Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 is highly skewed (γ1 = 25.02076186)Skewed
금액 has 2157 (21.6%) zerosZeros

Reproduction

Analysis started2024-05-11 06:01:13.869218
Analysis finished2024-05-11 06:01:15.048311
Duration1.18 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2107
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:15.349397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.1755
Min length2

Characters and Unicode

Total characters71755
Distinct characters431
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)1.0%

Sample

1st row신림동부
2nd row잠실동트리지움
3rd row래미안힐스테이트 고덕
4th row무악현대
5th row항동하버라인3단지
ValueCountFrequency (%)
아파트 115
 
1.1%
래미안 25
 
0.2%
서울숲2차푸르지오임대 21
 
0.2%
신동아파밀리에 18
 
0.2%
신도림현대 15
 
0.1%
힐스테이트 14
 
0.1%
래미안밤섬리베뉴 14
 
0.1%
송천센트레빌 13
 
0.1%
신반포 13
 
0.1%
래미안허브리츠 13
 
0.1%
Other values (2163) 10226
97.5%
2024-05-11T15:01:16.025494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2203
 
3.1%
2139
 
3.0%
1954
 
2.7%
1837
 
2.6%
1811
 
2.5%
1707
 
2.4%
1575
 
2.2%
1519
 
2.1%
1426
 
2.0%
1352
 
1.9%
Other values (421) 54232
75.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65698
91.6%
Decimal Number 3856
 
5.4%
Uppercase Letter 733
 
1.0%
Space Separator 545
 
0.8%
Lowercase Letter 361
 
0.5%
Dash Punctuation 159
 
0.2%
Close Punctuation 131
 
0.2%
Open Punctuation 131
 
0.2%
Other Punctuation 129
 
0.2%
Letter Number 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2203
 
3.4%
2139
 
3.3%
1954
 
3.0%
1837
 
2.8%
1811
 
2.8%
1707
 
2.6%
1575
 
2.4%
1519
 
2.3%
1426
 
2.2%
1352
 
2.1%
Other values (375) 48175
73.3%
Uppercase Letter
ValueCountFrequency (%)
S 123
16.8%
C 100
13.6%
K 90
12.3%
L 74
10.1%
H 53
7.2%
D 51
7.0%
M 51
7.0%
G 41
 
5.6%
E 35
 
4.8%
I 34
 
4.6%
Other values (7) 81
11.1%
Lowercase Letter
ValueCountFrequency (%)
e 199
55.1%
l 38
 
10.5%
i 34
 
9.4%
v 27
 
7.5%
k 15
 
4.2%
s 14
 
3.9%
c 10
 
2.8%
w 10
 
2.8%
a 5
 
1.4%
g 5
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 1168
30.3%
2 1151
29.8%
3 531
13.8%
4 245
 
6.4%
5 192
 
5.0%
6 158
 
4.1%
7 115
 
3.0%
0 102
 
2.6%
9 98
 
2.5%
8 96
 
2.5%
Other Punctuation
ValueCountFrequency (%)
, 103
79.8%
. 26
 
20.2%
Space Separator
ValueCountFrequency (%)
545
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 159
100.0%
Close Punctuation
ValueCountFrequency (%)
) 131
100.0%
Open Punctuation
ValueCountFrequency (%)
( 131
100.0%
Letter Number
ValueCountFrequency (%)
8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65698
91.6%
Common 4955
 
6.9%
Latin 1102
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2203
 
3.4%
2139
 
3.3%
1954
 
3.0%
1837
 
2.8%
1811
 
2.8%
1707
 
2.6%
1575
 
2.4%
1519
 
2.3%
1426
 
2.2%
1352
 
2.1%
Other values (375) 48175
73.3%
Latin
ValueCountFrequency (%)
e 199
18.1%
S 123
11.2%
C 100
 
9.1%
K 90
 
8.2%
L 74
 
6.7%
H 53
 
4.8%
D 51
 
4.6%
M 51
 
4.6%
G 41
 
3.7%
l 38
 
3.4%
Other values (19) 282
25.6%
Common
ValueCountFrequency (%)
1 1168
23.6%
2 1151
23.2%
545
11.0%
3 531
10.7%
4 245
 
4.9%
5 192
 
3.9%
- 159
 
3.2%
6 158
 
3.2%
) 131
 
2.6%
( 131
 
2.6%
Other values (7) 544
11.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65698
91.6%
ASCII 6049
 
8.4%
Number Forms 8
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2203
 
3.4%
2139
 
3.3%
1954
 
3.0%
1837
 
2.8%
1811
 
2.8%
1707
 
2.6%
1575
 
2.4%
1519
 
2.3%
1426
 
2.2%
1352
 
2.1%
Other values (375) 48175
73.3%
ASCII
ValueCountFrequency (%)
1 1168
19.3%
2 1151
19.0%
545
 
9.0%
3 531
 
8.8%
4 245
 
4.1%
e 199
 
3.3%
5 192
 
3.2%
- 159
 
2.6%
6 158
 
2.6%
) 131
 
2.2%
Other values (35) 1570
26.0%
Number Forms
ValueCountFrequency (%)
8
100.0%
Distinct2113
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:16.529185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)1.0%

Sample

1st rowA15101101
2nd rowA13822002
3rd rowA10027207
4th rowA11081503
5th rowA10025614
ValueCountFrequency (%)
a14272313 13
 
0.1%
a13070301 13
 
0.1%
a13378103 13
 
0.1%
a13376906 13
 
0.1%
a13187004 12
 
0.1%
a15288803 12
 
0.1%
a13071302 12
 
0.1%
a13610003 11
 
0.1%
a14386109 11
 
0.1%
a13203408 11
 
0.1%
Other values (2103) 9879
98.8%
2024-05-11T15:01:17.187845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18294
20.3%
1 17823
19.8%
A 9987
11.1%
3 9021
10.0%
2 7985
8.9%
5 6179
 
6.9%
8 5793
 
6.4%
7 4801
 
5.3%
4 3648
 
4.1%
6 3550
 
3.9%
Other values (2) 2919
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18294
22.9%
1 17823
22.3%
3 9021
11.3%
2 7985
10.0%
5 6179
 
7.7%
8 5793
 
7.2%
7 4801
 
6.0%
4 3648
 
4.6%
6 3550
 
4.4%
9 2906
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
A 9987
99.9%
B 13
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18294
22.9%
1 17823
22.3%
3 9021
11.3%
2 7985
10.0%
5 6179
 
7.7%
8 5793
 
7.2%
7 4801
 
6.0%
4 3648
 
4.6%
6 3550
 
4.4%
9 2906
 
3.6%
Latin
ValueCountFrequency (%)
A 9987
99.9%
B 13
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18294
20.3%
1 17823
19.8%
A 9987
11.1%
3 9021
10.0%
2 7985
8.9%
5 6179
 
6.9%
8 5793
 
6.4%
7 4801
 
5.3%
4 3648
 
4.1%
6 3550
 
3.9%
Other values (2) 2919
 
3.2%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:17.614285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9824
Min length2

Characters and Unicode

Total characters59824
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row선급비용
2nd row기타충당부채
3rd row장기수선충당부채
4th row수선유지비충당부채
5th row미수관리비예치금
ValueCountFrequency (%)
당기순이익 336
 
3.4%
예수금 326
 
3.3%
현금 313
 
3.1%
미처분이익잉여금 313
 
3.1%
비품 312
 
3.1%
연차수당충당부채 312
 
3.1%
선급비용 311
 
3.1%
미부과관리비 310
 
3.1%
관리비미수금 310
 
3.1%
장기수선충당부채 307
 
3.1%
Other values (67) 6850
68.5%
2024-05-11T15:01:18.272119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4736
 
7.9%
3811
 
6.4%
3163
 
5.3%
3095
 
5.2%
3019
 
5.0%
2989
 
5.0%
2663
 
4.5%
2398
 
4.0%
1920
 
3.2%
1795
 
3.0%
Other values (97) 30235
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59824
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4736
 
7.9%
3811
 
6.4%
3163
 
5.3%
3095
 
5.2%
3019
 
5.0%
2989
 
5.0%
2663
 
4.5%
2398
 
4.0%
1920
 
3.2%
1795
 
3.0%
Other values (97) 30235
50.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59824
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4736
 
7.9%
3811
 
6.4%
3163
 
5.3%
3095
 
5.2%
3019
 
5.0%
2989
 
5.0%
2663
 
4.5%
2398
 
4.0%
1920
 
3.2%
1795
 
3.0%
Other values (97) 30235
50.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59824
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4736
 
7.9%
3811
 
6.4%
3163
 
5.3%
3095
 
5.2%
3019
 
5.0%
2989
 
5.0%
2663
 
4.5%
2398
 
4.0%
1920
 
3.2%
1795
 
3.0%
Other values (97) 30235
50.5%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
201909
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201909
2nd row201909
3rd row201909
4th row201909
5th row201909

Common Values

ValueCountFrequency (%)
201909 10000
100.0%

Length

2024-05-11T15:01:18.510379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:01:18.957717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201909 10000
100.0%

금액
Real number (ℝ)

SKEWED  ZEROS 

Distinct7528
Distinct (%)75.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75488528
Minimum-3.8900128 × 108
Maximum2.0750082 × 1010
Zeros2157
Zeros (%)21.6%
Negative332
Negative (%)3.3%
Memory size166.0 KiB
2024-05-11T15:01:19.137913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3.8900128 × 108
5-th percentile0
Q1412
median3284357.5
Q333750790
95-th percentile3.4069278 × 108
Maximum2.0750082 × 1010
Range2.1139083 × 1010
Interquartile range (IQR)33750378

Descriptive statistics

Standard deviation3.6423283 × 108
Coefficient of variation (CV)4.8250091
Kurtosis1135.581
Mean75488528
Median Absolute Deviation (MAD)3284357.5
Skewness25.020762
Sum7.5488528 × 1011
Variance1.3266556 × 1017
MonotonicityNot monotonic
2024-05-11T15:01:19.392902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2157
 
21.6%
250000 24
 
0.2%
500000 22
 
0.2%
200000 15
 
0.1%
300000 13
 
0.1%
484000 13
 
0.1%
242000 12
 
0.1%
10000000 11
 
0.1%
5000000 8
 
0.1%
20000000 8
 
0.1%
Other values (7518) 7717
77.2%
ValueCountFrequency (%)
-389001283 1
< 0.1%
-368366420 1
< 0.1%
-238576550 1
< 0.1%
-221814932 1
< 0.1%
-221347140 1
< 0.1%
-205582628 1
< 0.1%
-193857280 1
< 0.1%
-189742270 1
< 0.1%
-144004700 1
< 0.1%
-140766677 1
< 0.1%
ValueCountFrequency (%)
20750082050 1
< 0.1%
7994861235 2
< 0.1%
6273644510 1
< 0.1%
5689478761 1
< 0.1%
5568518331 1
< 0.1%
5518560098 2
< 0.1%
5240090438 1
< 0.1%
4988196611 1
< 0.1%
4859262096 1
< 0.1%
4858442326 1
< 0.1%

Interactions

2024-05-11T15:01:14.580015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:01:19.537447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.207
금액0.2071.000

Missing values

2024-05-11T15:01:14.794513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:01:14.975770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
49792신림동부A15101101선급비용2019099855975
35328잠실동트리지움A13822002기타충당부채2019090
2304래미안힐스테이트 고덕A10027207장기수선충당부채2019091675805012
5187무악현대A11081503수선유지비충당부채2019090
302항동하버라인3단지A10025614미수관리비예치금2019092409000
10559신사한신휴플러스A12208103상여충당부채2019090
53063고척서울가든A15282810승강기유지비충당부채2019090
36544거여우방A13881601수선유지비충당부채201909460600
19210마장삼성A13305006단기보증금2019091043500
64359신월대림A15883803예금20190939250274
아파트명아파트코드비용명년월일금액
8528서강한화오벨리스크스위트A12177801당기순이익20190926871576
41920중계경남아너스빌A13986703연차수당충당부채2019095597705
24560청담삼성1차A13510001장기수선충당부채201909198396295
11458수색대림한숲타운A12287204정화조관리비충당부채2019098177900
51570신구로자이A15205508예금20190985934922
45296자양현대5차A14319203수선유지비충당부채20190917556430
9289성산시영아파트A12185004연차수당충당부채20190934335840
21934성내성안청구A13403002관리비예치금20190950485000
8041상암월드컵파크4단지A12127006선급비용20190930279200
48246당산반도유보라A15072201예수금2019093147221