Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2318 (23.2%) zerosZeros

Reproduction

Analysis started2024-05-11 05:58:28.364310
Analysis finished2024-05-11 05:58:29.496307
Duration1.13 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2151
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:29.747228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.3642
Min length2

Characters and Unicode

Total characters73642
Distinct characters434
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique89 ?
Unique (%)0.9%

Sample

1st row방화대림E편한세상
2nd row신도림대림5차e-편한세상
3rd row서초삼풍
4th row관악우방
5th row휘경 미소지움아파트
ValueCountFrequency (%)
아파트 171
 
1.6%
래미안 41
 
0.4%
아이파크 24
 
0.2%
e편한세상 23
 
0.2%
sk뷰 19
 
0.2%
고덕 17
 
0.2%
신반포 16
 
0.1%
백련산 15
 
0.1%
신내동성1차2차 15
 
0.1%
푸르지오 15
 
0.1%
Other values (2223) 10401
96.7%
2024-05-11T14:58:30.240635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2539
 
3.4%
2509
 
3.4%
2279
 
3.1%
1749
 
2.4%
1741
 
2.4%
1657
 
2.3%
1506
 
2.0%
1467
 
2.0%
1375
 
1.9%
1373
 
1.9%
Other values (424) 55447
75.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67406
91.5%
Decimal Number 3638
 
4.9%
Space Separator 826
 
1.1%
Uppercase Letter 790
 
1.1%
Lowercase Letter 398
 
0.5%
Open Punctuation 167
 
0.2%
Close Punctuation 167
 
0.2%
Dash Punctuation 141
 
0.2%
Other Punctuation 106
 
0.1%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2539
 
3.8%
2509
 
3.7%
2279
 
3.4%
1749
 
2.6%
1741
 
2.6%
1657
 
2.5%
1506
 
2.2%
1467
 
2.2%
1375
 
2.0%
1373
 
2.0%
Other values (379) 49211
73.0%
Uppercase Letter
ValueCountFrequency (%)
S 139
17.6%
C 103
13.0%
K 86
10.9%
D 78
9.9%
M 78
9.9%
L 69
8.7%
H 58
7.3%
E 36
 
4.6%
G 36
 
4.6%
I 31
 
3.9%
Other values (7) 76
9.6%
Lowercase Letter
ValueCountFrequency (%)
e 202
50.8%
l 40
 
10.1%
i 35
 
8.8%
k 25
 
6.3%
s 25
 
6.3%
v 24
 
6.0%
c 16
 
4.0%
h 8
 
2.0%
g 8
 
2.0%
a 8
 
2.0%
Decimal Number
ValueCountFrequency (%)
1 1080
29.7%
2 1075
29.5%
3 499
13.7%
4 245
 
6.7%
5 197
 
5.4%
6 179
 
4.9%
7 117
 
3.2%
9 85
 
2.3%
0 83
 
2.3%
8 78
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 84
79.2%
. 22
 
20.8%
Space Separator
ValueCountFrequency (%)
826
100.0%
Open Punctuation
ValueCountFrequency (%)
( 167
100.0%
Close Punctuation
ValueCountFrequency (%)
) 167
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 141
100.0%
Letter Number
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67406
91.5%
Common 5045
 
6.9%
Latin 1191
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2539
 
3.8%
2509
 
3.7%
2279
 
3.4%
1749
 
2.6%
1741
 
2.6%
1657
 
2.5%
1506
 
2.2%
1467
 
2.2%
1375
 
2.0%
1373
 
2.0%
Other values (379) 49211
73.0%
Latin
ValueCountFrequency (%)
e 202
17.0%
S 139
11.7%
C 103
 
8.6%
K 86
 
7.2%
D 78
 
6.5%
M 78
 
6.5%
L 69
 
5.8%
H 58
 
4.9%
l 40
 
3.4%
E 36
 
3.0%
Other values (19) 302
25.4%
Common
ValueCountFrequency (%)
1 1080
21.4%
2 1075
21.3%
826
16.4%
3 499
9.9%
4 245
 
4.9%
5 197
 
3.9%
6 179
 
3.5%
( 167
 
3.3%
) 167
 
3.3%
- 141
 
2.8%
Other values (6) 469
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67406
91.5%
ASCII 6233
 
8.5%
Number Forms 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2539
 
3.8%
2509
 
3.7%
2279
 
3.4%
1749
 
2.6%
1741
 
2.6%
1657
 
2.5%
1506
 
2.2%
1467
 
2.2%
1375
 
2.0%
1373
 
2.0%
Other values (379) 49211
73.0%
ASCII
ValueCountFrequency (%)
1 1080
17.3%
2 1075
17.2%
826
13.3%
3 499
 
8.0%
4 245
 
3.9%
e 202
 
3.2%
5 197
 
3.2%
6 179
 
2.9%
( 167
 
2.7%
) 167
 
2.7%
Other values (34) 1596
25.6%
Number Forms
ValueCountFrequency (%)
3
100.0%
Distinct2156
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:30.676578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique89 ?
Unique (%)0.9%

Sample

1st rowA15722204
2nd rowA15288805
3rd rowA13792001
4th rowA15303203
5th rowA13077702
ValueCountFrequency (%)
a13186708 15
 
0.1%
a13881701 14
 
0.1%
a14085002 14
 
0.1%
a10025263 12
 
0.1%
a13407104 12
 
0.1%
a13402003 12
 
0.1%
a15081002 12
 
0.1%
a12282203 12
 
0.1%
a14003106 12
 
0.1%
a13920506 12
 
0.1%
Other values (2146) 9873
98.7%
2024-05-11T14:58:31.317665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18629
20.7%
1 17635
19.6%
A 10000
11.1%
3 8731
9.7%
2 8377
9.3%
5 6235
 
6.9%
8 5531
 
6.1%
7 4607
 
5.1%
4 4012
 
4.5%
6 3340
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18629
23.3%
1 17635
22.0%
3 8731
10.9%
2 8377
10.5%
5 6235
 
7.8%
8 5531
 
6.9%
7 4607
 
5.8%
4 4012
 
5.0%
6 3340
 
4.2%
9 2903
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
A 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18629
23.3%
1 17635
22.0%
3 8731
10.9%
2 8377
10.5%
5 6235
 
7.8%
8 5531
 
6.9%
7 4607
 
5.8%
4 4012
 
5.0%
6 3340
 
4.2%
9 2903
 
3.6%
Latin
ValueCountFrequency (%)
A 10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18629
20.7%
1 17635
19.6%
A 10000
11.1%
3 8731
9.7%
2 8377
9.3%
5 6235
 
6.9%
8 5531
 
6.1%
7 4607
 
5.1%
4 4012
 
4.5%
6 3340
 
3.7%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:31.701778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.9536
Min length2

Characters and Unicode

Total characters59536
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가수금
2nd row상여충당부채
3rd row복리후생비충당부채
4th row미처분이익잉여금
5th row퇴직급여충당부채
ValueCountFrequency (%)
미처분이익잉여금 328
 
3.3%
당기순이익 317
 
3.2%
관리비미수금 312
 
3.1%
선급비용 309
 
3.1%
공동주택적립금 307
 
3.1%
연차수당충당부채 307
 
3.1%
비품 305
 
3.0%
예금 299
 
3.0%
퇴직급여충당부채 299
 
3.0%
예수금 298
 
3.0%
Other values (67) 6919
69.2%
2024-05-11T14:58:32.279543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4565
 
7.7%
3847
 
6.5%
3121
 
5.2%
3046
 
5.1%
3035
 
5.1%
2960
 
5.0%
2656
 
4.5%
2485
 
4.2%
1890
 
3.2%
1697
 
2.9%
Other values (97) 30234
50.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59536
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4565
 
7.7%
3847
 
6.5%
3121
 
5.2%
3046
 
5.1%
3035
 
5.1%
2960
 
5.0%
2656
 
4.5%
2485
 
4.2%
1890
 
3.2%
1697
 
2.9%
Other values (97) 30234
50.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59536
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4565
 
7.7%
3847
 
6.5%
3121
 
5.2%
3046
 
5.1%
3035
 
5.1%
2960
 
5.0%
2656
 
4.5%
2485
 
4.2%
1890
 
3.2%
1697
 
2.9%
Other values (97) 30234
50.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59536
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4565
 
7.7%
3847
 
6.5%
3121
 
5.2%
3046
 
5.1%
3035
 
5.1%
2960
 
5.0%
2656
 
4.5%
2485
 
4.2%
1890
 
3.2%
1697
 
2.9%
Other values (97) 30234
50.8%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202111
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202111
2nd row202111
3rd row202111
4th row202111
5th row202111

Common Values

ValueCountFrequency (%)
202111 10000
100.0%

Length

2024-05-11T14:58:32.495139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:58:32.653695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202111 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7341
Distinct (%)73.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72390703
Minimum-3.6943877 × 108
Maximum6.7024207 × 109
Zeros2318
Zeros (%)23.2%
Negative323
Negative (%)3.2%
Memory size166.0 KiB
2024-05-11T14:58:32.824594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3.6943877 × 108
5-th percentile0
Q10
median3005049
Q334301188
95-th percentile3.4176207 × 108
Maximum6.7024207 × 109
Range7.0718594 × 109
Interquartile range (IQR)34301188

Descriptive statistics

Standard deviation2.7537762 × 108
Coefficient of variation (CV)3.8040468
Kurtosis142.33877
Mean72390703
Median Absolute Deviation (MAD)3005049
Skewness9.9056155
Sum7.2390703 × 1011
Variance7.5832836 × 1016
MonotonicityNot monotonic
2024-05-11T14:58:33.040615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2318
 
23.2%
500000 36
 
0.4%
242000 17
 
0.2%
250000 16
 
0.2%
484000 13
 
0.1%
300000 13
 
0.1%
1000000 12
 
0.1%
100000 12
 
0.1%
200000 11
 
0.1%
2000000 10
 
0.1%
Other values (7331) 7542
75.4%
ValueCountFrequency (%)
-369438768 1
< 0.1%
-264005590 1
< 0.1%
-257913346 1
< 0.1%
-230922000 1
< 0.1%
-201330000 1
< 0.1%
-167011730 1
< 0.1%
-139259226 1
< 0.1%
-133221705 1
< 0.1%
-119527363 1
< 0.1%
-106213220 1
< 0.1%
ValueCountFrequency (%)
6702420662 1
< 0.1%
5447921597 1
< 0.1%
5230947921 1
< 0.1%
5168126591 1
< 0.1%
4921004897 1
< 0.1%
4904836096 1
< 0.1%
4873014602 1
< 0.1%
3811349718 1
< 0.1%
3733288838 1
< 0.1%
3727769325 1
< 0.1%

Interactions

2024-05-11T14:58:29.075640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:58:33.217448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.530
금액0.5301.000

Missing values

2024-05-11T14:58:29.300139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:58:29.441954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
63078방화대림E편한세상A15722204가수금2021113032020
57659신도림대림5차e-편한세상A15288805상여충당부채2021110
36901서초삼풍A13792001복리후생비충당부채202111951490
58083관악우방A15303203미처분이익잉여금2021110
15968휘경 미소지움아파트A13077702퇴직급여충당부채20211131761660
58406독산계룡A15381402예수금2021114293180
45979하계한신A13993503수선유지비충당부채20211133137727
58250독산주공14단지A15375809장기수선충당예금202111737748211
68687은평뉴타운마고정11단지A41279913선급금2021111812709
62513마곡금호어울림A15721001기타유동부채2021110
아파트명아파트코드비용명년월일금액
20315창동금용A13204201장기수선충당부채202111405745967
41891하계2차현대아파트A13923106선급금2021111833310
3462항동하버라인8단지A10025858예금202111141547079
65419목동3단지A15805003선수전기료2021112452315
39222잠실5단지아파트A13879102승강기유지비충당부채20211136590440
27203동양파라곤A13501001미처분이익잉여금2021114016006
27438역삼아이파크A13508009미처분이익잉여금2021110
20413창동태영데시앙A13204205장기수선충당예금202111868152202
34657래미안 서초스위트 아파트A13707009선급비용2021111112170
66449신정동일하이빌A15807315장기수선충당예금202111891594389