Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2377 (23.8%) zerosZeros

Reproduction

Analysis started2024-05-11 05:58:48.619380
Analysis finished2024-05-11 05:58:49.599807
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2212
Distinct (%)22.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:49.824800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.2939
Min length2

Characters and Unicode

Total characters72939
Distinct characters436
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103 ?
Unique (%)1.0%

Sample

1st row신내중앙하이츠
2nd row송파파인타운10단지
3rd row신월대성유니드
4th row고덕아이파크아파트
5th row사당극동
ValueCountFrequency (%)
아파트 143
 
1.3%
래미안 39
 
0.4%
e편한세상 19
 
0.2%
팰리스 17
 
0.2%
해모로 16
 
0.1%
고덕 16
 
0.1%
래미안밤섬리베뉴 14
 
0.1%
북한산 14
 
0.1%
아이파크 14
 
0.1%
우리유앤미 13
 
0.1%
Other values (2285) 10374
97.1%
2024-05-11T14:58:50.386240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2430
 
3.3%
2407
 
3.3%
2214
 
3.0%
1832
 
2.5%
1778
 
2.4%
1712
 
2.3%
1476
 
2.0%
1448
 
2.0%
1412
 
1.9%
1314
 
1.8%
Other values (426) 54916
75.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66941
91.8%
Decimal Number 3622
 
5.0%
Space Separator 767
 
1.1%
Uppercase Letter 758
 
1.0%
Lowercase Letter 320
 
0.4%
Dash Punctuation 144
 
0.2%
Open Punctuation 139
 
0.2%
Close Punctuation 139
 
0.2%
Other Punctuation 100
 
0.1%
Letter Number 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2430
 
3.6%
2407
 
3.6%
2214
 
3.3%
1832
 
2.7%
1778
 
2.7%
1712
 
2.6%
1476
 
2.2%
1448
 
2.2%
1412
 
2.1%
1314
 
2.0%
Other values (381) 48918
73.1%
Uppercase Letter
ValueCountFrequency (%)
S 135
17.8%
K 102
13.5%
C 91
12.0%
L 62
8.2%
M 56
7.4%
D 56
7.4%
H 46
 
6.1%
I 41
 
5.4%
G 39
 
5.1%
E 38
 
5.0%
Other values (7) 92
12.1%
Lowercase Letter
ValueCountFrequency (%)
e 197
61.6%
l 24
 
7.5%
i 22
 
6.9%
k 18
 
5.6%
v 17
 
5.3%
s 16
 
5.0%
c 10
 
3.1%
w 7
 
2.2%
h 3
 
0.9%
g 3
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 1101
30.4%
2 1054
29.1%
3 455
12.6%
4 259
 
7.2%
5 200
 
5.5%
6 144
 
4.0%
7 136
 
3.8%
8 99
 
2.7%
9 92
 
2.5%
0 82
 
2.3%
Other Punctuation
ValueCountFrequency (%)
, 79
79.0%
. 21
 
21.0%
Space Separator
ValueCountFrequency (%)
767
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 144
100.0%
Open Punctuation
ValueCountFrequency (%)
( 139
100.0%
Close Punctuation
ValueCountFrequency (%)
) 139
100.0%
Letter Number
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 66941
91.8%
Common 4911
 
6.7%
Latin 1087
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2430
 
3.6%
2407
 
3.6%
2214
 
3.3%
1832
 
2.7%
1778
 
2.7%
1712
 
2.6%
1476
 
2.2%
1448
 
2.2%
1412
 
2.1%
1314
 
2.0%
Other values (381) 48918
73.1%
Latin
ValueCountFrequency (%)
e 197
18.1%
S 135
12.4%
K 102
 
9.4%
C 91
 
8.4%
L 62
 
5.7%
M 56
 
5.2%
D 56
 
5.2%
H 46
 
4.2%
I 41
 
3.8%
G 39
 
3.6%
Other values (19) 262
24.1%
Common
ValueCountFrequency (%)
1 1101
22.4%
2 1054
21.5%
767
15.6%
3 455
9.3%
4 259
 
5.3%
5 200
 
4.1%
- 144
 
2.9%
6 144
 
2.9%
( 139
 
2.8%
) 139
 
2.8%
Other values (6) 509
10.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 66941
91.8%
ASCII 5989
 
8.2%
Number Forms 9
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2430
 
3.6%
2407
 
3.6%
2214
 
3.3%
1832
 
2.7%
1778
 
2.7%
1712
 
2.6%
1476
 
2.2%
1448
 
2.2%
1412
 
2.1%
1314
 
2.0%
Other values (381) 48918
73.1%
ASCII
ValueCountFrequency (%)
1 1101
18.4%
2 1054
17.6%
767
12.8%
3 455
 
7.6%
4 259
 
4.3%
5 200
 
3.3%
e 197
 
3.3%
- 144
 
2.4%
6 144
 
2.4%
( 139
 
2.3%
Other values (34) 1529
25.5%
Number Forms
ValueCountFrequency (%)
9
100.0%
Distinct2217
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:50.868413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103 ?
Unique (%)1.0%

Sample

1st rowA13186907
2nd rowA13821005
3rd rowA15809403
4th rowA13408003
5th rowA15681503
ValueCountFrequency (%)
a14001101 12
 
0.1%
a15608002 12
 
0.1%
a13985201 12
 
0.1%
a15805105 12
 
0.1%
a13980019 12
 
0.1%
a13679403 11
 
0.1%
a15386506 11
 
0.1%
a13184208 11
 
0.1%
a13606002 11
 
0.1%
a13681304 11
 
0.1%
Other values (2207) 9885
98.9%
2024-05-11T14:58:51.550551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18468
20.5%
1 17658
19.6%
A 9992
11.1%
3 8581
9.5%
2 8385
9.3%
5 6149
 
6.8%
8 5693
 
6.3%
7 4758
 
5.3%
4 3964
 
4.4%
6 3408
 
3.8%
Other values (2) 2944
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18468
23.1%
1 17658
22.1%
3 8581
10.7%
2 8385
10.5%
5 6149
 
7.7%
8 5693
 
7.1%
7 4758
 
5.9%
4 3964
 
5.0%
6 3408
 
4.3%
9 2936
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 9992
99.9%
B 8
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18468
23.1%
1 17658
22.1%
3 8581
10.7%
2 8385
10.5%
5 6149
 
7.7%
8 5693
 
7.1%
7 4758
 
5.9%
4 3964
 
5.0%
6 3408
 
4.3%
9 2936
 
3.7%
Latin
ValueCountFrequency (%)
A 9992
99.9%
B 8
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18468
20.5%
1 17658
19.6%
A 9992
11.1%
3 8581
9.5%
2 8385
9.3%
5 6149
 
6.8%
8 5693
 
6.3%
7 4758
 
5.3%
4 3964
 
4.4%
6 3408
 
3.8%
Other values (2) 2944
 
3.3%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:51.948917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9956
Min length2

Characters and Unicode

Total characters59956
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경비비충당부채
2nd row주차장충당예금
3rd row관리비예치금
4th row미수관리비예치금
5th row전신전화가입권
ValueCountFrequency (%)
관리비미수금 329
 
3.3%
예금 325
 
3.2%
당기순이익 315
 
3.1%
예수금 310
 
3.1%
미처분이익잉여금 307
 
3.1%
공동주택적립금 306
 
3.1%
미부과관리비 305
 
3.0%
선급비용 304
 
3.0%
퇴직급여충당부채 300
 
3.0%
연차수당충당부채 296
 
3.0%
Other values (67) 6903
69.0%
2024-05-11T14:58:52.581597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4697
 
7.8%
3819
 
6.4%
3096
 
5.2%
3041
 
5.1%
3019
 
5.0%
2942
 
4.9%
2618
 
4.4%
2409
 
4.0%
1803
 
3.0%
1792
 
3.0%
Other values (97) 30720
51.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59956
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4697
 
7.8%
3819
 
6.4%
3096
 
5.2%
3041
 
5.1%
3019
 
5.0%
2942
 
4.9%
2618
 
4.4%
2409
 
4.0%
1803
 
3.0%
1792
 
3.0%
Other values (97) 30720
51.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59956
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4697
 
7.8%
3819
 
6.4%
3096
 
5.2%
3041
 
5.1%
3019
 
5.0%
2942
 
4.9%
2618
 
4.4%
2409
 
4.0%
1803
 
3.0%
1792
 
3.0%
Other values (97) 30720
51.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59956
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4697
 
7.8%
3819
 
6.4%
3096
 
5.2%
3041
 
5.1%
3019
 
5.0%
2942
 
4.9%
2618
 
4.4%
2409
 
4.0%
1803
 
3.0%
1792
 
3.0%
Other values (97) 30720
51.2%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202108
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202108
2nd row202108
3rd row202108
4th row202108
5th row202108

Common Values

ValueCountFrequency (%)
202108 10000
100.0%

Length

2024-05-11T14:58:52.778618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:58:52.928377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202108 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7330
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73334093
Minimum-4.09024 × 109
Maximum1.1738153 × 1010
Zeros2377
Zeros (%)23.8%
Negative343
Negative (%)3.4%
Memory size166.0 KiB
2024-05-11T14:58:53.100536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.09024 × 109
5-th percentile0
Q10
median2975505
Q334115385
95-th percentile3.5166449 × 108
Maximum1.1738153 × 1010
Range1.5828393 × 1010
Interquartile range (IQR)34115385

Descriptive statistics

Standard deviation3.0726014 × 108
Coefficient of variation (CV)4.1898676
Kurtosis363.44093
Mean73334093
Median Absolute Deviation (MAD)2975505
Skewness13.944817
Sum7.3334093 × 1011
Variance9.4408793 × 1016
MonotonicityNot monotonic
2024-05-11T14:58:53.308123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2377
 
23.8%
500000 27
 
0.3%
250000 18
 
0.2%
3000000 16
 
0.2%
300000 14
 
0.1%
100000 14
 
0.1%
242000 13
 
0.1%
484000 12
 
0.1%
1000000 12
 
0.1%
200000 10
 
0.1%
Other values (7320) 7487
74.9%
ValueCountFrequency (%)
-4090240000 1
< 0.1%
-190693510 1
< 0.1%
-190291536 1
< 0.1%
-158616022 1
< 0.1%
-152882860 1
< 0.1%
-138548880 1
< 0.1%
-135424278 1
< 0.1%
-130563831 1
< 0.1%
-116223936 1
< 0.1%
-109655711 1
< 0.1%
ValueCountFrequency (%)
11738152789 1
< 0.1%
9014652346 1
< 0.1%
6691255231 1
< 0.1%
6356577257 1
< 0.1%
5985559147 1
< 0.1%
4203166294 1
< 0.1%
3996183711 1
< 0.1%
3948034200 1
< 0.1%
3814810498 1
< 0.1%
3621919722 1
< 0.1%

Interactions

2024-05-11T14:58:49.197494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:58:53.447281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.487
금액0.4871.000

Missing values

2024-05-11T14:58:49.397312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:58:49.535684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
19125신내중앙하이츠A13186907경비비충당부채20210837786160
39159송파파인타운10단지A13821005주차장충당예금2021080
68690신월대성유니드A15809403관리비예치금20210843240000
25938고덕아이파크아파트A13408003미수관리비예치금202108210000
62869사당극동A15681503전신전화가입권2021080
22646마장신성미소지움A13305003기타충당예금2021080
54922신림건영4차A15102902기타유동부채2021080
46896하계한신A13993503저장품202108370800
32518정릉푸르지오A13610202미수금2021080
4026래미안서초에스티지에스아파트A10026411경비비충당부채20210826542727
아파트명아파트코드비용명년월일금액
55466관악국제산장A15176701선수수익2021080
34226정릉대우A13676702기타공동주택관리비충당부채2021080
57018영화 아이닉스A15209304기타유동부채202108150000
3702힐스테이트청계A10026104기타당좌자산2021080
3756문래동모아미래도아파트A10026197미지급금20210828957508
33983삼선푸르지오아파트A13672101퇴직급여충당부채20210887238787
38345송파파크데일2단지A13812005선급금2021081330590
11103상암월드컵파크7단지A12127005공동체활성화단체지원적립금2021080
60445신대방경남교수A15601102미수관리비예치금2021080
16581이문쌍용A13082704선급금2021083751560