Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15821/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 1309 (13.1%) zerosZeros

Reproduction

Analysis started2024-05-18 02:46:33.366947
Analysis finished2024-05-18 02:46:34.784819
Duration1.42 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2150
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T11:46:35.024613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.462
Min length2

Characters and Unicode

Total characters74620
Distinct characters434
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique117 ?
Unique (%)1.2%

Sample

1st row왕십리풍림아이원
2nd row자양우성2차
3rd row반포미도2차
4th row당산쌍용예가클래식
5th row래미안미아1차
ValueCountFrequency (%)
아파트 194
 
1.8%
래미안 53
 
0.5%
e편한세상 32
 
0.3%
송파 22
 
0.2%
이편한세상 22
 
0.2%
고덕 21
 
0.2%
아이파크 21
 
0.2%
sk뷰 17
 
0.2%
북한산 16
 
0.1%
목동14단지 16
 
0.1%
Other values (2231) 10513
96.2%
2024-05-18T11:46:35.625906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2645
 
3.5%
2635
 
3.5%
2562
 
3.4%
1776
 
2.4%
1684
 
2.3%
1647
 
2.2%
1546
 
2.1%
1484
 
2.0%
1425
 
1.9%
1394
 
1.9%
Other values (424) 55822
74.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68378
91.6%
Decimal Number 3496
 
4.7%
Space Separator 1037
 
1.4%
Uppercase Letter 844
 
1.1%
Lowercase Letter 329
 
0.4%
Close Punctuation 154
 
0.2%
Open Punctuation 154
 
0.2%
Other Punctuation 114
 
0.2%
Dash Punctuation 109
 
0.1%
Letter Number 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2645
 
3.9%
2635
 
3.9%
2562
 
3.7%
1776
 
2.6%
1684
 
2.5%
1647
 
2.4%
1546
 
2.3%
1484
 
2.2%
1425
 
2.1%
1394
 
2.0%
Other values (379) 49580
72.5%
Uppercase Letter
ValueCountFrequency (%)
C 143
16.9%
S 123
14.6%
D 113
13.4%
M 113
13.4%
K 90
10.7%
H 39
 
4.6%
E 39
 
4.6%
L 36
 
4.3%
I 35
 
4.1%
V 29
 
3.4%
Other values (7) 84
10.0%
Lowercase Letter
ValueCountFrequency (%)
e 194
59.0%
i 25
 
7.6%
l 22
 
6.7%
s 21
 
6.4%
k 19
 
5.8%
v 18
 
5.5%
w 12
 
3.6%
c 8
 
2.4%
h 6
 
1.8%
a 2
 
0.6%
Decimal Number
ValueCountFrequency (%)
2 1023
29.3%
1 995
28.5%
3 476
13.6%
4 244
 
7.0%
5 222
 
6.4%
6 173
 
4.9%
7 106
 
3.0%
8 105
 
3.0%
9 83
 
2.4%
0 69
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 98
86.0%
. 16
 
14.0%
Space Separator
ValueCountFrequency (%)
1037
100.0%
Close Punctuation
ValueCountFrequency (%)
) 154
100.0%
Open Punctuation
ValueCountFrequency (%)
( 154
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 109
100.0%
Letter Number
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68378
91.6%
Common 5064
 
6.8%
Latin 1178
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2645
 
3.9%
2635
 
3.9%
2562
 
3.7%
1776
 
2.6%
1684
 
2.5%
1647
 
2.4%
1546
 
2.3%
1484
 
2.2%
1425
 
2.1%
1394
 
2.0%
Other values (379) 49580
72.5%
Latin
ValueCountFrequency (%)
e 194
16.5%
C 143
12.1%
S 123
10.4%
D 113
9.6%
M 113
9.6%
K 90
 
7.6%
H 39
 
3.3%
E 39
 
3.3%
L 36
 
3.1%
I 35
 
3.0%
Other values (19) 253
21.5%
Common
ValueCountFrequency (%)
1037
20.5%
2 1023
20.2%
1 995
19.6%
3 476
9.4%
4 244
 
4.8%
5 222
 
4.4%
6 173
 
3.4%
) 154
 
3.0%
( 154
 
3.0%
- 109
 
2.2%
Other values (6) 477
9.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68378
91.6%
ASCII 6237
 
8.4%
Number Forms 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2645
 
3.9%
2635
 
3.9%
2562
 
3.7%
1776
 
2.6%
1684
 
2.5%
1647
 
2.4%
1546
 
2.3%
1484
 
2.2%
1425
 
2.1%
1394
 
2.0%
Other values (379) 49580
72.5%
ASCII
ValueCountFrequency (%)
1037
16.6%
2 1023
16.4%
1 995
16.0%
3 476
 
7.6%
4 244
 
3.9%
5 222
 
3.6%
e 194
 
3.1%
6 173
 
2.8%
) 154
 
2.5%
( 154
 
2.5%
Other values (34) 1565
25.1%
Number Forms
ValueCountFrequency (%)
5
100.0%
Distinct2154
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T11:46:36.300224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique117 ?
Unique (%)1.2%

Sample

1st rowA13302206
2nd rowA14386109
3rd rowA13770105
4th rowA15072001
5th rowA14277601
ValueCountFrequency (%)
a15807606 16
 
0.2%
a13187201 13
 
0.1%
a10024254 13
 
0.1%
a13613009 12
 
0.1%
a13876114 12
 
0.1%
a13981901 12
 
0.1%
a13204104 12
 
0.1%
a12013003 12
 
0.1%
a13813010 11
 
0.1%
a13822002 11
 
0.1%
Other values (2144) 9876
98.8%
2024-05-18T11:46:37.316638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18616
20.7%
1 17305
19.2%
A 10000
11.1%
3 8714
9.7%
2 8578
9.5%
5 6164
 
6.8%
8 5599
 
6.2%
7 4532
 
5.0%
4 4138
 
4.6%
6 3434
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18616
23.3%
1 17305
21.6%
3 8714
10.9%
2 8578
10.7%
5 6164
 
7.7%
8 5599
 
7.0%
7 4532
 
5.7%
4 4138
 
5.2%
6 3434
 
4.3%
9 2920
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
A 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18616
23.3%
1 17305
21.6%
3 8714
10.9%
2 8578
10.7%
5 6164
 
7.7%
8 5599
 
7.0%
7 4532
 
5.7%
4 4138
 
5.2%
6 3434
 
4.3%
9 2920
 
3.6%
Latin
ValueCountFrequency (%)
A 10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18616
20.7%
1 17305
19.2%
A 10000
11.1%
3 8714
9.7%
2 8578
9.5%
5 6164
 
6.8%
8 5599
 
6.2%
7 4532
 
5.0%
4 4138
 
4.6%
6 3434
 
3.8%
Distinct86
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T11:46:37.846120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.8107
Min length2

Characters and Unicode

Total characters48107
Distinct characters120
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row급여
2nd row정화조관리비
3rd row승강기유지비
4th row충당부채전입이자비용
5th row주차장수익
ValueCountFrequency (%)
승강기유지비 257
 
2.6%
수선유지비 238
 
2.4%
급여 233
 
2.3%
통신비 230
 
2.3%
사무용품비 223
 
2.2%
경비비 222
 
2.2%
교육비 219
 
2.2%
소독비 215
 
2.1%
세대전기료 210
 
2.1%
복리후생비 210
 
2.1%
Other values (76) 7743
77.4%
2024-05-18T11:46:38.675453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5457
 
11.3%
3490
 
7.3%
2101
 
4.4%
1978
 
4.1%
1378
 
2.9%
1371
 
2.8%
1082
 
2.2%
909
 
1.9%
808
 
1.7%
801
 
1.7%
Other values (110) 28732
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 48107
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5457
 
11.3%
3490
 
7.3%
2101
 
4.4%
1978
 
4.1%
1378
 
2.9%
1371
 
2.8%
1082
 
2.2%
909
 
1.9%
808
 
1.7%
801
 
1.7%
Other values (110) 28732
59.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 48107
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5457
 
11.3%
3490
 
7.3%
2101
 
4.4%
1978
 
4.1%
1378
 
2.9%
1371
 
2.8%
1082
 
2.2%
909
 
1.9%
808
 
1.7%
801
 
1.7%
Other values (110) 28732
59.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 48107
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5457
 
11.3%
3490
 
7.3%
2101
 
4.4%
1978
 
4.1%
1378
 
2.9%
1371
 
2.8%
1082
 
2.2%
909
 
1.9%
808
 
1.7%
801
 
1.7%
Other values (110) 28732
59.7%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202312
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202312
2nd row202312
3rd row202312
4th row202312
5th row202312

Common Values

ValueCountFrequency (%)
202312 10000
100.0%

Length

2024-05-18T11:46:38.912606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T11:46:39.134841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202312 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7152
Distinct (%)71.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4630968.1
Minimum-97721050
Maximum5.509708 × 108
Zeros1309
Zeros (%)13.1%
Negative21
Negative (%)0.2%
Memory size166.0 KiB
2024-05-18T11:46:39.443061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-97721050
5-th percentile0
Q172175
median328230
Q31602092.5
95-th percentile22409322
Maximum5.509708 × 108
Range6.4869185 × 108
Interquartile range (IQR)1529917.5

Descriptive statistics

Standard deviation18633773
Coefficient of variation (CV)4.0237317
Kurtosis208.16744
Mean4630968.1
Median Absolute Deviation (MAD)328230
Skewness11.672279
Sum4.6309681 × 1010
Variance3.472175 × 1014
MonotonicityNot monotonic
2024-05-18T11:46:39.858249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1309
 
13.1%
200000 63
 
0.6%
300000 59
 
0.6%
100000 57
 
0.6%
150000 50
 
0.5%
60000 31
 
0.3%
30000 30
 
0.3%
600000 30
 
0.3%
50000 29
 
0.3%
250000 29
 
0.3%
Other values (7142) 8313
83.1%
ValueCountFrequency (%)
-97721050 1
< 0.1%
-23750000 1
< 0.1%
-14890911 1
< 0.1%
-12400000 1
< 0.1%
-12211328 1
< 0.1%
-10811000 1
< 0.1%
-6833040 1
< 0.1%
-2311700 1
< 0.1%
-1200000 1
< 0.1%
-614770 1
< 0.1%
ValueCountFrequency (%)
550970800 1
< 0.1%
415078400 1
< 0.1%
413106110 1
< 0.1%
366227130 1
< 0.1%
361462346 1
< 0.1%
351480460 1
< 0.1%
334995005 1
< 0.1%
288745580 1
< 0.1%
264658165 1
< 0.1%
245813329 1
< 0.1%

Interactions

2024-05-18T11:46:34.180783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T11:46:40.120781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.569
금액0.5691.000

Missing values

2024-05-18T11:46:34.402944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T11:46:34.680328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
34687왕십리풍림아이원A13302206급여20231218482170
73681자양우성2차A14386109정화조관리비202312273005
52472반포미도2차A13770105승강기유지비202312522500
76344당산쌍용예가클래식A15072001충당부채전입이자비용2023126322
71227래미안미아1차A14277601주차장수익2023121417740
83402고척벽산블루밍A15283711임대료수익2023122400000
42980세곡리엔파크3단지A13519003소독비202312260000
32330창동대동A13204501산재보험료20231253770
8044이편한세상 상도노빌리티A10025768음식물처리비2023121265640
38221서울숲한신더휴아파트A13386702장기수선비20231230134200
아파트명아파트코드비용명년월일금액
50187종암극동아파트A13671207복리후생비2023120
74699당산성원아파트A15004501연체료수익2023126130
61289상계주공16단지A13920803충당부채전입이자비용2023124499436
91472마곡엠밸리4단지A15721008통신비202312118750
63327공릉비선A13980018감가상각비2023120
6841항동하버라인4단지아파트A10025302교통비2023120
97714목동6단지A15875103부과차익20231219864
19807마포동원베네스트A12170401교통비2023120
15457효성주얼리시티아파트A11041001보험료2023121438350
99431은평뉴타운마고정3단지A41279912급여20231215353180