Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15821/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 1078 (10.8%) zerosZeros

Reproduction

Analysis started2024-05-11 06:48:11.759454
Analysis finished2024-05-11 06:48:14.150679
Duration2.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2111
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:48:14.630013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.4343
Min length2

Characters and Unicode

Total characters74343
Distinct characters431
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)1.0%

Sample

1st row수유역두산위브
2nd row서울숲한신더휴아파트
3rd row래미안 신반포 리오센트
4th row창동주공4단지
5th row이문현대
ValueCountFrequency (%)
아파트 206
 
1.9%
래미안 48
 
0.4%
e편한세상 35
 
0.3%
아이파크 25
 
0.2%
경남아너스빌 21
 
0.2%
힐스테이트 18
 
0.2%
푸르지오 17
 
0.2%
북한산 16
 
0.1%
이편한세상 15
 
0.1%
신반포 15
 
0.1%
Other values (2193) 10522
96.2%
2024-05-11T06:48:15.877453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2569
 
3.5%
2548
 
3.4%
2490
 
3.3%
1836
 
2.5%
1704
 
2.3%
1571
 
2.1%
1468
 
2.0%
1432
 
1.9%
1427
 
1.9%
1413
 
1.9%
Other values (421) 55885
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68083
91.6%
Decimal Number 3621
 
4.9%
Space Separator 1035
 
1.4%
Uppercase Letter 771
 
1.0%
Lowercase Letter 265
 
0.4%
Open Punctuation 167
 
0.2%
Close Punctuation 167
 
0.2%
Dash Punctuation 130
 
0.2%
Other Punctuation 98
 
0.1%
Letter Number 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2569
 
3.8%
2548
 
3.7%
2490
 
3.7%
1836
 
2.7%
1704
 
2.5%
1571
 
2.3%
1468
 
2.2%
1432
 
2.1%
1427
 
2.1%
1413
 
2.1%
Other values (376) 49625
72.9%
Uppercase Letter
ValueCountFrequency (%)
S 123
16.0%
C 109
14.1%
K 89
11.5%
M 79
10.2%
D 79
10.2%
L 57
7.4%
H 47
 
6.1%
E 42
 
5.4%
I 33
 
4.3%
G 30
 
3.9%
Other values (7) 83
10.8%
Lowercase Letter
ValueCountFrequency (%)
e 186
70.2%
i 15
 
5.7%
s 13
 
4.9%
k 13
 
4.9%
l 12
 
4.5%
v 8
 
3.0%
w 6
 
2.3%
c 4
 
1.5%
a 3
 
1.1%
g 3
 
1.1%
Decimal Number
ValueCountFrequency (%)
1 1087
30.0%
2 997
27.5%
3 507
14.0%
4 241
 
6.7%
5 221
 
6.1%
6 180
 
5.0%
7 105
 
2.9%
8 104
 
2.9%
9 99
 
2.7%
0 80
 
2.2%
Other Punctuation
ValueCountFrequency (%)
, 75
76.5%
. 23
 
23.5%
Space Separator
ValueCountFrequency (%)
1035
100.0%
Open Punctuation
ValueCountFrequency (%)
( 167
100.0%
Close Punctuation
ValueCountFrequency (%)
) 167
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 130
100.0%
Letter Number
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68083
91.6%
Common 5218
 
7.0%
Latin 1042
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2569
 
3.8%
2548
 
3.7%
2490
 
3.7%
1836
 
2.7%
1704
 
2.5%
1571
 
2.3%
1468
 
2.2%
1432
 
2.1%
1427
 
2.1%
1413
 
2.1%
Other values (376) 49625
72.9%
Latin
ValueCountFrequency (%)
e 186
17.9%
S 123
11.8%
C 109
10.5%
K 89
8.5%
M 79
 
7.6%
D 79
 
7.6%
L 57
 
5.5%
H 47
 
4.5%
E 42
 
4.0%
I 33
 
3.2%
Other values (19) 198
19.0%
Common
ValueCountFrequency (%)
1 1087
20.8%
1035
19.8%
2 997
19.1%
3 507
9.7%
4 241
 
4.6%
5 221
 
4.2%
6 180
 
3.4%
( 167
 
3.2%
) 167
 
3.2%
- 130
 
2.5%
Other values (6) 486
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68083
91.6%
ASCII 6254
 
8.4%
Number Forms 6
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2569
 
3.8%
2548
 
3.7%
2490
 
3.7%
1836
 
2.7%
1704
 
2.5%
1571
 
2.3%
1468
 
2.2%
1432
 
2.1%
1427
 
2.1%
1413
 
2.1%
Other values (376) 49625
72.9%
ASCII
ValueCountFrequency (%)
1 1087
17.4%
1035
16.5%
2 997
15.9%
3 507
 
8.1%
4 241
 
3.9%
5 221
 
3.5%
e 186
 
3.0%
6 180
 
2.9%
( 167
 
2.7%
) 167
 
2.7%
Other values (34) 1466
23.4%
Number Forms
ValueCountFrequency (%)
6
100.0%
Distinct2115
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:48:17.507401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique105 ?
Unique (%)1.1%

Sample

1st rowA14270301
2nd rowA13386702
3rd rowA10025418
4th rowA13204104
5th rowA13082703
ValueCountFrequency (%)
a13086101 13
 
0.1%
a15210206 13
 
0.1%
a15605103 13
 
0.1%
a15001009 12
 
0.1%
a13983816 12
 
0.1%
a15083701 12
 
0.1%
a15807605 11
 
0.1%
a13790703 11
 
0.1%
a14280502 11
 
0.1%
a13611011 11
 
0.1%
Other values (2105) 9881
98.8%
2024-05-11T06:48:19.080801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18643
20.7%
1 17438
19.4%
A 9984
11.1%
3 8820
9.8%
2 8497
9.4%
5 6123
 
6.8%
8 5478
 
6.1%
7 4516
 
5.0%
4 4011
 
4.5%
6 3438
 
3.8%
Other values (2) 3052
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18643
23.3%
1 17438
21.8%
3 8820
11.0%
2 8497
10.6%
5 6123
 
7.7%
8 5478
 
6.8%
7 4516
 
5.6%
4 4011
 
5.0%
6 3438
 
4.3%
9 3036
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9984
99.8%
B 16
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18643
23.3%
1 17438
21.8%
3 8820
11.0%
2 8497
10.6%
5 6123
 
7.7%
8 5478
 
6.8%
7 4516
 
5.6%
4 4011
 
5.0%
6 3438
 
4.3%
9 3036
 
3.8%
Latin
ValueCountFrequency (%)
A 9984
99.8%
B 16
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18643
20.7%
1 17438
19.4%
A 9984
11.1%
3 8820
9.8%
2 8497
9.4%
5 6123
 
6.8%
8 5478
 
6.1%
7 4516
 
5.0%
4 4011
 
4.5%
6 3438
 
3.8%
Other values (2) 3052
 
3.4%
Distinct86
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:48:19.952982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.7529
Min length2

Characters and Unicode

Total characters47529
Distinct characters120
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row연차수당
2nd row복리후생비
3rd row제수당
4th row전산고지비
5th row입주자대표회의운영비
ValueCountFrequency (%)
소독비 251
 
2.5%
세대전기료 247
 
2.5%
도서인쇄비 243
 
2.4%
퇴직급여 243
 
2.4%
통신비 235
 
2.4%
제수당 227
 
2.3%
수선유지비 227
 
2.3%
사무용품비 223
 
2.2%
산재보험료 222
 
2.2%
교육비 221
 
2.2%
Other values (76) 7661
76.6%
2024-05-11T06:48:21.472718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5316
 
11.2%
3611
 
7.6%
2231
 
4.7%
1923
 
4.0%
1417
 
3.0%
1280
 
2.7%
1146
 
2.4%
880
 
1.9%
874
 
1.8%
804
 
1.7%
Other values (110) 28047
59.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 47529
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5316
 
11.2%
3611
 
7.6%
2231
 
4.7%
1923
 
4.0%
1417
 
3.0%
1280
 
2.7%
1146
 
2.4%
880
 
1.9%
874
 
1.8%
804
 
1.7%
Other values (110) 28047
59.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 47529
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5316
 
11.2%
3611
 
7.6%
2231
 
4.7%
1923
 
4.0%
1417
 
3.0%
1280
 
2.7%
1146
 
2.4%
880
 
1.9%
874
 
1.8%
804
 
1.7%
Other values (110) 28047
59.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 47529
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5316
 
11.2%
3611
 
7.6%
2231
 
4.7%
1923
 
4.0%
1417
 
3.0%
1280
 
2.7%
1146
 
2.4%
880
 
1.9%
874
 
1.8%
804
 
1.7%
Other values (110) 28047
59.0%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202304
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202304
2nd row202304
3rd row202304
4th row202304
5th row202304

Common Values

ValueCountFrequency (%)
202304 10000
100.0%

Length

2024-05-11T06:48:22.198573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T06:48:22.659862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202304 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7236
Distinct (%)72.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3778251.1
Minimum-2820000
Maximum4.3325737 × 108
Zeros1078
Zeros (%)10.8%
Negative9
Negative (%)0.1%
Memory size166.0 KiB
2024-05-11T06:48:23.096395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-2820000
5-th percentile0
Q189567.5
median350000
Q31571887.5
95-th percentile18262329
Maximum4.3325737 × 108
Range4.3607737 × 108
Interquartile range (IQR)1482320

Descriptive statistics

Standard deviation14090720
Coefficient of variation (CV)3.7294292
Kurtosis233.88961
Mean3778251.1
Median Absolute Deviation (MAD)346718.5
Skewness12.275287
Sum3.7782511 × 1010
Variance1.985484 × 1014
MonotonicityNot monotonic
2024-05-11T06:48:23.826076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1078
 
10.8%
200000 96
 
1.0%
300000 60
 
0.6%
100000 59
 
0.6%
150000 47
 
0.5%
400000 39
 
0.4%
50000 35
 
0.4%
30000 34
 
0.3%
500000 33
 
0.3%
250000 30
 
0.3%
Other values (7226) 8489
84.9%
ValueCountFrequency (%)
-2820000 1
 
< 0.1%
-447500 1
 
< 0.1%
-420800 1
 
< 0.1%
-358590 1
 
< 0.1%
-286320 1
 
< 0.1%
-217100 1
 
< 0.1%
-146460 1
 
< 0.1%
-32000 1
 
< 0.1%
-290 1
 
< 0.1%
0 1078
10.8%
ValueCountFrequency (%)
433257372 1
< 0.1%
361828570 1
< 0.1%
296312760 1
< 0.1%
287181770 1
< 0.1%
258694178 1
< 0.1%
243383708 1
< 0.1%
242005070 1
< 0.1%
237719560 1
< 0.1%
215909205 1
< 0.1%
207789430 1
< 0.1%

Interactions

2024-05-11T06:48:13.090265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T06:48:24.227298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.280
금액0.2801.000

Missing values

2024-05-11T06:48:13.504426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T06:48:13.942210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
64028수유역두산위브A14270301연차수당202304580000
34707서울숲한신더휴아파트A13386702복리후생비2023041113330
6304래미안 신반포 리오센트A10025418제수당2023043039730
28699창동주공4단지A13204104전산고지비202304418000
24240이문현대A13082703입주자대표회의운영비2023041100000
70446신길경남A15083703복리후생비202304540000
58038공릉대동1차A13980801교육비2023040
66790자양우성3차A14386110세금과공과202304258610
53297가락우성2차아파트A13880602재활용품수익20230450000
22024백련산힐스테이트3차A12290901부과차익2023041626
아파트명아파트코드비용명년월일금액
81628우장산한화꿈에그린A15701004국민연금202304145290
47269양재리본타워1단지A13713001음식물처리비202304326270
18762공덕현대A12180401보험료202304318930
65784광장11현대홈타운A14321001도서인쇄비202304169330
70434신길경남A15083703이자수익2023040
17003마포도화우성아파트A12104007소모품비2023041054860
2631마포프레스티지자이아파트A10024347검침수익202304728420
79565이수교스위첸A15608001세대수도료2023045897830
69735한강아파트A15080501재활용품수익202304132500
78524대방경남아너스빌A15602001기타운영수익20230435