Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2400 (24.0%) zerosZeros

Reproduction

Analysis started2024-05-11 05:56:29.582139
Analysis finished2024-05-11 05:56:30.852889
Duration1.27 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2217
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:31.090654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length7.4249
Min length2

Characters and Unicode

Total characters74249
Distinct characters436
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique127 ?
Unique (%)1.3%

Sample

1st row공릉태강
2nd row삼성동중앙하이츠빌리지
3rd row왕십리텐즈힐2구역214동
4th row양평거성파스텔
5th row문정건영
ValueCountFrequency (%)
아파트 177
 
1.6%
래미안 45
 
0.4%
e편한세상 33
 
0.3%
아이파크 27
 
0.2%
푸르지오 20
 
0.2%
이편한세상 18
 
0.2%
힐스테이트 18
 
0.2%
강남한신휴플러스 17
 
0.2%
은평뉴타운상림마을6단지 16
 
0.1%
왕십리 16
 
0.1%
Other values (2297) 10496
96.4%
2024-05-11T14:56:31.580071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2496
 
3.4%
2457
 
3.3%
2384
 
3.2%
1942
 
2.6%
1724
 
2.3%
1634
 
2.2%
1491
 
2.0%
1400
 
1.9%
1400
 
1.9%
1390
 
1.9%
Other values (426) 55931
75.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67986
91.6%
Decimal Number 3559
 
4.8%
Space Separator 956
 
1.3%
Uppercase Letter 940
 
1.3%
Lowercase Letter 292
 
0.4%
Close Punctuation 146
 
0.2%
Open Punctuation 146
 
0.2%
Dash Punctuation 117
 
0.2%
Other Punctuation 104
 
0.1%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2496
 
3.7%
2457
 
3.6%
2384
 
3.5%
1942
 
2.9%
1724
 
2.5%
1634
 
2.4%
1491
 
2.2%
1400
 
2.1%
1400
 
2.1%
1390
 
2.0%
Other values (381) 49668
73.1%
Uppercase Letter
ValueCountFrequency (%)
S 161
17.1%
C 131
13.9%
K 116
12.3%
M 94
10.0%
D 94
10.0%
E 53
 
5.6%
L 52
 
5.5%
H 47
 
5.0%
I 45
 
4.8%
V 34
 
3.6%
Other values (7) 113
12.0%
Lowercase Letter
ValueCountFrequency (%)
e 177
60.6%
i 23
 
7.9%
l 22
 
7.5%
v 14
 
4.8%
k 12
 
4.1%
s 12
 
4.1%
c 10
 
3.4%
w 7
 
2.4%
h 5
 
1.7%
g 5
 
1.7%
Decimal Number
ValueCountFrequency (%)
1 1046
29.4%
2 1016
28.5%
3 506
14.2%
4 282
 
7.9%
5 192
 
5.4%
6 153
 
4.3%
8 108
 
3.0%
7 98
 
2.8%
9 85
 
2.4%
0 73
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 83
79.8%
. 21
 
20.2%
Space Separator
ValueCountFrequency (%)
956
100.0%
Close Punctuation
ValueCountFrequency (%)
) 146
100.0%
Open Punctuation
ValueCountFrequency (%)
( 146
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 117
100.0%
Letter Number
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67986
91.6%
Common 5028
 
6.8%
Latin 1235
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2496
 
3.7%
2457
 
3.6%
2384
 
3.5%
1942
 
2.9%
1724
 
2.5%
1634
 
2.4%
1491
 
2.2%
1400
 
2.1%
1400
 
2.1%
1390
 
2.0%
Other values (381) 49668
73.1%
Latin
ValueCountFrequency (%)
e 177
14.3%
S 161
13.0%
C 131
10.6%
K 116
9.4%
M 94
 
7.6%
D 94
 
7.6%
E 53
 
4.3%
L 52
 
4.2%
H 47
 
3.8%
I 45
 
3.6%
Other values (19) 265
21.5%
Common
ValueCountFrequency (%)
1 1046
20.8%
2 1016
20.2%
956
19.0%
3 506
10.1%
4 282
 
5.6%
5 192
 
3.8%
6 153
 
3.0%
) 146
 
2.9%
( 146
 
2.9%
- 117
 
2.3%
Other values (6) 468
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67986
91.6%
ASCII 6260
 
8.4%
Number Forms 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2496
 
3.7%
2457
 
3.6%
2384
 
3.5%
1942
 
2.9%
1724
 
2.5%
1634
 
2.4%
1491
 
2.2%
1400
 
2.1%
1400
 
2.1%
1390
 
2.0%
Other values (381) 49668
73.1%
ASCII
ValueCountFrequency (%)
1 1046
16.7%
2 1016
16.2%
956
15.3%
3 506
 
8.1%
4 282
 
4.5%
5 192
 
3.1%
e 177
 
2.8%
S 161
 
2.6%
6 153
 
2.4%
) 146
 
2.3%
Other values (34) 1625
26.0%
Number Forms
ValueCountFrequency (%)
3
100.0%
Distinct2220
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:32.034841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique127 ?
Unique (%)1.3%

Sample

1st rowA13980019
2nd rowA13550701
3rd rowA13373302
4th rowA15010306
5th rowA13820005
ValueCountFrequency (%)
a13610202 15
 
0.1%
a12013003 12
 
0.1%
a14004002 12
 
0.1%
a15205301 12
 
0.1%
a13792001 12
 
0.1%
a15001005 12
 
0.1%
a13183502 12
 
0.1%
a13606002 11
 
0.1%
a10025768 11
 
0.1%
a13184401 11
 
0.1%
Other values (2210) 9880
98.8%
2024-05-11T14:56:32.910284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18581
20.6%
1 17491
19.4%
A 9986
11.1%
3 8824
9.8%
2 8560
9.5%
5 6088
 
6.8%
8 5429
 
6.0%
7 4547
 
5.1%
4 4007
 
4.5%
6 3324
 
3.7%
Other values (2) 3163
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18581
23.2%
1 17491
21.9%
3 8824
11.0%
2 8560
10.7%
5 6088
 
7.6%
8 5429
 
6.8%
7 4547
 
5.7%
4 4007
 
5.0%
6 3324
 
4.2%
9 3149
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
A 9986
99.9%
B 14
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18581
23.2%
1 17491
21.9%
3 8824
11.0%
2 8560
10.7%
5 6088
 
7.6%
8 5429
 
6.8%
7 4547
 
5.7%
4 4007
 
5.0%
6 3324
 
4.2%
9 3149
 
3.9%
Latin
ValueCountFrequency (%)
A 9986
99.9%
B 14
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18581
20.6%
1 17491
19.4%
A 9986
11.1%
3 8824
9.8%
2 8560
9.5%
5 6088
 
6.8%
8 5429
 
6.0%
7 4547
 
5.1%
4 4007
 
4.5%
6 3324
 
3.7%
Other values (2) 3163
 
3.5%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:33.372307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.9821
Min length2

Characters and Unicode

Total characters59821
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가수금
2nd row장기수선충당예금
3rd row가수금
4th row현금
5th row관리비미수금
ValueCountFrequency (%)
당기순이익 324
 
3.2%
관리비미수금 318
 
3.2%
연차수당충당부채 315
 
3.1%
예수금 304
 
3.0%
예금 303
 
3.0%
비품 300
 
3.0%
비품감가상각누계액 298
 
3.0%
공동주택적립금 298
 
3.0%
미부과관리비 297
 
3.0%
장기수선충당예금 294
 
2.9%
Other values (67) 6949
69.5%
2024-05-11T14:56:33.945100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4548
 
7.6%
3814
 
6.4%
3172
 
5.3%
3098
 
5.2%
2963
 
5.0%
2899
 
4.8%
2583
 
4.3%
2565
 
4.3%
1870
 
3.1%
1747
 
2.9%
Other values (97) 30562
51.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59821
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4548
 
7.6%
3814
 
6.4%
3172
 
5.3%
3098
 
5.2%
2963
 
5.0%
2899
 
4.8%
2583
 
4.3%
2565
 
4.3%
1870
 
3.1%
1747
 
2.9%
Other values (97) 30562
51.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59821
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4548
 
7.6%
3814
 
6.4%
3172
 
5.3%
3098
 
5.2%
2963
 
5.0%
2899
 
4.8%
2583
 
4.3%
2565
 
4.3%
1870
 
3.1%
1747
 
2.9%
Other values (97) 30562
51.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59821
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4548
 
7.6%
3814
 
6.4%
3172
 
5.3%
3098
 
5.2%
2963
 
5.0%
2899
 
4.8%
2583
 
4.3%
2565
 
4.3%
1870
 
3.1%
1747
 
2.9%
Other values (97) 30562
51.1%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202310
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202310
2nd row202310
3rd row202310
4th row202310
5th row202310

Common Values

ValueCountFrequency (%)
202310 10000
100.0%

Length

2024-05-11T14:56:34.215017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:56:34.392251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202310 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7285
Distinct (%)72.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79407907
Minimum-7.9485722 × 108
Maximum7.8483692 × 109
Zeros2400
Zeros (%)24.0%
Negative350
Negative (%)3.5%
Memory size166.0 KiB
2024-05-11T14:56:34.572756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-7.9485722 × 108
5-th percentile0
Q10
median3245337
Q338918475
95-th percentile3.8752169 × 108
Maximum7.8483692 × 109
Range8.6432264 × 109
Interquartile range (IQR)38918475

Descriptive statistics

Standard deviation3.0082813 × 108
Coefficient of variation (CV)3.7883901
Kurtosis160.80531
Mean79407907
Median Absolute Deviation (MAD)3245337
Skewness10.333237
Sum7.9407907 × 1011
Variance9.0497562 × 1016
MonotonicityNot monotonic
2024-05-11T14:56:34.822614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2400
 
24.0%
250000 26
 
0.3%
500000 19
 
0.2%
1000000 17
 
0.2%
300000 15
 
0.1%
600000 11
 
0.1%
242000 11
 
0.1%
20000000 10
 
0.1%
484000 9
 
0.1%
200000 9
 
0.1%
Other values (7275) 7473
74.7%
ValueCountFrequency (%)
-794857216 1
< 0.1%
-524930511 1
< 0.1%
-500876502 1
< 0.1%
-294532030 1
< 0.1%
-279074605 1
< 0.1%
-267912322 1
< 0.1%
-251269244 1
< 0.1%
-251074850 1
< 0.1%
-249751829 1
< 0.1%
-205483040 1
< 0.1%
ValueCountFrequency (%)
7848369151 1
< 0.1%
7223579420 1
< 0.1%
6232488710 1
< 0.1%
5070153106 1
< 0.1%
4758948467 1
< 0.1%
4525189071 1
< 0.1%
4486267091 1
< 0.1%
4117482873 1
< 0.1%
4093344472 1
< 0.1%
4075073552 1
< 0.1%

Interactions

2024-05-11T14:56:30.262127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:56:34.963246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.624
금액0.6241.000

Missing values

2024-05-11T14:56:30.572604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:56:30.774994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
44577공릉태강A13980019가수금2023101326073
31399삼성동중앙하이츠빌리지A13550701장기수선충당예금202310544832190
24929왕십리텐즈힐2구역214동A13373302가수금2023103508735
54007양평거성파스텔A15010306현금202310150020
40145문정건영A13820005관리비미수금2023107022520
32309압구정한양아파트제2단지A13590204장기수선충당부채2023101200661951
51911광장청구A14381513공동주택적립금2023108000000
18374청량리한신A13086704가수금202310687161
54608대림코오롱A15081105당기순이익20231036517973
14510신촌태영데시앙A12188205관리비예치금20231099540000
아파트명아파트코드비용명년월일금액
43005상계금호타운A13920501퇴직급여충당부채20231036871580
63059사당제일A15609501장기수선충당부채202310465877949
5309천왕이펜하우스8단지A10026039당기순이익2023103471183
62063아이파크상도동A15603203기타충당예금20231013819278
45668상계한신3차A13982002장기수선충당예금2023101042223824
818강동리엔파크13단지아파트A10023855수선유지비충당부채20231016092150
7655상봉듀오트리스A10027670기타유동부채2023100
1223개포더샵트리에A10023996장기수선충당부채20231020561249
45557상계동양메이저A13981608퇴직급여충당부채20231065827600
5734래미안서초에스티지에스아파트A10026411미지급금202310153071490