Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2340 (23.4%) zerosZeros

Reproduction

Analysis started2024-05-11 05:57:51.553742
Analysis finished2024-05-11 05:57:52.677064
Duration1.12 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2249
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:52.902894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length7.3888
Min length2

Characters and Unicode

Total characters73888
Distinct characters433
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique149 ?
Unique (%)1.5%

Sample

1st row도곡삼성래미안
2nd row강일리버파크7단지
3rd row꿈의숲코오롱하늘채아파트
4th row마곡수명산파크2단지
5th row상도효성해링턴플레이스
ValueCountFrequency (%)
아파트 176
 
1.6%
래미안 29
 
0.3%
e편한세상 26
 
0.2%
아이파크 23
 
0.2%
브라운스톤 17
 
0.2%
푸르지오 17
 
0.2%
경남아너스빌 16
 
0.1%
sk뷰 15
 
0.1%
북한산 14
 
0.1%
이편한세상 14
 
0.1%
Other values (2331) 10419
96.8%
2024-05-11T14:57:53.408285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2522
 
3.4%
2513
 
3.4%
2327
 
3.1%
1797
 
2.4%
1768
 
2.4%
1645
 
2.2%
1528
 
2.1%
1404
 
1.9%
1402
 
1.9%
1393
 
1.9%
Other values (423) 55589
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67565
91.4%
Decimal Number 3658
 
5.0%
Space Separator 868
 
1.2%
Uppercase Letter 829
 
1.1%
Lowercase Letter 393
 
0.5%
Open Punctuation 164
 
0.2%
Close Punctuation 164
 
0.2%
Dash Punctuation 141
 
0.2%
Other Punctuation 103
 
0.1%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2522
 
3.7%
2513
 
3.7%
2327
 
3.4%
1797
 
2.7%
1768
 
2.6%
1645
 
2.4%
1528
 
2.3%
1404
 
2.1%
1402
 
2.1%
1393
 
2.1%
Other values (378) 49266
72.9%
Uppercase Letter
ValueCountFrequency (%)
S 132
15.9%
C 117
14.1%
K 105
12.7%
M 78
9.4%
D 78
9.4%
L 57
6.9%
H 51
 
6.2%
I 46
 
5.5%
E 40
 
4.8%
V 30
 
3.6%
Other values (7) 95
11.5%
Lowercase Letter
ValueCountFrequency (%)
e 214
54.5%
l 35
 
8.9%
i 33
 
8.4%
s 28
 
7.1%
v 23
 
5.9%
k 20
 
5.1%
h 13
 
3.3%
w 13
 
3.3%
c 8
 
2.0%
a 3
 
0.8%
Decimal Number
ValueCountFrequency (%)
1 1079
29.5%
2 1055
28.8%
3 503
13.8%
4 267
 
7.3%
5 195
 
5.3%
6 171
 
4.7%
7 129
 
3.5%
9 95
 
2.6%
8 92
 
2.5%
0 72
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 78
75.7%
. 25
 
24.3%
Space Separator
ValueCountFrequency (%)
868
100.0%
Open Punctuation
ValueCountFrequency (%)
( 164
100.0%
Close Punctuation
ValueCountFrequency (%)
) 164
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 141
100.0%
Letter Number
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67565
91.4%
Common 5098
 
6.9%
Latin 1225
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2522
 
3.7%
2513
 
3.7%
2327
 
3.4%
1797
 
2.7%
1768
 
2.6%
1645
 
2.4%
1528
 
2.3%
1404
 
2.1%
1402
 
2.1%
1393
 
2.1%
Other values (378) 49266
72.9%
Latin
ValueCountFrequency (%)
e 214
17.5%
S 132
10.8%
C 117
 
9.6%
K 105
 
8.6%
M 78
 
6.4%
D 78
 
6.4%
L 57
 
4.7%
H 51
 
4.2%
I 46
 
3.8%
E 40
 
3.3%
Other values (19) 307
25.1%
Common
ValueCountFrequency (%)
1 1079
21.2%
2 1055
20.7%
868
17.0%
3 503
9.9%
4 267
 
5.2%
5 195
 
3.8%
6 171
 
3.4%
( 164
 
3.2%
) 164
 
3.2%
- 141
 
2.8%
Other values (6) 491
9.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67565
91.4%
ASCII 6320
 
8.6%
Number Forms 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2522
 
3.7%
2513
 
3.7%
2327
 
3.4%
1797
 
2.7%
1768
 
2.6%
1645
 
2.4%
1528
 
2.3%
1404
 
2.1%
1402
 
2.1%
1393
 
2.1%
Other values (378) 49266
72.9%
ASCII
ValueCountFrequency (%)
1 1079
17.1%
2 1055
16.7%
868
13.7%
3 503
 
8.0%
4 267
 
4.2%
e 214
 
3.4%
5 195
 
3.1%
6 171
 
2.7%
( 164
 
2.6%
) 164
 
2.6%
Other values (34) 1640
25.9%
Number Forms
ValueCountFrequency (%)
3
100.0%
Distinct2255
Distinct (%)22.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:53.826951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique151 ?
Unique (%)1.5%

Sample

1st rowA13550502
2nd rowA13410010
3rd rowA10026571
4th rowA15728004
5th rowA10027472
ValueCountFrequency (%)
a13986306 13
 
0.1%
a13204505 12
 
0.1%
a15679104 12
 
0.1%
a13789002 12
 
0.1%
a14003001 12
 
0.1%
a13820006 12
 
0.1%
a13776510 11
 
0.1%
a15009402 11
 
0.1%
a13905105 11
 
0.1%
a13811205 11
 
0.1%
Other values (2245) 9883
98.8%
2024-05-11T14:57:54.540517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18645
20.7%
1 17461
19.4%
A 9994
11.1%
3 8772
9.7%
2 8223
9.1%
5 6170
 
6.9%
8 5519
 
6.1%
7 4699
 
5.2%
4 4076
 
4.5%
6 3353
 
3.7%
Other values (2) 3088
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18645
23.3%
1 17461
21.8%
3 8772
11.0%
2 8223
10.3%
5 6170
 
7.7%
8 5519
 
6.9%
7 4699
 
5.9%
4 4076
 
5.1%
6 3353
 
4.2%
9 3082
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
A 9994
99.9%
B 6
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18645
23.3%
1 17461
21.8%
3 8772
11.0%
2 8223
10.3%
5 6170
 
7.7%
8 5519
 
6.9%
7 4699
 
5.9%
4 4076
 
5.1%
6 3353
 
4.2%
9 3082
 
3.9%
Latin
ValueCountFrequency (%)
A 9994
99.9%
B 6
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18645
20.7%
1 17461
19.4%
A 9994
11.1%
3 8772
9.7%
2 8223
9.1%
5 6170
 
6.9%
8 5519
 
6.1%
7 4699
 
5.2%
4 4076
 
4.5%
6 3353
 
3.7%
Other values (2) 3088
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:54.963048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9475
Min length2

Characters and Unicode

Total characters59475
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row세대배부용비품
2nd row선수난방비
3rd row기타유형자산
4th row연차수당충당부채
5th row소프트웨어
ValueCountFrequency (%)
예금 350
 
3.5%
당기순이익 326
 
3.3%
가수금 317
 
3.2%
미처분이익잉여금 316
 
3.2%
공동주택적립금 309
 
3.1%
관리비미수금 307
 
3.1%
선급비용 306
 
3.1%
수선유지비충당부채 302
 
3.0%
비품 298
 
3.0%
미부과관리비 297
 
3.0%
Other values (67) 6872
68.7%
2024-05-11T14:57:55.495580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4679
 
7.9%
3797
 
6.4%
3093
 
5.2%
3021
 
5.1%
2988
 
5.0%
2882
 
4.8%
2561
 
4.3%
2429
 
4.1%
1895
 
3.2%
1763
 
3.0%
Other values (97) 30367
51.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59475
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4679
 
7.9%
3797
 
6.4%
3093
 
5.2%
3021
 
5.1%
2988
 
5.0%
2882
 
4.8%
2561
 
4.3%
2429
 
4.1%
1895
 
3.2%
1763
 
3.0%
Other values (97) 30367
51.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59475
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4679
 
7.9%
3797
 
6.4%
3093
 
5.2%
3021
 
5.1%
2988
 
5.0%
2882
 
4.8%
2561
 
4.3%
2429
 
4.1%
1895
 
3.2%
1763
 
3.0%
Other values (97) 30367
51.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59475
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4679
 
7.9%
3797
 
6.4%
3093
 
5.2%
3021
 
5.1%
2988
 
5.0%
2882
 
4.8%
2561
 
4.3%
2429
 
4.1%
1895
 
3.2%
1763
 
3.0%
Other values (97) 30367
51.1%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202206
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202206
2nd row202206
3rd row202206
4th row202206
5th row202206

Common Values

ValueCountFrequency (%)
202206 10000
100.0%

Length

2024-05-11T14:57:55.678698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:57:55.795352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202206 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7350
Distinct (%)73.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73552745
Minimum-4.09024 × 109
Maximum9.0859515 × 109
Zeros2340
Zeros (%)23.4%
Negative348
Negative (%)3.5%
Memory size166.0 KiB
2024-05-11T14:57:55.951603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.09024 × 109
5-th percentile0
Q10
median3061335
Q334484036
95-th percentile3.713755 × 108
Maximum9.0859515 × 109
Range1.3176191 × 1010
Interquartile range (IQR)34484036

Descriptive statistics

Standard deviation2.9125409 × 108
Coefficient of variation (CV)3.9597991
Kurtosis221.3177
Mean73552745
Median Absolute Deviation (MAD)3061335
Skewness11.07729
Sum7.3552745 × 1011
Variance8.4828947 × 1016
MonotonicityNot monotonic
2024-05-11T14:57:56.119053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2340
 
23.4%
250000 28
 
0.3%
500000 23
 
0.2%
300000 17
 
0.2%
484000 15
 
0.1%
242000 15
 
0.1%
1000000 10
 
0.1%
10000000 8
 
0.1%
2000000 7
 
0.1%
100000 7
 
0.1%
Other values (7340) 7530
75.3%
ValueCountFrequency (%)
-4090240000 1
< 0.1%
-483769996 1
< 0.1%
-283117556 1
< 0.1%
-263942701 1
< 0.1%
-230922000 1
< 0.1%
-211155798 1
< 0.1%
-195908810 1
< 0.1%
-190422700 1
< 0.1%
-179333490 1
< 0.1%
-174771277 1
< 0.1%
ValueCountFrequency (%)
9085951481 1
< 0.1%
7549587613 1
< 0.1%
6575495178 1
< 0.1%
5430910202 1
< 0.1%
4397584477 1
< 0.1%
4347672691 1
< 0.1%
4109028170 1
< 0.1%
4108913314 1
< 0.1%
4032492352 1
< 0.1%
3945671117 1
< 0.1%

Interactions

2024-05-11T14:57:52.225455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:57:56.225322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.287
금액0.2871.000

Missing values

2024-05-11T14:57:52.432756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:57:52.591986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
31324도곡삼성래미안A13550502세대배부용비품2022061639200
27546강일리버파크7단지A13410010선수난방비2022060
5106꿈의숲코오롱하늘채아파트A10026571기타유형자산2022060
66627마곡수명산파크2단지A15728004연차수당충당부채2022064223430
6540상도효성해링턴플레이스A10027472소프트웨어2022061071120
13431신수현대A12185603저장품20220695700
42514잠실현대A13886701상여충당부채2022060
40378가락1차현대아파트A13820004장기수선충당예금202206510328385
1622종암sh빌아파트A10024603관리비미수금202206759010
4880송파호반베르디움더퍼스트A10026362퇴직급여충당부채20220623401980
아파트명아파트코드비용명년월일금액
42180송파파크데일1단지A13881701장기수선충당부채202206202946251
18988신내6단지A13176901주차장충당예금20220612818669
56977봉천은천1단지A15106101공동주택적립금20220678459615
57494신림건영1차A15185704당기순이익20220611218179
3567항동하버라인3단지A10025614장기수선충당예금20220681489513
70569목동10단지A15873701기타공동주택관리비충당부채20220671572503
55636보라매두산위브A15086001선수관리비20220632240000
23873마장세림A13305007선급금20220673120
33071삼선1SH-VILLEA13604301관리비예치금20220632841000
3320항동하버라인2단지A10025387퇴직급여충당부채20220668960720