Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 is highly skewed (γ1 = 27.29397723)Skewed
금액 has 2107 (21.1%) zerosZeros

Reproduction

Analysis started2024-05-11 06:00:39.259600
Analysis finished2024-05-11 06:00:40.442970
Duration1.18 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2166
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:40.700918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.2041
Min length2

Characters and Unicode

Total characters72041
Distinct characters431
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)1.1%

Sample

1st row서초롯데캐슬프레지던트아파트
2nd row양평동보아파트
3rd row아카데미스위트
4th row상일동아
5th row개봉동현대아파트
ValueCountFrequency (%)
아파트 115
 
1.1%
래미안 26
 
0.2%
은평뉴타운상림마을6단지 17
 
0.2%
신반포한신5지구(12,13,18차 15
 
0.1%
은평뉴타운상림마을제3단지 13
 
0.1%
우리유앤미 13
 
0.1%
제1아파트 13
 
0.1%
힐스테이트 13
 
0.1%
래미안밤섬리베뉴 13
 
0.1%
브라운스톤 13
 
0.1%
Other values (2226) 10243
97.6%
2024-05-11T15:00:41.676442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2238
 
3.1%
2182
 
3.0%
1962
 
2.7%
1867
 
2.6%
1806
 
2.5%
1612
 
2.2%
1564
 
2.2%
1526
 
2.1%
1467
 
2.0%
1274
 
1.8%
Other values (421) 54543
75.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65999
91.6%
Decimal Number 3912
 
5.4%
Uppercase Letter 692
 
1.0%
Space Separator 537
 
0.7%
Lowercase Letter 301
 
0.4%
Open Punctuation 153
 
0.2%
Close Punctuation 153
 
0.2%
Other Punctuation 148
 
0.2%
Dash Punctuation 133
 
0.2%
Letter Number 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2238
 
3.4%
2182
 
3.3%
1962
 
3.0%
1867
 
2.8%
1806
 
2.7%
1612
 
2.4%
1564
 
2.4%
1526
 
2.3%
1467
 
2.2%
1274
 
1.9%
Other values (375) 48501
73.5%
Uppercase Letter
ValueCountFrequency (%)
S 127
18.4%
K 93
13.4%
C 74
10.7%
L 57
8.2%
M 51
7.4%
D 51
7.4%
G 43
 
6.2%
E 38
 
5.5%
H 37
 
5.3%
I 34
 
4.9%
Other values (7) 87
12.6%
Lowercase Letter
ValueCountFrequency (%)
e 192
63.8%
l 26
 
8.6%
i 24
 
8.0%
v 16
 
5.3%
k 11
 
3.7%
s 10
 
3.3%
w 8
 
2.7%
c 6
 
2.0%
a 3
 
1.0%
g 3
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 1182
30.2%
2 1142
29.2%
3 525
13.4%
4 302
 
7.7%
5 226
 
5.8%
6 148
 
3.8%
7 109
 
2.8%
8 100
 
2.6%
0 92
 
2.4%
9 86
 
2.2%
Other Punctuation
ValueCountFrequency (%)
, 128
86.5%
. 20
 
13.5%
Space Separator
ValueCountFrequency (%)
537
100.0%
Open Punctuation
ValueCountFrequency (%)
( 153
100.0%
Close Punctuation
ValueCountFrequency (%)
) 153
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 133
100.0%
Letter Number
ValueCountFrequency (%)
9
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65999
91.6%
Common 5040
 
7.0%
Latin 1002
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2238
 
3.4%
2182
 
3.3%
1962
 
3.0%
1867
 
2.8%
1806
 
2.7%
1612
 
2.4%
1564
 
2.4%
1526
 
2.3%
1467
 
2.2%
1274
 
1.9%
Other values (375) 48501
73.5%
Latin
ValueCountFrequency (%)
e 192
19.2%
S 127
12.7%
K 93
 
9.3%
C 74
 
7.4%
L 57
 
5.7%
M 51
 
5.1%
D 51
 
5.1%
G 43
 
4.3%
E 38
 
3.8%
H 37
 
3.7%
Other values (19) 239
23.9%
Common
ValueCountFrequency (%)
1 1182
23.5%
2 1142
22.7%
537
10.7%
3 525
10.4%
4 302
 
6.0%
5 226
 
4.5%
( 153
 
3.0%
) 153
 
3.0%
6 148
 
2.9%
- 133
 
2.6%
Other values (7) 539
10.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65999
91.6%
ASCII 6033
 
8.4%
Number Forms 9
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2238
 
3.4%
2182
 
3.3%
1962
 
3.0%
1867
 
2.8%
1806
 
2.7%
1612
 
2.4%
1564
 
2.4%
1526
 
2.3%
1467
 
2.2%
1274
 
1.9%
Other values (375) 48501
73.5%
ASCII
ValueCountFrequency (%)
1 1182
19.6%
2 1142
18.9%
537
 
8.9%
3 525
 
8.7%
4 302
 
5.0%
5 226
 
3.7%
e 192
 
3.2%
( 153
 
2.5%
) 153
 
2.5%
6 148
 
2.5%
Other values (35) 1473
24.4%
Number Forms
ValueCountFrequency (%)
9
100.0%
Distinct2173
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:42.246463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)1.1%

Sample

1st rowA10028146
2nd rowA15010501
3rd rowA13527011
4th rowA13409001
5th rowA15209207
ValueCountFrequency (%)
a13790726 15
 
0.1%
a41279908 13
 
0.1%
a12220001 13
 
0.1%
a10026571 12
 
0.1%
a15105303 12
 
0.1%
a13788208 12
 
0.1%
a15807703 12
 
0.1%
a14004002 11
 
0.1%
a13920506 11
 
0.1%
a12071002 11
 
0.1%
Other values (2163) 9878
98.8%
2024-05-11T15:00:42.970421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18212
20.2%
1 17782
19.8%
A 9990
11.1%
3 8869
9.9%
2 7996
8.9%
5 6314
 
7.0%
8 5797
 
6.4%
7 4950
 
5.5%
4 3765
 
4.2%
6 3373
 
3.7%
Other values (2) 2952
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18212
22.8%
1 17782
22.2%
3 8869
11.1%
2 7996
10.0%
5 6314
 
7.9%
8 5797
 
7.2%
7 4950
 
6.2%
4 3765
 
4.7%
6 3373
 
4.2%
9 2942
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 9990
99.9%
B 10
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18212
22.8%
1 17782
22.2%
3 8869
11.1%
2 7996
10.0%
5 6314
 
7.9%
8 5797
 
7.2%
7 4950
 
6.2%
4 3765
 
4.7%
6 3373
 
4.2%
9 2942
 
3.7%
Latin
ValueCountFrequency (%)
A 9990
99.9%
B 10
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18212
20.2%
1 17782
19.8%
A 9990
11.1%
3 8869
9.9%
2 7996
8.9%
5 6314
 
7.0%
8 5797
 
6.4%
7 4950
 
5.5%
4 3765
 
4.2%
6 3373
 
3.7%
Other values (2) 2952
 
3.3%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:43.283869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9584
Min length2

Characters and Unicode

Total characters59584
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가수금
2nd row기타유형자산감가상각누계액
3rd row미부과관리비
4th row선급비용
5th row장기수선충당예금
ValueCountFrequency (%)
퇴직급여충당부채 341
 
3.4%
예금 338
 
3.4%
예수금 330
 
3.3%
선급비용 316
 
3.2%
당기순이익 316
 
3.2%
장기수선충당부채 315
 
3.1%
연차수당충당부채 311
 
3.1%
미처분이익잉여금 303
 
3.0%
미부과관리비 300
 
3.0%
비품 298
 
3.0%
Other values (67) 6832
68.3%
2024-05-11T15:00:43.802295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4698
 
7.9%
3838
 
6.4%
3237
 
5.4%
3133
 
5.3%
3019
 
5.1%
2934
 
4.9%
2701
 
4.5%
2337
 
3.9%
1955
 
3.3%
1772
 
3.0%
Other values (97) 29960
50.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59584
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4698
 
7.9%
3838
 
6.4%
3237
 
5.4%
3133
 
5.3%
3019
 
5.1%
2934
 
4.9%
2701
 
4.5%
2337
 
3.9%
1955
 
3.3%
1772
 
3.0%
Other values (97) 29960
50.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59584
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4698
 
7.9%
3838
 
6.4%
3237
 
5.4%
3133
 
5.3%
3019
 
5.1%
2934
 
4.9%
2701
 
4.5%
2337
 
3.9%
1955
 
3.3%
1772
 
3.0%
Other values (97) 29960
50.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59584
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4698
 
7.9%
3838
 
6.4%
3237
 
5.4%
3133
 
5.3%
3019
 
5.1%
2934
 
4.9%
2701
 
4.5%
2337
 
3.9%
1955
 
3.3%
1772
 
3.0%
Other values (97) 29960
50.3%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202002
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202002
2nd row202002
3rd row202002
4th row202002
5th row202002

Common Values

ValueCountFrequency (%)
202002 10000
100.0%

Length

2024-05-11T15:00:44.039551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:00:44.186612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202002 10000
100.0%

금액
Real number (ℝ)

SKEWED  ZEROS 

Distinct7559
Distinct (%)75.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean76402797
Minimum-4.09024 × 109
Maximum2.097081 × 1010
Zeros2107
Zeros (%)21.1%
Negative328
Negative (%)3.3%
Memory size166.0 KiB
2024-05-11T15:00:44.364511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.09024 × 109
5-th percentile0
Q18236.25
median3457040
Q337221905
95-th percentile3.6697535 × 108
Maximum2.097081 × 1010
Range2.506105 × 1010
Interquartile range (IQR)37213669

Descriptive statistics

Standard deviation3.4725599 × 108
Coefficient of variation (CV)4.5450692
Kurtosis1407.9972
Mean76402797
Median Absolute Deviation (MAD)3457040
Skewness27.293977
Sum7.6402797 × 1011
Variance1.2058673 × 1017
MonotonicityNot monotonic
2024-05-11T15:00:44.584671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2107
 
21.1%
250000 26
 
0.3%
500000 24
 
0.2%
300000 16
 
0.2%
242000 12
 
0.1%
5000000 12
 
0.1%
55000 12
 
0.1%
10000000 12
 
0.1%
1000000 11
 
0.1%
100000 10
 
0.1%
Other values (7549) 7758
77.6%
ValueCountFrequency (%)
-4090240000 1
< 0.1%
-408407676 1
< 0.1%
-245627770 1
< 0.1%
-243406144 1
< 0.1%
-240093440 1
< 0.1%
-194381780 1
< 0.1%
-179048970 1
< 0.1%
-169830614 1
< 0.1%
-129945492 1
< 0.1%
-127399450 1
< 0.1%
ValueCountFrequency (%)
20970809510 1
< 0.1%
9967592425 1
< 0.1%
5406884888 1
< 0.1%
5061438094 1
< 0.1%
4772425366 1
< 0.1%
4090240000 1
< 0.1%
3979903204 1
< 0.1%
3650587166 1
< 0.1%
3609763351 1
< 0.1%
3543228234 1
< 0.1%

Interactions

2024-05-11T15:00:39.974913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:00:44.701902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.372
금액0.3721.000

Missing values

2024-05-11T15:00:40.179542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:00:40.372472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
4920서초롯데캐슬프레지던트아파트A10028146가수금202002641148
50294양평동보아파트A15010501기타유형자산감가상각누계액202002-957000
27425아카데미스위트A13527011미부과관리비202002261701363
23924상일동아A13409001선급비용2020024958500
54387개봉동현대아파트A15209207장기수선충당예금2020021972654726
3960마곡힐스테이트A10027687공동주택적립금202002113990026
15488면목경남아너스빌A13120403비품감가상각누계액20200217972156
38323거여우방A13881601장기수선충당예금202002287856521
48395자양우성2차A14386109미지급금20200248058050
46740송천센트레빌A14272313미수금202002535522
아파트명아파트코드비용명년월일금액
37142잠실엘스아파트 입주자대표회의A13822004기타유동부채2020022320358
36395송파동부센트레빌A13816101퇴직급여충당예금20200256063597
41181공릉우성A13980108퇴직급여충당부채20200230530050
41915상계조합대림A13981407미처분이익잉여금20200297379278
33354방배서리풀e편한세상A13771601미수관리비예치금2020020
34878양재우성KBS(113동)A13789201퇴직급여충당부채20200221187731
33808엠브이아파트A13780201선급비용2020026156260
55155고척대우A15279404상여충당부채2020020
67819은평뉴타운박석고개1단지A41279910선급금20200243160
22513성수현대그린A13381902미부과관리비20200235362210