Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2439 (24.4%) zerosZeros

Reproduction

Analysis started2024-05-11 05:56:07.905284
Analysis finished2024-05-11 05:56:09.145683
Duration1.24 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2250
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:09.468676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.4474
Min length2

Characters and Unicode

Total characters74474
Distinct characters435
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique138 ?
Unique (%)1.4%

Sample

1st row래미안장위포레카운티아파트
2nd row신반포4차
3rd row신내9단지
4th row둔촌동동아
5th row광장신동아파밀리에
ValueCountFrequency (%)
아파트 194
 
1.8%
래미안 53
 
0.5%
e편한세상 28
 
0.3%
경남아너스빌 22
 
0.2%
송파 20
 
0.2%
아이파크 19
 
0.2%
래미안밤섬리베뉴 15
 
0.1%
해모로 15
 
0.1%
푸르지오 15
 
0.1%
sk뷰 14
 
0.1%
Other values (2337) 10514
96.4%
2024-05-11T14:56:10.109486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2580
 
3.5%
2539
 
3.4%
2437
 
3.3%
1896
 
2.5%
1695
 
2.3%
1692
 
2.3%
1474
 
2.0%
1469
 
2.0%
1440
 
1.9%
1410
 
1.9%
Other values (425) 55842
75.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68169
91.5%
Decimal Number 3582
 
4.8%
Space Separator 1010
 
1.4%
Uppercase Letter 920
 
1.2%
Lowercase Letter 302
 
0.4%
Open Punctuation 133
 
0.2%
Close Punctuation 133
 
0.2%
Dash Punctuation 109
 
0.1%
Other Punctuation 107
 
0.1%
Letter Number 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2580
 
3.8%
2539
 
3.7%
2437
 
3.6%
1896
 
2.8%
1695
 
2.5%
1692
 
2.5%
1474
 
2.2%
1469
 
2.2%
1440
 
2.1%
1410
 
2.1%
Other values (380) 49537
72.7%
Uppercase Letter
ValueCountFrequency (%)
S 151
16.4%
K 123
13.4%
C 121
13.2%
D 85
9.2%
M 85
9.2%
L 60
 
6.5%
I 54
 
5.9%
E 51
 
5.5%
H 49
 
5.3%
V 35
 
3.8%
Other values (7) 106
11.5%
Lowercase Letter
ValueCountFrequency (%)
e 192
63.6%
l 30
 
9.9%
i 20
 
6.6%
v 17
 
5.6%
s 15
 
5.0%
k 9
 
3.0%
h 8
 
2.6%
c 4
 
1.3%
w 3
 
1.0%
a 2
 
0.7%
Decimal Number
ValueCountFrequency (%)
2 1034
28.9%
1 1032
28.8%
3 485
13.5%
4 268
 
7.5%
5 201
 
5.6%
6 167
 
4.7%
9 115
 
3.2%
7 108
 
3.0%
8 97
 
2.7%
0 75
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 83
77.6%
. 24
 
22.4%
Space Separator
ValueCountFrequency (%)
1010
100.0%
Open Punctuation
ValueCountFrequency (%)
( 133
100.0%
Close Punctuation
ValueCountFrequency (%)
) 133
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 109
100.0%
Letter Number
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68169
91.5%
Common 5074
 
6.8%
Latin 1231
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2580
 
3.8%
2539
 
3.7%
2437
 
3.6%
1896
 
2.8%
1695
 
2.5%
1692
 
2.5%
1474
 
2.2%
1469
 
2.2%
1440
 
2.1%
1410
 
2.1%
Other values (380) 49537
72.7%
Latin
ValueCountFrequency (%)
e 192
15.6%
S 151
12.3%
K 123
10.0%
C 121
9.8%
D 85
 
6.9%
M 85
 
6.9%
L 60
 
4.9%
I 54
 
4.4%
E 51
 
4.1%
H 49
 
4.0%
Other values (19) 260
21.1%
Common
ValueCountFrequency (%)
2 1034
20.4%
1 1032
20.3%
1010
19.9%
3 485
9.6%
4 268
 
5.3%
5 201
 
4.0%
6 167
 
3.3%
( 133
 
2.6%
) 133
 
2.6%
9 115
 
2.3%
Other values (6) 496
9.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68169
91.5%
ASCII 6296
 
8.5%
Number Forms 9
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2580
 
3.8%
2539
 
3.7%
2437
 
3.6%
1896
 
2.8%
1695
 
2.5%
1692
 
2.5%
1474
 
2.2%
1469
 
2.2%
1440
 
2.1%
1410
 
2.1%
Other values (380) 49537
72.7%
ASCII
ValueCountFrequency (%)
2 1034
16.4%
1 1032
16.4%
1010
16.0%
3 485
 
7.7%
4 268
 
4.3%
5 201
 
3.2%
e 192
 
3.0%
6 167
 
2.7%
S 151
 
2.4%
( 133
 
2.1%
Other values (34) 1623
25.8%
Number Forms
ValueCountFrequency (%)
9
100.0%
Distinct2254
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:10.762146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique139 ?
Unique (%)1.4%

Sample

1st rowA10025461
2nd rowA13790828
3rd rowA13187305
4th rowA13406002
5th rowA14380605
ValueCountFrequency (%)
a15792602 14
 
0.1%
a13380803 13
 
0.1%
a15089411 13
 
0.1%
a10026924 13
 
0.1%
a15805303 12
 
0.1%
a41279917 12
 
0.1%
a14272304 12
 
0.1%
a15721006 11
 
0.1%
a15722102 11
 
0.1%
a13006003 11
 
0.1%
Other values (2244) 9878
98.8%
2024-05-11T14:56:11.530212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18522
20.6%
1 17383
19.3%
A 9988
11.1%
3 8776
9.8%
2 8425
9.4%
5 6257
 
7.0%
8 5447
 
6.1%
7 4661
 
5.2%
4 4013
 
4.5%
6 3423
 
3.8%
Other values (2) 3105
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18522
23.2%
1 17383
21.7%
3 8776
11.0%
2 8425
10.5%
5 6257
 
7.8%
8 5447
 
6.8%
7 4661
 
5.8%
4 4013
 
5.0%
6 3423
 
4.3%
9 3093
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
A 9988
99.9%
B 12
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18522
23.2%
1 17383
21.7%
3 8776
11.0%
2 8425
10.5%
5 6257
 
7.8%
8 5447
 
6.8%
7 4661
 
5.8%
4 4013
 
5.0%
6 3423
 
4.3%
9 3093
 
3.9%
Latin
ValueCountFrequency (%)
A 9988
99.9%
B 12
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18522
20.6%
1 17383
19.3%
A 9988
11.1%
3 8776
9.8%
2 8425
9.4%
5 6257
 
7.0%
8 5447
 
6.1%
7 4661
 
5.2%
4 4013
 
4.5%
6 3423
 
3.8%
Other values (2) 3105
 
3.5%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:11.937634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9789
Min length2

Characters and Unicode

Total characters59789
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row당기순이익
2nd row미수수익
3rd row당기순이익
4th row미처분이익잉여금
5th row예수금
ValueCountFrequency (%)
연차수당충당부채 323
 
3.2%
장기수선충당예금 315
 
3.1%
선급비용 314
 
3.1%
미처분이익잉여금 307
 
3.1%
관리비미수금 307
 
3.1%
비품 301
 
3.0%
당기순이익 300
 
3.0%
예수금 300
 
3.0%
공동주택적립금 298
 
3.0%
퇴직급여충당부채 293
 
2.9%
Other values (67) 6942
69.4%
2024-05-11T14:56:12.512128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4617
 
7.7%
3840
 
6.4%
3115
 
5.2%
3008
 
5.0%
3008
 
5.0%
2832
 
4.7%
2548
 
4.3%
2467
 
4.1%
1923
 
3.2%
1718
 
2.9%
Other values (97) 30713
51.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59789
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4617
 
7.7%
3840
 
6.4%
3115
 
5.2%
3008
 
5.0%
3008
 
5.0%
2832
 
4.7%
2548
 
4.3%
2467
 
4.1%
1923
 
3.2%
1718
 
2.9%
Other values (97) 30713
51.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59789
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4617
 
7.7%
3840
 
6.4%
3115
 
5.2%
3008
 
5.0%
3008
 
5.0%
2832
 
4.7%
2548
 
4.3%
2467
 
4.1%
1923
 
3.2%
1718
 
2.9%
Other values (97) 30713
51.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59789
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4617
 
7.7%
3840
 
6.4%
3115
 
5.2%
3008
 
5.0%
3008
 
5.0%
2832
 
4.7%
2548
 
4.3%
2467
 
4.1%
1923
 
3.2%
1718
 
2.9%
Other values (97) 30713
51.4%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202307
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202307
2nd row202307
3rd row202307
4th row202307
5th row202307

Common Values

ValueCountFrequency (%)
202307 10000
100.0%

Length

2024-05-11T14:56:12.806786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:56:12.952831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202307 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7191
Distinct (%)71.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79015370
Minimum-3.8900128 × 108
Maximum8.5835122 × 109
Zeros2439
Zeros (%)24.4%
Negative349
Negative (%)3.5%
Memory size166.0 KiB
2024-05-11T14:56:13.159668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3.8900128 × 108
5-th percentile0
Q10
median2731579.5
Q336731885
95-th percentile3.966876 × 108
Maximum8.5835122 × 109
Range8.9725135 × 109
Interquartile range (IQR)36731885

Descriptive statistics

Standard deviation2.957669 × 108
Coefficient of variation (CV)3.7431566
Kurtosis161.09316
Mean79015370
Median Absolute Deviation (MAD)2731579.5
Skewness10.114869
Sum7.901537 × 1011
Variance8.7478058 × 1016
MonotonicityNot monotonic
2024-05-11T14:56:13.409876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2439
 
24.4%
500000 33
 
0.3%
250000 26
 
0.3%
300000 19
 
0.2%
1000000 17
 
0.2%
10000000 15
 
0.1%
100000 13
 
0.1%
484000 12
 
0.1%
200000 11
 
0.1%
242000 11
 
0.1%
Other values (7181) 7404
74.0%
ValueCountFrequency (%)
-389001283 1
< 0.1%
-329973010 1
< 0.1%
-164694360 1
< 0.1%
-146871400 1
< 0.1%
-128171264 1
< 0.1%
-127089406 1
< 0.1%
-125288064 1
< 0.1%
-120316856 1
< 0.1%
-117966690 1
< 0.1%
-113580500 1
< 0.1%
ValueCountFrequency (%)
8583512204 1
< 0.1%
6087730861 1
< 0.1%
5457108888 1
< 0.1%
5271660033 1
< 0.1%
4820644805 1
< 0.1%
4488118530 1
< 0.1%
4397557030 1
< 0.1%
3958199369 1
< 0.1%
3849821947 1
< 0.1%
3741998939 1
< 0.1%

Interactions

2024-05-11T14:56:08.637437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:56:13.584480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.484
금액0.4841.000

Missing values

2024-05-11T14:56:08.820583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:56:09.015425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
4189래미안장위포레카운티아파트A10025461당기순이익20230750396810
39618신반포4차A13790828미수수익2023070
21197신내9단지A13187305당기순이익20230739735837
27629둔촌동동아A13406002미처분이익잉여금2023074904800
52488광장신동아파밀리에A14380605예수금2023071184680
11219DMC래미안e편한세상A12013003소프트웨어2023070
18073대림아파트201동A13079401미부과관리비20230722091429
52603광장삼성1,2차A14381506예금202307104108780
50145원효산호A14085002시설보수충당부채20230794206
55415래미안당산1차아파트A15081001연차수당충당부채20230712429488
아파트명아파트코드비용명년월일금액
13949마포태영아파트A12181103선급금2023078441230
9002신당푸르지오A10045001수선유지비충당부채2023071795290
24287창동성원A13292701공동주택적립금20230715053434
49075대우월드마크용산A14001101주차장충당예금2023070
44433중계한화꿈에그린A13922905예수금2023071942130
10791홍은풍림2차A12010103미수수익2023070
70353신트리4단지A15807316장기수선충당예금202307690166809
43115송파현대힐스테이트A13887901청소비충당부채2023076388664
33649돈암동일하이빌A13603501기타당좌자산2023070
20398신내5단지대림두산A13184610저장품202307982190