Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2184 (21.8%) zerosZeros

Reproduction

Analysis started2024-05-11 06:01:07.201009
Analysis finished2024-05-11 06:01:08.273315
Duration1.07 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2110
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:08.527986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.1796
Min length2

Characters and Unicode

Total characters71796
Distinct characters430
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique111 ?
Unique (%)1.1%

Sample

1st row우장산에스케이뷰
2nd row갈현베르빌주상복합아파트
3rd row묵동신안2차
4th row래미안아름숲
5th row상계현대1차
ValueCountFrequency (%)
아파트 114
 
1.1%
래미안 23
 
0.2%
래미안밤섬리베뉴 17
 
0.2%
힐스테이트 17
 
0.2%
신반포한신5지구(12,13,18차 13
 
0.1%
무학현대 12
 
0.1%
은평뉴타운구파발9-2단지 12
 
0.1%
상암월드컵파크9단지 12
 
0.1%
래미안라센트 12
 
0.1%
북한산 12
 
0.1%
Other values (2170) 10286
97.7%
2024-05-11T15:01:09.118669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2168
 
3.0%
2167
 
3.0%
1967
 
2.7%
1879
 
2.6%
1878
 
2.6%
1629
 
2.3%
1567
 
2.2%
1552
 
2.2%
1484
 
2.1%
1416
 
2.0%
Other values (420) 54089
75.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65649
91.4%
Decimal Number 3943
 
5.5%
Uppercase Letter 643
 
0.9%
Space Separator 582
 
0.8%
Lowercase Letter 373
 
0.5%
Dash Punctuation 160
 
0.2%
Open Punctuation 149
 
0.2%
Close Punctuation 149
 
0.2%
Other Punctuation 139
 
0.2%
Letter Number 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2168
 
3.3%
2167
 
3.3%
1967
 
3.0%
1879
 
2.9%
1878
 
2.9%
1629
 
2.5%
1567
 
2.4%
1552
 
2.4%
1484
 
2.3%
1416
 
2.2%
Other values (375) 47942
73.0%
Uppercase Letter
ValueCountFrequency (%)
S 106
16.5%
K 90
14.0%
C 74
11.5%
L 49
7.6%
H 43
6.7%
M 42
 
6.5%
D 42
 
6.5%
I 39
 
6.1%
G 31
 
4.8%
E 30
 
4.7%
Other values (7) 97
15.1%
Lowercase Letter
ValueCountFrequency (%)
e 193
51.7%
l 38
 
10.2%
i 34
 
9.1%
v 29
 
7.8%
s 22
 
5.9%
k 21
 
5.6%
w 13
 
3.5%
c 12
 
3.2%
h 7
 
1.9%
a 2
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 1242
31.5%
2 1125
28.5%
3 520
13.2%
4 270
 
6.8%
5 220
 
5.6%
6 160
 
4.1%
8 105
 
2.7%
9 104
 
2.6%
7 102
 
2.6%
0 95
 
2.4%
Other Punctuation
ValueCountFrequency (%)
, 116
83.5%
. 23
 
16.5%
Space Separator
ValueCountFrequency (%)
582
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 160
100.0%
Open Punctuation
ValueCountFrequency (%)
( 149
100.0%
Close Punctuation
ValueCountFrequency (%)
) 149
100.0%
Letter Number
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65649
91.4%
Common 5122
 
7.1%
Latin 1025
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2168
 
3.3%
2167
 
3.3%
1967
 
3.0%
1879
 
2.9%
1878
 
2.9%
1629
 
2.5%
1567
 
2.4%
1552
 
2.4%
1484
 
2.3%
1416
 
2.2%
Other values (375) 47942
73.0%
Latin
ValueCountFrequency (%)
e 193
18.8%
S 106
 
10.3%
K 90
 
8.8%
C 74
 
7.2%
L 49
 
4.8%
H 43
 
4.2%
M 42
 
4.1%
D 42
 
4.1%
I 39
 
3.8%
l 38
 
3.7%
Other values (19) 309
30.1%
Common
ValueCountFrequency (%)
1 1242
24.2%
2 1125
22.0%
582
11.4%
3 520
10.2%
4 270
 
5.3%
5 220
 
4.3%
6 160
 
3.1%
- 160
 
3.1%
( 149
 
2.9%
) 149
 
2.9%
Other values (6) 545
10.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65649
91.4%
ASCII 6138
 
8.5%
Number Forms 9
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2168
 
3.3%
2167
 
3.3%
1967
 
3.0%
1879
 
2.9%
1878
 
2.9%
1629
 
2.5%
1567
 
2.4%
1552
 
2.4%
1484
 
2.3%
1416
 
2.2%
Other values (375) 47942
73.0%
ASCII
ValueCountFrequency (%)
1 1242
20.2%
2 1125
18.3%
582
 
9.5%
3 520
 
8.5%
4 270
 
4.4%
5 220
 
3.6%
e 193
 
3.1%
6 160
 
2.6%
- 160
 
2.6%
( 149
 
2.4%
Other values (34) 1517
24.7%
Number Forms
ValueCountFrequency (%)
9
100.0%
Distinct2115
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:09.566203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique112 ?
Unique (%)1.1%

Sample

1st rowA15701002
2nd rowA12271402
3rd rowA13185502
4th rowA13002002
5th rowA13983707
ValueCountFrequency (%)
a13790726 13
 
0.1%
a13385802 12
 
0.1%
a13676702 12
 
0.1%
a41279920 12
 
0.1%
a13671209 12
 
0.1%
a12179504 12
 
0.1%
a13820006 11
 
0.1%
a13922110 11
 
0.1%
a15807705 11
 
0.1%
a15205513 11
 
0.1%
Other values (2105) 9883
98.8%
2024-05-11T15:01:10.266252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18223
20.2%
1 17694
19.7%
A 9993
11.1%
3 9033
10.0%
2 8186
9.1%
5 6186
 
6.9%
8 5790
 
6.4%
7 4809
 
5.3%
4 3677
 
4.1%
6 3377
 
3.8%
Other values (2) 3032
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18223
22.8%
1 17694
22.1%
3 9033
11.3%
2 8186
10.2%
5 6186
 
7.7%
8 5790
 
7.2%
7 4809
 
6.0%
4 3677
 
4.6%
6 3377
 
4.2%
9 3025
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9993
99.9%
B 7
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18223
22.8%
1 17694
22.1%
3 9033
11.3%
2 8186
10.2%
5 6186
 
7.7%
8 5790
 
7.2%
7 4809
 
6.0%
4 3677
 
4.6%
6 3377
 
4.2%
9 3025
 
3.8%
Latin
ValueCountFrequency (%)
A 9993
99.9%
B 7
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18223
20.2%
1 17694
19.7%
A 9993
11.1%
3 9033
10.0%
2 8186
9.1%
5 6186
 
6.9%
8 5790
 
6.4%
7 4809
 
5.3%
4 3677
 
4.1%
6 3377
 
3.8%
Other values (2) 3032
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:10.682692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9704
Min length2

Characters and Unicode

Total characters59704
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미수관리비예치금
2nd row단기차입금
3rd row미지급금
4th row연차수당충당부채
5th row미부과관리비
ValueCountFrequency (%)
관리비미수금 330
 
3.3%
가수금 330
 
3.3%
예수금 319
 
3.2%
선급비용 317
 
3.2%
퇴직급여충당부채 316
 
3.2%
연차수당충당부채 308
 
3.1%
미처분이익잉여금 307
 
3.1%
예금 307
 
3.1%
현금 294
 
2.9%
당기순이익 294
 
2.9%
Other values (67) 6878
68.8%
2024-05-11T15:01:11.234648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4724
 
7.9%
3767
 
6.3%
3199
 
5.4%
3091
 
5.2%
2992
 
5.0%
2973
 
5.0%
2663
 
4.5%
2321
 
3.9%
1879
 
3.1%
1753
 
2.9%
Other values (97) 30342
50.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59704
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4724
 
7.9%
3767
 
6.3%
3199
 
5.4%
3091
 
5.2%
2992
 
5.0%
2973
 
5.0%
2663
 
4.5%
2321
 
3.9%
1879
 
3.1%
1753
 
2.9%
Other values (97) 30342
50.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59704
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4724
 
7.9%
3767
 
6.3%
3199
 
5.4%
3091
 
5.2%
2992
 
5.0%
2973
 
5.0%
2663
 
4.5%
2321
 
3.9%
1879
 
3.1%
1753
 
2.9%
Other values (97) 30342
50.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59704
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4724
 
7.9%
3767
 
6.3%
3199
 
5.4%
3091
 
5.2%
2992
 
5.0%
2973
 
5.0%
2663
 
4.5%
2321
 
3.9%
1879
 
3.1%
1753
 
2.9%
Other values (97) 30342
50.8%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
201910
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201910
2nd row201910
3rd row201910
4th row201910
5th row201910

Common Values

ValueCountFrequency (%)
201910 10000
100.0%

Length

2024-05-11T15:01:11.428025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:01:11.580614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201910 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7471
Distinct (%)74.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69850424
Minimum-6.3136859 × 108
Maximum1.3613033 × 1010
Zeros2184
Zeros (%)21.8%
Negative342
Negative (%)3.4%
Memory size166.0 KiB
2024-05-11T15:01:11.732230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-6.3136859 × 108
5-th percentile0
Q10
median3555366
Q334578065
95-th percentile3.0648018 × 108
Maximum1.3613033 × 1010
Range1.4244402 × 1010
Interquartile range (IQR)34578065

Descriptive statistics

Standard deviation3.297648 × 108
Coefficient of variation (CV)4.7210135
Kurtosis527.9957
Mean69850424
Median Absolute Deviation (MAD)3555366
Skewness18.604884
Sum6.9850424 × 1011
Variance1.0874482 × 1017
MonotonicityNot monotonic
2024-05-11T15:01:11.945249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2184
 
21.8%
500000 27
 
0.3%
250000 21
 
0.2%
484000 14
 
0.1%
250400 13
 
0.1%
20000000 13
 
0.1%
200000 13
 
0.1%
300000 12
 
0.1%
30000000 11
 
0.1%
242000 10
 
0.1%
Other values (7461) 7682
76.8%
ValueCountFrequency (%)
-631368591 1
< 0.1%
-265628480 1
< 0.1%
-243231024 1
< 0.1%
-238861320 1
< 0.1%
-227163515 1
< 0.1%
-222480180 1
< 0.1%
-204350320 1
< 0.1%
-169714594 1
< 0.1%
-151940030 1
< 0.1%
-132654576 1
< 0.1%
ValueCountFrequency (%)
13613033099 1
< 0.1%
9822153902 1
< 0.1%
8736155155 1
< 0.1%
7943651403 1
< 0.1%
7402618872 1
< 0.1%
7251883796 1
< 0.1%
6250746161 1
< 0.1%
5327680530 1
< 0.1%
4252955297 1
< 0.1%
4203099737 1
< 0.1%

Interactions

2024-05-11T15:01:07.806196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:01:12.104395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.379
금액0.3791.000

Missing values

2024-05-11T15:01:08.048051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:01:08.205654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
58305우장산에스케이뷰A15701002미수관리비예치금2019100
11127갈현베르빌주상복합아파트A12271402단기차입금2019100
15665묵동신안2차A13185502미지급금20191013486073
11858래미안아름숲A13002002연차수당충당부채2019109506690
40722상계현대1차A13983707미부과관리비20191053455180
39988불암현대A13981208가수금20191011291965
14665상봉프레미어스엠코A13122002수선유지비충당부채20191027528990
57752대방2차현대A15681104선수관리비20191045676800
26153도곡경남A13527008공동주택적립금예금2019100
30174종암선경아파트A13671203장기수선충당예금201910364865373
아파트명아파트코드비용명년월일금액
16648도봉동아에코빌A13201206미부과관리비20191091482897
27446도곡대림A13586101공동주택적립금예금20191018407731
28057개포한신A13594402장기수선충당부채201910954549991
46788영등포푸르지오A15003002경비비충당부채20191028231212
18751쌍문삼익A13286304미처분이익잉여금2019100
60708가양4단지A15780705단기차입금2019100
25428강남엘에이치1단지A13519007기타유형자산감가상각누계액201910-15134194
11728역촌센트레빌A12289501저장품201910280000
2926위례2차아이파크아파트A10027553공동주택적립금2019104148229
48990대림신동아A15081606수선유지비충당부채2019104031029