Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15821/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 1792 (17.9%) zerosZeros

Reproduction

Analysis started2024-05-11 06:50:19.940255
Analysis finished2024-05-11 06:50:22.344491
Duration2.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2114
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:50:22.794072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length7.3615
Min length2

Characters and Unicode

Total characters73615
Distinct characters430
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)0.9%

Sample

1st row위례포레샤인15단지
2nd row제기이수브라운스톤
3rd row창동주공1단지
4th row창동주공18단지
5th row마포강변힐스테이트
ValueCountFrequency (%)
아파트 208
 
1.9%
래미안 45
 
0.4%
아이파크 27
 
0.2%
송파 24
 
0.2%
e편한세상 23
 
0.2%
고덕 21
 
0.2%
신반포 20
 
0.2%
푸르지오 18
 
0.2%
sk뷰 17
 
0.2%
북한산 17
 
0.2%
Other values (2195) 10535
96.2%
2024-05-11T06:50:24.131360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2674
 
3.6%
2602
 
3.5%
2410
 
3.3%
1785
 
2.4%
1666
 
2.3%
1539
 
2.1%
1480
 
2.0%
1444
 
2.0%
1420
 
1.9%
1246
 
1.7%
Other values (420) 55349
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67517
91.7%
Decimal Number 3312
 
4.5%
Space Separator 1066
 
1.4%
Uppercase Letter 830
 
1.1%
Lowercase Letter 326
 
0.4%
Close Punctuation 164
 
0.2%
Open Punctuation 164
 
0.2%
Dash Punctuation 121
 
0.2%
Other Punctuation 108
 
0.1%
Letter Number 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2674
 
4.0%
2602
 
3.9%
2410
 
3.6%
1785
 
2.6%
1666
 
2.5%
1539
 
2.3%
1480
 
2.2%
1444
 
2.1%
1420
 
2.1%
1246
 
1.8%
Other values (375) 49251
72.9%
Uppercase Letter
ValueCountFrequency (%)
S 138
16.6%
C 129
15.5%
K 93
11.2%
M 84
10.1%
D 84
10.1%
L 60
7.2%
H 45
 
5.4%
E 43
 
5.2%
G 32
 
3.9%
I 32
 
3.9%
Other values (7) 90
10.8%
Lowercase Letter
ValueCountFrequency (%)
e 184
56.4%
l 30
 
9.2%
i 28
 
8.6%
k 18
 
5.5%
v 16
 
4.9%
s 14
 
4.3%
c 12
 
3.7%
a 9
 
2.8%
g 9
 
2.8%
w 4
 
1.2%
Decimal Number
ValueCountFrequency (%)
1 1002
30.3%
2 971
29.3%
3 422
12.7%
4 228
 
6.9%
5 202
 
6.1%
6 150
 
4.5%
7 118
 
3.6%
9 84
 
2.5%
8 81
 
2.4%
0 54
 
1.6%
Other Punctuation
ValueCountFrequency (%)
, 84
77.8%
. 24
 
22.2%
Space Separator
ValueCountFrequency (%)
1066
100.0%
Close Punctuation
ValueCountFrequency (%)
) 164
100.0%
Open Punctuation
ValueCountFrequency (%)
( 164
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 121
100.0%
Letter Number
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67517
91.7%
Common 4935
 
6.7%
Latin 1163
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2674
 
4.0%
2602
 
3.9%
2410
 
3.6%
1785
 
2.6%
1666
 
2.5%
1539
 
2.3%
1480
 
2.2%
1444
 
2.1%
1420
 
2.1%
1246
 
1.8%
Other values (375) 49251
72.9%
Latin
ValueCountFrequency (%)
e 184
15.8%
S 138
11.9%
C 129
11.1%
K 93
 
8.0%
M 84
 
7.2%
D 84
 
7.2%
L 60
 
5.2%
H 45
 
3.9%
E 43
 
3.7%
G 32
 
2.8%
Other values (19) 271
23.3%
Common
ValueCountFrequency (%)
1066
21.6%
1 1002
20.3%
2 971
19.7%
3 422
 
8.6%
4 228
 
4.6%
5 202
 
4.1%
) 164
 
3.3%
( 164
 
3.3%
6 150
 
3.0%
- 121
 
2.5%
Other values (6) 445
9.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67517
91.7%
ASCII 6091
 
8.3%
Number Forms 7
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2674
 
4.0%
2602
 
3.9%
2410
 
3.6%
1785
 
2.6%
1666
 
2.5%
1539
 
2.3%
1480
 
2.2%
1444
 
2.1%
1420
 
2.1%
1246
 
1.8%
Other values (375) 49251
72.9%
ASCII
ValueCountFrequency (%)
1066
17.5%
1 1002
16.5%
2 971
15.9%
3 422
 
6.9%
4 228
 
3.7%
5 202
 
3.3%
e 184
 
3.0%
) 164
 
2.7%
( 164
 
2.7%
6 150
 
2.5%
Other values (34) 1538
25.3%
Number Forms
ValueCountFrequency (%)
7
100.0%
Distinct2118
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:50:25.242782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)0.9%

Sample

1st rowA10024197
2nd rowA13006003
3rd rowA13290809
4th rowA13290105
5th rowA12112002
ValueCountFrequency (%)
a13285404 16
 
0.2%
a13113003 14
 
0.1%
a14021001 13
 
0.1%
a14387605 12
 
0.1%
a13880603 11
 
0.1%
a13986703 11
 
0.1%
a13876108 11
 
0.1%
a10025024 11
 
0.1%
a13872504 11
 
0.1%
a13187305 11
 
0.1%
Other values (2108) 9879
98.8%
2024-05-11T06:50:26.643451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18930
21.0%
1 17570
19.5%
A 10000
11.1%
3 9016
10.0%
2 8548
9.5%
5 5803
 
6.4%
8 5273
 
5.9%
7 4418
 
4.9%
4 4087
 
4.5%
6 3517
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18930
23.7%
1 17570
22.0%
3 9016
11.3%
2 8548
10.7%
5 5803
 
7.3%
8 5273
 
6.6%
7 4418
 
5.5%
4 4087
 
5.1%
6 3517
 
4.4%
9 2838
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
A 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18930
23.7%
1 17570
22.0%
3 9016
11.3%
2 8548
10.7%
5 5803
 
7.3%
8 5273
 
6.6%
7 4418
 
5.5%
4 4087
 
5.1%
6 3517
 
4.4%
9 2838
 
3.5%
Latin
ValueCountFrequency (%)
A 10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18930
21.0%
1 17570
19.5%
A 10000
11.1%
3 9016
10.0%
2 8548
9.5%
5 5803
 
6.4%
8 5273
 
5.9%
7 4418
 
4.9%
4 4087
 
4.5%
6 3517
 
3.9%
Distinct87
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:50:27.410570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.8787
Min length2

Characters and Unicode

Total characters48787
Distinct characters120
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row승강기유지비
2nd row교육비
3rd row재활용품비용
4th row기타운영비용
5th row이자수익
ValueCountFrequency (%)
승강기유지비 229
 
2.3%
세대전기료 220
 
2.2%
경비비 219
 
2.2%
수선유지비 218
 
2.2%
급여 212
 
2.1%
이자수익 211
 
2.1%
도서인쇄비 209
 
2.1%
보험료 208
 
2.1%
제수당 206
 
2.1%
통신비 204
 
2.0%
Other values (77) 7864
78.6%
2024-05-11T06:50:28.602876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5337
 
10.9%
3668
 
7.5%
2076
 
4.3%
2073
 
4.2%
1692
 
3.5%
1318
 
2.7%
1069
 
2.2%
846
 
1.7%
765
 
1.6%
733
 
1.5%
Other values (110) 29210
59.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 48787
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5337
 
10.9%
3668
 
7.5%
2076
 
4.3%
2073
 
4.2%
1692
 
3.5%
1318
 
2.7%
1069
 
2.2%
846
 
1.7%
765
 
1.6%
733
 
1.5%
Other values (110) 29210
59.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 48787
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5337
 
10.9%
3668
 
7.5%
2076
 
4.3%
2073
 
4.2%
1692
 
3.5%
1318
 
2.7%
1069
 
2.2%
846
 
1.7%
765
 
1.6%
733
 
1.5%
Other values (110) 29210
59.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 48787
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5337
 
10.9%
3668
 
7.5%
2076
 
4.3%
2073
 
4.2%
1692
 
3.5%
1318
 
2.7%
1069
 
2.2%
846
 
1.7%
765
 
1.6%
733
 
1.5%
Other values (110) 29210
59.9%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202211
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202211
2nd row202211
3rd row202211
4th row202211
5th row202211

Common Values

ValueCountFrequency (%)
202211 10000
100.0%

Length

2024-05-11T06:50:29.093834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T06:50:29.418086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202211 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct6655
Distinct (%)66.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3559721.1
Minimum-3452100
Maximum6.5471416 × 108
Zeros1792
Zeros (%)17.9%
Negative15
Negative (%)0.1%
Memory size166.0 KiB
2024-05-11T06:50:29.793690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3452100
5-th percentile0
Q139092.5
median270560
Q31317922.5
95-th percentile17596809
Maximum6.5471416 × 108
Range6.5816626 × 108
Interquartile range (IQR)1278830

Descriptive statistics

Standard deviation14581346
Coefficient of variation (CV)4.0962046
Kurtosis549.9505
Mean3559721.1
Median Absolute Deviation (MAD)270560
Skewness17.447669
Sum3.5597211 × 1010
Variance2.1261565 × 1014
MonotonicityNot monotonic
2024-05-11T06:50:30.333371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1792
 
17.9%
200000 93
 
0.9%
100000 58
 
0.6%
300000 52
 
0.5%
150000 51
 
0.5%
400000 33
 
0.3%
50000 31
 
0.3%
110000 29
 
0.3%
60000 29
 
0.3%
30000 28
 
0.3%
Other values (6645) 7804
78.0%
ValueCountFrequency (%)
-3452100 1
< 0.1%
-1562730 1
< 0.1%
-1352040 1
< 0.1%
-950720 1
< 0.1%
-596000 1
< 0.1%
-459920 1
< 0.1%
-393310 1
< 0.1%
-219070 1
< 0.1%
-218810 1
< 0.1%
-145530 1
< 0.1%
ValueCountFrequency (%)
654714155 1
< 0.1%
421708913 1
< 0.1%
317306269 1
< 0.1%
311240320 1
< 0.1%
290540500 1
< 0.1%
221496235 1
< 0.1%
201124650 1
< 0.1%
194510363 1
< 0.1%
184462902 1
< 0.1%
174721259 1
< 0.1%

Interactions

2024-05-11T06:50:21.205993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T06:50:30.682356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.307
금액0.3071.000

Missing values

2024-05-11T06:50:21.651648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T06:50:22.163055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
1701위례포레샤인15단지A10024197승강기유지비202211988350
25386제기이수브라운스톤A13006003교육비20221173000
35468창동주공1단지A13290809재활용품비용202211293818
35346창동주공18단지A13290105기타운영비용2022110
18885마포강변힐스테이트A12112002이자수익2022111171217
48278압구정한양3단지A13590602공동전기료20221111708327
3421위례포레샤인18단지A10024577주차장수익2022112468340
85706고척한일유앤아이A15208204퇴직급여2022111346530
16782DMC휴먼빌A12013001수도광열비20221139110
93379사당동작삼성래미안아파트A15609306수선유지비20221112418540
아파트명아파트코드비용명년월일금액
25647휘경동양1.2차A13009001사무용품비202211107000
15422평창롯데A11084601식대202211100000
85713고척한일유앤아이A15208204피복비2022110
94456대방현대1차A15681106교육비20221140000
7013백련산파크자이아파트A10025683복리후생비2022111124020
63965상계1차중앙하이츠A13920207입주자대표회의운영비202211845000
22676북한산수자인A12204001검침수익20221166650
8795삼성동센트럴아이파크A10026350세대전기료20221134978000
39075성수청구강변A13383003급여2022119187380
55099서초현대아파트A13772801세대수도료2022114668130