Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15821/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 1637 (16.4%) zerosZeros

Reproduction

Analysis started2024-05-11 06:50:49.453811
Analysis finished2024-05-11 06:50:53.012489
Duration3.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2097
Distinct (%)21.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:50:53.468617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length7.3702
Min length2

Characters and Unicode

Total characters73702
Distinct characters430
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique115 ?
Unique (%)1.1%

Sample

1st row압구정현대아파트
2nd row구로우성
3rd row역삼래미안팬타빌
4th row당산효성1차
5th row상도쌍용
ValueCountFrequency (%)
아파트 226
 
2.1%
래미안 54
 
0.5%
아이파크 27
 
0.2%
e편한세상 24
 
0.2%
sk뷰 22
 
0.2%
푸르지오 21
 
0.2%
보라매 19
 
0.2%
북한산 19
 
0.2%
고덕 19
 
0.2%
신길삼두 18
 
0.2%
Other values (2181) 10556
95.9%
2024-05-11T06:50:54.785149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2778
 
3.8%
2717
 
3.7%
2508
 
3.4%
1706
 
2.3%
1590
 
2.2%
1533
 
2.1%
1515
 
2.1%
1455
 
2.0%
1327
 
1.8%
1200
 
1.6%
Other values (420) 55373
75.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67485
91.6%
Decimal Number 3293
 
4.5%
Space Separator 1112
 
1.5%
Uppercase Letter 895
 
1.2%
Lowercase Letter 369
 
0.5%
Open Punctuation 156
 
0.2%
Close Punctuation 156
 
0.2%
Dash Punctuation 116
 
0.2%
Other Punctuation 116
 
0.2%
Letter Number 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2778
 
4.1%
2717
 
4.0%
2508
 
3.7%
1706
 
2.5%
1590
 
2.4%
1533
 
2.3%
1515
 
2.2%
1455
 
2.2%
1327
 
2.0%
1200
 
1.8%
Other values (375) 49156
72.8%
Uppercase Letter
ValueCountFrequency (%)
C 144
16.1%
S 134
15.0%
K 103
11.5%
M 98
10.9%
D 98
10.9%
L 67
7.5%
H 54
 
6.0%
E 38
 
4.2%
I 38
 
4.2%
G 28
 
3.1%
Other values (7) 93
10.4%
Lowercase Letter
ValueCountFrequency (%)
e 188
50.9%
l 50
 
13.6%
i 35
 
9.5%
v 27
 
7.3%
s 21
 
5.7%
k 16
 
4.3%
h 9
 
2.4%
c 8
 
2.2%
w 5
 
1.4%
a 5
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 999
30.3%
2 962
29.2%
3 472
14.3%
4 197
 
6.0%
5 181
 
5.5%
6 139
 
4.2%
7 101
 
3.1%
8 94
 
2.9%
9 83
 
2.5%
0 65
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 101
87.1%
. 15
 
12.9%
Space Separator
ValueCountFrequency (%)
1112
100.0%
Open Punctuation
ValueCountFrequency (%)
( 156
100.0%
Close Punctuation
ValueCountFrequency (%)
) 156
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 116
100.0%
Letter Number
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67485
91.6%
Common 4949
 
6.7%
Latin 1268
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2778
 
4.1%
2717
 
4.0%
2508
 
3.7%
1706
 
2.5%
1590
 
2.4%
1533
 
2.3%
1515
 
2.2%
1455
 
2.2%
1327
 
2.0%
1200
 
1.8%
Other values (375) 49156
72.8%
Latin
ValueCountFrequency (%)
e 188
14.8%
C 144
11.4%
S 134
10.6%
K 103
 
8.1%
M 98
 
7.7%
D 98
 
7.7%
L 67
 
5.3%
H 54
 
4.3%
l 50
 
3.9%
E 38
 
3.0%
Other values (19) 294
23.2%
Common
ValueCountFrequency (%)
1112
22.5%
1 999
20.2%
2 962
19.4%
3 472
9.5%
4 197
 
4.0%
5 181
 
3.7%
( 156
 
3.2%
) 156
 
3.2%
6 139
 
2.8%
- 116
 
2.3%
Other values (6) 459
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67485
91.6%
ASCII 6213
 
8.4%
Number Forms 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2778
 
4.1%
2717
 
4.0%
2508
 
3.7%
1706
 
2.5%
1590
 
2.4%
1533
 
2.3%
1515
 
2.2%
1455
 
2.2%
1327
 
2.0%
1200
 
1.8%
Other values (375) 49156
72.8%
ASCII
ValueCountFrequency (%)
1112
17.9%
1 999
16.1%
2 962
15.5%
3 472
 
7.6%
4 197
 
3.2%
e 188
 
3.0%
5 181
 
2.9%
( 156
 
2.5%
) 156
 
2.5%
C 144
 
2.3%
Other values (34) 1646
26.5%
Number Forms
ValueCountFrequency (%)
4
100.0%
Distinct2101
Distinct (%)21.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:50:55.807822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique116 ?
Unique (%)1.2%

Sample

1st rowA13589802
2nd rowA15283809
3rd rowA13508006
4th rowA15004506
5th rowA15683901
ValueCountFrequency (%)
a15083701 18
 
0.2%
a15678102 16
 
0.2%
a13610003 15
 
0.1%
a13676504 14
 
0.1%
a10026051 13
 
0.1%
a13805002 12
 
0.1%
a15280602 12
 
0.1%
a13588601 11
 
0.1%
a15004506 11
 
0.1%
a10024152 11
 
0.1%
Other values (2091) 9867
98.7%
2024-05-11T06:50:57.303695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 19156
21.3%
1 17474
19.4%
A 10000
11.1%
3 9060
10.1%
2 8623
9.6%
5 5867
 
6.5%
8 5220
 
5.8%
7 4383
 
4.9%
4 3951
 
4.4%
6 3409
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 19156
23.9%
1 17474
21.8%
3 9060
11.3%
2 8623
10.8%
5 5867
 
7.3%
8 5220
 
6.5%
7 4383
 
5.5%
4 3951
 
4.9%
6 3409
 
4.3%
9 2857
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
A 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 19156
23.9%
1 17474
21.8%
3 9060
11.3%
2 8623
10.8%
5 5867
 
7.3%
8 5220
 
6.5%
7 4383
 
5.5%
4 3951
 
4.9%
6 3409
 
4.3%
9 2857
 
3.6%
Latin
ValueCountFrequency (%)
A 10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 19156
21.3%
1 17474
19.4%
A 10000
11.1%
3 9060
10.1%
2 8623
9.6%
5 5867
 
6.5%
8 5220
 
5.8%
7 4383
 
4.9%
4 3951
 
4.4%
6 3409
 
3.8%
Distinct86
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:50:58.132557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.9444
Min length2

Characters and Unicode

Total characters49444
Distinct characters118
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row광고료수익
2nd row재활용품비용
3rd row위탁관리수수료
4th row회계감사비
5th row재활용품수익
ValueCountFrequency (%)
도서인쇄비 220
 
2.2%
수선유지비 213
 
2.1%
청소비 212
 
2.1%
세대전기료 210
 
2.1%
이자수익 209
 
2.1%
승강기유지비 207
 
2.1%
급여 206
 
2.1%
경비비 206
 
2.1%
통신비 206
 
2.1%
교육비 204
 
2.0%
Other values (76) 7907
79.1%
2024-05-11T06:50:59.458368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5398
 
10.9%
3621
 
7.3%
2071
 
4.2%
2059
 
4.2%
1719
 
3.5%
1343
 
2.7%
1036
 
2.1%
922
 
1.9%
775
 
1.6%
760
 
1.5%
Other values (108) 29740
60.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49444
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5398
 
10.9%
3621
 
7.3%
2071
 
4.2%
2059
 
4.2%
1719
 
3.5%
1343
 
2.7%
1036
 
2.1%
922
 
1.9%
775
 
1.6%
760
 
1.5%
Other values (108) 29740
60.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49444
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5398
 
10.9%
3621
 
7.3%
2071
 
4.2%
2059
 
4.2%
1719
 
3.5%
1343
 
2.7%
1036
 
2.1%
922
 
1.9%
775
 
1.6%
760
 
1.5%
Other values (108) 29740
60.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49444
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5398
 
10.9%
3621
 
7.3%
2071
 
4.2%
2059
 
4.2%
1719
 
3.5%
1343
 
2.7%
1036
 
2.1%
922
 
1.9%
775
 
1.6%
760
 
1.5%
Other values (108) 29740
60.1%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202212
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202212
2nd row202212
3rd row202212
4th row202212
5th row202212

Common Values

ValueCountFrequency (%)
202212 10000
100.0%

Length

2024-05-11T06:50:59.898624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T06:51:00.241513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202212 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct6943
Distinct (%)69.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4273453.6
Minimum-15700000
Maximum8.764205 × 108
Zeros1637
Zeros (%)16.4%
Negative11
Negative (%)0.1%
Memory size166.0 KiB
2024-05-11T06:51:00.580416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-15700000
5-th percentile0
Q145750
median287200
Q31350515
95-th percentile18206528
Maximum8.764205 × 108
Range8.921205 × 108
Interquartile range (IQR)1304765

Descriptive statistics

Standard deviation21102425
Coefficient of variation (CV)4.9380259
Kurtosis643.99075
Mean4273453.6
Median Absolute Deviation (MAD)287200
Skewness19.925873
Sum4.2734536 × 1010
Variance4.4531233 × 1014
MonotonicityNot monotonic
2024-05-11T06:51:01.290848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1637
 
16.4%
200000 85
 
0.9%
100000 54
 
0.5%
300000 41
 
0.4%
150000 39
 
0.4%
30000 35
 
0.4%
250000 34
 
0.3%
50000 28
 
0.3%
400000 28
 
0.3%
500000 26
 
0.3%
Other values (6933) 7993
79.9%
ValueCountFrequency (%)
-15700000 1
< 0.1%
-13540546 1
< 0.1%
-3200000 1
< 0.1%
-1869173 1
< 0.1%
-1489870 1
< 0.1%
-501790 1
< 0.1%
-61030 1
< 0.1%
-35557 1
< 0.1%
-11350 1
< 0.1%
-3660 1
< 0.1%
ValueCountFrequency (%)
876420496 1
< 0.1%
856850080 1
< 0.1%
447455018 1
< 0.1%
442785490 1
< 0.1%
427513679 1
< 0.1%
306945650 1
< 0.1%
293643040 1
< 0.1%
271935030 1
< 0.1%
268009923 1
< 0.1%
255414470 1
< 0.1%

Interactions

2024-05-11T06:50:51.345902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T06:51:01.524431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.486
금액0.4861.000

Missing values

2024-05-11T06:50:51.898479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T06:50:52.827618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
48356압구정현대아파트A13589802광고료수익2022121707270
88673구로우성A15283809재활용품비용20221280000
43550역삼래미안팬타빌A13508006위탁관리수수료202212425000
78962당산효성1차A15004506회계감사비202212198000
95334상도쌍용A15683901재활용품수익2022120
49343개포주공5단지A13599402입주자대표회의운영비2022121344000
62032가락금호A13880407검침비용202212300000
89366중앙구로하이츠A15286806국민연금202212343610
78523당산신동아파밀리에A15004101수선유지비202212617190
47088삼성동중앙하이츠빌리지A13550701장기수선비20221210794000
아파트명아파트코드비용명년월일금액
23176북한산힐스테이트3차A12204004복리후생비202212425800
18639공덕한화꿈에그린A12102002위탁관리수수료202212192500
40064대우한강베네시티A13402003공동난방비20221252160
78874당산2차효성타운A15004503재활용품비용202212240000
13659정릉꿈에그린아파트A10028000업무추진비2022120
52193장위참누리A13614302퇴직급여2022121306000
43728테헤란 IPARKA13508012광고료수익202212400000
65049중계주공8단지A13922111제수당2022122084490
21581서교동대우미래사랑A12184201검침비용202212157380
25273답십리두산위브A13003003소모품비2022121238980