Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15821/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 856 (8.6%) zerosZeros

Reproduction

Analysis started2024-05-11 06:57:32.241285
Analysis finished2024-05-11 06:57:34.434495
Duration2.19 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2157
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:57:34.869208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.2036
Min length2

Characters and Unicode

Total characters72036
Distinct characters431
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)1.0%

Sample

1st row상계수락한신
2nd row번동한양
3rd row신정현대
4th row중동계룡
5th row대치롯데캐슬아파트
ValueCountFrequency (%)
아파트 128
 
1.2%
래미안 28
 
0.3%
힐스테이트 20
 
0.2%
신내 18
 
0.2%
북한산 17
 
0.2%
신반포 17
 
0.2%
입주자대표회의 16
 
0.2%
래미안밤섬리베뉴 14
 
0.1%
아이파크 14
 
0.1%
코오롱하늘채아파트 14
 
0.1%
Other values (2220) 10324
97.3%
2024-05-11T06:57:36.667957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2435
 
3.4%
2290
 
3.2%
2078
 
2.9%
1829
 
2.5%
1805
 
2.5%
1688
 
2.3%
1510
 
2.1%
1457
 
2.0%
1444
 
2.0%
1302
 
1.8%
Other values (421) 54198
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66060
91.7%
Decimal Number 3734
 
5.2%
Uppercase Letter 680
 
0.9%
Space Separator 662
 
0.9%
Lowercase Letter 312
 
0.4%
Close Punctuation 158
 
0.2%
Open Punctuation 158
 
0.2%
Dash Punctuation 147
 
0.2%
Other Punctuation 113
 
0.2%
Letter Number 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2435
 
3.7%
2290
 
3.5%
2078
 
3.1%
1829
 
2.8%
1805
 
2.7%
1688
 
2.6%
1510
 
2.3%
1457
 
2.2%
1444
 
2.2%
1302
 
2.0%
Other values (376) 48222
73.0%
Uppercase Letter
ValueCountFrequency (%)
S 108
15.9%
C 98
14.4%
K 89
13.1%
D 61
9.0%
M 61
9.0%
L 43
 
6.3%
H 38
 
5.6%
E 34
 
5.0%
I 33
 
4.9%
A 25
 
3.7%
Other values (7) 90
13.2%
Decimal Number
ValueCountFrequency (%)
2 1130
30.3%
1 1082
29.0%
3 506
13.6%
4 261
 
7.0%
5 211
 
5.7%
6 158
 
4.2%
9 106
 
2.8%
8 100
 
2.7%
7 92
 
2.5%
0 88
 
2.4%
Lowercase Letter
ValueCountFrequency (%)
e 198
63.5%
i 24
 
7.7%
l 20
 
6.4%
v 16
 
5.1%
k 16
 
5.1%
s 13
 
4.2%
w 9
 
2.9%
c 6
 
1.9%
a 5
 
1.6%
g 5
 
1.6%
Other Punctuation
ValueCountFrequency (%)
, 87
77.0%
. 26
 
23.0%
Space Separator
ValueCountFrequency (%)
662
100.0%
Close Punctuation
ValueCountFrequency (%)
) 158
100.0%
Open Punctuation
ValueCountFrequency (%)
( 158
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%
Letter Number
ValueCountFrequency (%)
9
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 66060
91.7%
Common 4975
 
6.9%
Latin 1001
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2435
 
3.7%
2290
 
3.5%
2078
 
3.1%
1829
 
2.8%
1805
 
2.7%
1688
 
2.6%
1510
 
2.3%
1457
 
2.2%
1444
 
2.2%
1302
 
2.0%
Other values (376) 48222
73.0%
Latin
ValueCountFrequency (%)
e 198
19.8%
S 108
10.8%
C 98
9.8%
K 89
 
8.9%
D 61
 
6.1%
M 61
 
6.1%
L 43
 
4.3%
H 38
 
3.8%
E 34
 
3.4%
I 33
 
3.3%
Other values (18) 238
23.8%
Common
ValueCountFrequency (%)
2 1130
22.7%
1 1082
21.7%
662
13.3%
3 506
10.2%
4 261
 
5.2%
5 211
 
4.2%
6 158
 
3.2%
) 158
 
3.2%
( 158
 
3.2%
- 147
 
3.0%
Other values (7) 502
10.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 66060
91.7%
ASCII 5967
 
8.3%
Number Forms 9
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2435
 
3.7%
2290
 
3.5%
2078
 
3.1%
1829
 
2.8%
1805
 
2.7%
1688
 
2.6%
1510
 
2.3%
1457
 
2.2%
1444
 
2.2%
1302
 
2.0%
Other values (376) 48222
73.0%
ASCII
ValueCountFrequency (%)
2 1130
18.9%
1 1082
18.1%
662
11.1%
3 506
 
8.5%
4 261
 
4.4%
5 211
 
3.5%
e 198
 
3.3%
6 158
 
2.6%
) 158
 
2.6%
( 158
 
2.6%
Other values (34) 1443
24.2%
Number Forms
ValueCountFrequency (%)
9
100.0%
Distinct2164
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:57:37.694264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)1.0%

Sample

1st rowA13920105
2nd rowA14286104
3rd rowA15807204
4th rowA12187901
5th rowA10024821
ValueCountFrequency (%)
a13410003 13
 
0.1%
a12201301 13
 
0.1%
a15375809 13
 
0.1%
a10027553 12
 
0.1%
a12187906 12
 
0.1%
a12070101 12
 
0.1%
a15004507 12
 
0.1%
a15205513 12
 
0.1%
a12007001 11
 
0.1%
a13923103 11
 
0.1%
Other values (2154) 9879
98.8%
2024-05-11T06:57:39.430752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18460
20.5%
1 17601
19.6%
A 9983
11.1%
3 8789
9.8%
2 8290
9.2%
5 6248
 
6.9%
8 5773
 
6.4%
7 4774
 
5.3%
4 3770
 
4.2%
6 3371
 
3.7%
Other values (2) 2941
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18460
23.1%
1 17601
22.0%
3 8789
11.0%
2 8290
10.4%
5 6248
 
7.8%
8 5773
 
7.2%
7 4774
 
6.0%
4 3770
 
4.7%
6 3371
 
4.2%
9 2924
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 9983
99.8%
B 17
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18460
23.1%
1 17601
22.0%
3 8789
11.0%
2 8290
10.4%
5 6248
 
7.8%
8 5773
 
7.2%
7 4774
 
6.0%
4 3770
 
4.7%
6 3371
 
4.2%
9 2924
 
3.7%
Latin
ValueCountFrequency (%)
A 9983
99.8%
B 17
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18460
20.5%
1 17601
19.6%
A 9983
11.1%
3 8789
9.8%
2 8290
9.2%
5 6248
 
6.9%
8 5773
 
6.4%
7 4774
 
5.3%
4 3770
 
4.2%
6 3371
 
3.7%
Other values (2) 2941
 
3.3%
Distinct87
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:57:40.354886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.7955
Min length2

Characters and Unicode

Total characters47955
Distinct characters120
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row위탁관리수수료
2nd row연체료수익
3rd row소독비
4th row검침수익
5th row승강기유지비
ValueCountFrequency (%)
청소비 247
 
2.5%
수선유지비 245
 
2.5%
연체료수익 238
 
2.4%
승강기유지비 238
 
2.4%
사무용품비 237
 
2.4%
급여 237
 
2.4%
소독비 237
 
2.4%
통신비 234
 
2.3%
퇴직급여 227
 
2.3%
경비비 224
 
2.2%
Other values (77) 7636
76.4%
2024-05-11T06:57:41.820083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5378
 
11.2%
3630
 
7.6%
2197
 
4.6%
1943
 
4.1%
1695
 
3.5%
1355
 
2.8%
1051
 
2.2%
886
 
1.8%
825
 
1.7%
816
 
1.7%
Other values (110) 28179
58.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 47955
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5378
 
11.2%
3630
 
7.6%
2197
 
4.6%
1943
 
4.1%
1695
 
3.5%
1355
 
2.8%
1051
 
2.2%
886
 
1.8%
825
 
1.7%
816
 
1.7%
Other values (110) 28179
58.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 47955
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5378
 
11.2%
3630
 
7.6%
2197
 
4.6%
1943
 
4.1%
1695
 
3.5%
1355
 
2.8%
1051
 
2.2%
886
 
1.8%
825
 
1.7%
816
 
1.7%
Other values (110) 28179
58.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 47955
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5378
 
11.2%
3630
 
7.6%
2197
 
4.6%
1943
 
4.1%
1695
 
3.5%
1355
 
2.8%
1051
 
2.2%
886
 
1.8%
825
 
1.7%
816
 
1.7%
Other values (110) 28179
58.8%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202002
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202002
2nd row202002
3rd row202002
4th row202002
5th row202002

Common Values

ValueCountFrequency (%)
202002 10000
100.0%

Length

2024-05-11T06:57:42.225046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T06:57:42.533016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202002 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7302
Distinct (%)73.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3484129.5
Minimum-2285555
Maximum4.3522297 × 108
Zeros856
Zeros (%)8.6%
Negative11
Negative (%)0.1%
Memory size166.0 KiB
2024-05-11T06:57:43.008147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-2285555
5-th percentile0
Q193185
median332460
Q31441682.5
95-th percentile17098769
Maximum4.3522297 × 108
Range4.3750852 × 108
Interquartile range (IQR)1348497.5

Descriptive statistics

Standard deviation13518360
Coefficient of variation (CV)3.8799821
Kurtosis305.45498
Mean3484129.5
Median Absolute Deviation (MAD)317540
Skewness13.976058
Sum3.4841295 × 1010
Variance1.8274605 × 1014
MonotonicityNot monotonic
2024-05-11T06:57:43.647267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 856
 
8.6%
200000 82
 
0.8%
100000 81
 
0.8%
300000 68
 
0.7%
150000 54
 
0.5%
400000 42
 
0.4%
250000 38
 
0.4%
50000 38
 
0.4%
500000 34
 
0.3%
220000 30
 
0.3%
Other values (7292) 8677
86.8%
ValueCountFrequency (%)
-2285555 1
< 0.1%
-1763878 1
< 0.1%
-904920 1
< 0.1%
-517100 1
< 0.1%
-234720 1
< 0.1%
-117500 1
< 0.1%
-80000 1
< 0.1%
-71260 1
< 0.1%
-51120 1
< 0.1%
-2459 1
< 0.1%
ValueCountFrequency (%)
435222968 1
< 0.1%
351387546 1
< 0.1%
342790630 1
< 0.1%
333476770 1
< 0.1%
313970370 1
< 0.1%
284752702 1
< 0.1%
242034746 1
< 0.1%
203604370 1
< 0.1%
201529847 1
< 0.1%
160034000 1
< 0.1%

Interactions

2024-05-11T06:57:33.178249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T06:57:44.006488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.410
금액0.4101.000

Missing values

2024-05-11T06:57:33.633595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T06:57:34.205846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
53198상계수락한신A13920105위탁관리수수료202002327419
63439번동한양A14286104연체료수익2020028120
88107신정현대A15807204소독비202002170000
15039중동계룡A12187901검침수익20200274850
61대치롯데캐슬아파트A10024821승강기유지비202002550000
6261위례아이파크아파트A10027744연차수당202002630080
5851상도2차 두산위브트레지움 아파트A10027633검침비용202002249830
5137목동힐스테이트A10027375재활용품수익2020021621500
54095중계주공5단지A13922114회계감사비202002108166
89546신정이펜하우스3단지A15879502통신비202002116390
아파트명아파트코드비용명년월일금액
84449화곡초록A15770801제수당2020021322000
78593브라운스톤상도A15603002고용보험료202002122800
85517등촌대림e편한세상A15783703산재보험료202002149440
4482경희궁자이3단지A10027105연체료수익20200233040
91361은평뉴타운우물골8단지A41279915산재보험료202002129860
60259동부이촌동우성A14003001공동수도료20200256070
14988래미안용강아파트A12187602감가상각비20200245000
18794휘경동일하이빌A13009202승강기수익202002160000
46682서초2차e편한세상A13787102업무추진비202002100000
17020DMC자이1단지A12275501잡수익2020020