Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15821/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 1154 (11.5%) zerosZeros

Reproduction

Analysis started2024-05-11 06:59:27.532189
Analysis finished2024-05-11 06:59:29.647538
Duration2.12 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2097
Distinct (%)21.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:59:29.975863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.1816
Min length2

Characters and Unicode

Total characters71816
Distinct characters430
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)1.0%

Sample

1st row마곡푸르지오
2nd row돈암코오롱하늘채아파트
3rd row문래현대1차
4th row이편한세상 상도 노빌리티
5th row잠실우성4차
ValueCountFrequency (%)
아파트 102
 
1.0%
래미안 21
 
0.2%
신동아파밀리에 17
 
0.2%
힐스테이트 14
 
0.1%
신내 14
 
0.1%
우리유앤미 13
 
0.1%
브라운스톤 13
 
0.1%
문화촌현대 13
 
0.1%
신길삼두 13
 
0.1%
청구e편한세상(분양 13
 
0.1%
Other values (2150) 10251
97.8%
2024-05-11T06:59:31.279497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2306
 
3.2%
2178
 
3.0%
1981
 
2.8%
1889
 
2.6%
1819
 
2.5%
1676
 
2.3%
1571
 
2.2%
1561
 
2.2%
1433
 
2.0%
1373
 
1.9%
Other values (420) 54029
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65919
91.8%
Decimal Number 3782
 
5.3%
Uppercase Letter 679
 
0.9%
Space Separator 522
 
0.7%
Lowercase Letter 351
 
0.5%
Dash Punctuation 153
 
0.2%
Close Punctuation 148
 
0.2%
Open Punctuation 148
 
0.2%
Other Punctuation 110
 
0.2%
Letter Number 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2306
 
3.5%
2178
 
3.3%
1981
 
3.0%
1889
 
2.9%
1819
 
2.8%
1676
 
2.5%
1571
 
2.4%
1561
 
2.4%
1433
 
2.2%
1373
 
2.1%
Other values (375) 48132
73.0%
Uppercase Letter
ValueCountFrequency (%)
S 127
18.7%
K 93
13.7%
C 79
11.6%
L 59
8.7%
H 51
7.5%
D 39
 
5.7%
M 39
 
5.7%
G 35
 
5.2%
E 34
 
5.0%
I 33
 
4.9%
Other values (7) 90
13.3%
Lowercase Letter
ValueCountFrequency (%)
e 195
55.6%
l 42
 
12.0%
i 38
 
10.8%
v 28
 
8.0%
w 12
 
3.4%
s 10
 
2.8%
k 9
 
2.6%
g 5
 
1.4%
a 5
 
1.4%
c 4
 
1.1%
Decimal Number
ValueCountFrequency (%)
1 1121
29.6%
2 1113
29.4%
3 548
14.5%
4 243
 
6.4%
5 234
 
6.2%
6 169
 
4.5%
7 112
 
3.0%
9 101
 
2.7%
0 73
 
1.9%
8 68
 
1.8%
Other Punctuation
ValueCountFrequency (%)
, 90
81.8%
. 20
 
18.2%
Space Separator
ValueCountFrequency (%)
522
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 153
100.0%
Close Punctuation
ValueCountFrequency (%)
) 148
100.0%
Open Punctuation
ValueCountFrequency (%)
( 148
100.0%
Letter Number
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65919
91.8%
Common 4863
 
6.8%
Latin 1034
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2306
 
3.5%
2178
 
3.3%
1981
 
3.0%
1889
 
2.9%
1819
 
2.8%
1676
 
2.5%
1571
 
2.4%
1561
 
2.4%
1433
 
2.2%
1373
 
2.1%
Other values (375) 48132
73.0%
Latin
ValueCountFrequency (%)
e 195
18.9%
S 127
12.3%
K 93
 
9.0%
C 79
 
7.6%
L 59
 
5.7%
H 51
 
4.9%
l 42
 
4.1%
D 39
 
3.8%
M 39
 
3.8%
i 38
 
3.7%
Other values (19) 272
26.3%
Common
ValueCountFrequency (%)
1 1121
23.1%
2 1113
22.9%
3 548
11.3%
522
10.7%
4 243
 
5.0%
5 234
 
4.8%
6 169
 
3.5%
- 153
 
3.1%
) 148
 
3.0%
( 148
 
3.0%
Other values (6) 464
9.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65919
91.8%
ASCII 5893
 
8.2%
Number Forms 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2306
 
3.5%
2178
 
3.3%
1981
 
3.0%
1889
 
2.9%
1819
 
2.8%
1676
 
2.5%
1571
 
2.4%
1561
 
2.4%
1433
 
2.2%
1373
 
2.1%
Other values (375) 48132
73.0%
ASCII
ValueCountFrequency (%)
1 1121
19.0%
2 1113
18.9%
3 548
 
9.3%
522
 
8.9%
4 243
 
4.1%
5 234
 
4.0%
e 195
 
3.3%
6 169
 
2.9%
- 153
 
2.6%
) 148
 
2.5%
Other values (34) 1447
24.6%
Number Forms
ValueCountFrequency (%)
4
100.0%
Distinct2104
Distinct (%)21.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:59:32.213104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)1.0%

Sample

1st rowA15722004
2nd rowA10027227
3rd rowA15009604
4th rowA10025768
5th rowA13822902
ValueCountFrequency (%)
a15083701 13
 
0.1%
a12009305 13
 
0.1%
a10045002 13
 
0.1%
a13980107 12
 
0.1%
a10027755 12
 
0.1%
a13524006 12
 
0.1%
a15286801 12
 
0.1%
a13614001 11
 
0.1%
a15606007 11
 
0.1%
a13184208 11
 
0.1%
Other values (2094) 9880
98.8%
2024-05-11T06:59:33.982548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18510
20.6%
1 17666
19.6%
A 9987
11.1%
3 8884
9.9%
2 8092
9.0%
5 6219
 
6.9%
8 5643
 
6.3%
7 4745
 
5.3%
4 3845
 
4.3%
6 3501
 
3.9%
Other values (2) 2908
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18510
23.1%
1 17666
22.1%
3 8884
11.1%
2 8092
10.1%
5 6219
 
7.8%
8 5643
 
7.1%
7 4745
 
5.9%
4 3845
 
4.8%
6 3501
 
4.4%
9 2895
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
A 9987
99.9%
B 13
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18510
23.1%
1 17666
22.1%
3 8884
11.1%
2 8092
10.1%
5 6219
 
7.8%
8 5643
 
7.1%
7 4745
 
5.9%
4 3845
 
4.8%
6 3501
 
4.4%
9 2895
 
3.6%
Latin
ValueCountFrequency (%)
A 9987
99.9%
B 13
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18510
20.6%
1 17666
19.6%
A 9987
11.1%
3 8884
9.9%
2 8092
9.0%
5 6219
 
6.9%
8 5643
 
6.3%
7 4745
 
5.3%
4 3845
 
4.3%
6 3501
 
3.9%
Other values (2) 2908
 
3.2%
Distinct87
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:59:34.894425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.8845
Min length2

Characters and Unicode

Total characters48845
Distinct characters120
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row지급수수료
2nd row교육비
3rd row복리후생비
4th row피복비
5th row연차수당
ValueCountFrequency (%)
소독비 236
 
2.4%
퇴직급여 224
 
2.2%
사무용품비 221
 
2.2%
이자수익 220
 
2.2%
통신비 212
 
2.1%
세대전기료 212
 
2.1%
수선유지비 209
 
2.1%
승강기유지비 208
 
2.1%
급여 206
 
2.1%
연체료수익 205
 
2.1%
Other values (77) 7847
78.5%
2024-05-11T06:59:36.622074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5432
 
11.1%
3552
 
7.3%
2103
 
4.3%
2032
 
4.2%
1806
 
3.7%
1304
 
2.7%
1087
 
2.2%
828
 
1.7%
794
 
1.6%
746
 
1.5%
Other values (110) 29161
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 48845
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5432
 
11.1%
3552
 
7.3%
2103
 
4.3%
2032
 
4.2%
1806
 
3.7%
1304
 
2.7%
1087
 
2.2%
828
 
1.7%
794
 
1.6%
746
 
1.5%
Other values (110) 29161
59.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 48845
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5432
 
11.1%
3552
 
7.3%
2103
 
4.3%
2032
 
4.2%
1806
 
3.7%
1304
 
2.7%
1087
 
2.2%
828
 
1.7%
794
 
1.6%
746
 
1.5%
Other values (110) 29161
59.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 48845
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5432
 
11.1%
3552
 
7.3%
2103
 
4.3%
2032
 
4.2%
1806
 
3.7%
1304
 
2.7%
1087
 
2.2%
828
 
1.7%
794
 
1.6%
746
 
1.5%
Other values (110) 29161
59.7%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
201906
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201906
2nd row201906
3rd row201906
4th row201906
5th row201906

Common Values

ValueCountFrequency (%)
201906 10000
100.0%

Length

2024-05-11T06:59:37.160173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T06:59:37.501435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201906 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct6987
Distinct (%)69.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2689012.8
Minimum-1725000
Maximum2.8434045 × 108
Zeros1154
Zeros (%)11.5%
Negative12
Negative (%)0.1%
Memory size166.0 KiB
2024-05-11T06:59:37.940743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-1725000
5-th percentile0
Q163802.25
median300000
Q31304620
95-th percentile13560896
Maximum2.8434045 × 108
Range2.8606545 × 108
Interquartile range (IQR)1240817.8

Descriptive statistics

Standard deviation9198861.6
Coefficient of variation (CV)3.4209065
Kurtosis204.26373
Mean2689012.8
Median Absolute Deviation (MAD)299000
Skewness11.049385
Sum2.6890128 × 1010
Variance8.4619054 × 1013
MonotonicityNot monotonic
2024-05-11T06:59:38.742632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1154
 
11.5%
200000 107
 
1.1%
300000 69
 
0.7%
100000 68
 
0.7%
150000 44
 
0.4%
250000 38
 
0.4%
500000 35
 
0.4%
400000 34
 
0.3%
50000 32
 
0.3%
600000 28
 
0.3%
Other values (6977) 8391
83.9%
ValueCountFrequency (%)
-1725000 1
< 0.1%
-1717350 1
< 0.1%
-280160 1
< 0.1%
-232080 1
< 0.1%
-51055 1
< 0.1%
-41790 1
< 0.1%
-35100 1
< 0.1%
-18140 1
< 0.1%
-8000 1
< 0.1%
-3000 1
< 0.1%
ValueCountFrequency (%)
284340449 1
< 0.1%
215050535 1
< 0.1%
191084440 1
< 0.1%
188124800 1
< 0.1%
165975281 1
< 0.1%
154683140 1
< 0.1%
147398042 1
< 0.1%
142234508 1
< 0.1%
124378670 1
< 0.1%
119812630 1
< 0.1%

Interactions

2024-05-11T06:59:28.672420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T06:59:39.089214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.366
금액0.3661.000

Missing values

2024-05-11T06:59:29.175093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T06:59:29.520486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
88262마곡푸르지오A15722004지급수수료2019062890
3756돈암코오롱하늘채아파트A10027227교육비2019060
70990문래현대1차A15009604복리후생비20190618000
736이편한세상 상도 노빌리티A10025768피복비201906324190
52583잠실우성4차A13822902연차수당201906802150
38408도곡경남A13527008통신비20190662886
37682수서신동아A13522006회계감사비201906165000
3333도봉숲 아뜨리움A10027136위탁관리수수료201906350000
91798염창한강동아(2차)A15786510제수당201906700000
94815신정이펜하우스3단지A15879502공동주택지원금수익2019060
아파트명아파트코드비용명년월일금액
57424공릉한보아파트A13924003사무용품비2019060
28056창동성원A13292701경비비2019067092110
22422용마금호타운A13181203업무추진비201906200000
7211중림삼성래미안아파트A10085902공동전기료201906776570
94075목동12단지A15807706선거관리위원회운영비201906762140
15728갈현미미A12205001위탁관리수수료2019060
5091위례아이파크아파트A10027744자치활동비2019060
80798신도림우성1,2차A15288806부과차손201906110000
62197삼창타워프라자A13986702공동수도료201906112670
85182상도삼호A15678102급여20190620447280