Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2526 (25.3%) zerosZeros

Reproduction

Analysis started2024-05-11 05:56:15.364218
Analysis finished2024-05-11 05:56:16.463152
Duration1.1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2244
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:16.696842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length7.4452
Min length2

Characters and Unicode

Total characters74452
Distinct characters433
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123 ?
Unique (%)1.2%

Sample

1st row구로다솜금호
2nd row자양현대5차
3rd row마포래미안푸르지오
4th row묵동한국
5th row평창롯데
ValueCountFrequency (%)
아파트 189
 
1.7%
래미안 53
 
0.5%
e편한세상 28
 
0.3%
이편한세상 17
 
0.2%
송파 17
 
0.2%
푸르지오 15
 
0.1%
sk뷰 14
 
0.1%
경남아너스빌 14
 
0.1%
장안위더스빌 13
 
0.1%
양재신영체르니 12
 
0.1%
Other values (2332) 10500
96.6%
2024-05-11T14:56:17.352982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2595
 
3.5%
2573
 
3.5%
2432
 
3.3%
1855
 
2.5%
1735
 
2.3%
1625
 
2.2%
1543
 
2.1%
1488
 
2.0%
1402
 
1.9%
1390
 
1.9%
Other values (423) 55814
75.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68094
91.5%
Decimal Number 3699
 
5.0%
Space Separator 953
 
1.3%
Uppercase Letter 799
 
1.1%
Lowercase Letter 365
 
0.5%
Close Punctuation 153
 
0.2%
Open Punctuation 153
 
0.2%
Dash Punctuation 126
 
0.2%
Other Punctuation 102
 
0.1%
Letter Number 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2595
 
3.8%
2573
 
3.8%
2432
 
3.6%
1855
 
2.7%
1735
 
2.5%
1625
 
2.4%
1543
 
2.3%
1488
 
2.2%
1402
 
2.1%
1390
 
2.0%
Other values (378) 49456
72.6%
Uppercase Letter
ValueCountFrequency (%)
S 125
15.6%
C 120
15.0%
M 94
11.8%
D 94
11.8%
K 86
10.8%
H 50
 
6.3%
L 44
 
5.5%
I 35
 
4.4%
E 34
 
4.3%
V 28
 
3.5%
Other values (7) 89
11.1%
Lowercase Letter
ValueCountFrequency (%)
e 201
55.1%
l 36
 
9.9%
i 32
 
8.8%
v 21
 
5.8%
s 18
 
4.9%
k 16
 
4.4%
c 14
 
3.8%
w 10
 
2.7%
h 9
 
2.5%
a 4
 
1.1%
Decimal Number
ValueCountFrequency (%)
1 1110
30.0%
2 1070
28.9%
3 484
13.1%
4 248
 
6.7%
5 224
 
6.1%
6 163
 
4.4%
7 123
 
3.3%
9 95
 
2.6%
0 92
 
2.5%
8 90
 
2.4%
Other Punctuation
ValueCountFrequency (%)
, 82
80.4%
. 20
 
19.6%
Space Separator
ValueCountFrequency (%)
953
100.0%
Close Punctuation
ValueCountFrequency (%)
) 153
100.0%
Open Punctuation
ValueCountFrequency (%)
( 153
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 126
100.0%
Letter Number
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68094
91.5%
Common 5186
 
7.0%
Latin 1172
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2595
 
3.8%
2573
 
3.8%
2432
 
3.6%
1855
 
2.7%
1735
 
2.5%
1625
 
2.4%
1543
 
2.3%
1488
 
2.2%
1402
 
2.1%
1390
 
2.0%
Other values (378) 49456
72.6%
Latin
ValueCountFrequency (%)
e 201
17.2%
S 125
10.7%
C 120
10.2%
M 94
 
8.0%
D 94
 
8.0%
K 86
 
7.3%
H 50
 
4.3%
L 44
 
3.8%
l 36
 
3.1%
I 35
 
3.0%
Other values (19) 287
24.5%
Common
ValueCountFrequency (%)
1 1110
21.4%
2 1070
20.6%
953
18.4%
3 484
9.3%
4 248
 
4.8%
5 224
 
4.3%
6 163
 
3.1%
) 153
 
3.0%
( 153
 
3.0%
- 126
 
2.4%
Other values (6) 502
9.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68094
91.5%
ASCII 6350
 
8.5%
Number Forms 8
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2595
 
3.8%
2573
 
3.8%
2432
 
3.6%
1855
 
2.7%
1735
 
2.5%
1625
 
2.4%
1543
 
2.3%
1488
 
2.2%
1402
 
2.1%
1390
 
2.0%
Other values (378) 49456
72.6%
ASCII
ValueCountFrequency (%)
1 1110
17.5%
2 1070
16.9%
953
15.0%
3 484
 
7.6%
4 248
 
3.9%
5 224
 
3.5%
e 201
 
3.2%
6 163
 
2.6%
) 153
 
2.4%
( 153
 
2.4%
Other values (34) 1591
25.1%
Number Forms
ValueCountFrequency (%)
8
100.0%
Distinct2248
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:17.834776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123 ?
Unique (%)1.2%

Sample

1st rowA15283806
2nd rowA14319203
3rd rowA12175203
4th rowA13185402
5th rowA11084601
ValueCountFrequency (%)
a13078701 13
 
0.1%
a13987306 12
 
0.1%
a10027375 12
 
0.1%
a15205305 12
 
0.1%
a15807604 12
 
0.1%
a13508006 12
 
0.1%
a13789002 12
 
0.1%
a13703011 12
 
0.1%
a13983709 11
 
0.1%
a13185508 11
 
0.1%
Other values (2238) 9881
98.8%
2024-05-11T14:56:18.601052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18373
20.4%
1 17433
19.4%
A 9988
11.1%
3 9005
10.0%
2 8383
9.3%
5 6140
 
6.8%
8 5565
 
6.2%
7 4778
 
5.3%
4 3883
 
4.3%
6 3410
 
3.8%
Other values (2) 3042
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18373
23.0%
1 17433
21.8%
3 9005
11.3%
2 8383
10.5%
5 6140
 
7.7%
8 5565
 
7.0%
7 4778
 
6.0%
4 3883
 
4.9%
6 3410
 
4.3%
9 3030
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9988
99.9%
B 12
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18373
23.0%
1 17433
21.8%
3 9005
11.3%
2 8383
10.5%
5 6140
 
7.7%
8 5565
 
7.0%
7 4778
 
6.0%
4 3883
 
4.9%
6 3410
 
4.3%
9 3030
 
3.8%
Latin
ValueCountFrequency (%)
A 9988
99.9%
B 12
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18373
20.4%
1 17433
19.4%
A 9988
11.1%
3 9005
10.0%
2 8383
9.3%
5 6140
 
6.8%
8 5565
 
6.2%
7 4778
 
5.3%
4 3883
 
4.3%
6 3410
 
3.8%
Other values (2) 3042
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:19.005137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.9612
Min length2

Characters and Unicode

Total characters59612
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row예금
2nd row현금
3rd row연차수당충당부채
4th row비품감가상각누계액
5th row기타유동부채
ValueCountFrequency (%)
관리비미수금 330
 
3.3%
미처분이익잉여금 326
 
3.3%
당기순이익 313
 
3.1%
장기수선충당예금 313
 
3.1%
예금 309
 
3.1%
공동주택적립금 308
 
3.1%
연차수당충당부채 306
 
3.1%
예수금 302
 
3.0%
가수금 293
 
2.9%
비품감가상각누계액 290
 
2.9%
Other values (67) 6910
69.1%
2024-05-11T14:56:19.612389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4682
 
7.9%
3845
 
6.5%
3170
 
5.3%
3065
 
5.1%
3011
 
5.1%
2844
 
4.8%
2556
 
4.3%
2494
 
4.2%
1873
 
3.1%
1761
 
3.0%
Other values (97) 30311
50.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59612
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4682
 
7.9%
3845
 
6.5%
3170
 
5.3%
3065
 
5.1%
3011
 
5.1%
2844
 
4.8%
2556
 
4.3%
2494
 
4.2%
1873
 
3.1%
1761
 
3.0%
Other values (97) 30311
50.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59612
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4682
 
7.9%
3845
 
6.5%
3170
 
5.3%
3065
 
5.1%
3011
 
5.1%
2844
 
4.8%
2556
 
4.3%
2494
 
4.2%
1873
 
3.1%
1761
 
3.0%
Other values (97) 30311
50.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59612
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4682
 
7.9%
3845
 
6.5%
3170
 
5.3%
3065
 
5.1%
3011
 
5.1%
2844
 
4.8%
2556
 
4.3%
2494
 
4.2%
1873
 
3.1%
1761
 
3.0%
Other values (97) 30311
50.8%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202308
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202308
2nd row202308
3rd row202308
4th row202308
5th row202308

Common Values

ValueCountFrequency (%)
202308 10000
100.0%

Length

2024-05-11T14:56:19.852563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:56:20.015278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202308 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7160
Distinct (%)71.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77844828
Minimum-3.4508111 × 108
Maximum1.4435281 × 1010
Zeros2526
Zeros (%)25.3%
Negative349
Negative (%)3.5%
Memory size166.0 KiB
2024-05-11T14:56:20.215441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3.4508111 × 108
5-th percentile0
Q10
median2757067
Q336273762
95-th percentile3.7224869 × 108
Maximum1.4435281 × 1010
Range1.4780362 × 1010
Interquartile range (IQR)36273762

Descriptive statistics

Standard deviation3.3072303 × 108
Coefficient of variation (CV)4.2484908
Kurtosis477.78544
Mean77844828
Median Absolute Deviation (MAD)2757067
Skewness16.283004
Sum7.7844828 × 1011
Variance1.0937773 × 1017
MonotonicityNot monotonic
2024-05-11T14:56:20.492585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2526
 
25.3%
500000 24
 
0.2%
250000 16
 
0.2%
1000000 13
 
0.1%
484000 13
 
0.1%
200000 12
 
0.1%
300000 12
 
0.1%
242000 9
 
0.1%
2000000 9
 
0.1%
3000000 8
 
0.1%
Other values (7150) 7358
73.6%
ValueCountFrequency (%)
-345081106 1
< 0.1%
-310187704 1
< 0.1%
-306959554 1
< 0.1%
-285332080 1
< 0.1%
-246006355 1
< 0.1%
-204744212 1
< 0.1%
-204723650 1
< 0.1%
-192260384 1
< 0.1%
-165179325 1
< 0.1%
-160313140 1
< 0.1%
ValueCountFrequency (%)
14435281182 1
< 0.1%
8637605194 1
< 0.1%
6562691622 1
< 0.1%
6505000187 1
< 0.1%
5508155658 1
< 0.1%
4860224589 1
< 0.1%
4644103478 1
< 0.1%
4595645088 1
< 0.1%
4332787860 1
< 0.1%
4166459492 1
< 0.1%

Interactions

2024-05-11T14:56:15.974334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:56:20.673856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.314
금액0.3141.000

Missing values

2024-05-11T14:56:16.250361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:56:16.396642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
60222구로다솜금호A15283806예금202308141586154
51860자양현대5차A14319203현금202308383976
13427마포래미안푸르지오A12175203연차수당충당부채202308213137834
20457묵동한국A13185402비품감가상각누계액202308-2703620
10085평창롯데A11084601기타유동부채2023080
4938래미안베라힐즈아파트A10025846기타충당부채202308357230
41100송파파인타운5단지A13821003장기수선충당부채적립금2023080
34516길음뉴타운 경남아너스빌A13610107미처분이익잉여금2023080
4673답십리파크자이A10025754비품감가상각누계액202308-28917450
20118면목삼익A13183502미지급비용20230824182079
아파트명아파트코드비용명년월일금액
42722문정푸르지오1차A13882402수선유지비충당부채20230812956690
69823신정현대A15807204상여충당부채2023080
69798목동9단지A15807101관리비미수금20230814342120
28368고덕리엔파크2단지A13410011수선유지비충당부채2023080
28719길동현대아파트A13480803예수금202308561230
3618한양수자인사가정파크아파트A10025159장기수선충당부채202308109002369
12549마포강변힐스테이트A12112002당기순이익202308111535426
49664한남동리첸시아A14021001기타충당부채2023081218
71044신정삼성SH임대A15876402미처분이익잉여금2023080
62954상도래미안1차A15603204수선유지비충당부채20230823774200