Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2465 (24.6%) zerosZeros

Reproduction

Analysis started2024-05-11 05:57:06.749498
Analysis finished2024-05-11 05:57:07.952218
Duration1.2 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2240
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:08.206383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length7.4082
Min length2

Characters and Unicode

Total characters74082
Distinct characters436
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123 ?
Unique (%)1.2%

Sample

1st row사당4-3우성
2nd row가락금호
3rd row오류금강수목원
4th row신트리3단지
5th row목동파크자이아파트
ValueCountFrequency (%)
아파트 165
 
1.5%
래미안 44
 
0.4%
경남아너스빌 20
 
0.2%
신도림현대 17
 
0.2%
아이파크 17
 
0.2%
e편한세상 17
 
0.2%
푸르지오 16
 
0.1%
힐스테이트 14
 
0.1%
은평뉴타운상림마을6단지 14
 
0.1%
잠실엘스아파트 13
 
0.1%
Other values (2325) 10444
96.9%
2024-05-11T14:57:08.844417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2572
 
3.5%
2552
 
3.4%
2330
 
3.1%
1922
 
2.6%
1745
 
2.4%
1703
 
2.3%
1494
 
2.0%
1433
 
1.9%
1420
 
1.9%
1379
 
1.9%
Other values (426) 55532
75.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67865
91.6%
Decimal Number 3651
 
4.9%
Uppercase Letter 879
 
1.2%
Space Separator 874
 
1.2%
Lowercase Letter 284
 
0.4%
Open Punctuation 149
 
0.2%
Close Punctuation 149
 
0.2%
Dash Punctuation 126
 
0.2%
Other Punctuation 99
 
0.1%
Letter Number 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2572
 
3.8%
2552
 
3.8%
2330
 
3.4%
1922
 
2.8%
1745
 
2.6%
1703
 
2.5%
1494
 
2.2%
1433
 
2.1%
1420
 
2.1%
1379
 
2.0%
Other values (381) 49315
72.7%
Uppercase Letter
ValueCountFrequency (%)
C 137
15.6%
S 130
14.8%
K 109
12.4%
D 87
9.9%
M 87
9.9%
L 56
6.4%
I 51
 
5.8%
H 43
 
4.9%
E 42
 
4.8%
V 27
 
3.1%
Other values (7) 110
12.5%
Lowercase Letter
ValueCountFrequency (%)
e 184
64.8%
i 21
 
7.4%
s 14
 
4.9%
k 13
 
4.6%
l 12
 
4.2%
v 11
 
3.9%
w 8
 
2.8%
a 7
 
2.5%
g 7
 
2.5%
c 4
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 1087
29.8%
2 1048
28.7%
3 487
13.3%
4 250
 
6.8%
5 204
 
5.6%
6 156
 
4.3%
7 135
 
3.7%
8 112
 
3.1%
9 94
 
2.6%
0 78
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 80
80.8%
. 19
 
19.2%
Space Separator
ValueCountFrequency (%)
874
100.0%
Open Punctuation
ValueCountFrequency (%)
( 149
100.0%
Close Punctuation
ValueCountFrequency (%)
) 149
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 126
100.0%
Letter Number
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67865
91.6%
Common 5048
 
6.8%
Latin 1169
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2572
 
3.8%
2552
 
3.8%
2330
 
3.4%
1922
 
2.8%
1745
 
2.6%
1703
 
2.5%
1494
 
2.2%
1433
 
2.1%
1420
 
2.1%
1379
 
2.0%
Other values (381) 49315
72.7%
Latin
ValueCountFrequency (%)
e 184
15.7%
C 137
11.7%
S 130
11.1%
K 109
9.3%
D 87
 
7.4%
M 87
 
7.4%
L 56
 
4.8%
I 51
 
4.4%
H 43
 
3.7%
E 42
 
3.6%
Other values (19) 243
20.8%
Common
ValueCountFrequency (%)
1 1087
21.5%
2 1048
20.8%
874
17.3%
3 487
9.6%
4 250
 
5.0%
5 204
 
4.0%
6 156
 
3.1%
( 149
 
3.0%
) 149
 
3.0%
7 135
 
2.7%
Other values (6) 509
10.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67865
91.6%
ASCII 6211
 
8.4%
Number Forms 6
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2572
 
3.8%
2552
 
3.8%
2330
 
3.4%
1922
 
2.8%
1745
 
2.6%
1703
 
2.5%
1494
 
2.2%
1433
 
2.1%
1420
 
2.1%
1379
 
2.0%
Other values (381) 49315
72.7%
ASCII
ValueCountFrequency (%)
1 1087
17.5%
2 1048
16.9%
874
14.1%
3 487
 
7.8%
4 250
 
4.0%
5 204
 
3.3%
e 184
 
3.0%
6 156
 
2.5%
( 149
 
2.4%
) 149
 
2.4%
Other values (34) 1623
26.1%
Number Forms
ValueCountFrequency (%)
6
100.0%
Distinct2245
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:09.361330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)1.2%

Sample

1st rowA15681501
2nd rowA13880407
3rd rowA15210211
4th rowA15807311
5th rowA10025729
ValueCountFrequency (%)
a13822004 13
 
0.1%
a15721006 12
 
0.1%
a15007201 12
 
0.1%
a15603203 12
 
0.1%
a13922910 11
 
0.1%
a12104005 11
 
0.1%
a12079501 11
 
0.1%
a12007001 11
 
0.1%
a13002002 11
 
0.1%
a10026207 11
 
0.1%
Other values (2235) 9885
98.9%
2024-05-11T14:57:10.046708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18438
20.5%
1 17509
19.5%
A 9989
11.1%
3 8920
9.9%
2 8289
9.2%
5 6241
 
6.9%
8 5590
 
6.2%
7 4675
 
5.2%
4 3984
 
4.4%
6 3304
 
3.7%
Other values (2) 3061
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18438
23.0%
1 17509
21.9%
3 8920
11.2%
2 8289
10.4%
5 6241
 
7.8%
8 5590
 
7.0%
7 4675
 
5.8%
4 3984
 
5.0%
6 3304
 
4.1%
9 3050
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9989
99.9%
B 11
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18438
23.0%
1 17509
21.9%
3 8920
11.2%
2 8289
10.4%
5 6241
 
7.8%
8 5590
 
7.0%
7 4675
 
5.8%
4 3984
 
5.0%
6 3304
 
4.1%
9 3050
 
3.8%
Latin
ValueCountFrequency (%)
A 9989
99.9%
B 11
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18438
20.5%
1 17509
19.5%
A 9989
11.1%
3 8920
9.9%
2 8289
9.2%
5 6241
 
6.9%
8 5590
 
6.2%
7 4675
 
5.2%
4 3984
 
4.4%
6 3304
 
3.7%
Other values (2) 3061
 
3.4%
Distinct76
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:10.498724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.9868
Min length2

Characters and Unicode

Total characters59868
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row비품감가상각누계액
2nd row예수금
3rd row가수금
4th row경비비충당부채
5th row미처분이익잉여금
ValueCountFrequency (%)
예수금 333
 
3.3%
당기순이익 328
 
3.3%
관리비미수금 320
 
3.2%
미처분이익잉여금 315
 
3.1%
비품 313
 
3.1%
연차수당충당부채 311
 
3.1%
예금 310
 
3.1%
선급비용 307
 
3.1%
퇴직급여충당부채 302
 
3.0%
미부과관리비 298
 
3.0%
Other values (66) 6863
68.6%
2024-05-11T14:57:11.133873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4479
 
7.5%
3841
 
6.4%
3117
 
5.2%
3086
 
5.2%
2996
 
5.0%
2963
 
4.9%
2649
 
4.4%
2559
 
4.3%
1846
 
3.1%
1690
 
2.8%
Other values (97) 30642
51.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59868
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4479
 
7.5%
3841
 
6.4%
3117
 
5.2%
3086
 
5.2%
2996
 
5.0%
2963
 
4.9%
2649
 
4.4%
2559
 
4.3%
1846
 
3.1%
1690
 
2.8%
Other values (97) 30642
51.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59868
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4479
 
7.5%
3841
 
6.4%
3117
 
5.2%
3086
 
5.2%
2996
 
5.0%
2963
 
4.9%
2649
 
4.4%
2559
 
4.3%
1846
 
3.1%
1690
 
2.8%
Other values (97) 30642
51.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59868
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4479
 
7.5%
3841
 
6.4%
3117
 
5.2%
3086
 
5.2%
2996
 
5.0%
2963
 
4.9%
2649
 
4.4%
2559
 
4.3%
1846
 
3.1%
1690
 
2.8%
Other values (97) 30642
51.2%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202212
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202212
2nd row202212
3rd row202212
4th row202212
5th row202212

Common Values

ValueCountFrequency (%)
202212 10000
100.0%

Length

2024-05-11T14:57:11.382848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:57:11.545408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202212 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7209
Distinct (%)72.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77109353
Minimum-5.3920078 × 108
Maximum8.0600941 × 109
Zeros2465
Zeros (%)24.6%
Negative349
Negative (%)3.5%
Memory size166.0 KiB
2024-05-11T14:57:11.712481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-5.3920078 × 108
5-th percentile0
Q10
median3186842
Q339642525
95-th percentile3.7252641 × 108
Maximum8.0600941 × 109
Range8.5992949 × 109
Interquartile range (IQR)39642525

Descriptive statistics

Standard deviation2.8592909 × 108
Coefficient of variation (CV)3.7080986
Kurtosis162.39727
Mean77109353
Median Absolute Deviation (MAD)3186842
Skewness10.295164
Sum7.7109353 × 1011
Variance8.1755443 × 1016
MonotonicityNot monotonic
2024-05-11T14:57:11.976575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2465
 
24.6%
500000 26
 
0.3%
300000 21
 
0.2%
250000 17
 
0.2%
484000 14
 
0.1%
55000 12
 
0.1%
242000 11
 
0.1%
100000 11
 
0.1%
2000000 10
 
0.1%
1000000 9
 
0.1%
Other values (7199) 7404
74.0%
ValueCountFrequency (%)
-539200781 1
< 0.1%
-389001283 1
< 0.1%
-253309086 1
< 0.1%
-207330046 1
< 0.1%
-189911240 1
< 0.1%
-178668590 1
< 0.1%
-146927715 1
< 0.1%
-144587080 1
< 0.1%
-126305200 1
< 0.1%
-109453760 1
< 0.1%
ValueCountFrequency (%)
8060094086 1
< 0.1%
5702598583 1
< 0.1%
5513147342 1
< 0.1%
5230883774 1
< 0.1%
4671473816 1
< 0.1%
4461687602 1
< 0.1%
4439795074 1
< 0.1%
4356268262 1
< 0.1%
4284746163 1
< 0.1%
4140213046 1
< 0.1%

Interactions

2024-05-11T14:57:07.510186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:57:12.139307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.534
금액0.5341.000

Missing values

2024-05-11T14:57:07.698079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:57:07.876948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
63768사당4-3우성A15681501비품감가상각누계액202212-39280250
41348가락금호A13880407예수금2022123659510
58370오류금강수목원A15210211가수금202212330540
69014신트리3단지A15807311경비비충당부채202212100634030
4082목동파크자이아파트A10025729미처분이익잉여금2022120
2889고덕롯데캐슬베네루체A10025112가지급금20221250
41054송파더센트레아파트A13876113기타유형자산감가상각누계액202212-2679770
24745옥수삼성A13375902예금202212332722866
2714휘경 해모로 프레스티지 아파트A10025015퇴직급여충당예금2022120
21015방학삼성래미안2단지A13202103선급비용20221216714150
아파트명아파트코드비용명년월일금액
55032신길남서울A15085805미지급금20221237380790
22395도봉파크빌2단지A13275303선수전기료2022121699800
10509DMC휴먼빌A12013001가지급금2022120
23654하왕금호베스트빌A13302204선수전기료2022122747390
19801묵동신안2차A13185502비품2022123371370
22007창동대동A13204501기타유동부채202212219000
21487쌍문금호1차아파트A13203408현금20221259922
68063목동한신청구A15805002가지급금2022120
27001명일삼환아파트A13407202당기순이익20221210005491
45966상계한신A13983608퇴직급여충당부채20221259467094