Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2131 (21.3%) zerosZeros

Reproduction

Analysis started2024-05-11 05:59:14.906696
Analysis finished2024-05-11 05:59:15.689167
Duration0.78 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2227
Distinct (%)22.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:59:15.879489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length19
Mean length7.3212
Min length2

Characters and Unicode

Total characters73212
Distinct characters435
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique128 ?
Unique (%)1.3%

Sample

1st row논현신동아
2nd row독산동한양수자인아파트
3rd row이촌동부센트레빌
4th row쌍문한양5차
5th row구로보광
ValueCountFrequency (%)
아파트 171
 
1.6%
래미안 30
 
0.3%
아이파크 19
 
0.2%
해모로 18
 
0.2%
은평뉴타운상림마을6단지 18
 
0.2%
브라운스톤 17
 
0.2%
e편한세상 16
 
0.1%
북한산 15
 
0.1%
고덕 13
 
0.1%
신대방현대 13
 
0.1%
Other values (2294) 10338
96.9%
2024-05-11T14:59:16.290804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2580
 
3.5%
2493
 
3.4%
2231
 
3.0%
1827
 
2.5%
1805
 
2.5%
1674
 
2.3%
1506
 
2.1%
1461
 
2.0%
1440
 
2.0%
1302
 
1.8%
Other values (425) 54893
75.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67020
91.5%
Decimal Number 3794
 
5.2%
Space Separator 755
 
1.0%
Uppercase Letter 716
 
1.0%
Lowercase Letter 347
 
0.5%
Open Punctuation 161
 
0.2%
Close Punctuation 161
 
0.2%
Dash Punctuation 128
 
0.2%
Other Punctuation 126
 
0.2%
Letter Number 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2580
 
3.8%
2493
 
3.7%
2231
 
3.3%
1827
 
2.7%
1805
 
2.7%
1674
 
2.5%
1506
 
2.2%
1461
 
2.2%
1440
 
2.1%
1302
 
1.9%
Other values (380) 48701
72.7%
Uppercase Letter
ValueCountFrequency (%)
S 123
17.2%
C 87
12.2%
K 81
11.3%
D 62
8.7%
M 62
8.7%
L 56
7.8%
H 46
 
6.4%
I 36
 
5.0%
G 35
 
4.9%
E 26
 
3.6%
Other values (7) 102
14.2%
Lowercase Letter
ValueCountFrequency (%)
e 185
53.3%
i 32
 
9.2%
l 32
 
9.2%
v 26
 
7.5%
s 18
 
5.2%
k 18
 
5.2%
w 11
 
3.2%
c 10
 
2.9%
g 5
 
1.4%
h 5
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 1182
31.2%
2 1090
28.7%
3 486
12.8%
4 263
 
6.9%
5 222
 
5.9%
6 169
 
4.5%
7 103
 
2.7%
9 98
 
2.6%
0 96
 
2.5%
8 85
 
2.2%
Other Punctuation
ValueCountFrequency (%)
, 96
76.2%
. 30
 
23.8%
Space Separator
ValueCountFrequency (%)
755
100.0%
Open Punctuation
ValueCountFrequency (%)
( 161
100.0%
Close Punctuation
ValueCountFrequency (%)
) 161
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 128
100.0%
Letter Number
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67020
91.5%
Common 5125
 
7.0%
Latin 1067
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2580
 
3.8%
2493
 
3.7%
2231
 
3.3%
1827
 
2.7%
1805
 
2.7%
1674
 
2.5%
1506
 
2.2%
1461
 
2.2%
1440
 
2.1%
1302
 
1.9%
Other values (380) 48701
72.7%
Latin
ValueCountFrequency (%)
e 185
17.3%
S 123
11.5%
C 87
 
8.2%
K 81
 
7.6%
D 62
 
5.8%
M 62
 
5.8%
L 56
 
5.2%
H 46
 
4.3%
I 36
 
3.4%
G 35
 
3.3%
Other values (19) 294
27.6%
Common
ValueCountFrequency (%)
1 1182
23.1%
2 1090
21.3%
755
14.7%
3 486
9.5%
4 263
 
5.1%
5 222
 
4.3%
6 169
 
3.3%
( 161
 
3.1%
) 161
 
3.1%
- 128
 
2.5%
Other values (6) 508
9.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67020
91.5%
ASCII 6188
 
8.5%
Number Forms 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2580
 
3.8%
2493
 
3.7%
2231
 
3.3%
1827
 
2.7%
1805
 
2.7%
1674
 
2.5%
1506
 
2.2%
1461
 
2.2%
1440
 
2.1%
1302
 
1.9%
Other values (380) 48701
72.7%
ASCII
ValueCountFrequency (%)
1 1182
19.1%
2 1090
17.6%
755
12.2%
3 486
 
7.9%
4 263
 
4.3%
5 222
 
3.6%
e 185
 
3.0%
6 169
 
2.7%
( 161
 
2.6%
) 161
 
2.6%
Other values (34) 1514
24.5%
Number Forms
ValueCountFrequency (%)
4
100.0%
Distinct2232
Distinct (%)22.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:59:16.544040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique128 ?
Unique (%)1.3%

Sample

1st rowA13501004
2nd rowA15370301
3rd rowA14003004
4th rowA13286105
5th rowA15285503
ValueCountFrequency (%)
a13606201 13
 
0.1%
a15601105 13
 
0.1%
a10045601 12
 
0.1%
a10027817 12
 
0.1%
a13522006 12
 
0.1%
a11081503 12
 
0.1%
a13583402 11
 
0.1%
a15785710 11
 
0.1%
a15205103 11
 
0.1%
a13184401 11
 
0.1%
Other values (2222) 9882
98.8%
2024-05-11T14:59:16.909307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18370
20.4%
1 17704
19.7%
A 9989
11.1%
3 8797
9.8%
2 8286
9.2%
5 6323
 
7.0%
8 5493
 
6.1%
7 4849
 
5.4%
4 3894
 
4.3%
6 3294
 
3.7%
Other values (2) 3001
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18370
23.0%
1 17704
22.1%
3 8797
11.0%
2 8286
10.4%
5 6323
 
7.9%
8 5493
 
6.9%
7 4849
 
6.1%
4 3894
 
4.9%
6 3294
 
4.1%
9 2990
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 9989
99.9%
B 11
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18370
23.0%
1 17704
22.1%
3 8797
11.0%
2 8286
10.4%
5 6323
 
7.9%
8 5493
 
6.9%
7 4849
 
6.1%
4 3894
 
4.9%
6 3294
 
4.1%
9 2990
 
3.7%
Latin
ValueCountFrequency (%)
A 9989
99.9%
B 11
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18370
20.4%
1 17704
19.7%
A 9989
11.1%
3 8797
9.8%
2 8286
9.2%
5 6323
 
7.0%
8 5493
 
6.1%
7 4849
 
5.4%
4 3894
 
4.3%
6 3294
 
3.7%
Other values (2) 3001
 
3.3%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:59:17.184506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.9292
Min length2

Characters and Unicode

Total characters59292
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row비품감가상각누계액
2nd row저장품
3rd row주차장충당예금
4th row공동체활성화단체지원적립금
5th row미처분이익잉여금
ValueCountFrequency (%)
당기순이익 341
 
3.4%
미처분이익잉여금 335
 
3.4%
퇴직급여충당부채 326
 
3.3%
연차수당충당부채 323
 
3.2%
예금 323
 
3.2%
관리비미수금 313
 
3.1%
예수금 312
 
3.1%
선급비용 309
 
3.1%
장기수선충당부채 302
 
3.0%
현금 300
 
3.0%
Other values (67) 6816
68.2%
2024-05-11T14:59:17.596099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4676
 
7.9%
3824
 
6.4%
3146
 
5.3%
2984
 
5.0%
2934
 
4.9%
2903
 
4.9%
2619
 
4.4%
2461
 
4.2%
1871
 
3.2%
1708
 
2.9%
Other values (97) 30166
50.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59292
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4676
 
7.9%
3824
 
6.4%
3146
 
5.3%
2984
 
5.0%
2934
 
4.9%
2903
 
4.9%
2619
 
4.4%
2461
 
4.2%
1871
 
3.2%
1708
 
2.9%
Other values (97) 30166
50.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59292
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4676
 
7.9%
3824
 
6.4%
3146
 
5.3%
2984
 
5.0%
2934
 
4.9%
2903
 
4.9%
2619
 
4.4%
2461
 
4.2%
1871
 
3.2%
1708
 
2.9%
Other values (97) 30166
50.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59292
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4676
 
7.9%
3824
 
6.4%
3146
 
5.3%
2984
 
5.0%
2934
 
4.9%
2903
 
4.9%
2619
 
4.4%
2461
 
4.2%
1871
 
3.2%
1708
 
2.9%
Other values (97) 30166
50.9%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202104
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202104
2nd row202104
3rd row202104
4th row202104
5th row202104

Common Values

ValueCountFrequency (%)
202104 10000
100.0%

Length

2024-05-11T14:59:17.728122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:59:18.137451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202104 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7544
Distinct (%)75.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70793966
Minimum-4.3086778 × 108
Maximum7.7812863 × 109
Zeros2131
Zeros (%)21.3%
Negative300
Negative (%)3.0%
Memory size166.0 KiB
2024-05-11T14:59:18.242964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.3086778 × 108
5-th percentile0
Q16940.5
median3037613.5
Q332334470
95-th percentile3.5975952 × 108
Maximum7.7812863 × 109
Range8.2121541 × 109
Interquartile range (IQR)32327530

Descriptive statistics

Standard deviation2.6569723 × 108
Coefficient of variation (CV)3.7531056
Kurtosis176.01228
Mean70793966
Median Absolute Deviation (MAD)3037613.5
Skewness10.410688
Sum7.0793966 × 1011
Variance7.0595018 × 1016
MonotonicityNot monotonic
2024-05-11T14:59:18.389025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2131
 
21.3%
500000 21
 
0.2%
250000 20
 
0.2%
1000000 12
 
0.1%
20000000 11
 
0.1%
300000 11
 
0.1%
30000000 10
 
0.1%
2000000 9
 
0.1%
3000000 9
 
0.1%
200000 9
 
0.1%
Other values (7534) 7757
77.6%
ValueCountFrequency (%)
-430867776 1
< 0.1%
-389001283 1
< 0.1%
-288131647 1
< 0.1%
-268025520 1
< 0.1%
-190422700 1
< 0.1%
-174724570 1
< 0.1%
-155891782 1
< 0.1%
-148478412 1
< 0.1%
-117902189 1
< 0.1%
-115783434 1
< 0.1%
ValueCountFrequency (%)
7781286294 1
< 0.1%
6339079850 1
< 0.1%
5452830675 1
< 0.1%
4636264353 1
< 0.1%
4237699740 1
< 0.1%
3993049083 1
< 0.1%
3816357211 1
< 0.1%
3544988262 1
< 0.1%
3494880949 1
< 0.1%
3473469573 1
< 0.1%

Interactions

2024-05-11T14:59:15.384793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:59:18.480783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.519
금액0.5191.000

Missing values

2024-05-11T14:59:15.512192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:59:15.627299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
27702논현신동아A13501004비품감가상각누계액202104-11591720
60239독산동한양수자인아파트A15370301저장품20210478000
47564이촌동부센트레빌A14003004주차장충당예금2021040
21800쌍문한양5차A13286105공동체활성화단체지원적립금2021048750800
59106구로보광A15285503미처분이익잉여금2021040
33747석관코오롱A13615002승강기유지비충당부채202104261470
23977행당두산위브아파트A13377901현금20210450930
6689위례 송파푸르지오A10028086퇴직급여충당부채20210441390216
44803상계주공14단지A13981903예수금2021047520100
55973봉천건영6차아파트A15176602상여충당부채2021040
아파트명아파트코드비용명년월일금액
31609개포7차우성A13594403단기보증금2021045300000
18832신내석탑A13186503주차장충당예금20210474397232
59638신도림우성3차A15288804기타유동부채2021041120000
6337텐즈힐1단지A10027920당기순이익20210486340770
30187삼성동중앙하이츠빌리지A13550701승강기유지비충당부채2021045948500
58101서울수목원현대홈타운스위트A15271601예수금2021041010220
58381개봉거성푸르뫼2차아피트A15280303주차장충당부채2021040
64675강서한강자이A15720001예수금2021046541535
8834연희대우A12011002승강기유지비충당부채2021040
20791창동대동A13204501기타당좌자산202104490000