Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2325 (23.2%) zerosZeros

Reproduction

Analysis started2024-05-11 05:58:04.987598
Analysis finished2024-05-11 05:58:06.657244
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2239
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:06.968492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.3154
Min length2

Characters and Unicode

Total characters73154
Distinct characters433
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique135 ?
Unique (%)1.4%

Sample

1st row월계주공2단지
2nd row신정이펜하우스1단지(총세대 기준)
3rd row삼성래미안공덕4차
4th row신창세방리버하이빌
5th row중계무지개아파트
ValueCountFrequency (%)
아파트 151
 
1.4%
래미안 43
 
0.4%
e편한세상 22
 
0.2%
아이파크 22
 
0.2%
상계수락파크빌 15
 
0.1%
마포자이 15
 
0.1%
고덕 14
 
0.1%
푸르지오 14
 
0.1%
경남아너스빌 14
 
0.1%
북한산 14
 
0.1%
Other values (2319) 10405
97.0%
2024-05-11T14:58:07.643885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2545
 
3.5%
2510
 
3.4%
2324
 
3.2%
1780
 
2.4%
1709
 
2.3%
1665
 
2.3%
1431
 
2.0%
1420
 
1.9%
1383
 
1.9%
1375
 
1.9%
Other values (423) 55012
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67164
91.8%
Decimal Number 3495
 
4.8%
Uppercase Letter 858
 
1.2%
Space Separator 802
 
1.1%
Lowercase Letter 333
 
0.5%
Open Punctuation 141
 
0.2%
Close Punctuation 141
 
0.2%
Dash Punctuation 126
 
0.2%
Other Punctuation 87
 
0.1%
Letter Number 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2545
 
3.8%
2510
 
3.7%
2324
 
3.5%
1780
 
2.7%
1709
 
2.5%
1665
 
2.5%
1431
 
2.1%
1420
 
2.1%
1383
 
2.1%
1375
 
2.0%
Other values (378) 49022
73.0%
Uppercase Letter
ValueCountFrequency (%)
S 123
14.3%
C 119
13.9%
K 109
12.7%
D 81
9.4%
M 81
9.4%
L 63
7.3%
I 52
6.1%
E 40
 
4.7%
H 39
 
4.5%
G 36
 
4.2%
Other values (7) 115
13.4%
Lowercase Letter
ValueCountFrequency (%)
e 210
63.1%
l 25
 
7.5%
i 22
 
6.6%
s 17
 
5.1%
v 16
 
4.8%
k 15
 
4.5%
h 8
 
2.4%
w 8
 
2.4%
c 6
 
1.8%
a 3
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 1087
31.1%
2 956
27.4%
3 443
12.7%
4 275
 
7.9%
5 210
 
6.0%
6 144
 
4.1%
7 127
 
3.6%
8 89
 
2.5%
9 83
 
2.4%
0 81
 
2.3%
Other Punctuation
ValueCountFrequency (%)
, 67
77.0%
. 20
 
23.0%
Space Separator
ValueCountFrequency (%)
802
100.0%
Open Punctuation
ValueCountFrequency (%)
( 141
100.0%
Close Punctuation
ValueCountFrequency (%)
) 141
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 126
100.0%
Letter Number
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67164
91.8%
Common 4792
 
6.6%
Latin 1198
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2545
 
3.8%
2510
 
3.7%
2324
 
3.5%
1780
 
2.7%
1709
 
2.5%
1665
 
2.5%
1431
 
2.1%
1420
 
2.1%
1383
 
2.1%
1375
 
2.0%
Other values (378) 49022
73.0%
Latin
ValueCountFrequency (%)
e 210
17.5%
S 123
10.3%
C 119
9.9%
K 109
 
9.1%
D 81
 
6.8%
M 81
 
6.8%
L 63
 
5.3%
I 52
 
4.3%
E 40
 
3.3%
H 39
 
3.3%
Other values (19) 281
23.5%
Common
ValueCountFrequency (%)
1 1087
22.7%
2 956
19.9%
802
16.7%
3 443
9.2%
4 275
 
5.7%
5 210
 
4.4%
6 144
 
3.0%
( 141
 
2.9%
) 141
 
2.9%
7 127
 
2.7%
Other values (6) 466
9.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67164
91.8%
ASCII 5983
 
8.2%
Number Forms 7
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2545
 
3.8%
2510
 
3.7%
2324
 
3.5%
1780
 
2.7%
1709
 
2.5%
1665
 
2.5%
1431
 
2.1%
1420
 
2.1%
1383
 
2.1%
1375
 
2.0%
Other values (378) 49022
73.0%
ASCII
ValueCountFrequency (%)
1 1087
18.2%
2 956
16.0%
802
13.4%
3 443
 
7.4%
4 275
 
4.6%
e 210
 
3.5%
5 210
 
3.5%
6 144
 
2.4%
( 141
 
2.4%
) 141
 
2.4%
Other values (34) 1574
26.3%
Number Forms
ValueCountFrequency (%)
7
100.0%
Distinct2244
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:08.186767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique135 ?
Unique (%)1.4%

Sample

1st rowA13984814
2nd rowA15870701
3rd rowA12170601
4th rowA14006001
5th rowA13986504
ValueCountFrequency (%)
a13983810 15
 
0.1%
a15609305 13
 
0.1%
a13986504 12
 
0.1%
a15603203 12
 
0.1%
a15703204 12
 
0.1%
a14272313 12
 
0.1%
a12170601 11
 
0.1%
a15205301 11
 
0.1%
a13821004 10
 
0.1%
a13677401 10
 
0.1%
Other values (2234) 9882
98.8%
2024-05-11T14:58:08.940018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18463
20.5%
1 17515
19.5%
A 9993
11.1%
3 8885
9.9%
2 8345
9.3%
5 6207
 
6.9%
8 5635
 
6.3%
7 4592
 
5.1%
4 4090
 
4.5%
6 3255
 
3.6%
Other values (2) 3020
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18463
23.1%
1 17515
21.9%
3 8885
11.1%
2 8345
10.4%
5 6207
 
7.8%
8 5635
 
7.0%
7 4592
 
5.7%
4 4090
 
5.1%
6 3255
 
4.1%
9 3013
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9993
99.9%
B 7
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18463
23.1%
1 17515
21.9%
3 8885
11.1%
2 8345
10.4%
5 6207
 
7.8%
8 5635
 
7.0%
7 4592
 
5.7%
4 4090
 
5.1%
6 3255
 
4.1%
9 3013
 
3.8%
Latin
ValueCountFrequency (%)
A 9993
99.9%
B 7
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18463
20.5%
1 17515
19.5%
A 9993
11.1%
3 8885
9.9%
2 8345
9.3%
5 6207
 
6.9%
8 5635
 
6.3%
7 4592
 
5.1%
4 4090
 
4.5%
6 3255
 
3.6%
Other values (2) 3020
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:09.344198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.92
Min length2

Characters and Unicode

Total characters59200
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row기타유형자산감가상각누계액
2nd row시설보수충당부채
3rd row관리비미수금
4th row가지급금
5th row기타투자자산
ValueCountFrequency (%)
예수금 340
 
3.4%
예금 334
 
3.3%
미처분이익잉여금 328
 
3.3%
당기순이익 326
 
3.3%
선급비용 315
 
3.1%
공동주택적립금 311
 
3.1%
연차수당충당부채 308
 
3.1%
장기수선충당부채 306
 
3.1%
관리비미수금 303
 
3.0%
퇴직급여충당부채 292
 
2.9%
Other values (67) 6837
68.4%
2024-05-11T14:58:09.955026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4655
 
7.9%
3836
 
6.5%
3222
 
5.4%
3010
 
5.1%
2915
 
4.9%
2888
 
4.9%
2590
 
4.4%
2484
 
4.2%
1918
 
3.2%
1790
 
3.0%
Other values (97) 29892
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59200
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4655
 
7.9%
3836
 
6.5%
3222
 
5.4%
3010
 
5.1%
2915
 
4.9%
2888
 
4.9%
2590
 
4.4%
2484
 
4.2%
1918
 
3.2%
1790
 
3.0%
Other values (97) 29892
50.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59200
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4655
 
7.9%
3836
 
6.5%
3222
 
5.4%
3010
 
5.1%
2915
 
4.9%
2888
 
4.9%
2590
 
4.4%
2484
 
4.2%
1918
 
3.2%
1790
 
3.0%
Other values (97) 29892
50.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59200
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4655
 
7.9%
3836
 
6.5%
3222
 
5.4%
3010
 
5.1%
2915
 
4.9%
2888
 
4.9%
2590
 
4.4%
2484
 
4.2%
1918
 
3.2%
1790
 
3.0%
Other values (97) 29892
50.5%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202204
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202204
2nd row202204
3rd row202204
4th row202204
5th row202204

Common Values

ValueCountFrequency (%)
202204 10000
100.0%

Length

2024-05-11T14:58:10.168199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:58:10.396296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202204 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7341
Distinct (%)73.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80057681
Minimum-9.0793549 × 108
Maximum8.4784997 × 109
Zeros2325
Zeros (%)23.2%
Negative290
Negative (%)2.9%
Memory size166.0 KiB
2024-05-11T14:58:10.602124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9.0793549 × 108
5-th percentile0
Q10
median3142490
Q337870709
95-th percentile3.9360256 × 108
Maximum8.4784997 × 109
Range9.3864352 × 109
Interquartile range (IQR)37870709

Descriptive statistics

Standard deviation3.1499409 × 108
Coefficient of variation (CV)3.9345893
Kurtosis198.13172
Mean80057681
Median Absolute Deviation (MAD)3142490
Skewness11.427159
Sum8.0057681 × 1011
Variance9.9221278 × 1016
MonotonicityNot monotonic
2024-05-11T14:58:10.866601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2325
 
23.2%
250000 22
 
0.2%
500000 22
 
0.2%
300000 17
 
0.2%
10000000 15
 
0.1%
2000000 13
 
0.1%
200000 12
 
0.1%
20000000 12
 
0.1%
3000000 10
 
0.1%
242000 10
 
0.1%
Other values (7331) 7542
75.4%
ValueCountFrequency (%)
-907935489 1
< 0.1%
-311621827 1
< 0.1%
-154519681 1
< 0.1%
-140549450 1
< 0.1%
-133407205 1
< 0.1%
-117108394 1
< 0.1%
-105064837 1
< 0.1%
-102643070 1
< 0.1%
-98397820 1
< 0.1%
-96988750 1
< 0.1%
ValueCountFrequency (%)
8478499741 1
< 0.1%
7406676518 1
< 0.1%
7318879438 1
< 0.1%
6764052522 1
< 0.1%
6429003861 1
< 0.1%
5552907513 1
< 0.1%
5111190543 1
< 0.1%
4957877933 1
< 0.1%
4651955395 1
< 0.1%
4480501285 1
< 0.1%

Interactions

2024-05-11T14:58:06.128741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:58:11.020123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.632
금액0.6321.000

Missing values

2024-05-11T14:58:06.342220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:58:06.513132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
47104월계주공2단지A13984814기타유형자산감가상각누계액2022040
70742신정이펜하우스1단지(총세대 기준)A15870701시설보수충당부채2022040
12164삼성래미안공덕4차A12170601관리비미수금20220433971934
49196신창세방리버하이빌A14006001가지급금202204321550
47807중계무지개아파트A13986504기타투자자산2022040
31838대치쌍용2차A13583402기타유형자산감가상각누계액202204-2248900
65036우장산에스케이뷰A15701002당기순이익2022048604586
72716은평뉴타운제각말5단지제3관리사무소A41279925미수관리비예치금2022040
40213래미안송파파인탑A13817001미부과관리비202204203793754
69809신정명지해드는터A15807202미처분이익잉여금2022040
아파트명아파트코드비용명년월일금액
43397상계벽산A13920506주차장충당부채2022040
56374신림2차푸르지오A15101503미부과관리비20220477708080
62575상도동원베네스트A15603001장기수선충당부채202204121828177
5689한남아이파크A10027071미지급비용202204497310
55250대림현대2차A15081602미지급금20220418795881
26160서울숲리버그린동아A13385301예금202204140675681
5705래미안에스티움A10027073가지급금20220438437772
6396강남더샵포레스트A10027446예수금2022044457780
60384구로중앙하이츠아파트A15285804기타충당부채2022041887760
3127백련산 sk뷰 아이파크A10025310관리비미수금20220454849340