Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 is highly skewed (γ1 = 22.11784956)Skewed
금액 has 2360 (23.6%) zerosZeros

Reproduction

Analysis started2024-05-11 05:57:58.198287
Analysis finished2024-05-11 05:57:59.092075
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2228
Distinct (%)22.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:59.295879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.3622
Min length2

Characters and Unicode

Total characters73622
Distinct characters436
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)1.1%

Sample

1st row방배1차현대
2nd row갈현한솔아파트
3rd row용산센트럴파크
4th row신동아아파트
5th row여의도장미
ValueCountFrequency (%)
아파트 163
 
1.5%
래미안 43
 
0.4%
e편한세상 25
 
0.2%
아이파크 24
 
0.2%
푸르지오 17
 
0.2%
래미안밤섬리베뉴 15
 
0.1%
경남아너스빌 15
 
0.1%
고덕 15
 
0.1%
은평뉴타운상림마을6단지 15
 
0.1%
신트리1단지 13
 
0.1%
Other values (2306) 10394
96.8%
2024-05-11T14:57:59.834336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2581
 
3.5%
2516
 
3.4%
2332
 
3.2%
1908
 
2.6%
1734
 
2.4%
1632
 
2.2%
1511
 
2.1%
1424
 
1.9%
1383
 
1.9%
1323
 
1.8%
Other values (426) 55278
75.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67367
91.5%
Decimal Number 3694
 
5.0%
Uppercase Letter 860
 
1.2%
Space Separator 805
 
1.1%
Lowercase Letter 353
 
0.5%
Close Punctuation 147
 
0.2%
Open Punctuation 147
 
0.2%
Dash Punctuation 145
 
0.2%
Other Punctuation 96
 
0.1%
Letter Number 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2581
 
3.8%
2516
 
3.7%
2332
 
3.5%
1908
 
2.8%
1734
 
2.6%
1632
 
2.4%
1511
 
2.2%
1424
 
2.1%
1383
 
2.1%
1323
 
2.0%
Other values (381) 49023
72.8%
Uppercase Letter
ValueCountFrequency (%)
S 135
15.7%
C 109
12.7%
K 104
12.1%
M 78
9.1%
D 78
9.1%
L 62
7.2%
H 58
6.7%
I 46
 
5.3%
E 45
 
5.2%
V 27
 
3.1%
Other values (7) 118
13.7%
Lowercase Letter
ValueCountFrequency (%)
e 190
53.8%
l 40
 
11.3%
i 34
 
9.6%
v 23
 
6.5%
s 18
 
5.1%
k 14
 
4.0%
h 8
 
2.3%
w 8
 
2.3%
a 7
 
2.0%
g 7
 
2.0%
Decimal Number
ValueCountFrequency (%)
1 1157
31.3%
2 1042
28.2%
3 477
12.9%
4 266
 
7.2%
5 209
 
5.7%
6 169
 
4.6%
8 110
 
3.0%
7 98
 
2.7%
9 90
 
2.4%
0 76
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 77
80.2%
. 19
 
19.8%
Space Separator
ValueCountFrequency (%)
805
100.0%
Close Punctuation
ValueCountFrequency (%)
) 147
100.0%
Open Punctuation
ValueCountFrequency (%)
( 147
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 145
100.0%
Letter Number
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67367
91.5%
Common 5034
 
6.8%
Latin 1221
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2581
 
3.8%
2516
 
3.7%
2332
 
3.5%
1908
 
2.8%
1734
 
2.6%
1632
 
2.4%
1511
 
2.2%
1424
 
2.1%
1383
 
2.1%
1323
 
2.0%
Other values (381) 49023
72.8%
Latin
ValueCountFrequency (%)
e 190
15.6%
S 135
11.1%
C 109
 
8.9%
K 104
 
8.5%
M 78
 
6.4%
D 78
 
6.4%
L 62
 
5.1%
H 58
 
4.8%
I 46
 
3.8%
E 45
 
3.7%
Other values (19) 316
25.9%
Common
ValueCountFrequency (%)
1 1157
23.0%
2 1042
20.7%
805
16.0%
3 477
9.5%
4 266
 
5.3%
5 209
 
4.2%
6 169
 
3.4%
) 147
 
2.9%
( 147
 
2.9%
- 145
 
2.9%
Other values (6) 470
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67367
91.5%
ASCII 6247
 
8.5%
Number Forms 8
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2581
 
3.8%
2516
 
3.7%
2332
 
3.5%
1908
 
2.8%
1734
 
2.6%
1632
 
2.4%
1511
 
2.2%
1424
 
2.1%
1383
 
2.1%
1323
 
2.0%
Other values (381) 49023
72.8%
ASCII
ValueCountFrequency (%)
1 1157
18.5%
2 1042
16.7%
805
12.9%
3 477
 
7.6%
4 266
 
4.3%
5 209
 
3.3%
e 190
 
3.0%
6 169
 
2.7%
) 147
 
2.4%
( 147
 
2.4%
Other values (34) 1638
26.2%
Number Forms
ValueCountFrequency (%)
8
100.0%
Distinct2234
Distinct (%)22.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:00.215430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)1.1%

Sample

1st rowA13785203
2nd rowA12281801
3rd rowA10024691
4th rowA14082601
5th rowA15001004
ValueCountFrequency (%)
a15807002 13
 
0.1%
a12208204 12
 
0.1%
a13204406 11
 
0.1%
a13384403 11
 
0.1%
a13986701 11
 
0.1%
a14381516 11
 
0.1%
a15785711 11
 
0.1%
a13922111 11
 
0.1%
a12013202 11
 
0.1%
a13078701 10
 
0.1%
Other values (2224) 9888
98.9%
2024-05-11T14:58:00.782434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18510
20.6%
1 17579
19.5%
A 9993
11.1%
3 8833
9.8%
2 8237
9.2%
5 6006
 
6.7%
8 5576
 
6.2%
7 4709
 
5.2%
4 4079
 
4.5%
6 3373
 
3.7%
Other values (2) 3105
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18510
23.1%
1 17579
22.0%
3 8833
11.0%
2 8237
10.3%
5 6006
 
7.5%
8 5576
 
7.0%
7 4709
 
5.9%
4 4079
 
5.1%
6 3373
 
4.2%
9 3098
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
A 9993
99.9%
B 7
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18510
23.1%
1 17579
22.0%
3 8833
11.0%
2 8237
10.3%
5 6006
 
7.5%
8 5576
 
7.0%
7 4709
 
5.9%
4 4079
 
5.1%
6 3373
 
4.2%
9 3098
 
3.9%
Latin
ValueCountFrequency (%)
A 9993
99.9%
B 7
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18510
20.6%
1 17579
19.5%
A 9993
11.1%
3 8833
9.8%
2 8237
9.2%
5 6006
 
6.7%
8 5576
 
6.2%
7 4709
 
5.2%
4 4079
 
4.5%
6 3373
 
3.7%
Other values (2) 3105
 
3.5%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:58:01.142381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.9905
Min length2

Characters and Unicode

Total characters59905
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row장기수선충당부채
2nd row관리비미수금
3rd row장기수선충당예금
4th row공동주택적립금
5th row예수금
ValueCountFrequency (%)
예금 331
 
3.3%
예수금 327
 
3.3%
공동주택적립금 319
 
3.2%
연차수당충당부채 314
 
3.1%
퇴직급여충당부채 310
 
3.1%
관리비미수금 306
 
3.1%
당기순이익 303
 
3.0%
미처분이익잉여금 302
 
3.0%
장기수선충당예금 301
 
3.0%
수선유지비충당부채 298
 
3.0%
Other values (67) 6889
68.9%
2024-05-11T14:58:01.624276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4728
 
7.9%
3908
 
6.5%
3176
 
5.3%
3110
 
5.2%
3001
 
5.0%
2926
 
4.9%
2638
 
4.4%
2445
 
4.1%
1877
 
3.1%
1815
 
3.0%
Other values (97) 30281
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59905
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4728
 
7.9%
3908
 
6.5%
3176
 
5.3%
3110
 
5.2%
3001
 
5.0%
2926
 
4.9%
2638
 
4.4%
2445
 
4.1%
1877
 
3.1%
1815
 
3.0%
Other values (97) 30281
50.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59905
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4728
 
7.9%
3908
 
6.5%
3176
 
5.3%
3110
 
5.2%
3001
 
5.0%
2926
 
4.9%
2638
 
4.4%
2445
 
4.1%
1877
 
3.1%
1815
 
3.0%
Other values (97) 30281
50.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59905
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4728
 
7.9%
3908
 
6.5%
3176
 
5.3%
3110
 
5.2%
3001
 
5.0%
2926
 
4.9%
2638
 
4.4%
2445
 
4.1%
1877
 
3.1%
1815
 
3.0%
Other values (97) 30281
50.5%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202205
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202205
2nd row202205
3rd row202205
4th row202205
5th row202205

Common Values

ValueCountFrequency (%)
202205 10000
100.0%

Length

2024-05-11T14:58:01.805846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:58:01.982989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202205 10000
100.0%

금액
Real number (ℝ)

SKEWED  ZEROS 

Distinct7308
Distinct (%)73.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80679369
Minimum-4.4797938 × 108
Maximum2.0760626 × 1010
Zeros2360
Zeros (%)23.6%
Negative343
Negative (%)3.4%
Memory size166.0 KiB
2024-05-11T14:58:02.164470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.4797938 × 108
5-th percentile0
Q10
median2974104
Q333994437
95-th percentile3.616385 × 108
Maximum2.0760626 × 1010
Range2.1208606 × 1010
Interquartile range (IQR)33994437

Descriptive statistics

Standard deviation3.827196 × 108
Coefficient of variation (CV)4.7437109
Kurtosis933.27788
Mean80679369
Median Absolute Deviation (MAD)2974104
Skewness22.11785
Sum8.0679369 × 1011
Variance1.464743 × 1017
MonotonicityNot monotonic
2024-05-11T14:58:02.368751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2360
 
23.6%
500000 28
 
0.3%
250000 18
 
0.2%
200000 15
 
0.1%
300000 13
 
0.1%
484000 13
 
0.1%
242000 13
 
0.1%
100000 12
 
0.1%
10000000 12
 
0.1%
30000000 9
 
0.1%
Other values (7298) 7507
75.1%
ValueCountFrequency (%)
-447979377 1
< 0.1%
-375250468 1
< 0.1%
-315812936 1
< 0.1%
-280663216 1
< 0.1%
-245626510 1
< 0.1%
-230922000 1
< 0.1%
-201330000 1
< 0.1%
-195595710 1
< 0.1%
-190422700 1
< 0.1%
-154762018 1
< 0.1%
ValueCountFrequency (%)
20760626250 1
< 0.1%
7751492557 1
< 0.1%
7508019841 1
< 0.1%
7477225598 1
< 0.1%
5759842614 1
< 0.1%
5004400593 1
< 0.1%
4838246619 1
< 0.1%
4835743985 1
< 0.1%
4783509589 1
< 0.1%
4671293406 1
< 0.1%

Interactions

2024-05-11T14:57:58.676187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:58:02.472847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.302
금액0.3021.000

Missing values

2024-05-11T14:57:58.828385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:57:58.971684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
37924방배1차현대A13785203장기수선충당부채202205326036221
15118갈현한솔아파트A12281801관리비미수금2022052030220
1704용산센트럴파크A10024691장기수선충당예금202205321966959
49526신동아아파트A14082601공동주택적립금20220555188745
52597여의도장미A15001004예수금2022051000680
30339수서까치마을A13522007가지급금202205531240
32958돈암동일하이빌A13603501저장품202205336500
31527역삼개나리푸르지오A13579501선수관리비20220581084000
4018연희파크푸르지오 아파트A10025822당기순이익20220519185012
32249압구정한양아파트제2단지A13590204연차수당충당부채20220569212958
아파트명아파트코드비용명년월일금액
53492신길삼성래미안A15005402임대보증금2022050
66467마곡수명산파크7단지A15728005기타충당부채2022055870000
28576천호한신A13486601선수금20220528455000
9622홍제성원아파트A12009201기타투자자산20220556512
11660마포강변힐스테이트A12112002기타시설운영충당부채2022054242700
7432강남한신휴플러스 8단지A10027909선수수도료2022050
31011도곡렉슬A13527203현금202205168110177
55747여의도화랑A15088802기타충당예금2022051811790
49863번동금호어울림A14206002미처분이익잉여금2022050
59388신개봉삼환A15280602임대보증금2022055600000