Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 is highly skewed (γ1 = 21.57862091)Skewed
금액 has 2504 (25.0%) zerosZeros

Reproduction

Analysis started2024-05-11 05:57:43.199069
Analysis finished2024-05-11 05:57:44.829082
Duration1.63 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2259
Distinct (%)22.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:45.083951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length7.3617
Min length2

Characters and Unicode

Total characters73617
Distinct characters436
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)1.2%

Sample

1st row원효산호
2nd row우장산한화꿈에그린
3rd row신림동부
4th row구의현대6단지
5th row천호삼익
ValueCountFrequency (%)
아파트 158
 
1.5%
래미안 44
 
0.4%
아이파크 21
 
0.2%
e편한세상 18
 
0.2%
경남아너스빌 17
 
0.2%
신반포 17
 
0.2%
sk뷰 16
 
0.1%
해모로 16
 
0.1%
꿈의숲 13
 
0.1%
방화2-2 12
 
0.1%
Other values (2339) 10417
96.9%
2024-05-11T14:57:45.741103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2568
 
3.5%
2462
 
3.3%
2331
 
3.2%
1842
 
2.5%
1697
 
2.3%
1667
 
2.3%
1509
 
2.0%
1453
 
2.0%
1443
 
2.0%
1434
 
1.9%
Other values (426) 55211
75.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67349
91.5%
Decimal Number 3663
 
5.0%
Uppercase Letter 846
 
1.1%
Space Separator 832
 
1.1%
Lowercase Letter 364
 
0.5%
Open Punctuation 150
 
0.2%
Close Punctuation 150
 
0.2%
Dash Punctuation 140
 
0.2%
Other Punctuation 120
 
0.2%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2568
 
3.8%
2462
 
3.7%
2331
 
3.5%
1842
 
2.7%
1697
 
2.5%
1667
 
2.5%
1509
 
2.2%
1453
 
2.2%
1443
 
2.1%
1434
 
2.1%
Other values (381) 48943
72.7%
Uppercase Letter
ValueCountFrequency (%)
S 139
16.4%
C 117
13.8%
K 98
11.6%
D 84
9.9%
M 84
9.9%
L 68
8.0%
H 64
7.6%
E 34
 
4.0%
I 34
 
4.0%
G 33
 
3.9%
Other values (7) 91
10.8%
Lowercase Letter
ValueCountFrequency (%)
e 189
51.9%
l 45
 
12.4%
i 35
 
9.6%
v 28
 
7.7%
s 19
 
5.2%
k 17
 
4.7%
w 11
 
3.0%
h 8
 
2.2%
c 6
 
1.6%
a 3
 
0.8%
Decimal Number
ValueCountFrequency (%)
2 1090
29.8%
1 1044
28.5%
3 479
13.1%
4 300
 
8.2%
5 234
 
6.4%
6 147
 
4.0%
7 111
 
3.0%
9 97
 
2.6%
8 83
 
2.3%
0 78
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 96
80.0%
. 24
 
20.0%
Space Separator
ValueCountFrequency (%)
832
100.0%
Open Punctuation
ValueCountFrequency (%)
( 150
100.0%
Close Punctuation
ValueCountFrequency (%)
) 150
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 140
100.0%
Letter Number
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67349
91.5%
Common 5055
 
6.9%
Latin 1213
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2568
 
3.8%
2462
 
3.7%
2331
 
3.5%
1842
 
2.7%
1697
 
2.5%
1667
 
2.5%
1509
 
2.2%
1453
 
2.2%
1443
 
2.1%
1434
 
2.1%
Other values (381) 48943
72.7%
Latin
ValueCountFrequency (%)
e 189
15.6%
S 139
11.5%
C 117
9.6%
K 98
 
8.1%
D 84
 
6.9%
M 84
 
6.9%
L 68
 
5.6%
H 64
 
5.3%
l 45
 
3.7%
i 35
 
2.9%
Other values (19) 290
23.9%
Common
ValueCountFrequency (%)
2 1090
21.6%
1 1044
20.7%
832
16.5%
3 479
9.5%
4 300
 
5.9%
5 234
 
4.6%
( 150
 
3.0%
) 150
 
3.0%
6 147
 
2.9%
- 140
 
2.8%
Other values (6) 489
9.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67349
91.5%
ASCII 6265
 
8.5%
Number Forms 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2568
 
3.8%
2462
 
3.7%
2331
 
3.5%
1842
 
2.7%
1697
 
2.5%
1667
 
2.5%
1509
 
2.2%
1453
 
2.2%
1443
 
2.1%
1434
 
2.1%
Other values (381) 48943
72.7%
ASCII
ValueCountFrequency (%)
2 1090
17.4%
1 1044
16.7%
832
13.3%
3 479
 
7.6%
4 300
 
4.8%
5 234
 
3.7%
e 189
 
3.0%
( 150
 
2.4%
) 150
 
2.4%
6 147
 
2.3%
Other values (34) 1650
26.3%
Number Forms
ValueCountFrequency (%)
3
100.0%
Distinct2265
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:46.318375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)1.2%

Sample

1st rowA14085002
2nd rowA15701004
3rd rowA15101101
4th rowA14383203
5th rowA13486701
ValueCountFrequency (%)
a13082502 12
 
0.1%
a13187406 12
 
0.1%
a15785711 12
 
0.1%
a13178101 12
 
0.1%
a12013003 12
 
0.1%
a15886504 12
 
0.1%
a15085805 11
 
0.1%
a13986306 11
 
0.1%
a13528003 11
 
0.1%
a12070101 11
 
0.1%
Other values (2255) 9884
98.8%
2024-05-11T14:57:47.066637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18495
20.5%
1 17565
19.5%
A 9996
11.1%
3 8804
9.8%
2 8303
9.2%
5 6141
 
6.8%
8 5623
 
6.2%
7 4747
 
5.3%
4 4077
 
4.5%
6 3264
 
3.6%
Other values (2) 2985
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18495
23.1%
1 17565
22.0%
3 8804
11.0%
2 8303
10.4%
5 6141
 
7.7%
8 5623
 
7.0%
7 4747
 
5.9%
4 4077
 
5.1%
6 3264
 
4.1%
9 2981
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 9996
> 99.9%
B 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18495
23.1%
1 17565
22.0%
3 8804
11.0%
2 8303
10.4%
5 6141
 
7.7%
8 5623
 
7.0%
7 4747
 
5.9%
4 4077
 
5.1%
6 3264
 
4.1%
9 2981
 
3.7%
Latin
ValueCountFrequency (%)
A 9996
> 99.9%
B 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18495
20.5%
1 17565
19.5%
A 9996
11.1%
3 8804
9.8%
2 8303
9.2%
5 6141
 
6.8%
8 5623
 
6.2%
7 4747
 
5.3%
4 4077
 
4.5%
6 3264
 
3.6%
Other values (2) 2985
 
3.3%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:47.517235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length6.0515
Min length2

Characters and Unicode

Total characters60515
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row기타당좌자산
2nd row선급비용
3rd row가지급금
4th row퇴직급여충당부채
5th row가지급금
ValueCountFrequency (%)
미처분이익잉여금 342
 
3.4%
퇴직급여충당부채 332
 
3.3%
당기순이익 317
 
3.2%
공동주택적립금 306
 
3.1%
장기수선충당부채 302
 
3.0%
연차수당충당부채 298
 
3.0%
선급비용 296
 
3.0%
예금 295
 
2.9%
가수금 293
 
2.9%
비품 293
 
2.9%
Other values (67) 6926
69.3%
2024-05-11T14:57:48.206548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4541
 
7.5%
3917
 
6.5%
3111
 
5.1%
3055
 
5.0%
3030
 
5.0%
2965
 
4.9%
2685
 
4.4%
2428
 
4.0%
1865
 
3.1%
1713
 
2.8%
Other values (97) 31205
51.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60515
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4541
 
7.5%
3917
 
6.5%
3111
 
5.1%
3055
 
5.0%
3030
 
5.0%
2965
 
4.9%
2685
 
4.4%
2428
 
4.0%
1865
 
3.1%
1713
 
2.8%
Other values (97) 31205
51.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60515
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4541
 
7.5%
3917
 
6.5%
3111
 
5.1%
3055
 
5.0%
3030
 
5.0%
2965
 
4.9%
2685
 
4.4%
2428
 
4.0%
1865
 
3.1%
1713
 
2.8%
Other values (97) 31205
51.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60515
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4541
 
7.5%
3917
 
6.5%
3111
 
5.1%
3055
 
5.0%
3030
 
5.0%
2965
 
4.9%
2685
 
4.4%
2428
 
4.0%
1865
 
3.1%
1713
 
2.8%
Other values (97) 31205
51.6%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202207
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202207
2nd row202207
3rd row202207
4th row202207
5th row202207

Common Values

ValueCountFrequency (%)
202207 10000
100.0%

Length

2024-05-11T14:57:48.447912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:57:48.671315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202207 10000
100.0%

금액
Real number (ℝ)

SKEWED  ZEROS 

Distinct7174
Distinct (%)71.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77774383
Minimum-4.09024 × 109
Maximum1.9396992 × 1010
Zeros2504
Zeros (%)25.0%
Negative350
Negative (%)3.5%
Memory size166.0 KiB
2024-05-11T14:57:48.862812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.09024 × 109
5-th percentile0
Q10
median2766160.5
Q336842597
95-th percentile3.6708453 × 108
Maximum1.9396992 × 1010
Range2.3487232 × 1010
Interquartile range (IQR)36842597

Descriptive statistics

Standard deviation3.5903193 × 108
Coefficient of variation (CV)4.6163264
Kurtosis916.08661
Mean77774383
Median Absolute Deviation (MAD)2766160.5
Skewness21.578621
Sum7.7774383 × 1011
Variance1.2890393 × 1017
MonotonicityNot monotonic
2024-05-11T14:57:49.137036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2504
 
25.0%
500000 32
 
0.3%
250000 22
 
0.2%
300000 19
 
0.2%
484000 14
 
0.1%
1000000 12
 
0.1%
10000000 12
 
0.1%
20000000 11
 
0.1%
5000000 11
 
0.1%
100000 11
 
0.1%
Other values (7164) 7352
73.5%
ValueCountFrequency (%)
-4090240000 1
< 0.1%
-328099290 1
< 0.1%
-321546036 1
< 0.1%
-163222500 1
< 0.1%
-93463340 1
< 0.1%
-89099256 1
< 0.1%
-85732520 1
< 0.1%
-85361160 1
< 0.1%
-83889710 1
< 0.1%
-81999544 1
< 0.1%
ValueCountFrequency (%)
19396992307 1
< 0.1%
7637948975 1
< 0.1%
6062532886 1
< 0.1%
5551857570 1
< 0.1%
5451557816 1
< 0.1%
5260505618 1
< 0.1%
5168539258 1
< 0.1%
4877444847 1
< 0.1%
4809640569 1
< 0.1%
4668714394 1
< 0.1%

Interactions

2024-05-11T14:57:44.005177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:57:49.296955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.504
금액0.5041.000

Missing values

2024-05-11T14:57:44.558017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:57:44.742076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
49867원효산호A14085002기타당좌자산2022070
65080우장산한화꿈에그린A15701004선급비용2022071694900
56404신림동부A15101101가지급금202207544930
52567구의현대6단지A14383203퇴직급여충당부채20220750903701
28655천호삼익A13486701가지급금2022073046200
34221길음SHVILLEA13611009상여충당부채2022070
53035여의도삼부A15001020퇴직급여충당예금202207353547483
42719풍납 현대리버빌1차A13887405공동주택적립금2022078952730
62523대방경남아너스빌A15602001당기순이익2022077935531
71749신정5차현대A15886504비품202207637000
아파트명아파트코드비용명년월일금액
35823정릉우정에쉐르A13677807관리비미수금20220712158020
63082노량진쌍용예가A15605003선수전기료2022070
33729정릉산장A13610004주차장충당예금2022078072802
54387양평삼성래미안A15010202당기순이익20220719526424
60181신구로현대A15283902비품2022075620000
44069중계한화꿈에그린A13922905상여충당부채2022070
47050월계사슴2단지A13984409선급비용2022070
43772중계3벽산A13922103비품2022074555140
15087북한산래미안A12275201비품감가상각누계액202207-31344190
7084래미안프레비뉴A10027755기타재고자산202207251175744