Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2320 (23.2%) zerosZeros

Reproduction

Analysis started2024-05-11 05:57:29.000512
Analysis finished2024-05-11 05:57:30.229668
Duration1.23 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2232
Distinct (%)22.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:30.423899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.4089
Min length2

Characters and Unicode

Total characters74089
Distinct characters435
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)1.2%

Sample

1st row래미안서초유니빌
2nd row상계은빛1단지
3rd row송파파인타운6단지
4th row상암월드컵파크7단지
5th row신정푸른마을2단지
ValueCountFrequency (%)
아파트 170
 
1.6%
래미안 51
 
0.5%
e편한세상 33
 
0.3%
아이파크 24
 
0.2%
sk뷰 23
 
0.2%
고덕 17
 
0.2%
송파 15
 
0.1%
꿈의숲 14
 
0.1%
신반포 14
 
0.1%
보라매 13
 
0.1%
Other values (2317) 10448
96.5%
2024-05-11T14:57:30.977741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2520
 
3.4%
2502
 
3.4%
2347
 
3.2%
1876
 
2.5%
1752
 
2.4%
1636
 
2.2%
1492
 
2.0%
1456
 
2.0%
1432
 
1.9%
1377
 
1.9%
Other values (425) 55699
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67800
91.5%
Decimal Number 3605
 
4.9%
Space Separator 907
 
1.2%
Uppercase Letter 902
 
1.2%
Lowercase Letter 367
 
0.5%
Open Punctuation 137
 
0.2%
Close Punctuation 137
 
0.2%
Dash Punctuation 120
 
0.2%
Other Punctuation 109
 
0.1%
Letter Number 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2520
 
3.7%
2502
 
3.7%
2347
 
3.5%
1876
 
2.8%
1752
 
2.6%
1636
 
2.4%
1492
 
2.2%
1456
 
2.1%
1432
 
2.1%
1377
 
2.0%
Other values (380) 49410
72.9%
Uppercase Letter
ValueCountFrequency (%)
S 163
18.1%
C 113
12.5%
K 108
12.0%
D 85
9.4%
M 85
9.4%
H 64
 
7.1%
L 51
 
5.7%
I 48
 
5.3%
E 47
 
5.2%
V 38
 
4.2%
Other values (7) 100
11.1%
Lowercase Letter
ValueCountFrequency (%)
e 207
56.4%
l 34
 
9.3%
i 33
 
9.0%
v 24
 
6.5%
k 22
 
6.0%
s 20
 
5.4%
w 15
 
4.1%
c 8
 
2.2%
h 2
 
0.5%
g 1
 
0.3%
Decimal Number
ValueCountFrequency (%)
1 1099
30.5%
2 1031
28.6%
3 444
12.3%
4 281
 
7.8%
5 227
 
6.3%
6 142
 
3.9%
7 135
 
3.7%
8 104
 
2.9%
9 83
 
2.3%
0 59
 
1.6%
Other Punctuation
ValueCountFrequency (%)
, 80
73.4%
. 29
 
26.6%
Space Separator
ValueCountFrequency (%)
907
100.0%
Open Punctuation
ValueCountFrequency (%)
( 137
100.0%
Close Punctuation
ValueCountFrequency (%)
) 137
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 120
100.0%
Letter Number
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67800
91.5%
Common 5015
 
6.8%
Latin 1274
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2520
 
3.7%
2502
 
3.7%
2347
 
3.5%
1876
 
2.8%
1752
 
2.6%
1636
 
2.4%
1492
 
2.2%
1456
 
2.1%
1432
 
2.1%
1377
 
2.0%
Other values (380) 49410
72.9%
Latin
ValueCountFrequency (%)
e 207
16.2%
S 163
12.8%
C 113
 
8.9%
K 108
 
8.5%
D 85
 
6.7%
M 85
 
6.7%
H 64
 
5.0%
L 51
 
4.0%
I 48
 
3.8%
E 47
 
3.7%
Other values (19) 303
23.8%
Common
ValueCountFrequency (%)
1 1099
21.9%
2 1031
20.6%
907
18.1%
3 444
8.9%
4 281
 
5.6%
5 227
 
4.5%
6 142
 
2.8%
( 137
 
2.7%
) 137
 
2.7%
7 135
 
2.7%
Other values (6) 475
9.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67800
91.5%
ASCII 6284
 
8.5%
Number Forms 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2520
 
3.7%
2502
 
3.7%
2347
 
3.5%
1876
 
2.8%
1752
 
2.6%
1636
 
2.4%
1492
 
2.2%
1456
 
2.1%
1432
 
2.1%
1377
 
2.0%
Other values (380) 49410
72.9%
ASCII
ValueCountFrequency (%)
1 1099
17.5%
2 1031
16.4%
907
14.4%
3 444
 
7.1%
4 281
 
4.5%
5 227
 
3.6%
e 207
 
3.3%
S 163
 
2.6%
6 142
 
2.3%
( 137
 
2.2%
Other values (34) 1646
26.2%
Number Forms
ValueCountFrequency (%)
5
100.0%
Distinct2237
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:31.504166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)1.2%

Sample

1st rowA13707010
2nd rowA13983816
3rd rowA13876108
4th rowA12127005
5th rowA15886508
ValueCountFrequency (%)
a13922114 13
 
0.1%
a13519001 12
 
0.1%
a15180705 12
 
0.1%
a13187302 12
 
0.1%
a13671207 11
 
0.1%
a13611006 11
 
0.1%
a41279909 11
 
0.1%
a13984005 11
 
0.1%
a13771601 11
 
0.1%
a14007002 11
 
0.1%
Other values (2227) 9885
98.9%
2024-05-11T14:57:32.253419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18554
20.6%
1 17522
19.5%
A 9997
11.1%
3 8690
9.7%
2 8387
9.3%
5 6228
 
6.9%
8 5598
 
6.2%
7 4738
 
5.3%
4 3971
 
4.4%
6 3323
 
3.7%
Other values (2) 2992
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18554
23.2%
1 17522
21.9%
3 8690
10.9%
2 8387
10.5%
5 6228
 
7.8%
8 5598
 
7.0%
7 4738
 
5.9%
4 3971
 
5.0%
6 3323
 
4.2%
9 2989
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 9997
> 99.9%
B 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18554
23.2%
1 17522
21.9%
3 8690
10.9%
2 8387
10.5%
5 6228
 
7.8%
8 5598
 
7.0%
7 4738
 
5.9%
4 3971
 
5.0%
6 3323
 
4.2%
9 2989
 
3.7%
Latin
ValueCountFrequency (%)
A 9997
> 99.9%
B 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18554
20.6%
1 17522
19.5%
A 9997
11.1%
3 8690
9.7%
2 8387
9.3%
5 6228
 
6.9%
8 5598
 
6.2%
7 4738
 
5.3%
4 3971
 
4.4%
6 3323
 
3.7%
Other values (2) 2992
 
3.3%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:32.636915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length6.0013
Min length2

Characters and Unicode

Total characters60013
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row승강기유지비충당부채
2nd row예금
3rd row연차수당충당부채
4th row기타충당부채
5th row예수금
ValueCountFrequency (%)
공동주택적립금 345
 
3.5%
예금 329
 
3.3%
관리비미수금 325
 
3.2%
장기수선충당부채 317
 
3.2%
선급비용 317
 
3.2%
미처분이익잉여금 314
 
3.1%
연차수당충당부채 305
 
3.0%
비품 303
 
3.0%
당기순이익 302
 
3.0%
퇴직급여충당부채 301
 
3.0%
Other values (67) 6842
68.4%
2024-05-11T14:57:33.173669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4602
 
7.7%
3887
 
6.5%
3136
 
5.2%
3076
 
5.1%
3035
 
5.1%
2945
 
4.9%
2649
 
4.4%
2526
 
4.2%
1911
 
3.2%
1761
 
2.9%
Other values (97) 30485
50.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60013
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4602
 
7.7%
3887
 
6.5%
3136
 
5.2%
3076
 
5.1%
3035
 
5.1%
2945
 
4.9%
2649
 
4.4%
2526
 
4.2%
1911
 
3.2%
1761
 
2.9%
Other values (97) 30485
50.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60013
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4602
 
7.7%
3887
 
6.5%
3136
 
5.2%
3076
 
5.1%
3035
 
5.1%
2945
 
4.9%
2649
 
4.4%
2526
 
4.2%
1911
 
3.2%
1761
 
2.9%
Other values (97) 30485
50.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60013
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4602
 
7.7%
3887
 
6.5%
3136
 
5.2%
3076
 
5.1%
3035
 
5.1%
2945
 
4.9%
2649
 
4.4%
2526
 
4.2%
1911
 
3.2%
1761
 
2.9%
Other values (97) 30485
50.8%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202210
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202210
2nd row202210
3rd row202210
4th row202210
5th row202210

Common Values

ValueCountFrequency (%)
202210 10000
100.0%

Length

2024-05-11T14:57:33.386111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:57:33.537568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202210 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7366
Distinct (%)73.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77653346
Minimum-2.5519501 × 109
Maximum9.0421074 × 109
Zeros2320
Zeros (%)23.2%
Negative335
Negative (%)3.4%
Memory size166.0 KiB
2024-05-11T14:57:33.725566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-2.5519501 × 109
5-th percentile0
Q10
median3563675
Q338970549
95-th percentile3.869958 × 108
Maximum9.0421074 × 109
Range1.1594058 × 1010
Interquartile range (IQR)38970549

Descriptive statistics

Standard deviation2.9269846 × 108
Coefficient of variation (CV)3.7692961
Kurtosis189.46206
Mean77653346
Median Absolute Deviation (MAD)3563675
Skewness10.666021
Sum7.7653346 × 1011
Variance8.5672386 × 1016
MonotonicityNot monotonic
2024-05-11T14:57:33.979661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2320
 
23.2%
500000 27
 
0.3%
250000 17
 
0.2%
200000 16
 
0.2%
300000 13
 
0.1%
484000 12
 
0.1%
242000 12
 
0.1%
2000000 11
 
0.1%
20000000 10
 
0.1%
1000000 10
 
0.1%
Other values (7356) 7552
75.5%
ValueCountFrequency (%)
-2551950146 1
< 0.1%
-330224456 1
< 0.1%
-304675700 1
< 0.1%
-300272334 1
< 0.1%
-269170920 1
< 0.1%
-199705516 1
< 0.1%
-190422700 1
< 0.1%
-173712590 1
< 0.1%
-156589520 1
< 0.1%
-156250434 1
< 0.1%
ValueCountFrequency (%)
9042107364 1
< 0.1%
6454311838 1
< 0.1%
5406157430 1
< 0.1%
5200430666 1
< 0.1%
5046139742 1
< 0.1%
4841341082 1
< 0.1%
4797118437 1
< 0.1%
4429098711 1
< 0.1%
4250990076 1
< 0.1%
3815992140 1
< 0.1%

Interactions

2024-05-11T14:57:29.776330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:57:34.127486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.346
금액0.3461.000

Missing values

2024-05-11T14:57:29.978509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:57:30.152215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
36469래미안서초유니빌A13707010승강기유지비충당부채2022100
46523상계은빛1단지A13983816예금202210290489338
41056송파파인타운6단지A13876108연차수당충당부채2022107900740
12224상암월드컵파크7단지A12127005기타충당부채202210157597642
71227신정푸른마을2단지A15886508예수금2022102873450
43610중계주공4단지A13922406미지급금202210155929886
53999양평삼호A15010304기타의비유동자산202210250000
43647중계대림벽산A13922903선수금2022100
22415도봉파크빌2단지A13275303선급금202210552460
30337일원샘터마을A13523004기타유동부채20221087719110
아파트명아파트코드비용명년월일금액
24907어울림더리버아파트A13375906관리비예치금20221074682000
45123공릉풍림아이원A13980513저장품202210148510
61312독산한신A15383307선수관리비202210159584000
7636강동역신동아파밀리에A10027948가지급금202210101754
7937북한산힐스테이트7차제2 (임대)A10028056기타충당부채2022100
39867송파동부센트레빌A13816101연차수당충당부채2022102439930
23504창동한신A13292002수선유지비충당부채2022102848610
881신내역 금강펜테리움 센트럴파크아파트A10024214비품20221039971500
62034상도동원베네스트A15603001기타당좌자산2022100
3951목동파크자이아파트A10025729현금2022101180214