Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2084 (20.8%) zerosZeros

Reproduction

Analysis started2024-05-11 06:01:44.694491
Analysis finished2024-05-11 06:01:46.227522
Duration1.53 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2109
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:46.510667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.1287
Min length2

Characters and Unicode

Total characters71287
Distinct characters430
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)1.0%

Sample

1st row염창동아3차
2nd row한남하이츠
3rd row목동5단지
4th row구로한일유엔아이
5th row래미안서초유니빌
ValueCountFrequency (%)
아파트 101
 
1.0%
래미안 26
 
0.2%
왕십리 14
 
0.1%
올림픽파크한양수자인 14
 
0.1%
우리유앤미 13
 
0.1%
힐스테이트 13
 
0.1%
은평뉴타운상림마을6단지 13
 
0.1%
대치동부센트레빌 12
 
0.1%
송천센트레빌 12
 
0.1%
경남아너스빌 12
 
0.1%
Other values (2163) 10194
97.8%
2024-05-11T15:01:47.265360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2209
 
3.1%
2070
 
2.9%
1906
 
2.7%
1872
 
2.6%
1859
 
2.6%
1720
 
2.4%
1568
 
2.2%
1494
 
2.1%
1486
 
2.1%
1391
 
2.0%
Other values (420) 53712
75.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65504
91.9%
Decimal Number 3925
 
5.5%
Uppercase Letter 596
 
0.8%
Space Separator 451
 
0.6%
Lowercase Letter 295
 
0.4%
Dash Punctuation 147
 
0.2%
Close Punctuation 128
 
0.2%
Open Punctuation 128
 
0.2%
Other Punctuation 107
 
0.2%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2209
 
3.4%
2070
 
3.2%
1906
 
2.9%
1872
 
2.9%
1859
 
2.8%
1720
 
2.6%
1568
 
2.4%
1494
 
2.3%
1486
 
2.3%
1391
 
2.1%
Other values (374) 47929
73.2%
Uppercase Letter
ValueCountFrequency (%)
S 109
18.3%
K 80
13.4%
L 58
9.7%
C 52
8.7%
H 47
7.9%
I 37
 
6.2%
E 34
 
5.7%
G 32
 
5.4%
M 28
 
4.7%
D 28
 
4.7%
Other values (7) 91
15.3%
Lowercase Letter
ValueCountFrequency (%)
e 183
62.0%
l 32
 
10.8%
i 25
 
8.5%
v 19
 
6.4%
c 8
 
2.7%
k 7
 
2.4%
s 6
 
2.0%
w 6
 
2.0%
h 3
 
1.0%
a 3
 
1.0%
Decimal Number
ValueCountFrequency (%)
2 1185
30.2%
1 1174
29.9%
3 484
12.3%
4 268
 
6.8%
5 220
 
5.6%
6 166
 
4.2%
7 126
 
3.2%
0 115
 
2.9%
8 101
 
2.6%
9 86
 
2.2%
Other Punctuation
ValueCountFrequency (%)
, 88
82.2%
. 19
 
17.8%
Space Separator
ValueCountFrequency (%)
451
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%
Close Punctuation
ValueCountFrequency (%)
) 128
100.0%
Open Punctuation
ValueCountFrequency (%)
( 128
100.0%
Letter Number
ValueCountFrequency (%)
3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65504
91.9%
Common 4889
 
6.9%
Latin 894
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2209
 
3.4%
2070
 
3.2%
1906
 
2.9%
1872
 
2.9%
1859
 
2.8%
1720
 
2.6%
1568
 
2.4%
1494
 
2.3%
1486
 
2.3%
1391
 
2.1%
Other values (374) 47929
73.2%
Latin
ValueCountFrequency (%)
e 183
20.5%
S 109
12.2%
K 80
 
8.9%
L 58
 
6.5%
C 52
 
5.8%
H 47
 
5.3%
I 37
 
4.1%
E 34
 
3.8%
l 32
 
3.6%
G 32
 
3.6%
Other values (19) 230
25.7%
Common
ValueCountFrequency (%)
2 1185
24.2%
1 1174
24.0%
3 484
9.9%
451
 
9.2%
4 268
 
5.5%
5 220
 
4.5%
6 166
 
3.4%
- 147
 
3.0%
) 128
 
2.6%
( 128
 
2.6%
Other values (7) 538
11.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65504
91.9%
ASCII 5780
 
8.1%
Number Forms 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2209
 
3.4%
2070
 
3.2%
1906
 
2.9%
1872
 
2.9%
1859
 
2.8%
1720
 
2.6%
1568
 
2.4%
1494
 
2.3%
1486
 
2.3%
1391
 
2.1%
Other values (374) 47929
73.2%
ASCII
ValueCountFrequency (%)
2 1185
20.5%
1 1174
20.3%
3 484
 
8.4%
451
 
7.8%
4 268
 
4.6%
5 220
 
3.8%
e 183
 
3.2%
6 166
 
2.9%
- 147
 
2.5%
) 128
 
2.2%
Other values (35) 1374
23.8%
Number Forms
ValueCountFrequency (%)
3
100.0%
Distinct2116
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:47.810717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)1.0%

Sample

1st rowA15786227
2nd rowA13375901
3rd rowA15805504
4th rowA15205104
5th rowA13707010
ValueCountFrequency (%)
a10027354 14
 
0.1%
a14272313 12
 
0.1%
a13184401 12
 
0.1%
a13822004 12
 
0.1%
a15606007 12
 
0.1%
a13528103 12
 
0.1%
a13481305 11
 
0.1%
a13922114 11
 
0.1%
a15884703 11
 
0.1%
a15681106 11
 
0.1%
Other values (2106) 9882
98.8%
2024-05-11T15:01:48.653166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18184
20.2%
1 17675
19.6%
A 9988
11.1%
3 8985
10.0%
2 7940
8.8%
5 6287
 
7.0%
8 5756
 
6.4%
7 4813
 
5.3%
4 3878
 
4.3%
6 3444
 
3.8%
Other values (2) 3050
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18184
22.7%
1 17675
22.1%
3 8985
11.2%
2 7940
9.9%
5 6287
 
7.9%
8 5756
 
7.2%
7 4813
 
6.0%
4 3878
 
4.8%
6 3444
 
4.3%
9 3038
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9988
99.9%
B 12
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18184
22.7%
1 17675
22.1%
3 8985
11.2%
2 7940
9.9%
5 6287
 
7.9%
8 5756
 
7.2%
7 4813
 
6.0%
4 3878
 
4.8%
6 3444
 
4.3%
9 3038
 
3.8%
Latin
ValueCountFrequency (%)
A 9988
99.9%
B 12
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18184
20.2%
1 17675
19.6%
A 9988
11.1%
3 8985
10.0%
2 7940
8.8%
5 6287
 
7.0%
8 5756
 
6.4%
7 4813
 
5.3%
4 3878
 
4.3%
6 3444
 
3.8%
Other values (2) 3050
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:49.088382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.976
Min length2

Characters and Unicode

Total characters59760
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row연차수당충당부채
2nd row장기수선충당예금
3rd row기타시설운영충당부채
4th row세대배부용비품
5th row주차장충당부채
ValueCountFrequency (%)
예금 339
 
3.4%
퇴직급여충당부채 330
 
3.3%
미처분이익잉여금 327
 
3.3%
예수금 326
 
3.3%
관리비미수금 323
 
3.2%
공동주택적립금 320
 
3.2%
연차수당충당부채 319
 
3.2%
선급비용 316
 
3.2%
장기수선충당예금 308
 
3.1%
당기순이익 299
 
3.0%
Other values (67) 6793
67.9%
2024-05-11T15:01:49.738025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4777
 
8.0%
3764
 
6.3%
3194
 
5.3%
3065
 
5.1%
2935
 
4.9%
2920
 
4.9%
2626
 
4.4%
2318
 
3.9%
1870
 
3.1%
1831
 
3.1%
Other values (97) 30460
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59760
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4777
 
8.0%
3764
 
6.3%
3194
 
5.3%
3065
 
5.1%
2935
 
4.9%
2920
 
4.9%
2626
 
4.4%
2318
 
3.9%
1870
 
3.1%
1831
 
3.1%
Other values (97) 30460
51.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59760
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4777
 
8.0%
3764
 
6.3%
3194
 
5.3%
3065
 
5.1%
2935
 
4.9%
2920
 
4.9%
2626
 
4.4%
2318
 
3.9%
1870
 
3.1%
1831
 
3.1%
Other values (97) 30460
51.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59760
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4777
 
8.0%
3764
 
6.3%
3194
 
5.3%
3065
 
5.1%
2935
 
4.9%
2920
 
4.9%
2626
 
4.4%
2318
 
3.9%
1870
 
3.1%
1831
 
3.1%
Other values (97) 30460
51.0%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
201905
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201905
2nd row201905
3rd row201905
4th row201905
5th row201905

Common Values

ValueCountFrequency (%)
201905 10000
100.0%

Length

2024-05-11T15:01:49.942632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:01:50.103180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201905 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7558
Distinct (%)75.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74584700
Minimum-4.09024 × 109
Maximum9.6344252 × 109
Zeros2084
Zeros (%)20.8%
Negative321
Negative (%)3.2%
Memory size166.0 KiB
2024-05-11T15:01:50.292261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.09024 × 109
5-th percentile0
Q114970
median3483560
Q332630885
95-th percentile3.5725721 × 108
Maximum9.6344252 × 109
Range1.3724665 × 1010
Interquartile range (IQR)32615915

Descriptive statistics

Standard deviation3.1935374 × 108
Coefficient of variation (CV)4.2817594
Kurtosis259.34772
Mean74584700
Median Absolute Deviation (MAD)3483560
Skewness12.614355
Sum7.45847 × 1011
Variance1.0198681 × 1017
MonotonicityNot monotonic
2024-05-11T15:01:50.577708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2084
 
20.8%
500000 29
 
0.3%
250000 27
 
0.3%
1000000 16
 
0.2%
20000000 14
 
0.1%
10000000 12
 
0.1%
300000 11
 
0.1%
200000 10
 
0.1%
484000 10
 
0.1%
5000000 10
 
0.1%
Other values (7548) 7777
77.8%
ValueCountFrequency (%)
-4090240000 1
< 0.1%
-626368591 1
< 0.1%
-262574210 1
< 0.1%
-219951880 1
< 0.1%
-161481980 1
< 0.1%
-134212500 1
< 0.1%
-132342706 1
< 0.1%
-124188940 1
< 0.1%
-122789896 1
< 0.1%
-120098530 1
< 0.1%
ValueCountFrequency (%)
9634425246 1
< 0.1%
8689507436 1
< 0.1%
8398272240 1
< 0.1%
6014828955 1
< 0.1%
5785899932 1
< 0.1%
5364257365 1
< 0.1%
5189197570 1
< 0.1%
5054589351 1
< 0.1%
4739809247 1
< 0.1%
4653135158 1
< 0.1%

Interactions

2024-05-11T15:01:45.485939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:01:50.774456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.658
금액0.6581.000

Missing values

2024-05-11T15:01:45.832545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:01:46.088913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
61585염창동아3차A15786227연차수당충당부채2019059839940
19775한남하이츠A13375901장기수선충당예금2019051922408794
62661목동5단지A15805504기타시설운영충당부채2019050
51492구로한일유엔아이A15205104세대배부용비품20190522000
31313래미안서초유니빌A13707010주차장충당부채2019053138653
44583미아현대A14272307수선유지비충당부채2019050
7112토정한강삼성A12106001미부과관리비20190588461466
38952공릉화랑타운A13980010미수금2019050
36428거여현대2차A13881401미처분이익잉여금2019050
7969공덕래미안5차A12170603장기수선충당부채적립금2019050
아파트명아파트코드비용명년월일금액
26076래미안대치하이스턴A13528007미부과관리비20190586018600
12395래미안허브리츠A13070301상여충당부채2019050
9697응암금호A12201102선급비용2019053753880
23383강동현대홈타운A13485301미처분이익잉여금20190516614240
11502래미안위브A13003007기타시설운영충당부채201905417606964
38642상계미도A13971501선급금2019053190
7939공덕래미안5차A12170603미수관리비예치금201905384000
39636공릉대주파크빌A13980706비품2019056160700
45542자양우성7차A14319311퇴직급여충당예금20190572881855
50082삼성산주공3단지A15101506비품20190539446000