Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2135 (21.3%) zerosZeros

Reproduction

Analysis started2024-05-11 06:01:34.948358
Analysis finished2024-05-11 06:01:36.664477
Duration1.72 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2107
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:36.957267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.1726
Min length2

Characters and Unicode

Total characters71726
Distinct characters429
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)1.0%

Sample

1st row정릉꿈에그린아파트
2nd row정릉쌍용
3rd row창전삼성임대
4th row대치우성1차아파트
5th row헬리오시티아파트
ValueCountFrequency (%)
아파트 90
 
0.9%
래미안 17
 
0.2%
대치동부센트레빌 14
 
0.1%
경남아너스빌 14
 
0.1%
서울숲2차푸르지오임대 13
 
0.1%
상도삼호 12
 
0.1%
일원청솔빌리지 12
 
0.1%
신동아파밀리에 12
 
0.1%
대림코오롱 12
 
0.1%
남성두산위브트레지움 12
 
0.1%
Other values (2159) 10217
98.0%
2024-05-11T15:01:38.416004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2191
 
3.1%
2165
 
3.0%
1895
 
2.6%
1887
 
2.6%
1852
 
2.6%
1650
 
2.3%
1570
 
2.2%
1552
 
2.2%
1480
 
2.1%
1386
 
1.9%
Other values (419) 54098
75.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65920
91.9%
Decimal Number 3866
 
5.4%
Uppercase Letter 682
 
1.0%
Space Separator 459
 
0.6%
Lowercase Letter 284
 
0.4%
Dash Punctuation 139
 
0.2%
Other Punctuation 123
 
0.2%
Close Punctuation 122
 
0.2%
Open Punctuation 122
 
0.2%
Letter Number 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2191
 
3.3%
2165
 
3.3%
1895
 
2.9%
1887
 
2.9%
1852
 
2.8%
1650
 
2.5%
1570
 
2.4%
1552
 
2.4%
1480
 
2.2%
1386
 
2.1%
Other values (374) 48292
73.3%
Uppercase Letter
ValueCountFrequency (%)
S 116
17.0%
K 100
14.7%
C 91
13.3%
L 51
7.5%
I 40
 
5.9%
E 39
 
5.7%
D 39
 
5.7%
M 39
 
5.7%
H 37
 
5.4%
A 31
 
4.5%
Other values (7) 99
14.5%
Lowercase Letter
ValueCountFrequency (%)
e 179
63.0%
i 22
 
7.7%
l 20
 
7.0%
v 17
 
6.0%
s 11
 
3.9%
k 10
 
3.5%
w 9
 
3.2%
c 6
 
2.1%
h 4
 
1.4%
a 3
 
1.1%
Decimal Number
ValueCountFrequency (%)
1 1199
31.0%
2 1168
30.2%
3 504
13.0%
4 252
 
6.5%
5 215
 
5.6%
6 150
 
3.9%
7 115
 
3.0%
9 95
 
2.5%
0 86
 
2.2%
8 82
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 104
84.6%
. 19
 
15.4%
Space Separator
ValueCountFrequency (%)
459
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 139
100.0%
Close Punctuation
ValueCountFrequency (%)
) 122
100.0%
Open Punctuation
ValueCountFrequency (%)
( 122
100.0%
Letter Number
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65920
91.9%
Common 4831
 
6.7%
Latin 975
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2191
 
3.3%
2165
 
3.3%
1895
 
2.9%
1887
 
2.9%
1852
 
2.8%
1650
 
2.5%
1570
 
2.4%
1552
 
2.4%
1480
 
2.2%
1386
 
2.1%
Other values (374) 48292
73.3%
Latin
ValueCountFrequency (%)
e 179
18.4%
S 116
11.9%
K 100
10.3%
C 91
 
9.3%
L 51
 
5.2%
I 40
 
4.1%
E 39
 
4.0%
D 39
 
4.0%
M 39
 
4.0%
H 37
 
3.8%
Other values (19) 244
25.0%
Common
ValueCountFrequency (%)
1 1199
24.8%
2 1168
24.2%
3 504
10.4%
459
 
9.5%
4 252
 
5.2%
5 215
 
4.5%
6 150
 
3.1%
- 139
 
2.9%
) 122
 
2.5%
( 122
 
2.5%
Other values (6) 501
10.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65920
91.9%
ASCII 5797
 
8.1%
Number Forms 9
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2191
 
3.3%
2165
 
3.3%
1895
 
2.9%
1887
 
2.9%
1852
 
2.8%
1650
 
2.5%
1570
 
2.4%
1552
 
2.4%
1480
 
2.2%
1386
 
2.1%
Other values (374) 48292
73.3%
ASCII
ValueCountFrequency (%)
1 1199
20.7%
2 1168
20.1%
3 504
 
8.7%
459
 
7.9%
4 252
 
4.3%
5 215
 
3.7%
e 179
 
3.1%
6 150
 
2.6%
- 139
 
2.4%
) 122
 
2.1%
Other values (34) 1410
24.3%
Number Forms
ValueCountFrequency (%)
9
100.0%
Distinct2114
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:39.034700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)1.0%

Sample

1st rowA10028000
2nd rowA13676501
3rd rowA12177802
4th rowA13583403
5th rowA10025850
ValueCountFrequency (%)
a13528103 14
 
0.1%
a13523001 12
 
0.1%
a15678102 12
 
0.1%
a13204302 12
 
0.1%
a15081105 12
 
0.1%
a15677501 12
 
0.1%
a15792602 12
 
0.1%
a13002002 12
 
0.1%
a41279932 11
 
0.1%
a13817001 11
 
0.1%
Other values (2104) 9880
98.8%
2024-05-11T15:01:39.885307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18302
20.3%
1 17875
19.9%
A 9992
11.1%
3 8998
10.0%
2 8088
9.0%
5 6209
 
6.9%
8 5707
 
6.3%
7 4730
 
5.3%
4 3775
 
4.2%
6 3357
 
3.7%
Other values (2) 2967
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18302
22.9%
1 17875
22.3%
3 8998
11.2%
2 8088
10.1%
5 6209
 
7.8%
8 5707
 
7.1%
7 4730
 
5.9%
4 3775
 
4.7%
6 3357
 
4.2%
9 2959
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 9992
99.9%
B 8
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18302
22.9%
1 17875
22.3%
3 8998
11.2%
2 8088
10.1%
5 6209
 
7.8%
8 5707
 
7.1%
7 4730
 
5.9%
4 3775
 
4.7%
6 3357
 
4.2%
9 2959
 
3.7%
Latin
ValueCountFrequency (%)
A 9992
99.9%
B 8
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18302
20.3%
1 17875
19.9%
A 9992
11.1%
3 8998
10.0%
2 8088
9.0%
5 6209
 
6.9%
8 5707
 
6.3%
7 4730
 
5.3%
4 3775
 
4.2%
6 3357
 
3.7%
Other values (2) 2967
 
3.3%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:01:40.324589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9718
Min length2

Characters and Unicode

Total characters59718
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row수선유지비충당부채
2nd row현금
3rd row현금
4th row수선유지비충당부채
5th row연차수당충당부채
ValueCountFrequency (%)
예금 333
 
3.3%
연차수당충당부채 330
 
3.3%
당기순이익 325
 
3.2%
관리비미수금 322
 
3.2%
예수금 317
 
3.2%
가수금 317
 
3.2%
공동주택적립금 311
 
3.1%
현금 311
 
3.1%
장기수선충당부채 309
 
3.1%
퇴직급여충당부채 309
 
3.1%
Other values (67) 6816
68.2%
2024-05-11T15:01:41.096952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4736
 
7.9%
3809
 
6.4%
3296
 
5.5%
3087
 
5.2%
2989
 
5.0%
2947
 
4.9%
2696
 
4.5%
2318
 
3.9%
1898
 
3.2%
1768
 
3.0%
Other values (97) 30174
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59718
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4736
 
7.9%
3809
 
6.4%
3296
 
5.5%
3087
 
5.2%
2989
 
5.0%
2947
 
4.9%
2696
 
4.5%
2318
 
3.9%
1898
 
3.2%
1768
 
3.0%
Other values (97) 30174
50.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59718
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4736
 
7.9%
3809
 
6.4%
3296
 
5.5%
3087
 
5.2%
2989
 
5.0%
2947
 
4.9%
2696
 
4.5%
2318
 
3.9%
1898
 
3.2%
1768
 
3.0%
Other values (97) 30174
50.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59718
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4736
 
7.9%
3809
 
6.4%
3296
 
5.5%
3087
 
5.2%
2989
 
5.0%
2947
 
4.9%
2696
 
4.5%
2318
 
3.9%
1898
 
3.2%
1768
 
3.0%
Other values (97) 30174
50.5%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
201906
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201906
2nd row201906
3rd row201906
4th row201906
5th row201906

Common Values

ValueCountFrequency (%)
201906 10000
100.0%

Length

2024-05-11T15:01:41.389926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:01:41.565440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201906 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7516
Distinct (%)75.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71192390
Minimum-4.09024 × 109
Maximum1.4399164 × 1010
Zeros2135
Zeros (%)21.3%
Negative328
Negative (%)3.3%
Memory size166.0 KiB
2024-05-11T15:01:41.772394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.09024 × 109
5-th percentile0
Q13030
median3661355.5
Q337447447
95-th percentile3.3215551 × 108
Maximum1.4399164 × 1010
Range1.8489404 × 1010
Interquartile range (IQR)37444417

Descriptive statistics

Standard deviation3.0898822 × 108
Coefficient of variation (CV)4.3401861
Kurtosis589.76283
Mean71192390
Median Absolute Deviation (MAD)3661355.5
Skewness17.458937
Sum7.119239 × 1011
Variance9.5473721 × 1016
MonotonicityNot monotonic
2024-05-11T15:01:42.132301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2135
 
21.3%
500000 25
 
0.2%
250000 21
 
0.2%
100000 15
 
0.1%
242000 14
 
0.1%
1000000 13
 
0.1%
5000000 12
 
0.1%
20000000 12
 
0.1%
484000 11
 
0.1%
750000 11
 
0.1%
Other values (7506) 7731
77.3%
ValueCountFrequency (%)
-4090240000 1
< 0.1%
-286871240 1
< 0.1%
-282000000 1
< 0.1%
-263354520 1
< 0.1%
-219647748 1
< 0.1%
-205582628 1
< 0.1%
-189484870 1
< 0.1%
-161481980 1
< 0.1%
-120029900 1
< 0.1%
-105052459 1
< 0.1%
ValueCountFrequency (%)
14399164107 1
< 0.1%
8691289026 1
< 0.1%
5407137016 1
< 0.1%
5371910748 1
< 0.1%
5154565699 1
< 0.1%
5102985779 1
< 0.1%
4937806238 1
< 0.1%
4269617823 1
< 0.1%
4244602016 1
< 0.1%
3850764365 1
< 0.1%

Interactions

2024-05-11T15:01:36.025110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:01:42.327792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.228
금액0.2281.000

Missing values

2024-05-11T15:01:36.324373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:01:36.540227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
3491정릉꿈에그린아파트A10028000수선유지비충당부채2019061498930
30227정릉쌍용A13676501현금201906416630
8318창전삼성임대A12177802현금2019060
26768대치우성1차아파트A13583403수선유지비충당부채20190638935460
532헬리오시티아파트A10025850연차수당충당부채20190684938430
17499창동상아1차A13204507미부과관리비201906106718550
44050번동기산그린A14206305미수금2019060
38639상계주공3단지A13971502상여충당부채2019060
7182마포쌍용황금A12105001공동체활성화단체지원적립금2019060
8885도화현대1차A12181406기타유형자산20190611907400
아파트명아파트코드비용명년월일금액
49695여의도은하A15089510관리비예치금2019060
56473래미안트윈파크A15606007선급비용2019068516408
63645수명산롯데캐슬A15809502연차수당충당부채2019063971600
53509구로우성A15283809현금201906106415
35035송파파인타운8단지A13821006예금201906102708671
29587길음서희스타힐스A13613012선급금201906166070
18143방학동양크레오A13285503장기수선충당예금201906356989294
8580상암월드컵파크12단지A12179505경비비충당부채2019069666008
49507여의도미성A15088717미수금2019060
39710상계불암대림A13981006기타유동부채2019060