Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2052 (20.5%) zerosZeros

Reproduction

Analysis started2024-05-11 06:02:07.657790
Analysis finished2024-05-11 06:02:08.957185
Duration1.3 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2105
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:02:09.292512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.1509
Min length2

Characters and Unicode

Total characters71509
Distinct characters431
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)1.0%

Sample

1st row여의도화랑
2nd row거여2단지(동아효성)
3rd row제기현대
4th row정릉풍림아이원
5th row목동현대아파트
ValueCountFrequency (%)
아파트 107
 
1.0%
래미안 26
 
0.2%
입주자대표회의 22
 
0.2%
강변힐스테이트 14
 
0.1%
힐스테이트 14
 
0.1%
상봉건영캐스빌 13
 
0.1%
신도림현대 12
 
0.1%
역삼2차아이파크 12
 
0.1%
원효산호 12
 
0.1%
대림코오롱 11
 
0.1%
Other values (2161) 10211
97.7%
2024-05-11T15:02:09.986365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2134
 
3.0%
2066
 
2.9%
1964
 
2.7%
1863
 
2.6%
1855
 
2.6%
1659
 
2.3%
1547
 
2.2%
1513
 
2.1%
1482
 
2.1%
1360
 
1.9%
Other values (421) 54066
75.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65574
91.7%
Decimal Number 3916
 
5.5%
Uppercase Letter 666
 
0.9%
Space Separator 498
 
0.7%
Lowercase Letter 300
 
0.4%
Open Punctuation 146
 
0.2%
Close Punctuation 146
 
0.2%
Dash Punctuation 126
 
0.2%
Other Punctuation 126
 
0.2%
Letter Number 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2134
 
3.3%
2066
 
3.2%
1964
 
3.0%
1863
 
2.8%
1855
 
2.8%
1659
 
2.5%
1547
 
2.4%
1513
 
2.3%
1482
 
2.3%
1360
 
2.1%
Other values (375) 48131
73.4%
Uppercase Letter
ValueCountFrequency (%)
S 119
17.9%
K 85
12.8%
C 66
9.9%
L 57
8.6%
H 56
8.4%
M 40
 
6.0%
D 40
 
6.0%
I 37
 
5.6%
E 35
 
5.3%
G 30
 
4.5%
Other values (7) 101
15.2%
Lowercase Letter
ValueCountFrequency (%)
e 173
57.7%
l 36
 
12.0%
i 33
 
11.0%
v 21
 
7.0%
w 8
 
2.7%
a 7
 
2.3%
g 7
 
2.3%
s 6
 
2.0%
k 4
 
1.3%
h 3
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 1240
31.7%
2 1160
29.6%
3 530
13.5%
4 241
 
6.2%
5 195
 
5.0%
6 154
 
3.9%
7 111
 
2.8%
9 102
 
2.6%
0 93
 
2.4%
8 90
 
2.3%
Other Punctuation
ValueCountFrequency (%)
, 108
85.7%
. 18
 
14.3%
Space Separator
ValueCountFrequency (%)
498
100.0%
Open Punctuation
ValueCountFrequency (%)
( 146
100.0%
Close Punctuation
ValueCountFrequency (%)
) 146
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 126
100.0%
Letter Number
ValueCountFrequency (%)
6
100.0%
Math Symbol
ValueCountFrequency (%)
~ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65574
91.7%
Common 4963
 
6.9%
Latin 972
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2134
 
3.3%
2066
 
3.2%
1964
 
3.0%
1863
 
2.8%
1855
 
2.8%
1659
 
2.5%
1547
 
2.4%
1513
 
2.3%
1482
 
2.3%
1360
 
2.1%
Other values (375) 48131
73.4%
Latin
ValueCountFrequency (%)
e 173
17.8%
S 119
12.2%
K 85
 
8.7%
C 66
 
6.8%
L 57
 
5.9%
H 56
 
5.8%
M 40
 
4.1%
D 40
 
4.1%
I 37
 
3.8%
l 36
 
3.7%
Other values (19) 263
27.1%
Common
ValueCountFrequency (%)
1 1240
25.0%
2 1160
23.4%
3 530
10.7%
498
10.0%
4 241
 
4.9%
5 195
 
3.9%
6 154
 
3.1%
( 146
 
2.9%
) 146
 
2.9%
- 126
 
2.5%
Other values (7) 527
10.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65574
91.7%
ASCII 5929
 
8.3%
Number Forms 6
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2134
 
3.3%
2066
 
3.2%
1964
 
3.0%
1863
 
2.8%
1855
 
2.8%
1659
 
2.5%
1547
 
2.4%
1513
 
2.3%
1482
 
2.3%
1360
 
2.1%
Other values (375) 48131
73.4%
ASCII
ValueCountFrequency (%)
1 1240
20.9%
2 1160
19.6%
3 530
 
8.9%
498
 
8.4%
4 241
 
4.1%
5 195
 
3.3%
e 173
 
2.9%
6 154
 
2.6%
( 146
 
2.5%
) 146
 
2.5%
Other values (35) 1446
24.4%
Number Forms
ValueCountFrequency (%)
6
100.0%
Distinct2111
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:02:10.548248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)1.0%

Sample

1st rowA15088802
2nd rowA13811202
3rd rowA13006002
4th rowA13610007
5th rowA15807211
ValueCountFrequency (%)
a15704023 14
 
0.1%
a13122001 13
 
0.1%
a14085002 12
 
0.1%
a13579503 12
 
0.1%
a14085501 11
 
0.1%
a12013202 11
 
0.1%
a13508011 11
 
0.1%
a13407104 11
 
0.1%
a12187501 11
 
0.1%
a15081105 11
 
0.1%
Other values (2101) 9883
98.8%
2024-05-11T15:02:11.402358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18127
20.1%
1 17619
19.6%
A 9993
11.1%
3 9228
10.3%
2 8082
9.0%
5 6177
 
6.9%
8 5766
 
6.4%
7 4781
 
5.3%
4 3907
 
4.3%
6 3268
 
3.6%
Other values (2) 3052
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18127
22.7%
1 17619
22.0%
3 9228
11.5%
2 8082
10.1%
5 6177
 
7.7%
8 5766
 
7.2%
7 4781
 
6.0%
4 3907
 
4.9%
6 3268
 
4.1%
9 3045
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9993
99.9%
B 7
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18127
22.7%
1 17619
22.0%
3 9228
11.5%
2 8082
10.1%
5 6177
 
7.7%
8 5766
 
7.2%
7 4781
 
6.0%
4 3907
 
4.9%
6 3268
 
4.1%
9 3045
 
3.8%
Latin
ValueCountFrequency (%)
A 9993
99.9%
B 7
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18127
20.1%
1 17619
19.6%
A 9993
11.1%
3 9228
10.3%
2 8082
9.0%
5 6177
 
6.9%
8 5766
 
6.4%
7 4781
 
5.3%
4 3907
 
4.3%
6 3268
 
3.6%
Other values (2) 3052
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:02:11.867192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9382
Min length2

Characters and Unicode

Total characters59382
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정화조관리비충당부채
2nd row예수금
3rd row현금
4th row공동주택적립금
5th row장기수선충당부채
ValueCountFrequency (%)
당기순이익 339
 
3.4%
장기수선충당예금 336
 
3.4%
관리비미수금 335
 
3.4%
비품 331
 
3.3%
예금 326
 
3.3%
현금 323
 
3.2%
장기수선충당부채 316
 
3.2%
예수금 309
 
3.1%
연차수당충당부채 304
 
3.0%
선급비용 301
 
3.0%
Other values (67) 6780
67.8%
2024-05-11T15:02:12.517583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4738
 
8.0%
3817
 
6.4%
3177
 
5.4%
3107
 
5.2%
2905
 
4.9%
2884
 
4.9%
2617
 
4.4%
2406
 
4.1%
1901
 
3.2%
1807
 
3.0%
Other values (97) 30023
50.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59382
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4738
 
8.0%
3817
 
6.4%
3177
 
5.4%
3107
 
5.2%
2905
 
4.9%
2884
 
4.9%
2617
 
4.4%
2406
 
4.1%
1901
 
3.2%
1807
 
3.0%
Other values (97) 30023
50.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59382
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4738
 
8.0%
3817
 
6.4%
3177
 
5.4%
3107
 
5.2%
2905
 
4.9%
2884
 
4.9%
2617
 
4.4%
2406
 
4.1%
1901
 
3.2%
1807
 
3.0%
Other values (97) 30023
50.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59382
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4738
 
8.0%
3817
 
6.4%
3177
 
5.4%
3107
 
5.2%
2905
 
4.9%
2884
 
4.9%
2617
 
4.4%
2406
 
4.1%
1901
 
3.2%
1807
 
3.0%
Other values (97) 30023
50.6%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
201902
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201902
2nd row201902
3rd row201902
4th row201902
5th row201902

Common Values

ValueCountFrequency (%)
201902 10000
100.0%

Length

2024-05-11T15:02:12.785364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:02:12.954292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201902 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7602
Distinct (%)76.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77495400
Minimum-6.0342072 × 108
Maximum8.665062 × 109
Zeros2052
Zeros (%)20.5%
Negative349
Negative (%)3.5%
Memory size166.0 KiB
2024-05-11T15:02:13.170388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-6.0342072 × 108
5-th percentile0
Q119792.5
median3418405
Q334157213
95-th percentile3.7300637 × 108
Maximum8.665062 × 109
Range9.2684827 × 109
Interquartile range (IQR)34137421

Descriptive statistics

Standard deviation2.9751704 × 108
Coefficient of variation (CV)3.8391575
Kurtosis197.00838
Mean77495400
Median Absolute Deviation (MAD)3418405
Skewness11.023574
Sum7.74954 × 1011
Variance8.8516391 × 1016
MonotonicityNot monotonic
2024-05-11T15:02:13.422498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2052
 
20.5%
250000 28
 
0.3%
500000 23
 
0.2%
242000 18
 
0.2%
484000 17
 
0.2%
20000000 13
 
0.1%
200000 12
 
0.1%
30000000 11
 
0.1%
10000000 11
 
0.1%
300000 10
 
0.1%
Other values (7592) 7805
78.0%
ValueCountFrequency (%)
-603420717 1
< 0.1%
-388641505 1
< 0.1%
-302145700 1
< 0.1%
-282000000 1
< 0.1%
-269035194 1
< 0.1%
-242139904 1
< 0.1%
-168377396 1
< 0.1%
-147921070 1
< 0.1%
-134212500 1
< 0.1%
-119103910 1
< 0.1%
ValueCountFrequency (%)
8665061963 1
< 0.1%
7889746955 1
< 0.1%
6218827702 1
< 0.1%
5823064344 1
< 0.1%
4981704436 1
< 0.1%
4838592241 1
< 0.1%
4302804346 1
< 0.1%
3952505624 1
< 0.1%
3815214169 1
< 0.1%
3749981958 1
< 0.1%

Interactions

2024-05-11T15:02:08.409959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:02:13.696142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.537
금액0.5371.000

Missing values

2024-05-11T15:02:08.647999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:02:08.856872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
49531여의도화랑A15088802정화조관리비충당부채201902552000
33899거여2단지(동아효성)A13811202예수금2019021557480
11604제기현대A13006002현금201902173540
28541정릉풍림아이원A13610007공동주택적립금201902102263893
62915목동현대아파트A15807211장기수선충당부채2019021164230539
38836청솔아파트8A13980004장기수선충당예금201902410832467
11029뉴신사신성A12289401퇴직급여충당부채20190230053522
65101은평뉴타운박석고개제12단지아파트A41279911미수금2019027370000
19147신금호두산위브A13309101공동주택적립금201902-1603010
6235DMC센트레빌A12072801비품감가상각누계액201902-16347880
아파트명아파트코드비용명년월일금액
41303중계현대2차(4동)A13985904현금20190269948
15774신내새한아파트A13187406현금201902365720
14660면목늘푸른동아아파트A13183504선급금201902178380
27527대청A13594007주차장충당부채201902108451191
22421고덕리엔파크2단지A13410011장기수선충당예금201902131880614
1374서초푸르지오써밋A10026941기타충당부채201902550000
34939송파파인타운7단지A13821004공동체활성화단체지원적립금2019020
26094대치동부센트레빌A13528103수선유지비충당부채201902121690134
33599신반포한신2차A13790929현금20190294532
36391거여우방A13881601단기보증금201902750000