Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15821/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 162 (1.6%) zerosZeros

Reproduction

Analysis started2024-05-11 06:47:27.535440
Analysis finished2024-05-11 06:47:31.101300
Duration3.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2267
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:47:31.612850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.4648
Min length2

Characters and Unicode

Total characters74648
Distinct characters435
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique132 ?
Unique (%)1.3%

Sample

1st row신내9단지임대
2nd row동부아파트
3rd row신정5차현대
4th row보문아남
5th row중계라이프신동아청구아파트
ValueCountFrequency (%)
아파트 186
 
1.7%
래미안 51
 
0.5%
e편한세상 33
 
0.3%
아이파크 33
 
0.3%
sk뷰 24
 
0.2%
푸르지오 19
 
0.2%
경남아너스빌 19
 
0.2%
힐스테이트 17
 
0.2%
왕십리 14
 
0.1%
은평뉴타운상림마을6단지 14
 
0.1%
Other values (2352) 10475
96.2%
2024-05-11T06:47:32.768175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2607
 
3.5%
2584
 
3.5%
2419
 
3.2%
1923
 
2.6%
1730
 
2.3%
1713
 
2.3%
1497
 
2.0%
1492
 
2.0%
1443
 
1.9%
1432
 
1.9%
Other values (425) 55808
74.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68170
91.3%
Decimal Number 3667
 
4.9%
Space Separator 957
 
1.3%
Uppercase Letter 919
 
1.2%
Lowercase Letter 325
 
0.4%
Open Punctuation 184
 
0.2%
Close Punctuation 184
 
0.2%
Dash Punctuation 132
 
0.2%
Other Punctuation 106
 
0.1%
Letter Number 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2607
 
3.8%
2584
 
3.8%
2419
 
3.5%
1923
 
2.8%
1730
 
2.5%
1713
 
2.5%
1497
 
2.2%
1492
 
2.2%
1443
 
2.1%
1432
 
2.1%
Other values (380) 49330
72.4%
Uppercase Letter
ValueCountFrequency (%)
S 162
17.6%
C 136
14.8%
K 112
12.2%
M 96
10.4%
D 96
10.4%
L 57
 
6.2%
H 54
 
5.9%
E 43
 
4.7%
I 41
 
4.5%
V 30
 
3.3%
Other values (7) 92
10.0%
Lowercase Letter
ValueCountFrequency (%)
e 191
58.8%
l 30
 
9.2%
i 25
 
7.7%
k 19
 
5.8%
v 18
 
5.5%
s 18
 
5.5%
c 8
 
2.5%
w 7
 
2.2%
h 3
 
0.9%
a 3
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 1110
30.3%
2 1056
28.8%
3 493
13.4%
4 261
 
7.1%
5 218
 
5.9%
6 149
 
4.1%
9 102
 
2.8%
8 101
 
2.8%
7 97
 
2.6%
0 80
 
2.2%
Other Punctuation
ValueCountFrequency (%)
, 84
79.2%
. 22
 
20.8%
Space Separator
ValueCountFrequency (%)
957
100.0%
Open Punctuation
ValueCountFrequency (%)
( 184
100.0%
Close Punctuation
ValueCountFrequency (%)
) 184
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 132
100.0%
Letter Number
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68170
91.3%
Common 5230
 
7.0%
Latin 1248
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2607
 
3.8%
2584
 
3.8%
2419
 
3.5%
1923
 
2.8%
1730
 
2.5%
1713
 
2.5%
1497
 
2.2%
1492
 
2.2%
1443
 
2.1%
1432
 
2.1%
Other values (380) 49330
72.4%
Latin
ValueCountFrequency (%)
e 191
15.3%
S 162
13.0%
C 136
10.9%
K 112
9.0%
M 96
 
7.7%
D 96
 
7.7%
L 57
 
4.6%
H 54
 
4.3%
E 43
 
3.4%
I 41
 
3.3%
Other values (19) 260
20.8%
Common
ValueCountFrequency (%)
1 1110
21.2%
2 1056
20.2%
957
18.3%
3 493
9.4%
4 261
 
5.0%
5 218
 
4.2%
( 184
 
3.5%
) 184
 
3.5%
6 149
 
2.8%
- 132
 
2.5%
Other values (6) 486
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68170
91.3%
ASCII 6474
 
8.7%
Number Forms 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2607
 
3.8%
2584
 
3.8%
2419
 
3.5%
1923
 
2.8%
1730
 
2.5%
1713
 
2.5%
1497
 
2.2%
1492
 
2.2%
1443
 
2.1%
1432
 
2.1%
Other values (380) 49330
72.4%
ASCII
ValueCountFrequency (%)
1 1110
17.1%
2 1056
16.3%
957
14.8%
3 493
 
7.6%
4 261
 
4.0%
5 218
 
3.4%
e 191
 
3.0%
( 184
 
2.8%
) 184
 
2.8%
S 162
 
2.5%
Other values (34) 1658
25.6%
Number Forms
ValueCountFrequency (%)
4
100.0%
Distinct2272
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:47:33.964966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique133 ?
Unique (%)1.3%

Sample

1st rowA13176902
2nd rowA13186401
3rd rowA15886504
4th rowA13608601
5th rowA13986111
ValueCountFrequency (%)
a13671209 13
 
0.1%
a12109002 12
 
0.1%
a13523005 12
 
0.1%
a12175203 12
 
0.1%
a13923102 12
 
0.1%
a15089513 11
 
0.1%
a13813010 11
 
0.1%
a13993501 11
 
0.1%
a12084302 11
 
0.1%
a13986111 11
 
0.1%
Other values (2262) 9884
98.8%
2024-05-11T06:47:35.613032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18656
20.7%
1 17404
19.3%
A 9992
11.1%
3 8694
9.7%
2 8478
9.4%
5 6280
 
7.0%
8 5444
 
6.0%
7 4547
 
5.1%
4 4132
 
4.6%
6 3324
 
3.7%
Other values (2) 3049
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18656
23.3%
1 17404
21.8%
3 8694
10.9%
2 8478
10.6%
5 6280
 
7.8%
8 5444
 
6.8%
7 4547
 
5.7%
4 4132
 
5.2%
6 3324
 
4.2%
9 3041
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9992
99.9%
B 8
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18656
23.3%
1 17404
21.8%
3 8694
10.9%
2 8478
10.6%
5 6280
 
7.8%
8 5444
 
6.8%
7 4547
 
5.7%
4 4132
 
5.2%
6 3324
 
4.2%
9 3041
 
3.8%
Latin
ValueCountFrequency (%)
A 9992
99.9%
B 8
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18656
20.7%
1 17404
19.3%
A 9992
11.1%
3 8694
9.7%
2 8478
9.4%
5 6280
 
7.0%
8 5444
 
6.0%
7 4547
 
5.1%
4 4132
 
4.6%
6 3324
 
3.7%
Other values (2) 3049
 
3.4%
Distinct83
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:47:36.449027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.692
Min length2

Characters and Unicode

Total characters46920
Distinct characters120
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row공동난방비
2nd row위탁관리수수료
3rd row교통비
4th row도서인쇄비
5th row소독비
ValueCountFrequency (%)
세대전기료 289
 
2.9%
경비비 276
 
2.8%
소독비 267
 
2.7%
청소비 266
 
2.7%
보험료 254
 
2.5%
퇴직급여 251
 
2.5%
통신비 245
 
2.5%
급여 244
 
2.4%
세대수도료 241
 
2.4%
도서인쇄비 241
 
2.4%
Other values (73) 7426
74.3%
2024-05-11T06:47:37.841032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5401
 
11.5%
3505
 
7.5%
2409
 
5.1%
1695
 
3.6%
1396
 
3.0%
1268
 
2.7%
1213
 
2.6%
989
 
2.1%
937
 
2.0%
878
 
1.9%
Other values (110) 27229
58.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46920
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5401
 
11.5%
3505
 
7.5%
2409
 
5.1%
1695
 
3.6%
1396
 
3.0%
1268
 
2.7%
1213
 
2.6%
989
 
2.1%
937
 
2.0%
878
 
1.9%
Other values (110) 27229
58.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 46920
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5401
 
11.5%
3505
 
7.5%
2409
 
5.1%
1695
 
3.6%
1396
 
3.0%
1268
 
2.7%
1213
 
2.6%
989
 
2.1%
937
 
2.0%
878
 
1.9%
Other values (110) 27229
58.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 46920
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5401
 
11.5%
3505
 
7.5%
2409
 
5.1%
1695
 
3.6%
1396
 
3.0%
1268
 
2.7%
1213
 
2.6%
989
 
2.1%
937
 
2.0%
878
 
1.9%
Other values (110) 27229
58.0%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202301
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202301
2nd row202301
3rd row202301
4th row202301
5th row202301

Common Values

ValueCountFrequency (%)
202301 10000
100.0%

Length

2024-05-11T06:47:38.512172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T06:47:38.947489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202301 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7920
Distinct (%)79.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5010091.8
Minimum-563650
Maximum7.230593 × 108
Zeros162
Zeros (%)1.6%
Negative6
Negative (%)0.1%
Memory size166.0 KiB
2024-05-11T06:47:39.996042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-563650
5-th percentile5000
Q1128358
median446825
Q32129844
95-th percentile24052182
Maximum7.230593 × 108
Range7.2362295 × 108
Interquartile range (IQR)2001486

Descriptive statistics

Standard deviation18679505
Coefficient of variation (CV)3.7283758
Kurtosis335.20644
Mean5010091.8
Median Absolute Deviation (MAD)397250
Skewness13.650816
Sum5.0100918 × 1010
Variance3.4892391 × 1014
MonotonicityNot monotonic
2024-05-11T06:47:40.753057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 162
 
1.6%
200000 103
 
1.0%
100000 86
 
0.9%
300000 63
 
0.6%
150000 45
 
0.4%
50000 43
 
0.4%
400000 41
 
0.4%
250000 38
 
0.4%
120000 33
 
0.3%
110000 32
 
0.3%
Other values (7910) 9354
93.5%
ValueCountFrequency (%)
-563650 1
 
< 0.1%
-482750 1
 
< 0.1%
-324660 1
 
< 0.1%
-34430 1
 
< 0.1%
-21000 1
 
< 0.1%
-232 1
 
< 0.1%
0 162
1.6%
3 2
 
< 0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
ValueCountFrequency (%)
723059300 1
< 0.1%
500869574 1
< 0.1%
320679668 1
< 0.1%
317027330 1
< 0.1%
305689814 1
< 0.1%
288033220 1
< 0.1%
266483583 1
< 0.1%
266477058 1
< 0.1%
264551290 1
< 0.1%
262915483 1
< 0.1%

Interactions

2024-05-11T06:47:29.296018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T06:47:41.192661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.407
금액0.4071.000

Missing values

2024-05-11T06:47:30.172767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T06:47:30.840938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
24600신내9단지임대A13176902공동난방비2023012412180
25692동부아파트A13186401위탁관리수수료202301277800
87704신정5차현대A15886504교통비2023016000
41905보문아남A13608601도서인쇄비202301126500
58773중계라이프신동아청구아파트A13986111소독비202301540000
73233개봉거성푸르뫼2차아피트A15280303세대전기료20230112466468
28581벽산1A13276417도서인쇄비202301265250
38530도곡경남A13527008보험료202301770250
50225송파파인타운5단지A13821003재활용품수익202301227500
47360롯데캐슬클래식A13785607감가상각비202301458330
아파트명아파트코드비용명년월일금액
84885목동롯데캐슬위너A15805303광고료수익202301800000
82184방화월드메르디앙A15773501감가상각비20230156060
64241광장동 광나루현대A14381407세대전기료20230127068416
8743서초푸르지오써밋A10026941홈네트워크설비유지비2023011650000
14587신촌럭키A12017001경비비20230132109670
26463도봉파크빌3단지A13201202사무용품비20230123890
15436홍은현대A12084504선거관리위원회운영비202301360000
22222래미안크레시티A13071302연차수당2023012433180
32954뚝섬중앙하이츠빌A13384303장기수선비2023014224000
15480홍제한양A12085303기타부대비202301224320