Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2344 (23.4%) zerosZeros

Reproduction

Analysis started2024-05-11 06:00:53.179169
Analysis finished2024-05-11 06:00:54.228342
Duration1.05 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2076
Distinct (%)20.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:54.495495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length7.2187
Min length2

Characters and Unicode

Total characters72187
Distinct characters431
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)0.9%

Sample

1st row고척동아한신
2nd row중계그린
3rd row은평지웰테라스
4th row구로한일유엔아이
5th row개봉삼호아파트관리사무소
ValueCountFrequency (%)
아파트 99
 
0.9%
래미안 26
 
0.2%
힐스테이트 13
 
0.1%
래미안월곡 13
 
0.1%
마포삼성 12
 
0.1%
목동6단지 12
 
0.1%
입주자대표회의 12
 
0.1%
월드컵아이파크1단지 12
 
0.1%
고덕 12
 
0.1%
서울숲2차푸르지오임대 12
 
0.1%
Other values (2138) 10279
97.9%
2024-05-11T15:00:55.105740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2208
 
3.1%
2151
 
3.0%
1926
 
2.7%
1881
 
2.6%
1857
 
2.6%
1639
 
2.3%
1614
 
2.2%
1536
 
2.1%
1502
 
2.1%
1377
 
1.9%
Other values (421) 54496
75.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65964
91.4%
Decimal Number 3970
 
5.5%
Uppercase Letter 784
 
1.1%
Space Separator 562
 
0.8%
Lowercase Letter 345
 
0.5%
Close Punctuation 145
 
0.2%
Open Punctuation 145
 
0.2%
Dash Punctuation 130
 
0.2%
Other Punctuation 129
 
0.2%
Letter Number 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2208
 
3.3%
2151
 
3.3%
1926
 
2.9%
1881
 
2.9%
1857
 
2.8%
1639
 
2.5%
1614
 
2.4%
1536
 
2.3%
1502
 
2.3%
1377
 
2.1%
Other values (375) 48273
73.2%
Uppercase Letter
ValueCountFrequency (%)
S 121
15.4%
K 110
14.0%
C 93
11.9%
D 60
7.7%
M 60
7.7%
L 55
7.0%
I 51
6.5%
E 40
 
5.1%
H 36
 
4.6%
G 31
 
4.0%
Other values (7) 127
16.2%
Lowercase Letter
ValueCountFrequency (%)
e 180
52.2%
l 40
 
11.6%
i 35
 
10.1%
v 24
 
7.0%
k 12
 
3.5%
s 12
 
3.5%
c 12
 
3.5%
g 9
 
2.6%
a 9
 
2.6%
h 6
 
1.7%
Decimal Number
ValueCountFrequency (%)
1 1231
31.0%
2 1156
29.1%
3 544
13.7%
4 276
 
7.0%
5 185
 
4.7%
6 180
 
4.5%
7 121
 
3.0%
0 98
 
2.5%
9 94
 
2.4%
8 85
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 104
80.6%
. 25
 
19.4%
Space Separator
ValueCountFrequency (%)
562
100.0%
Close Punctuation
ValueCountFrequency (%)
) 145
100.0%
Open Punctuation
ValueCountFrequency (%)
( 145
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 130
100.0%
Letter Number
ValueCountFrequency (%)
8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65964
91.4%
Common 5086
 
7.0%
Latin 1137
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2208
 
3.3%
2151
 
3.3%
1926
 
2.9%
1881
 
2.9%
1857
 
2.8%
1639
 
2.5%
1614
 
2.4%
1536
 
2.3%
1502
 
2.3%
1377
 
2.1%
Other values (375) 48273
73.2%
Latin
ValueCountFrequency (%)
e 180
15.8%
S 121
 
10.6%
K 110
 
9.7%
C 93
 
8.2%
D 60
 
5.3%
M 60
 
5.3%
L 55
 
4.8%
I 51
 
4.5%
E 40
 
3.5%
l 40
 
3.5%
Other values (19) 327
28.8%
Common
ValueCountFrequency (%)
1 1231
24.2%
2 1156
22.7%
562
11.0%
3 544
10.7%
4 276
 
5.4%
5 185
 
3.6%
6 180
 
3.5%
) 145
 
2.9%
( 145
 
2.9%
- 130
 
2.6%
Other values (7) 532
10.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65964
91.4%
ASCII 6215
 
8.6%
Number Forms 8
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2208
 
3.3%
2151
 
3.3%
1926
 
2.9%
1881
 
2.9%
1857
 
2.8%
1639
 
2.5%
1614
 
2.4%
1536
 
2.3%
1502
 
2.3%
1377
 
2.1%
Other values (375) 48273
73.2%
ASCII
ValueCountFrequency (%)
1 1231
19.8%
2 1156
18.6%
562
 
9.0%
3 544
 
8.8%
4 276
 
4.4%
5 185
 
3.0%
6 180
 
2.9%
e 180
 
2.9%
) 145
 
2.3%
( 145
 
2.3%
Other values (35) 1611
25.9%
Number Forms
ValueCountFrequency (%)
8
100.0%
Distinct2082
Distinct (%)20.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:55.687842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)0.9%

Sample

1st rowA15283706
2nd rowA13986306
3rd rowA10026842
4th rowA15205104
5th rowA15209202
ValueCountFrequency (%)
a13613007 13
 
0.1%
a13579506 12
 
0.1%
a41279905 12
 
0.1%
a12104005 12
 
0.1%
a12171101 12
 
0.1%
a12012202 12
 
0.1%
a15875103 12
 
0.1%
a13920804 12
 
0.1%
a15081002 11
 
0.1%
a13706001 11
 
0.1%
Other values (2072) 9881
98.8%
2024-05-11T15:00:56.343405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18206
20.2%
1 17621
19.6%
A 9987
11.1%
3 8831
9.8%
2 8151
9.1%
5 6299
 
7.0%
8 5774
 
6.4%
7 4852
 
5.4%
4 3781
 
4.2%
6 3437
 
3.8%
Other values (2) 3061
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18206
22.8%
1 17621
22.0%
3 8831
11.0%
2 8151
10.2%
5 6299
 
7.9%
8 5774
 
7.2%
7 4852
 
6.1%
4 3781
 
4.7%
6 3437
 
4.3%
9 3048
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9987
99.9%
B 13
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18206
22.8%
1 17621
22.0%
3 8831
11.0%
2 8151
10.2%
5 6299
 
7.9%
8 5774
 
7.2%
7 4852
 
6.1%
4 3781
 
4.7%
6 3437
 
4.3%
9 3048
 
3.8%
Latin
ValueCountFrequency (%)
A 9987
99.9%
B 13
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18206
20.2%
1 17621
19.6%
A 9987
11.1%
3 8831
9.8%
2 8151
9.1%
5 6299
 
7.0%
8 5774
 
6.4%
7 4852
 
5.4%
4 3781
 
4.2%
6 3437
 
3.8%
Other values (2) 3061
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:56.697599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length6.0387
Min length2

Characters and Unicode

Total characters60387
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row장기수선충당부채
2nd row선급금
3rd row장기수선충당예금
4th row미지급비용
5th row선급금
ValueCountFrequency (%)
예금 328
 
3.3%
미처분이익잉여금 323
 
3.2%
선급비용 317
 
3.2%
퇴직급여충당부채 315
 
3.1%
장기수선충당예금 314
 
3.1%
수선유지비충당부채 311
 
3.1%
당기순이익 309
 
3.1%
미지급금 308
 
3.1%
연차수당충당부채 298
 
3.0%
장기수선충당부채 292
 
2.9%
Other values (67) 6885
68.8%
2024-05-11T15:00:57.273377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4662
 
7.7%
3847
 
6.4%
3266
 
5.4%
3162
 
5.2%
3024
 
5.0%
2979
 
4.9%
2704
 
4.5%
2376
 
3.9%
1978
 
3.3%
1793
 
3.0%
Other values (97) 30596
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60387
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4662
 
7.7%
3847
 
6.4%
3266
 
5.4%
3162
 
5.2%
3024
 
5.0%
2979
 
4.9%
2704
 
4.5%
2376
 
3.9%
1978
 
3.3%
1793
 
3.0%
Other values (97) 30596
50.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60387
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4662
 
7.7%
3847
 
6.4%
3266
 
5.4%
3162
 
5.2%
3024
 
5.0%
2979
 
4.9%
2704
 
4.5%
2376
 
3.9%
1978
 
3.3%
1793
 
3.0%
Other values (97) 30596
50.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60387
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4662
 
7.7%
3847
 
6.4%
3266
 
5.4%
3162
 
5.2%
3024
 
5.0%
2979
 
4.9%
2704
 
4.5%
2376
 
3.9%
1978
 
3.3%
1793
 
3.0%
Other values (97) 30596
50.7%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
201912
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row201912
3rd row201912
4th row201912
5th row201912

Common Values

ValueCountFrequency (%)
201912 10000
100.0%

Length

2024-05-11T15:00:57.494595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:00:57.655705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7307
Distinct (%)73.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77079389
Minimum-9.488935 × 108
Maximum7.8854545 × 109
Zeros2344
Zeros (%)23.4%
Negative336
Negative (%)3.4%
Memory size166.0 KiB
2024-05-11T15:00:57.877721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9.488935 × 108
5-th percentile0
Q10
median3266224
Q337141852
95-th percentile3.7729593 × 108
Maximum7.8854545 × 109
Range8.834348 × 109
Interquartile range (IQR)37141852

Descriptive statistics

Standard deviation3.0489981 × 108
Coefficient of variation (CV)3.9556594
Kurtosis189.69972
Mean77079389
Median Absolute Deviation (MAD)3266224
Skewness11.378167
Sum7.7079389 × 1011
Variance9.2963892 × 1016
MonotonicityNot monotonic
2024-05-11T15:00:58.078932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2344
 
23.4%
500000 26
 
0.3%
250000 19
 
0.2%
200000 16
 
0.2%
30000000 12
 
0.1%
484000 11
 
0.1%
300000 11
 
0.1%
15000 11
 
0.1%
1000000 10
 
0.1%
2000000 10
 
0.1%
Other values (7297) 7530
75.3%
ValueCountFrequency (%)
-948893502 1
< 0.1%
-425620475 1
< 0.1%
-153484580 1
< 0.1%
-147498761 1
< 0.1%
-118687890 1
< 0.1%
-112532600 1
< 0.1%
-89020065 1
< 0.1%
-88705225 1
< 0.1%
-83223130 1
< 0.1%
-80573670 1
< 0.1%
ValueCountFrequency (%)
7885454519 1
< 0.1%
6855390795 1
< 0.1%
6347940798 1
< 0.1%
6196643966 1
< 0.1%
6109423224 1
< 0.1%
5627160542 1
< 0.1%
5620625369 1
< 0.1%
5347095040 1
< 0.1%
5329912476 1
< 0.1%
4892842102 1
< 0.1%

Interactions

2024-05-11T15:00:53.850271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:00:58.190696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.434
금액0.4341.000

Missing values

2024-05-11T15:00:54.032729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:00:54.177094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
52272고척동아한신A15283706장기수선충당부채201912261099700
41244중계그린A13986306선급금2019129264980
1989은평지웰테라스A10026842장기수선충당예금20191232427064
50451구로한일유엔아이A15205104미지급비용20191251272960
51113개봉삼호아파트관리사무소A15209202선급금2019120
63794은평뉴타운상림마을7단지A41279903선수수도료20191239560
18581쌍문삼익A13286304경비비충당부채201912117324227
26443역삼2차아이파크A13579503기타충당예금2019120
4437신당푸르지오A10045001예금201912270964380
44692현대강변A14319201장기수선충당부채201912450821732
아파트명아파트코드비용명년월일금액
49930관악국제산장A15176701가수금2019125215540
3863래미안첼리투스A10027908단기보증금2019125450000
18424쌍문성원A13286106예수금201912985275
18071도봉서울가든A13281201기타의비유동자산2019120
47504당산쌍용예가클래식A15072001단기보증금2019127820000
15687신내건영1차A13185603예금201912178067125
14665망우신원A13123101가수금2019121563172
55783이수역리가A15609007비품20191239851030
27127구현대2A13589802수선유지비충당부채2019120
7979창전삼성A12119007연차수당충당부채20191215617030