Overview

Dataset statistics

Number of variables12
Number of observations10000
Missing cells10815
Missing cells (%)9.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 MiB
Average record size in memory110.0 B

Variable types

Text5
Categorical2
Numeric5

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15401/S/1/datasetView.do

Alerts

6992.960000000 is highly overall correlated with 226180.460000000High correlation
226180.460000000 is highly overall correlated with 6992.960000000 and 1 other fieldsHigh correlation
46 is highly overall correlated with 226180.460000000 and 1 other fieldsHigh correlation
3 is highly overall correlated with 46High correlation
현대슈퍼빌 has 4657 (46.6%) missing valuesMissing
업무시설,운동시설,근린생활시설 has 2186 (21.9%) missing valuesMissing
Unnamed: 6 has 3913 (39.1%) missing valuesMissing
6992.960000000 is highly skewed (γ1 = 70.88393568)Skewed
226180.460000000 is highly skewed (γ1 = 22.44219467)Skewed
11000-1 has unique valuesUnique
6992.960000000 has 3016 (30.2%) zerosZeros
226180.460000000 has 1039 (10.4%) zerosZeros
46 has 232 (2.3%) zerosZeros
3 has 4765 (47.6%) zerosZeros

Reproduction

Analysis started2024-05-03 21:58:33.535696
Analysis finished2024-05-03 21:58:44.831794
Duration11.3 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

11000-1
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-03T21:58:45.159748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length15
Mean length13.8753
Min length7

Characters and Unicode

Total characters138753
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row11110-4111
2nd row11110-100019540
3rd row11110-100013544
4th row11110-100016571
5th row11110-100017393
ValueCountFrequency (%)
11110-4111 1
 
< 0.1%
11110-100031435 1
 
< 0.1%
11110-4557 1
 
< 0.1%
11110-100006936 1
 
< 0.1%
11140-100036323 1
 
< 0.1%
11110-100022443 1
 
< 0.1%
11110-1648 1
 
< 0.1%
11110-100016992 1
 
< 0.1%
11110-100026502 1
 
< 0.1%
11140-100005592 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-03T21:58:45.963297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 50727
36.6%
0 38824
28.0%
- 10000
 
7.2%
4 7757
 
5.6%
2 5945
 
4.3%
3 5244
 
3.8%
5 4556
 
3.3%
6 4541
 
3.3%
7 3844
 
2.8%
8 3825
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 128753
92.8%
Dash Punctuation 10000
 
7.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 50727
39.4%
0 38824
30.2%
4 7757
 
6.0%
2 5945
 
4.6%
3 5244
 
4.1%
5 4556
 
3.5%
6 4541
 
3.5%
7 3844
 
3.0%
8 3825
 
3.0%
9 3490
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 138753
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 50727
36.6%
0 38824
28.0%
- 10000
 
7.2%
4 7757
 
5.6%
2 5945
 
4.3%
3 5244
 
3.8%
5 4556
 
3.3%
6 4541
 
3.3%
7 3844
 
2.8%
8 3825
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 138753
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 50727
36.6%
0 38824
28.0%
- 10000
 
7.2%
4 7757
 
5.6%
2 5945
 
4.3%
3 5244
 
3.8%
5 4556
 
3.3%
6 4541
 
3.3%
7 3844
 
2.8%
8 3825
 
2.8%
Distinct9102
Distinct (%)91.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-03T21:58:46.575771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length15
Mean length13.7712
Min length7

Characters and Unicode

Total characters137712
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8652 ?
Unique (%)86.5%

Sample

1st row11110-3459
2nd row11110-100020333
3rd row11110-100014710
4th row11110-100010627
5th row11110-100018477
ValueCountFrequency (%)
11140-100030872 108
 
1.1%
11140-100031373 35
 
0.4%
11110-100028220 16
 
0.2%
11110-100025921 15
 
0.1%
11110-100011141 12
 
0.1%
11110-100016415 11
 
0.1%
11110-100028263 11
 
0.1%
11110-100030948 11
 
0.1%
11000-100004285 9
 
0.1%
11110-100034290 9
 
0.1%
Other values (9092) 9763
97.6%
2024-05-03T21:58:47.441631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 51336
37.3%
0 37687
27.4%
- 10000
 
7.3%
4 7518
 
5.5%
2 6405
 
4.7%
3 5655
 
4.1%
5 4301
 
3.1%
7 3934
 
2.9%
8 3716
 
2.7%
9 3714
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 127712
92.7%
Dash Punctuation 10000
 
7.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 51336
40.2%
0 37687
29.5%
4 7518
 
5.9%
2 6405
 
5.0%
3 5655
 
4.4%
5 4301
 
3.4%
7 3934
 
3.1%
8 3716
 
2.9%
9 3714
 
2.9%
6 3446
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 137712
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 51336
37.3%
0 37687
27.4%
- 10000
 
7.3%
4 7518
 
5.5%
2 6405
 
4.7%
3 5655
 
4.1%
5 4301
 
3.1%
7 3934
 
2.9%
8 3716
 
2.7%
9 3714
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 137712
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 51336
37.3%
0 37687
27.4%
- 10000
 
7.3%
4 7518
 
5.5%
2 6405
 
4.7%
3 5655
 
4.1%
5 4301
 
3.1%
7 3934
 
2.9%
8 3716
 
2.7%
9 3714
 
2.7%

현대슈퍼빌
Text

MISSING 

Distinct1944
Distinct (%)36.4%
Missing4657
Missing (%)46.6%
Memory size156.2 KiB
2024-05-03T21:58:48.174613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length25
Mean length5.2927194
Min length1

Characters and Unicode

Total characters28279
Distinct characters541
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1611 ?
Unique (%)30.2%

Sample

1st row주건축물제1동
2nd row1
3rd row백상빌딩
4th row리안빌리지
5th row주건축물제1동
ValueCountFrequency (%)
주건축물제1동 1206
 
19.5%
1 561
 
9.1%
1동 146
 
2.4%
132
 
2.1%
2 104
 
1.7%
a동 72
 
1.2%
주택 52
 
0.8%
3 51
 
0.8%
단독주택 49
 
0.8%
2동 44
 
0.7%
Other values (2023) 3775
61.0%
2024-05-03T21:58:49.211413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2873
 
10.2%
1 2427
 
8.6%
1798
 
6.4%
1404
 
5.0%
1404
 
5.0%
1401
 
5.0%
1333
 
4.7%
851
 
3.0%
425
 
1.5%
2 424
 
1.5%
Other values (531) 13939
49.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22044
78.0%
Decimal Number 4002
 
14.2%
Space Separator 851
 
3.0%
Uppercase Letter 590
 
2.1%
Dash Punctuation 273
 
1.0%
Other Punctuation 194
 
0.7%
Lowercase Letter 109
 
0.4%
Close Punctuation 103
 
0.4%
Open Punctuation 103
 
0.4%
Letter Number 5
 
< 0.1%
Other values (3) 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2873
 
13.0%
1798
 
8.2%
1404
 
6.4%
1404
 
6.4%
1401
 
6.4%
1333
 
6.0%
425
 
1.9%
405
 
1.8%
276
 
1.3%
268
 
1.2%
Other values (458) 10457
47.4%
Uppercase Letter
ValueCountFrequency (%)
A 116
19.7%
B 85
14.4%
C 43
 
7.3%
E 42
 
7.1%
S 31
 
5.3%
D 31
 
5.3%
T 30
 
5.1%
F 25
 
4.2%
I 23
 
3.9%
O 22
 
3.7%
Other values (14) 142
24.1%
Lowercase Letter
ValueCountFrequency (%)
e 16
14.7%
a 15
13.8%
r 7
 
6.4%
l 7
 
6.4%
t 7
 
6.4%
i 7
 
6.4%
s 6
 
5.5%
n 6
 
5.5%
k 5
 
4.6%
m 5
 
4.6%
Other values (11) 28
25.7%
Decimal Number
ValueCountFrequency (%)
1 2427
60.6%
2 424
 
10.6%
0 260
 
6.5%
3 237
 
5.9%
4 168
 
4.2%
5 144
 
3.6%
6 112
 
2.8%
7 96
 
2.4%
8 73
 
1.8%
9 61
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 139
71.6%
, 32
 
16.5%
: 8
 
4.1%
/ 7
 
3.6%
' 4
 
2.1%
# 2
 
1.0%
& 2
 
1.0%
Letter Number
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%
Space Separator
ValueCountFrequency (%)
851
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 273
100.0%
Close Punctuation
ValueCountFrequency (%)
) 103
100.0%
Open Punctuation
ValueCountFrequency (%)
( 103
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22043
77.9%
Common 5531
 
19.6%
Latin 704
 
2.5%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2873
 
13.0%
1798
 
8.2%
1404
 
6.4%
1404
 
6.4%
1401
 
6.4%
1333
 
6.0%
425
 
1.9%
405
 
1.8%
276
 
1.3%
268
 
1.2%
Other values (457) 10456
47.4%
Latin
ValueCountFrequency (%)
A 116
16.5%
B 85
 
12.1%
C 43
 
6.1%
E 42
 
6.0%
S 31
 
4.4%
D 31
 
4.4%
T 30
 
4.3%
F 25
 
3.6%
I 23
 
3.3%
O 22
 
3.1%
Other values (39) 256
36.4%
Common
ValueCountFrequency (%)
1 2427
43.9%
851
 
15.4%
2 424
 
7.7%
- 273
 
4.9%
0 260
 
4.7%
3 237
 
4.3%
4 168
 
3.0%
5 144
 
2.6%
. 139
 
2.5%
6 112
 
2.0%
Other values (14) 496
 
9.0%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22042
77.9%
ASCII 6230
 
22.0%
Number Forms 5
 
< 0.1%
Compat Jamo 1
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2873
 
13.0%
1798
 
8.2%
1404
 
6.4%
1404
 
6.4%
1401
 
6.4%
1333
 
6.0%
425
 
1.9%
405
 
1.8%
276
 
1.3%
268
 
1.2%
Other values (456) 10455
47.4%
ASCII
ValueCountFrequency (%)
1 2427
39.0%
851
 
13.7%
2 424
 
6.8%
- 273
 
4.4%
0 260
 
4.2%
3 237
 
3.8%
4 168
 
2.7%
5 144
 
2.3%
. 139
 
2.2%
A 116
 
1.9%
Other values (59) 1191
19.1%
Number Forms
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
100.0%

02000
Categorical

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
01000
2840 
04000
2021 
03000
1012 
14000
979 
02000
911 
Other values (25)
2237 

Length

Max length5
Median length5
Mean length4.9879
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row04000
2nd row01000
3rd row01000
4th row04000
5th row28000

Common Values

ValueCountFrequency (%)
01000 2840
28.4%
04000 2021
20.2%
03000 1012
 
10.1%
14000 979
 
9.8%
02000 911
 
9.1%
28000 770
 
7.7%
15000 224
 
2.2%
07000 189
 
1.9%
10000 160
 
1.6%
18000 131
 
1.3%
Other values (20) 763
 
7.6%

Length

2024-05-03T21:58:49.486634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
01000 2840
28.4%
04000 2021
20.2%
03000 1012
 
10.1%
14000 979
 
9.8%
02000 911
 
9.1%
28000 770
 
7.7%
15000 224
 
2.2%
07000 189
 
1.9%
10000 160
 
1.6%
18000 131
 
1.3%
Other values (20) 763
 
7.6%
Distinct1775
Distinct (%)22.7%
Missing2186
Missing (%)21.9%
Memory size156.2 KiB
2024-05-03T21:58:49.827335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length39
Mean length7.3527003
Min length1

Characters and Unicode

Total characters57454
Distinct characters301
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1239 ?
Unique (%)15.9%

Sample

1st row수리점
2nd row주택
3rd row사무소
4th row임시창고(농산물 단순 가공 및 농자재 보관)
5th row휴게음식점
ValueCountFrequency (%)
주택 1444
 
15.0%
근린생활시설 995
 
10.3%
업무시설 351
 
3.6%
다세대주택 317
 
3.3%
사무실 294
 
3.1%
일반음식점 288
 
3.0%
다가구주택 283
 
2.9%
단독주택 237
 
2.5%
사무소 232
 
2.4%
제2종근린생활시설 173
 
1.8%
Other values (1361) 5024
52.1%
2024-05-03T21:58:50.469052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4470
 
7.8%
4143
 
7.2%
3172
 
5.5%
3007
 
5.2%
, 2753
 
4.8%
2150
 
3.7%
2109
 
3.7%
2046
 
3.6%
2020
 
3.5%
1843
 
3.2%
Other values (291) 29741
51.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49767
86.6%
Other Punctuation 2882
 
5.0%
Space Separator 1826
 
3.2%
Decimal Number 1049
 
1.8%
Open Punctuation 939
 
1.6%
Close Punctuation 937
 
1.6%
Dash Punctuation 20
 
< 0.1%
Math Symbol 12
 
< 0.1%
Lowercase Letter 12
 
< 0.1%
Uppercase Letter 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4470
 
9.0%
4143
 
8.3%
3172
 
6.4%
3007
 
6.0%
2150
 
4.3%
2109
 
4.2%
2046
 
4.1%
2020
 
4.1%
1843
 
3.7%
1331
 
2.7%
Other values (251) 23476
47.2%
Decimal Number
ValueCountFrequency (%)
2 524
50.0%
1 360
34.3%
3 52
 
5.0%
4 28
 
2.7%
5 28
 
2.7%
6 17
 
1.6%
7 15
 
1.4%
8 10
 
1.0%
9 8
 
0.8%
0 7
 
0.7%
Lowercase Letter
ValueCountFrequency (%)
e 2
16.7%
r 2
16.7%
x 1
8.3%
o 1
8.3%
h 1
8.3%
n 1
8.3%
w 1
8.3%
s 1
8.3%
k 1
8.3%
m 1
8.3%
Other Punctuation
ValueCountFrequency (%)
, 2753
95.5%
/ 83
 
2.9%
. 40
 
1.4%
: 4
 
0.1%
# 1
 
< 0.1%
· 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
D 3
30.0%
F 3
30.0%
M 2
20.0%
G 1
 
10.0%
A 1
 
10.0%
Math Symbol
ValueCountFrequency (%)
+ 10
83.3%
> 1
 
8.3%
< 1
 
8.3%
Open Punctuation
ValueCountFrequency (%)
( 938
99.9%
[ 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 936
99.9%
] 1
 
0.1%
Space Separator
ValueCountFrequency (%)
1826
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49767
86.6%
Common 7665
 
13.3%
Latin 22
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4470
 
9.0%
4143
 
8.3%
3172
 
6.4%
3007
 
6.0%
2150
 
4.3%
2109
 
4.2%
2046
 
4.1%
2020
 
4.1%
1843
 
3.7%
1331
 
2.7%
Other values (251) 23476
47.2%
Common
ValueCountFrequency (%)
, 2753
35.9%
1826
23.8%
( 938
 
12.2%
) 936
 
12.2%
2 524
 
6.8%
1 360
 
4.7%
/ 83
 
1.1%
3 52
 
0.7%
. 40
 
0.5%
4 28
 
0.4%
Other values (15) 125
 
1.6%
Latin
ValueCountFrequency (%)
D 3
13.6%
F 3
13.6%
e 2
 
9.1%
r 2
 
9.1%
M 2
 
9.1%
x 1
 
4.5%
o 1
 
4.5%
h 1
 
4.5%
n 1
 
4.5%
w 1
 
4.5%
Other values (5) 5
22.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49765
86.6%
ASCII 7686
 
13.4%
Compat Jamo 2
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4470
 
9.0%
4143
 
8.3%
3172
 
6.4%
3007
 
6.0%
2150
 
4.3%
2109
 
4.2%
2046
 
4.1%
2020
 
4.1%
1843
 
3.7%
1331
 
2.7%
Other values (249) 23474
47.2%
ASCII
ValueCountFrequency (%)
, 2753
35.8%
1826
23.8%
( 938
 
12.2%
) 936
 
12.2%
2 524
 
6.8%
1 360
 
4.7%
/ 83
 
1.1%
3 52
 
0.7%
. 40
 
0.5%
4 28
 
0.4%
Other values (29) 146
 
1.9%
Compat Jamo
ValueCountFrequency (%)
1
50.0%
1
50.0%
None
ValueCountFrequency (%)
· 1
100.0%

42
Real number (ℝ)

Distinct26
Distinct (%)0.3%
Missing59
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean28.05764
Minimum10
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T21:58:50.790363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q121
median21
Q339
95-th percentile51
Maximum99
Range89
Interquartile range (IQR)18

Descriptive statistics

Standard deviation16.065835
Coefficient of variation (CV)0.57260109
Kurtosis4.9349202
Mean28.05764
Median Absolute Deviation (MAD)10
Skewness1.8537297
Sum278921
Variance258.11106
MonotonicityNot monotonic
2024-05-03T21:58:51.115275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
21 4685
46.9%
11 1459
 
14.6%
51 1369
 
13.7%
32 680
 
6.8%
42 510
 
5.1%
31 364
 
3.6%
39 350
 
3.5%
99 180
 
1.8%
12 138
 
1.4%
19 87
 
0.9%
Other values (16) 119
 
1.2%
(Missing) 59
 
0.6%
ValueCountFrequency (%)
10 3
 
< 0.1%
11 1459
 
14.6%
12 138
 
1.4%
13 11
 
0.1%
19 87
 
0.9%
20 1
 
< 0.1%
21 4685
46.9%
22 1
 
< 0.1%
29 2
 
< 0.1%
30 1
 
< 0.1%
ValueCountFrequency (%)
99 180
 
1.8%
74 30
 
0.3%
63 3
 
< 0.1%
61 1
 
< 0.1%
52 2
 
< 0.1%
51 1369
13.7%
50 2
 
< 0.1%
49 10
 
0.1%
43 3
 
< 0.1%
42 510
 
5.1%

Unnamed: 6
Text

MISSING 

Distinct616
Distinct (%)10.1%
Missing3913
Missing (%)39.1%
Memory size156.2 KiB
2024-05-03T21:58:51.467319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length26
Mean length5.983243
Min length1

Characters and Unicode

Total characters36420
Distinct characters176
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique407 ?
Unique (%)6.7%

Sample

1st row철근콘크리트조
2nd row연와조
3rd row철근콘크리트구조
4th row철근콘크리트
5th row철골철근콘크리트조
ValueCountFrequency (%)
목조 1141
16.8%
철근콘크리트조 1085
16.0%
연와조 950
14.0%
철근콘크리트구조 514
 
7.6%
컨테이너 473
 
7.0%
철근콘크리트 386
 
5.7%
철골철근콘크리트구조 211
 
3.1%
세멘벽돌조 203
 
3.0%
철골철근콘크리트조 133
 
2.0%
철골조 119
 
1.8%
Other values (430) 1566
23.1%
2024-05-03T21:58:52.360970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5947
16.3%
3526
 
9.7%
2669
 
7.3%
2668
 
7.3%
2605
 
7.2%
2602
 
7.1%
2595
 
7.1%
1247
 
3.4%
1127
 
3.1%
1075
 
3.0%
Other values (166) 10359
28.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 34596
95.0%
Other Punctuation 908
 
2.5%
Space Separator 694
 
1.9%
Open Punctuation 58
 
0.2%
Close Punctuation 58
 
0.2%
Uppercase Letter 35
 
0.1%
Decimal Number 31
 
0.1%
Math Symbol 19
 
0.1%
Lowercase Letter 16
 
< 0.1%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5947
17.2%
3526
10.2%
2669
 
7.7%
2668
 
7.7%
2605
 
7.5%
2602
 
7.5%
2595
 
7.5%
1247
 
3.6%
1127
 
3.3%
1075
 
3.1%
Other values (128) 8535
24.7%
Uppercase Letter
ValueCountFrequency (%)
R 9
25.7%
C 5
14.3%
F 4
11.4%
P 4
11.4%
A 3
 
8.6%
S 3
 
8.6%
B 3
 
8.6%
E 3
 
8.6%
D 1
 
2.9%
Lowercase Letter
ValueCountFrequency (%)
e 3
18.8%
b 2
12.5%
p 2
12.5%
r 2
12.5%
f 2
12.5%
a 2
12.5%
g 1
 
6.2%
i 1
 
6.2%
t 1
 
6.2%
Decimal Number
ValueCountFrequency (%)
1 9
29.0%
3 6
19.4%
2 5
16.1%
6 4
12.9%
4 3
 
9.7%
8 2
 
6.5%
0 1
 
3.2%
5 1
 
3.2%
Other Punctuation
ValueCountFrequency (%)
, 831
91.5%
/ 31
 
3.4%
. 28
 
3.1%
: 16
 
1.8%
1
 
0.1%
· 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
+ 18
94.7%
~ 1
 
5.3%
Space Separator
ValueCountFrequency (%)
694
100.0%
Open Punctuation
ValueCountFrequency (%)
( 58
100.0%
Close Punctuation
ValueCountFrequency (%)
) 58
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 34596
95.0%
Common 1773
 
4.9%
Latin 51
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5947
17.2%
3526
10.2%
2669
 
7.7%
2668
 
7.7%
2605
 
7.5%
2602
 
7.5%
2595
 
7.5%
1247
 
3.6%
1127
 
3.3%
1075
 
3.1%
Other values (128) 8535
24.7%
Common
ValueCountFrequency (%)
, 831
46.9%
694
39.1%
( 58
 
3.3%
) 58
 
3.3%
/ 31
 
1.7%
. 28
 
1.6%
+ 18
 
1.0%
: 16
 
0.9%
1 9
 
0.5%
3 6
 
0.3%
Other values (10) 24
 
1.4%
Latin
ValueCountFrequency (%)
R 9
17.6%
C 5
9.8%
F 4
 
7.8%
P 4
 
7.8%
A 3
 
5.9%
S 3
 
5.9%
e 3
 
5.9%
B 3
 
5.9%
E 3
 
5.9%
b 2
 
3.9%
Other values (8) 12
23.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 34596
95.0%
ASCII 1822
 
5.0%
None 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5947
17.2%
3526
10.2%
2669
 
7.7%
2668
 
7.7%
2605
 
7.5%
2602
 
7.5%
2595
 
7.5%
1247
 
3.6%
1127
 
3.3%
1075
 
3.1%
Other values (128) 8535
24.7%
ASCII
ValueCountFrequency (%)
, 831
45.6%
694
38.1%
( 58
 
3.2%
) 58
 
3.2%
/ 31
 
1.7%
. 28
 
1.5%
+ 18
 
1.0%
: 16
 
0.9%
1 9
 
0.5%
R 9
 
0.5%
Other values (26) 70
 
3.8%
None
ValueCountFrequency (%)
1
50.0%
· 1
50.0%

10
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
10
6077 
20
1639 
<NA>
1177 
90
1024 
30
 
82

Length

Max length4
Median length2
Mean length2.2354
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row90
2nd row10
3rd row20
4th row10
5th row<NA>

Common Values

ValueCountFrequency (%)
10 6077
60.8%
20 1639
 
16.4%
<NA> 1177
 
11.8%
90 1024
 
10.2%
30 82
 
0.8%
39 1
 
< 0.1%

Length

2024-05-03T21:58:52.621256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T21:58:53.117769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10 6077
60.8%
20 1639
 
16.4%
na 1177
 
11.8%
90 1024
 
10.2%
30 82
 
0.8%
39 1
 
< 0.1%

6992.960000000
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct4989
Distinct (%)49.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1114.7502
Minimum-3244.74
Maximum4125652
Zeros3016
Zeros (%)30.2%
Negative30
Negative (%)0.3%
Memory size166.0 KiB
2024-05-03T21:58:53.538270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3244.74
5-th percentile0
Q10
median46.13
Q3148.2125
95-th percentile1315.54
Maximum4125652
Range4128896.7
Interquartile range (IQR)148.2125

Descriptive statistics

Standard deviation55541.925
Coefficient of variation (CV)49.824546
Kurtosis5041.9104
Mean1114.7502
Median Absolute Deviation (MAD)46.13
Skewness70.883936
Sum11147502
Variance3.0849054 × 109
MonotonicityNot monotonic
2024-05-03T21:58:53.913954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 3016
30.2%
18.0 328
 
3.3%
27.0 130
 
1.3%
9.0 70
 
0.7%
5.76 32
 
0.3%
36.0 25
 
0.2%
4671.76 24
 
0.2%
54.0 22
 
0.2%
12.0 17
 
0.2%
33.06 17
 
0.2%
Other values (4979) 6319
63.2%
ValueCountFrequency (%)
-3244.74 1
< 0.1%
-239.99 2
< 0.1%
-216.66 1
< 0.1%
-164.1 1
< 0.1%
-95.97 1
< 0.1%
-56.2 1
< 0.1%
-37.11 1
< 0.1%
-30.56 2
< 0.1%
-27.4 1
< 0.1%
-25.74 1
< 0.1%
ValueCountFrequency (%)
4125652.0 1
< 0.1%
3715601.0 1
< 0.1%
80409.6 1
< 0.1%
54944.55 2
< 0.1%
54936.68 1
< 0.1%
25500.54 1
< 0.1%
24913.76 1
< 0.1%
23742.37 1
< 0.1%
22793.79 1
< 0.1%
19556.91 1
< 0.1%

226180.460000000
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct6656
Distinct (%)66.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4305.3519
Minimum-13432.91
Maximum1290999.3
Zeros1039
Zeros (%)10.4%
Negative35
Negative (%)0.4%
Memory size166.0 KiB
2024-05-03T21:58:54.316007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-13432.91
5-th percentile0
Q129.7225
median126.765
Q3588.77
95-th percentile18370.068
Maximum1290999.3
Range1304432.2
Interquartile range (IQR)559.0475

Descriptive statistics

Standard deviation22763.23
Coefficient of variation (CV)5.2871938
Kurtosis1059.0252
Mean4305.3519
Median Absolute Deviation (MAD)126.765
Skewness22.442195
Sum43053519
Variance5.1816464 × 108
MonotonicityNot monotonic
2024-05-03T21:58:54.937456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1039
 
10.4%
18.0 276
 
2.8%
27.0 102
 
1.0%
36.0 81
 
0.8%
9.0 66
 
0.7%
54.0 51
 
0.5%
33.06 41
 
0.4%
46.28 41
 
0.4%
5.76 33
 
0.3%
49.59 31
 
0.3%
Other values (6646) 8239
82.4%
ValueCountFrequency (%)
-13432.91 1
< 0.1%
-3044.7 1
< 0.1%
-1333.66 1
< 0.1%
-1131.98 1
< 0.1%
-1004.36 2
< 0.1%
-809.24 1
< 0.1%
-196.68 1
< 0.1%
-178.48 1
< 0.1%
-126.12 1
< 0.1%
-82.34 1
< 0.1%
ValueCountFrequency (%)
1290999.29 1
< 0.1%
426635.55 1
< 0.1%
274056.67 1
< 0.1%
267746.04 1
< 0.1%
265791.63 1
< 0.1%
263708.82 1
< 0.1%
262184.08 2
< 0.1%
262143.41 1
< 0.1%
249837.37 1
< 0.1%
246003.09 1
< 0.1%

46
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct49
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4488
Minimum0
Maximum72
Zeros232
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T21:58:55.409546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median3
Q35
95-th percentile16
Maximum72
Range72
Interquartile range (IQR)4

Descriptive statistics

Standard deviation5.5607617
Coefficient of variation (CV)1.2499464
Kurtosis14.444624
Mean4.4488
Median Absolute Deviation (MAD)2
Skewness3.2326779
Sum44488
Variance30.922071
MonotonicityNot monotonic
2024-05-03T21:58:55.925524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1 2571
25.7%
2 1998
20.0%
3 1354
13.5%
4 1049
10.5%
5 883
 
8.8%
6 334
 
3.3%
0 232
 
2.3%
7 190
 
1.9%
10 180
 
1.8%
8 133
 
1.3%
Other values (39) 1076
10.8%
ValueCountFrequency (%)
0 232
 
2.3%
1 2571
25.7%
2 1998
20.0%
3 1354
13.5%
4 1049
10.5%
5 883
 
8.8%
6 334
 
3.3%
7 190
 
1.9%
8 133
 
1.3%
9 114
 
1.1%
ValueCountFrequency (%)
72 1
 
< 0.1%
69 1
 
< 0.1%
51 1
 
< 0.1%
49 1
 
< 0.1%
48 1
 
< 0.1%
46 1
 
< 0.1%
43 1
 
< 0.1%
41 2
< 0.1%
40 3
< 0.1%
39 2
< 0.1%

3
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9656
Minimum0
Maximum19
Zeros4765
Zeros (%)47.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T21:58:56.462745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile4
Maximum19
Range19
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.5066
Coefficient of variation (CV)1.5602734
Kurtosis8.5258396
Mean0.9656
Median Absolute Deviation (MAD)1
Skewness2.6285462
Sum9656
Variance2.2698436
MonotonicityNot monotonic
2024-05-03T21:58:56.886949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 4765
47.6%
1 3636
36.4%
2 611
 
6.1%
3 255
 
2.5%
4 238
 
2.4%
6 154
 
1.5%
7 150
 
1.5%
5 147
 
1.5%
8 35
 
0.4%
9 7
 
0.1%
Other values (2) 2
 
< 0.1%
ValueCountFrequency (%)
0 4765
47.6%
1 3636
36.4%
2 611
 
6.1%
3 255
 
2.5%
4 238
 
2.4%
5 147
 
1.5%
6 154
 
1.5%
7 150
 
1.5%
8 35
 
0.4%
9 7
 
0.1%
ValueCountFrequency (%)
19 1
 
< 0.1%
10 1
 
< 0.1%
9 7
 
0.1%
8 35
 
0.4%
7 150
 
1.5%
6 154
 
1.5%
5 147
 
1.5%
4 238
 
2.4%
3 255
2.5%
2 611
6.1%

Interactions

2024-05-03T21:58:42.014944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:37.009693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:38.341531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:39.613830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:40.901127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:42.278392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:37.262110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:38.589106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:39.792395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:41.066811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:42.653206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:37.526097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:38.845984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:40.055639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:41.240607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:42.954232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:37.775473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:39.112398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:40.338978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:41.458524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:43.231613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:38.050979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:39.389660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:40.618164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:58:41.735316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T21:58:57.332338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
0200042106992.960000000226180.460000000463
020001.0000.6700.3950.0000.2870.5710.568
420.6701.0000.5870.0000.1500.3220.364
100.3950.5871.0000.0000.0560.2160.199
6992.9600000000.0000.0000.0001.0000.0000.0570.043
226180.4600000000.2870.1500.0560.0001.0000.4310.245
460.5710.3220.2160.0570.4311.0000.566
30.5680.3640.1990.0430.2450.5661.000
2024-05-03T21:58:57.660433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
0200010
020001.0000.202
100.2021.000
2024-05-03T21:58:57.987883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
426992.960000000226180.4600000004630200010
421.000-0.074-0.190-0.243-0.2350.3390.448
6992.960000000-0.0741.0000.7780.4090.3260.0000.000
226180.460000000-0.1900.7781.0000.5050.4340.1400.021
46-0.2430.4090.5051.0000.6470.2540.126
3-0.2350.3260.4340.6471.0000.2750.128
020000.3390.0000.1400.2540.2751.0000.202
100.4480.0000.0210.1260.1280.2021.000

Missing values

2024-05-03T21:58:43.630395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T21:58:44.186063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-03T21:58:44.607792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

11000-111000-1.1현대슈퍼빌02000업무시설,운동시설,근린생활시설42Unnamed: 6106992.960000000226180.460000000463
1506411110-411111110-3459<NA>04000수리점51<NA>9059.559.510
625311110-10001954011110-100020333<NA>01000주택21철근콘크리트조100.00.030
359011110-10001354411110-100014710주건축물제1동01000<NA>51<NA>2033.0633.0610
473611110-10001657111110-100010627<NA>04000사무소21<NA>10115.81491.7651
517511110-10001739311110-100018477128000임시창고(농산물 단순 가공 및 농자재 보관)32<NA><NA>36.036.010
119211110-10000707011110-100008507<NA>03000휴게음식점11연와조900.00.020
267711110-10001093511110-100012880백상빌딩03000제1종근린생활시설, 업무시설21철근콘크리트구조10615.172995.6841
1734211140-10000760611140-100004792리안빌리지02000다세대21<NA>10205.95658.1150
1978511140-10002862411140-100025332<NA>14000업무시설21철근콘크리트102496.0539343.15205
2306611140-10006088611140-100051152<NA>07000판매시설, 업무시설42철골철근콘크리트조102431.1449938.92187
11000-111000-1.1현대슈퍼빌02000업무시설,운동시설,근린생활시설42Unnamed: 6106992.960000000226180.460000000463
635111110-10001978311110-100019643<NA>04000제1종근린생활시설,영업용, 위락시설(전자유기장)21철근콘크리트조10161.16943.3560
1851211140-10001535911140-100013827<NA>14000사무실, 근린생활시설21철근콘크리트10903.715533.661
2256311140-10005406811140-100046032A동14000업무시설, 근린생활시설, 판매시설, 노유자시설42철골철근콘크리트조, 철근콘크리트조108013.45132792.56232
1626911110-536311110-4628창신동 다세대02000다세대주택21경량철골1050.6722.3850
1175011110-10004138511110-100038891주건축물제1동10000문화및집회시설,교육연구시설,근린생활시설41철골콘크리트구조10602.556377.9255
2112611140-10004048511140-100033992<NA>04000<NA>21<NA>10261.161135.2151
37811110-10000514411110-100004265104000<NA>51목조200.00.010
51611110-10000549111110-100005054가설-128000공사용가설39콘테이너<NA>18.018.010
833311110-10002544011110-100025793<NA>04000점포11연와조200.063.1410
1526911110-431711110-3632종로6가 창고18000<NA>31<NA>9067.16134.3220