Overview

Dataset statistics

Number of variables11
Number of observations10000
Missing cells24481
Missing cells (%)22.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory996.1 KiB
Average record size in memory102.0 B

Variable types

Text5
Numeric5
Categorical1

Dataset

Description관리_층별_개요_PK,관리_동별_개요_PK,층_번호,건축_구분_코드,주_용도_코드,기타_용도,구조_코드,기타_구조,층_면적,층_구분_코드,층_일련번호
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15402/S/1/datasetView.do

Alerts

층_번호 is highly overall correlated with 층_일련번호High correlation
건축_구분_코드 is highly overall correlated with 층_일련번호High correlation
층_면적 is highly overall correlated with 층_일련번호High correlation
층_일련번호 is highly overall correlated with 층_번호 and 2 other fieldsHigh correlation
건축_구분_코드 has 318 (3.2%) missing valuesMissing
기타_용도 has 5384 (53.8%) missing valuesMissing
기타_구조 has 8642 (86.4%) missing valuesMissing
층_일련번호 has 9978 (99.8%) missing valuesMissing
층_면적 is highly skewed (γ1 = 99.26012424)Skewed
관리_층별_개요_PK has unique valuesUnique
층_면적 has 377 (3.8%) zerosZeros

Reproduction

Analysis started2024-05-04 04:30:55.182759
Analysis finished2024-05-04 04:31:07.750101
Duration12.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-04T04:31:08.344018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length11.7903
Min length7

Characters and Unicode

Total characters117903
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row11230-9851
2nd row11215-14567
3rd row11260-5592
4th row11140-14946
5th row11110-9898
ValueCountFrequency (%)
11230-9851 1
 
< 0.1%
11140-19134 1
 
< 0.1%
11140-15770 1
 
< 0.1%
11215-22697 1
 
< 0.1%
11170-9367 1
 
< 0.1%
11215-20355 1
 
< 0.1%
11260-100060149 1
 
< 0.1%
11140-100015574 1
 
< 0.1%
11215-29700 1
 
< 0.1%
11200-2384 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-04T04:31:10.028305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 37962
32.2%
0 22991
19.5%
- 10000
 
8.5%
2 9491
 
8.0%
4 6377
 
5.4%
5 6346
 
5.4%
3 5704
 
4.8%
7 5671
 
4.8%
6 4849
 
4.1%
8 4452
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 107903
91.5%
Dash Punctuation 10000
 
8.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 37962
35.2%
0 22991
21.3%
2 9491
 
8.8%
4 6377
 
5.9%
5 6346
 
5.9%
3 5704
 
5.3%
7 5671
 
5.3%
6 4849
 
4.5%
8 4452
 
4.1%
9 4060
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 117903
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 37962
32.2%
0 22991
19.5%
- 10000
 
8.5%
2 9491
 
8.0%
4 6377
 
5.4%
5 6346
 
5.4%
3 5704
 
4.8%
7 5671
 
4.8%
6 4849
 
4.1%
8 4452
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 117903
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 37962
32.2%
0 22991
19.5%
- 10000
 
8.5%
2 9491
 
8.0%
4 6377
 
5.4%
5 6346
 
5.4%
3 5704
 
4.8%
7 5671
 
4.8%
6 4849
 
4.1%
8 4452
 
3.8%
Distinct7358
Distinct (%)73.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-04T04:31:11.129066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length10
Mean length11.2709
Min length7

Characters and Unicode

Total characters112709
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5632 ?
Unique (%)56.3%

Sample

1st row11230-3170
2nd row11215-3087
3rd row11260-1511
4th row11140-2689
5th row11110-2102
ValueCountFrequency (%)
11000-100006267 22
 
0.2%
11000-100006268 14
 
0.1%
11000-173 13
 
0.1%
11260-100118116 12
 
0.1%
11680-100196685 12
 
0.1%
11140-100007179 11
 
0.1%
11000-106 11
 
0.1%
11140-3139 10
 
0.1%
11110-776 10
 
0.1%
11170-2373 10
 
0.1%
Other values (7348) 9875
98.8%
2024-05-04T04:31:12.818204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 35730
31.7%
0 22849
20.3%
- 10000
 
8.9%
2 9336
 
8.3%
3 6245
 
5.5%
4 6104
 
5.4%
5 6052
 
5.4%
7 4933
 
4.4%
6 4434
 
3.9%
8 3581
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 102709
91.1%
Dash Punctuation 10000
 
8.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 35730
34.8%
0 22849
22.2%
2 9336
 
9.1%
3 6245
 
6.1%
4 6104
 
5.9%
5 6052
 
5.9%
7 4933
 
4.8%
6 4434
 
4.3%
8 3581
 
3.5%
9 3445
 
3.4%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 112709
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 35730
31.7%
0 22849
20.3%
- 10000
 
8.9%
2 9336
 
8.3%
3 6245
 
5.5%
4 6104
 
5.4%
5 6052
 
5.4%
7 4933
 
4.4%
6 4434
 
3.9%
8 3581
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 112709
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 35730
31.7%
0 22849
20.3%
- 10000
 
8.9%
2 9336
 
8.3%
3 6245
 
5.5%
4 6104
 
5.4%
5 6052
 
5.4%
7 4933
 
4.4%
6 4434
 
3.9%
8 3581
 
3.2%

층_번호
Real number (ℝ)

HIGH CORRELATION 

Distinct50
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2736
Minimum0
Maximum58
Zeros20
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:31:13.581546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q34
95-th percentile10
Maximum58
Range58
Interquartile range (IQR)3

Descriptive statistics

Standard deviation4.2712255
Coefficient of variation (CV)1.3047488
Kurtosis32.365941
Mean3.2736
Median Absolute Deviation (MAD)1
Skewness4.7896568
Sum32736
Variance18.243367
MonotonicityNot monotonic
2024-05-04T04:31:14.090505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3769
37.7%
2 1976
19.8%
3 1532
15.3%
4 1050
 
10.5%
5 570
 
5.7%
6 230
 
2.3%
7 137
 
1.4%
8 98
 
1.0%
9 88
 
0.9%
10 71
 
0.7%
Other values (40) 479
 
4.8%
ValueCountFrequency (%)
0 20
 
0.2%
1 3769
37.7%
2 1976
19.8%
3 1532
15.3%
4 1050
 
10.5%
5 570
 
5.7%
6 230
 
2.3%
7 137
 
1.4%
8 98
 
1.0%
9 88
 
0.9%
ValueCountFrequency (%)
58 1
 
< 0.1%
56 1
 
< 0.1%
55 1
 
< 0.1%
52 1
 
< 0.1%
51 1
 
< 0.1%
50 1
 
< 0.1%
46 1
 
< 0.1%
45 3
< 0.1%
44 1
 
< 0.1%
42 1
 
< 0.1%

건축_구분_코드
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct12
Distinct (%)0.1%
Missing318
Missing (%)3.2%
Infinite0
Infinite (%)0.0%
Mean317.30138
Minimum100
Maximum3000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:31:14.592050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile100
Q1100
median100
Q3200
95-th percentile900
Maximum3000
Range2900
Interquartile range (IQR)100

Descriptive statistics

Standard deviation448.72273
Coefficient of variation (CV)1.4141846
Kurtosis7.2269288
Mean317.30138
Median Absolute Deviation (MAD)0
Skewness2.7062211
Sum3072112
Variance201352.09
MonotonicityNot monotonic
2024-05-04T04:31:15.105030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
100 6139
61.4%
200 1296
 
13.0%
700 1008
 
10.1%
2000 472
 
4.7%
600 432
 
4.3%
900 290
 
2.9%
300 17
 
0.2%
800 11
 
0.1%
500 9
 
0.1%
400 4
 
< 0.1%
Other values (2) 4
 
< 0.1%
(Missing) 318
 
3.2%
ValueCountFrequency (%)
100 6139
61.4%
200 1296
 
13.0%
212 1
 
< 0.1%
300 17
 
0.2%
400 4
 
< 0.1%
500 9
 
0.1%
600 432
 
4.3%
700 1008
 
10.1%
800 11
 
0.1%
900 290
 
2.9%
ValueCountFrequency (%)
3000 3
 
< 0.1%
2000 472
4.7%
900 290
 
2.9%
800 11
 
0.1%
700 1008
10.1%
600 432
4.3%
500 9
 
0.1%
400 4
 
< 0.1%
300 17
 
0.2%
212 1
 
< 0.1%
Distinct220
Distinct (%)2.2%
Missing93
Missing (%)0.9%
Memory size156.2 KiB
2024-05-04T04:31:16.150289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters49535
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)0.6%

Sample

1st row01003
2nd row02003
3rd row02003
4th row04499
5th row02003
ValueCountFrequency (%)
02003 2069
20.9%
01003 1544
15.6%
01001 675
 
6.8%
02001 508
 
5.1%
03001 463
 
4.7%
04402 424
 
4.3%
14202 348
 
3.5%
04001 344
 
3.5%
14204 267
 
2.7%
04499 255
 
2.6%
Other values (210) 3010
30.4%
2024-05-04T04:31:17.720832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 24337
49.1%
1 7253
 
14.6%
2 5245
 
10.6%
3 4794
 
9.7%
4 3716
 
7.5%
9 2861
 
5.8%
5 525
 
1.1%
7 272
 
0.5%
8 217
 
0.4%
6 196
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49416
99.8%
Uppercase Letter 119
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 24337
49.2%
1 7253
 
14.7%
2 5245
 
10.6%
3 4794
 
9.7%
4 3716
 
7.5%
9 2861
 
5.8%
5 525
 
1.1%
7 272
 
0.6%
8 217
 
0.4%
6 196
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
Z 119
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49416
99.8%
Latin 119
 
0.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 24337
49.2%
1 7253
 
14.7%
2 5245
 
10.6%
3 4794
 
9.7%
4 3716
 
7.5%
9 2861
 
5.8%
5 525
 
1.1%
7 272
 
0.6%
8 217
 
0.4%
6 196
 
0.4%
Latin
ValueCountFrequency (%)
Z 119
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49535
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 24337
49.1%
1 7253
 
14.6%
2 5245
 
10.6%
3 4794
 
9.7%
4 3716
 
7.5%
9 2861
 
5.8%
5 525
 
1.1%
7 272
 
0.5%
8 217
 
0.4%
6 196
 
0.4%

기타_용도
Text

MISSING 

Distinct1261
Distinct (%)27.3%
Missing5384
Missing (%)53.8%
Memory size156.2 KiB
2024-05-04T04:31:18.526238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length41
Mean length6.089688
Min length1

Characters and Unicode

Total characters28110
Distinct characters342
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique898 ?
Unique (%)19.5%

Sample

1st row2세대
2nd row4세대
3rd row소매점(37.98),주차장(12.60)
4th row공동주택
5th row1가구
ValueCountFrequency (%)
1가구 395
 
8.1%
2세대 342
 
7.0%
주차장 331
 
6.8%
계단실 327
 
6.7%
1세대 186
 
3.8%
2가구 163
 
3.3%
계단실(연면적제외 137
 
2.8%
사무소 82
 
1.7%
사무실 71
 
1.5%
3세대 70
 
1.4%
Other values (1191) 2775
56.9%
2024-05-04T04:31:19.990284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1476
 
5.3%
( 1277
 
4.5%
) 1273
 
4.5%
1062
 
3.8%
1045
 
3.7%
1022
 
3.6%
992
 
3.5%
906
 
3.2%
1 861
 
3.1%
823
 
2.9%
Other values (332) 17373
61.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 21382
76.1%
Decimal Number 2328
 
8.3%
Open Punctuation 1282
 
4.6%
Close Punctuation 1278
 
4.5%
Other Punctuation 967
 
3.4%
Uppercase Letter 449
 
1.6%
Space Separator 263
 
0.9%
Dash Punctuation 96
 
0.3%
Lowercase Letter 26
 
0.1%
Math Symbol 20
 
0.1%
Other values (3) 19
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1476
 
6.9%
1062
 
5.0%
1045
 
4.9%
1022
 
4.8%
992
 
4.6%
906
 
4.2%
823
 
3.8%
804
 
3.8%
645
 
3.0%
623
 
2.9%
Other values (271) 11984
56.0%
Uppercase Letter
ValueCountFrequency (%)
E 178
39.6%
V 91
20.3%
L 87
19.4%
F 14
 
3.1%
M 14
 
3.1%
D 13
 
2.9%
T 12
 
2.7%
P 9
 
2.0%
I 8
 
1.8%
A 7
 
1.6%
Other values (7) 16
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
m 8
30.8%
a 3
 
11.5%
f 2
 
7.7%
r 2
 
7.7%
e 2
 
7.7%
p 1
 
3.8%
g 1
 
3.8%
k 1
 
3.8%
c 1
 
3.8%
t 1
 
3.8%
Other values (4) 4
15.4%
Decimal Number
ValueCountFrequency (%)
1 861
37.0%
2 809
34.8%
3 224
 
9.6%
4 152
 
6.5%
5 55
 
2.4%
0 53
 
2.3%
6 52
 
2.2%
8 47
 
2.0%
7 39
 
1.7%
9 36
 
1.5%
Other Punctuation
ValueCountFrequency (%)
, 679
70.2%
. 148
 
15.3%
/ 81
 
8.4%
: 53
 
5.5%
? 2
 
0.2%
& 2
 
0.2%
; 1
 
0.1%
' 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 10
50.0%
= 7
35.0%
+ 3
 
15.0%
Open Punctuation
ValueCountFrequency (%)
( 1277
99.6%
[ 5
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 1273
99.6%
] 5
 
0.4%
Space Separator
ValueCountFrequency (%)
263
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 96
100.0%
Other Symbol
ValueCountFrequency (%)
16
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 21370
76.0%
Common 6253
 
22.2%
Latin 475
 
1.7%
Han 12
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1476
 
6.9%
1062
 
5.0%
1045
 
4.9%
1022
 
4.8%
992
 
4.6%
906
 
4.2%
823
 
3.9%
804
 
3.8%
645
 
3.0%
623
 
2.9%
Other values (267) 11972
56.0%
Latin
ValueCountFrequency (%)
E 178
37.5%
V 91
19.2%
L 87
18.3%
F 14
 
2.9%
M 14
 
2.9%
D 13
 
2.7%
T 12
 
2.5%
P 9
 
1.9%
I 8
 
1.7%
m 8
 
1.7%
Other values (21) 41
 
8.6%
Common
ValueCountFrequency (%)
( 1277
20.4%
) 1273
20.4%
1 861
13.8%
2 809
12.9%
, 679
10.9%
263
 
4.2%
3 224
 
3.6%
4 152
 
2.4%
. 148
 
2.4%
- 96
 
1.5%
Other values (20) 471
 
7.5%
Han
ValueCountFrequency (%)
3
25.0%
3
25.0%
3
25.0%
3
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 21370
76.0%
ASCII 6712
 
23.9%
CJK Compat 16
 
0.1%
CJK 12
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1476
 
6.9%
1062
 
5.0%
1045
 
4.9%
1022
 
4.8%
992
 
4.6%
906
 
4.2%
823
 
3.9%
804
 
3.8%
645
 
3.0%
623
 
2.9%
Other values (267) 11972
56.0%
ASCII
ValueCountFrequency (%)
( 1277
19.0%
) 1273
19.0%
1 861
12.8%
2 809
12.1%
, 679
10.1%
263
 
3.9%
3 224
 
3.3%
E 178
 
2.7%
4 152
 
2.3%
. 148
 
2.2%
Other values (50) 848
12.6%
CJK Compat
ValueCountFrequency (%)
16
100.0%
CJK
ValueCountFrequency (%)
3
25.0%
3
25.0%
3
25.0%
3
25.0%

구조_코드
Real number (ℝ)

Distinct22
Distinct (%)0.2%
Missing66
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean22.642641
Minimum10
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:31:20.476824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q121
median21
Q321
95-th percentile42
Maximum99
Range89
Interquartile range (IQR)0

Descriptive statistics

Standard deviation8.8756717
Coefficient of variation (CV)0.39198923
Kurtosis19.163278
Mean22.642641
Median Absolute Deviation (MAD)0
Skewness3.2510356
Sum224932
Variance78.777547
MonotonicityNot monotonic
2024-05-04T04:31:20.902727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
21 7345
73.5%
11 1017
 
10.2%
42 563
 
5.6%
31 313
 
3.1%
32 286
 
2.9%
51 103
 
1.0%
19 83
 
0.8%
41 66
 
0.7%
12 33
 
0.3%
99 29
 
0.3%
Other values (12) 96
 
1.0%
(Missing) 66
 
0.7%
ValueCountFrequency (%)
10 6
 
0.1%
11 1017
 
10.2%
12 33
 
0.3%
19 83
 
0.8%
20 4
 
< 0.1%
21 7345
73.5%
22 15
 
0.1%
27 1
 
< 0.1%
29 7
 
0.1%
30 1
 
< 0.1%
ValueCountFrequency (%)
99 29
 
0.3%
74 28
 
0.3%
63 2
 
< 0.1%
51 103
 
1.0%
50 2
 
< 0.1%
43 1
 
< 0.1%
42 563
5.6%
41 66
 
0.7%
39 28
 
0.3%
33 1
 
< 0.1%

기타_구조
Text

MISSING 

Distinct166
Distinct (%)12.2%
Missing8642
Missing (%)86.4%
Memory size156.2 KiB
2024-05-04T04:31:21.392268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length19
Mean length6.3784978
Min length1

Characters and Unicode

Total characters8662
Distinct characters109
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)7.4%

Sample

1st row철골철근콘크리트,철골조
2nd row(벽식구조)
3rd row합성보
4th row(연와조)
5th row조적조
ValueCountFrequency (%)
연와조 314
22.8%
철근콘크리트구조 247
17.9%
철근콘크리트조 187
13.6%
철골철근콘크리트구조 94
 
6.8%
철골콘크리트구조 44
 
3.2%
철근콘크리트 39
 
2.8%
철골철근콘크리트조 32
 
2.3%
컨테이너 30
 
2.2%
목조 30
 
2.2%
벽식구조 24
 
1.7%
Other values (139) 337
24.5%
2024-05-04T04:31:22.315275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1318
15.2%
964
11.1%
736
8.5%
725
8.4%
724
8.4%
723
8.3%
678
7.8%
493
 
5.7%
346
 
4.0%
346
 
4.0%
Other values (99) 1609
18.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8382
96.8%
Other Punctuation 89
 
1.0%
Close Punctuation 65
 
0.8%
Open Punctuation 65
 
0.8%
Decimal Number 27
 
0.3%
Space Separator 20
 
0.2%
Uppercase Letter 8
 
0.1%
Other Symbol 4
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1318
15.7%
964
11.5%
736
8.8%
725
8.6%
724
8.6%
723
8.6%
678
8.1%
493
 
5.9%
346
 
4.1%
346
 
4.1%
Other values (75) 1329
15.9%
Decimal Number
ValueCountFrequency (%)
0 6
22.2%
2 4
14.8%
7 4
14.8%
1 4
14.8%
9 3
11.1%
5 2
 
7.4%
6 2
 
7.4%
3 1
 
3.7%
4 1
 
3.7%
Other Punctuation
ValueCountFrequency (%)
, 60
67.4%
. 16
 
18.0%
/ 10
 
11.2%
: 2
 
2.2%
? 1
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
C 3
37.5%
S 2
25.0%
R 1
 
12.5%
A 1
 
12.5%
L 1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 65
100.0%
Open Punctuation
ValueCountFrequency (%)
( 65
100.0%
Space Separator
ValueCountFrequency (%)
20
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8382
96.8%
Common 272
 
3.1%
Latin 8
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1318
15.7%
964
11.5%
736
8.8%
725
8.6%
724
8.6%
723
8.6%
678
8.1%
493
 
5.9%
346
 
4.1%
346
 
4.1%
Other values (75) 1329
15.9%
Common
ValueCountFrequency (%)
) 65
23.9%
( 65
23.9%
, 60
22.1%
20
 
7.4%
. 16
 
5.9%
/ 10
 
3.7%
0 6
 
2.2%
2 4
 
1.5%
7 4
 
1.5%
1 4
 
1.5%
Other values (9) 18
 
6.6%
Latin
ValueCountFrequency (%)
C 3
37.5%
S 2
25.0%
R 1
 
12.5%
A 1
 
12.5%
L 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8382
96.8%
ASCII 276
 
3.2%
CJK Compat 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1318
15.7%
964
11.5%
736
8.8%
725
8.6%
724
8.6%
723
8.6%
678
8.1%
493
 
5.9%
346
 
4.1%
346
 
4.1%
Other values (75) 1329
15.9%
ASCII
ValueCountFrequency (%)
) 65
23.6%
( 65
23.6%
, 60
21.7%
20
 
7.2%
. 16
 
5.8%
/ 10
 
3.6%
0 6
 
2.2%
2 4
 
1.4%
7 4
 
1.4%
1 4
 
1.4%
Other values (13) 22
 
8.0%
CJK Compat
ValueCountFrequency (%)
4
100.0%

층_면적
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct7210
Distinct (%)72.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean394.58854
Minimum-422.99
Maximum1217235
Zeros377
Zeros (%)3.8%
Negative36
Negative (%)0.4%
Memory size166.0 KiB
2024-05-04T04:31:22.839465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-422.99
5-th percentile4.8
Q139.97
median92.495
Q3178.825
95-th percentile1191.5925
Maximum1217235
Range1217658
Interquartile range (IQR)138.855

Descriptive statistics

Standard deviation12199.989
Coefficient of variation (CV)30.918256
Kurtosis9900.7816
Mean394.58854
Median Absolute Deviation (MAD)63.765
Skewness99.260124
Sum3945885.4
Variance1.4883974 × 108
MonotonicityNot monotonic
2024-05-04T04:31:23.452785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 377
 
3.8%
18.0 38
 
0.4%
10.08 29
 
0.3%
10.8 21
 
0.2%
27.0 21
 
0.2%
9.36 19
 
0.2%
10.92 17
 
0.2%
12.48 14
 
0.1%
9.0 13
 
0.1%
11.7 10
 
0.1%
Other values (7200) 9441
94.4%
ValueCountFrequency (%)
-422.99 1
< 0.1%
-334.23 1
< 0.1%
-311.01 1
< 0.1%
-109.38 1
< 0.1%
-106.15 1
< 0.1%
-91.81 1
< 0.1%
-72.5 1
< 0.1%
-71.4 1
< 0.1%
-40.77 1
< 0.1%
-28.8 1
< 0.1%
ValueCountFrequency (%)
1217235.0 1
< 0.1%
26292.115 1
< 0.1%
24515.08 1
< 0.1%
22324.05 1
< 0.1%
22075.03 1
< 0.1%
20547.58 1
< 0.1%
18837.23 1
< 0.1%
16141.21 1
< 0.1%
12200.78 1
< 0.1%
12059.35 1
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
7962 
10
1392 
30
 
646

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 7962
79.6%
10 1392
 
13.9%
30 646
 
6.5%

Length

2024-05-04T04:31:23.888241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:31:24.260779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 7962
79.6%
10 1392
 
13.9%
30 646
 
6.5%

층_일련번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct17
Distinct (%)77.3%
Missing9978
Missing (%)99.8%
Infinite0
Infinite (%)0.0%
Mean21
Minimum1
Maximum63
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:31:24.606095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median18.5
Q334.5
95-th percentile58.9
Maximum63
Range62
Interquartile range (IQR)28.5

Descriptive statistics

Standard deviation18.026436
Coefficient of variation (CV)0.8584017
Kurtosis0.31455485
Mean21
Median Absolute Deviation (MAD)13
Skewness0.99242193
Sum462
Variance324.95238
MonotonicityNot monotonic
2024-05-04T04:31:25.064862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
4 2
 
< 0.1%
9 2
 
< 0.1%
5 2
 
< 0.1%
6 2
 
< 0.1%
36 2
 
< 0.1%
20 1
 
< 0.1%
30 1
 
< 0.1%
1 1
 
< 0.1%
18 1
 
< 0.1%
38 1
 
< 0.1%
Other values (7) 7
 
0.1%
(Missing) 9978
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
4 2
< 0.1%
5 2
< 0.1%
6 2
< 0.1%
7 1
< 0.1%
9 2
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
20 1
< 0.1%
24 1
< 0.1%
ValueCountFrequency (%)
63 1
< 0.1%
60 1
< 0.1%
38 1
< 0.1%
37 1
< 0.1%
36 2
< 0.1%
30 1
< 0.1%
25 1
< 0.1%
24 1
< 0.1%
20 1
< 0.1%
19 1
< 0.1%

Interactions

2024-05-04T04:31:04.928503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:30:58.180413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:30:59.973766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:01.736323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:03.272441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:05.244729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:30:58.611049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:00.351523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:02.035583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:03.559901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:05.551183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:30:58.901198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:00.740426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:02.326633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:03.870477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:05.817648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:30:59.314246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:01.096793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:02.649868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:04.224322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:06.052465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:30:59.674651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:01.441840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:02.985248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:31:04.663840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:31:25.442164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
층_번호건축_구분_코드구조_코드층_면적층_구분_코드층_일련번호
층_번호1.0000.0000.2410.2630.1750.801
건축_구분_코드0.0001.0000.2760.0130.0760.537
구조_코드0.2410.2761.0000.0000.1540.404
층_면적0.2630.0130.0001.0000.000NaN
층_구분_코드0.1750.0760.1540.0001.0000.000
층_일련번호0.8010.5370.404NaN0.0001.000
2024-05-04T04:31:25.884998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
층_번호건축_구분_코드구조_코드층_면적층_일련번호층_구분_코드
층_번호1.000-0.0530.1840.3920.6610.105
건축_구분_코드-0.0531.000-0.145-0.1240.6590.056
구조_코드0.184-0.1451.0000.2430.0610.098
층_면적0.392-0.1240.2431.0000.5030.000
층_일련번호0.6610.6590.0610.5031.0000.000
층_구분_코드0.1050.0560.0980.0000.0001.000

Missing values

2024-05-04T04:31:06.433114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:31:07.154418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-04T04:31:07.529336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

관리_층별_개요_PK관리_동별_개요_PK층_번호건축_구분_코드주_용도_코드기타_용도구조_코드기타_구조층_면적층_구분_코드층_일련번호
8523211230-985111230-3170320001003<NA>32<NA>40.020<NA>
8091211215-1456711215-3087510002003<NA>21<NA>160.3620<NA>
9849411260-559211260-15115100020032세대21<NA>118.4120<NA>
1590011140-1494611140-2689310004499<NA>31<NA>270.5120<NA>
653811110-989811110-21024100020034세대21<NA>115.220<NA>
2212611140-2102211140-3837560001001<NA>21<NA>0.020<NA>
1926611140-469511140-907110003000소매점(37.98),주차장(12.60)21<NA>50.5820<NA>
24911000-10002200211000-1000073683010014204공동주택21철골철근콘크리트,철골조2661.3320<NA>
8762711215-2377511215-50253100010031가구21<NA>113.4320<NA>
69411110-526311110-91422000020032세대21<NA>124.6220<NA>
관리_층별_개요_PK관리_동별_개요_PK층_번호건축_구분_코드주_용도_코드기타_용도구조_코드기타_구조층_면적층_구분_코드층_일련번호
3015011140-10000722711140-100005265170017100인쇄소51목조55.5420<NA>
7865611215-1598911215-3346310002003<NA>21<NA>144.9220<NA>
5591011170-614611170-1626310001003<NA>21<NA>160.9620<NA>
9963011260-1470311260-3460170003001업무시설21<NA>0.020<NA>
5761111200-88711200-2511900010031가구99연와조34.1920<NA>
1516911140-1391911140-25375200014202<NA>21<NA>345.6120<NA>
2881311140-1696611140-3069410020001기계실,전기실21<NA>910.2710<NA>
1434211110-10001134711110-100006962410004001한정식집21<NA>215.9420<NA>
7955811110-10005565011110-100028413160004001대중음식점21철근콘크리트조123.3510<NA>
5570711200-198811200-520110020001<NA>39<NA>15.020<NA>