Overview

Dataset statistics

Number of variables12
Number of observations10000
Missing cells14023
Missing cells (%)11.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 MiB
Average record size in memory110.0 B

Variable types

Text5
Categorical1
Numeric6

Dataset

Description관리_동별_개요_PK,관리_허가대장_PK,건물_명,주_용도_코드,기타_용도,구조_코드,기타_구조,지붕_코드,건축_면적,연면적,지상_층_수,지하_층_수
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15401/S/1/datasetView.do

Alerts

건축_면적 is highly overall correlated with 연면적High correlation
연면적 is highly overall correlated with 건축_면적 and 1 other fieldsHigh correlation
지상_층_수 is highly overall correlated with 연면적High correlation
건물_명 has 2532 (25.3%) missing valuesMissing
기타_용도 has 2851 (28.5%) missing valuesMissing
기타_구조 has 8060 (80.6%) missing valuesMissing
지붕_코드 has 512 (5.1%) missing valuesMissing
건축_면적 is highly skewed (γ1 = 26.8609218)Skewed
연면적 is highly skewed (γ1 = 32.12596318)Skewed
관리_동별_개요_PK has unique valuesUnique
건축_면적 has 1174 (11.7%) zerosZeros
연면적 has 273 (2.7%) zerosZeros
지상_층_수 has 341 (3.4%) zerosZeros
지하_층_수 has 4820 (48.2%) zerosZeros

Reproduction

Analysis started2024-05-03 21:59:07.072215
Analysis finished2024-05-03 21:59:27.319576
Duration20.25 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-03T21:59:27.953876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length10
Mean length10.9173
Min length7

Characters and Unicode

Total characters109173
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row11380-100006707
2nd row11650-4064
3rd row11380-565
4th row11380-2868
5th row11380-10372
ValueCountFrequency (%)
11380-100006707 1
 
< 0.1%
11260-100009752 1
 
< 0.1%
11290-3433 1
 
< 0.1%
11500-7637 1
 
< 0.1%
11380-9034 1
 
< 0.1%
11500-726 1
 
< 0.1%
11170-2208 1
 
< 0.1%
11590-3658 1
 
< 0.1%
11380-10114 1
 
< 0.1%
11560-3517 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-03T21:59:29.359437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 29766
27.3%
0 20435
18.7%
- 10000
 
9.2%
5 7988
 
7.3%
2 7772
 
7.1%
3 7503
 
6.9%
4 6830
 
6.3%
6 5734
 
5.3%
8 4592
 
4.2%
7 4389
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 99173
90.8%
Dash Punctuation 10000
 
9.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 29766
30.0%
0 20435
20.6%
5 7988
 
8.1%
2 7772
 
7.8%
3 7503
 
7.6%
4 6830
 
6.9%
6 5734
 
5.8%
8 4592
 
4.6%
7 4389
 
4.4%
9 4164
 
4.2%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 109173
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 29766
27.3%
0 20435
18.7%
- 10000
 
9.2%
5 7988
 
7.3%
2 7772
 
7.1%
3 7503
 
6.9%
4 6830
 
6.3%
6 5734
 
5.3%
8 4592
 
4.2%
7 4389
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 109173
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 29766
27.3%
0 20435
18.7%
- 10000
 
9.2%
5 7988
 
7.3%
2 7772
 
7.1%
3 7503
 
6.9%
4 6830
 
6.3%
6 5734
 
5.3%
8 4592
 
4.2%
7 4389
 
4.0%
Distinct9930
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-03T21:59:29.946995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length10
Mean length10.7993
Min length7

Characters and Unicode

Total characters107993
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9868 ?
Unique (%)98.7%

Sample

1st row11380-100005519
2nd row11650-4013
3rd row11380-560
4th row11380-2868
5th row11380-10220
ValueCountFrequency (%)
11290-2962 6
 
0.1%
11410-2919 4
 
< 0.1%
11305-3610 3
 
< 0.1%
11470-5030 3
 
< 0.1%
11215-5715 2
 
< 0.1%
11650-100009599 2
 
< 0.1%
11545-532 2
 
< 0.1%
11530-3521 2
 
< 0.1%
11350-100016834 2
 
< 0.1%
11305-3959 2
 
< 0.1%
Other values (9920) 9972
99.7%
2024-05-03T21:59:31.138427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 29493
27.3%
0 19427
18.0%
- 10000
 
9.3%
2 7959
 
7.4%
5 7942
 
7.4%
3 7421
 
6.9%
4 7121
 
6.6%
6 5738
 
5.3%
8 4494
 
4.2%
7 4350
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 97993
90.7%
Dash Punctuation 10000
 
9.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 29493
30.1%
0 19427
19.8%
2 7959
 
8.1%
5 7942
 
8.1%
3 7421
 
7.6%
4 7121
 
7.3%
6 5738
 
5.9%
8 4494
 
4.6%
7 4350
 
4.4%
9 4048
 
4.1%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 107993
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 29493
27.3%
0 19427
18.0%
- 10000
 
9.3%
2 7959
 
7.4%
5 7942
 
7.4%
3 7421
 
6.9%
4 7121
 
6.6%
6 5738
 
5.3%
8 4494
 
4.2%
7 4350
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 107993
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 29493
27.3%
0 19427
18.0%
- 10000
 
9.3%
2 7959
 
7.4%
5 7942
 
7.4%
3 7421
 
6.9%
4 7121
 
6.6%
6 5738
 
5.3%
8 4494
 
4.2%
7 4350
 
4.0%

건물_명
Text

MISSING 

Distinct4147
Distinct (%)55.5%
Missing2532
Missing (%)25.3%
Memory size156.2 KiB
2024-05-03T21:59:31.968070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length21
Mean length5.2609802
Min length1

Characters and Unicode

Total characters39289
Distinct characters612
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3780 ?
Unique (%)50.6%

Sample

1st row삼익하이빌
2nd row방배동 다세대 주택
3rd row박해룡주택
4th row아이언아파트
5th row별궁도시형생활주택
ValueCountFrequency (%)
1 1133
 
12.3%
532
 
5.8%
주건축물제1동 334
 
3.6%
다세대주택 271
 
2.9%
1동 237
 
2.6%
주택 137
 
1.5%
다가구주택 136
 
1.5%
신축공사 96
 
1.0%
근린생활시설 89
 
1.0%
근생 78
 
0.8%
Other values (3985) 6153
66.9%
2024-05-03T21:59:33.117229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3013
 
7.7%
1 2317
 
5.9%
2142
 
5.5%
1729
 
4.4%
1702
 
4.3%
1068
 
2.7%
1041
 
2.6%
828
 
2.1%
682
 
1.7%
669
 
1.7%
Other values (602) 24098
61.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30792
78.4%
Decimal Number 4756
 
12.1%
Space Separator 1729
 
4.4%
Other Punctuation 643
 
1.6%
Uppercase Letter 539
 
1.4%
Dash Punctuation 484
 
1.2%
Close Punctuation 114
 
0.3%
Open Punctuation 113
 
0.3%
Lowercase Letter 104
 
0.3%
Letter Number 7
 
< 0.1%
Other values (3) 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3013
 
9.8%
2142
 
7.0%
1702
 
5.5%
1068
 
3.5%
1041
 
3.4%
828
 
2.7%
682
 
2.2%
669
 
2.2%
666
 
2.2%
618
 
2.0%
Other values (526) 18363
59.6%
Uppercase Letter
ValueCountFrequency (%)
A 133
24.7%
B 86
16.0%
C 32
 
5.9%
S 31
 
5.8%
E 27
 
5.0%
O 22
 
4.1%
I 20
 
3.7%
T 20
 
3.7%
L 20
 
3.7%
D 17
 
3.2%
Other values (16) 131
24.3%
Lowercase Letter
ValueCountFrequency (%)
a 25
24.0%
e 15
14.4%
l 12
11.5%
o 8
 
7.7%
r 6
 
5.8%
n 4
 
3.8%
y 4
 
3.8%
i 4
 
3.8%
b 3
 
2.9%
w 3
 
2.9%
Other values (12) 20
19.2%
Decimal Number
ValueCountFrequency (%)
1 2317
48.7%
0 581
 
12.2%
2 457
 
9.6%
3 302
 
6.3%
4 234
 
4.9%
5 219
 
4.6%
6 203
 
4.3%
7 156
 
3.3%
8 148
 
3.1%
9 139
 
2.9%
Other Punctuation
ValueCountFrequency (%)
. 514
79.9%
, 63
 
9.8%
/ 37
 
5.8%
* 16
 
2.5%
' 6
 
0.9%
& 5
 
0.8%
; 1
 
0.2%
? 1
 
0.2%
Letter Number
ValueCountFrequency (%)
3
42.9%
3
42.9%
1
 
14.3%
Space Separator
ValueCountFrequency (%)
1729
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 484
100.0%
Close Punctuation
ValueCountFrequency (%)
) 114
100.0%
Open Punctuation
ValueCountFrequency (%)
( 113
100.0%
Math Symbol
ValueCountFrequency (%)
+ 5
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30791
78.4%
Common 7847
 
20.0%
Latin 650
 
1.7%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3013
 
9.8%
2142
 
7.0%
1702
 
5.5%
1068
 
3.5%
1041
 
3.4%
828
 
2.7%
682
 
2.2%
669
 
2.2%
666
 
2.2%
618
 
2.0%
Other values (525) 18362
59.6%
Latin
ValueCountFrequency (%)
A 133
20.5%
B 86
 
13.2%
C 32
 
4.9%
S 31
 
4.8%
E 27
 
4.2%
a 25
 
3.8%
O 22
 
3.4%
I 20
 
3.1%
T 20
 
3.1%
L 20
 
3.1%
Other values (41) 234
36.0%
Common
ValueCountFrequency (%)
1 2317
29.5%
1729
22.0%
0 581
 
7.4%
. 514
 
6.6%
- 484
 
6.2%
2 457
 
5.8%
3 302
 
3.8%
4 234
 
3.0%
5 219
 
2.8%
6 203
 
2.6%
Other values (15) 807
 
10.3%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30779
78.3%
ASCII 8488
 
21.6%
Compat Jamo 12
 
< 0.1%
Number Forms 7
 
< 0.1%
Geometric Shapes 2
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3013
 
9.8%
2142
 
7.0%
1702
 
5.5%
1068
 
3.5%
1041
 
3.4%
828
 
2.7%
682
 
2.2%
669
 
2.2%
666
 
2.2%
618
 
2.0%
Other values (521) 18350
59.6%
ASCII
ValueCountFrequency (%)
1 2317
27.3%
1729
20.4%
0 581
 
6.8%
. 514
 
6.1%
- 484
 
5.7%
2 457
 
5.4%
3 302
 
3.6%
4 234
 
2.8%
5 219
 
2.6%
6 203
 
2.4%
Other values (62) 1448
17.1%
Compat Jamo
ValueCountFrequency (%)
6
50.0%
4
33.3%
1
 
8.3%
1
 
8.3%
Number Forms
ValueCountFrequency (%)
3
42.9%
3
42.9%
1
 
14.3%
Geometric Shapes
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct32
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
01000
3207 
02000
2683 
04000
1645 
03000
931 
14000
357 
Other values (27)
1177 

Length

Max length5
Median length5
Mean length4.9837
Min length4

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row02000
2nd row02000
3rd row01000
4th row<NA>
5th row02000

Common Values

ValueCountFrequency (%)
01000 3207
32.1%
02000 2683
26.8%
04000 1645
16.4%
03000 931
 
9.3%
14000 357
 
3.6%
<NA> 163
 
1.6%
28000 139
 
1.4%
Z8000 135
 
1.4%
17000 109
 
1.1%
10000 81
 
0.8%
Other values (22) 550
 
5.5%

Length

2024-05-03T21:59:33.758646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
01000 3207
32.1%
02000 2683
26.8%
04000 1645
16.4%
03000 931
 
9.3%
14000 357
 
3.6%
na 163
 
1.6%
28000 139
 
1.4%
z8000 135
 
1.4%
17000 109
 
1.1%
10000 81
 
0.8%
Other values (22) 550
 
5.5%

기타_용도
Text

MISSING 

Distinct1409
Distinct (%)19.7%
Missing2851
Missing (%)28.5%
Memory size156.2 KiB
2024-05-03T21:59:34.403448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length35
Mean length6.2576584
Min length1

Characters and Unicode

Total characters44736
Distinct characters269
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1058 ?
Unique (%)14.8%

Sample

1st row다세대주택
2nd row근린생활시설및아파트
3rd row다세대주택 및 도시형생활주택
4th row제2종근린생활시설
5th row사무실, 근린생활시설, 은행
ValueCountFrequency (%)
다세대주택 1504
18.7%
다가구주택 966
 
12.0%
주택 588
 
7.3%
근린생활시설 456
 
5.7%
단독주택 312
 
3.9%
242
 
3.0%
다세대 163
 
2.0%
다가구 160
 
2.0%
사무소 159
 
2.0%
제2종근린생활시설 131
 
1.6%
Other values (1131) 3361
41.8%
2024-05-03T21:59:35.485078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4426
 
9.9%
4373
 
9.8%
3441
 
7.7%
1965
 
4.4%
1960
 
4.4%
1920
 
4.3%
1832
 
4.1%
1684
 
3.8%
1602
 
3.6%
1551
 
3.5%
Other values (259) 19982
44.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40702
91.0%
Other Punctuation 1261
 
2.8%
Decimal Number 901
 
2.0%
Space Separator 893
 
2.0%
Open Punctuation 470
 
1.1%
Close Punctuation 469
 
1.0%
Dash Punctuation 15
 
< 0.1%
Uppercase Letter 12
 
< 0.1%
Math Symbol 11
 
< 0.1%
Modifier Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4426
 
10.9%
4373
 
10.7%
3441
 
8.5%
1965
 
4.8%
1960
 
4.8%
1920
 
4.7%
1832
 
4.5%
1684
 
4.1%
1602
 
3.9%
1551
 
3.8%
Other values (229) 15948
39.2%
Decimal Number
ValueCountFrequency (%)
2 379
42.1%
1 352
39.1%
3 37
 
4.1%
4 37
 
4.1%
5 24
 
2.7%
6 21
 
2.3%
8 20
 
2.2%
7 16
 
1.8%
0 8
 
0.9%
9 7
 
0.8%
Other Punctuation
ValueCountFrequency (%)
, 1059
84.0%
/ 121
 
9.6%
. 64
 
5.1%
& 10
 
0.8%
: 5
 
0.4%
2
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
G 3
25.0%
N 3
25.0%
C 3
25.0%
T 1
 
8.3%
P 1
 
8.3%
A 1
 
8.3%
Math Symbol
ValueCountFrequency (%)
+ 7
63.6%
< 2
 
18.2%
> 2
 
18.2%
Space Separator
ValueCountFrequency (%)
893
100.0%
Open Punctuation
ValueCountFrequency (%)
( 470
100.0%
Close Punctuation
ValueCountFrequency (%)
) 469
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 40702
91.0%
Common 4022
 
9.0%
Latin 12
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4426
 
10.9%
4373
 
10.7%
3441
 
8.5%
1965
 
4.8%
1960
 
4.8%
1920
 
4.7%
1832
 
4.5%
1684
 
4.1%
1602
 
3.9%
1551
 
3.8%
Other values (229) 15948
39.2%
Common
ValueCountFrequency (%)
, 1059
26.3%
893
22.2%
( 470
11.7%
) 469
11.7%
2 379
 
9.4%
1 352
 
8.8%
/ 121
 
3.0%
. 64
 
1.6%
3 37
 
0.9%
4 37
 
0.9%
Other values (14) 141
 
3.5%
Latin
ValueCountFrequency (%)
G 3
25.0%
N 3
25.0%
C 3
25.0%
T 1
 
8.3%
P 1
 
8.3%
A 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 40702
91.0%
ASCII 4032
 
9.0%
None 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4426
 
10.9%
4373
 
10.7%
3441
 
8.5%
1965
 
4.8%
1960
 
4.8%
1920
 
4.7%
1832
 
4.5%
1684
 
4.1%
1602
 
3.9%
1551
 
3.8%
Other values (229) 15948
39.2%
ASCII
ValueCountFrequency (%)
, 1059
26.3%
893
22.1%
( 470
11.7%
) 469
11.6%
2 379
 
9.4%
1 352
 
8.7%
/ 121
 
3.0%
. 64
 
1.6%
3 37
 
0.9%
4 37
 
0.9%
Other values (19) 151
 
3.7%
None
ValueCountFrequency (%)
2
100.0%

구조_코드
Real number (ℝ)

Distinct22
Distinct (%)0.2%
Missing68
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean21.455497
Minimum10
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T21:59:35.997509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q121
median21
Q321
95-th percentile39
Maximum99
Range89
Interquartile range (IQR)0

Descriptive statistics

Standard deviation9.4042206
Coefficient of variation (CV)0.43831287
Kurtosis19.801997
Mean21.455497
Median Absolute Deviation (MAD)0
Skewness3.2632339
Sum213096
Variance88.439365
MonotonicityNot monotonic
2024-05-03T21:59:36.401139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
21 6409
64.1%
11 1896
 
19.0%
32 619
 
6.2%
31 274
 
2.7%
51 169
 
1.7%
42 142
 
1.4%
19 101
 
1.0%
39 93
 
0.9%
12 82
 
0.8%
99 36
 
0.4%
Other values (12) 111
 
1.1%
(Missing) 68
 
0.7%
ValueCountFrequency (%)
10 27
 
0.3%
11 1896
 
19.0%
12 82
 
0.8%
19 101
 
1.0%
20 1
 
< 0.1%
21 6409
64.1%
22 7
 
0.1%
29 1
 
< 0.1%
30 4
 
< 0.1%
31 274
 
2.7%
ValueCountFrequency (%)
99 36
 
0.4%
74 34
 
0.3%
63 1
 
< 0.1%
61 1
 
< 0.1%
51 169
1.7%
50 4
 
< 0.1%
49 2
 
< 0.1%
42 142
1.4%
41 23
 
0.2%
39 93
0.9%

기타_구조
Text

MISSING 

Distinct340
Distinct (%)17.5%
Missing8060
Missing (%)80.6%
Memory size156.2 KiB
2024-05-03T21:59:36.935392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length26
Mean length5.9876289
Min length1

Characters and Unicode

Total characters11616
Distinct characters142
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique248 ?
Unique (%)12.8%

Sample

1st row철근콘크리트구조
2nd row철골철근콘크리트조
3rd row세멘벽돌조
4th row철근콘크리트조
5th row철근콘크리트구조
ValueCountFrequency (%)
연와조 599
28.4%
철근콘크리트구조 256
 
12.1%
철근콘크리트조 226
 
10.7%
목조 71
 
3.4%
경량철골조 69
 
3.3%
컨테이너 59
 
2.8%
철근콘크리트 52
 
2.5%
조적조 50
 
2.4%
세멘벽돌조 48
 
2.3%
경량철골구조 34
 
1.6%
Other values (267) 643
30.5%
2024-05-03T21:59:38.062404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2081
17.9%
970
 
8.4%
712
 
6.1%
711
 
6.1%
672
 
5.8%
670
 
5.8%
656
 
5.6%
655
 
5.6%
651
 
5.6%
503
 
4.3%
Other values (132) 3335
28.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10952
94.3%
Other Punctuation 297
 
2.6%
Space Separator 167
 
1.4%
Close Punctuation 51
 
0.4%
Open Punctuation 51
 
0.4%
Decimal Number 50
 
0.4%
Uppercase Letter 33
 
0.3%
Lowercase Letter 6
 
0.1%
Math Symbol 4
 
< 0.1%
Other Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2081
19.0%
970
 
8.9%
712
 
6.5%
711
 
6.5%
672
 
6.1%
670
 
6.1%
656
 
6.0%
655
 
6.0%
651
 
5.9%
503
 
4.6%
Other values (99) 2671
24.4%
Decimal Number
ValueCountFrequency (%)
2 14
28.0%
3 10
20.0%
1 9
18.0%
6 4
 
8.0%
0 3
 
6.0%
8 3
 
6.0%
7 2
 
4.0%
5 2
 
4.0%
4 2
 
4.0%
9 1
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
R 11
33.3%
C 10
30.3%
A 4
 
12.1%
L 4
 
12.1%
X 1
 
3.0%
S 1
 
3.0%
F 1
 
3.0%
P 1
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
e 2
33.3%
f 1
16.7%
b 1
16.7%
p 1
16.7%
r 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 238
80.1%
/ 34
 
11.4%
. 24
 
8.1%
* 1
 
0.3%
Space Separator
ValueCountFrequency (%)
167
100.0%
Close Punctuation
ValueCountFrequency (%)
) 51
100.0%
Open Punctuation
ValueCountFrequency (%)
( 51
100.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10952
94.3%
Common 625
 
5.4%
Latin 39
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2081
19.0%
970
 
8.9%
712
 
6.5%
711
 
6.5%
672
 
6.1%
670
 
6.1%
656
 
6.0%
655
 
6.0%
651
 
5.9%
503
 
4.6%
Other values (99) 2671
24.4%
Common
ValueCountFrequency (%)
, 238
38.1%
167
26.7%
) 51
 
8.2%
( 51
 
8.2%
/ 34
 
5.4%
. 24
 
3.8%
2 14
 
2.2%
3 10
 
1.6%
1 9
 
1.4%
6 4
 
0.6%
Other values (10) 23
 
3.7%
Latin
ValueCountFrequency (%)
R 11
28.2%
C 10
25.6%
A 4
 
10.3%
L 4
 
10.3%
e 2
 
5.1%
f 1
 
2.6%
X 1
 
2.6%
S 1
 
2.6%
b 1
 
2.6%
p 1
 
2.6%
Other values (3) 3
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10952
94.3%
ASCII 661
 
5.7%
CJK Compat 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2081
19.0%
970
 
8.9%
712
 
6.5%
711
 
6.5%
672
 
6.1%
670
 
6.1%
656
 
6.0%
655
 
6.0%
651
 
5.9%
503
 
4.6%
Other values (99) 2671
24.4%
ASCII
ValueCountFrequency (%)
, 238
36.0%
167
25.3%
) 51
 
7.7%
( 51
 
7.7%
/ 34
 
5.1%
. 24
 
3.6%
2 14
 
2.1%
R 11
 
1.7%
C 10
 
1.5%
3 10
 
1.5%
Other values (22) 51
 
7.7%
CJK Compat
ValueCountFrequency (%)
3
100.0%

지붕_코드
Real number (ℝ)

MISSING 

Distinct8
Distinct (%)0.1%
Missing512
Missing (%)5.1%
Infinite0
Infinite (%)0.0%
Mean24.484191
Minimum10
Maximum90
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T21:59:38.442840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile10
Q110
median10
Q310
95-th percentile90
Maximum90
Range80
Interquartile range (IQR)0

Descriptive statistics

Standard deviation30.048876
Coefficient of variation (CV)1.2272767
Kurtosis0.94231773
Mean24.484191
Median Absolute Deviation (MAD)0
Skewness1.7028258
Sum232306
Variance902.93492
MonotonicityNot monotonic
2024-05-03T21:59:38.887571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
10 7232
72.3%
90 1637
 
16.4%
20 492
 
4.9%
30 70
 
0.7%
11 50
 
0.5%
12 4
 
< 0.1%
39 2
 
< 0.1%
40 1
 
< 0.1%
(Missing) 512
 
5.1%
ValueCountFrequency (%)
10 7232
72.3%
11 50
 
0.5%
12 4
 
< 0.1%
20 492
 
4.9%
30 70
 
0.7%
39 2
 
< 0.1%
40 1
 
< 0.1%
90 1637
 
16.4%
ValueCountFrequency (%)
90 1637
 
16.4%
40 1
 
< 0.1%
39 2
 
< 0.1%
30 70
 
0.7%
20 492
 
4.9%
12 4
 
< 0.1%
11 50
 
0.5%
10 7232
72.3%

건축_면적
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct7046
Distinct (%)70.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean187.40719
Minimum-1110.17
Maximum36711.68
Zeros1174
Zeros (%)11.7%
Negative10
Negative (%)0.1%
Memory size166.0 KiB
2024-05-03T21:59:39.513744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-1110.17
5-th percentile0
Q153.905
median98.7
Q3156.72
95-th percentile490.034
Maximum36711.68
Range37821.85
Interquartile range (IQR)102.815

Descriptive statistics

Standard deviation748.61739
Coefficient of variation (CV)3.9946033
Kurtosis1038.4738
Mean187.40719
Median Absolute Deviation (MAD)50.48
Skewness26.860922
Sum1874071.9
Variance560427.99
MonotonicityNot monotonic
2024-05-03T21:59:40.131604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1174
 
11.7%
18.0 63
 
0.6%
27.0 32
 
0.3%
9.0 11
 
0.1%
99.0 7
 
0.1%
30.0 7
 
0.1%
115.2 6
 
0.1%
98.7 6
 
0.1%
15.0 6
 
0.1%
77.69 6
 
0.1%
Other values (7036) 8682
86.8%
ValueCountFrequency (%)
-1110.17 1
< 0.1%
-344.44 1
< 0.1%
-19.5 1
< 0.1%
-13.32 1
< 0.1%
-12.74 1
< 0.1%
-10.85 1
< 0.1%
-5.72 1
< 0.1%
-4.86 1
< 0.1%
-4.7 1
< 0.1%
-2.16 1
< 0.1%
ValueCountFrequency (%)
36711.68 1
< 0.1%
29490.0 1
< 0.1%
27892.98 1
< 0.1%
14386.25 1
< 0.1%
13746.21 1
< 0.1%
10150.58 1
< 0.1%
10007.42 1
< 0.1%
10003.12 1
< 0.1%
9237.39 1
< 0.1%
9162.79 2
< 0.1%

연면적
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct8837
Distinct (%)88.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1583.1745
Minimum-344.95
Maximum736474
Zeros273
Zeros (%)2.7%
Negative9
Negative (%)0.1%
Memory size166.0 KiB
2024-05-03T21:59:40.738977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-344.95
5-th percentile13.957
Q1131.865
median327.615
Q3642.11
95-th percentile3226.844
Maximum736474
Range736818.95
Interquartile range (IQR)510.245

Descriptive statistics

Standard deviation11954.394
Coefficient of variation (CV)7.5509012
Kurtosis1610.1478
Mean1583.1745
Median Absolute Deviation (MAD)242.71
Skewness32.125963
Sum15831745
Variance1.4290754 × 108
MonotonicityNot monotonic
2024-05-03T21:59:41.309966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 273
 
2.7%
18.0 50
 
0.5%
27.0 27
 
0.3%
36.0 18
 
0.2%
9.0 12
 
0.1%
632.63 7
 
0.1%
30.0 7
 
0.1%
9.9 6
 
0.1%
54.0 6
 
0.1%
63.0 6
 
0.1%
Other values (8827) 9588
95.9%
ValueCountFrequency (%)
-344.95 1
 
< 0.1%
-61.31 1
 
< 0.1%
-43.94 1
 
< 0.1%
-37.94 1
 
< 0.1%
-33.42 1
 
< 0.1%
-32.4 1
 
< 0.1%
-17.49 1
 
< 0.1%
-4.94 1
 
< 0.1%
-0.17 1
 
< 0.1%
0.0 273
2.7%
ValueCountFrequency (%)
736474.0 1
< 0.1%
385944.25 1
< 0.1%
260397.0 1
< 0.1%
203903.56 1
< 0.1%
187697.21 2
< 0.1%
174541.83 1
< 0.1%
170671.0 1
< 0.1%
168050.01 1
< 0.1%
165707.58 1
< 0.1%
148969.29 2
< 0.1%

지상_층_수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct39
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8285
Minimum0
Maximum69
Zeros341
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T21:59:42.143910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q35
95-th percentile7
Maximum69
Range69
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.0905409
Coefficient of variation (CV)0.80724589
Kurtosis60.584008
Mean3.8285
Median Absolute Deviation (MAD)1
Skewness5.4971396
Sum38285
Variance9.5514429
MonotonicityNot monotonic
2024-05-03T21:59:42.663576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
4 2313
23.1%
3 1799
18.0%
5 1785
17.8%
2 1604
16.0%
1 1033
10.3%
6 436
 
4.4%
0 341
 
3.4%
7 243
 
2.4%
8 96
 
1.0%
15 60
 
0.6%
Other values (29) 290
 
2.9%
ValueCountFrequency (%)
0 341
 
3.4%
1 1033
10.3%
2 1604
16.0%
3 1799
18.0%
4 2313
23.1%
5 1785
17.8%
6 436
 
4.4%
7 243
 
2.4%
8 96
 
1.0%
9 47
 
0.5%
ValueCountFrequency (%)
69 1
 
< 0.1%
58 1
 
< 0.1%
49 1
 
< 0.1%
41 1
 
< 0.1%
39 4
< 0.1%
37 1
 
< 0.1%
36 3
< 0.1%
33 1
 
< 0.1%
32 3
< 0.1%
30 6
0.1%

지하_층_수
Real number (ℝ)

ZEROS 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6428
Minimum0
Maximum9
Zeros4820
Zeros (%)48.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T21:59:43.572390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.86132592
Coefficient of variation (CV)1.3399594
Kurtosis16.725061
Mean0.6428
Median Absolute Deviation (MAD)1
Skewness3.0980963
Sum6428
Variance0.74188235
MonotonicityNot monotonic
2024-05-03T21:59:44.067539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0 4820
48.2%
1 4578
45.8%
2 308
 
3.1%
3 125
 
1.2%
4 69
 
0.7%
5 52
 
0.5%
6 24
 
0.2%
7 16
 
0.2%
8 5
 
0.1%
9 3
 
< 0.1%
ValueCountFrequency (%)
0 4820
48.2%
1 4578
45.8%
2 308
 
3.1%
3 125
 
1.2%
4 69
 
0.7%
5 52
 
0.5%
6 24
 
0.2%
7 16
 
0.2%
8 5
 
0.1%
9 3
 
< 0.1%
ValueCountFrequency (%)
9 3
 
< 0.1%
8 5
 
0.1%
7 16
 
0.2%
6 24
 
0.2%
5 52
 
0.5%
4 69
 
0.7%
3 125
 
1.2%
2 308
 
3.1%
1 4578
45.8%
0 4820
48.2%

Interactions

2024-05-03T21:59:23.781592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:12.465931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:14.835783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:17.142872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:19.499238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:21.585294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:24.047414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:12.864656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:15.233863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:17.569108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:19.792598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:21.862398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:24.344806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:13.363814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:15.555832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:17.960881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:20.165895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:22.225940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:24.734175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:13.707371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:15.953074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:18.289533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:20.603682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:22.665930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:25.073097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:14.043028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:16.349973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:18.672447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:20.926974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:22.977648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:25.362506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:14.411007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:16.752498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:18.991746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:21.262023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T21:59:23.386355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T21:59:44.389581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주_용도_코드구조_코드지붕_코드건축_면적연면적지상_층_수지하_층_수
주_용도_코드1.0000.6590.3650.4070.2360.4470.449
구조_코드0.6591.0000.5300.1830.0980.2270.318
지붕_코드0.3650.5301.0000.0000.0000.0920.161
건축_면적0.4070.1830.0001.0000.7760.6770.308
연면적0.2360.0980.0000.7761.0000.7740.330
지상_층_수0.4470.2270.0920.6770.7741.0000.591
지하_층_수0.4490.3180.1610.3080.3300.5911.000
2024-05-03T21:59:44.770890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구조_코드지붕_코드건축_면적연면적지상_층_수지하_층_수주_용도_코드
구조_코드1.0000.1160.1920.1510.162-0.1500.324
지붕_코드0.1161.000-0.244-0.327-0.380-0.1310.166
건축_면적0.192-0.2441.0000.8610.4930.1760.170
연면적0.151-0.3270.8611.0000.6350.3040.106
지상_층_수0.162-0.3800.4930.6351.0000.2030.183
지하_층_수-0.150-0.1310.1760.3040.2031.0000.174
주_용도_코드0.3240.1660.1700.1060.1830.1741.000

Missing values

2024-05-03T21:59:25.663398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T21:59:26.306467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-03T21:59:26.920288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

관리_동별_개요_PK관리_허가대장_PK건물_명주_용도_코드기타_용도구조_코드기타_구조지붕_코드건축_면적연면적지상_층_수지하_층_수
3844411380-10000670711380-100005519삼익하이빌02000<NA>21<NA>1070.8226.7650
8620211650-406411650-4013방배동 다세대 주택02000다세대주택21<NA>10117.43541.6941
4401311380-56511380-560박해룡주택01000<NA>11<NA>200.060.3820
3467811380-286811380-2868<NA><NA><NA>21<NA><NA>0.0648.5841
3479611380-1037211380-10220아이언아파트02000근린생활시설및아파트21<NA>10379.952692.8171
1479211350-10003230811350-100019187별궁도시형생활주택02000다세대주택 및 도시형생활주택21철근콘크리트구조100.036.7440
527611170-420011170-3888<NA>04000<NA>51<NA>9062.8192.5620
4553811440-10000866511440-100007359<NA>04000제2종근린생활시설51<NA>2022.7617.7621
6577711140-10004064511140-100032811<NA>14000사무실, 근린생활시설, 은행21철골철근콘크리트조10819.6915321.93153
8227711620-4511620-360001000다가구주택11<NA>1075.01209.1921
관리_동별_개요_PK관리_허가대장_PK건물_명주_용도_코드기타_용도구조_코드기타_구조지붕_코드건축_면적연면적지상_층_수지하_층_수
2162711215-10007601011215-100067293<NA>14000근린생활시설, 업무시설21철근콘크리트조100.00.093
6542911545-279511545-2569나동20000<NA>32<NA>9044.044.000
2250011260-377611260-3613.04000<NA>21<NA>90198.9568.4630
8144511620-759211620-7661101000다중주택21<NA>1081.87286.3331
4814111215-218711215-2141104000업무시설,단독주택31<NA>90264.451174.941
909811140-239911140-2250<NA>04000<NA>50<NA>2079.34145.4620
7761811590-197211590-1975윤찬순씨네다세대주택02000다세대주택21<NA>10188.55902.4441
5795411410-291611410-2835대원빌딩04000단독주택31<NA>903.75520.3530
185411170-10004298011170-100034540<NA>01000주택, 근린생활시설11연와조10137.16262.7821
7044711590-315611590-31443동Z5000<NA>21<NA>90173.34494.9250