Overview

Dataset statistics

Number of variables13
Number of observations10000
Missing cells395
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 MiB
Average record size in memory115.0 B

Variable types

Text5
Categorical5
Numeric3

Dataset

Description전유_공용_면적_PK,폐쇄말소대장_PK,전유_공용_구분_코드,주_부속_구분_코드,층_구분_코드,층_번호,층_번호_명,구조_코드,기타_구조,주_용도_코드,기타_용도,면적,작업_일자
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15398/S/1/datasetView.do

Alerts

기타_구조 is highly overall correlated with 구조_코드High correlation
구조_코드 is highly overall correlated with 기타_구조High correlation
전유_공용_구분_코드 is highly overall correlated with 층_구분_코드High correlation
층_구분_코드 is highly overall correlated with 전유_공용_구분_코드High correlation
구조_코드 is highly imbalanced (89.2%)Imbalance
기타_구조 is highly imbalanced (81.5%)Imbalance
층_번호_명 has 144 (1.4%) missing valuesMissing
주_용도_코드 has 123 (1.2%) missing valuesMissing
기타_용도 has 128 (1.3%) missing valuesMissing
전유_공용_면적_PK has unique valuesUnique
층_번호 has 4627 (46.3%) zerosZeros

Reproduction

Analysis started2024-05-18 05:42:24.734457
Analysis finished2024-05-18 05:42:32.827670
Duration8.09 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T14:42:33.292395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length16.6652
Min length9

Characters and Unicode

Total characters166652
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row11710-100286453
2nd row11350-100783619
3rd row11260-100954144
4th row11740-102714129
5th row11710-100337118
ValueCountFrequency (%)
11710-100286453 1
 
< 0.1%
11710-100279341 1
 
< 0.1%
11260-100773526 1
 
< 0.1%
11710-100289211 1
 
< 0.1%
11545-100970995 1
 
< 0.1%
11710-100273581 1
 
< 0.1%
11530-105646997 1
 
< 0.1%
11710-100274473 1
 
< 0.1%
11680-105917068 1
 
< 0.1%
11710-100278421 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-18T14:42:34.303879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 48922
29.4%
1 39506
23.7%
7 11843
 
7.1%
2 11624
 
7.0%
- 10000
 
6.0%
5 8123
 
4.9%
6 8044
 
4.8%
9 7992
 
4.8%
4 7686
 
4.6%
8 6457
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 156652
94.0%
Dash Punctuation 10000
 
6.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 48922
31.2%
1 39506
25.2%
7 11843
 
7.6%
2 11624
 
7.4%
5 8123
 
5.2%
6 8044
 
5.1%
9 7992
 
5.1%
4 7686
 
4.9%
8 6457
 
4.1%
3 6455
 
4.1%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 166652
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 48922
29.4%
1 39506
23.7%
7 11843
 
7.1%
2 11624
 
7.0%
- 10000
 
6.0%
5 8123
 
4.9%
6 8044
 
4.8%
9 7992
 
4.8%
4 7686
 
4.6%
8 6457
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 166652
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 48922
29.4%
1 39506
23.7%
7 11843
 
7.1%
2 11624
 
7.0%
- 10000
 
6.0%
5 8123
 
4.9%
6 8044
 
4.8%
9 7992
 
4.8%
4 7686
 
4.6%
8 6457
 
3.9%
Distinct7144
Distinct (%)71.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T14:42:34.950274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length16.6272
Min length10

Characters and Unicode

Total characters166272
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4905 ?
Unique (%)49.0%

Sample

1st row11710-100160494
2nd row11350-100216753
3rd row11260-100290878
4th row11740-100633340
5th row11710-100190839
ValueCountFrequency (%)
11260-100242930 6
 
0.1%
11710-100191126 5
 
< 0.1%
11260-100290943 5
 
< 0.1%
11680-101077182 5
 
< 0.1%
11200-100120743 5
 
< 0.1%
11500-1000000000000003193950 5
 
< 0.1%
11260-100291032 5
 
< 0.1%
11440-1000000000000002635762 5
 
< 0.1%
11260-100194370 5
 
< 0.1%
11260-100290879 5
 
< 0.1%
Other values (7134) 9949
99.5%
2024-05-18T14:42:36.239671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 50701
30.5%
1 44367
26.7%
2 10307
 
6.2%
- 10000
 
6.0%
7 9548
 
5.7%
6 9437
 
5.7%
5 8324
 
5.0%
9 6513
 
3.9%
4 6395
 
3.8%
3 5887
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 156272
94.0%
Dash Punctuation 10000
 
6.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 50701
32.4%
1 44367
28.4%
2 10307
 
6.6%
7 9548
 
6.1%
6 9437
 
6.0%
5 8324
 
5.3%
9 6513
 
4.2%
4 6395
 
4.1%
3 5887
 
3.8%
8 4793
 
3.1%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 166272
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 50701
30.5%
1 44367
26.7%
2 10307
 
6.2%
- 10000
 
6.0%
7 9548
 
5.7%
6 9437
 
5.7%
5 8324
 
5.0%
9 6513
 
3.9%
4 6395
 
3.8%
3 5887
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 166272
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 50701
30.5%
1 44367
26.7%
2 10307
 
6.2%
- 10000
 
6.0%
7 9548
 
5.7%
6 9437
 
5.7%
5 8324
 
5.0%
9 6513
 
3.9%
4 6395
 
3.8%
3 5887
 
3.5%

전유_공용_구분_코드
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
공용
7780 
전유
2220 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공용
2nd row공용
3rd row공용
4th row공용
5th row공용

Common Values

ValueCountFrequency (%)
공용 7780
77.8%
전유 2220
 
22.2%

Length

2024-05-18T14:42:36.724087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T14:42:37.032930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공용 7780
77.8%
전유 2220
 
22.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
주건축물
8522 
부속건축물
1478 

Length

Max length5
Median length4
Mean length4.1478
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주건축물
2nd row부속건축물
3rd row부속건축물
4th row주건축물
5th row부속건축물

Common Values

ValueCountFrequency (%)
주건축물 8522
85.2%
부속건축물 1478
 
14.8%

Length

2024-05-18T14:42:37.399110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T14:42:37.732919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주건축물 8522
85.2%
부속건축물 1478
 
14.8%

층_구분_코드
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
지상
4090 
각층
3668 
지하
1963 
<NA>
 
215
복수층(상층)
 
46

Length

Max length7
Median length2
Mean length2.066
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row각층
2nd row지하
3rd row지상
4th row<NA>
5th row각층

Common Values

ValueCountFrequency (%)
지상 4090
40.9%
각층 3668
36.7%
지하 1963
19.6%
<NA> 215
 
2.1%
복수층(상층) 46
 
0.5%
옥탑 18
 
0.2%

Length

2024-05-18T14:42:38.293365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T14:42:38.711944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지상 4090
40.9%
각층 3668
36.7%
지하 1963
19.6%
na 215
 
2.1%
복수층(상층 46
 
0.5%
옥탑 18
 
0.2%

층_번호
Real number (ℝ)

ZEROS 

Distinct29
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3774
Minimum0
Maximum32
Zeros4627
Zeros (%)46.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-18T14:42:39.180515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile10
Maximum32
Range32
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.8155504
Coefficient of variation (CV)1.6049257
Kurtosis6.5695051
Mean2.3774
Median Absolute Deviation (MAD)1
Skewness2.3451389
Sum23774
Variance14.558425
MonotonicityNot monotonic
2024-05-18T14:42:39.770753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
0 4627
46.3%
1 1785
 
17.8%
2 772
 
7.7%
3 436
 
4.4%
4 416
 
4.2%
6 349
 
3.5%
7 319
 
3.2%
5 262
 
2.6%
8 259
 
2.6%
9 243
 
2.4%
Other values (19) 532
 
5.3%
ValueCountFrequency (%)
0 4627
46.3%
1 1785
 
17.8%
2 772
 
7.7%
3 436
 
4.4%
4 416
 
4.2%
5 262
 
2.6%
6 349
 
3.5%
7 319
 
3.2%
8 259
 
2.6%
9 243
 
2.4%
ValueCountFrequency (%)
32 1
 
< 0.1%
28 1
 
< 0.1%
27 1
 
< 0.1%
25 5
 
0.1%
24 8
 
0.1%
23 9
0.1%
22 7
 
0.1%
21 8
 
0.1%
20 20
0.2%
19 21
0.2%

층_번호_명
Text

MISSING 

Distinct195
Distinct (%)2.0%
Missing144
Missing (%)1.4%
Memory size156.2 KiB
2024-05-18T14:42:40.962574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length14
Mean length4.4079748
Min length1

Characters and Unicode

Total characters43445
Distinct characters35
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique60 ?
Unique (%)0.6%

Sample

1st row지5-1층,3층-7층,10층,11층
2nd row지1
3rd row1층
4th row각층
5th row지하2층
ValueCountFrequency (%)
각층 1533
 
15.5%
지5-1층,3층-7층,10층,11층 862
 
8.7%
지5층-지1층 808
 
8.2%
1층 769
 
7.8%
지1 444
 
4.5%
지하1층 302
 
3.1%
2층 235
 
2.4%
지2 229
 
2.3%
3 213
 
2.2%
7 211
 
2.1%
Other values (180) 4278
43.3%
2024-05-18T14:42:43.124256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11488
26.4%
1 7819
18.0%
6160
14.2%
, 2900
 
6.7%
- 2580
 
5.9%
5 2091
 
4.8%
3 1574
 
3.6%
1549
 
3.6%
2 1460
 
3.4%
7 1225
 
2.8%
Other values (25) 4599
10.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 20014
46.1%
Decimal Number 16921
38.9%
Other Punctuation 2904
 
6.7%
Dash Punctuation 2580
 
5.9%
Math Symbol 932
 
2.1%
Space Separator 35
 
0.1%
Open Punctuation 22
 
0.1%
Close Punctuation 22
 
0.1%
Uppercase Letter 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11488
57.4%
6160
30.8%
1549
 
7.7%
550
 
2.7%
159
 
0.8%
23
 
0.1%
23
 
0.1%
20
 
0.1%
20
 
0.1%
12
 
0.1%
Other values (6) 10
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 7819
46.2%
5 2091
 
12.4%
3 1574
 
9.3%
2 1460
 
8.6%
7 1225
 
7.2%
0 990
 
5.9%
4 826
 
4.9%
6 373
 
2.2%
8 294
 
1.7%
9 269
 
1.6%
Other Punctuation
ValueCountFrequency (%)
, 2900
99.9%
. 4
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
B 8
53.3%
A 7
46.7%
Dash Punctuation
ValueCountFrequency (%)
- 2580
100.0%
Math Symbol
ValueCountFrequency (%)
~ 932
100.0%
Space Separator
ValueCountFrequency (%)
35
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23416
53.9%
Hangul 20014
46.1%
Latin 15
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 7819
33.4%
, 2900
 
12.4%
- 2580
 
11.0%
5 2091
 
8.9%
3 1574
 
6.7%
2 1460
 
6.2%
7 1225
 
5.2%
0 990
 
4.2%
~ 932
 
4.0%
4 826
 
3.5%
Other values (7) 1019
 
4.4%
Hangul
ValueCountFrequency (%)
11488
57.4%
6160
30.8%
1549
 
7.7%
550
 
2.7%
159
 
0.8%
23
 
0.1%
23
 
0.1%
20
 
0.1%
20
 
0.1%
12
 
0.1%
Other values (6) 10
 
< 0.1%
Latin
ValueCountFrequency (%)
B 8
53.3%
A 7
46.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23431
53.9%
Hangul 20014
46.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
11488
57.4%
6160
30.8%
1549
 
7.7%
550
 
2.7%
159
 
0.8%
23
 
0.1%
23
 
0.1%
20
 
0.1%
20
 
0.1%
12
 
0.1%
Other values (6) 10
 
< 0.1%
ASCII
ValueCountFrequency (%)
1 7819
33.4%
, 2900
 
12.4%
- 2580
 
11.0%
5 2091
 
8.9%
3 1574
 
6.7%
2 1460
 
6.2%
7 1225
 
5.2%
0 990
 
4.2%
~ 932
 
4.0%
4 826
 
3.5%
Other values (9) 1034
 
4.4%

구조_코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
철근콘크리트구조
9534 
철골철근콘크리트구조
 
252
벽돌구조
 
124
철골콘크리트구조
 
34
프리케스트콘크리트구조
 
27
Other values (5)
 
29

Length

Max length11
Median length8
Mean length8.0019
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row철근콘크리트구조
2nd row철근콘크리트구조
3rd row철근콘크리트구조
4th row철근콘크리트구조
5th row철근콘크리트구조

Common Values

ValueCountFrequency (%)
철근콘크리트구조 9534
95.3%
철골철근콘크리트구조 252
 
2.5%
벽돌구조 124
 
1.2%
철골콘크리트구조 34
 
0.3%
프리케스트콘크리트구조 27
 
0.3%
일반철골구조 13
 
0.1%
기타조적구조 6
 
0.1%
경량철골구조 4
 
< 0.1%
<NA> 3
 
< 0.1%
블록구조 3
 
< 0.1%

Length

2024-05-18T14:42:43.936719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T14:42:44.464025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
철근콘크리트구조 9534
95.3%
철골철근콘크리트구조 252
 
2.5%
벽돌구조 124
 
1.2%
철골콘크리트구조 34
 
0.3%
프리케스트콘크리트구조 27
 
0.3%
일반철골구조 13
 
0.1%
기타조적구조 6
 
0.1%
경량철골구조 4
 
< 0.1%
na 3
 
< 0.1%
블록구조 3
 
< 0.1%

기타_구조
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct40
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
철근콘크리트구조
8475 
철근콘크리트조
885 
철골철근콘크리트구조
 
250
연와조
 
75
<NA>
 
57
Other values (35)
 
258

Length

Max length24
Median length8
Mean length7.8949
Min length3

Unique

Unique8 ?
Unique (%)0.1%

Sample

1st row철근콘크리트구조
2nd row철근콘크리트구조
3rd row철근콘크리트구조
4th row철근콘크리트조
5th row철근콘크리트구조

Common Values

ValueCountFrequency (%)
철근콘크리트구조 8475
84.8%
철근콘크리트조 885
 
8.8%
철골철근콘크리트구조 250
 
2.5%
연와조 75
 
0.8%
<NA> 57
 
0.6%
철근콘크리트 51
 
0.5%
철골콘크리트구조 32
 
0.3%
프리케스트콘크리트구조 26
 
0.3%
시멘트벽돌조 26
 
0.3%
조적조 16
 
0.2%
Other values (30) 107
 
1.1%

Length

2024-05-18T14:42:45.045535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
철근콘크리트구조 8475
84.7%
철근콘크리트조 885
 
8.8%
철골철근콘크리트구조 250
 
2.5%
연와조 75
 
0.7%
na 57
 
0.6%
철근콘크리트 55
 
0.5%
철골콘크리트구조 32
 
0.3%
프리케스트콘크리트구조 26
 
0.3%
시멘트벽돌조 26
 
0.3%
조적조 16
 
0.2%
Other values (32) 112
 
1.1%

주_용도_코드
Text

MISSING 

Distinct79
Distinct (%)0.8%
Missing123
Missing (%)1.2%
Memory size156.2 KiB
2024-05-18T14:42:45.782288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length4
Mean length4.2615167
Min length2

Characters and Unicode

Total characters42091
Distinct characters127
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)0.1%

Sample

1st row도매시장
2nd row기타노유자시설
3rd row아파트
4th row아파트
5th row복리시설
ValueCountFrequency (%)
도매시장 3001
30.4%
아파트 2094
21.2%
부대시설 954
 
9.7%
오피스텔 671
 
6.8%
기타공장 431
 
4.4%
다세대주택 400
 
4.0%
기타노유자시설 324
 
3.3%
상점(소매점 292
 
3.0%
기타제2종근린생활시설 204
 
2.1%
복리시설 155
 
1.6%
Other values (69) 1351
13.7%
2024-05-18T14:42:47.065981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5207
 
12.4%
3792
 
9.0%
3565
 
8.5%
3010
 
7.2%
2094
 
5.0%
2094
 
5.0%
2094
 
5.0%
2015
 
4.8%
1373
 
3.3%
1371
 
3.3%
Other values (117) 15476
36.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41214
97.9%
Open Punctuation 298
 
0.7%
Close Punctuation 298
 
0.7%
Decimal Number 278
 
0.7%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5207
 
12.6%
3792
 
9.2%
3565
 
8.6%
3010
 
7.3%
2094
 
5.1%
2094
 
5.1%
2094
 
5.1%
2015
 
4.9%
1373
 
3.3%
1371
 
3.3%
Other values (112) 14599
35.4%
Decimal Number
ValueCountFrequency (%)
2 204
73.4%
1 74
 
26.6%
Open Punctuation
ValueCountFrequency (%)
( 298
100.0%
Close Punctuation
ValueCountFrequency (%)
) 298
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41214
97.9%
Common 877
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5207
 
12.6%
3792
 
9.2%
3565
 
8.6%
3010
 
7.3%
2094
 
5.1%
2094
 
5.1%
2094
 
5.1%
2015
 
4.9%
1373
 
3.3%
1371
 
3.3%
Other values (112) 14599
35.4%
Common
ValueCountFrequency (%)
( 298
34.0%
) 298
34.0%
2 204
23.3%
1 74
 
8.4%
. 3
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41214
97.9%
ASCII 877
 
2.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5207
 
12.6%
3792
 
9.2%
3565
 
8.6%
3010
 
7.3%
2094
 
5.1%
2094
 
5.1%
2094
 
5.1%
2015
 
4.9%
1373
 
3.3%
1371
 
3.3%
Other values (112) 14599
35.4%
ASCII
ValueCountFrequency (%)
( 298
34.0%
) 298
34.0%
2 204
23.3%
1 74
 
8.4%
. 3
 
0.3%

기타_용도
Text

MISSING 

Distinct477
Distinct (%)4.8%
Missing128
Missing (%)1.3%
Memory size156.2 KiB
2024-05-18T14:42:47.861645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length63
Mean length13.663088
Min length1

Characters and Unicode

Total characters134882
Distinct characters249
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique189 ?
Unique (%)1.9%

Sample

1st row기계실,전기실,창고,재활용창고,용역원실,휴게실,오락실,주차관제실,체력단련실,검수실,방재센터,경비실,유아실,사무실
2nd row체력단련장,기전실계단
3rd row경로당,맘스카페
4th row계단
5th row보육시설,주민공동시설,문고,노인정,독서실(지1~2층)
ValueCountFrequency (%)
주차장 1333
 
13.3%
기계실,전기실,창고,재활용창고,용역원실,휴게실,오락실,주차관제실,체력단련실,검수실,방재센터,경비실,유아실,사무실 862
 
8.6%
계단실,복도,로비,화장실,공조실 801
 
8.0%
판매시설(도매시장 709
 
7.1%
계단실 402
 
4.0%
지하주차장 326
 
3.3%
벽체 322
 
3.2%
아파트 168
 
1.7%
경비실 163
 
1.6%
공동주택(아파트 149
 
1.5%
Other values (479) 4792
47.8%
2024-05-18T14:42:49.169622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 20191
 
15.0%
17531
 
13.0%
4130
 
3.1%
3810
 
2.8%
3804
 
2.8%
3671
 
2.7%
2985
 
2.2%
2724
 
2.0%
2568
 
1.9%
2375
 
1.8%
Other values (239) 71093
52.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 106822
79.2%
Other Punctuation 20410
 
15.1%
Uppercase Letter 2205
 
1.6%
Close Punctuation 1942
 
1.4%
Open Punctuation 1941
 
1.4%
Decimal Number 1039
 
0.8%
Math Symbol 212
 
0.2%
Space Separator 164
 
0.1%
Dash Punctuation 141
 
0.1%
Other Symbol 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17531
 
16.4%
4130
 
3.9%
3810
 
3.6%
3804
 
3.6%
3671
 
3.4%
2985
 
2.8%
2724
 
2.6%
2568
 
2.4%
2375
 
2.2%
2207
 
2.1%
Other values (203) 61017
57.1%
Uppercase Letter
ValueCountFrequency (%)
D 432
19.6%
F 429
19.5%
M 428
19.4%
E 350
15.9%
V 183
8.3%
L 154
 
7.0%
S 72
 
3.3%
P 65
 
2.9%
T 46
 
2.1%
A 18
 
0.8%
Other values (5) 28
 
1.3%
Decimal Number
ValueCountFrequency (%)
1 481
46.3%
2 352
33.9%
3 92
 
8.9%
6 32
 
3.1%
4 31
 
3.0%
5 26
 
2.5%
7 11
 
1.1%
0 10
 
1.0%
9 4
 
0.4%
Other Punctuation
ValueCountFrequency (%)
, 20191
98.9%
/ 143
 
0.7%
. 70
 
0.3%
# 5
 
< 0.1%
: 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 1942
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1941
100.0%
Math Symbol
ValueCountFrequency (%)
~ 212
100.0%
Space Separator
ValueCountFrequency (%)
164
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 141
100.0%
Other Symbol
ValueCountFrequency (%)
5
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 106822
79.2%
Common 25855
 
19.2%
Latin 2205
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17531
 
16.4%
4130
 
3.9%
3810
 
3.6%
3804
 
3.6%
3671
 
3.4%
2985
 
2.8%
2724
 
2.6%
2568
 
2.4%
2375
 
2.2%
2207
 
2.1%
Other values (203) 61017
57.1%
Common
ValueCountFrequency (%)
, 20191
78.1%
) 1942
 
7.5%
( 1941
 
7.5%
1 481
 
1.9%
2 352
 
1.4%
~ 212
 
0.8%
164
 
0.6%
/ 143
 
0.6%
- 141
 
0.5%
3 92
 
0.4%
Other values (11) 196
 
0.8%
Latin
ValueCountFrequency (%)
D 432
19.6%
F 429
19.5%
M 428
19.4%
E 350
15.9%
V 183
8.3%
L 154
 
7.0%
S 72
 
3.3%
P 65
 
2.9%
T 46
 
2.1%
A 18
 
0.8%
Other values (5) 28
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 106822
79.2%
ASCII 28055
 
20.8%
CJK Compat 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 20191
72.0%
) 1942
 
6.9%
( 1941
 
6.9%
1 481
 
1.7%
D 432
 
1.5%
F 429
 
1.5%
M 428
 
1.5%
2 352
 
1.3%
E 350
 
1.2%
~ 212
 
0.8%
Other values (25) 1297
 
4.6%
Hangul
ValueCountFrequency (%)
17531
 
16.4%
4130
 
3.9%
3810
 
3.6%
3804
 
3.6%
3671
 
3.4%
2985
 
2.8%
2724
 
2.6%
2568
 
2.4%
2375
 
2.2%
2207
 
2.1%
Other values (203) 61017
57.1%
CJK Compat
ValueCountFrequency (%)
5
100.0%

면적
Real number (ℝ)

Distinct2518
Distinct (%)25.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.500582
Minimum0
Maximum1136.84
Zeros47
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-18T14:42:49.626291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.30499
Q12.41
median13.96285
Q322.86
95-th percentile84.4675
Maximum1136.84
Range1136.84
Interquartile range (IQR)20.45

Descriptive statistics

Standard deviation54.224062
Coefficient of variation (CV)2.3073498
Kurtosis149.42843
Mean23.500582
Median Absolute Deviation (MAD)11.51285
Skewness10.488118
Sum235005.82
Variance2940.2489
MonotonicityNot monotonic
2024-05-18T14:42:50.119427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.41 505
 
5.1%
21.08 465
 
4.7%
22.68 417
 
4.2%
20.78 119
 
1.2%
20.66 101
 
1.0%
20.62 66
 
0.7%
20.58 51
 
0.5%
2.43 49
 
0.5%
16.3716 48
 
0.5%
0.0 47
 
0.5%
Other values (2508) 8132
81.3%
ValueCountFrequency (%)
0.0 47
0.5%
0.02 5
 
0.1%
0.03 45
0.4%
0.04 20
0.2%
0.05 28
0.3%
0.053 1
 
< 0.1%
0.0558 1
 
< 0.1%
0.0559 1
 
< 0.1%
0.0595 31
0.3%
0.06 36
0.4%
ValueCountFrequency (%)
1136.84 1
< 0.1%
1116.2 1
< 0.1%
1079.59 1
< 0.1%
1056.866 1
< 0.1%
928.52 1
< 0.1%
918.58 1
< 0.1%
907.5 1
< 0.1%
893.95 1
< 0.1%
874.16 1
< 0.1%
855.47 1
< 0.1%

작업_일자
Real number (ℝ)

Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20228855
Minimum20200108
Maximum20240227
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-18T14:42:50.755488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20200108
5-th percentile20211029
Q120231104
median20231104
Q320231110
95-th percentile20231124
Maximum20240227
Range40119
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6877.7042
Coefficient of variation (CV)0.00033999474
Kurtosis6.0941699
Mean20228855
Median Absolute Deviation (MAD)6
Skewness-2.6720395
Sum2.0228855 × 1011
Variance47302815
MonotonicityNot monotonic
2024-05-18T14:42:51.420469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20231104 4191
41.9%
20231124 2059
20.6%
20231110 1983
19.8%
20211029 642
 
6.4%
20230912 453
 
4.5%
20231028 104
 
1.0%
20240227 86
 
0.9%
20211216 77
 
0.8%
20211123 43
 
0.4%
20220211 34
 
0.3%
Other values (44) 328
 
3.3%
ValueCountFrequency (%)
20200108 2
 
< 0.1%
20200110 6
 
0.1%
20200117 5
 
0.1%
20200206 3
 
< 0.1%
20200213 4
 
< 0.1%
20200304 1
 
< 0.1%
20200306 9
 
0.1%
20200324 3
 
< 0.1%
20200331 23
0.2%
20200407 1
 
< 0.1%
ValueCountFrequency (%)
20240227 86
 
0.9%
20231124 2059
20.6%
20231110 1983
19.8%
20231104 4191
41.9%
20231028 104
 
1.0%
20230929 2
 
< 0.1%
20230912 453
 
4.5%
20230831 6
 
0.1%
20230808 2
 
< 0.1%
20230607 3
 
< 0.1%

Interactions

2024-05-18T14:42:30.638850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T14:42:28.848512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T14:42:29.811260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T14:42:30.923913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T14:42:29.229662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T14:42:30.092118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T14:42:31.204755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T14:42:29.514987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T14:42:30.347767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T14:42:51.897485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전유_공용_구분_코드주_부속_구분_코드층_구분_코드층_번호구조_코드기타_구조주_용도_코드면적작업_일자
전유_공용_구분_코드1.0000.3420.4330.5270.1000.2030.3390.1880.156
주_부속_구분_코드0.3421.0000.1240.3010.0760.2620.7160.0510.259
층_구분_코드0.4330.1241.0000.6080.0890.4100.5500.0690.178
층_번호0.5270.3010.6081.0000.0590.0000.4110.3950.147
구조_코드0.1000.0760.0890.0591.0000.9900.7990.2060.344
기타_구조0.2030.2620.4100.0000.9901.0000.8630.3640.779
주_용도_코드0.3390.7160.5500.4110.7990.8631.0000.6870.609
면적0.1880.0510.0690.3950.2060.3640.6871.0000.102
작업_일자0.1560.2590.1780.1470.3440.7790.6090.1021.000
2024-05-18T14:42:52.348629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기타_구조주_부속_구분_코드전유_공용_구분_코드구조_코드층_구분_코드
기타_구조1.0000.2200.1700.9170.190
주_부속_구분_코드0.2201.0000.2220.0760.152
전유_공용_구분_코드0.1700.2221.0000.1000.527
구조_코드0.9170.0760.1001.0000.051
층_구분_코드0.1900.1520.5270.0511.000
2024-05-18T14:42:52.726422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
층_번호면적작업_일자전유_공용_구분_코드주_부속_구분_코드층_구분_코드구조_코드기타_구조
층_번호1.0000.1900.0090.4060.2310.2960.0270.000
면적0.1901.000-0.0610.1440.0390.0290.0950.134
작업_일자0.009-0.0611.0000.1120.1860.1210.1780.468
전유_공용_구분_코드0.4060.1440.1121.0000.2220.5270.1000.170
주_부속_구분_코드0.2310.0390.1860.2221.0000.1520.0760.220
층_구분_코드0.2960.0290.1210.5270.1521.0000.0510.190
구조_코드0.0270.0950.1780.1000.0760.0511.0000.917
기타_구조0.0000.1340.4680.1700.2200.1900.9171.000

Missing values

2024-05-18T14:42:31.613474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T14:42:32.082383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-18T14:42:32.565986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

전유_공용_면적_PK폐쇄말소대장_PK전유_공용_구분_코드주_부속_구분_코드층_구분_코드층_번호층_번호_명구조_코드기타_구조주_용도_코드기타_용도면적작업_일자
2273211710-10028645311710-100160494공용주건축물각층0지5-1층,3층-7층,10층,11층철근콘크리트구조철근콘크리트구조도매시장기계실,전기실,창고,재활용창고,용역원실,휴게실,오락실,주차관제실,체력단련실,검수실,방재센터,경비실,유아실,사무실2.4120231104
4821511350-10078361911350-100216753공용부속건축물지하1지1철근콘크리트구조철근콘크리트구조기타노유자시설체력단련장,기전실계단2.493620231124
5549911260-10095414411260-100290878공용부속건축물지상11층철근콘크리트구조철근콘크리트구조아파트경로당,맘스카페0.681120231110
187711740-10271412911740-100633340공용주건축물<NA>0<NA>철근콘크리트구조철근콘크리트조아파트계단0.7920211127
3603711710-10033711811710-100190839공용부속건축물각층0각층철근콘크리트구조철근콘크리트구조복리시설보육시설,주민공동시설,문고,노인정,독서실(지1~2층)2.8320231104
4198611680-10591682711680-101076530공용부속건축물지하2지하2층철근콘크리트구조철근콘크리트조부대시설보일러실1.1420211029
1593011710-10029152311710-100159514공용주건축물지상88철근콘크리트구조철근콘크리트구조도매시장계단실,복도,로비,화장실,공조실20.7820231104
164111440-100000000000000790740211440-1000000000000002613654공용주건축물지하0지2~지1층철근콘크리트구조철근콘크리트구조부대시설폐기물보관실,용역원휴게실0.4120231124
5048711350-10078422511350-100216829전유주건축물지상99층철근콘크리트구조철근콘크리트구조기타노유자시설노인복지주택84.751320231124
4763211350-10078252711350-100216616공용부속건축물지상11층철근콘크리트구조철근콘크리트구조기타노유자시설지하주차장0.076620231124
전유_공용_면적_PK폐쇄말소대장_PK전유_공용_구분_코드주_부속_구분_코드층_구분_코드층_번호층_번호_명구조_코드기타_구조주_용도_코드기타_용도면적작업_일자
663711710-10027905111710-100157175전유주건축물지상22철근콘크리트구조철근콘크리트구조도매시장판매시설(도매시장)22.6820231104
208011710-10028058311710-100157289전유주건축물지상22철근콘크리트구조철근콘크리트구조도매시장판매시설(도매시장)22.7520231104
3332811710-10027637811710-100156423공용주건축물각층0지5층-지1층철근콘크리트구조철근콘크리트구조도매시장주차장21.0820231104
5276111140-10026367111140-100090619공용주건축물지하2지2철근콘크리트구조철근콘크리트구조기타판매시설계단실,ELEV.,복도,화장실5.690320231124
3302711260-10095145811260-100290335공용주건축물지상0각층철근콘크리트구조철근콘크리트구조아파트계단실,복도16.371620231110
1949611710-10027920111710-100157782공용주건축물각층0지5-1층,3층-7층,10층,11층철근콘크리트구조철근콘크리트구조도매시장기계실,전기실,창고,재활용창고,용역원실,휴게실,오락실,주차관제실,체력단련실,검수실,방재센터,경비실,유아실,사무실1.220231104
5149911260-100000000000000752280911260-90500전유주건축물지상22층철근콘크리트구조<NA>의원근린생활시설(의원)181.2120231028
2101611710-10027083511710-100158912전유주건축물지상88철근콘크리트구조철근콘크리트구조도매시장판매시설(도매시장)22.6820231104
2542911260-10095779211260-100291393공용주건축물각층0지4~지1철근콘크리트구조철근콘크리트구조오피스텔지하주차장21.442920231110
4380411710-10029061511710-100160199공용주건축물지상33철근콘크리트구조철근콘크리트구조도매시장계단실,복도,로비,화장실,공조실21.220231104