Overview

Dataset statistics

Number of variables14
Number of observations10000
Missing cells10636
Missing cells (%)7.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 MiB
Average record size in memory124.0 B

Variable types

Text7
Categorical3
Numeric4

Dataset

Description관리_층별_개요_PK,관리_폐쇄말소대장_PK,관리_주_폐쇄말소대장_PK,주_부속_구분_코드,주_부속_일련번호,층_구분_코드,층_번호,층_번호_명,구조_코드,기타_구조,주_용도_코드,기타_용도,면적
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15397/S/1/datasetView.do

Alerts

주_부속_구분_코드 is highly imbalanced (89.9%)Imbalance
층_구분_코드 is highly imbalanced (51.6%)Imbalance
구조_코드 is highly imbalanced (57.0%)Imbalance
관리_주_폐쇄말소대장_PK has 9998 (> 99.9%) missing valuesMissing
주_용도_코드 has 408 (4.1%) missing valuesMissing
기타_용도 has 130 (1.3%) missing valuesMissing
관리_층별_개요_PK has unique valuesUnique
주_부속_일련번호 has 1341 (13.4%) zerosZeros
층_번호 has 104 (1.0%) zerosZeros

Reproduction

Analysis started2024-05-11 00:10:23.815833
Analysis finished2024-05-11 00:10:37.007059
Duration13.19 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T00:10:37.411255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length14.8546
Min length8

Characters and Unicode

Total characters148546
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row11470-100296372
2nd row11305-100214721
3rd row11380-100404316
4th row11200-100178373
5th row11215-52272
ValueCountFrequency (%)
11470-100296372 1
 
< 0.1%
11710-100785455 1
 
< 0.1%
11170-100429810 1
 
< 0.1%
11650-100553065 1
 
< 0.1%
11650-1000000000000002541883 1
 
< 0.1%
11215-100355513 1
 
< 0.1%
11545-100270754 1
 
< 0.1%
11650-100055018 1
 
< 0.1%
11440-100674604 1
 
< 0.1%
11110-100081530 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-11T00:10:38.610679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 37226
25.1%
1 36942
24.9%
2 10032
 
6.8%
- 10000
 
6.7%
5 9442
 
6.4%
4 9432
 
6.3%
3 9318
 
6.3%
6 7757
 
5.2%
7 6385
 
4.3%
9 6026
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 138546
93.3%
Dash Punctuation 10000
 
6.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 37226
26.9%
1 36942
26.7%
2 10032
 
7.2%
5 9442
 
6.8%
4 9432
 
6.8%
3 9318
 
6.7%
6 7757
 
5.6%
7 6385
 
4.6%
9 6026
 
4.3%
8 5986
 
4.3%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 148546
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 37226
25.1%
1 36942
24.9%
2 10032
 
6.8%
- 10000
 
6.7%
5 9442
 
6.4%
4 9432
 
6.3%
3 9318
 
6.3%
6 7757
 
5.2%
7 6385
 
4.3%
9 6026
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 148546
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 37226
25.1%
1 36942
24.9%
2 10032
 
6.8%
- 10000
 
6.7%
5 9442
 
6.4%
4 9432
 
6.3%
3 9318
 
6.3%
6 7757
 
5.2%
7 6385
 
4.3%
9 6026
 
4.1%
Distinct7334
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T00:10:39.306252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length15.2099
Min length9

Characters and Unicode

Total characters152099
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5526 ?
Unique (%)55.3%

Sample

1st row11470-100429679
2nd row11305-100169487
3rd row11380-100352096
4th row11200-100228601
5th row11215-13903
ValueCountFrequency (%)
11110-100034869 16
 
0.2%
11740-1366 15
 
0.1%
11170-100378215 13
 
0.1%
11260-100284459 13
 
0.1%
11590-100114281 13
 
0.1%
11740-4756 12
 
0.1%
11260-100284460 11
 
0.1%
11650-100715880 11
 
0.1%
11140-101026233 10
 
0.1%
11215-3288 10
 
0.1%
Other values (7324) 9876
98.8%
2024-05-11T00:10:40.576458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 39772
26.1%
1 37601
24.7%
2 10037
 
6.6%
- 10000
 
6.6%
5 8929
 
5.9%
3 8919
 
5.9%
4 8360
 
5.5%
6 8341
 
5.5%
7 7184
 
4.7%
8 6727
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 142099
93.4%
Dash Punctuation 10000
 
6.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 39772
28.0%
1 37601
26.5%
2 10037
 
7.1%
5 8929
 
6.3%
3 8919
 
6.3%
4 8360
 
5.9%
6 8341
 
5.9%
7 7184
 
5.1%
8 6727
 
4.7%
9 6229
 
4.4%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 152099
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 39772
26.1%
1 37601
24.7%
2 10037
 
6.6%
- 10000
 
6.6%
5 8929
 
5.9%
3 8919
 
5.9%
4 8360
 
5.5%
6 8341
 
5.5%
7 7184
 
4.7%
8 6727
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 152099
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 39772
26.1%
1 37601
24.7%
2 10037
 
6.6%
- 10000
 
6.6%
5 8929
 
5.9%
3 8919
 
5.9%
4 8360
 
5.5%
6 8341
 
5.5%
7 7184
 
4.7%
8 6727
 
4.4%
Distinct2
Distinct (%)100.0%
Missing9998
Missing (%)> 99.9%
Memory size156.2 KiB
2024-05-11T00:10:41.195989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12.5
Mean length12.5
Min length10

Characters and Unicode

Total characters25
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row11545-100445386
2nd row11560-2668
ValueCountFrequency (%)
11545-100445386 1
50.0%
11560-2668 1
50.0%
2024-05-11T00:10:42.016393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5
20.0%
5 4
16.0%
6 4
16.0%
4 3
12.0%
0 3
12.0%
- 2
 
8.0%
8 2
 
8.0%
3 1
 
4.0%
2 1
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23
92.0%
Dash Punctuation 2
 
8.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5
21.7%
5 4
17.4%
6 4
17.4%
4 3
13.0%
0 3
13.0%
8 2
 
8.7%
3 1
 
4.3%
2 1
 
4.3%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5
20.0%
5 4
16.0%
6 4
16.0%
4 3
12.0%
0 3
12.0%
- 2
 
8.0%
8 2
 
8.0%
3 1
 
4.0%
2 1
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5
20.0%
5 4
16.0%
6 4
16.0%
4 3
12.0%
0 3
12.0%
- 2
 
8.0%
8 2
 
8.0%
3 1
 
4.0%
2 1
 
4.0%

주_부속_구분_코드
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
주건축물
9771 
부속건축물
 
226
<NA>
 
3

Length

Max length5
Median length4
Mean length4.0226
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주건축물
2nd row주건축물
3rd row주건축물
4th row주건축물
5th row주건축물

Common Values

ValueCountFrequency (%)
주건축물 9771
97.7%
부속건축물 226
 
2.3%
<NA> 3
 
< 0.1%

Length

2024-05-11T00:10:42.450646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T00:10:42.946909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주건축물 9771
97.7%
부속건축물 226
 
2.3%
na 3
 
< 0.1%

주_부속_일련번호
Real number (ℝ)

ZEROS 

Distinct49
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4198
Minimum0
Maximum97
Zeros1341
Zeros (%)13.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T00:10:43.494023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile3
Maximum97
Range97
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.6947169
Coefficient of variation (CV)2.6022798
Kurtosis208.98002
Mean1.4198
Median Absolute Deviation (MAD)0
Skewness12.363988
Sum14198
Variance13.650933
MonotonicityNot monotonic
2024-05-11T00:10:44.063857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1 7725
77.2%
0 1341
 
13.4%
2 427
 
4.3%
3 127
 
1.3%
4 57
 
0.6%
6 48
 
0.5%
5 42
 
0.4%
7 24
 
0.2%
9 23
 
0.2%
8 21
 
0.2%
Other values (39) 165
 
1.7%
ValueCountFrequency (%)
0 1341
 
13.4%
1 7725
77.2%
2 427
 
4.3%
3 127
 
1.3%
4 57
 
0.6%
5 42
 
0.4%
6 48
 
0.5%
7 24
 
0.2%
8 21
 
0.2%
9 23
 
0.2%
ValueCountFrequency (%)
97 2
< 0.1%
91 1
 
< 0.1%
71 1
 
< 0.1%
62 1
 
< 0.1%
60 2
< 0.1%
56 1
 
< 0.1%
51 1
 
< 0.1%
48 3
< 0.1%
46 1
 
< 0.1%
45 2
< 0.1%

층_구분_코드
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
지상
7538 
지하
2050 
옥탑
 
410
<NA>
 
2

Length

Max length4
Median length2
Mean length2.0004
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지상
2nd row지상
3rd row지상
4th row옥탑
5th row지상

Common Values

ValueCountFrequency (%)
지상 7538
75.4%
지하 2050
 
20.5%
옥탑 410
 
4.1%
<NA> 2
 
< 0.1%

Length

2024-05-11T00:10:44.642741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T00:10:45.015347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지상 7538
75.4%
지하 2050
 
20.5%
옥탑 410
 
4.1%
na 2
 
< 0.1%

층_번호
Real number (ℝ)

ZEROS 

Distinct36
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0886
Minimum0
Maximum39
Zeros104
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T00:10:45.763013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile6
Maximum39
Range39
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.4821696
Coefficient of variation (CV)1.1884371
Kurtosis45.830111
Mean2.0886
Median Absolute Deviation (MAD)0
Skewness5.5596943
Sum20886
Variance6.1611662
MonotonicityNot monotonic
2024-05-11T00:10:46.307190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
1 5703
57.0%
2 2078
 
20.8%
3 896
 
9.0%
4 444
 
4.4%
5 272
 
2.7%
6 118
 
1.2%
0 104
 
1.0%
7 70
 
0.7%
8 67
 
0.7%
10 46
 
0.5%
Other values (26) 202
 
2.0%
ValueCountFrequency (%)
0 104
 
1.0%
1 5703
57.0%
2 2078
 
20.8%
3 896
 
9.0%
4 444
 
4.4%
5 272
 
2.7%
6 118
 
1.2%
7 70
 
0.7%
8 67
 
0.7%
9 44
 
0.4%
ValueCountFrequency (%)
39 1
 
< 0.1%
37 1
 
< 0.1%
35 1
 
< 0.1%
33 1
 
< 0.1%
31 1
 
< 0.1%
30 1
 
< 0.1%
29 3
< 0.1%
28 3
< 0.1%
27 1
 
< 0.1%
26 4
< 0.1%
Distinct151
Distinct (%)1.5%
Missing2
Missing (%)< 0.1%
Memory size156.2 KiB
2024-05-11T00:10:46.994833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length2
Mean length2.1993399
Min length1

Characters and Unicode

Total characters21989
Distinct characters46
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)0.7%

Sample

1st row1층
2nd row4층
3rd row2층
4th row옥탑
5th row2층
ValueCountFrequency (%)
1층 3417
33.9%
2층 1840
18.3%
3층 797
 
7.9%
지1층 583
 
5.8%
지1 503
 
5.0%
지층 475
 
4.7%
4층 393
 
3.9%
지하1층 254
 
2.5%
5층 239
 
2.4%
옥탑 198
 
2.0%
Other values (135) 1368
13.6%
2024-05-11T00:10:48.154009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8956
40.7%
1 5211
23.7%
2131
 
9.7%
2 2097
 
9.5%
3 899
 
4.1%
452
 
2.1%
4 445
 
2.0%
426
 
1.9%
417
 
1.9%
5 270
 
1.2%
Other values (36) 685
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12555
57.1%
Decimal Number 9288
42.2%
Space Separator 69
 
0.3%
Open Punctuation 36
 
0.2%
Close Punctuation 36
 
0.2%
Uppercase Letter 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8956
71.3%
2131
 
17.0%
452
 
3.6%
426
 
3.4%
417
 
3.3%
62
 
0.5%
28
 
0.2%
22
 
0.2%
13
 
0.1%
6
 
< 0.1%
Other values (18) 42
 
0.3%
Decimal Number
ValueCountFrequency (%)
1 5211
56.1%
2 2097
22.6%
3 899
 
9.7%
4 445
 
4.8%
5 270
 
2.9%
6 121
 
1.3%
8 74
 
0.8%
7 74
 
0.8%
0 49
 
0.5%
9 48
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
D 1
20.0%
I 1
20.0%
P 1
20.0%
A 1
20.0%
T 1
20.0%
Space Separator
ValueCountFrequency (%)
69
100.0%
Open Punctuation
ValueCountFrequency (%)
( 36
100.0%
Close Punctuation
ValueCountFrequency (%)
) 36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12555
57.1%
Common 9429
42.9%
Latin 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8956
71.3%
2131
 
17.0%
452
 
3.6%
426
 
3.4%
417
 
3.3%
62
 
0.5%
28
 
0.2%
22
 
0.2%
13
 
0.1%
6
 
< 0.1%
Other values (18) 42
 
0.3%
Common
ValueCountFrequency (%)
1 5211
55.3%
2 2097
22.2%
3 899
 
9.5%
4 445
 
4.7%
5 270
 
2.9%
6 121
 
1.3%
8 74
 
0.8%
7 74
 
0.8%
69
 
0.7%
0 49
 
0.5%
Other values (3) 120
 
1.3%
Latin
ValueCountFrequency (%)
D 1
20.0%
I 1
20.0%
P 1
20.0%
A 1
20.0%
T 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12555
57.1%
ASCII 9434
42.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8956
71.3%
2131
 
17.0%
452
 
3.6%
426
 
3.4%
417
 
3.3%
62
 
0.5%
28
 
0.2%
22
 
0.2%
13
 
0.1%
6
 
< 0.1%
Other values (18) 42
 
0.3%
ASCII
ValueCountFrequency (%)
1 5211
55.2%
2 2097
22.2%
3 899
 
9.5%
4 445
 
4.7%
5 270
 
2.9%
6 121
 
1.3%
8 74
 
0.8%
7 74
 
0.8%
69
 
0.7%
0 49
 
0.5%
Other values (8) 125
 
1.3%

구조_코드
Categorical

IMBALANCE 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
철근콘크리트구조
4402 
벽돌구조
4143 
일반목구조
549 
블록구조
 
335
철골철근콘크리트구조
 
172
Other values (14)
 
399

Length

Max length12
Median length11
Mean length5.9983
Min length3

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row블록구조
2nd row철근콘크리트구조
3rd row벽돌구조
4th row벽돌구조
5th row철근콘크리트구조

Common Values

ValueCountFrequency (%)
철근콘크리트구조 4402
44.0%
벽돌구조 4143
41.4%
일반목구조 549
 
5.5%
블록구조 335
 
3.4%
철골철근콘크리트구조 172
 
1.7%
일반철골구조 119
 
1.2%
경량철골구조 100
 
1.0%
<NA> 97
 
1.0%
프리케스트콘크리트구조 31
 
0.3%
철골콘크리트구조 25
 
0.2%
Other values (9) 27
 
0.3%

Length

2024-05-11T00:10:48.626221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
철근콘크리트구조 4402
44.0%
벽돌구조 4143
41.4%
일반목구조 549
 
5.5%
블록구조 335
 
3.4%
철골철근콘크리트구조 172
 
1.7%
일반철골구조 119
 
1.2%
경량철골구조 100
 
1.0%
na 97
 
1.0%
프리케스트콘크리트구조 31
 
0.3%
철골콘크리트구조 25
 
0.2%
Other values (9) 27
 
0.3%
Distinct325
Distinct (%)3.3%
Missing98
Missing (%)1.0%
Memory size156.2 KiB
2024-05-11T00:10:49.235189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length28
Mean length5.4049687
Min length1

Characters and Unicode

Total characters53520
Distinct characters126
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique180 ?
Unique (%)1.8%

Sample

1st row세멘블록조
2nd row철근콘크리트구조
3rd row연와조
4th row연와조
5th row철근콘크리트조
ValueCountFrequency (%)
연와조 3534
35.3%
철근콘크리트조 2369
23.6%
철근콘크리트구조 1374
 
13.7%
목조 516
 
5.1%
세멘벽돌조 283
 
2.8%
철근콘크리트 245
 
2.4%
철골철근콘크리트구조 130
 
1.3%
조적조 91
 
0.9%
세멘부록조 83
 
0.8%
철골철근콘크리트조 52
 
0.5%
Other values (287) 1348
 
13.4%
2024-05-11T00:10:50.464447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9928
18.6%
5035
9.4%
4657
8.7%
4607
8.6%
4564
8.5%
4549
8.5%
4517
8.4%
3712
 
6.9%
3695
 
6.9%
1750
 
3.3%
Other values (116) 6506
12.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52774
98.6%
Other Punctuation 364
 
0.7%
Space Separator 123
 
0.2%
Decimal Number 103
 
0.2%
Uppercase Letter 70
 
0.1%
Open Punctuation 33
 
0.1%
Close Punctuation 32
 
0.1%
Other Symbol 20
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9928
18.8%
5035
9.5%
4657
8.8%
4607
8.7%
4564
8.6%
4549
8.6%
4517
8.6%
3712
 
7.0%
3695
 
7.0%
1750
 
3.3%
Other values (94) 5760
10.9%
Decimal Number
ValueCountFrequency (%)
6 16
15.5%
9 14
13.6%
3 14
13.6%
0 14
13.6%
1 13
12.6%
2 10
9.7%
5 8
7.8%
4 5
 
4.9%
7 5
 
4.9%
8 4
 
3.9%
Other Punctuation
ValueCountFrequency (%)
, 213
58.5%
. 114
31.3%
/ 20
 
5.5%
? 16
 
4.4%
: 1
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
R 35
50.0%
C 35
50.0%
Space Separator
ValueCountFrequency (%)
123
100.0%
Open Punctuation
ValueCountFrequency (%)
( 33
100.0%
Close Punctuation
ValueCountFrequency (%)
) 32
100.0%
Other Symbol
ValueCountFrequency (%)
20
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 52774
98.6%
Common 676
 
1.3%
Latin 70
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9928
18.8%
5035
9.5%
4657
8.8%
4607
8.7%
4564
8.6%
4549
8.6%
4517
8.6%
3712
 
7.0%
3695
 
7.0%
1750
 
3.3%
Other values (94) 5760
10.9%
Common
ValueCountFrequency (%)
, 213
31.5%
123
18.2%
. 114
16.9%
( 33
 
4.9%
) 32
 
4.7%
20
 
3.0%
/ 20
 
3.0%
? 16
 
2.4%
6 16
 
2.4%
9 14
 
2.1%
Other values (10) 75
 
11.1%
Latin
ValueCountFrequency (%)
R 35
50.0%
C 35
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52774
98.6%
ASCII 726
 
1.4%
CJK Compat 20
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9928
18.8%
5035
9.5%
4657
8.8%
4607
8.7%
4564
8.6%
4549
8.6%
4517
8.6%
3712
 
7.0%
3695
 
7.0%
1750
 
3.3%
Other values (94) 5760
10.9%
ASCII
ValueCountFrequency (%)
, 213
29.3%
123
16.9%
. 114
15.7%
R 35
 
4.8%
C 35
 
4.8%
( 33
 
4.5%
) 32
 
4.4%
/ 20
 
2.8%
? 16
 
2.2%
6 16
 
2.2%
Other values (11) 89
12.3%
CJK Compat
ValueCountFrequency (%)
20
100.0%

주_용도_코드
Text

MISSING 

Distinct177
Distinct (%)1.8%
Missing408
Missing (%)4.1%
Memory size156.2 KiB
2024-05-11T00:10:51.065312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length11
Mean length4.6353211
Min length2

Characters and Unicode

Total characters44462
Distinct characters179
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)0.5%

Sample

1st row사무소
2nd row다세대주택
3rd row단독주택
4th row기타제1종근린생활시설
5th row기타제1종근린생활시설
ValueCountFrequency (%)
단독주택 3060
31.8%
다가구주택 1171
 
12.2%
다세대주택 777
 
8.1%
소매점 509
 
5.3%
아파트 497
 
5.2%
사무소 460
 
4.8%
기타제1종근린생활시설 396
 
4.1%
일반음식점 259
 
2.7%
연립주택 205
 
2.1%
기타제2종근린생활시설 192
 
2.0%
Other values (169) 2082
21.7%
2024-05-11T00:10:52.560683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5307
 
11.9%
5233
 
11.8%
3094
 
7.0%
3071
 
6.9%
1977
 
4.4%
1252
 
2.8%
1236
 
2.8%
1203
 
2.7%
1185
 
2.7%
1180
 
2.7%
Other values (169) 19724
44.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 43785
98.5%
Decimal Number 589
 
1.3%
Open Punctuation 36
 
0.1%
Close Punctuation 36
 
0.1%
Space Separator 16
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5307
 
12.1%
5233
 
12.0%
3094
 
7.1%
3071
 
7.0%
1977
 
4.5%
1252
 
2.9%
1236
 
2.8%
1203
 
2.7%
1185
 
2.7%
1180
 
2.7%
Other values (164) 19047
43.5%
Decimal Number
ValueCountFrequency (%)
1 396
67.2%
2 193
32.8%
Open Punctuation
ValueCountFrequency (%)
( 36
100.0%
Close Punctuation
ValueCountFrequency (%)
) 36
100.0%
Space Separator
ValueCountFrequency (%)
16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 43785
98.5%
Common 677
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5307
 
12.1%
5233
 
12.0%
3094
 
7.1%
3071
 
7.0%
1977
 
4.5%
1252
 
2.9%
1236
 
2.8%
1203
 
2.7%
1185
 
2.7%
1180
 
2.7%
Other values (164) 19047
43.5%
Common
ValueCountFrequency (%)
1 396
58.5%
2 193
28.5%
( 36
 
5.3%
) 36
 
5.3%
16
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 43785
98.5%
ASCII 677
 
1.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5307
 
12.1%
5233
 
12.0%
3094
 
7.1%
3071
 
7.0%
1977
 
4.5%
1252
 
2.9%
1236
 
2.8%
1203
 
2.7%
1185
 
2.7%
1180
 
2.7%
Other values (164) 19047
43.5%
ASCII
ValueCountFrequency (%)
1 396
58.5%
2 193
28.5%
( 36
 
5.3%
) 36
 
5.3%
16
 
2.4%

기타_용도
Text

MISSING 

Distinct1323
Distinct (%)13.4%
Missing130
Missing (%)1.3%
Memory size156.2 KiB
2024-05-11T00:10:53.217518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length67
Median length48
Mean length5.6862209
Min length1

Characters and Unicode

Total characters56123
Distinct characters334
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique872 ?
Unique (%)8.8%

Sample

1st row사무소
2nd row다세대주택(1세대)
3rd row주택
4th row계단실(연면적제외)
5th row근린생활시설
ValueCountFrequency (%)
주택 2636
25.7%
아파트 407
 
4.0%
근린생활시설 283
 
2.8%
단독주택 269
 
2.6%
다세대주택 229
 
2.2%
주차장 224
 
2.2%
점포 199
 
1.9%
연립주택 166
 
1.6%
사무실 143
 
1.4%
다가구주택(1가구 136
 
1.3%
Other values (1287) 5566
54.3%
2024-05-11T00:10:54.547920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5133
 
9.1%
4789
 
8.5%
) 2814
 
5.0%
( 2813
 
5.0%
1835
 
3.3%
1779
 
3.2%
1458
 
2.6%
1360
 
2.4%
1280
 
2.3%
1271
 
2.3%
Other values (324) 31591
56.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46275
82.5%
Close Punctuation 2818
 
5.0%
Open Punctuation 2817
 
5.0%
Decimal Number 2631
 
4.7%
Other Punctuation 758
 
1.4%
Space Separator 397
 
0.7%
Uppercase Letter 299
 
0.5%
Other Symbol 69
 
0.1%
Dash Punctuation 40
 
0.1%
Math Symbol 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5133
 
11.1%
4789
 
10.3%
1835
 
4.0%
1779
 
3.8%
1458
 
3.2%
1360
 
2.9%
1280
 
2.8%
1271
 
2.7%
1265
 
2.7%
1252
 
2.7%
Other values (276) 24853
53.7%
Uppercase Letter
ValueCountFrequency (%)
E 128
42.8%
V 67
22.4%
L 56
18.7%
P 11
 
3.7%
S 8
 
2.7%
D 6
 
2.0%
T 5
 
1.7%
H 3
 
1.0%
O 3
 
1.0%
A 2
 
0.7%
Other values (6) 10
 
3.3%
Decimal Number
ValueCountFrequency (%)
2 1016
38.6%
1 866
32.9%
4 156
 
5.9%
3 154
 
5.9%
5 95
 
3.6%
6 83
 
3.2%
8 79
 
3.0%
0 66
 
2.5%
9 63
 
2.4%
7 53
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 522
68.9%
. 168
 
22.2%
: 39
 
5.1%
/ 25
 
3.3%
? 3
 
0.4%
1
 
0.1%
Other Symbol
ValueCountFrequency (%)
65
94.2%
2
 
2.9%
1
 
1.4%
1
 
1.4%
Close Punctuation
ValueCountFrequency (%)
) 2814
99.9%
2
 
0.1%
] 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 2813
99.9%
2
 
0.1%
[ 2
 
0.1%
Math Symbol
ValueCountFrequency (%)
< 6
40.0%
> 6
40.0%
~ 3
20.0%
Space Separator
ValueCountFrequency (%)
397
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 40
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 46275
82.5%
Common 9549
 
17.0%
Latin 299
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5133
 
11.1%
4789
 
10.3%
1835
 
4.0%
1779
 
3.8%
1458
 
3.2%
1360
 
2.9%
1280
 
2.8%
1271
 
2.7%
1265
 
2.7%
1252
 
2.7%
Other values (276) 24853
53.7%
Common
ValueCountFrequency (%)
) 2814
29.5%
( 2813
29.5%
2 1016
 
10.6%
1 866
 
9.1%
, 522
 
5.5%
397
 
4.2%
. 168
 
1.8%
4 156
 
1.6%
3 154
 
1.6%
5 95
 
1.0%
Other values (22) 548
 
5.7%
Latin
ValueCountFrequency (%)
E 128
42.8%
V 67
22.4%
L 56
18.7%
P 11
 
3.7%
S 8
 
2.7%
D 6
 
2.0%
T 5
 
1.7%
H 3
 
1.0%
O 3
 
1.0%
A 2
 
0.7%
Other values (6) 10
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 46275
82.5%
ASCII 9774
 
17.4%
CJK Compat 65
 
0.1%
None 5
 
< 0.1%
Box Drawing 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5133
 
11.1%
4789
 
10.3%
1835
 
4.0%
1779
 
3.8%
1458
 
3.2%
1360
 
2.9%
1280
 
2.8%
1271
 
2.7%
1265
 
2.7%
1252
 
2.7%
Other values (276) 24853
53.7%
ASCII
ValueCountFrequency (%)
) 2814
28.8%
( 2813
28.8%
2 1016
 
10.4%
1 866
 
8.9%
, 522
 
5.3%
397
 
4.1%
. 168
 
1.7%
4 156
 
1.6%
3 154
 
1.6%
E 128
 
1.3%
Other values (31) 740
 
7.6%
CJK Compat
ValueCountFrequency (%)
65
100.0%
None
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
Box Drawing
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%

면적
Real number (ℝ)

Distinct6265
Distinct (%)62.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162.46379
Minimum0
Maximum20664.13
Zeros35
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T00:10:55.175405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile9.72
Q141.57
median74.33
Q3132.63
95-th percentile587.29819
Maximum20664.13
Range20664.13
Interquartile range (IQR)91.06

Descriptive statistics

Standard deviation412.21928
Coefficient of variation (CV)2.5372993
Kurtosis709.82971
Mean162.46379
Median Absolute Deviation (MAD)40.34
Skewness18.758557
Sum1624637.9
Variance169924.73
MonotonicityNot monotonic
2024-05-11T00:10:55.783084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 35
 
0.4%
33.06 30
 
0.3%
29.75 27
 
0.3%
39.67 23
 
0.2%
42.98 23
 
0.2%
16.53 22
 
0.2%
49.59 22
 
0.2%
66.12 21
 
0.2%
36.36 19
 
0.2%
19.83 19
 
0.2%
Other values (6255) 9759
97.6%
ValueCountFrequency (%)
0.0 35
0.4%
0.8 1
 
< 0.1%
0.9 1
 
< 0.1%
0.99 1
 
< 0.1%
1.0 5
 
0.1%
1.08 1
 
< 0.1%
1.21 2
 
< 0.1%
1.23 1
 
< 0.1%
1.26 1
 
< 0.1%
1.3 1
 
< 0.1%
ValueCountFrequency (%)
20664.13 1
< 0.1%
11478.802 1
< 0.1%
6886.18 1
< 0.1%
6732.46 1
< 0.1%
6415.0 1
< 0.1%
5719.75 1
< 0.1%
4678.24 1
< 0.1%
4537.0 1
< 0.1%
4393.82 1
< 0.1%
4271.05 1
< 0.1%

작업_일자
Real number (ℝ)

Distinct114
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20207967
Minimum20200108
Maximum20240420
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T00:10:56.320468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20200108
5-th percentile20200114
Q120200411
median20200623
Q320210130
95-th percentile20240102
Maximum20240420
Range40312
Interquartile range (IQR)9719

Descriptive statistics

Standard deviation13504.987
Coefficient of variation (CV)0.00066830017
Kurtosis0.45110457
Mean20207967
Median Absolute Deviation (MAD)309
Skewness1.4761984
Sum2.0207967 × 1011
Variance1.8238469 × 108
MonotonicityNot monotonic
2024-05-11T00:10:57.061149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20231124 1061
 
10.6%
20200603 557
 
5.6%
20200411 523
 
5.2%
20211029 457
 
4.6%
20200108 442
 
4.4%
20200304 442
 
4.4%
20200610 418
 
4.2%
20200428 324
 
3.2%
20200808 284
 
2.8%
20240102 277
 
2.8%
Other values (104) 5215
52.1%
ValueCountFrequency (%)
20200108 442
4.4%
20200110 52
 
0.5%
20200114 94
 
0.9%
20200117 99
 
1.0%
20200121 47
 
0.5%
20200129 117
 
1.2%
20200131 11
 
0.1%
20200204 75
 
0.8%
20200206 75
 
0.8%
20200208 46
 
0.5%
ValueCountFrequency (%)
20240420 1
 
< 0.1%
20240411 22
 
0.2%
20240327 28
 
0.3%
20240309 41
 
0.4%
20240302 15
 
0.1%
20240227 64
 
0.6%
20240221 77
 
0.8%
20240208 182
1.8%
20240202 1
 
< 0.1%
20240102 277
2.8%

Interactions

2024-05-11T00:10:33.898859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:29.164902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:30.989136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:32.567478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:34.297376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:29.596485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:31.516460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:32.980162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:34.629259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:30.288668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:31.878438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:33.309618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:34.923222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:30.641907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:32.189307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:10:33.575153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T00:10:57.561330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리_주_폐쇄말소대장_PK주_부속_구분_코드주_부속_일련번호층_구분_코드층_번호구조_코드면적작업_일자
관리_주_폐쇄말소대장_PK1.000NaNNaNNaNNaNNaNNaNNaN
주_부속_구분_코드NaN1.0000.0000.0180.0470.2810.1000.040
주_부속_일련번호NaN0.0001.0000.0630.1410.0000.0000.010
층_구분_코드NaN0.0180.0631.0000.2190.3030.0700.120
층_번호NaN0.0470.1410.2191.0000.3160.0900.189
구조_코드NaN0.2810.0000.3030.3161.0000.2840.361
면적NaN0.1000.0000.0700.0900.2841.0000.097
작업_일자NaN0.0400.0100.1200.1890.3610.0971.000
2024-05-11T00:10:58.058307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구조_코드주_부속_구분_코드층_구분_코드
구조_코드1.0000.2210.146
주_부속_구분_코드0.2211.0000.030
층_구분_코드0.1460.0301.000
2024-05-11T00:10:58.418293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주_부속_일련번호층_번호면적작업_일자주_부속_구분_코드층_구분_코드구조_코드
주_부속_일련번호1.0000.0160.0570.0180.0000.0270.000
층_번호0.0161.0000.4130.1070.0360.1330.126
면적0.0570.4131.0000.0400.0720.0290.117
작업_일자0.0180.1070.0401.0000.0270.0490.151
주_부속_구분_코드0.0000.0360.0720.0271.0000.0300.221
층_구분_코드0.0270.1330.0290.0490.0301.0000.146
구조_코드0.0000.1260.1170.1510.2210.1461.000

Missing values

2024-05-11T00:10:35.413155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T00:10:36.096680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T00:10:36.625877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

관리_층별_개요_PK관리_폐쇄말소대장_PK관리_주_폐쇄말소대장_PK주_부속_구분_코드주_부속_일련번호층_구분_코드층_번호층_번호_명구조_코드기타_구조주_용도_코드기타_용도면적작업_일자
1218811470-10029637211470-100429679<NA>주건축물1지상11층블록구조세멘블록조사무소사무소66.1220200218
3211011305-10021472111305-100169487<NA>주건축물1지상44층철근콘크리트구조철근콘크리트구조다세대주택다세대주택(1세대)85.4120200814
4644411380-10040431611380-100352096<NA>주건축물1지상22층벽돌구조연와조단독주택주택31.9820200411
23811200-10017837311200-100228601<NA>주건축물1옥탑1옥탑벽돌구조연와조기타제1종근린생활시설계단실(연면적제외)28.3520231124
163211215-5227211215-13903<NA>주건축물1지상22층철근콘크리트구조철근콘크리트조기타제1종근린생활시설근린생활시설63.1220240208
2946511680-10096452511680-100975713<NA>주건축물1지상22층벽돌구조연와조다세대주택다세대주택(1세대)97.1520200610
2005011680-10006351611680-100029944<NA>주건축물5지상2121층철근콘크리트구조철근콘크리트구조아파트아파트(4세대)437.18420200108
3410311620-10033176611620-100291551<NA>주건축물1지하1지하1벽돌구조연와조단독주택주택49.7620200808
3916911710-2717211710-5634<NA>주건축물1지하1지1벽돌구조연와조단독주택보이라실0.020200331
612011260-10026053211260-100282927<NA>주건축물1지상44층벽돌구조연와조사무소제2종근린생활시설(사무소)72.4420231124
관리_층별_개요_PK관리_폐쇄말소대장_PK관리_주_폐쇄말소대장_PK주_부속_구분_코드주_부속_일련번호층_구분_코드층_번호층_번호_명구조_코드기타_구조주_용도_코드기타_용도면적작업_일자
5606511380-10040471911380-100352323<NA>주건축물1지상11층일반목구조목조단독주택주택2.4120200411
66611110-100000000000000205733111110-29414<NA>주건축물1지상11층벽돌구조벽돌구조단독주택단독주택28.0920240102
5297511290-10026645711290-100271595<NA>주건축물1지하1지1층철근콘크리트구조철근콘크리트조다세대주택다세대주택(2세대)134.9520200411
5137311680-10001712111680-100009194<NA>주건축물1지상11층벽돌구조연와조다가구주택다가구주택(1가구)89.2520200411
5212711740-10004095411740-100020788<NA>주건축물0지하1지층철근콘크리트구조철근콘크리트조제조업소펌프실7.3520200317
6266711650-10060010111650-100789063<NA>주건축물1지상11층벽돌구조연와조단독주택주택81.3220211029
5509811380-10040401911380-100351933<NA>주건축물1지상11층블록구조세멘브럭조단독주택주택37.1920200411
5577311560-10048112011560-100722947<NA>주건축물3지상11층철근콘크리트구조철근콘크리트평옥개소매점근린생활시설(소매점)567.3420200304
3008211680-10096359911680-100974365<NA>주건축물1지상11층철근콘크리트구조철근콘크리트조기타제1종근린생활시설주차장23.3620200610
2610611305-10001790711305-100010752<NA>주건축물0지하1지1벽돌구조연와조단독주택주택8.0720200630