Overview

Dataset statistics

Number of variables12
Number of observations10000
Missing cells1707
Missing cells (%)1.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 MiB
Average record size in memory111.0 B

Variable types

Text5
Categorical2
Numeric5

Dataset

Description관리_전유_공용_면적_pk,호별명세_pk,평형_구분_명,전유_공용_구분_코드,주_부속_구분_코드,층_구분_코드,층_번호,구조_코드,주_용도_코드,기타_용도,면적
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15391/S/1/datasetView.do

Alerts

층_구분_코드 is highly overall correlated with 층_번호 and 1 other fieldsHigh correlation
층_번호 is highly overall correlated with 층_구분_코드High correlation
전유_공용_구분_코드 is highly overall correlated with 층_구분_코드High correlation
주_부속_구분_코드 is highly imbalanced (91.6%)Imbalance
주_용도_코드 has 208 (2.1%) missing valuesMissing
기타_용도 has 1438 (14.4%) missing valuesMissing
층_번호 is highly skewed (γ1 = 23.37351032)Skewed
면적 is highly skewed (γ1 = 50.62897015)Skewed
관리_전유_공용_면적_pk has unique valuesUnique
층_번호 has 5348 (53.5%) zerosZeros

Reproduction

Analysis started2024-05-03 20:06:07.682373
Analysis finished2024-05-03 20:06:23.235780
Duration15.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-03T20:06:23.629556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length14.9564
Min length9

Characters and Unicode

Total characters149564
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row11500-100216589
2nd row11545-100124144
3rd row11260-100087687
4th row11710-100209980
5th row11200-100048533
ValueCountFrequency (%)
11500-100216589 1
 
< 0.1%
11545-100123450 1
 
< 0.1%
11530-100084922 1
 
< 0.1%
11545-100118252 1
 
< 0.1%
11545-100185188 1
 
< 0.1%
11500-100219002 1
 
< 0.1%
11500-100221207 1
 
< 0.1%
11290-100057760 1
 
< 0.1%
11230-100070407 1
 
< 0.1%
11545-100162783 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-03T20:06:24.796444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 40111
26.8%
0 39557
26.4%
5 12113
 
8.1%
- 10000
 
6.7%
2 9078
 
6.1%
4 7765
 
5.2%
6 7250
 
4.8%
7 6029
 
4.0%
8 5987
 
4.0%
3 5891
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 139564
93.3%
Dash Punctuation 10000
 
6.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 40111
28.7%
0 39557
28.3%
5 12113
 
8.7%
2 9078
 
6.5%
4 7765
 
5.6%
6 7250
 
5.2%
7 6029
 
4.3%
8 5987
 
4.3%
3 5891
 
4.2%
9 5783
 
4.1%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 149564
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 40111
26.8%
0 39557
26.4%
5 12113
 
8.1%
- 10000
 
6.7%
2 9078
 
6.1%
4 7765
 
5.2%
6 7250
 
4.8%
7 6029
 
4.0%
8 5987
 
4.0%
3 5891
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149564
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 40111
26.8%
0 39557
26.4%
5 12113
 
8.1%
- 10000
 
6.7%
2 9078
 
6.1%
4 7765
 
5.2%
6 7250
 
4.8%
7 6029
 
4.0%
8 5987
 
4.0%
3 5891
 
3.9%
Distinct1714
Distinct (%)17.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-03T20:06:25.488618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length14.8228
Min length8

Characters and Unicode

Total characters148228
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique519 ?
Unique (%)5.2%

Sample

1st row11500-100100041
2nd row11545-100053518
3rd row11260-100038618
4th row11710-100067608
5th row11200-100052190
ValueCountFrequency (%)
11545-100053518 695
 
7.0%
11545-100065917 264
 
2.6%
11200-100035965 189
 
1.9%
11545-100064797 184
 
1.8%
11170-100061031 178
 
1.8%
11530-100079506 174
 
1.7%
11500-100080633 155
 
1.6%
11560-100066989 117
 
1.2%
11620-100079215 109
 
1.1%
11545-100054977 104
 
1.0%
Other values (1704) 7831
78.3%
2024-05-03T20:06:26.893799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 41743
28.2%
1 37220
25.1%
5 12640
 
8.5%
- 10000
 
6.7%
6 8354
 
5.6%
4 6910
 
4.7%
2 6838
 
4.6%
7 6773
 
4.6%
3 6381
 
4.3%
8 5817
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 138228
93.3%
Dash Punctuation 10000
 
6.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 41743
30.2%
1 37220
26.9%
5 12640
 
9.1%
6 8354
 
6.0%
4 6910
 
5.0%
2 6838
 
4.9%
7 6773
 
4.9%
3 6381
 
4.6%
8 5817
 
4.2%
9 5552
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 148228
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 41743
28.2%
1 37220
25.1%
5 12640
 
8.5%
- 10000
 
6.7%
6 8354
 
5.6%
4 6910
 
4.7%
2 6838
 
4.6%
7 6773
 
4.6%
3 6381
 
4.3%
8 5817
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 148228
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 41743
28.2%
1 37220
25.1%
5 12640
 
8.5%
- 10000
 
6.7%
6 8354
 
5.6%
4 6910
 
4.7%
2 6838
 
4.6%
7 6773
 
4.6%
3 6381
 
4.3%
8 5817
 
3.9%
Distinct3998
Distinct (%)40.0%
Missing3
Missing (%)< 0.1%
Memory size156.2 KiB
2024-05-03T20:06:28.002579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length20
Mean length3.90007
Min length1

Characters and Unicode

Total characters38989
Distinct characters164
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2700 ?
Unique (%)27.0%

Sample

1st row202
2nd row1429
3rd row주7-17B
4th row1804
5th row8I
ValueCountFrequency (%)
b 169
 
1.6%
a 159
 
1.5%
c 142
 
1.4%
202 118
 
1.1%
d 116
 
1.1%
302 116
 
1.1%
201 114
 
1.1%
301 107
 
1.0%
e 103
 
1.0%
501 100
 
1.0%
Other values (3801) 9089
88.0%
2024-05-03T20:06:29.429714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 6268
16.1%
0 6021
15.4%
2 3980
 
10.2%
3 2981
 
7.6%
4 2333
 
6.0%
5 1843
 
4.7%
6 1632
 
4.2%
7 1313
 
3.4%
1184
 
3.0%
. 1169
 
3.0%
Other values (154) 10265
26.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28619
73.4%
Other Letter 3910
 
10.0%
Uppercase Letter 3018
 
7.7%
Other Punctuation 1183
 
3.0%
Dash Punctuation 936
 
2.4%
Lowercase Letter 599
 
1.5%
Space Separator 336
 
0.9%
Open Punctuation 167
 
0.4%
Close Punctuation 167
 
0.4%
Math Symbol 26
 
0.1%
Other values (2) 28
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1184
30.3%
385
 
9.8%
359
 
9.2%
304
 
7.8%
153
 
3.9%
144
 
3.7%
128
 
3.3%
127
 
3.2%
92
 
2.4%
69
 
1.8%
Other values (81) 965
24.7%
Uppercase Letter
ValueCountFrequency (%)
B 806
26.7%
A 615
20.4%
C 266
 
8.8%
E 209
 
6.9%
F 208
 
6.9%
D 161
 
5.3%
T 119
 
3.9%
P 98
 
3.2%
O 80
 
2.7%
Y 75
 
2.5%
Other values (15) 381
12.6%
Lowercase Letter
ValueCountFrequency (%)
a 102
17.0%
e 79
13.2%
p 65
10.9%
y 61
10.2%
b 59
9.8%
t 53
8.8%
d 35
 
5.8%
c 30
 
5.0%
f 21
 
3.5%
g 13
 
2.2%
Other values (14) 81
13.5%
Decimal Number
ValueCountFrequency (%)
1 6268
21.9%
0 6021
21.0%
2 3980
13.9%
3 2981
10.4%
4 2333
 
8.2%
5 1843
 
6.4%
6 1632
 
5.7%
7 1313
 
4.6%
8 1154
 
4.0%
9 1094
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 1169
98.8%
, 9
 
0.8%
' 4
 
0.3%
/ 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 19
73.1%
= 6
 
23.1%
+ 1
 
3.8%
Other Symbol
ValueCountFrequency (%)
2
50.0%
2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 936
100.0%
Space Separator
ValueCountFrequency (%)
336
100.0%
Open Punctuation
ValueCountFrequency (%)
( 167
100.0%
Close Punctuation
ValueCountFrequency (%)
) 167
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31462
80.7%
Hangul 3910
 
10.0%
Latin 3617
 
9.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1184
30.3%
385
 
9.8%
359
 
9.2%
304
 
7.8%
153
 
3.9%
144
 
3.7%
128
 
3.3%
127
 
3.2%
92
 
2.4%
69
 
1.8%
Other values (81) 965
24.7%
Latin
ValueCountFrequency (%)
B 806
22.3%
A 615
17.0%
C 266
 
7.4%
E 209
 
5.8%
F 208
 
5.8%
D 161
 
4.5%
T 119
 
3.3%
a 102
 
2.8%
P 98
 
2.7%
O 80
 
2.2%
Other values (39) 953
26.3%
Common
ValueCountFrequency (%)
1 6268
19.9%
0 6021
19.1%
2 3980
12.7%
3 2981
9.5%
4 2333
 
7.4%
5 1843
 
5.9%
6 1632
 
5.2%
7 1313
 
4.2%
. 1169
 
3.7%
8 1154
 
3.7%
Other values (14) 2768
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35075
90.0%
Hangul 3879
 
9.9%
Compat Jamo 31
 
0.1%
Enclosed Alphanum 2
 
< 0.1%
CJK Compat 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 6268
17.9%
0 6021
17.2%
2 3980
11.3%
3 2981
8.5%
4 2333
 
6.7%
5 1843
 
5.3%
6 1632
 
4.7%
7 1313
 
3.7%
. 1169
 
3.3%
8 1154
 
3.3%
Other values (61) 6381
18.2%
Hangul
ValueCountFrequency (%)
1184
30.5%
385
 
9.9%
359
 
9.3%
304
 
7.8%
153
 
3.9%
144
 
3.7%
128
 
3.3%
127
 
3.3%
92
 
2.4%
69
 
1.8%
Other values (73) 934
24.1%
Compat Jamo
ValueCountFrequency (%)
17
54.8%
3
 
9.7%
3
 
9.7%
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
Enclosed Alphanum
ValueCountFrequency (%)
2
100.0%
CJK Compat
ValueCountFrequency (%)
2
100.0%

전유_공용_구분_코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2
6976 
1
3023 
<NA>
 
1

Length

Max length4
Median length1
Mean length1.0003
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row2
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
2 6976
69.8%
1 3023
30.2%
<NA> 1
 
< 0.1%

Length

2024-05-03T20:06:30.039555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T20:06:30.449764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 6976
69.8%
1 3023
30.2%
na 1
 
< 0.1%

주_부속_구분_코드
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9895 
1
 
105

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9895
99.0%
1 105
 
1.1%

Length

2024-05-03T20:06:30.864094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T20:06:31.253108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9895
99.0%
1 105
 
1.1%

층_구분_코드
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing49
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean26.648176
Minimum0
Maximum40
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T20:06:31.794537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q120
median20
Q340
95-th percentile40
Maximum40
Range40
Interquartile range (IQR)20

Descriptive statistics

Standard deviation11.617943
Coefficient of variation (CV)0.43597518
Kurtosis-1.5856224
Mean26.648176
Median Absolute Deviation (MAD)10
Skewness0.084141562
Sum265176
Variance134.97661
MonotonicityNot monotonic
2024-05-03T20:06:32.119940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
20 4254
42.5%
40 4071
40.7%
10 1538
 
15.4%
22 49
 
0.5%
21 38
 
0.4%
0 1
 
< 0.1%
(Missing) 49
 
0.5%
ValueCountFrequency (%)
0 1
 
< 0.1%
10 1538
 
15.4%
20 4254
42.5%
21 38
 
0.4%
22 49
 
0.5%
40 4071
40.7%
ValueCountFrequency (%)
40 4071
40.7%
22 49
 
0.5%
21 38
 
0.4%
20 4254
42.5%
10 1538
 
15.4%
0 1
 
< 0.1%

층_번호
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4351
Minimum0
Maximum503
Zeros5348
Zeros (%)53.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T20:06:32.513716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile9
Maximum503
Range503
Interquartile range (IQR)2

Descriptive statistics

Standard deviation16.160702
Coefficient of variation (CV)6.6365663
Kurtosis592.55881
Mean2.4351
Median Absolute Deviation (MAD)0
Skewness23.37351
Sum24351
Variance261.1683
MonotonicityNot monotonic
2024-05-03T20:06:33.138460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 5348
53.5%
1 1543
 
15.4%
2 723
 
7.2%
3 693
 
6.9%
4 414
 
4.1%
5 331
 
3.3%
6 232
 
2.3%
7 120
 
1.2%
8 91
 
0.9%
9 80
 
0.8%
Other values (21) 425
 
4.2%
ValueCountFrequency (%)
0 5348
53.5%
1 1543
 
15.4%
2 723
 
7.2%
3 693
 
6.9%
4 414
 
4.1%
5 331
 
3.3%
6 232
 
2.3%
7 120
 
1.2%
8 91
 
0.9%
9 80
 
0.8%
ValueCountFrequency (%)
503 1
 
< 0.1%
502 1
 
< 0.1%
501 1
 
< 0.1%
402 1
 
< 0.1%
401 5
0.1%
303 1
 
< 0.1%
302 4
< 0.1%
301 3
< 0.1%
201 2
 
< 0.1%
101 1
 
< 0.1%

구조_코드
Real number (ℝ)

Distinct8
Distinct (%)0.1%
Missing9
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean22.235712
Minimum11
Maximum43
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T20:06:33.781824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile21
Q121
median21
Q321
95-th percentile41
Maximum43
Range32
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.8982019
Coefficient of variation (CV)0.22028536
Kurtosis12.185301
Mean22.235712
Median Absolute Deviation (MAD)0
Skewness3.7268482
Sum222157
Variance23.992382
MonotonicityNot monotonic
2024-05-03T20:06:34.242322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
21 9336
93.4%
42 307
 
3.1%
43 146
 
1.5%
41 98
 
1.0%
31 67
 
0.7%
22 23
 
0.2%
11 8
 
0.1%
40 6
 
0.1%
(Missing) 9
 
0.1%
ValueCountFrequency (%)
11 8
 
0.1%
21 9336
93.4%
22 23
 
0.2%
31 67
 
0.7%
40 6
 
0.1%
41 98
 
1.0%
42 307
 
3.1%
43 146
 
1.5%
ValueCountFrequency (%)
43 146
 
1.5%
42 307
 
3.1%
41 98
 
1.0%
40 6
 
0.1%
31 67
 
0.7%
22 23
 
0.2%
21 9336
93.4%
11 8
 
0.1%

주_용도_코드
Text

MISSING 

Distinct73
Distinct (%)0.7%
Missing208
Missing (%)2.1%
Memory size156.2 KiB
2024-05-03T20:06:34.754060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters48960
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)0.2%

Sample

1st row02003
2nd row17999
3rd row02001
4th row14202
5th row17999
ValueCountFrequency (%)
02003 3167
32.3%
14202 2167
22.1%
17999 1816
18.5%
02001 488
 
5.0%
03001 292
 
3.0%
04402 242
 
2.5%
04001 233
 
2.4%
02007 193
 
2.0%
02002 142
 
1.5%
03999 119
 
1.2%
Other values (63) 933
 
9.5%
2024-05-03T20:06:35.663606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18046
36.9%
2 9203
18.8%
9 6487
 
13.2%
1 5515
 
11.3%
3 3869
 
7.9%
4 3446
 
7.0%
7 2183
 
4.5%
5 177
 
0.4%
6 26
 
0.1%
8 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 48958
> 99.9%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18046
36.9%
2 9203
18.8%
9 6487
 
13.3%
1 5515
 
11.3%
3 3869
 
7.9%
4 3446
 
7.0%
7 2183
 
4.5%
5 177
 
0.4%
6 26
 
0.1%
8 6
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
Z 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 48958
> 99.9%
Latin 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18046
36.9%
2 9203
18.8%
9 6487
 
13.3%
1 5515
 
11.3%
3 3869
 
7.9%
4 3446
 
7.0%
7 2183
 
4.5%
5 177
 
0.4%
6 26
 
0.1%
8 6
 
< 0.1%
Latin
ValueCountFrequency (%)
Z 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18046
36.9%
2 9203
18.8%
9 6487
 
13.2%
1 5515
 
11.3%
3 3869
 
7.9%
4 3446
 
7.0%
7 2183
 
4.5%
5 177
 
0.4%
6 26
 
0.1%
8 6
 
< 0.1%

기타_용도
Text

MISSING 

Distinct989
Distinct (%)11.6%
Missing1438
Missing (%)14.4%
Memory size156.2 KiB
2024-05-03T20:06:36.288705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length55
Mean length7.8635833
Min length1

Characters and Unicode

Total characters67328
Distinct characters262
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique415 ?
Unique (%)4.8%

Sample

1st row도시형생활주택
2nd row복도,ELV 외
3rd row주차장
4th row주차장
5th row주차장
ValueCountFrequency (%)
계단실 1195
 
12.5%
주차장 1071
 
11.2%
기계실 382
 
4.0%
354
 
3.7%
공장(지식산업센터 219
 
2.3%
벽체공용 199
 
2.1%
복도,elv 159
 
1.7%
지하주차장 159
 
1.7%
기계식주차장 151
 
1.6%
도시형생활주택(단지형다세대 139
 
1.5%
Other values (898) 5556
58.0%
2024-05-03T20:06:37.301917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6345
 
9.4%
, 4602
 
6.8%
3924
 
5.8%
3019
 
4.5%
2943
 
4.4%
2906
 
4.3%
2512
 
3.7%
( 1825
 
2.7%
) 1825
 
2.7%
1758
 
2.6%
Other values (252) 35669
53.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 53820
79.9%
Other Punctuation 5040
 
7.5%
Uppercase Letter 2165
 
3.2%
Open Punctuation 1833
 
2.7%
Close Punctuation 1833
 
2.7%
Decimal Number 1173
 
1.7%
Space Separator 1022
 
1.5%
Math Symbol 220
 
0.3%
Dash Punctuation 186
 
0.3%
Lowercase Letter 36
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6345
 
11.8%
3924
 
7.3%
3019
 
5.6%
2943
 
5.5%
2906
 
5.4%
2512
 
4.7%
1758
 
3.3%
1733
 
3.2%
1718
 
3.2%
1409
 
2.6%
Other values (201) 25553
47.5%
Uppercase Letter
ValueCountFrequency (%)
E 568
26.2%
V 371
17.1%
L 363
16.8%
D 267
12.3%
F 267
12.3%
M 267
12.3%
A 12
 
0.6%
S 11
 
0.5%
C 11
 
0.5%
H 10
 
0.5%
Other values (10) 18
 
0.8%
Decimal Number
ValueCountFrequency (%)
1 448
38.2%
2 187
15.9%
3 183
15.6%
4 170
 
14.5%
5 81
 
6.9%
7 44
 
3.8%
6 32
 
2.7%
0 25
 
2.1%
8 2
 
0.2%
9 1
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
f 8
22.2%
m 8
22.2%
d 8
22.2%
e 6
16.7%
v 3
 
8.3%
l 3
 
8.3%
Other Punctuation
ValueCountFrequency (%)
, 4602
91.3%
/ 277
 
5.5%
. 145
 
2.9%
; 8
 
0.2%
: 8
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 1825
99.6%
[ 7
 
0.4%
{ 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1825
99.6%
] 7
 
0.4%
} 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 215
97.7%
= 5
 
2.3%
Space Separator
ValueCountFrequency (%)
1022
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 186
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 53820
79.9%
Common 11307
 
16.8%
Latin 2201
 
3.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6345
 
11.8%
3924
 
7.3%
3019
 
5.6%
2943
 
5.5%
2906
 
5.4%
2512
 
4.7%
1758
 
3.3%
1733
 
3.2%
1718
 
3.2%
1409
 
2.6%
Other values (201) 25553
47.5%
Latin
ValueCountFrequency (%)
E 568
25.8%
V 371
16.9%
L 363
16.5%
D 267
12.1%
F 267
12.1%
M 267
12.1%
A 12
 
0.5%
S 11
 
0.5%
C 11
 
0.5%
H 10
 
0.5%
Other values (16) 54
 
2.5%
Common
ValueCountFrequency (%)
, 4602
40.7%
( 1825
 
16.1%
) 1825
 
16.1%
1022
 
9.0%
1 448
 
4.0%
/ 277
 
2.4%
~ 215
 
1.9%
2 187
 
1.7%
- 186
 
1.6%
3 183
 
1.6%
Other values (15) 537
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 53820
79.9%
ASCII 13508
 
20.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6345
 
11.8%
3924
 
7.3%
3019
 
5.6%
2943
 
5.5%
2906
 
5.4%
2512
 
4.7%
1758
 
3.3%
1733
 
3.2%
1718
 
3.2%
1409
 
2.6%
Other values (201) 25553
47.5%
ASCII
ValueCountFrequency (%)
, 4602
34.1%
( 1825
 
13.5%
) 1825
 
13.5%
1022
 
7.6%
E 568
 
4.2%
1 448
 
3.3%
V 371
 
2.7%
L 363
 
2.7%
/ 277
 
2.1%
D 267
 
2.0%
Other values (41) 1940
14.4%

면적
Real number (ℝ)

SKEWED 

Distinct4420
Distinct (%)44.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.632823
Minimum0
Maximum17686.89
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T20:06:37.880060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.52
Q13.29
median11.59
Q328.8325
95-th percentile75.64
Maximum17686.89
Range17686.89
Interquartile range (IQR)25.5425

Descriptive statistics

Standard deviation270.19672
Coefficient of variation (CV)8.541657
Kurtosis2979.1702
Mean31.632823
Median Absolute Deviation (MAD)9.97
Skewness50.62897
Sum316328.23
Variance73006.269
MonotonicityNot monotonic
2024-05-03T20:06:38.421880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.6213 59
 
0.6%
0.66 57
 
0.6%
1.56 48
 
0.5%
11.59 44
 
0.4%
7.1233 41
 
0.4%
2.4424 37
 
0.4%
27.32 35
 
0.4%
16.78 35
 
0.4%
48.6 34
 
0.3%
14.63 34
 
0.3%
Other values (4410) 9576
95.8%
ValueCountFrequency (%)
0.0 2
 
< 0.1%
0.002 1
 
< 0.1%
0.01 1
 
< 0.1%
0.013 2
 
< 0.1%
0.02 7
0.1%
0.03 3
< 0.1%
0.04 6
0.1%
0.048 1
 
< 0.1%
0.049 1
 
< 0.1%
0.05 3
< 0.1%
ValueCountFrequency (%)
17686.89 1
< 0.1%
15415.76 1
< 0.1%
8202.8 1
< 0.1%
5664.41 1
< 0.1%
2452.13 1
< 0.1%
1962.182 1
< 0.1%
1838.94 2
< 0.1%
1796.11 1
< 0.1%
1762.63 2
< 0.1%
1759.44 1
< 0.1%

작업_일자
Real number (ℝ)

Distinct284
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20208653
Minimum20200101
Maximum20240503
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T20:06:38.990881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20200101
5-th percentile20200218
Q120200804
median20210930
Q320211124
95-th percentile20220405
Maximum20240503
Range40402
Interquartile range (IQR)10320

Descriptive statistics

Standard deviation7979.3605
Coefficient of variation (CV)0.00039484871
Kurtosis0.53159523
Mean20208653
Median Absolute Deviation (MAD)9393
Skewness0.73495431
Sum2.0208653 × 1011
Variance63670193
MonotonicityNot monotonic
2024-05-03T20:06:39.506933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20211029 1833
 
18.3%
20201111 389
 
3.9%
20200218 374
 
3.7%
20200219 365
 
3.6%
20220405 361
 
3.6%
20211103 245
 
2.5%
20220204 223
 
2.2%
20201103 208
 
2.1%
20211201 177
 
1.8%
20200808 154
 
1.5%
Other values (274) 5671
56.7%
ValueCountFrequency (%)
20200101 30
0.3%
20200107 13
 
0.1%
20200108 45
0.4%
20200109 21
0.2%
20200110 18
 
0.2%
20200114 9
 
0.1%
20200115 9
 
0.1%
20200116 10
 
0.1%
20200117 11
 
0.1%
20200121 1
 
< 0.1%
ValueCountFrequency (%)
20240503 74
0.7%
20240417 1
 
< 0.1%
20240330 3
 
< 0.1%
20240327 4
 
< 0.1%
20240227 1
 
< 0.1%
20240208 1
 
< 0.1%
20230929 2
 
< 0.1%
20230831 5
 
0.1%
20230808 1
 
< 0.1%
20230719 3
 
< 0.1%

Interactions

2024-05-03T20:06:19.663298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:11.882662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:13.714581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:15.585571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:17.749846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:20.055535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:12.253023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:14.112457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:16.239284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:18.116560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:20.459561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:12.631093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:14.456092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:16.606855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:18.504882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:20.897599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:13.013840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:14.834017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:17.023082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:18.855211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:21.282507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:13.370151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:15.191566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:17.394901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:06:19.267998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T20:06:40.179770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전유_공용_구분_코드주_부속_구분_코드층_구분_코드층_번호구조_코드주_용도_코드면적작업_일자
전유_공용_구분_코드1.0000.1030.8150.0300.1250.2220.0480.057
주_부속_구분_코드0.1031.0000.0650.0000.2900.5310.0000.121
층_구분_코드0.8150.0651.0000.0240.1410.3200.0200.080
층_번호0.0300.0000.0241.0000.0000.4390.0000.000
구조_코드0.1250.2900.1410.0001.0000.5080.0980.171
주_용도_코드0.2220.5310.3200.4390.5081.0000.8290.391
면적0.0480.0000.0200.0000.0980.8291.0000.033
작업_일자0.0570.1210.0800.0000.1710.3910.0331.000
2024-05-03T20:06:40.672462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전유_공용_구분_코드주_부속_구분_코드
전유_공용_구분_코드1.0000.066
주_부속_구분_코드0.0661.000
2024-05-03T20:06:41.043156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
층_구분_코드층_번호구조_코드면적작업_일자전유_공용_구분_코드주_부속_구분_코드
층_구분_코드1.000-0.6290.0980.0280.0990.6080.042
층_번호-0.6291.000-0.0730.043-0.0810.0320.000
구조_코드0.098-0.0731.0000.0040.0740.0830.193
면적0.0280.0430.0041.0000.0440.0350.000
작업_일자0.099-0.0810.0740.0441.0000.0410.087
전유_공용_구분_코드0.6080.0320.0830.0350.0411.0000.066
주_부속_구분_코드0.0420.0000.1930.0000.0870.0661.000

Missing values

2024-05-03T20:06:21.749430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T20:06:22.321592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-03T20:06:22.978308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

관리_전유_공용_면적_pk호별명세_pk평형_구분_명전유_공용_구분_코드주_부속_구분_코드층_구분_코드층_번호구조_코드주_용도_코드기타_용도면적작업_일자
4670911500-10021658911500-100100041202102022102003도시형생활주택26.6420211204
7461711545-10012414411545-1000535181429204002117999복도,ELV 외16.7820200218
4010411260-10008768711260-100038618주7-17B201002102001주차장8.575420211029
4529311710-10020998011710-10006760818041020182114202<NA>15.9820200215
8437411200-10004853311200-1000521908I202102117999주차장59.620200808
3014911620-10011681511620-10007825215.64형202012114202주차장0.775220220304
4067611710-10022030111710-1000836011201202032114202벽체면적2.0420210128
7662811545-10012305411545-1000535181656102032102007기숙사20.621320200218
528111530-10010097311530-100079506327 공장-27204002117999기계전기실,MDF등5.41820211201
3094811680-10009261611680-100119709공공 17-3202012102001관리사무실(방재실)0.0720220409
관리_전유_공용_면적_pk호별명세_pk평형_구분_명전유_공용_구분_코드주_부속_구분_코드층_구분_코드층_번호구조_코드주_용도_코드기타_용도면적작업_일자
1436411560-10008685511560-100010630오피스29102002114204업무시설(사무소)29.197420201103
5459811380-10011028711380-1000809874lll201012102003단지형(기계실,발전기실,통신실)0.4820200319
6582511620-10010121411620-10005712779.73201032102001기계실1.1920200221
2442711470-988511470-271760102022103003<NA>132.2520200515
5890811260-10007597811260-100027732FB206204004217999주차장(지4~5층)55.7820200610
1274511680-10009088011680-100169152101102012103999근린생활시설180.3720220204
7496211290-10005544211290-100065542B1204002102003계단실,홀7.1120211029
1578011545-10018370711545-1000659171120호204002117999복도,화장실16.956720220405
4380411620-10011784611620-100079215305201012104402기계전기통신8.5820220322
2207111380-10011021811380-1000809876p102052114202<NA>21.7820200319