Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells19987
Missing cells (%)22.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory820.3 KiB
Average record size in memory84.0 B

Variable types

Text5
Numeric2
Categorical1
Unsupported1

Dataset

Description관리_호별_명세_pk,관리_동별_개요_pk,호_번호,호_명칭,평형_구분_명,층_번호,층_구분_코드,관리_건축물대장_참조_pk,변경_구분_코드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15664/S/1/datasetView.do

Alerts

층_구분_코드 is highly imbalanced (81.4%)Imbalance
관리_건축물대장_참조_pk has 9935 (99.4%) missing valuesMissing
변경_구분_코드 has 10000 (100.0%) missing valuesMissing
관리_호별_명세_pk has unique valuesUnique
변경_구분_코드 is an unsupported type, check if it needs cleaning or further analysisUnsupported
호_번호 has 2529 (25.3%) zerosZeros

Reproduction

Analysis started2024-05-11 08:22:48.057565
Analysis finished2024-05-11 08:22:51.895340
Duration3.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T08:22:52.407228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length13.6079
Min length7

Characters and Unicode

Total characters136079
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row11000-100030381
2nd row11000-30237
3rd row11110-100005523
4th row11110-100023617
5th row11110-100021234
ValueCountFrequency (%)
11000-100030381 1
 
< 0.1%
11110-5281 1
 
< 0.1%
11000-100032737 1
 
< 0.1%
11000-27530 1
 
< 0.1%
11000-100010342 1
 
< 0.1%
11000-7171 1
 
< 0.1%
11000-7684 1
 
< 0.1%
11110-100016064 1
 
< 0.1%
11110-100014615 1
 
< 0.1%
11110-100006831 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-11T08:22:53.386079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 47963
35.2%
1 41403
30.4%
- 10000
 
7.3%
4 6008
 
4.4%
2 5924
 
4.4%
3 4979
 
3.7%
9 4020
 
3.0%
8 3987
 
2.9%
7 3978
 
2.9%
5 3934
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 126079
92.7%
Dash Punctuation 10000
 
7.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 47963
38.0%
1 41403
32.8%
4 6008
 
4.8%
2 5924
 
4.7%
3 4979
 
3.9%
9 4020
 
3.2%
8 3987
 
3.2%
7 3978
 
3.2%
5 3934
 
3.1%
6 3883
 
3.1%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 136079
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 47963
35.2%
1 41403
30.4%
- 10000
 
7.3%
4 6008
 
4.4%
2 5924
 
4.4%
3 4979
 
3.7%
9 4020
 
3.0%
8 3987
 
2.9%
7 3978
 
2.9%
5 3934
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 136079
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 47963
35.2%
1 41403
30.4%
- 10000
 
7.3%
4 6008
 
4.4%
2 5924
 
4.4%
3 4979
 
3.7%
9 4020
 
3.0%
8 3987
 
2.9%
7 3978
 
2.9%
5 3934
 
2.9%
Distinct1090
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T08:22:54.255857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length12.6539
Min length7

Characters and Unicode

Total characters126539
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique463 ?
Unique (%)4.6%

Sample

1st row11000-100005926
2nd row11000-159
3rd row11110-100015098
4th row11110-100071748
5th row11110-100067789
ValueCountFrequency (%)
11000-100005238 494
 
4.9%
11140-100007179 437
 
4.4%
11000-131 348
 
3.5%
11000-100005237 222
 
2.2%
11000-45 159
 
1.6%
11110-3036 131
 
1.3%
11000-82 124
 
1.2%
11000-5 107
 
1.1%
11110-1252 104
 
1.0%
11000-159 99
 
1.0%
Other values (1080) 7775
77.8%
2024-05-11T08:22:55.380572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 46385
36.7%
1 39697
31.4%
- 10000
 
7.9%
2 4659
 
3.7%
7 4536
 
3.6%
4 4299
 
3.4%
3 3869
 
3.1%
5 3781
 
3.0%
8 3438
 
2.7%
6 2990
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 116539
92.1%
Dash Punctuation 10000
 
7.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 46385
39.8%
1 39697
34.1%
2 4659
 
4.0%
7 4536
 
3.9%
4 4299
 
3.7%
3 3869
 
3.3%
5 3781
 
3.2%
8 3438
 
3.0%
6 2990
 
2.6%
9 2885
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 126539
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 46385
36.7%
1 39697
31.4%
- 10000
 
7.9%
2 4659
 
3.7%
7 4536
 
3.6%
4 4299
 
3.4%
3 3869
 
3.1%
5 3781
 
3.0%
8 3438
 
2.7%
6 2990
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 126539
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 46385
36.7%
1 39697
31.4%
- 10000
 
7.9%
2 4659
 
3.7%
7 4536
 
3.6%
4 4299
 
3.4%
3 3869
 
3.1%
5 3781
 
3.0%
8 3438
 
2.7%
6 2990
 
2.4%

호_번호
Real number (ℝ)

ZEROS 

Distinct1897
Distinct (%)19.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean423.9252
Minimum0
Maximum5353
Zeros2529
Zeros (%)25.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T08:22:55.932990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median91
Q3349
95-th percentile2599.1
Maximum5353
Range5353
Interquartile range (IQR)349

Descriptive statistics

Standard deviation876.07306
Coefficient of variation (CV)2.0665746
Kurtosis10.078732
Mean423.9252
Median Absolute Deviation (MAD)91
Skewness3.1485174
Sum4239252
Variance767504
MonotonicityNot monotonic
2024-05-11T08:22:56.743309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2529
 
25.3%
3 93
 
0.9%
2 91
 
0.9%
5 83
 
0.8%
4 82
 
0.8%
6 78
 
0.8%
1 77
 
0.8%
7 68
 
0.7%
8 48
 
0.5%
9 47
 
0.5%
Other values (1887) 6804
68.0%
ValueCountFrequency (%)
0 2529
25.3%
1 77
 
0.8%
2 91
 
0.9%
3 93
 
0.9%
4 82
 
0.8%
5 83
 
0.8%
6 78
 
0.8%
7 68
 
0.7%
8 48
 
0.5%
9 47
 
0.5%
ValueCountFrequency (%)
5353 1
< 0.1%
5352 1
< 0.1%
5346 1
< 0.1%
5337 1
< 0.1%
5315 1
< 0.1%
5294 1
< 0.1%
5288 1
< 0.1%
5261 1
< 0.1%
5233 1
< 0.1%
5217 1
< 0.1%
Distinct5319
Distinct (%)53.3%
Missing25
Missing (%)0.2%
Memory size156.2 KiB
2024-05-11T08:22:57.711772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length4.7851629
Min length1

Characters and Unicode

Total characters47732
Distinct characters103
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4334 ?
Unique (%)43.4%

Sample

1st row202동1504호
2nd row131
3rd row302
4th row502
5th row410
ValueCountFrequency (%)
301 130
 
1.3%
201 115
 
1.1%
401 103
 
1.0%
오피스텔 99
 
1.0%
202 91
 
0.9%
101 89
 
0.9%
302 87
 
0.9%
아파트 67
 
0.7%
102 66
 
0.6%
402 62
 
0.6%
Other values (5123) 9300
91.1%
2024-05-11T08:22:59.086424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 9117
19.1%
1 9072
19.0%
2 5322
11.1%
3 3413
 
7.2%
- 2922
 
6.1%
4 2707
 
5.7%
5 2120
 
4.4%
6 1745
 
3.7%
7 1605
 
3.4%
8 1396
 
2.9%
Other values (93) 8313
17.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37773
79.1%
Other Letter 4134
 
8.7%
Dash Punctuation 2922
 
6.1%
Uppercase Letter 2609
 
5.5%
Space Separator 234
 
0.5%
Lowercase Letter 24
 
0.1%
Close Punctuation 18
 
< 0.1%
Open Punctuation 18
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1232
29.8%
235
 
5.7%
216
 
5.2%
215
 
5.2%
215
 
5.2%
203
 
4.9%
142
 
3.4%
139
 
3.4%
130
 
3.1%
129
 
3.1%
Other values (53) 1278
30.9%
Uppercase Letter
ValueCountFrequency (%)
B 938
36.0%
F 438
16.8%
A 336
 
12.9%
L 159
 
6.1%
D 156
 
6.0%
T 153
 
5.9%
C 134
 
5.1%
Y 130
 
5.0%
E 81
 
3.1%
S 38
 
1.5%
Other values (9) 46
 
1.8%
Decimal Number
ValueCountFrequency (%)
0 9117
24.1%
1 9072
24.0%
2 5322
14.1%
3 3413
 
9.0%
4 2707
 
7.2%
5 2120
 
5.6%
6 1745
 
4.6%
7 1605
 
4.2%
8 1396
 
3.7%
9 1276
 
3.4%
Lowercase Letter
ValueCountFrequency (%)
b 10
41.7%
e 4
 
16.7%
r 4
 
16.7%
i 2
 
8.3%
m 2
 
8.3%
a 1
 
4.2%
c 1
 
4.2%
Dash Punctuation
ValueCountFrequency (%)
- 2922
100.0%
Space Separator
ValueCountFrequency (%)
234
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40965
85.8%
Hangul 4134
 
8.7%
Latin 2633
 
5.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1232
29.8%
235
 
5.7%
216
 
5.2%
215
 
5.2%
215
 
5.2%
203
 
4.9%
142
 
3.4%
139
 
3.4%
130
 
3.1%
129
 
3.1%
Other values (53) 1278
30.9%
Latin
ValueCountFrequency (%)
B 938
35.6%
F 438
16.6%
A 336
 
12.8%
L 159
 
6.0%
D 156
 
5.9%
T 153
 
5.8%
C 134
 
5.1%
Y 130
 
4.9%
E 81
 
3.1%
S 38
 
1.4%
Other values (16) 70
 
2.7%
Common
ValueCountFrequency (%)
0 9117
22.3%
1 9072
22.1%
2 5322
13.0%
3 3413
 
8.3%
- 2922
 
7.1%
4 2707
 
6.6%
5 2120
 
5.2%
6 1745
 
4.3%
7 1605
 
3.9%
8 1396
 
3.4%
Other values (4) 1546
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43598
91.3%
Hangul 4134
 
8.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9117
20.9%
1 9072
20.8%
2 5322
12.2%
3 3413
 
7.8%
- 2922
 
6.7%
4 2707
 
6.2%
5 2120
 
4.9%
6 1745
 
4.0%
7 1605
 
3.7%
8 1396
 
3.2%
Other values (30) 4179
9.6%
Hangul
ValueCountFrequency (%)
1232
29.8%
235
 
5.7%
216
 
5.2%
215
 
5.2%
215
 
5.2%
203
 
4.9%
142
 
3.4%
139
 
3.4%
130
 
3.1%
129
 
3.1%
Other values (53) 1278
30.9%
Distinct3655
Distinct (%)36.6%
Missing27
Missing (%)0.3%
Memory size156.2 KiB
2024-05-11T08:23:00.095470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length16
Mean length3.4702697
Min length1

Characters and Unicode

Total characters34609
Distinct characters128
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2464 ?
Unique (%)24.7%

Sample

1st row40B
2nd rowss023
3rd rowD
4th row48.55
5th row19오
ValueCountFrequency (%)
a 384
 
3.8%
b 168
 
1.6%
c 143
 
1.4%
d 92
 
0.9%
type 82
 
0.8%
1ob 77
 
0.8%
oc 66
 
0.6%
g 65
 
0.6%
17ta 62
 
0.6%
32a 58
 
0.6%
Other values (3538) 8988
88.2%
2024-05-11T08:23:01.858159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 4072
 
11.8%
2 2866
 
8.3%
3 2693
 
7.8%
A 2306
 
6.7%
4 2186
 
6.3%
6 1997
 
5.8%
5 1923
 
5.6%
0 1834
 
5.3%
. 1718
 
5.0%
B 1711
 
4.9%
Other values (118) 11303
32.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21900
63.3%
Uppercase Letter 8143
 
23.5%
Other Punctuation 1739
 
5.0%
Lowercase Letter 1083
 
3.1%
Other Letter 742
 
2.1%
Dash Punctuation 690
 
2.0%
Space Separator 212
 
0.6%
Open Punctuation 44
 
0.1%
Close Punctuation 44
 
0.1%
Math Symbol 12
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
141
19.0%
105
14.2%
63
 
8.5%
49
 
6.6%
34
 
4.6%
30
 
4.0%
30
 
4.0%
25
 
3.4%
25
 
3.4%
24
 
3.2%
Other values (49) 216
29.1%
Uppercase Letter
ValueCountFrequency (%)
A 2306
28.3%
B 1711
21.0%
C 831
 
10.2%
F 649
 
8.0%
D 528
 
6.5%
O 519
 
6.4%
E 251
 
3.1%
S 229
 
2.8%
T 228
 
2.8%
G 139
 
1.7%
Other values (16) 752
 
9.2%
Lowercase Letter
ValueCountFrequency (%)
a 209
19.3%
b 145
13.4%
p 114
10.5%
e 110
10.2%
o 107
9.9%
f 107
9.9%
y 95
8.8%
c 44
 
4.1%
s 43
 
4.0%
t 30
 
2.8%
Other values (15) 79
 
7.3%
Decimal Number
ValueCountFrequency (%)
1 4072
18.6%
2 2866
13.1%
3 2693
12.3%
4 2186
10.0%
6 1997
9.1%
5 1923
8.8%
0 1834
8.4%
7 1548
 
7.1%
8 1538
 
7.0%
9 1243
 
5.7%
Other Punctuation
ValueCountFrequency (%)
. 1718
98.8%
* 12
 
0.7%
, 9
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 690
100.0%
Space Separator
ValueCountFrequency (%)
212
100.0%
Open Punctuation
ValueCountFrequency (%)
( 44
100.0%
Close Punctuation
ValueCountFrequency (%)
) 44
100.0%
Math Symbol
ValueCountFrequency (%)
~ 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24641
71.2%
Latin 9226
 
26.7%
Hangul 742
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
141
19.0%
105
14.2%
63
 
8.5%
49
 
6.6%
34
 
4.6%
30
 
4.0%
30
 
4.0%
25
 
3.4%
25
 
3.4%
24
 
3.2%
Other values (49) 216
29.1%
Latin
ValueCountFrequency (%)
A 2306
25.0%
B 1711
18.5%
C 831
 
9.0%
F 649
 
7.0%
D 528
 
5.7%
O 519
 
5.6%
E 251
 
2.7%
S 229
 
2.5%
T 228
 
2.5%
a 209
 
2.3%
Other values (41) 1765
19.1%
Common
ValueCountFrequency (%)
1 4072
16.5%
2 2866
11.6%
3 2693
10.9%
4 2186
8.9%
6 1997
8.1%
5 1923
7.8%
0 1834
7.4%
. 1718
7.0%
7 1548
 
6.3%
8 1538
 
6.2%
Other values (8) 2266
9.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33867
97.9%
Hangul 741
 
2.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 4072
 
12.0%
2 2866
 
8.5%
3 2693
 
8.0%
A 2306
 
6.8%
4 2186
 
6.5%
6 1997
 
5.9%
5 1923
 
5.7%
0 1834
 
5.4%
. 1718
 
5.1%
B 1711
 
5.1%
Other values (59) 10561
31.2%
Hangul
ValueCountFrequency (%)
141
19.0%
105
14.2%
63
 
8.5%
49
 
6.6%
34
 
4.6%
30
 
4.0%
30
 
4.0%
25
 
3.4%
25
 
3.4%
24
 
3.2%
Other values (48) 215
29.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

층_번호
Real number (ℝ)

Distinct71
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.3235
Minimum0
Maximum501
Zeros63
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T08:23:02.544970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q313
95-th percentile27
Maximum501
Range501
Interquartile range (IQR)10

Descriptive statistics

Standard deviation13.07531
Coefficient of variation (CV)1.4024037
Kurtosis518.53451
Mean9.3235
Median Absolute Deviation (MAD)4
Skewness17.326913
Sum93235
Variance170.96374
MonotonicityNot monotonic
2024-05-11T08:23:03.023886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1285
 
12.8%
2 1119
 
11.2%
3 842
 
8.4%
4 733
 
7.3%
5 562
 
5.6%
7 507
 
5.1%
6 503
 
5.0%
8 406
 
4.1%
9 405
 
4.0%
10 323
 
3.2%
Other values (61) 3315
33.1%
ValueCountFrequency (%)
0 63
 
0.6%
1 1285
12.8%
2 1119
11.2%
3 842
8.4%
4 733
7.3%
5 562
5.6%
6 503
 
5.0%
7 507
 
5.1%
8 406
 
4.1%
9 405
 
4.0%
ValueCountFrequency (%)
501 1
< 0.1%
403 1
< 0.1%
402 1
< 0.1%
401 1
< 0.1%
303 1
< 0.1%
302 1
< 0.1%
301 1
< 0.1%
114 1
< 0.1%
112 1
< 0.1%
67 1
< 0.1%

층_구분_코드
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
9165 
10
 
812
<NA>
 
17
40
 
4
30
 
2

Length

Max length4
Median length2
Mean length2.0034
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row10
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 9165
91.6%
10 812
 
8.1%
<NA> 17
 
0.2%
40 4
 
< 0.1%
30 2
 
< 0.1%

Length

2024-05-11T08:23:03.681295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T08:23:04.280699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 9165
91.6%
10 812
 
8.1%
na 17
 
0.2%
40 4
 
< 0.1%
30 2
 
< 0.1%
Distinct46
Distinct (%)70.8%
Missing9935
Missing (%)99.4%
Memory size156.2 KiB
2024-05-11T08:23:05.070487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length28
Mean length24.6
Min length15

Characters and Unicode

Total characters1599
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)49.2%

Sample

1st row11110-100187986
2nd row11110-1000000000000001991263
3rd row11110-1000000000000001991272
4th row11110-1000000000000001991285
5th row11110-1000000000000001991247
ValueCountFrequency (%)
11110-1000000000000001991249 4
 
6.2%
11110-1000000000000001991288 3
 
4.6%
11110-1000000000000001991242 3
 
4.6%
11110-1000000000000001991272 3
 
4.6%
11140-100212875 2
 
3.1%
11110-1000000000000001991284 2
 
3.1%
11140-100196161 2
 
3.1%
11110-1000000000000001991275 2
 
3.1%
11110-1000000000000001991255 2
 
3.1%
11110-1000000000000001991262 2
 
3.1%
Other values (36) 40
61.5%
2024-05-11T08:23:06.580784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 774
48.4%
1 440
27.5%
9 119
 
7.4%
- 65
 
4.1%
2 65
 
4.1%
4 33
 
2.1%
6 28
 
1.8%
7 24
 
1.5%
8 23
 
1.4%
5 19
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1534
95.9%
Dash Punctuation 65
 
4.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 774
50.5%
1 440
28.7%
9 119
 
7.8%
2 65
 
4.2%
4 33
 
2.2%
6 28
 
1.8%
7 24
 
1.6%
8 23
 
1.5%
5 19
 
1.2%
3 9
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
- 65
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1599
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 774
48.4%
1 440
27.5%
9 119
 
7.4%
- 65
 
4.1%
2 65
 
4.1%
4 33
 
2.1%
6 28
 
1.8%
7 24
 
1.5%
8 23
 
1.4%
5 19
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1599
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 774
48.4%
1 440
27.5%
9 119
 
7.4%
- 65
 
4.1%
2 65
 
4.1%
4 33
 
2.1%
6 28
 
1.8%
7 24
 
1.5%
8 23
 
1.4%
5 19
 
1.2%

변경_구분_코드
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

Interactions

2024-05-11T08:22:49.998479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T08:22:49.396346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T08:22:50.316497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T08:22:49.687371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T08:23:07.084072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호_번호층_번호층_구분_코드관리_건축물대장_참조_pk
호_번호1.0000.0000.111NaN
층_번호0.0001.0000.000NaN
층_구분_코드0.1110.0001.0000.735
관리_건축물대장_참조_pkNaNNaN0.7351.000
2024-05-11T08:23:07.723972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호_번호층_번호층_구분_코드
호_번호1.0000.1990.067
층_번호0.1991.0000.000
층_구분_코드0.0670.0001.000

Missing values

2024-05-11T08:22:50.845811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T08:22:51.292851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T08:22:51.640262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

관리_호별_명세_pk관리_동별_개요_pk호_번호호_명칭평형_구분_명층_번호층_구분_코드관리_건축물대장_참조_pk변경_구분_코드
1189511000-10003038111000-1000059260202동1504호40B1520<NA><NA>
3611811000-3023711000-15993131ss023110<NA><NA>
5316611110-10000552311110-1000150980302D320<NA><NA>
6486811110-10002361711110-100071748050248.55520<NA><NA>
6262511110-10002123411110-100067789041019오420<NA><NA>
8396011110-969611110-123831601호11D620<NA><NA>
188911000-10000535311000-100005126277A-612F620<NA><NA>
7888311110-499711110-7821상가 B10122.68210<NA><NA>
3412811000-2844711000-145253비-190343C81920<NA><NA>
6213911110-10002050111110-100062408010289.88120<NA><NA>
관리_호별_명세_pk관리_동별_개요_pk호_번호호_명칭평형_구분_명층_번호층_구분_코드관리_건축물대장_참조_pk변경_구분_코드
118911000-10000427011000-10000484733101-40156W420<NA><NA>
7905111110-514711110-782151아파트 3021D320<NA><NA>
4177711000-379111000-21120160256B1620<NA><NA>
9658311140-10001604311140-1000156990B-90159020<NA><NA>
6321711110-10002188911110-100069608010041004 (A)1020<NA><NA>
9884011140-10002061411140-1000304630403b420<NA><NA>
3916311000-3297911000-16711920021B2020<NA><NA>
3434611000-2864311000-1461351605호39B1620<NA><NA>
1024811000-10002735311000-100005295160비-130447B1320<NA><NA>
9975611140-10002231511140-1000347441471012K1020<NA><NA>