Overview

Dataset statistics

Number of variables14
Number of observations10000
Missing cells74832
Missing cells (%)53.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 MiB
Average record size in memory125.0 B

Variable types

Numeric5
Categorical2
Text7

Dataset

Description메타시스템내 사용하는 우편번호 관리현황 자료로 제공항목은 우편번호, 시도명, 시군구명, 읍면동명, 리명, 도서명, 산번지, 시작주번지, 시작부번지, 종료주번지, 종료부번지, 아파트건물명, 시작동번호, 종료동번호
Author법무부
URLhttps://www.data.go.kr/data/15087310/fileData.do

Alerts

시도명 is highly overall correlated with 우편번호 and 1 other fieldsHigh correlation
산번지 is highly overall correlated with 우편번호 and 5 other fieldsHigh correlation
우편번호 is highly overall correlated with 시도명 and 1 other fieldsHigh correlation
시작주번지 is highly overall correlated with 종료주번지 and 1 other fieldsHigh correlation
시작부번지 is highly overall correlated with 종료부번지 and 1 other fieldsHigh correlation
종료주번지 is highly overall correlated with 시작주번지 and 1 other fieldsHigh correlation
종료부번지 is highly overall correlated with 시작부번지 and 1 other fieldsHigh correlation
산번지 is highly imbalanced (90.6%)Imbalance
리명 has 6372 (63.7%) missing valuesMissing
도서명 has 9988 (99.9%) missing valuesMissing
시작주번지 has 7123 (71.2%) missing valuesMissing
시작부번지 has 9662 (96.6%) missing valuesMissing
종료주번지 has 7123 (71.2%) missing valuesMissing
종료부번지 has 9662 (96.6%) missing valuesMissing
아파트건물명 has 7146 (71.5%) missing valuesMissing
시작동번호 has 8815 (88.1%) missing valuesMissing
종료동번호 has 8830 (88.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 20:36:10.667567
Analysis finished2023-12-12 20:36:15.862352
Duration5.19 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

우편번호
Real number (ℝ)

HIGH CORRELATION 

Distinct8523
Distinct (%)85.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean460923.48
Minimum100031
Maximum799820
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:36:15.941911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100031
5-th percentile134024.5
Q1325912
median469847
Q3617017.75
95-th percentile750873.45
Maximum799820
Range699789
Interquartile range (IQR)291105.75

Descriptive statistics

Standard deviation197863.73
Coefficient of variation (CV)0.42927674
Kurtosis-1.0173305
Mean460923.48
Median Absolute Deviation (MAD)146016
Skewness-0.26076375
Sum4.6092348 × 109
Variance3.9150056 × 1010
MonotonicityNot monotonic
2023-12-13T05:36:16.094858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
482839 10
 
0.1%
701819 9
 
0.1%
486859 9
 
0.1%
250889 8
 
0.1%
487839 7
 
0.1%
482869 7
 
0.1%
701813 6
 
0.1%
601814 6
 
0.1%
209839 6
 
0.1%
630822 6
 
0.1%
Other values (8513) 9926
99.3%
ValueCountFrequency (%)
100031 1
< 0.1%
100051 1
< 0.1%
100053 1
< 0.1%
100094 1
< 0.1%
100095 1
< 0.1%
100151 1
< 0.1%
100161 1
< 0.1%
100162 1
< 0.1%
100195 1
< 0.1%
100273 1
< 0.1%
ValueCountFrequency (%)
799820 1
< 0.1%
799811 2
< 0.1%
799803 1
< 0.1%
799800 1
< 0.1%
791948 1
< 0.1%
791945 2
< 0.1%
791942 1
< 0.1%
791941 1
< 0.1%
791923 2
< 0.1%
791921 1
< 0.1%

시도명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기도
1645 
서울특별시
1565 
경상북도
952 
전라남도
757 
경상남도
720 
Other values (12)
4361 

Length

Max length7
Median length5
Mean length4.1905
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row경기도
3rd row전라북도
4th row충청북도
5th row전라북도

Common Values

ValueCountFrequency (%)
경기도 1645
16.4%
서울특별시 1565
15.7%
경상북도 952
9.5%
전라남도 757
7.6%
경상남도 720
7.2%
부산광역시 688
6.9%
강원도 551
 
5.5%
충청남도 540
 
5.4%
전라북도 529
 
5.3%
대구광역시 520
 
5.2%
Other values (7) 1533
15.3%

Length

2023-12-13T05:36:16.266724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 1645
16.4%
서울특별시 1565
15.7%
경상북도 952
9.5%
전라남도 757
7.6%
경상남도 720
7.2%
부산광역시 688
6.9%
강원도 551
 
5.5%
충청남도 540
 
5.4%
전라북도 529
 
5.3%
대구광역시 520
 
5.2%
Other values (7) 1533
15.3%
Distinct228
Distinct (%)2.3%
Missing47
Missing (%)0.5%
Memory size156.2 KiB
2023-12-13T05:36:16.598337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.3281423
Min length2

Characters and Unicode

Total characters33125
Distinct characters140
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row송파구
2nd row양주시
3rd row순창군
4th row청주시상당구
5th row군산시
ValueCountFrequency (%)
동구 275
 
2.8%
중구 239
 
2.4%
북구 230
 
2.3%
남구 225
 
2.3%
서구 202
 
2.0%
강남구 107
 
1.1%
화성시 94
 
0.9%
노원구 91
 
0.9%
송파구 90
 
0.9%
수성구 89
 
0.9%
Other values (218) 8311
83.5%
2023-12-13T05:36:17.099194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4997
 
15.1%
4189
 
12.6%
2410
 
7.3%
1160
 
3.5%
954
 
2.9%
945
 
2.9%
920
 
2.8%
856
 
2.6%
797
 
2.4%
766
 
2.3%
Other values (130) 15131
45.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33125
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4997
 
15.1%
4189
 
12.6%
2410
 
7.3%
1160
 
3.5%
954
 
2.9%
945
 
2.9%
920
 
2.8%
856
 
2.6%
797
 
2.4%
766
 
2.3%
Other values (130) 15131
45.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33125
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4997
 
15.1%
4189
 
12.6%
2410
 
7.3%
1160
 
3.5%
954
 
2.9%
945
 
2.9%
920
 
2.8%
856
 
2.6%
797
 
2.4%
766
 
2.3%
Other values (130) 15131
45.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33125
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4997
 
15.1%
4189
 
12.6%
2410
 
7.3%
1160
 
3.5%
954
 
2.9%
945
 
2.9%
920
 
2.8%
856
 
2.6%
797
 
2.4%
766
 
2.3%
Other values (130) 15131
45.7%
Distinct3284
Distinct (%)33.1%
Missing64
Missing (%)0.6%
Memory size156.2 KiB
2023-12-13T05:36:17.454572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.3447061
Min length2

Characters and Unicode

Total characters33233
Distinct characters339
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1045 ?
Unique (%)10.5%

Sample

1st row장지동
2nd row남면
3rd row팔덕면
4th row평촌동
5th row수송동
ValueCountFrequency (%)
남면 41
 
0.4%
서면 30
 
0.3%
북면 27
 
0.3%
부곡동 20
 
0.2%
금곡동 20
 
0.2%
가산동 18
 
0.2%
조례동 17
 
0.2%
중동 17
 
0.2%
중앙동 16
 
0.2%
여의도동 16
 
0.2%
Other values (3274) 9714
97.8%
2023-12-13T05:36:18.096120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6233
 
18.8%
2956
 
8.9%
2 1114
 
3.4%
1 1071
 
3.2%
1043
 
3.1%
797
 
2.4%
546
 
1.6%
3 475
 
1.4%
422
 
1.3%
419
 
1.3%
Other values (329) 18157
54.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30062
90.5%
Decimal Number 3133
 
9.4%
Other Punctuation 38
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6233
 
20.7%
2956
 
9.8%
1043
 
3.5%
797
 
2.7%
546
 
1.8%
422
 
1.4%
419
 
1.4%
411
 
1.4%
388
 
1.3%
356
 
1.2%
Other values (317) 16491
54.9%
Decimal Number
ValueCountFrequency (%)
2 1114
35.6%
1 1071
34.2%
3 475
15.2%
4 248
 
7.9%
5 97
 
3.1%
6 61
 
1.9%
7 30
 
1.0%
8 20
 
0.6%
9 13
 
0.4%
0 4
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 32
84.2%
, 6
 
15.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30062
90.5%
Common 3171
 
9.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6233
 
20.7%
2956
 
9.8%
1043
 
3.5%
797
 
2.7%
546
 
1.8%
422
 
1.4%
419
 
1.4%
411
 
1.4%
388
 
1.3%
356
 
1.2%
Other values (317) 16491
54.9%
Common
ValueCountFrequency (%)
2 1114
35.1%
1 1071
33.8%
3 475
15.0%
4 248
 
7.8%
5 97
 
3.1%
6 61
 
1.9%
. 32
 
1.0%
7 30
 
0.9%
8 20
 
0.6%
9 13
 
0.4%
Other values (2) 10
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30062
90.5%
ASCII 3171
 
9.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6233
 
20.7%
2956
 
9.8%
1043
 
3.5%
797
 
2.7%
546
 
1.8%
422
 
1.4%
419
 
1.4%
411
 
1.4%
388
 
1.3%
356
 
1.2%
Other values (317) 16491
54.9%
ASCII
ValueCountFrequency (%)
2 1114
35.1%
1 1071
33.8%
3 475
15.0%
4 248
 
7.8%
5 97
 
3.1%
6 61
 
1.9%
. 32
 
1.0%
7 30
 
0.9%
8 20
 
0.6%
9 13
 
0.4%
Other values (2) 10
 
0.3%

리명
Text

MISSING 

Distinct2640
Distinct (%)72.8%
Missing6372
Missing (%)63.7%
Memory size156.2 KiB
2023-12-13T05:36:18.528956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.1041896
Min length2

Characters and Unicode

Total characters11262
Distinct characters330
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2108 ?
Unique (%)58.1%

Sample

1st row구암리
2nd row용산리
3rd row토성리
4th row정곡리
5th row신정리
ValueCountFrequency (%)
동산리 10
 
0.3%
신흥리 9
 
0.2%
상리 9
 
0.2%
용암리 9
 
0.2%
용두리 9
 
0.2%
용산리 9
 
0.2%
금곡리 9
 
0.2%
남산리 8
 
0.2%
봉산리 8
 
0.2%
읍내리 8
 
0.2%
Other values (2630) 3540
97.6%
2023-12-13T05:36:19.156277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3638
32.3%
287
 
2.5%
229
 
2.0%
162
 
1.4%
161
 
1.4%
159
 
1.4%
157
 
1.4%
152
 
1.3%
131
 
1.2%
128
 
1.1%
Other values (320) 6058
53.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10836
96.2%
Decimal Number 426
 
3.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3638
33.6%
287
 
2.6%
229
 
2.1%
162
 
1.5%
161
 
1.5%
159
 
1.5%
157
 
1.4%
152
 
1.4%
131
 
1.2%
128
 
1.2%
Other values (310) 5632
52.0%
Decimal Number
ValueCountFrequency (%)
1 117
27.5%
2 102
23.9%
3 58
13.6%
4 54
12.7%
5 31
 
7.3%
6 19
 
4.5%
7 15
 
3.5%
8 13
 
3.1%
0 9
 
2.1%
9 8
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10836
96.2%
Common 426
 
3.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3638
33.6%
287
 
2.6%
229
 
2.1%
162
 
1.5%
161
 
1.5%
159
 
1.5%
157
 
1.4%
152
 
1.4%
131
 
1.2%
128
 
1.2%
Other values (310) 5632
52.0%
Common
ValueCountFrequency (%)
1 117
27.5%
2 102
23.9%
3 58
13.6%
4 54
12.7%
5 31
 
7.3%
6 19
 
4.5%
7 15
 
3.5%
8 13
 
3.1%
0 9
 
2.1%
9 8
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10836
96.2%
ASCII 426
 
3.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3638
33.6%
287
 
2.6%
229
 
2.1%
162
 
1.5%
161
 
1.5%
159
 
1.5%
157
 
1.4%
152
 
1.4%
131
 
1.2%
128
 
1.2%
Other values (310) 5632
52.0%
ASCII
ValueCountFrequency (%)
1 117
27.5%
2 102
23.9%
3 58
13.6%
4 54
12.7%
5 31
 
7.3%
6 19
 
4.5%
7 15
 
3.5%
8 13
 
3.1%
0 9
 
2.1%
9 8
 
1.9%

도서명
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing9988
Missing (%)99.9%
Memory size156.2 KiB
2023-12-13T05:36:19.406611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.5833333
Min length2

Characters and Unicode

Total characters31
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st row율도
2nd row백야도
3rd row비안도
4th row내도
5th row조도
ValueCountFrequency (%)
율도 1
8.3%
백야도 1
8.3%
비안도 1
8.3%
내도 1
8.3%
조도 1
8.3%
서화도 1
8.3%
분점도 1
8.3%
소도 1
8.3%
서리도 1
8.3%
가우도 1
8.3%
Other values (2) 2
16.7%
2023-12-13T05:36:19.765447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
38.7%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (9) 9
29.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 31
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
38.7%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (9) 9
29.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 31
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
38.7%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (9) 9
29.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 31
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
38.7%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (9) 9
29.0%

산번지
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9879 
 
121

Length

Max length4
Median length4
Mean length3.9637
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9879
98.8%
121
 
1.2%

Length

2023-12-13T05:36:19.939678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:36:20.076251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9879
98.8%
121
 
1.2%

시작주번지
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1127
Distinct (%)39.2%
Missing7123
Missing (%)71.2%
Infinite0
Infinite (%)0.0%
Mean554.26521
Minimum1
Maximum9900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:36:20.225360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q1118
median373
Q3746
95-th percentile1542.2
Maximum9900
Range9899
Interquartile range (IQR)628

Descriptive statistics

Standard deviation765.34125
Coefficient of variation (CV)1.3808214
Kurtosis51.305023
Mean554.26521
Median Absolute Deviation (MAD)281
Skewness5.7260477
Sum1594621
Variance585747.22
MonotonicityNot monotonic
2023-12-13T05:36:20.402135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 251
 
2.5%
300 18
 
0.2%
94 17
 
0.2%
301 16
 
0.2%
500 16
 
0.2%
101 15
 
0.1%
100 15
 
0.1%
200 14
 
0.1%
400 14
 
0.1%
600 13
 
0.1%
Other values (1117) 2488
 
24.9%
(Missing) 7123
71.2%
ValueCountFrequency (%)
1 251
2.5%
2 4
 
< 0.1%
3 6
 
0.1%
4 4
 
< 0.1%
5 9
 
0.1%
6 3
 
< 0.1%
7 3
 
< 0.1%
8 2
 
< 0.1%
9 1
 
< 0.1%
10 2
 
< 0.1%
ValueCountFrequency (%)
9900 1
< 0.1%
9800 1
< 0.1%
9300 1
< 0.1%
9100 1
< 0.1%
9000 1
< 0.1%
8300 1
< 0.1%
8000 1
< 0.1%
7343 1
< 0.1%
7314 1
< 0.1%
7288 1
< 0.1%

시작부번지
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct112
Distinct (%)33.1%
Missing9662
Missing (%)96.6%
Infinite0
Infinite (%)0.0%
Mean124.3432
Minimum1
Maximum4600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:36:20.577324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median16
Q341
95-th percentile638.35
Maximum4600
Range4599
Interquartile range (IQR)37

Descriptive statistics

Standard deviation417.41887
Coefficient of variation (CV)3.3569901
Kurtosis52.403146
Mean124.3432
Median Absolute Deviation (MAD)14
Skewness6.4581301
Sum42028
Variance174238.52
MonotonicityNot monotonic
2023-12-13T05:36:21.052857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 45
 
0.4%
2 19
 
0.2%
3 14
 
0.1%
4 13
 
0.1%
7 11
 
0.1%
11 10
 
0.1%
16 9
 
0.1%
10 8
 
0.1%
18 8
 
0.1%
5 7
 
0.1%
Other values (102) 194
 
1.9%
(Missing) 9662
96.6%
ValueCountFrequency (%)
1 45
0.4%
2 19
0.2%
3 14
 
0.1%
4 13
 
0.1%
5 7
 
0.1%
6 5
 
0.1%
7 11
 
0.1%
8 5
 
0.1%
9 6
 
0.1%
10 8
 
0.1%
ValueCountFrequency (%)
4600 1
< 0.1%
3271 1
< 0.1%
1898 1
< 0.1%
1894 1
< 0.1%
1781 1
< 0.1%
1780 1
< 0.1%
1768 1
< 0.1%
1676 1
< 0.1%
1589 1
< 0.1%
1342 1
< 0.1%

종료주번지
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1127
Distinct (%)39.2%
Missing7123
Missing (%)71.2%
Infinite0
Infinite (%)0.0%
Mean554.26521
Minimum1
Maximum9900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:36:21.223135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q1118
median373
Q3746
95-th percentile1542.2
Maximum9900
Range9899
Interquartile range (IQR)628

Descriptive statistics

Standard deviation765.34125
Coefficient of variation (CV)1.3808214
Kurtosis51.305023
Mean554.26521
Median Absolute Deviation (MAD)281
Skewness5.7260477
Sum1594621
Variance585747.22
MonotonicityNot monotonic
2023-12-13T05:36:21.386338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 251
 
2.5%
300 18
 
0.2%
94 17
 
0.2%
301 16
 
0.2%
500 16
 
0.2%
101 15
 
0.1%
100 15
 
0.1%
200 14
 
0.1%
400 14
 
0.1%
600 13
 
0.1%
Other values (1117) 2488
 
24.9%
(Missing) 7123
71.2%
ValueCountFrequency (%)
1 251
2.5%
2 4
 
< 0.1%
3 6
 
0.1%
4 4
 
< 0.1%
5 9
 
0.1%
6 3
 
< 0.1%
7 3
 
< 0.1%
8 2
 
< 0.1%
9 1
 
< 0.1%
10 2
 
< 0.1%
ValueCountFrequency (%)
9900 1
< 0.1%
9800 1
< 0.1%
9300 1
< 0.1%
9100 1
< 0.1%
9000 1
< 0.1%
8300 1
< 0.1%
8000 1
< 0.1%
7343 1
< 0.1%
7314 1
< 0.1%
7288 1
< 0.1%

종료부번지
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct112
Distinct (%)33.1%
Missing9662
Missing (%)96.6%
Infinite0
Infinite (%)0.0%
Mean124.3432
Minimum1
Maximum4600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:36:21.566104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median16
Q341
95-th percentile638.35
Maximum4600
Range4599
Interquartile range (IQR)37

Descriptive statistics

Standard deviation417.41887
Coefficient of variation (CV)3.3569901
Kurtosis52.403146
Mean124.3432
Median Absolute Deviation (MAD)14
Skewness6.4581301
Sum42028
Variance174238.52
MonotonicityNot monotonic
2023-12-13T05:36:21.729146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 45
 
0.4%
2 19
 
0.2%
3 14
 
0.1%
4 13
 
0.1%
7 11
 
0.1%
11 10
 
0.1%
16 9
 
0.1%
10 8
 
0.1%
18 8
 
0.1%
5 7
 
0.1%
Other values (102) 194
 
1.9%
(Missing) 9662
96.6%
ValueCountFrequency (%)
1 45
0.4%
2 19
0.2%
3 14
 
0.1%
4 13
 
0.1%
5 7
 
0.1%
6 5
 
0.1%
7 11
 
0.1%
8 5
 
0.1%
9 6
 
0.1%
10 8
 
0.1%
ValueCountFrequency (%)
4600 1
< 0.1%
3271 1
< 0.1%
1898 1
< 0.1%
1894 1
< 0.1%
1781 1
< 0.1%
1780 1
< 0.1%
1768 1
< 0.1%
1676 1
< 0.1%
1589 1
< 0.1%
1342 1
< 0.1%

아파트건물명
Text

MISSING 

Distinct2239
Distinct (%)78.5%
Missing7146
Missing (%)71.5%
Memory size156.2 KiB
2023-12-13T05:36:22.020399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length7.3058865
Min length2

Characters and Unicode

Total characters20851
Distinct characters538
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2075 ?
Unique (%)72.7%

Sample

1st row동신아파트
2nd row거제자이아파트
3rd row일산서구청
4th row태성그린시티
5th row화성아파트
ValueCountFrequency (%)
사서함 133
 
4.7%
주공아파트 46
 
1.6%
현대아파트 33
 
1.2%
서울중앙우체국사서함 22
 
0.8%
우성아파트 13
 
0.5%
삼성아파트 11
 
0.4%
현대2차아파트 11
 
0.4%
신동아아파트 10
 
0.4%
한신아파트 10
 
0.4%
청구아파트 10
 
0.4%
Other values (2229) 2555
89.5%
2023-12-13T05:36:22.420456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1889
 
9.1%
1777
 
8.5%
1764
 
8.5%
525
 
2.5%
410
 
2.0%
384
 
1.8%
368
 
1.8%
299
 
1.4%
285
 
1.4%
281
 
1.3%
Other values (528) 12869
61.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19880
95.3%
Decimal Number 703
 
3.4%
Uppercase Letter 174
 
0.8%
Open Punctuation 29
 
0.1%
Close Punctuation 29
 
0.1%
Lowercase Letter 21
 
0.1%
Dash Punctuation 10
 
< 0.1%
Other Punctuation 2
 
< 0.1%
Letter Number 2
 
< 0.1%
Other Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1889
 
9.5%
1777
 
8.9%
1764
 
8.9%
525
 
2.6%
410
 
2.1%
384
 
1.9%
368
 
1.9%
299
 
1.5%
285
 
1.4%
281
 
1.4%
Other values (486) 11898
59.8%
Uppercase Letter
ValueCountFrequency (%)
K 27
15.5%
S 26
14.9%
C 21
12.1%
T 16
9.2%
G 16
9.2%
L 13
7.5%
B 11
6.3%
M 6
 
3.4%
I 6
 
3.4%
E 5
 
2.9%
Other values (11) 27
15.5%
Decimal Number
ValueCountFrequency (%)
1 206
29.3%
2 199
28.3%
3 97
13.8%
5 45
 
6.4%
4 43
 
6.1%
6 33
 
4.7%
7 28
 
4.0%
9 22
 
3.1%
8 22
 
3.1%
0 8
 
1.1%
Lowercase Letter
ValueCountFrequency (%)
e 19
90.5%
w 1
 
4.8%
i 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
& 1
50.0%
' 1
50.0%
Letter Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 29
100.0%
Close Punctuation
ValueCountFrequency (%)
) 29
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19880
95.3%
Common 774
 
3.7%
Latin 197
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1889
 
9.5%
1777
 
8.9%
1764
 
8.9%
525
 
2.6%
410
 
2.1%
384
 
1.9%
368
 
1.9%
299
 
1.5%
285
 
1.4%
281
 
1.4%
Other values (486) 11898
59.8%
Latin
ValueCountFrequency (%)
K 27
13.7%
S 26
13.2%
C 21
10.7%
e 19
9.6%
T 16
8.1%
G 16
8.1%
L 13
 
6.6%
B 11
 
5.6%
M 6
 
3.0%
I 6
 
3.0%
Other values (16) 36
18.3%
Common
ValueCountFrequency (%)
1 206
26.6%
2 199
25.7%
3 97
12.5%
5 45
 
5.8%
4 43
 
5.6%
6 33
 
4.3%
( 29
 
3.7%
) 29
 
3.7%
7 28
 
3.6%
9 22
 
2.8%
Other values (6) 43
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19880
95.3%
ASCII 968
 
4.6%
Number Forms 2
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1889
 
9.5%
1777
 
8.9%
1764
 
8.9%
525
 
2.6%
410
 
2.1%
384
 
1.9%
368
 
1.9%
299
 
1.5%
285
 
1.4%
281
 
1.4%
Other values (486) 11898
59.8%
ASCII
ValueCountFrequency (%)
1 206
21.3%
2 199
20.6%
3 97
10.0%
5 45
 
4.6%
4 43
 
4.4%
6 33
 
3.4%
( 29
 
3.0%
) 29
 
3.0%
7 28
 
2.9%
K 27
 
2.8%
Other values (29) 232
24.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
50.0%
1
50.0%

시작동번호
Text

MISSING 

Distinct88
Distinct (%)7.4%
Missing8815
Missing (%)88.1%
Memory size156.2 KiB
2023-12-13T05:36:22.625643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.9696203
Min length1

Characters and Unicode

Total characters3519
Distinct characters15
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique59 ?
Unique (%)5.0%

Sample

1st row101
2nd row101
3rd row101
4th row1
5th row101
ValueCountFrequency (%)
101 535
45.1%
201 160
 
13.5%
301 93
 
7.8%
501 53
 
4.5%
401 46
 
3.9%
601 39
 
3.3%
1 36
 
3.0%
801 29
 
2.4%
701 27
 
2.3%
901 22
 
1.9%
Other values (78) 145
 
12.2%
2023-12-13T05:36:22.993504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1777
50.5%
0 1102
31.3%
2 216
 
6.1%
3 136
 
3.9%
5 68
 
1.9%
4 60
 
1.7%
6 50
 
1.4%
7 37
 
1.1%
8 34
 
1.0%
9 26
 
0.7%
Other values (5) 13
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3506
99.6%
Uppercase Letter 8
 
0.2%
Other Letter 5
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1777
50.7%
0 1102
31.4%
2 216
 
6.2%
3 136
 
3.9%
5 68
 
1.9%
4 60
 
1.7%
6 50
 
1.4%
7 37
 
1.1%
8 34
 
1.0%
9 26
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
A 5
62.5%
G 1
 
12.5%
D 1
 
12.5%
E 1
 
12.5%
Other Letter
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3506
99.6%
Latin 8
 
0.2%
Hangul 5
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1777
50.7%
0 1102
31.4%
2 216
 
6.2%
3 136
 
3.9%
5 68
 
1.9%
4 60
 
1.7%
6 50
 
1.4%
7 37
 
1.1%
8 34
 
1.0%
9 26
 
0.7%
Latin
ValueCountFrequency (%)
A 5
62.5%
G 1
 
12.5%
D 1
 
12.5%
E 1
 
12.5%
Hangul
ValueCountFrequency (%)
5
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3514
99.9%
Hangul 5
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1777
50.6%
0 1102
31.4%
2 216
 
6.1%
3 136
 
3.9%
5 68
 
1.9%
4 60
 
1.7%
6 50
 
1.4%
7 37
 
1.1%
8 34
 
1.0%
9 26
 
0.7%
Other values (4) 8
 
0.2%
Hangul
ValueCountFrequency (%)
5
100.0%

종료동번호
Text

MISSING 

Distinct297
Distinct (%)25.4%
Missing8830
Missing (%)88.3%
Memory size156.2 KiB
2023-12-13T05:36:23.366501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.9974359
Min length1

Characters and Unicode

Total characters3507
Distinct characters16
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique161 ?
Unique (%)13.8%

Sample

1st row106
2nd row115
3rd row104
4th row16
5th row103
ValueCountFrequency (%)
104 50
 
4.3%
108 50
 
4.3%
106 49
 
4.2%
105 45
 
3.8%
107 42
 
3.6%
103 38
 
3.2%
110 34
 
2.9%
109 33
 
2.8%
206 27
 
2.3%
111 26
 
2.2%
Other values (287) 776
66.3%
2023-12-13T05:36:23.914301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1016
29.0%
0 793
22.6%
2 390
 
11.1%
3 276
 
7.9%
4 204
 
5.8%
5 204
 
5.8%
6 181
 
5.2%
8 154
 
4.4%
7 152
 
4.3%
9 128
 
3.6%
Other values (6) 9
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3498
99.7%
Uppercase Letter 5
 
0.1%
Other Letter 4
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1016
29.0%
0 793
22.7%
2 390
 
11.1%
3 276
 
7.9%
4 204
 
5.8%
5 204
 
5.8%
6 181
 
5.2%
8 154
 
4.4%
7 152
 
4.3%
9 128
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
B 3
60.0%
F 1
 
20.0%
E 1
 
20.0%
Other Letter
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3498
99.7%
Latin 5
 
0.1%
Hangul 4
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1016
29.0%
0 793
22.7%
2 390
 
11.1%
3 276
 
7.9%
4 204
 
5.8%
5 204
 
5.8%
6 181
 
5.2%
8 154
 
4.4%
7 152
 
4.3%
9 128
 
3.7%
Latin
ValueCountFrequency (%)
B 3
60.0%
F 1
 
20.0%
E 1
 
20.0%
Hangul
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3503
99.9%
Hangul 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1016
29.0%
0 793
22.6%
2 390
 
11.1%
3 276
 
7.9%
4 204
 
5.8%
5 204
 
5.8%
6 181
 
5.2%
8 154
 
4.4%
7 152
 
4.3%
9 128
 
3.7%
Other values (3) 5
 
0.1%
Hangul
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%

Interactions

2023-12-13T05:36:14.614961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:12.245144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:12.883463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.470720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:14.044421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:14.736732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:12.376772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.001298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.605944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:14.158452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:14.864271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:12.505030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.109466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.721525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:14.272799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:14.966931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:12.624425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.222692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.811911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:14.382272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:15.078499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:12.756320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.352070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:13.948537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:36:14.512896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:36:24.017061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호시도명도서명시작주번지시작부번지종료주번지종료부번지시작동번호
우편번호1.0000.9691.0000.1970.0000.1970.0000.000
시도명0.9691.0001.0000.1680.6770.1680.6770.000
도서명1.0001.0001.000NaNNaNNaNNaNNaN
시작주번지0.1970.168NaN1.0000.7651.0000.7650.000
시작부번지0.0000.677NaN0.7651.0000.7651.000NaN
종료주번지0.1970.168NaN1.0000.7651.0000.7650.000
종료부번지0.0000.677NaN0.7651.0000.7651.000NaN
시작동번호0.0000.000NaN0.000NaN0.000NaN1.000
2023-12-13T05:36:24.141198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명산번지
시도명1.0001.000
산번지1.0001.000
2023-12-13T05:36:24.244957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호시작주번지시작부번지종료주번지종료부번지시도명산번지
우편번호1.0000.100-0.0260.100-0.0260.8581.000
시작주번지0.1001.0000.1051.0000.1050.0661.000
시작부번지-0.0260.1051.0000.1051.0000.3931.000
종료주번지0.1001.0000.1051.0000.1050.0661.000
종료부번지-0.0260.1051.0000.1051.0000.3931.000
시도명0.8580.0660.3930.0660.3931.0001.000
산번지1.0001.0001.0001.0001.0001.0001.000

Missing values

2023-12-13T05:36:15.214314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:36:15.494409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T05:36:15.725049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

우편번호시도명시군구명읍면동명리명도서명산번지시작주번지시작부번지종료주번지종료부번지아파트건물명시작동번호종료동번호
4619138926서울특별시송파구장지동<NA><NA><NA>864<NA>864<NA><NA><NA><NA>
27798482872경기도양주시남면구암리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
36282595861전라북도순창군팔덕면용산리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
15447360186충청북도청주시상당구평촌동<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
35099573799전라북도군산시수송동<NA><NA><NA><NA><NA><NA><NA>동신아파트101106
52737791873경상북도포항시북구신광면토성리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
42202656707경상남도거제시수월동<NA><NA><NA><NA><NA><NA><NA>거제자이아파트101115
40320621914경상남도김해시안동<NA><NA><NA>1<NA>1<NA><NA><NA><NA>
19549411702경기도고양시일산서구대화동<NA><NA><NA><NA><NA><NA><NA>일산서구청<NA><NA>
40627627873경상남도밀양시무안면정곡리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
우편번호시도명시군구명읍면동명리명도서명산번지시작주번지시작부번지종료주번지종료부번지아파트건물명시작동번호종료동번호
39725617757부산광역시사상구모라1동<NA><NA><NA><NA><NA><NA><NA>우성아파트101108
31517534901전라남도무안군일로읍용산리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
32764542840전라남도구례군광의면<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
26566472721경기도남양주시별내면청학리<NA><NA><NA><NA><NA><NA>거성아파트<NA><NA>
32439540781전라남도순천시조례동<NA><NA><NA><NA><NA><NA><NA>왕지현대1차아파트101104
36308595881전라북도순창군복흥면대방리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
49870745804경상북도문경시문경읍상리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
14747350811충청남도홍성군홍동면월현리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
48868719862경상북도성주군대가면옥련리<NA><NA><NA><NA><NA><NA><NA><NA><NA>
15045355910충청남도보령시성주면<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>