Overview

Dataset statistics

Number of variables4
Number of observations8135
Missing cells6
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory278.2 KiB
Average record size in memory35.0 B

Variable types

Numeric3
Text1

Dataset

Description경상남도 도립거창대학교 시스템의 우편번호 데이터입니다. 우편번호1, 우편번호2, 주소, 시도 데이터를 포함하고 있습니다.
URLhttps://www.data.go.kr/data/15049415/fileData.do

Alerts

우편번호1 is highly overall correlated with 시도High correlation
시도 is highly overall correlated with 우편번호1High correlation

Reproduction

Analysis started2023-12-12 00:56:23.415897
Analysis finished2023-12-12 00:56:25.636199
Duration2.22 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

우편번호1
Real number (ℝ)

HIGH CORRELATION 

Distinct247
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean422.15157
Minimum100
Maximum799
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size71.6 KiB
2023-12-12T09:56:25.711131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile110
Q1158
median440
Q3608
95-th percentile742
Maximum799
Range699
Interquartile range (IQR)450

Descriptive statistics

Standard deviation215.07366
Coefficient of variation (CV)0.50947025
Kurtosis-1.3269863
Mean422.15157
Median Absolute Deviation (MAD)191
Skewness-0.092199275
Sum3434203
Variance46256.681
MonotonicityNot monotonic
2023-12-12T09:56:25.857534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 237
 
2.9%
110 178
 
2.2%
135 156
 
1.9%
138 131
 
1.6%
150 116
 
1.4%
137 109
 
1.3%
121 99
 
1.2%
157 98
 
1.2%
139 90
 
1.1%
506 88
 
1.1%
Other values (237) 6833
84.0%
ValueCountFrequency (%)
100 237
2.9%
110 178
2.2%
120 68
 
0.8%
121 99
1.2%
122 38
 
0.5%
130 60
 
0.7%
131 49
 
0.6%
132 58
 
0.7%
133 48
 
0.6%
134 65
 
0.8%
ValueCountFrequency (%)
799 4
 
< 0.1%
791 36
0.4%
790 31
0.4%
780 65
0.8%
770 43
0.5%
769 19
 
0.2%
767 10
 
0.1%
766 9
 
0.1%
764 6
 
0.1%
763 9
 
0.1%

우편번호2
Real number (ℝ)

Distinct620
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean429.37664
Minimum10
Maximum990
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size71.6 KiB
2023-12-12T09:56:26.035138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile20
Q1101
median342
Q3753
95-th percentile890
Maximum990
Range980
Interquartile range (IQR)652

Descriptive statistics

Standard deviation327.00454
Coefficient of variation (CV)0.76157972
Kurtosis-1.6675378
Mean429.37664
Median Absolute Deviation (MAD)302
Skewness0.12385224
Sum3492979
Variance106931.97
MonotonicityNot monotonic
2023-12-12T09:56:26.185388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
830 130
 
1.6%
820 130
 
1.6%
810 129
 
1.6%
840 126
 
1.5%
850 120
 
1.5%
701 117
 
1.4%
600 114
 
1.4%
50 111
 
1.4%
30 108
 
1.3%
860 108
 
1.3%
Other values (610) 6942
85.3%
ValueCountFrequency (%)
10 107
1.3%
11 51
0.6%
12 52
0.6%
13 37
 
0.5%
14 23
 
0.3%
15 14
 
0.2%
16 9
 
0.1%
17 5
 
0.1%
18 3
 
< 0.1%
19 5
 
0.1%
ValueCountFrequency (%)
990 1
 
< 0.1%
980 5
 
0.1%
970 11
 
0.1%
965 1
 
< 0.1%
960 14
0.2%
955 2
 
< 0.1%
950 23
0.3%
945 3
 
< 0.1%
940 34
0.4%
935 2
 
< 0.1%

주소
Text

Distinct7863
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size63.7 KiB
2023-12-12T09:56:26.524190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length29
Mean length12.677566
Min length8

Characters and Unicode

Total characters103132
Distinct characters483
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7758 ?
Unique (%)95.4%

Sample

1st row서울 동대문구 청량리1동 미주아파트
2nd row서울 동대문구 이문2동 한국외국어대학교
3rd row서울 동대문구 회기동 신현대아파트
4th row서울 중랑구 중화동
5th row서울 중랑구 중화1동
ValueCountFrequency (%)
서울 2070
 
7.5%
경기 926
 
3.4%
경남 737
 
2.7%
경북 628
 
2.3%
부산 502
 
1.8%
전남 469
 
1.7%
전북 461
 
1.7%
강원도 386
 
1.4%
중구 347
 
1.3%
대구 335
 
1.2%
Other values (6874) 20622
75.0%
2023-12-12T09:56:27.074748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19793
 
19.2%
7184
 
7.0%
5334
 
5.2%
3518
 
3.4%
3516
 
3.4%
2571
 
2.5%
2527
 
2.5%
2399
 
2.3%
2089
 
2.0%
1872
 
1.8%
Other values (473) 52329
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 80189
77.8%
Space Separator 19793
 
19.2%
Decimal Number 2892
 
2.8%
Open Punctuation 80
 
0.1%
Close Punctuation 80
 
0.1%
Uppercase Letter 55
 
0.1%
Other Punctuation 25
 
< 0.1%
Dash Punctuation 18
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7184
 
9.0%
5334
 
6.7%
3518
 
4.4%
3516
 
4.4%
2571
 
3.2%
2527
 
3.2%
2399
 
3.0%
2089
 
2.6%
1872
 
2.3%
1610
 
2.0%
Other values (441) 47569
59.3%
Uppercase Letter
ValueCountFrequency (%)
L 17
30.9%
G 16
29.1%
B 3
 
5.5%
C 3
 
5.5%
A 3
 
5.5%
M 2
 
3.6%
K 2
 
3.6%
Y 2
 
3.6%
I 1
 
1.8%
P 1
 
1.8%
Other values (5) 5
 
9.1%
Decimal Number
ValueCountFrequency (%)
1 905
31.3%
2 868
30.0%
3 485
16.8%
4 228
 
7.9%
5 118
 
4.1%
6 85
 
2.9%
0 62
 
2.1%
7 58
 
2.0%
9 50
 
1.7%
8 33
 
1.1%
Other Punctuation
ValueCountFrequency (%)
, 18
72.0%
. 6
 
24.0%
· 1
 
4.0%
Space Separator
ValueCountFrequency (%)
19793
100.0%
Open Punctuation
ValueCountFrequency (%)
( 80
100.0%
Close Punctuation
ValueCountFrequency (%)
) 80
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 80189
77.8%
Common 22888
 
22.2%
Latin 55
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7184
 
9.0%
5334
 
6.7%
3518
 
4.4%
3516
 
4.4%
2571
 
3.2%
2527
 
3.2%
2399
 
3.0%
2089
 
2.6%
1872
 
2.3%
1610
 
2.0%
Other values (441) 47569
59.3%
Common
ValueCountFrequency (%)
19793
86.5%
1 905
 
4.0%
2 868
 
3.8%
3 485
 
2.1%
4 228
 
1.0%
5 118
 
0.5%
6 85
 
0.4%
( 80
 
0.3%
) 80
 
0.3%
0 62
 
0.3%
Other values (7) 184
 
0.8%
Latin
ValueCountFrequency (%)
L 17
30.9%
G 16
29.1%
B 3
 
5.5%
C 3
 
5.5%
A 3
 
5.5%
M 2
 
3.6%
K 2
 
3.6%
Y 2
 
3.6%
I 1
 
1.8%
P 1
 
1.8%
Other values (5) 5
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 80189
77.8%
ASCII 22942
 
22.2%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
19793
86.3%
1 905
 
3.9%
2 868
 
3.8%
3 485
 
2.1%
4 228
 
1.0%
5 118
 
0.5%
6 85
 
0.4%
( 80
 
0.3%
) 80
 
0.3%
0 62
 
0.3%
Other values (21) 238
 
1.0%
Hangul
ValueCountFrequency (%)
7184
 
9.0%
5334
 
6.7%
3518
 
4.4%
3516
 
4.4%
2571
 
3.2%
2527
 
3.2%
2399
 
3.0%
2089
 
2.6%
1872
 
2.3%
1610
 
2.0%
Other values (441) 47569
59.3%
None
ValueCountFrequency (%)
· 1
100.0%

시도
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)0.2%
Missing6
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean8.8576701
Minimum1
Maximum19
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size71.6 KiB
2023-12-12T09:56:27.248915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median11
Q315
95-th percentile18
Maximum19
Range18
Interquartile range (IQR)14

Descriptive statistics

Standard deviation6.5799737
Coefficient of variation (CV)0.74285604
Kurtosis-1.6459129
Mean8.8576701
Median Absolute Deviation (MAD)6
Skewness0.025285003
Sum72004
Variance43.296054
MonotonicityNot monotonic
2023-12-12T09:56:27.377032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
1 2070
25.4%
11 925
11.4%
17 732
 
9.0%
18 628
 
7.7%
2 502
 
6.2%
15 469
 
5.8%
16 461
 
5.7%
12 386
 
4.7%
3 335
 
4.1%
13 322
 
4.0%
Other values (5) 1299
16.0%
ValueCountFrequency (%)
1 2070
25.4%
2 502
 
6.2%
3 335
 
4.1%
4 304
 
3.7%
5 297
 
3.7%
6 280
 
3.4%
11 925
11.4%
12 386
 
4.7%
13 322
 
4.0%
14 310
 
3.8%
ValueCountFrequency (%)
19 108
 
1.3%
18 628
7.7%
17 732
9.0%
16 461
5.7%
15 469
5.8%
14 310
 
3.8%
13 322
 
4.0%
12 386
4.7%
11 925
11.4%
6 280
 
3.4%

Interactions

2023-12-12T09:56:25.070746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:56:24.314777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:56:24.684331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:56:25.209262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:56:24.426484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:56:24.824739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:56:25.335474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:56:24.547376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:56:24.949499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:56:27.487438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호1우편번호2시도
우편번호11.0000.5440.903
우편번호20.5441.0000.454
시도0.9030.4541.000
2023-12-12T09:56:27.605276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호1우편번호2시도
우편번호11.000-0.0210.740
우편번호2-0.0211.0000.141
시도0.7400.1411.000

Missing values

2023-12-12T09:56:25.502964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:56:25.596437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

우편번호1우편번호2주소시도
0130781서울 동대문구 청량리1동 미주아파트1
1130791서울 동대문구 이문2동 한국외국어대학교1
2130792서울 동대문구 회기동 신현대아파트1
3131120서울 중랑구 중화동1
4131121서울 중랑구 중화1동1
5131122서울 중랑구 중화2동1
6131123서울 중랑구 중화3동1
7131130서울 중랑구 신내동1
8131131서울 중랑구 신내1동1
9131132서울 중랑구 신내2동1
우편번호1우편번호2주소시도
8125799800경북 울릉군 울릉읍18
8126799810경북 울릉군 서 면18
8127799815경북 울릉군 서 면18
8128799820경북 울릉군 북 면18
8129660331경남 진주시 하대1동<NA>
8130660332경남 진주시 하대2동<NA>
8131445827경기 화성시 반송동<NA>
8132656707경남 거제시 수월 거제자이아파트<NA>
8133656927경남 거제시 신현읍<NA>
8134656305경남 거제시 장평동<NA>