Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells17135
Missing cells (%)24.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory673.8 KiB
Average record size in memory69.0 B

Variable types

Numeric4
Text2
Unsupported1

Dataset

Description경상남도 도립남해대학 우편 DB입니다. (우편번호1, 우편번호2, 주소 )
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15039601

Alerts

순번 is highly overall correlated with G|시도구분|시도코드High correlation
우편번호1 is highly overall correlated with G|시도구분|시도코드High correlation
G|시도구분|시도코드 is highly overall correlated with 순번 and 1 other fieldsHigh correlation
번지_APT동 has 7135 (71.4%) missing valuesMissing
사서함등 has 10000 (100.0%) missing valuesMissing
순번 has unique valuesUnique
사서함등 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 00:33:11.088764
Analysis finished2023-12-11 00:33:14.361415
Duration3.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20134.244
Minimum8
Maximum40092
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:33:14.427399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile1862.95
Q110099.25
median20283
Q330217.5
95-th percentile38081.1
Maximum40092
Range40084
Interquartile range (IQR)20118.25

Descriptive statistics

Standard deviation11609.617
Coefficient of variation (CV)0.57661053
Kurtosis-1.1990775
Mean20134.244
Median Absolute Deviation (MAD)10057
Skewness-0.018049206
Sum2.0134244 × 108
Variance1.3478321 × 108
MonotonicityNot monotonic
2023-12-11T09:33:14.558887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39116 1
 
< 0.1%
21338 1
 
< 0.1%
10124 1
 
< 0.1%
5005 1
 
< 0.1%
6090 1
 
< 0.1%
11379 1
 
< 0.1%
19375 1
 
< 0.1%
11311 1
 
< 0.1%
29827 1
 
< 0.1%
13069 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
8 1
< 0.1%
9 1
< 0.1%
15 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
26 1
< 0.1%
27 1
< 0.1%
29 1
< 0.1%
32 1
< 0.1%
33 1
< 0.1%
ValueCountFrequency (%)
40092 1
< 0.1%
40091 1
< 0.1%
40087 1
< 0.1%
40078 1
< 0.1%
40077 1
< 0.1%
40076 1
< 0.1%
40074 1
< 0.1%
40072 1
< 0.1%
40069 1
< 0.1%
40068 1
< 0.1%

우편번호1
Real number (ℝ)

HIGH CORRELATION 

Distinct249
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean466.2326
Minimum100
Maximum799
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:33:14.696140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile133.95
Q1330
median480
Q3626
95-th percentile757
Maximum799
Range699
Interquartile range (IQR)296

Descriptive statistics

Standard deviation197.86403
Coefficient of variation (CV)0.4243891
Kurtosis-0.98813165
Mean466.2326
Median Absolute Deviation (MAD)146
Skewness-0.29529388
Sum4662326
Variance39150.175
MonotonicityNot monotonic
2023-12-11T09:33:14.818660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
135 146
 
1.5%
560 97
 
1.0%
151 91
 
0.9%
139 80
 
0.8%
702 80
 
0.8%
780 79
 
0.8%
540 79
 
0.8%
742 78
 
0.8%
760 77
 
0.8%
110 74
 
0.7%
Other values (239) 9119
91.2%
ValueCountFrequency (%)
100 73
0.7%
110 74
0.7%
120 40
0.4%
121 52
0.5%
122 65
0.7%
130 32
0.3%
131 63
0.6%
132 63
0.6%
133 38
0.4%
134 36
0.4%
ValueCountFrequency (%)
799 5
 
0.1%
791 54
0.5%
790 45
0.4%
780 79
0.8%
770 58
0.6%
769 61
0.6%
767 24
 
0.2%
766 34
0.3%
764 19
 
0.2%
763 27
 
0.3%

우편번호2
Real number (ℝ)

Distinct567
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean751.772
Minimum3
Maximum997
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:33:14.951505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile92
Q1765
median823
Q3863
95-th percentile932
Maximum997
Range994
Interquartile range (IQR)98

Descriptive statistics

Standard deviation230.14736
Coefficient of variation (CV)0.30613984
Kurtosis3.6417651
Mean751.772
Median Absolute Deviation (MAD)48
Skewness-2.2041468
Sum7517720
Variance52967.807
MonotonicityNot monotonic
2023-12-11T09:33:15.098579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
831 156
 
1.6%
822 151
 
1.5%
821 149
 
1.5%
842 147
 
1.5%
811 147
 
1.5%
812 146
 
1.5%
832 139
 
1.4%
841 134
 
1.3%
851 121
 
1.2%
801 120
 
1.2%
Other values (557) 8590
85.9%
ValueCountFrequency (%)
3 6
 
0.1%
10 35
0.4%
11 12
 
0.1%
12 9
 
0.1%
13 9
 
0.1%
14 3
 
< 0.1%
15 3
 
< 0.1%
16 4
 
< 0.1%
17 2
 
< 0.1%
18 1
 
< 0.1%
ValueCountFrequency (%)
997 1
 
< 0.1%
994 1
 
< 0.1%
993 1
 
< 0.1%
992 3
< 0.1%
991 1
 
< 0.1%
990 1
 
< 0.1%
989 1
 
< 0.1%
988 2
< 0.1%
986 1
 
< 0.1%
985 2
< 0.1%

주소
Text

Distinct8389
Distinct (%)83.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T09:33:15.482786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length32
Mean length13.6666
Min length8

Characters and Unicode

Total characters136666
Distinct characters508
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7561 ?
Unique (%)75.6%

Sample

1st row충북 보은군 보은읍 중동리
2nd row대구 달서구 본리동
3rd row경남 진주시 판문동
4th row서울 송파구 잠실1동 주공아파트
5th row경남 창녕군 대신동 동포동
ValueCountFrequency (%)
서울 1534
 
4.3%
경기 1359
 
3.8%
경북 1060
 
2.9%
전남 908
 
2.5%
경남 810
 
2.2%
충남 705
 
2.0%
전북 668
 
1.9%
부산 572
 
1.6%
충북 525
 
1.5%
대구 442
 
1.2%
Other values (8167) 27486
76.2%
2023-12-11T09:33:15.960187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26069
 
19.1%
6668
 
4.9%
5091
 
3.7%
4315
 
3.2%
3889
 
2.8%
3802
 
2.8%
3792
 
2.8%
3552
 
2.6%
3242
 
2.4%
3049
 
2.2%
Other values (498) 73197
53.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 105504
77.2%
Space Separator 26069
 
19.1%
Decimal Number 4330
 
3.2%
Math Symbol 316
 
0.2%
Other Punctuation 136
 
0.1%
Close Punctuation 128
 
0.1%
Open Punctuation 128
 
0.1%
Uppercase Letter 55
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6668
 
6.3%
5091
 
4.8%
4315
 
4.1%
3889
 
3.7%
3802
 
3.6%
3792
 
3.6%
3552
 
3.4%
3242
 
3.1%
3049
 
2.9%
2966
 
2.8%
Other values (466) 65138
61.7%
Uppercase Letter
ValueCountFrequency (%)
G 13
23.6%
L 13
23.6%
B 5
 
9.1%
S 5
 
9.1%
K 4
 
7.3%
I 3
 
5.5%
D 2
 
3.6%
V 2
 
3.6%
T 2
 
3.6%
A 1
 
1.8%
Other values (5) 5
 
9.1%
Decimal Number
ValueCountFrequency (%)
1 1546
35.7%
2 1341
31.0%
3 646
14.9%
4 327
 
7.6%
5 158
 
3.6%
6 121
 
2.8%
7 76
 
1.8%
8 50
 
1.2%
9 34
 
0.8%
0 31
 
0.7%
Math Symbol
ValueCountFrequency (%)
308
97.5%
~ 8
 
2.5%
Other Punctuation
ValueCountFrequency (%)
, 112
82.4%
· 24
 
17.6%
Space Separator
ValueCountFrequency (%)
26069
100.0%
Close Punctuation
ValueCountFrequency (%)
) 128
100.0%
Open Punctuation
ValueCountFrequency (%)
( 128
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 105501
77.2%
Common 31107
 
22.8%
Latin 55
 
< 0.1%
Han 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6668
 
6.3%
5091
 
4.8%
4315
 
4.1%
3889
 
3.7%
3802
 
3.6%
3792
 
3.6%
3552
 
3.4%
3242
 
3.1%
3049
 
2.9%
2966
 
2.8%
Other values (463) 65135
61.7%
Common
ValueCountFrequency (%)
26069
83.8%
1 1546
 
5.0%
2 1341
 
4.3%
3 646
 
2.1%
4 327
 
1.1%
308
 
1.0%
5 158
 
0.5%
) 128
 
0.4%
( 128
 
0.4%
6 121
 
0.4%
Other values (7) 335
 
1.1%
Latin
ValueCountFrequency (%)
G 13
23.6%
L 13
23.6%
B 5
 
9.1%
S 5
 
9.1%
K 4
 
7.3%
I 3
 
5.5%
D 2
 
3.6%
V 2
 
3.6%
T 2
 
3.6%
A 1
 
1.8%
Other values (5) 5
 
9.1%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 105501
77.2%
ASCII 30830
 
22.6%
Math Operators 308
 
0.2%
None 24
 
< 0.1%
CJK 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26069
84.6%
1 1546
 
5.0%
2 1341
 
4.3%
3 646
 
2.1%
4 327
 
1.1%
5 158
 
0.5%
) 128
 
0.4%
( 128
 
0.4%
6 121
 
0.4%
, 112
 
0.4%
Other values (20) 254
 
0.8%
Hangul
ValueCountFrequency (%)
6668
 
6.3%
5091
 
4.8%
4315
 
4.1%
3889
 
3.7%
3802
 
3.6%
3792
 
3.6%
3552
 
3.4%
3242
 
3.1%
3049
 
2.9%
2966
 
2.8%
Other values (463) 65135
61.7%
Math Operators
ValueCountFrequency (%)
308
100.0%
None
ValueCountFrequency (%)
· 24
100.0%
CJK
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

G|시도구분|시도코드
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.4225
Minimum1
Maximum19
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:33:16.088908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median12
Q316
95-th percentile18
Maximum19
Range18
Interquartile range (IQR)13

Descriptive statistics

Standard deviation6.1847085
Coefficient of variation (CV)0.59339971
Kurtosis-1.3691344
Mean10.4225
Median Absolute Deviation (MAD)5
Skewness-0.40458586
Sum104225
Variance38.250619
MonotonicityNot monotonic
2023-12-11T09:33:16.224760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
1 1534
15.3%
11 1359
13.6%
18 1060
10.6%
15 908
9.1%
17 810
8.1%
13 705
7.0%
16 668
6.7%
2 572
 
5.7%
14 525
 
5.2%
3 442
 
4.4%
Other values (6) 1417
14.2%
ValueCountFrequency (%)
1 1534
15.3%
2 572
 
5.7%
3 442
 
4.4%
4 334
 
3.3%
5 206
 
2.1%
6 205
 
2.1%
7 148
 
1.5%
11 1359
13.6%
12 427
 
4.3%
13 705
7.0%
ValueCountFrequency (%)
19 97
 
1.0%
18 1060
10.6%
17 810
8.1%
16 668
6.7%
15 908
9.1%
14 525
 
5.2%
13 705
7.0%
12 427
 
4.3%
11 1359
13.6%
7 148
 
1.5%

번지_APT동
Text

MISSING 

Distinct2637
Distinct (%)92.0%
Missing7135
Missing (%)71.4%
Memory size156.2 KiB
2023-12-11T09:33:16.563252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length7
Mean length7.0387435
Min length1

Characters and Unicode

Total characters20166
Distinct characters62
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2489 ?
Unique (%)86.9%

Sample

1st row352∼699
2nd row1063∼1078
3rd row(148동)
4th row533∼611
5th row348∼374
ValueCountFrequency (%)
101106동 11
 
0.4%
1∼200번지 6
 
0.2%
1∼100번지 6
 
0.2%
101109동 5
 
0.2%
1∼199번지 5
 
0.2%
101105동 5
 
0.2%
101103동 5
 
0.2%
201202동 5
 
0.2%
101107동 4
 
0.1%
1∼18번지 4
 
0.1%
Other values (2627) 2809
98.0%
2023-12-11T09:33:17.007870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3129
15.5%
2326
11.5%
0 2158
10.7%
2 1653
8.2%
3 1425
7.1%
9 1370
6.8%
4 1362
6.8%
6 1301
6.5%
5 1296
6.4%
7 1120
 
5.6%
Other values (52) 3026
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15855
78.6%
Math Symbol 2330
 
11.6%
Other Letter 1394
 
6.9%
Close Punctuation 287
 
1.4%
Open Punctuation 287
 
1.4%
Uppercase Letter 8
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
467
33.5%
460
33.0%
270
19.4%
133
 
9.5%
6
 
0.4%
5
 
0.4%
4
 
0.3%
3
 
0.2%
3
 
0.2%
3
 
0.2%
Other values (34) 40
 
2.9%
Decimal Number
ValueCountFrequency (%)
1 3129
19.7%
0 2158
13.6%
2 1653
10.4%
3 1425
9.0%
9 1370
8.6%
4 1362
8.6%
6 1301
8.2%
5 1296
8.2%
7 1120
 
7.1%
8 1041
 
6.6%
Math Symbol
ValueCountFrequency (%)
2326
99.8%
~ 4
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
B 7
87.5%
A 1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 287
100.0%
Open Punctuation
ValueCountFrequency (%)
( 287
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18764
93.0%
Hangul 1394
 
6.9%
Latin 8
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
467
33.5%
460
33.0%
270
19.4%
133
 
9.5%
6
 
0.4%
5
 
0.4%
4
 
0.3%
3
 
0.2%
3
 
0.2%
3
 
0.2%
Other values (34) 40
 
2.9%
Common
ValueCountFrequency (%)
1 3129
16.7%
2326
12.4%
0 2158
11.5%
2 1653
8.8%
3 1425
7.6%
9 1370
7.3%
4 1362
7.3%
6 1301
6.9%
5 1296
6.9%
7 1120
 
6.0%
Other values (6) 1624
8.7%
Latin
ValueCountFrequency (%)
B 7
87.5%
A 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16446
81.6%
Math Operators 2326
 
11.5%
Hangul 1394
 
6.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3129
19.0%
0 2158
13.1%
2 1653
10.1%
3 1425
8.7%
9 1370
8.3%
4 1362
8.3%
6 1301
7.9%
5 1296
7.9%
7 1120
 
6.8%
8 1041
 
6.3%
Other values (7) 591
 
3.6%
Math Operators
ValueCountFrequency (%)
2326
100.0%
Hangul
ValueCountFrequency (%)
467
33.5%
460
33.0%
270
19.4%
133
 
9.5%
6
 
0.4%
5
 
0.4%
4
 
0.3%
3
 
0.2%
3
 
0.2%
3
 
0.2%
Other values (34) 40
 
2.9%

사서함등
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

Interactions

2023-12-11T09:33:13.772731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:12.150699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:12.605373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.365258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.867663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:12.285596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.002831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.477608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.971712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:12.399980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.129039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.599870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:14.072991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:12.500933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.239594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:33:13.686166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:33:17.097655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번우편번호1우편번호2G|시도구분|시도코드
순번1.0000.9610.4030.923
우편번호10.9611.0000.3490.890
우편번호20.4030.3491.0000.309
G|시도구분|시도코드0.9230.8900.3091.000
2023-12-11T09:33:17.181503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번우편번호1우편번호2G|시도구분|시도코드
순번1.0000.3140.2920.802
우편번호10.3141.0000.0890.643
우편번호20.2920.0891.0000.277
G|시도구분|시도코드0.8020.6430.2771.000

Missing values

2023-12-11T09:33:14.198407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:33:14.312282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번우편번호1우편번호2주소G|시도구분|시도코드번지_APT동사서함등
3906639116376806충북 보은군 보은읍 중동리14<NA><NA>
88358873704914대구 달서구 본리동3352∼699<NA>
2246622509660990경남 진주시 판문동171063∼1078<NA>
41734186138908서울 송파구 잠실1동 주공아파트1(148동)<NA>
2359723640635907경남 창녕군 대신동 동포동17533∼611<NA>
540541134812서울 강동구 길1동1348∼374<NA>
1620516243423751경기 광명시 하안2동 하안주공11(고층)2단지(201207동)<NA>
23692382132838서울 도봉구 방학2동1602∼610<NA>
1651116549415812경기 김포시 고촌면 신곡리11600∼1200<NA>
1378013818689881울산 울주군 서생면 강월리7<NA><NA>
순번우편번호1우편번호2주소G|시도구분|시도코드번지_APT동사서함등
2783727884767892경북 울진군 북면 금성리18<NA><NA>
33883401137836서울 서초구 방배4동1840∼849<NA>
1613416172423837경기 광명시 철산3동11261∼372<NA>
35813594133809서울 성동구 금호동4가1500∼1530<NA>
1225312291500856광주 북구 운암3동5201∼345번지<NA>
1666716706472812경기 남양주시 별내면 덕송1∼2리11<NA><NA>
3969339743365842충북 진천군 덕산면 인산리14<NA><NA>
63966428608702부산 남구 대연3동 남부경찰서2<NA><NA>
3647436524330824충남 천안시 입장면 산정리13<NA><NA>
53405372110727서울 종로구 수송동 석탄회관빌딩1<NA><NA>