Overview

Dataset statistics

Number of variables6
Number of observations385
Missing cells212
Missing cells (%)9.2%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory18.9 KiB
Average record size in memory50.3 B

Variable types

Numeric2
Categorical2
Text2

Dataset

Description경상북도내 시군청 주소 , 전화번호 등의 안내입니다(경상북도 시군의 읍면동사무소의 시군구, 읍면동, 우편번호, 주소 현황입니다.)
Author경상북도
URLhttps://www.data.go.kr/data/15044824/fileData.do

Alerts

Dataset has 1 (0.3%) duplicate rowsDuplicates
시군구 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
시도 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
연번 is highly overall correlated with 시도 and 1 other fieldsHigh correlation
우편번호 is highly overall correlated with 시도 and 1 other fieldsHigh correlation
연번 has 53 (13.8%) missing valuesMissing
읍면동 has 53 (13.8%) missing valuesMissing
우편번호 has 53 (13.8%) missing valuesMissing
주 소 has 53 (13.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 23:30:30.145176
Analysis finished2023-12-12 23:30:31.143869
Duration1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct332
Distinct (%)100.0%
Missing53
Missing (%)13.8%
Infinite0
Infinite (%)0.0%
Mean166.5
Minimum1
Maximum332
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2023-12-13T08:30:31.228651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile17.55
Q183.75
median166.5
Q3249.25
95-th percentile315.45
Maximum332
Range331
Interquartile range (IQR)165.5

Descriptive statistics

Standard deviation95.984374
Coefficient of variation (CV)0.57648273
Kurtosis-1.2
Mean166.5
Median Absolute Deviation (MAD)83
Skewness0
Sum55278
Variance9213
MonotonicityStrictly increasing
2023-12-13T08:30:31.369776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
230 1
 
0.3%
228 1
 
0.3%
227 1
 
0.3%
226 1
 
0.3%
225 1
 
0.3%
224 1
 
0.3%
223 1
 
0.3%
222 1
 
0.3%
221 1
 
0.3%
220 1
 
0.3%
Other values (322) 322
83.6%
(Missing) 53
 
13.8%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
332 1
0.3%
331 1
0.3%
330 1
0.3%
329 1
0.3%
328 1
0.3%
327 1
0.3%
326 1
0.3%
325 1
0.3%
324 1
0.3%
323 1
0.3%

시도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
경상북도
332 
<NA>
53 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상북도
2nd row경상북도
3rd row경상북도
4th row경상북도
5th row경상북도

Common Values

ValueCountFrequency (%)
경상북도 332
86.2%
<NA> 53
 
13.8%

Length

2023-12-13T08:30:31.500233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:30:31.880611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경상북도 332
86.2%
na 53
 
13.8%

시군구
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
<NA>
53 
구미시
27 
상주시
24 
안동시
24 
경주시
 
23
Other values (20)
234 

Length

Max length6
Median length3
Mean length3.425974
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row포항시 남구
2nd row포항시 남구
3rd row포항시 남구
4th row포항시 남구
5th row포항시 남구

Common Values

ValueCountFrequency (%)
<NA> 53
 
13.8%
구미시 27
 
7.0%
상주시 24
 
6.2%
안동시 24
 
6.2%
경주시 23
 
6.0%
김천시 22
 
5.7%
영주시 19
 
4.9%
의성군 18
 
4.7%
영천시 16
 
4.2%
포항시 북구 15
 
3.9%
Other values (15) 144
37.4%

Length

2023-12-13T08:30:31.980415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 53
 
12.8%
포항시 29
 
7.0%
구미시 27
 
6.5%
상주시 24
 
5.8%
안동시 24
 
5.8%
경주시 23
 
5.6%
김천시 22
 
5.3%
영주시 19
 
4.6%
의성군 18
 
4.3%
영천시 16
 
3.9%
Other values (16) 159
38.4%

읍면동
Text

MISSING 

Distinct327
Distinct (%)98.5%
Missing53
Missing (%)13.8%
Memory size3.1 KiB
2023-12-13T08:30:32.216329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length7.8072289
Min length5

Characters and Unicode

Total characters2592
Distinct characters184
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique322 ?
Unique (%)97.0%

Sample

1st row구룡포읍행정복지센터
2nd row연일읍행정복지센터
3rd row오천읍행정복지센터
4th row대송면행정복지센터
5th row동해면행정복지센터
ValueCountFrequency (%)
행정복지센터 29
 
8.0%
화북면행정복지센터 2
 
0.6%
서면사무소 2
 
0.6%
북면사무소 2
 
0.6%
중앙동행정복지센터 2
 
0.6%
화남면행정복지센터 2
 
0.6%
춘산면사무소 1
 
0.3%
사곡면사무소 1
 
0.3%
옥산면사무소 1
 
0.3%
점곡면사무소 1
 
0.3%
Other values (318) 318
88.1%
2023-12-13T08:30:32.643887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
203
 
7.8%
203
 
7.8%
201
 
7.8%
176
 
6.8%
174
 
6.7%
170
 
6.6%
170
 
6.6%
133
 
5.1%
132
 
5.1%
130
 
5.0%
Other values (174) 900
34.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2541
98.0%
Space Separator 29
 
1.1%
Decimal Number 22
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
203
 
8.0%
203
 
8.0%
201
 
7.9%
176
 
6.9%
174
 
6.8%
170
 
6.7%
170
 
6.7%
133
 
5.2%
132
 
5.2%
130
 
5.1%
Other values (168) 849
33.4%
Decimal Number
ValueCountFrequency (%)
2 9
40.9%
1 9
40.9%
3 2
 
9.1%
5 1
 
4.5%
4 1
 
4.5%
Space Separator
ValueCountFrequency (%)
29
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2541
98.0%
Common 51
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
203
 
8.0%
203
 
8.0%
201
 
7.9%
176
 
6.9%
174
 
6.8%
170
 
6.7%
170
 
6.7%
133
 
5.2%
132
 
5.2%
130
 
5.1%
Other values (168) 849
33.4%
Common
ValueCountFrequency (%)
29
56.9%
2 9
 
17.6%
1 9
 
17.6%
3 2
 
3.9%
5 1
 
2.0%
4 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2541
98.0%
ASCII 51
 
2.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
203
 
8.0%
203
 
8.0%
201
 
7.9%
176
 
6.9%
174
 
6.8%
170
 
6.7%
170
 
6.7%
133
 
5.2%
132
 
5.2%
130
 
5.1%
Other values (168) 849
33.4%
ASCII
ValueCountFrequency (%)
29
56.9%
2 9
 
17.6%
1 9
 
17.6%
3 2
 
3.9%
5 1
 
2.0%
4 1
 
2.0%

우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct331
Distinct (%)99.7%
Missing53
Missing (%)13.8%
Infinite0
Infinite (%)0.0%
Mean37900.244
Minimum36004
Maximum40235
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2023-12-13T08:30:32.827354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36004
5-th percentile36170.55
Q136831.75
median37674.5
Q339032.25
95-th percentile40041.15
Maximum40235
Range4231
Interquartile range (IQR)2200.5

Descriptive statistics

Standard deviation1242.1627
Coefficient of variation (CV)0.03277453
Kurtosis-1.1836496
Mean37900.244
Median Absolute Deviation (MAD)1024
Skewness0.27334986
Sum12582881
Variance1542968.2
MonotonicityNot monotonic
2023-12-13T08:30:32.968194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39504 2
 
0.5%
37362 1
 
0.3%
37358 1
 
0.3%
37348 1
 
0.3%
37347 1
 
0.3%
37324 1
 
0.3%
37321 1
 
0.3%
37327 1
 
0.3%
37337 1
 
0.3%
39055 1
 
0.3%
Other values (321) 321
83.4%
(Missing) 53
 
13.8%
ValueCountFrequency (%)
36004 1
0.3%
36008 1
0.3%
36016 1
0.3%
36030 1
0.3%
36044 1
0.3%
36050 1
0.3%
36057 1
0.3%
36068 1
0.3%
36073 1
0.3%
36089 1
0.3%
ValueCountFrequency (%)
40235 1
0.3%
40221 1
0.3%
40211 1
0.3%
40163 1
0.3%
40152 1
0.3%
40147 1
0.3%
40136 1
0.3%
40127 1
0.3%
40123 1
0.3%
40118 1
0.3%

주 소
Text

MISSING 

Distinct332
Distinct (%)100.0%
Missing53
Missing (%)13.8%
Memory size3.1 KiB
2023-12-13T08:30:33.328976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length25
Mean length19.713855
Min length14

Characters and Unicode

Total characters6545
Distinct characters234
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique332 ?
Unique (%)100.0%

Sample

1st row경상북도 포항시 남구 구룡포읍 호미로 133
2nd row경상북도 포항시 남구 연일읍 철강로 10
3rd row경상북도 포항시 남구 오천읍 세계길 5
4th row경상북도 포항시 남구 대송면 장동홍계길 19
5th row경상북도 포항시 남구 동해면 일월로 66
ValueCountFrequency (%)
경상북도 317
 
19.9%
포항시 29
 
1.8%
구미시 27
 
1.7%
안동시 24
 
1.5%
상주시 24
 
1.5%
경주시 23
 
1.4%
김천시 22
 
1.4%
영주시 19
 
1.2%
의성군 18
 
1.1%
영천시 16
 
1.0%
Other values (787) 1075
67.4%
2023-12-13T08:30:33.888869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1317
20.1%
396
 
6.1%
367
 
5.6%
356
 
5.4%
355
 
5.4%
225
 
3.4%
214
 
3.3%
202
 
3.1%
1 195
 
3.0%
133
 
2.0%
Other values (224) 2785
42.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4252
65.0%
Space Separator 1317
 
20.1%
Decimal Number 940
 
14.4%
Dash Punctuation 26
 
0.4%
Close Punctuation 5
 
0.1%
Open Punctuation 5
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
396
 
9.3%
367
 
8.6%
356
 
8.4%
355
 
8.3%
225
 
5.3%
214
 
5.0%
202
 
4.8%
133
 
3.1%
129
 
3.0%
86
 
2.0%
Other values (210) 1789
42.1%
Decimal Number
ValueCountFrequency (%)
1 195
20.7%
2 113
12.0%
3 107
11.4%
5 104
11.1%
4 83
8.8%
0 70
 
7.4%
7 70
 
7.4%
9 69
 
7.3%
8 68
 
7.2%
6 61
 
6.5%
Space Separator
ValueCountFrequency (%)
1317
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4252
65.0%
Common 2293
35.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
396
 
9.3%
367
 
8.6%
356
 
8.4%
355
 
8.3%
225
 
5.3%
214
 
5.0%
202
 
4.8%
133
 
3.1%
129
 
3.0%
86
 
2.0%
Other values (210) 1789
42.1%
Common
ValueCountFrequency (%)
1317
57.4%
1 195
 
8.5%
2 113
 
4.9%
3 107
 
4.7%
5 104
 
4.5%
4 83
 
3.6%
0 70
 
3.1%
7 70
 
3.1%
9 69
 
3.0%
8 68
 
3.0%
Other values (4) 97
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4252
65.0%
ASCII 2293
35.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1317
57.4%
1 195
 
8.5%
2 113
 
4.9%
3 107
 
4.7%
5 104
 
4.5%
4 83
 
3.6%
0 70
 
3.1%
7 70
 
3.1%
9 69
 
3.0%
8 68
 
3.0%
Other values (4) 97
 
4.2%
Hangul
ValueCountFrequency (%)
396
 
9.3%
367
 
8.6%
356
 
8.4%
355
 
8.3%
225
 
5.3%
214
 
5.0%
202
 
4.8%
133
 
3.1%
129
 
3.0%
86
 
2.0%
Other values (210) 1789
42.1%

Interactions

2023-12-13T08:30:30.632268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:30:30.459789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:30:30.721026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:30:30.554963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:30:34.000014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시군구우편번호
연번1.0000.9820.950
시군구0.9821.0000.985
우편번호0.9500.9851.000
2023-12-13T08:30:34.098393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구시도
시군구1.0001.000
시도1.0001.000
2023-12-13T08:30:34.186555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번우편번호시도시군구
연번1.000-0.1301.0000.874
우편번호-0.1301.0001.0000.890
시도1.0001.0001.0001.000
시군구0.8740.8901.0001.000

Missing values

2023-12-13T08:30:30.834312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:30:30.955914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T08:30:31.071358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번시도시군구읍면동우편번호주 소
01경상북도포항시 남구구룡포읍행정복지센터37936경상북도 포항시 남구 구룡포읍 호미로 133
12경상북도포항시 남구연일읍행정복지센터37852경상북도 포항시 남구 연일읍 철강로 10
23경상북도포항시 남구오천읍행정복지센터37912경상북도 포항시 남구 오천읍 세계길 5
34경상북도포항시 남구대송면행정복지센터37857경상북도 포항시 남구 대송면 장동홍계길 19
45경상북도포항시 남구동해면행정복지센터37926경상북도 포항시 남구 동해면 일월로 66
56경상북도포항시 남구장기면행정복지센터37945경상북도 포항시 남구 장기면 읍내길 99
67경상북도포항시 남구호미곶면행정복지센터37928경상북도 포항시 남구 호미곶면 해맞이로 242
78경상북도포항시 남구상대동행정복지센터37766경상북도 포항시 남구 상대로 98
89경상북도포항시 남구해도동행정복지센터37795경상북도 포항시 남구 상공로 235
910경상북도포항시 남구송도동행정복지센터37800경상북도 포항시 남구 송림로 25
연번시도시군구읍면동우편번호주 소
375<NA><NA><NA><NA><NA><NA>
376<NA><NA><NA><NA><NA><NA>
377<NA><NA><NA><NA><NA><NA>
378<NA><NA><NA><NA><NA><NA>
379<NA><NA><NA><NA><NA><NA>
380<NA><NA><NA><NA><NA><NA>
381<NA><NA><NA><NA><NA><NA>
382<NA><NA><NA><NA><NA><NA>
383<NA><NA><NA><NA><NA><NA>
384<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번시도시군구읍면동우편번호주 소# duplicates
0<NA><NA><NA><NA><NA><NA>53