Overview

Dataset statistics

Number of variables9
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows91
Duplicate rows (%)18.2%
Total size in memory37.7 KiB
Average record size in memory77.3 B

Variable types

Numeric2
Categorical6
Boolean1

Dataset

Description해당 파일 데이터는 신용보증기금의 공통일반 우편번호 정보에 대해 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15093107/fileData.do

Alerts

삭제여부 has constant value ""Constant
최종수정수 has constant value ""Constant
Dataset has 91 (18.2%) duplicate rowsDuplicates
최초처리시각 is highly overall correlated with 우편번호 and 5 other fieldsHigh correlation
최초처리직원번호 is highly overall correlated with 우편번호 and 5 other fieldsHigh correlation
처리직원번호 is highly overall correlated with 우편번호 and 5 other fieldsHigh correlation
처리시각 is highly overall correlated with 우편번호 and 5 other fieldsHigh correlation
우편번호 is highly overall correlated with 제1주소 and 4 other fieldsHigh correlation
시도구분코드 is highly overall correlated with 제1주소 and 4 other fieldsHigh correlation
제1주소 is highly overall correlated with 우편번호 and 5 other fieldsHigh correlation
제1주소 is highly imbalanced (92.1%)Imbalance
처리시각 is highly imbalanced (95.0%)Imbalance
처리직원번호 is highly imbalanced (89.5%)Imbalance
최초처리시각 is highly imbalanced (95.0%)Imbalance
최초처리직원번호 is highly imbalanced (89.5%)Imbalance

Reproduction

Analysis started2023-12-12 15:58:29.707788
Analysis finished2023-12-12 15:58:30.952852
Duration1.25 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

우편번호
Real number (ℝ)

HIGH CORRELATION 

Distinct259
Distinct (%)51.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean223028.15
Minimum210003
Maximum740220
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-13T00:58:31.053595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum210003
5-th percentile210199.5
Q1210801
median210933
Q3219839
95-th percentile240816
Maximum740220
Range530217
Interquartile range (IQR)9038

Descriptive statistics

Standard deviation41005.732
Coefficient of variation (CV)0.18385899
Kurtosis85.520206
Mean223028.15
Median Absolute Deviation (MAD)231.5
Skewness8.3630917
Sum1.1151408 × 108
Variance1.68147 × 109
MonotonicityNot monotonic
2023-12-13T00:58:31.228981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
219839 20
 
4.0%
219903 8
 
1.6%
240814 8
 
1.6%
219904 8
 
1.6%
219831 7
 
1.4%
210954 7
 
1.4%
210957 7
 
1.4%
219901 7
 
1.4%
240815 7
 
1.4%
219811 6
 
1.2%
Other values (249) 415
83.0%
ValueCountFrequency (%)
210003 1
0.2%
210010 1
0.2%
210020 1
0.2%
210030 1
0.2%
210040 2
0.4%
210050 1
0.2%
210060 1
0.2%
210070 1
0.2%
210080 1
0.2%
210090 1
0.2%
ValueCountFrequency (%)
740220 1
0.2%
660031 1
0.2%
520350 1
0.2%
486861 1
0.2%
380873 1
0.2%
363951 1
0.2%
362823 1
0.2%
362810 1
0.2%
343060 1
0.2%
339914 1
0.2%

제1주소
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
강원
487 
세종
 
4
충북
 
4
경북
 
1
충남
 
1
Other values (3)
 
3

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique5 ?
Unique (%)1.0%

Sample

1st row세종
2nd row세종
3rd row세종
4th row경북
5th row충북

Common Values

ValueCountFrequency (%)
강원 487
97.4%
세종 4
 
0.8%
충북 4
 
0.8%
경북 1
 
0.2%
충남 1
 
0.2%
경기 1
 
0.2%
경남 1
 
0.2%
전남 1
 
0.2%

Length

2023-12-13T00:58:31.397153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:58:31.542475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강원 487
97.4%
세종 4
 
0.8%
충북 4
 
0.8%
경북 1
 
0.2%
충남 1
 
0.2%
경기 1
 
0.2%
경남 1
 
0.2%
전남 1
 
0.2%

시도구분코드
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.178
Minimum8
Maximum18
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-13T00:58:31.672125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile8
Q18
median8
Q38
95-th percentile8
Maximum18
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.2004992
Coefficient of variation (CV)0.14679619
Kurtosis50.353305
Mean8.178
Median Absolute Deviation (MAD)0
Skewness7.0987134
Sum4089
Variance1.4411984
MonotonicityNot monotonic
2023-12-13T00:58:31.781781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
8 487
97.4%
18 4
 
0.8%
16 4
 
0.8%
11 1
 
0.2%
15 1
 
0.2%
9 1
 
0.2%
10 1
 
0.2%
12 1
 
0.2%
ValueCountFrequency (%)
8 487
97.4%
9 1
 
0.2%
10 1
 
0.2%
11 1
 
0.2%
12 1
 
0.2%
15 1
 
0.2%
16 4
 
0.8%
18 4
 
0.8%
ValueCountFrequency (%)
18 4
 
0.8%
16 4
 
0.8%
15 1
 
0.2%
12 1
 
0.2%
11 1
 
0.2%
10 1
 
0.2%
9 1
 
0.2%
8 487
97.4%

삭제여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size632.0 B
False
500 
ValueCountFrequency (%)
False 500
100.0%
2023-12-13T00:58:31.885859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

최종수정수
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
500 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 500
100.0%

Length

2023-12-13T00:58:31.999779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:58:32.102238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 500
100.0%

처리시각
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
00:00.0
493 
08:59.5
 
2
04:00.0
 
1
15:00.0
 
1
19:00.0
 
1
Other values (2)
 
2

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique5 ?
Unique (%)1.0%

Sample

1st row00:00.0
2nd row00:00.0
3rd row04:00.0
4th row00:00.0
5th row00:00.0

Common Values

ValueCountFrequency (%)
00:00.0 493
98.6%
08:59.5 2
 
0.4%
04:00.0 1
 
0.2%
15:00.0 1
 
0.2%
19:00.0 1
 
0.2%
12:00.0 1
 
0.2%
23:56.3 1
 
0.2%

Length

2023-12-13T00:58:32.198405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:58:32.308494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
00:00.0 493
98.6%
08:59.5 2
 
0.4%
04:00.0 1
 
0.2%
15:00.0 1
 
0.2%
19:00.0 1
 
0.2%
12:00.0 1
 
0.2%
23:56.3 1
 
0.2%

처리직원번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
3513
487 
4925
 
7
4800
 
4
5176
 
2

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5176
2nd row5176
3rd row4925
4th row4925
5th row4925

Common Values

ValueCountFrequency (%)
3513 487
97.4%
4925 7
 
1.4%
4800 4
 
0.8%
5176 2
 
0.4%

Length

2023-12-13T00:58:32.442125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:58:32.560991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3513 487
97.4%
4925 7
 
1.4%
4800 4
 
0.8%
5176 2
 
0.4%

최초처리시각
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
00:00.0
493 
08:59.5
 
2
04:00.0
 
1
15:00.0
 
1
19:00.0
 
1
Other values (2)
 
2

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique5 ?
Unique (%)1.0%

Sample

1st row00:00.0
2nd row00:00.0
3rd row04:00.0
4th row00:00.0
5th row00:00.0

Common Values

ValueCountFrequency (%)
00:00.0 493
98.6%
08:59.5 2
 
0.4%
04:00.0 1
 
0.2%
15:00.0 1
 
0.2%
19:00.0 1
 
0.2%
12:00.0 1
 
0.2%
23:56.3 1
 
0.2%

Length

2023-12-13T00:58:32.677111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:58:32.793319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
00:00.0 493
98.6%
08:59.5 2
 
0.4%
04:00.0 1
 
0.2%
15:00.0 1
 
0.2%
19:00.0 1
 
0.2%
12:00.0 1
 
0.2%
23:56.3 1
 
0.2%

최초처리직원번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
3513
487 
4925
 
7
4800
 
4
5176
 
2

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5176
2nd row5176
3rd row4925
4th row4925
5th row4925

Common Values

ValueCountFrequency (%)
3513 487
97.4%
4925 7
 
1.4%
4800 4
 
0.8%
5176 2
 
0.4%

Length

2023-12-13T00:58:32.901673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:58:32.986575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3513 487
97.4%
4925 7
 
1.4%
4800 4
 
0.8%
5176 2
 
0.4%

Interactions

2023-12-13T00:58:30.416468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:30.177052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:30.534099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:30.309480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:58:33.055022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호제1주소시도구분코드처리시각처리직원번호최초처리시각최초처리직원번호
우편번호1.0000.9660.9510.8080.8070.8080.807
제1주소0.9661.0001.0000.9250.9830.9250.983
시도구분코드0.9511.0001.0000.9640.8600.9640.860
처리시각0.8080.9250.9641.0000.7711.0000.771
처리직원번호0.8070.9830.8600.7711.0000.7711.000
최초처리시각0.8080.9250.9641.0000.7711.0000.771
최초처리직원번호0.8070.9830.8600.7711.0000.7711.000
2023-12-13T00:58:33.177955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최초처리시각최초처리직원번호처리직원번호제1주소처리시각
최초처리시각1.0000.6550.6550.8121.000
최초처리직원번호0.6551.0001.0000.8230.655
처리직원번호0.6551.0001.0000.8230.655
제1주소0.8120.8230.8231.0000.812
처리시각1.0000.6550.6550.8121.000
2023-12-13T00:58:33.276643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호시도구분코드제1주소처리시각처리직원번호최초처리시각최초처리직원번호
우편번호1.0000.2760.9090.6450.6570.6450.657
시도구분코드0.2761.0000.9990.7020.7940.7020.794
제1주소0.9090.9991.0000.8120.8230.8120.823
처리시각0.6450.7020.8121.0000.6551.0000.655
처리직원번호0.6570.7940.8230.6551.0000.6551.000
최초처리시각0.6450.7020.8121.0000.6551.0000.655
최초처리직원번호0.6570.7940.8230.6551.0000.6551.000

Missing values

2023-12-13T00:58:30.701955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:58:30.877493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

우편번호제1주소시도구분코드삭제여부최종수정수처리시각처리직원번호최초처리시각최초처리직원번호
0339004세종18N100:00.0517600:00.05176
1339824세종18N100:00.0517600:00.05176
2339913세종18N104:00.0492504:00.04925
3740220경북11N100:00.0492500:00.04925
4380873충북16N100:00.0492500:00.04925
5343060충남15N115:00.0492515:00.04925
6339914세종18N100:00.0492500:00.04925
7486861경기9N119:00.0492519:00.04925
8660031경남10N112:00.0492512:00.04925
9362823충북16N108:59.5480008:59.54800
우편번호제1주소시도구분코드삭제여부최종수정수처리시각처리직원번호최초처리시각최초처리직원번호
490240815강원8N100:00.0351300:00.03513
491240815강원8N100:00.0351300:00.03513
492240814강원8N100:00.0351300:00.03513
493240200강원8N100:00.0351300:00.03513
494240717강원8N100:00.0351300:00.03513
495240320강원8N100:00.0351300:00.03513
496240350강원8N100:00.0351300:00.03513
497240712강원8N100:00.0351300:00.03513
498240899강원8N100:00.0351300:00.03513
499240806강원8N100:00.0351300:00.03513

Duplicate rows

Most frequently occurring

우편번호제1주소시도구분코드삭제여부최종수정수처리시각처리직원번호최초처리시각최초처리직원번호# duplicates
74219839강원8N100:00.0351300:00.0351320
77219903강원8N100:00.0351300:00.035138
78219904강원8N100:00.0351300:00.035138
85240814강원8N100:00.0351300:00.035138
56210954강원8N100:00.0351300:00.035137
59210957강원8N100:00.0351300:00.035137
71219831강원8N100:00.0351300:00.035137
75219901강원8N100:00.0351300:00.035137
86240815강원8N100:00.0351300:00.035137
65219811강원8N100:00.0351300:00.035136