Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells242
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory752.0 KiB
Average record size in memory77.0 B

Variable types

Categorical2
Text3
Numeric3

Dataset

Description국토지리정보원의 수치지도(수치지형도) 관련 메타데이터 중 주소도엽매칭 정보입니다. (축척, 시군구, 시군구코드, 도엽코드, 도엽명 등)
Author국토교통부 국토지리정보원
URLhttps://www.data.go.kr/data/15067686/fileData.do

Alerts

축척 is highly overall correlated with 지도종류High correlation
지도종류 is highly overall correlated with 축척High correlation
축척 is highly imbalanced (60.3%)Imbalance
도엽명 has 228 (2.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 11:58:13.300795
Analysis finished2023-12-12 11:58:15.514618
Duration2.21 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

축척
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1000
7807 
5000
2148 
250000
 
38
25000
 
7

Length

Max length6
Median length4
Mean length4.0083
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5000
2nd row1000
3rd row1000
4th row1000
5th row5000

Common Values

ValueCountFrequency (%)
1000 7807
78.1%
5000 2148
 
21.5%
250000 38
 
0.4%
25000 7
 
0.1%

Length

2023-12-12T20:58:15.608306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:58:15.736930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1000 7807
78.1%
5000 2148
 
21.5%
250000 38
 
0.4%
25000 7
 
0.1%
Distinct233
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T20:58:16.078535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.4483
Min length2

Characters and Unicode

Total characters34483
Distinct characters143
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.1%

Sample

1st row해남군
2nd row청주시 서원구
3rd row서귀포시
4th row전주시 완산구
5th row양양군
ValueCountFrequency (%)
청주시 700
 
6.3%
창원시 307
 
2.8%
울주군 296
 
2.7%
고양시 258
 
2.3%
북구 247
 
2.2%
서귀포시 246
 
2.2%
제주시 229
 
2.1%
화성시 223
 
2.0%
서구 216
 
1.9%
동구 205
 
1.8%
Other values (219) 8224
73.8%
2023-12-12T20:58:16.882140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6052
17.6%
3654
 
10.6%
2298
 
6.7%
1904
 
5.5%
1151
 
3.3%
1043
 
3.0%
987
 
2.9%
832
 
2.4%
822
 
2.4%
792
 
2.3%
Other values (133) 14948
43.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33332
96.7%
Space Separator 1151
 
3.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6052
18.2%
3654
 
11.0%
2298
 
6.9%
1904
 
5.7%
1043
 
3.1%
987
 
3.0%
832
 
2.5%
822
 
2.5%
792
 
2.4%
741
 
2.2%
Other values (132) 14207
42.6%
Space Separator
ValueCountFrequency (%)
1151
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33332
96.7%
Common 1151
 
3.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6052
18.2%
3654
 
11.0%
2298
 
6.9%
1904
 
5.7%
1043
 
3.1%
987
 
3.0%
832
 
2.5%
822
 
2.5%
792
 
2.4%
741
 
2.2%
Other values (132) 14207
42.6%
Common
ValueCountFrequency (%)
1151
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33332
96.7%
ASCII 1151
 
3.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6052
18.2%
3654
 
11.0%
2298
 
6.9%
1904
 
5.7%
1043
 
3.1%
987
 
3.0%
832
 
2.5%
822
 
2.5%
792
 
2.4%
741
 
2.2%
Other values (132) 14207
42.6%
ASCII
ValueCountFrequency (%)
1151
100.0%

시군구코드
Real number (ℝ)

Distinct255
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9100168 × 109
Minimum1.111 × 109
Maximum5.013 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T20:58:17.112222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.111 × 109
5-th percentile1.162 × 109
Q13.171 × 109
median4.223 × 109
Q34.677 × 109
95-th percentile4.888 × 109
Maximum5.013 × 109
Range3.902 × 109
Interquartile range (IQR)1.506 × 109

Descriptive statistics

Standard deviation1.0286851 × 109
Coefficient of variation (CV)0.26308968
Kurtosis1.2094834
Mean3.9100168 × 109
Median Absolute Deviation (MAD)4.68 × 108
Skewness-1.4100795
Sum3.9100168 × 1013
Variance1.058193 × 1018
MonotonicityNot monotonic
2023-12-12T20:58:17.337073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4311000000 325
 
3.2%
3171000000 296
 
3.0%
5013000000 246
 
2.5%
5011000000 229
 
2.3%
4159000000 223
 
2.2%
4719000000 195
 
1.9%
4136000000 179
 
1.8%
2771000000 152
 
1.5%
2920000000 133
 
1.3%
4812000000 132
 
1.3%
Other values (245) 7890
78.9%
ValueCountFrequency (%)
1111000000 24
0.2%
1114000000 9
 
0.1%
1117000000 29
0.3%
1120000000 24
0.2%
1121500000 32
0.3%
1123000000 26
0.3%
1126000000 22
0.2%
1129000000 27
0.3%
1130500000 19
0.2%
1132000000 18
0.2%
ValueCountFrequency (%)
5013000000 246
2.5%
5011000000 229
2.3%
4889000000 19
 
0.2%
4888000000 30
 
0.3%
4887000000 8
 
0.1%
4886000000 18
 
0.2%
4885000000 12
 
0.1%
4882000000 1
 
< 0.1%
4874000000 1
 
< 0.1%
4873000000 13
 
0.1%
Distinct9429
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T20:58:17.712238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length8.7728
Min length6

Characters and Unicode

Total characters87728
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8911 ?
Unique (%)89.1%

Sample

1st row34607057
2nd row367062517
3rd row336120525
4th row357012375
5th row38815055
ValueCountFrequency (%)
ni52-2 8
 
0.1%
nj52-10 8
 
0.1%
nj52-7 7
 
0.1%
377092206 4
 
< 0.1%
377051646 4
 
< 0.1%
377090366 4
 
< 0.1%
376121693 4
 
< 0.1%
nj52-4 3
 
< 0.1%
367061909 3
 
< 0.1%
358031171 3
 
< 0.1%
Other values (9419) 9952
99.5%
2023-12-12T20:58:18.317831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 14587
16.6%
0 13837
15.8%
1 11721
13.4%
7 10345
11.8%
6 9871
11.3%
5 7323
8.3%
2 5816
 
6.6%
8 5453
 
6.2%
9 4460
 
5.1%
4 4201
 
4.8%
Other values (4) 114
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 87614
99.9%
Uppercase Letter 76
 
0.1%
Dash Punctuation 38
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 14587
16.6%
0 13837
15.8%
1 11721
13.4%
7 10345
11.8%
6 9871
11.3%
5 7323
8.4%
2 5816
 
6.6%
8 5453
 
6.2%
9 4460
 
5.1%
4 4201
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
N 38
50.0%
J 25
32.9%
I 13
 
17.1%
Dash Punctuation
ValueCountFrequency (%)
- 38
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 87652
99.9%
Latin 76
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 14587
16.6%
0 13837
15.8%
1 11721
13.4%
7 10345
11.8%
6 9871
11.3%
5 7323
8.4%
2 5816
 
6.6%
8 5453
 
6.2%
9 4460
 
5.1%
4 4201
 
4.8%
Latin
ValueCountFrequency (%)
N 38
50.0%
J 25
32.9%
I 13
 
17.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 87728
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 14587
16.6%
0 13837
15.8%
1 11721
13.4%
7 10345
11.8%
6 9871
11.3%
5 7323
8.3%
2 5816
 
6.6%
8 5453
 
6.2%
9 4460
 
5.1%
4 4201
 
4.8%
Other values (4) 114
 
0.1%

도엽명
Text

MISSING 

Distinct8956
Distinct (%)91.6%
Missing228
Missing (%)2.3%
Memory size156.2 KiB
2023-12-12T20:58:18.740739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length6
Mean length5.6835858
Min length2

Characters and Unicode

Total characters55540
Distinct characters183
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8434 ?
Unique (%)86.3%

Sample

1st row해남057
2nd row청주2517
3rd row표선0525
4th row전주2375
5th row속초055
ValueCountFrequency (%)
마산 41
 
0.4%
창원 29
 
0.3%
안산시 26
 
0.3%
원주 22
 
0.2%
부산 21
 
0.2%
화성 21
 
0.2%
서귀포시 20
 
0.2%
안양 17
 
0.2%
예안 16
 
0.2%
광주 13
 
0.1%
Other values (8947) 9587
97.7%
2023-12-12T20:58:19.374750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 6878
 
12.4%
1 6108
 
11.0%
2 4316
 
7.8%
3 2789
 
5.0%
4 2779
 
5.0%
5 2748
 
4.9%
8 2503
 
4.5%
9 2496
 
4.5%
7 2477
 
4.5%
6 2423
 
4.4%
Other values (173) 20023
36.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 35517
63.9%
Other Letter 19914
35.9%
Space Separator 109
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1499
 
7.5%
1262
 
6.3%
1096
 
5.5%
803
 
4.0%
727
 
3.7%
716
 
3.6%
690
 
3.5%
686
 
3.4%
650
 
3.3%
648
 
3.3%
Other values (162) 11137
55.9%
Decimal Number
ValueCountFrequency (%)
0 6878
19.4%
1 6108
17.2%
2 4316
12.2%
3 2789
7.9%
4 2779
7.8%
5 2748
 
7.7%
8 2503
 
7.0%
9 2496
 
7.0%
7 2477
 
7.0%
6 2423
 
6.8%
Space Separator
ValueCountFrequency (%)
109
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 35626
64.1%
Hangul 19914
35.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1499
 
7.5%
1262
 
6.3%
1096
 
5.5%
803
 
4.0%
727
 
3.7%
716
 
3.6%
690
 
3.5%
686
 
3.4%
650
 
3.3%
648
 
3.3%
Other values (162) 11137
55.9%
Common
ValueCountFrequency (%)
0 6878
19.3%
1 6108
17.1%
2 4316
12.1%
3 2789
7.8%
4 2779
7.8%
5 2748
 
7.7%
8 2503
 
7.0%
9 2496
 
7.0%
7 2477
 
7.0%
6 2423
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35626
64.1%
Hangul 19914
35.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6878
19.3%
1 6108
17.1%
2 4316
12.1%
3 2789
7.8%
4 2779
7.8%
5 2748
 
7.7%
8 2503
 
7.0%
9 2496
 
7.0%
7 2477
 
7.0%
6 2423
 
6.8%
Hangul
ValueCountFrequency (%)
1499
 
7.5%
1262
 
6.3%
1096
 
5.5%
803
 
4.0%
727
 
3.7%
716
 
3.6%
690
 
3.5%
686
 
3.4%
650
 
3.3%
648
 
3.3%
Other values (162) 11137
55.9%

중간X값
Real number (ℝ)

Distinct276
Distinct (%)2.8%
Missing7
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1005413.3
Minimum837025
Maximum1339430
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T20:58:19.544995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum837025
5-th percentile910860
Q1943118
median981368
Q31076298
95-th percentile1152038
Maximum1339430
Range502405
Interquartile range (IQR)133180

Descriptive statistics

Standard deviation81376.264
Coefficient of variation (CV)0.080938125
Kurtosis-0.80944729
Mean1005413.3
Median Absolute Deviation (MAD)49439
Skewness0.54433312
Sum1.0047095 × 1010
Variance6.6220964 × 109
MonotonicityNot monotonic
2023-12-12T20:58:19.735070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1002039 325
 
3.2%
1152038 296
 
3.0%
912313 246
 
2.5%
912423 229
 
2.3%
941859 222
 
2.2%
1076298 195
 
1.9%
976250 179
 
1.8%
1092502 152
 
1.5%
931929 133
 
1.3%
1099843 132
 
1.3%
Other values (266) 7884
78.8%
ValueCountFrequency (%)
837025 87
0.9%
864750 38
 
0.4%
870538 52
0.5%
879452 24
 
0.2%
892285 21
 
0.2%
896410 47
0.5%
896979 38
 
0.4%
900746 25
 
0.2%
904663 97
1.0%
904664 1
 
< 0.1%
ValueCountFrequency (%)
1339430 1
 
< 0.1%
1174374 35
0.4%
1173992 42
0.4%
1173990 1
 
< 0.1%
1170048 86
0.9%
1164497 45
0.4%
1163388 29
 
0.3%
1161258 4
 
< 0.1%
1160927 50
0.5%
1157768 67
0.7%

중간Y값
Real number (ℝ)

Distinct277
Distinct (%)2.8%
Missing7
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1809339.6
Minimum1478904
Maximum2033039
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T20:58:19.898904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1478904
5-th percentile1597927
Q11715100
median1827831
Q31925193
95-th percentile1962468
Maximum2033039
Range554135
Interquartile range (IQR)210093

Descriptive statistics

Standard deviation126898.19
Coefficient of variation (CV)0.070135086
Kurtosis-0.30305282
Mean1809339.6
Median Absolute Deviation (MAD)103186
Skewness-0.57339783
Sum1.808073 × 1010
Variance1.610315 × 1010
MonotonicityNot monotonic
2023-12-12T20:58:20.112258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1844048 326
 
3.3%
1727047 296
 
3.0%
1478904 246
 
2.5%
1517440 229
 
2.3%
1906899 222
 
2.2%
1962468 205
 
2.1%
1801900 195
 
1.9%
1960416 179
 
1.8%
1753601 152
 
1.5%
1685716 133
 
1.3%
Other values (267) 7810
78.1%
ValueCountFrequency (%)
1478904 246
2.5%
1517440 229
2.3%
1580278 19
 
0.2%
1597927 52
 
0.5%
1606482 84
 
0.8%
1615360 23
 
0.2%
1619844 49
 
0.5%
1624922 16
 
0.2%
1628068 87
 
0.9%
1628566 10
 
0.1%
ValueCountFrequency (%)
2033039 7
 
0.1%
2019906 15
 
0.1%
2012719 6
 
0.1%
2005105 30
 
0.3%
2004423 1
 
< 0.1%
2001880 21
 
0.2%
1995709 77
0.8%
1991281 17
 
0.2%
1986726 56
0.6%
1986724 1
 
< 0.1%

지도종류
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
101
7083 
102
1556 
103
1354 
105
 
7

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row102
2nd row101
3rd row101
4th row101
5th row101

Common Values

ValueCountFrequency (%)
101 7083
70.8%
102 1556
 
15.6%
103 1354
 
13.5%
105 7
 
0.1%

Length

2023-12-12T20:58:20.298409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:58:20.415062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
101 7083
70.8%
102 1556
 
15.6%
103 1354
 
13.5%
105 7
 
0.1%

Interactions

2023-12-12T20:58:14.754228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:14.033317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:14.374534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:14.874315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:14.151395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:14.506443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:15.010100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:14.253143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:14.630632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:58:20.490197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
축척시군구코드중간X값중간Y값지도종류
축척1.0000.4260.2870.3580.915
시군구코드0.4261.0000.8350.7530.481
중간X값0.2870.8351.0000.6220.308
중간Y값0.3580.7530.6221.0000.437
지도종류0.9150.4810.3080.4371.000
2023-12-12T20:58:20.609926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
축척지도종류
축척1.0000.616
지도종류0.6161.000
2023-12-12T20:58:20.720366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구코드중간X값중간Y값축척지도종류
시군구코드1.0000.055-0.4920.2010.232
중간X값0.0551.000-0.1230.2240.212
중간Y값-0.492-0.1231.0000.2310.296
축척0.2010.2240.2311.0000.616
지도종류0.2320.2120.2960.6161.000

Missing values

2023-12-12T20:58:15.155143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:58:15.309532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T20:58:15.433149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

축척시군구시군구코드도엽코드도엽명중간X값중간Y값지도종류
805115000해남군468200000034607057해남0579071231615360102
735011000청주시 서원구4311200000367062517청주25179942591838909101
442081000서귀포시5013000000336120525표선05259123131478904101
771051000전주시 완산구4511100000357012375전주23759668831754732101
882305000양양군428300000038815055속초05510972162001880101
739271000용인시 기흥구4146300000377130303용인03039663931919086101
211331000천안시4413000000367012434평택24349744791866532103
788865000보성군467800000034705013회천0139749041646371102
618911000이천시4150000000377102597이천25979985791911410101
338971000여수시4613000000347031311광양131110056361606482101
축척시군구시군구코드도엽코드도엽명중간X값중간Y값지도종류
854575000서구282600000037611018인천0189253851950975101
551651000화성시4159000000376160365남양03659418591906899101
536191000유성구3020000000367101231대전12319850511820544101
433091000양평군4183000000377100505이천050510068131946639101
757651000중구2711000000358031372대구137210987851763960101
906145000단양군438000000036802019단양01910831061887217101
384661000서귀포시5013000000336101449모슬포14499123131478904101
290191000구미시4719000000368151111<NA>10762981801900101
187501000시흥시4139000000376120662안양06629329751932624103
488221000남양주시4136000000377061711양수17119762501960416101