Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 10000 |
Missing cells | 3019 |
Missing cells (%) | 3.4% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 820.3 KiB |
Average record size in memory | 84.0 B |
Variable types
Text | 3 |
---|---|
Categorical | 3 |
DateTime | 1 |
Numeric | 2 |
Dataset
Description | 국토지리정보원의 수치지도(수치지형도) 관련 메타데이터 중 고시정보 입니다. (도엽번호, 도엽명, 축척, 고시일자, 지도종류 등 포함) |
---|---|
Author | 국토교통부 국토지리정보원 |
URL | https://www.data.go.kr/data/15067684/fileData.do |
조사연도 is highly overall correlated with 제작연도 and 1 other fields | High correlation |
제작연도 is highly overall correlated with 조사연도 and 1 other fields | High correlation |
촬영연도 is highly overall correlated with 조사연도 and 1 other fields | High correlation |
축척 is highly imbalanced (54.4%) | Imbalance |
도엽명 has 876 (8.8%) missing values | Missing |
고시일자 has 517 (5.2%) missing values | Missing |
조사연도 has 521 (5.2%) missing values | Missing |
제작연도 has 584 (5.8%) missing values | Missing |
고시번호 has 521 (5.2%) missing values | Missing |
Reproduction
Analysis started | 2023-12-12 19:08:37.195661 |
---|---|
Analysis finished | 2023-12-12 19:08:39.458066 |
Duration | 2.26 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
도엽번호
Text
Distinct | 9296 |
---|---|
Distinct (%) | 93.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
377121765 | 2 | < 0.1% |
379140789 | 2 | < 0.1% |
37715043 | 2 | < 0.1% |
359100348 | 2 | < 0.1% |
368140110 | 2 | < 0.1% |
347030800 | 2 | < 0.1% |
377102575 | 2 | < 0.1% |
368151170 | 2 | < 0.1% |
367101968 | 2 | < 0.1% |
34711024 | 2 | < 0.1% |
Other values (9286) | 9980 |
Most occurring characters
Value | Count | Frequency (%) |
3 | 15073 | |
0 | 14360 | |
1 | 11559 | |
6 | 8963 | |
7 | 8468 | |
5 | 6866 | |
8 | 5724 | 6.7% |
2 | 5382 | 6.3% |
9 | 4753 | 5.5% |
4 | 4701 | 5.5% |
Other values (12) | 82 | 0.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 85849 | |
Uppercase Letter | 82 | 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
I | 13 | |
H | 13 | |
G | 11 | |
E | 8 | |
J | 7 | |
N | 6 | |
D | 5 | 6.1% |
C | 5 | 6.1% |
A | 5 | 6.1% |
B | 4 | 4.9% |
Other values (2) | 5 | 6.1% |
Decimal Number
Value | Count | Frequency (%) |
3 | 15073 | |
0 | 14360 | |
1 | 11559 | |
6 | 8963 | |
7 | 8468 | |
5 | 6866 | |
8 | 5724 | 6.7% |
2 | 5382 | 6.3% |
9 | 4753 | 5.5% |
4 | 4701 | 5.5% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 85849 | |
Latin | 82 | 0.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
I | 13 | |
H | 13 | |
G | 11 | |
E | 8 | |
J | 7 | |
N | 6 | |
D | 5 | 6.1% |
C | 5 | 6.1% |
A | 5 | 6.1% |
B | 4 | 4.9% |
Other values (2) | 5 | 6.1% |
Common
Value | Count | Frequency (%) |
3 | 15073 | |
0 | 14360 | |
1 | 11559 | |
6 | 8963 | |
7 | 8468 | |
5 | 6866 | |
8 | 5724 | 6.7% |
2 | 5382 | 6.3% |
9 | 4753 | 5.5% |
4 | 4701 | 5.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 85931 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
3 | 15073 | |
0 | 14360 | |
1 | 11559 | |
6 | 8963 | |
7 | 8468 | |
5 | 6866 | |
8 | 5724 | 6.7% |
2 | 5382 | 6.3% |
9 | 4753 | 5.5% |
4 | 4701 | 5.5% |
Other values (12) | 82 | 0.1% |
도엽명
Text
MISSING
 
Distinct | 7287 |
---|---|
Distinct (%) | 79.9% |
Missing | 876 |
Missing (%) | 8.8% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
마산 | 53 | 0.6% |
남원 | 42 | 0.5% |
담양 | 38 | 0.4% |
장호원 | 38 | 0.4% |
금산 | 38 | 0.4% |
고창 | 37 | 0.4% |
익산 | 36 | 0.4% |
거창 | 36 | 0.4% |
논산 | 35 | 0.4% |
무풍 | 35 | 0.4% |
Other values (7280) | 8807 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 6146 | 13.0% |
1 | 4593 | 9.7% |
2 | 3363 | 7.1% |
3 | 2324 | 4.9% |
7 | 2129 | 4.5% |
4 | 2125 | 4.5% |
5 | 1979 | 4.2% |
8 | 1974 | 4.2% |
9 | 1972 | 4.2% |
6 | 1819 | 3.8% |
Other values (181) | 18833 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 28424 | |
Other Letter | 18661 | |
Space Separator | 172 | 0.4% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
산 | 1471 | 7.9% |
주 | 1139 | 6.1% |
양 | 935 | 5.0% |
천 | 751 | 4.0% |
포 | 585 | 3.1% |
안 | 525 | 2.8% |
전 | 497 | 2.7% |
대 | 485 | 2.6% |
원 | 468 | 2.5% |
부 | 455 | 2.4% |
Other values (170) | 11350 |
Decimal Number
Value | Count | Frequency (%) |
0 | 6146 | |
1 | 4593 | |
2 | 3363 | |
3 | 2324 | 8.2% |
7 | 2129 | 7.5% |
4 | 2125 | 7.5% |
5 | 1979 | 7.0% |
8 | 1974 | 6.9% |
9 | 1972 | 6.9% |
6 | 1819 | 6.4% |
Space Separator
Value | Count | Frequency (%) |
172 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 28596 | |
Hangul | 18661 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
산 | 1471 | 7.9% |
주 | 1139 | 6.1% |
양 | 935 | 5.0% |
천 | 751 | 4.0% |
포 | 585 | 3.1% |
안 | 525 | 2.8% |
전 | 497 | 2.7% |
대 | 485 | 2.6% |
원 | 468 | 2.5% |
부 | 455 | 2.4% |
Other values (170) | 11350 |
Common
Value | Count | Frequency (%) |
0 | 6146 | |
1 | 4593 | |
2 | 3363 | |
3 | 2324 | 8.1% |
7 | 2129 | 7.4% |
4 | 2125 | 7.4% |
5 | 1979 | 6.9% |
8 | 1974 | 6.9% |
9 | 1972 | 6.9% |
6 | 1819 | 6.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 28596 | |
Hangul | 18661 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 6146 | |
1 | 4593 | |
2 | 3363 | |
3 | 2324 | 8.1% |
7 | 2129 | 7.4% |
4 | 2125 | 7.4% |
5 | 1979 | 6.9% |
8 | 1974 | 6.9% |
9 | 1972 | 6.9% |
6 | 1819 | 6.4% |
Hangul
Value | Count | Frequency (%) |
산 | 1471 | 7.9% |
주 | 1139 | 6.1% |
양 | 935 | 5.0% |
천 | 751 | 4.0% |
포 | 585 | 3.1% |
안 | 525 | 2.8% |
전 | 497 | 2.7% |
대 | 485 | 2.6% |
원 | 468 | 2.5% |
부 | 455 | 2.4% |
Other values (170) | 11350 |
축척
Categorical
IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
1000 | |
---|---|
5000 | |
2500 | 70 |
25000 | 41 |
250000 | 6 |
Length
Max length | 6 |
---|---|
Median length | 4 |
Mean length | 4.0053 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 5000 |
---|---|
2nd row | 5000 |
3rd row | 1000 |
4th row | 1000 |
5th row | 5000 |
Common Values
Value | Count | Frequency (%) |
1000 | 6025 | |
5000 | 3858 | |
2500 | 70 | 0.7% |
25000 | 41 | 0.4% |
250000 | 6 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1000 | 6025 | |
5000 | 3858 | |
2500 | 70 | 0.7% |
25000 | 41 | 0.4% |
250000 | 6 | 0.1% |
고시일자
Date
MISSING
 
Distinct | 58 |
---|---|
Distinct (%) | 0.6% |
Missing | 517 |
Missing (%) | 5.2% |
Memory size | 156.2 KiB |
Minimum | 1899-12-30 00:00:00 |
---|---|
Maximum | 2012-02-27 00:00:00 |
지도종류
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
3 | |
---|---|
0 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 3 |
---|---|
2nd row | 0 |
3rd row | 0 |
4th row | 0 |
5th row | 0 |
Common Values
Value | Count | Frequency (%) |
3 | 5178 | |
0 | 4822 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
3 | 5178 | |
0 | 4822 |
촬영연도
Categorical
HIGH CORRELATION
 
Distinct | 12 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2010 | |
---|---|
2008 | |
2009 | |
2005 | |
2006 | |
Other values (7) |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.9994 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2008 |
---|---|
2nd row | 2010 |
3rd row | 2003 |
4th row | 2011 |
5th row | 2008 |
Common Values
Value | Count | Frequency (%) |
2010 | 2137 | |
2008 | 1926 | |
2009 | 1518 | |
2005 | 1000 | |
2006 | 871 | |
2007 | 858 | |
<NA> | 522 | 5.2% |
2003 | 342 | 3.4% |
2011 | 336 | 3.4% |
2004 | 279 | 2.8% |
Other values (2) | 211 | 2.1% |
Length
Value | Count | Frequency (%) |
2010 | 2137 | |
2008 | 1926 | |
2009 | 1518 | |
2005 | 1000 | |
2006 | 871 | |
2007 | 858 | |
na | 522 | 5.2% |
2003 | 342 | 3.4% |
2011 | 336 | 3.4% |
2004 | 279 | 2.8% |
Other values (2) | 211 | 2.1% |
조사연도
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 0.1% |
Missing | 521 |
Missing (%) | 5.2% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2008.2298 |
Minimum | 2002 |
---|---|
Maximum | 2011 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 2002 |
---|---|
5-th percentile | 2004 |
Q1 | 2006 |
median | 2009 |
Q3 | 2010 |
95-th percentile | 2011 |
Maximum | 2011 |
Range | 9 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 2.4329272 |
---|---|
Coefficient of variation (CV) | 0.0012114785 |
Kurtosis | -0.76722312 |
Mean | 2008.2298 |
Median Absolute Deviation (MAD) | 2 |
Skewness | -0.60014076 |
Sum | 19036010 |
Variance | 5.9191347 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2010 | 2066 | |
2011 | 1970 | |
2009 | 1120 | |
2006 | 943 | |
2007 | 914 | |
2005 | 834 | |
2008 | 832 | |
2004 | 397 | 4.0% |
2003 | 324 | 3.2% |
2002 | 79 | 0.8% |
(Missing) | 521 | 5.2% |
Value | Count | Frequency (%) |
2002 | 79 | 0.8% |
2003 | 324 | 3.2% |
2004 | 397 | 4.0% |
2005 | 834 | |
2006 | 943 | |
2007 | 914 | |
2008 | 832 | |
2009 | 1120 | |
2010 | 2066 | |
2011 | 1970 |
Value | Count | Frequency (%) |
2011 | 1970 | |
2010 | 2066 | |
2009 | 1120 | |
2008 | 832 | |
2007 | 914 | |
2006 | 943 | |
2005 | 834 | |
2004 | 397 | 4.0% |
2003 | 324 | 3.2% |
2002 | 79 | 0.8% |
제작연도
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 0.1% |
Missing | 584 |
Missing (%) | 5.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2008.2004 |
Minimum | 2002 |
---|---|
Maximum | 2011 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 2002 |
---|---|
5-th percentile | 2004 |
Q1 | 2006 |
median | 2009 |
Q3 | 2010 |
95-th percentile | 2011 |
Maximum | 2011 |
Range | 9 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 2.4267678 |
---|---|
Coefficient of variation (CV) | 0.0012084291 |
Kurtosis | -0.77343126 |
Mean | 2008.2004 |
Median Absolute Deviation (MAD) | 2 |
Skewness | -0.58622743 |
Sum | 18909215 |
Variance | 5.8892022 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2010 | 2036 | |
2011 | 1892 | |
2009 | 1144 | |
2006 | 943 | |
2007 | 935 | |
2005 | 834 | |
2008 | 832 | |
2004 | 397 | 4.0% |
2003 | 324 | 3.2% |
2002 | 79 | 0.8% |
(Missing) | 584 | 5.8% |
Value | Count | Frequency (%) |
2002 | 79 | 0.8% |
2003 | 324 | 3.2% |
2004 | 397 | 4.0% |
2005 | 834 | |
2006 | 943 | |
2007 | 935 | |
2008 | 832 | |
2009 | 1144 | |
2010 | 2036 | |
2011 | 1892 |
Value | Count | Frequency (%) |
2011 | 1892 | |
2010 | 2036 | |
2009 | 1144 | |
2008 | 832 | |
2007 | 935 | |
2006 | 943 | |
2005 | 834 | |
2004 | 397 | 4.0% |
2003 | 324 | 3.2% |
2002 | 79 | 0.8% |
고시번호
Text
MISSING
 
Distinct | 58 |
---|---|
Distinct (%) | 0.6% |
Missing | 521 |
Missing (%) | 5.2% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
2011-1080 | 1470 | |
2010-52 | 790 | 8.3% |
2006-755 | 716 | 7.6% |
2010-777 | 704 | 7.4% |
2008-875 | 655 | 6.9% |
2010-953 | 592 | 6.2% |
2012-260 | 468 | 4.9% |
2007-675 | 397 | 4.2% |
2010-907 | 345 | 3.6% |
2005-124 | 292 | 3.1% |
Other values (48) | 3050 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 21111 | |
2 | 12210 | |
- | 9479 | |
1 | 8810 | |
7 | 6283 | 8.2% |
5 | 6064 | 7.9% |
8 | 3941 | 5.2% |
6 | 3150 | 4.1% |
3 | 2616 | 3.4% |
9 | 1488 | 2.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 66816 | |
Dash Punctuation | 9479 | 12.4% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 21111 | |
2 | 12210 | |
1 | 8810 | |
7 | 6283 | 9.4% |
5 | 6064 | 9.1% |
8 | 3941 | 5.9% |
6 | 3150 | 4.7% |
3 | 2616 | 3.9% |
9 | 1488 | 2.2% |
4 | 1143 | 1.7% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 9479 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 76295 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 21111 | |
2 | 12210 | |
- | 9479 | |
1 | 8810 | |
7 | 6283 | 8.2% |
5 | 6064 | 7.9% |
8 | 3941 | 5.2% |
6 | 3150 | 4.1% |
3 | 2616 | 3.4% |
9 | 1488 | 2.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 76295 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 21111 | |
2 | 12210 | |
- | 9479 | |
1 | 8810 | |
7 | 6283 | 8.2% |
5 | 6064 | 7.9% |
8 | 3941 | 5.2% |
6 | 3150 | 4.1% |
3 | 2616 | 3.4% |
9 | 1488 | 2.0% |
축척 | 고시일자 | 지도종류 | 촬영연도 | 조사연도 | 제작연도 | 고시번호 | |
---|---|---|---|---|---|---|---|
축척 | 1.000 | 0.890 | 0.064 | 0.543 | 0.540 | 0.538 | 0.937 |
고시일자 | 0.890 | 1.000 | 0.289 | 0.987 | 0.999 | 0.999 | 1.000 |
지도종류 | 0.064 | 0.289 | 1.000 | 0.173 | 0.059 | 0.059 | 0.288 |
촬영연도 | 0.543 | 0.987 | 0.173 | 1.000 | 0.943 | 0.943 | 0.987 |
조사연도 | 0.540 | 0.999 | 0.059 | 0.943 | 1.000 | 1.000 | 1.000 |
제작연도 | 0.538 | 0.999 | 0.059 | 0.943 | 1.000 | 1.000 | 1.000 |
고시번호 | 0.937 | 1.000 | 0.288 | 0.987 | 1.000 | 1.000 | 1.000 |
축척 | 촬영연도 | 지도종류 | |
---|---|---|---|
축척 | 1.000 | 0.337 | 0.078 |
촬영연도 | 0.337 | 1.000 | 0.166 |
지도종류 | 0.078 | 0.166 | 1.000 |
조사연도 | 제작연도 | 축척 | 지도종류 | 촬영연도 | |
---|---|---|---|---|---|
조사연도 | 1.000 | 0.999 | 0.350 | 0.105 | 0.787 |
제작연도 | 0.999 | 1.000 | 0.348 | 0.106 | 0.787 |
축척 | 0.350 | 0.348 | 1.000 | 0.078 | 0.337 |
지도종류 | 0.105 | 0.106 | 0.078 | 1.000 | 0.166 |
촬영연도 | 0.787 | 0.787 | 0.337 | 0.166 | 1.000 |
도엽번호 | 도엽명 | 축척 | 고시일자 | 지도종류 | 촬영연도 | 조사연도 | 제작연도 | 고시번호 | |
---|---|---|---|---|---|---|---|---|---|
6499 | 34610049 | 진도049 | 5000 | 2010-01-19 | 3 | 2008 | 2009 | 2009 | 2010-52 |
47680 | 37703072 | 춘천 | 5000 | 2011-12-26 | 0 | 2010 | 2011 | 2011 | 2011-1080 |
10690 | 359100699 | 방어진0699 | 1000 | 2005-02-04 | 0 | 2003 | 2004 | 2004 | 2005-124 |
8024 | 367101886 | 대전1886 | 1000 | 2012-02-27 | 0 | 2011 | 2011 | 2011 | 2012-260 |
965 | 34611079 | <NA> | 5000 | 2010-01-19 | 0 | 2008 | 2009 | 2009 | 2010-52 |
14604 | 37806077 | 봉평077 | 5000 | 2010-12-24 | 3 | 2009 | 2010 | 2010 | 2010-953 |
12834 | 376071978 | 김포1978 | 1000 | 2004-01-05 | 0 | 2002 | 2003 | 2003 | 2004-001 |
37127 | 356041886 | 익산1886 | 1000 | 2005-10-17 | 3 | 2005 | 2005 | 2005 | 2005-643 |
11603 | 359130382 | 부산0382 | 1000 | 2006-12-29 | 0 | 2005 | 2006 | 2006 | 2006-755 |
48025 | 35705073 | 갈담 | 5000 | 2011-12-26 | 3 | 2010 | 2011 | 2011 | 2011-1080 |
도엽번호 | 도엽명 | 축척 | 고시일자 | 지도종류 | 촬영연도 | 조사연도 | 제작연도 | 고시번호 | |
---|---|---|---|---|---|---|---|---|---|
23222 | 336061968 | 한림1968 | 1000 | 2012-02-27 | 0 | 2011 | 2011 | 2011 | 2012-260 |
12625 | 34612074 | <NA> | 5000 | 2010-01-19 | 0 | 2008 | 2009 | 2009 | 2010-52 |
29179 | 336102017 | 모슬포2017 | 1000 | 2006-12-29 | 0 | 2006 | 2006 | 2006 | 2006-755 |
25439 | 347030966 | 광양0966 | 1000 | 2010-11-09 | 0 | 2009 | 2010 | 2010 | 2010-777 |
19565 | 36702071 | 진천 | 5000 | 2011-12-26 | 0 | 2010 | 2011 | 2011 | 2011-1080 |
38200 | 368141835 | 구미1835 | 1000 | 2008-12-30 | 3 | 2008 | 2008 | 2008 | 2008-875 |
30839 | 359130176 | 부산0176 | 1000 | 2006-06-07 | 3 | 2005 | 2006 | 2006 | 2006-353 |
51357 | 35704037 | 무풍 | 5000 | 2011-12-26 | 3 | 2010 | 2011 | 2011 | 2011-1080 |
11101 | 367101310 | 대전1310 | 1000 | 2005-12-14 | 0 | 2005 | 2005 | 2005 | 2005-498 |
49465 | 36713090 | 논산 | 5000 | 2011-12-26 | 3 | 2010 | 2011 | 2011 | 2011-1080 |