Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells4488
Missing cells (%)5.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory761.7 KiB
Average record size in memory78.0 B

Variable types

Numeric5
Text1
Unsupported1
Categorical1

Dataset

Description국토지리정보원의 수치지도(수치지형도) 관련 메타데이터 중 도엽주소매칭 정보입니다. (축척, 도엽명, 도엽번호, 최대값X, 최대값Y 등)
Author국토교통부 국토지리정보원
URLhttps://www.data.go.kr/data/15067688/fileData.do

Alerts

최대값X is highly overall correlated with 최소값XHigh correlation
최대값Y is highly overall correlated with 최소값YHigh correlation
최소값X is highly overall correlated with 최대값XHigh correlation
최소값Y is highly overall correlated with 최대값YHigh correlation
도엽명 has 444 (4.4%) missing valuesMissing
최대값X has 1011 (10.1%) missing valuesMissing
최대값Y has 1011 (10.1%) missing valuesMissing
최소값X has 1011 (10.1%) missing valuesMissing
최소값Y has 1011 (10.1%) missing valuesMissing
축척 is highly skewed (γ1 = 25.01129627)Skewed
도엽번호 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 08:57:17.155998
Analysis finished2023-12-12 08:57:22.778101
Duration5.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

축척
Real number (ℝ)

SKEWED 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3253.25
Minimum1000
Maximum250000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:57:22.862251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile1000
Q11000
median1000
Q35000
95-th percentile5000
Maximum250000
Range249000
Interquartile range (IQR)4000

Descriptive statistics

Standard deviation6351.0844
Coefficient of variation (CV)1.9522276
Kurtosis918.41393
Mean3253.25
Median Absolute Deviation (MAD)0
Skewness25.011296
Sum32532500
Variance40336273
MonotonicityNot monotonic
2023-12-12T17:57:23.419150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1000 5515
55.1%
5000 4253
42.5%
25000 131
 
1.3%
2500 71
 
0.7%
50000 26
 
0.3%
250000 4
 
< 0.1%
ValueCountFrequency (%)
1000 5515
55.1%
2500 71
 
0.7%
5000 4253
42.5%
25000 131
 
1.3%
50000 26
 
0.3%
250000 4
 
< 0.1%
ValueCountFrequency (%)
250000 4
 
< 0.1%
50000 26
 
0.3%
25000 131
 
1.3%
5000 4253
42.5%
2500 71
 
0.7%
1000 5515
55.1%

도엽명
Text

MISSING 

Distinct6539
Distinct (%)68.4%
Missing444
Missing (%)4.4%
Memory size156.2 KiB
2023-12-12T17:57:23.955457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.6176224
Min length1

Characters and Unicode

Total characters44126
Distinct characters201
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5956 ?
Unique (%)62.3%

Sample

1st row당진0430
2nd row공주077
3rd row삼가
4th row김포2561
5th row서울2374
ValueCountFrequency (%)
원주 68
 
0.7%
양산 59
 
0.6%
김포 50
 
0.5%
창원 48
 
0.5%
언양 46
 
0.5%
마산 45
 
0.5%
화성 38
 
0.4%
광양 36
 
0.4%
구정 35
 
0.4%
광주 34
 
0.4%
Other values (6529) 9133
95.2%
2023-12-12T17:57:24.556213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5245
 
11.9%
1 3853
 
8.7%
2 2954
 
6.7%
3 2036
 
4.6%
5 1815
 
4.1%
4 1803
 
4.1%
8 1783
 
4.0%
7 1724
 
3.9%
9 1713
 
3.9%
6 1605
 
3.6%
Other values (191) 19595
44.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24531
55.6%
Other Letter 19448
44.1%
Space Separator 146
 
0.3%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1206
 
6.2%
1059
 
5.4%
1020
 
5.2%
819
 
4.2%
610
 
3.1%
604
 
3.1%
599
 
3.1%
513
 
2.6%
490
 
2.5%
476
 
2.4%
Other values (179) 12052
62.0%
Decimal Number
ValueCountFrequency (%)
0 5245
21.4%
1 3853
15.7%
2 2954
12.0%
3 2036
 
8.3%
5 1815
 
7.4%
4 1803
 
7.3%
8 1783
 
7.3%
7 1724
 
7.0%
9 1713
 
7.0%
6 1605
 
6.5%
Space Separator
ValueCountFrequency (%)
146
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24678
55.9%
Hangul 19448
44.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1206
 
6.2%
1059
 
5.4%
1020
 
5.2%
819
 
4.2%
610
 
3.1%
604
 
3.1%
599
 
3.1%
513
 
2.6%
490
 
2.5%
476
 
2.4%
Other values (179) 12052
62.0%
Common
ValueCountFrequency (%)
0 5245
21.3%
1 3853
15.6%
2 2954
12.0%
3 2036
 
8.3%
5 1815
 
7.4%
4 1803
 
7.3%
8 1783
 
7.2%
7 1724
 
7.0%
9 1713
 
6.9%
6 1605
 
6.5%
Other values (2) 147
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24678
55.9%
Hangul 19448
44.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5245
21.3%
1 3853
15.6%
2 2954
12.0%
3 2036
 
8.3%
5 1815
 
7.4%
4 1803
 
7.3%
8 1783
 
7.2%
7 1724
 
7.0%
9 1713
 
6.9%
6 1605
 
6.5%
Other values (2) 147
 
0.6%
Hangul
ValueCountFrequency (%)
1206
 
6.2%
1059
 
5.4%
1020
 
5.2%
819
 
4.2%
610
 
3.1%
604
 
3.1%
599
 
3.1%
513
 
2.6%
490
 
2.5%
476
 
2.4%
Other values (179) 12052
62.0%

도엽번호
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size156.2 KiB

최대값X
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct8149
Distinct (%)90.7%
Missing1011
Missing (%)10.1%
Infinite0
Infinite (%)0.0%
Mean1020824
Minimum780779
Maximum1388222
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:57:24.729822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum780779
5-th percentile901213
Q1944723
median1004533
Q31098080
95-th percentile1159094.4
Maximum1388222
Range607443
Interquartile range (IQR)153357

Descriptive statistics

Standard deviation85983.462
Coefficient of variation (CV)0.084229471
Kurtosis-1.1349459
Mean1020824
Median Absolute Deviation (MAD)70979
Skewness0.20791729
Sum9.1761866 × 109
Variance7.3931558 × 109
MonotonicityNot monotonic
2023-12-12T17:57:24.898929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000000 56
 
0.6%
999553 5
 
0.1%
1000457 5
 
0.1%
1000913 5
 
0.1%
998659 4
 
< 0.1%
999557 4
 
< 0.1%
1003653 4
 
< 0.1%
1071228 3
 
< 0.1%
1002219 3
 
< 0.1%
993294 3
 
< 0.1%
Other values (8139) 8897
89.0%
(Missing) 1011
 
10.1%
ValueCountFrequency (%)
780779 1
< 0.1%
789275 2
< 0.1%
805487 1
< 0.1%
809892 1
< 0.1%
814586 2
< 0.1%
816710 1
< 0.1%
827930 1
< 0.1%
846452 1
< 0.1%
848066 1
< 0.1%
848698 1
< 0.1%
ValueCountFrequency (%)
1388222 1
< 0.1%
1300773 1
< 0.1%
1296250 2
< 0.1%
1293940 1
< 0.1%
1293842 1
< 0.1%
1222460 1
< 0.1%
1207051 1
< 0.1%
1189221 2
< 0.1%
1187087 1
< 0.1%
1185722 1
< 0.1%

최대값Y
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct7859
Distinct (%)87.4%
Missing1011
Missing (%)10.1%
Infinite0
Infinite (%)0.0%
Mean1791197.6
Minimum1465550
Maximum2558033
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:57:25.085953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1465550
5-th percentile1507269
Q11690803
median1792722
Q31914299
95-th percentile1966722.8
Maximum2558033
Range1092483
Interquartile range (IQR)223496

Descriptive statistics

Standard deviation130171.3
Coefficient of variation (CV)0.072672772
Kurtosis-0.38114472
Mean1791197.6
Median Absolute Deviation (MAD)108612
Skewness-0.36020926
Sum1.6101076 × 1010
Variance1.6944566 × 1010
MonotonicityNot monotonic
2023-12-12T17:57:25.266819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1914018 6
 
0.1%
1915682 6
 
0.1%
1661129 6
 
0.1%
1920119 5
 
0.1%
1695210 4
 
< 0.1%
1664456 4
 
< 0.1%
1921229 4
 
< 0.1%
1825278 4
 
< 0.1%
1708880 4
 
< 0.1%
1642853 4
 
< 0.1%
Other values (7849) 8942
89.4%
(Missing) 1011
 
10.1%
ValueCountFrequency (%)
1465550 1
< 0.1%
1471121 1
< 0.1%
1471432 1
< 0.1%
1471437 1
< 0.1%
1471441 1
< 0.1%
1471446 1
< 0.1%
1471610 1
< 0.1%
1471676 1
< 0.1%
1471964 1
< 0.1%
1471982 1
< 0.1%
ValueCountFrequency (%)
2558033 1
< 0.1%
2042082 1
< 0.1%
2041971 1
< 0.1%
2036607 1
< 0.1%
2036582 1
< 0.1%
2036557 1
< 0.1%
2036533 1
< 0.1%
2036487 1
< 0.1%
2033833 1
< 0.1%
2033670 1
< 0.1%

최소값X
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct8157
Distinct (%)90.7%
Missing1011
Missing (%)10.1%
Infinite0
Infinite (%)0.0%
Mean1019205.1
Minimum680551
Maximum1385875
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:57:25.481040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum680551
5-th percentile899582.8
Q1943723
median1002283
Q31096088
95-th percentile1157767
Maximum1385875
Range705324
Interquartile range (IQR)152365

Descriptive statistics

Standard deviation86025.71
Coefficient of variation (CV)0.084404711
Kurtosis-1.1046918
Mean1019205.1
Median Absolute Deviation (MAD)70326
Skewness0.2032475
Sum9.1616344 × 109
Variance7.4004228 × 109
MonotonicityNot monotonic
2023-12-12T17:57:25.686442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000000 69
 
0.7%
1000913 5
 
0.1%
999113 5
 
0.1%
999106 5
 
0.1%
999544 4
 
< 0.1%
1000456 4
 
< 0.1%
998212 4
 
< 0.1%
999543 4
 
< 0.1%
1003196 4
 
< 0.1%
1098365 3
 
< 0.1%
Other values (8147) 8882
88.8%
(Missing) 1011
 
10.1%
ValueCountFrequency (%)
680551 1
< 0.1%
778406 1
< 0.1%
786920 2
< 0.1%
793733 1
< 0.1%
807544 1
< 0.1%
812240 2
< 0.1%
814363 1
< 0.1%
825639 1
< 0.1%
838682 1
< 0.1%
840127 1
< 0.1%
ValueCountFrequency (%)
1385875 1
< 0.1%
1298461 1
< 0.1%
1293940 2
< 0.1%
1291632 1
< 0.1%
1291534 1
< 0.1%
1186909 2
< 0.1%
1185260 1
< 0.1%
1185248 1
< 0.1%
1184774 1
< 0.1%
1182405 1
< 0.1%

최소값Y
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct7803
Distinct (%)86.8%
Missing1011
Missing (%)10.1%
Infinite0
Infinite (%)0.0%
Mean1789254.2
Minimum1462619
Maximum2444082
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:57:25.867439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1462619
5-th percentile1506680.2
Q11689863
median1791630
Q31912384
95-th percentile1964151.4
Maximum2444082
Range981463
Interquartile range (IQR)222521

Descriptive statistics

Standard deviation130005.71
Coefficient of variation (CV)0.072659163
Kurtosis-0.44566627
Mean1789254.2
Median Absolute Deviation (MAD)108428
Skewness-0.36628596
Sum1.6083606 × 1010
Variance1.6901486 × 1010
MonotonicityNot monotonic
2023-12-12T17:57:26.149680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1656139 6
 
0.1%
1915127 6
 
0.1%
1660575 6
 
0.1%
1711588 5
 
0.1%
1844686 5
 
0.1%
1660020 5
 
0.1%
1919565 5
 
0.1%
1846904 4
 
< 0.1%
1847459 4
 
< 0.1%
1661683 4
 
< 0.1%
Other values (7793) 8939
89.4%
(Missing) 1011
 
10.1%
ValueCountFrequency (%)
1462619 1
< 0.1%
1462751 1
< 0.1%
1470561 1
< 0.1%
1470783 1
< 0.1%
1470847 1
< 0.1%
1470873 1
< 0.1%
1470878 1
< 0.1%
1470883 1
< 0.1%
1470887 1
< 0.1%
1470939 1
< 0.1%
ValueCountFrequency (%)
2444082 1
< 0.1%
2039284 1
< 0.1%
2033808 1
< 0.1%
2033783 1
< 0.1%
2033759 1
< 0.1%
2033736 1
< 0.1%
2033691 1
< 0.1%
2031034 1
< 0.1%
2030875 1
< 0.1%
2028260 1
< 0.1%

지도종류
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
101
4283 
102
3987 
103
1684 
104
 
44
105
 
2

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row101
2nd row102
3rd row101
4th row102
5th row103

Common Values

ValueCountFrequency (%)
101 4283
42.8%
102 3987
39.9%
103 1684
 
16.8%
104 44
 
0.4%
105 2
 
< 0.1%

Length

2023-12-12T17:57:26.306745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:57:26.451800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
101 4283
42.8%
102 3987
39.9%
103 1684
 
16.8%
104 44
 
0.4%
105 2
 
< 0.1%

Interactions

2023-12-12T17:57:21.541448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:18.321816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:19.049959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:19.800783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:20.623344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:21.679120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:18.446981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:19.220159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:19.926907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:20.798159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:21.831383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:18.595216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:19.381245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:20.071786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:21.015860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:21.976800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:18.740642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:19.506859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:20.256079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:21.203053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:22.136246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:18.900316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:19.672804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:20.444889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:57:21.396874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:57:26.572258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
축척최대값X최대값Y최소값X최소값Y지도종류
축척1.0000.6510.4640.0840.4640.080
최대값X0.6511.0000.6190.9580.6210.074
최대값Y0.4640.6191.0000.3850.9800.085
최소값X0.0840.9580.3851.0000.4020.071
최소값Y0.4640.6210.9800.4021.0000.062
지도종류0.0800.0740.0850.0710.0621.000
2023-12-12T17:57:26.742264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
축척최대값X최대값Y최소값X최소값Y지도종류
축척1.0000.0270.0420.0120.0280.040
최대값X0.0271.0000.0510.9990.0520.045
최대값Y0.0420.0511.0000.0511.0000.059
최소값X0.0120.9990.0511.0000.0520.045
최소값Y0.0280.0521.0000.0521.0000.043
지도종류0.0400.0450.0590.0450.0431.000

Missing values

2023-12-12T17:57:22.324502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:57:22.509273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T17:57:22.677374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

축척도엽명도엽번호최대값X최대값Y최소값X최소값Y지도종류
235541000당진043036603043092880918882519283601887692101
493465000공주0773670907797082918142389685751811458102
7875000삼가358090921050036169786410477471695079101
114801000김포256137607256192974819470469293011946487102
456181000서울2374376082374<NA><NA><NA><NA>103
689025000동곡358080331119948174289611176481740093101
228911000밀양21443581221441115504169901911150421698459102
703651000관기20633671220631041742181372710412911813171102
664851000창원21903581121901095518169656710950571696008102
750861000대부019837615019891481219122359143621911676102
축척도엽명도엽번호최대값X최대값Y최소값X최소값Y지도종류
338101000울산18143590618141169417173482411689531734261101
933335000익산0763560407694580417589049435281756117102
779691000이천258337710258399689619179019964531917347102
736881000안양103937612103995532419374379548791936880102
608105000순천3470209799313716422819908471639507101
177661000구미13393681413391080533179347810800781792919101
71845000김천368130621049523178936510472571786580101
724541000거제18023480318021101482165115611010191650596102
952695000함안358140121072824169247310705271689683102
356701000엄정08283771608281034616191020610341701909650101