Overview

Dataset statistics

Number of variables10
Number of observations4523
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory366.7 KiB
Average record size in memory83.0 B

Variable types

Text1
Categorical6
Numeric2
DateTime1

Dataset

Description도시계획정보 시스템에 등록된 하동군의 관리지역 현황, 현황도형 관리번호, 도형 대분류코드, 도형 속성코드, 도형 조서관리 코드, 결정고시관리코드, 라벨명, 면적(도형), 길이(도형), 시군구코드, 현황도형 생성일시 정보
Author경상남도 하동군
URLhttps://www.data.go.kr/data/15124043/fileData.do

Alerts

시군구코드 has constant value ""Constant
라벨명 is highly overall correlated with 도형 대분류코드 and 2 other fieldsHigh correlation
도형 속성코드 is highly overall correlated with 도형 대분류코드 and 2 other fieldsHigh correlation
도형 조서관리 코드 is highly overall correlated with 도형 대분류코드 and 3 other fieldsHigh correlation
도형 대분류코드 is highly overall correlated with 도형 속성코드 and 2 other fieldsHigh correlation
결정고시관리코드 is highly overall correlated with 도형 조서관리 코드High correlation
면적(도형) is highly overall correlated with 길이(도형)High correlation
길이(도형) is highly overall correlated with 면적(도형)High correlation
도형 조서관리 코드 is highly imbalanced (64.9%)Imbalance
결정고시관리코드 is highly imbalanced (81.2%)Imbalance
면적(도형) is highly skewed (γ1 = 25.65925134)Skewed
현황도형 관리번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 23:01:50.240343
Analysis finished2023-12-12 23:01:51.463217
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct4523
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size35.5 KiB
2023-12-13T08:01:51.616939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters108552
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4523 ?
Unique (%)100.0%

Sample

1st row48850UQ112PS201202094128
2nd row48850UQ112PS201202094124
3rd row48850UQ112PS201202094116
4th row48850UQ112PS201202094117
5th row48850UQ112PS201202094121
ValueCountFrequency (%)
48850uq112ps201202094128 1
 
< 0.1%
48850uq112ps201202093901 1
 
< 0.1%
48850uq112ps201202093906 1
 
< 0.1%
48850uq112ps201202093907 1
 
< 0.1%
48850uq112ps201202093902 1
 
< 0.1%
48850uq112ps201202093896 1
 
< 0.1%
48850uq112ps201202093905 1
 
< 0.1%
48850uq112ps201202093900 1
 
< 0.1%
48850uq112ps201202093904 1
 
< 0.1%
48850uq112ps201202093894 1
 
< 0.1%
Other values (4513) 4513
99.8%
2023-12-13T08:01:51.922556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 20725
19.1%
0 19786
18.2%
1 16660
15.3%
8 10332
9.5%
4 6485
 
6.0%
5 5946
 
5.5%
9 5456
 
5.0%
U 4523
 
4.2%
Q 4523
 
4.2%
P 4523
 
4.2%
Other values (4) 9593
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 90460
83.3%
Uppercase Letter 18092
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 20725
22.9%
0 19786
21.9%
1 16660
18.4%
8 10332
11.4%
4 6485
 
7.2%
5 5946
 
6.6%
9 5456
 
6.0%
3 2474
 
2.7%
6 1302
 
1.4%
7 1294
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
U 4523
25.0%
Q 4523
25.0%
P 4523
25.0%
S 4523
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 90460
83.3%
Latin 18092
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
2 20725
22.9%
0 19786
21.9%
1 16660
18.4%
8 10332
11.4%
4 6485
 
7.2%
5 5946
 
6.6%
9 5456
 
6.0%
3 2474
 
2.7%
6 1302
 
1.4%
7 1294
 
1.4%
Latin
ValueCountFrequency (%)
U 4523
25.0%
Q 4523
25.0%
P 4523
25.0%
S 4523
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 108552
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 20725
19.1%
0 19786
18.2%
1 16660
15.3%
8 10332
9.5%
4 6485
 
6.0%
5 5946
 
5.5%
9 5456
 
5.0%
U 4523
 
4.2%
Q 4523
 
4.2%
P 4523
 
4.2%
Other values (4) 9593
8.8%

도형 대분류코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size35.5 KiB
UQB300
3515 
UQB100
653 
UQB200
355 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUQB300
2nd rowUQB300
3rd rowUQB300
4th rowUQB300
5th rowUQB300

Common Values

ValueCountFrequency (%)
UQB300 3515
77.7%
UQB100 653
 
14.4%
UQB200 355
 
7.8%

Length

2023-12-13T08:01:52.072039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:01:52.151077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
uqb300 3515
77.7%
uqb100 653
 
14.4%
uqb200 355
 
7.8%

도형 속성코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size35.5 KiB
UQB300
3515 
UQB100
653 
UQB200
355 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUQB300
2nd rowUQB300
3rd rowUQB300
4th rowUQB300
5th rowUQB300

Common Values

ValueCountFrequency (%)
UQB300 3515
77.7%
UQB100 653
 
14.4%
UQB200 355
 
7.8%

Length

2023-12-13T08:01:52.236723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:01:52.330070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
uqb300 3515
77.7%
uqb100 653
 
14.4%
uqb200 355
 
7.8%

도형 조서관리 코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct22
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size35.5 KiB
48850ARZ201202091011
3331 
48850ARZ201202091009
433 
48850ARZ201202091010
 
256
48850ARZ202201063186
 
106
48850ARZ201211293137
 
93
Other values (17)
 
304

Length

Max length20
Median length20
Mean length20
Min length20

Unique

Unique7 ?
Unique (%)0.2%

Sample

1st row48850ARZ201202091011
2nd row48850ARZ201202091011
3rd row48850ARZ201202091011
4th row48850ARZ201202091011
5th row48850ARZ201202091011

Common Values

ValueCountFrequency (%)
48850ARZ201202091011 3331
73.6%
48850ARZ201202091009 433
 
9.6%
48850ARZ201202091010 256
 
5.7%
48850ARZ202201063186 106
 
2.3%
48850ARZ201211293137 93
 
2.1%
48850ARZ202201063184 91
 
2.0%
48850ARZ202201063185 86
 
1.9%
48850ARZ201512313168 51
 
1.1%
48850ARZ201211293135 37
 
0.8%
48850ARZ201506113165 9
 
0.2%
Other values (12) 30
 
0.7%

Length

2023-12-13T08:01:52.423801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
48850arz201202091011 3331
73.6%
48850arz201202091009 433
 
9.6%
48850arz201202091010 256
 
5.7%
48850arz202201063186 106
 
2.3%
48850arz201211293137 93
 
2.1%
48850arz202201063184 91
 
2.0%
48850arz202201063185 86
 
1.9%
48850arz201512313168 51
 
1.1%
48850arz201211293135 37
 
0.8%
48850arz201506113165 9
 
0.2%
Other values (12) 30
 
0.7%

결정고시관리코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct13
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size35.5 KiB
48850NTC201202091000
4020 
48850NTC202201060806
 
283
48850NTC201211290321
 
137
48850NTC201512310470
 
52
48850NTC201506110443
 
9
Other values (8)
 
22

Length

Max length20
Median length20
Mean length20
Min length20

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st row48850NTC201202091000
2nd row48850NTC201202091000
3rd row48850NTC201202091000
4th row48850NTC201202091000
5th row48850NTC201202091000

Common Values

ValueCountFrequency (%)
48850NTC201202091000 4020
88.9%
48850NTC202201060806 283
 
6.3%
48850NTC201211290321 137
 
3.0%
48850NTC201512310470 52
 
1.1%
48850NTC201506110443 9
 
0.2%
48850NTC201312190378 7
 
0.2%
48850NTC201411200408 6
 
0.1%
48850NTC201708310528 3
 
0.1%
48850NTC201602250480 2
 
< 0.1%
48850NTC202109160784 1
 
< 0.1%
Other values (3) 3
 
0.1%

Length

2023-12-13T08:01:52.520137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
48850ntc201202091000 4020
88.9%
48850ntc202201060806 283
 
6.3%
48850ntc201211290321 137
 
3.0%
48850ntc201512310470 52
 
1.1%
48850ntc201506110443 9
 
0.2%
48850ntc201312190378 7
 
0.2%
48850ntc201411200408 6
 
0.1%
48850ntc201708310528 3
 
0.1%
48850ntc201602250480 2
 
< 0.1%
48850ntc202109160784 1
 
< 0.1%
Other values (3) 3
 
0.1%

라벨명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size35.5 KiB
보전관리지역
3515 
계획관리지역
653 
생산관리지역
355 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보전관리지역
2nd row보전관리지역
3rd row보전관리지역
4th row보전관리지역
5th row보전관리지역

Common Values

ValueCountFrequency (%)
보전관리지역 3515
77.7%
계획관리지역 653
 
14.4%
생산관리지역 355
 
7.8%

Length

2023-12-13T08:01:52.630159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:01:52.743105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보전관리지역 3515
77.7%
계획관리지역 653
 
14.4%
생산관리지역 355
 
7.8%

면적(도형)
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct4455
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46384.035
Minimum0
Maximum14750034
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size39.9 KiB
2023-12-13T08:01:52.856721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile33.109
Q1141.84
median469.38
Q33050.77
95-th percentile161746.89
Maximum14750034
Range14750034
Interquartile range (IQR)2908.93

Descriptive statistics

Standard deviation319283.23
Coefficient of variation (CV)6.8834724
Kurtosis1033.2542
Mean46384.035
Median Absolute Deviation (MAD)404.92
Skewness25.659251
Sum2.0979499 × 108
Variance1.0194178 × 1011
MonotonicityNot monotonic
2023-12-13T08:01:53.007030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
71.18 3
 
0.1%
148.88 3
 
0.1%
53.39 2
 
< 0.1%
81.52 2
 
< 0.1%
69.22 2
 
< 0.1%
66.02 2
 
< 0.1%
97.28 2
 
< 0.1%
130.93 2
 
< 0.1%
99.07 2
 
< 0.1%
392.07 2
 
< 0.1%
Other values (4445) 4501
99.5%
ValueCountFrequency (%)
0.0 1
< 0.1%
0.16 1
< 0.1%
0.29 1
< 0.1%
0.4 1
< 0.1%
0.52 1
< 0.1%
0.57 1
< 0.1%
1.32 1
< 0.1%
1.49 1
< 0.1%
1.64 1
< 0.1%
1.67 1
< 0.1%
ValueCountFrequency (%)
14750033.99 1
< 0.1%
4657860.41 1
< 0.1%
4314148.43 1
< 0.1%
3784089.32 1
< 0.1%
3599932.94 1
< 0.1%
3309746.15 1
< 0.1%
3227822.64 1
< 0.1%
3045720.07 1
< 0.1%
2709974.46 1
< 0.1%
2694047.21 1
< 0.1%

길이(도형)
Real number (ℝ)

HIGH CORRELATION 

Distinct4255
Distinct (%)94.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean973.87847
Minimum0.14
Maximum94373.89
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.9 KiB
2023-12-13T08:01:53.150660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.14
5-th percentile28.292
Q159.475
median113.69
Q3346.81
95-th percentile3936.247
Maximum94373.89
Range94373.75
Interquartile range (IQR)287.335

Descriptive statistics

Standard deviation3746.839
Coefficient of variation (CV)3.8473373
Kurtosis145.60961
Mean973.87847
Median Absolute Deviation (MAD)70.94
Skewness9.7660839
Sum4404852.3
Variance14038803
MonotonicityNot monotonic
2023-12-13T08:01:53.270246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
52.22 4
 
0.1%
61.38 3
 
0.1%
65.66 3
 
0.1%
32.89 3
 
0.1%
48.95 3
 
0.1%
76.8 3
 
0.1%
49.24 3
 
0.1%
40.19 3
 
0.1%
95.4 3
 
0.1%
502.87 3
 
0.1%
Other values (4245) 4492
99.3%
ValueCountFrequency (%)
0.14 1
< 0.1%
2.42 1
< 0.1%
3.05 1
< 0.1%
3.69 1
< 0.1%
4.97 1
< 0.1%
6.39 1
< 0.1%
6.63 1
< 0.1%
6.88 1
< 0.1%
7.38 1
< 0.1%
8.21 1
< 0.1%
ValueCountFrequency (%)
94373.89 1
< 0.1%
57370.48 1
< 0.1%
57251.91 1
< 0.1%
48282.24 1
< 0.1%
45225.06 1
< 0.1%
40962.5 1
< 0.1%
40136.24 1
< 0.1%
39570.47 1
< 0.1%
38461.37 1
< 0.1%
36995.36 1
< 0.1%

시군구코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size35.5 KiB
48850
4523 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row48850
2nd row48850
3rd row48850
4th row48850
5th row48850

Common Values

ValueCountFrequency (%)
48850 4523
100.0%

Length

2023-12-13T08:01:53.383355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:01:53.462890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
48850 4523
100.0%
Distinct11
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size35.5 KiB
Minimum2013-05-30 00:00:00
Maximum2022-03-08 00:00:00
2023-12-13T08:01:53.524183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:01:53.607676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)

Interactions

2023-12-13T08:01:51.029421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:01:50.830673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:01:51.131021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:01:50.933908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:01:53.683064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도형 대분류코드도형 속성코드도형 조서관리 코드결정고시관리코드라벨명면적(도형)길이(도형)현황도형 생성일시
도형 대분류코드1.0001.0001.0000.4831.0000.0690.1490.470
도형 속성코드1.0001.0001.0000.4831.0000.0690.1490.470
도형 조서관리 코드1.0001.0001.0001.0001.0000.1480.1510.987
결정고시관리코드0.4830.4831.0001.0000.4830.0310.0000.950
라벨명1.0001.0001.0000.4831.0000.0690.1490.470
면적(도형)0.0690.0690.1480.0310.0691.0000.8900.048
길이(도형)0.1490.1490.1510.0000.1490.8901.0000.000
현황도형 생성일시0.4700.4700.9870.9500.4700.0480.0001.000
2023-12-13T08:01:53.793210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
라벨명도형 속성코드도형 조서관리 코드도형 대분류코드결정고시관리코드
라벨명1.0001.0000.9981.0000.312
도형 속성코드1.0001.0000.9981.0000.312
도형 조서관리 코드0.9980.9981.0000.9980.999
도형 대분류코드1.0001.0000.9981.0000.312
결정고시관리코드0.3120.3120.9990.3121.000
2023-12-13T08:01:53.883554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적(도형)길이(도형)도형 대분류코드도형 속성코드도형 조서관리 코드결정고시관리코드라벨명
면적(도형)1.0000.9740.0520.0520.0730.0170.052
길이(도형)0.9741.0000.0950.0950.0620.0000.095
도형 대분류코드0.0520.0951.0001.0000.9980.3121.000
도형 속성코드0.0520.0951.0001.0000.9980.3121.000
도형 조서관리 코드0.0730.0620.9980.9981.0000.9990.998
결정고시관리코드0.0170.0000.3120.3120.9991.0000.312
라벨명0.0520.0951.0001.0000.9980.3121.000

Missing values

2023-12-13T08:01:51.254001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:01:51.392288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

현황도형 관리번호도형 대분류코드도형 속성코드도형 조서관리 코드결정고시관리코드라벨명면적(도형)길이(도형)시군구코드현황도형 생성일시
048850UQ112PS201202094128UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역25795.61929.56488502013-05-30
148850UQ112PS201202094124UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역660.95124.33488502013-05-30
248850UQ112PS201202094116UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역65.1340.58488502013-05-30
348850UQ112PS201202094117UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역88.9251.39488502013-05-30
448850UQ112PS201202094121UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역441.36115.91488502013-05-30
548850UQ112PS201202094119UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역730.49122.72488502013-05-30
648850UQ112PS201202094107UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역1326.09249.13488502013-05-30
748850UQ112PS201202094113UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역553.94185.79488502013-05-30
848850UQ112PS201202094112UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역210.4477.68488502013-05-30
948850UQ112PS201202094106UQB300UQB30048850ARZ20120209101148850NTC201202091000보전관리지역147.7251.9488502013-05-30
현황도형 관리번호도형 대분류코드도형 속성코드도형 조서관리 코드결정고시관리코드라벨명면적(도형)길이(도형)시군구코드현황도형 생성일시
451348850UQ112PS202202114337UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역38.2824.88488502022-02-11
451448850UQ112PS202202114339UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역80.0637.07488502022-02-11
451548850UQ112PS202202114335UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역73.9835.17488502022-02-11
451648850UQ112PS202202114340UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역222.8260.08488502022-02-11
451748850UQ112PS202202114334UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역113.0445.64488502022-02-11
451848850UQ112PS202202114344UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역58.0629.18488502022-02-11
451948850UQ112PS202202114351UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역226.0179.9488502022-02-11
452048850UQ112PS202202114355UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역89.7437.81488502022-02-11
452148850UQ112PS202202114352UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역602.4998.74488502022-02-11
452248850UQ112PS202202114356UQB100UQB10048850ARZ20220106318648850NTC202201060806계획관리지역393.3985.04488502022-02-11