Overview

Dataset statistics

Number of variables10
Number of observations1948
Missing cells1948
Missing cells (%)10.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory158.0 KiB
Average record size in memory83.1 B

Variable types

Text2
Categorical5
Unsupported1
Numeric1
DateTime1

Dataset

Description도시계획정보 시스템에 등록된 하동군의 유통공급시설 현황, 현황도형 관리번호, 도형 대분류코드, 도형 중분류코드, 도형 소분류코드, 도형 속성코드, (허가)계획관리번호, 라벨명, 면적(도형), 시군구코드, 현황도형 생성일시 정보
Author경상남도 하동군
URLhttps://www.data.go.kr/data/15123809/fileData.do

Alerts

도형 대분류코드 has constant value ""Constant
시군구코드 has constant value ""Constant
도형 속성코드 is highly overall correlated with 도형 중분류코드 and 1 other fieldsHigh correlation
라벨명 is highly overall correlated with 도형 중분류코드 and 1 other fieldsHigh correlation
도형 중분류코드 is highly overall correlated with 도형 속성코드 and 1 other fieldsHigh correlation
도형 중분류코드 is highly imbalanced (69.4%)Imbalance
도형 속성코드 is highly imbalanced (66.2%)Imbalance
라벨명 is highly imbalanced (66.2%)Imbalance
도형 소분류코드 has 1948 (100.0%) missing valuesMissing
면적(도형) is highly skewed (γ1 = 22.18563398)Skewed
현황도형 관리번호 has unique valuesUnique
도형 소분류코드 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 15:03:52.513262
Analysis finished2023-12-12 15:03:54.256013
Duration1.74 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1948
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
2023-12-13T00:03:54.414740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters38960
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1948 ?
Unique (%)100.0%

Sample

1st row48850PPR200712181703
2nd row48850PPR200711201704
3rd row48850PPR200707161696
4th row48850PPR200712031705
5th row48850PPR200708101699
ValueCountFrequency (%)
48850ppr200712181703 1
 
0.1%
48850ppr201008100447 1
 
0.1%
48850ppr201111150671 1
 
0.1%
48850ppr201107250474 1
 
0.1%
48850ppr201108170470 1
 
0.1%
48850ppr201204100672 1
 
0.1%
48850ppr201111300669 1
 
0.1%
48850ppr201104280468 1
 
0.1%
48850ppr201007140466 1
 
0.1%
48850ppr201104280467 1
 
0.1%
Other values (1938) 1938
99.5%
2023-12-13T00:03:54.779377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8986
23.1%
1 4982
12.8%
8 4846
12.4%
2 4137
10.6%
P 3896
10.0%
5 3003
 
7.7%
4 2885
 
7.4%
R 1948
 
5.0%
7 1298
 
3.3%
6 1074
 
2.8%
Other values (2) 1905
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 33116
85.0%
Uppercase Letter 5844
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8986
27.1%
1 4982
15.0%
8 4846
14.6%
2 4137
12.5%
5 3003
 
9.1%
4 2885
 
8.7%
7 1298
 
3.9%
6 1074
 
3.2%
3 1058
 
3.2%
9 847
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
P 3896
66.7%
R 1948
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 33116
85.0%
Latin 5844
 
15.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8986
27.1%
1 4982
15.0%
8 4846
14.6%
2 4137
12.5%
5 3003
 
9.1%
4 2885
 
8.7%
7 1298
 
3.9%
6 1074
 
3.2%
3 1058
 
3.2%
9 847
 
2.6%
Latin
ValueCountFrequency (%)
P 3896
66.7%
R 1948
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8986
23.1%
1 4982
12.8%
8 4846
12.4%
2 4137
10.6%
P 3896
10.0%
5 3003
 
7.7%
4 2885
 
7.4%
R 1948
 
5.0%
7 1298
 
3.3%
6 1074
 
2.8%
Other values (2) 1905
 
4.9%

도형 대분류코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
UQQA00
1948 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUQQA00
2nd rowUQQA00
3rd rowUQQA00
4th rowUQQA00
5th rowUQQA00

Common Values

ValueCountFrequency (%)
UQQA00 1948
100.0%

Length

2023-12-13T00:03:54.910162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:03:54.999566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
uqqa00 1948
100.0%

도형 중분류코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
UQQA20
1593 
UQQA40
320 
UQQA50
 
25
UQQA10
 
5
UQQA30
 
4

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowUQQA40
2nd rowUQQA40
3rd rowUQQA20
4th rowUQQA40
5th rowUQQA40

Common Values

ValueCountFrequency (%)
UQQA20 1593
81.8%
UQQA40 320
 
16.4%
UQQA50 25
 
1.3%
UQQA10 5
 
0.3%
UQQA30 4
 
0.2%
UQQA00 1
 
0.1%

Length

2023-12-13T00:03:55.099885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:03:55.217036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
uqqa20 1593
81.8%
uqqa40 320
 
16.4%
uqqa50 25
 
1.3%
uqqa10 5
 
0.3%
uqqa30 4
 
0.2%
uqqa00 1
 
0.1%

도형 소분류코드
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1948
Missing (%)100.0%
Memory size17.2 KiB

도형 속성코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
UQQA20
1594 
UQQA40
320 
UQQA50
 
25
UQQA10
 
5
UQQA30
 
4

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUQQA40
2nd rowUQQA40
3rd rowUQQA20
4th rowUQQA40
5th rowUQQA40

Common Values

ValueCountFrequency (%)
UQQA20 1594
81.8%
UQQA40 320
 
16.4%
UQQA50 25
 
1.3%
UQQA10 5
 
0.3%
UQQA30 4
 
0.2%

Length

2023-12-13T00:03:55.337400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:03:55.444685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
uqqa20 1594
81.8%
uqqa40 320
 
16.4%
uqqa50 25
 
1.3%
uqqa10 5
 
0.3%
uqqa30 4
 
0.2%
Distinct1676
Distinct (%)86.0%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
2023-12-13T00:03:55.712312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters38960
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1494 ?
Unique (%)76.7%

Sample

1st row48850PPR200712180295
2nd row48850PPR200711200315
3rd row48850PPR200707160182
4th row48850PPR200712030312
5th row48850PPR200708100419
ValueCountFrequency (%)
48850ppr201206250970 10
 
0.5%
48850ppr201304110729 7
 
0.4%
48850ppr201303281235 6
 
0.3%
48850ppr201110080834 5
 
0.3%
48850ppr201112190566 5
 
0.3%
48850ppr201012100473 5
 
0.3%
48850ppr201005280229 5
 
0.3%
48850ppr201210041282 5
 
0.3%
48850ppr201210191080 5
 
0.3%
48850ppr201302261398 5
 
0.3%
Other values (1666) 1890
97.0%
2023-12-13T00:03:56.055715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 9613
24.7%
8 4858
12.5%
1 4377
11.2%
2 4214
10.8%
P 3896
10.0%
5 2983
 
7.7%
4 2841
 
7.3%
R 1948
 
5.0%
7 1241
 
3.2%
3 1116
 
2.9%
Other values (2) 1873
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 33116
85.0%
Uppercase Letter 5844
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9613
29.0%
8 4858
14.7%
1 4377
13.2%
2 4214
12.7%
5 2983
 
9.0%
4 2841
 
8.6%
7 1241
 
3.7%
3 1116
 
3.4%
6 1070
 
3.2%
9 803
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
P 3896
66.7%
R 1948
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 33116
85.0%
Latin 5844
 
15.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9613
29.0%
8 4858
14.7%
1 4377
13.2%
2 4214
12.7%
5 2983
 
9.0%
4 2841
 
8.6%
7 1241
 
3.7%
3 1116
 
3.4%
6 1070
 
3.2%
9 803
 
2.4%
Latin
ValueCountFrequency (%)
P 3896
66.7%
R 1948
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9613
24.7%
8 4858
12.5%
1 4377
11.2%
2 4214
10.8%
P 3896
10.0%
5 2983
 
7.7%
4 2841
 
7.3%
R 1948
 
5.0%
7 1241
 
3.2%
3 1116
 
2.9%
Other values (2) 1873
 
4.8%

라벨명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
토지형질변경
1594 
토지분할
320 
물건적치
 
25
공작물설치
 
5
토석채취
 
4

Length

Max length6
Median length6
Mean length5.639117
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row토지분할
2nd row토지분할
3rd row토지형질변경
4th row토지분할
5th row토지분할

Common Values

ValueCountFrequency (%)
토지형질변경 1594
81.8%
토지분할 320
 
16.4%
물건적치 25
 
1.3%
공작물설치 5
 
0.3%
토석채취 4
 
0.2%

Length

2023-12-13T00:03:56.212065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:03:56.345224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
토지형질변경 1594
81.8%
토지분할 320
 
16.4%
물건적치 25
 
1.3%
공작물설치 5
 
0.3%
토석채취 4
 
0.2%

면적(도형)
Real number (ℝ)

SKEWED 

Distinct1940
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5370.7096
Minimum12.06
Maximum1358825.7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.2 KiB
2023-12-13T00:03:56.496362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12.06
5-th percentile111.957
Q1400.11
median728.8
Q31695.615
95-th percentile18959.396
Maximum1358825.7
Range1358813.6
Interquartile range (IQR)1295.505

Descriptive statistics

Standard deviation46503.525
Coefficient of variation (CV)8.6587301
Kurtosis550.02077
Mean5370.7096
Median Absolute Deviation (MAD)443.83
Skewness22.185634
Sum10462142
Variance2.1625778 × 109
MonotonicityNot monotonic
2023-12-13T00:03:56.660866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
603.73 2
 
0.1%
660.95 2
 
0.1%
652.38 2
 
0.1%
646.34 2
 
0.1%
994.46 2
 
0.1%
656.75 2
 
0.1%
2059.63 2
 
0.1%
803.75 2
 
0.1%
2782.5 1
 
0.1%
482.42 1
 
0.1%
Other values (1930) 1930
99.1%
ValueCountFrequency (%)
12.06 1
0.1%
12.44 1
0.1%
13.37 1
0.1%
14.22 1
0.1%
14.25 1
0.1%
15.1 1
0.1%
15.56 1
0.1%
17.59 1
0.1%
19.99 1
0.1%
21.67 1
0.1%
ValueCountFrequency (%)
1358825.67 1
0.1%
1049907.56 1
0.1%
749147.03 1
0.1%
610443.38 1
0.1%
432037.44 1
0.1%
137917.8 1
0.1%
118140.12 1
0.1%
101130.72 1
0.1%
94740.34 1
0.1%
89059.86 1
0.1%

시군구코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
48850
1948 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row48850
2nd row48850
3rd row48850
4th row48850
5th row48850

Common Values

ValueCountFrequency (%)
48850 1948
100.0%

Length

2023-12-13T00:03:56.833520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:03:56.960476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
48850 1948
100.0%
Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
Minimum2012-05-13 00:00:00
Maximum2013-05-30 00:00:00
2023-12-13T00:03:57.061577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:03:57.199383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=3)

Interactions

2023-12-13T00:03:52.849726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:03:57.298940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도형 중분류코드도형 속성코드라벨명면적(도형)현황도형 생성일시
도형 중분류코드1.0001.0001.0000.0000.977
도형 속성코드1.0001.0001.0000.0000.473
라벨명1.0001.0001.0000.0000.473
면적(도형)0.0000.0000.0001.0000.000
현황도형 생성일시0.9770.4730.4730.0001.000
2023-12-13T00:03:57.409351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도형 속성코드라벨명도형 중분류코드
도형 속성코드1.0001.0001.000
라벨명1.0001.0001.000
도형 중분류코드1.0001.0001.000
2023-12-13T00:03:57.502892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적(도형)도형 중분류코드도형 속성코드라벨명
면적(도형)1.0000.0000.0000.000
도형 중분류코드0.0001.0001.0001.000
도형 속성코드0.0001.0001.0001.000
라벨명0.0001.0001.0001.000

Missing values

2023-12-13T00:03:54.039150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:03:54.184488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

현황도형 관리번호도형 대분류코드도형 중분류코드도형 소분류코드도형 속성코드(허가)계획관리번호라벨명면적(도형)시군구코드현황도형 생성일시
048850PPR200712181703UQQA00UQQA40<NA>UQQA4048850PPR200712180295토지분할814.0488502012-05-13
148850PPR200711201704UQQA00UQQA40<NA>UQQA4048850PPR200711200315토지분할1966.46488502012-05-13
248850PPR200707161696UQQA00UQQA20<NA>UQQA2048850PPR200707160182토지형질변경4374.48488502012-05-13
348850PPR200712031705UQQA00UQQA40<NA>UQQA4048850PPR200712030312토지분할183.81488502012-05-13
448850PPR200708101699UQQA00UQQA40<NA>UQQA4048850PPR200708100419토지분할1069.39488502012-05-13
548850PPR200710231702UQQA00UQQA40<NA>UQQA4048850PPR200710230520토지분할37130.65488502012-05-13
648850PPR200708271701UQQA00UQQA40<NA>UQQA4048850PPR200708270409토지분할15895.82488502012-05-13
748850PPR200712281698UQQA00UQQA40<NA>UQQA4048850PPR200712280276토지분할14183.77488502012-05-13
848850PPR200707181706UQQA00UQQA20<NA>UQQA2048850PPR200707180179토지형질변경784.94488502012-05-13
948850PPR200702051707UQQA00UQQA40<NA>UQQA4048850PPR200702050515토지분할537.29488502012-05-13
현황도형 관리번호도형 대분류코드도형 중분류코드도형 소분류코드도형 속성코드(허가)계획관리번호라벨명면적(도형)시군구코드현황도형 생성일시
193848850PPR200704241931UQQA00UQQA20<NA>UQQA2048850PPR200704240087토지형질변경247.99488502012-05-13
193948850PPR200611151937UQQA00UQQA20<NA>UQQA2048850PPR200611150546토지형질변경658.44488502012-05-13
194048850PPR200707301938UQQA00UQQA40<NA>UQQA4048850PPR200707300204토지분할2390.3488502012-05-13
194148850PPR200709031939UQQA00UQQA40<NA>UQQA4048850PPR200709030402토지분할2456.36488502012-05-13
194248850PPR200511101941UQQA00UQQA20<NA>UQQA2048850PPR200511100486토지형질변경780.14488502012-05-13
194348850PPR200507261947UQQA00UQQA20<NA>UQQA2048850PPR200507260839토지형질변경664.16488502012-05-13
194448850PPR200703121940UQQA00UQQA40<NA>UQQA4048850PPR200703120395토지분할788.39488502012-05-13
194548850PPR200708081946UQQA00UQQA20<NA>UQQA2048850PPR200708080152토지형질변경665.84488502012-05-13
194648850PPR200711131944UQQA00UQQA40<NA>UQQA4048850PPR200711130321토지분할9715.92488502012-05-13
194748850PPR200708101873UQQA00UQQA40<NA>UQQA4048850PPR200708100422토지분할3218.98488502012-05-13