Overview

Dataset statistics

Number of variables10
Number of observations132
Missing cells65
Missing cells (%)4.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.8 KiB
Average record size in memory84.0 B

Variable types

Categorical5
Text2
Numeric3

Dataset

Description전라북도 임실군 산지전용허가 현황으로 구분, 면, 리, 토지구분, 본번, 부번, 지적, 부지, 전용목적, 비고가 포함된 데이터입니다.
URLhttps://www.data.go.kr/data/15114474/fileData.do

Alerts

전용목적 is highly overall correlated with 지적 and 3 other fieldsHigh correlation
is highly overall correlated with 토지구분 and 1 other fieldsHigh correlation
토지구분 is highly overall correlated with 부번 and 5 other fieldsHigh correlation
비고 is highly overall correlated with 부번 and 5 other fieldsHigh correlation
구분 is highly overall correlated with 부번 and 3 other fieldsHigh correlation
부번 is highly overall correlated with 구분 and 2 other fieldsHigh correlation
지적 is highly overall correlated with 부지 and 3 other fieldsHigh correlation
부지 is highly overall correlated with 지적 and 2 other fieldsHigh correlation
구분 is highly imbalanced (77.6%)Imbalance
비고 is highly imbalanced (84.4%)Imbalance
has 2 (1.5%) missing valuesMissing
본번 has 2 (1.5%) missing valuesMissing
부번 has 56 (42.4%) missing valuesMissing
지적 has 3 (2.3%) missing valuesMissing
부지 has 2 (1.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 05:07:04.423487
Analysis finished2023-12-12 05:07:06.674508
Duration2.25 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
협의
121 
변경
 
8
(치즈역사문화관)
 
1
(농업인 주택)
 
1
허가
 
1

Length

Max length9
Median length2
Mean length2.0984848
Min length2

Unique

Unique3 ?
Unique (%)2.3%

Sample

1st row협의
2nd row협의
3rd row협의
4th row협의
5th row협의

Common Values

ValueCountFrequency (%)
협의 121
91.7%
변경 8
 
6.1%
(치즈역사문화관) 1
 
0.8%
(농업인 주택) 1
 
0.8%
허가 1
 
0.8%

Length

2023-12-12T14:07:06.776743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:07:06.941319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
협의 121
91.0%
변경 8
 
6.0%
치즈역사문화관 1
 
0.8%
농업인 1
 
0.8%
주택 1
 
0.8%
허가 1
 
0.8%


Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
운암
21 
덕치
13 
임실
12 
관촌
12 
오수
11 
Other values (8)
63 

Length

Max length4
Median length2
Mean length2.030303
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row임실
2nd row임실
3rd row관촌
4th row관촌
5th row관촌

Common Values

ValueCountFrequency (%)
운암 21
15.9%
덕치 13
9.8%
임실 12
9.1%
관촌 12
9.1%
오수 11
8.3%
신덕 11
8.3%
삼계 11
8.3%
신평 10
7.6%
강진 9
6.8%
성수 8
 
6.1%
Other values (3) 14
10.6%

Length

2023-12-12T14:07:07.111405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
운암 21
15.9%
덕치 13
9.8%
임실 12
9.1%
관촌 12
9.1%
오수 11
8.3%
신덕 11
8.3%
삼계 11
8.3%
신평 10
7.6%
강진 9
6.8%
성수 8
 
6.1%
Other values (3) 14
10.6%


Text

MISSING 

Distinct58
Distinct (%)44.6%
Missing2
Missing (%)1.5%
Memory size1.2 KiB
2023-12-12T14:07:07.378232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters260
Distinct characters64
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)24.6%

Sample

1st row오정
2nd row오정
3rd row슬치
4th row슬치
5th row상월
ValueCountFrequency (%)
입석 16
 
12.3%
오정 7
 
5.4%
천담 6
 
4.6%
용두 6
 
4.6%
장암 5
 
3.8%
삼길 4
 
3.1%
성수 4
 
3.1%
청계 4
 
3.1%
두월 4
 
3.1%
용수 4
 
3.1%
Other values (48) 70
53.8%
2023-12-12T14:07:07.815313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17
 
6.5%
17
 
6.5%
16
 
6.2%
13
 
5.0%
13
 
5.0%
12
 
4.6%
11
 
4.2%
11
 
4.2%
10
 
3.8%
9
 
3.5%
Other values (54) 131
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 260
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17
 
6.5%
17
 
6.5%
16
 
6.2%
13
 
5.0%
13
 
5.0%
12
 
4.6%
11
 
4.2%
11
 
4.2%
10
 
3.8%
9
 
3.5%
Other values (54) 131
50.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 260
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
6.5%
17
 
6.5%
16
 
6.2%
13
 
5.0%
13
 
5.0%
12
 
4.6%
11
 
4.2%
11
 
4.2%
10
 
3.8%
9
 
3.5%
Other values (54) 131
50.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 260
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
17
 
6.5%
17
 
6.5%
16
 
6.2%
13
 
5.0%
13
 
5.0%
12
 
4.6%
11
 
4.2%
11
 
4.2%
10
 
3.8%
9
 
3.5%
Other values (54) 131
50.4%

토지구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
<NA>
67 
65 

Length

Max length4
Median length4
Mean length2.5227273
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
<NA> 67
50.8%
65
49.2%

Length

2023-12-12T14:07:08.024976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:07:08.163650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 67
50.8%
65
49.2%

본번
Text

MISSING 

Distinct85
Distinct (%)65.4%
Missing2
Missing (%)1.5%
Memory size1.2 KiB
2023-12-12T14:07:08.444336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length2.5923077
Min length1

Characters and Unicode

Total characters337
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)44.6%

Sample

1st row332
2nd row332
3rd row42
4th row42
5th row75
ValueCountFrequency (%)
557 4
 
3.1%
614 4
 
3.1%
42 4
 
3.1%
87 4
 
3.1%
142 3
 
2.3%
332 3
 
2.3%
62 3
 
2.3%
40 3
 
2.3%
169 3
 
2.3%
122 3
 
2.3%
Other values (75) 96
73.8%
2023-12-12T14:07:08.987420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 57
16.9%
4 45
13.4%
3 41
12.2%
2 40
11.9%
7 33
9.8%
5 32
9.5%
6 27
8.0%
0 22
 
6.5%
8 19
 
5.6%
9 15
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 331
98.2%
Dash Punctuation 6
 
1.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 57
17.2%
4 45
13.6%
3 41
12.4%
2 40
12.1%
7 33
10.0%
5 32
9.7%
6 27
8.2%
0 22
 
6.6%
8 19
 
5.7%
9 15
 
4.5%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 337
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 57
16.9%
4 45
13.4%
3 41
12.2%
2 40
11.9%
7 33
9.8%
5 32
9.5%
6 27
8.0%
0 22
 
6.5%
8 19
 
5.6%
9 15
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 337
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 57
16.9%
4 45
13.4%
3 41
12.2%
2 40
11.9%
7 33
9.8%
5 32
9.5%
6 27
8.0%
0 22
 
6.5%
8 19
 
5.6%
9 15
 
4.5%

부번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct24
Distinct (%)31.6%
Missing56
Missing (%)42.4%
Infinite0
Infinite (%)0.0%
Mean8.6447368
Minimum1
Maximum120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.3 KiB
2023-12-12T14:07:09.162260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q310.25
95-th percentile26.25
Maximum120
Range119
Interquartile range (IQR)8.25

Descriptive statistics

Standard deviation15.219027
Coefficient of variation (CV)1.7604962
Kurtosis38.875027
Mean8.6447368
Median Absolute Deviation (MAD)3
Skewness5.6064882
Sum657
Variance231.61877
MonotonicityNot monotonic
2023-12-12T14:07:09.351526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
1 16
 
12.1%
2 10
 
7.6%
3 9
 
6.8%
4 6
 
4.5%
8 5
 
3.8%
6 4
 
3.0%
5 3
 
2.3%
12 3
 
2.3%
14 2
 
1.5%
10 2
 
1.5%
Other values (14) 16
 
12.1%
(Missing) 56
42.4%
ValueCountFrequency (%)
1 16
12.1%
2 10
7.6%
3 9
6.8%
4 6
 
4.5%
5 3
 
2.3%
6 4
 
3.0%
7 1
 
0.8%
8 5
 
3.8%
9 1
 
0.8%
10 2
 
1.5%
ValueCountFrequency (%)
120 1
0.8%
44 1
0.8%
31 1
0.8%
30 1
0.8%
25 1
0.8%
20 1
0.8%
19 2
1.5%
17 1
0.8%
16 1
0.8%
15 1
0.8%

지적
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct124
Distinct (%)96.1%
Missing3
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean13946.395
Minimum5
Maximum421377
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.3 KiB
2023-12-12T14:07:09.591856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile42.6
Q1382
median1587
Q38135
95-th percentile55791.2
Maximum421377
Range421372
Interquartile range (IQR)7753

Descriptive statistics

Standard deviation44578.424
Coefficient of variation (CV)3.1964119
Kurtosis56.797847
Mean13946.395
Median Absolute Deviation (MAD)1508
Skewness6.8585558
Sum1799085
Variance1.9872359 × 109
MonotonicityNot monotonic
2023-12-12T14:07:09.784397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33 2
 
1.5%
60 2
 
1.5%
136 2
 
1.5%
628 2
 
1.5%
380 2
 
1.5%
8513 1
 
0.8%
2033 1
 
0.8%
626 1
 
0.8%
15183 1
 
0.8%
102347 1
 
0.8%
Other values (114) 114
86.4%
(Missing) 3
 
2.3%
ValueCountFrequency (%)
5 1
0.8%
11 1
0.8%
20 1
0.8%
27 1
0.8%
29 1
0.8%
33 2
1.5%
57 1
0.8%
60 2
1.5%
73 1
0.8%
79 1
0.8%
ValueCountFrequency (%)
421377 1
0.8%
189302 1
0.8%
134848 1
0.8%
107867 1
0.8%
102347 1
0.8%
72510 1
0.8%
60192 1
0.8%
49190 1
0.8%
47703 1
0.8%
43329 1
0.8%

부지
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct116
Distinct (%)89.2%
Missing2
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean1735.6408
Minimum5
Maximum27426
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.3 KiB
2023-12-12T14:07:09.986089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile18.9
Q199
median506.5
Q31741
95-th percentile4992.05
Maximum27426
Range27421
Interquartile range (IQR)1642

Descriptive statistics

Standard deviation3884.0406
Coefficient of variation (CV)2.2378136
Kurtosis26.352895
Mean1735.6408
Median Absolute Deviation (MAD)445.5
Skewness4.8305962
Sum225633.3
Variance15085771
MonotonicityNot monotonic
2023-12-12T14:07:10.162563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
658.0 5
 
3.8%
659.0 4
 
3.0%
99.0 4
 
3.0%
20.0 2
 
1.5%
42.0 2
 
1.5%
60.0 2
 
1.5%
3643.0 2
 
1.5%
2705.0 1
 
0.8%
1001.0 1
 
0.8%
338.0 1
 
0.8%
Other values (106) 106
80.3%
(Missing) 2
 
1.5%
ValueCountFrequency (%)
5.0 1
0.8%
8.0 1
0.8%
10.0 1
0.8%
11.0 1
0.8%
14.5 1
0.8%
15.0 1
0.8%
18.0 1
0.8%
20.0 2
1.5%
23.0 1
0.8%
25.0 1
0.8%
ValueCountFrequency (%)
27426.0 1
0.8%
23590.0 1
0.8%
21550.0 1
0.8%
9237.0 1
0.8%
8513.0 1
0.8%
4998.0 1
0.8%
4997.0 1
0.8%
4986.0 1
0.8%
4984.0 1
0.8%
4949.0 1
0.8%

전용목적
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)23.5%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
개간(농지조성)
45 
단독주택 신축
하천정비사업
도로개설(진입로)
배수지 개선사업
 
6
Other values (26)
60 

Length

Max length14
Median length13
Mean length7.5984848
Min length3

Unique

Unique9 ?
Unique (%)6.8%

Sample

1st row개간(농지조성)
2nd row개간(농지조성)
3rd row개간(농지조성)
4th row개간(농지조성)
5th row개간(농지조성)

Common Values

ValueCountFrequency (%)
개간(농지조성) 45
34.1%
단독주택 신축 7
 
5.3%
하천정비사업 7
 
5.3%
도로개설(진입로) 7
 
5.3%
배수지 개선사업 6
 
4.5%
제2종근린생활시설 5
 
3.8%
주차장 5
 
3.8%
단독주택(명의변경) 4
 
3.0%
도로(소규모 구조개선) 4
 
3.0%
도로개설 4
 
3.0%
Other values (21) 38
28.8%

Length

2023-12-12T14:07:10.694163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
개간(농지조성 45
27.8%
배수지 10
 
6.2%
단독주택 8
 
4.9%
신축 7
 
4.3%
하천정비사업 7
 
4.3%
도로개설(진입로 7
 
4.3%
개선사업 6
 
3.7%
제2종근린생활시설 5
 
3.1%
주차장 5
 
3.1%
설치 5
 
3.1%
Other values (27) 57
35.2%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
<NA>
129 
명의변경
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 129
97.7%
명의변경 3
 
2.3%

Length

2023-12-12T14:07:10.817685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:07:10.937089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 129
97.7%
명의변경 3
 
2.3%

Interactions

2023-12-12T14:07:05.725733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:07:05.149388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:07:05.422278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:07:05.822737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:07:05.248245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:07:05.511375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:07:05.929071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:07:05.338669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:07:05.603561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:07:11.017923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분본번부번지적부지전용목적
구분1.0000.5730.9810.9680.7480.0000.0000.938
0.5731.0001.0000.9580.0000.1050.4480.881
0.9811.0001.0000.9920.6740.8750.0000.983
본번0.9680.9580.9921.0000.7850.7960.8070.989
부번0.7480.0000.6740.7851.0000.0000.5950.634
지적0.0000.1050.8750.7960.0001.0000.8020.914
부지0.0000.4480.0000.8070.5950.8021.0000.000
전용목적0.9380.8810.9830.9890.6340.9140.0001.000
2023-12-12T14:07:11.151800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전용목적토지구분비고구분
전용목적1.0000.4751.0001.0000.666
0.4751.0001.0001.0000.302
토지구분1.0001.0001.000NaN1.000
비고1.0001.000NaN1.0001.000
구분0.6660.3021.0001.0001.000
2023-12-12T14:07:11.282583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부번지적부지구분토지구분전용목적비고
부번1.000-0.315-0.3550.7370.0001.0000.3231.000
지적-0.3151.0000.6520.0000.0301.0000.5731.000
부지-0.3550.6521.0000.0000.1851.0000.0001.000
구분0.7370.0000.0001.0000.3021.0000.6661.000
0.0000.0300.1850.3021.0001.0000.4751.000
토지구분1.0001.0001.0001.0001.0001.0001.000NaN
전용목적0.3230.5730.0000.6660.4751.0001.0001.000
비고1.0001.0001.0001.0001.000NaN1.0001.000

Missing values

2023-12-12T14:07:06.076876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:07:06.294965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:07:06.549129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분토지구분본번부번지적부지전용목적비고
0협의임실오정<NA>3322592379237.0개간(농지조성)<NA>
1협의임실오정<NA>33230754754.0개간(농지조성)<NA>
2협의관촌슬치42218971897.0개간(농지조성)<NA>
3협의관촌슬치42319791979.0개간(농지조성)<NA>
4협의관촌상월75136433643.0개간(농지조성)<NA>
5협의관촌상월75<NA>2359023590.0개간(농지조성)<NA>
6변경강진용수<NA>5571355.0단독주택(명의변경)<NA>
7변경강진용수<NA>557147373.0단독주택(명의변경)<NA>
8변경강진용수<NA>55716365365.0단독주택(명의변경)<NA>
9변경강진용수<NA>557172727.0단독주택(명의변경)<NA>
구분토지구분본번부번지적부지전용목적비고
122협의임실현곡138<NA>1970015.0하천정비사업<NA>
123협의운암입석4513611738.0도로(소규모 구조개선)<NA>
124협의운암입석4529439124.0도로(소규모 구조개선)<NA>
125협의운암입석44647448.0도로(소규모 구조개선)<NA>
126협의운암입석464813510.0도로(소규모 구조개선)<NA>
127협의관촌신전<NA>158247932705.0동식물관련시설(축사)<NA>
128협의운암운종<NA>4441914725.0화장실 설치<NA>
129협의운암운종<NA>44331237235.0화장실 설치<NA>
130협의삼계어은42193333.0도로(국도 폭원 정비사업)<NA>
131협의삼계어은42201111.0도로(국도 폭원 정비사업)<NA>