Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory322.3 KiB
Average record size in memory33.0 B

Variable types

Text1
Categorical1
Numeric1

Dataset

Description전라북도 고창군에서 보유한 토지소유현황에 대한 데이터로 소재지, 지목, 면적 등에 대한 정보를 csv파일 형식으로 제공합니다.
URLhttps://www.data.go.kr/data/15112876/fileData.do

Alerts

지목 is highly imbalanced (52.5%)Imbalance
면적 is highly skewed (γ1 = 28.9903347)Skewed

Reproduction

Analysis started2023-12-12 22:23:46.896605
Analysis finished2023-12-12 22:23:47.456892
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9996
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:23:47.723784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length21.9145
Min length18

Characters and Unicode

Total characters219145
Distinct characters131
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9992 ?
Unique (%)99.9%

Sample

1st row전라북도 고창군 고창읍 읍내리 324-25
2nd row전라북도 고창군 고창읍 교촌리 93-20
3rd row전라북도 고창군 고창읍 율계리 94-6
4th row전라북도 고창군 신림면 반룡리 607-5
5th row전라북도 고창군 해리면 라성리 666-2
ValueCountFrequency (%)
전라북도 10000
20.0%
고창군 10000
20.0%
고창읍 1479
 
3.0%
무장면 864
 
1.7%
해리면 850
 
1.7%
고수면 843
 
1.7%
대산면 775
 
1.6%
부안면 727
 
1.5%
흥덕면 672
 
1.3%
아산면 651
 
1.3%
Other values (6455) 23139
46.3%
2023-12-13T07:23:48.261203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
40000
18.3%
12551
 
5.7%
11547
 
5.3%
10850
 
5.0%
10260
 
4.7%
10127
 
4.6%
10121
 
4.6%
10033
 
4.6%
10000
 
4.6%
- 9456
 
4.3%
Other values (121) 84200
38.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 130000
59.3%
Space Separator 40000
 
18.3%
Decimal Number 39689
 
18.1%
Dash Punctuation 9456
 
4.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12551
 
9.7%
11547
 
8.9%
10850
 
8.3%
10260
 
7.9%
10127
 
7.8%
10121
 
7.8%
10033
 
7.7%
10000
 
7.7%
8521
 
6.6%
3152
 
2.4%
Other values (109) 32838
25.3%
Decimal Number
ValueCountFrequency (%)
1 7124
17.9%
2 6634
16.7%
3 4793
12.1%
4 4019
10.1%
5 3399
8.6%
6 3285
8.3%
7 2896
7.3%
8 2699
 
6.8%
9 2579
 
6.5%
0 2261
 
5.7%
Space Separator
ValueCountFrequency (%)
40000
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9456
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 130000
59.3%
Common 89145
40.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12551
 
9.7%
11547
 
8.9%
10850
 
8.3%
10260
 
7.9%
10127
 
7.8%
10121
 
7.8%
10033
 
7.7%
10000
 
7.7%
8521
 
6.6%
3152
 
2.4%
Other values (109) 32838
25.3%
Common
ValueCountFrequency (%)
40000
44.9%
- 9456
 
10.6%
1 7124
 
8.0%
2 6634
 
7.4%
3 4793
 
5.4%
4 4019
 
4.5%
5 3399
 
3.8%
6 3285
 
3.7%
7 2896
 
3.2%
8 2699
 
3.0%
Other values (2) 4840
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 130000
59.3%
ASCII 89145
40.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
40000
44.9%
- 9456
 
10.6%
1 7124
 
8.0%
2 6634
 
7.4%
3 4793
 
5.4%
4 4019
 
4.5%
5 3399
 
3.8%
6 3285
 
3.7%
7 2896
 
3.2%
8 2699
 
3.0%
Other values (2) 4840
 
5.4%
Hangul
ValueCountFrequency (%)
12551
 
9.7%
11547
 
8.9%
10850
 
8.3%
10260
 
7.9%
10127
 
7.8%
10121
 
7.8%
10033
 
7.7%
10000
 
7.7%
8521
 
6.6%
3152
 
2.4%
Other values (109) 32838
25.3%

지목
Categorical

IMBALANCE 

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
도로
5628 
1357 
1118 
임야
808 
 
404
Other values (19)
685 

Length

Max length4
Median length2
Mean length1.7414
Min length1

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row도로
2nd row
3rd row도로
4th row도로
5th row도로

Common Values

ValueCountFrequency (%)
도로 5628
56.3%
1357
 
13.6%
1118
 
11.2%
임야 808
 
8.1%
404
 
4.0%
유지 140
 
1.4%
하천 122
 
1.2%
구거 108
 
1.1%
잡종지 98
 
1.0%
유원지 62
 
0.6%
Other values (14) 155
 
1.6%

Length

2023-12-13T07:23:48.445872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
도로 5628
56.3%
1357
 
13.6%
1118
 
11.2%
임야 808
 
8.1%
404
 
4.0%
유지 140
 
1.4%
하천 122
 
1.2%
구거 108
 
1.1%
잡종지 98
 
1.0%
유원지 62
 
0.6%
Other values (14) 155
 
1.6%

면적
Real number (ℝ)

SKEWED 

Distinct1861
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean471.41169
Minimum0.4
Maximum159812
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:23:48.663534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.4
5-th percentile7
Q130
median84
Q3252
95-th percentile1397.24
Maximum159812
Range159811.6
Interquartile range (IQR)222

Descriptive statistics

Standard deviation3175.6952
Coefficient of variation (CV)6.7365644
Kurtosis1120.991
Mean471.41169
Median Absolute Deviation (MAD)68
Skewness28.990335
Sum4714116.9
Variance10085040
MonotonicityNot monotonic
2023-12-13T07:23:48.846681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.0 262
 
2.6%
13.0 238
 
2.4%
17.0 234
 
2.3%
10.0 225
 
2.2%
3.0 177
 
1.8%
20.0 166
 
1.7%
33.0 160
 
1.6%
36.0 139
 
1.4%
23.0 138
 
1.4%
40.0 120
 
1.2%
Other values (1851) 8141
81.4%
ValueCountFrequency (%)
0.4 2
 
< 0.1%
1.0 36
 
0.4%
2.0 31
 
0.3%
2.2 1
 
< 0.1%
2.5 1
 
< 0.1%
2.8 1
 
< 0.1%
2.9 1
 
< 0.1%
3.0 177
1.8%
3.5 1
 
< 0.1%
3.7 1
 
< 0.1%
ValueCountFrequency (%)
159812.0 1
< 0.1%
114207.0 1
< 0.1%
113061.0 1
< 0.1%
105126.0 1
< 0.1%
62374.4 1
< 0.1%
55676.0 1
< 0.1%
47127.0 1
< 0.1%
45167.0 1
< 0.1%
44350.0 1
< 0.1%
36665.2 1
< 0.1%

Interactions

2023-12-13T07:23:47.233589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:23:49.268693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지목면적
지목1.0000.509
면적0.5091.000
2023-12-13T07:23:49.358330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적지목
면적1.0000.246
지목0.2461.000

Missing values

2023-12-13T07:23:47.351620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:23:47.421214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

소재지지목면적
481전라북도 고창군 고창읍 읍내리 324-25도로1563.0
1725전라북도 고창군 고창읍 교촌리 93-20144.0
5993전라북도 고창군 고창읍 율계리 94-6도로734.0
40532전라북도 고창군 신림면 반룡리 607-5도로1198.0
24249전라북도 고창군 해리면 라성리 666-2도로228.0
18924전라북도 고창군 공음면 건동리 151-2도로17.0
40204전라북도 고창군 신림면 가평리 440-2도로109.0
18552전라북도 고창군 공음면 선동리 600-6임야13.0
27994전라북도 고창군 성송면 학천리 443-2도로165.0
12844전라북도 고창군 아산면 반암리 54-242.0
소재지지목면적
7565전라북도 고창군 고수면 봉산리 16272101.3
2818전라북도 고창군 고창읍 석정리 6991044.1
24023전라북도 고창군 해리면 왕촌리 1338-2도로399.0
32106전라북도 고창군 심원면 주산리 152-4도로13.0
32253전라북도 고창군 심원면 주산리 742-2도로26.0
24247전라북도 고창군 해리면 라성리 664-4266.0
27643전라북도 고창군 성송면 괴치리 42-3임야198.0
17740전라북도 고창군 공음면 석교리 968-1847.0
925전라북도 고창군 고창읍 읍내리 516-5도로18.0
22411전라북도 고창군 상하면 검산리 1017-2도로56.0