Overview

Dataset statistics

Number of variables6
Number of observations88
Missing cells8
Missing cells (%)1.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.4 KiB
Average record size in memory51.5 B

Variable types

Categorical3
Text1
Numeric2

Dataset

Description전북특별자치도 산업단지 조성 현황(구분(국가선업, 일반신업, 도시첨단, 농공단지), 시군명, 단지명, 조성, 관리 면적, 입주업체 수 등)
Author전북특별자치도
URLhttps://www.data.go.kr/data/3081504/fileData.do

Alerts

조성 is highly imbalanced (67.6%)Imbalance
입주업체 수 has 8 (9.1%) missing valuesMissing
입주업체 수 has 1 (1.1%) zerosZeros

Reproduction

Analysis started2024-03-14 18:12:45.223708
Analysis finished2024-03-14 18:12:47.805285
Duration2.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct4
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size832.0 B
농공단지
59 
일반산업단지
22 
국가산업단지
도시첨단산업단지
 
1

Length

Max length8
Median length4
Mean length4.6818182
Min length4

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row국가산업단지
2nd row국가산업단지
3rd row국가산업단지
4th row국가산업단지
5th row국가산업단지

Common Values

ValueCountFrequency (%)
농공단지 59
67.0%
일반산업단지 22
 
25.0%
국가산업단지 6
 
6.8%
도시첨단산업단지 1
 
1.1%

Length

2024-03-15T03:12:48.074815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T03:12:48.469918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
농공단지 59
67.0%
일반산업단지 22
 
25.0%
국가산업단지 6
 
6.8%
도시첨단산업단지 1
 
1.1%

시군명
Categorical

Distinct14
Distinct (%)15.9%
Missing0
Missing (%)0.0%
Memory size832.0 B
정읍
12 
익산
10 
김제
군산
남원
Other values (9)
41 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row군산
2nd row군산
3rd row익산
4th row익산
5th row군산

Common Values

ValueCountFrequency (%)
정읍 12
13.6%
익산 10
11.4%
김제 9
10.2%
군산 8
9.1%
남원 8
9.1%
전주 7
8.0%
완주 6
6.8%
부안 5
 
5.7%
고창 5
 
5.7%
순창 5
 
5.7%
Other values (4) 13
14.8%

Length

2024-03-15T03:12:48.736730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
정읍 12
13.6%
익산 10
11.4%
김제 9
10.2%
군산 8
9.1%
남원 8
9.1%
전주 7
8.0%
완주 6
6.8%
부안 5
 
5.7%
고창 5
 
5.7%
순창 5
 
5.7%
Other values (4) 13
14.8%
Distinct86
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Memory size832.0 B
2024-03-15T03:12:49.740001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length2
Mean length3.3636364
Min length2

Characters and Unicode

Total characters296
Distinct characters120
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique84 ?
Unique (%)95.5%

Sample

1st row군산
2nd row군산 2
3rd row익산
4th row국가식품클러스터
5th row새만금
ValueCountFrequency (%)
제2 6
 
5.4%
2 5
 
4.5%
익산 4
 
3.6%
군산 3
 
2.7%
제3 3
 
2.7%
노암 3
 
2.7%
부안 3
 
2.7%
전주 3
 
2.7%
정읍 3
 
2.7%
완주 2
 
1.8%
Other values (70) 77
68.8%
2024-03-15T03:12:50.997975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24
 
8.1%
15
 
5.1%
14
 
4.7%
2 12
 
4.1%
7
 
2.4%
1 6
 
2.0%
6
 
2.0%
5
 
1.7%
3 5
 
1.7%
5
 
1.7%
Other values (110) 197
66.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 245
82.8%
Space Separator 24
 
8.1%
Decimal Number 24
 
8.1%
Open Punctuation 1
 
0.3%
Close Punctuation 1
 
0.3%
Dash Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
 
6.1%
14
 
5.7%
7
 
2.9%
6
 
2.4%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (102) 176
71.8%
Decimal Number
ValueCountFrequency (%)
2 12
50.0%
1 6
25.0%
3 5
20.8%
4 1
 
4.2%
Space Separator
ValueCountFrequency (%)
24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 245
82.8%
Common 51
 
17.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
 
6.1%
14
 
5.7%
7
 
2.9%
6
 
2.4%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (102) 176
71.8%
Common
ValueCountFrequency (%)
24
47.1%
2 12
23.5%
1 6
 
11.8%
3 5
 
9.8%
( 1
 
2.0%
) 1
 
2.0%
- 1
 
2.0%
4 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 245
82.8%
ASCII 51
 
17.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
24
47.1%
2 12
23.5%
1 6
 
11.8%
3 5
 
9.8%
( 1
 
2.0%
) 1
 
2.0%
- 1
 
2.0%
4 1
 
2.0%
Hangul
ValueCountFrequency (%)
15
 
6.1%
14
 
5.7%
7
 
2.9%
6
 
2.4%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (102) 176
71.8%

조성
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size832.0 B
조성 완료
80 
조성 중
 
6
계획 수립
 
2

Length

Max length5
Median length5
Mean length4.9318182
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row조성 완료
2nd row조성 완료
3rd row조성 완료
4th row조성 완료
5th row조성 중

Common Values

ValueCountFrequency (%)
조성 완료 80
90.9%
조성 중 6
 
6.8%
계획 수립 2
 
2.3%

Length

2024-03-15T03:12:51.352616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T03:12:51.533588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
조성 86
48.9%
완료 80
45.5%
6
 
3.4%
계획 2
 
1.1%
수립 2
 
1.1%

관리 면적
Real number (ℝ)

Distinct79
Distinct (%)89.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1009.5795
Minimum50
Maximum18497
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size920.0 B
2024-03-15T03:12:51.875925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile75.45
Q1141.5
median231.5
Q3424.25
95-th percentile3341.5
Maximum18497
Range18447
Interquartile range (IQR)282.75

Descriptive statistics

Standard deviation2662.732
Coefficient of variation (CV)2.6374663
Kurtosis28.617531
Mean1009.5795
Median Absolute Deviation (MAD)100
Skewness5.0921414
Sum88843
Variance7090141.8
MonotonicityNot monotonic
2024-03-15T03:12:52.323417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
147 4
 
4.5%
329 3
 
3.4%
143 2
 
2.3%
149 2
 
2.3%
140 2
 
2.3%
53 2
 
2.3%
116 1
 
1.1%
50 1
 
1.1%
230 1
 
1.1%
112 1
 
1.1%
Other values (69) 69
78.4%
ValueCountFrequency (%)
50 1
1.1%
53 2
2.3%
57 1
1.1%
73 1
1.1%
80 1
1.1%
83 1
1.1%
89 1
1.1%
94 1
1.1%
98 1
1.1%
106 1
1.1%
ValueCountFrequency (%)
18497 1
1.1%
14612 1
1.1%
6828 1
1.1%
5641 1
1.1%
3359 1
1.1%
3309 1
1.1%
3074 1
1.1%
2986 1
1.1%
2740 1
1.1%
2192 1
1.1%

입주업체 수
Real number (ℝ)

MISSING  ZEROS 

Distinct45
Distinct (%)56.2%
Missing8
Missing (%)9.1%
Infinite0
Infinite (%)0.0%
Mean37.2375
Minimum0
Maximum456
Zeros1
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size920.0 B
2024-03-15T03:12:52.742506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q17
median17.5
Q342.5
95-th percentile154.55
Maximum456
Range456
Interquartile range (IQR)35.5

Descriptive statistics

Standard deviation63.839443
Coefficient of variation (CV)1.7143859
Kurtosis24.320876
Mean37.2375
Median Absolute Deviation (MAD)11.5
Skewness4.3957339
Sum2979
Variance4075.4745
MonotonicityNot monotonic
2024-03-15T03:12:52.980303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
7 5
 
5.7%
1 4
 
4.5%
17 4
 
4.5%
2 4
 
4.5%
9 3
 
3.4%
11 3
 
3.4%
14 3
 
3.4%
6 3
 
3.4%
20 3
 
3.4%
23 3
 
3.4%
Other values (35) 45
51.1%
(Missing) 8
 
9.1%
ValueCountFrequency (%)
0 1
 
1.1%
1 4
4.5%
2 4
4.5%
4 3
3.4%
5 1
 
1.1%
6 3
3.4%
7 5
5.7%
8 2
 
2.3%
9 3
3.4%
10 2
 
2.3%
ValueCountFrequency (%)
456 1
1.1%
235 1
1.1%
190 1
1.1%
165 1
1.1%
154 1
1.1%
119 1
1.1%
83 1
1.1%
80 1
1.1%
68 2
2.3%
64 1
1.1%

Interactions

2024-03-15T03:12:46.320832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:12:45.703331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:12:46.660750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:12:46.004756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T03:12:53.168209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분시군명단지명조성관리 면적입주업체 수
구분1.0000.4640.0000.1630.4800.609
시군명0.4641.0001.0000.0000.4160.000
단지명0.0001.0001.0000.0000.8980.000
조성0.1630.0000.0001.0000.2630.000
관리 면적0.4800.4160.8980.2631.0000.757
입주업체 수0.6090.0000.0000.0000.7571.000
2024-03-15T03:12:53.411650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분조성시군명
구분1.0000.1530.259
조성0.1531.0000.000
시군명0.2590.0001.000
2024-03-15T03:12:53.672201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리 면적입주업체 수구분시군명조성
관리 면적1.0000.4880.4070.2150.201
입주업체 수0.4881.0000.4580.0000.000
구분0.4070.4581.0000.2590.153
시군명0.2150.0000.2591.0000.000
조성0.2010.0000.1530.0001.000

Missing values

2024-03-15T03:12:47.233749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T03:12:47.730947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분시군명단지명조성관리 면적입주업체 수
0국가산업단지군산군산조성 완료6828154
1국가산업단지군산군산 2조성 완료14612456
2국가산업단지익산익산조성 완료1336235
3국가산업단지익산국가식품클러스터조성 완료219253
4국가산업단지군산새만금조성 중184974
5국가산업단지전주전주 탄소계획 수립6560
6일반산업단지전주전주 제1조성 완료1806119
7일반산업단지전주전주 제2조성 완료68730
8일반산업단지전주친환경첨단복합(1단계)조성 완료29147
9일반산업단지전주친환경첨단복합3-1조성 완료2842
구분시군명단지명조성관리 면적입주업체 수
78농공단지고창고수조성 완료10623
79농공단지고창아산조성 완료14023
80농공단지고창흥덕조성 완료31514
81농공단지고창복분자조성 완료19611
82농공단지부안줄포조성 완료8914
83농공단지부안부안조성 완료14927
84농공단지부안부안 2조성 완료3449
85농공단지부안부안 제3조성 중329<NA>
86농공단지정읍철도산업조성 중2221
87농공단지완주완주계획 수립299<NA>