Overview

Dataset statistics

Number of variables8
Number of observations27
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory72.9 B

Variable types

Categorical2
Text2
Numeric4

Dataset

Description부산광역시 산업단지 입주기업 현황으로 구군명, 산업단지명, 산단구분, 사업기간,조성면적, 산업용지, 입주업체, 고용인원 항목에 대한 정보를 제공합니다.
Author부산광역시
URLhttps://www.data.go.kr/data/15088731/fileData.do

Alerts

조성면적(만제곱미터) is highly overall correlated with 산업용지(만제곱미터) and 2 other fieldsHigh correlation
산업용지(만제곱미터) is highly overall correlated with 조성면적(만제곱미터) and 2 other fieldsHigh correlation
입주업체(개사) is highly overall correlated with 조성면적(만제곱미터) and 3 other fieldsHigh correlation
고용인원(명) is highly overall correlated with 조성면적(만제곱미터) and 3 other fieldsHigh correlation
구군명 is highly overall correlated with 입주업체(개사) and 1 other fieldsHigh correlation
산단구분 is highly imbalanced (58.6%)Imbalance
산업단지명 has unique valuesUnique
사업기간 has unique valuesUnique
고용인원(명) has unique valuesUnique

Reproduction

Analysis started2023-12-11 23:54:07.438304
Analysis finished2023-12-11 23:54:09.328091
Duration1.89 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구군명
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Memory size348.0 B
강서구
12 
기장군
11 
사하구
 
1
사상구
 
1
해운대구
 
1

Length

Max length4
Median length3
Mean length3.037037
Min length3

Unique

Unique4 ?
Unique (%)14.8%

Sample

1st row사하구
2nd row강서구
3rd row강서구
4th row강서구
5th row강서구

Common Values

ValueCountFrequency (%)
강서구 12
44.4%
기장군 11
40.7%
사하구 1
 
3.7%
사상구 1
 
3.7%
해운대구 1
 
3.7%
금정구 1
 
3.7%

Length

2023-12-12T08:54:09.406749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:54:09.538859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강서구 12
44.4%
기장군 11
40.7%
사하구 1
 
3.7%
사상구 1
 
3.7%
해운대구 1
 
3.7%
금정구 1
 
3.7%

산업단지명
Text

UNIQUE 

Distinct27
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size348.0 B
2023-12-12T08:54:09.795515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length10.407407
Min length6

Characters and Unicode

Total characters281
Distinct characters73
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)100.0%

Sample

1st row신평ㆍ장림일반산업단지
2nd row녹산국가산업단지
3rd row신호일반산업단지(경자구역)
4th row부산과학일반산업단지(경자구역)
5th row화전일반산업단지(경자구역)
ValueCountFrequency (%)
신평ㆍ장림일반산업단지 1
 
3.7%
기룡1일반산업단지 1
 
3.7%
에코장안일반산업단지 1
 
3.7%
오리일반산업단지 1
 
3.7%
국제산업물류도시(1단계)(경자구역 1
 
3.7%
반룡일반산업단지 1
 
3.7%
부산신소재일반산업단지 1
 
3.7%
명례일반산업단지 1
 
3.7%
회동ㆍ석대도시첨단산업단지 1
 
3.7%
정관코리일반산업단지 1
 
3.7%
Other values (17) 17
63.0%
2023-12-12T08:54:10.184486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29
 
10.3%
29
 
10.3%
27
 
9.6%
26
 
9.3%
23
 
8.2%
22
 
7.8%
( 7
 
2.5%
) 7
 
2.5%
6
 
2.1%
6
 
2.1%
Other values (63) 99
35.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 263
93.6%
Open Punctuation 7
 
2.5%
Close Punctuation 7
 
2.5%
Decimal Number 4
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
 
11.0%
29
 
11.0%
27
 
10.3%
26
 
9.9%
23
 
8.7%
22
 
8.4%
6
 
2.3%
6
 
2.3%
6
 
2.3%
6
 
2.3%
Other values (59) 83
31.6%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
1 2
50.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 263
93.6%
Common 18
 
6.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
 
11.0%
29
 
11.0%
27
 
10.3%
26
 
9.9%
23
 
8.7%
22
 
8.4%
6
 
2.3%
6
 
2.3%
6
 
2.3%
6
 
2.3%
Other values (59) 83
31.6%
Common
ValueCountFrequency (%)
( 7
38.9%
) 7
38.9%
2 2
 
11.1%
1 2
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 261
92.9%
ASCII 18
 
6.4%
Compat Jamo 2
 
0.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
29
 
11.1%
29
 
11.1%
27
 
10.3%
26
 
10.0%
23
 
8.8%
22
 
8.4%
6
 
2.3%
6
 
2.3%
6
 
2.3%
6
 
2.3%
Other values (58) 81
31.0%
ASCII
ValueCountFrequency (%)
( 7
38.9%
) 7
38.9%
2 2
 
11.1%
1 2
 
11.1%
Compat Jamo
ValueCountFrequency (%)
2
100.0%

산단구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Memory size348.0 B
일반
23 
도시첨단
 
2
국가
 
1
농공
 
1

Length

Max length4
Median length2
Mean length2.1481481
Min length2

Unique

Unique2 ?
Unique (%)7.4%

Sample

1st row일반
2nd row국가
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 23
85.2%
도시첨단 2
 
7.4%
국가 1
 
3.7%
농공 1
 
3.7%

Length

2023-12-12T08:54:10.628633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:54:10.776337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 23
85.2%
도시첨단 2
 
7.4%
국가 1
 
3.7%
농공 1
 
3.7%

사업기간
Text

UNIQUE 

Distinct27
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size348.0 B
2023-12-12T08:54:11.000774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters243
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)100.0%

Sample

1st row1980~1990
2nd row1989~2002
3rd row1993~2007
4th row1991~2008
5th row2003~2011
ValueCountFrequency (%)
1980~1990 1
 
3.7%
2005~2008 1
 
3.7%
2014~2019 1
 
3.7%
2013~2020 1
 
3.7%
2010~2019 1
 
3.7%
2013~2018 1
 
3.7%
2013~2017 1
 
3.7%
2008~2014 1
 
3.7%
2008~2013 1
 
3.7%
2010~2013 1
 
3.7%
Other values (17) 17
63.0%
2023-12-12T08:54:11.397471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 69
28.4%
2 51
21.0%
1 43
17.7%
~ 27
 
11.1%
9 17
 
7.0%
8 10
 
4.1%
7 9
 
3.7%
3 7
 
2.9%
4 4
 
1.6%
5 4
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 216
88.9%
Math Symbol 27
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 69
31.9%
2 51
23.6%
1 43
19.9%
9 17
 
7.9%
8 10
 
4.6%
7 9
 
4.2%
3 7
 
3.2%
4 4
 
1.9%
5 4
 
1.9%
6 2
 
0.9%
Math Symbol
ValueCountFrequency (%)
~ 27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 243
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 69
28.4%
2 51
21.0%
1 43
17.7%
~ 27
 
11.1%
9 17
 
7.0%
8 10
 
4.1%
7 9
 
3.7%
3 7
 
2.9%
4 4
 
1.6%
5 4
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 243
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 69
28.4%
2 51
21.0%
1 43
17.7%
~ 27
 
11.1%
9 17
 
7.0%
8 10
 
4.1%
7 9
 
3.7%
3 7
 
2.9%
4 4
 
1.6%
5 4
 
1.6%

조성면적(만제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)85.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean130.25926
Minimum1
Maximum700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size375.0 B
2023-12-12T08:54:11.546837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.3
Q19.5
median55
Q3176.5
95-th percentile506.2
Maximum700
Range699
Interquartile range (IQR)167

Descriptive statistics

Standard deviation179.96123
Coefficient of variation (CV)1.3815619
Kurtosis3.557666
Mean130.25926
Median Absolute Deviation (MAD)49
Skewness1.9266513
Sum3517
Variance32386.046
MonotonicityNot monotonic
2023-12-12T08:54:11.696740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
8 2
 
7.4%
10 2
 
7.4%
6 2
 
7.4%
26 2
 
7.4%
130 1
 
3.7%
9 1
 
3.7%
20 1
 
3.7%
61 1
 
3.7%
571 1
 
3.7%
55 1
 
3.7%
Other values (13) 13
48.1%
ValueCountFrequency (%)
1 1
3.7%
5 1
3.7%
6 2
7.4%
8 2
7.4%
9 1
3.7%
10 2
7.4%
20 1
3.7%
23 1
3.7%
26 2
7.4%
55 1
3.7%
ValueCountFrequency (%)
700 1
3.7%
571 1
3.7%
355 1
3.7%
312 1
3.7%
282 1
3.7%
245 1
3.7%
196 1
3.7%
157 1
3.7%
130 1
3.7%
121 1
3.7%

산업용지(만제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)81.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72.444444
Minimum1
Maximum411
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size375.0 B
2023-12-12T08:54:11.819787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15.5
median21
Q389.5
95-th percentile281.2
Maximum411
Range410
Interquartile range (IQR)84

Descriptive statistics

Standard deviation103.29172
Coefficient of variation (CV)1.425806
Kurtosis4.0804102
Mean72.444444
Median Absolute Deviation (MAD)18
Skewness2.0366364
Sum1956
Variance10669.179
MonotonicityNot monotonic
2023-12-12T08:54:11.958214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
5 3
 
11.1%
3 2
 
7.4%
13 2
 
7.4%
6 2
 
7.4%
175 1
 
3.7%
61 1
 
3.7%
4 1
 
3.7%
40 1
 
3.7%
319 1
 
3.7%
33 1
 
3.7%
Other values (12) 12
44.4%
ValueCountFrequency (%)
1 1
 
3.7%
3 2
7.4%
4 1
 
3.7%
5 3
11.1%
6 2
7.4%
13 2
7.4%
17 1
 
3.7%
19 1
 
3.7%
21 1
 
3.7%
33 1
 
3.7%
ValueCountFrequency (%)
411 1
3.7%
319 1
3.7%
193 1
3.7%
175 1
3.7%
171 1
3.7%
142 1
3.7%
92 1
3.7%
87 1
3.7%
77 1
3.7%
61 1
3.7%

입주업체(개사)
Real number (ℝ)

HIGH CORRELATION 

Distinct25
Distinct (%)92.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean258.11111
Minimum1
Maximum2195
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size375.0 B
2023-12-12T08:54:12.095366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q116.5
median76
Q3215
95-th percentile1245.8
Maximum2195
Range2194
Interquartile range (IQR)198.5

Descriptive statistics

Standard deviation499.05683
Coefficient of variation (CV)1.9334961
Kurtosis9.5660307
Mean258.11111
Median Absolute Deviation (MAD)72
Skewness3.0385274
Sum6969
Variance249057.72
MonotonicityNot monotonic
2023-12-12T08:54:12.232223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
4 2
 
7.4%
2 2
 
7.4%
641 1
 
3.7%
1505 1
 
3.7%
24 1
 
3.7%
65 1
 
3.7%
567 1
 
3.7%
70 1
 
3.7%
83 1
 
3.7%
117 1
 
3.7%
Other values (15) 15
55.6%
ValueCountFrequency (%)
1 1
3.7%
2 2
7.4%
3 1
3.7%
4 2
7.4%
16 1
3.7%
17 1
3.7%
21 1
3.7%
24 1
3.7%
31 1
3.7%
65 1
3.7%
ValueCountFrequency (%)
2195 1
3.7%
1505 1
3.7%
641 1
3.7%
567 1
3.7%
435 1
3.7%
306 1
3.7%
217 1
3.7%
213 1
3.7%
167 1
3.7%
117 1
3.7%

고용인원(명)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct27
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4027.8889
Minimum61
Maximum28301
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size375.0 B
2023-12-12T08:54:12.353591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum61
5-th percentile86.3
Q1184
median1452
Q33846
95-th percentile15091.5
Maximum28301
Range28240
Interquartile range (IQR)3662

Descriptive statistics

Standard deviation6410.8098
Coefficient of variation (CV)1.5916054
Kurtosis7.4819863
Mean4027.8889
Median Absolute Deviation (MAD)1321
Skewness2.5874432
Sum108753
Variance41098483
MonotonicityNot monotonic
2023-12-12T08:54:12.479022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
14059 1
 
3.7%
28301 1
 
3.7%
61 1
 
3.7%
520 1
 
3.7%
1167 1
 
3.7%
9480 1
 
3.7%
1275 1
 
3.7%
396 1
 
3.7%
1567 1
 
3.7%
2440 1
 
3.7%
Other values (17) 17
63.0%
ValueCountFrequency (%)
61 1
3.7%
68 1
3.7%
129 1
3.7%
131 1
3.7%
149 1
3.7%
161 1
3.7%
174 1
3.7%
194 1
3.7%
396 1
3.7%
520 1
3.7%
ValueCountFrequency (%)
28301 1
3.7%
15534 1
3.7%
14059 1
3.7%
9480 1
3.7%
8549 1
3.7%
5841 1
3.7%
3955 1
3.7%
3737 1
3.7%
3471 1
3.7%
3075 1
3.7%

Interactions

2023-12-12T08:54:08.753898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:07.718715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.072690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.410984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.833254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:07.817466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.153855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.498948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.923826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:07.907949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.247906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.581554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:09.022093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:07.991964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.336940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:54:08.669768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:54:12.629636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구군명산업단지명산단구분사업기간조성면적(만제곱미터)산업용지(만제곱미터)입주업체(개사)고용인원(명)
구군명1.0001.0000.6401.0000.0000.0000.6710.723
산업단지명1.0001.0001.0001.0001.0001.0001.0001.000
산단구분0.6401.0001.0001.0000.6930.5540.5520.554
사업기간1.0001.0001.0001.0001.0001.0001.0001.000
조성면적(만제곱미터)0.0001.0000.6931.0001.0000.9800.8960.919
산업용지(만제곱미터)0.0001.0000.5541.0000.9801.0000.7980.965
입주업체(개사)0.6711.0000.5521.0000.8960.7981.0000.938
고용인원(명)0.7231.0000.5541.0000.9190.9650.9381.000
2023-12-12T08:54:12.761021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산단구분구군명
산단구분1.0000.442
구군명0.4421.000
2023-12-12T08:54:12.870168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조성면적(만제곱미터)산업용지(만제곱미터)입주업체(개사)고용인원(명)구군명산단구분
조성면적(만제곱미터)1.0000.9810.7460.8520.0000.321
산업용지(만제곱미터)0.9811.0000.7130.8230.0000.377
입주업체(개사)0.7460.7131.0000.9500.5140.463
고용인원(명)0.8520.8230.9501.0000.5170.377
구군명0.0000.0000.5140.5171.0000.442
산단구분0.3210.3770.4630.3770.4421.000

Missing values

2023-12-12T08:54:09.135943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:54:09.275764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구군명산업단지명산단구분사업기간조성면적(만제곱미터)산업용지(만제곱미터)입주업체(개사)고용인원(명)
0사하구신평ㆍ장림일반산업단지일반1980~199028217564114059
1강서구녹산국가산업단지국가1989~2002700411150528301
2강서구신호일반산업단지(경자구역)일반1993~2007312171763075
3강서구부산과학일반산업단지(경자구역)일반1991~2008196921673955
4강서구화전일반산업단지(경자구역)일반2003~20112451423065841
5강서구강서보고일반산업단지일반2010~201410521194
6사상구모라도시첨단산업단지도시첨단2012~2015112171452
7강서구생곡일반산업단지(경자구역)일반2009~201556341011037
8강서구성우일반산업단지일반2009~20166316161
9강서구풍상일반산업단지일반2011~2016653149
구군명산업단지명산단구분사업기간조성면적(만제곱미터)산업용지(만제곱미터)입주업체(개사)고용인원(명)
17기장군기룡2일반산업단지일반2007~2011532129
18기장군정관코리일반산업단지일반2010~2013864131
19금정구회동ㆍ석대도시첨단산업단지도시첨단2008~201323131172440
20기장군명례일반산업단지일반2008~201415787831567
21기장군부산신소재일반산업단지일반2013~201726174396
22기장군반룡일반산업단지일반2013~20185533701275
23강서구국제산업물류도시(1단계)(경자구역)일반2010~20195713195679480
24기장군오리일반산업단지일반2013~20206140651167
25기장군에코장안일반산업단지일반2014~2019201324520
26강서구정주일반산업단지일반2014~202094261