Overview

Dataset statistics

Number of variables4
Number of observations57
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory37.3 B

Variable types

Categorical2
Numeric2

Dataset

Description경남빅데이터허브플랫폼 DB 내 산업별 사업체및종사자 통계에 대한 데이터로, 산업구분명, 기준년도, 사업체수, 종사자수 정보를 제공합니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15123814

Alerts

사업체수 is highly overall correlated with 종사자수 and 1 other fieldsHigh correlation
종사자수 is highly overall correlated with 사업체수 and 1 other fieldsHigh correlation
산업구분명 is highly overall correlated with 사업체수 and 1 other fieldsHigh correlation
사업체수 has unique valuesUnique
종사자수 has unique valuesUnique

Reproduction

Analysis started2023-12-11 00:51:38.936420
Analysis finished2023-12-11 00:51:39.534995
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

산업구분명
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)35.1%
Missing0
Missing (%)0.0%
Memory size588.0 B
건설업
 
3
사업시설 관리, 사업 지원 및 임대 서비스업
 
3
협회 및 단체, 수리 및 기타 개인 서비스업
 
3
광업
 
3
교육 서비스업
 
3
Other values (15)
42 

Length

Max length24
Median length16
Mean length12
Min length2

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row건설업
2nd row건설업
3rd row건설업
4th row공공행정, 국방 및 사회보장 행정
5th row공공행정, 국방 및 사회보장 행정

Common Values

ValueCountFrequency (%)
건설업 3
 
5.3%
사업시설 관리, 사업 지원 및 임대 서비스업 3
 
5.3%
협회 및 단체, 수리 및 기타 개인 서비스업 3
 
5.3%
광업 3
 
5.3%
교육 서비스업 3
 
5.3%
금융 및 보험업 3
 
5.3%
농업, 임업 및 어업 3
 
5.3%
도매 및 소매업 3
 
5.3%
보건업 및 사회복지 서비스업 3
 
5.3%
부동산업 3
 
5.3%
Other values (10) 27
47.4%

Length

2023-12-11T09:51:39.606109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
42
 
20.0%
서비스업 18
 
8.6%
건설업 3
 
1.4%
여가관련 3
 
1.4%
원료 3
 
1.4%
재생업 3
 
1.4%
공공행정 3
 
1.4%
국방 3
 
1.4%
사회보장 3
 
1.4%
행정 3
 
1.4%
Other values (43) 126
60.0%

기준년도
Categorical

Distinct3
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size588.0 B
2017
19 
2018
19 
2019
19 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2018
3rd row2019
4th row2017
5th row2018

Common Values

ValueCountFrequency (%)
2017 19
33.3%
2018 19
33.3%
2019 19
33.3%

Length

2023-12-11T09:51:39.744250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:51:39.891590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 19
33.3%
2018 19
33.3%
2019 19
33.3%

사업체수
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct57
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14788.193
Minimum85
Maximum66322
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.0 B
2023-12-11T09:51:40.018095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum85
5-th percentile143.4
Q11067
median7492
Q320856
95-th percentile63146
Maximum66322
Range66237
Interquartile range (IQR)19789

Descriptive statistics

Standard deviation19675.861
Coefficient of variation (CV)1.3305115
Kurtosis1.6493262
Mean14788.193
Median Absolute Deviation (MAD)6432
Skewness1.6565749
Sum842927
Variance3.8713951 × 108
MonotonicityNot monotonic
2023-12-11T09:51:40.185226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9973 1
 
1.8%
157 1
 
1.8%
869 1
 
1.8%
916 1
 
1.8%
59762 1
 
1.8%
61806 1
 
1.8%
62642 1
 
1.8%
7360 1
 
1.8%
7492 1
 
1.8%
7555 1
 
1.8%
Other values (47) 47
82.5%
ValueCountFrequency (%)
85 1
1.8%
87 1
1.8%
89 1
1.8%
157 1
1.8%
203 1
1.8%
291 1
1.8%
336 1
1.8%
366 1
1.8%
395 1
1.8%
799 1
1.8%
ValueCountFrequency (%)
66322 1
1.8%
65898 1
1.8%
65162 1
1.8%
62642 1
1.8%
61806 1
1.8%
59762 1
1.8%
38352 1
1.8%
37048 1
1.8%
36668 1
1.8%
29827 1
1.8%

종사자수
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct57
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73587.491
Minimum712
Maximum423853
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.0 B
2023-12-11T09:51:40.319055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum712
5-th percentile2779.2
Q110508
median41238
Q393514
95-th percentile223041.2
Maximum423853
Range423141
Interquartile range (IQR)83006

Descriptive statistics

Standard deviation95096.059
Coefficient of variation (CV)1.2922856
Kurtosis7.0044227
Mean73587.491
Median Absolute Deviation (MAD)31876
Skewness2.5569689
Sum4194487
Variance9.0432604 × 109
MonotonicityNot monotonic
2023-12-11T09:51:40.474448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
85111 1
 
1.8%
4169 1
 
1.8%
9362 1
 
1.8%
9736 1
 
1.8%
147783 1
 
1.8%
153097 1
 
1.8%
158258 1
 
1.8%
23442 1
 
1.8%
24650 1
 
1.8%
26122 1
 
1.8%
Other values (47) 47
82.5%
ValueCountFrequency (%)
712 1
1.8%
756 1
1.8%
828 1
1.8%
3267 1
1.8%
3282 1
1.8%
3372 1
1.8%
4169 1
1.8%
4266 1
1.8%
4535 1
1.8%
8593 1
1.8%
ValueCountFrequency (%)
423853 1
1.8%
412713 1
1.8%
409762 1
1.8%
176361 1
1.8%
170972 1
1.8%
169269 1
1.8%
158258 1
1.8%
153097 1
1.8%
147783 1
1.8%
128395 1
1.8%

Interactions

2023-12-11T09:51:39.229118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:51:39.060264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:51:39.318712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:51:39.151759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:51:40.556987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산업구분명기준년도사업체수종사자수
산업구분명1.0000.0000.9900.962
기준년도0.0001.0000.0000.000
사업체수0.9900.0001.0000.854
종사자수0.9620.0000.8541.000
2023-12-11T09:51:40.931001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산업구분명기준년도
산업구분명1.0000.000
기준년도0.0001.000
2023-12-11T09:51:41.029490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업체수종사자수산업구분명기준년도
사업체수1.0000.8920.8170.000
종사자수0.8921.0000.7290.000
산업구분명0.8170.7291.0000.000
기준년도0.0000.0000.0001.000

Missing values

2023-12-11T09:51:39.418638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:51:39.503194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

산업구분명기준년도사업체수종사자수
0건설업2017997385111
1건설업20181015782581
2건설업20191049586226
3공공행정, 국방 및 사회보장 행정2017106045350
4공공행정, 국방 및 사회보장 행정2018105848323
5공공행정, 국방 및 사회보장 행정2019106750422
6광업201789828
7광업201885712
8광업201987756
9교육 서비스업20171308593514
산업구분명기준년도사업체수종사자수
47전문, 과학 및 기술 서비스업2019556336114
48정보통신업2017116310074
49정보통신업2018116410051
50정보통신업2019125010508
51제조업201736668423853
52제조업201837048409762
53제조업201938352412713
54협회 및 단체, 수리 및 기타 개인 서비스업20172805154142
55협회 및 단체, 수리 및 기타 개인 서비스업20182906655660
56협회 및 단체, 수리 및 기타 개인 서비스업20192982757406