Overview

Dataset statistics

Number of variables7
Number of observations1264
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory75.4 KiB
Average record size in memory61.1 B

Variable types

Numeric5
Categorical2

Dataset

Description기업유치 현황입니다. 시군구명, 신규기업수, 수도권 이전기업 수, 유치기업의 고용자수, 유치기업의 투자금액 정보를 제공합니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/bigdata/collect/view.chungnam?menuCd=DOM_000000201001001000&apiIdx=143

Alerts

신규기업수 is highly overall correlated with 유치기업의 고용자수 and 1 other fieldsHigh correlation
수도권 이전기업 수 is highly overall correlated with 유치기업의 고용자수 and 1 other fieldsHigh correlation
유치기업의 고용자수 is highly overall correlated with 신규기업수 and 2 other fieldsHigh correlation
유치기업의 투자금액(억원) is highly overall correlated with 신규기업수 and 2 other fieldsHigh correlation
승인상태(W : 대기 S : 승인) is highly imbalanced (90.2%)Imbalance
신규기업수 has 374 (29.6%) zerosZeros
수도권 이전기업 수 has 1033 (81.7%) zerosZeros
유치기업의 고용자수 has 373 (29.5%) zerosZeros
유치기업의 투자금액(억원) has 400 (31.6%) zerosZeros

Reproduction

Analysis started2024-03-13 11:52:15.683762
Analysis finished2024-03-13 11:52:20.775648
Duration5.09 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준연월
Real number (ℝ)

Distinct79
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean202025.71
Minimum201706
Maximum202312
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.2 KiB
2024-03-13T20:52:20.874347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201706
5-th percentile201709.15
Q1201901
median202009
Q3202205
95-th percentile202308.85
Maximum202312
Range606
Interquartile range (IQR)304

Descriptive statistics

Standard deviation190.71994
Coefficient of variation (CV)0.00094403799
Kurtosis-1.1921494
Mean202025.71
Median Absolute Deviation (MAD)195
Skewness-0.047979596
Sum2.553605 × 108
Variance36374.097
MonotonicityNot monotonic
2024-03-13T20:52:21.005943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
201706 16
 
1.3%
201707 16
 
1.3%
202204 16
 
1.3%
202203 16
 
1.3%
202202 16
 
1.3%
202201 16
 
1.3%
202112 16
 
1.3%
202111 16
 
1.3%
202110 16
 
1.3%
202109 16
 
1.3%
Other values (69) 1104
87.3%
ValueCountFrequency (%)
201706 16
1.3%
201707 16
1.3%
201708 16
1.3%
201709 16
1.3%
201710 16
1.3%
201711 16
1.3%
201712 16
1.3%
201801 16
1.3%
201802 16
1.3%
201803 16
1.3%
ValueCountFrequency (%)
202312 16
1.3%
202311 16
1.3%
202310 16
1.3%
202309 16
1.3%
202308 16
1.3%
202307 16
1.3%
202306 16
1.3%
202305 16
1.3%
202304 16
1.3%
202303 16
1.3%

시군구명
Categorical

Distinct16
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size10.0 KiB
충남
 
79
천안시
 
79
공주시
 
79
보령시
 
79
아산시
 
79
Other values (11)
869 

Length

Max length3
Median length3
Mean length2.9375
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충남
2nd row충남
3rd row충남
4th row충남
5th row충남

Common Values

ValueCountFrequency (%)
충남 79
 
6.2%
천안시 79
 
6.2%
공주시 79
 
6.2%
보령시 79
 
6.2%
아산시 79
 
6.2%
서산시 79
 
6.2%
논산시 79
 
6.2%
계룡시 79
 
6.2%
당진시 79
 
6.2%
금산군 79
 
6.2%
Other values (6) 474
37.5%

Length

2024-03-13T20:52:21.156150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
충남 79
 
6.2%
천안시 79
 
6.2%
공주시 79
 
6.2%
보령시 79
 
6.2%
아산시 79
 
6.2%
서산시 79
 
6.2%
논산시 79
 
6.2%
계룡시 79
 
6.2%
당진시 79
 
6.2%
금산군 79
 
6.2%
Other values (6) 474
37.5%

신규기업수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct156
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.240506
Minimum0
Maximum1499
Zeros374
Zeros (%)29.6%
Negative0
Negative (%)0.0%
Memory size11.2 KiB
2024-03-13T20:52:21.285739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q316
95-th percentile136.1
Maximum1499
Range1499
Interquartile range (IQR)16

Descriptive statistics

Standard deviation135.26375
Coefficient of variation (CV)3.9504016
Kurtosis60.150129
Mean34.240506
Median Absolute Deviation (MAD)2
Skewness7.2709205
Sum43280
Variance18296.283
MonotonicityNot monotonic
2024-03-13T20:52:21.437734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 374
29.6%
1 159
12.6%
2 105
 
8.3%
3 63
 
5.0%
5 51
 
4.0%
4 40
 
3.2%
7 31
 
2.5%
8 29
 
2.3%
6 26
 
2.1%
21 13
 
1.0%
Other values (146) 373
29.5%
ValueCountFrequency (%)
0 374
29.6%
1 159
12.6%
2 105
 
8.3%
3 63
 
5.0%
4 40
 
3.2%
5 51
 
4.0%
6 26
 
2.1%
7 31
 
2.5%
8 29
 
2.3%
9 11
 
0.9%
ValueCountFrequency (%)
1499 1
0.1%
1452 1
0.1%
1379 1
0.1%
1329 1
0.1%
1287 1
0.1%
1237 1
0.1%
1163 1
0.1%
1114 1
0.1%
1046 1
0.1%
976 1
0.1%

수도권 이전기업 수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct14
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.66455696
Minimum0
Maximum21
Zeros1033
Zeros (%)81.7%
Negative0
Negative (%)0.0%
Memory size11.2 KiB
2024-03-13T20:52:21.639399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.5899147
Coefficient of variation (CV)3.897205
Kurtosis35.212517
Mean0.66455696
Median Absolute Deviation (MAD)0
Skewness5.7534242
Sum840
Variance6.7076581
MonotonicityNot monotonic
2024-03-13T20:52:21.801626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
0 1033
81.7%
1 123
 
9.7%
2 38
 
3.0%
3 26
 
2.1%
11 17
 
1.3%
18 6
 
0.5%
5 4
 
0.3%
4 4
 
0.3%
20 4
 
0.3%
17 3
 
0.2%
Other values (4) 6
 
0.5%
ValueCountFrequency (%)
0 1033
81.7%
1 123
 
9.7%
2 38
 
3.0%
3 26
 
2.1%
4 4
 
0.3%
5 4
 
0.3%
6 1
 
0.1%
11 17
 
1.3%
13 1
 
0.1%
17 3
 
0.2%
ValueCountFrequency (%)
21 3
 
0.2%
20 4
 
0.3%
19 1
 
0.1%
18 6
 
0.5%
17 3
 
0.2%
13 1
 
0.1%
11 17
1.3%
6 1
 
0.1%
5 4
 
0.3%
4 4
 
0.3%

유치기업의 고용자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct436
Distinct (%)34.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean702.96994
Minimum0
Maximum29123
Zeros373
Zeros (%)29.5%
Negative0
Negative (%)0.0%
Memory size11.2 KiB
2024-03-13T20:52:21.958357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median28.5
Q3254
95-th percentile2750.8
Maximum29123
Range29123
Interquartile range (IQR)254

Descriptive statistics

Standard deviation2831.2655
Coefficient of variation (CV)4.0275769
Kurtosis52.103407
Mean702.96994
Median Absolute Deviation (MAD)28.5
Skewness6.8299874
Sum888554
Variance8016064.2
MonotonicityNot monotonic
2024-03-13T20:52:22.145549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 373
29.5%
3 27
 
2.1%
148 17
 
1.3%
5 16
 
1.3%
10 15
 
1.2%
8 14
 
1.1%
4 13
 
1.0%
1219 12
 
0.9%
15 11
 
0.9%
9 11
 
0.9%
Other values (426) 755
59.7%
ValueCountFrequency (%)
0 373
29.5%
1 4
 
0.3%
2 11
 
0.9%
3 27
 
2.1%
4 13
 
1.0%
5 16
 
1.3%
6 11
 
0.9%
7 7
 
0.6%
8 14
 
1.1%
9 11
 
0.9%
ValueCountFrequency (%)
29123 1
0.1%
28423 1
0.1%
27656 1
0.1%
26688 1
0.1%
26157 1
0.1%
25493 1
0.1%
24049 1
0.1%
21285 1
0.1%
20076 1
0.1%
19803 1
0.1%

유치기업의 투자금액(억원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct507
Distinct (%)40.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1557.0997
Minimum0
Maximum69247
Zeros400
Zeros (%)31.6%
Negative0
Negative (%)0.0%
Memory size11.2 KiB
2024-03-13T20:52:22.323627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median35
Q3478.25
95-th percentile7160.4
Maximum69247
Range69247
Interquartile range (IQR)478.25

Descriptive statistics

Standard deviation6316.5992
Coefficient of variation (CV)4.0566441
Kurtosis58.913445
Mean1557.0997
Median Absolute Deviation (MAD)35
Skewness7.1401601
Sum1968174
Variance39899426
MonotonicityNot monotonic
2024-03-13T20:52:22.528513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 400
31.6%
2 34
 
2.7%
1 23
 
1.8%
184 17
 
1.3%
10 14
 
1.1%
5 14
 
1.1%
3 14
 
1.1%
2854 12
 
0.9%
7 11
 
0.9%
11 9
 
0.7%
Other values (497) 716
56.6%
ValueCountFrequency (%)
0 400
31.6%
1 23
 
1.8%
2 34
 
2.7%
3 14
 
1.1%
4 9
 
0.7%
5 14
 
1.1%
6 7
 
0.6%
7 11
 
0.9%
8 6
 
0.5%
9 4
 
0.3%
ValueCountFrequency (%)
69247 1
0.1%
67635 1
0.1%
65904 1
0.1%
62100 1
0.1%
60740 1
0.1%
56306 1
0.1%
53252 1
0.1%
51573 1
0.1%
49582 1
0.1%
42954 1
0.1%
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size10.0 KiB
S
1248 
W
 
16

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS
2nd rowS
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S 1248
98.7%
W 16
 
1.3%

Length

2024-03-13T20:52:22.698321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T20:52:22.806669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
s 1248
98.7%
w 16
 
1.3%

Interactions

2024-03-13T20:52:19.914429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:16.178025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:17.170333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:18.031675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:19.173286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:20.049177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:16.335881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:17.365923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:18.235854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:19.354236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:20.186284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:16.516649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:17.593724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:18.417935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:19.533310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:20.292344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:16.657159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:17.708903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:18.581899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:19.653368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:20.414523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:16.865961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:17.854076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:18.713727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:52:19.788069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T20:52:22.873370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준연월시군구명신규기업수수도권 이전기업 수유치기업의 고용자수유치기업의 투자금액(억원)승인상태(W : 대기 S : 승인)
기준연월1.0000.0000.3000.2240.2870.2570.343
시군구명0.0001.0000.4430.5530.4520.4140.000
신규기업수0.3000.4431.0000.8170.9800.9830.215
수도권 이전기업 수0.2240.5530.8171.0000.7900.7740.067
유치기업의 고용자수0.2870.4520.9800.7901.0000.9710.223
유치기업의 투자금액(억원)0.2570.4140.9830.7740.9711.0000.195
승인상태(W : 대기 S : 승인)0.3430.0000.2150.0670.2230.1951.000
2024-03-13T20:52:23.006719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승인상태(W : 대기 S : 승인)시군구명
승인상태(W : 대기 S : 승인)1.0000.000
시군구명0.0001.000
2024-03-13T20:52:23.108549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준연월신규기업수수도권 이전기업 수유치기업의 고용자수유치기업의 투자금액(억원)시군구명승인상태(W : 대기 S : 승인)
기준연월1.0000.3220.2340.3190.3610.0000.257
신규기업수0.3221.0000.4720.9680.9350.1900.164
수도권 이전기업 수0.2340.4721.0000.5070.5230.2910.072
유치기업의 고용자수0.3190.9680.5071.0000.9630.1940.170
유치기업의 투자금액(억원)0.3610.9350.5230.9631.0000.1750.149
시군구명0.0000.1900.2910.1940.1751.0000.000
승인상태(W : 대기 S : 승인)0.2570.1640.0720.1700.1490.0001.000

Missing values

2024-03-13T20:52:20.577682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T20:52:20.715100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연월시군구명신규기업수수도권 이전기업 수유치기업의 고용자수유치기업의 투자금액(억원)승인상태(W : 대기 S : 승인)
0201706충남5531980315209S
1201707충남56113241678S
2201708충남651920761S
3201709충남703726505S
4201710충남710818770S
5201711충남69011871405S
6201712충남68315541538S
7201801충남5036892830S
8201802충남550719902S
9201803충남5728231220S
기준연월시군구명신규기업수수도권 이전기업 수유치기업의 고용자수유치기업의 투자금액(억원)승인상태(W : 대기 S : 승인)
1254202312논산시170241453W
1255202312계룡시50148184W
1256202312당진시2681143139588W
1257202312금산군950540687W
1258202312부여군110177145W
1259202312서천군9112342914W
1260202312청양군1032W
1261202312홍성군5145461899W
1262202312예산군4917262233W
1263202312태안군201911W