Overview

Dataset statistics

Number of variables7
Number of observations38
Missing cells25
Missing cells (%)9.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory64.5 B

Variable types

Text1
Categorical1
Numeric5

Dataset

Description산업단지별 고용현황 및 고용과 관련한 당월 변동사항, 전월 대비 변동사항 등에 대해 국가산업단지 산업동향정보를 월별로 제공하고 있습니다.
Author한국산업단지공단
URLhttps://www.data.go.kr/data/15085890/fileData.do

Alerts

구분 has constant value ""Constant
당월_계(명) is highly overall correlated with 당월_남(명) and 2 other fieldsHigh correlation
당월_남(명) is highly overall correlated with 당월_계(명) and 2 other fieldsHigh correlation
당월_여(명) is highly overall correlated with 당월_계(명) and 2 other fieldsHigh correlation
전월(명) is highly overall correlated with 당월_계(명) and 2 other fieldsHigh correlation
당월_계(명) has 5 (13.2%) missing valuesMissing
당월_남(명) has 5 (13.2%) missing valuesMissing
당월_여(명) has 5 (13.2%) missing valuesMissing
전월(명) has 5 (13.2%) missing valuesMissing
전월대비(퍼센트) has 5 (13.2%) missing valuesMissing
산업단지 has unique valuesUnique
전월대비(퍼센트) has 2 (5.3%) zerosZeros

Reproduction

Analysis started2024-04-06 08:21:27.367012
Analysis finished2024-04-06 08:21:34.597218
Duration7.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

산업단지
Text

UNIQUE 

Distinct38
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size436.0 B
2024-04-06T17:21:34.889313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length2
Mean length3.2631579
Min length2

Characters and Unicode

Total characters124
Distinct characters79
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)100.0%

Sample

1st row서울
2nd row녹산
3rd row대구
4th row남동
5th row부평
ValueCountFrequency (%)
서울 1
 
2.6%
대불(외 1
 
2.6%
진해 1
 
2.6%
국가식품클러스터(외 1
 
2.6%
군산 1
 
2.6%
군산2 1
 
2.6%
익산 1
 
2.6%
광양 1
 
2.6%
대불 1
 
2.6%
구미 1
 
2.6%
Other values (28) 28
73.7%
2024-04-06T17:21:35.593617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
5.6%
4
 
3.2%
3
 
2.4%
3
 
2.4%
3
 
2.4%
3
 
2.4%
) 3
 
2.4%
3
 
2.4%
( 3
 
2.4%
3
 
2.4%
Other values (69) 89
71.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 114
91.9%
Close Punctuation 3
 
2.4%
Open Punctuation 3
 
2.4%
Uppercase Letter 3
 
2.4%
Decimal Number 1
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
6.1%
4
 
3.5%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
2
 
1.8%
2
 
1.8%
Other values (63) 81
71.1%
Uppercase Letter
ValueCountFrequency (%)
M 1
33.3%
T 1
33.3%
V 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 114
91.9%
Common 7
 
5.6%
Latin 3
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
6.1%
4
 
3.5%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
2
 
1.8%
2
 
1.8%
Other values (63) 81
71.1%
Common
ValueCountFrequency (%)
) 3
42.9%
( 3
42.9%
2 1
 
14.3%
Latin
ValueCountFrequency (%)
M 1
33.3%
T 1
33.3%
V 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 113
91.1%
ASCII 10
 
8.1%
Compat Jamo 1
 
0.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7
 
6.2%
4
 
3.5%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
2
 
1.8%
2
 
1.8%
Other values (62) 80
70.8%
ASCII
ValueCountFrequency (%)
) 3
30.0%
( 3
30.0%
2 1
 
10.0%
M 1
 
10.0%
T 1
 
10.0%
V 1
 
10.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

구분
Categorical

CONSTANT 

Distinct1
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size436.0 B
국가
38 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국가
2nd row국가
3rd row국가
4th row국가
5th row국가

Common Values

ValueCountFrequency (%)
국가 38
100.0%

Length

2024-04-06T17:21:35.880790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:21:36.159644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국가 38
100.0%

당월_계(명)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)100.0%
Missing5
Missing (%)13.2%
Infinite0
Infinite (%)0.0%
Mean29341.03
Minimum48
Maximum140684
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2024-04-06T17:21:36.465159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile483.2
Q11476
median10704
Q324907
95-th percentile122802.4
Maximum140684
Range140636
Interquartile range (IQR)23431

Descriptive statistics

Standard deviation43495.223
Coefficient of variation (CV)1.4824027
Kurtosis0.96768156
Mean29341.03
Median Absolute Deviation (MAD)9352
Skewness1.5700746
Sum968254
Variance1.8918344 × 109
MonotonicityNot monotonic
2024-04-06T17:21:36.793049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
1082 1
 
2.6%
1119 1
 
2.6%
4455 1
 
2.6%
5731 1
 
2.6%
3237 1
 
2.6%
14218 1
 
2.6%
5666 1
 
2.6%
24907 1
 
2.6%
28266 1
 
2.6%
80182 1
 
2.6%
Other values (23) 23
60.5%
(Missing) 5
 
13.2%
ValueCountFrequency (%)
48 1
2.6%
266 1
2.6%
628 1
2.6%
878 1
2.6%
952 1
2.6%
1082 1
2.6%
1119 1
2.6%
1352 1
2.6%
1476 1
2.6%
3237 1
2.6%
ValueCountFrequency (%)
140684 1
2.6%
129163 1
2.6%
118562 1
2.6%
110455 1
2.6%
96964 1
2.6%
84410 1
2.6%
80182 1
2.6%
28266 1
2.6%
24907 1
2.6%
18889 1
2.6%

당월_남(명)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)100.0%
Missing5
Missing (%)13.2%
Infinite0
Infinite (%)0.0%
Mean23525.939
Minimum40
Maximum101976
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2024-04-06T17:21:37.139076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile305.6
Q11325
median6951
Q321980
95-th percentile99818.4
Maximum101976
Range101936
Interquartile range (IQR)20655

Descriptive statistics

Standard deviation34565.57
Coefficient of variation (CV)1.4692535
Kurtosis0.72245927
Mean23525.939
Median Absolute Deviation (MAD)6161
Skewness1.5243209
Sum776356
Variance1.1947786 × 109
MonotonicityNot monotonic
2024-04-06T17:21:37.387934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
910 1
 
2.6%
614 1
 
2.6%
3979 1
 
2.6%
4878 1
 
2.6%
2278 1
 
2.6%
13429 1
 
2.6%
4539 1
 
2.6%
23020 1
 
2.6%
21980 1
 
2.6%
64665 1
 
2.6%
Other values (23) 23
60.5%
(Missing) 5
 
13.2%
ValueCountFrequency (%)
40 1
2.6%
194 1
2.6%
380 1
2.6%
614 1
2.6%
790 1
2.6%
890 1
2.6%
910 1
2.6%
1188 1
2.6%
1325 1
2.6%
2278 1
2.6%
ValueCountFrequency (%)
101976 1
2.6%
101502 1
2.6%
98696 1
2.6%
90222 1
2.6%
87929 1
2.6%
64665 1
2.6%
61382 1
2.6%
23020 1
2.6%
21980 1
2.6%
13544 1
2.6%

당월_여(명)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)100.0%
Missing5
Missing (%)13.2%
Infinite0
Infinite (%)0.0%
Mean5815.0909
Minimum8
Maximum41988
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2024-04-06T17:21:37.617973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile68
Q1475
median1126
Q36286
95-th percentile24691.6
Maximum41988
Range41980
Interquartile range (IQR)5811

Descriptive statistics

Standard deviation9904.3453
Coefficient of variation (CV)1.7032142
Kurtosis5.0306423
Mean5815.0909
Median Absolute Deviation (MAD)975
Skewness2.2670518
Sum191898
Variance98096055
MonotonicityNot monotonic
2024-04-06T17:21:37.898195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
172 1
 
2.6%
505 1
 
2.6%
476 1
 
2.6%
853 1
 
2.6%
959 1
 
2.6%
789 1
 
2.6%
1127 1
 
2.6%
1887 1
 
2.6%
6286 1
 
2.6%
15517 1
 
2.6%
Other values (23) 23
60.5%
(Missing) 5
 
13.2%
ValueCountFrequency (%)
8 1
2.6%
62 1
2.6%
72 1
2.6%
88 1
2.6%
151 1
2.6%
164 1
2.6%
172 1
2.6%
248 1
2.6%
475 1
2.6%
476 1
2.6%
ValueCountFrequency (%)
41988 1
2.6%
27187 1
2.6%
23028 1
2.6%
22526 1
2.6%
17060 1
2.6%
15517 1
2.6%
6742 1
2.6%
6558 1
2.6%
6286 1
2.6%
3983 1
2.6%

전월(명)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)100.0%
Missing5
Missing (%)13.2%
Infinite0
Infinite (%)0.0%
Mean29313.424
Minimum48
Maximum140463
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2024-04-06T17:21:38.116892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile470.6
Q11485
median10780
Q324890
95-th percentile122711.8
Maximum140463
Range140415
Interquartile range (IQR)23405

Descriptive statistics

Standard deviation43458.655
Coefficient of variation (CV)1.4825513
Kurtosis0.96475876
Mean29313.424
Median Absolute Deviation (MAD)9426
Skewness1.5692596
Sum967343
Variance1.8886547 × 109
MonotonicityNot monotonic
2024-04-06T17:21:38.406591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
1081 1
 
2.6%
1099 1
 
2.6%
4457 1
 
2.6%
5714 1
 
2.6%
3236 1
 
2.6%
14217 1
 
2.6%
5283 1
 
2.6%
24890 1
 
2.6%
28297 1
 
2.6%
79908 1
 
2.6%
Other values (23) 23
60.5%
(Missing) 5
 
13.2%
ValueCountFrequency (%)
48 1
2.6%
266 1
2.6%
607 1
2.6%
867 1
2.6%
937 1
2.6%
1081 1
2.6%
1099 1
2.6%
1354 1
2.6%
1485 1
2.6%
3236 1
2.6%
ValueCountFrequency (%)
140463 1
2.6%
129136 1
2.6%
118429 1
2.6%
110309 1
2.6%
96938 1
2.6%
84558 1
2.6%
79908 1
2.6%
28297 1
2.6%
24890 1
2.6%
18920 1
2.6%

전월대비(퍼센트)
Real number (ℝ)

MISSING  ZEROS 

Distinct32
Distinct (%)97.0%
Missing5
Missing (%)13.2%
Infinite0
Infinite (%)0.0%
Mean0.40967729
Minimum-0.70500928
Maximum7.2496687
Zeros2
Zeros (%)5.3%
Negative14
Negative (%)36.8%
Memory size474.0 B
2024-04-06T17:21:38.717690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-0.70500928
5-th percentile-0.63843241
Q1-0.12068387
median0.007033833
Q30.15733681
95-th percentile2.4757568
Maximum7.2496687
Range7.954678
Interquartile range (IQR)0.27802068

Descriptive statistics

Standard deviation1.4609085
Coefficient of variation (CV)3.5659984
Kurtosis15.626719
Mean0.40967729
Median Absolute Deviation (MAD)0.14977292
Skewness3.7081626
Sum13.51935
Variance2.1342538
MonotonicityNot monotonic
2024-04-06T17:21:38.982700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
0.0 2
 
5.3%
7.249668749 1
 
2.6%
1.819836215 1
 
2.6%
-0.044873233 1
 
2.6%
0.297514876 1
 
2.6%
0.030902349 1
 
2.6%
0.007033833 1
 
2.6%
0.092506938 1
 
2.6%
-0.076849183 1
 
2.6%
0.068300522 1
 
2.6%
Other values (22) 22
57.9%
(Missing) 5
 
13.2%
ValueCountFrequency (%)
-0.705009276 1
2.6%
-0.686990125 1
2.6%
-0.606060606 1
2.6%
-0.175027792 1
2.6%
-0.16384778 1
2.6%
-0.163354206 1
2.6%
-0.147710487 1
2.6%
-0.142739088 1
2.6%
-0.120683875 1
2.6%
-0.116647225 1
2.6%
ValueCountFrequency (%)
7.249668749 1
2.6%
3.459637562 1
2.6%
1.819836215 1
2.6%
1.600853789 1
2.6%
1.268742791 1
2.6%
0.342894328 1
2.6%
0.297514876 1
2.6%
0.17281106 1
2.6%
0.157336808 1
2.6%
0.132355474 1
2.6%

Interactions

2024-04-06T17:21:32.687825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:27.923800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:29.174130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:30.240852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:31.076448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:32.972798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:28.237163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:29.445223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:30.413461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:31.733275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:33.273146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:28.531972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:29.668283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:30.584524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:31.988195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:33.521013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:28.727869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:29.881009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:30.743056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:32.172509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:33.758583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:28.980771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:30.069105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:30.910217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:21:32.441923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T17:21:39.173692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산업단지당월_계(명)당월_남(명)당월_여(명)전월(명)전월대비(퍼센트)
산업단지1.0001.0001.0001.0001.0001.000
당월_계(명)1.0001.0000.9590.8960.9980.000
당월_남(명)1.0000.9591.0000.7660.9240.000
당월_여(명)1.0000.8960.7661.0000.9180.000
전월(명)1.0000.9980.9240.9181.0000.000
전월대비(퍼센트)1.0000.0000.0000.0000.0001.000
2024-04-06T17:21:39.383393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
당월_계(명)당월_남(명)당월_여(명)전월(명)전월대비(퍼센트)
당월_계(명)1.0000.9900.9321.000-0.108
당월_남(명)0.9901.0000.8940.990-0.100
당월_여(명)0.9320.8941.0000.932-0.082
전월(명)1.0000.9900.9321.000-0.108
전월대비(퍼센트)-0.108-0.100-0.082-0.1081.000

Missing values

2024-04-06T17:21:34.008007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:21:34.269692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-06T17:21:34.456138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

산업단지구분당월_계(명)당월_남(명)당월_여(명)전월(명)전월대비(퍼센트)
0서울국가14068498696419881404630.157337
1녹산국가2826621980628628297-0.109552
2대구국가462636309964658-0.68699
3남동국가84410613822302884558-0.175028
4부평국가107046951375310780-0.705009
5주안국가132929309398313311-0.142739
6광주첨단국가1888912331655818920-0.163848
7빛그린국가135211881641354-0.14771
8온산국가1467013544112614682-0.081733
9울산ㆍ미포국가96964902226742969380.026821
산업단지구분당월_계(명)당월_남(명)당월_여(명)전월(명)전월대비(퍼센트)
28여수국가24907230201887248900.068301
29구미국가801826466515517799080.342894
30구미(외)국가366731375303673-0.163354
31포항국가119881151347512002-0.116647
32포항블루밸리국가48408480.0
33경남항공국가<NA><NA><NA><NA><NA>
34밀양나노국가<NA><NA><NA><NA><NA>
35안정국가952890629371.600854
36진해국가<NA><NA><NA><NA><NA>
37창원국가118562101502170601184290.112304