Overview

Dataset statistics

Number of variables7
Number of observations337
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.5 KiB
Average record size in memory59.4 B

Variable types

Categorical4
Text2
Numeric1

Dataset

Description기업인력애로센터 보조사업(스마트제조기업 일자리 패키지사업)을 통한 기업-구직자 취업매칭 목록- 스마트공장도입(예정)기업에 인력매칭-현장교육-인건비 및 참여보조비 지원
Author중소벤처기업진흥공단
URLhttps://www.data.go.kr/data/15100252/fileData.do

Alerts

업종 is highly imbalanced (89.5%)Imbalance

Reproduction

Analysis started2024-03-23 05:45:47.261584
Analysis finished2024-03-23 05:45:57.963354
Duration10.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

참여년도
Categorical

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2022
173 
2023
164 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2022 173
51.3%
2023 164
48.7%

Length

2024-03-23T05:45:58.361297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T05:45:58.767208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 173
51.3%
2023 164
48.7%
Distinct301
Distinct (%)89.3%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2024-03-23T05:45:59.416954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length7.5964392
Min length2

Characters and Unicode

Total characters2560
Distinct characters275
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique265 ?
Unique (%)78.6%

Sample

1st row영일엔지니어링(주)
2nd row(주)드림애드앤프린팅그룹
3rd row장한기술(주)
4th row스마트엠
5th row대신엠씨(주)
ValueCountFrequency (%)
가스켐테크놀로지(주 3
 
0.9%
아성플라스틱밸브(주 2
 
0.6%
코엠테크(주 2
 
0.6%
주)한백정밀 2
 
0.6%
미크론 2
 
0.6%
주)위드멤스 2
 
0.6%
주)화인트로 2
 
0.6%
주)아폴로산업 2
 
0.6%
한본인더스트리(주 2
 
0.6%
주)페이퍼팩 2
 
0.6%
Other values (293) 320
93.8%
2024-03-23T05:46:01.098393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
276
 
10.8%
) 272
 
10.6%
( 272
 
10.6%
84
 
3.3%
82
 
3.2%
50
 
2.0%
48
 
1.9%
44
 
1.7%
44
 
1.7%
41
 
1.6%
Other values (265) 1347
52.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2002
78.2%
Close Punctuation 272
 
10.6%
Open Punctuation 272
 
10.6%
Space Separator 5
 
0.2%
Uppercase Letter 4
 
0.2%
Control 2
 
0.1%
Decimal Number 2
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
276
 
13.8%
84
 
4.2%
82
 
4.1%
50
 
2.5%
48
 
2.4%
44
 
2.2%
44
 
2.2%
41
 
2.0%
32
 
1.6%
32
 
1.6%
Other values (256) 1269
63.4%
Uppercase Letter
ValueCountFrequency (%)
C 2
50.0%
K 1
25.0%
H 1
25.0%
Close Punctuation
ValueCountFrequency (%)
) 272
100.0%
Open Punctuation
ValueCountFrequency (%)
( 272
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Control
ValueCountFrequency (%)
2
100.0%
Decimal Number
ValueCountFrequency (%)
6 2
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2002
78.2%
Common 554
 
21.6%
Latin 4
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
276
 
13.8%
84
 
4.2%
82
 
4.1%
50
 
2.5%
48
 
2.4%
44
 
2.2%
44
 
2.2%
41
 
2.0%
32
 
1.6%
32
 
1.6%
Other values (256) 1269
63.4%
Common
ValueCountFrequency (%)
) 272
49.1%
( 272
49.1%
5
 
0.9%
2
 
0.4%
6 2
 
0.4%
& 1
 
0.2%
Latin
ValueCountFrequency (%)
C 2
50.0%
K 1
25.0%
H 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2002
78.2%
ASCII 558
 
21.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
276
 
13.8%
84
 
4.2%
82
 
4.1%
50
 
2.5%
48
 
2.4%
44
 
2.2%
44
 
2.2%
41
 
2.0%
32
 
1.6%
32
 
1.6%
Other values (256) 1269
63.4%
ASCII
ValueCountFrequency (%)
) 272
48.7%
( 272
48.7%
5
 
0.9%
C 2
 
0.4%
2
 
0.4%
6 2
 
0.4%
K 1
 
0.2%
H 1
 
0.2%
& 1
 
0.2%

사업자번호
Real number (ℝ)

Distinct286
Distinct (%)84.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4761642 × 109
Minimum1.0288002 × 109
Maximum8.9985003 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 KiB
2024-03-23T05:46:02.014254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0288002 × 109
5-th percentile1.1221497 × 109
Q11.3186552 × 109
median3.1286233 × 109
Q35.3486002 × 109
95-th percentile6.4421011 × 109
Maximum8.9985003 × 109
Range7.9697001 × 109
Interquartile range (IQR)4.0299451 × 109

Descriptive statistics

Standard deviation2.1493002 × 109
Coefficient of variation (CV)0.61829652
Kurtosis-1.1834137
Mean3.4761642 × 109
Median Absolute Deviation (MAD)1.8701012 × 109
Skewness0.3978048
Sum1.1714673 × 1012
Variance4.6194914 × 1018
MonotonicityNot monotonic
2024-03-23T05:46:02.769269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6078163163 2
 
0.6%
1308126673 2
 
0.6%
1308162965 2
 
0.6%
1308631832 2
 
0.6%
1308658485 2
 
0.6%
1048149624 2
 
0.6%
1318651001 2
 
0.6%
6092423242 2
 
0.6%
6088162643 2
 
0.6%
1348127048 2
 
0.6%
Other values (276) 317
94.1%
ValueCountFrequency (%)
1028800167 1
0.3%
1048149624 2
0.6%
1058765032 1
0.3%
1068125254 2
0.6%
1071687880 2
0.6%
1078159849 1
0.3%
1078171936 1
0.3%
1078616618 1
0.3%
1088193646 2
0.6%
1088700305 2
0.6%
ValueCountFrequency (%)
8998500260 1
0.3%
8328500097 2
0.6%
8323200301 1
0.3%
8288801841 1
0.3%
8085200266 1
0.3%
7968101680 1
0.3%
7798600100 1
0.3%
7758700347 1
0.3%
7558600818 1
0.3%
7508701032 1
0.3%

취업인원
Categorical

Distinct4
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
1
148 
3
95 
2
93 
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row2
2nd row3
3rd row2
4th row2
5th row3

Common Values

ValueCountFrequency (%)
1 148
43.9%
3 95
28.2%
2 93
27.6%
4 1
 
0.3%

Length

2024-03-23T05:46:03.747038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T05:46:04.532821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 148
43.9%
3 95
28.2%
2 93
27.6%
4 1
 
0.3%

지역구분
Categorical

Distinct17
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
경기
97 
경남
42 
충남
35 
경북
22 
서울
20 
Other values (12)
121 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구
2nd row경기
3rd row충남
4th row서울
5th row인천

Common Values

ValueCountFrequency (%)
경기 97
28.8%
경남 42
12.5%
충남 35
 
10.4%
경북 22
 
6.5%
서울 20
 
5.9%
부산 19
 
5.6%
충북 18
 
5.3%
광주 16
 
4.7%
인천 16
 
4.7%
대구 16
 
4.7%
Other values (7) 36
 
10.7%

Length

2024-03-23T05:46:05.181121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 97
28.8%
경남 42
12.5%
충남 35
 
10.4%
경북 22
 
6.5%
서울 20
 
5.9%
부산 19
 
5.6%
충북 18
 
5.3%
대구 16
 
4.7%
인천 16
 
4.7%
광주 16
 
4.7%
Other values (7) 36
 
10.7%

업종
Categorical

IMBALANCE 

Distinct5
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
제조업
327 
도매 및 소매업
 
5
전문, 과학 및 기술 서비스업
 
3
수리 및 기타 개인 서비스업
 
1
운수 및 창고업
 
1

Length

Max length16
Median length3
Mean length3.2403561
Min length3

Unique

Unique2 ?
Unique (%)0.6%

Sample

1st row제조업
2nd row제조업
3rd row제조업
4th row전문, 과학 및 기술 서비스업
5th row제조업

Common Values

ValueCountFrequency (%)
제조업 327
97.0%
도매 및 소매업 5
 
1.5%
전문, 과학 및 기술 서비스업 3
 
0.9%
수리 및 기타 개인 서비스업 1
 
0.3%
운수 및 창고업 1
 
0.3%

Length

2024-03-23T05:46:05.706355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T05:46:06.195859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조업 327
89.6%
10
 
2.7%
도매 5
 
1.4%
소매업 5
 
1.4%
서비스업 4
 
1.1%
전문 3
 
0.8%
과학 3
 
0.8%
기술 3
 
0.8%
수리 1
 
0.3%
기타 1
 
0.3%
Other values (3) 3
 
0.8%
Distinct209
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2024-03-23T05:46:07.144845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length20
Mean length14.10089
Min length3

Characters and Unicode

Total characters4752
Distinct characters245
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique150 ?
Unique (%)44.5%

Sample

1st row그 외 기타 특수목적용 기계 제조업
2nd row기타 인쇄업
3rd row금속탱크 및 저장용기 제조업
4th row제품 디자인업
5th row그 외 기타 특수목적용 기계 제조업
ValueCountFrequency (%)
제조업 279
20.5%
107
 
7.9%
기타 92
 
6.8%
부품 35
 
2.6%
31
 
2.3%
31
 
2.3%
그외 28
 
2.1%
기계 24
 
1.8%
자동차 23
 
1.7%
플라스틱 20
 
1.5%
Other values (324) 690
50.7%
2024-03-23T05:46:08.463226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1023
21.5%
354
 
7.4%
344
 
7.2%
330
 
6.9%
217
 
4.6%
144
 
3.0%
113
 
2.4%
107
 
2.3%
93
 
2.0%
75
 
1.6%
Other values (235) 1952
41.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3675
77.3%
Space Separator 1023
 
21.5%
Other Punctuation 37
 
0.8%
Close Punctuation 5
 
0.1%
Open Punctuation 5
 
0.1%
Decimal Number 4
 
0.1%
Uppercase Letter 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
354
 
9.6%
344
 
9.4%
330
 
9.0%
217
 
5.9%
144
 
3.9%
113
 
3.1%
107
 
2.9%
93
 
2.5%
75
 
2.0%
67
 
1.8%
Other values (226) 1831
49.8%
Uppercase Letter
ValueCountFrequency (%)
S 1
33.3%
M 1
33.3%
I 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 35
94.6%
· 2
 
5.4%
Space Separator
ValueCountFrequency (%)
1023
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Decimal Number
ValueCountFrequency (%)
1 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3675
77.3%
Common 1074
 
22.6%
Latin 3
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
354
 
9.6%
344
 
9.4%
330
 
9.0%
217
 
5.9%
144
 
3.9%
113
 
3.1%
107
 
2.9%
93
 
2.5%
75
 
2.0%
67
 
1.8%
Other values (226) 1831
49.8%
Common
ValueCountFrequency (%)
1023
95.3%
, 35
 
3.3%
) 5
 
0.5%
( 5
 
0.5%
1 4
 
0.4%
· 2
 
0.2%
Latin
ValueCountFrequency (%)
S 1
33.3%
M 1
33.3%
I 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3674
77.3%
ASCII 1075
 
22.6%
None 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1023
95.2%
, 35
 
3.3%
) 5
 
0.5%
( 5
 
0.5%
1 4
 
0.4%
S 1
 
0.1%
M 1
 
0.1%
I 1
 
0.1%
Hangul
ValueCountFrequency (%)
354
 
9.6%
344
 
9.4%
330
 
9.0%
217
 
5.9%
144
 
3.9%
113
 
3.1%
107
 
2.9%
93
 
2.5%
75
 
2.0%
67
 
1.8%
Other values (225) 1830
49.8%
None
ValueCountFrequency (%)
· 2
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

Interactions

2024-03-23T05:45:56.792876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T05:46:08.699554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
참여년도사업자번호취업인원지역구분업종
참여년도1.0000.0000.0000.1420.117
사업자번호0.0001.0000.0540.8100.000
취업인원0.0000.0541.0000.0000.000
지역구분0.1420.8100.0001.0000.304
업종0.1170.0000.0000.3041.000
2024-03-23T05:46:09.067777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
취업인원업종지역구분참여년도
취업인원1.0000.0000.0000.000
업종0.0001.0000.1580.143
지역구분0.0000.1581.0000.124
참여년도0.0000.1430.1241.000
2024-03-23T05:46:09.344688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업자번호참여년도취업인원지역구분업종
사업자번호1.0000.0000.0360.4750.000
참여년도0.0001.0000.0000.1240.143
취업인원0.0360.0001.0000.0000.000
지역구분0.4750.1240.0001.0000.158
업종0.0000.1430.0000.1581.000

Missing values

2024-03-23T05:45:57.321824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T05:45:57.716686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

참여년도기업명사업자번호취업인원지역구분업종산업분류
02023영일엔지니어링(주)10288001672대구제조업그 외 기타 특수목적용 기계 제조업
12023(주)드림애드앤프린팅그룹10481496243경기제조업기타 인쇄업
22023장한기술(주)10681252542충남제조업금속탱크 및 저장용기 제조업
32023스마트엠10716878802서울전문, 과학 및 기술 서비스업제품 디자인업
42023대신엠씨(주)10781598493인천제조업그 외 기타 특수목적용 기계 제조업
52023(주)지에스티산업10781719363충북제조업탭, 밸브 및 유사장치 제조업
62023(주)아임유10881936463서울제조업사무용 기계 및 장비 제조업
72023(주)다니엘컴퍼니10887003052경북제조업커피 가공업
82023비바코리아10907426641인천제조업방송장비 제조업
92023(주)사운드캠코리아11281551272서울도매 및 소매업정밀기기 및 과학기기 도매업
참여년도기업명사업자번호취업인원지역구분업종산업분류
3272022(주)코아드43181268541경기제조업금속 문, 창, 셔터 및 관련제품 제조업
3282022한진실업(주)41081284842광주제조업기계장비 조립용 플라스틱 제조업
3292022(주)화인트로60881626432충남제조업그외 기타 자동차 부품 제조업
3302022스마트엠10716878801서울제조업기타 편조의복 액세서리 제조업
3312022(주)프리텍코리아31286198761충남제조업공기조화장치 제조업
3322022이노6(주)12487317752경기제조업반도체 제조용 기계 제조업
3332022(주)티티엔지51481970182대구제조업응용소프트웨어 개발 및 공급업
3342022(주)에니룩스13086584852경북제조업전시 및 광고용 조명장치 제조업
3352022(주)마이텍60381539071부산제조업증류기, 열교환기 및 가스발생기 제조업
3362022디랩치과기공소11991333841서울제조업지과용 기기 제조업