Overview

Dataset statistics

Number of variables3
Number of observations1228
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.9 KiB
Average record size in memory24.1 B

Variable types

Text1
Categorical2

Dataset

Description한국산업인력공단에서 시행하는 국가기술자격 및 국가전문자격 종목 목록 및 종목별 관련 주무부처 정보임 (타기관에서 시행하는 자격증 목록은 미포함됨)
URLhttps://www.data.go.kr/data/15112802/fileData.do

Alerts

자격구분 is highly overall correlated with 주무부처High correlation
주무부처 is highly overall correlated with 자격구분High correlation
자격구분 is highly imbalanced (59.3%)Imbalance
종목명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 11:39:59.218840
Analysis finished2023-12-12 11:39:59.871572
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

종목명
Text

UNIQUE 

Distinct1228
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
2023-12-12T20:40:00.300382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length18
Mean length8.3884365
Min length3

Characters and Unicode

Total characters10301
Distinct characters345
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1228 ?
Unique (%)100.0%

Sample

1st row세무사
2nd row관세사
3rd row관광통역안내사(영어)
4th row국내여행안내사
5th row호텔경영사
ValueCountFrequency (%)
청소년상담사 3
 
0.2%
청소년지도사 3
 
0.2%
1급 3
 
0.2%
3급 2
 
0.2%
2급 2
 
0.2%
세무사 1
 
0.1%
시각디자인산업기사 1
 
0.1%
자동차검사산업기사 1
 
0.1%
승강기산업기사 1
 
0.1%
정밀기계기사2급 1
 
0.1%
Other values (1217) 1217
98.5%
2023-12-12T20:40:01.044182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1472
 
14.3%
1191
 
11.6%
678
 
6.6%
400
 
3.9%
1 256
 
2.5%
227
 
2.2%
184
 
1.8%
172
 
1.7%
165
 
1.6%
162
 
1.6%
Other values (335) 5394
52.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9566
92.9%
Decimal Number 433
 
4.2%
Close Punctuation 142
 
1.4%
Open Punctuation 142
 
1.4%
Other Punctuation 10
 
0.1%
Space Separator 7
 
0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1472
 
15.4%
1191
 
12.5%
678
 
7.1%
400
 
4.2%
227
 
2.4%
184
 
1.9%
172
 
1.8%
165
 
1.7%
162
 
1.7%
156
 
1.6%
Other values (318) 4759
49.7%
Decimal Number
ValueCountFrequency (%)
1 256
59.1%
2 156
36.0%
3 8
 
1.8%
4 4
 
0.9%
5 3
 
0.7%
6 2
 
0.5%
8 1
 
0.2%
7 1
 
0.2%
9 1
 
0.2%
0 1
 
0.2%
Other Punctuation
ValueCountFrequency (%)
, 8
80.0%
/ 1
 
10.0%
· 1
 
10.0%
Close Punctuation
ValueCountFrequency (%)
) 142
100.0%
Open Punctuation
ValueCountFrequency (%)
( 142
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%
Uppercase Letter
ValueCountFrequency (%)
D 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9566
92.9%
Common 734
 
7.1%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1472
 
15.4%
1191
 
12.5%
678
 
7.1%
400
 
4.2%
227
 
2.4%
184
 
1.9%
172
 
1.8%
165
 
1.7%
162
 
1.7%
156
 
1.6%
Other values (318) 4759
49.7%
Common
ValueCountFrequency (%)
1 256
34.9%
2 156
21.3%
) 142
19.3%
( 142
19.3%
3 8
 
1.1%
, 8
 
1.1%
7
 
1.0%
4 4
 
0.5%
5 3
 
0.4%
6 2
 
0.3%
Other values (6) 6
 
0.8%
Latin
ValueCountFrequency (%)
D 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9566
92.9%
ASCII 734
 
7.1%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1472
 
15.4%
1191
 
12.5%
678
 
7.1%
400
 
4.2%
227
 
2.4%
184
 
1.9%
172
 
1.8%
165
 
1.7%
162
 
1.7%
156
 
1.6%
Other values (318) 4759
49.7%
ASCII
ValueCountFrequency (%)
1 256
34.9%
2 156
21.3%
) 142
19.3%
( 142
19.3%
3 8
 
1.1%
, 8
 
1.1%
7
 
1.0%
4 4
 
0.5%
5 3
 
0.4%
6 2
 
0.3%
Other values (6) 6
 
0.8%
None
ValueCountFrequency (%)
· 1
100.0%

자격구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
국가기술자격
1128 
국가전문자격
 
100

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국가전문자격
2nd row국가전문자격
3rd row국가전문자격
4th row국가전문자격
5th row국가전문자격

Common Values

ValueCountFrequency (%)
국가기술자격 1128
91.9%
국가전문자격 100
 
8.1%

Length

2023-12-12T20:40:01.320857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:40:01.532202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국가기술자격 1128
91.9%
국가전문자격 100
 
8.1%

주무부처
Categorical

HIGH CORRELATION 

Distinct35
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
고용노동부
404 
국토교통부
131 
산업통상자원부
93 
건설교통부
 
50
과학기술정보통신부
 
47
Other values (30)
503 

Length

Max length9
Median length5
Mean length5.3241042
Min length3

Unique

Unique5 ?
Unique (%)0.4%

Sample

1st row국세청
2nd row관세청
3rd row문화체육관광부
4th row문화체육관광부
5th row문화체육관광부

Common Values

ValueCountFrequency (%)
고용노동부 404
32.9%
국토교통부 131
 
10.7%
산업통상자원부 93
 
7.6%
건설교통부 50
 
4.1%
과학기술정보통신부 47
 
3.8%
산업자원부 45
 
3.7%
과학기술부 43
 
3.5%
농촌진흥청 36
 
2.9%
해양수산부 35
 
2.9%
환경부 32
 
2.6%
Other values (25) 312
25.4%

Length

2023-12-12T20:40:01.747254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
고용노동부 404
32.9%
국토교통부 131
 
10.7%
산업통상자원부 93
 
7.6%
건설교통부 50
 
4.1%
과학기술정보통신부 47
 
3.8%
산업자원부 45
 
3.7%
과학기술부 43
 
3.5%
농촌진흥청 36
 
2.9%
해양수산부 35
 
2.9%
환경부 32
 
2.6%
Other values (25) 312
25.4%

Correlations

2023-12-12T20:40:01.870162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자격구분주무부처
자격구분1.0000.839
주무부처0.8391.000
2023-12-12T20:40:02.006559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자격구분주무부처
자격구분1.0000.736
주무부처0.7361.000
2023-12-12T20:40:02.145709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자격구분주무부처
자격구분1.0000.736
주무부처0.7361.000

Missing values

2023-12-12T20:39:59.608783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:39:59.796496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

종목명자격구분주무부처
0세무사국가전문자격국세청
1관세사국가전문자격관세청
2관광통역안내사(영어)국가전문자격문화체육관광부
3국내여행안내사국가전문자격문화체육관광부
4호텔경영사국가전문자격문화체육관광부
5호텔관리사국가전문자격문화체육관광부
6관광통역안내사(불어)국가전문자격문화체육관광부
7정수시설운영관리사3급국가전문자격환경부
8정수시설운영관리사2급국가전문자격환경부
9정수시설운영관리사1급국가전문자격환경부
종목명자격구분주무부처
1218바이오화학제품제조산업기사국가기술자격산업통상자원부
1219버섯산업기사국가기술자격농촌진흥청
1220농작업안전보건기사국가기술자격농림축산식품부
1221신발산업기사국가기술자격산업통상자원부
1222로봇하드웨어개발기사국가기술자격산업통상자원부
1223잠수기능장국가기술자격해양수산부
1224환경위해관리기사국가기술자격환경부
1225채소재배기능사2급국가기술자격농촌진흥청
1226측지기사2급국가기술자격건설부
1227로봇기구개발기사국가기술자격산업통상자원부