Overview

Dataset statistics

Number of variables8
Number of observations75
Missing cells140
Missing cells (%)23.3%
Duplicate rows10
Duplicate rows (%)13.3%
Total size in memory4.8 KiB
Average record size in memory65.8 B

Variable types

Unsupported6
Categorical2

Dataset

Description경력단절여성국도비직업교육훈련과정
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=201665

Alerts

Dataset has 10 (13.3%) duplicate rowsDuplicates
유형 is highly overall correlated with 훈 련 과 정 명High correlation
훈 련 과 정 명 is highly overall correlated with 유형High correlation
연번 has 18 (24.0%) missing valuesMissing
센터명 has 17 (22.7%) missing valuesMissing
교육인원 has 17 (22.7%) missing valuesMissing
교육기간 has 27 (36.0%) missing valuesMissing
교육시간 has 25 (33.3%) missing valuesMissing
Unnamed: 7 has 36 (48.0%) missing valuesMissing
연번 is an unsupported type, check if it needs cleaning or further analysisUnsupported
센터명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
교육인원 is an unsupported type, check if it needs cleaning or further analysisUnsupported
교육기간 is an unsupported type, check if it needs cleaning or further analysisUnsupported
교육시간 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 00:50:47.187353
Analysis finished2024-03-14 00:50:47.622047
Duration0.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing18
Missing (%)24.0%
Memory size732.0 B

센터명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing17
Missing (%)22.7%
Memory size732.0 B

유형
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)18.7%
Missing0
Missing (%)0.0%
Memory size732.0 B
일반
18 
<NA>
15 
전문
11 
일반과정
Other values (9)
15 

Length

Max length4
Median length3
Mean length2.72
Min length1

Unique

Unique6 ?
Unique (%)8.0%

Sample

1st row<NA>
2nd row<NA>
3rd row일반과정
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
일반 18
24.0%
<NA> 15
20.0%
전문 11
14.7%
일반과정 8
10.7%
8
10.7%
취약계층 5
 
6.7%
소 계 2
 
2.7%
창업 2
 
2.7%
소계 1
 
1.3%
기업맞춤 1
 
1.3%
Other values (4) 4
 
5.3%

Length

2024-03-14T09:50:47.677774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반 18
23.4%
na 15
19.5%
전문 11
14.3%
10
13.0%
일반과정 8
10.4%
취약계층 5
 
6.5%
2
 
2.6%
창업 2
 
2.6%
소계 1
 
1.3%
기업맞춤 1
 
1.3%
Other values (4) 4
 
5.2%

훈 련 과 정 명
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Memory size732.0 B
<NA>
22 
일반
15 
기업
이민
결혼
Other values (17)
21 

Length

Max length12
Median length2
Mean length3.6533333
Min length2

Unique

Unique14 ?
Unique (%)18.7%

Sample

1st row<NA>
2nd row10개 과정
3rd row직업교육훈련 취업담당자
4th row직무연수
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 22
29.3%
일반 15
20.0%
기업 9
12.0%
이민 4
 
5.3%
결혼 4
 
5.3%
역량 3
 
4.0%
창업 2
 
2.7%
기술 2
 
2.7%
호텔객실관리사 1
 
1.3%
직업교육훈련 취업담당자 1
 
1.3%
Other values (12) 12
16.0%

Length

2024-03-14T09:50:47.774091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 22
28.6%
일반 15
19.5%
기업 9
11.7%
이민 4
 
5.2%
결혼 4
 
5.2%
역량 3
 
3.9%
창업 2
 
2.6%
기술 2
 
2.6%
떡공방창업과정 1
 
1.3%
과정 1
 
1.3%
Other values (14) 14
18.2%

교육인원
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing17
Missing (%)22.7%
Memory size732.0 B

교육기간
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing27
Missing (%)36.0%
Memory size732.0 B

교육시간
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing25
Missing (%)33.3%
Memory size732.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing36
Missing (%)48.0%
Memory size732.0 B

Correlations

2024-03-14T09:50:47.832930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형훈 련 과 정 명
유형1.0000.983
훈 련 과 정 명0.9831.000
2024-03-14T09:50:48.196434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형훈 련 과 정 명
유형1.0000.780
훈 련 과 정 명0.7801.000
2024-03-14T09:50:48.293202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형훈 련 과 정 명
유형1.0000.780
훈 련 과 정 명0.7801.000

Missing values

2024-03-14T09:50:47.285983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:50:47.401091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T09:50:47.536895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번센터명유형훈 련 과 정 명교육인원교육기간교육시간Unnamed: 7
0NaNNaN<NA><NA>(명)NaN(시간)NaN
1NaNNaN<NA>10개 과정212NaNNaNNaN
21광역일반과정직업교육훈련 취업담당자3004. 21 ~ 04. 254박5일NaN
3NaN새일<NA>직무연수NaNNaNNaNNaN
4NaN센터<NA><NA>NaNNaNNaNNaN
5NaN전주소계<NA>NaNNaNNaNNaN
62새일일반과정방과후특기적성지도사2004. 04 ~ 07. 11160NaN
73센터일반과정IT사무원2004. 11 ~ 07. 29184NaN
84-3일반과정탄소소재제조생산인력양성2004. 18 ~ 05. 27128NaN
9NaN군산소 계<NA>NaNNaNNaNNaN
연번센터명유형훈 련 과 정 명교육인원교육기간교육시간Unnamed: 7
6530남원새일일반일반방과후아동지도사양성과정2004.04 ~ 09.05219
6631남원새일일반일반사무행정실무과정2004.04 ~ 07.26240
67김제새일NaN<NA>3개 과정NaNNaNNaN
6832김제새일취약계층결혼이민네일아트국가자격증2203. 02 ~ 05. 24184
6933김제새일일반일반로봇과학방과후지도사2404. 18 ~ 06. 20180
7034김제새일일반일반커리어IT실무자2005. 09 ~ 07. 08184
71완주새일NaN<NA>3개 과정NaNNaNNaN
7235완주새일일반일반생산제조품질관리원1505. 18 ~ 06. 16120
7336완주새일일반일반자동차부품제조양성과정1509. 21 ~ 10. 20120
7437완주새일창업창업폐백이야기1504. 21 ~ 05. 27100

Duplicate rows

Most frequently occurring

유형훈 련 과 정 명# duplicates
3일반일반15
9<NA><NA>10
5전문기업9
0<NA>8
2일반역량3
7취약계층결혼3
8<NA>이민3
1소 계<NA>2
4전문기술2
6창업창업2