Overview

Dataset statistics

Number of variables5
Number of observations21
Missing cells11
Missing cells (%)10.5%
Duplicate rows2
Duplicate rows (%)9.5%
Total size in memory968.0 B
Average record size in memory46.1 B

Variable types

Unsupported1
Text1
Categorical3

Dataset

Description공공데이터 제공신청에 의해 제공된 데이터입니다. 고용노동부 내 간호사 자격을 보유한 인력에 대한 현황 데이터로 전자인사관리시스템에서 추출하였으며, 자격증 보유현황 등록은 필수가 아니므로 개인이 자격을 증명하는 서류를 등록하지 않는 경우도 존재하여 실제 보유 현황과 다를 수 있습니다.
Author고용노동부
URLhttps://www.data.go.kr/data/15126369/fileData.do

Alerts

Dataset has 2 (9.5%) duplicate rowsDuplicates
Unnamed: 2 is highly overall correlated with Unnamed: 4High correlation
Unnamed: 3 is highly overall correlated with Unnamed: 4High correlation
Unnamed: 4 is highly overall correlated with Unnamed: 2 and 1 other fieldsHigh correlation
<고용노동부 간호사 자격증 보유 인력 현황('23.11.14.기준)> has 2 (9.5%) missing valuesMissing
Unnamed: 1 has 9 (42.9%) missing valuesMissing
<고용노동부 간호사 자격증 보유 인력 현황('23.11.14.기준)> is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 09:12:18.525689
Analysis finished2024-03-14 09:12:19.681534
Duration1.16 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Missing2
Missing (%)9.5%
Memory size296.0 B

Unnamed: 1
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing9
Missing (%)42.9%
Memory size296.0 B
2024-03-14T18:12:20.172849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length9
Mean length10.166667
Min length2

Characters and Unicode

Total characters122
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st row소속
2nd row고용노동부
3rd row서울지방고용노동청
4th row중부지방고용노동청
5th row중부지방고용노동청 경기지청
ValueCountFrequency (%)
중부지방고용노동청 4
23.5%
부산지방고용노동청 3
17.6%
소속 1
 
5.9%
고용노동부 1
 
5.9%
서울지방고용노동청 1
 
5.9%
경기지청 1
 
5.9%
성남지청 1
 
5.9%
강원지청 1
 
5.9%
양산지청 1
 
5.9%
진주지청 1
 
5.9%
Other values (2) 2
11.8%
2024-03-14T18:12:21.241735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15
12.3%
15
12.3%
11
9.0%
11
9.0%
11
9.0%
11
9.0%
10
8.2%
8
 
6.6%
5
 
4.1%
4
 
3.3%
Other values (17) 21
17.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 117
95.9%
Space Separator 5
 
4.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
12.8%
15
12.8%
11
9.4%
11
9.4%
11
9.4%
11
9.4%
10
8.5%
8
6.8%
4
 
3.4%
4
 
3.4%
Other values (16) 17
14.5%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 117
95.9%
Common 5
 
4.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
12.8%
15
12.8%
11
9.4%
11
9.4%
11
9.4%
11
9.4%
10
8.5%
8
6.8%
4
 
3.4%
4
 
3.4%
Other values (16) 17
14.5%
Common
ValueCountFrequency (%)
5
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 117
95.9%
ASCII 5
 
4.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
15
12.8%
15
12.8%
11
9.4%
11
9.4%
11
9.4%
11
9.4%
10
8.5%
8
6.8%
4
 
3.4%
4
 
3.4%
Other values (16) 17
14.5%
ASCII
ValueCountFrequency (%)
5
100.0%

Unnamed: 2
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size296.0 B
여성
11 
<NA>
성별
 
1

Length

Max length4
Median length2
Mean length2.8571429
Min length2

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row성별
2nd row여성
3rd row여성
4th row여성
5th row여성

Common Values

ValueCountFrequency (%)
여성 11
52.4%
<NA> 9
42.9%
성별 1
 
4.8%

Length

2024-03-14T18:12:21.671125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T18:12:22.027648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
여성 11
52.4%
na 9
42.9%
성별 1
 
4.8%

Unnamed: 3
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Memory size296.0 B
공무원
11 
<NA>
1명
신분
 
1
인원수
 
1
Other values (2)

Length

Max length4
Median length3
Mean length2.952381
Min length2

Unique

Unique4 ?
Unique (%)19.0%

Sample

1st row신분
2nd row공무원
3rd row공무원
4th row공무원
5th row공무원

Common Values

ValueCountFrequency (%)
공무원 11
52.4%
<NA> 4
 
19.0%
1명 2
 
9.5%
신분 1
 
4.8%
인원수 1
 
4.8%
6명 1
 
4.8%
3명 1
 
4.8%

Length

2024-03-14T18:12:22.426966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T18:12:22.807564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공무원 11
52.4%
na 4
 
19.0%
1명 2
 
9.5%
신분 1
 
4.8%
인원수 1
 
4.8%
6명 1
 
4.8%
3명 1
 
4.8%

Unnamed: 4
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)23.8%
Missing0
Missing (%)0.0%
Memory size296.0 B
<NA>
부내 직원 건강증진 및 산업보건 업무
산재예방지도(보건분야) 업무
담당업무
외국인 고용허가 관련 민원 업무

Length

Max length20
Median length17
Mean length11.52381
Min length4

Unique

Unique2 ?
Unique (%)9.5%

Sample

1st row담당업무
2nd row부내 직원 건강증진 및 산업보건 업무
3rd row부내 직원 건강증진 및 산업보건 업무
4th row부내 직원 건강증진 및 산업보건 업무
5th row산재예방지도(보건분야) 업무

Common Values

ValueCountFrequency (%)
<NA> 9
42.9%
부내 직원 건강증진 및 산업보건 업무 7
33.3%
산재예방지도(보건분야) 업무 3
 
14.3%
담당업무 1
 
4.8%
외국인 고용허가 관련 민원 업무 1
 
4.8%

Length

2024-03-14T18:12:23.240421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T18:12:23.603502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
업무 11
17.5%
na 9
14.3%
부내 7
11.1%
직원 7
11.1%
건강증진 7
11.1%
7
11.1%
산업보건 7
11.1%
산재예방지도(보건분야 3
 
4.8%
담당업무 1
 
1.6%
외국인 1
 
1.6%
Other values (3) 3
 
4.8%

Correlations

2024-03-14T18:12:23.750688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
Unnamed: 11.0001.0001.0001.000
Unnamed: 21.0001.0000.5451.000
Unnamed: 31.0000.5451.0001.000
Unnamed: 41.0001.0001.0001.000
2024-03-14T18:12:24.005883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 4Unnamed: 3Unnamed: 2
Unnamed: 41.0000.8940.894
Unnamed: 30.8941.0000.357
Unnamed: 20.8940.3571.000
2024-03-14T18:12:24.252978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3Unnamed: 4
Unnamed: 21.0000.3570.894
Unnamed: 30.3571.0000.894
Unnamed: 40.8940.8941.000

Missing values

2024-03-14T18:12:18.796129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T18:12:19.154114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T18:12:19.485979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

<고용노동부 간호사 자격증 보유 인력 현황('23.11.14.기준)>Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
0연번소속성별신분담당업무
11고용노동부여성공무원부내 직원 건강증진 및 산업보건 업무
22서울지방고용노동청여성공무원부내 직원 건강증진 및 산업보건 업무
33중부지방고용노동청여성공무원부내 직원 건강증진 및 산업보건 업무
44중부지방고용노동청 경기지청여성공무원산재예방지도(보건분야) 업무
55중부지방고용노동청 성남지청여성공무원외국인 고용허가 관련 민원 업무
66중부지방고용노동청 강원지청여성공무원부내 직원 건강증진 및 산업보건 업무
77부산지방고용노동청여성공무원부내 직원 건강증진 및 산업보건 업무
88부산지방고용노동청 양산지청여성공무원산재예방지도(보건분야) 업무
99부산지방고용노동청 진주지청여성공무원산재예방지도(보건분야) 업무
<고용노동부 간호사 자격증 보유 인력 현황('23.11.14.기준)>Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
1111대전지방고용노동청여성공무원부내 직원 건강증진 및 산업보건 업무
12NaN<NA><NA><NA><NA>
13<고용노동부 간호사 자격증 보유 인력 연령별 현황('23.11.14.기준)><NA><NA><NA><NA>
14연령<NA><NA>인원수<NA>
1520대<NA><NA>1명<NA>
1630대<NA><NA>6명<NA>
1740대<NA><NA>3명<NA>
1850대<NA><NA>1명<NA>
19NaN<NA><NA><NA><NA>
20※ 해당 자료는 전자인사관리시스템 추출 자료로, 자격증 보유 현황은 필수 등록현황이 아닌 \n 개인이 자격을 증명하는 서류를 제출한 후 시스템에 등록하는 경우에 한하므로 실제와 다를 수 있음<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4# duplicates
1<NA><NA><NA><NA>4
0<NA><NA>1명<NA>2