Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)4.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Text1
Categorical2

Dataset

Descriptiono 요양기관 교육생들의 교육프로그램에 대한 제안사항 내용 정보(송신용) - 교육프로그램명, 제안분류코드, 교육분기 - 변수 레이아웃 1. 교육프로그램명 : 교육프로그램 이름 2. 제안분류코드: 1 우수 2 제안 3 의견 4 개선방안 3. 교육분기: 1 1분기 2 2분기 3 3분기 4 4분기
URLhttps://www.data.go.kr/data/15120953/fileData.do

Alerts

Dataset has 4 (4.0%) duplicate rowsDuplicates
교육분기 is highly imbalanced (80.6%)Imbalance

Reproduction

Analysis started2023-12-12 07:15:23.163534
Analysis finished2023-12-12 07:15:23.507578
Duration0.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct87
Distinct (%)87.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-12T16:15:23.757581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length35
Mean length15.79
Min length4

Characters and Unicode

Total characters1579
Distinct characters223
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique79 ?
Unique (%)79.0%

Sample

1st row혈액검사의 이해1
2nd row10월 신규간호사 입사교육
3rd row10월 신규간호사 집체교육
4th row혈액검사의 이해2
5th rowImpantable access system
ValueCountFrequency (%)
교육 19
 
6.2%
22년 15
 
4.9%
신규간호사 11
 
3.6%
10
 
3.2%
간호조무사 5
 
1.6%
11월 5
 
1.6%
기본간호술기 4
 
1.3%
간호사 4
 
1.3%
간호간병 4
 
1.3%
이해 4
 
1.3%
Other values (170) 227
73.7%
2023-12-12T16:15:24.376686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
234
 
14.8%
2 72
 
4.6%
66
 
4.2%
1 64
 
4.1%
53
 
3.4%
47
 
3.0%
47
 
3.0%
41
 
2.6%
30
 
1.9%
26
 
1.6%
Other values (213) 899
56.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1013
64.2%
Space Separator 234
 
14.8%
Decimal Number 182
 
11.5%
Uppercase Letter 44
 
2.8%
Lowercase Letter 39
 
2.5%
Close Punctuation 25
 
1.6%
Open Punctuation 25
 
1.6%
Other Punctuation 11
 
0.7%
Math Symbol 4
 
0.3%
Dash Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
 
6.5%
53
 
5.2%
47
 
4.6%
47
 
4.6%
41
 
4.0%
30
 
3.0%
26
 
2.6%
21
 
2.1%
19
 
1.9%
18
 
1.8%
Other values (165) 645
63.7%
Lowercase Letter
ValueCountFrequency (%)
e 7
17.9%
a 5
12.8%
s 4
10.3%
t 3
7.7%
n 3
7.7%
w 3
7.7%
l 3
7.7%
r 2
 
5.1%
m 2
 
5.1%
c 2
 
5.1%
Other values (5) 5
12.8%
Uppercase Letter
ValueCountFrequency (%)
C 7
15.9%
R 7
15.9%
A 6
13.6%
S 5
11.4%
P 5
11.4%
B 3
6.8%
L 2
 
4.5%
D 2
 
4.5%
I 2
 
4.5%
O 1
 
2.3%
Other values (4) 4
9.1%
Decimal Number
ValueCountFrequency (%)
2 72
39.6%
1 64
35.2%
0 16
 
8.8%
8 6
 
3.3%
3 6
 
3.3%
7 5
 
2.7%
9 4
 
2.2%
6 4
 
2.2%
4 4
 
2.2%
5 1
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 8
72.7%
/ 1
 
9.1%
& 1
 
9.1%
: 1
 
9.1%
Space Separator
ValueCountFrequency (%)
234
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1013
64.2%
Common 483
30.6%
Latin 83
 
5.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
 
6.5%
53
 
5.2%
47
 
4.6%
47
 
4.6%
41
 
4.0%
30
 
3.0%
26
 
2.6%
21
 
2.1%
19
 
1.9%
18
 
1.8%
Other values (165) 645
63.7%
Latin
ValueCountFrequency (%)
C 7
 
8.4%
e 7
 
8.4%
R 7
 
8.4%
A 6
 
7.2%
S 5
 
6.0%
P 5
 
6.0%
a 5
 
6.0%
s 4
 
4.8%
t 3
 
3.6%
n 3
 
3.6%
Other values (19) 31
37.3%
Common
ValueCountFrequency (%)
234
48.4%
2 72
 
14.9%
1 64
 
13.3%
) 25
 
5.2%
( 25
 
5.2%
0 16
 
3.3%
. 8
 
1.7%
8 6
 
1.2%
3 6
 
1.2%
7 5
 
1.0%
Other values (9) 22
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1013
64.2%
ASCII 566
35.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
234
41.3%
2 72
 
12.7%
1 64
 
11.3%
) 25
 
4.4%
( 25
 
4.4%
0 16
 
2.8%
. 8
 
1.4%
C 7
 
1.2%
e 7
 
1.2%
R 7
 
1.2%
Other values (38) 101
17.8%
Hangul
ValueCountFrequency (%)
66
 
6.5%
53
 
5.2%
47
 
4.6%
47
 
4.6%
41
 
4.0%
30
 
3.0%
26
 
2.6%
21
 
2.1%
19
 
1.9%
18
 
1.8%
Other values (165) 645
63.7%
Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
49 
4
22 
3
18 
2
11 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row4
3rd row4
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 49
49.0%
4 22
22.0%
3 18
 
18.0%
2 11
 
11.0%

Length

2023-12-12T16:15:24.556149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:15:24.678413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 49
49.0%
4 22
22.0%
3 18
 
18.0%
2 11
 
11.0%

교육분기
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
4
97 
1
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 97
97.0%
1 3
 
3.0%

Length

2023-12-12T16:15:24.810157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:15:24.918025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 97
97.0%
1 3
 
3.0%

Correlations

2023-12-12T16:15:24.982812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
교육프로그램명제안분류코드교육분기
교육프로그램명1.0000.8531.000
제안분류코드0.8531.0000.426
교육분기1.0000.4261.000
2023-12-12T16:15:25.091163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
교육분기제안분류코드
교육분기1.0000.283
제안분류코드0.2831.000
2023-12-12T16:15:25.179931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제안분류코드교육분기
제안분류코드1.0000.283
교육분기0.2831.000

Missing values

2023-12-12T16:15:23.378290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:15:23.471676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

교육프로그램명제안분류코드교육분기
0혈액검사의 이해114
110월 신규간호사 입사교육44
210월 신규간호사 집체교육44
3혈액검사의 이해214
4Impantable access system14
5암의 병태생리 및 항암화학요법의 이해44
612월 신규간호사 입사 교육44
711월 신규간호사 집체교육44
8배액관 관리44
9연명의료결정제도의 이해34
교육프로그램명제안분류코드교육분기
90마음챙김14
91마음챙김14
92프리셉터 의사소통 모니터링 및 업무 지원21
93환자안전 증진 모니터링41
94환자의 권리 존중 및 보호34
95친절교육14
96친절교육34
97내시경 부비동 수술의 기본개념14
98친절교육24
99CPCR시뮬레이션21

Duplicate rows

Most frequently occurring

교육프로그램명제안분류코드교육분기# duplicates
0기본간호술기144
1마음챙김143
3의료정보시스템 활용 간호실무교육(수혈 항암) SBAR 및 인수인계143
2신규간호사 배치전 교육142