Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)4.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Text1
Categorical2

Dataset

Descriptiono 요양기관 교육생들의 교육프로그램에 대한 제안사항 내용 정보 - 교육프로그램명, 제안분류코드, 교육분기 - 변수 레이아웃 1. 교육프로그램명 : 교육프로그램 이름 2. 제안분류코드: 1 우수 2 제안 3 의견 4 개선방안 3. 교육분기: 1 1분기 2 2분기 3 3분기 4 4분기
URLhttps://www.data.go.kr/data/15120952/fileData.do

Alerts

Dataset has 4 (4.0%) duplicate rowsDuplicates
교육분기 is highly imbalanced (80.6%)Imbalance

Reproduction

Analysis started2023-12-12 10:27:51.959759
Analysis finished2023-12-12 10:27:52.324723
Duration0.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct87
Distinct (%)87.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-12T19:27:52.553777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length35
Mean length15.79
Min length4

Characters and Unicode

Total characters1579
Distinct characters223
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique79 ?
Unique (%)79.0%

Sample

1st row마음챙김
2nd rowCPCR 교육
3rd row22년 11월 간호간병통합서비스병동 간호조무사 배치후 교육
4th row22년 11월 3일 4일 11일 17일 호흡기계 흡입기 약물(간호사)
5th rowImpantable access system
ValueCountFrequency (%)
교육 19
 
6.2%
22년 15
 
4.9%
신규간호사 11
 
3.6%
10
 
3.2%
간호조무사 5
 
1.6%
11월 5
 
1.6%
기본간호술기 4
 
1.3%
간호간병 4
 
1.3%
이해 4
 
1.3%
간호사 4
 
1.3%
Other values (170) 227
73.7%
2023-12-12T19:27:53.090327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
234
 
14.8%
2 72
 
4.6%
66
 
4.2%
1 64
 
4.1%
53
 
3.4%
47
 
3.0%
47
 
3.0%
41
 
2.6%
30
 
1.9%
26
 
1.6%
Other values (213) 899
56.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1013
64.2%
Space Separator 234
 
14.8%
Decimal Number 182
 
11.5%
Uppercase Letter 44
 
2.8%
Lowercase Letter 39
 
2.5%
Close Punctuation 25
 
1.6%
Open Punctuation 25
 
1.6%
Other Punctuation 11
 
0.7%
Math Symbol 4
 
0.3%
Dash Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
 
6.5%
53
 
5.2%
47
 
4.6%
47
 
4.6%
41
 
4.0%
30
 
3.0%
26
 
2.6%
21
 
2.1%
19
 
1.9%
18
 
1.8%
Other values (165) 645
63.7%
Lowercase Letter
ValueCountFrequency (%)
e 7
17.9%
a 5
12.8%
s 4
10.3%
w 3
7.7%
n 3
7.7%
t 3
7.7%
l 3
7.7%
r 2
 
5.1%
m 2
 
5.1%
c 2
 
5.1%
Other values (5) 5
12.8%
Uppercase Letter
ValueCountFrequency (%)
C 7
15.9%
R 7
15.9%
A 6
13.6%
S 5
11.4%
P 5
11.4%
B 3
6.8%
D 2
 
4.5%
L 2
 
4.5%
I 2
 
4.5%
O 1
 
2.3%
Other values (4) 4
9.1%
Decimal Number
ValueCountFrequency (%)
2 72
39.6%
1 64
35.2%
0 16
 
8.8%
3 6
 
3.3%
8 6
 
3.3%
7 5
 
2.7%
6 4
 
2.2%
9 4
 
2.2%
4 4
 
2.2%
5 1
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 8
72.7%
/ 1
 
9.1%
: 1
 
9.1%
& 1
 
9.1%
Space Separator
ValueCountFrequency (%)
234
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1013
64.2%
Common 483
30.6%
Latin 83
 
5.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
 
6.5%
53
 
5.2%
47
 
4.6%
47
 
4.6%
41
 
4.0%
30
 
3.0%
26
 
2.6%
21
 
2.1%
19
 
1.9%
18
 
1.8%
Other values (165) 645
63.7%
Latin
ValueCountFrequency (%)
C 7
 
8.4%
e 7
 
8.4%
R 7
 
8.4%
A 6
 
7.2%
S 5
 
6.0%
P 5
 
6.0%
a 5
 
6.0%
s 4
 
4.8%
w 3
 
3.6%
B 3
 
3.6%
Other values (19) 31
37.3%
Common
ValueCountFrequency (%)
234
48.4%
2 72
 
14.9%
1 64
 
13.3%
) 25
 
5.2%
( 25
 
5.2%
0 16
 
3.3%
. 8
 
1.7%
3 6
 
1.2%
8 6
 
1.2%
7 5
 
1.0%
Other values (9) 22
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1013
64.2%
ASCII 566
35.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
234
41.3%
2 72
 
12.7%
1 64
 
11.3%
) 25
 
4.4%
( 25
 
4.4%
0 16
 
2.8%
. 8
 
1.4%
C 7
 
1.2%
e 7
 
1.2%
R 7
 
1.2%
Other values (38) 101
17.8%
Hangul
ValueCountFrequency (%)
66
 
6.5%
53
 
5.2%
47
 
4.6%
47
 
4.6%
41
 
4.0%
30
 
3.0%
26
 
2.6%
21
 
2.1%
19
 
1.9%
18
 
1.8%
Other values (165) 645
63.7%
Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
49 
4
22 
3
18 
2
11 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row3
5th row1

Common Values

ValueCountFrequency (%)
1 49
49.0%
4 22
22.0%
3 18
 
18.0%
2 11
 
11.0%

Length

2023-12-12T19:27:53.244677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:27:53.341379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 49
49.0%
4 22
22.0%
3 18
 
18.0%
2 11
 
11.0%

교육분기
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
4
97 
1
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 97
97.0%
1 3
 
3.0%

Length

2023-12-12T19:27:53.476912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:27:53.626495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 97
97.0%
1 3
 
3.0%

Correlations

2023-12-12T19:27:53.694477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
교육프로그램명제안분류코드교육분기
교육프로그램명1.0000.8531.000
제안분류코드0.8531.0000.426
교육분기1.0000.4261.000
2023-12-12T19:27:53.778402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
교육분기제안분류코드
교육분기1.0000.283
제안분류코드0.2831.000
2023-12-12T19:27:53.852774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제안분류코드교육분기
제안분류코드1.0000.283
교육분기0.2831.000

Missing values

2023-12-12T19:27:52.210627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:27:52.286952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

교육프로그램명제안분류코드교육분기
0마음챙김14
1CPCR 교육14
222년 11월 간호간병통합서비스병동 간호조무사 배치후 교육14
322년 11월 3일 4일 11일 17일 호흡기계 흡입기 약물(간호사)34
4Impantable access system14
5암의 병태생리 및 항암화학요법의 이해44
622년 11월 8일 15일 수혈 (2~3년차 교육)14
722년 11월11일 18일 민원사례공유(통합병동 간호조무사)44
822년 11월11일 18일 민원사례공유(통합병동 지원인력 교육)44
922년 11월23일 신경계사정 및 급성 뇌질환의 증상(간호사 직무교육)44
교육프로그램명제안분류코드교육분기
90기본간호술기14
911000 with U14
92의료정보시스템 활용 간호실무교육(수혈 항암) SBAR 및 인수인계14
93기본간호술기14
94안전한투약 고주의약물34
95위관영양과 I/O관리14
96혈액검사의 이해44
97배액관 관리14
98약물용량계산34
99응급상황 대처능력 향상 교육34

Duplicate rows

Most frequently occurring

교육프로그램명제안분류코드교육분기# duplicates
0기본간호술기144
1마음챙김143
3의료정보시스템 활용 간호실무교육(수혈 항암) SBAR 및 인수인계143
2신규간호사 배치전 교육142