Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory400.4 KiB
Average record size in memory41.0 B

Variable types

Text2
Categorical1
Boolean1

Dataset

Description한국산업안전보건공단에서 제공하는 교육콘텐츠 현황 csv 파일로 교육콘텐츠코드,교육콘텐츠 내용, 강의단계(수준) , 사용여부 등의 컬럼 정보를 제공합니다.
Author한국산업안전보건공단
URLhttps://www.data.go.kr/data/15065588/fileData.do

Alerts

강의단계 has constant value ""Constant
교육콘텐츠 번호 has unique valuesUnique

Reproduction

Analysis started2024-03-14 21:17:56.855741
Analysis finished2024-03-14 21:17:58.091088
Duration1.24 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T06:17:58.754946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length18.6809
Min length14

Characters and Unicode

Total characters186809
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowCW1211081601616809
2nd rowCW18060516236639654
3rd rowCW16011914072837262
4th rowCW17091409205135038
5th row17432AA2FFELEMACZSQB
ValueCountFrequency (%)
cw1211081601616809 1
 
< 0.1%
cw16092816125924422 1
 
< 0.1%
cw16011914070937003 1
 
< 0.1%
cw1211081601616615 1
 
< 0.1%
cw15112013054215800 1
 
< 0.1%
17080d6d30ewoyxqhfhq 1
 
< 0.1%
cw110131100045856 1
 
< 0.1%
cw110125100042743 1
 
< 0.1%
cw1305150902148291 1
 
< 0.1%
cn160115143329 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-03-15T06:18:00.317938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 36292
19.4%
0 21911
11.7%
2 16902
9.0%
3 11663
 
6.2%
6 11658
 
6.2%
4 11080
 
5.9%
8 10504
 
5.6%
7 10308
 
5.5%
C 9715
 
5.2%
5 9364
 
5.0%
Other values (26) 37412
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 147887
79.2%
Uppercase Letter 38922
 
20.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 9715
25.0%
W 8175
21.0%
E 1864
 
4.8%
F 1826
 
4.7%
A 1587
 
4.1%
D 1534
 
3.9%
B 1532
 
3.9%
N 1271
 
3.3%
L 678
 
1.7%
Q 657
 
1.7%
Other values (16) 10083
25.9%
Decimal Number
ValueCountFrequency (%)
1 36292
24.5%
0 21911
14.8%
2 16902
11.4%
3 11663
 
7.9%
6 11658
 
7.9%
4 11080
 
7.5%
8 10504
 
7.1%
7 10308
 
7.0%
5 9364
 
6.3%
9 8205
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common 147887
79.2%
Latin 38922
 
20.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 9715
25.0%
W 8175
21.0%
E 1864
 
4.8%
F 1826
 
4.7%
A 1587
 
4.1%
D 1534
 
3.9%
B 1532
 
3.9%
N 1271
 
3.3%
L 678
 
1.7%
Q 657
 
1.7%
Other values (16) 10083
25.9%
Common
ValueCountFrequency (%)
1 36292
24.5%
0 21911
14.8%
2 16902
11.4%
3 11663
 
7.9%
6 11658
 
7.9%
4 11080
 
7.5%
8 10504
 
7.1%
7 10308
 
7.0%
5 9364
 
6.3%
9 8205
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 186809
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 36292
19.4%
0 21911
11.7%
2 16902
9.0%
3 11663
 
6.2%
6 11658
 
6.2%
4 11080
 
5.9%
8 10504
 
5.6%
7 10308
 
5.5%
C 9715
 
5.2%
5 9364
 
5.0%
Other values (26) 37412
20.0%
Distinct2047
Distinct (%)20.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T06:18:01.564898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length97
Median length76
Mean length14.7877
Min length2

Characters and Unicode

Total characters147877
Distinct characters532
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique684 ?
Unique (%)6.8%

Sample

1st row직무 스트레스
2nd row거푸집동바리 작업안전
3rd row공정안전자료2(안전밸브)
4th row산업안전보건의 의의 및 산업안전보건법령 개요(new)
5th row안전보건관리체제
ValueCountFrequency (%)
2313
 
7.5%
관리감독자의 350
 
1.1%
안전 318
 
1.0%
보호구 315
 
1.0%
역할과 279
 
0.9%
안전관리의 279
 
0.9%
목적 279
 
0.9%
작업안전 265
 
0.9%
위험성평가 262
 
0.9%
책임 250
 
0.8%
Other values (2604) 25827
84.0%
2024-03-15T06:18:03.149876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20777
 
14.1%
5640
 
3.8%
4645
 
3.1%
4272
 
2.9%
) 3291
 
2.2%
( 3291
 
2.2%
2774
 
1.9%
2364
 
1.6%
2314
 
1.6%
2280
 
1.5%
Other values (522) 96229
65.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 109733
74.2%
Space Separator 20777
 
14.1%
Lowercase Letter 5792
 
3.9%
Close Punctuation 3302
 
2.2%
Open Punctuation 3302
 
2.2%
Decimal Number 2345
 
1.6%
Uppercase Letter 1220
 
0.8%
Other Punctuation 998
 
0.7%
Connector Punctuation 181
 
0.1%
Dash Punctuation 137
 
0.1%
Other values (3) 90
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5640
 
5.1%
4645
 
4.2%
4272
 
3.9%
2774
 
2.5%
2364
 
2.2%
2314
 
2.1%
2280
 
2.1%
2180
 
2.0%
2117
 
1.9%
2030
 
1.8%
Other values (443) 79117
72.1%
Lowercase Letter
ValueCountFrequency (%)
e 806
13.9%
n 644
11.1%
a 584
10.1%
t 540
 
9.3%
d 321
 
5.5%
s 316
 
5.5%
o 306
 
5.3%
l 305
 
5.3%
i 295
 
5.1%
r 232
 
4.0%
Other values (16) 1443
24.9%
Uppercase Letter
ValueCountFrequency (%)
S 201
16.5%
M 133
10.9%
P 104
8.5%
A 95
 
7.8%
H 88
 
7.2%
T 82
 
6.7%
I 80
 
6.6%
O 76
 
6.2%
D 65
 
5.3%
V 47
 
3.9%
Other values (14) 249
20.4%
Decimal Number
ValueCountFrequency (%)
1 923
39.4%
2 684
29.2%
3 216
 
9.2%
0 139
 
5.9%
4 100
 
4.3%
5 97
 
4.1%
8 81
 
3.5%
6 45
 
1.9%
7 35
 
1.5%
9 25
 
1.1%
Other Punctuation
ValueCountFrequency (%)
, 409
41.0%
: 250
25.1%
· 247
24.7%
! 51
 
5.1%
/ 15
 
1.5%
& 14
 
1.4%
. 12
 
1.2%
Letter Number
ValueCountFrequency (%)
12
54.5%
7
31.8%
3
 
13.6%
Close Punctuation
ValueCountFrequency (%)
) 3291
99.7%
] 11
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 3291
99.7%
[ 11
 
0.3%
Space Separator
ValueCountFrequency (%)
20777
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 181
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 137
100.0%
Control
ValueCountFrequency (%)
62
100.0%
Final Punctuation
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 109733
74.2%
Common 31110
 
21.0%
Latin 7034
 
4.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5640
 
5.1%
4645
 
4.2%
4272
 
3.9%
2774
 
2.5%
2364
 
2.2%
2314
 
2.1%
2280
 
2.1%
2180
 
2.0%
2117
 
1.9%
2030
 
1.8%
Other values (443) 79117
72.1%
Latin
ValueCountFrequency (%)
e 806
 
11.5%
n 644
 
9.2%
a 584
 
8.3%
t 540
 
7.7%
d 321
 
4.6%
s 316
 
4.5%
o 306
 
4.4%
l 305
 
4.3%
i 295
 
4.2%
r 232
 
3.3%
Other values (43) 2685
38.2%
Common
ValueCountFrequency (%)
20777
66.8%
) 3291
 
10.6%
( 3291
 
10.6%
1 923
 
3.0%
2 684
 
2.2%
, 409
 
1.3%
: 250
 
0.8%
· 247
 
0.8%
3 216
 
0.7%
_ 181
 
0.6%
Other values (16) 841
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 109617
74.1%
ASCII 37869
 
25.6%
None 247
 
0.2%
Compat Jamo 116
 
0.1%
Number Forms 22
 
< 0.1%
Punctuation 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20777
54.9%
) 3291
 
8.7%
( 3291
 
8.7%
1 923
 
2.4%
e 806
 
2.1%
2 684
 
1.8%
n 644
 
1.7%
a 584
 
1.5%
t 540
 
1.4%
, 409
 
1.1%
Other values (64) 5920
 
15.6%
Hangul
ValueCountFrequency (%)
5640
 
5.1%
4645
 
4.2%
4272
 
3.9%
2774
 
2.5%
2364
 
2.2%
2314
 
2.1%
2280
 
2.1%
2180
 
2.0%
2117
 
1.9%
2030
 
1.9%
Other values (442) 79001
72.1%
None
ValueCountFrequency (%)
· 247
100.0%
Compat Jamo
ValueCountFrequency (%)
116
100.0%
Number Forms
ValueCountFrequency (%)
12
54.5%
7
31.8%
3
 
13.6%
Punctuation
ValueCountFrequency (%)
6
100.0%

강의단계
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 10000
100.0%

Length

2024-03-15T06:18:03.401946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T06:18:03.640733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 10000
100.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
True
8738 
False
1262 
ValueCountFrequency (%)
True 8738
87.4%
False 1262
 
12.6%
2024-03-15T06:18:03.771878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-03-15T06:17:57.661781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T06:17:57.960005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

교육콘텐츠 번호교육콘텐츠명강의단계사용여부
9287CW1211081601616809직무 스트레스1Y
36520CW18060516236639654거푸집동바리 작업안전1Y
15576CW16011914072837262공정안전자료2(안전밸브)1Y
33024CW17091409205135038산업안전보건의 의의 및 산업안전보건법령 개요(new)1Y
4783717432AA2FFELEMACZSQB안전보건관리체제1N
15442CW16011914072618661연소이론1Y
27092CW17050210180431849사다리 안전작업1Y
24336CW16110315138726218응급처치 요령1Y
18032CW15112017055215870항만하역 안전작업 및 하역시설 사용안전1Y
18282CW15120809068016649선박건조ㆍ수리업 지원부문 안전작업(1)1Y
교육콘텐츠 번호교육콘텐츠명강의단계사용여부
30835CW18032523224938204사업장 내 안전 커뮤니케이션1Y
24092CW16102418131025198재해 발생 원인과 예방대책1Y
849CN110418140772이행상태 평가 개론(무인트로)1N
25183CW16110117138526141사고극복사례11Y
6391CW1105131300651533공정안전자료1(공정도면과 주요설비)(무인트로)1Y
42356168CBC16BA2IEQQIGIIJUnderstanding and appropriating the system of occupational safety and health management expense1Y
19911CW16090911119623966중량물 취급1Y
37766167ED66AFF7SELLZEWHY기계안전의 개요와 위험성 및 기본원칙1N
36881CW18052314233639289발파공사 안전1Y
5070CW1211081601615803안전활동대화기법(추천8)1Y