Overview

Dataset statistics

Number of variables3
Number of observations8485
Missing cells0
Missing cells (%)0.0%
Duplicate rows609
Duplicate rows (%)7.2%
Total size in memory207.3 KiB
Average record size in memory25.0 B

Variable types

Text1
Categorical1
Numeric1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 LMS 학기 평가 관련 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15090865/fileData.do

Alerts

Dataset has 609 (7.2%) duplicate rowsDuplicates
가중치 has 6790 (80.0%) zerosZeros

Reproduction

Analysis started2023-12-12 11:13:34.067871
Analysis finished2023-12-12 11:13:34.905436
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1015
Distinct (%)12.0%
Missing0
Missing (%)0.0%
Memory size66.4 KiB
2023-12-12T20:13:35.253053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length55
Median length29
Mean length12.362169
Min length2

Characters and Unicode

Total characters104893
Distinct characters368
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상시
2nd row상시
3rd row상시
4th row상시
5th row상시
ValueCountFrequency (%)
직무 270
 
2.2%
핵심 258
 
2.1%
삼성협력사 234
 
1.9%
1기 227
 
1.8%
2020-1 210
 
1.7%
주)광진 150
 
1.2%
2기 142
 
1.1%
3기 112
 
0.9%
2022-1 112
 
0.9%
105
 
0.8%
Other values (1025) 10621
85.4%
2023-12-12T20:13:35.969714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 12855
 
12.3%
0 8352
 
8.0%
1 7007
 
6.7%
( 6043
 
5.8%
) 6043
 
5.8%
- 6022
 
5.7%
3997
 
3.8%
3652
 
3.5%
3143
 
3.0%
3 1961
 
1.9%
Other values (358) 45818
43.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 44564
42.5%
Decimal Number 35692
34.0%
Open Punctuation 6084
 
5.8%
Close Punctuation 6084
 
5.8%
Dash Punctuation 6022
 
5.7%
Space Separator 3997
 
3.8%
Uppercase Letter 1749
 
1.7%
Lowercase Letter 456
 
0.4%
Other Punctuation 131
 
0.1%
Connector Punctuation 60
 
0.1%
Other values (2) 54
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3652
 
8.2%
3143
 
7.1%
1430
 
3.2%
1335
 
3.0%
1274
 
2.9%
1211
 
2.7%
1203
 
2.7%
1189
 
2.7%
876
 
2.0%
739
 
1.7%
Other values (299) 28512
64.0%
Uppercase Letter
ValueCountFrequency (%)
S 328
18.8%
P 248
14.2%
L 216
12.3%
I 165
9.4%
C 154
8.8%
T 150
8.6%
K 105
 
6.0%
A 101
 
5.8%
H 55
 
3.1%
B 36
 
2.1%
Other values (11) 191
10.9%
Lowercase Letter
ValueCountFrequency (%)
t 99
21.7%
a 77
16.9%
s 63
13.8%
b 63
13.8%
e 49
10.7%
i 35
 
7.7%
m 21
 
4.6%
o 21
 
4.6%
c 7
 
1.5%
n 7
 
1.5%
Other values (2) 14
 
3.1%
Decimal Number
ValueCountFrequency (%)
2 12855
36.0%
0 8352
23.4%
1 7007
19.6%
3 1961
 
5.5%
9 1243
 
3.5%
8 1185
 
3.3%
7 937
 
2.6%
4 819
 
2.3%
5 703
 
2.0%
6 630
 
1.8%
Other Punctuation
ValueCountFrequency (%)
% 35
26.7%
· 28
21.4%
/ 27
20.6%
. 14
 
10.7%
, 14
 
10.7%
& 7
 
5.3%
! 6
 
4.6%
Open Punctuation
ValueCountFrequency (%)
( 6043
99.3%
[ 41
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 6043
99.3%
] 41
 
0.7%
Dash Punctuation
ValueCountFrequency (%)
- 6022
100.0%
Space Separator
ValueCountFrequency (%)
3997
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 60
100.0%
Math Symbol
ValueCountFrequency (%)
+ 42
100.0%
Other Symbol
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 58112
55.4%
Hangul 44576
42.5%
Latin 2205
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3652
 
8.2%
3143
 
7.1%
1430
 
3.2%
1335
 
3.0%
1274
 
2.9%
1211
 
2.7%
1203
 
2.7%
1189
 
2.7%
876
 
2.0%
739
 
1.7%
Other values (300) 28524
64.0%
Latin
ValueCountFrequency (%)
S 328
14.9%
P 248
11.2%
L 216
 
9.8%
I 165
 
7.5%
C 154
 
7.0%
T 150
 
6.8%
K 105
 
4.8%
A 101
 
4.6%
t 99
 
4.5%
a 77
 
3.5%
Other values (23) 562
25.5%
Common
ValueCountFrequency (%)
2 12855
22.1%
0 8352
14.4%
1 7007
12.1%
( 6043
10.4%
) 6043
10.4%
- 6022
10.4%
3997
 
6.9%
3 1961
 
3.4%
9 1243
 
2.1%
8 1185
 
2.0%
Other values (15) 3404
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60289
57.5%
Hangul 44564
42.5%
None 40
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 12855
21.3%
0 8352
13.9%
1 7007
11.6%
( 6043
10.0%
) 6043
10.0%
- 6022
10.0%
3997
 
6.6%
3 1961
 
3.3%
9 1243
 
2.1%
8 1185
 
2.0%
Other values (47) 5581
9.3%
Hangul
ValueCountFrequency (%)
3652
 
8.2%
3143
 
7.1%
1430
 
3.2%
1335
 
3.0%
1274
 
2.9%
1211
 
2.7%
1203
 
2.7%
1189
 
2.7%
876
 
2.0%
739
 
1.7%
Other values (299) 28512
64.0%
None
ValueCountFrequency (%)
· 28
70.0%
12
30.0%

과목 코드
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size66.4 KiB
시험
1287 
퀴즈
1287 
과제
1287 
토론
1287 
게시판 참여
1287 
Other values (2)
2050 

Length

Max length6
Median length2
Mean length2.6067177
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row시험
2nd row퀴즈
3rd row과제
4th row토론
5th row게시판 참여

Common Values

ValueCountFrequency (%)
시험 1287
15.2%
퀴즈 1287
15.2%
과제 1287
15.2%
토론 1287
15.2%
게시판 참여 1287
15.2%
출석 1287
15.2%
기타 763
9.0%

Length

2023-12-12T20:13:36.247636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:13:36.528168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
시험 1287
13.2%
퀴즈 1287
13.2%
과제 1287
13.2%
토론 1287
13.2%
게시판 1287
13.2%
참여 1287
13.2%
출석 1287
13.2%
기타 763
7.8%

가중치
Real number (ℝ)

ZEROS 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.87802
Minimum0
Maximum100
Zeros6790
Zeros (%)80.0%
Negative0
Negative (%)0.0%
Memory size74.7 KiB
2023-12-12T20:13:36.784876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile60
Maximum100
Range100
Interquartile range (IQR)0

Descriptive statistics

Standard deviation23.075524
Coefficient of variation (CV)2.1212982
Kurtosis3.1837026
Mean10.87802
Median Absolute Deviation (MAD)0
Skewness2.0078181
Sum92300
Variance532.47982
MonotonicityNot monotonic
2023-12-12T20:13:36.997076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0 6790
80.0%
50 780
 
9.2%
40 308
 
3.6%
60 300
 
3.5%
100 159
 
1.9%
30 76
 
0.9%
70 60
 
0.7%
10 4
 
< 0.1%
90 4
 
< 0.1%
20 2
 
< 0.1%
ValueCountFrequency (%)
0 6790
80.0%
10 4
 
< 0.1%
20 2
 
< 0.1%
30 76
 
0.9%
40 308
 
3.6%
50 780
 
9.2%
60 300
 
3.5%
70 60
 
0.7%
80 2
 
< 0.1%
90 4
 
< 0.1%
ValueCountFrequency (%)
100 159
 
1.9%
90 4
 
< 0.1%
80 2
 
< 0.1%
70 60
 
0.7%
60 300
 
3.5%
50 780
9.2%
40 308
 
3.6%
30 76
 
0.9%
20 2
 
< 0.1%
10 4
 
< 0.1%

Interactions

2023-12-12T20:13:34.499408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:13:37.154532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과목 코드가중치
과목 코드1.0000.602
가중치0.6021.000
2023-12-12T20:13:37.296544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가중치과목 코드
가중치1.0000.369
과목 코드0.3691.000

Missing values

2023-12-12T20:13:34.712859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:13:34.844624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

학기 아이디과목 코드가중치
0상시시험0
1상시퀴즈0
2상시과제0
3상시토론0
4상시게시판 참여0
5상시출석100
6상시시험0
7상시퀴즈0
8상시과제0
9상시토론0
학기 아이디과목 코드가중치
84752023-4차(유라코퍼레이션)게시판 참여0
84762023-4차(유라코퍼레이션)출석50
84772023-4차(유라코퍼레이션)기타0
847816기시험50
847916기퀴즈0
848016기과제0
848116기토론0
848216기게시판 참여0
848316기출석50
848416기기타0

Duplicate rows

Most frequently occurring

학기 아이디과목 코드가중치# duplicates
2352020-1게시판 참여026
2362020-1과제026
2372020-1기타026
2382020-1시험026
2412020-1퀴즈026
2422020-1토론026
2392020-1출석023
6913기게시판 참여010
7013기과제010
7813기퀴즈010