Overview

Dataset statistics

Number of variables2
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory244.1 KiB
Average record size in memory25.0 B

Variable types

Numeric1
Text1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 과목 키워드와 관련된 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15091098/fileData.do

Reproduction

Analysis started2023-12-12 07:31:16.364960
Analysis finished2023-12-12 07:31:16.833897
Duration0.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

과정 아이디
Real number (ℝ)

Distinct7797
Distinct (%)78.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63607.498
Minimum34
Maximum413848
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:31:16.910583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34
5-th percentile3055
Q15649.75
median9275
Q395059.75
95-th percentile296641.4
Maximum413848
Range413814
Interquartile range (IQR)89410

Descriptive statistics

Standard deviation99830.891
Coefficient of variation (CV)1.5694831
Kurtosis1.9175251
Mean63607.498
Median Absolute Deviation (MAD)4796.5
Skewness1.7147149
Sum6.3607498 × 108
Variance9.9662069 × 109
MonotonicityNot monotonic
2023-12-12T16:31:17.047780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11918 5
 
0.1%
10125 4
 
< 0.1%
9319 4
 
< 0.1%
2829 4
 
< 0.1%
9604 4
 
< 0.1%
9337 4
 
< 0.1%
8869 4
 
< 0.1%
3678 4
 
< 0.1%
4810 4
 
< 0.1%
5705 4
 
< 0.1%
Other values (7787) 9959
99.6%
ValueCountFrequency (%)
34 1
 
< 0.1%
122 1
 
< 0.1%
261 1
 
< 0.1%
262 1
 
< 0.1%
308 1
 
< 0.1%
312 1
 
< 0.1%
327 2
< 0.1%
332 1
 
< 0.1%
333 2
< 0.1%
334 3
< 0.1%
ValueCountFrequency (%)
413848 1
< 0.1%
413839 1
< 0.1%
413836 1
< 0.1%
413806 1
< 0.1%
413797 1
< 0.1%
413794 1
< 0.1%
413785 1
< 0.1%
413746 1
< 0.1%
413725 1
< 0.1%
413662 1
< 0.1%
Distinct763
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T16:31:17.370954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length26
Mean length3.683
Min length1

Characters and Unicode

Total characters36830
Distinct characters450
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique131 ?
Unique (%)1.3%

Sample

1st row네트워크 보안
2nd row모바일
3rd rowPH-Lab
4th row인적자원관리
5th row추천
ValueCountFrequency (%)
신규 2304
 
20.4%
모바일 605
 
5.4%
plc 165
 
1.5%
프로그래밍 135
 
1.2%
도면 124
 
1.1%
자동제어 117
 
1.0%
제어 111
 
1.0%
설계 108
 
1.0%
네트워크 101
 
0.9%
13기 94
 
0.8%
Other values (764) 7418
65.8%
2023-12-12T16:31:17.851375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2412
 
6.5%
2312
 
6.3%
1309
 
3.6%
920
 
2.5%
854
 
2.3%
675
 
1.8%
C 642
 
1.7%
637
 
1.7%
625
 
1.7%
620
 
1.7%
Other values (440) 25824
70.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29388
79.8%
Uppercase Letter 3281
 
8.9%
Lowercase Letter 1954
 
5.3%
Space Separator 1309
 
3.6%
Decimal Number 623
 
1.7%
Other Punctuation 86
 
0.2%
Dash Punctuation 52
 
0.1%
Open Punctuation 47
 
0.1%
Close Punctuation 47
 
0.1%
Math Symbol 42
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2412
 
8.2%
2312
 
7.9%
920
 
3.1%
854
 
2.9%
675
 
2.3%
637
 
2.2%
625
 
2.1%
620
 
2.1%
592
 
2.0%
577
 
2.0%
Other values (372) 19164
65.2%
Uppercase Letter
ValueCountFrequency (%)
C 642
19.6%
D 343
10.5%
P 322
9.8%
L 272
8.3%
M 271
8.3%
A 224
 
6.8%
S 187
 
5.7%
I 126
 
3.8%
T 125
 
3.8%
H 118
 
3.6%
Other values (14) 651
19.8%
Lowercase Letter
ValueCountFrequency (%)
e 239
12.2%
o 217
11.1%
a 180
9.2%
t 166
 
8.5%
l 144
 
7.4%
c 137
 
7.0%
r 132
 
6.8%
i 118
 
6.0%
s 113
 
5.8%
n 84
 
4.3%
Other values (14) 424
21.7%
Decimal Number
ValueCountFrequency (%)
1 223
35.8%
3 190
30.5%
7 86
 
13.8%
2 39
 
6.3%
5 33
 
5.3%
0 22
 
3.5%
8 21
 
3.4%
6 8
 
1.3%
4 1
 
0.2%
Other Punctuation
ValueCountFrequency (%)
, 38
44.2%
# 28
32.6%
/ 18
20.9%
. 1
 
1.2%
& 1
 
1.2%
Space Separator
ValueCountFrequency (%)
1309
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 52
100.0%
Open Punctuation
ValueCountFrequency (%)
( 47
100.0%
Close Punctuation
ValueCountFrequency (%)
) 47
100.0%
Math Symbol
ValueCountFrequency (%)
+ 42
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29388
79.8%
Latin 5235
 
14.2%
Common 2207
 
6.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2412
 
8.2%
2312
 
7.9%
920
 
3.1%
854
 
2.9%
675
 
2.3%
637
 
2.2%
625
 
2.1%
620
 
2.1%
592
 
2.0%
577
 
2.0%
Other values (372) 19164
65.2%
Latin
ValueCountFrequency (%)
C 642
 
12.3%
D 343
 
6.6%
P 322
 
6.2%
L 272
 
5.2%
M 271
 
5.2%
e 239
 
4.6%
A 224
 
4.3%
o 217
 
4.1%
S 187
 
3.6%
a 180
 
3.4%
Other values (38) 2338
44.7%
Common
ValueCountFrequency (%)
1309
59.3%
1 223
 
10.1%
3 190
 
8.6%
7 86
 
3.9%
- 52
 
2.4%
( 47
 
2.1%
) 47
 
2.1%
+ 42
 
1.9%
2 39
 
1.8%
, 38
 
1.7%
Other values (10) 134
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29388
79.8%
ASCII 7442
 
20.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2412
 
8.2%
2312
 
7.9%
920
 
3.1%
854
 
2.9%
675
 
2.3%
637
 
2.2%
625
 
2.1%
620
 
2.1%
592
 
2.0%
577
 
2.0%
Other values (372) 19164
65.2%
ASCII
ValueCountFrequency (%)
1309
 
17.6%
C 642
 
8.6%
D 343
 
4.6%
P 322
 
4.3%
L 272
 
3.7%
M 271
 
3.6%
e 239
 
3.2%
A 224
 
3.0%
1 223
 
3.0%
o 217
 
2.9%
Other values (58) 3380
45.4%

Interactions

2023-12-12T16:31:16.618839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T16:31:16.738993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:31:16.804615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

과정 아이디키워드
166457612네트워크 보안
34287128821모바일
210969038PH-Lab
5412620인적자원관리
3004730110추천
159747349CAM
145336898산업설비
38290200517신규
164767565도면
88575126솔리드웍스
과정 아이디키워드
41758309349신규
37055179643신규
134466416공압
94585306박막증착
234859772재무제표
34681140392신규
2565211258자바
2688212254진단장비
2813613520양중기
42261326881신규