Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows23
Duplicate rows (%)0.2%
Total size in memory498.0 KiB
Average record size in memory51.0 B

Variable types

Text1
Numeric3
DateTime1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 학습자 출석 아이템 관련 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15091058/fileData.do

Alerts

Dataset has 23 (0.2%) duplicate rowsDuplicates
레슨 아이디 is highly overall correlated with 레슨 아이템 아이디 and 1 other fieldsHigh correlation
레슨 아이템 아이디 is highly overall correlated with 레슨 아이디 and 1 other fieldsHigh correlation
레슨 서브아이템 아이디 is highly overall correlated with 레슨 아이디 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 07:27:32.164066
Analysis finished2023-12-12 07:27:34.101595
Duration1.94 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct242
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T16:27:34.253776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length35
Mean length15.4389
Min length4

Characters and Unicode

Total characters154389
Distinct characters385
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row유니티 기반 실감 콘텐츠 퍼블리싱
2nd rowAVR을 활용한 마이컴 제어(기본)
3rd row시스템 소프트웨어 펌웨어 설계
4th row계약 관리(건설공사)
5th row기계적 재료시험 part 2
ValueCountFrequency (%)
part 2282
 
6.9%
2 1098
 
3.3%
1 1012
 
3.1%
설계 620
 
1.9%
활용한 581
 
1.8%
제어 375
 
1.1%
366
 
1.1%
활용 349
 
1.1%
plc 347
 
1.1%
모델링 281
 
0.9%
Other values (497) 25702
77.9%
2023-12-12T16:27:34.617872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23013
 
14.9%
r 2900
 
1.9%
t 2870
 
1.9%
2868
 
1.9%
2 2796
 
1.8%
a 2616
 
1.7%
p 2533
 
1.6%
2464
 
1.6%
2339
 
1.5%
2040
 
1.3%
Other values (375) 107950
69.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 95111
61.6%
Space Separator 23013
 
14.9%
Lowercase Letter 15391
 
10.0%
Uppercase Letter 9074
 
5.9%
Decimal Number 6153
 
4.0%
Open Punctuation 1863
 
1.2%
Close Punctuation 1863
 
1.2%
Connector Punctuation 815
 
0.5%
Other Punctuation 649
 
0.4%
Dash Punctuation 258
 
0.2%
Other values (2) 199
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2868
 
3.0%
2464
 
2.6%
2339
 
2.5%
2040
 
2.1%
1851
 
1.9%
1774
 
1.9%
1773
 
1.9%
1711
 
1.8%
1707
 
1.8%
1424
 
1.5%
Other values (305) 75160
79.0%
Uppercase Letter
ValueCountFrequency (%)
C 1237
13.6%
L 878
9.7%
D 832
9.2%
A 714
 
7.9%
P 672
 
7.4%
I 629
 
6.9%
T 580
 
6.4%
M 561
 
6.2%
S 550
 
6.1%
E 506
 
5.6%
Other values (14) 1915
21.1%
Lowercase Letter
ValueCountFrequency (%)
r 2900
18.8%
t 2870
18.6%
a 2616
17.0%
p 2533
16.5%
o 974
 
6.3%
l 581
 
3.8%
e 498
 
3.2%
i 395
 
2.6%
n 303
 
2.0%
c 303
 
2.0%
Other values (13) 1418
9.2%
Decimal Number
ValueCountFrequency (%)
2 2796
45.4%
1 1450
23.6%
0 808
 
13.1%
3 610
 
9.9%
4 251
 
4.1%
5 85
 
1.4%
9 77
 
1.3%
6 76
 
1.2%
Other Punctuation
ValueCountFrequency (%)
· 332
51.2%
, 142
21.9%
/ 109
 
16.8%
# 47
 
7.2%
! 19
 
2.9%
Open Punctuation
ValueCountFrequency (%)
( 1432
76.9%
[ 431
 
23.1%
Close Punctuation
ValueCountFrequency (%)
) 1432
76.9%
] 431
 
23.1%
Letter Number
ValueCountFrequency (%)
49
71.0%
20
29.0%
Space Separator
ValueCountFrequency (%)
23013
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 815
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 258
100.0%
Math Symbol
ValueCountFrequency (%)
+ 130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 95111
61.6%
Common 34744
 
22.5%
Latin 24534
 
15.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2868
 
3.0%
2464
 
2.6%
2339
 
2.5%
2040
 
2.1%
1851
 
1.9%
1774
 
1.9%
1773
 
1.9%
1711
 
1.8%
1707
 
1.8%
1424
 
1.5%
Other values (305) 75160
79.0%
Latin
ValueCountFrequency (%)
r 2900
 
11.8%
t 2870
 
11.7%
a 2616
 
10.7%
p 2533
 
10.3%
C 1237
 
5.0%
o 974
 
4.0%
L 878
 
3.6%
D 832
 
3.4%
A 714
 
2.9%
P 672
 
2.7%
Other values (39) 8308
33.9%
Common
ValueCountFrequency (%)
23013
66.2%
2 2796
 
8.0%
1 1450
 
4.2%
( 1432
 
4.1%
) 1432
 
4.1%
_ 815
 
2.3%
0 808
 
2.3%
3 610
 
1.8%
[ 431
 
1.2%
] 431
 
1.2%
Other values (11) 1526
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 95111
61.6%
ASCII 58877
38.1%
None 332
 
0.2%
Number Forms 69
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23013
39.1%
r 2900
 
4.9%
t 2870
 
4.9%
2 2796
 
4.7%
a 2616
 
4.4%
p 2533
 
4.3%
1 1450
 
2.5%
( 1432
 
2.4%
) 1432
 
2.4%
C 1237
 
2.1%
Other values (57) 16598
28.2%
Hangul
ValueCountFrequency (%)
2868
 
3.0%
2464
 
2.6%
2339
 
2.5%
2040
 
2.1%
1851
 
1.9%
1774
 
1.9%
1773
 
1.9%
1711
 
1.8%
1707
 
1.8%
1424
 
1.5%
Other values (305) 75160
79.0%
None
ValueCountFrequency (%)
· 332
100.0%
Number Forms
ValueCountFrequency (%)
49
71.0%
20
29.0%

레슨 아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct403
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6327.7938
Minimum647
Maximum12082
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:27:34.768558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum647
5-th percentile1528
Q12948
median7233
Q38571
95-th percentile10819
Maximum12082
Range11435
Interquartile range (IQR)5623

Descriptive statistics

Standard deviation3097.8936
Coefficient of variation (CV)0.48956931
Kurtosis-1.121513
Mean6327.7938
Median Absolute Deviation (MAD)2000
Skewness-0.18890519
Sum63277938
Variance9596945
MonotonicityNot monotonic
2023-12-12T16:27:34.914325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8180 146
 
1.5%
8202 131
 
1.3%
2107 125
 
1.2%
8196 116
 
1.2%
8206 116
 
1.2%
2934 112
 
1.1%
2752 109
 
1.1%
8204 106
 
1.1%
8200 105
 
1.1%
2466 103
 
1.0%
Other values (393) 8831
88.3%
ValueCountFrequency (%)
647 35
0.4%
649 42
0.4%
651 45
0.4%
653 51
0.5%
655 34
0.3%
1266 5
 
0.1%
1272 10
 
0.1%
1274 12
 
0.1%
1320 19
 
0.2%
1333 15
 
0.1%
ValueCountFrequency (%)
12082 40
0.4%
12076 27
0.3%
12072 55
0.5%
12068 38
0.4%
12064 27
0.3%
12060 30
0.3%
12055 27
0.3%
12050 42
0.4%
11699 44
0.4%
11673 28
0.3%

레슨 아이템 아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct2483
Distinct (%)24.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64042.442
Minimum4224
Maximum109090
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:27:35.130587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4224
5-th percentile33156.95
Q143600
median68890
Q379029
95-th percentile95549
Maximum109090
Range104866
Interquartile range (IQR)35429

Descriptive statistics

Standard deviation22108.145
Coefficient of variation (CV)0.34521084
Kurtosis-0.37184436
Mean64042.442
Median Absolute Deviation (MAD)11808.5
Skewness-0.28179581
Sum6.4042442 × 108
Variance4.8877009 × 108
MonotonicityNot monotonic
2023-12-12T16:27:35.307509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37406 51
 
0.5%
57889 46
 
0.5%
37405 46
 
0.5%
37345 44
 
0.4%
37525 43
 
0.4%
37425 43
 
0.4%
37486 41
 
0.4%
37466 40
 
0.4%
37446 40
 
0.4%
37366 39
 
0.4%
Other values (2473) 9567
95.7%
ValueCountFrequency (%)
4224 5
0.1%
4225 2
 
< 0.1%
4226 7
0.1%
4227 6
0.1%
4228 7
0.1%
4229 2
 
< 0.1%
4230 6
0.1%
4242 9
0.1%
4243 10
0.1%
4244 9
0.1%
ValueCountFrequency (%)
109090 3
 
< 0.1%
109087 3
 
< 0.1%
109081 3
 
< 0.1%
109078 11
0.1%
109075 9
0.1%
109072 6
0.1%
109069 1
 
< 0.1%
109066 2
 
< 0.1%
109063 2
 
< 0.1%
109027 2
 
< 0.1%

레슨 서브아이템 아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct4781
Distinct (%)47.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62933.344
Minimum8222
Maximum474789
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:27:35.466673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8222
5-th percentile18577.9
Q139225.5
median59265
Q378330.25
95-th percentile140023.3
Maximum474789
Range466567
Interquartile range (IQR)39104.75

Descriptive statistics

Standard deviation42314.719
Coefficient of variation (CV)0.67237361
Kurtosis33.131674
Mean62933.344
Median Absolute Deviation (MAD)19118
Skewness4.3730792
Sum6.2933344 × 108
Variance1.7905355 × 109
MonotonicityNot monotonic
2023-12-12T16:27:35.625318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75221 16
 
0.2%
75303 14
 
0.1%
75197 14
 
0.1%
75196 13
 
0.1%
75309 13
 
0.1%
75200 13
 
0.1%
75333 13
 
0.1%
75411 12
 
0.1%
75086 12
 
0.1%
75354 12
 
0.1%
Other values (4771) 9868
98.7%
ValueCountFrequency (%)
8222 1
 
< 0.1%
8223 1
 
< 0.1%
8224 3
< 0.1%
8225 1
 
< 0.1%
8226 1
 
< 0.1%
8228 1
 
< 0.1%
8229 3
< 0.1%
8230 3
< 0.1%
8231 2
< 0.1%
8232 2
< 0.1%
ValueCountFrequency (%)
474789 2
< 0.1%
474786 1
 
< 0.1%
474783 2
< 0.1%
474780 1
 
< 0.1%
474777 1
 
< 0.1%
451636 2
< 0.1%
451633 1
 
< 0.1%
451630 3
< 0.1%
451627 1
 
< 0.1%
451624 1
 
< 0.1%
Distinct3206
Distinct (%)32.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2016-10-14 20:51:04
Maximum2017-06-26 11:14:49
2023-12-12T16:27:35.785481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:35.941761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T16:27:33.331491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:32.683220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:33.010988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:33.420690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:32.785100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:33.133332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:33.512013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:32.891083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:27:33.233408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:27:36.022245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
레슨 아이디레슨 아이템 아이디레슨 서브아이템 아이디
레슨 아이디1.0000.9430.815
레슨 아이템 아이디0.9431.0000.849
레슨 서브아이템 아이디0.8150.8491.000
2023-12-12T16:27:36.113965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
레슨 아이디레슨 아이템 아이디레슨 서브아이템 아이디
레슨 아이디1.0000.9890.980
레슨 아이템 아이디0.9891.0000.991
레슨 서브아이템 아이디0.9800.9911.000

Missing values

2023-12-12T16:27:33.663736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:27:34.044550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

과목 배움 콘텐츠 아이디레슨 아이디레슨 아이템 아이디레슨 서브아이템 아이디상태 코드 수정 일시
41872유니티 기반 실감 콘텐츠 퍼블리싱478354949462292016-11-28 15:37:56
47886AVR을 활용한 마이컴 제어(기본)664663515526142016-11-10 16:15:46
53510시스템 소프트웨어 펌웨어 설계278234304212352016-11-10 10:21:37
75166계약 관리(건설공사)9282837224747802017-06-26 11:14:49
66098기계적 재료시험 part 2386248177424312016-11-22 13:40:03
7245재미있게 배우는 전기회로1005890423846122016-11-27 13:55:08
17562HRD 실무전문가 과정174437366285892016-11-10 13:58:38
67778기계설비 안전대책 수립과 스마트 팩토리(산업용 로봇) 안전 확보876680657813622016-11-22 17:32:15
66431빅데이터 분석 기획386448040422912016-11-01 19:28:21
77405전기 조명설비 설계186643480389462016-11-10 14:45:24
과목 배움 콘텐츠 아이디레슨 아이디레슨 아이템 아이디레슨 서브아이템 아이디상태 코드 수정 일시
68339사출성형공정검토 part 1867679867803362016-11-13 09:14:28
94885영어회화 2171730214113562016-10-28 09:56:04
32895라즈베리 파이 및 아두이노를 활용한 제어 설계 및 코딩829977486764442016-11-11 22:53:13
27473프레스금형 2D도면작성861179331789922016-11-02 13:37:19
12500SolidWorks를 활용한 3D설계1007490513849522016-11-11 18:36:37
48055네트워크 프로그래밍 구현 part 1_개발환경 분석하기664863533526512016-11-04 17:25:28
47058전기자동차 전기장치 정비664063453524112016-11-13 21:12:17
72802전자제품 품질보증 part 2273333039182832016-11-09 20:48:11
221502D도면작업(AutoCAD를 이용한 도면화 기초)263037466288832016-11-10 14:58:58
36984L2·L3 스위치 구축 part 1522057648468792016-11-02 00:31:21

Duplicate rows

Most frequently occurring

과목 배움 콘텐츠 아이디레슨 아이디레슨 아이템 아이디레슨 서브아이템 아이디상태 코드 수정 일시# duplicates
14구내통신 설비공사 part 39278836913169022017-03-16 13:31:098
0건축목공시공 도면파악8184763372924722017-03-08 09:47:366
17무선통신시스템 설계 part 25248578884516032017-04-27 10:05:376
12구내통신 설비공사 part 39278836903169292017-03-16 13:31:095
3구내통신 설비공사 part 39278836873168692017-03-16 13:31:094
5구내통신 설비공사 part 39278836873168752017-03-16 13:31:093
8구내통신 설비공사 part 39278836893168872017-03-16 13:31:093
9구내통신 설비공사 part 39278836903169112017-03-16 13:31:093
11구내통신 설비공사 part 39278836903169232017-03-16 13:31:093
21무선통신시스템 설계 part 25248578884516302017-04-27 10:05:373