Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows733
Duplicate rows (%)7.3%
Total size in memory585.9 KiB
Average record size in memory60.0 B

Variable types

DateTime2
Categorical4

Dataset

Description온라인 교육프로그램에 대한 교육과정 유형, 과제, 진도, 수강자 이용현황 등과 같은 정보입니다.
Author국가평생교육진흥원
URLhttps://www.data.go.kr/data/15072237/fileData.do

Alerts

진도율 has constant value ""Constant
Dataset has 733 (7.3%) duplicate rowsDuplicates
수료증 출력 제한(1:기본, 2:출력제한,3:출력가능) is highly overall correlated with 출석여부(0:미출석1:출석)High correlation
출석여부(0:미출석1:출석) is highly overall correlated with 수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)High correlation
0:승인, 1:승인대기, 2:이수완료, 3:취소 is highly imbalanced (83.8%)Imbalance
수료증 출력 제한(1:기본, 2:출력제한,3:출력가능) is highly imbalanced (96.3%)Imbalance
출석여부(0:미출석1:출석) is highly imbalanced (97.3%)Imbalance

Reproduction

Analysis started2023-12-12 17:47:15.594223
Analysis finished2023-12-12 17:47:16.164981
Duration0.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct8788
Distinct (%)87.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2013-03-15 02:05:00
Maximum2020-10-13 12:59:00
2023-12-13T02:47:16.241299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:47:16.367899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct117
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2013-03-22 00:00:00
Maximum2020-10-31 00:00:00
2023-12-13T02:47:16.539266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:47:16.681432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9581 
3
 
410
1
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row3

Common Values

ValueCountFrequency (%)
0 9581
95.8%
3 410
 
4.1%
1 9
 
0.1%

Length

2023-12-13T02:47:16.809331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:47:16.906618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9581
95.8%
3 410
 
4.1%
1 9
 
0.1%

진도율
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 10000
100.0%

Length

2023-12-13T02:47:17.021078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:47:17.123785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 10000
100.0%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9941 
3
 
36
2
 
23

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9941
99.4%
3 36
 
0.4%
2 23
 
0.2%

Length

2023-12-13T02:47:17.225401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:47:17.331764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9941
99.4%
3 36
 
0.4%
2 23
 
0.2%

출석여부(0:미출석1:출석)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9973 
1
 
27

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9973
99.7%
1 27
 
0.3%

Length

2023-12-13T02:47:17.447890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:47:17.576537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9973
99.7%
1 27
 
0.3%

Correlations

2023-12-13T02:47:17.637601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
0:승인, 1:승인대기, 2:이수완료, 3:취소수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)출석여부(0:미출석1:출석)
0:승인, 1:승인대기, 2:이수완료, 3:취소1.0000.1620.000
수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)0.1621.0000.596
출석여부(0:미출석1:출석)0.0000.5961.000
2023-12-13T02:47:17.764845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)0:승인, 1:승인대기, 2:이수완료, 3:취소출석여부(0:미출석1:출석)
수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)1.0000.0490.866
0:승인, 1:승인대기, 2:이수완료, 3:취소0.0491.0000.000
출석여부(0:미출석1:출석)0.8660.0001.000
2023-12-13T02:47:17.913438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
0:승인, 1:승인대기, 2:이수완료, 3:취소수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)출석여부(0:미출석1:출석)
0:승인, 1:승인대기, 2:이수완료, 3:취소1.0000.0490.000
수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)0.0491.0000.866
출석여부(0:미출석1:출석)0.0000.8661.000

Missing values

2023-12-13T02:47:15.930389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:47:16.099356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

등록일수료일0:승인, 1:승인대기, 2:이수완료, 3:취소진도율수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)출석여부(0:미출석1:출석)
648762020-06-08 1:552020-06-300010
314462016-03-25 1:302016-03-310010
446552017-11-08 2:322017-11-300010
560982020-04-23 4:472020-04-300010
578112020-05-03 4:172020-05-313010
750962020-07-06 10:382020-07-310010
815722020-07-15 11:132020-07-310010
888092020-09-20 5:542020-09-300010
267442015-09-04 4:562015-09-300010
424472017-09-08 11:322017-09-300010
등록일수료일0:승인, 1:승인대기, 2:이수완료, 3:취소진도율수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)출석여부(0:미출석1:출석)
821252020-07-16 6:392020-07-310010
665912020-06-10 9:582020-06-300010
389232017-08-23 3:092017-08-310010
8992013-05-13 12:592013-05-190010
883592020-09-17 3:132020-09-300010
70522014-09-27 1:402014-10-310010
496262019-04-14 8:222019-04-300010
426622017-09-11 1:512017-09-300010
846102020-08-04 3:262020-08-310010
496652019-05-01 12:312019-05-313010

Duplicate rows

Most frequently occurring

등록일수료일0:승인, 1:승인대기, 2:이수완료, 3:취소진도율수료증 출력 제한(1:기본, 2:출력제한,3:출력가능)출석여부(0:미출석1:출석)# duplicates
5542020-07-02 12:112020-07-31001014
4132020-05-11 1:272020-05-31001010
4892020-06-11 9:002020-06-30001010
5982020-07-06 6:132020-07-31001010
5132020-07-01 10:222020-07-3100109
5372020-07-01 8:102020-07-3100109
5602020-07-02 5:332020-07-3100109
6222020-07-08 4:392020-07-3100109
6412020-07-12 1:472020-07-3100109
6602020-07-15 11:132020-07-3100109