Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory498.0 KiB
Average record size in memory51.0 B

Variable types

Numeric2
Categorical3

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 과목 등록과 관련된 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15091096/fileData.do

Alerts

과정 아이디 is highly overall correlated with 학습자 사용자 인덱스High correlation
학습자 사용자 인덱스 is highly overall correlated with 과정 아이디High correlation
상태 코드 is highly overall correlated with 등록 횟수High correlation
등록 횟수 is highly overall correlated with 상태 코드High correlation
상태 코드 is highly imbalanced (67.7%)Imbalance
등록 횟수 is highly imbalanced (98.4%)Imbalance
등록국가 is highly imbalanced (93.7%)Imbalance

Reproduction

Analysis started2023-12-13 00:06:25.121025
Analysis finished2023-12-13 00:06:25.906273
Duration0.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

과정 아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct2534
Distinct (%)25.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5523.7445
Minimum3
Maximum7290
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T09:06:25.965200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile1256
Q14961
median6253
Q36596
95-th percentile7154
Maximum7290
Range7287
Interquartile range (IQR)1635

Descriptive statistics

Standard deviation1719.0004
Coefficient of variation (CV)0.311202
Kurtosis1.3620018
Mean5523.7445
Median Absolute Deviation (MAD)584
Skewness-1.4955057
Sum55237445
Variance2954962.2
MonotonicityNot monotonic
2023-12-13T09:06:26.077736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2600 127
 
1.3%
6395 109
 
1.1%
6430 104
 
1.0%
6239 99
 
1.0%
6400 96
 
1.0%
6410 92
 
0.9%
2597 91
 
0.9%
6414 85
 
0.9%
7131 84
 
0.8%
2596 77
 
0.8%
Other values (2524) 9036
90.4%
ValueCountFrequency (%)
3 1
 
< 0.1%
31 1
 
< 0.1%
32 1
 
< 0.1%
33 4
< 0.1%
37 1
 
< 0.1%
107 1
 
< 0.1%
112 1
 
< 0.1%
124 1
 
< 0.1%
142 1
 
< 0.1%
143 1
 
< 0.1%
ValueCountFrequency (%)
7290 2
 
< 0.1%
7289 1
 
< 0.1%
7288 3
 
< 0.1%
7287 11
0.1%
7286 10
0.1%
7285 3
 
< 0.1%
7284 2
 
< 0.1%
7282 3
 
< 0.1%
7281 3
 
< 0.1%
7280 2
 
< 0.1%

학습자 사용자 인덱스
Real number (ℝ)

HIGH CORRELATION 

Distinct7989
Distinct (%)79.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean738572.61
Minimum352
Maximum21857958
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T09:06:26.194643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum352
5-th percentile28915
Q180140.75
median151531
Q3206105.25
95-th percentile1300275.6
Maximum21857958
Range21857606
Interquartile range (IQR)125964.5

Descriptive statistics

Standard deviation2823014.2
Coefficient of variation (CV)3.8222569
Kurtosis27.216406
Mean738572.61
Median Absolute Deviation (MAD)60050
Skewness5.2166864
Sum7.3857261 × 109
Variance7.9694094 × 1012
MonotonicityNot monotonic
2023-12-13T09:06:26.305851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
145881 13
 
0.1%
23349 10
 
0.1%
57355 10
 
0.1%
3307 10
 
0.1%
72660 9
 
0.1%
30226 8
 
0.1%
3251 7
 
0.1%
56712 7
 
0.1%
363 7
 
0.1%
78947 6
 
0.1%
Other values (7979) 9913
99.1%
ValueCountFrequency (%)
352 2
 
< 0.1%
359 3
< 0.1%
363 7
0.1%
377 1
 
< 0.1%
412 1
 
< 0.1%
463 1
 
< 0.1%
466 4
< 0.1%
477 1
 
< 0.1%
484 2
 
< 0.1%
492 2
 
< 0.1%
ValueCountFrequency (%)
21857958 1
< 0.1%
21765523 1
< 0.1%
21743845 1
< 0.1%
21714416 1
< 0.1%
21713495 1
< 0.1%
21636059 1
< 0.1%
21583580 1
< 0.1%
21403558 1
< 0.1%
21208201 1
< 0.1%
21205789 1
< 0.1%

상태 코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
수강
8884 
수강취소
1108 
수강대기
 
8

Length

Max length4
Median length2
Mean length2.2232
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수강
2nd row수강
3rd row수강
4th row수강
5th row수강취소

Common Values

ValueCountFrequency (%)
수강 8884
88.8%
수강취소 1108
 
11.1%
수강대기 8
 
0.1%

Length

2023-12-13T09:06:26.424387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:06:26.520852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수강 8884
88.8%
수강취소 1108
 
11.1%
수강대기 8
 
0.1%

등록 횟수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9970 
2
 
23
0
 
5
3
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9970
99.7%
2 23
 
0.2%
0 5
 
0.1%
3 2
 
< 0.1%

Length

2023-12-13T09:06:26.604957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:06:26.689321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9970
99.7%
2 23
 
0.2%
0 5
 
< 0.1%
3 2
 
< 0.1%

등록국가
Categorical

IMBALANCE 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
KR
9759 
UNKNOWN
 
218
US
 
16
CN
 
3
JP
 
2
Other values (2)
 
2

Length

Max length7
Median length2
Mean length2.109
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowKR
2nd rowKR
3rd rowKR
4th rowKR
5th rowKR

Common Values

ValueCountFrequency (%)
KR 9759
97.6%
UNKNOWN 218
 
2.2%
US 16
 
0.2%
CN 3
 
< 0.1%
JP 2
 
< 0.1%
SG 1
 
< 0.1%
FR 1
 
< 0.1%

Length

2023-12-13T09:06:26.776204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:06:26.861231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kr 9759
97.6%
unknown 218
 
2.2%
us 16
 
0.2%
cn 3
 
< 0.1%
jp 2
 
< 0.1%
sg 1
 
< 0.1%
fr 1
 
< 0.1%

Interactions

2023-12-13T09:06:25.607965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:06:25.413307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:06:25.691859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:06:25.520160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:06:26.932325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과정 아이디학습자 사용자 인덱스상태 코드등록 횟수등록국가
과정 아이디1.0000.1750.1430.0310.124
학습자 사용자 인덱스0.1751.0000.1510.1350.385
상태 코드0.1430.1511.0000.5520.000
등록 횟수0.0310.1350.5521.0000.000
등록국가0.1240.3850.0000.0001.000
2023-12-13T09:06:27.032357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록국가등록 횟수상태 코드
등록국가1.0000.0000.000
등록 횟수0.0001.0000.559
상태 코드0.0000.5591.000
2023-12-13T09:06:27.107290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과정 아이디학습자 사용자 인덱스상태 코드등록 횟수등록국가
과정 아이디1.0000.6920.0860.0180.063
학습자 사용자 인덱스0.6921.0000.1020.0930.143
상태 코드0.0860.1021.0000.5590.000
등록 횟수0.0180.0930.5591.0000.000
등록국가0.0630.1430.0000.0001.000

Missing values

2023-12-13T09:06:25.780828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:06:25.870987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

과정 아이디학습자 사용자 인덱스상태 코드등록 횟수등록국가
6176964081248145수강1KR
27979539188975수강1KR
80182598123285수강1KR
10972281151588수강1KR
270665247105994수강취소1KR
957097177215404수강1KR
11393288253130수강1KR
13214315758951수강1KR
867806929199769수강1KR
382675852159477수강1KR
과정 아이디학습자 사용자 인덱스상태 코드등록 횟수등록국가
911257039192988수강취소1KR
125303069143848수강1KR
62828641011057077수강1KR
505946247177285수강1KR
371515838160760수강1KR
928307130399617수강1KR
593976402253586수강1KR
383065852160748수강1KR
833916857197959수강1KR
175857231675수강1KR