Overview

Dataset statistics

Number of variables4
Number of observations3202
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory106.4 KiB
Average record size in memory34.0 B

Variable types

Numeric2
Boolean1
Categorical1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 학습자 그룹과 관련 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15091065/fileData.do

Alerts

마이그레이션 원천 구분 is highly overall correlated with 그룹 아이디 and 2 other fieldsHigh correlation
팀장 여부 is highly overall correlated with 마이그레이션 원천 구분High correlation
그룹 아이디 is highly overall correlated with 학습자 사용자 인덱스 and 1 other fieldsHigh correlation
학습자 사용자 인덱스 is highly overall correlated with 그룹 아이디 and 1 other fieldsHigh correlation
팀장 여부 is highly imbalanced (85.9%)Imbalance
마이그레이션 원천 구분 is highly imbalanced (93.0%)Imbalance

Reproduction

Analysis started2023-12-12 04:03:08.458518
Analysis finished2023-12-12 04:03:09.431483
Duration0.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

그룹 아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct1383
Distinct (%)43.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2665.6724
Minimum8
Maximum4165
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.3 KiB
2023-12-12T13:03:09.554185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile139
Q11015.75
median3410.5
Q34123
95-th percentile4153
Maximum4165
Range4157
Interquartile range (IQR)3107.25

Descriptive statistics

Standard deviation1610.5867
Coefficient of variation (CV)0.6041953
Kurtosis-1.407386
Mean2665.6724
Median Absolute Deviation (MAD)739.5
Skewness-0.51957348
Sum8535483
Variance2593989.6
MonotonicityIncreasing
2023-12-12T13:03:09.783112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4153 144
 
4.5%
4120 138
 
4.3%
4123 138
 
4.3%
4126 59
 
1.8%
4129 59
 
1.8%
4132 59
 
1.8%
4135 59
 
1.8%
4138 59
 
1.8%
4141 59
 
1.8%
4144 59
 
1.8%
Other values (1373) 2369
74.0%
ValueCountFrequency (%)
8 2
 
0.1%
10 1
 
< 0.1%
12 1
 
< 0.1%
19 2
 
0.1%
22 1
 
< 0.1%
25 2
 
0.1%
28 4
0.1%
31 6
0.2%
34 4
0.1%
35 4
0.1%
ValueCountFrequency (%)
4165 27
 
0.8%
4156 26
 
0.8%
4153 144
4.5%
4150 59
1.8%
4147 59
1.8%
4144 59
1.8%
4141 59
1.8%
4138 59
1.8%
4135 59
1.8%
4132 59
1.8%

학습자 사용자 인덱스
Real number (ℝ)

HIGH CORRELATION 

Distinct675
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9685298.8
Minimum565
Maximum11172644
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.3 KiB
2023-12-12T13:03:09.978189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum565
5-th percentile526608
Q111119925
median11126020
Q311129135
95-th percentile11130110
Maximum11172644
Range11172079
Interquartile range (IQR)9210

Descriptive statistics

Standard deviation3550256.6
Coefficient of variation (CV)0.36656139
Kurtosis2.5026624
Mean9685298.8
Median Absolute Deviation (MAD)3819
Skewness-2.1132274
Sum3.1012327 × 1010
Variance1.2604322 × 1013
MonotonicityNot monotonic
2023-12-12T13:03:10.174019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11130245 18
 
0.6%
11129460 18
 
0.6%
11129406 18
 
0.6%
11129378 18
 
0.6%
11129351 18
 
0.6%
11129324 18
 
0.6%
11129297 18
 
0.6%
11129270 18
 
0.6%
11129243 18
 
0.6%
11129216 18
 
0.6%
Other values (665) 3022
94.4%
ValueCountFrequency (%)
565 3
0.1%
1476 3
0.1%
26100 1
 
< 0.1%
29557 1
 
< 0.1%
45786 1
 
< 0.1%
46828 3
0.1%
49608 3
0.1%
52763 3
0.1%
72054 3
0.1%
72271 3
0.1%
ValueCountFrequency (%)
11172644 1
< 0.1%
11172617 1
< 0.1%
11172590 1
< 0.1%
11172563 1
< 0.1%
11172536 1
< 0.1%
11172509 1
< 0.1%
11172482 1
< 0.1%
11172455 1
< 0.1%
11172428 1
< 0.1%
11172401 1
< 0.1%

팀장 여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
False
3138 
True
 
64
ValueCountFrequency (%)
False 3138
98.0%
True 64
 
2.0%
2023-12-12T13:03:10.321137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

마이그레이션 원천 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size25.1 KiB
<NA>
3175 
OLEIPORTAL
 
27

Length

Max length10
Median length4
Mean length4.0505934
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOLEIPORTAL
2nd rowOLEIPORTAL
3rd rowOLEIPORTAL
4th rowOLEIPORTAL
5th rowOLEIPORTAL

Common Values

ValueCountFrequency (%)
<NA> 3175
99.2%
OLEIPORTAL 27
 
0.8%

Length

2023-12-12T13:03:10.459317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:03:10.603847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 3175
99.2%
oleiportal 27
 
0.8%

Interactions

2023-12-12T13:03:08.884298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:08.646435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:09.011597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:08.756099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:03:10.703687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
그룹 아이디학습자 사용자 인덱스팀장 여부
그룹 아이디1.0000.7210.381
학습자 사용자 인덱스0.7211.0000.185
팀장 여부0.3810.1851.000
2023-12-12T13:03:10.815826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
마이그레이션 원천 구분팀장 여부
마이그레이션 원천 구분1.0001.000
팀장 여부1.0001.000
2023-12-12T13:03:10.933062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
그룹 아이디학습자 사용자 인덱스팀장 여부마이그레이션 원천 구분
그룹 아이디1.0000.6690.2921.000
학습자 사용자 인덱스0.6691.0000.3071.000
팀장 여부0.2920.3071.0001.000
마이그레이션 원천 구분1.0001.0001.0001.000

Missing values

2023-12-12T13:03:09.242087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:03:09.377010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

그룹 아이디학습자 사용자 인덱스팀장 여부마이그레이션 원천 구분
08227698YOLEIPORTAL
18334978NOLEIPORTAL
210335031YOLEIPORTAL
312335051NOLEIPORTAL
419227698YOLEIPORTAL
519334978NOLEIPORTAL
622335031YOLEIPORTAL
725704485YOLEIPORTAL
8251106375NOLEIPORTAL
9281119477YOLEIPORTAL
그룹 아이디학습자 사용자 인덱스팀장 여부마이그레이션 원천 구분
3192416511172401N<NA>
3193416511172428N<NA>
3194416511172455N<NA>
3195416511172482N<NA>
3196416511172509N<NA>
3197416511172536N<NA>
3198416511172563N<NA>
3199416511172590N<NA>
3200416511172617N<NA>
3201416511172644N<NA>