Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory400.4 KiB
Average record size in memory41.0 B

Variable types

Numeric1
Text1
DateTime1
Categorical1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 즐겨찾기 과정 관련 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15090900/fileData.do

Alerts

사용자 인덱스 is highly overall correlated with 마이그레이션 원천 구분High correlation
마이그레이션 원천 구분 is highly overall correlated with 사용자 인덱스High correlation
등록 일시 has unique valuesUnique

Reproduction

Analysis started2023-12-12 20:14:55.330473
Analysis finished2023-12-12 20:14:56.157470
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용자 인덱스
Real number (ℝ)

HIGH CORRELATION 

Distinct5509
Distinct (%)55.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5317560.2
Minimum470
Maximum14590835
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:14:56.245802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum470
5-th percentile114756.9
Q1588691.25
median1404968
Q311034808
95-th percentile14166840
Maximum14590835
Range14590365
Interquartile range (IQR)10446117

Descriptive statistics

Standard deviation5632955.4
Coefficient of variation (CV)1.059312
Kurtosis-1.6493836
Mean5317560.2
Median Absolute Deviation (MAD)1190094
Skewness0.46916558
Sum5.3175602 × 1010
Variance3.1730186 × 1013
MonotonicityNot monotonic
2023-12-13T05:14:56.421452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1130510 66
 
0.7%
14208155 63
 
0.6%
103407 39
 
0.4%
889703 24
 
0.2%
59787 21
 
0.2%
1444200 21
 
0.2%
10256256 17
 
0.2%
626086 17
 
0.2%
12515254 17
 
0.2%
261459 17
 
0.2%
Other values (5499) 9698
97.0%
ValueCountFrequency (%)
470 1
 
< 0.1%
1476 2
< 0.1%
1727 1
 
< 0.1%
2038 2
< 0.1%
2275 2
< 0.1%
2607 1
 
< 0.1%
23349 1
 
< 0.1%
23457 3
< 0.1%
24061 3
< 0.1%
25299 1
 
< 0.1%
ValueCountFrequency (%)
14590835 3
< 0.1%
14590576 2
< 0.1%
14589410 2
< 0.1%
14588965 1
 
< 0.1%
14587836 1
 
< 0.1%
14586948 1
 
< 0.1%
14578028 2
< 0.1%
14576621 1
 
< 0.1%
14573845 1
 
< 0.1%
14573327 1
 
< 0.1%
Distinct1404
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T05:14:56.783682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length73
Median length38
Mean length16.419
Min length3

Characters and Unicode

Total characters164190
Distinct characters605
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique344 ?
Unique (%)3.4%

Sample

1st rowPLC 기본(XGT/XGK)
2nd row실습과 함께하는! 빅데이터 구축과 분석 실무(하둡)
3rd row기획 및 프리젠테이션 전략
4th row중소기업 기술보호 이해 및 핵심수칙
5th row빅데이터 수집 part 1
ValueCountFrequency (%)
part 1168
 
3.3%
1 643
 
1.8%
활용한 591
 
1.7%
585
 
1.7%
575
 
1.6%
위한 425
 
1.2%
2 425
 
1.2%
배우는 385
 
1.1%
분석 353
 
1.0%
설계 333
 
0.9%
Other values (2143) 29618
84.4%
2023-12-13T05:14:57.296309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25117
 
15.3%
3646
 
2.2%
2742
 
1.7%
2494
 
1.5%
C 2212
 
1.3%
1 2050
 
1.2%
t 2036
 
1.2%
2 1923
 
1.2%
1907
 
1.2%
r 1888
 
1.1%
Other values (595) 118175
72.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 100686
61.3%
Space Separator 25117
 
15.3%
Lowercase Letter 12961
 
7.9%
Uppercase Letter 11633
 
7.1%
Decimal Number 6040
 
3.7%
Open Punctuation 2210
 
1.3%
Close Punctuation 2210
 
1.3%
Connector Punctuation 1296
 
0.8%
Other Punctuation 1285
 
0.8%
Dash Punctuation 486
 
0.3%
Other values (2) 266
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3646
 
3.6%
2742
 
2.7%
2494
 
2.5%
1907
 
1.9%
1733
 
1.7%
1651
 
1.6%
1637
 
1.6%
1587
 
1.6%
1585
 
1.6%
1483
 
1.5%
Other values (514) 80221
79.7%
Uppercase Letter
ValueCountFrequency (%)
C 2212
19.0%
S 1027
 
8.8%
P 980
 
8.4%
D 870
 
7.5%
L 867
 
7.5%
A 733
 
6.3%
I 609
 
5.2%
T 573
 
4.9%
N 564
 
4.8%
M 559
 
4.8%
Other values (16) 2639
22.7%
Lowercase Letter
ValueCountFrequency (%)
t 2036
15.7%
r 1888
14.6%
a 1783
13.8%
p 1326
10.2%
o 1002
7.7%
e 856
6.6%
i 525
 
4.1%
n 507
 
3.9%
l 496
 
3.8%
c 407
 
3.1%
Other values (14) 2135
16.5%
Decimal Number
ValueCountFrequency (%)
1 2050
33.9%
2 1923
31.8%
0 715
 
11.8%
3 542
 
9.0%
4 241
 
4.0%
5 185
 
3.1%
6 148
 
2.5%
9 96
 
1.6%
7 89
 
1.5%
8 51
 
0.8%
Other Punctuation
ValueCountFrequency (%)
! 274
21.3%
, 268
20.9%
. 253
19.7%
: 220
17.1%
/ 73
 
5.7%
# 66
 
5.1%
· 60
 
4.7%
& 53
 
4.1%
' 18
 
1.4%
Math Symbol
ValueCountFrequency (%)
+ 142
98.6%
> 1
 
0.7%
< 1
 
0.7%
Open Punctuation
ValueCountFrequency (%)
( 1695
76.7%
[ 515
 
23.3%
Close Punctuation
ValueCountFrequency (%)
) 1695
76.7%
] 515
 
23.3%
Letter Number
ValueCountFrequency (%)
108
88.5%
14
 
11.5%
Space Separator
ValueCountFrequency (%)
25117
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1296
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 486
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 100682
61.3%
Common 38788
 
23.6%
Latin 24716
 
15.1%
Han 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3646
 
3.6%
2742
 
2.7%
2494
 
2.5%
1907
 
1.9%
1733
 
1.7%
1651
 
1.6%
1637
 
1.6%
1587
 
1.6%
1585
 
1.6%
1483
 
1.5%
Other values (512) 80217
79.7%
Latin
ValueCountFrequency (%)
C 2212
 
8.9%
t 2036
 
8.2%
r 1888
 
7.6%
a 1783
 
7.2%
p 1326
 
5.4%
S 1027
 
4.2%
o 1002
 
4.1%
P 980
 
4.0%
D 870
 
3.5%
L 867
 
3.5%
Other values (42) 10725
43.4%
Common
ValueCountFrequency (%)
25117
64.8%
1 2050
 
5.3%
2 1923
 
5.0%
( 1695
 
4.4%
) 1695
 
4.4%
_ 1296
 
3.3%
0 715
 
1.8%
3 542
 
1.4%
[ 515
 
1.3%
] 515
 
1.3%
Other values (19) 2725
 
7.0%
Han
ValueCountFrequency (%)
2
50.0%
2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 100682
61.3%
ASCII 63322
38.6%
Number Forms 122
 
0.1%
None 60
 
< 0.1%
CJK 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25117
39.7%
C 2212
 
3.5%
1 2050
 
3.2%
t 2036
 
3.2%
2 1923
 
3.0%
r 1888
 
3.0%
a 1783
 
2.8%
( 1695
 
2.7%
) 1695
 
2.7%
p 1326
 
2.1%
Other values (68) 21597
34.1%
Hangul
ValueCountFrequency (%)
3646
 
3.6%
2742
 
2.7%
2494
 
2.5%
1907
 
1.9%
1733
 
1.7%
1651
 
1.6%
1637
 
1.6%
1587
 
1.6%
1585
 
1.6%
1483
 
1.5%
Other values (512) 80217
79.7%
Number Forms
ValueCountFrequency (%)
108
88.5%
14
 
11.5%
None
ValueCountFrequency (%)
· 60
100.0%
CJK
ValueCountFrequency (%)
2
50.0%
2
50.0%

등록 일시
Date

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2016-09-29 08:56:48
Maximum2023-09-25 10:59:17
2023-12-13T05:14:57.488572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:14:57.985324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

마이그레이션 원천 구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
5871 
OLEIPORTAL
4129 

Length

Max length10
Median length4
Mean length6.4774
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOLEIPORTAL
2nd rowOLEIPORTAL
3rd rowOLEIPORTAL
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 5871
58.7%
OLEIPORTAL 4129
41.3%

Length

2023-12-13T05:14:58.147995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:14:58.269346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 5871
58.7%
oleiportal 4129
41.3%

Interactions

2023-12-13T05:14:55.838433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:14:58.347520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용자 인덱스
사용자 인덱스1.000
2023-12-13T05:14:58.433853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용자 인덱스마이그레이션 원천 구분
사용자 인덱스1.0001.000
마이그레이션 원천 구분1.0001.000

Missing values

2023-12-13T05:14:55.978465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:14:56.095912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사용자 인덱스과정 아이디등록 일시마이그레이션 원천 구분
36595894053PLC 기본(XGT/XGK)2018-01-31 19:42:41OLEIPORTAL
29390698789실습과 함께하는! 빅데이터 구축과 분석 실무(하둡)2018-06-12 02:05:05OLEIPORTAL
484761371790기획 및 프리젠테이션 전략2019-05-19 21:51:48OLEIPORTAL
121847636중소기업 기술보호 이해 및 핵심수칙2021-05-25 15:30:14<NA>
8408612519196빅데이터 수집 part 12020-08-02 23:17:54<NA>
518331444200애플리케이션 요구사항 분석2021-09-01 13:03:17<NA>
6280510270095PLC 기본(SIEMENS)2020-05-28 12:41:53<NA>
9121313578515[기획의 신] 하루 7분, 기획력2020-11-16 11:31:39<NA>
7073161639통계 기반 데이터 분석2019-01-28 16:06:16OLEIPORTAL
8030011916484파이썬 프로그래밍(2019년)2020-05-21 14:19:07<NA>
사용자 인덱스과정 아이디등록 일시마이그레이션 원천 구분
6393210282456IT시스템통합 운영관리2019-08-16 14:09:30OLEIPORTAL
9638197001파이썬 프로그래밍(2019년)2021-12-01 17:49:03<NA>
27516648140전기설비 보호계전시스템 설계2022-09-15 07:38:12<NA>
29741703452네트워크 I_12017-08-09 09:49:58OLEIPORTAL
9270113789966현장에서 배우는 공조냉동시스템 자동제어2020-11-29 12:37:23<NA>
36970900531공정흐름도 작성2018-02-04 20:51:11OLEIPORTAL
9482414119943재미있게 배우는 기초전자회로2021-01-10 19:49:11<NA>
8784613118042논리 데이터베이스 설계2022-02-23 08:44:50<NA>
37318908983모두의 일러스트레이터 CC2019-06-04 16:41:15OLEIPORTAL
32191789697[NCS]웹 표준에 맞는 HTML5 프로그래밍_12019-11-11 01:34:00<NA>