Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory566.4 KiB
Average record size in memory58.0 B

Variable types

Numeric2
Text1
Categorical2
DateTime1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 학습자 등록 관련된 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15091074/fileData.do

Alerts

아이디 is highly overall correlated with 과정 아이디High correlation
과정 아이디 is highly overall correlated with 아이디 High correlation
등록 국가 is highly imbalanced (70.5%)Imbalance
등록 기기 타입 is highly imbalanced (72.6%)Imbalance
아이디 has unique valuesUnique

Reproduction

Analysis started2023-12-12 00:53:39.664180
Analysis finished2023-12-12 00:53:41.021885
Duration1.36 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

아이디
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean135906.33
Minimum17
Maximum276489
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T09:53:41.117064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile13611.7
Q163843.25
median132472
Q3206362
95-th percentile265776.4
Maximum276489
Range276472
Interquartile range (IQR)142518.75

Descriptive statistics

Standard deviation80814.735
Coefficient of variation (CV)0.59463553
Kurtosis-1.216497
Mean135906.33
Median Absolute Deviation (MAD)71289
Skewness0.072401255
Sum1.3590633 × 109
Variance6.5310214 × 109
MonotonicityNot monotonic
2023-12-12T09:53:41.310546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
211795 1
 
< 0.1%
115462 1
 
< 0.1%
131710 1
 
< 0.1%
226045 1
 
< 0.1%
103009 1
 
< 0.1%
132778 1
 
< 0.1%
47475 1
 
< 0.1%
269 1
 
< 0.1%
10423 1
 
< 0.1%
178849 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
17 1
< 0.1%
47 1
< 0.1%
54 1
< 0.1%
62 1
< 0.1%
79 1
< 0.1%
82 1
< 0.1%
91 1
< 0.1%
112 1
< 0.1%
124 1
< 0.1%
130 1
< 0.1%
ValueCountFrequency (%)
276489 1
< 0.1%
276483 1
< 0.1%
276449 1
< 0.1%
276441 1
< 0.1%
276429 1
< 0.1%
276423 1
< 0.1%
276403 1
< 0.1%
276389 1
< 0.1%
276361 1
< 0.1%
276341 1
< 0.1%

코드
Text

Distinct9566
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T09:53:41.658260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length24
Mean length23.2277
Min length13

Characters and Unicode

Total characters232277
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9174 ?
Unique (%)91.7%

Sample

1st rowA200000140343-2020211795
2nd rowA200000090166-2020139420
3rd rowA201361010018-2020130507
4th rowA200000040159-202094042
5th rowA200000060370-202097378
ValueCountFrequency (%)
a200000060324-2020213640 7
 
0.1%
a200000040430-202079582 4
 
< 0.1%
a200000020071-202064561 4
 
< 0.1%
a201271020025-2020127774 4
 
< 0.1%
a200000000172-2020103180 4
 
< 0.1%
a201271010012-2020122857 4
 
< 0.1%
a190000200268-201951619 4
 
< 0.1%
a201361010016-2020138970 4
 
< 0.1%
a200000000171-2020254137 3
 
< 0.1%
a201271010009-2020104386 3
 
< 0.1%
Other values (9556) 9959
99.6%
2023-12-12T09:53:42.556825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 83971
36.2%
2 40165
17.3%
1 27289
 
11.7%
3 10283
 
4.4%
9 10242
 
4.4%
- 10000
 
4.3%
5 9945
 
4.3%
A 9549
 
4.1%
7 8484
 
3.7%
4 8160
 
3.5%
Other values (2) 14189
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 212728
91.6%
Dash Punctuation 10000
 
4.3%
Uppercase Letter 9549
 
4.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 83971
39.5%
2 40165
18.9%
1 27289
 
12.8%
3 10283
 
4.8%
9 10242
 
4.8%
5 9945
 
4.7%
7 8484
 
4.0%
4 8160
 
3.8%
6 7451
 
3.5%
8 6738
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 9549
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 222728
95.9%
Latin 9549
 
4.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 83971
37.7%
2 40165
18.0%
1 27289
 
12.3%
3 10283
 
4.6%
9 10242
 
4.6%
- 10000
 
4.5%
5 9945
 
4.5%
7 8484
 
3.8%
4 8160
 
3.7%
6 7451
 
3.3%
Latin
ValueCountFrequency (%)
A 9549
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 232277
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 83971
36.2%
2 40165
17.3%
1 27289
 
11.7%
3 10283
 
4.4%
9 10242
 
4.4%
- 10000
 
4.3%
5 9945
 
4.3%
A 9549
 
4.1%
7 8484
 
3.7%
4 8160
 
3.5%
Other values (2) 14189
 
6.1%

과정 아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct4086
Distinct (%)40.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean114306.65
Minimum388
Maximum164129
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T09:53:42.756534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum388
5-th percentile21365.95
Q198379.25
median123533.5
Q3139264.75
95-th percentile154177
Maximum164129
Range163741
Interquartile range (IQR)40885.5

Descriptive statistics

Standard deviation34290.341
Coefficient of variation (CV)0.29998554
Kurtosis2.1065562
Mean114306.65
Median Absolute Deviation (MAD)18106.5
Skewness-1.4117844
Sum1.1430665 × 109
Variance1.1758275 × 109
MonotonicityNot monotonic
2023-12-12T09:53:42.931501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
132448 310
 
3.1%
132451 277
 
2.8%
142699 266
 
2.7%
90337 215
 
2.1%
87487 171
 
1.7%
10212 122
 
1.2%
127546 119
 
1.2%
105427 116
 
1.2%
105425 115
 
1.1%
127555 94
 
0.9%
Other values (4076) 8195
82.0%
ValueCountFrequency (%)
388 1
< 0.1%
2598 1
< 0.1%
3062 1
< 0.1%
3279 1
< 0.1%
3483 1
< 0.1%
3690 1
< 0.1%
3823 1
< 0.1%
3870 1
< 0.1%
4275 1
< 0.1%
4408 1
< 0.1%
ValueCountFrequency (%)
164129 3
 
< 0.1%
164127 2
 
< 0.1%
163858 1
 
< 0.1%
163813 47
0.5%
163810 36
0.4%
163807 3
 
< 0.1%
162679 16
 
0.2%
162676 5
 
0.1%
162673 5
 
0.1%
162670 24
0.2%

등록 국가
Categorical

IMBALANCE 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
KR
7536 
UNKNOWN
2182 
US
 
145
RU
 
85
CN
 
41
Other values (5)
 
11

Length

Max length7
Median length2
Mean length3.091
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowKR
2nd rowKR
3rd rowKR
4th rowUNKNOWN
5th rowUNKNOWN

Common Values

ValueCountFrequency (%)
KR 7536
75.4%
UNKNOWN 2182
 
21.8%
US 145
 
1.5%
RU 85
 
0.9%
CN 41
 
0.4%
JP 3
 
< 0.1%
ES 3
 
< 0.1%
GB 3
 
< 0.1%
VU 1
 
< 0.1%
TW 1
 
< 0.1%

Length

2023-12-12T09:53:43.117717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:53:43.293775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kr 7536
75.4%
unknown 2182
 
21.8%
us 145
 
1.5%
ru 85
 
0.9%
cn 41
 
0.4%
jp 3
 
< 0.1%
es 3
 
< 0.1%
gb 3
 
< 0.1%
vu 1
 
< 0.1%
tw 1
 
< 0.1%

등록 기기 타입
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
PC
9530 
모바일
 
470

Length

Max length3
Median length2
Mean length2.047
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPC
2nd rowPC
3rd rowPC
4th rowPC
5th rowPC

Common Values

ValueCountFrequency (%)
PC 9530
95.3%
모바일 470
 
4.7%

Length

2023-12-12T09:53:43.494445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:53:43.598850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
pc 9530
95.3%
모바일 470
 
4.7%
Distinct9991
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2019-09-14 15:07:48
Maximum2021-01-03 00:09:43
2023-12-12T09:53:43.729591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:53:43.931993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T09:53:40.572189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:53:40.306710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:53:40.680938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:53:40.447478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:53:44.031389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
아이디과정 아이디등록 국가등록 기기 타입
아이디1.0000.8560.2570.176
과정 아이디0.8561.0000.2080.159
등록 국가0.2570.2081.0000.492
등록 기기 타입0.1760.1590.4921.000
2023-12-12T09:53:44.131483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록 기기 타입등록 국가
등록 기기 타입1.0000.378
등록 국가0.3781.000
2023-12-12T09:53:44.230445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
아이디과정 아이디등록 국가등록 기기 타입
아이디1.0000.8120.0810.135
과정 아이디0.8121.0000.0650.122
등록 국가0.0810.0651.0000.378
등록 기기 타입0.1350.1220.3781.000

Missing values

2023-12-12T09:53:40.815030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:53:40.950063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아이디코드과정 아이디등록 국가등록 기기 타입코드 등록 일시
72408211795A200000140343-2020211795144871KRPC2020-10-15 12:56:12
48490139420A200000090166-2020139420128863KRPC2020-08-03 15:34:28
45519130507A201361010018-2020130507124888KRPC2020-07-22 17:56:16
3336694042A200000040159-202094042111592UNKNOWNPC2020-06-14 14:07:28
3447897378A200000060370-202097378120847UNKNOWNPC2020-06-16 07:48:58
2834778985A200000040156-202078985111583KRPC2020-05-15 10:20:57
93895273681A200000170213-2020248284153163UNKNOWNPC2020-12-28 13:03:18
1767648859A200000000071-202048857105425KRPC2020-02-29 19:17:21
66243193300A200000120461-2020193300139798UNKNOWNPC2020-09-16 10:09:35
192655203710245-20205086310245KRPC2020-03-10 10:25:06
아이디코드과정 아이디등록 국가등록 기기 타입코드 등록 일시
69675203596A200000130337-2020203596141943UNKNOWNPC2020-10-05 11:44:40
71999210568A201415010001-2020210568146689UNKNOWNPC2020-10-15 08:51:50
1339439109A191015100019-20193910995197KRPC2020-01-16 17:22:17
37744107182A200000000172-2020101071113686KR모바일2020-06-26 13:38:13
858024658A191015090092-20192465891921KRPC2019-12-02 16:35:46
1454442562A171037060002-20174255920842UNKNOWNPC2020-02-09 15:51:07
923626632A191015090012-20192662691678KRPC2019-12-05 18:49:37
83581245314A200000170192-2020245314153100KRPC2020-12-01 08:28:02
595616786A190000140366-20191678688092KRPC2019-11-15 11:09:23
90474265993A201355030043-2020265993154093KRPC2020-12-18 12:36:03