Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells365
Missing cells (%)0.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory576.2 KiB
Average record size in memory59.0 B

Variable types

Numeric2
Categorical3
DateTime1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 메시지 발송 이력 관련 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15090990/fileData.do

Alerts

카테고리 코드 is highly overall correlated with 과정 아이디 and 2 other fieldsHigh correlation
등록 국가 is highly overall correlated with 타입코드 and 1 other fieldsHigh correlation
아이디 is highly overall correlated with 과정 아이디High correlation
과정 아이디 is highly overall correlated with 아이디 and 1 other fieldsHigh correlation
타입코드 is highly overall correlated with 카테고리 코드 and 1 other fieldsHigh correlation
타입코드 is highly imbalanced (75.7%)Imbalance
카테고리 코드 is highly imbalanced (87.2%)Imbalance
등록 국가 is highly imbalanced (88.0%)Imbalance
과정 아이디 has 365 (3.6%) missing valuesMissing
아이디 has unique valuesUnique

Reproduction

Analysis started2023-12-12 03:40:04.882045
Analysis finished2023-12-12 03:40:06.218177
Duration1.34 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

아이디
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71785.09
Minimum43
Maximum215163
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T12:40:06.321777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum43
5-th percentile5452.05
Q125708.75
median51753
Q380748
95-th percentile204779.25
Maximum215163
Range215120
Interquartile range (IQR)55039.25

Descriptive statistics

Standard deviation64545.734
Coefficient of variation (CV)0.89915237
Kurtosis-0.034005661
Mean71785.09
Median Absolute Deviation (MAD)26704
Skewness1.1585621
Sum7.178509 × 108
Variance4.1661518 × 109
MonotonicityNot monotonic
2023-12-12T12:40:06.505058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
77385 1
 
< 0.1%
93934 1
 
< 0.1%
18935 1
 
< 0.1%
49277 1
 
< 0.1%
39392 1
 
< 0.1%
180752 1
 
< 0.1%
35740 1
 
< 0.1%
30321 1
 
< 0.1%
191754 1
 
< 0.1%
193282 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
43 1
< 0.1%
44 1
< 0.1%
98 1
< 0.1%
99 1
< 0.1%
104 1
< 0.1%
109 1
< 0.1%
110 1
< 0.1%
112 1
< 0.1%
114 1
< 0.1%
129 1
< 0.1%
ValueCountFrequency (%)
215163 1
< 0.1%
215153 1
< 0.1%
215142 1
< 0.1%
215134 1
< 0.1%
215120 1
< 0.1%
215074 1
< 0.1%
215059 1
< 0.1%
215052 1
< 0.1%
215046 1
< 0.1%
215037 1
< 0.1%

타입코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
단문
9599 
장문
 
401

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row단문
2nd row단문
3rd row단문
4th row단문
5th row단문

Common Values

ValueCountFrequency (%)
단문 9599
96.0%
장문 401
 
4.0%

Length

2023-12-12T12:40:06.693880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:40:06.824029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단문 9599
96.0%
장문 401
 
4.0%

카테고리 코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1000
9824 
9001
 
176

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1000
2nd row1000
3rd row1000
4th row1000
5th row1000

Common Values

ValueCountFrequency (%)
1000 9824
98.2%
9001 176
 
1.8%

Length

2023-12-12T12:40:06.963639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:40:07.093624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1000 9824
98.2%
9001 176
 
1.8%

과정 아이디
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct2784
Distinct (%)28.9%
Missing365
Missing (%)3.6%
Infinite0
Infinite (%)0.0%
Mean4938.5661
Minimum32
Maximum12345
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T12:40:07.250668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum32
5-th percentile2691
Q13810
median4646
Q35533
95-th percentile9052
Maximum12345
Range12313
Interquartile range (IQR)1723

Descriptive statistics

Standard deviation2001.4632
Coefficient of variation (CV)0.40527214
Kurtosis0.70811729
Mean4938.5661
Median Absolute Deviation (MAD)882
Skewness0.53707558
Sum47583084
Variance4005855.1
MonotonicityNot monotonic
2023-12-12T12:40:07.446768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5620 51
 
0.5%
5204 49
 
0.5%
5442 48
 
0.5%
5087 46
 
0.5%
5440 44
 
0.4%
5640 43
 
0.4%
5536 42
 
0.4%
5496 39
 
0.4%
5528 37
 
0.4%
3831 34
 
0.3%
Other values (2774) 9202
92.0%
(Missing) 365
 
3.6%
ValueCountFrequency (%)
32 2
 
< 0.1%
33 8
0.1%
38 1
 
< 0.1%
142 1
 
< 0.1%
144 2
 
< 0.1%
145 1
 
< 0.1%
147 2
 
< 0.1%
148 4
< 0.1%
149 1
 
< 0.1%
151 2
 
< 0.1%
ValueCountFrequency (%)
12345 1
< 0.1%
11040 1
< 0.1%
11038 1
< 0.1%
10952 1
< 0.1%
10943 1
< 0.1%
10937 1
< 0.1%
10859 1
< 0.1%
10811 1
< 0.1%
10787 1
< 0.1%
10778 1
< 0.1%

등록 국가
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
KR
9751 
US
 
176
UNKNOWN
 
73

Length

Max length7
Median length2
Mean length2.0365
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKR
2nd rowKR
3rd rowKR
4th rowKR
5th rowKR

Common Values

ValueCountFrequency (%)
KR 9751
97.5%
US 176
 
1.8%
UNKNOWN 73
 
0.7%

Length

2023-12-12T12:40:07.621172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:40:07.742055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kr 9751
97.5%
us 176
 
1.8%
unknown 73
 
0.7%
Distinct9992
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2015-03-25 10:02:47
Maximum2016-11-14 10:27:40
2023-12-12T12:40:07.899553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:40:08.066122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T12:40:05.641176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:40:05.399048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:40:05.761773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:40:05.505042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:40:08.169408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
아이디타입코드카테고리 코드과정 아이디등록 국가
아이디1.0000.4090.4730.8360.454
타입코드0.4091.0000.8550.2040.418
카테고리 코드0.4730.8551.000NaN1.000
과정 아이디0.8360.204NaN1.0000.022
등록 국가0.4540.4181.0000.0221.000
2023-12-12T12:40:08.317697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
카테고리 코드타입코드등록 국가
카테고리 코드1.0000.6531.000
타입코드0.6531.0000.655
등록 국가1.0000.6551.000
2023-12-12T12:40:08.448780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
아이디과정 아이디타입코드카테고리 코드등록 국가
아이디1.0000.9580.3140.3630.306
과정 아이디0.9581.0000.2041.0000.022
타입코드0.3140.2041.0000.6530.655
카테고리 코드0.3631.0000.6531.0001.000
등록 국가0.3060.0220.6551.0001.000

Missing values

2023-12-12T12:40:05.951200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:40:06.127974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아이디타입코드카테고리 코드과정 아이디등록 국가등록 일시
7215677385단문10005539KR2015-11-28 14:28:52
2105622270단문10003587KR2015-08-11 09:45:37
78851114260단문10006037KR2015-12-30 12:12:46
6196565084단문10004801KR2015-11-08 14:49:42
2980931233단문10003948KR2015-08-31 17:06:02
83553190888단문10008620KR2016-07-21 13:51:36
2391325186단문10004252KR2015-08-18 12:14:43
6660170030단문10005544KR2015-11-16 20:06:38
3632938045단문10004483KR2015-09-16 14:54:48
2349324755단문10003474KR2015-08-17 15:45:59
아이디타입코드카테고리 코드과정 아이디등록 국가등록 일시
5896661808단문10005087KR2015-11-02 14:49:38
21132403단문1000408KR2015-05-06 14:16:28
6191365018단문10004668KR2015-11-08 11:15:10
25232851단문1000430KR2015-05-11 16:04:15
3818740008단문10004402KR2015-09-21 16:49:49
5272955324단문10004629KR2015-10-22 11:45:40
1172212642단문10002982KR2015-07-10 14:05:04
3258534130단문10004050KR2015-09-07 13:32:33
94312207950단문10009345KR2016-10-25 10:50:23
61096638단문1000444KR2015-06-05 11:00:53