Overview

Dataset statistics

Number of variables9
Number of observations26
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory82.1 B

Variable types

Numeric2
Text1
Boolean2
Categorical3
DateTime1

Dataset

Description온라인 개인정보보호 포털 내 온라인 교육콘텐츠 및 강의정보에 관련한 데이터로 교육과정명, 등록 일시, 차수 등의 정보를 제공합니다.
Author한국인터넷진흥원
URLhttps://www.data.go.kr/data/15070607/fileData.do

Alerts

조회수 has constant value ""Constant
선택이수챕터수 is highly overall correlated with 노출여부 and 2 other fieldsHigh correlation
노출여부 is highly overall correlated with 선택이수챕터수High correlation
필수시험여부 is highly overall correlated with 선택이수챕터수High correlation
시험과락갯수 is highly overall correlated with 선택이수챕터수High correlation
노출여부 is highly imbalanced (60.9%)Imbalance
선택이수챕터수 is highly imbalanced (60.8%)Imbalance
필수시험여부 is highly imbalanced (76.5%)Imbalance
시험과락갯수 is highly imbalanced (76.5%)Imbalance
인덱스 has unique valuesUnique
강의명 has unique valuesUnique

Reproduction

Analysis started2023-12-13 00:43:48.866254
Analysis finished2023-12-13 00:43:49.570390
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

인덱스
Real number (ℝ)

UNIQUE 

Distinct26
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.192308
Minimum1
Maximum27
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-13T09:43:49.619616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.25
Q17.25
median14.5
Q320.75
95-th percentile25.75
Maximum27
Range26
Interquartile range (IQR)13.5

Descriptive statistics

Standard deviation8.0300398
Coefficient of variation (CV)0.56580226
Kurtosis-1.2168073
Mean14.192308
Median Absolute Deviation (MAD)7
Skewness-0.067387075
Sum369
Variance64.481538
MonotonicityStrictly increasing
2023-12-13T09:43:49.713413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
1 1
 
3.8%
16 1
 
3.8%
27 1
 
3.8%
26 1
 
3.8%
25 1
 
3.8%
24 1
 
3.8%
23 1
 
3.8%
22 1
 
3.8%
21 1
 
3.8%
20 1
 
3.8%
Other values (16) 16
61.5%
ValueCountFrequency (%)
1 1
3.8%
2 1
3.8%
3 1
3.8%
4 1
3.8%
5 1
3.8%
6 1
3.8%
7 1
3.8%
8 1
3.8%
10 1
3.8%
11 1
3.8%
ValueCountFrequency (%)
27 1
3.8%
26 1
3.8%
25 1
3.8%
24 1
3.8%
23 1
3.8%
22 1
3.8%
21 1
3.8%
20 1
3.8%
19 1
3.8%
18 1
3.8%

강의명
Text

UNIQUE 

Distinct26
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size340.0 B
2023-12-13T09:43:49.887021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length21
Mean length16.192308
Min length7

Characters and Unicode

Total characters421
Distinct characters81
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)100.0%

Sample

1st row개인정보보호 교육과정1
2nd row개인정보보호 교육과정2
3rd row정보보호 실무과정
4th row정보보호 기초과정
5th row개인정보보호교육 - 업종별 교육과정
ValueCountFrequency (%)
12
 
14.1%
개인정보보호교육 10
 
11.8%
교육과정 5
 
5.9%
사업자 3
 
3.5%
school 3
 
3.5%
student 3
 
3.5%
ceo·cpo 3
 
3.5%
교육 2
 
2.4%
위치정보보호 2
 
2.4%
개인정보보호 2
 
2.4%
Other values (39) 40
47.1%
2023-12-13T09:43:50.154759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
60
 
14.3%
33
 
7.8%
27
 
6.4%
24
 
5.7%
23
 
5.5%
16
 
3.8%
13
 
3.1%
- 12
 
2.9%
12
 
2.9%
n 10
 
2.4%
Other values (71) 191
45.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 219
52.0%
Lowercase Letter 90
21.4%
Space Separator 60
 
14.3%
Uppercase Letter 26
 
6.2%
Dash Punctuation 12
 
2.9%
Decimal Number 9
 
2.1%
Other Punctuation 3
 
0.7%
Open Punctuation 1
 
0.2%
Close Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
15.1%
27
12.3%
24
11.0%
23
10.5%
16
 
7.3%
13
 
5.9%
12
 
5.5%
10
 
4.6%
4
 
1.8%
4
 
1.8%
Other values (33) 53
24.2%
Lowercase Letter
ValueCountFrequency (%)
n 10
11.1%
e 8
8.9%
t 7
 
7.8%
s 7
 
7.8%
c 7
 
7.8%
h 7
 
7.8%
o 7
 
7.8%
l 7
 
7.8%
i 6
 
6.7%
d 6
 
6.7%
Other values (9) 18
20.0%
Uppercase Letter
ValueCountFrequency (%)
C 6
23.1%
O 6
23.1%
E 4
15.4%
P 4
15.4%
M 2
 
7.7%
G 1
 
3.8%
H 1
 
3.8%
S 1
 
3.8%
I 1
 
3.8%
Decimal Number
ValueCountFrequency (%)
2 3
33.3%
1 3
33.3%
0 1
 
11.1%
4 1
 
11.1%
3 1
 
11.1%
Space Separator
ValueCountFrequency (%)
60
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Other Punctuation
ValueCountFrequency (%)
· 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 219
52.0%
Latin 116
27.6%
Common 86
 
20.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
33
15.1%
27
12.3%
24
11.0%
23
10.5%
16
 
7.3%
13
 
5.9%
12
 
5.5%
10
 
4.6%
4
 
1.8%
4
 
1.8%
Other values (33) 53
24.2%
Latin
ValueCountFrequency (%)
n 10
 
8.6%
e 8
 
6.9%
t 7
 
6.0%
s 7
 
6.0%
c 7
 
6.0%
h 7
 
6.0%
o 7
 
6.0%
l 7
 
6.0%
C 6
 
5.2%
O 6
 
5.2%
Other values (18) 44
37.9%
Common
ValueCountFrequency (%)
60
69.8%
- 12
 
14.0%
2 3
 
3.5%
· 3
 
3.5%
1 3
 
3.5%
( 1
 
1.2%
0 1
 
1.2%
4 1
 
1.2%
3 1
 
1.2%
) 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 219
52.0%
ASCII 199
47.3%
None 3
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
60
30.2%
- 12
 
6.0%
n 10
 
5.0%
e 8
 
4.0%
t 7
 
3.5%
s 7
 
3.5%
c 7
 
3.5%
h 7
 
3.5%
o 7
 
3.5%
l 7
 
3.5%
Other values (27) 67
33.7%
Hangul
ValueCountFrequency (%)
33
15.1%
27
12.3%
24
11.0%
23
10.5%
16
 
7.3%
13
 
5.9%
12
 
5.5%
10
 
4.6%
4
 
1.8%
4
 
1.8%
Other values (33) 53
24.2%
None
ValueCountFrequency (%)
· 3
100.0%

노출여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size158.0 B
True
24 
False
 
2
ValueCountFrequency (%)
True 24
92.3%
False 2
 
7.7%
2023-12-13T09:43:50.240549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

조회수
Categorical

CONSTANT 

Distinct1
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
26 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 26
100.0%

Length

2023-12-13T09:43:50.310899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:43:50.376156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 26
100.0%
Distinct18
Distinct (%)69.2%
Missing0
Missing (%)0.0%
Memory size340.0 B
Minimum2010-11-17 11:52:00
Maximum2020-02-18 11:08:00
2023-12-13T09:43:50.434868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:43:50.511075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)

이수챕터수
Real number (ℝ)

Distinct6
Distinct (%)23.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2692308
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-13T09:43:50.586449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34.75
95-th percentile5
Maximum10
Range9
Interquartile range (IQR)2.75

Descriptive statistics

Standard deviation1.9911342
Coefficient of variation (CV)0.60905281
Kurtosis3.9534447
Mean3.2692308
Median Absolute Deviation (MAD)1.5
Skewness1.4406119
Sum85
Variance3.9646154
MonotonicityNot monotonic
2023-12-13T09:43:50.660983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
3 9
34.6%
5 6
23.1%
1 6
23.1%
2 2
 
7.7%
4 2
 
7.7%
10 1
 
3.8%
ValueCountFrequency (%)
1 6
23.1%
2 2
 
7.7%
3 9
34.6%
4 2
 
7.7%
5 6
23.1%
10 1
 
3.8%
ValueCountFrequency (%)
10 1
 
3.8%
5 6
23.1%
4 2
 
7.7%
3 9
34.6%
2 2
 
7.7%
1 6
23.1%

선택이수챕터수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
23 
1
 
2
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)3.8%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row3

Common Values

ValueCountFrequency (%)
0 23
88.5%
1 2
 
7.7%
3 1
 
3.8%

Length

2023-12-13T09:43:50.745719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:43:50.814144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 23
88.5%
1 2
 
7.7%
3 1
 
3.8%

필수시험여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size158.0 B
False
25 
True
 
1
ValueCountFrequency (%)
False 25
96.2%
True 1
 
3.8%
2023-12-13T09:43:50.888627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

시험과락갯수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
25 
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)3.8%

Sample

1st row0
2nd row0
3rd row0
4th row5
5th row0

Common Values

ValueCountFrequency (%)
0 25
96.2%
5 1
 
3.8%

Length

2023-12-13T09:43:50.970623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:43:51.039769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 25
96.2%
5 1
 
3.8%

Interactions

2023-12-13T09:43:49.275280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:43:49.163136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:43:49.333022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:43:49.218180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:43:51.086491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인덱스강의명노출여부등록일자이수챕터수선택이수챕터수필수시험여부시험과락갯수
인덱스1.0001.0000.0000.9540.6910.0000.0000.000
강의명1.0001.0001.0001.0001.0001.0001.0001.000
노출여부0.0001.0001.0001.0000.4511.0000.3890.389
등록일자0.9541.0001.0001.0000.9901.0000.0000.000
이수챕터수0.6911.0000.4510.9901.0000.3910.0000.000
선택이수챕터수0.0001.0001.0001.0000.3911.0000.4220.422
필수시험여부0.0001.0000.3890.0000.0000.4221.0000.000
시험과락갯수0.0001.0000.3890.0000.0000.4220.0001.000
2023-12-13T09:43:51.174554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
필수시험여부선택이수챕터수노출여부시험과락갯수
필수시험여부1.0000.6450.2520.000
선택이수챕터수0.6451.0000.9790.645
노출여부0.2520.9791.0000.252
시험과락갯수0.0000.6450.2521.000
2023-12-13T09:43:51.247460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인덱스이수챕터수노출여부선택이수챕터수필수시험여부시험과락갯수
인덱스1.0000.4130.0000.0000.0000.000
이수챕터수0.4131.0000.2850.1370.0000.000
노출여부0.0000.2851.0000.9790.2520.252
선택이수챕터수0.0000.1370.9791.0000.6450.645
필수시험여부0.0000.0000.2520.6451.0000.000
시험과락갯수0.0000.0000.2520.6450.0001.000

Missing values

2023-12-13T09:43:49.419451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:43:49.525627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

인덱스강의명노출여부조회수등록일자이수챕터수선택이수챕터수필수시험여부시험과락갯수
01개인정보보호 교육과정1Y02010-11-17 11:5250N0
12개인정보보호 교육과정2Y02010-11-17 11:5250N0
23정보보호 실무과정N02010-11-18 15:4011Y0
34정보보호 기초과정N02010-11-18 15:4011N5
45개인정보보호교육 - 업종별 교육과정Y02012-03-26 15:2113N0
56위치정보보호 교육과정Y02012-07-31 16:2120N0
67정보통신망법 신규제도 교육Y02012-12-24 13:4030N0
782014 개인정보보호교육Y02014-09-11 13:1320N0
810PIMS 교육Y02015-04-01 15:1840N0
911CEO·CPO 교육과정 - 제1편 통찰Y02016-02-24 16:4810N0
인덱스강의명노출여부조회수등록일자이수챕터수선택이수챕터수필수시험여부시험과락갯수
1618개인정보보호교육 - 사업자 기본교육Y02019-02-22 16:4550N0
1719개인정보보호교육 - 사업자 실무교육Y02019-02-22 16:4540N0
1820개인정보보호교육 - 사업자 전문교육Y02019-02-22 16:4650N0
1921개인정보보호교육 - 교원Y02019-03-06 10:35100N0
2022Elementary school studentY02020-02-11 16:1330N0
2123Middle school studentY02020-02-11 16:1330N0
2224High school studentY02020-02-11 16:1330N0
2325General publicY02020-02-11 16:1430N0
2426ngi nc ngoi nhp c v du hc sinhY02020-02-11 16:1450N0
2527위치정보보호 교육과정(신)Y02020-02-18 11:0850N0