Overview

Dataset statistics

Number of variables4
Number of observations368
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.6 KiB
Average record size in memory32.4 B

Variable types

Categorical2
Text2

Dataset

Description독학학위제 전공별 평가영역에 대한 과목정보 데이터로 전공분야, 과정명, 과목명, 첨부파일명 등의 항목을 제공합니다.
Author국가평생교육진흥원
URLhttps://www.data.go.kr/data/15050111/fileData.do

Alerts

전공분야 is highly overall correlated with 과정명High correlation
과정명 is highly overall correlated with 전공분야High correlation

Reproduction

Analysis started2023-12-12 06:15:45.018957
Analysis finished2023-12-12 06:15:45.384922
Duration0.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

전공분야
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
국어국문학
27 
영어영문학
27 
중어중문학
27 
심리학
27 
경영학
27 
Other values (10)
233 

Length

Max length5
Median length4
Mean length3.6331522
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전공분야
2nd row전공분야
3rd row전공분야
4th row전공분야
5th row전공분야

Common Values

ValueCountFrequency (%)
국어국문학 27
 
7.3%
영어영문학 27
 
7.3%
중어중문학 27
 
7.3%
심리학 27
 
7.3%
경영학 27
 
7.3%
법학 27
 
7.3%
행정학 27
 
7.3%
수학 27
 
7.3%
가정학 27
 
7.3%
컴퓨터과학 27
 
7.3%
Other values (5) 98
26.6%

Length

2023-12-12T15:15:45.459053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
국어국문학 27
 
7.3%
영어영문학 27
 
7.3%
중어중문학 27
 
7.3%
심리학 27
 
7.3%
경영학 27
 
7.3%
법학 27
 
7.3%
행정학 27
 
7.3%
수학 27
 
7.3%
가정학 27
 
7.3%
컴퓨터과학 27
 
7.3%
Other values (5) 98
26.6%

과정명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
학위취득종합시험
154 
전공심화과정인정시험
104 
전공기초과정인정시험
88 
교양과정인정시험
22 

Length

Max length10
Median length10
Mean length9.0434783
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교양과정인정시험
2nd row교양과정인정시험
3rd row교양과정인정시험
4th row교양과정인정시험
5th row교양과정인정시험

Common Values

ValueCountFrequency (%)
학위취득종합시험 154
41.8%
전공심화과정인정시험 104
28.3%
전공기초과정인정시험 88
23.9%
교양과정인정시험 22
 
6.0%

Length

2023-12-12T15:15:45.588550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:15:45.696548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
학위취득종합시험 154
41.8%
전공심화과정인정시험 104
28.3%
전공기초과정인정시험 88
23.9%
교양과정인정시험 22
 
6.0%
Distinct248
Distinct (%)67.4%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T15:15:45.961102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9.5
Mean length4.6902174
Min length2

Characters and Unicode

Total characters1726
Distinct characters191
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique210 ?
Unique (%)57.1%

Sample

1st row국어
2nd row국사
3rd row독일어
4th row프랑스어
5th row중국어
ValueCountFrequency (%)
국어 15
 
4.1%
국사 15
 
4.1%
실용중국어 12
 
3.3%
실용독일어 11
 
3.0%
실용영어 11
 
3.0%
실용일본어 11
 
3.0%
실용프랑스어 11
 
3.0%
독일어 4
 
1.1%
프랑스어 4
 
1.1%
중국어 4
 
1.1%
Other values (239) 271
73.4%
2023-12-12T15:15:46.476739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
119
 
6.9%
102
 
5.9%
81
 
4.7%
74
 
4.3%
58
 
3.4%
57
 
3.3%
46
 
2.7%
42
 
2.4%
38
 
2.2%
35
 
2.0%
Other values (181) 1074
62.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1698
98.4%
Decimal Number 22
 
1.3%
Uppercase Letter 4
 
0.2%
Other Punctuation 1
 
0.1%
Space Separator 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
119
 
7.0%
102
 
6.0%
81
 
4.8%
74
 
4.4%
58
 
3.4%
57
 
3.4%
46
 
2.7%
42
 
2.5%
38
 
2.2%
35
 
2.1%
Other values (173) 1046
61.6%
Decimal Number
ValueCountFrequency (%)
1 10
45.5%
2 8
36.4%
0 2
 
9.1%
9 2
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
I 3
75.0%
C 1
 
25.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1698
98.4%
Common 24
 
1.4%
Latin 4
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
119
 
7.0%
102
 
6.0%
81
 
4.8%
74
 
4.4%
58
 
3.4%
57
 
3.4%
46
 
2.7%
42
 
2.5%
38
 
2.2%
35
 
2.1%
Other values (173) 1046
61.6%
Common
ValueCountFrequency (%)
1 10
41.7%
2 8
33.3%
0 2
 
8.3%
9 2
 
8.3%
. 1
 
4.2%
1
 
4.2%
Latin
ValueCountFrequency (%)
I 3
75.0%
C 1
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1698
98.4%
ASCII 28
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
119
 
7.0%
102
 
6.0%
81
 
4.8%
74
 
4.4%
58
 
3.4%
57
 
3.4%
46
 
2.7%
42
 
2.5%
38
 
2.2%
35
 
2.1%
Other values (173) 1046
61.6%
ASCII
ValueCountFrequency (%)
1 10
35.7%
2 8
28.6%
I 3
 
10.7%
0 2
 
7.1%
9 2
 
7.1%
. 1
 
3.6%
1
 
3.6%
C 1
 
3.6%
Distinct277
Distinct (%)75.3%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T15:15:46.849373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length16
Mean length11.480978
Min length6

Characters and Unicode

Total characters4225
Distinct characters215
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique264 ?
Unique (%)71.7%

Sample

1st row교양과정_국어평가영역개정.doc
2nd row국사.doc
3rd row독일어.doc
4th row프랑스어.doc
5th row중국어.hwp
ValueCountFrequency (%)
국사.doc 15
 
4.1%
학위취득_교양국어평가영역개정.doc 14
 
3.8%
실용독일어.hwp 11
 
3.0%
실용영어.hwp 11
 
3.0%
실용프랑스어.hwp 11
 
3.0%
실용중국어.hwp 11
 
3.0%
실용일본어.hwp 11
 
3.0%
독일어.doc 4
 
1.1%
프랑스어.doc 4
 
1.1%
일본어.doc 4
 
1.1%
Other values (267) 272
73.9%
2023-12-12T15:15:47.425081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 368
 
8.7%
o 309
 
7.3%
d 307
 
7.3%
c 307
 
7.3%
) 215
 
5.1%
( 215
 
5.1%
139
 
3.3%
120
 
2.8%
117
 
2.8%
83
 
2.0%
Other values (205) 2045
48.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2183
51.7%
Lowercase Letter 1151
27.2%
Other Punctuation 368
 
8.7%
Close Punctuation 215
 
5.1%
Open Punctuation 215
 
5.1%
Decimal Number 59
 
1.4%
Connector Punctuation 23
 
0.5%
Uppercase Letter 9
 
0.2%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
139
 
6.4%
120
 
5.5%
117
 
5.4%
83
 
3.8%
72
 
3.3%
66
 
3.0%
58
 
2.7%
57
 
2.6%
52
 
2.4%
51
 
2.3%
Other values (178) 1368
62.7%
Lowercase Letter
ValueCountFrequency (%)
o 309
26.8%
d 307
26.7%
c 307
26.7%
w 62
 
5.4%
p 61
 
5.3%
h 61
 
5.3%
x 33
 
2.9%
t 3
 
0.3%
r 3
 
0.3%
n 2
 
0.2%
Other values (3) 3
 
0.3%
Decimal Number
ValueCountFrequency (%)
2 21
35.6%
3 16
27.1%
4 9
15.3%
1 9
15.3%
0 2
 
3.4%
9 2
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
I 6
66.7%
L 2
 
22.2%
C 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 368
100.0%
Close Punctuation
ValueCountFrequency (%)
) 215
100.0%
Open Punctuation
ValueCountFrequency (%)
( 215
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 23
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2183
51.7%
Latin 1160
27.5%
Common 882
20.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
139
 
6.4%
120
 
5.5%
117
 
5.4%
83
 
3.8%
72
 
3.3%
66
 
3.0%
58
 
2.7%
57
 
2.6%
52
 
2.4%
51
 
2.3%
Other values (178) 1368
62.7%
Latin
ValueCountFrequency (%)
o 309
26.6%
d 307
26.5%
c 307
26.5%
w 62
 
5.3%
p 61
 
5.3%
h 61
 
5.3%
x 33
 
2.8%
I 6
 
0.5%
t 3
 
0.3%
r 3
 
0.3%
Other values (6) 8
 
0.7%
Common
ValueCountFrequency (%)
. 368
41.7%
) 215
24.4%
( 215
24.4%
_ 23
 
2.6%
2 21
 
2.4%
3 16
 
1.8%
4 9
 
1.0%
1 9
 
1.0%
0 2
 
0.2%
9 2
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2183
51.7%
ASCII 2042
48.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 368
18.0%
o 309
15.1%
d 307
15.0%
c 307
15.0%
) 215
10.5%
( 215
10.5%
w 62
 
3.0%
p 61
 
3.0%
h 61
 
3.0%
x 33
 
1.6%
Other values (17) 104
 
5.1%
Hangul
ValueCountFrequency (%)
139
 
6.4%
120
 
5.5%
117
 
5.4%
83
 
3.8%
72
 
3.3%
66
 
3.0%
58
 
2.7%
57
 
2.6%
52
 
2.4%
51
 
2.3%
Other values (178) 1368
62.7%

Correlations

2023-12-12T15:15:47.528318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전공분야과정명
전공분야1.0000.785
과정명0.7851.000
2023-12-12T15:15:47.636688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전공분야과정명
전공분야1.0000.571
과정명0.5711.000
2023-12-12T15:15:47.724486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전공분야과정명
전공분야1.0000.571
과정명0.5711.000

Missing values

2023-12-12T15:15:45.277154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:15:45.354838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

전공분야과정명과목명파일명
0전공분야교양과정인정시험국어교양과정_국어평가영역개정.doc
1전공분야교양과정인정시험국사국사.doc
2전공분야교양과정인정시험독일어독일어.doc
3전공분야교양과정인정시험프랑스어프랑스어.doc
4전공분야교양과정인정시험중국어중국어.hwp
5전공분야교양과정인정시험일본어일본어.doc
6전공분야교양과정인정시험문학개론Liter-Intro.doc
7전공분야교양과정인정시험문화사문화사.doc
8전공분야교양과정인정시험한문한문.doc
9전공분야교양과정인정시험법학개론Law-Intro.doc
전공분야과정명과목명파일명
358정보통신학학위취득종합시험실용중국어실용중국어.hwp
359정보통신학학위취득종합시험실용일본어실용일본어.hwp
360정보통신학전공심화과정인정시험회로이론회로이론.docx
361정보통신학전공심화과정인정시험데이터통신데이터통신.docx
362정보통신학전공심화과정인정시험정보통신이론정보통신이론.docx
363정보통신학전공심화과정인정시험임베디드시스템임베디드시스템.docx
364정보통신학전공심화과정인정시험이동통신시스템이동통신시스템.docx
365정보통신학전공심화과정인정시험정보통신기기정보통신기기.docx
366정보통신학전공심화과정인정시험정보보안정보보안.docx
367정보통신학학위취득종합시험정보통신시스템정보통신시스템.docx