Overview

Dataset statistics

Number of variables7
Number of observations7898
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory455.2 KiB
Average record size in memory59.0 B

Variable types

Text2
Categorical5

Dataset

Description고시된 평가인정학습과목에 대한 데이터로 과목명칭,학점,강의시간,실습시간,과목유형, 영문이름, 개요 등의 항목을 제공합니다.
Author국가평생교육진흥원
URLhttps://www.data.go.kr/data/15071125/fileData.do

Alerts

학점 has constant value ""Constant
고시여부 has constant value ""Constant
강의시간 is highly overall correlated with 실습시간High correlation
실습시간 is highly overall correlated with 강의시간High correlation
과목유형 is highly imbalanced (77.9%)Imbalance
과목명칭 has unique valuesUnique

Reproduction

Analysis started2023-12-12 11:46:47.045472
Analysis finished2023-12-12 11:46:48.024849
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

과목명칭
Text

UNIQUE 

Distinct7898
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
2023-12-12T20:46:48.190364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length20
Mean length6.6449734
Min length2

Characters and Unicode

Total characters52482
Distinct characters701
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7898 ?
Unique (%)100.0%

Sample

1st row16세기대위법
2nd row17세기프랑스문학
3rd row17ㆍ18세기미국문학
4th row17ㆍ18세기영국문학
5th row18·19세기독문학
ValueCountFrequency (%)
3
 
< 0.1%
16세기대위법 1
 
< 0.1%
작곡이론사 1
 
< 0.1%
잔놀음실습 1
 
< 0.1%
작품분석 1
 
< 0.1%
작품감상 1
 
< 0.1%
작업환경공학 1
 
< 0.1%
작업치료학 1
 
< 0.1%
작업치료세미나 1
 
< 0.1%
작업치료도구 1
 
< 0.1%
Other values (7905) 7905
99.8%
2023-12-12T20:46:48.620914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2061
 
3.9%
1746
 
3.3%
1511
 
2.9%
1326
 
2.5%
997
 
1.9%
989
 
1.9%
953
 
1.8%
859
 
1.6%
828
 
1.6%
789
 
1.5%
Other values (691) 40423
77.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49352
94.0%
Letter Number 2255
 
4.3%
Uppercase Letter 426
 
0.8%
Other Punctuation 114
 
0.2%
Lowercase Letter 86
 
0.2%
Decimal Number 80
 
0.2%
Close Punctuation 66
 
0.1%
Open Punctuation 66
 
0.1%
Space Separator 29
 
0.1%
Dash Punctuation 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2061
 
4.2%
1746
 
3.5%
1511
 
3.1%
1326
 
2.7%
953
 
1.9%
859
 
1.7%
828
 
1.7%
789
 
1.6%
756
 
1.5%
714
 
1.4%
Other values (624) 37809
76.6%
Uppercase Letter
ValueCountFrequency (%)
C 83
19.5%
A 54
12.7%
D 53
12.4%
T 34
8.0%
P 31
 
7.3%
V 25
 
5.9%
S 22
 
5.2%
I 22
 
5.2%
M 16
 
3.8%
N 13
 
3.1%
Other values (14) 73
17.1%
Lowercase Letter
ValueCountFrequency (%)
e 11
12.8%
s 9
10.5%
i 9
10.5%
t 8
9.3%
a 7
 
8.1%
n 7
 
8.1%
r 5
 
5.8%
c 4
 
4.7%
o 4
 
4.7%
y 4
 
4.7%
Other values (8) 18
20.9%
Letter Number
ValueCountFrequency (%)
997
44.2%
989
43.9%
178
 
7.9%
68
 
3.0%
10
 
0.4%
7
 
0.3%
3
 
0.1%
3
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 19
23.8%
0 16
20.0%
2 15
18.8%
3 10
12.5%
9 10
12.5%
8 6
 
7.5%
7 3
 
3.8%
6 1
 
1.2%
Other Punctuation
ValueCountFrequency (%)
· 59
51.8%
, 27
23.7%
/ 20
 
17.5%
& 4
 
3.5%
. 4
 
3.5%
Close Punctuation
ValueCountFrequency (%)
) 66
100.0%
Open Punctuation
ValueCountFrequency (%)
( 66
100.0%
Space Separator
ValueCountFrequency (%)
29
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49346
94.0%
Latin 2767
 
5.3%
Common 363
 
0.7%
Han 6
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2061
 
4.2%
1746
 
3.5%
1511
 
3.1%
1326
 
2.7%
953
 
1.9%
859
 
1.7%
828
 
1.7%
789
 
1.6%
756
 
1.5%
714
 
1.4%
Other values (618) 37803
76.6%
Latin
ValueCountFrequency (%)
997
36.0%
989
35.7%
178
 
6.4%
C 83
 
3.0%
68
 
2.5%
A 54
 
2.0%
D 53
 
1.9%
T 34
 
1.2%
P 31
 
1.1%
V 25
 
0.9%
Other values (40) 255
 
9.2%
Common
ValueCountFrequency (%)
) 66
18.2%
( 66
18.2%
· 59
16.3%
29
8.0%
, 27
7.4%
/ 20
 
5.5%
1 19
 
5.2%
0 16
 
4.4%
2 15
 
4.1%
3 10
 
2.8%
Other values (7) 36
9.9%
Han
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49342
94.0%
Number Forms 2255
 
4.3%
ASCII 816
 
1.6%
None 59
 
0.1%
CJK 5
 
< 0.1%
Compat Jamo 4
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2061
 
4.2%
1746
 
3.5%
1511
 
3.1%
1326
 
2.7%
953
 
1.9%
859
 
1.7%
828
 
1.7%
789
 
1.6%
756
 
1.5%
714
 
1.4%
Other values (617) 37799
76.6%
Number Forms
ValueCountFrequency (%)
997
44.2%
989
43.9%
178
 
7.9%
68
 
3.0%
10
 
0.4%
7
 
0.3%
3
 
0.1%
3
 
0.1%
ASCII
ValueCountFrequency (%)
C 83
 
10.2%
) 66
 
8.1%
( 66
 
8.1%
A 54
 
6.6%
D 53
 
6.5%
T 34
 
4.2%
P 31
 
3.8%
29
 
3.6%
, 27
 
3.3%
V 25
 
3.1%
Other values (48) 348
42.6%
None
ValueCountFrequency (%)
· 59
100.0%
Compat Jamo
ValueCountFrequency (%)
4
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

학점
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
3
7898 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
3 7898
100.0%

Length

2023-12-12T20:46:48.788049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:46:48.923890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 7898
100.0%

강의시간
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
3
4098 
1
1489 
2
1172 
<NA>
1139 

Length

Max length4
Median length1
Mean length1.4326412
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
3 4098
51.9%
1 1489
 
18.9%
2 1172
 
14.8%
<NA> 1139
 
14.4%

Length

2023-12-12T20:46:49.055537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:46:49.197285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 4098
51.9%
1 1489
 
18.9%
2 1172
 
14.8%
na 1139
 
14.4%

실습시간
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
<NA>
4098 
4
1489 
2
1172 
6
1139 

Length

Max length4
Median length4
Mean length2.5565966
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 4098
51.9%
4 1489
 
18.9%
2 1172
 
14.8%
6 1139
 
14.4%

Length

2023-12-12T20:46:49.341003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:46:49.527747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4098
51.9%
4 1489
 
18.9%
2 1172
 
14.8%
6 1139
 
14.4%

고시여부
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
고시됨
7898 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row고시됨
2nd row고시됨
3rd row고시됨
4th row고시됨
5th row고시됨

Common Values

ValueCountFrequency (%)
고시됨 7898
100.0%

Length

2023-12-12T20:46:49.677288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:46:49.793920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고시됨 7898
100.0%

과목유형
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
전공
7481 
교양
 
241
전공이면서 교양
 
176

Length

Max length8
Median length2
Mean length2.1337047
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전공
2nd row전공
3rd row전공
4th row전공
5th row전공

Common Values

ValueCountFrequency (%)
전공 7481
94.7%
교양 241
 
3.1%
전공이면서 교양 176
 
2.2%

Length

2023-12-12T20:46:49.914182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:46:50.047964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전공 7481
92.7%
교양 417
 
5.2%
전공이면서 176
 
2.2%
Distinct7795
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
2023-12-12T20:46:50.354136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length158
Median length84
Mean length27.458724
Min length1

Characters and Unicode

Total characters216869
Distinct characters86
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7699 ?
Unique (%)97.5%

Sample

1st row16th Century Counterpoint
2nd row17th French Literature
3rd row17th and 18th Century American Literature
4th row17th and 18th Century British Literature
5th row18·19th Century German Literature
ValueCountFrequency (%)
of 1546
 
5.5%
and 603
 
2.1%
practice 582
 
2.1%
studio 435
 
1.5%
378
 
1.3%
in 371
 
1.3%
to 314
 
1.1%
the 307
 
1.1%
introduction 268
 
0.9%
design 258
 
0.9%
Other values (3829) 23194
82.1%
2023-12-12T20:46:50.974323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20474
 
9.4%
i 17440
 
8.0%
e 17034
 
7.9%
n 16602
 
7.7%
a 14894
 
6.9%
o 14544
 
6.7%
t 14131
 
6.5%
r 12333
 
5.7%
c 8549
 
3.9%
s 8048
 
3.7%
Other values (76) 72820
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 168487
77.7%
Uppercase Letter 24393
 
11.2%
Space Separator 20474
 
9.4%
Letter Number 2164
 
1.0%
Other Punctuation 732
 
0.3%
Dash Punctuation 383
 
0.2%
Decimal Number 153
 
0.1%
Close Punctuation 42
 
< 0.1%
Open Punctuation 38
 
< 0.1%
Format 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 17440
10.4%
e 17034
10.1%
n 16602
9.9%
a 14894
 
8.8%
o 14544
 
8.6%
t 14131
 
8.4%
r 12333
 
7.3%
c 8549
 
5.1%
s 8048
 
4.8%
l 6469
 
3.8%
Other values (16) 38443
22.8%
Uppercase Letter
ValueCountFrequency (%)
S 2949
12.1%
P 2541
 
10.4%
C 2343
 
9.6%
M 2012
 
8.2%
T 1656
 
6.8%
A 1494
 
6.1%
E 1440
 
5.9%
D 1319
 
5.4%
I 1209
 
5.0%
F 971
 
4.0%
Other values (16) 6459
26.5%
Other Punctuation
ValueCountFrequency (%)
& 368
50.3%
, 153
20.9%
' 75
 
10.2%
. 63
 
8.6%
/ 39
 
5.3%
: 20
 
2.7%
11
 
1.5%
· 2
 
0.3%
; 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 42
27.5%
2 32
20.9%
8 22
14.4%
0 17
11.1%
4 14
 
9.2%
3 12
 
7.8%
9 10
 
6.5%
7 3
 
2.0%
6 1
 
0.7%
Letter Number
ValueCountFrequency (%)
955
44.1%
943
43.6%
176
 
8.1%
67
 
3.1%
10
 
0.5%
7
 
0.3%
3
 
0.1%
3
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 40
95.2%
] 2
 
4.8%
Open Punctuation
ValueCountFrequency (%)
( 36
94.7%
[ 2
 
5.3%
Space Separator
ValueCountFrequency (%)
20474
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 383
100.0%
Format
ValueCountFrequency (%)
­ 2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 195044
89.9%
Common 21825
 
10.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 17440
 
8.9%
e 17034
 
8.7%
n 16602
 
8.5%
a 14894
 
7.6%
o 14544
 
7.5%
t 14131
 
7.2%
r 12333
 
6.3%
c 8549
 
4.4%
s 8048
 
4.1%
l 6469
 
3.3%
Other values (50) 65000
33.3%
Common
ValueCountFrequency (%)
20474
93.8%
- 383
 
1.8%
& 368
 
1.7%
, 153
 
0.7%
' 75
 
0.3%
. 63
 
0.3%
1 42
 
0.2%
) 40
 
0.2%
/ 39
 
0.2%
( 36
 
0.2%
Other values (16) 152
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 214689
99.0%
Number Forms 2164
 
1.0%
None 15
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20474
 
9.5%
i 17440
 
8.1%
e 17034
 
7.9%
n 16602
 
7.7%
a 14894
 
6.9%
o 14544
 
6.8%
t 14131
 
6.6%
r 12333
 
5.7%
c 8549
 
4.0%
s 8048
 
3.7%
Other values (64) 70640
32.9%
Number Forms
ValueCountFrequency (%)
955
44.1%
943
43.6%
176
 
8.1%
67
 
3.1%
10
 
0.5%
7
 
0.3%
3
 
0.1%
3
 
0.1%
None
ValueCountFrequency (%)
11
73.3%
­ 2
 
13.3%
· 2
 
13.3%
Punctuation
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-12T20:46:51.085111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강의시간실습시간과목유형
강의시간1.0001.0000.241
실습시간1.0001.0000.217
과목유형0.2410.2171.000
2023-12-12T20:46:51.187905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강의시간실습시간과목유형
강의시간1.0000.9990.077
실습시간0.9991.0000.068
과목유형0.0770.0681.000
2023-12-12T20:46:51.316031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강의시간실습시간과목유형
강의시간1.0000.9990.077
실습시간0.9991.0000.068
과목유형0.0770.0681.000

Missing values

2023-12-12T20:46:47.812642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:46:47.953591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

과목명칭학점강의시간실습시간고시여부과목유형영문이름
016세기대위법33<NA>고시됨전공16th Century Counterpoint
117세기프랑스문학33<NA>고시됨전공17th French Literature
217ㆍ18세기미국문학33<NA>고시됨전공17th and 18th Century American Literature
317ㆍ18세기영국문학33<NA>고시됨전공17th and 18th Century British Literature
418·19세기독문학33<NA>고시됨전공18·19th Century German Literature
518세기영소설33<NA>고시됨전공18th Century English Novel
618세기프랑스문학33<NA>고시됨전공18th French Literature
718세기프랑스소설33<NA>고시됨전공18th French Novel
819·20세기독일시33<NA>고시됨전공19·20th Century German Poetry
919세기미국문학33<NA>고시됨전공19th Century American Literature
과목명칭학점강의시간실습시간고시여부과목유형영문이름
7888휴먼커뮤니케이션33<NA>고시됨교양Human communication
7889흑백암실실습314고시됨전공Black & White Darkroom Techniques
7890희곡론33<NA>고시됨전공Study in Drama
7891희곡분석Ⅰ33<NA>고시됨전공Play AnalysisⅠ
7892희곡분석Ⅱ33<NA>고시됨전공Play AnalysisⅡ
7893희곡창작연구33<NA>고시됨전공Drama Creation Survey
7894희랍신화의탐구33<NA>고시됨교양Introduction to Greek Mythology
7895히브리어기초33<NA>고시됨전공Basic Hebrew
7896힙합Ⅰ314고시됨전공HiphopⅠ
7897힙합Ⅱ314고시됨전공HiphopⅡ