Overview

Dataset statistics

Number of variables7
Number of observations769
Missing cells13
Missing cells (%)0.2%
Duplicate rows3
Duplicate rows (%)0.4%
Total size in memory42.9 KiB
Average record size in memory57.2 B

Variable types

Categorical5
Text2

Dataset

Description한국폴리텍대학에서 실시하는 교과과정에 대한 정보를 제공합니다. 제공하는 데이터 항목은 (과정, 계열, 학과, 캠퍼스, 이론/실습, 교과목명, 학점)입니다.
Author학교법인한국폴리텍
URLhttps://www.data.go.kr/data/15053552/fileData.do

Alerts

과정 has constant value ""Constant
Dataset has 3 (0.4%) duplicate rowsDuplicates
학점 is highly overall correlated with 이론실습여부High correlation
이론실습여부 is highly overall correlated with 학점High correlation
계열 is highly overall correlated with 캠퍼스High correlation
캠퍼스 is highly overall correlated with 계열High correlation
교과목명 has 13 (1.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:43:20.477453
Analysis finished2023-12-12 21:43:21.150723
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

과정
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
다기능
769 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row다기능
2nd row다기능
3rd row다기능
4th row다기능
5th row다기능

Common Values

ValueCountFrequency (%)
다기능 769
100.0%

Length

2023-12-13T06:43:21.213237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:43:21.292424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
다기능 769
100.0%

계열
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
정보통신IT
85 
자동화
79 
기계
73 
산업설비
69 
전자
64 
Other values (15)
399 

Length

Max length6
Median length5
Mean length3.289987
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기계
2nd row기계
3rd row기계
4th row기계
5th row기계

Common Values

ValueCountFrequency (%)
정보통신IT 85
11.1%
자동화 79
10.3%
기계 73
9.5%
산업설비 69
9.0%
전자 64
8.3%
미디어 54
 
7.0%
전기 50
 
6.5%
섬유패션 50
 
6.5%
디자인 50
 
6.5%
바이오 41
 
5.3%
Other values (10) 154
20.0%

Length

2023-12-13T06:43:21.397062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
정보통신it 85
11.1%
자동화 79
10.3%
기계 73
9.5%
전자 73
9.5%
산업설비 69
9.0%
미디어 54
 
7.0%
전기 50
 
6.5%
섬유패션 50
 
6.5%
디자인 50
 
6.5%
바이오 41
 
5.3%
Other values (9) 145
18.9%

학과
Text

Distinct69
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
2023-12-13T06:43:21.651661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length10
Mean length6.4941482
Min length3

Characters and Unicode

Total characters4994
Distinct characters118
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row컴퓨터응용기계
2nd row컴퓨터응용기계
3rd row컴퓨터응용기계
4th row컴퓨터응용기계
5th row컴퓨터응용기계
ValueCountFrequency (%)
전기 28
 
3.3%
전기에너지 28
 
3.3%
전기에너지시스템 28
 
3.3%
산업설비자동화 25
 
2.9%
신소재응용 24
 
2.8%
스마트소프트웨어 23
 
2.7%
금형디자인 23
 
2.7%
메카트로닉스 22
 
2.6%
컴퓨터응용기계 22
 
2.6%
it융합제어 21
 
2.4%
Other values (63) 615
71.6%
2023-12-13T06:43:22.121585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
266
 
5.3%
260
 
5.2%
190
 
3.8%
181
 
3.6%
142
 
2.8%
139
 
2.8%
127
 
2.5%
115
 
2.3%
111
 
2.2%
104
 
2.1%
Other values (108) 3359
67.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4727
94.7%
Space Separator 90
 
1.8%
Other Punctuation 90
 
1.8%
Uppercase Letter 87
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
266
 
5.6%
260
 
5.5%
190
 
4.0%
181
 
3.8%
142
 
3.0%
139
 
2.9%
127
 
2.7%
115
 
2.4%
111
 
2.3%
104
 
2.2%
Other values (101) 3092
65.4%
Uppercase Letter
ValueCountFrequency (%)
T 30
34.5%
I 30
34.5%
A 9
 
10.3%
C 9
 
10.3%
D 9
 
10.3%
Space Separator
ValueCountFrequency (%)
90
100.0%
Other Punctuation
ValueCountFrequency (%)
, 90
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4727
94.7%
Common 180
 
3.6%
Latin 87
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
266
 
5.6%
260
 
5.5%
190
 
4.0%
181
 
3.8%
142
 
3.0%
139
 
2.9%
127
 
2.7%
115
 
2.4%
111
 
2.3%
104
 
2.2%
Other values (101) 3092
65.4%
Latin
ValueCountFrequency (%)
T 30
34.5%
I 30
34.5%
A 9
 
10.3%
C 9
 
10.3%
D 9
 
10.3%
Common
ValueCountFrequency (%)
90
50.0%
, 90
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4727
94.7%
ASCII 267
 
5.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
266
 
5.6%
260
 
5.5%
190
 
4.0%
181
 
3.8%
142
 
3.0%
139
 
2.9%
127
 
2.7%
115
 
2.4%
111
 
2.3%
104
 
2.2%
Other values (101) 3092
65.4%
ASCII
ValueCountFrequency (%)
90
33.7%
, 90
33.7%
T 30
 
11.2%
I 30
 
11.2%
A 9
 
3.4%
C 9
 
3.4%
D 9
 
3.4%

캠퍼스
Categorical

HIGH CORRELATION 

Distinct39
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
항공
 
48
대전
 
47
서울강서
 
45
바이오
 
41
인천
 
34
Other values (34)
554 

Length

Max length69
Median length34
Mean length12.546164
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템
2nd row서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템
3rd row서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템
4th row서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템
5th row서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템

Common Values

ValueCountFrequency (%)
항공 48
 
6.2%
대전 47
 
6.1%
서울강서 45
 
5.9%
바이오 41
 
5.3%
인천 34
 
4.4%
안성 33
 
4.3%
아산 29
 
3.8%
섬유패션 28
 
3.6%
서울정수, 안성, 춘천, 홍성, 광주, 목포, 구미, 부산, 울산-전기/인천-전기에너지시스템/ 청주-전기에너지 28
 
3.6%
춘천 25
 
3.3%
Other values (29) 411
53.4%

Length

2023-12-13T06:43:22.304980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
인천 171
 
7.9%
창원 155
 
7.2%
서울정수 146
 
6.8%
광주 141
 
6.6%
대구 130
 
6.0%
성남 119
 
5.5%
부산 113
 
5.3%
김제 106
 
4.9%
목포 97
 
4.5%
아산 88
 
4.1%
Other values (22) 885
41.1%

이론실습여부
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
실습
431 
이론
325 
<NA>
 
13

Length

Max length4
Median length2
Mean length2.0338101
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row실습
2nd row실습
3rd row실습
4th row실습
5th row실습

Common Values

ValueCountFrequency (%)
실습 431
56.0%
이론 325
42.3%
<NA> 13
 
1.7%

Length

2023-12-13T06:43:22.471365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:43:22.594619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
실습 431
56.0%
이론 325
42.3%
na 13
 
1.7%

교과목명
Text

MISSING 

Distinct448
Distinct (%)59.3%
Missing13
Missing (%)1.7%
Memory size6.1 KiB
2023-12-13T06:43:22.881688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length6.2156085
Min length3

Characters and Unicode

Total characters4699
Distinct characters293
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique352 ?
Unique (%)46.6%

Sample

1st row3D모델링실습
2nd rowCAD실습
3rd rowCNC선반실습
4th row공작기계실습
5th row금속가공실습
ValueCountFrequency (%)
문제원형실습 66
 
8.5%
cad실습 18
 
2.3%
디지털공학 14
 
1.8%
디지털회로실습 12
 
1.6%
회로이론 10
 
1.3%
프로그래밍실습 8
 
1.0%
기계공작법 8
 
1.0%
기계제도 8
 
1.0%
전기전자공학 7
 
0.9%
프로그래밍언어실습 6
 
0.8%
Other values (448) 615
79.7%
2023-12-13T06:43:23.326505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
394
 
8.4%
377
 
8.0%
208
 
4.4%
138
 
2.9%
132
 
2.8%
132
 
2.8%
131
 
2.8%
128
 
2.7%
90
 
1.9%
86
 
1.8%
Other values (283) 2883
61.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4420
94.1%
Uppercase Letter 237
 
5.0%
Decimal Number 17
 
0.4%
Space Separator 16
 
0.3%
Lowercase Letter 6
 
0.1%
Other Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
394
 
8.9%
377
 
8.5%
208
 
4.7%
138
 
3.1%
132
 
3.0%
132
 
3.0%
131
 
3.0%
128
 
2.9%
90
 
2.0%
86
 
1.9%
Other values (258) 2604
58.9%
Uppercase Letter
ValueCountFrequency (%)
C 76
32.1%
A 51
21.5%
D 46
19.4%
I 14
 
5.9%
T 12
 
5.1%
N 10
 
4.2%
P 9
 
3.8%
L 8
 
3.4%
M 5
 
2.1%
H 1
 
0.4%
Other values (5) 5
 
2.1%
Decimal Number
ValueCountFrequency (%)
1 10
58.8%
3 4
 
23.5%
2 2
 
11.8%
5 1
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
o 4
66.7%
u 1
 
16.7%
t 1
 
16.7%
Other Punctuation
ValueCountFrequency (%)
/ 2
66.7%
. 1
33.3%
Space Separator
ValueCountFrequency (%)
16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4420
94.1%
Latin 243
 
5.2%
Common 36
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
394
 
8.9%
377
 
8.5%
208
 
4.7%
138
 
3.1%
132
 
3.0%
132
 
3.0%
131
 
3.0%
128
 
2.9%
90
 
2.0%
86
 
1.9%
Other values (258) 2604
58.9%
Latin
ValueCountFrequency (%)
C 76
31.3%
A 51
21.0%
D 46
18.9%
I 14
 
5.8%
T 12
 
4.9%
N 10
 
4.1%
P 9
 
3.7%
L 8
 
3.3%
M 5
 
2.1%
o 4
 
1.6%
Other values (8) 8
 
3.3%
Common
ValueCountFrequency (%)
16
44.4%
1 10
27.8%
3 4
 
11.1%
/ 2
 
5.6%
2 2
 
5.6%
5 1
 
2.8%
. 1
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4420
94.1%
ASCII 279
 
5.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
394
 
8.9%
377
 
8.5%
208
 
4.7%
138
 
3.1%
132
 
3.0%
132
 
3.0%
131
 
3.0%
128
 
2.9%
90
 
2.0%
86
 
1.9%
Other values (258) 2604
58.9%
ASCII
ValueCountFrequency (%)
C 76
27.2%
A 51
18.3%
D 46
16.5%
16
 
5.7%
I 14
 
5.0%
T 12
 
4.3%
1 10
 
3.6%
N 10
 
3.6%
P 9
 
3.2%
L 8
 
2.9%
Other values (15) 27
 
9.7%

학점
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
4
395 
2
221 
3
140 
<NA>
 
13

Length

Max length4
Median length1
Mean length1.0507152
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 395
51.4%
2 221
28.7%
3 140
 
18.2%
<NA> 13
 
1.7%

Length

2023-12-13T06:43:23.459848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:43:23.572170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 395
51.4%
2 221
28.7%
3 140
 
18.2%
na 13
 
1.7%

Correlations

2023-12-13T06:43:23.654493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계열학과캠퍼스이론실습여부학점
계열1.0001.0000.9860.0000.377
학과1.0001.0001.0000.0000.593
캠퍼스0.9861.0001.0000.0000.474
이론실습여부0.0000.0000.0001.0000.660
학점0.3770.5930.4740.6601.000
2023-12-13T06:43:23.748979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학점이론실습여부캠퍼스계열
학점1.0000.9200.2460.213
이론실습여부0.9201.0000.0000.000
캠퍼스0.2460.0001.0000.801
계열0.2130.0000.8011.000
2023-12-13T06:43:23.832232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계열캠퍼스이론실습여부학점
계열1.0000.8010.0000.213
캠퍼스0.8011.0000.0000.246
이론실습여부0.0000.0001.0000.920
학점0.2130.2460.9201.000

Missing values

2023-12-13T06:43:20.964839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:43:21.092032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

과정계열학과캠퍼스이론실습여부교과목명학점
0다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습3D모델링실습4
1다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습CAD실습4
2다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습CNC선반실습4
3다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습공작기계실습4
4다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습금속가공실습4
5다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습금형제작실습4
6다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습기계공작실습4
7다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습도면해독법3
8다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습머시닝센터실습4
9다기능기계컴퓨터응용기계서울정수, 성남, 춘천, 원주, 청주, 홍성, 김제, 목포, 익산, 대구, 구미, 부산, 울산-컴퓨터응용기계/창원-기계시스템실습정밀측정실습4
과정계열학과캠퍼스이론실습여부교과목명학점
759다기능ICT융합스마트팩토리창원이론기계제도2
760다기능ICT융합스마트팩토리창원이론전기전자공학2
761다기능ICT융합스마트융합제어울산실습IoT융합CAD실습4
762다기능ICT융합스마트융합제어울산실습네트워크구축실습4
763다기능ICT융합스마트융합제어울산실습디지털회로실습4
764다기능ICT융합스마트융합제어울산실습스마트전기제어실습4
765다기능ICT융합스마트융합제어울산실습전기전자실습4
766다기능ICT융합스마트융합제어울산실습문제원형실습4
767다기능ICT융합스마트융합제어울산이론전기전자공학2
768다기능ICT융합스마트융합제어울산이론프로그래밍언어3

Duplicate rows

Most frequently occurring

과정계열학과캠퍼스이론실습여부교과목명학점# duplicates
0다기능미디어스마트소프트웨어대전<NA><NA><NA>13
1다기능섬유패션패션디자인서울강서,섬유패션이론의복구성원리32
2다기능전기전기, 전기에너지, 전기에너지시스템서울정수, 안성, 춘천, 홍성, 광주, 목포, 구미, 부산, 울산-전기/인천-전기에너지시스템/ 청주-전기에너지이론신재생에너지공학22