Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows26
Duplicate rows (%)0.3%
Total size in memory468.8 KiB
Average record size in memory48.0 B

Variable types

Categorical3
Text2

Dataset

Description폴리텍대학에서 운영하는 학과, 계열, 과정, 교과목명
Author학교법인한국폴리텍
URLhttps://www.data.go.kr/data/15053553/fileData.do

Alerts

Dataset has 26 (0.3%) duplicate rowsDuplicates
계열 is highly imbalanced (76.3%)Imbalance

Reproduction

Analysis started2023-12-12 20:37:34.833468
Analysis finished2023-12-12 20:37:35.714343
Duration0.88 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

캠퍼스
Categorical

Distinct36
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
인천캠퍼스
848 
서울정수캠퍼스
 
633
서울강서캠퍼스
 
565
창원캠퍼스
 
559
광주캠퍼스
 
514
Other values (31)
6881 

Length

Max length7
Median length5
Mean length5.3524
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대전캠퍼스
2nd row남인천캠퍼스
3rd row아산캠퍼스
4th row남인천캠퍼스
5th row아산캠퍼스

Common Values

ValueCountFrequency (%)
인천캠퍼스 848
 
8.5%
서울정수캠퍼스 633
 
6.3%
서울강서캠퍼스 565
 
5.7%
창원캠퍼스 559
 
5.6%
광주캠퍼스 514
 
5.1%
대전캠퍼스 428
 
4.3%
성남캠퍼스 410
 
4.1%
춘천캠퍼스 364
 
3.6%
부산캠퍼스 349
 
3.5%
울산캠퍼스 328
 
3.3%
Other values (26) 5002
50.0%

Length

2023-12-13T05:37:35.795318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
인천캠퍼스 848
 
8.5%
서울정수캠퍼스 633
 
6.3%
서울강서캠퍼스 565
 
5.7%
창원캠퍼스 559
 
5.6%
광주캠퍼스 514
 
5.1%
대전캠퍼스 428
 
4.3%
성남캠퍼스 410
 
4.1%
춘천캠퍼스 364
 
3.6%
부산캠퍼스 349
 
3.5%
울산캠퍼스 328
 
3.3%
Other values (26) 5002
50.0%

학과
Text

Distinct178
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T05:37:36.042724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length6.9832
Min length3

Characters and Unicode

Total characters69832
Distinct characters193
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row디지털콘텐츠과
2nd row스마트전자과
3rd row정보통신시스템과
4th row스마트전자과
5th row메카트로닉스과
ValueCountFrequency (%)
컴퓨터응용기계과 882
 
8.8%
경력단절여성(예산 509
 
5.1%
베이비부머(예산 446
 
4.5%
자동차과 425
 
4.2%
산업설비자동화과 383
 
3.8%
전기과 368
 
3.7%
금형디자인과 318
 
3.2%
정보통신시스템과 306
 
3.1%
산업설비과 297
 
3.0%
메카트로닉스과 291
 
2.9%
Other values (168) 5775
57.8%
2023-12-13T05:37:36.478879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9010
 
12.9%
3259
 
4.7%
2600
 
3.7%
2395
 
3.4%
2022
 
2.9%
1906
 
2.7%
1870
 
2.7%
1513
 
2.2%
1508
 
2.2%
1466
 
2.1%
Other values (183) 42283
60.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66219
94.8%
Open Punctuation 1282
 
1.8%
Close Punctuation 1282
 
1.8%
Uppercase Letter 963
 
1.4%
Other Punctuation 69
 
0.1%
Decimal Number 17
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9010
 
13.6%
3259
 
4.9%
2600
 
3.9%
2395
 
3.6%
2022
 
3.1%
1906
 
2.9%
1870
 
2.8%
1513
 
2.3%
1508
 
2.3%
1466
 
2.2%
Other values (169) 38670
58.4%
Uppercase Letter
ValueCountFrequency (%)
C 206
21.4%
T 170
17.7%
I 170
17.7%
D 140
14.5%
A 100
10.4%
S 67
 
7.0%
G 48
 
5.0%
L 23
 
2.4%
E 20
 
2.1%
W 19
 
2.0%
Open Punctuation
ValueCountFrequency (%)
( 1282
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1282
100.0%
Other Punctuation
ValueCountFrequency (%)
& 69
100.0%
Decimal Number
ValueCountFrequency (%)
3 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 66219
94.8%
Common 2650
 
3.8%
Latin 963
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9010
 
13.6%
3259
 
4.9%
2600
 
3.9%
2395
 
3.6%
2022
 
3.1%
1906
 
2.9%
1870
 
2.8%
1513
 
2.3%
1508
 
2.3%
1466
 
2.2%
Other values (169) 38670
58.4%
Latin
ValueCountFrequency (%)
C 206
21.4%
T 170
17.7%
I 170
17.7%
D 140
14.5%
A 100
10.4%
S 67
 
7.0%
G 48
 
5.0%
L 23
 
2.4%
E 20
 
2.1%
W 19
 
2.0%
Common
ValueCountFrequency (%)
( 1282
48.4%
) 1282
48.4%
& 69
 
2.6%
3 17
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 66219
94.8%
ASCII 3613
 
5.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9010
 
13.6%
3259
 
4.9%
2600
 
3.9%
2395
 
3.6%
2022
 
3.1%
1906
 
2.9%
1870
 
2.8%
1513
 
2.3%
1508
 
2.3%
1466
 
2.2%
Other values (169) 38670
58.4%
ASCII
ValueCountFrequency (%)
( 1282
35.5%
) 1282
35.5%
C 206
 
5.7%
T 170
 
4.7%
I 170
 
4.7%
D 140
 
3.9%
A 100
 
2.8%
& 69
 
1.9%
S 67
 
1.9%
G 48
 
1.3%
Other values (4) 79
 
2.2%

계열
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
정보.전기.전자계열
9219 
기계.금속계열
 
548
디자인.섬유계
 
213
자동화.건축.산업응용계열
 
20

Length

Max length13
Median length10
Mean length9.7777
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row디자인.섬유계
2nd row정보.전기.전자계열
3rd row정보.전기.전자계열
4th row정보.전기.전자계열
5th row정보.전기.전자계열

Common Values

ValueCountFrequency (%)
정보.전기.전자계열 9219
92.2%
기계.금속계열 548
 
5.5%
디자인.섬유계 213
 
2.1%
자동화.건축.산업응용계열 20
 
0.2%

Length

2023-12-13T05:37:36.668079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:37:36.777400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정보.전기.전자계열 9219
92.2%
기계.금속계열 548
 
5.5%
디자인.섬유계 213
 
2.1%
자동화.건축.산업응용계열 20
 
0.2%

과정
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
다기능기술자
5480 
기능사
4194 
학위전공심화
 
189
기능장
 
137

Length

Max length6
Median length6
Mean length4.7007
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row다기능기술자
2nd row기능사
3rd row다기능기술자
4th row기능사
5th row다기능기술자

Common Values

ValueCountFrequency (%)
다기능기술자 5480
54.8%
기능사 4194
41.9%
학위전공심화 189
 
1.9%
기능장 137
 
1.4%

Length

2023-12-13T05:37:36.916641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:37:37.062141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
다기능기술자 5480
54.8%
기능사 4194
41.9%
학위전공심화 189
 
1.9%
기능장 137
 
1.4%
Distinct4221
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T05:37:37.289681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length25
Mean length6.5238
Min length2

Characters and Unicode

Total characters65238
Distinct characters563
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3114 ?
Unique (%)31.1%

Sample

1st row영상연출
2nd row프로젝트실습
3rd row통신융합응용실습
4th row직업과사회
5th row프로젝트실습1
ValueCountFrequency (%)
문제원형실습 158
 
1.4%
프로젝트실습 115
 
1.0%
참人폴리텍 100
 
0.9%
영어 99
 
0.9%
직업과사회 97
 
0.9%
86
 
0.8%
실용영어 83
 
0.8%
건강과능력개발 82
 
0.7%
봉사활동 82
 
0.7%
한국사 77
 
0.7%
Other values (4349) 10038
91.1%
2023-12-13T05:37:37.782748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4016
 
6.2%
3550
 
5.4%
1958
 
3.0%
1262
 
1.9%
1239
 
1.9%
1196
 
1.8%
1188
 
1.8%
1142
 
1.8%
C 1121
 
1.7%
1039
 
1.6%
Other values (553) 47527
72.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 56913
87.2%
Uppercase Letter 4097
 
6.3%
Decimal Number 1738
 
2.7%
Space Separator 1039
 
1.6%
Lowercase Letter 394
 
0.6%
Open Punctuation 314
 
0.5%
Close Punctuation 313
 
0.5%
Letter Number 306
 
0.5%
Other Punctuation 106
 
0.2%
Dash Punctuation 14
 
< 0.1%
Other values (3) 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4016
 
7.1%
3550
 
6.2%
1958
 
3.4%
1262
 
2.2%
1239
 
2.2%
1196
 
2.1%
1188
 
2.1%
1142
 
2.0%
954
 
1.7%
890
 
1.6%
Other values (479) 39518
69.4%
Uppercase Letter
ValueCountFrequency (%)
C 1121
27.4%
D 535
13.1%
A 478
11.7%
I 345
 
8.4%
N 250
 
6.1%
P 206
 
5.0%
L 201
 
4.9%
T 180
 
4.4%
O 154
 
3.8%
M 148
 
3.6%
Other values (14) 479
11.7%
Lowercase Letter
ValueCountFrequency (%)
e 63
16.0%
r 40
10.2%
o 32
 
8.1%
l 31
 
7.9%
n 30
 
7.6%
i 27
 
6.9%
w 25
 
6.3%
a 24
 
6.1%
t 21
 
5.3%
s 13
 
3.3%
Other values (14) 88
22.3%
Decimal Number
ValueCountFrequency (%)
1 756
43.5%
2 674
38.8%
3 270
 
15.5%
5 17
 
1.0%
4 16
 
0.9%
0 2
 
0.1%
7 2
 
0.1%
8 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
/ 54
50.9%
, 38
35.8%
. 9
 
8.5%
& 4
 
3.8%
% 1
 
0.9%
Letter Number
ValueCountFrequency (%)
125
40.8%
122
39.9%
53
17.3%
6
 
2.0%
Open Punctuation
ValueCountFrequency (%)
( 282
89.8%
[ 32
 
10.2%
Close Punctuation
ValueCountFrequency (%)
) 281
89.8%
] 32
 
10.2%
Space Separator
ValueCountFrequency (%)
1039
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%
Other Number
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 56803
87.1%
Latin 4797
 
7.4%
Common 3528
 
5.4%
Han 110
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4016
 
7.1%
3550
 
6.2%
1958
 
3.4%
1262
 
2.2%
1239
 
2.2%
1196
 
2.1%
1188
 
2.1%
1142
 
2.0%
954
 
1.7%
890
 
1.6%
Other values (478) 39408
69.4%
Latin
ValueCountFrequency (%)
C 1121
23.4%
D 535
11.2%
A 478
10.0%
I 345
 
7.2%
N 250
 
5.2%
P 206
 
4.3%
L 201
 
4.2%
T 180
 
3.8%
O 154
 
3.2%
M 148
 
3.1%
Other values (42) 1179
24.6%
Common
ValueCountFrequency (%)
1039
29.5%
1 756
21.4%
2 674
19.1%
( 282
 
8.0%
) 281
 
8.0%
3 270
 
7.7%
/ 54
 
1.5%
, 38
 
1.1%
] 32
 
0.9%
[ 32
 
0.9%
Other values (12) 70
 
2.0%
Han
ValueCountFrequency (%)
110
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 56803
87.1%
ASCII 8016
 
12.3%
Number Forms 306
 
0.5%
CJK 110
 
0.2%
None 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4016
 
7.1%
3550
 
6.2%
1958
 
3.4%
1262
 
2.2%
1239
 
2.2%
1196
 
2.1%
1188
 
2.1%
1142
 
2.0%
954
 
1.7%
890
 
1.6%
Other values (478) 39408
69.4%
ASCII
ValueCountFrequency (%)
C 1121
14.0%
1039
13.0%
1 756
 
9.4%
2 674
 
8.4%
D 535
 
6.7%
A 478
 
6.0%
I 345
 
4.3%
( 282
 
3.5%
) 281
 
3.5%
3 270
 
3.4%
Other values (58) 2235
27.9%
Number Forms
ValueCountFrequency (%)
125
40.8%
122
39.9%
53
17.3%
6
 
2.0%
CJK
ValueCountFrequency (%)
110
100.0%
None
ValueCountFrequency (%)
2
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-13T05:37:37.902588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
캠퍼스계열과정
캠퍼스1.0000.5420.672
계열0.5421.0000.474
과정0.6720.4741.000
2023-12-13T05:37:38.021750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계열과정캠퍼스
계열1.0000.2000.285
과정0.2001.0000.383
캠퍼스0.2850.3831.000
2023-12-13T05:37:38.149838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
캠퍼스계열과정
캠퍼스1.0000.2850.383
계열0.2851.0000.200
과정0.3830.2001.000

Missing values

2023-12-13T05:37:35.550123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:37:35.663857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

캠퍼스학과계열과정교과목명
9239대전캠퍼스디지털콘텐츠과디자인.섬유계다기능기술자영상연출
6472남인천캠퍼스스마트전자과정보.전기.전자계열기능사프로젝트실습
10480아산캠퍼스정보통신시스템과정보.전기.전자계열다기능기술자통신융합응용실습
6440남인천캠퍼스스마트전자과정보.전기.전자계열기능사직업과사회
10319아산캠퍼스메카트로닉스과정보.전기.전자계열다기능기술자프로젝트실습1
9835청주캠퍼스컴퓨터응용기계과정보.전기.전자계열다기능기술자실용영어
5005인천캠퍼스정보통신공학과정보.전기.전자계열학위전공심화정보통신망구축실습
10529아산캠퍼스메카트로닉스과정보.전기.전자계열다기능기술자유공압응용실습
13415목포캠퍼스스마트정보통신과정보.전기.전자계열다기능기술자고성능프로세서활용실습
2138서울강서캠퍼스실내건축디자인과정보.전기.전자계열기능사실내건축시공실습1
캠퍼스학과계열과정교과목명
7529춘천캠퍼스산업설비과정보.전기.전자계열기능사배관공학
17763창원캠퍼스스마트전기전자과정보.전기.전자계열기능사디지털회로실습2
5790안성캠퍼스주얼리디자인과정보.전기.전자계열다기능기술자경제학의이해
11260충주캠퍼스전기제어과정보.전기.전자계열기능사전기CAD
19489진주캠퍼스컴퓨터응용기계과정보.전기.전자계열기능사CNC공작기계실습
20116바이오캠퍼스바이오배양공정과정보.전기.전자계열기능사바이오산업개론
5081인천캠퍼스기계시스템과기계.금속계열기능장기계재료
9415대전캠퍼스기계시스템과기계.금속계열기능사직업과사회
396서울정수캠퍼스산업디자인과정보.전기.전자계열다기능기술자공간스케치기법
10630아산캠퍼스베이비부머(예산)정보.전기.전자계열기능사취업지도

Duplicate rows

Most frequently occurring

캠퍼스학과계열과정교과목명# duplicates
0강릉캠퍼스자동차과정보.전기.전자계열기능사건강과능력개발2
1광주캠퍼스베이비부머(예산)정보.전기.전자계열기능사취업역량교육2
2구미캠퍼스IT응용제어과정보.전기.전자계열다기능기술자데이터베이스실습2
3구미캠퍼스스마트전자과정보.전기.전자계열기능사프로젝트실습2
4대전캠퍼스경력단절여성(예산)정보.전기.전자계열기능사직무소양교육2
5서울정수캠퍼스자동차과정보.전기.전자계열다기능기술자친환경전기장치실습2
6서울정수캠퍼스컴퓨터응용기계설계과정보.전기.전자계열다기능기술자치공구설계2
7성남캠퍼스베이비부머(예산)정보.전기.전자계열기능사전기기기2
8성남캠퍼스베이비부머(예산)정보.전기.전자계열기능사직업윤리2
9성남캠퍼스전자정보통신과정보.전기.전자계열다기능기술자디지털공학2