Overview

Dataset statistics

Number of variables2
Number of observations695
Missing cells1
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.0 KiB
Average record size in memory16.2 B

Variable types

Text2

Dataset

Description대학 홈페이지 정보입니다.(학교명, 홈페이지)
Author한국장학재단
URLhttps://www.data.go.kr/data/15070452/fileData.do

Alerts

학교명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 08:14:22.602047
Analysis finished2023-12-12 08:14:22.914403
Duration0.31 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

학교명
Text

UNIQUE 

Distinct695
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-12T17:14:23.117409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length44
Mean length13.084892
Min length5

Characters and Unicode

Total characters9094
Distinct characters277
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique695 ?
Unique (%)100.0%

Sample

1st row문경대학교(본교)
2nd row배재대학교(본교) 학부
3rd row배화여자대학교(본교)
4th row백석대학교(본교) 학부
5th row우송대학교(본교) 대학원
ValueCountFrequency (%)
학부 225
 
18.3%
대학원 212
 
17.3%
university 19
 
1.5%
한국폴리텍대학 16
 
1.3%
of 15
 
1.2%
london 4
 
0.3%
college 4
 
0.3%
중앙대학교 4
 
0.3%
안성캠퍼스 3
 
0.2%
아세아연합신학대학교(본교 2
 
0.2%
Other values (525) 724
59.0%
2023-12-12T17:14:23.662336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1207
13.3%
1185
13.0%
951
 
10.5%
( 569
 
6.3%
) 569
 
6.3%
533
 
5.9%
532
 
5.9%
317
 
3.5%
254
 
2.8%
119
 
1.3%
Other values (267) 2858
31.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6813
74.9%
Open Punctuation 569
 
6.3%
Close Punctuation 569
 
6.3%
Space Separator 533
 
5.9%
Lowercase Letter 508
 
5.6%
Uppercase Letter 86
 
0.9%
Letter Number 9
 
0.1%
Other Punctuation 4
 
< 0.1%
Decimal Number 2
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1207
17.7%
1185
17.4%
951
14.0%
532
 
7.8%
317
 
4.7%
254
 
3.7%
119
 
1.7%
117
 
1.7%
65
 
1.0%
64
 
0.9%
Other values (212) 2002
29.4%
Lowercase Letter
ValueCountFrequency (%)
i 60
11.8%
e 56
11.0%
o 47
9.3%
n 45
 
8.9%
r 37
 
7.3%
s 35
 
6.9%
t 34
 
6.7%
a 32
 
6.3%
y 27
 
5.3%
l 22
 
4.3%
Other values (14) 113
22.2%
Uppercase Letter
ValueCountFrequency (%)
U 21
24.4%
C 14
16.3%
S 8
 
9.3%
B 5
 
5.8%
L 5
 
5.8%
I 5
 
5.8%
M 4
 
4.7%
N 4
 
4.7%
T 3
 
3.5%
G 3
 
3.5%
Other values (9) 14
16.3%
Letter Number
ValueCountFrequency (%)
3
33.3%
2
22.2%
1
 
11.1%
1
 
11.1%
1
 
11.1%
1
 
11.1%
Open Punctuation
ValueCountFrequency (%)
( 569
100.0%
Close Punctuation
ValueCountFrequency (%)
) 569
100.0%
Space Separator
ValueCountFrequency (%)
533
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%
Decimal Number
ValueCountFrequency (%)
2 2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6813
74.9%
Common 1678
 
18.5%
Latin 603
 
6.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1207
17.7%
1185
17.4%
951
14.0%
532
 
7.8%
317
 
4.7%
254
 
3.7%
119
 
1.7%
117
 
1.7%
65
 
1.0%
64
 
0.9%
Other values (212) 2002
29.4%
Latin
ValueCountFrequency (%)
i 60
 
10.0%
e 56
 
9.3%
o 47
 
7.8%
n 45
 
7.5%
r 37
 
6.1%
s 35
 
5.8%
t 34
 
5.6%
a 32
 
5.3%
y 27
 
4.5%
l 22
 
3.6%
Other values (39) 208
34.5%
Common
ValueCountFrequency (%)
( 569
33.9%
) 569
33.9%
533
31.8%
, 4
 
0.2%
2 2
 
0.1%
1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6813
74.9%
ASCII 2271
 
25.0%
Number Forms 9
 
0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1207
17.7%
1185
17.4%
951
14.0%
532
 
7.8%
317
 
4.7%
254
 
3.7%
119
 
1.7%
117
 
1.7%
65
 
1.0%
64
 
0.9%
Other values (212) 2002
29.4%
ASCII
ValueCountFrequency (%)
( 569
25.1%
) 569
25.1%
533
23.5%
i 60
 
2.6%
e 56
 
2.5%
o 47
 
2.1%
n 45
 
2.0%
r 37
 
1.6%
s 35
 
1.5%
t 34
 
1.5%
Other values (38) 286
12.6%
Number Forms
ValueCountFrequency (%)
3
33.3%
2
22.2%
1
 
11.1%
1
 
11.1%
1
 
11.1%
1
 
11.1%
Punctuation
ValueCountFrequency (%)
1
100.0%
Distinct525
Distinct (%)75.6%
Missing1
Missing (%)0.1%
Memory size5.6 KiB
2023-12-12T17:14:23.980142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length29
Mean length16.007205
Min length10

Characters and Unicode

Total characters11109
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique371 ?
Unique (%)53.5%

Sample

1st rowwww.mkc.ac.kr
2nd rowwww.pcu.ac.kr
3rd rowbaewha.ac.kr
4th rowwww.bu.ac.kr
5th rowwww.woosong.ac.kr
ValueCountFrequency (%)
www.gachon.ac.kr 4
 
0.6%
www.cau.ac.kr 4
 
0.6%
www.kopo.ac.kr 4
 
0.6%
www.dankook.ac.kr 4
 
0.6%
www.hufs.ac.kr 3
 
0.4%
www.kangwon.ac.kr 3
 
0.4%
www.hanyang.ac.kr 3
 
0.4%
www.smu.ac.kr 3
 
0.4%
www.yonsei.ac.kr 3
 
0.4%
www.khu.ac.kr 3
 
0.4%
Other values (516) 661
95.1%
2023-12-12T17:14:24.489306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2052
18.5%
w 1965
17.7%
a 931
8.4%
k 897
8.1%
c 873
 
7.9%
r 717
 
6.5%
u 416
 
3.7%
n 390
 
3.5%
t 323
 
2.9%
o 313
 
2.8%
Other values (23) 2232
20.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8701
78.3%
Other Punctuation 2399
 
21.6%
Decimal Number 5
 
< 0.1%
Dash Punctuation 3
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 1965
22.6%
a 931
10.7%
k 897
10.3%
c 873
10.0%
r 717
 
8.2%
u 416
 
4.8%
n 390
 
4.5%
t 323
 
3.7%
o 313
 
3.6%
s 297
 
3.4%
Other values (15) 1579
18.1%
Other Punctuation
ValueCountFrequency (%)
. 2052
85.5%
/ 240
 
10.0%
: 106
 
4.4%
, 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 4
80.0%
5 1
 
20.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8701
78.3%
Common 2408
 
21.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 1965
22.6%
a 931
10.7%
k 897
10.3%
c 873
10.0%
r 717
 
8.2%
u 416
 
4.8%
n 390
 
4.5%
t 323
 
3.7%
o 313
 
3.6%
s 297
 
3.4%
Other values (15) 1579
18.1%
Common
ValueCountFrequency (%)
. 2052
85.2%
/ 240
 
10.0%
: 106
 
4.4%
1 4
 
0.2%
- 3
 
0.1%
, 1
 
< 0.1%
1
 
< 0.1%
5 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11109
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2052
18.5%
w 1965
17.7%
a 931
8.4%
k 897
8.1%
c 873
 
7.9%
r 717
 
6.5%
u 416
 
3.7%
n 390
 
3.5%
t 323
 
2.9%
o 313
 
2.8%
Other values (23) 2232
20.1%

Missing values

2023-12-12T17:14:22.819960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:14:22.887658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

학교명홈페이지주소
0문경대학교(본교)www.mkc.ac.kr
1배재대학교(본교) 학부www.pcu.ac.kr
2배화여자대학교(본교)baewha.ac.kr
3백석대학교(본교) 학부www.bu.ac.kr
4우송대학교(본교) 대학원www.woosong.ac.kr
5울산과학기술원(본교) 대학원www.unist.ac.kr
6울산대학교(본교) 대학원www.ulsan.ac.kr
7원광대학교(본교) 대학원www.wonkwang.ac.kr
8원불교대학원대학교http://www.wonbuddhism.ac.kr/
9웨스트민스터신학대학원대학교http://www.wgst.ac.kr
학교명홈페이지주소
685국립암센터국제암대학원대학교http://www.ncc.re.kr
686University of Sheffieldhttps://www.sheffield.ac.uk
687Temple Universityhttps://www.temple.edu/
688National University of Singaporehttps://nus.edu.sg
689University of British Colombia (Okanagan Campus)https://www.ubc.ca
690University of Marylandwww.umd.edu
691University of Arizonahttp://www.arizona.edu
692산동사범대학교www.sdnu.edu.cn.
693남서울대학교 대학원 국제캠퍼스igs.nsu.ac.kr
694Georgia State University’s Perimeter Collegeperimeter.gsu.edu