Overview

Dataset statistics

Number of variables4
Number of observations1395
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory43.7 KiB
Average record size in memory32.1 B

Variable types

Categorical1
Text3

Dataset

Description한국국제교류재단이 <해외대학 국문명칭 표준화 지침>을 통해 정의한 해외 주요대학의 국문 및 영문 명칭에 관한 정보를 제공합니다.
Author한국국제교류재단
URLhttps://www.data.go.kr/data/15060038/fileData.do

Reproduction

Analysis started2024-03-14 15:28:59.637388
Analysis finished2024-03-14 15:29:00.934922
Duration1.3 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지역
Categorical

Distinct12
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
동북아
530 
북미
184 
서유럽
161 
동남아
127 
중유럽
91 
Other values (7)
302 

Length

Max length5
Median length3
Mean length2.9491039
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남미
2nd row남미
3rd row남미
4th row남미
5th row남미

Common Values

ValueCountFrequency (%)
동북아 530
38.0%
북미 184
 
13.2%
서유럽 161
 
11.5%
동남아 127
 
9.1%
중유럽 91
 
6.5%
유라시아 84
 
6.0%
중미카리브 50
 
3.6%
중동 42
 
3.0%
남미 40
 
2.9%
서남아 40
 
2.9%
Other values (2) 46
 
3.3%

Length

2024-03-15T00:29:01.169567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
동북아 530
38.0%
북미 184
 
13.2%
서유럽 161
 
11.5%
동남아 127
 
9.1%
중유럽 91
 
6.5%
유라시아 84
 
6.0%
중미카리브 50
 
3.6%
중동 42
 
3.0%
남미 40
 
2.9%
서남아 40
 
2.9%
Other values (2) 46
 
3.3%

국가
Text

Distinct109
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
2024-03-15T00:29:01.951726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.7146953
Min length2

Characters and Unicode

Total characters3787
Distinct characters133
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)2.2%

Sample

1st row볼리비아
2nd row볼리비아
3rd row볼리비아
4th row볼리비아
5th row볼리비아
ValueCountFrequency (%)
일본 347
24.9%
미국 156
 
11.2%
중국 125
 
9.0%
영국 55
 
3.9%
대만 43
 
3.1%
러시아 42
 
3.0%
베트남 34
 
2.4%
태국 33
 
2.4%
독일 29
 
2.1%
인도 29
 
2.1%
Other values (100) 503
36.0%
2024-03-15T00:29:02.889912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
381
 
10.1%
372
 
9.8%
347
 
9.2%
198
 
5.2%
167
 
4.4%
144
 
3.8%
125
 
3.3%
100
 
2.6%
69
 
1.8%
69
 
1.8%
Other values (123) 1815
47.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3786
> 99.9%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
381
 
10.1%
372
 
9.8%
347
 
9.2%
198
 
5.2%
167
 
4.4%
144
 
3.8%
125
 
3.3%
100
 
2.6%
69
 
1.8%
69
 
1.8%
Other values (122) 1814
47.9%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3786
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
381
 
10.1%
372
 
9.8%
347
 
9.2%
198
 
5.2%
167
 
4.4%
144
 
3.8%
125
 
3.3%
100
 
2.6%
69
 
1.8%
69
 
1.8%
Other values (122) 1814
47.9%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3786
> 99.9%
ASCII 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
381
 
10.1%
372
 
9.8%
347
 
9.2%
198
 
5.2%
167
 
4.4%
144
 
3.8%
125
 
3.3%
100
 
2.6%
69
 
1.8%
69
 
1.8%
Other values (122) 1814
47.9%
ASCII
ValueCountFrequency (%)
1
100.0%
Distinct1393
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
2024-03-15T00:29:03.659767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length23
Mean length8.6193548
Min length3

Characters and Unicode

Total characters12024
Distinct characters573
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1391 ?
Unique (%)99.7%

Sample

1st row가브리엘레네모레노자치대학교
2nd row산시몬대학교(UMSS)
3rd row산안드레스대학교
4th row세인트프란시스하비에르대학교
5th row후안미사엘사라초자치대학교
ValueCountFrequency (%)
캘리포니아대학교 8
 
0.5%
인도공과대학교 7
 
0.5%
대학교 5
 
0.3%
펜실베니아 4
 
0.3%
텍사스대학교 4
 
0.3%
뉴욕주립대학교 4
 
0.3%
자이드대학교 3
 
0.2%
송클라대학교 3
 
0.2%
쓰촨외국어대학교 3
 
0.2%
아메리칸대학교 3
 
0.2%
Other values (1430) 1441
97.0%
2024-03-15T00:29:04.686430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1429
 
11.9%
1343
 
11.2%
1316
 
10.9%
276
 
2.3%
239
 
2.0%
237
 
2.0%
191
 
1.6%
177
 
1.5%
161
 
1.3%
133
 
1.1%
Other values (563) 6522
54.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11573
96.2%
Uppercase Letter 217
 
1.8%
Space Separator 90
 
0.7%
Open Punctuation 50
 
0.4%
Close Punctuation 50
 
0.4%
Lowercase Letter 24
 
0.2%
Dash Punctuation 13
 
0.1%
Decimal Number 4
 
< 0.1%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1429
 
12.3%
1343
 
11.6%
1316
 
11.4%
276
 
2.4%
239
 
2.1%
237
 
2.0%
191
 
1.7%
177
 
1.5%
161
 
1.4%
133
 
1.1%
Other values (522) 6071
52.5%
Uppercase Letter
ValueCountFrequency (%)
U 45
20.7%
S 21
9.7%
N 20
9.2%
I 18
 
8.3%
A 16
 
7.4%
C 16
 
7.4%
L 11
 
5.1%
M 10
 
4.6%
T 10
 
4.6%
E 7
 
3.2%
Other values (11) 43
19.8%
Lowercase Letter
ValueCountFrequency (%)
s 4
16.7%
n 4
16.7%
i 3
12.5%
e 3
12.5%
t 2
8.3%
u 2
8.3%
a 2
8.3%
g 1
 
4.2%
l 1
 
4.2%
r 1
 
4.2%
Decimal Number
ValueCountFrequency (%)
8 1
25.0%
3 1
25.0%
5 1
25.0%
2 1
25.0%
Space Separator
ValueCountFrequency (%)
90
100.0%
Open Punctuation
ValueCountFrequency (%)
( 50
100.0%
Close Punctuation
ValueCountFrequency (%)
) 50
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11573
96.2%
Latin 241
 
2.0%
Common 210
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1429
 
12.3%
1343
 
11.6%
1316
 
11.4%
276
 
2.4%
239
 
2.1%
237
 
2.0%
191
 
1.7%
177
 
1.5%
161
 
1.4%
133
 
1.1%
Other values (522) 6071
52.5%
Latin
ValueCountFrequency (%)
U 45
18.7%
S 21
 
8.7%
N 20
 
8.3%
I 18
 
7.5%
A 16
 
6.6%
C 16
 
6.6%
L 11
 
4.6%
M 10
 
4.1%
T 10
 
4.1%
E 7
 
2.9%
Other values (22) 67
27.8%
Common
ValueCountFrequency (%)
90
42.9%
( 50
23.8%
) 50
23.8%
- 13
 
6.2%
. 3
 
1.4%
8 1
 
0.5%
3 1
 
0.5%
5 1
 
0.5%
2 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11573
96.2%
ASCII 451
 
3.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1429
 
12.3%
1343
 
11.6%
1316
 
11.4%
276
 
2.4%
239
 
2.1%
237
 
2.0%
191
 
1.7%
177
 
1.5%
161
 
1.4%
133
 
1.1%
Other values (522) 6071
52.5%
ASCII
ValueCountFrequency (%)
90
20.0%
( 50
11.1%
) 50
11.1%
U 45
10.0%
S 21
 
4.7%
N 20
 
4.4%
I 18
 
4.0%
A 16
 
3.5%
C 16
 
3.5%
- 13
 
2.9%
Other values (31) 112
24.8%
Distinct1393
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
2024-03-15T00:29:06.126794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length105
Median length59
Mean length27.395699
Min length4

Characters and Unicode

Total characters38217
Distinct characters62
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1391 ?
Unique (%)99.7%

Sample

1st rowGabriel Rene Moreno Autonomous University
2nd rowUniversity of San Simon (UMSS)
3rd rowHigher University of San Andres
4th rowThe Royal and Pontifical Major University of Saint Francis Xavier of Chuquisaca
5th rowJuan Misael Saracho Autonomous University
ValueCountFrequency (%)
university 1248
25.4%
of 572
 
11.6%
college 81
 
1.6%
state 70
 
1.4%
technology 70
 
1.4%
national 66
 
1.3%
and 56
 
1.1%
institute 48
 
1.0%
international 47
 
1.0%
normal 25
 
0.5%
Other values (1544) 2638
53.6%
2024-03-15T00:29:08.106436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4117
 
10.8%
3527
 
9.2%
n 3030
 
7.9%
e 2874
 
7.5%
t 2407
 
6.3%
a 2342
 
6.1%
r 2126
 
5.6%
o 2108
 
5.5%
s 2028
 
5.3%
y 1604
 
4.2%
Other values (52) 12054
31.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29802
78.0%
Uppercase Letter 4586
 
12.0%
Space Separator 3527
 
9.2%
Other Punctuation 92
 
0.2%
Open Punctuation 84
 
0.2%
Close Punctuation 84
 
0.2%
Dash Punctuation 41
 
0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4117
13.8%
n 3030
10.2%
e 2874
9.6%
t 2407
8.1%
a 2342
7.9%
r 2126
 
7.1%
o 2108
 
7.1%
s 2028
 
6.8%
y 1604
 
5.4%
v 1335
 
4.5%
Other values (16) 5831
19.6%
Uppercase Letter
ValueCountFrequency (%)
U 1356
29.6%
S 401
 
8.7%
C 335
 
7.3%
T 270
 
5.9%
N 229
 
5.0%
I 202
 
4.4%
A 193
 
4.2%
M 187
 
4.1%
P 144
 
3.1%
K 136
 
3.0%
Other values (16) 1133
24.7%
Other Punctuation
ValueCountFrequency (%)
' 34
37.0%
, 26
28.3%
. 21
22.8%
& 10
 
10.9%
? 1
 
1.1%
Space Separator
ValueCountFrequency (%)
3527
100.0%
Open Punctuation
ValueCountFrequency (%)
( 84
100.0%
Close Punctuation
ValueCountFrequency (%)
) 84
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34388
90.0%
Common 3829
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4117
 
12.0%
n 3030
 
8.8%
e 2874
 
8.4%
t 2407
 
7.0%
a 2342
 
6.8%
r 2126
 
6.2%
o 2108
 
6.1%
s 2028
 
5.9%
y 1604
 
4.7%
U 1356
 
3.9%
Other values (42) 10396
30.2%
Common
ValueCountFrequency (%)
3527
92.1%
( 84
 
2.2%
) 84
 
2.2%
- 41
 
1.1%
' 34
 
0.9%
, 26
 
0.7%
. 21
 
0.5%
& 10
 
0.3%
3 1
 
< 0.1%
? 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38217
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4117
 
10.8%
3527
 
9.2%
n 3030
 
7.9%
e 2874
 
7.5%
t 2407
 
6.3%
a 2342
 
6.1%
r 2126
 
5.6%
o 2108
 
5.5%
s 2028
 
5.3%
y 1604
 
4.2%
Other values (52) 12054
31.5%

Missing values

2024-03-15T00:29:00.540238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T00:29:00.821100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지역국가기관명(한글)기관명(영문)
0남미볼리비아가브리엘레네모레노자치대학교Gabriel Rene Moreno Autonomous University
1남미볼리비아산시몬대학교(UMSS)University of San Simon (UMSS)
2남미볼리비아산안드레스대학교Higher University of San Andres
3남미볼리비아세인트프란시스하비에르대학교The Royal and Pontifical Major University of Saint Francis Xavier of Chuquisaca
4남미볼리비아후안미사엘사라초자치대학교Juan Misael Saracho Autonomous University
5남미브라질리우데자네이루연방대학교Federal University of Rio de Janeiro
6남미브라질미나스제라이스연방대학교Federal University of Minas Gerais
7남미브라질발리두히우두스시누스대학교(UNISINOS)University of Vale do Rio dos Sinos (UNISINOS)
8남미브라질브라질리아대학교University of Brasilia
9남미브라질상파울루대학교(USP)University of Sao Paulo (USP)
지역국가기관명(한글)기관명(영문)
1385중유럽튀르키예이스탄불대학교Istanbul University
1386중유럽튀르키예이즈미르경제대학교Izmir University of Economics
1387중유럽튀르키예중동공과대학교Middle East Technical University
1388중유럽튀르키예하제테페대학교Hacettepe University
1389중유럽폴란드바르샤바대학교University of Warsaw
1390중유럽폴란드브로츠와프대학교University of Wroclaw
1391중유럽폴란드아담미츠키에비츠대학교Adam Mickiewicz University
1392중유럽폴란드야기엘론스키대학교Jagiellonian University
1393중유럽헝가리데브레첸서머대학교Debrecen Summer University
1394중유럽헝가리중앙유럽대학교(CEU)Central European University(CEU)