Overview

Dataset statistics

Number of variables14
Number of observations2191
Missing cells5078
Missing cells (%)16.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory241.9 KiB
Average record size in memory113.1 B

Variable types

Numeric1
Text9
Categorical4

Dataset

Description해외한국학지원사업 연구성과를 등록한 기관 정보
Author한국학중앙연구원
URLhttps://www.data.go.kr/data/15049067/fileData.do

Alerts

GANADA_ORGANIZATION_ORI is highly overall correlated with GANADA_ORGANIZATION_KOR and 2 other fieldsHigh correlation
GANADA_ORGANIZATION_ETC is highly overall correlated with GANADA_ORGANIZATION_ORIHigh correlation
ORGANIZATION_ID is highly overall correlated with GANADA_ORGANIZATION_KOR and 1 other fieldsHigh correlation
GANADA_ORGANIZATION_KOR is highly overall correlated with ORGANIZATION_ID and 2 other fieldsHigh correlation
GANADA_ORGANIZATION_ENG is highly overall correlated with ORGANIZATION_ID and 2 other fieldsHigh correlation
GANADA_ORGANIZATION_ETC is highly imbalanced (86.3%)Imbalance
ORGANIZATION_KOR has 250 (11.4%) missing valuesMissing
ORGANIZATION_ENG has 235 (10.7%) missing valuesMissing
ORGANIZATION_ETC has 2043 (93.2%) missing valuesMissing
SORT_ORGANIZATION_KOR has 250 (11.4%) missing valuesMissing
SORT_ORGANIZATION_ENG has 235 (10.7%) missing valuesMissing
SORT_ORGANIZATION_ETC has 2043 (93.2%) missing valuesMissing
ORGANIZATION_ID has unique valuesUnique

Reproduction

Analysis started2023-12-12 02:09:17.738799
Analysis finished2023-12-12 02:09:20.558884
Duration2.82 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ORGANIZATION_ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2191
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7279.8129
Minimum6154
Maximum8608
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.4 KiB
2023-12-12T11:09:20.651597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6154
5-th percentile6263.5
Q16705.5
median7267
Q37817.5
95-th percentile8437.5
Maximum8608
Range2454
Interquartile range (IQR)1112

Descriptive statistics

Standard deviation668.89804
Coefficient of variation (CV)0.091883961
Kurtosis-1.0390722
Mean7279.8129
Median Absolute Deviation (MAD)556
Skewness0.12539726
Sum15950070
Variance447424.59
MonotonicityStrictly increasing
2023-12-12T11:09:20.806942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6154 1
 
< 0.1%
7637 1
 
< 0.1%
7631 1
 
< 0.1%
7632 1
 
< 0.1%
7633 1
 
< 0.1%
7634 1
 
< 0.1%
7635 1
 
< 0.1%
7636 1
 
< 0.1%
7638 1
 
< 0.1%
7612 1
 
< 0.1%
Other values (2181) 2181
99.5%
ValueCountFrequency (%)
6154 1
< 0.1%
6155 1
< 0.1%
6156 1
< 0.1%
6157 1
< 0.1%
6158 1
< 0.1%
6159 1
< 0.1%
6160 1
< 0.1%
6161 1
< 0.1%
6162 1
< 0.1%
6163 1
< 0.1%
ValueCountFrequency (%)
8608 1
< 0.1%
8607 1
< 0.1%
8602 1
< 0.1%
8601 1
< 0.1%
8600 1
< 0.1%
8598 1
< 0.1%
8596 1
< 0.1%
8593 1
< 0.1%
8590 1
< 0.1%
8588 1
< 0.1%
Distinct2100
Distinct (%)95.8%
Missing0
Missing (%)0.0%
Memory size17.2 KiB
2023-12-12T11:09:21.136199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length9.3455043
Min length5

Characters and Unicode

Total characters20476
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2025 ?
Unique (%)92.4%

Sample

1st row08C19_0019
2nd row08C09_0004
3rd row06C10_0028
4th row06C10_0028
5th row08C09_0024
ValueCountFrequency (%)
09r33_0002 4
 
0.2%
09r33_0001 4
 
0.2%
09c12_0018 4
 
0.2%
09r34 4
 
0.2%
09c12_0004 4
 
0.2%
09c02_0052 3
 
0.1%
06c13_0016 3
 
0.1%
07c09_0040 3
 
0.1%
06p01_0017 3
 
0.1%
10c15_0016 3
 
0.1%
Other values (2090) 2156
98.4%
2023-12-12T11:09:21.649027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7649
37.4%
1 2288
 
11.2%
_ 1887
 
9.2%
C 1585
 
7.7%
2 1042
 
5.1%
6 1022
 
5.0%
9 951
 
4.6%
7 820
 
4.0%
5 686
 
3.4%
8 665
 
3.2%
Other values (10) 1881
 
9.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16319
79.7%
Uppercase Letter 2263
 
11.1%
Connector Punctuation 1887
 
9.2%
Lowercase Letter 7
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7649
46.9%
1 2288
 
14.0%
2 1042
 
6.4%
6 1022
 
6.3%
9 951
 
5.8%
7 820
 
5.0%
5 686
 
4.2%
8 665
 
4.1%
3 654
 
4.0%
4 542
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
C 1585
70.0%
R 340
 
15.0%
P 186
 
8.2%
V 150
 
6.6%
M 2
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
b 3
42.9%
a 2
28.6%
d 1
 
14.3%
t 1
 
14.3%
Connector Punctuation
ValueCountFrequency (%)
_ 1887
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18206
88.9%
Latin 2270
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7649
42.0%
1 2288
 
12.6%
_ 1887
 
10.4%
2 1042
 
5.7%
6 1022
 
5.6%
9 951
 
5.2%
7 820
 
4.5%
5 686
 
3.8%
8 665
 
3.7%
3 654
 
3.6%
Latin
ValueCountFrequency (%)
C 1585
69.8%
R 340
 
15.0%
P 186
 
8.2%
V 150
 
6.6%
b 3
 
0.1%
a 2
 
0.1%
M 2
 
0.1%
d 1
 
< 0.1%
t 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20476
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7649
37.4%
1 2288
 
11.2%
_ 1887
 
9.2%
C 1585
 
7.7%
2 1042
 
5.1%
6 1022
 
5.0%
9 951
 
4.6%
7 820
 
4.0%
5 686
 
3.4%
8 665
 
3.2%
Other values (10) 1881
 
9.2%
Distinct965
Distinct (%)44.3%
Missing11
Missing (%)0.5%
Memory size17.2 KiB
2023-12-12T11:09:22.013831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length107
Median length79
Mean length18.038991
Min length1

Characters and Unicode

Total characters39325
Distinct characters654
Distinct categories13 ?
Distinct scripts11 ?
Distinct blocks14 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique660 ?
Unique (%)30.3%

Sample

1st rowGMM Grammy ltd
2nd rowKanazawa University
3rd rowKanazawa University
4th rowTohoku University
5th rowKanazawa University
ValueCountFrequency (%)
university 852
 
16.0%
of 431
 
8.1%
the 118
 
2.2%
national 90
 
1.7%
연변대학교 64
 
1.2%
studies 56
 
1.0%
연변대학 52
 
1.0%
им 47
 
0.9%
중앙민족대학교 46
 
0.9%
state 46
 
0.9%
Other values (1204) 3534
66.2%
2023-12-12T11:09:22.605669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3176
 
8.1%
i 2913
 
7.4%
n 2256
 
5.7%
e 2220
 
5.6%
t 1738
 
4.4%
a 1606
 
4.1%
o 1597
 
4.1%
r 1511
 
3.8%
s 1471
 
3.7%
y 1106
 
2.8%
Other values (644) 19731
50.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25650
65.2%
Other Letter 5879
 
14.9%
Uppercase Letter 4098
 
10.4%
Space Separator 3176
 
8.1%
Other Punctuation 260
 
0.7%
Dash Punctuation 74
 
0.2%
Close Punctuation 51
 
0.1%
Open Punctuation 51
 
0.1%
Decimal Number 36
 
0.1%
Nonspacing Mark 31
 
0.1%
Other values (3) 19
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
640
 
10.9%
588
 
10.0%
418
 
7.1%
273
 
4.6%
256
 
4.4%
155
 
2.6%
137
 
2.3%
118
 
2.0%
116
 
2.0%
96
 
1.6%
Other values (473) 3082
52.4%
Lowercase Letter
ValueCountFrequency (%)
i 2913
 
11.4%
n 2256
 
8.8%
e 2220
 
8.7%
t 1738
 
6.8%
a 1606
 
6.3%
o 1597
 
6.2%
r 1511
 
5.9%
s 1471
 
5.7%
y 1106
 
4.3%
v 975
 
3.8%
Other values (77) 8257
32.2%
Uppercase Letter
ValueCountFrequency (%)
U 940
22.9%
S 472
 
11.5%
C 229
 
5.6%
A 217
 
5.3%
T 189
 
4.6%
N 167
 
4.1%
H 141
 
3.4%
K 139
 
3.4%
M 128
 
3.1%
I 128
 
3.1%
Other values (42) 1348
32.9%
Nonspacing Mark
ValueCountFrequency (%)
9
29.0%
6
19.4%
́ 4
12.9%
3
 
9.7%
3
 
9.7%
3
 
9.7%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 116
44.6%
, 116
44.6%
' 14
 
5.4%
& 8
 
3.1%
" 2
 
0.8%
: 2
 
0.8%
/ 2
 
0.8%
Decimal Number
ValueCountFrequency (%)
1 18
50.0%
2 10
27.8%
7 4
 
11.1%
3 3
 
8.3%
5 1
 
2.8%
Open Punctuation
ValueCountFrequency (%)
( 50
98.0%
1
 
2.0%
Spacing Mark
ValueCountFrequency (%)
6
66.7%
3
33.3%
Final Punctuation
ValueCountFrequency (%)
4
66.7%
2
33.3%
Initial Punctuation
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
3176
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 74
100.0%
Close Punctuation
ValueCountFrequency (%)
) 51
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25253
64.2%
Cyrillic 4494
 
11.4%
Hangul 4083
 
10.4%
Common 3658
 
9.3%
Han 1685
 
4.3%
Khmer 81
 
0.2%
Arabic 30
 
0.1%
Katakana 20
 
0.1%
Thai 16
 
< 0.1%
Inherited 4
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
273
 
16.2%
256
 
15.2%
62
 
3.7%
58
 
3.4%
51
 
3.0%
42
 
2.5%
38
 
2.3%
36
 
2.1%
28
 
1.7%
28
 
1.7%
Other values (228) 813
48.2%
Hangul
ValueCountFrequency (%)
640
 
15.7%
588
 
14.4%
418
 
10.2%
155
 
3.8%
137
 
3.4%
118
 
2.9%
116
 
2.8%
96
 
2.4%
90
 
2.2%
83
 
2.0%
Other values (192) 1642
40.2%
Latin
ValueCountFrequency (%)
i 2913
 
11.5%
n 2256
 
8.9%
e 2220
 
8.8%
t 1738
 
6.9%
a 1606
 
6.4%
o 1597
 
6.3%
r 1511
 
6.0%
s 1471
 
5.8%
y 1106
 
4.4%
v 975
 
3.9%
Other values (76) 7860
31.1%
Cyrillic
ValueCountFrequency (%)
и 469
 
10.4%
а 410
 
9.1%
н 334
 
7.4%
т 322
 
7.2%
с 316
 
7.0%
е 292
 
6.5%
о 205
 
4.6%
р 188
 
4.2%
к 185
 
4.1%
в 157
 
3.5%
Other values (42) 1616
36.0%
Common
ValueCountFrequency (%)
3176
86.8%
. 116
 
3.2%
, 116
 
3.2%
- 74
 
2.0%
) 51
 
1.4%
( 50
 
1.4%
1 18
 
0.5%
' 14
 
0.4%
2 10
 
0.3%
& 8
 
0.2%
Other values (11) 25
 
0.7%
Khmer
ValueCountFrequency (%)
9
 
11.1%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
3
 
3.7%
3
 
3.7%
Other values (8) 24
29.6%
Thai
ValueCountFrequency (%)
3
18.8%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (3) 3
18.8%
Arabic
ValueCountFrequency (%)
ا 7
23.3%
ل 4
13.3%
ة 4
13.3%
ج 2
 
6.7%
ع 2
 
6.7%
ر 2
 
6.7%
د 2
 
6.7%
ن 2
 
6.7%
ي 2
 
6.7%
م 2
 
6.7%
Katakana
ValueCountFrequency (%)
4
20.0%
4
20.0%
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Inherited
ValueCountFrequency (%)
́ 4
100.0%
Greek
ValueCountFrequency (%)
γ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28758
73.1%
Cyrillic 4494
 
11.4%
Hangul 4083
 
10.4%
CJK 1683
 
4.3%
None 99
 
0.3%
Khmer 81
 
0.2%
Latin Ext Additional 42
 
0.1%
Arabic 30
 
0.1%
Katakana 20
 
0.1%
Thai 16
 
< 0.1%
Other values (4) 19
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3176
 
11.0%
i 2913
 
10.1%
n 2256
 
7.8%
e 2220
 
7.7%
t 1738
 
6.0%
a 1606
 
5.6%
o 1597
 
5.6%
r 1511
 
5.3%
s 1471
 
5.1%
y 1106
 
3.8%
Other values (58) 9164
31.9%
Hangul
ValueCountFrequency (%)
640
 
15.7%
588
 
14.4%
418
 
10.2%
155
 
3.8%
137
 
3.4%
118
 
2.9%
116
 
2.8%
96
 
2.4%
90
 
2.2%
83
 
2.0%
Other values (192) 1642
40.2%
Cyrillic
ValueCountFrequency (%)
и 469
 
10.4%
а 410
 
9.1%
н 334
 
7.4%
т 322
 
7.2%
с 316
 
7.0%
е 292
 
6.5%
о 205
 
4.6%
р 188
 
4.2%
к 185
 
4.1%
в 157
 
3.5%
Other values (42) 1616
36.0%
CJK
ValueCountFrequency (%)
273
 
16.2%
256
 
15.2%
62
 
3.7%
58
 
3.4%
51
 
3.0%
42
 
2.5%
38
 
2.3%
36
 
2.1%
28
 
1.7%
28
 
1.7%
Other values (226) 811
48.2%
None
ValueCountFrequency (%)
ö 12
12.1%
ä 11
11.1%
Đ 11
11.1%
é 9
 
9.1%
à 9
 
9.1%
á 9
 
9.1%
ó 5
 
5.1%
Ü 4
 
4.0%
ü 3
 
3.0%
ș 3
 
3.0%
Other values (14) 23
23.2%
Latin Ext Additional
ValueCountFrequency (%)
10
23.8%
10
23.8%
5
11.9%
5
11.9%
4
 
9.5%
3
 
7.1%
ế 2
 
4.8%
1
 
2.4%
1
 
2.4%
1
 
2.4%
Khmer
ValueCountFrequency (%)
9
 
11.1%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
3
 
3.7%
3
 
3.7%
Other values (8) 24
29.6%
Arabic
ValueCountFrequency (%)
ا 7
23.3%
ل 4
13.3%
ة 4
13.3%
ج 2
 
6.7%
ع 2
 
6.7%
ر 2
 
6.7%
د 2
 
6.7%
ن 2
 
6.7%
ي 2
 
6.7%
م 2
 
6.7%
Katakana
ValueCountFrequency (%)
4
20.0%
4
20.0%
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Diacriticals
ValueCountFrequency (%)
́ 4
100.0%
Punctuation
ValueCountFrequency (%)
4
36.4%
3
27.3%
2
18.2%
1
 
9.1%
1
 
9.1%
Thai
ValueCountFrequency (%)
3
18.8%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (3) 3
18.8%
IPA Ext
ValueCountFrequency (%)
ə 2
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
50.0%
1
50.0%

ORGANIZATION_KOR
Text

MISSING 

Distinct514
Distinct (%)26.5%
Missing250
Missing (%)11.4%
Memory size17.2 KiB
2023-12-12T11:09:23.007188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length17
Mean length7.2282329
Min length4

Characters and Unicode

Total characters14030
Distinct characters393
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique236 ?
Unique (%)12.2%

Sample

1st rowGMM그래미사
2nd row가나자와대학교
3rd row가나자와대학교
4th row도호쿠대학교
5th row가나자와대학교
ValueCountFrequency (%)
연변대학교 146
 
6.9%
중앙민족대학교 71
 
3.4%
서울대학교 55
 
2.6%
조선사회과학원 39
 
1.8%
한국학중앙연구원 36
 
1.7%
고려대학교 33
 
1.6%
푸단대학교 32
 
1.5%
하와이대학교 26
 
1.2%
산둥대학교 26
 
1.2%
중앙대학교 26
 
1.2%
Other values (540) 1623
76.8%
2023-12-12T11:09:23.566541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1920
 
13.7%
1779
 
12.7%
1739
 
12.4%
300
 
2.1%
271
 
1.9%
268
 
1.9%
198
 
1.4%
183
 
1.3%
182
 
1.3%
175
 
1.2%
Other values (383) 7015
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13775
98.2%
Space Separator 173
 
1.2%
Decimal Number 24
 
0.2%
Close Punctuation 15
 
0.1%
Open Punctuation 15
 
0.1%
Dash Punctuation 14
 
0.1%
Other Punctuation 11
 
0.1%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1920
 
13.9%
1779
 
12.9%
1739
 
12.6%
300
 
2.2%
271
 
2.0%
268
 
1.9%
198
 
1.4%
183
 
1.3%
182
 
1.3%
175
 
1.3%
Other values (370) 6760
49.1%
Decimal Number
ValueCountFrequency (%)
2 9
37.5%
5 6
25.0%
7 4
16.7%
3 3
 
12.5%
0 1
 
4.2%
1 1
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
M 2
66.7%
G 1
33.3%
Space Separator
ValueCountFrequency (%)
173
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%
Other Punctuation
ValueCountFrequency (%)
, 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13775
98.2%
Common 252
 
1.8%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1920
 
13.9%
1779
 
12.9%
1739
 
12.6%
300
 
2.2%
271
 
2.0%
268
 
1.9%
198
 
1.4%
183
 
1.3%
182
 
1.3%
175
 
1.3%
Other values (370) 6760
49.1%
Common
ValueCountFrequency (%)
173
68.7%
) 15
 
6.0%
( 15
 
6.0%
- 14
 
5.6%
, 11
 
4.4%
2 9
 
3.6%
5 6
 
2.4%
7 4
 
1.6%
3 3
 
1.2%
0 1
 
0.4%
Latin
ValueCountFrequency (%)
M 2
66.7%
G 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13775
98.2%
ASCII 255
 
1.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1920
 
13.9%
1779
 
12.9%
1739
 
12.6%
300
 
2.2%
271
 
2.0%
268
 
1.9%
198
 
1.4%
183
 
1.3%
182
 
1.3%
175
 
1.3%
Other values (370) 6760
49.1%
ASCII
ValueCountFrequency (%)
173
67.8%
) 15
 
5.9%
( 15
 
5.9%
- 14
 
5.5%
, 11
 
4.3%
2 9
 
3.5%
5 6
 
2.4%
7 4
 
1.6%
3 3
 
1.2%
M 2
 
0.8%
Other values (3) 3
 
1.2%

ORGANIZATION_ENG
Text

MISSING 

Distinct565
Distinct (%)28.9%
Missing235
Missing (%)10.7%
Memory size17.2 KiB
2023-12-12T11:09:23.999610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length122
Median length60
Mean length26.858384
Min length6

Characters and Unicode

Total characters52535
Distinct characters78
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique284 ?
Unique (%)14.5%

Sample

1st rowGMM Grammy ltd
2nd rowKanazawa University
3rd rowKanazawa University
4th rowTohoku University
5th rowKanazawa University
ValueCountFrequency (%)
university 1688
25.0%
of 692
 
10.2%
the 196
 
2.9%
national 169
 
2.5%
yanbian 159
 
2.4%
studies 126
 
1.9%
and 87
 
1.3%
for 86
 
1.3%
central 75
 
1.1%
nationalities 73
 
1.1%
Other values (675) 3404
50.4%
2023-12-12T11:09:24.599738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 5712
 
10.9%
4824
 
9.2%
n 4712
 
9.0%
e 4186
 
8.0%
a 3349
 
6.4%
t 3246
 
6.2%
o 2838
 
5.4%
r 2807
 
5.3%
s 2802
 
5.3%
y 2155
 
4.1%
Other values (68) 15904
30.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 41244
78.5%
Uppercase Letter 6150
 
11.7%
Space Separator 4825
 
9.2%
Other Punctuation 140
 
0.3%
Dash Punctuation 100
 
0.2%
Open Punctuation 34
 
0.1%
Close Punctuation 34
 
0.1%
Final Punctuation 5
 
< 0.1%
Decimal Number 2
 
< 0.1%
Initial Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 5712
13.8%
n 4712
11.4%
e 4186
10.1%
a 3349
8.1%
t 3246
7.9%
o 2838
 
6.9%
r 2807
 
6.8%
s 2802
 
6.8%
y 2155
 
5.2%
v 1794
 
4.3%
Other values (26) 7643
18.5%
Uppercase Letter
ValueCountFrequency (%)
U 1742
28.3%
S 739
12.0%
C 407
 
6.6%
N 361
 
5.9%
T 323
 
5.3%
A 294
 
4.8%
K 290
 
4.7%
Y 228
 
3.7%
F 223
 
3.6%
H 181
 
2.9%
Other values (18) 1362
22.1%
Other Punctuation
ValueCountFrequency (%)
, 91
65.0%
' 25
 
17.9%
& 13
 
9.3%
. 9
 
6.4%
: 2
 
1.4%
Space Separator
ValueCountFrequency (%)
4824
> 99.9%
  1
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
4
80.0%
1
 
20.0%
Dash Punctuation
ValueCountFrequency (%)
- 100
100.0%
Open Punctuation
ValueCountFrequency (%)
( 34
100.0%
Close Punctuation
ValueCountFrequency (%)
) 34
100.0%
Decimal Number
ValueCountFrequency (%)
3 2
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 47393
90.2%
Common 5141
 
9.8%
Cyrillic 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 5712
12.1%
n 4712
 
9.9%
e 4186
 
8.8%
a 3349
 
7.1%
t 3246
 
6.8%
o 2838
 
6.0%
r 2807
 
5.9%
s 2802
 
5.9%
y 2155
 
4.5%
v 1794
 
3.8%
Other values (53) 13792
29.1%
Common
ValueCountFrequency (%)
4824
93.8%
- 100
 
1.9%
, 91
 
1.8%
( 34
 
0.7%
) 34
 
0.7%
' 25
 
0.5%
& 13
 
0.3%
. 9
 
0.2%
4
 
0.1%
: 2
 
< 0.1%
Other values (4) 5
 
0.1%
Cyrillic
ValueCountFrequency (%)
К 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 52462
99.9%
None 66
 
0.1%
Punctuation 6
 
< 0.1%
Cyrillic 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 5712
 
10.9%
4824
 
9.2%
n 4712
 
9.0%
e 4186
 
8.0%
a 3349
 
6.4%
t 3246
 
6.2%
o 2838
 
5.4%
r 2807
 
5.4%
s 2802
 
5.3%
y 2155
 
4.1%
Other values (52) 15831
30.2%
None
ValueCountFrequency (%)
ö 18
27.3%
é 15
22.7%
ä 10
15.2%
á 10
15.2%
ş 3
 
4.5%
É 2
 
3.0%
ó 2
 
3.0%
ü 2
 
3.0%
  1
 
1.5%
ç 1
 
1.5%
Other values (2) 2
 
3.0%
Punctuation
ValueCountFrequency (%)
4
66.7%
1
 
16.7%
1
 
16.7%
Cyrillic
ValueCountFrequency (%)
К 1
100.0%

ORGANIZATION_ETC
Text

MISSING 

Distinct116
Distinct (%)78.4%
Missing2043
Missing (%)93.2%
Memory size17.2 KiB
2023-12-12T11:09:24.982292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length107
Median length62
Mean length34.912162
Min length4

Characters and Unicode

Total characters5167
Distinct characters185
Distinct categories9 ?
Distinct scripts5 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)66.9%

Sample

1st row民族出版社
2nd rowСахалинский государственный университет
3rd rowСахалинский государственный университет
4th row上海市档案馆
5th row中央研究院
ValueCountFrequency (%)
им 44
 
7.0%
университет 37
 
5.9%
государственный 34
 
5.4%
институт 29
 
4.6%
и 22
 
3.5%
арабаева 16
 
2.6%
ташкентский 15
 
2.4%
казахский 11
 
1.8%
востоковедения 10
 
1.6%
кыргызский 10
 
1.6%
Other values (199) 398
63.6%
2023-12-12T11:09:25.597796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
478
 
9.3%
и 418
 
8.1%
а 376
 
7.3%
н 301
 
5.8%
т 289
 
5.6%
с 282
 
5.5%
е 262
 
5.1%
о 193
 
3.7%
р 170
 
3.3%
к 166
 
3.2%
Other values (175) 2232
43.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3839
74.3%
Uppercase Letter 592
 
11.5%
Space Separator 478
 
9.3%
Other Punctuation 128
 
2.5%
Other Letter 114
 
2.2%
Dash Punctuation 6
 
0.1%
Open Punctuation 4
 
0.1%
Close Punctuation 4
 
0.1%
Decimal Number 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
и 418
 
10.9%
а 376
 
9.8%
н 301
 
7.8%
т 289
 
7.5%
с 282
 
7.3%
е 262
 
6.8%
о 193
 
5.0%
р 170
 
4.4%
к 166
 
4.3%
в 141
 
3.7%
Other values (60) 1241
32.3%
Other Letter
ValueCountFrequency (%)
8
 
7.0%
6
 
5.3%
5
 
4.4%
4
 
3.5%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (57) 73
64.0%
Uppercase Letter
ValueCountFrequency (%)
Г 69
11.7%
У 69
11.7%
И 61
10.3%
К 60
 
10.1%
А 49
 
8.3%
Т 39
 
6.6%
Н 30
 
5.1%
М 24
 
4.1%
Р 23
 
3.9%
В 18
 
3.0%
Other values (29) 150
25.3%
Other Punctuation
ValueCountFrequency (%)
. 93
72.7%
, 34
 
26.6%
& 1
 
0.8%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
478
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 4088
79.1%
Common 622
 
12.0%
Latin 343
 
6.6%
Han 111
 
2.1%
Katakana 3
 
0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
8
 
7.2%
6
 
5.4%
5
 
4.5%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (54) 70
63.1%
Latin
ValueCountFrequency (%)
i 32
 
9.3%
a 25
 
7.3%
c 21
 
6.1%
h 20
 
5.8%
n 19
 
5.5%
g 16
 
4.7%
N 12
 
3.5%
u 11
 
3.2%
t 11
 
3.2%
10
 
2.9%
Other values (47) 166
48.4%
Cyrillic
ValueCountFrequency (%)
и 418
 
10.2%
а 376
 
9.2%
н 301
 
7.4%
т 289
 
7.1%
с 282
 
6.9%
е 262
 
6.4%
о 193
 
4.7%
р 170
 
4.2%
к 166
 
4.1%
в 141
 
3.4%
Other values (42) 1490
36.4%
Common
ValueCountFrequency (%)
478
76.8%
. 93
 
15.0%
, 34
 
5.5%
- 6
 
1.0%
( 4
 
0.6%
) 4
 
0.6%
1 1
 
0.2%
2 1
 
0.2%
& 1
 
0.2%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 4088
79.1%
ASCII 895
 
17.3%
CJK 111
 
2.1%
Latin Ext Additional 41
 
0.8%
None 29
 
0.6%
Katakana 3
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
478
53.4%
. 93
 
10.4%
, 34
 
3.8%
i 32
 
3.6%
a 25
 
2.8%
c 21
 
2.3%
h 20
 
2.2%
n 19
 
2.1%
g 16
 
1.8%
N 12
 
1.3%
Other values (36) 145
 
16.2%
Cyrillic
ValueCountFrequency (%)
и 418
 
10.2%
а 376
 
9.2%
н 301
 
7.4%
т 289
 
7.1%
с 282
 
6.9%
е 262
 
6.4%
о 193
 
4.7%
р 170
 
4.2%
к 166
 
4.1%
в 141
 
3.4%
Other values (42) 1490
36.4%
Latin Ext Additional
ValueCountFrequency (%)
10
24.4%
10
24.4%
5
12.2%
5
12.2%
3
 
7.3%
3
 
7.3%
ế 2
 
4.9%
1
 
2.4%
1
 
2.4%
1
 
2.4%
None
ValueCountFrequency (%)
Đ 10
34.5%
à 4
 
13.8%
ư 3
 
10.3%
ê 3
 
10.3%
ô 2
 
6.9%
á 2
 
6.9%
Á 2
 
6.9%
í 1
 
3.4%
ò 1
 
3.4%
ơ 1
 
3.4%
CJK
ValueCountFrequency (%)
8
 
7.2%
6
 
5.4%
5
 
4.5%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (54) 70
63.1%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct937
Distinct (%)43.0%
Missing11
Missing (%)0.5%
Memory size17.2 KiB
2023-12-12T11:09:25.998075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length100
Median length78
Mean length17.649541
Min length1

Characters and Unicode

Total characters38476
Distinct characters593
Distinct categories9 ?
Distinct scripts11 ?
Distinct blocks13 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique632 ?
Unique (%)29.0%

Sample

1st rowGMM GRAMMY LTD
2nd rowKANAZAWA UNIVERSITY
3rd rowKANAZAWA UNIVERSITY
4th rowTOHOKU UNIVERSITY
5th rowKANAZAWA UNIVERSITY
ValueCountFrequency (%)
university 852
 
16.3%
of 431
 
8.3%
national 90
 
1.7%
연변대학교 64
 
1.2%
studies 56
 
1.1%
연변대학 52
 
1.0%
им 47
 
0.9%
state 46
 
0.9%
중앙민족대학교 46
 
0.9%
университет 45
 
0.9%
Other values (1201) 3495
66.9%
2023-12-12T11:09:26.536332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3059
 
8.0%
I 3041
 
7.9%
N 2423
 
6.3%
E 2222
 
5.8%
S 1943
 
5.0%
T 1829
 
4.8%
A 1823
 
4.7%
O 1642
 
4.3%
R 1563
 
4.1%
U 1468
 
3.8%
Other values (583) 17463
45.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 29454
76.6%
Other Letter 5879
 
15.3%
Space Separator 3059
 
8.0%
Decimal Number 36
 
0.1%
Nonspacing Mark 31
 
0.1%
Spacing Mark 9
 
< 0.1%
Initial Punctuation 4
 
< 0.1%
Final Punctuation 3
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
640
 
10.9%
588
 
10.0%
418
 
7.1%
273
 
4.6%
256
 
4.4%
155
 
2.6%
137
 
2.3%
118
 
2.0%
116
 
2.0%
96
 
1.6%
Other values (473) 3082
52.4%
Uppercase Letter
ValueCountFrequency (%)
I 3041
 
10.3%
N 2423
 
8.2%
E 2222
 
7.5%
S 1943
 
6.6%
T 1829
 
6.2%
A 1823
 
6.2%
O 1642
 
5.6%
R 1563
 
5.3%
U 1468
 
5.0%
Y 1144
 
3.9%
Other values (78) 10356
35.2%
Nonspacing Mark
ValueCountFrequency (%)
9
29.0%
6
19.4%
́ 4
12.9%
3
 
9.7%
3
 
9.7%
3
 
9.7%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Decimal Number
ValueCountFrequency (%)
1 18
50.0%
2 10
27.8%
7 4
 
11.1%
3 3
 
8.3%
5 1
 
2.8%
Spacing Mark
ValueCountFrequency (%)
6
66.7%
3
33.3%
Initial Punctuation
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Final Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
3059
100.0%
Open Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24959
64.9%
Cyrillic 4494
 
11.7%
Hangul 4083
 
10.6%
Common 3103
 
8.1%
Han 1685
 
4.4%
Khmer 81
 
0.2%
Arabic 30
 
0.1%
Katakana 20
 
0.1%
Thai 16
 
< 0.1%
Inherited 4
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
273
 
16.2%
256
 
15.2%
62
 
3.7%
58
 
3.4%
51
 
3.0%
42
 
2.5%
38
 
2.3%
36
 
2.1%
28
 
1.7%
28
 
1.7%
Other values (228) 813
48.2%
Hangul
ValueCountFrequency (%)
640
 
15.7%
588
 
14.4%
418
 
10.2%
155
 
3.8%
137
 
3.4%
118
 
2.9%
116
 
2.8%
96
 
2.4%
90
 
2.2%
83
 
2.0%
Other values (192) 1642
40.2%
Latin
ValueCountFrequency (%)
I 3041
12.2%
N 2423
9.7%
E 2222
 
8.9%
S 1943
 
7.8%
T 1829
 
7.3%
A 1823
 
7.3%
O 1642
 
6.6%
R 1563
 
6.3%
U 1468
 
5.9%
Y 1144
 
4.6%
Other values (47) 5861
23.5%
Cyrillic
ValueCountFrequency (%)
И 530
11.8%
А 461
 
10.3%
Н 366
 
8.1%
Т 362
 
8.1%
С 333
 
7.4%
Е 296
 
6.6%
К 251
 
5.6%
О 213
 
4.7%
Р 213
 
4.7%
У 189
 
4.2%
Other values (20) 1280
28.5%
Khmer
ValueCountFrequency (%)
9
 
11.1%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
3
 
3.7%
3
 
3.7%
Other values (8) 24
29.6%
Thai
ValueCountFrequency (%)
3
18.8%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (3) 3
18.8%
Common
ValueCountFrequency (%)
3059
98.6%
1 18
 
0.6%
2 10
 
0.3%
7 4
 
0.1%
3 3
 
0.1%
3
 
0.1%
2
 
0.1%
1
 
< 0.1%
5 1
 
< 0.1%
1
 
< 0.1%
Arabic
ValueCountFrequency (%)
ا 7
23.3%
ل 4
13.3%
ة 4
13.3%
ج 2
 
6.7%
م 2
 
6.7%
ع 2
 
6.7%
ر 2
 
6.7%
د 2
 
6.7%
ي 2
 
6.7%
ن 2
 
6.7%
Katakana
ValueCountFrequency (%)
4
20.0%
4
20.0%
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Inherited
ValueCountFrequency (%)
́ 4
100.0%
Greek
ValueCountFrequency (%)
Γ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27912
72.5%
Cyrillic 4494
 
11.7%
Hangul 4083
 
10.6%
CJK 1683
 
4.4%
None 101
 
0.3%
Khmer 81
 
0.2%
Latin Ext Additional 42
 
0.1%
Arabic 30
 
0.1%
Katakana 20
 
0.1%
Thai 16
 
< 0.1%
Other values (3) 14
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3059
11.0%
I 3041
10.9%
N 2423
 
8.7%
E 2222
 
8.0%
S 1943
 
7.0%
T 1829
 
6.6%
A 1823
 
6.5%
O 1642
 
5.9%
R 1563
 
5.6%
U 1468
 
5.3%
Other values (22) 6899
24.7%
Hangul
ValueCountFrequency (%)
640
 
15.7%
588
 
14.4%
418
 
10.2%
155
 
3.8%
137
 
3.4%
118
 
2.9%
116
 
2.8%
96
 
2.4%
90
 
2.2%
83
 
2.0%
Other values (192) 1642
40.2%
Cyrillic
ValueCountFrequency (%)
И 530
11.8%
А 461
 
10.3%
Н 366
 
8.1%
Т 362
 
8.1%
С 333
 
7.4%
Е 296
 
6.6%
К 251
 
5.6%
О 213
 
4.7%
Р 213
 
4.7%
У 189
 
4.2%
Other values (20) 1280
28.5%
CJK
ValueCountFrequency (%)
273
 
16.2%
256
 
15.2%
62
 
3.7%
58
 
3.4%
51
 
3.0%
42
 
2.5%
38
 
2.3%
36
 
2.1%
28
 
1.7%
28
 
1.7%
Other values (226) 811
48.2%
None
ValueCountFrequency (%)
Ö 12
11.9%
Ä 11
10.9%
Đ 11
10.9%
Á 11
10.9%
É 10
9.9%
À 9
8.9%
Ü 7
 
6.9%
Ó 5
 
5.0%
Ư 3
 
3.0%
Ž 3
 
3.0%
Other values (12) 19
18.8%
Latin Ext Additional
ValueCountFrequency (%)
10
23.8%
10
23.8%
5
11.9%
5
11.9%
4
 
9.5%
3
 
7.1%
2
 
4.8%
1
 
2.4%
1
 
2.4%
1
 
2.4%
Khmer
ValueCountFrequency (%)
9
 
11.1%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
6
 
7.4%
3
 
3.7%
3
 
3.7%
Other values (8) 24
29.6%
Arabic
ValueCountFrequency (%)
ا 7
23.3%
ل 4
13.3%
ة 4
13.3%
ج 2
 
6.7%
م 2
 
6.7%
ع 2
 
6.7%
ر 2
 
6.7%
د 2
 
6.7%
ي 2
 
6.7%
ن 2
 
6.7%
Katakana
ValueCountFrequency (%)
4
20.0%
4
20.0%
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Diacriticals
ValueCountFrequency (%)
́ 4
100.0%
Punctuation
ValueCountFrequency (%)
3
37.5%
2
25.0%
1
 
12.5%
1
 
12.5%
1
 
12.5%
Thai
ValueCountFrequency (%)
3
18.8%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (3) 3
18.8%
CJK Compat Ideographs
ValueCountFrequency (%)
1
50.0%
1
50.0%

SORT_ORGANIZATION_KOR
Text

MISSING 

Distinct513
Distinct (%)26.4%
Missing250
Missing (%)11.4%
Memory size17.2 KiB
2023-12-12T11:09:26.805700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length17
Mean length7.1993818
Min length4

Characters and Unicode

Total characters13974
Distinct characters389
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique234 ?
Unique (%)12.1%

Sample

1st rowGMM그래미사
2nd row가나자와대학교
3rd row가나자와대학교
4th row도호쿠대학교
5th row가나자와대학교
ValueCountFrequency (%)
연변대학교 146
 
6.9%
중앙민족대학교 71
 
3.4%
서울대학교 55
 
2.6%
조선사회과학원 39
 
1.8%
한국학중앙연구원 36
 
1.7%
고려대학교 33
 
1.6%
푸단대학교 32
 
1.5%
하와이대학교 26
 
1.2%
산둥대학교 26
 
1.2%
중앙대학교 26
 
1.2%
Other values (540) 1623
76.8%
2023-12-12T11:09:27.235507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1920
 
13.7%
1779
 
12.7%
1739
 
12.4%
300
 
2.1%
271
 
1.9%
268
 
1.9%
198
 
1.4%
183
 
1.3%
182
 
1.3%
175
 
1.3%
Other values (379) 6959
49.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13775
98.6%
Space Separator 172
 
1.2%
Decimal Number 24
 
0.2%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1920
 
13.9%
1779
 
12.9%
1739
 
12.6%
300
 
2.2%
271
 
2.0%
268
 
1.9%
198
 
1.4%
183
 
1.3%
182
 
1.3%
175
 
1.3%
Other values (370) 6760
49.1%
Decimal Number
ValueCountFrequency (%)
2 9
37.5%
5 6
25.0%
7 4
16.7%
3 3
 
12.5%
0 1
 
4.2%
1 1
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
M 2
66.7%
G 1
33.3%
Space Separator
ValueCountFrequency (%)
172
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13775
98.6%
Common 196
 
1.4%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1920
 
13.9%
1779
 
12.9%
1739
 
12.6%
300
 
2.2%
271
 
2.0%
268
 
1.9%
198
 
1.4%
183
 
1.3%
182
 
1.3%
175
 
1.3%
Other values (370) 6760
49.1%
Common
ValueCountFrequency (%)
172
87.8%
2 9
 
4.6%
5 6
 
3.1%
7 4
 
2.0%
3 3
 
1.5%
0 1
 
0.5%
1 1
 
0.5%
Latin
ValueCountFrequency (%)
M 2
66.7%
G 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13775
98.6%
ASCII 199
 
1.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1920
 
13.9%
1779
 
12.9%
1739
 
12.6%
300
 
2.2%
271
 
2.0%
268
 
1.9%
198
 
1.4%
183
 
1.3%
182
 
1.3%
175
 
1.3%
Other values (370) 6760
49.1%
ASCII
ValueCountFrequency (%)
172
86.4%
2 9
 
4.5%
5 6
 
3.0%
7 4
 
2.0%
3 3
 
1.5%
M 2
 
1.0%
0 1
 
0.5%
1 1
 
0.5%
G 1
 
0.5%

SORT_ORGANIZATION_ENG
Text

MISSING 

Distinct542
Distinct (%)27.7%
Missing235
Missing (%)10.7%
Memory size17.2 KiB
2023-12-12T11:09:27.924864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length121
Median length58
Mean length26.322597
Min length6

Characters and Unicode

Total characters51487
Distinct characters43
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique260 ?
Unique (%)13.3%

Sample

1st rowGMM GRAMMY LTD
2nd rowKANAZAWA UNIVERSITY
3rd rowKANAZAWA UNIVERSITY
4th rowTOHOKU UNIVERSITY
5th rowKANAZAWA UNIVERSITY
ValueCountFrequency (%)
university 1688
25.7%
of 692
 
10.5%
national 169
 
2.6%
yanbian 159
 
2.4%
studies 126
 
1.9%
and 87
 
1.3%
for 86
 
1.3%
central 75
 
1.1%
nationalities 73
 
1.1%
academy 72
 
1.1%
Other values (673) 3336
50.8%
2023-12-12T11:09:28.400397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 5883
11.4%
N 5073
9.9%
4621
 
9.0%
E 4141
 
8.0%
A 3643
 
7.1%
S 3541
 
6.9%
T 3391
 
6.6%
O 2909
 
5.6%
R 2875
 
5.6%
U 2794
 
5.4%
Other values (33) 12616
24.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 46860
91.0%
Space Separator 4622
 
9.0%
Decimal Number 2
 
< 0.1%
Final Punctuation 2
 
< 0.1%
Initial Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 5883
12.6%
N 5073
10.8%
E 4141
8.8%
A 3643
 
7.8%
S 3541
 
7.6%
T 3391
 
7.2%
O 2909
 
6.2%
R 2875
 
6.1%
U 2794
 
6.0%
Y 2383
 
5.1%
Other values (27) 10227
21.8%
Space Separator
ValueCountFrequency (%)
4621
> 99.9%
  1
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Decimal Number
ValueCountFrequency (%)
3 2
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 46859
91.0%
Common 4627
 
9.0%
Cyrillic 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 5883
12.6%
N 5073
10.8%
E 4141
8.8%
A 3643
 
7.8%
S 3541
 
7.6%
T 3391
 
7.2%
O 2909
 
6.2%
R 2875
 
6.1%
U 2794
 
6.0%
Y 2383
 
5.1%
Other values (26) 10226
21.8%
Common
ValueCountFrequency (%)
4621
99.9%
3 2
 
< 0.1%
  1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Cyrillic
ValueCountFrequency (%)
К 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 51417
99.9%
None 66
 
0.1%
Punctuation 3
 
< 0.1%
Cyrillic 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 5883
11.4%
N 5073
9.9%
4621
 
9.0%
E 4141
 
8.1%
A 3643
 
7.1%
S 3541
 
6.9%
T 3391
 
6.6%
O 2909
 
5.7%
R 2875
 
5.6%
U 2794
 
5.4%
Other values (18) 12546
24.4%
None
ValueCountFrequency (%)
Ö 18
27.3%
É 17
25.8%
Á 10
15.2%
Ä 10
15.2%
Ş 3
 
4.5%
Ó 2
 
3.0%
Ü 2
 
3.0%
  1
 
1.5%
Ç 1
 
1.5%
Ě 1
 
1.5%
Cyrillic
ValueCountFrequency (%)
К 1
100.0%
Punctuation
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

SORT_ORGANIZATION_ETC
Text

MISSING 

Distinct106
Distinct (%)71.6%
Missing2043
Missing (%)93.2%
Memory size17.2 KiB
2023-12-12T11:09:28.740639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length100
Median length62
Mean length33.925676
Min length4

Characters and Unicode

Total characters5021
Distinct characters144
Distinct categories4 ?
Distinct scripts5 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)58.8%

Sample

1st row民族出版社
2nd rowСАХАЛИНСКИЙ ГОСУДАРСТВЕННЫЙ УНИВЕРСИТЕТ
3rd rowСАХАЛИНСКИЙ ГОСУДАРСТВЕННЫЙ УНИВЕРСИТЕТ
4th row上海市档案馆
5th row中央研究院
ValueCountFrequency (%)
им 44
 
7.1%
университет 37
 
6.0%
государственный 34
 
5.5%
институт 29
 
4.7%
и 22
 
3.5%
арабаева 16
 
2.6%
ташкентский 15
 
2.4%
казахский 11
 
1.8%
кыргызский 10
 
1.6%
кгу 10
 
1.6%
Other values (197) 393
63.3%
2023-12-12T11:09:29.262455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
И 479
 
9.5%
477
 
9.5%
А 425
 
8.5%
Н 331
 
6.6%
Т 328
 
6.5%
С 293
 
5.8%
Е 266
 
5.3%
К 226
 
4.5%
О 200
 
4.0%
Р 193
 
3.8%
Other values (134) 1803
35.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4428
88.2%
Space Separator 477
 
9.5%
Other Letter 114
 
2.3%
Decimal Number 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
И 479
 
10.8%
А 425
 
9.6%
Н 331
 
7.5%
Т 328
 
7.4%
С 293
 
6.6%
Е 266
 
6.0%
К 226
 
5.1%
О 200
 
4.5%
Р 193
 
4.4%
У 172
 
3.9%
Other values (64) 1515
34.2%
Other Letter
ValueCountFrequency (%)
8
 
7.0%
6
 
5.3%
5
 
4.4%
4
 
3.5%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (57) 73
64.0%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
1 1
50.0%
Space Separator
ValueCountFrequency (%)
477
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 4088
81.4%
Common 479
 
9.5%
Latin 340
 
6.8%
Han 111
 
2.2%
Katakana 3
 
0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
8
 
7.2%
6
 
5.4%
5
 
4.5%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (54) 70
63.1%
Latin
ValueCountFrequency (%)
I 32
 
9.4%
N 31
 
9.1%
H 29
 
8.5%
A 26
 
7.6%
C 22
 
6.5%
T 17
 
5.0%
G 16
 
4.7%
U 13
 
3.8%
Đ 10
 
2.9%
10
 
2.9%
Other values (34) 134
39.4%
Cyrillic
ValueCountFrequency (%)
И 479
11.7%
А 425
 
10.4%
Н 331
 
8.1%
Т 328
 
8.0%
С 293
 
7.2%
Е 266
 
6.5%
К 226
 
5.5%
О 200
 
4.9%
Р 193
 
4.7%
У 172
 
4.2%
Other values (20) 1175
28.7%
Common
ValueCountFrequency (%)
477
99.6%
2 1
 
0.2%
1 1
 
0.2%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 4088
81.4%
ASCII 749
 
14.9%
CJK 111
 
2.2%
Latin Ext Additional 41
 
0.8%
None 29
 
0.6%
Katakana 3
 
0.1%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
И 479
11.7%
А 425
 
10.4%
Н 331
 
8.1%
Т 328
 
8.0%
С 293
 
7.2%
Е 266
 
6.5%
К 226
 
5.5%
О 200
 
4.9%
Р 193
 
4.7%
У 172
 
4.2%
Other values (20) 1175
28.7%
ASCII
ValueCountFrequency (%)
477
63.7%
I 32
 
4.3%
N 31
 
4.1%
H 29
 
3.9%
A 26
 
3.5%
C 22
 
2.9%
T 17
 
2.3%
G 16
 
2.1%
U 13
 
1.7%
K 10
 
1.3%
Other values (18) 76
 
10.1%
None
ValueCountFrequency (%)
Đ 10
34.5%
Á 4
 
13.8%
À 4
 
13.8%
Ê 3
 
10.3%
Ư 3
 
10.3%
Ô 2
 
6.9%
Ơ 1
 
3.4%
Ò 1
 
3.4%
Í 1
 
3.4%
Latin Ext Additional
ValueCountFrequency (%)
10
24.4%
10
24.4%
5
12.2%
5
12.2%
3
 
7.3%
3
 
7.3%
2
 
4.9%
1
 
2.4%
1
 
2.4%
1
 
2.4%
CJK
ValueCountFrequency (%)
8
 
7.2%
6
 
5.4%
5
 
4.5%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (54) 70
63.1%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

GANADA_ORGANIZATION_ORI
Categorical

HIGH CORRELATION 

Distinct41
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size17.2 KiB
ETC
452 
U
350 
151 
140 
S
116 
Other values (36)
982 

Length

Max length4
Median length1
Mean length1.4276586
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowG
2nd rowK
3rd rowK
4th rowT
5th rowK

Common Values

ValueCountFrequency (%)
ETC 452
20.6%
U 350
16.0%
151
 
6.9%
140
 
6.4%
S 116
 
5.3%
101
 
4.6%
C 80
 
3.7%
K 80
 
3.7%
58
 
2.6%
55
 
2.5%
Other values (31) 608
27.7%

Length

2023-12-12T11:09:29.432999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
etc 452
20.6%
u 350
16.0%
151
 
6.9%
140
 
6.4%
s 116
 
5.3%
101
 
4.6%
c 80
 
3.7%
k 80
 
3.7%
58
 
2.6%
55
 
2.5%
Other values (31) 608
27.7%

GANADA_ORGANIZATION_KOR
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size17.2 KiB
415 
286 
<NA>
250 
228 
175 
Other values (11)
837 

Length

Max length4
Median length1
Mean length1.3423094
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowG
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
415
18.9%
286
13.1%
<NA> 250
11.4%
228
10.4%
175
8.0%
154
 
7.0%
144
 
6.6%
118
 
5.4%
80
 
3.7%
76
 
3.5%
Other values (6) 265
12.1%

Length

2023-12-12T11:09:29.602568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
415
18.9%
286
13.1%
na 250
11.4%
228
10.4%
175
8.0%
154
 
7.0%
144
 
6.6%
118
 
5.4%
80
 
3.7%
76
 
3.5%
Other values (6) 265
12.1%

GANADA_ORGANIZATION_ENG
Categorical

HIGH CORRELATION 

Distinct27
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size17.2 KiB
U
389 
S
272 
<NA>
235 
Y
206 
C
198 
Other values (22)
891 

Length

Max length4
Median length1
Mean length1.3245094
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowG
2nd rowK
3rd rowK
4th rowT
5th rowK

Common Values

ValueCountFrequency (%)
U 389
17.8%
S 272
12.4%
<NA> 235
10.7%
Y 206
9.4%
C 198
9.0%
K 153
 
7.0%
A 96
 
4.4%
H 63
 
2.9%
B 57
 
2.6%
T 46
 
2.1%
Other values (17) 476
21.7%

Length

2023-12-12T11:09:29.762016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
u 389
17.8%
s 272
12.4%
na 235
10.7%
y 206
9.4%
c 198
9.0%
k 153
 
7.0%
a 96
 
4.4%
h 63
 
2.9%
b 57
 
2.6%
t 46
 
2.1%
Other values (17) 476
21.7%

GANADA_ORGANIZATION_ETC
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size17.2 KiB
<NA>
2043 
ETC
 
140
V
 
3
T
 
2
N
 
1
Other values (2)
 
2

Length

Max length4
Median length4
Mean length3.9251483
Min length1

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 2043
93.2%
ETC 140
 
6.4%
V 3
 
0.1%
T 2
 
0.1%
N 1
 
< 0.1%
Q 1
 
< 0.1%
A 1
 
< 0.1%

Length

2023-12-12T11:09:29.919604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:09:30.052726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 2043
93.2%
etc 140
 
6.4%
v 3
 
0.1%
t 2
 
0.1%
n 1
 
< 0.1%
q 1
 
< 0.1%
a 1
 
< 0.1%

Interactions

2023-12-12T11:09:19.682122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:09:30.148551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ORGANIZATION_IDGANADA_ORGANIZATION_ORIGANADA_ORGANIZATION_KORGANADA_ORGANIZATION_ENGGANADA_ORGANIZATION_ETC
ORGANIZATION_ID1.0000.8740.9120.8490.000
GANADA_ORGANIZATION_ORI0.8741.0000.9350.9841.000
GANADA_ORGANIZATION_KOR0.9120.9351.0000.9270.302
GANADA_ORGANIZATION_ENG0.8490.9840.9271.0000.193
GANADA_ORGANIZATION_ETC0.0001.0000.3020.1931.000
2023-12-12T11:09:30.269985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GANADA_ORGANIZATION_KORGANADA_ORGANIZATION_ORIGANADA_ORGANIZATION_ENGGANADA_ORGANIZATION_ETC
GANADA_ORGANIZATION_KOR1.0000.6040.6100.236
GANADA_ORGANIZATION_ORI0.6041.0000.7160.982
GANADA_ORGANIZATION_ENG0.6100.7161.0000.000
GANADA_ORGANIZATION_ETC0.2360.9820.0001.000
2023-12-12T11:09:30.394124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ORGANIZATION_IDGANADA_ORGANIZATION_ORIGANADA_ORGANIZATION_KORGANADA_ORGANIZATION_ENGGANADA_ORGANIZATION_ETC
ORGANIZATION_ID1.0000.4940.6880.5110.000
GANADA_ORGANIZATION_ORI0.4941.0000.6040.7160.982
GANADA_ORGANIZATION_KOR0.6880.6041.0000.6100.236
GANADA_ORGANIZATION_ENG0.5110.7160.6101.0000.000
GANADA_ORGANIZATION_ETC0.0000.9820.2360.0001.000

Missing values

2023-12-12T11:09:19.884447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:09:20.153750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T11:09:20.390340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

ORGANIZATION_IDCATALOG_IDORGANIZATION_ORIORGANIZATION_KORORGANIZATION_ENGORGANIZATION_ETCSORT_ORGANIZATION_ORISORT_ORGANIZATION_KORSORT_ORGANIZATION_ENGSORT_ORGANIZATION_ETCGANADA_ORGANIZATION_ORIGANADA_ORGANIZATION_KORGANADA_ORGANIZATION_ENGGANADA_ORGANIZATION_ETC
0615408C19_0019GMM Grammy ltdGMM그래미사GMM Grammy ltd<NA>GMM GRAMMY LTDGMM그래미사GMM GRAMMY LTD<NA>GGG<NA>
1615508C09_0004Kanazawa University가나자와대학교Kanazawa University<NA>KANAZAWA UNIVERSITY가나자와대학교KANAZAWA UNIVERSITY<NA>KK<NA>
2615606C10_0028Kanazawa University가나자와대학교Kanazawa University<NA>KANAZAWA UNIVERSITY가나자와대학교KANAZAWA UNIVERSITY<NA>KK<NA>
3615706C10_0028Tohoku University도호쿠대학교Tohoku University<NA>TOHOKU UNIVERSITY도호쿠대학교TOHOKU UNIVERSITY<NA>TT<NA>
4615808C09_0024Kanazawa University가나자와대학교Kanazawa University<NA>KANAZAWA UNIVERSITY가나자와대학교KANAZAWA UNIVERSITY<NA>KK<NA>
5615908C09_0024University of Oregon오리건대학교University of Oregon<NA>UNIVERSITY OF OREGON오리건대학교UNIVERSITY OF OREGON<NA>UU<NA>
6616010C11_0005学習院大学가쿠슈인대학교Gakushuin University<NA>学習院大学가쿠슈인대학교GAKUSHUIN UNIVERSITY<NA>ETCG<NA>
7616110C11_0006学習院大学가쿠슈인대학교Gakushuin University<NA>学習院大学가쿠슈인대학교GAKUSHUIN UNIVERSITY<NA>ETCG<NA>
8616206C06_0028가톨릭대학교가톨릭대학교The Catholic University of Korea<NA>가톨릭대학교가톨릭대학교CATHOLIC UNIVERSITY OF KOREA<NA>C<NA>
9616307C06_0002카톨릭대학교가톨릭대학교The Catholic University of Korea<NA>카톨릭대학교가톨릭대학교CATHOLIC UNIVERSITY OF KOREA<NA>C<NA>
ORGANIZATION_IDCATALOG_IDORGANIZATION_ORIORGANIZATION_KORORGANIZATION_ENGORGANIZATION_ETCSORT_ORGANIZATION_ORISORT_ORGANIZATION_KORSORT_ORGANIZATION_ENGSORT_ORGANIZATION_ETCGANADA_ORGANIZATION_ORIGANADA_ORGANIZATION_KORGANADA_ORGANIZATION_ENGGANADA_ORGANIZATION_ETC
2181858812R15_0001Purdue University<NA>Purdue University<NA>PURDUE UNIVERSITY<NA>PURDUE UNIVERSITY<NA>P<NA>P<NA>
2182859011R61_0001연변대학교연변대학교Yanbian University<NA>연변대학교연변대학교YANBIAN UNIVERSITY<NA>Y<NA>
2183859311R08延边大学연변대학교Yanbian University<NA>延边大学연변대학교YANBIAN UNIVERSITY<NA>ETCY<NA>
2184859611R61_0003연변대학교연변대학교Yanbian University<NA>연변대학교연변대학교YANBIAN UNIVERSITY<NA>Y<NA>
2185859812R15_0002Purdue University<NA>Purdue University<NA>PURDUE UNIVERSITY<NA>PURDUE UNIVERSITY<NA>P<NA>P<NA>
2186860011R61延边大学연변대학교Yanbian University<NA>延边大学연변대학교YANBIAN UNIVERSITY<NA>ETCY<NA>
2187860109R84University of Wisconsin-Madison<NA>University of Wisconsin-Madison<NA>UNIVERSITY OF WISCONSINMADISON<NA>UNIVERSITY OF WISCONSINMADISON<NA>U<NA>U<NA>
2188860207R73Russian State University for the Humanities<NA>Russian State University for the Humanities<NA>RUSSIAN STATE UNIVERSITY FOR THE HUMANITIES<NA>RUSSIAN STATE UNIVERSITY FOR THE HUMANITIES<NA>R<NA>R<NA>
2189860707R12대련대학 한국학연구원대련대학 한국학연구원<NA><NA>대련대학 한국학연구원대련대학 한국학연구원<NA><NA><NA><NA>
2190860807R62延边大学연변대학교Yanbian University<NA>延边大学연변대학교YANBIAN UNIVERSITY<NA>ETCY<NA>