Dataset statistics
Number of variables | 13 |
---|---|
Number of observations | 2270 |
Missing cells | 1837 |
Missing cells (%) | 6.2% |
Duplicate rows | 2 |
Duplicate rows (%) | 0.1% |
Total size in memory | 230.7 KiB |
Average record size in memory | 104.1 B |
Variable types
Text | 7 |
---|---|
Categorical | 4 |
Boolean | 2 |
Dataset
Description | 해외 한국학 관련 학과교수 목록 정보 |
---|---|
Author | 한국학중앙연구원 |
URL | https://www.data.go.kr/data/15049054/fileData.do |
Dataset has 2 (0.1%) duplicate rows | Duplicates |
ERASER is highly overall correlated with MODIFIER and 1 other fields | High correlation |
ERASE_YN is highly overall correlated with ERASE_DT and 1 other fields | High correlation |
ERASE_DT is highly overall correlated with MODIFIER and 1 other fields | High correlation |
MODIFIER is highly overall correlated with ERASE_DT and 1 other fields | High correlation |
CODE_ORDER is highly imbalanced (50.4%) | Imbalance |
ERASE_DT is highly imbalanced (84.3%) | Imbalance |
ERASER is highly imbalanced (71.1%) | Imbalance |
MEMBER_NUM has 1484 (65.4%) missing values | Missing |
MEMBER_ID has 103 (4.5%) missing values | Missing |
MODIFY_DT has 210 (9.3%) missing values | Missing |
Reproduction
Analysis started | 2023-12-12 03:03:59.075526 |
---|---|
Analysis finished | 2023-12-12 03:04:01.298183 |
Duration | 2.22 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
Distinct | 2267 |
---|---|
Distinct (%) | 99.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 17.9 KiB |
Value | Count | Frequency (%) |
한국어교육학(korean | 2 | 0.1% |
education | 2 | 0.1% |
문법론(grammar | 2 | 0.1% |
번역학(translation | 2 | 0.1% |
studies | 2 | 0.1% |
language | 2 | 0.1% |
3216 | 1 | < 0.1% |
3217 | 1 | < 0.1% |
1538 | 1 | < 0.1% |
3221 | 1 | < 0.1% |
Other values (2260) | 2260 |
Most occurring characters
Value | Count | Frequency (%) |
2 | 1588 | |
4 | 1352 | |
1 | 1051 | |
3 | 802 | |
7 | 743 | |
5 | 726 | |
9 | 724 | |
8 | 711 | |
6 | 711 | |
0 | 648 | |
Other values (34) | 140 | 1.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 9056 | |
Lowercase Letter | 84 | 0.9% |
Other Letter | 24 | 0.3% |
Uppercase Letter | 12 | 0.1% |
Space Separator | 8 | 0.1% |
Close Punctuation | 6 | 0.1% |
Open Punctuation | 6 | 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
a | 16 | |
n | 10 | |
r | 8 | |
t | 6 | 7.1% |
i | 6 | 7.1% |
u | 6 | 7.1% |
o | 6 | 7.1% |
e | 6 | 7.1% |
g | 4 | 4.8% |
m | 4 | 4.8% |
Other values (4) | 12 |
Other Letter
Value | Count | Frequency (%) |
학 | 4 | |
어 | 2 | |
교 | 2 | |
육 | 2 | |
역 | 2 | |
번 | 2 | |
론 | 2 | |
법 | 2 | |
문 | 2 | |
국 | 2 |
Decimal Number
Value | Count | Frequency (%) |
2 | 1588 | |
4 | 1352 | |
1 | 1051 | |
3 | 802 | |
7 | 743 | |
5 | 726 | |
9 | 724 | |
8 | 711 | |
6 | 711 | |
0 | 648 |
Uppercase Letter
Value | Count | Frequency (%) |
E | 2 | |
S | 2 | |
K | 2 | |
L | 2 | |
T | 2 | |
G | 2 |
Space Separator
Value | Count | Frequency (%) |
8 |
Close Punctuation
Value | Count | Frequency (%) |
) | 6 |
Open Punctuation
Value | Count | Frequency (%) |
( | 6 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 9076 | |
Latin | 96 | 1.0% |
Hangul | 24 | 0.3% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 16 | |
n | 10 | |
r | 8 | 8.3% |
t | 6 | 6.2% |
i | 6 | 6.2% |
u | 6 | 6.2% |
o | 6 | 6.2% |
e | 6 | 6.2% |
g | 4 | 4.2% |
m | 4 | 4.2% |
Other values (10) | 24 |
Common
Value | Count | Frequency (%) |
2 | 1588 | |
4 | 1352 | |
1 | 1051 | |
3 | 802 | |
7 | 743 | |
5 | 726 | |
9 | 724 | |
8 | 711 | |
6 | 711 | |
0 | 648 | |
Other values (3) | 20 | 0.2% |
Hangul
Value | Count | Frequency (%) |
학 | 4 | |
어 | 2 | |
교 | 2 | |
육 | 2 | |
역 | 2 | |
번 | 2 | |
론 | 2 | |
법 | 2 | |
문 | 2 | |
국 | 2 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 9172 | |
Hangul | 24 | 0.3% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
2 | 1588 | |
4 | 1352 | |
1 | 1051 | |
3 | 802 | |
7 | 743 | |
5 | 726 | |
9 | 724 | |
8 | 711 | |
6 | 711 | |
0 | 648 | |
Other values (23) | 116 | 1.3% |
Hangul
Value | Count | Frequency (%) |
학 | 4 | |
어 | 2 | |
교 | 2 | |
육 | 2 | |
역 | 2 | |
번 | 2 | |
론 | 2 | |
법 | 2 | |
문 | 2 | |
국 | 2 |
INSTITUTION_NUM
Text
Distinct | 291 |
---|---|
Distinct (%) | 12.8% |
Missing | 5 |
Missing (%) | 0.2% |
Memory size | 17.9 KiB |
Value | Count | Frequency (%) |
100029 | 45 | 2.0% |
100212 | 36 | 1.6% |
100006 | 35 | 1.5% |
100085 | 33 | 1.5% |
100426 | 33 | 1.5% |
100294 | 32 | 1.4% |
100971 | 30 | 1.3% |
100653 | 29 | 1.3% |
100379 | 28 | 1.2% |
100576 | 28 | 1.2% |
Other values (281) | 1936 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 5507 | |
1 | 2902 | |
5 | 809 | 5.9% |
2 | 793 | 5.8% |
4 | 732 | 5.4% |
8 | 729 | 5.4% |
7 | 581 | 4.3% |
6 | 537 | 3.9% |
3 | 517 | 3.8% |
9 | 485 | 3.6% |
Other values (10) | 14 | 0.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 13592 | |
Uppercase Letter | 12 | 0.1% |
Connector Punctuation | 2 | < 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 5507 | |
1 | 2902 | |
5 | 809 | 6.0% |
2 | 793 | 5.8% |
4 | 732 | 5.4% |
8 | 729 | 5.4% |
7 | 581 | 4.3% |
6 | 537 | 4.0% |
3 | 517 | 3.8% |
9 | 485 | 3.6% |
Uppercase Letter
Value | Count | Frequency (%) |
A | 3 | |
E | 2 | |
N | 1 | 8.3% |
G | 1 | 8.3% |
T | 1 | 8.3% |
I | 1 | 8.3% |
H | 1 | 8.3% |
C | 1 | 8.3% |
R | 1 | 8.3% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 13594 | |
Latin | 12 | 0.1% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 5507 | |
1 | 2902 | |
5 | 809 | 6.0% |
2 | 793 | 5.8% |
4 | 732 | 5.4% |
8 | 729 | 5.4% |
7 | 581 | 4.3% |
6 | 537 | 4.0% |
3 | 517 | 3.8% |
9 | 485 | 3.6% |
Latin
Value | Count | Frequency (%) |
A | 3 | |
E | 2 | |
N | 1 | 8.3% |
G | 1 | 8.3% |
T | 1 | 8.3% |
I | 1 | 8.3% |
H | 1 | 8.3% |
C | 1 | 8.3% |
R | 1 | 8.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 13606 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 5507 | |
1 | 2902 | |
5 | 809 | 5.9% |
2 | 793 | 5.8% |
4 | 732 | 5.4% |
8 | 729 | 5.4% |
7 | 581 | 4.3% |
6 | 537 | 3.9% |
3 | 517 | 3.8% |
9 | 485 | 3.6% |
Other values (10) | 14 | 0.1% |
PROFILE_ID
Text
Distinct | 326 |
---|---|
Distinct (%) | 14.4% |
Missing | 4 |
Missing (%) | 0.2% |
Memory size | 17.9 KiB |
Length
Max length | 15 |
---|---|
Median length | 10 |
Mean length | 10.000883 |
Min length | 7 |
Characters and Unicode
Total characters | 22662 |
---|---|
Distinct characters | 26 |
Distinct categories | 7 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 61 ? |
---|---|
Unique (%) | 2.7% |
Sample
1st row | 1000001385 |
---|---|
2nd row | 1000001385 |
3rd row | 1000001385 |
4th row | 1000001385 |
5th row | 1000001386 |
Value | Count | Frequency (%) |
1000001412 | 40 | 1.8% |
1000001390 | 35 | 1.5% |
1000001646 | 33 | 1.5% |
1000001495 | 32 | 1.4% |
1000002514 | 31 | 1.4% |
1000002666 | 30 | 1.3% |
1000001504 | 28 | 1.2% |
1000001544 | 28 | 1.2% |
1000002453 | 27 | 1.2% |
1000002394 | 25 | 1.1% |
Other values (316) | 1957 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 11820 | |
1 | 4385 | 19.3% |
5 | 1210 | 5.3% |
2 | 1178 | 5.2% |
4 | 1102 | 4.9% |
6 | 860 | 3.8% |
3 | 606 | 2.7% |
9 | 600 | 2.6% |
7 | 573 | 2.5% |
8 | 311 | 1.4% |
Other values (16) | 17 | 0.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 22645 | |
Lowercase Letter | 9 | < 0.1% |
Other Letter | 3 | < 0.1% |
Other Punctuation | 2 | < 0.1% |
Close Punctuation | 1 | < 0.1% |
Open Punctuation | 1 | < 0.1% |
Uppercase Letter | 1 | < 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 11820 | |
1 | 4385 | 19.4% |
5 | 1210 | 5.3% |
2 | 1178 | 5.2% |
4 | 1102 | 4.9% |
6 | 860 | 3.8% |
3 | 606 | 2.7% |
9 | 600 | 2.6% |
7 | 573 | 2.5% |
8 | 311 | 1.4% |
Lowercase Letter
Value | Count | Frequency (%) |
a | 2 | |
s | 1 | |
m | 1 | |
c | 1 | |
i | 1 | |
t | 1 | |
g | 1 | |
r | 1 |
Other Letter
Value | Count | Frequency (%) |
론 | 1 | |
용 | 1 | |
화 | 1 |
Other Punctuation
Value | Count | Frequency (%) |
: | 1 | |
. | 1 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1 |
Open Punctuation
Value | Count | Frequency (%) |
( | 1 |
Uppercase Letter
Value | Count | Frequency (%) |
P | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 22649 | |
Latin | 10 | < 0.1% |
Hangul | 3 | < 0.1% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 11820 | |
1 | 4385 | 19.4% |
5 | 1210 | 5.3% |
2 | 1178 | 5.2% |
4 | 1102 | 4.9% |
6 | 860 | 3.8% |
3 | 606 | 2.7% |
9 | 600 | 2.6% |
7 | 573 | 2.5% |
8 | 311 | 1.4% |
Other values (4) | 4 | < 0.1% |
Latin
Value | Count | Frequency (%) |
a | 2 | |
s | 1 | |
m | 1 | |
c | 1 | |
i | 1 | |
t | 1 | |
g | 1 | |
r | 1 | |
P | 1 |
Hangul
Value | Count | Frequency (%) |
론 | 1 | |
용 | 1 | |
화 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 22659 | |
Hangul | 3 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 11820 | |
1 | 4385 | 19.4% |
5 | 1210 | 5.3% |
2 | 1178 | 5.2% |
4 | 1102 | 4.9% |
6 | 860 | 3.8% |
3 | 606 | 2.7% |
9 | 600 | 2.6% |
7 | 573 | 2.5% |
8 | 311 | 1.4% |
Other values (13) | 14 | 0.1% |
Hangul
Value | Count | Frequency (%) |
론 | 1 | |
용 | 1 | |
화 | 1 |
MEMBER_NUM
Text
MISSING
 
Distinct | 756 |
---|---|
Distinct (%) | 96.2% |
Missing | 1484 |
Missing (%) | 65.4% |
Memory size | 17.9 KiB |
Value | Count | Frequency (%) |
101074 | 3 | 0.4% |
101075 | 3 | 0.4% |
100777 | 2 | 0.3% |
101289 | 2 | 0.3% |
100728 | 2 | 0.3% |
101052 | 2 | 0.3% |
100176 | 2 | 0.3% |
100687 | 2 | 0.3% |
100058 | 2 | 0.3% |
100958 | 2 | 0.3% |
Other values (746) | 764 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 1738 | |
1 | 1179 | |
5 | 257 | 5.4% |
8 | 249 | 5.3% |
6 | 248 | 5.3% |
7 | 244 | 5.2% |
9 | 236 | 5.0% |
2 | 203 | 4.3% |
3 | 191 | 4.0% |
4 | 170 | 3.6% |
Other values (2) | 2 | < 0.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 4715 | |
Other Punctuation | 2 | < 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 1738 | |
1 | 1179 | |
5 | 257 | 5.5% |
8 | 249 | 5.3% |
6 | 248 | 5.3% |
7 | 244 | 5.2% |
9 | 236 | 5.0% |
2 | 203 | 4.3% |
3 | 191 | 4.1% |
4 | 170 | 3.6% |
Other Punctuation
Value | Count | Frequency (%) |
: | 1 | |
. | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 4717 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 1738 | |
1 | 1179 | |
5 | 257 | 5.4% |
8 | 249 | 5.3% |
6 | 248 | 5.3% |
7 | 244 | 5.2% |
9 | 236 | 5.0% |
2 | 203 | 4.3% |
3 | 191 | 4.0% |
4 | 170 | 3.6% |
Other values (2) | 2 | < 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 4717 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 1738 | |
1 | 1179 | |
5 | 257 | 5.4% |
8 | 249 | 5.3% |
6 | 248 | 5.3% |
7 | 244 | 5.2% |
9 | 236 | 5.0% |
2 | 203 | 4.3% |
3 | 191 | 4.0% |
4 | 170 | 3.6% |
Other values (2) | 2 | < 0.1% |
MEMBER_ID
Text
MISSING
 
Distinct | 1734 |
---|---|
Distinct (%) | 80.0% |
Missing | 103 |
Missing (%) | 4.5% |
Memory size | 17.9 KiB |
Length
Max length | 48 |
---|---|
Median length | 41 |
Mean length | 19.010152 |
Min length | 1 |
Characters and Unicode
Total characters | 41195 |
---|---|
Distinct characters | 82 |
Distinct categories | 11 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 1698 ? |
---|---|
Unique (%) | 78.4% |
Sample
1st row | pori.park@asu.edu |
---|---|
2nd row | youngoh@asu.edu |
3rd row | yysys@uchicago.edu |
4th row | Cypark@asu.edu |
5th row | garethmc@bu.edu |
Value | Count | Frequency (%) |
ksnet@aks.ac.kr | 295 | 13.4% |
temp.aks.ac.kr | 101 | 4.6% |
educopar@gmail.com | 5 | 0.2% |
unsw.edu.au | 4 | 0.2% |
4 | 0.2% | |
edu.au | 4 | 0.2% |
cencon@hindustanuniv.ac.in | 3 | 0.1% |
princeton.edu | 3 | 0.1% |
icfks-msu@yandex.ru | 3 | 0.1% |
hegartyn@stjohns.edu | 2 | 0.1% |
Other values (1743) | 1780 |
Most occurring characters
Value | Count | Frequency (%) |
. | 3984 | 9.7% |
a | 3936 | 9.6% |
e | 2810 | 6.8% |
n | 2290 | 5.6% |
k | 2262 | 5.5% |
@ | 2166 | 5.3% |
u | 2045 | 5.0% |
s | 2010 | 4.9% |
o | 1992 | 4.8% |
c | 1885 | 4.6% |
Other values (72) | 15815 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 32997 | |
Other Punctuation | 6162 | 15.0% |
Decimal Number | 1541 | 3.7% |
Uppercase Letter | 221 | 0.5% |
Dash Punctuation | 128 | 0.3% |
Connector Punctuation | 93 | 0.2% |
Space Separator | 40 | 0.1% |
Other Letter | 10 | < 0.1% |
Close Punctuation | 1 | < 0.1% |
Open Punctuation | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
a | 3936 | 11.9% |
e | 2810 | 8.5% |
n | 2290 | 6.9% |
k | 2262 | 6.9% |
u | 2045 | 6.2% |
s | 2010 | 6.1% |
o | 1992 | 6.0% |
c | 1885 | 5.7% |
m | 1744 | 5.3% |
i | 1704 | 5.2% |
Other values (16) | 10319 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 23 | 10.4% |
C | 23 | 10.4% |
H | 22 | 10.0% |
J | 18 | 8.1% |
K | 17 | 7.7% |
A | 15 | 6.8% |
M | 13 | 5.9% |
L | 11 | 5.0% |
I | 9 | 4.1% |
E | 8 | 3.6% |
Other values (13) | 62 |
Decimal Number
Value | Count | Frequency (%) |
2 | 246 | |
1 | 203 | |
3 | 193 | |
0 | 166 | |
4 | 154 | |
7 | 136 | |
8 | 114 | |
5 | 111 | |
6 | 111 | |
9 | 107 |
Other Letter
Value | Count | Frequency (%) |
육 | 1 | |
교 | 1 | |
어 | 1 | |
국 | 1 | |
희 | 1 | |
선 | 1 | |
전 | 1 | |
강 | 1 | |
사 | 1 | |
학 | 1 |
Other Punctuation
Value | Count | Frequency (%) |
. | 3984 | |
@ | 2166 | |
, | 8 | 0.1% |
: | 1 | < 0.1% |
/ | 1 | < 0.1% |
? | 1 | < 0.1% |
; | 1 | < 0.1% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 128 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 93 |
Space Separator
Value | Count | Frequency (%) |
40 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1 |
Open Punctuation
Value | Count | Frequency (%) |
( | 1 |
Math Symbol
Value | Count | Frequency (%) |
+ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 33218 | |
Common | 7967 | 19.3% |
Hangul | 10 | < 0.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 3936 | 11.8% |
e | 2810 | 8.5% |
n | 2290 | 6.9% |
k | 2262 | 6.8% |
u | 2045 | 6.2% |
s | 2010 | 6.1% |
o | 1992 | 6.0% |
c | 1885 | 5.7% |
m | 1744 | 5.3% |
i | 1704 | 5.1% |
Other values (39) | 10540 |
Common
Value | Count | Frequency (%) |
. | 3984 | |
@ | 2166 | |
2 | 246 | 3.1% |
1 | 203 | 2.5% |
3 | 193 | 2.4% |
0 | 166 | 2.1% |
4 | 154 | 1.9% |
7 | 136 | 1.7% |
- | 128 | 1.6% |
8 | 114 | 1.4% |
Other values (13) | 477 | 6.0% |
Hangul
Value | Count | Frequency (%) |
육 | 1 | |
교 | 1 | |
어 | 1 | |
국 | 1 | |
희 | 1 | |
선 | 1 | |
전 | 1 | |
강 | 1 | |
사 | 1 | |
학 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 41185 | |
Hangul | 10 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
. | 3984 | 9.7% |
a | 3936 | 9.6% |
e | 2810 | 6.8% |
n | 2290 | 5.6% |
k | 2262 | 5.5% |
@ | 2166 | 5.3% |
u | 2045 | 5.0% |
s | 2010 | 4.9% |
o | 1992 | 4.8% |
c | 1885 | 4.6% |
Other values (62) | 15805 |
Hangul
Value | Count | Frequency (%) |
육 | 1 | |
교 | 1 | |
어 | 1 | |
국 | 1 | |
희 | 1 | |
선 | 1 | |
전 | 1 | |
강 | 1 | |
사 | 1 | |
학 | 1 |
CODE_ORDER
Categorical
IMBALANCE
 
Distinct | 33 |
---|---|
Distinct (%) | 1.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 17.9 KiB |
0 | |
---|---|
1 | 112 |
2 | 100 |
3 | 92 |
4 | 81 |
Other values (28) |
Length
Max length | 4 |
---|---|
Median length | 1 |
Mean length | 1.1079295 |
Min length | 1 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | 0 |
---|---|
2nd row | 1 |
3rd row | 2 |
4th row | 3 |
5th row | 0 |
Common Values
Value | Count | Frequency (%) |
0 | 1414 | |
1 | 112 | 4.9% |
2 | 100 | 4.4% |
3 | 92 | 4.1% |
4 | 81 | 3.6% |
5 | 70 | 3.1% |
6 | 63 | 2.8% |
7 | 50 | 2.2% |
8 | 38 | 1.7% |
9 | 36 | 1.6% |
Other values (23) | 214 | 9.4% |
Length
Value | Count | Frequency (%) |
0 | 1414 | |
1 | 112 | 4.9% |
2 | 100 | 4.4% |
3 | 92 | 4.1% |
4 | 81 | 3.6% |
5 | 70 | 3.1% |
6 | 63 | 2.8% |
7 | 50 | 2.2% |
8 | 38 | 1.7% |
9 | 36 | 1.6% |
Other values (23) | 214 | 9.4% |
EMAIL_OPEN
Boolean
Distinct | 2 |
---|---|
Distinct (%) | 0.1% |
Missing | 17 |
Missing (%) | 0.7% |
Memory size | 4.6 KiB |
True | |
---|---|
False | |
(Missing) | 17 |
Value | Count | Frequency (%) |
True | 1759 | |
False | 494 | 21.8% |
(Missing) | 17 | 0.7% |
REGISTER_DT
Text
Distinct | 581 |
---|---|
Distinct (%) | 25.7% |
Missing | 7 |
Missing (%) | 0.3% |
Memory size | 17.9 KiB |
Value | Count | Frequency (%) |
15:23.8 | 1449 | |
32:34.7 | 13 | 0.6% |
31:31.6 | 12 | 0.5% |
52:04.9 | 12 | 0.5% |
06:19.7 | 11 | 0.5% |
51:27.6 | 11 | 0.5% |
33:37.9 | 9 | 0.4% |
06:37.7 | 8 | 0.4% |
40:37.6 | 8 | 0.4% |
28:40.6 | 8 | 0.4% |
Other values (571) | 722 |
Most occurring characters
Value | Count | Frequency (%) |
: | 2263 | |
. | 2263 | |
3 | 1999 | |
5 | 1930 | |
1 | 1917 | |
2 | 1915 | |
8 | 1710 | |
4 | 503 | 3.2% |
0 | 494 | 3.1% |
7 | 303 | 1.9% |
Other values (2) | 544 | 3.4% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 11315 | |
Other Punctuation | 4526 | 28.6% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
3 | 1999 | |
5 | 1930 | |
1 | 1917 | |
2 | 1915 | |
8 | 1710 | |
4 | 503 | 4.4% |
0 | 494 | 4.4% |
7 | 303 | 2.7% |
6 | 278 | 2.5% |
9 | 266 | 2.4% |
Other Punctuation
Value | Count | Frequency (%) |
: | 2263 | |
. | 2263 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 15841 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
: | 2263 | |
. | 2263 | |
3 | 1999 | |
5 | 1930 | |
1 | 1917 | |
2 | 1915 | |
8 | 1710 | |
4 | 503 | 3.2% |
0 | 494 | 3.1% |
7 | 303 | 1.9% |
Other values (2) | 544 | 3.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 15841 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
: | 2263 | |
. | 2263 | |
3 | 1999 | |
5 | 1930 | |
1 | 1917 | |
2 | 1915 | |
8 | 1710 | |
4 | 503 | 3.2% |
0 | 494 | 3.1% |
7 | 303 | 1.9% |
Other values (2) | 544 | 3.4% |
MODIFY_DT
Text
MISSING
 
Distinct | 110 |
---|---|
Distinct (%) | 5.3% |
Missing | 210 |
Missing (%) | 9.3% |
Memory size | 17.9 KiB |
Value | Count | Frequency (%) |
00:00.0 | 1473 | |
05:32.3 | 25 | 1.2% |
50:47.0 | 19 | 0.9% |
24:39.0 | 17 | 0.8% |
53:29.6 | 15 | 0.7% |
15:51.2 | 14 | 0.7% |
29:57.7 | 13 | 0.6% |
21:29.6 | 12 | 0.6% |
27:17.9 | 12 | 0.6% |
20:35.6 | 11 | 0.5% |
Other values (100) | 449 | 21.8% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 7659 | |
: | 2060 | 14.3% |
. | 2060 | 14.3% |
2 | 500 | 3.5% |
1 | 400 | 2.8% |
5 | 394 | 2.7% |
3 | 330 | 2.3% |
4 | 248 | 1.7% |
7 | 230 | 1.6% |
6 | 220 | 1.5% |
Other values (2) | 319 | 2.2% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 10300 | |
Other Punctuation | 4120 | 28.6% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 7659 | |
2 | 500 | 4.9% |
1 | 400 | 3.9% |
5 | 394 | 3.8% |
3 | 330 | 3.2% |
4 | 248 | 2.4% |
7 | 230 | 2.2% |
6 | 220 | 2.1% |
9 | 215 | 2.1% |
8 | 104 | 1.0% |
Other Punctuation
Value | Count | Frequency (%) |
: | 2060 | |
. | 2060 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 14420 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 7659 | |
: | 2060 | 14.3% |
. | 2060 | 14.3% |
2 | 500 | 3.5% |
1 | 400 | 2.8% |
5 | 394 | 2.7% |
3 | 330 | 2.3% |
4 | 248 | 1.7% |
7 | 230 | 1.6% |
6 | 220 | 1.5% |
Other values (2) | 319 | 2.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 14420 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 7659 | |
: | 2060 | 14.3% |
. | 2060 | 14.3% |
2 | 500 | 3.5% |
1 | 400 | 2.8% |
5 | 394 | 2.7% |
3 | 330 | 2.3% |
4 | 248 | 1.7% |
7 | 230 | 1.6% |
6 | 220 | 1.5% |
Other values (2) | 319 | 2.2% |
MODIFIER
Categorical
HIGH CORRELATION
 
Distinct | 4 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 17.9 KiB |
ksnet@aks.ac.kr | |
---|---|
oic@aks.ac.kr | |
<NA> | |
goodday2me@gmail.com | 11 |
Length
Max length | 20 |
---|---|
Median length | 15 |
Mean length | 13.698238 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | ksnet@aks.ac.kr |
---|---|
2nd row | ksnet@aks.ac.kr |
3rd row | ksnet@aks.ac.kr |
4th row | ksnet@aks.ac.kr |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
ksnet@aks.ac.kr | 1699 | |
oic@aks.ac.kr | 350 | 15.4% |
<NA> | 210 | 9.3% |
goodday2me@gmail.com | 11 | 0.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
ksnet@aks.ac.kr | 1699 | |
oic@aks.ac.kr | 350 | 15.4% |
na | 210 | 9.3% |
goodday2me@gmail.com | 11 | 0.5% |
ERASE_YN
Boolean
HIGH CORRELATION
 
Distinct | 2 |
---|---|
Distinct (%) | 0.1% |
Missing | 7 |
Missing (%) | 0.3% |
Memory size | 4.6 KiB |
False | |
---|---|
True | |
(Missing) | 7 |
Value | Count | Frequency (%) |
False | 2002 | |
True | 261 | 11.5% |
(Missing) | 7 | 0.3% |
ERASE_DT
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 35 |
---|---|
Distinct (%) | 1.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 17.9 KiB |
<NA> | |
---|---|
00:00.0 | 108 |
08:33.2 | 16 |
54:38.4 | 13 |
06:21.1 | 10 |
Other values (30) | 81 |
Length
Max length | 7 |
---|---|
Median length | 4 |
Mean length | 4.3013216 |
Min length | 4 |
Unique
Unique | 12 ? |
---|---|
Unique (%) | 0.5% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | 00:00.0 |
Common Values
Value | Count | Frequency (%) |
<NA> | 2042 | |
00:00.0 | 108 | 4.8% |
08:33.2 | 16 | 0.7% |
54:38.4 | 13 | 0.6% |
06:21.1 | 10 | 0.4% |
57:29.0 | 8 | 0.4% |
54:53.3 | 8 | 0.4% |
31:04.2 | 7 | 0.3% |
36:55.1 | 5 | 0.2% |
17:49.7 | 4 | 0.2% |
Other values (25) | 49 | 2.2% |
Length
Value | Count | Frequency (%) |
na | 2042 | |
00:00.0 | 108 | 4.8% |
08:33.2 | 16 | 0.7% |
54:38.4 | 13 | 0.6% |
06:21.1 | 10 | 0.4% |
57:29.0 | 8 | 0.4% |
54:53.3 | 8 | 0.4% |
31:04.2 | 7 | 0.3% |
36:55.1 | 5 | 0.2% |
17:49.7 | 4 | 0.2% |
Other values (25) | 49 | 2.2% |
ERASER
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 17.9 KiB |
<NA> | |
---|---|
ksnet@aks.ac.kr | 162 |
goodday2me@gmail.com | 54 |
oic@aks.ac.kr | 12 |
Length
Max length | 20 |
---|---|
Median length | 4 |
Mean length | 5.2132159 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | oic@aks.ac.kr |
Common Values
Value | Count | Frequency (%) |
<NA> | 2042 | |
ksnet@aks.ac.kr | 162 | 7.1% |
goodday2me@gmail.com | 54 | 2.4% |
oic@aks.ac.kr | 12 | 0.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 2042 | |
ksnet@aks.ac.kr | 162 | 7.1% |
goodday2me@gmail.com | 54 | 2.4% |
oic@aks.ac.kr | 12 | 0.5% |
CODE_ORDER | EMAIL_OPEN | MODIFIER | ERASE_YN | ERASE_DT | ERASER | |
---|---|---|---|---|---|---|
CODE_ORDER | 1.000 | 0.046 | 0.401 | 0.000 | 0.735 | 0.466 |
EMAIL_OPEN | 0.046 | 1.000 | 0.049 | 0.000 | 0.397 | 0.163 |
MODIFIER | 0.401 | 0.049 | 1.000 | 0.092 | 0.813 | 0.344 |
ERASE_YN | 0.000 | 0.000 | 0.092 | 1.000 | NaN | NaN |
ERASE_DT | 0.735 | 0.397 | 0.813 | NaN | 1.000 | 0.523 |
ERASER | 0.466 | 0.163 | 0.344 | NaN | 0.523 | 1.000 |
ERASER | ERASE_YN | CODE_ORDER | MODIFIER | EMAIL_OPEN | ERASE_DT | |
---|---|---|---|---|---|---|
ERASER | 1.000 | 1.000 | 0.235 | 0.549 | 0.269 | 0.286 |
ERASE_YN | 1.000 | 1.000 | 0.000 | 0.152 | 0.000 | 1.000 |
CODE_ORDER | 0.235 | 0.000 | 1.000 | 0.220 | 0.039 | 0.275 |
MODIFIER | 0.549 | 0.152 | 0.220 | 1.000 | 0.082 | 0.622 |
EMAIL_OPEN | 0.269 | 0.000 | 0.039 | 0.082 | 1.000 | 0.312 |
ERASE_DT | 0.286 | 1.000 | 0.275 | 0.622 | 0.312 | 1.000 |
CODE_ORDER | EMAIL_OPEN | MODIFIER | ERASE_YN | ERASE_DT | ERASER | |
---|---|---|---|---|---|---|
CODE_ORDER | 1.000 | 0.039 | 0.220 | 0.000 | 0.275 | 0.235 |
EMAIL_OPEN | 0.039 | 1.000 | 0.082 | 0.000 | 0.312 | 0.269 |
MODIFIER | 0.220 | 0.082 | 1.000 | 0.152 | 0.622 | 0.549 |
ERASE_YN | 0.000 | 0.000 | 0.152 | 1.000 | 1.000 | 1.000 |
ERASE_DT | 0.275 | 0.312 | 0.622 | 1.000 | 1.000 | 0.286 |
ERASER | 0.235 | 0.269 | 0.549 | 1.000 | 0.286 | 1.000 |
INSTITUTION_FACULTY_ID | INSTITUTION_NUM | PROFILE_ID | MEMBER_NUM | MEMBER_ID | CODE_ORDER | EMAIL_OPEN | REGISTER_DT | MODIFY_DT | MODIFIER | ERASE_YN | ERASE_DT | ERASER | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1538 | 100001 | 1000001385 | <NA> | pori.park@asu.edu | 0 | Y | 15:23.8 | 43:05.1 | ksnet@aks.ac.kr | N | <NA> | <NA> |
1 | 1539 | 100001 | 1000001385 | 101146 | youngoh@asu.edu | 1 | Y | 15:23.8 | 43:05.1 | ksnet@aks.ac.kr | N | <NA> | <NA> |
2 | 1540 | 100001 | 1000001385 | <NA> | yysys@uchicago.edu | 2 | Y | 15:23.8 | 43:05.1 | ksnet@aks.ac.kr | N | <NA> | <NA> |
3 | 1541 | 100001 | 1000001385 | <NA> | Cypark@asu.edu | 3 | Y | 15:23.8 | 43:05.1 | ksnet@aks.ac.kr | N | <NA> | <NA> |
4 | 1543 | 100002 | 1000001386 | 100004 | garethmc@bu.edu | 0 | Y | 15:23.8 | <NA> | <NA> | Y | 00:00.0 | oic@aks.ac.kr |
5 | 1544 | 100003 | 1000001387 | 100005 | hye-sook_wang@brown.edu | 0 | Y | 15:23.8 | 00:00.0 | ksnet@aks.ac.kr | N | <NA> | <NA> |
6 | 1545 | 100003 | 1000001387 | <NA> | Samuel_Perry@brown.edu | 0 | Y | 15:23.8 | 00:00.0 | ksnet@aks.ac.kr | N | <NA> | <NA> |
7 | 1546 | 100003 | 1000001387 | <NA> | James_McClain@brown.edu | 0 | Y | 15:23.8 | 00:00.0 | ksnet@aks.ac.kr | N | <NA> | <NA> |
8 | 1548 | 100004 | 1000001388 | 100968 | jkh25@columbia.edu | 0 | Y | 15:23.8 | 36:02.6 | ksnet@aks.ac.kr | N | <NA> | <NA> |
9 | 1549 | 100004 | 1000001388 | <NA> | bl355@columbia.edu | 1 | Y | 15:23.8 | 36:02.6 | ksnet@aks.ac.kr | N | <NA> | <NA> |
INSTITUTION_FACULTY_ID | INSTITUTION_NUM | PROFILE_ID | MEMBER_NUM | MEMBER_ID | CODE_ORDER | EMAIL_OPEN | REGISTER_DT | MODIFY_DT | MODIFIER | ERASE_YN | ERASE_DT | ERASER | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2260 | 5016 | 100328 | 1000002222 | <NA> | ksnet@aks.ac.kr | 7 | Y | 31:08.5 | 33:25.3 | ksnet@aks.ac.kr | N | <NA> | <NA> |
2261 | 5017 | 100328 | 1000002222 | <NA> | ksnet@aks.ac.kr | 8 | Y | 31:08.5 | 33:25.3 | ksnet@aks.ac.kr | N | <NA> | <NA> |
2262 | 5018 | 100653 | 1000002747 | <NA> | ksnet@aks.ac.kr | 0 | Y | 04:36.6 | <NA> | <NA> | N | <NA> | <NA> |
2263 | 5019 | 100653 | 1000002747 | <NA> | ksnet@aks.ac.kr | 1 | Y | 04:36.6 | <NA> | <NA> | N | <NA> | <NA> |
2264 | 5020 | 100653 | 1000002747 | <NA> | ksnet@aks.ac.kr | 2 | Y | 04:36.6 | <NA> | <NA> | N | <NA> | <NA> |
2265 | 5021 | 100653 | 1000002747 | <NA> | ksnet@aks.ac.kr | 3 | Y | 04:36.6 | <NA> | <NA> | N | <NA> | <NA> |
2266 | 5022 | 100653 | 1000002747 | <NA> | ksnet@aks.ac.kr | 4 | Y | 04:36.6 | <NA> | <NA> | N | <NA> | <NA> |
2267 | 5023 | 100653 | 1000002747 | <NA> | ksnet@aks.ac.kr | 5 | Y | 04:36.6 | <NA> | <NA> | N | <NA> | <NA> |
2268 | 5024 | 100653 | 1000002747 | <NA> | ksnet@aks.ac.kr | 6 | Y | 04:36.7 | <NA> | <NA> | N | <NA> | <NA> |
2269 | 5025 | 100653 | 1000002747 | <NA> | ksnet@aks.ac.kr | 7 | Y | 04:36.7 | <NA> | <NA> | N | <NA> | <NA> |
Most frequently occurring
INSTITUTION_FACULTY_ID | INSTITUTION_NUM | PROFILE_ID | MEMBER_NUM | MEMBER_ID | CODE_ORDER | EMAIL_OPEN | REGISTER_DT | MODIFY_DT | MODIFIER | ERASE_YN | ERASE_DT | ERASER | # duplicates | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 문법론(Grammar) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 2 |
1 | 번역학(Translation Studies) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 2 |