Overview

Dataset statistics

Number of variables8
Number of observations160
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.2 KiB
Average record size in memory71.8 B

Variable types

Text1
Numeric6
Categorical1

Dataset

Description국외인적자원관리시스템 내 정부초청외국인장학생의 국가별, 과정별 현황※ 시스템에 등록된 데이터 기준으로 실제 사업부서의 보유자료와 일부 차이가 있을 수 있음
Author교육부 국립국제교육원
URLhttps://www.data.go.kr/data/15052777/fileData.do

Alerts

학사 is highly overall correlated with 석사 and 2 other fieldsHigh correlation
석사 is highly overall correlated with 학사 and 2 other fieldsHigh correlation
박사 is highly overall correlated with 학사 and 3 other fieldsHigh correlation
연구 is highly overall correlated with 박사High correlation
기타 is highly overall correlated with 학사 and 2 other fieldsHigh correlation
석박사 is highly imbalanced (75.6%)Imbalance
구분 has unique valuesUnique
학사 has 75 (46.9%) zerosZeros
석사 has 7 (4.4%) zerosZeros
박사 has 34 (21.2%) zerosZeros
연구 has 108 (67.5%) zerosZeros
연수 has 131 (81.9%) zerosZeros
기타 has 16 (10.0%) zerosZeros

Reproduction

Analysis started2023-12-12 04:54:05.199623
Analysis finished2023-12-12 04:54:09.193334
Duration3.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

UNIQUE 

Distinct160
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2023-12-12T13:54:09.408989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length3.81875
Min length2

Characters and Unicode

Total characters611
Distinct characters155
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique160 ?
Unique (%)100.0%

Sample

1st row가나
2nd row가봉
3rd row가이아나
4th row감비아
5th row과테말라
ValueCountFrequency (%)
기니 2
 
1.2%
가나 1
 
0.6%
중국 1
 
0.6%
우루과이 1
 
0.6%
우즈베키스탄 1
 
0.6%
우크라이나 1
 
0.6%
유고슬라비아 1
 
0.6%
이라크 1
 
0.6%
이란 1
 
0.6%
이스라엘 1
 
0.6%
Other values (152) 152
93.3%
2023-12-12T13:54:09.856332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
48
 
7.9%
28
 
4.6%
24
 
3.9%
20
 
3.3%
20
 
3.3%
19
 
3.1%
19
 
3.1%
13
 
2.1%
13
 
2.1%
12
 
2.0%
Other values (145) 395
64.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 607
99.3%
Space Separator 3
 
0.5%
Other Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
48
 
7.9%
28
 
4.6%
24
 
4.0%
20
 
3.3%
20
 
3.3%
19
 
3.1%
19
 
3.1%
13
 
2.1%
13
 
2.1%
12
 
2.0%
Other values (143) 391
64.4%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
· 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 607
99.3%
Common 4
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
48
 
7.9%
28
 
4.6%
24
 
4.0%
20
 
3.3%
20
 
3.3%
19
 
3.1%
19
 
3.1%
13
 
2.1%
13
 
2.1%
12
 
2.0%
Other values (143) 391
64.4%
Common
ValueCountFrequency (%)
3
75.0%
· 1
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 607
99.3%
ASCII 3
 
0.5%
None 1
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
48
 
7.9%
28
 
4.6%
24
 
4.0%
20
 
3.3%
20
 
3.3%
19
 
3.1%
19
 
3.1%
13
 
2.1%
13
 
2.1%
12
 
2.0%
Other values (143) 391
64.4%
ASCII
ValueCountFrequency (%)
3
100.0%
None
ValueCountFrequency (%)
· 1
100.0%

학사
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct45
Distinct (%)28.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.925
Minimum0
Maximum128
Zeros75
Zeros (%)46.9%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-12T13:54:10.057475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q321.25
95-th percentile62.2
Maximum128
Range128
Interquartile range (IQR)21.25

Descriptive statistics

Standard deviation23.038265
Coefficient of variation (CV)1.6544535
Kurtosis7.0395443
Mean13.925
Median Absolute Deviation (MAD)1
Skewness2.4607369
Sum2228
Variance530.76164
MonotonicityNot monotonic
2023-12-12T13:54:10.225511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
0 75
46.9%
1 7
 
4.4%
2 5
 
3.1%
26 4
 
2.5%
19 3
 
1.9%
15 3
 
1.9%
30 3
 
1.9%
36 3
 
1.9%
4 3
 
1.9%
43 2
 
1.2%
Other values (35) 52
32.5%
ValueCountFrequency (%)
0 75
46.9%
1 7
 
4.4%
2 5
 
3.1%
3 2
 
1.2%
4 3
 
1.9%
5 2
 
1.2%
6 2
 
1.2%
8 2
 
1.2%
9 2
 
1.2%
10 2
 
1.2%
ValueCountFrequency (%)
128 1
0.6%
111 1
0.6%
104 1
0.6%
93 1
0.6%
83 1
0.6%
75 1
0.6%
67 1
0.6%
66 1
0.6%
62 1
0.6%
60 1
0.6%

석사
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct81
Distinct (%)50.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.55625
Minimum0
Maximum413
Zeros7
Zeros (%)4.4%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-12T13:54:10.377546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q19
median20.5
Q358
95-th percentile239.8
Maximum413
Range413
Interquartile range (IQR)49

Descriptive statistics

Standard deviation79.398267
Coefficient of variation (CV)1.5107293
Kurtosis6.3725476
Mean52.55625
Median Absolute Deviation (MAD)15.5
Skewness2.5367698
Sum8409
Variance6304.0849
MonotonicityNot monotonic
2023-12-12T13:54:10.559767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7
 
4.4%
4 6
 
3.8%
11 6
 
3.8%
10 6
 
3.8%
16 6
 
3.8%
7 5
 
3.1%
18 5
 
3.1%
2 5
 
3.1%
26 5
 
3.1%
12 5
 
3.1%
Other values (71) 104
65.0%
ValueCountFrequency (%)
0 7
4.4%
1 3
1.9%
2 5
3.1%
3 1
 
0.6%
4 6
3.8%
5 3
1.9%
6 4
2.5%
7 5
3.1%
8 3
1.9%
9 5
3.1%
ValueCountFrequency (%)
413 1
0.6%
389 1
0.6%
355 1
0.6%
297 1
0.6%
285 1
0.6%
274 1
0.6%
266 1
0.6%
255 1
0.6%
239 1
0.6%
238 1
0.6%

박사
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct44
Distinct (%)27.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.74375
Minimum0
Maximum206
Zeros34
Zeros (%)21.2%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-12T13:54:11.071650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q315
95-th percentile69.05
Maximum206
Range206
Interquartile range (IQR)14

Descriptive statistics

Standard deviation27.871052
Coefficient of variation (CV)2.0279074
Kurtosis18.197645
Mean13.74375
Median Absolute Deviation (MAD)3
Skewness3.8365849
Sum2199
Variance776.79556
MonotonicityNot monotonic
2023-12-12T13:54:11.290916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
0 34
21.2%
1 22
13.8%
2 18
 
11.2%
3 17
 
10.6%
4 5
 
3.1%
6 4
 
2.5%
8 4
 
2.5%
15 4
 
2.5%
17 3
 
1.9%
9 3
 
1.9%
Other values (34) 46
28.7%
ValueCountFrequency (%)
0 34
21.2%
1 22
13.8%
2 18
11.2%
3 17
10.6%
4 5
 
3.1%
5 2
 
1.2%
6 4
 
2.5%
7 1
 
0.6%
8 4
 
2.5%
9 3
 
1.9%
ValueCountFrequency (%)
206 1
0.6%
134 1
0.6%
124 1
0.6%
111 1
0.6%
103 1
0.6%
90 1
0.6%
80 1
0.6%
70 1
0.6%
69 1
0.6%
65 1
0.6%

석박사
Categorical

IMBALANCE 

Distinct4
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
0
147 
1
 
10
2
 
2
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.6%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 147
91.9%
1 10
 
6.2%
2 2
 
1.2%
4 1
 
0.6%

Length

2023-12-12T13:54:11.501878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:54:11.642593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 147
91.9%
1 10
 
6.2%
2 2
 
1.2%
4 1
 
0.6%

연구
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct11
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1
Minimum0
Maximum17
Zeros108
Zeros (%)67.5%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-12T13:54:11.800174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile5
Maximum17
Range17
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.452056
Coefficient of variation (CV)2.452056
Kurtosis20.711369
Mean1
Median Absolute Deviation (MAD)0
Skewness4.1782162
Sum160
Variance6.0125786
MonotonicityNot monotonic
2023-12-12T13:54:11.949643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0 108
67.5%
1 25
 
15.6%
2 7
 
4.4%
4 6
 
3.8%
3 5
 
3.1%
5 3
 
1.9%
8 2
 
1.2%
6 1
 
0.6%
15 1
 
0.6%
13 1
 
0.6%
ValueCountFrequency (%)
0 108
67.5%
1 25
 
15.6%
2 7
 
4.4%
3 5
 
3.1%
4 6
 
3.8%
5 3
 
1.9%
6 1
 
0.6%
8 2
 
1.2%
13 1
 
0.6%
15 1
 
0.6%
ValueCountFrequency (%)
17 1
 
0.6%
15 1
 
0.6%
13 1
 
0.6%
8 2
 
1.2%
6 1
 
0.6%
5 3
 
1.9%
4 6
 
3.8%
3 5
 
3.1%
2 7
 
4.4%
1 25
15.6%

연수
Real number (ℝ)

ZEROS 

Distinct6
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3
Minimum0
Maximum5
Zeros131
Zeros (%)81.9%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-12T13:54:12.120226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.75901293
Coefficient of variation (CV)2.5300431
Kurtosis13.019067
Mean0.3
Median Absolute Deviation (MAD)0
Skewness3.2806792
Sum48
Variance0.57610063
MonotonicityNot monotonic
2023-12-12T13:54:12.272550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 131
81.9%
1 16
 
10.0%
2 10
 
6.2%
4 1
 
0.6%
3 1
 
0.6%
5 1
 
0.6%
ValueCountFrequency (%)
0 131
81.9%
1 16
 
10.0%
2 10
 
6.2%
3 1
 
0.6%
4 1
 
0.6%
5 1
 
0.6%
ValueCountFrequency (%)
5 1
 
0.6%
4 1
 
0.6%
3 1
 
0.6%
2 10
 
6.2%
1 16
 
10.0%
0 131
81.9%

기타
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct44
Distinct (%)27.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.1
Minimum0
Maximum154
Zeros16
Zeros (%)10.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-12T13:54:12.445971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median4
Q313
95-th percentile58.45
Maximum154
Range154
Interquartile range (IQR)11

Descriptive statistics

Standard deviation21.601945
Coefficient of variation (CV)1.6490034
Kurtosis13.030563
Mean13.1
Median Absolute Deviation (MAD)3
Skewness3.1530826
Sum2096
Variance466.64403
MonotonicityNot monotonic
2023-12-12T13:54:12.671044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
1 22
13.8%
0 16
 
10.0%
2 16
 
10.0%
3 15
 
9.4%
4 12
 
7.5%
7 8
 
5.0%
11 7
 
4.4%
6 7
 
4.4%
10 4
 
2.5%
5 4
 
2.5%
Other values (34) 49
30.6%
ValueCountFrequency (%)
0 16
10.0%
1 22
13.8%
2 16
10.0%
3 15
9.4%
4 12
7.5%
5 4
 
2.5%
6 7
 
4.4%
7 8
 
5.0%
8 4
 
2.5%
9 4
 
2.5%
ValueCountFrequency (%)
154 1
0.6%
96 1
0.6%
80 1
0.6%
74 1
0.6%
72 1
0.6%
71 1
0.6%
69 1
0.6%
67 1
0.6%
58 1
0.6%
56 2
1.2%

Interactions

2023-12-12T13:54:08.208475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:05.512175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.129164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.670325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.147102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.679099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:08.294496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:05.631465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.234424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.756192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.246721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.761272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:08.377831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:05.737463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.316694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.827792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.333056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.837830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:08.520194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:05.834084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.400058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.900957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.404593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.919098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:08.633208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:05.936166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.493392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.986152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.505734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:08.014679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:08.741717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.021349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:06.571234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.061683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:07.585819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:54:08.103527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:54:12.836279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학사석사박사석박사연구연수기타
학사1.0000.9540.8040.6440.6350.6680.801
석사0.9541.0000.8150.6420.7720.7210.819
박사0.8040.8151.0000.6550.9370.4960.862
석박사0.6440.6420.6551.0000.4650.1180.823
연구0.6350.7720.9370.4651.0000.7070.832
연수0.6680.7210.4960.1180.7071.0000.355
기타0.8010.8190.8620.8230.8320.3551.000
2023-12-12T13:54:12.985087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학사석사박사연구연수기타석박사
학사1.0000.6660.5260.1820.0380.7410.436
석사0.6661.0000.8040.4770.2890.8490.434
박사0.5260.8041.0000.5160.2310.7320.362
연구0.1820.4770.5161.0000.3830.3290.219
연수0.0380.2890.2310.3831.0000.1620.074
기타0.7410.8490.7320.3290.1621.0000.485
석박사0.4360.4340.3620.2190.0740.4851.000

Missing values

2023-12-12T13:54:08.893686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:54:09.079421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분학사석사박사석박사연구연수기타
0가나26942300026
1가봉2323130107
2가이아나0300000
3감비아1910003
4과테말라2217300011
5그리스01630214
6기니11230004
7기니비사우2000002
8나미비아0100001
9나이지리아28773700029
구분학사석사박사석박사연구연수기타
150팔레스타인0720003
151페루2971200116
152포르투갈01130001
153폴란드1853210126
154프랑스361190157
155피지3740004
156핀란드12130314
157필리핀432173605180
158헝가리248140404
159홍콩11800000