Overview

Dataset statistics

Number of variables4
Number of observations482
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.6 KiB
Average record size in memory35.3 B

Variable types

Categorical1
Numeric3

Dataset

Description국립국제교육원이 선발 및 관리하는 정부초청장학생을 포함한 글로벌인재데이터를 국외인적자원관리시스템(HURIK)로 관리하고 있으며, 그 관리인원을 사업별, 연도별로 집계한 현황 자료
Author교육부 국립국제교육원
URLhttps://www.data.go.kr/data/15069776/fileData.do

Alerts

년도별선발인원 is highly overall correlated with 누적선발인원High correlation
누적선발인원 is highly overall correlated with 년도별선발인원High correlation

Reproduction

Analysis started2023-12-12 21:17:36.872993
Analysis finished2023-12-12 21:17:38.160389
Duration1.29 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업명
Categorical

Distinct40
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
장기교육과정
59 
정부초청 외국인 장학생 관리
56 
국비유학생 선발파견
46 
단기교육후기과정
28 
단기교육전기과정
 
26
Other values (35)
267 

Length

Max length30
Median length19
Mean length12.178423
Min length6

Unique

Unique4 ?
Unique (%)0.8%

Sample

1st rowBRICs and ABC 국가 대학생 초청연수
2nd rowBRICs and ABC 국가 대학생 초청연수
3rd rowBRICs and ABC 국가 대학생 초청연수
4th rowBRICs and ABC 국가 대학생 초청연수
5th rowEPIK(English Program In Korea)

Common Values

ValueCountFrequency (%)
장기교육과정 59
 
12.2%
정부초청 외국인 장학생 관리 56
 
11.6%
국비유학생 선발파견 46
 
9.5%
단기교육후기과정 28
 
5.8%
단기교육전기과정 26
 
5.4%
외국정부초청 장학생 관리 22
 
4.6%
문화협정에 의한 제2외국어 교원 연수 22
 
4.6%
한일공동이공계학부유학생선발파견사업 20
 
4.1%
한일 학술문화 청소년 교류 17
 
3.5%
한일중고생 교류 16
 
3.3%
Other values (30) 170
35.3%

Length

2023-12-13T06:17:38.243361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
장학생 78
 
6.4%
관리 78
 
6.4%
장기교육과정 59
 
4.9%
정부초청 56
 
4.6%
외국인 56
 
4.6%
연수 55
 
4.5%
교류 47
 
3.9%
선발파견 46
 
3.8%
국비유학생 46
 
3.8%
대학생 45
 
3.7%
Other values (67) 648
53.4%

선발년도
Real number (ℝ)

Distinct61
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2003.4544
Minimum1962
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.4 KiB
2023-12-13T06:17:38.413093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1962
5-th percentile1976.05
Q11997
median2006
Q32014
95-th percentile2019.95
Maximum2022
Range60
Interquartile range (IQR)17

Descriptive statistics

Standard deviation13.435261
Coefficient of variation (CV)0.0067060478
Kurtosis0.29859873
Mean2003.4544
Median Absolute Deviation (MAD)8
Skewness-0.95512157
Sum965665
Variance180.50623
MonotonicityNot monotonic
2023-12-13T06:17:38.927590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2006 24
 
5.0%
2005 20
 
4.1%
2007 18
 
3.7%
2018 18
 
3.7%
2017 18
 
3.7%
2012 17
 
3.5%
2008 16
 
3.3%
2010 16
 
3.3%
2019 16
 
3.3%
2013 16
 
3.3%
Other values (51) 303
62.9%
ValueCountFrequency (%)
1962 1
0.2%
1963 1
0.2%
1964 1
0.2%
1965 1
0.2%
1966 1
0.2%
1967 2
0.4%
1968 2
0.4%
1969 2
0.4%
1970 2
0.4%
1971 2
0.4%
ValueCountFrequency (%)
2022 9
1.9%
2021 10
2.1%
2020 6
 
1.2%
2019 16
3.3%
2018 18
3.7%
2017 18
3.7%
2016 15
3.1%
2015 15
3.1%
2014 15
3.1%
2013 16
3.3%

년도별선발인원
Real number (ℝ)

HIGH CORRELATION 

Distinct215
Distinct (%)44.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean154.50622
Minimum1
Maximum7419
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.4 KiB
2023-12-13T06:17:39.112980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11.05
Q141.25
median81
Q3131
95-th percentile562.9
Maximum7419
Range7418
Interquartile range (IQR)89.75

Descriptive statistics

Standard deviation390.20629
Coefficient of variation (CV)2.5255053
Kurtosis251.09729
Mean154.50622
Median Absolute Deviation (MAD)43
Skewness13.955964
Sum74472
Variance152260.94
MonotonicityNot monotonic
2023-12-13T06:17:39.308234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 16
 
3.3%
50 16
 
3.3%
49 8
 
1.7%
40 8
 
1.7%
30 7
 
1.5%
15 7
 
1.5%
99 7
 
1.5%
117 6
 
1.2%
60 6
 
1.2%
23 6
 
1.2%
Other values (205) 395
82.0%
ValueCountFrequency (%)
1 4
0.8%
2 2
0.4%
3 1
 
0.2%
4 2
0.4%
5 2
0.4%
6 1
 
0.2%
7 2
0.4%
8 2
0.4%
9 2
0.4%
10 4
0.8%
ValueCountFrequency (%)
7419 1
0.2%
1410 1
0.2%
1351 1
0.2%
1318 1
0.2%
1272 1
0.2%
1184 1
0.2%
1065 1
0.2%
1024 1
0.2%
941 1
0.2%
905 1
0.2%

누적선발인원
Real number (ℝ)

HIGH CORRELATION 

Distinct427
Distinct (%)88.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1453.5104
Minimum1
Maximum13823
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.4 KiB
2023-12-13T06:17:39.477756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile30
Q1221.75
median682
Q31899.75
95-th percentile5371.3
Maximum13823
Range13822
Interquartile range (IQR)1678

Descriptive statistics

Standard deviation1913.2409
Coefficient of variation (CV)1.3162898
Kurtosis8.5296708
Mean1453.5104
Median Absolute Deviation (MAD)582
Skewness2.5252137
Sum700592
Variance3660490.6
MonotonicityNot monotonic
2023-12-13T06:17:39.638666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 5
 
1.0%
118 3
 
0.6%
1 3
 
0.6%
81 3
 
0.6%
60 3
 
0.6%
64 3
 
0.6%
30 3
 
0.6%
302 3
 
0.6%
241 2
 
0.4%
55 2
 
0.4%
Other values (417) 452
93.8%
ValueCountFrequency (%)
1 3
0.6%
2 1
 
0.2%
7 2
0.4%
8 1
 
0.2%
9 1
 
0.2%
11 1
 
0.2%
12 1
 
0.2%
14 1
 
0.2%
15 1
 
0.2%
17 1
 
0.2%
ValueCountFrequency (%)
13823 1
0.2%
12413 1
0.2%
11062 1
0.2%
9744 1
0.2%
8890 1
0.2%
8107 1
0.2%
7666 1
0.2%
7598 1
0.2%
7534 1
0.2%
7491 1
0.2%

Interactions

2023-12-13T06:17:37.614614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:17:37.007379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:17:37.324516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:17:37.703321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:17:37.089165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:17:37.428787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:17:37.808926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:17:37.211604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:17:37.522112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:17:39.756892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업명선발년도년도별선발인원누적선발인원
사업명1.0000.6100.7370.718
선발년도0.6101.0000.0950.338
년도별선발인원0.7370.0951.0000.598
누적선발인원0.7180.3380.5981.000
2023-12-13T06:17:39.855330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선발년도년도별선발인원누적선발인원사업명
선발년도1.0000.3580.4200.235
년도별선발인원0.3581.0000.6180.487
누적선발인원0.4200.6181.0000.305
사업명0.2350.4870.3051.000

Missing values

2023-12-13T06:17:37.963775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:17:38.119637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업명선발년도년도별선발인원누적선발인원
0BRICs and ABC 국가 대학생 초청연수20052020
1BRICs and ABC 국가 대학생 초청연수20063050
2BRICs and ABC 국가 대학생 초청연수20073585
3BRICs and ABC 국가 대학생 초청연수200835120
4EPIK(English Program In Korea)2007300300
5EPIK(English Program In Korea)200811841484
6EPIK(English Program In Korea)200912722756
7GKS 우수교환학생 선발관리2010718718
8GKS 우수교환학생 선발관리20112971015
9GKS 우수교환학생 선발관리20122891304
사업명선발년도년도별선발인원누적선발인원
472한일중고생 교류20162003781
473한일중고생 교류2017543835
474한일중고생 교류20181964031
475한일중고생 교류20191004131
476한중중학생교류연수2011100100
477한중중학생교류연수2012200300
478한중중학생교류연수201399399
479한중중학생교류연수2017139538
480한중중학생교류연수2018142680
481한중중학생교류연수2019100780