Overview

Dataset statistics

Number of variables4
Number of observations192
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.7 KiB
Average record size in memory35.7 B

Variable types

Numeric3
Categorical1

Dataset

Description국립국제교육원은 정부초청장학생을 비롯한 각종 장학사업의 선발인원을 국외인적자원관리시스템에서 관리하고 있으며, 이 중 정부초청외국인장학생의 선발연도별 수학종료 현황임
Author교육부 국립국제교육원
URLhttps://www.data.go.kr/data/15069773/fileData.do

Alerts

선발년도 is highly overall correlated with 선발인원수High correlation
선발인원수 is highly overall correlated with 선발년도 and 1 other fieldsHigh correlation
수학종료인원수 is highly overall correlated with 선발인원수High correlation
수학종료인원수 has 14 (7.3%) zerosZeros

Reproduction

Analysis started2023-12-12 18:19:53.780701
Analysis finished2023-12-12 18:19:55.047148
Duration1.27 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

선발년도
Real number (ℝ)

HIGH CORRELATION 

Distinct56
Distinct (%)29.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1997.8906
Minimum1967
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-13T03:19:55.130016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1967
5-th percentile1972.55
Q11984
median1999
Q32012
95-th percentile2020.45
Maximum2022
Range55
Interquartile range (IQR)28

Descriptive statistics

Standard deviation15.877592
Coefficient of variation (CV)0.0079471777
Kurtosis-1.1623964
Mean1997.8906
Median Absolute Deviation (MAD)14
Skewness-0.1767374
Sum383595
Variance252.09792
MonotonicityIncreasing
2023-12-13T03:19:55.272122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2020 6
 
3.1%
2006 6
 
3.1%
2017 5
 
2.6%
2022 5
 
2.6%
2021 5
 
2.6%
2019 5
 
2.6%
2018 5
 
2.6%
1993 5
 
2.6%
2015 5
 
2.6%
1980 5
 
2.6%
Other values (46) 140
72.9%
ValueCountFrequency (%)
1967 2
1.0%
1968 2
1.0%
1969 2
1.0%
1970 1
 
0.5%
1971 1
 
0.5%
1972 2
1.0%
1973 2
1.0%
1974 3
1.6%
1975 4
2.1%
1976 4
2.1%
ValueCountFrequency (%)
2022 5
2.6%
2021 5
2.6%
2020 6
3.1%
2019 5
2.6%
2018 5
2.6%
2017 5
2.6%
2016 4
2.1%
2015 5
2.6%
2014 3
1.6%
2013 3
1.6%

초청과정
Categorical

Distinct8
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
석사
51 
박사
47 
연구
38 
학사
23 
연수
21 
Other values (3)
12 

Length

Max length4
Median length2
Mean length2.1145833
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row석사
2nd row연구
3rd row연구
4th row연수
5th row석사

Common Values

ValueCountFrequency (%)
석사 51
26.6%
박사 47
24.5%
연구 38
19.8%
학사 23
12.0%
연수 21
10.9%
전문학사 8
 
4.2%
석박사 2
 
1.0%
어학연수 2
 
1.0%

Length

2023-12-13T03:19:55.430377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:19:55.594079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
석사 51
26.6%
박사 47
24.5%
연구 38
19.8%
학사 23
12.0%
연수 21
10.9%
전문학사 8
 
4.2%
석박사 2
 
1.0%
어학연수 2
 
1.0%

선발인원수
Real number (ℝ)

HIGH CORRELATION 

Distinct81
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.942708
Minimum1
Maximum937
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-13T03:19:55.778006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median7
Q351.25
95-th percentile408.05
Maximum937
Range936
Interquartile range (IQR)48.25

Descriptive statistics

Standard deviation157.67546
Coefficient of variation (CV)2.1916809
Kurtosis13.475455
Mean71.942708
Median Absolute Deviation (MAD)6
Skewness3.493128
Sum13813
Variance24861.552
MonotonicityNot monotonic
2023-12-13T03:19:55.963064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 26
 
13.5%
2 18
 
9.4%
4 17
 
8.9%
3 15
 
7.8%
6 11
 
5.7%
5 7
 
3.6%
8 5
 
2.6%
13 4
 
2.1%
11 3
 
1.6%
9 3
 
1.6%
Other values (71) 83
43.2%
ValueCountFrequency (%)
1 26
13.5%
2 18
9.4%
3 15
7.8%
4 17
8.9%
5 7
 
3.6%
6 11
5.7%
7 3
 
1.6%
8 5
 
2.6%
9 3
 
1.6%
10 1
 
0.5%
ValueCountFrequency (%)
937 1
0.5%
929 1
0.5%
870 1
0.5%
615 1
0.5%
569 1
0.5%
554 1
0.5%
547 1
0.5%
545 1
0.5%
506 1
0.5%
457 1
0.5%

수학종료인원수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct67
Distinct (%)34.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.229167
Minimum0
Maximum548
Zeros14
Zeros (%)7.3%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-13T03:19:56.138718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q328.5
95-th percentile235.85
Maximum548
Range548
Interquartile range (IQR)26.5

Descriptive statistics

Standard deviation99.561918
Coefficient of variation (CV)2.3031191
Kurtosis12.273198
Mean43.229167
Median Absolute Deviation (MAD)4
Skewness3.4822128
Sum8300
Variance9912.5755
MonotonicityNot monotonic
2023-12-13T03:19:56.323827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 25
 
13.0%
2 19
 
9.9%
3 17
 
8.9%
4 15
 
7.8%
0 14
 
7.3%
6 10
 
5.2%
5 7
 
3.6%
8 5
 
2.6%
13 5
 
2.6%
30 4
 
2.1%
Other values (57) 71
37.0%
ValueCountFrequency (%)
0 14
7.3%
1 25
13.0%
2 19
9.9%
3 17
8.9%
4 15
7.8%
5 7
 
3.6%
6 10
 
5.2%
7 2
 
1.0%
8 5
 
2.6%
9 3
 
1.6%
ValueCountFrequency (%)
548 1
0.5%
501 1
0.5%
495 1
0.5%
484 1
0.5%
458 1
0.5%
450 1
0.5%
396 1
0.5%
341 1
0.5%
320 1
0.5%
243 1
0.5%

Interactions

2023-12-13T03:19:54.562225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:19:53.930285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:19:54.266343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:19:54.701760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:19:54.032190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:19:54.374089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:19:54.796009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:19:54.144728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:19:54.457169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:19:56.800025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선발년도초청과정선발인원수수학종료인원수
선발년도1.0000.2970.4760.536
초청과정0.2971.0000.5420.138
선발인원수0.4760.5421.0000.932
수학종료인원수0.5360.1380.9321.000
2023-12-13T03:19:56.925694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선발년도선발인원수수학종료인원수초청과정
선발년도1.0000.6990.3230.149
선발인원수0.6991.0000.7080.207
수학종료인원수0.3230.7081.0000.064
초청과정0.1490.2070.0641.000

Missing values

2023-12-13T03:19:54.923639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:19:55.012394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

선발년도초청과정선발인원수수학종료인원수
01967석사66
11967연구11
21968연구11
31968연수11
41969석사33
51969연구55
61970연구11
71971연구33
81972석사33
91972연구22
선발년도초청과정선발인원수수학종료인원수
1822021박사1910
1832021석사9370
1842021연구43
1852021전문학사350
1862021학사1840
1872022박사1590
1882022석사9290
1892022연구60
1902022전문학사500
1912022학사2660