Dataset statistics
Number of variables | 10 |
---|---|
Number of observations | 10000 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 918.0 KiB |
Average record size in memory | 94.0 B |
Variable types
Categorical | 7 |
---|---|
Numeric | 3 |
Dataset
Description | 의사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별, 연령대)를 제공합니다. |
---|---|
URL | https://www.data.go.kr/data/15060446/fileData.do |
직종 has constant value "" | Constant |
회차 is highly overall correlated with 일련번호 and 2 other fields | High correlation |
연도 is highly overall correlated with 일련번호 and 2 other fields | High correlation |
일련번호 is highly overall correlated with 연도 and 1 other fields | High correlation |
과목별점수 is highly overall correlated with 과목명 | High correlation |
총점 is highly overall correlated with 합격여부 | High correlation |
과목명 is highly overall correlated with 과목별점수 and 2 other fields | High correlation |
합격여부 is highly overall correlated with 총점 | High correlation |
합격여부 is highly imbalanced (69.5%) | Imbalance |
연령대 is highly imbalanced (68.2%) | Imbalance |
과목별점수 has 136 (1.4%) zeros | Zeros |
총점 has 131 (1.3%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-12 17:09:53.131164 |
---|---|
Analysis finished | 2023-12-12 17:09:55.748675 |
Duration | 2.62 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연도
Categorical
HIGH CORRELATION
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2003 | |
---|---|
2002 | |
2001 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2003 |
---|---|
2nd row | 2003 |
3rd row | 2003 |
4th row | 2001 |
5th row | 2001 |
Common Values
Value | Count | Frequency (%) |
2003 | 3830 | |
2002 | 3696 | |
2001 | 2474 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2003 | 3830 | |
2002 | 3696 | |
2001 | 2474 |
직종
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
의사 |
---|
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 의사 |
---|---|
2nd row | 의사 |
3rd row | 의사 |
4th row | 의사 |
5th row | 의사 |
Common Values
Value | Count | Frequency (%) |
의사 | 10000 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
의사 | 10000 |
회차
Categorical
HIGH CORRELATION
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
67 | |
---|---|
66 | |
65 |
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 67 |
---|---|
2nd row | 67 |
3rd row | 67 |
4th row | 65 |
5th row | 65 |
Common Values
Value | Count | Frequency (%) |
67 | 3830 | |
66 | 3696 | |
65 | 2474 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
67 | 3830 | |
66 | 3696 | |
65 | 2474 |
일련번호
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 6512 |
---|---|
Distinct (%) | 65.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 5566.0937 |
Minimum | 1 |
---|---|
Maximum | 10192 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 655.95 |
Q1 | 3321.5 |
median | 5718.5 |
Q3 | 8089.5 |
95-th percentile | 9774.15 |
Maximum | 10192 |
Range | 10191 |
Interquartile range (IQR) | 4768 |
Descriptive statistics
Standard deviation | 2884.4749 |
---|---|
Coefficient of variation (CV) | 0.51822248 |
Kurtosis | -1.1068471 |
Mean | 5566.0937 |
Median Absolute Deviation (MAD) | 2383.5 |
Skewness | -0.1927618 |
Sum | 55660937 |
Variance | 8320195.3 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
7088 | 6 | 0.1% |
8496 | 6 | 0.1% |
10046 | 5 | 0.1% |
7052 | 5 | 0.1% |
3575 | 5 | 0.1% |
9299 | 5 | 0.1% |
9980 | 5 | 0.1% |
8854 | 5 | 0.1% |
6960 | 5 | 0.1% |
9086 | 5 | 0.1% |
Other values (6502) | 9948 |
Value | Count | Frequency (%) |
1 | 1 | |
3 | 2 | |
7 | 2 | |
8 | 1 | |
10 | 1 | |
13 | 1 | |
14 | 1 | |
15 | 1 | |
16 | 1 | |
17 | 1 |
Value | Count | Frequency (%) |
10192 | 2 | |
10191 | 3 | |
10190 | 4 | |
10188 | 1 | < 0.1% |
10187 | 1 | < 0.1% |
10186 | 2 | |
10185 | 2 | |
10181 | 1 | < 0.1% |
10180 | 3 | |
10179 | 1 | < 0.1% |
과목명
Categorical
HIGH CORRELATION
 
Distinct | 20 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
보건의약관계 법규 | |
---|---|
의학각론2 | |
의학각론4 | |
의학각론3 | |
의학각론5 | |
Other values (15) |
Length
Max length | 9 |
---|---|
Median length | 5 |
Mean length | 5.6481 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 의학각론1(R형) |
---|---|
2nd row | 의학총론2(R형) |
3rd row | 의학총론2(R형) |
4th row | 예방의학 |
5th row | 산부인과 |
Common Values
Value | Count | Frequency (%) |
보건의약관계 법규 | 1142 | 11.4% |
의학각론2 | 739 | 7.4% |
의학각론4 | 724 | 7.2% |
의학각론3 | 712 | 7.1% |
의학각론5 | 704 | 7.0% |
의학총론2 | 699 | 7.0% |
의학총론1 | 699 | 7.0% |
의학각론2(R형) | 375 | 3.8% |
의학각론1 | 372 | 3.7% |
의학각론1(R형) | 362 | 3.6% |
Other values (10) | 3472 |
Length
Value | Count | Frequency (%) |
보건의약관계 | 1142 | 10.2% |
법규 | 1142 | 10.2% |
의학각론2 | 739 | 6.6% |
의학각론4 | 724 | 6.5% |
의학각론3 | 712 | 6.4% |
의학각론5 | 704 | 6.3% |
의학총론2 | 699 | 6.3% |
의학총론1 | 699 | 6.3% |
의학각론2(r형 | 375 | 3.4% |
의학각론1 | 372 | 3.3% |
Other values (11) | 3834 |
과목별점수
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 139 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 33.1104 |
Minimum | 0 |
---|---|
Maximum | 145 |
Zeros | 136 |
Zeros (%) | 1.4% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 4 |
Q1 | 16 |
median | 34 |
Q3 | 46 |
95-th percentile | 57 |
Maximum | 145 |
Range | 145 |
Interquartile range (IQR) | 30 |
Descriptive statistics
Standard deviation | 21.750587 |
---|---|
Coefficient of variation (CV) | 0.65691102 |
Kurtosis | 3.2906015 |
Mean | 33.1104 |
Median Absolute Deviation (MAD) | 13 |
Skewness | 1.1590389 |
Sum | 331104 |
Variance | 473.08802 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
46.0 | 281 | 2.8% |
45.0 | 280 | 2.8% |
44.0 | 279 | 2.8% |
43.0 | 262 | 2.6% |
42.0 | 261 | 2.6% |
8.0 | 258 | 2.6% |
48.0 | 258 | 2.6% |
47.0 | 256 | 2.6% |
41.0 | 243 | 2.4% |
7.0 | 242 | 2.4% |
Other values (129) | 7380 |
Value | Count | Frequency (%) |
0.0 | 136 | |
1.0 | 23 | 0.2% |
2.0 | 133 | |
3.0 | 165 | |
4.0 | 116 | |
4.5 | 10 | 0.1% |
5.0 | 127 | |
5.5 | 14 | 0.1% |
6.0 | 180 | |
6.5 | 83 |
Value | Count | Frequency (%) |
145.0 | 1 | < 0.1% |
138.0 | 1 | < 0.1% |
137.0 | 1 | < 0.1% |
135.0 | 1 | < 0.1% |
134.0 | 2 | |
133.0 | 3 | |
132.0 | 1 | < 0.1% |
131.0 | 2 | |
130.0 | 2 | |
129.0 | 3 |
총점
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 449 |
---|---|
Distinct (%) | 4.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 314.94945 |
Minimum | 0 |
---|---|
Maximum | 430.5 |
Zeros | 131 |
Zeros (%) | 1.3% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 246.5 |
Q1 | 294 |
median | 321 |
Q3 | 344.5 |
95-th percentile | 377.5 |
Maximum | 430.5 |
Range | 430.5 |
Interquartile range (IQR) | 50.5 |
Descriptive statistics
Standard deviation | 52.955226 |
---|---|
Coefficient of variation (CV) | 0.16813881 |
Kurtosis | 14.665391 |
Mean | 314.94945 |
Median Absolute Deviation (MAD) | 25.5 |
Skewness | -2.8533447 |
Sum | 3149494.5 |
Variance | 2804.256 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.0 | 131 | 1.3% |
316.0 | 76 | 0.8% |
322.5 | 73 | 0.7% |
335.0 | 73 | 0.7% |
319.5 | 72 | 0.7% |
337.0 | 72 | 0.7% |
339.5 | 72 | 0.7% |
333.5 | 71 | 0.7% |
313.5 | 70 | 0.7% |
330.5 | 69 | 0.7% |
Other values (439) | 9221 |
Value | Count | Frequency (%) |
0.0 | 131 | |
19.0 | 1 | < 0.1% |
23.0 | 1 | < 0.1% |
55.0 | 1 | < 0.1% |
84.5 | 2 | < 0.1% |
122.5 | 2 | < 0.1% |
138.0 | 1 | < 0.1% |
140.0 | 2 | < 0.1% |
157.5 | 1 | < 0.1% |
162.0 | 2 | < 0.1% |
Value | Count | Frequency (%) |
430.5 | 1 | < 0.1% |
426.0 | 1 | < 0.1% |
424.0 | 1 | < 0.1% |
421.5 | 2 | |
421.0 | 3 | |
420.0 | 1 | < 0.1% |
419.5 | 3 | |
418.5 | 1 | < 0.1% |
417.5 | 2 | |
415.5 | 3 |
합격여부
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
합격 | |
---|---|
불합격 | |
결시 | 104 |
응시결격 | 27 |
Length
Max length | 4 |
---|---|
Median length | 2 |
Mean length | 2.1162 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 합격 |
---|---|
2nd row | 합격 |
3rd row | 합격 |
4th row | 합격 |
5th row | 합격 |
Common Values
Value | Count | Frequency (%) |
합격 | 8761 | |
불합격 | 1108 | 11.1% |
결시 | 104 | 1.0% |
응시결격 | 27 | 0.3% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
합격 | 8761 | |
불합격 | 1108 | 11.1% |
결시 | 104 | 1.0% |
응시결격 | 27 | 0.3% |
성별
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
남 | |
---|---|
여 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 남 |
---|---|
2nd row | 여 |
3rd row | 남 |
4th row | 남 |
5th row | 남 |
Common Values
Value | Count | Frequency (%) |
남 | 7295 | |
여 | 2705 | 27.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
남 | 7295 | |
여 | 2705 | 27.1% |
연령대
Categorical
IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
20 | |
---|---|
30 | |
40 | 120 |
50 | 12 |
60 | 2 |
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20 |
---|---|
2nd row | 20 |
3rd row | 20 |
4th row | 20 |
5th row | 20 |
Common Values
Value | Count | Frequency (%) |
20 | 8279 | |
30 | 1587 | 15.9% |
40 | 120 | 1.2% |
50 | 12 | 0.1% |
60 | 2 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20 | 8279 | |
30 | 1587 | 15.9% |
40 | 120 | 1.2% |
50 | 12 | 0.1% |
60 | 2 | < 0.1% |
연도 | 회차 | 일련번호 | 과목명 | 과목별점수 | 총점 | 합격여부 | 성별 | 연령대 | |
---|---|---|---|---|---|---|---|---|---|
연도 | 1.000 | 1.000 | 0.952 | 0.899 | 0.373 | 0.446 | 0.090 | 0.000 | 0.013 |
회차 | 1.000 | 1.000 | 0.952 | 0.899 | 0.373 | 0.446 | 0.090 | 0.000 | 0.013 |
일련번호 | 0.952 | 0.952 | 1.000 | 0.759 | 0.346 | 0.443 | 0.158 | 0.074 | 0.270 |
과목명 | 0.899 | 0.899 | 0.759 | 1.000 | 0.905 | 0.323 | 0.097 | 0.039 | 0.000 |
과목별점수 | 0.373 | 0.373 | 0.346 | 0.905 | 1.000 | 0.439 | 0.302 | 0.118 | 0.171 |
총점 | 0.446 | 0.446 | 0.443 | 0.323 | 0.439 | 1.000 | 0.870 | 0.294 | 0.495 |
합격여부 | 0.090 | 0.090 | 0.158 | 0.097 | 0.302 | 0.870 | 1.000 | 0.196 | 0.275 |
성별 | 0.000 | 0.000 | 0.074 | 0.039 | 0.118 | 0.294 | 0.196 | 1.000 | 0.093 |
연령대 | 0.013 | 0.013 | 0.270 | 0.000 | 0.171 | 0.495 | 0.275 | 0.093 | 1.000 |
성별 | 과목명 | 연령대 | 회차 | 합격여부 | 연도 | |
---|---|---|---|---|---|---|
성별 | 1.000 | 0.031 | 0.114 | 0.000 | 0.130 | 0.000 |
과목명 | 0.031 | 1.000 | 0.000 | 0.772 | 0.046 | 0.772 |
연령대 | 0.114 | 0.000 | 1.000 | 0.009 | 0.228 | 0.009 |
회차 | 0.000 | 0.772 | 0.009 | 1.000 | 0.085 | 1.000 |
합격여부 | 0.130 | 0.046 | 0.228 | 0.085 | 1.000 | 0.085 |
연도 | 0.000 | 0.772 | 0.009 | 1.000 | 0.085 | 1.000 |
일련번호 | 과목별점수 | 총점 | 연도 | 회차 | 과목명 | 합격여부 | 성별 | 연령대 | |
---|---|---|---|---|---|---|---|---|---|
일련번호 | 1.000 | -0.102 | 0.368 | 0.946 | 0.946 | 0.346 | 0.095 | 0.057 | 0.115 |
과목별점수 | -0.102 | 1.000 | 0.175 | 0.240 | 0.240 | 0.541 | 0.188 | 0.086 | 0.071 |
총점 | 0.368 | 0.175 | 1.000 | 0.300 | 0.300 | 0.107 | 0.730 | 0.225 | 0.227 |
연도 | 0.946 | 0.240 | 0.300 | 1.000 | 1.000 | 0.772 | 0.085 | 0.000 | 0.009 |
회차 | 0.946 | 0.240 | 0.300 | 1.000 | 1.000 | 0.772 | 0.085 | 0.000 | 0.009 |
과목명 | 0.346 | 0.541 | 0.107 | 0.772 | 0.772 | 1.000 | 0.046 | 0.031 | 0.000 |
합격여부 | 0.095 | 0.188 | 0.730 | 0.085 | 0.085 | 0.046 | 1.000 | 0.130 | 0.228 |
성별 | 0.057 | 0.086 | 0.225 | 0.000 | 0.000 | 0.031 | 0.130 | 1.000 | 0.114 |
연령대 | 0.115 | 0.071 | 0.227 | 0.009 | 0.009 | 0.000 | 0.228 | 0.114 | 1.000 |
연도 | 직종 | 회차 | 일련번호 | 과목명 | 과목별점수 | 총점 | 합격여부 | 성별 | 연령대 | |
---|---|---|---|---|---|---|---|---|---|---|
79430 | 2003 | 의사 | 67 | 8751 | 의학각론1(R형) | 13.0 | 378.5 | 합격 | 남 | 20 |
91495 | 2003 | 의사 | 67 | 9847 | 의학총론2(R형) | 5.0 | 330.0 | 합격 | 여 | 20 |
60981 | 2003 | 의사 | 67 | 7073 | 의학총론2(R형) | 5.0 | 321.0 | 합격 | 남 | 20 |
9888 | 2001 | 의사 | 65 | 1413 | 예방의학 | 33.0 | 314.0 | 합격 | 남 | 20 |
10873 | 2001 | 의사 | 65 | 1554 | 산부인과 | 47.0 | 330.0 | 합격 | 남 | 20 |
10017 | 2001 | 의사 | 65 | 1432 | 내과학 | 119.0 | 300.5 | 합격 | 남 | 20 |
75001 | 2003 | 의사 | 67 | 8348 | 의학각론5 | 49.0 | 348.5 | 합격 | 여 | 20 |
9626 | 2001 | 의사 | 65 | 1376 | 보건의약관계 법규 | 8.0 | 325.0 | 합격 | 여 | 20 |
34148 | 2002 | 의사 | 66 | 4406 | 의학각론3 | 48.0 | 309.0 | 합격 | 남 | 20 |
22187 | 2001 | 의사 | 65 | 3170 | 예방의학 | 28.0 | 328.0 | 합격 | 여 | 20 |
연도 | 직종 | 회차 | 일련번호 | 과목명 | 과목별점수 | 총점 | 합격여부 | 성별 | 연령대 | |
---|---|---|---|---|---|---|---|---|---|---|
74442 | 2003 | 의사 | 67 | 8297 | 의학각론7 | 37.0 | 326.5 | 합격 | 남 | 20 |
26052 | 2002 | 의사 | 66 | 3596 | 의학총론2 | 22.0 | 318.0 | 합격 | 남 | 20 |
66143 | 2003 | 의사 | 67 | 7543 | 의학각론2 | 24.0 | 327.5 | 합격 | 남 | 20 |
66609 | 2003 | 의사 | 67 | 7585 | 의학각론6 | 29.0 | 214.5 | 불합격 | 남 | 30 |
80202 | 2003 | 의사 | 67 | 8821 | 의학각론3 | 34.0 | 305.0 | 합격 | 남 | 20 |
23814 | 2002 | 의사 | 66 | 3373 | 보건의약관계 법규 | 5.5 | 296.5 | 합격 | 남 | 20 |
53031 | 2002 | 의사 | 66 | 6294 | 의학총론1 | 40.0 | 315.5 | 합격 | 남 | 20 |
26791 | 2002 | 의사 | 66 | 3670 | 의학총론1 | 37.0 | 338.5 | 합격 | 여 | 30 |
25206 | 2002 | 의사 | 66 | 3512 | 의학각론2 | 27.0 | 316.0 | 합격 | 여 | 20 |
54368 | 2002 | 의사 | 66 | 6428 | 의학각론3 | 45.0 | 332.5 | 합격 | 여 | 20 |