Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory918.0 KiB
Average record size in memory94.0 B

Variable types

Categorical6
Numeric4

Dataset

Description약사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별, 연령대)를 제공합니다.
URLhttps://www.data.go.kr/data/15083505/fileData.do

Alerts

직종 has constant value ""Constant
회차 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
과목별점수 is highly overall correlated with 총점 and 1 other fieldsHigh correlation
총점 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
합격여부 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
연령대 is highly imbalanced (60.8%)Imbalance
과목별점수 has 1562 (15.6%) zerosZeros
총점 has 1546 (15.5%) zerosZeros

Reproduction

Analysis started2023-12-12 06:39:45.072894
Analysis finished2023-12-12 06:39:48.024174
Duration2.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2000
3694 
2001
1965 
2002
1935 
2003
1887 
2004
519 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2000
2nd row2001
3rd row2000
4th row2000
5th row2000

Common Values

ValueCountFrequency (%)
2000 3694
36.9%
2001 1965
19.7%
2002 1935
19.4%
2003 1887
18.9%
2004 519
 
5.2%

Length

2023-12-12T15:39:48.108987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:39:48.242659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2000 3694
36.9%
2001 1965
19.7%
2002 1935
19.4%
2003 1887
18.9%
2004 519
 
5.2%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
약사(4년제)
10000 

Length

Max length36
Median length36
Mean length36
Min length36

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row약사(4년제)
2nd row약사(4년제)
3rd row약사(4년제)
4th row약사(4년제)
5th row약사(4년제)

Common Values

ValueCountFrequency (%)
약사(4년제) 10000
100.0%

Length

2023-12-12T15:39:48.379113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:39:48.478752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
약사(4년제 10000
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.1374
Minimum50
Maximum55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:39:48.582048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile50
Q151
median52
Q353
95-th percentile55
Maximum55
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5574222
Coefficient of variation (CV)0.029871497
Kurtosis-1.1807253
Mean52.1374
Median Absolute Deviation (MAD)1
Skewness0.051393876
Sum521374
Variance2.4255638
MonotonicityNot monotonic
2023-12-12T15:39:48.699869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
50 2198
22.0%
52 1965
19.7%
53 1935
19.4%
54 1887
18.9%
51 1496
15.0%
55 519
 
5.2%
ValueCountFrequency (%)
50 2198
22.0%
51 1496
15.0%
52 1965
19.7%
53 1935
19.4%
54 1887
18.9%
55 519
 
5.2%
ValueCountFrequency (%)
55 519
 
5.2%
54 1887
18.9%
53 1935
19.4%
52 1965
19.7%
51 1496
15.0%
50 2198
22.0%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct5883
Distinct (%)58.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3962.6133
Minimum1
Maximum7941
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:39:48.840266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile392.95
Q11971.5
median3959
Q35937.25
95-th percentile7556
Maximum7941
Range7940
Interquartile range (IQR)3965.75

Descriptive statistics

Standard deviation2291.0541
Coefficient of variation (CV)0.57816747
Kurtosis-1.1977711
Mean3962.6133
Median Absolute Deviation (MAD)1982
Skewness0.0091735263
Sum39626133
Variance5248928.8
MonotonicityNot monotonic
2023-12-12T15:39:48.995932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4369 7
 
0.1%
5992 6
 
0.1%
3880 6
 
0.1%
6541 6
 
0.1%
6763 5
 
0.1%
4110 5
 
0.1%
1206 5
 
0.1%
7475 5
 
0.1%
4234 5
 
0.1%
5900 5
 
0.1%
Other values (5873) 9945
99.5%
ValueCountFrequency (%)
1 1
 
< 0.1%
2 1
 
< 0.1%
3 1
 
< 0.1%
7 1
 
< 0.1%
9 1
 
< 0.1%
11 2
< 0.1%
12 3
< 0.1%
13 1
 
< 0.1%
14 1
 
< 0.1%
15 3
< 0.1%
ValueCountFrequency (%)
7941 2
< 0.1%
7940 2
< 0.1%
7939 3
< 0.1%
7938 3
< 0.1%
7937 3
< 0.1%
7936 1
 
< 0.1%
7935 1
 
< 0.1%
7933 1
 
< 0.1%
7931 1
 
< 0.1%
7929 2
< 0.1%

과목명
Categorical

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
정량분석학
861 
위생화학
839 
생화학
839 
약전
836 
유기약품제조학
833 
Other values (8)
5792 

Length

Max length18
Median length6
Mean length4.4005
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정성분석학
2nd row무기약품제조학
3rd row정량분석학
4th row약전
5th row약물학

Common Values

ValueCountFrequency (%)
정량분석학 861
8.6%
위생화학 839
8.4%
생화학 839
8.4%
약전 836
8.4%
유기약품제조학 833
8.3%
약물학 832
8.3%
정성분석학 825
8.2%
약사관계법규 825
8.2%
미생물학 820
8.2%
생약학 819
8.2%
Other values (3) 1671
16.7%

Length

2023-12-12T15:39:49.208545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
정량분석학 861
8.4%
위생화학 839
8.2%
생화학 839
8.2%
약전 836
8.2%
유기약품제조학 833
8.1%
약물학 832
8.1%
정성분석학 825
8.1%
약사관계법규 825
8.1%
미생물학 820
8.0%
생약학 819
8.0%
Other values (8) 1916
18.7%

과목별점수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct26
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.5092
Minimum0
Maximum25
Zeros1562
Zeros (%)15.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:39:49.361498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q113
median18
Q321
95-th percentile24
Maximum25
Range25
Interquartile range (IQR)8

Descriptive statistics

Standard deviation7.7098417
Coefficient of variation (CV)0.49711408
Kurtosis-0.12992912
Mean15.5092
Median Absolute Deviation (MAD)4
Skewness-1.0712501
Sum155092
Variance59.44166
MonotonicityNot monotonic
2023-12-12T15:39:49.514519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
0 1562
15.6%
21 930
9.3%
20 873
8.7%
22 869
8.7%
19 824
8.2%
18 713
 
7.1%
23 657
 
6.6%
17 582
 
5.8%
16 566
 
5.7%
24 424
 
4.2%
Other values (16) 2000
20.0%
ValueCountFrequency (%)
0 1562
15.6%
1 1
 
< 0.1%
2 1
 
< 0.1%
3 7
 
0.1%
4 23
 
0.2%
5 38
 
0.4%
6 40
 
0.4%
7 70
 
0.7%
8 71
 
0.7%
9 89
 
0.9%
ValueCountFrequency (%)
25 172
 
1.7%
24 424
4.2%
23 657
6.6%
22 869
8.7%
21 930
9.3%
20 873
8.7%
19 824
8.2%
18 713
7.1%
17 582
5.8%
16 566
5.7%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct225
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean186.5266
Minimum0
Maximum289
Zeros1546
Zeros (%)15.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:39:49.708791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1166
median222
Q3247
95-th percentile267
Maximum289
Range289
Interquartile range (IQR)81

Descriptive statistics

Standard deviation88.2877
Coefficient of variation (CV)0.47332498
Kurtosis0.28327079
Mean186.5266
Median Absolute Deviation (MAD)32
Skewness-1.299229
Sum1865266
Variance7794.718
MonotonicityNot monotonic
2023-12-12T15:39:49.913655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1546
 
15.5%
243 128
 
1.3%
240 125
 
1.2%
234 123
 
1.2%
232 121
 
1.2%
241 118
 
1.2%
245 115
 
1.1%
228 113
 
1.1%
239 112
 
1.1%
237 112
 
1.1%
Other values (215) 7387
73.9%
ValueCountFrequency (%)
0 1546
15.5%
15 2
 
< 0.1%
21 2
 
< 0.1%
24 1
 
< 0.1%
28 1
 
< 0.1%
40 1
 
< 0.1%
46 2
 
< 0.1%
54 3
 
< 0.1%
58 2
 
< 0.1%
65 3
 
< 0.1%
ValueCountFrequency (%)
289 1
 
< 0.1%
288 1
 
< 0.1%
286 8
 
0.1%
285 11
0.1%
284 5
 
0.1%
283 6
 
0.1%
282 11
0.1%
281 15
0.1%
280 26
0.3%
279 13
0.1%

합격여부
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
6885 
불합격
1572 
결시
1540 
응시결격
 
3

Length

Max length4
Median length2
Mean length2.1578
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row결시
2nd row합격
3rd row결시
4th row합격
5th row결시

Common Values

ValueCountFrequency (%)
합격 6885
68.8%
불합격 1572
 
15.7%
결시 1540
 
15.4%
응시결격 3
 
< 0.1%

Length

2023-12-12T15:39:50.097490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:39:50.232629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 6885
68.8%
불합격 1572
 
15.7%
결시 1540
 
15.4%
응시결격 3
 
< 0.1%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
6659 
3341 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
6659
66.6%
3341
33.4%

Length

2023-12-12T15:39:50.354755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:39:50.468954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
6659
66.6%
3341
33.4%

연령대
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
7909 
30
1744 
40
 
260
50
 
72
60
 
15

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row30
5th row20

Common Values

ValueCountFrequency (%)
20 7909
79.1%
30 1744
 
17.4%
40 260
 
2.6%
50 72
 
0.7%
60 15
 
0.1%

Length

2023-12-12T15:39:50.579657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:39:50.677995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 7909
79.1%
30 1744
 
17.4%
40 260
 
2.6%
50 72
 
0.7%
60 15
 
0.1%

Interactions

2023-12-12T15:39:46.998430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:45.895993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.265694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.672192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:47.406109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:45.974083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.386072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.750427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:47.501987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.072932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.510586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.836842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:47.611253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.152937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.590815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:39:46.913404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:39:50.757724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부성별연령대
연도1.0001.0000.9850.2750.4800.4900.3070.0310.127
회차1.0001.0000.9340.2690.4950.5030.5370.0540.070
일련번호0.9850.9341.0000.1700.5780.5920.5590.1360.181
과목명0.2750.2690.1701.0000.1630.0200.0000.0000.000
과목별점수0.4800.4950.5780.1631.0000.8880.8400.2820.351
총점0.4900.5030.5920.0200.8881.0000.8890.3540.461
합격여부0.3070.5370.5590.0000.8400.8891.0000.3440.222
성별0.0310.0540.1360.0000.2820.3540.3441.0000.231
연령대0.1270.0700.1810.0000.3510.4610.2220.2311.000
2023-12-12T15:39:50.908256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도합격여부연령대과목명성별
연도1.0000.2550.0480.1520.038
합격여부0.2551.0000.1830.0000.230
연령대0.0480.1831.0000.0000.282
과목명0.1520.0000.0001.0000.000
성별0.0380.2300.2820.0001.000
2023-12-12T15:39:51.025596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회차일련번호과목별점수총점연도과목명합격여부성별연령대
회차1.0000.9820.3460.3631.0000.1370.3750.0390.047
일련번호0.9821.0000.3620.3840.8250.0710.3690.1050.076
과목별점수0.3460.3621.0000.8490.2200.0680.6820.2180.152
총점0.3630.3840.8491.0000.2250.0080.7630.2720.209
연도1.0000.8250.2200.2251.0000.1520.2550.0380.048
과목명0.1370.0710.0680.0080.1521.0000.0000.0000.000
합격여부0.3750.3690.6820.7630.2550.0001.0000.2300.183
성별0.0390.1050.2180.2720.0380.0000.2301.0000.282
연령대0.0470.0760.1520.2090.0480.0000.1830.2821.000

Missing values

2023-12-12T15:39:47.775364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:39:47.917105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
32552000약사(4년제)50272정성분석학00결시20
474772001약사(4년제)523957무기약품제조학18221합격20
154322000약사(4년제)501287정량분석학00결시20
4142000약사(4년제)5035약전20209합격30
178912000약사(4년제)501491약물학00결시20
871772003약사(4년제)547265약제학21241합격20
301832000약사(4년제)512516정성분석학21247합격20
688832002약사(4년제)535741정성분석학20220합격20
520002001약사(4년제)524334미생물학25279합격20
339012000약사(4년제)512826위생화학22253합격20
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
858222003약사(4년제)547152약사관계법규18218합격20
411072001약사(4년제)523426생화학23265합격20
540362002약사(4년제)534504정량분석학00결시30
246102000약사(4년제)512051약사관계법규13141불합격20
673242002약사(4년제)535611미생물학19246합격20
606912002약사(4년제)535058생화학20227합격20
253712000약사(4년제)512115정성분석학21257합격20
388372001약사(4년제)523237무기약품제조학17217합격20
77272000약사(4년제)50644약물학00결시20
798462003약사(4년제)546654약사관계법규21239합격20