Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory918.0 KiB
Average record size in memory94.0 B

Variable types

Categorical7
Numeric3

Dataset

Description임상병리사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별, 연령대)를 제공합니다.
Author공공데이터포털
URLhttps://www.data.go.kr/data/15083507/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
총점 is highly overall correlated with 합격여부High correlation
합격여부 is highly overall correlated with 총점High correlation
성별 is highly imbalanced (53.3%)Imbalance
연령대 is highly imbalanced (90.3%)Imbalance
과목별점수 has 518 (5.2%) zerosZeros
총점 has 504 (5.0%) zerosZeros

Reproduction

Analysis started2024-04-22 00:10:28.456694
Analysis finished2024-04-22 00:10:32.436399
Duration3.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2001
4261 
2002
2184 
2000
2148 
2003
1407 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2001
2nd row2001
3rd row2001
4th row2000
5th row2001

Common Values

ValueCountFrequency (%)
2001 4261
42.6%
2002 2184
21.8%
2000 2148
21.5%
2003 1407
 
14.1%

Length

2024-04-22T09:10:32.501314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:10:32.629532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2001 4261
42.6%
2002 2184
21.8%
2000 2148
21.5%
2003 1407
 
14.1%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
임상병리사
10000 

Length

Max length35
Median length35
Mean length35
Min length35

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row임상병리사
2nd row임상병리사
3rd row임상병리사
4th row임상병리사
5th row임상병리사

Common Values

ValueCountFrequency (%)
임상병리사 10000
100.0%

Length

2024-04-22T09:10:32.742786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:10:32.837744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
임상병리사 10000
100.0%

회차
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
30
2184 
27
2148 
29
2139 
28
2122 
31
1407 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row28
2nd row29
3rd row28
4th row27
5th row28

Common Values

ValueCountFrequency (%)
30 2184
21.8%
27 2148
21.5%
29 2139
21.4%
28 2122
21.2%
31 1407
14.1%

Length

2024-04-22T09:10:32.931548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:10:33.039802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30 2184
21.8%
27 2148
21.5%
29 2139
21.4%
28 2122
21.2%
31 1407
14.1%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct6780
Distinct (%)67.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5533.9345
Minimum1
Maximum11110
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-22T09:10:33.169701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile519.9
Q12745
median5518
Q38336.75
95-th percentile10599
Maximum11110
Range11109
Interquartile range (IQR)5591.75

Descriptive statistics

Standard deviation3233.932
Coefficient of variation (CV)0.58438205
Kurtosis-1.2067365
Mean5533.9345
Median Absolute Deviation (MAD)2798.5
Skewness0.010268932
Sum55339345
Variance10458316
MonotonicityNot monotonic
2024-04-22T09:10:33.321683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1867 6
 
0.1%
7950 6
 
0.1%
1391 6
 
0.1%
187 5
 
0.1%
9064 5
 
0.1%
7349 5
 
0.1%
5832 5
 
0.1%
300 5
 
0.1%
5950 5
 
0.1%
9356 5
 
0.1%
Other values (6770) 9947
99.5%
ValueCountFrequency (%)
1 2
< 0.1%
2 2
< 0.1%
4 1
 
< 0.1%
7 1
 
< 0.1%
8 1
 
< 0.1%
12 3
< 0.1%
14 2
< 0.1%
15 2
< 0.1%
16 1
 
< 0.1%
17 2
< 0.1%
ValueCountFrequency (%)
11110 1
 
< 0.1%
11109 2
< 0.1%
11107 1
 
< 0.1%
11105 1
 
< 0.1%
11104 1
 
< 0.1%
11103 2
< 0.1%
11102 1
 
< 0.1%
11101 1
 
< 0.1%
11100 1
 
< 0.1%
11097 3
< 0.1%

과목명
Categorical

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
임상생리학
1130 
임상화학
1122 
임상미생물학
1122 
임상병리사 실기
1119 
혈액학
1114 
Other values (4)
4393 

Length

Max length8
Median length6
Mean length5.885
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공중보건학 개론
2nd row임상화학
3rd row조직병리학
4th row임상병리사 실기
5th row혈액학

Common Values

ValueCountFrequency (%)
임상생리학 1130
11.3%
임상화학 1122
11.2%
임상미생물학 1122
11.2%
임상병리사 실기 1119
11.2%
혈액학 1114
11.1%
해부생리학 개론 1109
11.1%
의료관계법규 1106
11.1%
공중보건학 개론 1096
11.0%
조직병리학 1082
10.8%

Length

2024-04-22T09:10:33.452856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:10:33.579631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개론 2205
16.5%
임상생리학 1130
8.5%
임상화학 1122
8.4%
임상미생물학 1122
8.4%
임상병리사 1119
8.4%
실기 1119
8.4%
혈액학 1114
8.4%
해부생리학 1109
8.3%
의료관계법규 1106
8.3%
공중보건학 1096
8.2%

과목별점수
Real number (ℝ)

ZEROS 

Distinct69
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.0538
Minimum0
Maximum96
Zeros518
Zeros (%)5.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-22T09:10:33.731415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q112
median16
Q323
95-th percentile70
Maximum96
Range96
Interquartile range (IQR)11

Descriptive statistics

Standard deviation17.976964
Coefficient of variation (CV)0.85385839
Kurtosis3.7467239
Mean21.0538
Median Absolute Deviation (MAD)5
Skewness2.0307634
Sum210538
Variance323.17122
MonotonicityNot monotonic
2024-04-22T09:10:33.880094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15 671
 
6.7%
14 657
 
6.6%
13 630
 
6.3%
16 615
 
6.2%
17 579
 
5.8%
12 545
 
5.5%
0 518
 
5.2%
11 510
 
5.1%
18 473
 
4.7%
10 390
 
3.9%
Other values (59) 4412
44.1%
ValueCountFrequency (%)
0 518
5.2%
1 1
 
< 0.1%
2 14
 
0.1%
3 19
 
0.2%
4 47
 
0.5%
5 68
 
0.7%
6 127
 
1.3%
7 166
 
1.7%
8 246
2.5%
9 316
3.2%
ValueCountFrequency (%)
96 1
 
< 0.1%
94 2
 
< 0.1%
92 8
 
0.1%
90 8
 
0.1%
88 17
 
0.2%
86 20
 
0.2%
84 35
0.4%
82 53
0.5%
80 46
0.5%
78 65
0.7%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct221
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean173.8833
Minimum0
Maximum284
Zeros504
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-22T09:10:34.053515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1111
median203
Q3226
95-th percentile251
Maximum284
Range284
Interquartile range (IQR)115

Descriptive statistics

Standard deviation69.847254
Coefficient of variation (CV)0.40169041
Kurtosis-0.24308165
Mean173.8833
Median Absolute Deviation (MAD)32
Skewness-0.89680237
Sum1738833
Variance4878.6389
MonotonicityNot monotonic
2024-04-22T09:10:34.207880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 504
 
5.0%
211 137
 
1.4%
224 136
 
1.4%
232 134
 
1.3%
205 122
 
1.2%
225 121
 
1.2%
209 119
 
1.2%
215 119
 
1.2%
217 118
 
1.2%
220 116
 
1.2%
Other values (211) 8374
83.7%
ValueCountFrequency (%)
0 504
5.0%
6 1
 
< 0.1%
30 1
 
< 0.1%
34 5
 
0.1%
40 2
 
< 0.1%
41 3
 
< 0.1%
43 6
 
0.1%
45 2
 
< 0.1%
46 1
 
< 0.1%
47 3
 
< 0.1%
ValueCountFrequency (%)
284 3
 
< 0.1%
279 1
 
< 0.1%
278 2
 
< 0.1%
277 1
 
< 0.1%
275 1
 
< 0.1%
273 1
 
< 0.1%
272 1
 
< 0.1%
271 1
 
< 0.1%
270 6
0.1%
269 12
0.1%

합격여부
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
6071 
불합격
3425 
결시
 
504

Length

Max length3
Median length2
Mean length2.3425
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row합격
4th row결시
5th row합격

Common Values

ValueCountFrequency (%)
합격 6071
60.7%
불합격 3425
34.2%
결시 504
 
5.0%

Length

2024-04-22T09:10:34.384914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:10:34.485979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 6071
60.7%
불합격 3425
34.2%
결시 504
 
5.0%

성별
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
7916 
2082 
<NA>
 
2

Length

Max length4
Median length1
Mean length1.0006
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
7916
79.2%
2082
 
20.8%
<NA> 2
 
< 0.1%

Length

2024-04-22T09:10:34.598611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:10:34.699507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
7916
79.2%
2082
 
20.8%
na 2
 
< 0.1%

연령대
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
9718 
30
 
270
40
 
10
<NA>
 
2

Length

Max length4
Median length2
Mean length2.0004
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 9718
97.2%
30 270
 
2.7%
40 10
 
0.1%
<NA> 2
 
< 0.1%

Length

2024-04-22T09:10:35.158603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:10:35.267989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 9718
97.2%
30 270
 
2.7%
40 10
 
0.1%
na 2
 
< 0.1%

Interactions

2024-04-22T09:10:31.770351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:10:30.783121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:10:31.410909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:10:31.882425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:10:31.060675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:10:31.549904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:10:31.986622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:10:31.224131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:10:31.660681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-22T09:10:35.346515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부성별연령대
연도1.0001.0000.9700.0200.0910.1620.0920.0720.000
회차1.0001.0000.9950.0210.1280.2280.1180.0400.000
일련번호0.9700.9951.0000.0200.1460.2550.2080.0490.072
과목명0.0200.0210.0201.0000.7290.0220.0000.0000.042
과목별점수0.0910.1280.1460.7291.0000.6660.6270.0420.146
총점0.1620.2280.2550.0220.6661.0000.9560.1110.208
합격여부0.0920.1180.2080.0000.6270.9561.0000.0280.338
성별0.0720.0400.0490.0000.0420.1110.0281.0000.067
연령대0.0000.0000.0720.0420.1460.2080.3380.0671.000
2024-04-22T09:10:35.474030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연도합격여부과목명회차연령대
성별1.0000.0480.0460.0000.0490.111
연도0.0481.0000.0870.0131.0000.000
합격여부0.0460.0871.0000.0000.0890.117
과목명0.0000.0130.0001.0000.0120.018
회차0.0491.0000.0890.0121.0000.000
연령대0.1110.0000.1170.0180.0001.000
2024-04-22T09:10:35.581584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호과목별점수총점연도회차과목명합격여부성별연령대
일련번호1.0000.0500.1380.9130.9020.0090.1260.0380.043
과목별점수0.0501.0000.4820.0550.0540.4420.4730.0320.087
총점0.1380.4821.0000.0980.0970.0100.9530.0850.126
연도0.9130.0550.0981.0001.0000.0130.0870.0480.000
회차0.9020.0540.0971.0001.0000.0120.0890.0490.000
과목명0.0090.4420.0100.0130.0121.0000.0000.0000.018
합격여부0.1260.4730.9530.0870.0890.0001.0000.0460.117
성별0.0380.0320.0850.0480.0490.0000.0461.0000.111
연령대0.0430.0870.1260.0000.0000.0180.1170.1111.000

Missing values

2024-04-22T09:10:32.128724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-22T09:10:32.353030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
360632001임상병리사284008공중보건학 개론16182합격20
465802001임상병리사295176임상화학33223합격20
403322001임상병리사284482조직병리학11198합격20
70532000임상병리사27784임상병리사 실기00결시20
302382001임상병리사283360혈액학15188합격20
64232000임상병리사27714임상병리사 실기5884불합격20
374402001임상병리사284161공중보건학 개론1180불합격20
631572001임상병리사297018임상생리학251불합격20
290832001임상병리사283232임상생리학10184불합격20
158552000임상병리사271762임상병리사 실기64200합격20
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
677292002임상병리사307526임상생리학14231합격20
103832000임상병리사271154임상병리사 실기70222합격20
412382001임상병리사284583공중보건학 개론12202합격20
146082000임상병리사271624해부생리학 개론18203합격20
517362001임상병리사295749임상생리학11210합격20
814402002임상병리사309049임상미생물학2198불합격20
49382000임상병리사27549임상병리사 실기64107불합격20
280672001임상병리사283119임상화학28191합격20
335662001임상병리사283730임상화학1599불합격20
307192001임상병리사283414의료관계법규18239합격20