Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory918.0 KiB
Average record size in memory94.0 B

Variable types

Numeric5
Categorical5

Dataset

Description보건의료정보관리사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별, 연령대)를 제공합니다.
URLhttps://www.data.go.kr/data/15083515/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
과목별점수 is highly overall correlated with 총점 and 2 other fieldsHigh correlation
총점 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
과목명 is highly overall correlated with 과목별점수High correlation
합격여부 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
연령대 is highly imbalanced (82.7%)Imbalance
과목별점수 has 1446 (14.5%) zerosZeros
총점 has 1436 (14.4%) zerosZeros

Reproduction

Analysis started2023-12-12 07:29:34.492813
Analysis finished2023-12-12 07:29:39.017425
Duration4.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2004.7602
Minimum2000
Maximum2009
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:29:39.069549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2000
Q12003
median2005
Q32007
95-th percentile2009
Maximum2009
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.7384386
Coefficient of variation (CV)0.0013659681
Kurtosis-1.0967759
Mean2004.7602
Median Absolute Deviation (MAD)2
Skewness-0.13675609
Sum20047602
Variance7.4990459
MonotonicityNot monotonic
2023-12-12T16:29:39.187068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2003 1463
14.6%
2008 1212
12.1%
2006 1173
11.7%
2007 1112
11.1%
2005 1026
10.3%
2004 896
9.0%
2009 870
8.7%
2000 857
8.6%
2002 760
7.6%
2001 631
6.3%
ValueCountFrequency (%)
2000 857
8.6%
2001 631
6.3%
2002 760
7.6%
2003 1463
14.6%
2004 896
9.0%
2005 1026
10.3%
2006 1173
11.7%
2007 1112
11.1%
2008 1212
12.1%
2009 870
8.7%
ValueCountFrequency (%)
2009 870
8.7%
2008 1212
12.1%
2007 1112
11.1%
2006 1173
11.7%
2005 1026
10.3%
2004 896
9.0%
2003 1463
14.6%
2002 760
7.6%
2001 631
6.3%
2000 857
8.6%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보건의료정보관리사
10000 

Length

Max length31
Median length31
Mean length31
Min length31

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보건의료정보관리사
2nd row보건의료정보관리사
3rd row보건의료정보관리사
4th row보건의료정보관리사
5th row보건의료정보관리사

Common Values

ValueCountFrequency (%)
보건의료정보관리사 10000
100.0%

Length

2023-12-12T16:29:39.324724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:29:39.410432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보건의료정보관리사 10000
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.4592
Minimum16
Maximum26
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:29:39.489605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile16
Q119
median22
Q324
95-th percentile26
Maximum26
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.1125398
Coefficient of variation (CV)0.14504454
Kurtosis-1.1200816
Mean21.4592
Median Absolute Deviation (MAD)3
Skewness-0.27364296
Sum214592
Variance9.6879042
MonotonicityNot monotonic
2023-12-12T16:29:39.594239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
25 1212
12.1%
23 1173
11.7%
24 1112
11.1%
22 1026
10.3%
21 896
9.0%
26 870
8.7%
16 857
8.6%
19 762
7.6%
18 760
7.6%
20 701
7.0%
ValueCountFrequency (%)
16 857
8.6%
17 631
6.3%
18 760
7.6%
19 762
7.6%
20 701
7.0%
21 896
9.0%
22 1026
10.3%
23 1173
11.7%
24 1112
11.1%
25 1212
12.1%
ValueCountFrequency (%)
26 870
8.7%
25 1212
12.1%
24 1112
11.1%
23 1173
11.7%
22 1026
10.3%
21 896
9.0%
20 701
7.0%
19 762
7.6%
18 760
7.6%
17 631
6.3%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct8067
Distinct (%)80.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9545.1471
Minimum2
Maximum19058
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:29:39.745613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile911.95
Q14818.5
median9520.5
Q314349.25
95-th percentile18108.15
Maximum19058
Range19056
Interquartile range (IQR)9530.75

Descriptive statistics

Standard deviation5518.6805
Coefficient of variation (CV)0.5781661
Kurtosis-1.2028492
Mean9545.1471
Median Absolute Deviation (MAD)4757
Skewness-0.010150831
Sum95451471
Variance30455834
MonotonicityNot monotonic
2023-12-12T16:29:39.958457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13164 4
 
< 0.1%
12341 4
 
< 0.1%
3628 4
 
< 0.1%
3056 4
 
< 0.1%
10197 4
 
< 0.1%
16732 4
 
< 0.1%
938 4
 
< 0.1%
13729 4
 
< 0.1%
15462 3
 
< 0.1%
5770 3
 
< 0.1%
Other values (8057) 9962
99.6%
ValueCountFrequency (%)
2 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
8 2
< 0.1%
9 1
< 0.1%
13 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
19 1
< 0.1%
ValueCountFrequency (%)
19058 1
< 0.1%
19056 1
< 0.1%
19054 2
< 0.1%
19053 1
< 0.1%
19045 1
< 0.1%
19043 1
< 0.1%
19041 2
< 0.1%
19036 1
< 0.1%
19035 1
< 0.1%
19032 1
< 0.1%

과목명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
의무기록관리학
2035 
공중보건학 개론
2032 
의료관계법규
2014 
의학용어
1991 
의무기록사 실기
1928 

Length

Max length8
Median length7
Mean length6.5973
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공중보건학 개론
2nd row공중보건학 개론
3rd row의료관계법규
4th row공중보건학 개론
5th row의무기록사 실기

Common Values

ValueCountFrequency (%)
의무기록관리학 2035
20.3%
공중보건학 개론 2032
20.3%
의료관계법규 2014
20.1%
의학용어 1991
19.9%
의무기록사 실기 1928
19.3%

Length

2023-12-12T16:29:40.107590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:29:40.210556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
의무기록관리학 2035
14.6%
공중보건학 2032
14.6%
개론 2032
14.6%
의료관계법규 2014
14.4%
의학용어 1991
14.3%
의무기록사 1928
13.8%
실기 1928
13.8%

과목별점수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct115
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.5237
Minimum0
Maximum100
Zeros1446
Zeros (%)14.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:29:40.351754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q111
median19
Q356
95-th percentile80
Maximum100
Range100
Interquartile range (IQR)45

Descriptive statistics

Standard deviation27.276752
Coefficient of variation (CV)0.83867309
Kurtosis-1.072017
Mean32.5237
Median Absolute Deviation (MAD)19
Skewness0.52733743
Sum325237
Variance744.02119
MonotonicityNot monotonic
2023-12-12T16:29:40.485676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1446
 
14.5%
14.0 451
 
4.5%
15.0 388
 
3.9%
12.0 384
 
3.8%
13.0 378
 
3.8%
16.0 331
 
3.3%
11.0 305
 
3.0%
10.0 277
 
2.8%
17.0 259
 
2.6%
9.0 193
 
1.9%
Other values (105) 5588
55.9%
ValueCountFrequency (%)
0.0 1446
14.5%
2.0 2
 
< 0.1%
3.0 5
 
0.1%
4.0 15
 
0.1%
5.0 29
 
0.3%
6.0 61
 
0.6%
7.0 98
 
1.0%
8.0 157
 
1.6%
9.0 193
 
1.9%
10.0 277
 
2.8%
ValueCountFrequency (%)
100.0 3
 
< 0.1%
97.5 5
 
0.1%
97.0 1
 
< 0.1%
96.0 3
 
< 0.1%
95.0 17
0.2%
94.0 7
 
0.1%
93.0 10
0.1%
92.5 19
0.2%
92.0 9
0.1%
91.0 8
0.1%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct345
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean142.95445
Minimum0
Maximum292
Zeros1436
Zeros (%)14.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:29:40.610580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q189
median119
Q3217.5
95-th percentile251
Maximum292
Range292
Interquartile range (IQR)128.5

Descriptive statistics

Standard deviation83.192236
Coefficient of variation (CV)0.58194926
Kurtosis-1.1170461
Mean142.95445
Median Absolute Deviation (MAD)83.5
Skewness-0.32323754
Sum1429544.5
Variance6920.9481
MonotonicityNot monotonic
2023-12-12T16:29:40.751098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1436
 
14.4%
116.0 124
 
1.2%
117.0 107
 
1.1%
110.0 107
 
1.1%
118.0 103
 
1.0%
112.0 101
 
1.0%
107.0 97
 
1.0%
115.0 97
 
1.0%
114.0 92
 
0.9%
109.0 92
 
0.9%
Other values (335) 7644
76.4%
ValueCountFrequency (%)
0.0 1436
14.4%
28.0 3
 
< 0.1%
31.0 1
 
< 0.1%
33.0 2
 
< 0.1%
35.0 1
 
< 0.1%
36.0 1
 
< 0.1%
37.0 1
 
< 0.1%
41.0 1
 
< 0.1%
43.0 1
 
< 0.1%
44.0 5
 
0.1%
ValueCountFrequency (%)
292.0 1
< 0.1%
291.5 1
< 0.1%
290.5 2
< 0.1%
289.5 1
< 0.1%
289.0 1
< 0.1%
288.0 1
< 0.1%
287.5 2
< 0.1%
287.0 1
< 0.1%
286.0 1
< 0.1%
283.5 2
< 0.1%

합격여부
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
4421 
불합격
4143 
결시
1436 

Length

Max length3
Median length2
Mean length2.4143
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row불합격
3rd row합격
4th row결시
5th row합격

Common Values

ValueCountFrequency (%)
합격 4421
44.2%
불합격 4143
41.4%
결시 1436
 
14.4%

Length

2023-12-12T16:29:40.895765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:29:40.989678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 4421
44.2%
불합격 4143
41.4%
결시 1436
 
14.4%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
8394 
1606 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
8394
83.9%
1606
 
16.1%

Length

2023-12-12T16:29:41.102443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:29:41.220973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
8394
83.9%
1606
 
16.1%

연령대
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
9575 
30
 
381
40
 
44

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row30
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 9575
95.8%
30 381
 
3.8%
40 44
 
0.4%

Length

2023-12-12T16:29:41.325953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:29:41.437512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 9575
95.8%
30 381
 
3.8%
40 44
 
0.4%

Interactions

2023-12-12T16:29:38.228195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:35.827128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:36.318521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:37.138622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:37.717519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:38.328187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:35.924382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:36.403151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:37.256541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:37.829359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:38.422365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:36.020377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:36.518697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:37.376314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:37.930078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:38.517832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:36.118647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:36.644142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:37.490250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:38.053368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:38.616978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:36.222514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:36.756287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:37.596108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:29:38.136000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:29:41.510529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부성별연령대
연도1.0000.9990.9400.0000.1780.2910.3710.0520.025
회차0.9991.0000.9870.0000.2620.3740.3020.0640.043
일련번호0.9400.9871.0000.0000.2550.3780.2770.0570.050
과목명0.0000.0000.0001.0000.8680.0280.0280.0000.000
과목별점수0.1780.2620.2550.8681.0000.7940.7680.0290.071
총점0.2910.3740.3780.0280.7941.0000.9620.0810.088
합격여부0.3710.3020.2770.0280.7680.9621.0000.0230.174
성별0.0520.0640.0570.0000.0290.0810.0231.0000.043
연령대0.0250.0430.0500.0000.0710.0880.1740.0431.000
2023-12-12T16:29:41.636728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과목명합격여부연령대성별
과목명1.0000.0210.0000.000
합격여부0.0211.0000.0540.038
연령대0.0000.0541.0000.071
성별0.0000.0380.0711.000
2023-12-12T16:29:41.753117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목별점수총점과목명합격여부성별연령대
연도1.0000.9990.9940.0830.0630.0000.1790.0490.012
회차0.9991.0000.9950.0860.0690.0000.1950.0450.023
일련번호0.9940.9951.0000.0860.0680.0000.1720.0440.030
과목별점수0.0830.0860.0861.0000.5900.5340.6100.0240.039
총점0.0630.0690.0680.5901.0000.0120.9630.0620.052
과목명0.0000.0000.0000.5340.0121.0000.0210.0000.000
합격여부0.1790.1950.1720.6100.9630.0211.0000.0380.054
성별0.0490.0450.0440.0240.0620.0000.0381.0000.071
연령대0.0120.0230.0300.0390.0520.0000.0540.0711.000

Missing values

2023-12-12T16:29:38.754335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:29:38.942201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
704712007보건의료정보관리사2414095공중보건학 개론16.0275.5합격20
798562008보건의료정보관리사2515972공중보건학 개론10.085.0불합격30
921522009보건의료정보관리사2618431의료관계법규17.0183.5합격20
94712001보건의료정보관리사171895공중보건학 개론0.00.0결시20
307642003보건의료정보관리사206153의무기록사 실기85.0214.0합격20
4362000보건의료정보관리사1688공중보건학 개론14.0200.0합격20
717172007보건의료정보관리사2414344의료관계법규15.0261.0합격20
558162006보건의료정보관리사2311164공중보건학 개론11.094.0불합격20
926662009보건의료정보관리사2618534공중보건학 개론10.094.0불합격20
711572007보건의료정보관리사2414232의료관계법규11.0202.0합격20
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
668532007보건의료정보관리사2413371의무기록관리학61.0104.0불합격20
336132003보건의료정보관리사206723의무기록관리학0.00.0결시20
644442006보건의료정보관리사2312889의무기록사 실기77.5218.5합격20
777042008보건의료정보관리사2515541의무기록사 실기47.579.0불합격20
787122008보건의료정보관리사2515743의료관계법규15.0224.0합격20
582662006보건의료정보관리사2311654공중보건학 개론16.0235.0합격20
90922001보건의료정보관리사171819의료관계법규15.0211.5합격20
667142007보건의료정보관리사2413343의무기록사 실기55.0195.0불합격20
854142008보건의료정보관리사2517083의무기록사 실기77.5207.5합격20
218652003보건의료정보관리사194374의학용어26.0112.0불합격20