Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory918.0 KiB
Average record size in memory94.0 B

Variable types

Categorical5
Numeric5

Dataset

Description보건교육사 3급 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별, 연령대)를 제공합니다.
URLhttps://www.data.go.kr/data/15083526/fileData.do

Alerts

직종 has constant value ""Constant
회차 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
과목별점수 is highly overall correlated with 총점 and 2 other fieldsHigh correlation
총점 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
과목명 is highly overall correlated with 과목별점수High correlation
합격여부 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
성별 is highly imbalanced (59.0%)Imbalance
과목별점수 has 1527 (15.3%) zerosZeros
총점 has 1521 (15.2%) zerosZeros

Reproduction

Analysis started2023-12-12 21:47:18.735082
Analysis finished2023-12-12 21:47:22.787566
Duration4.05 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2012
3253 
2011
2668 
2010
2579 
2014
807 
2013
693 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2010
2nd row2011
3rd row2012
4th row2012
5th row2012

Common Values

ValueCountFrequency (%)
2012 3253
32.5%
2011 2668
26.7%
2010 2579
25.8%
2014 807
 
8.1%
2013 693
 
6.9%

Length

2023-12-13T06:47:22.858175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:47:22.953992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2012 3253
32.5%
2011 2668
26.7%
2010 2579
25.8%
2014 807
 
8.1%
2013 693
 
6.9%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보건교육사 3급
10000 

Length

Max length34
Median length34
Mean length34
Min length34

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보건교육사 3급
2nd row보건교육사 3급
3rd row보건교육사 3급
4th row보건교육사 3급
5th row보건교육사 3급

Common Values

ValueCountFrequency (%)
보건교육사 3급 10000
100.0%

Length

2023-12-13T06:47:23.410352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:47:23.512863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보건교육사 10000
50.0%
3급 10000
50.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7101
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T06:47:23.612648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.5264636
Coefficient of variation (CV)0.5632499
Kurtosis-0.43077704
Mean2.7101
Median Absolute Deviation (MAD)1
Skewness0.72163517
Sum27101
Variance2.330091
MonotonicityNot monotonic
2023-12-13T06:47:23.720629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2 2668
26.7%
1 2579
25.8%
3 2133
21.3%
4 1120
11.2%
6 807
 
8.1%
5 693
 
6.9%
ValueCountFrequency (%)
1 2579
25.8%
2 2668
26.7%
3 2133
21.3%
4 1120
11.2%
5 693
 
6.9%
6 807
 
8.1%
ValueCountFrequency (%)
6 807
 
8.1%
5 693
 
6.9%
4 1120
11.2%
3 2133
21.3%
2 2668
26.7%
1 2579
25.8%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct8525
Distinct (%)85.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11911.452
Minimum6
Maximum23823
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T06:47:23.864239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile1151.95
Q15947.5
median11899.5
Q317900.25
95-th percentile22621.15
Maximum23823
Range23817
Interquartile range (IQR)11952.75

Descriptive statistics

Standard deviation6894.0685
Coefficient of variation (CV)0.57877648
Kurtosis-1.2079843
Mean11911.452
Median Absolute Deviation (MAD)5972.5
Skewness-0.0034285601
Sum1.1911452 × 108
Variance47528181
MonotonicityNot monotonic
2023-12-13T06:47:24.015429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12829 4
 
< 0.1%
1464 4
 
< 0.1%
15142 3
 
< 0.1%
11630 3
 
< 0.1%
1536 3
 
< 0.1%
6788 3
 
< 0.1%
4462 3
 
< 0.1%
16653 3
 
< 0.1%
4017 3
 
< 0.1%
5088 3
 
< 0.1%
Other values (8515) 9968
99.7%
ValueCountFrequency (%)
6 2
< 0.1%
13 2
< 0.1%
14 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
20 1
< 0.1%
22 1
< 0.1%
25 1
< 0.1%
33 1
< 0.1%
34 1
< 0.1%
ValueCountFrequency (%)
23823 1
< 0.1%
23819 1
< 0.1%
23815 2
< 0.1%
23813 1
< 0.1%
23810 1
< 0.1%
23808 1
< 0.1%
23806 1
< 0.1%
23805 1
< 0.1%
23796 1
< 0.1%
23794 1
< 0.1%

과목명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보건학
2527 
보건의료법규
2503 
보건교육학
2488 
보건프로그램 개발 및 평가
2482 

Length

Max length14
Median length6
Mean length6.9787
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보건의료법규
2nd row보건학
3rd row보건학
4th row보건의료법규
5th row보건교육학

Common Values

ValueCountFrequency (%)
보건학 2527
25.3%
보건의료법규 2503
25.0%
보건교육학 2488
24.9%
보건프로그램 개발 및 평가 2482
24.8%

Length

2023-12-13T06:47:24.138357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:47:24.239540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보건학 2527
14.5%
보건의료법규 2503
14.3%
보건교육학 2488
14.3%
보건프로그램 2482
14.2%
개발 2482
14.2%
2482
14.2%
평가 2482
14.2%

과목별점수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct49
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.29095
Minimum0
Maximum38
Zeros1527
Zeros (%)15.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T06:47:24.547060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16
median12
Q319
95-th percentile27
Maximum38
Range38
Interquartile range (IQR)13

Descriptive statistics

Standard deviation8.5546795
Coefficient of variation (CV)0.69601451
Kurtosis-0.71816197
Mean12.29095
Median Absolute Deviation (MAD)6
Skewness0.29861695
Sum122909.5
Variance73.182541
MonotonicityNot monotonic
2023-12-13T06:47:24.825668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
0.0 1527
 
15.3%
12.0 432
 
4.3%
13.0 410
 
4.1%
11.0 395
 
4.0%
14.0 393
 
3.9%
7.0 361
 
3.6%
16.0 352
 
3.5%
15.0 340
 
3.4%
17.0 337
 
3.4%
18.0 336
 
3.4%
Other values (39) 5117
51.2%
ValueCountFrequency (%)
0.0 1527
15.3%
0.5 1
 
< 0.1%
1.0 4
 
< 0.1%
1.5 6
 
0.1%
2.0 12
 
0.1%
2.5 24
 
0.2%
3.0 61
 
0.6%
3.5 79
 
0.8%
4.0 140
 
1.4%
4.5 165
 
1.7%
ValueCountFrequency (%)
38.0 2
 
< 0.1%
37.0 1
 
< 0.1%
36.0 5
 
0.1%
35.0 14
 
0.1%
34.0 21
 
0.2%
33.0 21
 
0.2%
32.0 47
0.5%
31.0 75
0.8%
30.0 75
0.8%
29.0 97
1.0%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct137
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.23425
Minimum0
Maximum99
Zeros1521
Zeros (%)15.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T06:47:24.998064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q145
median55.5
Q364
95-th percentile76
Maximum99
Range99
Interquartile range (IQR)19

Descriptive statistics

Standard deviation23.185524
Coefficient of variation (CV)0.47092265
Kurtosis0.41462949
Mean49.23425
Median Absolute Deviation (MAD)9
Skewness-1.1866702
Sum492342.5
Variance537.56851
MonotonicityNot monotonic
2023-12-13T06:47:25.131941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1521
 
15.2%
61.0 187
 
1.9%
56.0 181
 
1.8%
58.0 169
 
1.7%
59.0 168
 
1.7%
57.0 165
 
1.7%
60.0 163
 
1.6%
56.5 161
 
1.6%
52.0 160
 
1.6%
54.0 159
 
1.6%
Other values (127) 6966
69.7%
ValueCountFrequency (%)
0.0 1521
15.2%
10.0 1
 
< 0.1%
12.0 1
 
< 0.1%
15.0 1
 
< 0.1%
20.0 1
 
< 0.1%
20.5 1
 
< 0.1%
22.5 4
 
< 0.1%
26.0 1
 
< 0.1%
26.5 2
 
< 0.1%
27.0 1
 
< 0.1%
ValueCountFrequency (%)
99.0 1
 
< 0.1%
97.0 1
 
< 0.1%
95.0 1
 
< 0.1%
93.0 1
 
< 0.1%
92.0 5
0.1%
91.0 6
0.1%
90.0 8
0.1%
89.0 6
0.1%
88.0 5
0.1%
87.0 12
0.1%

합격여부
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
불합격
5069 
합격
3412 
결시
1519 

Length

Max length3
Median length3
Mean length2.5069
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row결시
4th row불합격
5th row결시

Common Values

ValueCountFrequency (%)
불합격 5069
50.7%
합격 3412
34.1%
결시 1519
 
15.2%

Length

2023-12-13T06:47:25.273420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:47:25.389547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
불합격 5069
50.7%
합격 3412
34.1%
결시 1519
 
15.2%

성별
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
9177 
 
823

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
9177
91.8%
823
 
8.2%

Length

2023-12-13T06:47:25.517507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:47:25.618845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9177
91.8%
823
 
8.2%

연령대
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.948
Minimum10
Maximum60
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T06:47:25.726981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile20
Q120
median30
Q340
95-th percentile50
Maximum60
Range50
Interquartile range (IQR)20

Descriptive statistics

Standard deviation9.6302944
Coefficient of variation (CV)0.33267564
Kurtosis-0.56103844
Mean28.948
Median Absolute Deviation (MAD)10
Skewness0.70584777
Sum289480
Variance92.74257
MonotonicityNot monotonic
2023-12-13T06:47:25.852223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
20 4502
45.0%
30 2724
27.2%
40 2118
21.2%
50 626
 
6.3%
60 28
 
0.3%
10 2
 
< 0.1%
ValueCountFrequency (%)
10 2
 
< 0.1%
20 4502
45.0%
30 2724
27.2%
40 2118
21.2%
50 626
 
6.3%
60 28
 
0.3%
ValueCountFrequency (%)
60 28
 
0.3%
50 626
 
6.3%
40 2118
21.2%
30 2724
27.2%
20 4502
45.0%
10 2
 
< 0.1%

Interactions

2023-12-13T06:47:22.011372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:19.815121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.361240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.974788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.516227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:22.136326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:19.919392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.484972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.098895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.634881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:22.225752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.020052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.606138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.211716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.740846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:22.315951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.132106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.736733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.317946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.834545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:22.402084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.239916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:20.851289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.417393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:47:21.918140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:47:25.929036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부성별연령대
연도1.0001.0000.9880.0070.2680.4540.1540.0480.267
회차1.0001.0000.9280.0000.2410.3990.4390.0820.427
일련번호0.9880.9281.0000.0000.2460.4200.2720.0770.311
과목명0.0070.0000.0001.0000.7740.0000.0000.0080.000
과목별점수0.2680.2410.2460.7741.0000.7800.8010.0420.110
총점0.4540.3990.4200.0000.7801.0000.9590.0850.180
합격여부0.1540.4390.2720.0000.8010.9591.0000.0360.246
성별0.0480.0820.0770.0080.0420.0850.0361.0000.259
연령대0.2670.4270.3110.0000.1100.1800.2460.2591.000
2023-12-13T06:47:26.032106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도합격여부과목명성별
연도1.0000.1170.0060.059
합격여부0.1171.0000.0000.061
과목명0.0060.0001.0000.005
성별0.0590.0610.0051.000
2023-12-13T06:47:26.142326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회차일련번호과목별점수총점연령대연도과목명합격여부성별
회차1.0000.9760.0610.115-0.3131.0000.0000.2020.059
일련번호0.9761.0000.0590.107-0.3080.8440.0000.1690.059
과목별점수0.0610.0591.0000.596-0.1070.1150.5890.6920.035
총점0.1150.1070.5961.000-0.1870.2050.0000.9570.065
연령대-0.313-0.308-0.107-0.1871.0000.1840.0000.1050.186
연도1.0000.8440.1150.2050.1841.0000.0060.1170.059
과목명0.0000.0000.5890.0000.0000.0061.0000.0000.005
합격여부0.2020.1690.6920.9570.1050.1170.0001.0000.061
성별0.0590.0590.0350.0650.1860.0590.0050.0611.000

Missing values

2023-12-13T06:47:22.527218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:47:22.724653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
104782010보건교육사 3급12620보건의료법규7.565.5합격30
301732011보건교육사 3급27544보건학19.060.5합격30
784572012보건교육사 3급419615보건학0.00.0결시50
587662012보건교육사 3급314692보건의료법규3.043.0불합격20
645682012보건교육사 3급316143보건교육학0.00.0결시20
462132011보건교육사 3급211554보건학12.050.0불합격20
441482011보건교육사 3급211038보건교육학21.052.0불합격40
811152013보건교육사 3급520279보건프로그램 개발 및 평가17.091.0합격20
127202010보건교육사 3급13181보건교육학0.00.0결시20
375892011보건교육사 3급29398보건학21.065.0합격30
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
589932012보건교육사 3급314749보건학17.040.5불합격30
723432012보건교육사 3급418086보건프로그램 개발 및 평가11.065.0합격30
670672012보건교육사 3급316767보건프로그램 개발 및 평가8.056.0불합격30
569542012보건교육사 3급314239보건의료법규6.048.0불합격50
448432011보건교육사 3급211211보건프로그램 개발 및 평가11.049.0불합격20
255702011보건교육사 3급26393보건의료법규4.051.0불합격20
320392011보건교육사 3급28010보건프로그램 개발 및 평가11.058.5불합격30
430142011보건교육사 3급210754보건의료법규7.561.5합격20
597452012보건교육사 3급314937보건학22.060.5합격40
633142012보건교육사 3급315829보건의료법규4.040.0불합격30