Overview

Dataset statistics

Number of variables10
Number of observations2948
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory247.7 KiB
Average record size in memory86.0 B

Variable types

Numeric5
Categorical5

Dataset

Description조산사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별, 연령대)를 제공합니다.
URLhttps://www.data.go.kr/data/15060449/fileData.do

Alerts

직종 has constant value ""Constant
성별 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
과목별점수 is highly overall correlated with 과목명High correlation
총점 is highly overall correlated with 합격여부High correlation
과목명 is highly overall correlated with 과목별점수High correlation
합격여부 is highly overall correlated with 총점High correlation
합격여부 is highly imbalanced (82.1%)Imbalance
과목별점수 has 44 (1.5%) zerosZeros
총점 has 44 (1.5%) zerosZeros

Reproduction

Analysis started2023-12-12 16:37:07.892186
Analysis finished2023-12-12 16:37:12.191361
Duration4.3 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct24
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2007.1072
Minimum2000
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.0 KiB
2023-12-13T01:37:12.253212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2000
Q12002
median2004
Q32012
95-th percentile2020
Maximum2023
Range23
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.5238496
Coefficient of variation (CV)0.0032503743
Kurtosis-0.44586949
Mean2007.1072
Median Absolute Deviation (MAD)3
Skewness0.87198399
Sum5916952
Variance42.560613
MonotonicityIncreasing
2023-12-13T01:37:12.377101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
2001 312
 
10.6%
2002 304
 
10.3%
2000 300
 
10.2%
2003 300
 
10.2%
2004 260
 
8.8%
2005 152
 
5.2%
2009 148
 
5.0%
2007 108
 
3.7%
2008 108
 
3.7%
2006 104
 
3.5%
Other values (14) 852
28.9%
ValueCountFrequency (%)
2000 300
10.2%
2001 312
10.6%
2002 304
10.3%
2003 300
10.2%
2004 260
8.8%
2005 152
5.2%
2006 104
 
3.5%
2007 108
 
3.7%
2008 108
 
3.7%
2009 148
5.0%
ValueCountFrequency (%)
2023 40
1.4%
2022 48
1.6%
2021 48
1.6%
2020 52
1.8%
2019 56
1.9%
2018 84
2.8%
2017 64
2.2%
2016 76
2.6%
2015 76
2.6%
2014 68
2.3%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size23.2 KiB
조산사
2948 

Length

Max length37
Median length37
Mean length37
Min length37

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row조산사
2nd row조산사
3rd row조산사
4th row조산사
5th row조산사

Common Values

ValueCountFrequency (%)
조산사 2948
100.0%

Length

2023-12-13T01:37:12.497272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:37:12.584801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
조산사 2948
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct24
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.107191
Minimum11
Maximum34
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.0 KiB
2023-12-13T01:37:12.715127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile11
Q113
median15
Q323
95-th percentile31
Maximum34
Range23
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.5238496
Coefficient of variation (CV)0.36029053
Kurtosis-0.44586949
Mean18.107191
Median Absolute Deviation (MAD)3
Skewness0.87198399
Sum53380
Variance42.560613
MonotonicityIncreasing
2023-12-13T01:37:12.859292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
12 312
 
10.6%
13 304
 
10.3%
11 300
 
10.2%
14 300
 
10.2%
15 260
 
8.8%
16 152
 
5.2%
20 148
 
5.0%
18 108
 
3.7%
19 108
 
3.7%
17 104
 
3.5%
Other values (14) 852
28.9%
ValueCountFrequency (%)
11 300
10.2%
12 312
10.6%
13 304
10.3%
14 300
10.2%
15 260
8.8%
16 152
5.2%
17 104
 
3.5%
18 108
 
3.7%
19 108
 
3.7%
20 148
5.0%
ValueCountFrequency (%)
34 40
1.4%
33 48
1.6%
32 48
1.6%
31 52
1.8%
30 56
1.9%
29 84
2.8%
28 64
2.2%
27 76
2.6%
26 76
2.6%
25 68
2.3%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct737
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean369
Minimum1
Maximum737
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.0 KiB
2023-12-13T01:37:12.984736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile37.35
Q1185
median369
Q3553
95-th percentile700.65
Maximum737
Range736
Interquartile range (IQR)368

Descriptive statistics

Standard deviation212.78947
Coefficient of variation (CV)0.57666524
Kurtosis-1.2000041
Mean369
Median Absolute Deviation (MAD)184
Skewness0
Sum1087812
Variance45279.359
MonotonicityIncreasing
2023-12-13T01:37:13.119102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 4
 
0.1%
496 4
 
0.1%
487 4
 
0.1%
488 4
 
0.1%
489 4
 
0.1%
490 4
 
0.1%
491 4
 
0.1%
492 4
 
0.1%
493 4
 
0.1%
494 4
 
0.1%
Other values (727) 2908
98.6%
ValueCountFrequency (%)
1 4
0.1%
2 4
0.1%
3 4
0.1%
4 4
0.1%
5 4
0.1%
6 4
0.1%
7 4
0.1%
8 4
0.1%
9 4
0.1%
10 4
0.1%
ValueCountFrequency (%)
737 4
0.1%
736 4
0.1%
735 4
0.1%
734 4
0.1%
733 4
0.1%
732 4
0.1%
731 4
0.1%
730 4
0.1%
729 4
0.1%
728 4
0.1%

과목명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.2 KiB
모자보건법
737 
모자보건학
737 
신생아간호학
737 
조산학
737 

Length

Max length6
Median length5.5
Mean length4.75
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row모자보건법
2nd row모자보건학
3rd row신생아간호학
4th row조산학
5th row모자보건법

Common Values

ValueCountFrequency (%)
모자보건법 737
25.0%
모자보건학 737
25.0%
신생아간호학 737
25.0%
조산학 737
25.0%

Length

2023-12-13T01:37:13.278049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:37:13.407841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
모자보건법 737
25.0%
모자보건학 737
25.0%
신생아간호학 737
25.0%
조산학 737
25.0%

과목별점수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct98
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.778833
Minimum0
Maximum129
Zeros44
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size26.0 KiB
2023-12-13T01:37:13.579622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7
Q110
median23
Q334
95-th percentile106
Maximum129
Range129
Interquartile range (IQR)24

Descriptive statistics

Standard deviation35.94238
Coefficient of variation (CV)0.97725722
Kurtosis-0.46857413
Mean36.778833
Median Absolute Deviation (MAD)13
Skewness1.0973984
Sum108424
Variance1291.8547
MonotonicityNot monotonic
2023-12-13T01:37:13.794653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9 245
 
8.3%
8 166
 
5.6%
10 160
 
5.4%
7 119
 
4.0%
29 97
 
3.3%
27 96
 
3.3%
26 86
 
2.9%
28 85
 
2.9%
16 82
 
2.8%
25 81
 
2.7%
Other values (88) 1731
58.7%
ValueCountFrequency (%)
0 44
 
1.5%
2 1
 
< 0.1%
3 5
 
0.2%
4 10
 
0.3%
5 25
 
0.8%
6 60
 
2.0%
7 119
4.0%
8 166
5.6%
9 245
8.3%
10 160
5.4%
ValueCountFrequency (%)
129 1
 
< 0.1%
126 1
 
< 0.1%
124 2
 
0.1%
123 3
 
0.1%
121 2
 
0.1%
119 3
 
0.1%
118 2
 
0.1%
117 1
 
< 0.1%
116 10
0.3%
115 2
 
0.1%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct82
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean147.11533
Minimum0
Maximum186
Zeros44
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size26.0 KiB
2023-12-13T01:37:13.985790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile117.7
Q1141
median152
Q3159
95-th percentile169
Maximum186
Range186
Interquartile range (IQR)18

Descriptive statistics

Standard deviation23.569697
Coefficient of variation (CV)0.16021238
Kurtosis20.659886
Mean147.11533
Median Absolute Deviation (MAD)9
Skewness-3.8405367
Sum433696
Variance555.53064
MonotonicityNot monotonic
2023-12-13T01:37:14.170090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
155 144
 
4.9%
152 132
 
4.5%
153 112
 
3.8%
148 112
 
3.8%
159 100
 
3.4%
149 96
 
3.3%
158 96
 
3.3%
161 92
 
3.1%
151 92
 
3.1%
164 88
 
3.0%
Other values (72) 1884
63.9%
ValueCountFrequency (%)
0 44
1.5%
64 4
 
0.1%
79 8
 
0.3%
85 4
 
0.1%
91 4
 
0.1%
95 4
 
0.1%
96 4
 
0.1%
98 8
 
0.3%
104 4
 
0.1%
106 4
 
0.1%
ValueCountFrequency (%)
186 4
 
0.1%
185 4
 
0.1%
182 8
 
0.3%
178 8
 
0.3%
177 4
 
0.1%
176 16
0.5%
175 4
 
0.1%
174 8
 
0.3%
173 16
0.5%
172 24
0.8%

합격여부
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.2 KiB
합격
2828 
불합격
 
76
결시
 
44

Length

Max length3
Median length2
Mean length2.0257802
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row합격
4th row합격
5th row합격

Common Values

ValueCountFrequency (%)
합격 2828
95.9%
불합격 76
 
2.6%
결시 44
 
1.5%

Length

2023-12-13T01:37:14.361565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:37:14.487818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 2828
95.9%
불합격 76
 
2.6%
결시 44
 
1.5%

성별
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size23.2 KiB
2948 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
2948
100.0%

Length

2023-12-13T01:37:14.626481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:37:14.766000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2948
100.0%

연령대
Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.2 KiB
20
2204 
30
592 
40
 
116
50
 
36

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 2204
74.8%
30 592
 
20.1%
40 116
 
3.9%
50 36
 
1.2%

Length

2023-12-13T01:37:14.900656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:37:15.033775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 2204
74.8%
30 592
 
20.1%
40 116
 
3.9%
50 36
 
1.2%

Interactions

2023-12-13T01:37:11.358795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:08.488586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:09.140778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:09.788757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:10.753434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:11.505641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:08.583026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:09.259984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:09.920224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:10.848659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:11.621879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:08.743878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:09.396485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:10.387519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:10.960330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:11.753871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:08.894606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:09.524328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:10.507106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:11.118763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:11.844684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:09.022473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:09.653505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:10.626114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:37:11.242373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:37:15.131117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부연령대
연도1.0001.0000.9420.0000.4110.4960.2020.438
회차1.0001.0000.9810.0000.4190.4890.1970.430
일련번호0.9420.9811.0000.0000.3980.4600.1920.411
과목명0.0000.0000.0001.0000.9140.0000.0000.000
과목별점수0.4110.4190.3980.9141.0000.5260.4360.196
총점0.4960.4890.4600.0000.5261.0000.8810.593
합격여부0.2020.1970.1920.0000.4360.8811.0000.267
연령대0.4380.4300.4110.0000.1960.5930.2671.000
2023-12-13T01:37:15.292179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과목명합격여부연령대
과목명1.0000.0000.000
합격여부0.0001.0000.255
연령대0.0000.2551.000
2023-12-13T01:37:15.436158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목별점수총점과목명합격여부연령대
연도1.0001.0000.997-0.074-0.2520.0000.1190.270
회차1.0001.0000.997-0.074-0.2520.0000.1190.270
일련번호0.9970.9971.000-0.073-0.2450.0000.1160.257
과목별점수-0.074-0.074-0.0731.0000.2000.8060.2910.118
총점-0.252-0.252-0.2450.2001.0000.0000.8510.298
과목명0.0000.0000.0000.8060.0001.0000.0000.000
합격여부0.1190.1190.1160.2910.8510.0001.0000.255
연령대0.2700.2700.2570.1180.2980.0000.2551.000

Missing values

2023-12-13T01:37:11.983497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:37:12.136043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
02000조산사111모자보건법5138합격20
12000조산사111모자보건학25138합격20
22000조산사111신생아간호학27138합격20
32000조산사111조산학81138합격20
42000조산사112모자보건법7151합격20
52000조산사112모자보건학29151합격20
62000조산사112신생아간호학21151합격20
72000조산사112조산학94151합격20
82000조산사113모자보건법8146합격20
92000조산사113모자보건학28146합격20
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
29382023조산사34735신생아간호학00결시50
29392023조산사34735조산학00결시50
29402023조산사34736모자보건법10137합격20
29412023조산사34736모자보건학12137합격20
29422023조산사34736신생아간호학24137합격20
29432023조산사34736조산학91137합격20
29442023조산사34737모자보건법10141합격30
29452023조산사34737모자보건학12141합격30
29462023조산사34737신생아간호학24141합격30
29472023조산사34737조산학95141합격30