Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory918.0 KiB
Average record size in memory94.0 B

Variable types

Numeric5
Categorical5

Dataset

Description치과의사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별, 연령대)를 제공합니다.
URLhttps://www.data.go.kr/data/15060447/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
과목별점수 is highly overall correlated with 합격여부High correlation
총점 is highly overall correlated with 합격여부High correlation
합격여부 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
합격여부 is highly imbalanced (54.1%)Imbalance
과목별점수 has 280 (2.8%) zerosZeros
총점 has 274 (2.7%) zerosZeros

Reproduction

Analysis started2023-12-12 13:39:55.663147
Analysis finished2023-12-12 13:40:01.242583
Duration5.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2002.8531
Minimum2000
Maximum2006
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:40:01.295713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2000
Q12001
median2003
Q32004
95-th percentile2006
Maximum2006
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9342426
Coefficient of variation (CV)0.00096574363
Kurtosis-1.191389
Mean2002.8531
Median Absolute Deviation (MAD)2
Skewness0.041979105
Sum20028531
Variance3.7412945
MonotonicityNot monotonic
2023-12-12T22:40:01.413909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2003 1571
15.7%
2004 1521
15.2%
2000 1515
15.2%
2001 1503
15.0%
2005 1445
14.4%
2002 1416
14.2%
2006 1029
10.3%
ValueCountFrequency (%)
2000 1515
15.2%
2001 1503
15.0%
2002 1416
14.2%
2003 1571
15.7%
2004 1521
15.2%
2005 1445
14.4%
2006 1029
10.3%
ValueCountFrequency (%)
2006 1029
10.3%
2005 1445
14.4%
2004 1521
15.2%
2003 1571
15.7%
2002 1416
14.2%
2001 1503
15.0%
2000 1515
15.2%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
치과의사
10000 

Length

Max length36
Median length36
Mean length36
Min length36

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row치과의사
2nd row치과의사
3rd row치과의사
4th row치과의사
5th row치과의사

Common Values

ValueCountFrequency (%)
치과의사 10000
100.0%

Length

2023-12-12T22:40:01.600427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:40:01.720536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
치과의사 10000
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.8531
Minimum52
Maximum58
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:40:01.834433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum52
5-th percentile52
Q153
median55
Q356
95-th percentile58
Maximum58
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9342426
Coefficient of variation (CV)0.03526223
Kurtosis-1.191389
Mean54.8531
Median Absolute Deviation (MAD)2
Skewness0.041979105
Sum548531
Variance3.7412945
MonotonicityNot monotonic
2023-12-12T22:40:01.960515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
55 1571
15.7%
56 1521
15.2%
52 1515
15.2%
53 1503
15.0%
57 1445
14.4%
54 1416
14.2%
58 1029
10.3%
ValueCountFrequency (%)
52 1515
15.2%
53 1503
15.0%
54 1416
14.2%
55 1571
15.7%
56 1521
15.2%
57 1445
14.4%
58 1029
10.3%
ValueCountFrequency (%)
58 1029
10.3%
57 1445
14.4%
56 1521
15.2%
55 1571
15.7%
54 1416
14.2%
53 1503
15.0%
52 1515
15.2%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct5564
Distinct (%)55.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3632.638
Minimum1
Maximum7327
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:40:02.112330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile377
Q11809.5
median3638.5
Q35441
95-th percentile6935
Maximum7327
Range7326
Interquartile range (IQR)3631.5

Descriptive statistics

Standard deviation2104.4087
Coefficient of variation (CV)0.57930592
Kurtosis-1.1894507
Mean3632.638
Median Absolute Deviation (MAD)1814
Skewness0.01596127
Sum36326380
Variance4428535.9
MonotonicityNot monotonic
2023-12-12T22:40:02.294921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1704 7
 
0.1%
5496 6
 
0.1%
6527 6
 
0.1%
4838 6
 
0.1%
6957 6
 
0.1%
4955 6
 
0.1%
3769 5
 
0.1%
3814 5
 
0.1%
6703 5
 
0.1%
5130 5
 
0.1%
Other values (5554) 9943
99.4%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 2
< 0.1%
4 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 2
< 0.1%
10 1
< 0.1%
11 2
< 0.1%
12 2
< 0.1%
ValueCountFrequency (%)
7327 1
< 0.1%
7326 2
< 0.1%
7325 1
< 0.1%
7324 1
< 0.1%
7323 1
< 0.1%
7321 1
< 0.1%
7320 2
< 0.1%
7316 1
< 0.1%
7314 2
< 0.1%
7313 2
< 0.1%

과목명
Categorical

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
구강악안면방사선학
835 
치과보존학
801 
구강악안면외과학
793 
치과재료학
791 
치주과학
791 
Other values (8)
5989 

Length

Max length9
Median length5
Mean length5.804
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구강악안면방사선학
2nd row구강악안면외과학
3rd row치과재료학
4th row치과보철학
5th row구강악안면외과학

Common Values

ValueCountFrequency (%)
구강악안면방사선학 835
 
8.3%
치과보존학 801
 
8.0%
구강악안면외과학 793
 
7.9%
치과재료학 791
 
7.9%
치주과학 791
 
7.9%
보건의약관계 법규 778
 
7.8%
구강내과학 765
 
7.6%
구강병리학 758
 
7.6%
소아치과학 749
 
7.5%
치과교정학 748
 
7.5%
Other values (3) 2191
21.9%

Length

2023-12-12T22:40:02.511199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
구강악안면방사선학 835
 
7.7%
치과보존학 801
 
7.4%
구강악안면외과학 793
 
7.4%
치과재료학 791
 
7.3%
치주과학 791
 
7.3%
보건의약관계 778
 
7.2%
법규 778
 
7.2%
구강내과학 765
 
7.1%
구강병리학 758
 
7.0%
소아치과학 749
 
6.9%
Other values (4) 2939
27.3%

과목별점수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct57
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.20715
Minimum0
Maximum38
Zeros280
Zeros (%)2.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:40:02.685846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q111
median16
Q323
95-th percentile32
Maximum38
Range38
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.155447
Coefficient of variation (CV)0.47395687
Kurtosis-0.46977275
Mean17.20715
Median Absolute Deviation (MAD)5
Skewness0.32203503
Sum172071.5
Variance66.511315
MonotonicityNot monotonic
2023-12-12T22:40:02.889650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13.0 618
 
6.2%
12.0 564
 
5.6%
11.0 552
 
5.5%
14.0 514
 
5.1%
10.0 475
 
4.8%
17.0 438
 
4.4%
16.0 438
 
4.4%
15.0 423
 
4.2%
18.0 414
 
4.1%
19.0 363
 
3.6%
Other values (47) 5201
52.0%
ValueCountFrequency (%)
0.0 280
2.8%
1.0 4
 
< 0.1%
2.0 12
 
0.1%
3.0 30
 
0.3%
3.5 1
 
< 0.1%
4.0 47
 
0.5%
4.5 1
 
< 0.1%
5.0 105
 
1.1%
5.5 5
 
0.1%
6.0 130
1.3%
ValueCountFrequency (%)
38.0 10
 
0.1%
37.0 12
 
0.1%
36.0 41
 
0.4%
35.0 85
 
0.9%
34.0 122
1.2%
33.0 129
1.3%
32.0 172
1.7%
31.0 217
2.2%
30.0 234
2.3%
29.0 247
2.5%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct413
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean224.2821
Minimum0
Maximum305
Zeros274
Zeros (%)2.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:40:03.067047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile125.5
Q1209.5
median239.5
Q3257.5
95-th percentile277
Maximum305
Range305
Interquartile range (IQR)48

Descriptive statistics

Standard deviation54.127483
Coefficient of variation (CV)0.24133662
Kurtosis6.1469761
Mean224.2821
Median Absolute Deviation (MAD)21
Skewness-2.238398
Sum2242821
Variance2929.7844
MonotonicityNot monotonic
2023-12-12T22:40:03.264247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 274
 
2.7%
251.0 99
 
1.0%
245.0 93
 
0.9%
246.0 91
 
0.9%
250.5 90
 
0.9%
255.5 86
 
0.9%
260.0 86
 
0.9%
263.0 85
 
0.9%
253.5 81
 
0.8%
254.0 81
 
0.8%
Other values (403) 8934
89.3%
ValueCountFrequency (%)
0.0 274
2.7%
28.0 2
 
< 0.1%
33.0 1
 
< 0.1%
46.0 3
 
< 0.1%
72.5 2
 
< 0.1%
76.5 4
 
< 0.1%
79.0 3
 
< 0.1%
81.5 1
 
< 0.1%
82.5 3
 
< 0.1%
87.0 5
 
0.1%
ValueCountFrequency (%)
305.0 1
 
< 0.1%
303.0 1
 
< 0.1%
302.0 1
 
< 0.1%
300.5 1
 
< 0.1%
300.0 1
 
< 0.1%
299.0 1
 
< 0.1%
298.5 2
 
< 0.1%
297.5 7
0.1%
297.0 1
 
< 0.1%
296.5 1
 
< 0.1%

합격여부
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
7714 
불합격
2002 
결시
 
232
응시결격
 
52

Length

Max length4
Median length2
Mean length2.2106
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row불합격
3rd row합격
4th row합격
5th row합격

Common Values

ValueCountFrequency (%)
합격 7714
77.1%
불합격 2002
 
20.0%
결시 232
 
2.3%
응시결격 52
 
0.5%

Length

2023-12-12T22:40:03.443838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:40:03.593894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 7714
77.1%
불합격 2002
 
20.0%
결시 232
 
2.3%
응시결격 52
 
0.5%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
7236 
2764 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
7236
72.4%
2764
 
27.6%

Length

2023-12-12T22:40:03.702258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:40:03.801526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
7236
72.4%
2764
 
27.6%

연령대
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
6501 
30
2569 
40
826 
50
 
94
60
 
10

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row30
3rd row20
4th row30
5th row20

Common Values

ValueCountFrequency (%)
20 6501
65.0%
30 2569
 
25.7%
40 826
 
8.3%
50 94
 
0.9%
60 10
 
0.1%

Length

2023-12-12T22:40:03.934085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:40:04.085365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 6501
65.0%
30 2569
 
25.7%
40 826
 
8.3%
50 94
 
0.9%
60 10
 
0.1%

Interactions

2023-12-12T22:39:59.933149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:57.126276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:57.855153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:58.500921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:59.190553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:40:00.069597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:57.271521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:57.981408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:58.618249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:59.359358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:40:00.222296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:57.410002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:58.112241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:58.748905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:59.522144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:40:00.355512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:57.556421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:58.250592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:58.880927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:59.655367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:40:00.474338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:57.704225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:58.380819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:59.025603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:39:59.791375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:40:04.209767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부성별연령대
연도1.0001.0000.9880.0000.1380.1470.1070.0710.075
회차1.0001.0000.9570.0000.1340.1700.1190.0470.073
일련번호0.9880.9571.0000.0000.2290.3890.3100.0990.343
과목명0.0000.0000.0001.0000.6920.0000.0000.0210.000
과목별점수0.1380.1340.2290.6921.0000.7870.7420.1550.351
총점0.1470.1700.3890.0000.7871.0000.8870.3370.646
합격여부0.1070.1190.3100.0000.7420.8871.0000.2410.411
성별0.0710.0470.0990.0210.1550.3370.2411.0000.151
연령대0.0750.0730.3430.0000.3510.6460.4110.1511.000
2023-12-12T22:40:04.364667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령대과목명합격여부
성별1.0000.1840.0190.160
연령대0.1841.0000.0000.346
과목명0.0190.0001.0000.000
합격여부0.1600.3460.0001.000
2023-12-12T22:40:04.511302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목별점수총점과목명합격여부성별연령대
연도1.0001.0000.9890.0450.1130.0000.0820.0500.046
회차1.0001.0000.9890.0450.1130.0000.0820.0500.046
일련번호0.9890.9891.0000.0440.1130.0000.1900.0760.149
과목별점수0.0450.0450.0441.0000.4100.3720.5500.1190.153
총점0.1130.1130.1130.4101.0000.0000.7590.2580.322
과목명0.0000.0000.0000.3720.0001.0000.0000.0190.000
합격여부0.0820.0820.1900.5500.7590.0001.0000.1600.346
성별0.0500.0500.0760.1190.2580.0190.1601.0000.184
연령대0.0460.0460.1490.1530.3220.0000.3460.1841.000

Missing values

2023-12-12T22:40:00.648070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:40:00.853013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
908612006치과의사586990구강악안면방사선학21.0237.0합격20
841412005치과의사576473구강악안면외과학24.0180.5불합격30
677932004치과의사565215치과재료학13.0247.5합격20
926222006치과의사587125치과보철학31.0238.5합격30
938002006치과의사587216구강악안면외과학26.0259.0합격20
1012000치과의사528치과보철학29.0228.0합격20
469472003치과의사553612구강악안면방사선학14.0205.0합격20
350392002치과의사542696구강악안면방사선학18.0218.5합격30
906432006치과의사586973소아치과학18.0236.0합격20
553242003치과의사554256치과보존학23.0163.5불합격40
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
844752005치과의사576499구강병리학7.0143.5불합격30
358862002치과의사542761보건의약관계 법규16.0247.5합격30
253342001치과의사531949치과보철학31.0257.5합격20
354312002치과의사542726보건의약관계 법규17.0264.5합격20
197462001치과의사531519치주과학24.0272.0합격20
360512002치과의사542774구강보건학13.0267.0합격20
738032005치과의사575678구강보건학13.0216.5합격20
23032000치과의사52178구강보건학8.0195.5불합격40
139122000치과의사521071구강보건학12.0239.0합격20
941562006치과의사587243치과보철학32.0273.5합격20