Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory918.0 KiB
Average record size in memory94.0 B

Variable types

Numeric5
Categorical5

Dataset

Description치과기공사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별, 연령대)를 제공합니다.
URLhttps://www.data.go.kr/data/15083509/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
총점 is highly overall correlated with 합격여부High correlation
합격여부 is highly overall correlated with 총점High correlation
합격여부 is highly imbalanced (51.0%)Imbalance
연령대 is highly imbalanced (70.2%)Imbalance
과목별점수 has 497 (5.0%) zerosZeros
총점 has 486 (4.9%) zerosZeros

Reproduction

Analysis started2023-12-12 16:08:57.070194
Analysis finished2023-12-12 16:09:00.509639
Duration3.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2002.2941
Minimum2000
Maximum2005
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:09:00.558089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2000
Q12001
median2003
Q32003
95-th percentile2004
Maximum2005
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.3524748
Coefficient of variation (CV)0.00067546262
Kurtosis-1.0462871
Mean2002.2941
Median Absolute Deviation (MAD)1
Skewness-0.40921641
Sum20022941
Variance1.8291881
MonotonicityNot monotonic
2023-12-13T01:09:00.656389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2003 3443
34.4%
2004 1975
19.8%
2002 1550
15.5%
2000 1524
15.2%
2001 1482
14.8%
2005 26
 
0.3%
ValueCountFrequency (%)
2000 1524
15.2%
2001 1482
14.8%
2002 1550
15.5%
2003 3443
34.4%
2004 1975
19.8%
2005 26
 
0.3%
ValueCountFrequency (%)
2005 26
 
0.3%
2004 1975
19.8%
2003 3443
34.4%
2002 1550
15.5%
2001 1482
14.8%
2000 1524
15.2%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
치과기공사
10000 

Length

Max length35
Median length35
Mean length35
Min length35

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row치과기공사
2nd row치과기공사
3rd row치과기공사
4th row치과기공사
5th row치과기공사

Common Values

ValueCountFrequency (%)
치과기공사 10000
100.0%

Length

2023-12-13T01:09:00.757853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:09:00.831066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
치과기공사 10000
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.6736
Minimum27
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:09:00.895685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27
5-th percentile27
Q128
median30
Q331
95-th percentile32
Maximum33
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.7337715
Coefficient of variation (CV)0.058428081
Kurtosis-1.2685672
Mean29.6736
Median Absolute Deviation (MAD)2
Skewness-0.12435121
Sum296736
Variance3.0059636
MonotonicityNot monotonic
2023-12-13T01:09:00.986416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
32 1975
19.8%
31 1794
17.9%
30 1649
16.5%
29 1550
15.5%
27 1524
15.2%
28 1482
14.8%
33 26
 
0.3%
ValueCountFrequency (%)
27 1524
15.2%
28 1482
14.8%
29 1550
15.5%
30 1649
16.5%
31 1794
17.9%
32 1975
19.8%
33 26
 
0.3%
ValueCountFrequency (%)
33 26
 
0.3%
32 1975
19.8%
31 1794
17.9%
30 1649
16.5%
29 1550
15.5%
28 1482
14.8%
27 1524
15.2%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct6075
Distinct (%)60.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4323.8835
Minimum1
Maximum8665
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:09:01.098035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile410.95
Q12162.25
median4331
Q36517.5
95-th percentile8251
Maximum8665
Range8664
Interquartile range (IQR)4355.25

Descriptive statistics

Standard deviation2521.5017
Coefficient of variation (CV)0.58315673
Kurtosis-1.2084262
Mean4323.8835
Median Absolute Deviation (MAD)2179.5
Skewness0.0056701263
Sum43238835
Variance6357971
MonotonicityNot monotonic
2023-12-13T01:09:01.202699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3858 6
 
0.1%
5461 5
 
0.1%
6554 5
 
0.1%
4883 5
 
0.1%
1665 5
 
0.1%
6118 5
 
0.1%
1984 5
 
0.1%
653 5
 
0.1%
434 5
 
0.1%
634 5
 
0.1%
Other values (6065) 9949
99.5%
ValueCountFrequency (%)
1 1
 
< 0.1%
2 2
< 0.1%
3 4
< 0.1%
4 1
 
< 0.1%
6 1
 
< 0.1%
8 2
< 0.1%
9 1
 
< 0.1%
12 2
< 0.1%
13 3
< 0.1%
14 1
 
< 0.1%
ValueCountFrequency (%)
8665 1
 
< 0.1%
8663 1
 
< 0.1%
8662 1
 
< 0.1%
8661 3
< 0.1%
8660 3
< 0.1%
8659 2
< 0.1%
8658 2
< 0.1%
8653 1
 
< 0.1%
8652 2
< 0.1%
8646 1
 
< 0.1%

과목명
Categorical

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
의료관계법규
992 
치과재료학 개론
946 
구강해부학 개론
925 
치과기공 주관식 실기
922 
치과기공 객관식 실기
910 
Other values (6)
5305 

Length

Max length12
Median length10
Mean length8.4351
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row치과기공 객관식 실기
2nd row총의치기공학
3rd row국부의치기공학
4th row치과충전기공학
5th row치과기공 주관식 실기

Common Values

ValueCountFrequency (%)
의료관계법규 992
9.9%
치과재료학 개론 946
9.5%
구강해부학 개론 925
9.2%
치과기공 주관식 실기 922
9.2%
치과기공 객관식 실기 910
9.1%
공중구강보건학 개론 907
9.1%
총의치기공학 887
8.9%
치과충전기공학 886
8.9%
관교의치기공학 886
8.9%
국부의치기공학 877
8.8%

Length

2023-12-13T01:09:01.307193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
개론 2778
16.9%
치과기공 1832
11.1%
실기 1832
11.1%
의료관계법규 992
 
6.0%
치과재료학 946
 
5.8%
구강해부학 925
 
5.6%
주관식 922
 
5.6%
객관식 910
 
5.5%
공중구강보건학 907
 
5.5%
총의치기공학 887
 
5.4%
Other values (4) 3511
21.4%

과목별점수
Real number (ℝ)

ZEROS 

Distinct95
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.91535
Minimum0
Maximum97
Zeros497
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:09:01.445942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.95
Q112
median19
Q329
95-th percentile43
Maximum97
Range97
Interquartile range (IQR)17

Descriptive statistics

Standard deviation12.022536
Coefficient of variation (CV)0.57481876
Kurtosis0.57295708
Mean20.91535
Median Absolute Deviation (MAD)8
Skewness0.70723318
Sum209153.5
Variance144.54136
MonotonicityNot monotonic
2023-12-13T01:09:01.584214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.0 571
 
5.7%
13.0 549
 
5.5%
0.0 497
 
5.0%
11.0 458
 
4.6%
14.0 438
 
4.4%
10.0 373
 
3.7%
15.0 358
 
3.6%
32.0 313
 
3.1%
30.0 304
 
3.0%
18.0 298
 
3.0%
Other values (85) 5841
58.4%
ValueCountFrequency (%)
0.0 497
5.0%
1.0 3
 
< 0.1%
2.0 6
 
0.1%
3.0 19
 
0.2%
4.0 30
 
0.3%
5.0 36
 
0.4%
6.0 85
 
0.9%
7.0 138
 
1.4%
8.0 202
2.0%
9.0 280
2.8%
ValueCountFrequency (%)
97.0 1
 
< 0.1%
85.0 1
 
< 0.1%
60.0 1
 
< 0.1%
59.5 3
 
< 0.1%
59.0 10
 
0.1%
58.5 3
 
< 0.1%
58.0 27
0.3%
57.5 6
 
0.1%
57.0 31
0.3%
56.5 12
 
0.1%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct351
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean219.6291
Minimum0
Maximum309.5
Zeros486
Zeros (%)4.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:09:01.700861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile56.95
Q1213
median247.5
Q3267
95-th percentile286
Maximum309.5
Range309.5
Interquartile range (IQR)54

Descriptive statistics

Standard deviation73.656035
Coefficient of variation (CV)0.33536556
Kurtosis1.7428146
Mean219.6291
Median Absolute Deviation (MAD)23.5
Skewness-1.5979243
Sum2196291
Variance5425.2115
MonotonicityNot monotonic
2023-12-13T01:09:01.836809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 486
 
4.9%
266.0 139
 
1.4%
263.0 130
 
1.3%
269.0 130
 
1.3%
253.0 122
 
1.2%
256.0 121
 
1.2%
259.0 119
 
1.2%
270.0 119
 
1.2%
262.0 112
 
1.1%
254.0 110
 
1.1%
Other values (341) 8412
84.1%
ValueCountFrequency (%)
0.0 486
4.9%
35.0 1
 
< 0.1%
37.0 2
 
< 0.1%
53.0 1
 
< 0.1%
54.0 3
 
< 0.1%
55.0 5
 
0.1%
56.0 2
 
< 0.1%
57.0 6
 
0.1%
61.0 1
 
< 0.1%
62.0 1
 
< 0.1%
ValueCountFrequency (%)
309.5 1
 
< 0.1%
308.0 2
 
< 0.1%
306.0 3
< 0.1%
305.0 7
0.1%
304.0 3
< 0.1%
303.5 1
 
< 0.1%
302.5 1
 
< 0.1%
302.0 7
0.1%
301.0 1
 
< 0.1%
300.5 3
< 0.1%

합격여부
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
7572 
불합격
1946 
결시
 
475
응시결격
 
7

Length

Max length4
Median length2
Mean length2.196
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row합격
4th row합격
5th row불합격

Common Values

ValueCountFrequency (%)
합격 7572
75.7%
불합격 1946
 
19.5%
결시 475
 
4.8%
응시결격 7
 
0.1%

Length

2023-12-13T01:09:01.954829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:09:02.039800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 7572
75.7%
불합격 1946
 
19.5%
결시 475
 
4.8%
응시결격 7
 
0.1%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
5603 
4397 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
5603
56.0%
4397
44.0%

Length

2023-12-13T01:09:02.130376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:09:02.221011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5603
56.0%
4397
44.0%

연령대
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
8838 
30
971 
40
 
187
50
 
4

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 8838
88.4%
30 971
 
9.7%
40 187
 
1.9%
50 4
 
< 0.1%

Length

2023-12-13T01:09:02.310837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:09:02.409423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 8838
88.4%
30 971
 
9.7%
40 187
 
1.9%
50 4
 
< 0.1%

Interactions

2023-12-13T01:08:59.891737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.084142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.527897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.022491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.476540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.975278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.173461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.617586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.112435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.559713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:09:00.067988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.267756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.713440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.207341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.653955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:09:00.149389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.349858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.808583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.300332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.730833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:09:00.232070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.441458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:58.907894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.387662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:08:59.809815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:09:02.483166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부성별연령대
연도1.0001.0000.9180.0000.2650.2120.0620.0240.088
회차1.0001.0000.9340.0000.2430.2050.0710.0360.102
일련번호0.9180.9341.0000.0000.1510.2840.1720.3300.187
과목명0.0000.0000.0001.0000.7000.0000.0000.0000.000
과목별점수0.2650.2430.1510.7001.0000.4830.5330.1040.158
총점0.2120.2050.2840.0000.4831.0000.8980.2470.252
합격여부0.0620.0710.1720.0000.5330.8981.0000.2390.330
성별0.0240.0360.3300.0000.1040.2470.2391.0000.364
연령대0.0880.1020.1870.0000.1580.2520.3300.3641.000
2023-12-13T01:09:02.606040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령대과목명합격여부
성별1.0000.2430.0000.159
연령대0.2431.0000.0000.134
과목명0.0000.0001.0000.000
합격여부0.1590.1340.0001.000
2023-12-13T01:09:02.945813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목별점수총점과목명합격여부성별연령대
연도1.0000.9840.970-0.014-0.0840.0000.0470.0360.070
회차0.9841.0000.986-0.006-0.0660.0000.0490.0390.070
일련번호0.9700.9861.0000.013-0.0290.0000.1030.2530.112
과목별점수-0.014-0.0060.0131.0000.3950.4120.3710.1040.102
총점-0.084-0.066-0.0290.3951.0000.0000.7780.1890.153
과목명0.0000.0000.0000.4120.0001.0000.0000.0000.000
합격여부0.0470.0490.1030.3710.7780.0001.0000.1590.134
성별0.0360.0390.2530.1040.1890.0000.1591.0000.243
연령대0.0700.0700.1120.1020.1530.0000.1340.2431.000

Missing values

2023-12-13T01:09:00.338654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:09:00.455100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
761242003치과기공사316921치과기공 객관식 실기38.0277.0합격20
280412001치과기공사282550총의치기공학22.0266.5합격20
82862000치과기공사27754국부의치기공학22.0282.0합격20
106032000치과기공사27964치과충전기공학15.0282.0합격20
495622003치과기공사304506치과기공 주관식 실기45.5118.0불합격20
107812000치과기공사27981구강해부학 개론30.0265.0합격30
15212000치과기공사27139국부의치기공학17.0159.0불합격20
329872002치과기공사292999치과재료학 개론31.0257.0합격20
853552004치과기공사327760공중구강보건학 개론12.0286.0합격20
254762001치과기공사282317가철성치열교정장치기공학14.0261.5합격20
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
175042001치과기공사281592국부의치기공학17.0264.5합격20
66192000치과기공사27602의료관계법규12.0280.0합격20
479902003치과기공사304363의료관계법규16.0141.0불합격30
648402003치과기공사315895공중구강보건학 개론13.0275.0합격20
275742001치과기공사282507의료관계법규0.00.0결시20
310112002치과기공사292820총의치기공학23.0256.0합격20
605332003치과기공사315504가철성치열교정장치기공학10.0226.0합격30
362542002치과기공사293296치과재료학 개론34.0235.0합격20
645222003치과기공사315866치과기공 주관식 실기41.0258.0합격20
536492003치과기공사304878총의치기공학22.0216.5합격20