Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory918.0 KiB
Average record size in memory94.0 B

Variable types

Numeric5
Categorical5

Dataset

Description2급 언어재활사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별)를 제공합니다.
URLhttps://www.data.go.kr/data/15083528/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
과목별점수 is highly overall correlated with 총점 and 1 other fieldsHigh correlation
총점 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
합격여부 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
합격여부 is highly imbalanced (52.6%)Imbalance
성별 is highly imbalanced (52.9%)Imbalance
과목별점수 has 326 (3.3%) zerosZeros
총점 has 323 (3.2%) zerosZeros

Reproduction

Analysis started2023-12-12 13:11:24.291352
Analysis finished2023-12-12 13:11:28.971917
Duration4.68 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2017.0007
Minimum2013
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:11:29.031878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2013
5-th percentile2014
Q12014
median2016
Q32020
95-th percentile2022
Maximum2022
Range9
Interquartile range (IQR)6

Descriptive statistics

Standard deviation2.8606849
Coefficient of variation (CV)0.0014182865
Kurtosis-1.2756077
Mean2017.0007
Median Absolute Deviation (MAD)2
Skewness0.41436281
Sum20170007
Variance8.1835179
MonotonicityNot monotonic
2023-12-12T22:11:29.151590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2014 3191
31.9%
2015 946
 
9.5%
2022 909
 
9.1%
2016 899
 
9.0%
2021 847
 
8.5%
2020 795
 
8.0%
2018 793
 
7.9%
2017 780
 
7.8%
2019 770
 
7.7%
2013 70
 
0.7%
ValueCountFrequency (%)
2013 70
 
0.7%
2014 3191
31.9%
2015 946
 
9.5%
2016 899
 
9.0%
2017 780
 
7.8%
2018 793
 
7.9%
2019 770
 
7.7%
2020 795
 
8.0%
2021 847
 
8.5%
2022 909
 
9.1%
ValueCountFrequency (%)
2022 909
 
9.1%
2021 847
 
8.5%
2020 795
 
8.0%
2019 770
 
7.7%
2018 793
 
7.9%
2017 780
 
7.8%
2016 899
 
9.0%
2015 946
 
9.5%
2014 3191
31.9%
2013 70
 
0.7%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2급언어재활사
10000 

Length

Max length34
Median length34
Mean length34
Min length34

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2급언어재활사
2nd row2급언어재활사
3rd row2급언어재활사
4th row2급언어재활사
5th row2급언어재활사

Common Values

ValueCountFrequency (%)
2급언어재활사 10000
100.0%

Length

2023-12-12T22:11:29.300369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:11:29.405221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2급언어재활사 10000
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.7682
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:11:29.505607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q39
95-th percentile11
Maximum11
Range10
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.1259312
Coefficient of variation (CV)0.5419249
Kurtosis-1.3075636
Mean5.7682
Median Absolute Deviation (MAD)3
Skewness0.25383527
Sum57682
Variance9.7714459
MonotonicityNot monotonic
2023-12-12T22:11:29.622902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2 2255
22.6%
4 946
9.5%
3 936
9.4%
11 909
9.1%
5 899
 
9.0%
10 847
 
8.5%
9 795
 
8.0%
7 793
 
7.9%
6 780
 
7.8%
8 770
 
7.7%
ValueCountFrequency (%)
1 70
 
0.7%
2 2255
22.6%
3 936
9.4%
4 946
9.5%
5 899
 
9.0%
6 780
 
7.8%
7 793
 
7.9%
8 770
 
7.7%
9 795
 
8.0%
10 847
 
8.5%
ValueCountFrequency (%)
11 909
9.1%
10 847
 
8.5%
9 795
 
8.0%
8 770
 
7.7%
7 793
 
7.9%
6 780
 
7.8%
5 899
 
9.0%
4 946
9.5%
3 936
9.4%
2 2255
22.6%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct7862
Distinct (%)78.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8478.6392
Minimum1
Maximum16975
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:11:29.764459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile872.9
Q14275.25
median8366.5
Q312774.5
95-th percentile16189.05
Maximum16975
Range16974
Interquartile range (IQR)8499.25

Descriptive statistics

Standard deviation4916.0113
Coefficient of variation (CV)0.57981135
Kurtosis-1.2092886
Mean8478.6392
Median Absolute Deviation (MAD)4255
Skewness0.021434157
Sum84786392
Variance24167167
MonotonicityNot monotonic
2023-12-12T22:11:29.913368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9503 4
 
< 0.1%
897 4
 
< 0.1%
6266 4
 
< 0.1%
9996 4
 
< 0.1%
10863 4
 
< 0.1%
15544 4
 
< 0.1%
11860 4
 
< 0.1%
11375 4
 
< 0.1%
12055 4
 
< 0.1%
1542 4
 
< 0.1%
Other values (7852) 9960
99.6%
ValueCountFrequency (%)
1 1
 
< 0.1%
2 1
 
< 0.1%
5 1
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
12 1
 
< 0.1%
15 1
 
< 0.1%
18 2
< 0.1%
19 3
< 0.1%
21 1
 
< 0.1%
ValueCountFrequency (%)
16975 1
 
< 0.1%
16971 3
< 0.1%
16970 2
< 0.1%
16964 1
 
< 0.1%
16963 1
 
< 0.1%
16962 1
 
< 0.1%
16960 1
 
< 0.1%
16958 1
 
< 0.1%
16956 1
 
< 0.1%
16955 1
 
< 0.1%

과목명
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
신경언어장애
2038 
언어발달장애
1997 
조음음운장애
1996 
음성장애
1995 
유창성장애
1974 

Length

Max length6
Median length6
Mean length5.4036
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row신경언어장애
2nd row조음음운장애
3rd row신경언어장애
4th row신경언어장애
5th row조음음운장애

Common Values

ValueCountFrequency (%)
신경언어장애 2038
20.4%
언어발달장애 1997
20.0%
조음음운장애 1996
20.0%
음성장애 1995
20.0%
유창성장애 1974
19.7%

Length

2023-12-12T22:11:30.048643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:11:30.156551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신경언어장애 2038
20.4%
언어발달장애 1997
20.0%
조음음운장애 1996
20.0%
음성장애 1995
20.0%
유창성장애 1974
19.7%

과목별점수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct34
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.8393
Minimum0
Maximum35
Zeros326
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:11:30.255346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile9
Q117
median21
Q326
95-th percentile31
Maximum35
Range35
Interquartile range (IQR)9

Descriptive statistics

Standard deviation6.9698015
Coefficient of variation (CV)0.33445469
Kurtosis0.93370572
Mean20.8393
Median Absolute Deviation (MAD)4
Skewness-0.74550692
Sum208393
Variance48.578133
MonotonicityNot monotonic
2023-12-12T22:11:30.371007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
21 668
 
6.7%
23 651
 
6.5%
22 647
 
6.5%
19 643
 
6.4%
20 618
 
6.2%
24 595
 
5.9%
18 488
 
4.9%
25 479
 
4.8%
17 435
 
4.3%
26 427
 
4.3%
Other values (24) 4349
43.5%
ValueCountFrequency (%)
0 326
3.3%
3 1
 
< 0.1%
4 5
 
0.1%
5 12
 
0.1%
6 27
 
0.3%
7 36
 
0.4%
8 69
 
0.7%
9 87
 
0.9%
10 129
 
1.3%
11 162
1.6%
ValueCountFrequency (%)
35 15
 
0.1%
34 77
 
0.8%
33 142
 
1.4%
32 195
1.9%
31 235
2.4%
30 299
3.0%
29 344
3.4%
28 380
3.8%
27 411
4.1%
26 427
4.3%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct118
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.1544
Minimum0
Maximum148
Zeros323
Zeros (%)3.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:11:30.514524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile56
Q191
median111
Q3124
95-th percentile136
Maximum148
Range148
Interquartile range (IQR)33

Descriptive statistics

Standard deviation28.535595
Coefficient of variation (CV)0.27397398
Kurtosis3.5686569
Mean104.1544
Median Absolute Deviation (MAD)15
Skewness-1.668047
Sum1041544
Variance814.28019
MonotonicityNot monotonic
2023-12-12T22:11:30.671912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 323
 
3.2%
121 233
 
2.3%
128 212
 
2.1%
127 209
 
2.1%
118 205
 
2.1%
129 204
 
2.0%
119 202
 
2.0%
120 202
 
2.0%
126 200
 
2.0%
125 199
 
2.0%
Other values (108) 7811
78.1%
ValueCountFrequency (%)
0 323
3.2%
25 1
 
< 0.1%
32 1
 
< 0.1%
33 3
 
< 0.1%
34 3
 
< 0.1%
35 3
 
< 0.1%
36 2
 
< 0.1%
37 2
 
< 0.1%
39 1
 
< 0.1%
40 4
 
< 0.1%
ValueCountFrequency (%)
148 1
 
< 0.1%
147 3
 
< 0.1%
146 3
 
< 0.1%
145 6
 
0.1%
144 24
 
0.2%
143 18
 
0.2%
142 25
 
0.2%
141 61
0.6%
140 40
0.4%
139 83
0.8%

합격여부
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
7622 
불합격
2031 
결시
 
321
응시결격
 
26

Length

Max length4
Median length2
Mean length2.2083
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row불합격
2nd row합격
3rd row합격
4th row합격
5th row합격

Common Values

ValueCountFrequency (%)
합격 7622
76.2%
불합격 2031
 
20.3%
결시 321
 
3.2%
응시결격 26
 
0.3%

Length

2023-12-12T22:11:30.801919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:11:30.899676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 7622
76.2%
불합격 2031
 
20.3%
결시 321
 
3.2%
응시결격 26
 
0.3%

성별
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
8993 
1007 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
8993
89.9%
1007
 
10.1%

Length

2023-12-12T22:11:31.020216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:11:31.120100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
8993
89.9%
1007
 
10.1%

연령대
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
6842 
30
1712 
40
1081 
50
 
338
60
 
27

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row40
3rd row20
4th row20
5th row30

Common Values

ValueCountFrequency (%)
20 6842
68.4%
30 1712
 
17.1%
40 1081
 
10.8%
50 338
 
3.4%
60 27
 
0.3%

Length

2023-12-12T22:11:31.212456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:11:31.307711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 6842
68.4%
30 1712
 
17.1%
40 1081
 
10.8%
50 338
 
3.4%
60 27
 
0.3%

Interactions

2023-12-12T22:11:28.034586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:25.426081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:26.043271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:26.656136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.475020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:28.154263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:25.541938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:26.189541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.036558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.574418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:28.273935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:25.659263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:26.301467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.135605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.666694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:28.401434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:25.791534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:26.445184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.252141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.799723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:28.551987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:25.920029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:26.562471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.351910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:11:27.929615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:11:31.380419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부성별연령대
연도1.0000.9990.9460.0000.1220.1750.1080.0360.131
회차0.9991.0000.9850.0000.1940.2600.1570.0680.199
일련번호0.9460.9851.0000.0000.1620.2340.1220.0560.189
과목명0.0000.0000.0001.0000.6410.0000.0000.0000.000
과목별점수0.1220.1940.1620.6411.0000.8780.8400.0790.253
총점0.1750.2600.2340.0000.8781.0000.9100.0830.283
합격여부0.1080.1570.1220.0000.8400.9101.0000.0740.150
성별0.0360.0680.0560.0000.0790.0830.0741.0000.028
연령대0.1310.1990.1890.0000.2530.2830.1500.0281.000
2023-12-12T22:11:31.485913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별과목명합격여부연령대
성별1.0000.0000.0490.034
과목명0.0001.0000.0000.000
합격여부0.0490.0001.0000.123
연령대0.0340.0000.1231.000
2023-12-12T22:11:31.580009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목별점수총점과목명합격여부성별연령대
연도1.0000.9900.981-0.081-0.1060.0000.0830.0520.082
회차0.9901.0000.991-0.090-0.1150.0000.0830.0340.079
일련번호0.9810.9911.000-0.093-0.1190.0000.0730.0430.079
과목별점수-0.081-0.090-0.0931.0000.7220.3200.6820.0690.106
총점-0.106-0.115-0.1190.7221.0000.0000.7990.0630.121
과목명0.0000.0000.0000.3200.0001.0000.0000.0000.000
합격여부0.0830.0830.0730.6820.7990.0001.0000.0490.123
성별0.0520.0340.0430.0690.0630.0000.0491.0000.034
연령대0.0820.0790.0790.1060.1210.0000.1230.0341.000

Missing values

2023-12-12T22:11:28.712300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:11:28.894999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
90420142급언어재활사2181신경언어장애1680불합격20
5848520192급언어재활사811698조음음운장애32129합격40
4802420172급언어재활사69605신경언어장애24132합격20
3956420162급언어재활사57913신경언어장애30139합격20
5044520182급언어재활사710090조음음운장애27119합격30
5875720192급언어재활사811752음성장애17118합격20
8270620222급언어재활사1116542언어발달장애00결시20
7250720212급언어재활사1014502음성장애15116합격20
875420142급언어재활사21751신경언어장애24115합격20
6199320192급언어재활사812399유창성장애22115합격20
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
5928720192급언어재활사811858음성장애19126합격20
940920142급언어재활사21882신경언어장애24131합격20
4279220162급언어재활사58559음성장애1177불합격30
430320142급언어재활사2861유창성장애2099합격20
3537920162급언어재활사57076신경언어장애2298합격20
6191820192급언어재활사812384유창성장애537응시결격20
963020142급언어재활사21927음성장애23136합격20
6492420202급언어재활사912985신경언어장애29120합격20
4671320172급언어재활사69343유창성장애1794합격40
1721520142급언어재활사23444조음음운장애32128합격20