Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory918.0 KiB
Average record size in memory94.0 B

Variable types

Numeric5
Categorical5

Dataset

Description1급 언어재활사 국가시험 응시자의 성적 현황을 분석할 수 있는 정보(연도, 직종, 회차, 일련번호, 과목명, 과목별 점수, 총점, 합격여부, 성별)를 제공합니다.
URLhttps://www.data.go.kr/data/15083527/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
과목별점수 is highly overall correlated with 총점 and 1 other fieldsHigh correlation
총점 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
합격여부 is highly overall correlated with 과목별점수 and 1 other fieldsHigh correlation
성별 is highly imbalanced (61.6%)Imbalance
과목별점수 has 496 (5.0%) zerosZeros
총점 has 493 (4.9%) zerosZeros

Reproduction

Analysis started2023-12-12 19:15:52.269499
Analysis finished2023-12-12 19:15:57.439496
Duration5.17 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.3686
Minimum2015
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:15:57.493824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2016
Q12018
median2019
Q32021
95-th percentile2022
Maximum2022
Range7
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9353316
Coefficient of variation (CV)0.00095838454
Kurtosis-1.0386896
Mean2019.3686
Median Absolute Deviation (MAD)2
Skewness-0.19800239
Sum20193686
Variance3.7455086
MonotonicityNot monotonic
2023-12-13T04:15:57.615609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2022 1855
18.6%
2021 1623
16.2%
2018 1615
16.2%
2019 1538
15.4%
2020 1363
13.6%
2017 1303
13.0%
2016 545
 
5.5%
2015 158
 
1.6%
ValueCountFrequency (%)
2015 158
 
1.6%
2016 545
 
5.5%
2017 1303
13.0%
2018 1615
16.2%
2019 1538
15.4%
2020 1363
13.6%
2021 1623
16.2%
2022 1855
18.6%
ValueCountFrequency (%)
2022 1855
18.6%
2021 1623
16.2%
2020 1363
13.6%
2019 1538
15.4%
2018 1615
16.2%
2017 1303
13.0%
2016 545
 
5.5%
2015 158
 
1.6%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1급언어재활사
10000 

Length

Max length34
Median length34
Mean length34
Min length34

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1급언어재활사
2nd row1급언어재활사
3rd row1급언어재활사
4th row1급언어재활사
5th row1급언어재활사

Common Values

ValueCountFrequency (%)
1급언어재활사 10000
100.0%

Length

2023-12-13T04:15:57.752696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:15:57.858257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1급언어재활사 10000
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.3686
Minimum4
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:15:57.947354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile5
Q17
median8
Q310
95-th percentile11
Maximum11
Range7
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9353316
Coefficient of variation (CV)0.2312611
Kurtosis-1.0386896
Mean8.3686
Median Absolute Deviation (MAD)2
Skewness-0.19800239
Sum83686
Variance3.7455086
MonotonicityNot monotonic
2023-12-13T04:15:58.072195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
11 1855
18.6%
10 1623
16.2%
7 1615
16.2%
8 1538
15.4%
9 1363
13.6%
6 1303
13.0%
5 545
 
5.5%
4 158
 
1.6%
ValueCountFrequency (%)
4 158
 
1.6%
5 545
 
5.5%
6 1303
13.0%
7 1615
16.2%
8 1538
15.4%
9 1363
13.6%
10 1623
16.2%
11 1855
18.6%
ValueCountFrequency (%)
11 1855
18.6%
10 1623
16.2%
9 1363
13.6%
8 1538
15.4%
7 1615
16.2%
6 1303
13.0%
5 545
 
5.5%
4 158
 
1.6%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct4876
Distinct (%)48.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2773.8725
Minimum1
Maximum5534
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:15:58.214319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile273.95
Q11368.75
median2789.5
Q34161.25
95-th percentile5257
Maximum5534
Range5533
Interquartile range (IQR)2792.5

Descriptive statistics

Standard deviation1601.4245
Coefficient of variation (CV)0.57732448
Kurtosis-1.2101405
Mean2773.8725
Median Absolute Deviation (MAD)1401.5
Skewness-0.0024704013
Sum27738725
Variance2564560.5
MonotonicityNot monotonic
2023-12-13T04:15:58.405398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3178 6
 
0.1%
954 6
 
0.1%
3108 6
 
0.1%
1388 6
 
0.1%
4636 6
 
0.1%
5409 5
 
0.1%
5533 5
 
0.1%
789 5
 
0.1%
1664 5
 
0.1%
4937 5
 
0.1%
Other values (4866) 9945
99.5%
ValueCountFrequency (%)
1 2
 
< 0.1%
2 2
 
< 0.1%
3 1
 
< 0.1%
5 2
 
< 0.1%
6 1
 
< 0.1%
7 4
< 0.1%
8 5
0.1%
9 5
0.1%
10 1
 
< 0.1%
11 1
 
< 0.1%
ValueCountFrequency (%)
5534 4
< 0.1%
5533 5
0.1%
5532 2
 
< 0.1%
5531 2
 
< 0.1%
5530 3
< 0.1%
5529 1
 
< 0.1%
5527 2
 
< 0.1%
5526 2
 
< 0.1%
5525 1
 
< 0.1%
5524 2
 
< 0.1%

과목명
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
음성장애
1692 
언어발달장애
1688 
조음음운장애
1685 
유창성장애
1648 
신경언어장애
1647 

Length

Max length8
Median length6
Mean length5.8248
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row조음음운장애
2nd row언어발달장애
3rd row언어발달장애
4th row유창성장애
5th row유창성장애

Common Values

ValueCountFrequency (%)
음성장애 1692
16.9%
언어발달장애 1688
16.9%
조음음운장애 1685
16.9%
유창성장애 1648
16.5%
신경언어장애 1647
16.5%
언어재활현장실무 1640
16.4%

Length

2023-12-13T04:15:58.571522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:15:58.726996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
음성장애 1692
16.9%
언어발달장애 1688
16.9%
조음음운장애 1685
16.9%
유창성장애 1648
16.5%
신경언어장애 1647
16.5%
언어재활현장실무 1640
16.4%

과목별점수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.3729
Minimum0
Maximum24
Zeros496
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:15:58.873822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q112
median15
Q318
95-th percentile21
Maximum24
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.7960344
Coefficient of variation (CV)0.33368592
Kurtosis1.8483055
Mean14.3729
Median Absolute Deviation (MAD)3
Skewness-1.1382112
Sum143729
Variance23.001946
MonotonicityNot monotonic
2023-12-13T04:15:59.032670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
15 1049
10.5%
16 1037
10.4%
14 951
9.5%
17 861
8.6%
13 835
8.3%
18 765
 
7.6%
12 719
 
7.2%
19 646
 
6.5%
11 564
 
5.6%
0 496
 
5.0%
Other values (13) 2077
20.8%
ValueCountFrequency (%)
0 496
5.0%
3 2
 
< 0.1%
4 12
 
0.1%
5 21
 
0.2%
6 50
 
0.5%
7 87
 
0.9%
8 172
 
1.7%
9 251
2.5%
10 369
3.7%
11 564
5.6%
ValueCountFrequency (%)
24 16
 
0.2%
23 93
 
0.9%
22 191
 
1.9%
21 342
 
3.4%
20 471
4.7%
19 646
6.5%
18 765
7.6%
17 861
8.6%
16 1037
10.4%
15 1049
10.5%

총점
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct91
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean86.0226
Minimum0
Maximum133
Zeros493
Zeros (%)4.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:15:59.554896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile46
Q177
median90
Q3101
95-th percentile115
Maximum133
Range133
Interquartile range (IQR)24

Descriptive statistics

Standard deviation24.60311
Coefficient of variation (CV)0.28600751
Kurtosis4.7942502
Mean86.0226
Median Absolute Deviation (MAD)12
Skewness-1.9339619
Sum860226
Variance605.31302
MonotonicityNot monotonic
2023-12-13T04:15:59.823147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 493
 
4.9%
92 249
 
2.5%
104 243
 
2.4%
100 242
 
2.4%
90 242
 
2.4%
93 241
 
2.4%
88 240
 
2.4%
101 237
 
2.4%
91 235
 
2.4%
89 233
 
2.3%
Other values (81) 7345
73.5%
ValueCountFrequency (%)
0 493
4.9%
23 1
 
< 0.1%
42 1
 
< 0.1%
45 1
 
< 0.1%
46 11
 
0.1%
47 3
 
< 0.1%
48 3
 
< 0.1%
49 14
 
0.1%
50 12
 
0.1%
51 5
 
0.1%
ValueCountFrequency (%)
133 3
 
< 0.1%
132 3
 
< 0.1%
130 3
 
< 0.1%
129 9
 
0.1%
128 3
 
< 0.1%
127 11
0.1%
126 13
0.1%
125 6
 
0.1%
124 9
 
0.1%
123 24
0.2%

합격여부
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
6087 
불합격
3396 
결시
 
490
응시결격
 
27

Length

Max length4
Median length2
Mean length2.345
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row불합격
4th row합격
5th row합격

Common Values

ValueCountFrequency (%)
합격 6087
60.9%
불합격 3396
34.0%
결시 490
 
4.9%
응시결격 27
 
0.3%

Length

2023-12-13T04:16:00.010950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:16:00.156946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 6087
60.9%
불합격 3396
34.0%
결시 490
 
4.9%
응시결격 27
 
0.3%

성별
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
9251 
 
749

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
9251
92.5%
749
 
7.5%

Length

2023-12-13T04:16:00.308450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:16:00.433252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9251
92.5%
749
 
7.5%

연령대
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
30
4976 
20
3360 
40
1345 
50
 
307
60
 
12

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30
2nd row30
3rd row50
4th row30
5th row40

Common Values

ValueCountFrequency (%)
30 4976
49.8%
20 3360
33.6%
40 1345
 
13.5%
50 307
 
3.1%
60 12
 
0.1%

Length

2023-12-13T04:16:00.569869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:16:00.699049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30 4976
49.8%
20 3360
33.6%
40 1345
 
13.5%
50 307
 
3.1%
60 12
 
0.1%

Interactions

2023-12-13T04:15:56.546684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:53.615198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:54.399157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:55.210875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:55.992470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:56.678636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:53.741886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:54.572244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:55.366759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:56.116017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:56.806967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:53.891270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:54.727466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:55.552949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:56.229475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:56.929973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:54.042277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:54.895701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:55.690681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:56.335859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:57.057546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:54.272743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:55.051961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:55.839681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:15:56.436380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:16:00.823503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목명과목별점수총점합격여부성별연령대
연도1.0001.0000.9470.0000.1900.2460.2160.0350.107
회차1.0001.0000.9330.0000.1950.2570.3310.0540.116
일련번호0.9470.9331.0000.0000.2130.2180.2030.0720.170
과목명0.0000.0000.0001.0000.2430.0000.0000.0000.000
과목별점수0.1900.1950.2130.2431.0000.7450.8260.0540.207
총점0.2460.2570.2180.0000.7451.0000.8560.0550.200
합격여부0.2160.3310.2030.0000.8260.8561.0000.0670.135
성별0.0350.0540.0720.0000.0540.0550.0671.0000.118
연령대0.1070.1160.1700.0000.2070.2000.1350.1181.000
2023-12-13T04:16:00.999737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과목명성별연령대합격여부
과목명1.0000.0000.0000.000
성별0.0001.0000.1440.045
연령대0.0000.1441.0000.110
합격여부0.0000.0450.1101.000
2023-12-13T04:16:01.144156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호과목별점수총점과목명합격여부성별연령대
연도1.0001.0000.988-0.003-0.0040.0000.1530.0400.071
회차1.0001.0000.988-0.003-0.0040.0000.1530.0400.071
일련번호0.9880.9881.000-0.015-0.0200.0000.1230.0560.071
과목별점수-0.003-0.003-0.0151.0000.7380.1280.6610.0430.088
총점-0.004-0.004-0.0200.7381.0000.0000.7470.0550.117
과목명0.0000.0000.0000.1280.0001.0000.0000.0000.000
합격여부0.1530.1530.1230.6610.7470.0001.0000.0450.110
성별0.0400.0400.0560.0430.0550.0000.0451.0000.144
연령대0.0710.0710.0710.0880.1170.0000.1100.1441.000

Missing values

2023-12-13T04:15:57.204776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:15:57.360005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
735520181급언어재활사71226조음음운장애20105합격30
2303120211급언어재활사103839언어발달장애1785합격30
392120171급언어재활사6654언어발달장애858불합격50
3033420221급언어재활사115056유창성장애1999합격30
1814320201급언어재활사93024유창성장애1497합격40
45220151급언어재활사476신경언어장애1397합격50
2738620221급언어재활사114565신경언어장애13101합격20
1592120191급언어재활사82654언어발달장애1694합격30
268920171급언어재활사6449언어재활현장실무1596응시결격30
987520181급언어재활사71646유창성장애1492합격30
연도직종회차일련번호과목명과목별점수총점합격여부성별연령대
3217420221급언어재활사115363음성장애1379불합격40
3263020221급언어재활사115439음성장애15102합격20
452120171급언어재활사6754언어발달장애1577불합격40
1765920201급언어재활사92944언어재활현장실무1482불합격30
677620181급언어재활사71130음성장애1584불합격30
839320181급언어재활사71399조음음운장애1172불합격30
57920161급언어재활사597언어발달장애21103합격30
2304020211급언어재활사103841신경언어장애13114합격20
1659020191급언어재활사82766신경언어장애1680불합격20
1713920191급언어재활사82857언어발달장애1259불합격40