Overview

Dataset statistics

Number of variables10
Number of observations4043
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory331.8 KiB
Average record size in memory84.0 B

Variable types

Numeric4
Categorical6

Dataset

Description의지보조기기사 국가시험 응시자의 현황을 분석할 수 있는 정보(연도, 직종, 회차, 성별, 연령대, 응시지역, 졸업여부, 합격여부, 학교소재지)를 개인을 식별할 수 없는 형태로 제공합니다.
URLhttps://www.data.go.kr/data/15083489/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
응시지역 is highly imbalanced (99.0%)Imbalance
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 18:15:54.520051
Analysis finished2023-12-12 18:15:57.553546
Duration3.03 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2008.6973
Minimum2000
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size35.7 KiB
2023-12-13T03:15:57.637092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2000
Q12000
median2009
Q32015
95-th percentile2021
Maximum2022
Range22
Interquartile range (IQR)15

Descriptive statistics

Standard deviation7.5158984
Coefficient of variation (CV)0.003741678
Kurtosis-1.3138833
Mean2008.6973
Median Absolute Deviation (MAD)8
Skewness0.25210249
Sum8121163
Variance56.488728
MonotonicityIncreasing
2023-12-13T03:15:57.760992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2000 1022
25.3%
2013 258
 
6.4%
2001 255
 
6.3%
2002 206
 
5.1%
2009 189
 
4.7%
2011 170
 
4.2%
2022 165
 
4.1%
2010 162
 
4.0%
2020 149
 
3.7%
2021 145
 
3.6%
Other values (12) 1322
32.7%
ValueCountFrequency (%)
2000 1022
25.3%
2001 255
 
6.3%
2002 206
 
5.1%
2004 55
 
1.4%
2005 75
 
1.9%
2006 107
 
2.6%
2007 129
 
3.2%
2008 130
 
3.2%
2009 189
 
4.7%
2010 162
 
4.0%
ValueCountFrequency (%)
2022 165
4.1%
2021 145
3.6%
2020 149
3.7%
2019 127
3.1%
2018 129
3.2%
2017 113
2.8%
2016 113
2.8%
2015 116
2.9%
2014 88
 
2.2%
2013 258
6.4%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.7 KiB
의지·보조기기사
4043 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row의지·보조기기사
2nd row의지·보조기기사
3rd row의지·보조기기사
4th row의지·보조기기사
5th row의지·보조기기사

Common Values

ValueCountFrequency (%)
의지·보조기기사 4043
100.0%

Length

2023-12-13T03:15:57.888603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:15:58.000283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
의지·보조기기사 4043
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.3771952
Minimum1
Maximum23
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size35.7 KiB
2023-12-13T03:15:58.163158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median9
Q316
95-th percentile22
Maximum23
Range22
Interquartile range (IQR)15

Descriptive statistics

Standard deviation7.51136
Coefficient of variation (CV)0.80102417
Kurtosis-1.256271
Mean9.3771952
Median Absolute Deviation (MAD)7
Skewness0.36552678
Sum37912
Variance56.420529
MonotonicityNot monotonic
2023-12-13T03:15:58.330472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1 1022
25.3%
2 255
 
6.3%
3 206
 
5.1%
9 189
 
4.7%
11 170
 
4.2%
23 165
 
4.1%
10 162
 
4.0%
21 149
 
3.7%
22 145
 
3.6%
12 140
 
3.5%
Other values (13) 1440
35.6%
ValueCountFrequency (%)
1 1022
25.3%
2 255
 
6.3%
3 206
 
5.1%
4 55
 
1.4%
5 75
 
1.9%
6 107
 
2.6%
7 129
 
3.2%
8 130
 
3.2%
9 189
 
4.7%
10 162
 
4.0%
ValueCountFrequency (%)
23 165
4.1%
22 145
3.6%
21 149
3.7%
20 127
3.1%
19 129
3.2%
18 113
2.8%
17 113
2.8%
16 116
2.9%
15 88
2.2%
14 121
3.0%

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct4043
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2022
Minimum1
Maximum4043
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size35.7 KiB
2023-12-13T03:15:58.536615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile203.1
Q11011.5
median2022
Q33032.5
95-th percentile3840.9
Maximum4043
Range4042
Interquartile range (IQR)2021

Descriptive statistics

Standard deviation1167.2579
Coefficient of variation (CV)0.57727888
Kurtosis-1.2
Mean2022
Median Absolute Deviation (MAD)1011
Skewness0
Sum8174946
Variance1362491
MonotonicityNot monotonic
2023-12-13T03:15:58.726606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2807 1
 
< 0.1%
2666 1
 
< 0.1%
2801 1
 
< 0.1%
2802 1
 
< 0.1%
2667 1
 
< 0.1%
2668 1
 
< 0.1%
2803 1
 
< 0.1%
2669 1
 
< 0.1%
2670 1
 
< 0.1%
Other values (4033) 4033
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
4043 1
< 0.1%
4042 1
< 0.1%
4041 1
< 0.1%
4040 1
< 0.1%
4039 1
< 0.1%
4038 1
< 0.1%
4037 1
< 0.1%
4036 1
< 0.1%
4035 1
< 0.1%
4034 1
< 0.1%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.7 KiB
2599 
1444 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
2599
64.3%
1444
35.7%

Length

2023-12-13T03:15:58.913778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:15:59.056105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2599
64.3%
1444
35.7%

연령대
Real number (ℝ)

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.673015
Minimum10
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size35.7 KiB
2023-12-13T03:15:59.177298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile20
Q120
median20
Q330
95-th percentile40
Maximum70
Range60
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.3868795
Coefficient of variation (CV)0.31203797
Kurtosis5.4815615
Mean23.673015
Median Absolute Deviation (MAD)0
Skewness2.2809603
Sum95710
Variance54.565989
MonotonicityNot monotonic
2023-12-13T03:15:59.321352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
20 3026
74.8%
30 663
 
16.4%
40 250
 
6.2%
50 75
 
1.9%
60 24
 
0.6%
10 4
 
0.1%
70 1
 
< 0.1%
ValueCountFrequency (%)
10 4
 
0.1%
20 3026
74.8%
30 663
 
16.4%
40 250
 
6.2%
50 75
 
1.9%
60 24
 
0.6%
70 1
 
< 0.1%
ValueCountFrequency (%)
70 1
 
< 0.1%
60 24
 
0.6%
50 75
 
1.9%
40 250
 
6.2%
30 663
 
16.4%
20 3026
74.8%
10 4
 
0.1%

응시지역
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.7 KiB
서울특별시
4035 
대구광역시
 
3
광주광역시
 
3
부산광역시
 
1
대전광역시
 
1

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 4035
99.8%
대구광역시 3
 
0.1%
광주광역시 3
 
0.1%
부산광역시 1
 
< 0.1%
대전광역시 1
 
< 0.1%

Length

2023-12-13T03:15:59.448355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:15:59.559141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 4035
99.8%
대구광역시 3
 
0.1%
광주광역시 3
 
0.1%
부산광역시 1
 
< 0.1%
대전광역시 1
 
< 0.1%

졸업여부
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.7 KiB
졸업예정
2402 
졸업
1441 
 
200

Length

Max length4
Median length4
Mean length3.1387583
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row졸업
2nd row졸업
3rd row졸업
4th row졸업
5th row졸업

Common Values

ValueCountFrequency (%)
졸업예정 2402
59.4%
졸업 1441
35.6%
200
 
4.9%

Length

2023-12-13T03:15:59.697371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:15:59.848843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
졸업예정 2402
62.5%
졸업 1441
37.5%

합격여부
Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.7 KiB
불합격
2073 
합격
1665 
결시
300 
응시결격
 
5

Length

Max length4
Median length3
Mean length2.5152115
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row합격
4th row합격
5th row합격

Common Values

ValueCountFrequency (%)
불합격 2073
51.3%
합격 1665
41.2%
결시 300
 
7.4%
응시결격 5
 
0.1%

Length

2023-12-13T03:15:59.988898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:16:00.139734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
불합격 2073
51.3%
합격 1665
41.2%
결시 300
 
7.4%
응시결격 5
 
0.1%

학교소재지
Categorical

Distinct17
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size31.7 KiB
충청남도
1119 
경기도
764 
경상북도
643 
기타
569 
전라남도
309 
Other values (12)
639 

Length

Max length5
Median length4
Mean length3.5560228
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기타
2nd row기타
3rd row기타
4th row기타
5th row기타

Common Values

ValueCountFrequency (%)
충청남도 1119
27.7%
경기도 764
18.9%
경상북도 643
15.9%
기타 569
14.1%
전라남도 309
 
7.6%
충청북도 241
 
6.0%
강원도 98
 
2.4%
서울특별시 87
 
2.2%
광주광역시 64
 
1.6%
전라북도 41
 
1.0%
Other values (7) 108
 
2.7%

Length

2023-12-13T03:16:00.309475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
충청남도 1119
27.7%
경기도 764
18.9%
경상북도 643
15.9%
기타 569
14.1%
전라남도 309
 
7.6%
충청북도 241
 
6.0%
강원도 98
 
2.4%
서울특별시 87
 
2.2%
광주광역시 64
 
1.6%
전라북도 41
 
1.0%
Other values (7) 108
 
2.7%

Interactions

2023-12-13T03:15:56.559703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:55.311206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:55.745263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:56.125176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:56.645755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:55.421104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:55.829294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:56.249515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:56.731904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:55.528466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:55.910226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:56.339490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:57.155740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:55.650819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:56.003164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:56.452481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:16:00.414933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
연도1.0000.9970.9380.2680.373NaN0.5510.2030.672
회차0.9971.0000.9760.1620.3120.0000.4750.1920.620
일련번호0.9380.9761.0000.2280.3630.0900.5150.2890.681
성별0.2680.1620.2281.0000.2900.0020.1180.1770.340
연령대0.3730.3120.3630.2901.0000.0000.4900.1660.501
응시지역NaN0.0000.0900.0020.0001.0000.0000.0330.268
졸업여부0.5510.4750.5150.1180.4900.0001.0000.1340.624
합격여부0.2030.1920.2890.1770.1660.0330.1341.0000.445
학교소재지0.6720.6200.6810.3400.5010.2680.6240.4451.000
2023-12-13T03:16:00.582028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별합격여부학교소재지응시지역졸업여부
성별1.0000.1170.3040.0030.195
합격여부0.1171.0000.2620.0270.127
학교소재지0.3040.2621.0000.1410.424
응시지역0.0030.0270.1411.0000.000
졸업여부0.1950.1270.4240.0001.000
2023-12-13T03:16:00.726474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호연령대성별응시지역졸업여부합격여부학교소재지
연도1.0001.0000.991-0.3390.1210.0000.3250.1230.296
회차1.0001.0000.991-0.3390.1230.0000.3250.1180.295
일련번호0.9910.9911.000-0.3750.1750.0370.3600.1760.345
연령대-0.339-0.339-0.3751.0000.3100.0000.3770.1150.252
성별0.1210.1230.1750.3101.0000.0030.1950.1170.304
응시지역0.0000.0000.0370.0000.0031.0000.0000.0270.141
졸업여부0.3250.3250.3600.3770.1950.0001.0000.1270.424
합격여부0.1230.1180.1760.1150.1170.0270.1271.0000.262
학교소재지0.2960.2950.3450.2520.3040.1410.4240.2621.000

Missing values

2023-12-13T03:15:57.293520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:15:57.471020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
02000의지·보조기기사1140서울특별시졸업합격기타
12000의지·보조기기사1260서울특별시졸업합격기타
22000의지·보조기기사1340서울특별시졸업합격기타
32000의지·보조기기사1440서울특별시졸업합격기타
42000의지·보조기기사1530서울특별시졸업합격기타
52000의지·보조기기사1630서울특별시졸업합격기타
62000의지·보조기기사1740서울특별시졸업합격기타
72000의지·보조기기사1850서울특별시졸업합격기타
82000의지·보조기기사1930서울특별시졸업합격기타
92000의지·보조기기사11030서울특별시졸업합격기타
연도직종회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
40332022의지·보조기기사23403420서울특별시졸업예정합격경기도
40342022의지·보조기기사23403520서울특별시졸업불합격충청남도
40352022의지·보조기기사23403620서울특별시졸업예정합격경기도
40362022의지·보조기기사23403720서울특별시졸업예정불합격충청북도
40372022의지·보조기기사23403820서울특별시졸업예정합격충청남도
40382022의지·보조기기사23403920서울특별시졸업예정불합격경기도
40392022의지·보조기기사23404020서울특별시졸업예정결시충청남도
40402022의지·보조기기사23404120서울특별시졸업예정합격경기도
40412022의지·보조기기사23404220서울특별시졸업불합격전라남도
40422022의지·보조기기사23404320서울특별시졸업예정불합격충청남도