Overview

Dataset statistics

Number of variables10
Number of observations2598
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory213.2 KiB
Average record size in memory84.1 B

Variable types

Numeric3
Categorical7

Dataset

Description치과위생사 국가시험 응시자의 현황을 분석할 수 있는 정보(연도, 직종, 회차, 성별, 연령대, 응시지역, 졸업여부, 합격여부, 학교소재지)를 개인을 식별할 수 없는 형태로 제공합니다.
URLhttps://www.data.go.kr/data/15083497/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 2 other fieldsHigh correlation
응시지역 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
학교소재지 is highly overall correlated with 응시지역High correlation
졸업여부 is highly imbalanced (85.0%)Imbalance
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 21:49:01.204949
Analysis finished2023-12-12 21:49:02.869282
Duration1.66 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2020.4911
Minimum2018
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.0 KiB
2023-12-13T06:49:02.918506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2018
5-th percentile2018
Q12019
median2021
Q32022
95-th percentile2023
Maximum2023
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.927977
Coefficient of variation (CV)0.00095421204
Kurtosis-1.5468782
Mean2020.4911
Median Absolute Deviation (MAD)2
Skewness-0.028465192
Sum5249236
Variance3.7170953
MonotonicityIncreasing
2023-12-13T06:49:03.031318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2018 649
25.0%
2023 580
22.3%
2022 450
17.3%
2019 365
14.0%
2021 299
11.5%
2020 255
 
9.8%
ValueCountFrequency (%)
2018 649
25.0%
2019 365
14.0%
2020 255
 
9.8%
2021 299
11.5%
2022 450
17.3%
2023 580
22.3%
ValueCountFrequency (%)
2023 580
22.3%
2022 450
17.3%
2021 299
11.5%
2020 255
 
9.8%
2019 365
14.0%
2018 649
25.0%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
1급 장애인재활상담사
2598 

Length

Max length11
Median length11
Mean length11
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1급 장애인재활상담사
2nd row1급 장애인재활상담사
3rd row1급 장애인재활상담사
4th row1급 장애인재활상담사
5th row1급 장애인재활상담사

Common Values

ValueCountFrequency (%)
1급 장애인재활상담사 2598
100.0%

Length

2023-12-13T06:49:03.155249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:49:03.245321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1급 2598
50.0%
장애인재활상담사 2598
50.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3645112
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.0 KiB
2023-12-13T06:49:03.352517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.1116148
Coefficient of variation (CV)0.48381474
Kurtosis-1.3596355
Mean4.3645112
Median Absolute Deviation (MAD)2
Skewness-0.20618337
Sum11339
Variance4.4589172
MonotonicityIncreasing
2023-12-13T06:49:03.469663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
7 580
22.3%
6 450
17.3%
3 365
14.0%
1 329
12.7%
2 320
12.3%
5 299
11.5%
4 255
9.8%
ValueCountFrequency (%)
1 329
12.7%
2 320
12.3%
3 365
14.0%
4 255
9.8%
5 299
11.5%
6 450
17.3%
7 580
22.3%
ValueCountFrequency (%)
7 580
22.3%
6 450
17.3%
5 299
11.5%
4 255
9.8%
3 365
14.0%
2 320
12.3%
1 329
12.7%

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2598
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1299.5
Minimum1
Maximum2598
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.0 KiB
2023-12-13T06:49:03.603877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile130.85
Q1650.25
median1299.5
Q31948.75
95-th percentile2468.15
Maximum2598
Range2597
Interquartile range (IQR)1298.5

Descriptive statistics

Standard deviation750.12232
Coefficient of variation (CV)0.57723919
Kurtosis-1.2
Mean1299.5
Median Absolute Deviation (MAD)649.5
Skewness0
Sum3376101
Variance562683.5
MonotonicityStrictly increasing
2023-12-13T06:49:03.783413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1747 1
 
< 0.1%
1729 1
 
< 0.1%
1730 1
 
< 0.1%
1731 1
 
< 0.1%
1732 1
 
< 0.1%
1733 1
 
< 0.1%
1734 1
 
< 0.1%
1735 1
 
< 0.1%
1736 1
 
< 0.1%
Other values (2588) 2588
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2598 1
< 0.1%
2597 1
< 0.1%
2596 1
< 0.1%
2595 1
< 0.1%
2594 1
< 0.1%
2593 1
< 0.1%
2592 1
< 0.1%
2591 1
< 0.1%
2590 1
< 0.1%
2589 1
< 0.1%

성별
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
1576 
1022 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
1576
60.7%
1022
39.3%

Length

2023-12-13T06:49:03.934785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:49:04.022645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1576
60.7%
1022
39.3%

연령대
Categorical

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
30
860 
40
772 
20
671 
50
278 
60
 
17

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30
2nd row30
3rd row30
4th row50
5th row30

Common Values

ValueCountFrequency (%)
30 860
33.1%
40 772
29.7%
20 671
25.8%
50 278
 
10.7%
60 17
 
0.7%

Length

2023-12-13T06:49:04.139672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:49:04.254735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30 860
33.1%
40 772
29.7%
20 671
25.8%
50 278
 
10.7%
60 17
 
0.7%

응시지역
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
서울특별시
1829 
대구광역시
407 
전주
362 

Length

Max length5
Median length5
Mean length4.5819861
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 1829
70.4%
대구광역시 407
 
15.7%
전주 362
 
13.9%

Length

2023-12-13T06:49:04.374684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:49:04.482721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 1829
70.4%
대구광역시 407
 
15.7%
전주 362
 
13.9%

졸업여부
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
졸업
2542 
졸업예정
 
56

Length

Max length4
Median length2
Mean length2.0431101
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row졸업
2nd row졸업
3rd row졸업
4th row졸업
5th row졸업

Common Values

ValueCountFrequency (%)
졸업 2542
97.8%
졸업예정 56
 
2.2%

Length

2023-12-13T06:49:04.617331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:49:04.751456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
졸업 2542
97.8%
졸업예정 56
 
2.2%

합격여부
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
합격
1692 
불합격
621 
결시
279 
응시결격
 
6

Length

Max length4
Median length2
Mean length2.243649
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row불합격
2nd row합격
3rd row불합격
4th row결시
5th row합격

Common Values

ValueCountFrequency (%)
합격 1692
65.1%
불합격 621
 
23.9%
결시 279
 
10.7%
응시결격 6
 
0.2%

Length

2023-12-13T06:49:04.865378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:49:04.967579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 1692
65.1%
불합격 621
 
23.9%
결시 279
 
10.7%
응시결격 6
 
0.2%

학교소재지
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
경상북도
555 
경기도
472 
충청남도
380 
서울특별시
322 
전라북도
293 
Other values (17)
576 

Length

Max length7
Median length4
Mean length3.9996151
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row충청북도
2nd row충청남도
3rd row전라북도
4th row경기도
5th row충청남도

Common Values

ValueCountFrequency (%)
경상북도 555
21.4%
경기도 472
18.2%
충청남도 380
14.6%
서울특별시 322
12.4%
전라북도 293
11.3%
부산광역시 126
 
4.8%
광주광역시 98
 
3.8%
충청북도 49
 
1.9%
대전광역시 47
 
1.8%
전라남도 46
 
1.8%
Other values (12) 210
 
8.1%

Length

2023-12-13T06:49:05.094548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경상북도 555
21.4%
경기도 472
18.2%
충청남도 380
14.6%
서울특별시 322
12.4%
전라북도 293
11.3%
부산광역시 126
 
4.8%
광주광역시 98
 
3.8%
충청북도 49
 
1.9%
대전광역시 47
 
1.8%
전라남도 46
 
1.8%
Other values (12) 210
 
8.1%

Interactions

2023-12-13T06:49:02.306554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:49:01.745979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:49:02.015911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:49:02.417941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:49:01.830168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:49:02.120237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:49:02.513977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:49:01.916467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:49:02.209557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:49:05.202958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
연도1.0001.0000.9490.0050.5480.0790.1260.2020.315
회차1.0001.0000.9560.0280.3860.3720.1560.2960.309
일련번호0.9490.9561.0000.0320.5440.6550.2180.3220.413
성별0.0050.0280.0321.0000.1130.0000.1020.1640.000
연령대0.5480.3860.5440.1131.0000.1080.1890.1820.425
응시지역0.0790.3720.6550.0000.1081.0000.0000.0520.833
졸업여부0.1260.1560.2180.1020.1890.0001.0000.0930.100
합격여부0.2020.2960.3220.1640.1820.0520.0931.0000.147
학교소재지0.3150.3090.4130.0000.4250.8330.1000.1471.000
2023-12-13T06:49:05.352917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별합격여부학교소재지연령대응시지역졸업여부
성별1.0000.1090.0000.1380.0000.065
합격여부0.1091.0000.0800.1500.0490.061
학교소재지0.0000.0801.0000.2230.5790.088
연령대0.1380.1500.2231.0000.0810.231
응시지역0.0000.0490.5790.0811.0000.000
졸업여부0.0650.0610.0880.2310.0001.000
2023-12-13T06:49:05.474846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
연도1.0000.9940.9810.0340.2590.2710.1670.1870.132
회차0.9941.0000.9870.0290.2590.2700.1660.2070.120
일련번호0.9810.9871.0000.0250.2560.5040.1670.1970.165
성별0.0340.0290.0251.0000.1380.0000.0650.1090.000
연령대0.2590.2590.2560.1381.0000.0810.2310.1500.223
응시지역0.2710.2700.5040.0000.0811.0000.0000.0490.579
졸업여부0.1670.1660.1670.0650.2310.0001.0000.0610.088
합격여부0.1870.2070.1970.1090.1500.0490.0611.0000.080
학교소재지0.1320.1200.1650.0000.2230.5790.0880.0801.000

Missing values

2023-12-13T06:49:02.648854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:49:02.814927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
020181급 장애인재활상담사1130서울특별시졸업불합격충청북도
120181급 장애인재활상담사1230서울특별시졸업합격충청남도
220181급 장애인재활상담사1330서울특별시졸업불합격전라북도
320181급 장애인재활상담사1450서울특별시졸업결시경기도
420181급 장애인재활상담사1530서울특별시졸업합격충청남도
520181급 장애인재활상담사1630서울특별시졸업합격서울특별시
620181급 장애인재활상담사1740서울특별시졸업합격전라북도
720181급 장애인재활상담사1830서울특별시졸업합격충청남도
820181급 장애인재활상담사1940서울특별시졸업합격서울특별시
920181급 장애인재활상담사11040서울특별시졸업합격강원도
연도직종회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
258820231급 장애인재활상담사7258930전주졸업결시전라북도
258920231급 장애인재활상담사7259040전주졸업불합격광주광역시
259020231급 장애인재활상담사7259140전주졸업합격충청남도
259120231급 장애인재활상담사7259230전주졸업합격광주광역시
259220231급 장애인재활상담사7259330전주졸업응시결격전라북도
259320231급 장애인재활상담사7259420전주졸업합격전라북도
259420231급 장애인재활상담사7259520전주졸업예정합격전라북도
259520231급 장애인재활상담사7259620전주졸업합격전라북도
259620231급 장애인재활상담사7259720전주졸업합격전라북도
259720231급 장애인재활상담사7259820전주졸업합격전라북도