Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory820.3 KiB
Average record size in memory84.0 B

Variable types

Numeric2
Categorical7

Dataset

Description요양보호사 자격시험 응시자의 현황을 분석할 수 있는 정보(연도, 직종, 회차, 성별, 연령대, 응시지역, 합격여부, 학교소재지)를 개인을 식별할 수 없는 형태로 제공합니다.
URLhttps://www.data.go.kr/data/15101120/fileData.do

Alerts

연도 has constant value ""Constant
직종 has constant value ""Constant
일련번호 is highly overall correlated with 회차 and 2 other fieldsHigh correlation
회차 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
응시지역 is highly overall correlated with 일련번호 and 2 other fieldsHigh correlation
학교소재지 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
합격여부 is highly imbalanced (59.8%)Imbalance
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:35:56.260484
Analysis finished2023-12-12 10:35:58.409036
Duration2.15 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72758.077
Minimum2
Maximum186378
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:35:58.521485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4576.9
Q123681.75
median47106
Q3162172.75
95-th percentile181645.65
Maximum186378
Range186376
Interquartile range (IQR)138491

Descriptive statistics

Standard deviation64577.99
Coefficient of variation (CV)0.88757143
Kurtosis-1.0783933
Mean72758.077
Median Absolute Deviation (MAD)25963
Skewness0.78447414
Sum7.2758077 × 108
Variance4.1703168 × 109
MonotonicityNot monotonic
2023-12-12T19:35:58.740877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
38195 1
 
< 0.1%
43411 1
 
< 0.1%
166964 1
 
< 0.1%
12959 1
 
< 0.1%
47805 1
 
< 0.1%
170362 1
 
< 0.1%
10846 1
 
< 0.1%
181551 1
 
< 0.1%
169884 1
 
< 0.1%
182286 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
34 1
< 0.1%
46 1
< 0.1%
64 1
< 0.1%
65 1
< 0.1%
77 1
< 0.1%
82 1
< 0.1%
83 1
< 0.1%
89 1
< 0.1%
93 1
< 0.1%
ValueCountFrequency (%)
186378 1
< 0.1%
186354 1
< 0.1%
186350 1
< 0.1%
186339 1
< 0.1%
186336 1
< 0.1%
186332 1
< 0.1%
186315 1
< 0.1%
186310 1
< 0.1%
186300 1
< 0.1%
186287 1
< 0.1%

연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2020
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 10000
100.0%

Length

2023-12-12T19:35:58.945989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:35:59.073630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 10000
100.0%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
요양보호사
10000 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row요양보호사
2nd row요양보호사
3rd row요양보호사
4th row요양보호사
5th row요양보호사

Common Values

ValueCountFrequency (%)
요양보호사 10000
100.0%

Length

2023-12-12T19:35:59.210254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:35:59.343592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
요양보호사 10000
100.0%

회차
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
30
5817 
32
4183 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30
2nd row32
3rd row32
4th row32
5th row30

Common Values

ValueCountFrequency (%)
30 5817
58.2%
32 4183
41.8%

Length

2023-12-12T19:35:59.472342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:35:59.633817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30 5817
58.2%
32 4183
41.8%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
8862 
1138 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
8862
88.6%
1138
 
11.4%

Length

2023-12-12T19:35:59.816544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:36:00.001645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
8862
88.6%
1138
 
11.4%

연령대
Real number (ℝ)

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.826
Minimum10
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:36:00.162519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile40
Q150
median50
Q360
95-th percentile70
Maximum80
Range70
Interquartile range (IQR)10

Descriptive statistics

Standard deviation9.6630772
Coefficient of variation (CV)0.18645231
Kurtosis0.75459114
Mean51.826
Median Absolute Deviation (MAD)10
Skewness-0.51633661
Sum518260
Variance93.375062
MonotonicityNot monotonic
2023-12-12T19:36:00.344051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
50 4067
40.7%
60 3387
33.9%
40 1571
 
15.7%
70 475
 
4.8%
30 339
 
3.4%
20 123
 
1.2%
80 37
 
0.4%
10 1
 
< 0.1%
ValueCountFrequency (%)
10 1
 
< 0.1%
20 123
 
1.2%
30 339
 
3.4%
40 1571
 
15.7%
50 4067
40.7%
60 3387
33.9%
70 475
 
4.8%
80 37
 
0.4%
ValueCountFrequency (%)
80 37
 
0.4%
70 475
 
4.8%
60 3387
33.9%
50 4067
40.7%
40 1571
 
15.7%
30 339
 
3.4%
20 123
 
1.2%
10 1
 
< 0.1%

응시지역
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
서울특별시
2559 
인천광역시
1182 
부산광역시
1150 
대구광역시
1092 
광주광역시
739 
Other values (16)
3278 

Length

Max length7
Median length5
Mean length4.6268
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row홍성(충남)
2nd row대구광역시
3rd row부산광역시
4th row인천광역시
5th row수원,화성

Common Values

ValueCountFrequency (%)
서울특별시 2559
25.6%
인천광역시 1182
11.8%
부산광역시 1150
11.5%
대구광역시 1092
10.9%
광주광역시 739
 
7.4%
수원,화성 558
 
5.6%
의정부,양주 458
 
4.6%
창원 365
 
3.6%
대전광역시 292
 
2.9%
전주 273
 
2.7%
Other values (11) 1332
13.3%

Length

2023-12-12T19:36:00.526892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울특별시 2559
25.6%
인천광역시 1182
11.8%
부산광역시 1150
11.5%
대구광역시 1092
10.9%
광주광역시 739
 
7.4%
수원,화성 558
 
5.6%
의정부,양주 458
 
4.6%
창원 365
 
3.6%
대전광역시 292
 
2.9%
전주 273
 
2.7%
Other values (11) 1332
13.3%

합격여부
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
8275 
불합격
1303 
결시
 
419
응시결격
 
3

Length

Max length4
Median length2
Mean length2.1309
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row불합격
2nd row합격
3rd row불합격
4th row합격
5th row합격

Common Values

ValueCountFrequency (%)
합격 8275
82.8%
불합격 1303
 
13.0%
결시 419
 
4.2%
응시결격 3
 
< 0.1%

Length

2023-12-12T19:36:00.723598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:36:00.889273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 8275
82.8%
불합격 1303
 
13.0%
결시 419
 
4.2%
응시결격 3
 
< 0.1%

학교소재지
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
서울특별시
2183 
수원
1292 
부산광역시
1021 
인천광역시
739 
경상남도
680 
Other values (14)
4085 

Length

Max length7
Median length5
Mean length4.2463
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청남도
2nd row경상북도
3rd row부산광역시
4th row인천광역시
5th row수원

Common Values

ValueCountFrequency (%)
서울특별시 2183
21.8%
수원 1292
12.9%
부산광역시 1021
10.2%
인천광역시 739
 
7.4%
경상남도 680
 
6.8%
광주광역시 669
 
6.7%
대구광역시 666
 
6.7%
의정부 522
 
5.2%
경상북도 518
 
5.2%
전라남도 360
 
3.6%
Other values (9) 1350
13.5%

Length

2023-12-12T19:36:01.054401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울특별시 2183
21.8%
수원 1292
12.9%
부산광역시 1021
10.2%
인천광역시 739
 
7.4%
경상남도 680
 
6.8%
광주광역시 669
 
6.7%
대구광역시 666
 
6.7%
의정부 522
 
5.2%
경상북도 518
 
5.2%
전라남도 360
 
3.6%
Other values (9) 1350
13.5%

Interactions

2023-12-12T19:35:57.413018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:35:57.035941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:35:57.908874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:35:57.220397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:36:01.162479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호회차성별연령대응시지역합격여부학교소재지
일련번호1.0001.0000.0330.0780.9120.0920.844
회차1.0001.0000.0160.0910.6650.0740.532
성별0.0330.0161.0000.2720.0620.0430.062
연령대0.0780.0910.2721.0000.1650.2810.163
응시지역0.9120.6650.0620.1651.0000.1280.981
합격여부0.0920.0740.0430.2810.1281.0000.137
학교소재지0.8440.5320.0620.1630.9810.1371.000
2023-12-12T19:36:01.293224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회차합격여부학교소재지응시지역성별
회차1.0000.0490.4740.5930.010
합격여부0.0491.0000.0750.0690.028
학교소재지0.4740.0751.0000.8260.055
응시지역0.5930.0690.8261.0000.055
성별0.0100.0280.0550.0551.000
2023-12-12T19:36:01.440821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호연령대회차성별응시지역합격여부학교소재지
일련번호1.0000.0190.9820.0240.7040.0600.596
연령대0.0191.0000.0680.2040.0690.1290.069
회차0.9820.0681.0000.0100.5930.0490.474
성별0.0240.2040.0101.0000.0550.0280.055
응시지역0.7040.0690.5930.0551.0000.0690.826
합격여부0.0600.1290.0490.0280.0691.0000.075
학교소재지0.5960.0690.4740.0550.8260.0751.000

Missing values

2023-12-12T19:35:58.104740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:35:58.314819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호연도직종회차성별연령대응시지역합격여부학교소재지
42709381952020요양보호사3050홍성(충남)불합격충청남도
83103673212020요양보호사3250대구광역시합격경상북도
71966561842020요양보호사3260부산광역시불합격부산광역시
903991814852020요양보호사3240인천광역시합격인천광역시
52263468802020요양보호사3060수원,화성합격수원
12990500122020요양보호사3060대구광역시합격대구광역시
45494401112020요양보호사3060의정부,양주합격서울특별시
891851802712020요양보호사3250인천광역시합격수원
24004194902020요양보호사3060광주광역시합격광주광역시
37734332202020요양보호사3050창원합격경상남도
일련번호연도직종회차성별연령대응시지역합격여부학교소재지
855481766342020요양보호사3260인천광역시합격인천광역시
50973455902020요양보호사3050수원,화성합격수원
53588482052020요양보호사3050수원,화성합격수원
584981628832020요양보호사3250서울특별시합격수원
867141778002020요양보호사3250인천광역시합격수원
14253512752020요양보호사3070대구광역시불합격대구광역시
12373123742020요양보호사3050부산광역시합격부산광역시
870887092020요양보호사3050서울특별시합격의정부
50245448622020요양보호사3040수원,화성합격수원
82983672012020요양보호사3240대구광역시합격경상북도