Overview

Dataset statistics

Number of variables7
Number of observations6668
Missing cells242
Missing cells (%)0.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory390.8 KiB
Average record size in memory60.0 B

Variable types

Numeric4
Categorical2
Text1

Dataset

Description보건의료인 국가시험을 실시한 시험장에 대한 정보(연도, 직종, 회차, 응시지역, 시험장명, 시험실수, 수용인원수)를 제공합니다.
URLhttps://www.data.go.kr/data/3068128/fileData.do

Alerts

회차 is highly overall correlated with 직종High correlation
시험실수 is highly overall correlated with 수용인원수High correlation
수용인원수 is highly overall correlated with 시험실수High correlation
직종 is highly overall correlated with 회차High correlation
시험실수 has 121 (1.8%) missing valuesMissing
수용인원수 has 121 (1.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 22:45:58.437224
Analysis finished2023-12-12 22:46:01.388837
Duration2.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.3329
Minimum2011
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2023-12-13T07:46:01.437750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2012
Q12016
median2019
Q32021
95-th percentile2022
Maximum2023
Range12
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.2945243
Coefficient of variation (CV)0.0016322997
Kurtosis-0.83770313
Mean2018.3329
Median Absolute Deviation (MAD)2
Skewness-0.58051554
Sum13458244
Variance10.85389
MonotonicityIncreasing
2023-12-13T07:46:01.556295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
2022 1167
17.5%
2021 960
14.4%
2020 842
12.6%
2019 631
9.5%
2018 504
7.6%
2017 438
 
6.6%
2016 410
 
6.1%
2015 406
 
6.1%
2014 381
 
5.7%
2012 318
 
4.8%
Other values (3) 611
9.2%
ValueCountFrequency (%)
2011 135
 
2.0%
2012 318
 
4.8%
2013 317
 
4.8%
2014 381
5.7%
2015 406
6.1%
2016 410
6.1%
2017 438
6.6%
2018 504
7.6%
2019 631
9.5%
2020 842
12.6%
ValueCountFrequency (%)
2023 159
 
2.4%
2022 1167
17.5%
2021 960
14.4%
2020 842
12.6%
2019 631
9.5%
2018 504
7.6%
2017 438
 
6.6%
2016 410
 
6.1%
2015 406
 
6.1%
2014 381
 
5.7%

직종
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
요양보호사
3305 
간호조무사
937 
간호사
382 
위생사
 
238
영양사
 
170
Other values (38)
1636 

Length

Max length17
Median length5
Mean length5.0038992
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row간호조무사
2nd row간호조무사
3rd row간호조무사
4th row간호조무사
5th row간호조무사

Common Values

ValueCountFrequency (%)
요양보호사 3305
49.6%
간호조무사 937
 
14.1%
간호사 382
 
5.7%
위생사 238
 
3.6%
영양사 170
 
2.5%
치과위생사 137
 
2.1%
물리치료사 126
 
1.9%
의사 103
 
1.5%
방사선사 99
 
1.5%
보건의료정보관리사 96
 
1.4%
Other values (33) 1075
 
16.1%

Length

2023-12-13T07:46:01.750758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
요양보호사 3305
46.6%
간호조무사 937
 
13.2%
간호사 382
 
5.4%
위생사 238
 
3.4%
보건교육사 199
 
2.8%
영양사 170
 
2.4%
2급 159
 
2.2%
치과위생사 137
 
1.9%
1급 129
 
1.8%
물리치료사 126
 
1.8%
Other values (26) 1307
 
18.4%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct86
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.523395
Minimum1
Maximum87
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2023-12-13T07:46:01.918900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q119
median32
Q340
95-th percentile63
Maximum87
Range86
Interquartile range (IQR)21

Descriptive statistics

Standard deviation17.031365
Coefficient of variation (CV)0.54027699
Kurtosis0.43030501
Mean31.523395
Median Absolute Deviation (MAD)10
Skewness0.54924257
Sum210198
Variance290.0674
MonotonicityNot monotonic
2023-12-13T07:46:02.075327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32 308
 
4.6%
41 302
 
4.5%
39 279
 
4.2%
40 249
 
3.7%
38 243
 
3.6%
37 221
 
3.3%
36 206
 
3.1%
35 202
 
3.0%
33 153
 
2.3%
34 142
 
2.1%
Other values (76) 4363
65.4%
ValueCountFrequency (%)
1 26
 
0.4%
2 53
0.8%
3 74
1.1%
4 62
0.9%
5 108
1.6%
6 100
1.5%
7 104
1.6%
8 93
1.4%
9 105
1.6%
10 107
1.6%
ValueCountFrequency (%)
87 15
0.2%
86 16
0.2%
85 8
0.1%
84 8
0.1%
83 8
0.1%
82 7
0.1%
81 7
0.1%
80 7
0.1%
79 7
0.1%
78 14
0.2%

응시지역
Categorical

Distinct33
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
서울특별시
1284 
부산광역시
619 
대구광역시
608 
광주광역시
578 
대전광역시
466 
Other values (28)
3113 

Length

Max length7
Median length5
Mean length4.2075585
Min length2

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st row광주광역시
2nd row광주광역시
3rd row구미
4th row대구광역시
5th row대전광역시

Common Values

ValueCountFrequency (%)
서울특별시 1284
19.3%
부산광역시 619
9.3%
대구광역시 608
 
9.1%
광주광역시 578
 
8.7%
대전광역시 466
 
7.0%
인천광역시 391
 
5.9%
전주 386
 
5.8%
창원 341
 
5.1%
수원,화성 277
 
4.2%
제주도 191
 
2.9%
Other values (23) 1527
22.9%

Length

2023-12-13T07:46:02.210128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울특별시 1284
19.3%
부산광역시 619
9.3%
대구광역시 608
 
9.1%
광주광역시 578
 
8.7%
대전광역시 466
 
7.0%
인천광역시 391
 
5.9%
전주 386
 
5.8%
창원 341
 
5.1%
수원,화성 277
 
4.2%
제주도 191
 
2.9%
Other values (23) 1527
22.9%
Distinct1174
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
2023-12-13T07:46:02.473343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length26
Mean length8.334883
Min length5

Characters and Unicode

Total characters55577
Distinct characters281
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique435 ?
Unique (%)6.5%

Sample

1st row상일중학교(오전)
2nd row전남중학교
3rd row선주중학교
4th row동부공업고등학교(대구일마이스터고등학교)
5th row대전노은중학교
ValueCountFrequency (%)
전주서신중학교 94
 
1.4%
대구공업고등학교 82
 
1.2%
전주공업고등학교 80
 
1.2%
경북기계공업고등학교 77
 
1.1%
부산전자공업고등학교 74
 
1.1%
한라중학교 74
 
1.1%
오주중학교 67
 
1.0%
여명중학교 67
 
1.0%
성동공업고등학교 61
 
0.9%
대구서부공업고등학교 60
 
0.9%
Other values (1195) 6081
89.2%
2023-12-13T07:46:02.921671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6938
 
12.5%
6733
 
12.1%
4287
 
7.7%
2341
 
4.2%
2311
 
4.2%
2212
 
4.0%
) 2109
 
3.8%
( 2109
 
3.8%
1809
 
3.3%
1309
 
2.4%
Other values (271) 23419
42.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 50780
91.4%
Close Punctuation 2109
 
3.8%
Open Punctuation 2109
 
3.8%
Dash Punctuation 349
 
0.6%
Space Separator 149
 
0.3%
Decimal Number 52
 
0.1%
Uppercase Letter 18
 
< 0.1%
Other Punctuation 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6938
 
13.7%
6733
 
13.3%
4287
 
8.4%
2341
 
4.6%
2311
 
4.6%
2212
 
4.4%
1809
 
3.6%
1309
 
2.6%
1235
 
2.4%
1188
 
2.3%
Other values (253) 20417
40.2%
Decimal Number
ValueCountFrequency (%)
2 24
46.2%
1 14
26.9%
3 7
 
13.5%
0 3
 
5.8%
4 2
 
3.8%
8 1
 
1.9%
9 1
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
I 7
38.9%
T 7
38.9%
C 1
 
5.6%
O 1
 
5.6%
U 1
 
5.6%
E 1
 
5.6%
Close Punctuation
ValueCountFrequency (%)
) 2109
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2109
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 349
100.0%
Space Separator
ValueCountFrequency (%)
149
100.0%
Other Punctuation
ValueCountFrequency (%)
, 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 50780
91.4%
Common 4779
 
8.6%
Latin 18
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6938
 
13.7%
6733
 
13.3%
4287
 
8.4%
2341
 
4.6%
2311
 
4.6%
2212
 
4.4%
1809
 
3.6%
1309
 
2.6%
1235
 
2.4%
1188
 
2.3%
Other values (253) 20417
40.2%
Common
ValueCountFrequency (%)
) 2109
44.1%
( 2109
44.1%
- 349
 
7.3%
149
 
3.1%
2 24
 
0.5%
1 14
 
0.3%
, 11
 
0.2%
3 7
 
0.1%
0 3
 
0.1%
4 2
 
< 0.1%
Other values (2) 2
 
< 0.1%
Latin
ValueCountFrequency (%)
I 7
38.9%
T 7
38.9%
C 1
 
5.6%
O 1
 
5.6%
U 1
 
5.6%
E 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 50780
91.4%
ASCII 4797
 
8.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6938
 
13.7%
6733
 
13.3%
4287
 
8.4%
2341
 
4.6%
2311
 
4.6%
2212
 
4.4%
1809
 
3.6%
1309
 
2.6%
1235
 
2.4%
1188
 
2.3%
Other values (253) 20417
40.2%
ASCII
ValueCountFrequency (%)
) 2109
44.0%
( 2109
44.0%
- 349
 
7.3%
149
 
3.1%
2 24
 
0.5%
1 14
 
0.3%
, 11
 
0.2%
3 7
 
0.1%
I 7
 
0.1%
T 7
 
0.1%
Other values (8) 11
 
0.2%

시험실수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct55
Distinct (%)0.8%
Missing121
Missing (%)1.8%
Infinite0
Infinite (%)0.0%
Mean20.207881
Minimum1
Maximum55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2023-12-13T07:46:03.075902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q115
median20
Q326
95-th percentile35
Maximum55
Range54
Interquartile range (IQR)11

Descriptive statistics

Standard deviation9.3208261
Coefficient of variation (CV)0.46124707
Kurtosis0.052985474
Mean20.207881
Median Absolute Deviation (MAD)6
Skewness0.058302952
Sum132301
Variance86.877799
MonotonicityNot monotonic
2023-12-13T07:46:03.235152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20 432
 
6.5%
19 302
 
4.5%
23 291
 
4.4%
18 287
 
4.3%
24 280
 
4.2%
25 272
 
4.1%
22 272
 
4.1%
21 269
 
4.0%
15 267
 
4.0%
26 243
 
3.6%
Other values (45) 3632
54.5%
ValueCountFrequency (%)
1 182
2.7%
2 95
1.4%
3 81
1.2%
4 91
1.4%
5 81
1.2%
6 73
1.1%
7 89
1.3%
8 81
1.2%
9 107
1.6%
10 131
2.0%
ValueCountFrequency (%)
55 2
 
< 0.1%
54 1
 
< 0.1%
53 1
 
< 0.1%
52 2
 
< 0.1%
51 2
 
< 0.1%
50 1
 
< 0.1%
49 4
 
0.1%
48 5
0.1%
47 10
0.1%
46 5
0.1%

수용인원수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct277
Distinct (%)4.2%
Missing121
Missing (%)1.8%
Infinite0
Infinite (%)0.0%
Mean511.09577
Minimum11
Maximum1650
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2023-12-13T07:46:03.394648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile90
Q1340
median500
Q3660
95-th percentile950
Maximum1650
Range1639
Interquartile range (IQR)320

Descriptive statistics

Standard deviation255.90923
Coefficient of variation (CV)0.500707
Kurtosis0.41588533
Mean511.09577
Median Absolute Deviation (MAD)160
Skewness0.44032307
Sum3346144
Variance65489.535
MonotonicityNot monotonic
2023-12-13T07:46:03.543863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
600 284
 
4.3%
500 216
 
3.2%
400 208
 
3.1%
300 191
 
2.9%
540 187
 
2.8%
480 174
 
2.6%
420 157
 
2.4%
450 155
 
2.3%
750 153
 
2.3%
360 153
 
2.3%
Other values (267) 4669
70.0%
ValueCountFrequency (%)
11 1
 
< 0.1%
15 1
 
< 0.1%
20 38
0.6%
25 33
 
0.5%
28 1
 
< 0.1%
30 92
1.4%
35 7
 
0.1%
40 18
 
0.3%
45 1
 
< 0.1%
46 1
 
< 0.1%
ValueCountFrequency (%)
1650 1
 
< 0.1%
1620 1
 
< 0.1%
1530 1
 
< 0.1%
1500 1
 
< 0.1%
1470 3
 
< 0.1%
1440 3
 
< 0.1%
1410 8
0.1%
1380 1
 
< 0.1%
1350 8
0.1%
1325 1
 
< 0.1%

Interactions

2023-12-13T07:46:00.353663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.116802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.507376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.882746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:46:00.449280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.198826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.617405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.993288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:46:00.542335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.280852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.694867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:46:00.154360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:46:00.649083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.381088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:45:59.781931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:46:00.241646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:46:03.679068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도직종회차응시지역시험실수수용인원수
연도1.0000.2960.7630.2740.1550.378
직종0.2961.0000.9220.4840.6280.619
회차0.7630.9221.0000.4300.4230.477
응시지역0.2740.4840.4301.0000.5620.526
시험실수0.1550.6280.4230.5621.0000.950
수용인원수0.3780.6190.4770.5260.9501.000
2023-12-13T07:46:03.800475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
직종응시지역
직종1.0000.114
응시지역0.1141.000
2023-12-13T07:46:03.921111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차시험실수수용인원수직종응시지역
연도1.0000.4470.045-0.2550.1080.099
회차0.4471.000-0.002-0.1470.6400.166
시험실수0.045-0.0021.0000.9050.2700.235
수용인원수-0.255-0.1470.9051.0000.2640.214
직종0.1080.6400.2700.2641.0000.114
응시지역0.0990.1660.2350.2140.1141.000

Missing values

2023-12-13T07:46:00.784647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:46:00.921582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:46:01.327502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도직종회차응시지역시험장명시험실수수용인원수
02011간호조무사2광주광역시상일중학교(오전)7210
12011간호조무사2광주광역시전남중학교22770
22011간호조무사2구미선주중학교<NA><NA>
32011간호조무사2대구광역시동부공업고등학교(대구일마이스터고등학교)9270
42011간호조무사2대전광역시대전노은중학교26780
52011간호조무사2대전광역시대전만년중학교<NA><NA>
62011간호조무사2목포목포항도여자중학교(오전)<NA><NA>
72011간호조무사2부산광역시동현중학교<NA><NA>
82011간호조무사2부산광역시부산진중학교<NA><NA>
92011간호조무사2부산광역시연산중학교<NA><NA>
연도직종회차응시지역시험장명시험실수수용인원수
66582023의사87대전광역시우송정보대학교 동캠퍼스(국제경영관)4100
66592023의사87부산광역시부산경남 시험센터5220
66602023의사87부산광역시부산대학교(공과대학)6179
66612023의사87서울특별시경기성남 시험센터6270
66622023의사87서울특별시명지전문대학13413
66632023의사87서울특별시서울구로 시험센터6270
66642023의사87서울특별시서일대학교(배양관)22792
66652023의사87전주전북대학교(정보전산원)5161
66662023의사87전주전북전주 시험센터290
66672023의사87제주도제주 시험센터146