Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory732.4 KiB
Average record size in memory75.0 B

Variable types

Numeric2
Categorical5
Text1

Dataset

Description2018년~2021년도 간호조무사 국가시험 응시자 취득 현황을 분석할 수 있는 정보(연도,성별,나이,출신학교,학과(전공),졸업여부,이수기관)를 개인을 식별할 수 없는 형태로 제공합니다.
Author한국보건의료인국가시험원
URLhttps://www.data.go.kr/data/15098753/fileData.do

Alerts

이수기관 has constant value ""Constant
성별 is highly imbalanced (73.2%)Imbalance
졸업여부 is highly imbalanced (95.2%)Imbalance
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 18:20:29.281531
Analysis finished2023-12-12 18:20:31.143721
Duration1.86 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47596.353
Minimum10
Maximum95292
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:20:31.226193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile4571.2
Q123306.75
median47916.5
Q371623.25
95-th percentile90684.8
Maximum95292
Range95282
Interquartile range (IQR)48316.5

Descriptive statistics

Standard deviation27716.417
Coefficient of variation (CV)0.58232228
Kurtosis-1.2183219
Mean47596.353
Median Absolute Deviation (MAD)24155
Skewness-0.0066198523
Sum4.7596353 × 108
Variance7.6819975 × 108
MonotonicityNot monotonic
2023-12-13T03:20:31.411640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60538 1
 
< 0.1%
68182 1
 
< 0.1%
78694 1
 
< 0.1%
88225 1
 
< 0.1%
45638 1
 
< 0.1%
7897 1
 
< 0.1%
62925 1
 
< 0.1%
77564 1
 
< 0.1%
18957 1
 
< 0.1%
18343 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
10 1
< 0.1%
30 1
< 0.1%
37 1
< 0.1%
43 1
< 0.1%
44 1
< 0.1%
80 1
< 0.1%
87 1
< 0.1%
95 1
< 0.1%
99 1
< 0.1%
101 1
< 0.1%
ValueCountFrequency (%)
95292 1
< 0.1%
95282 1
< 0.1%
95281 1
< 0.1%
95270 1
< 0.1%
95269 1
< 0.1%
95254 1
< 0.1%
95243 1
< 0.1%
95238 1
< 0.1%
95218 1
< 0.1%
95196 1
< 0.1%

연도
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2018
2811 
2021
2405 
2019
2404 
2020
2380 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2018
3rd row2019
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2018 2811
28.1%
2021 2405
24.1%
2019 2404
24.0%
2020 2380
23.8%

Length

2023-12-13T03:20:31.574975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:20:31.709287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018 2811
28.1%
2021 2405
24.1%
2019 2404
24.0%
2020 2380
23.8%

성별
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
9542 
 
458

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
9542
95.4%
458
 
4.6%

Length

2023-12-13T03:20:31.830726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:20:31.960817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9542
95.4%
458
 
4.6%

나이
Real number (ℝ)

Distinct55
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.755
Minimum18
Maximum76
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:20:32.091844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile21
Q126
median36
Q347
95-th percentile55
Maximum76
Range58
Interquartile range (IQR)21

Descriptive statistics

Standard deviation11.567176
Coefficient of variation (CV)0.31471026
Kurtosis-1.2322375
Mean36.755
Median Absolute Deviation (MAD)11
Skewness0.20615725
Sum367550
Variance133.79955
MonotonicityNot monotonic
2023-12-13T03:20:32.294698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23 479
 
4.8%
24 477
 
4.8%
21 437
 
4.4%
25 436
 
4.4%
26 409
 
4.1%
22 391
 
3.9%
27 364
 
3.6%
28 317
 
3.2%
48 310
 
3.1%
49 309
 
3.1%
Other values (45) 6071
60.7%
ValueCountFrequency (%)
18 2
 
< 0.1%
19 20
 
0.2%
20 177
 
1.8%
21 437
4.4%
22 391
3.9%
23 479
4.8%
24 477
4.8%
25 436
4.4%
26 409
4.1%
27 364
3.6%
ValueCountFrequency (%)
76 1
 
< 0.1%
71 1
 
< 0.1%
70 1
 
< 0.1%
69 2
 
< 0.1%
68 2
 
< 0.1%
67 7
0.1%
66 3
 
< 0.1%
65 4
 
< 0.1%
64 8
0.1%
63 17
0.2%

출신학교
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
고등학교
5273 
(대)학교 또는 대학원
3447 
고등학교 또는 동등학력
1241 
기타
 
26
외국대
 
7

Length

Max length12
Median length4
Mean length7.7433
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row고등학교
2nd row고등학교
3rd row고등학교
4th row고등학교 또는 동등학력
5th row고등학교 또는 동등학력

Common Values

ValueCountFrequency (%)
고등학교 5273
52.7%
(대)학교 또는 대학원 3447
34.5%
고등학교 또는 동등학력 1241
 
12.4%
기타 26
 
0.3%
외국대 7
 
0.1%
학원 6
 
0.1%

Length

2023-12-13T03:20:32.467335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:20:32.619441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고등학교 6514
33.6%
또는 4688
24.2%
대)학교 3447
17.8%
대학원 3447
17.8%
동등학력 1241
 
6.4%
기타 26
 
0.1%
외국대 7
 
< 0.1%
학원 6
 
< 0.1%
Distinct214
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:20:32.897989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length2
Mean length2.5202
Min length2

Characters and Unicode

Total characters25202
Distinct characters152
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)1.0%

Sample

1st row기타
2nd row기타
3rd row기타
4th row기타
5th row기타
ValueCountFrequency (%)
기타 8974
89.7%
사회복지학과 146
 
1.5%
유아교육과 65
 
0.7%
미용학과 45
 
0.4%
식품)영양(학)과 26
 
0.3%
법학과,영어학과 25
 
0.2%
행정학과 24
 
0.2%
호텔관광경영학과 22
 
0.2%
호텔)경영학과 22
 
0.2%
의무행정전공 19
 
0.2%
Other values (204) 632
 
6.3%
2023-12-13T03:20:33.401613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8984
35.6%
8974
35.6%
986
 
3.9%
817
 
3.2%
) 380
 
1.5%
( 380
 
1.5%
195
 
0.8%
166
 
0.7%
164
 
0.7%
163
 
0.6%
Other values (142) 3993
15.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24323
96.5%
Close Punctuation 380
 
1.5%
Open Punctuation 380
 
1.5%
Other Punctuation 119
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8984
36.9%
8974
36.9%
986
 
4.1%
817
 
3.4%
195
 
0.8%
166
 
0.7%
164
 
0.7%
163
 
0.7%
162
 
0.7%
161
 
0.7%
Other values (137) 3551
 
14.6%
Other Punctuation
ValueCountFrequency (%)
, 111
93.3%
· 4
 
3.4%
. 4
 
3.4%
Close Punctuation
ValueCountFrequency (%)
) 380
100.0%
Open Punctuation
ValueCountFrequency (%)
( 380
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24322
96.5%
Common 879
 
3.5%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8984
36.9%
8974
36.9%
986
 
4.1%
817
 
3.4%
195
 
0.8%
166
 
0.7%
164
 
0.7%
163
 
0.7%
162
 
0.7%
161
 
0.7%
Other values (136) 3550
 
14.6%
Common
ValueCountFrequency (%)
) 380
43.2%
( 380
43.2%
, 111
 
12.6%
· 4
 
0.5%
. 4
 
0.5%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24322
96.5%
ASCII 875
 
3.5%
None 4
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8984
36.9%
8974
36.9%
986
 
4.1%
817
 
3.4%
195
 
0.8%
166
 
0.7%
164
 
0.7%
163
 
0.7%
162
 
0.7%
161
 
0.7%
Other values (136) 3550
 
14.6%
ASCII
ValueCountFrequency (%)
) 380
43.4%
( 380
43.4%
, 111
 
12.7%
. 4
 
0.5%
None
ValueCountFrequency (%)
· 4
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

졸업여부
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
졸업(수습)
9908 
졸업(수습)예정
 
91
<NA>
 
1

Length

Max length8
Median length6
Mean length6.018
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row졸업(수습)
2nd row졸업(수습)
3rd row졸업(수습)
4th row졸업(수습)
5th row졸업(수습)

Common Values

ValueCountFrequency (%)
졸업(수습) 9908
99.1%
졸업(수습)예정 91
 
0.9%
<NA> 1
 
< 0.1%

Length

2023-12-13T03:20:33.616915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:20:33.744721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
졸업(수습 9908
99.1%
졸업(수습)예정 91
 
0.9%
na 1
 
< 0.1%

이수기관
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
간호조무사학원
10000 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row간호조무사학원
2nd row간호조무사학원
3rd row간호조무사학원
4th row간호조무사학원
5th row간호조무사학원

Common Values

ValueCountFrequency (%)
간호조무사학원 10000
100.0%

Length

2023-12-13T03:20:33.882219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:20:34.007874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
간호조무사학원 10000
100.0%

Interactions

2023-12-13T03:20:30.583173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:20:30.346056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:20:30.731183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:20:30.469890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:20:34.075898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번연도성별나이출신학교졸업여부
순번1.0000.0600.0000.1350.1160.059
연도0.0601.0000.0280.0850.5170.068
성별0.0000.0281.0000.1770.0000.030
나이0.1350.0850.1771.0000.2110.227
출신학교0.1160.5170.0000.2111.0000.115
졸업여부0.0590.0680.0300.2270.1151.000
2023-12-13T03:20:34.186961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
졸업여부출신학교성별연도
졸업여부1.0000.0830.0190.045
출신학교0.0831.0000.0000.359
성별0.0190.0001.0000.018
연도0.0450.3590.0181.000
2023-12-13T03:20:34.279518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번나이연도성별출신학교졸업여부
순번1.000-0.0240.0360.0000.0610.045
나이-0.0241.0000.0510.1360.1120.174
연도0.0360.0511.0000.0180.3590.045
성별0.0000.1360.0181.0000.0000.019
출신학교0.0610.1120.3590.0001.0000.083
졸업여부0.0450.1740.0450.0190.0831.000

Missing values

2023-12-13T03:20:30.887620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:20:31.063690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번연도성별나이출신학교학과(전공)졸업여부이수기관
6053760538201921고등학교기타졸업(수습)간호조무사학원
8414784148201851고등학교기타졸업(수습)간호조무사학원
6980769808201929고등학교기타졸업(수습)간호조무사학원
4788347884201821고등학교 또는 동등학력기타졸업(수습)간호조무사학원
4824248243201839고등학교 또는 동등학력기타졸업(수습)간호조무사학원
1718917190201925(대)학교 또는 대학원기타졸업(수습)간호조무사학원
4065640657201947고등학교기타졸업(수습)간호조무사학원
2887928880202133고등학교기타졸업(수습)간호조무사학원
1390013901202028고등학교기타졸업(수습)간호조무사학원
2928529286201821고등학교 또는 동등학력기타졸업(수습)간호조무사학원
순번연도성별나이출신학교학과(전공)졸업여부이수기관
62566257202155고등학교기타졸업(수습)간호조무사학원
33673368202044고등학교기타졸업(수습)간호조무사학원
3742937430202026(대)학교 또는 대학원기타졸업(수습)간호조무사학원
3924339244201947(대)학교 또는 대학원기타졸업(수습)간호조무사학원
7102971030201943고등학교기타졸업(수습)간호조무사학원
1071110712201822고등학교 또는 동등학력기타졸업(수습)간호조무사학원
54625463202131(대)학교 또는 대학원기타졸업(수습)간호조무사학원
4271242713202023(대)학교 또는 대학원기타졸업(수습)간호조무사학원
4190541906202031고등학교기타졸업(수습)간호조무사학원
1418414185201825고등학교기타졸업(수습)간호조무사학원