Overview

Dataset statistics

Number of variables8
Number of observations54
Missing cells16
Missing cells (%)3.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.7 KiB
Average record size in memory69.4 B

Variable types

Numeric2
Text3
Categorical3

Dataset

DescriptionRA 시험 과목(시험과정그룹코드, 시험과정그룹명, 시험과정구분(실/필기), 그룹구분코드, 시험그룹등급)
Author한국의료기기안전정보원
URLhttps://www.data.go.kr/data/15065991/fileData.do

Alerts

GROUP_SUBJECT is highly overall correlated with GROUP_GRADEHigh correlation
GROUP_GRADE is highly overall correlated with SORT_SEQ and 1 other fieldsHigh correlation
SORT_SEQ is highly overall correlated with GROUP_GBN and 1 other fieldsHigh correlation
GROUP_GBN is highly overall correlated with SORT_SEQHigh correlation
SORT_SEQ has 16 (29.6%) missing valuesMissing
GROUP_CODE has unique valuesUnique
GROUP_NAME has unique valuesUnique
IN_DTIME has unique valuesUnique
UP_DTIME has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:41:33.118763
Analysis finished2023-12-12 13:41:34.150736
Duration1.03 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

GROUP_CODE
Real number (ℝ)

UNIQUE 

Distinct54
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1796.9444
Minimum1402
Maximum2081
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size618.0 B
2023-12-12T22:41:34.245338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1402
5-th percentile1507.85
Q11653.5
median1806.5
Q31948.5
95-th percentile2054.5
Maximum2081
Range679
Interquartile range (IQR)295

Descriptive statistics

Standard deviation180.02237
Coefficient of variation (CV)0.10018249
Kurtosis-0.98040901
Mean1796.9444
Median Absolute Deviation (MAD)150
Skewness-0.20718798
Sum97035
Variance32408.053
MonotonicityStrictly increasing
2023-12-12T22:41:34.396872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1402 1
 
1.9%
1961 1
 
1.9%
1841 1
 
1.9%
1851 1
 
1.9%
1861 1
 
1.9%
1871 1
 
1.9%
1902 1
 
1.9%
1903 1
 
1.9%
1911 1
 
1.9%
1921 1
 
1.9%
Other values (44) 44
81.5%
ValueCountFrequency (%)
1402 1
1.9%
1501 1
1.9%
1502 1
1.9%
1511 1
1.9%
1521 1
1.9%
1531 1
1.9%
1551 1
1.9%
1601 1
1.9%
1602 1
1.9%
1611 1
1.9%
ValueCountFrequency (%)
2081 1
1.9%
2071 1
1.9%
2061 1
1.9%
2051 1
1.9%
2041 1
1.9%
2031 1
1.9%
2021 1
1.9%
2011 1
1.9%
2003 1
1.9%
2002 1
1.9%

GROUP_NAME
Text

UNIQUE 

Distinct54
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size564.0 B
2023-12-12T22:41:34.652021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length18
Mean length14.740741
Min length11

Characters and Unicode

Total characters796
Distinct characters42
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)100.0%

Sample

1st row2014년 의료기기 RA 전문가 2급
2nd row[인허가] 필기(15)
3rd row2015년 의료기기 RA 전문가 2급
4th row[품질관리(GMP)] 필기(15)
5th row[임상] 필기(15)
ValueCountFrequency (%)
임상 11
 
8.1%
인허가 11
 
8.1%
품질관리(gmp 11
 
8.1%
해외인증 11
 
8.1%
ra 9
 
6.7%
전문가 9
 
6.7%
의료기기 9
 
6.7%
2급 7
 
5.2%
필기(19 4
 
3.0%
실기(17 4
 
3.0%
Other values (19) 49
36.3%
2023-12-12T22:41:35.074676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
83
 
10.4%
63
 
7.9%
) 58
 
7.3%
( 58
 
7.3%
[ 45
 
5.7%
] 45
 
5.7%
1 44
 
5.5%
2 28
 
3.5%
24
 
3.0%
22
 
2.8%
Other values (32) 326
41.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 318
39.9%
Decimal Number 135
17.0%
Close Punctuation 103
 
12.9%
Open Punctuation 103
 
12.9%
Space Separator 83
 
10.4%
Uppercase Letter 54
 
6.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
63
19.8%
24
 
7.5%
22
 
6.9%
21
 
6.6%
20
 
6.3%
11
 
3.5%
11
 
3.5%
11
 
3.5%
11
 
3.5%
11
 
3.5%
Other values (13) 113
35.5%
Decimal Number
ValueCountFrequency (%)
1 44
32.6%
2 28
20.7%
0 19
14.1%
9 10
 
7.4%
7 9
 
6.7%
6 9
 
6.7%
8 9
 
6.7%
5 6
 
4.4%
4 1
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
G 12
22.2%
M 12
22.2%
P 12
22.2%
A 9
16.7%
R 9
16.7%
Close Punctuation
ValueCountFrequency (%)
) 58
56.3%
] 45
43.7%
Open Punctuation
ValueCountFrequency (%)
( 58
56.3%
[ 45
43.7%
Space Separator
ValueCountFrequency (%)
83
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 424
53.3%
Hangul 318
39.9%
Latin 54
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
63
19.8%
24
 
7.5%
22
 
6.9%
21
 
6.6%
20
 
6.3%
11
 
3.5%
11
 
3.5%
11
 
3.5%
11
 
3.5%
11
 
3.5%
Other values (13) 113
35.5%
Common
ValueCountFrequency (%)
83
19.6%
) 58
13.7%
( 58
13.7%
[ 45
10.6%
] 45
10.6%
1 44
10.4%
2 28
 
6.6%
0 19
 
4.5%
9 10
 
2.4%
7 9
 
2.1%
Other values (4) 25
 
5.9%
Latin
ValueCountFrequency (%)
G 12
22.2%
M 12
22.2%
P 12
22.2%
A 9
16.7%
R 9
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 478
60.1%
Hangul 318
39.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
83
17.4%
) 58
12.1%
( 58
12.1%
[ 45
9.4%
] 45
9.4%
1 44
9.2%
2 28
 
5.9%
0 19
 
4.0%
G 12
 
2.5%
M 12
 
2.5%
Other values (9) 74
15.5%
Hangul
ValueCountFrequency (%)
63
19.8%
24
 
7.5%
22
 
6.9%
21
 
6.6%
20
 
6.3%
11
 
3.5%
11
 
3.5%
11
 
3.5%
11
 
3.5%
11
 
3.5%
Other values (13) 113
35.5%

GROUP_GBN
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size564.0 B
필기
33 
실기
21 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row필기
2nd row필기
3rd row필기
4th row필기
5th row필기

Common Values

ValueCountFrequency (%)
필기 33
61.1%
실기 21
38.9%

Length

2023-12-12T22:41:35.259365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:41:35.378276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
필기 33
61.1%
실기 21
38.9%

GROUP_SUBJECT
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Memory size564.0 B
4
12 
1
11 
3
11 
2
11 
R

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowR
2nd row1
3rd rowR
4th row4
5th row3

Common Values

ValueCountFrequency (%)
4 12
22.2%
1 11
20.4%
3 11
20.4%
2 11
20.4%
R 9
16.7%

Length

2023-12-12T22:41:35.491409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:41:35.619588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 12
22.2%
1 11
20.4%
3 11
20.4%
2 11
20.4%
r 9
16.7%

GROUP_GRADE
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size564.0 B
1
45 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 45
83.3%
2 9
 
16.7%

Length

2023-12-12T22:41:35.730018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:41:35.821292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 45
83.3%
2 9
 
16.7%

SORT_SEQ
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)86.8%
Missing16
Missing (%)29.6%
Infinite0
Infinite (%)0.0%
Mean17.842105
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size618.0 B
2023-12-12T22:41:35.944654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.85
Q17
median14.5
Q324.75
95-th percentile33.15
Maximum99
Range98
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation16.771157
Coefficient of variation (CV)0.93997633
Kurtosis14.43375
Mean17.842105
Median Absolute Deviation (MAD)8.5
Skewness3.1508633
Sum678
Variance281.27169
MonotonicityNot monotonic
2023-12-12T22:41:36.076154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
4 2
 
3.7%
5 2
 
3.7%
6 2
 
3.7%
7 2
 
3.7%
9 2
 
3.7%
29 1
 
1.9%
24 1
 
1.9%
25 1
 
1.9%
26 1
 
1.9%
27 1
 
1.9%
Other values (23) 23
42.6%
(Missing) 16
29.6%
ValueCountFrequency (%)
1 1
1.9%
2 1
1.9%
3 1
1.9%
4 2
3.7%
5 2
3.7%
6 2
3.7%
7 2
3.7%
8 1
1.9%
9 2
3.7%
10 1
1.9%
ValueCountFrequency (%)
99 1
1.9%
34 1
1.9%
33 1
1.9%
32 1
1.9%
31 1
1.9%
29 1
1.9%
28 1
1.9%
27 1
1.9%
26 1
1.9%
25 1
1.9%

IN_DTIME
Text

UNIQUE 

Distinct54
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size564.0 B
2023-12-12T22:41:36.323811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters378
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)100.0%

Sample

1st row11:01.6
2nd row49:38.0
3rd row44:14.6
4th row42:07.8
5th row42:45.7
ValueCountFrequency (%)
11:01.6 1
 
1.9%
59:07.0 1
 
1.9%
25:12.0 1
 
1.9%
19:18.0 1
 
1.9%
43:32.0 1
 
1.9%
43:55.0 1
 
1.9%
44:25.0 1
 
1.9%
45:25.0 1
 
1.9%
51:35.0 1
 
1.9%
59:21.0 1
 
1.9%
Other values (44) 44
81.5%
2023-12-12T22:41:36.690908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 57
15.1%
: 54
14.3%
. 54
14.3%
5 42
11.1%
2 38
10.1%
3 31
8.2%
4 30
7.9%
1 28
7.4%
9 14
 
3.7%
8 12
 
3.2%
Other values (2) 18
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 270
71.4%
Other Punctuation 108
 
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 57
21.1%
5 42
15.6%
2 38
14.1%
3 31
11.5%
4 30
11.1%
1 28
10.4%
9 14
 
5.2%
8 12
 
4.4%
7 10
 
3.7%
6 8
 
3.0%
Other Punctuation
ValueCountFrequency (%)
: 54
50.0%
. 54
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 378
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 57
15.1%
: 54
14.3%
. 54
14.3%
5 42
11.1%
2 38
10.1%
3 31
8.2%
4 30
7.9%
1 28
7.4%
9 14
 
3.7%
8 12
 
3.2%
Other values (2) 18
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 378
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 57
15.1%
: 54
14.3%
. 54
14.3%
5 42
11.1%
2 38
10.1%
3 31
8.2%
4 30
7.9%
1 28
7.4%
9 14
 
3.7%
8 12
 
3.2%
Other values (2) 18
 
4.8%

UP_DTIME
Text

UNIQUE 

Distinct54
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size564.0 B
2023-12-12T22:41:36.963959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters378
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)100.0%

Sample

1st row44:24.0
2nd row52:02.0
3rd row02:08.5
4th row42:59.0
5th row43:05.0
ValueCountFrequency (%)
44:24.0 1
 
1.9%
59:07.0 1
 
1.9%
25:12.0 1
 
1.9%
19:18.0 1
 
1.9%
43:32.0 1
 
1.9%
43:55.0 1
 
1.9%
44:25.0 1
 
1.9%
45:25.0 1
 
1.9%
45:11.0 1
 
1.9%
45:20.0 1
 
1.9%
Other values (44) 44
81.5%
2023-12-12T22:41:37.406461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 71
18.8%
: 54
14.3%
. 54
14.3%
5 40
10.6%
4 39
10.3%
2 36
9.5%
1 34
9.0%
3 18
 
4.8%
7 9
 
2.4%
8 9
 
2.4%
Other values (2) 14
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 270
71.4%
Other Punctuation 108
 
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 71
26.3%
5 40
14.8%
4 39
14.4%
2 36
13.3%
1 34
12.6%
3 18
 
6.7%
7 9
 
3.3%
8 9
 
3.3%
9 8
 
3.0%
6 6
 
2.2%
Other Punctuation
ValueCountFrequency (%)
: 54
50.0%
. 54
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 378
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 71
18.8%
: 54
14.3%
. 54
14.3%
5 40
10.6%
4 39
10.3%
2 36
9.5%
1 34
9.0%
3 18
 
4.8%
7 9
 
2.4%
8 9
 
2.4%
Other values (2) 14
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 378
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 71
18.8%
: 54
14.3%
. 54
14.3%
5 40
10.6%
4 39
10.3%
2 36
9.5%
1 34
9.0%
3 18
 
4.8%
7 9
 
2.4%
8 9
 
2.4%
Other values (2) 14
 
3.7%

Interactions

2023-12-12T22:41:33.687754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:41:33.484133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:41:33.802224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:41:33.598667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:41:37.533751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GROUP_CODEGROUP_NAMEGROUP_GBNGROUP_SUBJECTGROUP_GRADESORT_SEQIN_DTIMEUP_DTIME
GROUP_CODE1.0001.0000.3350.0000.0000.7431.0001.000
GROUP_NAME1.0001.0001.0001.0001.0001.0001.0001.000
GROUP_GBN0.3351.0001.0000.1990.4210.4481.0001.000
GROUP_SUBJECT0.0001.0000.1991.0001.0000.5721.0001.000
GROUP_GRADE0.0001.0000.4211.0001.0000.5311.0001.000
SORT_SEQ0.7431.0000.4480.5720.5311.0001.0001.000
IN_DTIME1.0001.0001.0001.0001.0001.0001.0001.000
UP_DTIME1.0001.0001.0001.0001.0001.0001.0001.000
2023-12-12T22:41:37.672884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GROUP_GBNGROUP_SUBJECTGROUP_GRADE
GROUP_GBN1.0000.2330.276
GROUP_SUBJECT0.2331.0000.971
GROUP_GRADE0.2760.9711.000
2023-12-12T22:41:37.779041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GROUP_CODESORT_SEQGROUP_GBNGROUP_SUBJECTGROUP_GRADE
GROUP_CODE1.0000.4580.2950.0000.223
SORT_SEQ0.4581.0000.5200.2390.615
GROUP_GBN0.2950.5201.0000.2330.276
GROUP_SUBJECT0.0000.2390.2331.0000.971
GROUP_GRADE0.2230.6150.2760.9711.000

Missing values

2023-12-12T22:41:33.936135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:41:34.100016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GROUP_CODEGROUP_NAMEGROUP_GBNGROUP_SUBJECTGROUP_GRADESORT_SEQIN_DTIMEUP_DTIME
014022014년 의료기기 RA 전문가 2급필기R2111:01.644:24.0
11501[인허가] 필기(15)필기119949:38.052:02.0
215022015년 의료기기 RA 전문가 2급필기R2244:14.602:08.5
31511[품질관리(GMP)] 필기(15)필기41442:07.842:59.0
41521[임상] 필기(15)필기31542:45.743:05.0
51531[해외인증] 필기(15)필기21643:18.943:11.0
61551[GMP] 실기(15)실기41740:24.715:10.0
71601[인허가] 필기(16)필기11928:36.143:22.0
816022016년 의료기기 RA 전문가 2급필기R2339:48.344:43.0
91611[품질관리(GMP)] 필기(16)필기411029:06.543:29.0
GROUP_CODEGROUP_NAMEGROUP_GBNGROUP_SUBJECTGROUP_GRADESORT_SEQIN_DTIMEUP_DTIME
4420022020년 의료기기 RA 전문가 2급필기R2837:04.046:09.0
4520032020년 의료기기 RA 전문가 2급(완화)필기R2938:01.046:17.0
462011[인허가] 필기(20)필기11<NA>21:35.021:35.0
472021[품질관리(GMP)] 필기(20)필기41<NA>22:18.022:18.0
482031[임상] 필기(20)필기31<NA>22:50.022:50.0
492041[해외인증] 필기(20)필기21<NA>23:19.023:19.0
502051[인허가] 실기(20)실기11<NA>23:55.023:55.0
512061[품질관리(GMP)] 실기(20)실기41<NA>24:44.024:44.0
522071[임상] 실기(20)실기31<NA>25:12.025:12.0
532081[해외인증] 실기(20)실기21<NA>25:45.025:45.0