Overview

Dataset statistics

Number of variables4
Number of observations208
Missing cells2
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.8 KiB
Average record size in memory33.6 B

Variable types

Categorical3
Text1

Dataset

Descriptiono 내용 : 맞춤형 연구 신청 건 중 치료재료 신청 건에 대한 신청연도별 대분류, 중분류에 따른 신청건수 o 대상 : 맟춤형 연구 신청자 1 신청연도: 치료재료 연구 신청년도 (req번호가 0000으로 시작하는 연구는 2020년 이전으로 그룹화, 식별가능년도 2021년, 2022년 구분) 2 대분류: 치료재료 대분류 3 중분류: 치료재료 중분류 4 신청건수: 치료재료 중분류에 따른 연구 신청건수 ※ 시스템 신청이 아닌 개별 신청 건은 누락될 수 있음
URLhttps://www.data.go.kr/data/15122155/fileData.do

Alerts

신청연도 is highly overall correlated with 대분류 and 1 other fieldsHigh correlation
대분류 is highly overall correlated with 신청연도 and 1 other fieldsHigh correlation
신청건수 is highly overall correlated with 신청연도 and 1 other fieldsHigh correlation
신청건수 is highly imbalanced (55.9%)Imbalance

Reproduction

Analysis started2023-12-12 21:50:09.928758
Analysis finished2023-12-12 21:50:10.301153
Duration0.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

신청연도
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2020년 이전
94 
2021
46 
2022
34 
2023
34 

Length

Max length8
Median length4
Mean length5.8076923
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020년 이전
2nd row2020년 이전
3rd row2020년 이전
4th row2020년 이전
5th row2020년 이전

Common Values

ValueCountFrequency (%)
2020년 이전 94
45.2%
2021 46
22.1%
2022 34
 
16.3%
2023 34
 
16.3%

Length

2023-12-13T06:50:10.401261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:50:10.531859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020년 94
31.1%
이전 94
31.1%
2021 46
15.2%
2022 34
 
11.3%
2023 34
 
11.3%

대분류
Categorical

HIGH CORRELATION 

Distinct49
Distinct (%)23.6%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
245
22 
J8
16 
G0: 인조혈관류
15 
H3: 신경자극기류
15 
H3: 신경자극기 류
15 
Other values (44)
125 

Length

Max length34
Median length26
Mean length11.783654
Min length1

Unique

Unique21 ?
Unique (%)10.1%

Sample

1st row245
2nd row245
3rd row245
4th row245
5th row245

Common Values

ValueCountFrequency (%)
245 22
 
10.6%
J8 16
 
7.7%
G0: 인조혈관류 15
 
7.2%
H3: 신경자극기류 15
 
7.2%
H3: 신경자극기 류 15
 
7.2%
KCD20(질병이환 및 사망의 외인) 13
 
6.2%
중재적시술용군 11
 
5.3%
G8 : PACEMAKER CRT ICD & LEAD 류 9
 
4.3%
G8: Pacemaker CRT ICD & Lead류 9
 
4.3%
G 9
 
4.3%
Other values (39) 74
35.6%

Length

2023-12-13T06:50:10.692219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
52
 
9.6%
32
 
5.9%
icd 30
 
5.6%
h3 30
 
5.6%
pacemaker 26
 
4.8%
crt 26
 
4.8%
g8 26
 
4.8%
245 22
 
4.1%
lead 17
 
3.2%
j8 16
 
3.0%
Other values (59) 262
48.6%
Distinct171
Distinct (%)83.0%
Missing2
Missing (%)1.0%
Memory size1.8 KiB
2023-12-13T06:50:11.073265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length64
Median length40
Mean length17.257282
Min length1

Characters and Unicode

Total characters3555
Distinct characters187
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique146 ?
Unique (%)70.9%

Sample

1st row116502BIJ
2nd row116530BIJ
3rd row142001BIJ
4th row142030BIJ
5th row142201BIJ
ValueCountFrequency (%)
척수신경자극기 41
 
8.2%
heart 20
 
4.0%
valve 19
 
3.8%
촬영가능 15
 
3.0%
ipg 14
 
2.8%
straight 12
 
2.4%
type 10
 
2.0%
tissue 10
 
2.0%
chamber 8
 
1.6%
다친 8
 
1.6%
Other values (220) 340
68.4%
2023-12-13T06:50:11.670775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
315
 
8.9%
0 127
 
3.6%
A 125
 
3.5%
E 124
 
3.5%
T 116
 
3.3%
R 114
 
3.2%
1 102
 
2.9%
I 102
 
2.9%
) 83
 
2.3%
( 83
 
2.3%
Other values (177) 2264
63.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1261
35.5%
Other Letter 878
24.7%
Decimal Number 586
16.5%
Space Separator 315
 
8.9%
Lowercase Letter 291
 
8.2%
Close Punctuation 83
 
2.3%
Open Punctuation 83
 
2.3%
Dash Punctuation 29
 
0.8%
Connector Punctuation 17
 
0.5%
Other Punctuation 12
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
80
 
9.1%
53
 
6.0%
52
 
5.9%
50
 
5.7%
44
 
5.0%
41
 
4.7%
41
 
4.7%
21
 
2.4%
20
 
2.3%
18
 
2.1%
Other values (115) 458
52.2%
Uppercase Letter
ValueCountFrequency (%)
A 125
 
9.9%
E 124
 
9.8%
T 116
 
9.2%
R 114
 
9.0%
I 102
 
8.1%
P 67
 
5.3%
L 62
 
4.9%
G 56
 
4.4%
M 56
 
4.4%
D 54
 
4.3%
Other values (15) 385
30.5%
Lowercase Letter
ValueCountFrequency (%)
c 38
13.1%
a 38
13.1%
e 38
13.1%
m 26
8.9%
l 21
7.2%
t 19
 
6.5%
v 16
 
5.5%
i 16
 
5.5%
r 15
 
5.2%
n 14
 
4.8%
Other values (9) 50
17.2%
Decimal Number
ValueCountFrequency (%)
0 127
21.7%
1 102
17.4%
2 67
11.4%
4 66
11.3%
3 51
8.7%
8 39
 
6.7%
6 39
 
6.7%
5 35
 
6.0%
7 34
 
5.8%
9 26
 
4.4%
Other Punctuation
ValueCountFrequency (%)
/ 9
75.0%
& 2
 
16.7%
: 1
 
8.3%
Space Separator
ValueCountFrequency (%)
315
100.0%
Close Punctuation
ValueCountFrequency (%)
) 83
100.0%
Open Punctuation
ValueCountFrequency (%)
( 83
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 29
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1552
43.7%
Common 1125
31.6%
Hangul 878
24.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
80
 
9.1%
53
 
6.0%
52
 
5.9%
50
 
5.7%
44
 
5.0%
41
 
4.7%
41
 
4.7%
21
 
2.4%
20
 
2.3%
18
 
2.1%
Other values (115) 458
52.2%
Latin
ValueCountFrequency (%)
A 125
 
8.1%
E 124
 
8.0%
T 116
 
7.5%
R 114
 
7.3%
I 102
 
6.6%
P 67
 
4.3%
L 62
 
4.0%
G 56
 
3.6%
M 56
 
3.6%
D 54
 
3.5%
Other values (34) 676
43.6%
Common
ValueCountFrequency (%)
315
28.0%
0 127
11.3%
1 102
 
9.1%
) 83
 
7.4%
( 83
 
7.4%
2 67
 
6.0%
4 66
 
5.9%
3 51
 
4.5%
8 39
 
3.5%
6 39
 
3.5%
Other values (8) 153
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2677
75.3%
Hangul 878
 
24.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
315
 
11.8%
0 127
 
4.7%
A 125
 
4.7%
E 124
 
4.6%
T 116
 
4.3%
R 114
 
4.3%
1 102
 
3.8%
I 102
 
3.8%
) 83
 
3.1%
( 83
 
3.1%
Other values (52) 1386
51.8%
Hangul
ValueCountFrequency (%)
80
 
9.1%
53
 
6.0%
52
 
5.9%
50
 
5.7%
44
 
5.0%
41
 
4.7%
41
 
4.7%
21
 
2.4%
20
 
2.3%
18
 
2.1%
Other values (115) 458
52.2%

신청건수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
1
189 
2
19 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 189
90.9%
2 19
 
9.1%

Length

2023-12-13T06:50:12.136899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:50:12.240663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 189
90.9%
2 19
 
9.1%

Correlations

2023-12-13T06:50:12.303355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신청연도대분류신청건수
신청연도1.0000.9830.828
대분류0.9831.0000.978
신청건수0.8280.9781.000
2023-12-13T06:50:12.388002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신청연도신청건수대분류
신청연도1.0000.6200.799
신청건수0.6201.0000.821
대분류0.7990.8211.000
2023-12-13T06:50:12.488015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신청연도대분류신청건수
신청연도1.0000.7990.620
대분류0.7991.0000.821
신청건수0.6200.8211.000

Missing values

2023-12-13T06:50:10.157854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:50:10.246078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

신청연도대분류중분류신청건수
02020년 이전245116502BIJ1
12020년 이전245116530BIJ1
22020년 이전245142001BIJ1
32020년 이전245142030BIJ1
42020년 이전245142201BIJ1
52020년 이전245142230BIJ1
62020년 이전245142232BIJ1
72020년 이전245171201BIJ1
82020년 이전245171202BIJ1
92020년 이전245193502BIJ1
신청연도대분류중분류신청건수
1982023Heart valve conduitTissue heart valve conduit1
1992023J61060121
2002023J81080141
2012023J81080391
2022023J81080451
2032023L군CRRT KIT2
2042023L군혈장용 PACK1
2052023경피적 대동맥판 삽입용경피적 대동맥판 삽입용1
2062023경피적 폐동맥판 삽입용경피적 폐동맥판 삽입용1
2072023비봉합 대동맥판막치환술용비봉합 대동맥판막치환술용1