Overview

Dataset statistics

Number of variables2
Number of observations238
Missing cells1
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.8 KiB
Average record size in memory16.6 B

Variable types

Text2

Dataset

Description국립암센터에서 19년도 9월까지 국립암센터홈페이지를 통해 개방하는 나라코드 마스터 정보
Author국립암센터
URLhttps://www.data.go.kr/data/15049630/fileData.do

Alerts

NAME has unique valuesUnique

Reproduction

Analysis started2023-12-11 22:46:23.020934
Analysis finished2023-12-11 22:46:23.283601
Duration0.26 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

CODE
Text

Distinct237
Distinct (%)100.0%
Missing1
Missing (%)0.4%
Memory size2.0 KiB
2023-12-12T07:46:23.643742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters474
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique237 ?
Unique (%)100.0%

Sample

1st rowGH
2nd rowGA
3rd rowGY
4th rowGM
5th rowGP
ValueCountFrequency (%)
gh 1
 
0.4%
vg 1
 
0.4%
ky 1
 
0.4%
ye 1
 
0.4%
om 1
 
0.4%
at 1
 
0.4%
hn 1
 
0.4%
wf 1
 
0.4%
jo 1
 
0.4%
ug 1
 
0.4%
Other values (227) 227
95.8%
2023-12-12T07:46:24.141559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 36
 
7.6%
G 30
 
6.3%
S 29
 
6.1%
T 28
 
5.9%
A 27
 
5.7%
C 25
 
5.3%
N 23
 
4.9%
B 23
 
4.9%
E 20
 
4.2%
K 20
 
4.2%
Other values (16) 213
44.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 474
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 36
 
7.6%
G 30
 
6.3%
S 29
 
6.1%
T 28
 
5.9%
A 27
 
5.7%
C 25
 
5.3%
N 23
 
4.9%
B 23
 
4.9%
E 20
 
4.2%
K 20
 
4.2%
Other values (16) 213
44.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 474
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 36
 
7.6%
G 30
 
6.3%
S 29
 
6.1%
T 28
 
5.9%
A 27
 
5.7%
C 25
 
5.3%
N 23
 
4.9%
B 23
 
4.9%
E 20
 
4.2%
K 20
 
4.2%
Other values (16) 213
44.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 474
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 36
 
7.6%
G 30
 
6.3%
S 29
 
6.1%
T 28
 
5.9%
A 27
 
5.7%
C 25
 
5.3%
N 23
 
4.9%
B 23
 
4.9%
E 20
 
4.2%
K 20
 
4.2%
Other values (16) 213
44.9%

NAME
Text

UNIQUE 

Distinct238
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-12T07:46:24.481793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length4.3655462
Min length1

Characters and Unicode

Total characters1039
Distinct characters213
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique238 ?
Unique (%)100.0%

Sample

1st row가나
2nd row가봉
3rd row가이아나
4th row감비아
5th row과델로프
ValueCountFrequency (%)
군도 11
 
3.8%
세인트 5
 
1.7%
불령 4
 
1.4%
아일랜드 4
 
1.4%
3
 
1.0%
영령 3
 
1.0%
버진군도 2
 
0.7%
사모아 2
 
0.7%
네덜란드 2
 
0.7%
도미니카 2
 
0.7%
Other values (251) 253
86.9%
2023-12-12T07:46:25.024558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
66
 
6.4%
53
 
5.1%
37
 
3.6%
33
 
3.2%
30
 
2.9%
30
 
2.9%
25
 
2.4%
24
 
2.3%
22
 
2.1%
20
 
1.9%
Other values (203) 699
67.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 981
94.4%
Space Separator 53
 
5.1%
Other Punctuation 4
 
0.4%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
 
6.7%
37
 
3.8%
33
 
3.4%
30
 
3.1%
30
 
3.1%
25
 
2.5%
24
 
2.4%
22
 
2.2%
20
 
2.0%
20
 
2.0%
Other values (199) 674
68.7%
Other Punctuation
ValueCountFrequency (%)
& 3
75.0%
, 1
 
25.0%
Space Separator
ValueCountFrequency (%)
53
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 981
94.4%
Common 58
 
5.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
 
6.7%
37
 
3.8%
33
 
3.4%
30
 
3.1%
30
 
3.1%
25
 
2.5%
24
 
2.4%
22
 
2.2%
20
 
2.0%
20
 
2.0%
Other values (199) 674
68.7%
Common
ValueCountFrequency (%)
53
91.4%
& 3
 
5.2%
, 1
 
1.7%
- 1
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 981
94.4%
ASCII 58
 
5.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
66
 
6.7%
37
 
3.8%
33
 
3.4%
30
 
3.1%
30
 
3.1%
25
 
2.5%
24
 
2.4%
22
 
2.2%
20
 
2.0%
20
 
2.0%
Other values (199) 674
68.7%
ASCII
ValueCountFrequency (%)
53
91.4%
& 3
 
5.2%
, 1
 
1.7%
- 1
 
1.7%

Missing values

2023-12-12T07:46:23.185632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:46:23.255548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

CODENAME
0GH가나
1GA가봉
2GY가이아나
3GM감비아
4GP과델로프
5GT과테말라
6GU
7VA교황청
8GD그레나다
9GE그루지아
CODENAME
228PR푸에르토리코
229FR프랑스
230FJ피지
231PN피트카이른
232FI핀란드
233PH필리핀
234KR한국
235HU헝가리
236AU호주
237HK홍콩