Overview

Dataset statistics

Number of variables3
Number of observations4749
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory111.4 KiB
Average record size in memory24.0 B

Variable types

Text2
Categorical1

Dataset

Description국립암센터에서 19년도 9월까지 국립암센터홈페이지를 통해 개방하는 공지코드
Author국립암센터
URLhttps://www.data.go.kr/data/15049634/fileData.do

Alerts

NTC_IDX has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:39:44.135020
Analysis finished2023-12-12 05:39:44.435951
Duration0.3 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

NTC_IDX
Text

UNIQUE 

Distinct4749
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size37.2 KiB
2023-12-12T14:39:45.092487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.2219415
Min length1

Characters and Unicode

Total characters24799
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4749 ?
Unique (%)100.0%

Sample

1st row358
2nd row359
3rd row360
4th row361
5th row362
ValueCountFrequency (%)
358 1
 
< 0.1%
8,611 1
 
< 0.1%
8,607 1
 
< 0.1%
8,609 1
 
< 0.1%
8,596 1
 
< 0.1%
8,597 1
 
< 0.1%
8,595 1
 
< 0.1%
8,591 1
 
< 0.1%
8,569 1
 
< 0.1%
8,559 1
 
< 0.1%
Other values (4739) 4739
99.8%
2023-12-12T14:39:45.775808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 4343
17.5%
, 4202
16.9%
3 2053
8.3%
2 1994
8.0%
4 1992
8.0%
5 1886
7.6%
6 1886
7.6%
0 1676
 
6.8%
9 1631
 
6.6%
7 1611
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20597
83.1%
Other Punctuation 4202
 
16.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4343
21.1%
3 2053
10.0%
2 1994
9.7%
4 1992
9.7%
5 1886
9.2%
6 1886
9.2%
0 1676
 
8.1%
9 1631
 
7.9%
7 1611
 
7.8%
8 1525
 
7.4%
Other Punctuation
ValueCountFrequency (%)
, 4202
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24799
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4343
17.5%
, 4202
16.9%
3 2053
8.3%
2 1994
8.0%
4 1992
8.0%
5 1886
7.6%
6 1886
7.6%
0 1676
 
6.8%
9 1631
 
6.6%
7 1611
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 4343
17.5%
, 4202
16.9%
3 2053
8.3%
2 1994
8.0%
4 1992
8.0%
5 1886
7.6%
6 1886
7.6%
0 1676
 
6.8%
9 1631
 
6.6%
7 1611
 
6.5%

NTC_ID
Text

Distinct4739
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size37.2 KiB
2023-12-12T14:39:46.282763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.6420299
Min length1

Characters and Unicode

Total characters22045
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4730 ?
Unique (%)99.6%

Sample

1st row449
2nd row450
3rd row452
4th row451
5th row454
ValueCountFrequency (%)
1,834 3
 
0.1%
1,693 2
 
< 0.1%
1,698 2
 
< 0.1%
3,028 2
 
< 0.1%
1,696 2
 
< 0.1%
1,106 2
 
< 0.1%
1,879 2
 
< 0.1%
1,013 2
 
< 0.1%
4,862 2
 
< 0.1%
2,521 1
 
< 0.1%
Other values (4729) 4729
99.6%
2023-12-12T14:39:46.966531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 3944
17.9%
4 2400
10.9%
1 2376
10.8%
3 2370
10.8%
2 2359
10.7%
5 1543
 
7.0%
9 1442
 
6.5%
8 1412
 
6.4%
0 1412
 
6.4%
7 1400
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18101
82.1%
Other Punctuation 3944
 
17.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 2400
13.3%
1 2376
13.1%
3 2370
13.1%
2 2359
13.0%
5 1543
8.5%
9 1442
8.0%
8 1412
7.8%
0 1412
7.8%
7 1400
7.7%
6 1387
7.7%
Other Punctuation
ValueCountFrequency (%)
, 3944
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22045
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
, 3944
17.9%
4 2400
10.9%
1 2376
10.8%
3 2370
10.8%
2 2359
10.7%
5 1543
 
7.0%
9 1442
 
6.5%
8 1412
 
6.4%
0 1412
 
6.4%
7 1400
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22045
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 3944
17.9%
4 2400
10.9%
1 2376
10.8%
3 2370
10.8%
2 2359
10.7%
5 1543
 
7.0%
9 1442
 
6.5%
8 1412
 
6.4%
0 1412
 
6.4%
7 1400
 
6.4%

NTC_CODE
Categorical

Distinct25
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size37.2 KiB
005_002
2006 
005_001
1120 
005_005
359 
005_006
261 
005_004
 
152
Other values (20)
851 

Length

Max length9
Median length7
Mean length7.0593809
Min length7

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row005_002
2nd row005_002
3rd row005_002
4th row005_002
5th row005_002

Common Values

ValueCountFrequency (%)
005_002 2006
42.2%
005_001 1120
23.6%
005_005 359
 
7.6%
005_006 261
 
5.5%
005_004 152
 
3.2%
005_001_1 139
 
2.9%
005_003 117
 
2.5%
005_010 85
 
1.8%
005_102 83
 
1.7%
005_027 80
 
1.7%
Other values (15) 347
 
7.3%

Length

2023-12-12T14:39:47.126227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
005_002 2006
42.2%
005_001 1122
23.6%
005_005 359
 
7.6%
005_006 261
 
5.5%
005_004 152
 
3.2%
005_001_1 139
 
2.9%
005_003 117
 
2.5%
005_010 85
 
1.8%
005_102 83
 
1.7%
005_027 80
 
1.7%
Other values (14) 345
 
7.3%

Missing values

2023-12-12T14:39:44.298852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:39:44.394950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

NTC_IDXNTC_IDNTC_CODE
0358449005_002
1359450005_002
2360452005_002
3361451005_002
4362454005_002
5363455005_003
6364456005_002
7365457005_002
8366460005_002
9367459005_002
NTC_IDXNTC_IDNTC_CODE
473916,3205,058005_002
474016,3225,059005_002
474116,3575,063005_002
474216,4635,081005_002
474316,4205,077005_002
474416,4395,080005_002
474516,5785,124005_002
474616,4645,084005_002
474716,5845,128005_002
474816,4625,083005_002