Overview

Dataset statistics

Number of variables5
Number of observations910
Missing cells0
Missing cells (%)0.0%
Duplicate rows24
Duplicate rows (%)2.6%
Total size in memory36.6 KiB
Average record size in memory41.1 B

Variable types

Text2
Numeric1
Categorical2

Dataset

Description농림수산식품 식품·유통R&D 특허 정보
Author농림식품기술기획평가원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220211000000001842

Alerts

Dataset has 24 (2.6%) duplicate rowsDuplicates
APLC_REGIST_NATION is highly overall correlated with NATION_ISO_CODEHigh correlation
NATION_ISO_CODE is highly overall correlated with APLC_REGIST_NATIONHigh correlation
APLC_REGIST_NATION is highly imbalanced (93.7%)Imbalance
NATION_ISO_CODE is highly imbalanced (93.7%)Imbalance

Reproduction

Analysis started2023-12-11 03:32:49.130367
Analysis finished2023-12-11 03:32:50.069260
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct793
Distinct (%)87.1%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
2023-12-11T12:32:50.368593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length169
Median length65
Mean length32.908791
Min length3

Characters and Unicode

Total characters29947
Distinct characters683
Distinct categories10 ?
Distinct scripts5 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique692 ?
Unique (%)76.0%

Sample

1st row다시마 엽체를 이용한 미용 마스크 팩의 제조방법
2nd rowNAD7 인트론4 영역에 특이적인 고려인삼 천풍종 구별용 SNP 프라이머 및 이를 이용한 고려인삼 천풍종 구별방법
3rd row온실용 탄산가스 발생 장치
4th row밀순 추출물을 유효성분으로 포함하는 당뇨병 예방 및 치료용 조성물
5th row스트렙토코커스 마세도니커스 LC743 균주를 함유하는 모짜렐라 치즈 및 그 제조방법
ValueCountFrequency (%)
534
 
7.4%
제조방법 345
 
4.8%
이용한 214
 
3.0%
조성물 193
 
2.7%
방법 168
 
2.3%
포함하는 137
 
1.9%
이를 101
 
1.4%
이의 93
 
1.3%
함유하는 85
 
1.2%
84
 
1.2%
Other values (2290) 5266
72.9%
2023-12-11T12:32:50.949768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6318
 
21.1%
736
 
2.5%
713
 
2.4%
669
 
2.2%
614
 
2.1%
614
 
2.1%
551
 
1.8%
542
 
1.8%
536
 
1.8%
529
 
1.8%
Other values (673) 18125
60.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 21877
73.1%
Space Separator 6318
 
21.1%
Lowercase Letter 736
 
2.5%
Uppercase Letter 621
 
2.1%
Decimal Number 225
 
0.8%
Dash Punctuation 69
 
0.2%
Other Punctuation 54
 
0.2%
Open Punctuation 23
 
0.1%
Close Punctuation 23
 
0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
736
 
3.4%
713
 
3.3%
669
 
3.1%
614
 
2.8%
614
 
2.8%
551
 
2.5%
542
 
2.5%
536
 
2.5%
529
 
2.4%
491
 
2.2%
Other values (551) 15882
72.6%
Uppercase Letter
ValueCountFrequency (%)
P 39
 
6.3%
A 38
 
6.1%
36
 
5.8%
N 34
 
5.5%
30
 
4.8%
E 30
 
4.8%
R 26
 
4.2%
T 23
 
3.7%
C 23
 
3.7%
M 21
 
3.4%
Other values (40) 321
51.7%
Lowercase Letter
ValueCountFrequency (%)
i 69
 
9.4%
o 65
 
8.8%
t 61
 
8.3%
e 58
 
7.9%
a 57
 
7.7%
n 52
 
7.1%
r 41
 
5.6%
s 36
 
4.9%
m 32
 
4.3%
l 31
 
4.2%
Other values (29) 234
31.8%
Decimal Number
ValueCountFrequency (%)
1 47
20.9%
2 31
13.8%
4 25
11.1%
5 20
8.9%
3 20
8.9%
0 15
 
6.7%
7 14
 
6.2%
9 9
 
4.0%
8
 
3.6%
6 7
 
3.1%
Other values (9) 29
12.9%
Dash Punctuation
ValueCountFrequency (%)
- 62
89.9%
6
 
8.7%
1
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 47
87.0%
/ 5
 
9.3%
2
 
3.7%
Open Punctuation
ValueCountFrequency (%)
( 19
82.6%
2
 
8.7%
{ 2
 
8.7%
Close Punctuation
ValueCountFrequency (%)
) 19
82.6%
2
 
8.7%
} 2
 
8.7%
Space Separator
ValueCountFrequency (%)
6318
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 21877
73.1%
Common 6712
 
22.4%
Latin 1356
 
4.5%
Cyrillic 1
 
< 0.1%
Greek 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
736
 
3.4%
713
 
3.3%
669
 
3.1%
614
 
2.8%
614
 
2.8%
551
 
2.5%
542
 
2.5%
536
 
2.5%
529
 
2.4%
491
 
2.2%
Other values (551) 15882
72.6%
Latin
ValueCountFrequency (%)
i 69
 
5.1%
o 65
 
4.8%
t 61
 
4.5%
e 58
 
4.3%
a 57
 
4.2%
n 52
 
3.8%
r 41
 
3.0%
P 39
 
2.9%
A 38
 
2.8%
36
 
2.7%
Other values (78) 840
61.9%
Common
ValueCountFrequency (%)
6318
94.1%
- 62
 
0.9%
1 47
 
0.7%
, 47
 
0.7%
2 31
 
0.5%
4 25
 
0.4%
5 20
 
0.3%
3 20
 
0.3%
( 19
 
0.3%
) 19
 
0.3%
Other values (22) 104
 
1.5%
Cyrillic
ValueCountFrequency (%)
Р 1
100.0%
Greek
ValueCountFrequency (%)
β 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 21876
73.0%
ASCII 7739
 
25.8%
None 323
 
1.1%
Punctuation 6
 
< 0.1%
Cyrillic 1
 
< 0.1%
Number Forms 1
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6318
81.6%
i 69
 
0.9%
o 65
 
0.8%
- 62
 
0.8%
t 61
 
0.8%
e 58
 
0.7%
a 57
 
0.7%
n 52
 
0.7%
1 47
 
0.6%
, 47
 
0.6%
Other values (55) 903
 
11.7%
Hangul
ValueCountFrequency (%)
736
 
3.4%
713
 
3.3%
669
 
3.1%
614
 
2.8%
614
 
2.8%
551
 
2.5%
542
 
2.5%
536
 
2.5%
529
 
2.4%
491
 
2.2%
Other values (550) 15881
72.6%
None
ValueCountFrequency (%)
36
 
11.1%
30
 
9.3%
21
 
6.5%
19
 
5.9%
19
 
5.9%
16
 
5.0%
13
 
4.0%
11
 
3.4%
10
 
3.1%
8
 
2.5%
Other values (44) 140
43.3%
Punctuation
ValueCountFrequency (%)
6
100.0%
Cyrillic
ValueCountFrequency (%)
Р 1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Distinct237
Distinct (%)26.0%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
2023-12-11T12:32:51.245441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length74
Median length69
Mean length9.2978022
Min length2

Characters and Unicode

Total characters8461
Distinct characters251
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique117 ?
Unique (%)12.9%

Sample

1st row대한민국(관리부서:국립수산과학원)
2nd row경희대학교산학협력단
3rd row서울대학교산학협력단
4th row전북대학교
5th row한국식품연구원
ValueCountFrequency (%)
한국식품연구원 227
22.9%
농촌진흥청 58
 
5.9%
대한민국(농촌진흥청장 54
 
5.4%
산학협력단 39
 
3.9%
건국대학교산학협력단 20
 
2.0%
동국대학교산학협력단 18
 
1.8%
충남대학교산학협력단 15
 
1.5%
경희대학교산학협력단 15
 
1.5%
대한민국 14
 
1.4%
주식회사 14
 
1.4%
Other values (247) 517
52.2%
2023-12-11T12:32:51.737192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
682
 
8.1%
423
 
5.0%
422
 
5.0%
374
 
4.4%
364
 
4.3%
341
 
4.0%
317
 
3.7%
317
 
3.7%
313
 
3.7%
310
 
3.7%
Other values (241) 4598
54.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7949
93.9%
Open Punctuation 172
 
2.0%
Close Punctuation 172
 
2.0%
Space Separator 81
 
1.0%
Other Punctuation 60
 
0.7%
Math Symbol 17
 
0.2%
Other Symbol 9
 
0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
682
 
8.6%
423
 
5.3%
422
 
5.3%
374
 
4.7%
364
 
4.6%
341
 
4.3%
317
 
4.0%
317
 
4.0%
313
 
3.9%
310
 
3.9%
Other values (230) 4086
51.4%
Other Punctuation
ValueCountFrequency (%)
; 34
56.7%
: 12
 
20.0%
, 7
 
11.7%
. 6
 
10.0%
/ 1
 
1.7%
Open Punctuation
ValueCountFrequency (%)
( 172
100.0%
Close Punctuation
ValueCountFrequency (%)
) 172
100.0%
Space Separator
ValueCountFrequency (%)
81
100.0%
Math Symbol
ValueCountFrequency (%)
| 17
100.0%
Other Symbol
ValueCountFrequency (%)
9
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7958
94.1%
Common 503
 
5.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
682
 
8.6%
423
 
5.3%
422
 
5.3%
374
 
4.7%
364
 
4.6%
341
 
4.3%
317
 
4.0%
317
 
4.0%
313
 
3.9%
310
 
3.9%
Other values (231) 4095
51.5%
Common
ValueCountFrequency (%)
( 172
34.2%
) 172
34.2%
81
16.1%
; 34
 
6.8%
| 17
 
3.4%
: 12
 
2.4%
, 7
 
1.4%
. 6
 
1.2%
1 1
 
0.2%
/ 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7949
93.9%
ASCII 503
 
5.9%
None 9
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
682
 
8.6%
423
 
5.3%
422
 
5.3%
374
 
4.7%
364
 
4.6%
341
 
4.3%
317
 
4.0%
317
 
4.0%
313
 
3.9%
310
 
3.9%
Other values (230) 4086
51.4%
ASCII
ValueCountFrequency (%)
( 172
34.2%
) 172
34.2%
81
16.1%
; 34
 
6.8%
| 17
 
3.4%
: 12
 
2.4%
, 7
 
1.4%
. 6
 
1.2%
1 1
 
0.2%
/ 1
 
0.2%
None
ValueCountFrequency (%)
9
100.0%

APLC_REGIST_DE
Real number (ℝ)

Distinct512
Distinct (%)56.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20113471
Minimum20090105
Maximum20131227
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.1 KiB
2023-12-11T12:32:51.923998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20090105
5-th percentile20091013
Q120110116
median20111212
Q320121017
95-th percentile20130704
Maximum20131227
Range41122
Interquartile range (IQR)10900.5

Descriptive statistics

Standard deviation11127.513
Coefficient of variation (CV)0.00055323683
Kurtosis-0.58499534
Mean20113471
Median Absolute Deviation (MAD)9804
Skewness-0.38300966
Sum1.8303259 × 1010
Variance1.2382155 × 108
MonotonicityNot monotonic
2023-12-11T12:32:52.111197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20130704 18
 
2.0%
20110714 16
 
1.8%
20121120 13
 
1.4%
20121203 11
 
1.2%
20121116 7
 
0.8%
20110207 7
 
0.8%
20121017 6
 
0.7%
20121204 6
 
0.7%
20121130 6
 
0.7%
20121127 6
 
0.7%
Other values (502) 814
89.5%
ValueCountFrequency (%)
20090105 1
0.1%
20090114 1
0.1%
20090119 1
0.1%
20090204 1
0.1%
20090209 1
0.1%
20090227 2
0.2%
20090312 1
0.1%
20090313 2
0.2%
20090317 1
0.1%
20090324 1
0.1%
ValueCountFrequency (%)
20131227 1
0.1%
20131226 1
0.1%
20131220 1
0.1%
20131219 1
0.1%
20131218 1
0.1%
20131217 1
0.1%
20131129 1
0.1%
20131125 1
0.1%
20131115 1
0.1%
20131031 1
0.1%

APLC_REGIST_NATION
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
대한민국
895 
국제
 
9
미국
 
4
뉴질랜드
 
1
중국
 
1

Length

Max length4
Median length4
Mean length3.9692308
Min length2

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st row대한민국
2nd row대한민국
3rd row대한민국
4th row대한민국
5th row대한민국

Common Values

ValueCountFrequency (%)
대한민국 895
98.4%
국제 9
 
1.0%
미국 4
 
0.4%
뉴질랜드 1
 
0.1%
중국 1
 
0.1%

Length

2023-12-11T12:32:52.295735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:32:52.439106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대한민국 895
98.4%
국제 9
 
1.0%
미국 4
 
0.4%
뉴질랜드 1
 
0.1%
중국 1
 
0.1%

NATION_ISO_CODE
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
KR
895 
<NA>
 
9
US
 
4
NZ
 
1
CN
 
1

Length

Max length4
Median length2
Mean length2.0197802
Min length2

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st rowKR
2nd rowKR
3rd rowKR
4th rowKR
5th rowKR

Common Values

ValueCountFrequency (%)
KR 895
98.4%
<NA> 9
 
1.0%
US 4
 
0.4%
NZ 1
 
0.1%
CN 1
 
0.1%

Length

2023-12-11T12:32:52.601422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:32:52.780031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kr 895
98.4%
na 9
 
1.0%
us 4
 
0.4%
nz 1
 
0.1%
cn 1
 
0.1%

Interactions

2023-12-11T12:32:49.767640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:32:53.192371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
APLC_REGIST_DEAPLC_REGIST_NATIONNATION_ISO_CODE
APLC_REGIST_DE1.0000.0000.000
APLC_REGIST_NATION0.0001.0001.000
NATION_ISO_CODE0.0001.0001.000
2023-12-11T12:32:53.295540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
APLC_REGIST_NATIONNATION_ISO_CODE
APLC_REGIST_NATION1.0001.000
NATION_ISO_CODE1.0001.000
2023-12-11T12:32:53.407340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
APLC_REGIST_DEAPLC_REGIST_NATIONNATION_ISO_CODE
APLC_REGIST_DE1.0000.0000.000
APLC_REGIST_NATION0.0001.0001.000
NATION_ISO_CODE0.0001.0001.000

Missing values

2023-12-11T12:32:49.908604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:32:50.023984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

PATENT_NMAPLC_REGIST_PSNAPLC_REGIST_DEAPLC_REGIST_NATIONNATION_ISO_CODE
0다시마 엽체를 이용한 미용 마스크 팩의 제조방법대한민국(관리부서:국립수산과학원)20090114대한민국KR
1NAD7 인트론4 영역에 특이적인 고려인삼 천풍종 구별용 SNP 프라이머 및 이를 이용한 고려인삼 천풍종 구별방법경희대학교산학협력단20090105대한민국KR
2온실용 탄산가스 발생 장치서울대학교산학협력단20130104대한민국KR
3밀순 추출물을 유효성분으로 포함하는 당뇨병 예방 및 치료용 조성물전북대학교20130115대한민국KR
4스트렙토코커스 마세도니커스 LC743 균주를 함유하는 모짜렐라 치즈 및 그 제조방법한국식품연구원20130107대한민국KR
5사카로미세스 세르바찌 MY7 및 락토바실러스 쿠르바투스 ML17함유 스타터를 이용한 묵은지의 제조방법재단법인전라남도생물산업진흥재단20130412대한민국KR
6고혈압 환자식용 저염단호박 고추장 소스 및 이의 제조방법경희대학교산학협력단20130118대한민국KR
7바실러스 속 KH-15의 콩 발효물을 유효성분으로 포함하는 당뇨병 예방 및 치료용 조성물대구대학교산학협력단20130121대한민국KR
8근원섬유 단백질의 가용화 방법 및 식품용 단백질의 제조방법한국식품연구원20130306대한민국KR
9인삼씨 배유 추출박을 포함하는 항산화 활성이 향상된 기능성 식품 및 이의 제조방법한국식품연구원20130307대한민국KR
PATENT_NMAPLC_REGIST_PSNAPLC_REGIST_DEAPLC_REGIST_NATIONNATION_ISO_CODE
900수수에 균주로서 아스퍼질러스 오리제를 접종하여 누룩 및 상기누룩을 이용한 발효주의 제조방법한국식품연구원20130704대한민국KR
901찹쌀흑미에 균주로서 아스퍼질러스 오리제를 접종하여 누룩 및 상기누룩을 이용한 발효주의 제조방법한국식품연구원20130704대한민국KR
902흑미에 균주로서 아스퍼질러스 나이거를 접종하여 누룩 및 상기누룩을 이용한 발효주의 제조방법한국식품연구원20130704대한민국KR
903메밀에 균주로서 아스퍼질러스 오리제를 접종하여 누룩 및 상기 누룩을 이용한 발효주의 제조방법한국식품연구원20130704대한민국KR
904주요 병원성 미생물 신속검출용 멀티플렉스 PCR 방법 및 이를 위한 PCR 프라이머건국대학교산학협력단20130430대한민국KR
905심근 단백질의 가용화 방법 및 식품용 단백질의 제조방법한국식품연구원20130703대한민국KR
906내부젤화 기술을 이용한 생리활성 물질 함유 베타락토글로불린 / 알긴산 나노에멀젼 전달체 및 그 제조방법경상대학교산학협력단20130704대한민국KR
907찹쌀에 균주로서 아스퍼질러스 오리제를 접종하여 누룩 및 상기누룩을 이용한 발효주의 제조방법한국식품연구원20130704대한민국KR
908현미에 균주로서 아스퍼질러스 오리제를 접종하여 누룩 및 상기 누룩을 이용한 발효주의 제조방법한국식품연구원20130704대한민국KR
909연근 가공 방법목포대학교산학협력단20090410대한민국KR

Duplicate rows

Most frequently occurring

PATENT_NMAPLC_REGIST_PSNAPLC_REGIST_DEAPLC_REGIST_NATIONNATION_ISO_CODE# duplicates
0가스치환을 통한 생강의 보존 방법한국식품연구원20110324대한민국KR2
1근원섬유 단백질의 가용화 방법 및 식품용 단백질의 제조방법한국식품연구원20110131대한민국KR2
2글리세올린을 유효성분으로 함유하는 항산화 조성물경북대학교산학협력단20100209대한민국KR2
3김치로부터 분리된 유산균 및 상기 유산균을 이용한 발효식품조선대학교산학협력단20110420대한민국KR2
4노화방지용 화장료 조성물대한민국(농촌진흥청장)20121221대한민국KR2
5마늘 예건 겸용 저온 저장고대한민국20120530대한민국KR2
6미립자 밀기울-중력분 혼합체 및 이의 제조방법한국식품연구원20110407대한민국KR2
7배추에서 분리된 신규 miRNA충남대학교산학협력단20120806대한민국KR2
8배추에서 분리된 신규 miRNA충남대학교산학협력단20120903대한민국KR2
9생강 절편 및 이의 제조방법한국식품연구원20110131대한민국KR2