Overview

Dataset statistics

Number of variables8
Number of observations137
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.7 KiB
Average record size in memory65.0 B

Variable types

Text2
Categorical6

Dataset

Description국립농산물품질관리원에서 관리하는 농산물 원산지 인증정보(인증번호, 최초인증일, 인증유효기간, 인증업체명, 제품명, 원산지, 비율, 인증기관)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220609000000002101

Alerts

원산지 has constant value ""Constant
최초인증일자 is highly overall correlated with 인증유효기간 and 3 other fieldsHigh correlation
인증업체명 is highly overall correlated with 최초인증일자 and 3 other fieldsHigh correlation
인증유효기간 is highly overall correlated with 최초인증일자 and 3 other fieldsHigh correlation
비율 is highly overall correlated with 최초인증일자 and 2 other fieldsHigh correlation
인증기관 is highly overall correlated with 최초인증일자 and 2 other fieldsHigh correlation
인증기관 is highly imbalanced (59.7%)Imbalance
인증번호 has unique valuesUnique

Reproduction

Analysis started2024-03-23 07:48:52.379758
Analysis finished2024-03-23 07:48:53.796114
Duration1.42 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

인증번호
Text

UNIQUE 

Distinct137
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2024-03-23T07:48:54.190602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.416058
Min length10

Characters and Unicode

Total characters1701
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique137 ?
Unique (%)100.0%

Sample

1st row푸름 원산지 제1호
2nd row푸름 원산지 제2호
3rd row푸름 원산지 제3호
4th row푸름 원산지 제4호
5th row푸름 원산지 제5호
ValueCountFrequency (%)
원산지 137
33.3%
식품연 126
30.7%
푸름 11
 
2.7%
제190호 1
 
0.2%
제152호 1
 
0.2%
제141호 1
 
0.2%
제142호 1
 
0.2%
제156호 1
 
0.2%
제155호 1
 
0.2%
제154호 1
 
0.2%
Other values (130) 130
31.6%
2024-03-23T07:48:55.301950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
274
16.1%
137
8.1%
137
8.1%
137
8.1%
137
8.1%
137
8.1%
126
7.4%
126
7.4%
126
7.4%
1 99
 
5.8%
Other values (11) 265
15.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1085
63.8%
Decimal Number 342
 
20.1%
Space Separator 274
 
16.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
137
12.6%
137
12.6%
137
12.6%
137
12.6%
137
12.6%
126
11.6%
126
11.6%
126
11.6%
11
 
1.0%
11
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 99
28.9%
6 34
 
9.9%
7 33
 
9.6%
3 32
 
9.4%
5 31
 
9.1%
4 29
 
8.5%
8 29
 
8.5%
2 24
 
7.0%
9 16
 
4.7%
0 15
 
4.4%
Space Separator
ValueCountFrequency (%)
274
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1085
63.8%
Common 616
36.2%

Most frequent character per script

Common
ValueCountFrequency (%)
274
44.5%
1 99
 
16.1%
6 34
 
5.5%
7 33
 
5.4%
3 32
 
5.2%
5 31
 
5.0%
4 29
 
4.7%
8 29
 
4.7%
2 24
 
3.9%
9 16
 
2.6%
Hangul
ValueCountFrequency (%)
137
12.6%
137
12.6%
137
12.6%
137
12.6%
137
12.6%
126
11.6%
126
11.6%
126
11.6%
11
 
1.0%
11
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1085
63.8%
ASCII 616
36.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
274
44.5%
1 99
 
16.1%
6 34
 
5.5%
7 33
 
5.4%
3 32
 
5.2%
5 31
 
5.0%
4 29
 
4.7%
8 29
 
4.7%
2 24
 
3.9%
9 16
 
2.6%
Hangul
ValueCountFrequency (%)
137
12.6%
137
12.6%
137
12.6%
137
12.6%
137
12.6%
126
11.6%
126
11.6%
126
11.6%
11
 
1.0%
11
 
1.0%

최초인증일자
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)17.5%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2019-12-09
36 
2018-12-18
18 
2020-01-21
16 
2021-02-08
11 
2020-01-20
Other values (19)
47 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique10 ?
Unique (%)7.3%

Sample

1st row2021-05-27
2nd row2021-07-12
3rd row2021-07-12
4th row2021-08-02
5th row2021-08-02

Common Values

ValueCountFrequency (%)
2019-12-09 36
26.3%
2018-12-18 18
13.1%
2020-01-21 16
11.7%
2021-02-08 11
 
8.0%
2020-01-20 9
 
6.6%
2020-02-25 8
 
5.8%
2021-07-12 6
 
4.4%
2020-03-26 5
 
3.6%
2020-11-09 5
 
3.6%
2021-11-22 4
 
2.9%
Other values (14) 19
13.9%

Length

2024-03-23T07:48:55.633273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2019-12-09 36
26.3%
2018-12-18 18
13.1%
2020-01-21 16
11.7%
2021-02-08 11
 
8.0%
2020-01-20 9
 
6.6%
2020-02-25 8
 
5.8%
2021-07-12 6
 
4.4%
2020-03-26 5
 
3.6%
2020-11-09 5
 
3.6%
2021-11-22 4
 
2.9%
Other values (14) 19
13.9%

인증유효기간
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)17.5%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2022-12-08
36 
2024-12-17
18 
2023-01-20
16 
2024-02-07
11 
2023-01-19
Other values (19)
47 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique10 ?
Unique (%)7.3%

Sample

1st row2024-05-26
2nd row2024-07-11
3rd row2024-07-11
4th row2024-08-01
5th row2024-08-01

Common Values

ValueCountFrequency (%)
2022-12-08 36
26.3%
2024-12-17 18
13.1%
2023-01-20 16
11.7%
2024-02-07 11
 
8.0%
2023-01-19 9
 
6.6%
2023-02-24 8
 
5.8%
2024-07-11 6
 
4.4%
2023-03-25 5
 
3.6%
2023-11-08 5
 
3.6%
2024-11-21 4
 
2.9%
Other values (14) 19
13.9%

Length

2024-03-23T07:48:55.848899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-12-08 36
26.3%
2024-12-17 18
13.1%
2023-01-20 16
11.7%
2024-02-07 11
 
8.0%
2023-01-19 9
 
6.6%
2023-02-24 8
 
5.8%
2024-07-11 6
 
4.4%
2023-03-25 5
 
3.6%
2023-11-08 5
 
3.6%
2024-11-21 4
 
2.9%
Other values (14) 19
13.9%

인증업체명
Categorical

HIGH CORRELATION 

Distinct35
Distinct (%)25.5%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
㈜농가식품
24 
서안동농협 풍산김치공장
18 
고삼농협 안성마춤 푸드센터
10 
㈜송원식품
㈜한성식품 서산지사
 
7
Other values (30)
70 

Length

Max length21
Median length15
Mean length9.189781
Min length3

Unique

Unique14 ?
Unique (%)10.2%

Sample

1st row농업회사법인 온샘㈜
2nd row태장고
3rd row태장고
4th row지보농협참기름가공공장
5th row지보농협참기름가공공장

Common Values

ValueCountFrequency (%)
㈜농가식품 24
17.5%
서안동농협 풍산김치공장 18
13.1%
고삼농협 안성마춤 푸드센터 10
 
7.3%
㈜송원식품 8
 
5.8%
㈜한성식품 서산지사 7
 
5.1%
㈜한성식품 부천공장 7
 
5.1%
㈜효원 6
 
4.4%
안동제비원전통식품 5
 
3.6%
황금터영농조합법인 5
 
3.6%
서일농원 4
 
2.9%
Other values (25) 43
31.4%

Length

2024-03-23T07:48:56.176167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
㈜농가식품 24
 
12.0%
풍산김치공장 18
 
9.0%
서안동농협 18
 
9.0%
㈜한성식품 15
 
7.5%
고삼농협 10
 
5.0%
안성마춤 10
 
5.0%
푸드센터 10
 
5.0%
㈜송원식품 8
 
4.0%
서산지사 7
 
3.5%
부천공장 7
 
3.5%
Other values (31) 73
36.5%
Distinct129
Distinct (%)94.2%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2024-03-23T07:48:56.641765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length7.4160584
Min length3

Characters and Unicode

Total characters1016
Distinct characters223
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123 ?
Unique (%)89.8%

Sample

1st row안동참마분말
2nd row새뜸된장
3rd row새뜸고추장
4th row참기름
5th row들기름
ValueCountFrequency (%)
포기김치 6
 
3.6%
애터미 5
 
3.0%
서분례명인 4
 
2.4%
안동제비원 2
 
1.2%
청결고춧가루 2
 
1.2%
맛김치 2
 
1.2%
시리얼 2
 
1.2%
구운요술콩 2
 
1.2%
위트밀 2
 
1.2%
오색소반 2
 
1.2%
Other values (135) 138
82.6%
2024-03-23T07:48:57.702694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
57
 
5.6%
57
 
5.6%
30
 
3.0%
29
 
2.9%
29
 
2.9%
28
 
2.8%
27
 
2.7%
27
 
2.7%
26
 
2.6%
20
 
2.0%
Other values (213) 686
67.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 977
96.2%
Space Separator 30
 
3.0%
Decimal Number 4
 
0.4%
Lowercase Letter 2
 
0.2%
Uppercase Letter 1
 
0.1%
Open Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
57
 
5.8%
57
 
5.8%
29
 
3.0%
29
 
3.0%
28
 
2.9%
27
 
2.8%
27
 
2.8%
26
 
2.7%
20
 
2.0%
16
 
1.6%
Other values (204) 661
67.7%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
0 1
25.0%
6 1
25.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
50.0%
h 1
50.0%
Space Separator
ValueCountFrequency (%)
30
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 977
96.2%
Common 36
 
3.5%
Latin 3
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
57
 
5.8%
57
 
5.8%
29
 
3.0%
29
 
3.0%
28
 
2.9%
27
 
2.8%
27
 
2.8%
26
 
2.7%
20
 
2.0%
16
 
1.6%
Other values (204) 661
67.7%
Common
ValueCountFrequency (%)
30
83.3%
1 2
 
5.6%
( 1
 
2.8%
) 1
 
2.8%
0 1
 
2.8%
6 1
 
2.8%
Latin
ValueCountFrequency (%)
T 1
33.3%
e 1
33.3%
h 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 977
96.2%
ASCII 39
 
3.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
57
 
5.8%
57
 
5.8%
29
 
3.0%
29
 
3.0%
28
 
2.9%
27
 
2.8%
27
 
2.8%
26
 
2.7%
20
 
2.0%
16
 
1.6%
Other values (204) 661
67.7%
ASCII
ValueCountFrequency (%)
30
76.9%
1 2
 
5.1%
T 1
 
2.6%
( 1
 
2.6%
e 1
 
2.6%
h 1
 
2.6%
) 1
 
2.6%
0 1
 
2.6%
6 1
 
2.6%

원산지
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
국산
137 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국산
2nd row국산
3rd row국산
4th row국산
5th row국산

Common Values

ValueCountFrequency (%)
국산 137
100.0%

Length

2024-03-23T07:48:58.257587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:48:58.627553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국산 137
100.0%

비율
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
100%
73 
95%
64 

Length

Max length4
Median length4
Mean length3.5328467
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100%
2nd row100%
3rd row100%
4th row100%
5th row100%

Common Values

ValueCountFrequency (%)
100% 73
53.3%
95% 64
46.7%

Length

2024-03-23T07:48:58.968556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:48:59.291656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
100 73
53.3%
95 64
46.7%

인증기관
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
한국식품연구원
126 
주식회사 푸름인증원
 
11

Length

Max length10
Median length7
Mean length7.2408759
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주식회사 푸름인증원
2nd row주식회사 푸름인증원
3rd row주식회사 푸름인증원
4th row주식회사 푸름인증원
5th row주식회사 푸름인증원

Common Values

ValueCountFrequency (%)
한국식품연구원 126
92.0%
주식회사 푸름인증원 11
 
8.0%

Length

2024-03-23T07:48:59.732046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:49:00.039682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한국식품연구원 126
85.1%
주식회사 11
 
7.4%
푸름인증원 11
 
7.4%

Correlations

2024-03-23T07:49:00.231249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최초인증일자인증유효기간인증업체명비율인증기관
최초인증일자1.0001.0000.9980.8240.989
인증유효기간1.0001.0000.9980.8240.989
인증업체명0.9980.9981.0000.9270.985
비율0.8240.8240.9271.0000.362
인증기관0.9890.9890.9850.3621.000
2024-03-23T07:49:00.560988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최초인증일자인증업체명인증기관인증유효기간비율
최초인증일자1.0000.9110.8391.0000.624
인증업체명0.9111.0000.8400.9110.744
인증기관0.8390.8401.0000.8390.235
인증유효기간1.0000.9110.8391.0000.624
비율0.6240.7440.2350.6241.000
2024-03-23T07:49:01.048081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최초인증일자인증유효기간인증업체명비율인증기관
최초인증일자1.0001.0000.9110.6240.839
인증유효기간1.0001.0000.9110.6240.839
인증업체명0.9110.9111.0000.7440.840
비율0.6240.6240.7441.0000.235
인증기관0.8390.8390.8400.2351.000

Missing values

2024-03-23T07:48:53.138750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:48:53.620281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

인증번호최초인증일자인증유효기간인증업체명제품명원산지비율인증기관
0푸름 원산지 제1호2021-05-272024-05-26농업회사법인 온샘㈜안동참마분말국산100%주식회사 푸름인증원
1푸름 원산지 제2호2021-07-122024-07-11태장고새뜸된장국산100%주식회사 푸름인증원
2푸름 원산지 제3호2021-07-122024-07-11태장고새뜸고추장국산100%주식회사 푸름인증원
3푸름 원산지 제4호2021-08-022024-08-01지보농협참기름가공공장참기름국산100%주식회사 푸름인증원
4푸름 원산지 제5호2021-08-022024-08-01지보농협참기름가공공장들기름국산100%주식회사 푸름인증원
5푸름 원산지 제6호2021-08-022024-08-01지보농협참기름가공공장볶음참깨국산100%주식회사 푸름인증원
6푸름 원산지 제7호2021-08-252024-08-24대풍년영농조합법인친정엄마꾸러미 요리 앤 고춧가루국산100%주식회사 푸름인증원
7푸름 원산지 제8호2021-11-222024-11-21농업회사법인 어울림(유)구운요술콩 백태국산100%주식회사 푸름인증원
8푸름 원산지 제9호2021-11-222024-11-21농업회사법인 어울림(유)구운요술콩 서리태국산100%주식회사 푸름인증원
9푸름 원산지 제10호2021-11-222024-11-21농업회사법인 어울림(유)위트밀 구운통곡물 시리얼국산100%주식회사 푸름인증원
인증번호최초인증일자인증유효기간인증업체명제품명원산지비율인증기관
127식품연 원산지 제183호2021-03-022024-03-01남영양농협가공사업소햇살촌영양청결고춧가루국산100%한국식품연구원
128식품연 원산지 제184호2021-03-022024-03-01남영양농협가공사업소참고춧가루국산100%한국식품연구원
129식품연 원산지 제185호2021-03-092024-03-08신태인농협청결고춧가루가공공장단풍고춧가루국산100%한국식품연구원
130식품연 원산지 제186호2021-05-102024-05-09안면도농협고추가공공장안면도태양초고춧가루국산100%한국식품연구원
131식품연 원산지 제187호2021-06-142024-06-13맑은샘자연교육농원조금자채소잡곡국산100%한국식품연구원
132식품연 원산지 제188호2021-06-142024-06-13맑은샘자연교육농원조금자채소볼국산100%한국식품연구원
133식품연 원산지 제189호2021-07-122024-07-11서일농원서분례명인 청국장국산100%한국식품연구원
134식품연 원산지 제190호2021-07-122024-07-11서일농원서분례명인 매운청국장국산100%한국식품연구원
135식품연 원산지 제191호2021-07-122024-07-11서일농원서분례명인 마늘청국장국산100%한국식품연구원
136식품연 원산지 제192호2021-07-122024-07-11서일농원서분례명인 들깨청국장국산100%한국식품연구원