Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)< 0.1%
Total size in memory810.5 KiB
Average record size in memory83.0 B

Variable types

Numeric3
Categorical3
Text1
DateTime2

Dataset

Description무항생제축산물 인증정보에 관한 사항(인증번호, 인증종류명, 인증농가, 인증품목명, 재배(작업장)면적(사육두수), 생산(수입)계획량, 인증기간(시작일), 인증기간(종료일), 원재료인증구분)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220711000000002154

Alerts

Dataset has 4 (< 0.1%) duplicate rowsDuplicates
원재료인증구분 is highly overall correlated with 재배(작업장)면적(사육두수) and 2 other fieldsHigh correlation
인증종류명 is highly overall correlated with 인증번호 and 1 other fieldsHigh correlation
인증번호 is highly overall correlated with 인증종류명High correlation
재배(작업장)면적(사육두수) is highly overall correlated with 생산(수입)계획량 and 1 other fieldsHigh correlation
생산(수입)계획량 is highly overall correlated with 재배(작업장)면적(사육두수)High correlation
인증품목명 is highly overall correlated with 원재료인증구분High correlation
원재료인증구분 is highly imbalanced (66.0%)Imbalance
생산(수입)계획량 is highly skewed (γ1 = 98.17438369)Skewed

Reproduction

Analysis started2024-01-05 22:18:42.002601
Analysis finished2024-01-05 22:18:47.418061
Duration5.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

인증번호
Real number (ℝ)

HIGH CORRELATION 

Distinct8064
Distinct (%)80.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12969751
Minimum1600002
Maximum18600124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-05T22:18:47.654913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1600002
5-th percentile4500023
Q110502981
median13502554
Q315502836
95-th percentile17501876
Maximum18600124
Range17000122
Interquartile range (IQR)4999855.5

Descriptive statistics

Standard deviation3669846.7
Coefficient of variation (CV)0.28295428
Kurtosis1.0998589
Mean12969751
Median Absolute Deviation (MAD)2097748
Skewness-1.1085185
Sum1.2969751 × 1011
Variance1.3467775 × 1013
MonotonicityNot monotonic
2024-01-05T22:18:48.095316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17501679 75
 
0.8%
12501871 52
 
0.5%
17501796 48
 
0.5%
17501795 40
 
0.4%
16502137 23
 
0.2%
15501734 21
 
0.2%
13501838 21
 
0.2%
15500004 18
 
0.2%
10502700 17
 
0.2%
16501986 15
 
0.1%
Other values (8054) 9670
96.7%
ValueCountFrequency (%)
1600002 1
 
< 0.1%
1600010 1
 
< 0.1%
1600011 2
< 0.1%
1600013 4
< 0.1%
1600017 1
 
< 0.1%
1600018 1
 
< 0.1%
1600020 2
< 0.1%
1600021 1
 
< 0.1%
1600022 1
 
< 0.1%
1600024 1
 
< 0.1%
ValueCountFrequency (%)
18600124 1
< 0.1%
18600122 1
< 0.1%
18600120 1
< 0.1%
18600115 2
< 0.1%
18600113 1
< 0.1%
18600112 1
< 0.1%
18600107 1
< 0.1%
18600105 1
< 0.1%
18600093 2
< 0.1%
18600089 2
< 0.1%

인증종류명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
무항생제축산물
7663 
취급자
2337 

Length

Max length7
Median length7
Mean length6.0652
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row무항생제축산물
2nd row무항생제축산물
3rd row무항생제축산물
4th row취급자
5th row무항생제축산물

Common Values

ValueCountFrequency (%)
무항생제축산물 7663
76.6%
취급자 2337
 
23.4%

Length

2024-01-05T22:18:48.568636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-05T22:18:48.898369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무항생제축산물 7663
76.6%
취급자 2337
 
23.4%
Distinct7349
Distinct (%)73.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-01-05T22:18:49.331273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length3
Mean length4.1973
Min length2

Characters and Unicode

Total characters41973
Distinct characters470
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5755 ?
Unique (%)57.6%

Sample

1st row성원농장(공준식)
2nd row김영진
3rd row금정농장 김희숙
4th row권안나
5th row정지연
ValueCountFrequency (%)
강희석 128
 
1.2%
농업회사법인 40
 
0.4%
이제훈 39
 
0.4%
박홍진 35
 
0.3%
주식회사 19
 
0.2%
인용식 18
 
0.2%
권민석 18
 
0.2%
예당한우 18
 
0.2%
이범호,안형철 17
 
0.2%
권혁수 14
 
0.1%
Other values (7851) 10590
96.8%
2024-01-05T22:18:50.247369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2094
 
5.0%
1716
 
4.1%
1335
 
3.2%
1076
 
2.6%
1015
 
2.4%
1009
 
2.4%
) 1001
 
2.4%
( 1001
 
2.4%
938
 
2.2%
870
 
2.1%
Other values (460) 29918
71.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 38660
92.1%
Close Punctuation 1001
 
2.4%
Open Punctuation 1001
 
2.4%
Space Separator 938
 
2.2%
Other Punctuation 244
 
0.6%
Decimal Number 83
 
0.2%
Uppercase Letter 45
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2094
 
5.4%
1716
 
4.4%
1335
 
3.5%
1076
 
2.8%
1015
 
2.6%
1009
 
2.6%
870
 
2.3%
707
 
1.8%
571
 
1.5%
564
 
1.5%
Other values (428) 27703
71.7%
Uppercase Letter
ValueCountFrequency (%)
F 7
15.6%
H 6
13.3%
S 6
13.3%
B 5
11.1%
M 5
11.1%
D 3
6.7%
N 3
6.7%
G 2
 
4.4%
C 2
 
4.4%
U 1
 
2.2%
Other values (5) 5
11.1%
Decimal Number
ValueCountFrequency (%)
2 39
47.0%
1 31
37.3%
3 5
 
6.0%
5 3
 
3.6%
4 2
 
2.4%
0 2
 
2.4%
8 1
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 217
88.9%
/ 16
 
6.6%
& 5
 
2.0%
. 3
 
1.2%
· 2
 
0.8%
: 1
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 1001
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1001
100.0%
Space Separator
ValueCountFrequency (%)
938
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 38660
92.1%
Common 3268
 
7.8%
Latin 45
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2094
 
5.4%
1716
 
4.4%
1335
 
3.5%
1076
 
2.8%
1015
 
2.6%
1009
 
2.6%
870
 
2.3%
707
 
1.8%
571
 
1.5%
564
 
1.5%
Other values (428) 27703
71.7%
Common
ValueCountFrequency (%)
) 1001
30.6%
( 1001
30.6%
938
28.7%
, 217
 
6.6%
2 39
 
1.2%
1 31
 
0.9%
/ 16
 
0.5%
& 5
 
0.2%
3 5
 
0.2%
5 3
 
0.1%
Other values (7) 12
 
0.4%
Latin
ValueCountFrequency (%)
F 7
15.6%
H 6
13.3%
S 6
13.3%
B 5
11.1%
M 5
11.1%
D 3
6.7%
N 3
6.7%
G 2
 
4.4%
C 2
 
4.4%
U 1
 
2.2%
Other values (5) 5
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 38660
92.1%
ASCII 3311
 
7.9%
None 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2094
 
5.4%
1716
 
4.4%
1335
 
3.5%
1076
 
2.8%
1015
 
2.6%
1009
 
2.6%
870
 
2.3%
707
 
1.8%
571
 
1.5%
564
 
1.5%
Other values (428) 27703
71.7%
ASCII
ValueCountFrequency (%)
) 1001
30.2%
( 1001
30.2%
938
28.3%
, 217
 
6.6%
2 39
 
1.2%
1 31
 
0.9%
/ 16
 
0.5%
F 7
 
0.2%
H 6
 
0.2%
S 6
 
0.2%
Other values (21) 49
 
1.5%
None
ValueCountFrequency (%)
· 2
100.0%

인증품목명
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
한우(식육)
4689 
돼지(식육)
1715 
육계(식육)
1321 
산란계(알)
848 
오리(식육)
676 
Other values (16)
751 

Length

Max length9
Median length6
Mean length5.9865
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row한우(식육)
2nd row한우(식육)
3rd row산란계(알)
4th row한우(식육)
5th row육계(식육)

Common Values

ValueCountFrequency (%)
한우(식육) 4689
46.9%
돼지(식육) 1715
 
17.2%
육계(식육) 1321
 
13.2%
산란계(알) 848
 
8.5%
오리(식육) 676
 
6.8%
육우(식육) 227
 
2.3%
젖소(시유) 225
 
2.2%
산란육성계 157
 
1.6%
메추리 알 73
 
0.7%
재래 산양(염소) 34
 
0.3%
Other values (11) 35
 
0.4%

Length

2024-01-05T22:18:50.752887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한우(식육 4689
46.3%
돼지(식육 1715
 
16.9%
육계(식육 1321
 
13.1%
산란계(알 848
 
8.4%
오리(식육 676
 
6.7%
육우(식육 227
 
2.2%
젖소(시유 225
 
2.2%
산란육성계 157
 
1.6%
메추리 76
 
0.8%
73
 
0.7%
Other values (14) 112
 
1.1%

재배(작업장)면적(사육두수)
Real number (ℝ)

HIGH CORRELATION 

Distinct2084
Distinct (%)20.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22554.142
Minimum0
Maximum2220000
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-05T22:18:51.166611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile26
Q184
median210
Q37000
95-th percentile111000
Maximum2220000
Range2220000
Interquartile range (IQR)6916

Descriptive statistics

Standard deviation77690.727
Coefficient of variation (CV)3.4446323
Kurtosis181.5306
Mean22554.142
Median Absolute Deviation (MAD)170
Skewness10.534571
Sum2.2554142 × 108
Variance6.035849 × 109
MonotonicityNot monotonic
2024-01-05T22:18:51.627386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 103
 
1.0%
60 94
 
0.9%
70 79
 
0.8%
40 76
 
0.8%
50 74
 
0.7%
200 65
 
0.7%
80 63
 
0.6%
30 61
 
0.6%
150 58
 
0.6%
130 56
 
0.6%
Other values (2074) 9271
92.7%
ValueCountFrequency (%)
0 4
 
< 0.1%
1 13
0.1%
2 11
0.1%
3 14
0.1%
4 14
0.1%
5 11
0.1%
6 10
0.1%
7 17
0.2%
8 19
0.2%
9 7
 
0.1%
ValueCountFrequency (%)
2220000 1
< 0.1%
1840000 1
< 0.1%
1750000 1
< 0.1%
1380000 1
< 0.1%
1330000 1
< 0.1%
1285200 1
< 0.1%
1276224 1
< 0.1%
1050000 2
< 0.1%
1025610 1
< 0.1%
960000 1
< 0.1%

생산(수입)계획량
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct2904
Distinct (%)29.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1551924.6
Minimum0
Maximum9.6808062 × 109
Zeros23
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-05T22:18:52.207651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile735
Q18000
median37100
Q3388050
95-th percentile1500000
Maximum9.6808062 × 109
Range9.6808062 × 109
Interquartile range (IQR)380050

Descriptive statistics

Standard deviation97436454
Coefficient of variation (CV)62.784269
Kurtosis9743.2117
Mean1551924.6
Median Absolute Deviation (MAD)35900
Skewness98.174384
Sum1.5519246 × 1010
Variance9.4938626 × 1015
MonotonicityNot monotonic
2024-01-05T22:18:53.269514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000.0 266
 
2.7%
1.0 184
 
1.8%
20000.0 157
 
1.6%
7000.0 148
 
1.5%
1000.0 147
 
1.5%
5000.0 145
 
1.5%
50000.0 143
 
1.4%
2000.0 136
 
1.4%
30000.0 128
 
1.3%
3500.0 120
 
1.2%
Other values (2894) 8426
84.3%
ValueCountFrequency (%)
0.0 23
 
0.2%
0.1 7
 
0.1%
1.0 184
1.8%
2.0 43
 
0.4%
3.0 21
 
0.2%
4.0 10
 
0.1%
5.0 4
 
< 0.1%
6.0 2
 
< 0.1%
10.0 2
 
< 0.1%
30.0 1
 
< 0.1%
ValueCountFrequency (%)
9680806200.0 1
< 0.1%
1000000000.0 1
< 0.1%
329000000.0 1
< 0.1%
300000000.0 1
< 0.1%
73000000.0 1
< 0.1%
70000000.0 1
< 0.1%
60000000.0 2
< 0.1%
49000000.0 1
< 0.1%
36000000.0 2
< 0.1%
32850000.0 1
< 0.1%
Distinct362
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-06-17 00:00:00
Maximum2023-06-16 00:00:00
2024-01-05T22:18:54.477223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:55.198394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct362
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2023-06-16 00:00:00
Maximum2024-06-15 00:00:00
2024-01-05T22:18:56.176852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:56.872208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

원재료인증구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
7673 
무항생제축산물
2321 
취급자
 
3
유기농산물
 
2
무농약원료가공식품
 
1

Length

Max length9
Median length4
Mean length4.6967
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row무항생제축산물
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 7673
76.7%
무항생제축산물 2321
 
23.2%
취급자 3
 
< 0.1%
유기농산물 2
 
< 0.1%
무농약원료가공식품 1
 
< 0.1%

Length

2024-01-05T22:18:57.361039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-05T22:18:57.822877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 7673
76.7%
무항생제축산물 2321
 
23.2%
취급자 3
 
< 0.1%
유기농산물 2
 
< 0.1%
무농약원료가공식품 1
 
< 0.1%

Interactions

2024-01-05T22:18:45.635423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:43.881022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:44.751233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:45.913609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:44.159412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:45.078441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:46.179079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:44.437703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:18:45.362832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-05T22:18:58.123873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량원재료인증구분
인증번호1.0000.5600.3120.0100.0000.133
인증종류명0.5601.0000.4140.0840.013NaN
인증품목명0.3120.4141.0000.2390.0000.784
재배(작업장)면적(사육두수)0.0100.0840.2391.0000.000NaN
생산(수입)계획량0.0000.0130.0000.0001.0000.000
원재료인증구분0.133NaN0.784NaN0.0001.000
2024-01-05T22:18:58.598445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
원재료인증구분인증품목명인증종류명
원재료인증구분1.0000.5761.000
인증품목명0.5761.0000.364
인증종류명1.0000.3641.000
2024-01-05T22:18:59.138999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증번호재배(작업장)면적(사육두수)생산(수입)계획량인증종류명인증품목명원재료인증구분
인증번호1.0000.1120.0790.5630.1250.085
재배(작업장)면적(사육두수)0.1121.0000.7670.0650.0901.000
생산(수입)계획량0.0790.7671.0000.0210.0000.000
인증종류명0.5630.0650.0211.0000.3641.000
인증품목명0.1250.0900.0000.3641.0000.576
원재료인증구분0.0851.0000.0001.0000.5761.000

Missing values

2024-01-05T22:18:46.597928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-05T22:18:47.133315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

인증번호인증종류명인증농가인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분
977610500050무항생제축산물성원농장(공준식)한우(식육)19518000.02022-08-272023-08-26<NA>
730016502289무항생제축산물김영진한우(식육)12532000.02022-09-162023-09-15<NA>
615114501908무항생제축산물금정농장 김희숙산란계(알)42000735840.02022-07-292023-07-28<NA>
41167600141취급자권안나한우(식육)314116800.02022-07-202023-07-19무항생제축산물
873614502187무항생제축산물정지연육계(식육)55000727650.02022-10-172023-10-16<NA>
575810600688취급자유애자육우(식육)481000.02023-04-232024-04-22무항생제축산물
617510600815취급자광성유통(지현구)산란계(알)4951100000.02022-07-152023-07-14무항생제축산물
819213502458무항생제축산물정상준육계(식육)37000439062.02023-05-062024-05-05<NA>
878414502194무항생제축산물인용식한우(식육)5161.02022-11-032023-11-02<NA>
350615500050무항생제축산물유명순한우(식육)7610000.02022-09-062023-09-05<NA>
인증번호인증종류명인증농가인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분
221315502418무항생제축산물김동구한우(식육)1397350.02023-04-302024-04-29<NA>
169816500147무항생제축산물안외숙육계(식육)66000785400.02022-08-072023-08-06<NA>
375411500123무항생제축산물홍근협돼지(식육)1300198825.02022-12-072023-12-06<NA>
173510501519무항생제축산물김남용육계(식육)35000336600.02022-10-172023-10-16<NA>
414010500088무항생제축산물허원준, 김혜곤돼지(식육)1600377800.02022-08-302023-08-29<NA>
660114600161취급자최규완육계(식육)3151000000.02022-12-242023-12-23무항생제축산물
434510500322무항생제축산물남양주축산농협한우(식육)45047040.02022-11-192023-11-18<NA>
37087500011무항생제축산물김진연한우(식육)1128000.02022-12-142023-12-13<NA>
95153600362취급자이은정돼지(식육)1462000.02023-04-062024-04-05무항생제축산물
340215502166무항생제축산물김승근한우(식육)2004500.02023-03-022024-03-01<NA>

Duplicate rows

Most frequently occurring

인증번호인증종류명인증농가인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분# duplicates
010502015무항생제축산물주은농장(백정기)육계(식육)17500208250.02023-02-182024-02-17<NA>2
114501181무항생제축산물이동준육계(식육)65000791700.02022-12-062023-12-05<NA>2
215500284무항생제축산물박남주육계(식육)31500362000.02022-09-302023-09-29<NA>2
316502026무항생제축산물권민석한우(식육)446200700.02023-04-252024-04-24<NA>2