Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory752.0 KiB
Average record size in memory77.0 B

Variable types

Text1
Categorical2
Numeric5

Dataset

Description국립농산물품질관리원에서 관리하는 농축산물 유통조사 정보(처분년월, 업무구분명, 시도명, 조사장소수, 위반업소수, 형사처벌건수, 고발건수, 과태료부과건수)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220204000000001683

Alerts

조사장소수 is highly overall correlated with 과태료부과건수High correlation
위반업소수 is highly overall correlated with 형사처벌건수 and 1 other fieldsHigh correlation
형사처벌건수 is highly overall correlated with 위반업소수High correlation
과태료부과건수 is highly overall correlated with 조사장소수 and 1 other fieldsHigh correlation
위반업소수 has 4163 (41.6%) zerosZeros
형사처벌건수 has 6175 (61.8%) zerosZeros
고발건수 has 9527 (95.3%) zerosZeros
과태료부과건수 has 5555 (55.5%) zerosZeros

Reproduction

Analysis started2024-03-23 07:51:21.845403
Analysis finished2024-03-23 07:51:31.135188
Duration9.29 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct262
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:51:31.624599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters60000
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.1%

Sample

1st rowApr-01
2nd rowSep-18
3rd rowAug-07
4th rowMar-08
5th rowJun-05
ValueCountFrequency (%)
oct-11 74
 
0.7%
jul-12 74
 
0.7%
dec-12 74
 
0.7%
nov-11 73
 
0.7%
jan-12 73
 
0.7%
may-12 72
 
0.7%
jul-11 71
 
0.7%
sep-12 71
 
0.7%
apr-12 71
 
0.7%
sep-11 69
 
0.7%
Other values (252) 9278
92.8%
2024-03-23T07:51:32.469024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 10000
16.7%
1 7749
 
12.9%
0 4008
 
6.7%
a 2536
 
4.2%
u 2533
 
4.2%
J 2529
 
4.2%
e 2469
 
4.1%
A 1715
 
2.9%
r 1688
 
2.8%
M 1684
 
2.8%
Other values (23) 23089
38.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20000
33.3%
Lowercase Letter 20000
33.3%
Dash Punctuation 10000
16.7%
Uppercase Letter 10000
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2536
12.7%
u 2533
12.7%
e 2469
12.3%
r 1688
8.4%
n 1665
8.3%
p 1661
8.3%
c 1609
8.0%
l 864
 
4.3%
b 860
 
4.3%
g 856
 
4.3%
Other values (4) 3259
16.3%
Decimal Number
ValueCountFrequency (%)
1 7749
38.7%
0 4008
20.0%
2 1416
 
7.1%
9 1215
 
6.1%
8 1127
 
5.6%
7 1004
 
5.0%
6 964
 
4.8%
3 864
 
4.3%
5 855
 
4.3%
4 798
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
J 2529
25.3%
A 1715
17.2%
M 1684
16.8%
F 860
 
8.6%
D 807
 
8.1%
S 802
 
8.0%
O 802
 
8.0%
N 801
 
8.0%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30000
50.0%
Latin 30000
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2536
 
8.5%
u 2533
 
8.4%
J 2529
 
8.4%
e 2469
 
8.2%
A 1715
 
5.7%
r 1688
 
5.6%
M 1684
 
5.6%
n 1665
 
5.5%
p 1661
 
5.5%
c 1609
 
5.4%
Other values (12) 9911
33.0%
Common
ValueCountFrequency (%)
- 10000
33.3%
1 7749
25.8%
0 4008
13.4%
2 1416
 
4.7%
9 1215
 
4.0%
8 1127
 
3.8%
7 1004
 
3.3%
6 964
 
3.2%
3 864
 
2.9%
5 855
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 10000
16.7%
1 7749
 
12.9%
0 4008
 
6.7%
a 2536
 
4.2%
u 2533
 
4.2%
J 2529
 
4.2%
e 2469
 
4.1%
A 1715
 
2.9%
r 1688
 
2.8%
M 1684
 
2.8%
Other values (23) 23089
38.5%

업무구분명
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
원산지단속
3514 
양곡표시
2134 
축산물이력
1893 
GMO
1244 
미검사품
1206 

Length

Max length5
Median length5
Mean length4.4172
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row원산지단속
2nd row양곡표시
3rd rowGMO
4th row원산지단속
5th row원산지단속

Common Values

ValueCountFrequency (%)
원산지단속 3514
35.1%
양곡표시 2134
21.3%
축산물이력 1893
18.9%
GMO 1244
 
12.4%
미검사품 1206
 
12.1%
재사용화환 9
 
0.1%

Length

2024-03-23T07:51:32.905729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:51:33.314980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
원산지단속 3514
35.1%
양곡표시 2134
21.3%
축산물이력 1893
18.9%
gmo 1244
 
12.4%
미검사품 1206
 
12.1%
재사용화환 9
 
0.1%

시도명
Categorical

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
전라남도
 
635
전라북도
 
629
경상북도
 
621
경기도
 
615
충청북도
 
615
Other values (12)
6885 

Length

Max length7
Median length5
Mean length4.6001
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전라북도
2nd row인천광역시
3rd row강원도
4th row충청남도
5th row대전광역시

Common Values

ValueCountFrequency (%)
전라남도 635
 
6.3%
전라북도 629
 
6.3%
경상북도 621
 
6.2%
경기도 615
 
6.2%
충청북도 615
 
6.2%
충청남도 613
 
6.1%
강원도 609
 
6.1%
경상남도 604
 
6.0%
서울특별시 592
 
5.9%
제주특별자치도 591
 
5.9%
Other values (7) 3876
38.8%

Length

2024-03-23T07:51:33.707392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전라남도 635
 
6.3%
전라북도 629
 
6.3%
경상북도 621
 
6.2%
경기도 615
 
6.2%
충청북도 615
 
6.2%
충청남도 613
 
6.1%
강원도 609
 
6.1%
경상남도 604
 
6.0%
서울특별시 592
 
5.9%
제주특별자치도 591
 
5.9%
Other values (7) 3876
38.8%

조사장소수
Real number (ℝ)

HIGH CORRELATION 

Distinct1996
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean474.2994
Minimum0
Maximum13387
Zeros22
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:51:34.054006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q131
median147
Q3482
95-th percentile2306.6
Maximum13387
Range13387
Interquartile range (IQR)451

Descriptive statistics

Standard deviation858.98611
Coefficient of variation (CV)1.811063
Kurtosis19.339293
Mean474.2994
Median Absolute Deviation (MAD)135
Skewness3.6136975
Sum4742994
Variance737857.14
MonotonicityNot monotonic
2024-03-23T07:51:34.365449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 211
 
2.1%
2 148
 
1.5%
3 124
 
1.2%
4 100
 
1.0%
7 99
 
1.0%
6 98
 
1.0%
5 95
 
0.9%
11 90
 
0.9%
10 89
 
0.9%
8 85
 
0.9%
Other values (1986) 8861
88.6%
ValueCountFrequency (%)
0 22
 
0.2%
1 211
2.1%
2 148
1.5%
3 124
1.2%
4 100
1.0%
5 95
0.9%
6 98
1.0%
7 99
1.0%
8 85
0.9%
9 73
 
0.7%
ValueCountFrequency (%)
13387 1
< 0.1%
10421 1
< 0.1%
8803 1
< 0.1%
8055 1
< 0.1%
7923 1
< 0.1%
7658 1
< 0.1%
7540 1
< 0.1%
7167 1
< 0.1%
6690 1
< 0.1%
6637 1
< 0.1%

위반업소수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct102
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.6384
Minimum0
Maximum189
Zeros4163
Zeros (%)41.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:51:34.732131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile38
Maximum189
Range189
Interquartile range (IQR)9

Descriptive statistics

Standard deviation13.91101
Coefficient of variation (CV)1.8211942
Kurtosis15.268332
Mean7.6384
Median Absolute Deviation (MAD)1
Skewness3.1512628
Sum76384
Variance193.5162
MonotonicityNot monotonic
2024-03-23T07:51:35.022711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4163
41.6%
1 941
 
9.4%
2 623
 
6.2%
3 459
 
4.6%
4 345
 
3.5%
5 304
 
3.0%
6 241
 
2.4%
7 193
 
1.9%
8 187
 
1.9%
9 175
 
1.8%
Other values (92) 2369
23.7%
ValueCountFrequency (%)
0 4163
41.6%
1 941
 
9.4%
2 623
 
6.2%
3 459
 
4.6%
4 345
 
3.5%
5 304
 
3.0%
6 241
 
2.4%
7 193
 
1.9%
8 187
 
1.9%
9 175
 
1.8%
ValueCountFrequency (%)
189 1
< 0.1%
169 1
< 0.1%
141 1
< 0.1%
138 1
< 0.1%
132 1
< 0.1%
131 1
< 0.1%
128 1
< 0.1%
123 1
< 0.1%
116 1
< 0.1%
110 1
< 0.1%

형사처벌건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct79
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5517
Minimum0
Maximum131
Zeros6175
Zeros (%)61.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:51:35.428802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q34
95-th percentile25
Maximum131
Range131
Interquartile range (IQR)4

Descriptive statistics

Standard deviation9.5090993
Coefficient of variation (CV)2.0891314
Kurtosis14.587192
Mean4.5517
Median Absolute Deviation (MAD)0
Skewness3.1883696
Sum45517
Variance90.422969
MonotonicityNot monotonic
2024-03-23T07:51:35.780901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6175
61.8%
1 571
 
5.7%
2 305
 
3.0%
3 260
 
2.6%
4 225
 
2.2%
5 186
 
1.9%
6 162
 
1.6%
7 154
 
1.5%
9 140
 
1.4%
10 123
 
1.2%
Other values (69) 1699
 
17.0%
ValueCountFrequency (%)
0 6175
61.8%
1 571
 
5.7%
2 305
 
3.0%
3 260
 
2.6%
4 225
 
2.2%
5 186
 
1.9%
6 162
 
1.6%
7 154
 
1.5%
8 122
 
1.2%
9 140
 
1.4%
ValueCountFrequency (%)
131 1
< 0.1%
93 1
< 0.1%
91 2
< 0.1%
86 1
< 0.1%
84 1
< 0.1%
81 1
< 0.1%
80 1
< 0.1%
74 1
< 0.1%
73 1
< 0.1%
70 2
< 0.1%

고발건수
Real number (ℝ)

ZEROS 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1199
Minimum0
Maximum19
Zeros9527
Zeros (%)95.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:51:36.031686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum19
Range19
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8111657
Coefficient of variation (CV)6.7653519
Kurtosis182.62246
Mean0.1199
Median Absolute Deviation (MAD)0
Skewness11.925742
Sum1199
Variance0.65798979
MonotonicityNot monotonic
2024-03-23T07:51:36.244186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
0 9527
95.3%
1 233
 
2.3%
2 117
 
1.2%
3 43
 
0.4%
4 17
 
0.2%
5 15
 
0.1%
6 13
 
0.1%
9 8
 
0.1%
8 4
 
< 0.1%
10 4
 
< 0.1%
Other values (9) 19
 
0.2%
ValueCountFrequency (%)
0 9527
95.3%
1 233
 
2.3%
2 117
 
1.2%
3 43
 
0.4%
4 17
 
0.2%
5 15
 
0.1%
6 13
 
0.1%
7 4
 
< 0.1%
8 4
 
< 0.1%
9 8
 
0.1%
ValueCountFrequency (%)
19 1
 
< 0.1%
18 1
 
< 0.1%
17 1
 
< 0.1%
15 1
 
< 0.1%
14 4
< 0.1%
13 3
 
< 0.1%
12 2
 
< 0.1%
11 2
 
< 0.1%
10 4
< 0.1%
9 8
0.1%

과태료부과건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct68
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9668
Minimum0
Maximum182
Zeros5555
Zeros (%)55.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:51:36.546101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile14.05
Maximum182
Range182
Interquartile range (IQR)3

Descriptive statistics

Standard deviation6.9511531
Coefficient of variation (CV)2.34298
Kurtosis110.9794
Mean2.9668
Median Absolute Deviation (MAD)0
Skewness7.313083
Sum29668
Variance48.31853
MonotonicityNot monotonic
2024-03-23T07:51:36.805986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5555
55.5%
1 989
 
9.9%
2 670
 
6.7%
3 485
 
4.9%
4 379
 
3.8%
5 286
 
2.9%
6 204
 
2.0%
7 200
 
2.0%
8 159
 
1.6%
9 132
 
1.3%
Other values (58) 941
 
9.4%
ValueCountFrequency (%)
0 5555
55.5%
1 989
 
9.9%
2 670
 
6.7%
3 485
 
4.9%
4 379
 
3.8%
5 286
 
2.9%
6 204
 
2.0%
7 200
 
2.0%
8 159
 
1.6%
9 132
 
1.3%
ValueCountFrequency (%)
182 1
< 0.1%
159 1
< 0.1%
136 1
< 0.1%
120 1
< 0.1%
93 1
< 0.1%
89 1
< 0.1%
87 1
< 0.1%
86 1
< 0.1%
72 1
< 0.1%
68 1
< 0.1%

Interactions

2024-03-23T07:51:29.434039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:23.356588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:24.807559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:26.328712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:27.764886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:29.809722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:23.652111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:25.138687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:26.603078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:28.047254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:30.051524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:23.933086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:25.425071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:26.878608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:28.367474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:30.230117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:24.205973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:25.698158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:27.206699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:28.864654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:30.442208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:24.501946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:26.024107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:27.491662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:51:29.140113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:51:37.079050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분명시도명조사장소수위반업소수형사처벌건수고발건수과태료부과건수
업무구분명1.0000.0790.3670.4230.4310.1630.170
시도명0.0791.0000.3090.2350.2280.1380.137
조사장소수0.3670.3091.0000.3810.4450.2120.232
위반업소수0.4230.2350.3811.0000.7220.3460.971
형사처벌건수0.4310.2280.4450.7221.0000.1630.237
고발건수0.1630.1380.2120.3460.1631.0000.207
과태료부과건수0.1700.1370.2320.9710.2370.2071.000
2024-03-23T07:51:37.425014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분명시도명
업무구분명1.0000.037
시도명0.0371.000
2024-03-23T07:51:37.654305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조사장소수위반업소수형사처벌건수고발건수과태료부과건수업무구분명시도명
조사장소수1.0000.4900.3670.1770.5960.1920.128
위반업소수0.4901.0000.8350.2740.7570.2380.093
형사처벌건수0.3670.8351.0000.2680.4070.2300.093
고발건수0.1770.2740.2681.0000.1970.0860.054
과태료부과건수0.5960.7570.4070.1971.0000.0900.053
업무구분명0.1920.2380.2300.0860.0901.0000.037
시도명0.1280.0930.0930.0540.0530.0371.000

Missing values

2024-03-23T07:51:30.686766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:51:31.022991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처분년월업무구분명시도명조사장소수위반업소수형사처벌건수고발건수과태료부과건수
11727Apr-01원산지단속전라북도46242400
1442Sep-18양곡표시인천광역시3920000
10039Aug-07GMO강원도2470000
9799Mar-08원산지단속충청남도53617908
10858Jun-05원산지단속대전광역시22200
6105Oct-12양곡표시충청북도2540000
2196Sep-17양곡표시세종특별자치시400000
6230Aug-12GMO인천광역시390000
9932Dec-07원산지단속전라남도16199207
12013May-98원산지단속부산광역시11010
처분년월업무구분명시도명조사장소수위반업소수형사처벌건수고발건수과태료부과건수
3041Aug-16원산지단속대구광역시7074327016
2974Sep-16원산지단속경상남도47193720017
4190Feb-15양곡표시인천광역시711001
4621Jul-14미검사품광주광역시20000
6196Sep-12원산지단속인천광역시873181404
6487May-12미검사품경상남도2313300
7267Aug-11양곡표시서울특별시7585104
4877Mar-14미검사품인천광역시130000
10353Dec-06원산지단속제주특별자치도550000
6152Sep-12GMO제주특별자치도1530000