Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory752.0 KiB
Average record size in memory77.0 B

Variable types

DateTime1
Categorical3
Numeric4

Dataset

Description국립농산물품질관리원에서 관리하는 농축산물 유통조사 정보(처분년월, 업무구분명, 시도명, 조사장소수, 위반업소수, 형사처벌건수, 고발건수, 과태료부과건수)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220204000000001683

Alerts

위반업소수 is highly overall correlated with 형사처벌건수 and 1 other fieldsHigh correlation
형사처벌건수 is highly overall correlated with 위반업소수High correlation
과태료부과건수 is highly overall correlated with 위반업소수High correlation
고발건수 is highly imbalanced (99.0%)Imbalance
조사장소수 has 229 (2.3%) zerosZeros
위반업소수 has 8172 (81.7%) zerosZeros
형사처벌건수 has 8993 (89.9%) zerosZeros
과태료부과건수 has 8929 (89.3%) zerosZeros

Reproduction

Analysis started2024-03-23 07:52:19.552399
Analysis finished2024-03-23 07:52:26.970234
Duration7.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2413
Distinct (%)24.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2015-04-21 00:00:00
Maximum2024-03-15 00:00:00
2024-03-23T07:52:27.145097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:27.812437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

업무구분명
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
원산지단속
3577 
축산물이력
2750 
양곡표시
2698 
미검사품
649 
재사용화환
 
326

Length

Max length5
Median length5
Mean length4.6653
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row재사용화환
2nd row원산지단속
3rd row축산물이력
4th row양곡표시
5th row원산지단속

Common Values

ValueCountFrequency (%)
원산지단속 3577
35.8%
축산물이력 2750
27.5%
양곡표시 2698
27.0%
미검사품 649
 
6.5%
재사용화환 326
 
3.3%

Length

2024-03-23T07:52:28.393414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:52:28.752151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
원산지단속 3577
35.8%
축산물이력 2750
27.5%
양곡표시 2698
27.0%
미검사품 649
 
6.5%
재사용화환 326
 
3.3%

시도명
Categorical

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경상북도
824 
전라남도
791 
충청남도
778 
충청북도
771 
경기도
749 
Other values (12)
6087 

Length

Max length7
Median length5
Mean length4.8993
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청남도
2nd row인천광역시
3rd row강원특별자치도
4th row충청남도
5th row경상북도

Common Values

ValueCountFrequency (%)
경상북도 824
 
8.2%
전라남도 791
 
7.9%
충청남도 778
 
7.8%
충청북도 771
 
7.7%
경기도 749
 
7.5%
경상남도 725
 
7.2%
강원특별자치도 722
 
7.2%
전북특별자치도 690
 
6.9%
서울특별시 566
 
5.7%
제주특별자치도 552
 
5.5%
Other values (7) 2832
28.3%

Length

2024-03-23T07:52:29.149710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경상북도 824
 
8.2%
전라남도 791
 
7.9%
충청남도 778
 
7.8%
충청북도 771
 
7.7%
경기도 749
 
7.5%
경상남도 725
 
7.2%
강원특별자치도 722
 
7.2%
전북특별자치도 690
 
6.9%
서울특별시 566
 
5.7%
제주특별자치도 552
 
5.5%
Other values (7) 2832
28.3%

조사장소수
Real number (ℝ)

ZEROS 

Distinct261
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.3196
Minimum0
Maximum555
Zeros229
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:52:29.516602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median10
Q332
95-th percentile113
Maximum555
Range555
Interquartile range (IQR)29

Descriptive statistics

Standard deviation41.924287
Coefficient of variation (CV)1.5345864
Kurtosis14.485094
Mean27.3196
Median Absolute Deviation (MAD)8
Skewness3.1007678
Sum273196
Variance1757.6458
MonotonicityNot monotonic
2024-03-23T07:52:29.798867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1021
 
10.2%
2 732
 
7.3%
3 610
 
6.1%
4 485
 
4.9%
5 462
 
4.6%
6 358
 
3.6%
7 313
 
3.1%
8 310
 
3.1%
10 274
 
2.7%
9 269
 
2.7%
Other values (251) 5166
51.7%
ValueCountFrequency (%)
0 229
 
2.3%
1 1021
10.2%
2 732
7.3%
3 610
6.1%
4 485
4.9%
5 462
4.6%
6 358
 
3.6%
7 313
 
3.1%
8 310
 
3.1%
9 269
 
2.7%
ValueCountFrequency (%)
555 1
< 0.1%
467 1
< 0.1%
433 1
< 0.1%
428 1
< 0.1%
394 2
< 0.1%
357 1
< 0.1%
337 1
< 0.1%
333 1
< 0.1%
323 1
< 0.1%
320 1
< 0.1%

위반업소수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.358
Minimum0
Maximum22
Zeros8172
Zeros (%)81.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:52:30.127819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum22
Range22
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.1113773
Coefficient of variation (CV)3.1044059
Kurtosis72.955458
Mean0.358
Median Absolute Deviation (MAD)0
Skewness6.7184858
Sum3580
Variance1.2351595
MonotonicityNot monotonic
2024-03-23T07:52:30.325865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
0 8172
81.7%
1 1074
 
10.7%
2 378
 
3.8%
3 183
 
1.8%
4 69
 
0.7%
5 43
 
0.4%
6 23
 
0.2%
7 17
 
0.2%
8 14
 
0.1%
9 7
 
0.1%
Other values (8) 20
 
0.2%
ValueCountFrequency (%)
0 8172
81.7%
1 1074
 
10.7%
2 378
 
3.8%
3 183
 
1.8%
4 69
 
0.7%
5 43
 
0.4%
6 23
 
0.2%
7 17
 
0.2%
8 14
 
0.1%
9 7
 
0.1%
ValueCountFrequency (%)
22 1
 
< 0.1%
21 1
 
< 0.1%
17 1
 
< 0.1%
16 4
 
< 0.1%
15 1
 
< 0.1%
13 5
 
0.1%
12 2
 
< 0.1%
10 5
 
0.1%
9 7
0.1%
8 14
0.1%

형사처벌건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1783
Minimum0
Maximum16
Zeros8993
Zeros (%)89.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:52:30.604740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum16
Range16
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7135545
Coefficient of variation (CV)4.0019882
Kurtosis86.55911
Mean0.1783
Median Absolute Deviation (MAD)0
Skewness7.4550554
Sum1783
Variance0.50916003
MonotonicityNot monotonic
2024-03-23T07:52:30.973582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0 8993
89.9%
1 625
 
6.2%
2 215
 
2.1%
3 84
 
0.8%
4 32
 
0.3%
5 21
 
0.2%
7 10
 
0.1%
8 8
 
0.1%
6 6
 
0.1%
10 2
 
< 0.1%
Other values (3) 4
 
< 0.1%
ValueCountFrequency (%)
0 8993
89.9%
1 625
 
6.2%
2 215
 
2.1%
3 84
 
0.8%
4 32
 
0.3%
5 21
 
0.2%
6 6
 
0.1%
7 10
 
0.1%
8 8
 
0.1%
10 2
 
< 0.1%
ValueCountFrequency (%)
16 1
 
< 0.1%
13 1
 
< 0.1%
12 2
 
< 0.1%
10 2
 
< 0.1%
8 8
 
0.1%
7 10
 
0.1%
6 6
 
0.1%
5 21
 
0.2%
4 32
 
0.3%
3 84
0.8%

고발건수
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9979 
1
 
15
2
 
4
3
 
1
8
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9979
99.8%
1 15
 
0.1%
2 4
 
< 0.1%
3 1
 
< 0.1%
8 1
 
< 0.1%

Length

2024-03-23T07:52:31.381324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:52:31.652386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9979
99.8%
1 15
 
0.1%
2 4
 
< 0.1%
3 1
 
< 0.1%
8 1
 
< 0.1%

과태료부과건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1763
Minimum0
Maximum22
Zeros8929
Zeros (%)89.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:52:31.949944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum22
Range22
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7474451
Coefficient of variation (CV)4.2396205
Kurtosis214.33904
Mean0.1763
Median Absolute Deviation (MAD)0
Skewness11.119256
Sum1763
Variance0.55867418
MonotonicityNot monotonic
2024-03-23T07:52:32.246909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
0 8929
89.3%
1 751
 
7.5%
2 190
 
1.9%
3 59
 
0.6%
4 26
 
0.3%
5 13
 
0.1%
6 11
 
0.1%
8 6
 
0.1%
7 5
 
0.1%
9 2
 
< 0.1%
Other values (6) 8
 
0.1%
ValueCountFrequency (%)
0 8929
89.3%
1 751
 
7.5%
2 190
 
1.9%
3 59
 
0.6%
4 26
 
0.3%
5 13
 
0.1%
6 11
 
0.1%
7 5
 
0.1%
8 6
 
0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
22 1
 
< 0.1%
21 1
 
< 0.1%
17 1
 
< 0.1%
15 1
 
< 0.1%
12 2
 
< 0.1%
10 2
 
< 0.1%
9 2
 
< 0.1%
8 6
0.1%
7 5
0.1%
6 11
0.1%

Interactions

2024-03-23T07:52:25.023710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:21.029526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:22.129041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:23.758792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:25.297715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:21.291139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:22.406379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:24.034149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:25.624153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:21.553114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:22.776715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:24.396022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:25.927845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:21.855222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:23.405931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:52:24.686543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:52:32.674889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분명시도명조사장소수위반업소수형사처벌건수고발건수과태료부과건수
업무구분명1.0000.2090.5430.2070.3020.0620.098
시도명0.2091.0000.2130.0960.1090.0350.035
조사장소수0.5430.2131.0000.2190.3400.1100.081
위반업소수0.2070.0960.2191.0000.7560.1890.938
형사처벌건수0.3020.1090.3400.7561.0000.0430.141
고발건수0.0620.0350.1100.1890.0431.0000.000
과태료부과건수0.0980.0350.0810.9380.1410.0001.000
2024-03-23T07:52:32.913805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고발건수업무구분명시도명
고발건수1.0000.0230.018
업무구분명0.0231.0000.109
시도명0.0180.1091.000
2024-03-23T07:52:33.173263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조사장소수위반업소수형사처벌건수과태료부과건수업무구분명시도명고발건수
조사장소수1.0000.2560.3470.0700.2550.0840.046
위반업소수0.2561.0000.7250.7370.1210.0380.110
형사처벌건수0.3470.7251.0000.1660.1310.0440.020
과태료부과건수0.0700.7370.1661.0000.0560.0140.000
업무구분명0.2550.1210.1310.0561.0000.1090.023
시도명0.0840.0380.0440.0140.1091.0000.018
고발건수0.0460.1100.0200.0000.0230.0181.000

Missing values

2024-03-23T07:52:26.359432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:52:26.770974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처분년월업무구분명시도명조사장소수위반업소수형사처벌건수고발건수과태료부과건수
62702023-08-22재사용화환충청남도60000
455582020-02-28원산지단속인천광역시20000
82492023-06-16축산물이력강원특별자치도40000
230772022-03-08양곡표시충청남도40000
320742021-05-20원산지단속경상북도231100
694202017-12-05원산지단속전북특별자치도911100
100312023-04-21축산물이력충청북도90000
918182015-12-18축산물이력충청북도60000
679372018-01-24원산지단속대구광역시76105
859212016-06-29양곡표시경기도880000
처분년월업무구분명시도명조사장소수위반업소수형사처벌건수고발건수과태료부과건수
149842022-11-15재사용화환경상북도30000
763512017-04-26축산물이력전라남도60000
558382019-03-08축산물이력전북특별자치도20000
158942022-10-20양곡표시강원특별자치도270000
88332023-05-26축산물이력경상북도31001
286482021-09-03원산지단속광주광역시30000
608962018-09-13양곡표시전라남도250000
183862022-08-01축산물이력부산광역시20000
979352015-06-23원산지단속경기도731100
259512021-11-30재사용화환대전광역시30000