Overview

Dataset statistics

Number of variables8
Number of observations2794
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory191.1 KiB
Average record size in memory70.0 B

Variable types

DateTime1
Categorical3
Numeric4

Dataset

Description농산물 유통 관련하여 가공용쌀 공급업에 대해 지정용도외 사용, 원산지표시, 관리대장 비치 등 단속정보(단속년월, 시도명, 조사건수, 위반업체수, 지정용도외 사용 건수, 표시위반 건수,관리대장 미비치 건수, 기타 )
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20170912000000000790

Alerts

위반업체수 is highly overall correlated with 관리대장미비치건수 and 1 other fieldsHigh correlation
관리대장미비치건수 is highly overall correlated with 위반업체수High correlation
기타 is highly overall correlated with 위반업체수High correlation
지정용도외 사용건수 is highly imbalanced (96.3%)Imbalance
표시위반건수 is highly imbalanced (99.1%)Imbalance
위반업체수 has 2446 (87.5%) zerosZeros
관리대장미비치건수 has 2615 (93.6%) zerosZeros
기타 has 2600 (93.1%) zerosZeros

Reproduction

Analysis started2024-03-23 07:22:46.441803
Analysis finished2024-03-23 07:22:52.236807
Duration5.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct256
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
Minimum1999-12-01 00:00:00
Maximum2022-02-01 00:00:00
2024-03-23T07:22:52.422895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:52.875024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

시도명
Categorical

Distinct18
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
경기도
252 
강원도
244 
충청남도
238 
경상남도
230 
전라남도
215 
Other values (13)
1615 

Length

Max length7
Median length5
Mean length4.2491052
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전라남도
2nd row충청북도
3rd row전라남도
4th row인천광역시
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 252
 
9.0%
강원도 244
 
8.7%
충청남도 238
 
8.5%
경상남도 230
 
8.2%
전라남도 215
 
7.7%
경상북도 212
 
7.6%
충청북도 205
 
7.3%
전라북도 200
 
7.2%
인천광역시 172
 
6.2%
서울특별시 153
 
5.5%
Other values (8) 673
24.1%

Length

2024-03-23T07:22:53.433072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 252
 
9.0%
강원도 244
 
8.7%
충청남도 238
 
8.5%
경상남도 230
 
8.2%
전라남도 215
 
7.7%
경상북도 212
 
7.6%
충청북도 205
 
7.3%
전라북도 200
 
7.2%
인천광역시 172
 
6.2%
서울특별시 153
 
5.5%
Other values (8) 673
24.1%

조사건수
Real number (ℝ)

Distinct116
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.637079
Minimum1
Maximum229
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2024-03-23T07:22:53.932542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median13
Q324
95-th percentile57
Maximum229
Range228
Interquartile range (IQR)19

Descriptive statistics

Standard deviation20.469726
Coefficient of variation (CV)1.0983333
Kurtosis12.464927
Mean18.637079
Median Absolute Deviation (MAD)9
Skewness2.7943272
Sum52072
Variance419.00967
MonotonicityNot monotonic
2024-03-23T07:22:54.591437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 199
 
7.1%
3 160
 
5.7%
4 135
 
4.8%
2 127
 
4.5%
5 113
 
4.0%
7 109
 
3.9%
6 105
 
3.8%
13 103
 
3.7%
10 92
 
3.3%
12 90
 
3.2%
Other values (106) 1561
55.9%
ValueCountFrequency (%)
1 199
7.1%
2 127
4.5%
3 160
5.7%
4 135
4.8%
5 113
4.0%
6 105
3.8%
7 109
3.9%
8 84
3.0%
9 81
2.9%
10 92
3.3%
ValueCountFrequency (%)
229 1
< 0.1%
178 1
< 0.1%
169 1
< 0.1%
147 1
< 0.1%
144 1
< 0.1%
141 2
0.1%
132 1
< 0.1%
125 2
0.1%
124 1
< 0.1%
122 2
0.1%

위반업체수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.18396564
Minimum0
Maximum9
Zeros2446
Zeros (%)87.5%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2024-03-23T07:22:54.992661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.60946164
Coefficient of variation (CV)3.3129101
Kurtosis43.46241
Mean0.18396564
Median Absolute Deviation (MAD)0
Skewness5.43509
Sum514
Variance0.37144349
MonotonicityNot monotonic
2024-03-23T07:22:55.825637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 2446
87.5%
1 256
 
9.2%
2 52
 
1.9%
3 24
 
0.9%
4 7
 
0.3%
5 5
 
0.2%
7 2
 
0.1%
9 1
 
< 0.1%
6 1
 
< 0.1%
ValueCountFrequency (%)
0 2446
87.5%
1 256
 
9.2%
2 52
 
1.9%
3 24
 
0.9%
4 7
 
0.3%
5 5
 
0.2%
6 1
 
< 0.1%
7 2
 
0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
7 2
 
0.1%
6 1
 
< 0.1%
5 5
 
0.2%
4 7
 
0.3%
3 24
 
0.9%
2 52
 
1.9%
1 256
 
9.2%
0 2446
87.5%

지정용도외 사용건수
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
0
2771 
1
 
21
3
 
1
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2771
99.2%
1 21
 
0.8%
3 1
 
< 0.1%
2 1
 
< 0.1%

Length

2024-03-23T07:22:56.207377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:22:56.539489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2771
99.2%
1 21
 
0.8%
3 1
 
< 0.1%
2 1
 
< 0.1%

관리대장미비치건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct8
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.10093057
Minimum0
Maximum9
Zeros2615
Zeros (%)93.6%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2024-03-23T07:22:57.041762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.48667809
Coefficient of variation (CV)4.8219098
Kurtosis86.570862
Mean0.10093057
Median Absolute Deviation (MAD)0
Skewness7.8539628
Sum282
Variance0.23685556
MonotonicityNot monotonic
2024-03-23T07:22:57.375093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 2615
93.6%
1 124
 
4.4%
2 33
 
1.2%
3 10
 
0.4%
4 5
 
0.2%
6 3
 
0.1%
5 3
 
0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
0 2615
93.6%
1 124
 
4.4%
2 33
 
1.2%
3 10
 
0.4%
4 5
 
0.2%
5 3
 
0.1%
6 3
 
0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
6 3
 
0.1%
5 3
 
0.1%
4 5
 
0.2%
3 10
 
0.4%
2 33
 
1.2%
1 124
 
4.4%
0 2615
93.6%

표시위반건수
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
0
2792 
1
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2792
99.9%
1 2
 
0.1%

Length

2024-03-23T07:22:57.751356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:22:58.002322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2792
99.9%
1 2
 
0.1%

기타
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.10665712
Minimum0
Maximum10
Zeros2600
Zeros (%)93.1%
Negative0
Negative (%)0.0%
Memory size24.7 KiB
2024-03-23T07:22:58.308858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum10
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5028441
Coefficient of variation (CV)4.7145853
Kurtosis100.64506
Mean0.10665712
Median Absolute Deviation (MAD)0
Skewness8.275435
Sum298
Variance0.25285219
MonotonicityNot monotonic
2024-03-23T07:22:58.670111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 2600
93.1%
1 141
 
5.0%
2 31
 
1.1%
3 9
 
0.3%
4 6
 
0.2%
5 3
 
0.1%
6 2
 
0.1%
10 1
 
< 0.1%
7 1
 
< 0.1%
ValueCountFrequency (%)
0 2600
93.1%
1 141
 
5.0%
2 31
 
1.1%
3 9
 
0.3%
4 6
 
0.2%
5 3
 
0.1%
6 2
 
0.1%
7 1
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
10 1
 
< 0.1%
7 1
 
< 0.1%
6 2
 
0.1%
5 3
 
0.1%
4 6
 
0.2%
3 9
 
0.3%
2 31
 
1.1%
1 141
 
5.0%
0 2600
93.1%

Interactions

2024-03-23T07:22:50.843575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:47.498811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:48.601540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:49.740101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:51.009078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:47.758063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:48.898103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:50.069895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:51.171837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:48.044627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:49.182780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:50.328008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:51.337861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:48.334186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:49.487110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:22:50.666369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:22:58.910937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명조사건수위반업체수지정용도외 사용건수관리대장미비치건수표시위반건수기타
시도명1.0000.4590.2060.0000.1630.0000.164
조사건수0.4591.0000.4150.0000.3060.0930.322
위반업체수0.2060.4151.0000.2920.8360.1870.908
지정용도외 사용건수0.0000.0000.2921.0000.0000.0000.034
관리대장미비치건수0.1630.3060.8360.0001.0000.0420.296
표시위반건수0.0000.0930.1870.0000.0421.0000.013
기타0.1640.3220.9080.0340.2960.0131.000
2024-03-23T07:22:59.208482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지정용도외 사용건수표시위반건수시도명
지정용도외 사용건수1.0000.0000.000
표시위반건수0.0001.0000.000
시도명0.0000.0001.000
2024-03-23T07:22:59.478916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조사건수위반업체수관리대장미비치건수기타시도명지정용도외 사용건수표시위반건수
조사건수1.0000.2380.1900.1530.1690.0000.092
위반업체수0.2381.0000.7000.7270.0700.1900.187
관리대장미비치건수0.1900.7001.0000.1670.0690.0000.032
기타0.1530.7270.1671.0000.0310.0000.000
시도명0.1690.0700.0690.0311.0000.0000.000
지정용도외 사용건수0.0000.1900.0000.0000.0001.0000.000
표시위반건수0.0920.1870.0320.0000.0000.0001.000

Missing values

2024-03-23T07:22:51.566584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:22:52.053596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

단속년월시도명조사건수위반업체수지정용도외 사용건수관리대장미비치건수표시위반건수기타
01999-12-01전라남도100000
12000-09-01충청북도100000
22000-09-01전라남도100000
32001-01-01인천광역시100000
42001-01-01경기도700000
52001-01-01강원도900000
62001-01-01충청북도2200000
72001-01-01충청남도400000
82001-01-01전라남도200000
92001-01-01경상북도1200000
단속년월시도명조사건수위반업체수지정용도외 사용건수관리대장미비치건수표시위반건수기타
27842022-01-01경상북도1110001
27852022-01-01경상남도500000
27862022-02-01광주광역시300000
27872022-02-01울산광역시100000
27882022-02-01경기도800000
27892022-02-01강원도100000
27902022-02-01충청북도100000
27912022-02-01충청남도300000
27922022-02-01전라남도300000
27932022-02-01경상남도900000