Overview

Dataset statistics

Number of variables8
Number of observations2806
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory191.9 KiB
Average record size in memory70.0 B

Variable types

Text1
Categorical3
Numeric4

Dataset

Description농산물 유통 관련하여 가공용쌀 공급업에 대해 지정용도외 사용, 원산지표시, 관리대장 비치 등 단속정보(단속년월, 시도명, 조사건수, 위반업체수, 지정용도외 사용 건수, 표시위반 건수,관리대장 미비치 건수, 기타 )
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20170912000000000790

Alerts

위반업체수 is highly overall correlated with 관리대장미비치 건수 and 1 other fieldsHigh correlation
관리대장미비치 건수 is highly overall correlated with 위반업체수High correlation
기타 is highly overall correlated with 위반업체수High correlation
지정용도외 사용 건수 is highly imbalanced (96.4%)Imbalance
표시위반 건수 is highly imbalanced (98.8%)Imbalance
위반업체수 has 2456 (87.5%) zerosZeros
관리대장미비치 건수 has 2625 (93.5%) zerosZeros
기타 has 2612 (93.1%) zerosZeros

Reproduction

Analysis started2024-03-23 07:23:01.515947
Analysis finished2024-03-23 07:23:07.930844
Duration6.41 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct258
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
2024-03-23T07:23:08.478186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters16836
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row199912
2nd row200009
3rd row200009
4th row200101
5th row200101
ValueCountFrequency (%)
201101 17
 
0.6%
201104 17
 
0.6%
201301 17
 
0.6%
201009 17
 
0.6%
201302 17
 
0.6%
200507 16
 
0.6%
200602 16
 
0.6%
201209 16
 
0.6%
200502 16
 
0.6%
201208 16
 
0.6%
Other values (249) 2642
94.1%
2024-03-23T07:23:09.822371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 6573
39.0%
2 3826
22.7%
1 2888
17.2%
5 546
 
3.2%
6 541
 
3.2%
9 530
 
3.1%
8 521
 
3.1%
7 520
 
3.1%
4 453
 
2.7%
3 437
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16835
> 99.9%
Space Separator 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6573
39.0%
2 3826
22.7%
1 2888
17.2%
5 546
 
3.2%
6 541
 
3.2%
9 530
 
3.1%
8 521
 
3.1%
7 520
 
3.1%
4 453
 
2.7%
3 437
 
2.6%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16836
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6573
39.0%
2 3826
22.7%
1 2888
17.2%
5 546
 
3.2%
6 541
 
3.2%
9 530
 
3.1%
8 521
 
3.1%
7 520
 
3.1%
4 453
 
2.7%
3 437
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16836
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6573
39.0%
2 3826
22.7%
1 2888
17.2%
5 546
 
3.2%
6 541
 
3.2%
9 530
 
3.1%
8 521
 
3.1%
7 520
 
3.1%
4 453
 
2.7%
3 437
 
2.6%

시도명
Categorical

Distinct18
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
경기도
253 
강원도
245 
충청남도
239 
경상남도
231 
전라남도
216 
Other values (13)
1622 

Length

Max length7
Median length5
Mean length4.2494654
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전라남도
2nd row충청북도
3rd row전라남도
4th row인천광역시
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 253
 
9.0%
강원도 245
 
8.7%
충청남도 239
 
8.5%
경상남도 231
 
8.2%
전라남도 216
 
7.7%
경상북도 213
 
7.6%
충청북도 206
 
7.3%
전라북도 201
 
7.2%
인천광역시 172
 
6.1%
서울특별시 154
 
5.5%
Other values (8) 676
24.1%

Length

2024-03-23T07:23:10.280375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 253
 
9.0%
강원도 245
 
8.7%
충청남도 239
 
8.5%
경상남도 231
 
8.2%
전라남도 216
 
7.7%
경상북도 213
 
7.6%
충청북도 206
 
7.3%
전라북도 201
 
7.2%
인천광역시 172
 
6.1%
서울특별시 154
 
5.5%
Other values (8) 676
24.1%

조사건수
Real number (ℝ)

Distinct116
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.635424
Minimum1
Maximum229
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.8 KiB
2024-03-23T07:23:10.699835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median13
Q324
95-th percentile57
Maximum229
Range228
Interquartile range (IQR)19

Descriptive statistics

Standard deviation20.46753
Coefficient of variation (CV)1.0983131
Kurtosis12.421564
Mean18.635424
Median Absolute Deviation (MAD)9
Skewness2.7895214
Sum52291
Variance418.9198
MonotonicityNot monotonic
2024-03-23T07:23:11.118932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 200
 
7.1%
3 161
 
5.7%
4 136
 
4.8%
2 127
 
4.5%
5 114
 
4.1%
7 109
 
3.9%
6 105
 
3.7%
13 104
 
3.7%
10 92
 
3.3%
12 92
 
3.3%
Other values (106) 1566
55.8%
ValueCountFrequency (%)
1 200
7.1%
2 127
4.5%
3 161
5.7%
4 136
4.8%
5 114
4.1%
6 105
3.7%
7 109
3.9%
8 84
3.0%
9 81
2.9%
10 92
3.3%
ValueCountFrequency (%)
229 1
< 0.1%
178 1
< 0.1%
169 1
< 0.1%
147 1
< 0.1%
144 1
< 0.1%
141 2
0.1%
132 1
< 0.1%
125 2
0.1%
124 1
< 0.1%
122 2
0.1%

위반업체수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1849608
Minimum0
Maximum9
Zeros2456
Zeros (%)87.5%
Negative0
Negative (%)0.0%
Memory size24.8 KiB
2024-03-23T07:23:11.479222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.61153655
Coefficient of variation (CV)3.3063036
Kurtosis42.79175
Mean0.1849608
Median Absolute Deviation (MAD)0
Skewness5.3961183
Sum519
Variance0.37397695
MonotonicityNot monotonic
2024-03-23T07:23:11.820904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 2456
87.5%
1 256
 
9.1%
2 53
 
1.9%
3 25
 
0.9%
4 7
 
0.2%
5 5
 
0.2%
7 2
 
0.1%
9 1
 
< 0.1%
6 1
 
< 0.1%
ValueCountFrequency (%)
0 2456
87.5%
1 256
 
9.1%
2 53
 
1.9%
3 25
 
0.9%
4 7
 
0.2%
5 5
 
0.2%
6 1
 
< 0.1%
7 2
 
0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
7 2
 
0.1%
6 1
 
< 0.1%
5 5
 
0.2%
4 7
 
0.2%
3 25
 
0.9%
2 53
 
1.9%
1 256
 
9.1%
0 2456
87.5%

지정용도외 사용 건수
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
0
2783 
1
 
21
3
 
1
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2783
99.2%
1 21
 
0.7%
3 1
 
< 0.1%
2 1
 
< 0.1%

Length

2024-03-23T07:23:12.178081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:23:12.405357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2783
99.2%
1 21
 
0.7%
3 1
 
< 0.1%
2 1
 
< 0.1%

관리대장미비치 건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct8
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.10192445
Minimum0
Maximum9
Zeros2625
Zeros (%)93.5%
Negative0
Negative (%)0.0%
Memory size24.8 KiB
2024-03-23T07:23:12.571890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.48831246
Coefficient of variation (CV)4.7909257
Kurtosis85.098787
Mean0.10192445
Median Absolute Deviation (MAD)0
Skewness7.7779058
Sum286
Variance0.23844906
MonotonicityNot monotonic
2024-03-23T07:23:12.833700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 2625
93.5%
1 124
 
4.4%
2 35
 
1.2%
3 10
 
0.4%
4 5
 
0.2%
6 3
 
0.1%
5 3
 
0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
0 2625
93.5%
1 124
 
4.4%
2 35
 
1.2%
3 10
 
0.4%
4 5
 
0.2%
5 3
 
0.1%
6 3
 
0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
6 3
 
0.1%
5 3
 
0.1%
4 5
 
0.2%
3 10
 
0.4%
2 35
 
1.2%
1 124
 
4.4%
0 2625
93.5%

표시위반 건수
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
0
2803 
1
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2803
99.9%
1 3
 
0.1%

Length

2024-03-23T07:23:13.189693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:23:13.476201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2803
99.9%
1 3
 
0.1%

기타
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.106201
Minimum0
Maximum10
Zeros2612
Zeros (%)93.1%
Negative0
Negative (%)0.0%
Memory size24.8 KiB
2024-03-23T07:23:13.649697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum10
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.50181563
Coefficient of variation (CV)4.7251499
Kurtosis101.07958
Mean0.106201
Median Absolute Deviation (MAD)0
Skewness8.293462
Sum298
Variance0.25181893
MonotonicityNot monotonic
2024-03-23T07:23:13.996138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 2612
93.1%
1 141
 
5.0%
2 31
 
1.1%
3 9
 
0.3%
4 6
 
0.2%
5 3
 
0.1%
6 2
 
0.1%
10 1
 
< 0.1%
7 1
 
< 0.1%
ValueCountFrequency (%)
0 2612
93.1%
1 141
 
5.0%
2 31
 
1.1%
3 9
 
0.3%
4 6
 
0.2%
5 3
 
0.1%
6 2
 
0.1%
7 1
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
10 1
 
< 0.1%
7 1
 
< 0.1%
6 2
 
0.1%
5 3
 
0.1%
4 6
 
0.2%
3 9
 
0.3%
2 31
 
1.1%
1 141
 
5.0%
0 2612
93.1%

Interactions

2024-03-23T07:23:05.486405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:02.186655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:03.160906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:04.265690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:05.715572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:02.430467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:03.434545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:04.510454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:06.035606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:02.664405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:03.703598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:04.782800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:06.553957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:02.901828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:03.996110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:23:05.100802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:23:14.232819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명조사건수위반업체수지정용도외 사용 건수관리대장미비치 건수표시위반 건수기타
시도명1.0000.4580.2070.0000.1640.0000.163
조사건수0.4581.0000.4130.0000.3060.0810.322
위반업체수0.2070.4131.0000.2890.8350.1870.907
지정용도외 사용 건수0.0000.0000.2891.0000.0000.0000.034
관리대장미비치 건수0.1640.3060.8350.0001.0000.1250.294
표시위반 건수0.0000.0810.1870.0000.1251.0000.000
기타0.1630.3220.9070.0340.2940.0001.000
2024-03-23T07:23:14.498052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지정용도외 사용 건수표시위반 건수시도명
지정용도외 사용 건수1.0000.0000.000
표시위반 건수0.0001.0000.000
시도명0.0000.0001.000
2024-03-23T07:23:14.771963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조사건수위반업체수관리대장미비치 건수기타시도명지정용도외 사용 건수표시위반 건수
조사건수1.0000.2380.1910.1530.1690.0000.081
위반업체수0.2381.0000.7020.7240.0700.1880.186
관리대장미비치 건수0.1910.7021.0000.1660.0690.0000.094
기타0.1530.7240.1661.0000.0310.0000.000
시도명0.1690.0700.0690.0311.0000.0000.000
지정용도외 사용 건수0.0000.1880.0000.0000.0001.0000.000
표시위반 건수0.0810.1860.0940.0000.0000.0001.000

Missing values

2024-03-23T07:23:07.272727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:23:07.761590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

단속년월시도명조사건수위반업체수지정용도외 사용 건수관리대장미비치 건수표시위반 건수기타
0199912전라남도100000
1200009충청북도100000
2200009전라남도100000
3200101인천광역시100000
4200101경기도700000
5200101강원도900000
6200101충청북도2200000
7200101충청남도400000
8200101전라남도200000
9200101경상북도1200000
단속년월시도명조사건수위반업체수지정용도외 사용 건수관리대장미비치 건수표시위반 건수기타
2796202206울산광역시300000
2797202206부산광역시1200000
2798202206경기도5330210
2799202206강원도400000
2800202206충청북도1320200
2801202206충청남도2200000
2802202206전라북도1500000
2803202206전라남도6800000
2804202206경상북도500000
2805202206경상남도1100000