Overview

Dataset statistics

Number of variables4
Number of observations36
Missing cells4
Missing cells (%)2.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 KiB
Average record size in memory38.7 B

Variable types

Text1
Numeric2
Categorical1

Dataset

Description도내 유통식품, 농산물, 수산물 검사 현황
Author전라북도
URLhttps://www.bigdatahub.go.kr/index.jeonbuk?startPage=24&menuCd=DOM_000000103007001000&pListTypeStr=&pId=3084463

Alerts

비율 is highly overall correlated with 유통식품 and 1 other fieldsHigh correlation
유통식품 is highly overall correlated with 비율 and 1 other fieldsHigh correlation
부적합 is highly overall correlated with 비율 and 1 other fieldsHigh correlation
부적합 is highly imbalanced (69.1%)Imbalance
유통식품 has 4 (11.1%) missing valuesMissing
식품유형 has unique valuesUnique
비율 has 4 (11.1%) zerosZeros

Reproduction

Analysis started2024-03-14 03:24:55.201795
Analysis finished2024-03-14 03:24:55.692913
Duration0.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

식품유형
Text

UNIQUE 

Distinct36
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size420.0 B
2024-03-14T12:24:55.833760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length7
Mean length4.5555556
Min length2

Characters and Unicode

Total characters164
Distinct characters82
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)100.0%

Sample

1st row과자류
2nd row빵또는 떡류
3rd row초콜릿류
4th row잼류
5th row설탕
ValueCountFrequency (%)
과자류 1
 
2.4%
기타가공품 1
 
2.4%
드레싱류 1
 
2.4%
기타식품류 1
 
2.4%
김치류 1
 
2.4%
젓갈류 1
 
2.4%
절임식품 1
 
2.4%
조림식품 1
 
2.4%
주류 1
 
2.4%
건포류 1
 
2.4%
Other values (32) 32
76.2%
2024-03-14T12:24:56.158629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19
 
11.6%
14
 
8.5%
13
 
7.9%
6
 
3.7%
6
 
3.7%
4
 
2.4%
4
 
2.4%
3
 
1.8%
3
 
1.8%
3
 
1.8%
Other values (72) 89
54.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 156
95.1%
Space Separator 6
 
3.7%
Open Punctuation 1
 
0.6%
Close Punctuation 1
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
12.2%
14
 
9.0%
13
 
8.3%
6
 
3.8%
4
 
2.6%
4
 
2.6%
3
 
1.9%
3
 
1.9%
3
 
1.9%
3
 
1.9%
Other values (69) 84
53.8%
Space Separator
ValueCountFrequency (%)
6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 156
95.1%
Common 8
 
4.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
12.2%
14
 
9.0%
13
 
8.3%
6
 
3.8%
4
 
2.6%
4
 
2.6%
3
 
1.9%
3
 
1.9%
3
 
1.9%
3
 
1.9%
Other values (69) 84
53.8%
Common
ValueCountFrequency (%)
6
75.0%
( 1
 
12.5%
) 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 156
95.1%
ASCII 8
 
4.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
19
 
12.2%
14
 
9.0%
13
 
8.3%
6
 
3.8%
4
 
2.6%
4
 
2.6%
3
 
1.9%
3
 
1.9%
3
 
1.9%
3
 
1.9%
Other values (69) 84
53.8%
ASCII
ValueCountFrequency (%)
6
75.0%
( 1
 
12.5%
) 1
 
12.5%

비율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct23
Distinct (%)63.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7805556
Minimum0
Maximum23.5
Zeros4
Zeros (%)11.1%
Negative0
Negative (%)0.0%
Memory size456.0 B
2024-03-14T12:24:56.263858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.2
median0.85
Q32.85
95-th percentile12.325
Maximum23.5
Range23.5
Interquartile range (IQR)2.65

Descriptive statistics

Standard deviation4.8734745
Coefficient of variation (CV)1.7526981
Kurtosis9.7377647
Mean2.7805556
Median Absolute Deviation (MAD)0.75
Skewness2.9687576
Sum100.1
Variance23.750754
MonotonicityNot monotonic
2024-03-14T12:24:56.353345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
0.1 4
 
11.1%
0.0 4
 
11.1%
0.3 4
 
11.1%
0.4 2
 
5.6%
1.6 2
 
5.6%
0.2 2
 
5.6%
2.4 2
 
5.6%
2.8 1
 
2.8%
3.6 1
 
2.8%
23.5 1
 
2.8%
Other values (13) 13
36.1%
ValueCountFrequency (%)
0.0 4
11.1%
0.1 4
11.1%
0.2 2
5.6%
0.3 4
11.1%
0.4 2
5.6%
0.5 1
 
2.8%
0.6 1
 
2.8%
1.1 1
 
2.8%
1.2 1
 
2.8%
1.4 1
 
2.8%
ValueCountFrequency (%)
23.5 1
2.8%
15.4 1
2.8%
11.3 1
2.8%
7.6 1
2.8%
6.0 1
2.8%
5.0 1
2.8%
4.0 1
2.8%
3.6 1
2.8%
3.0 1
2.8%
2.8 1
2.8%

유통식품
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct28
Distinct (%)87.5%
Missing4
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean85.46875
Minimum2
Maximum644
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size456.0 B
2024-03-14T12:24:56.460852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2.55
Q17.75
median35.5
Q385.25
95-th percentile358.75
Maximum644
Range642
Interquartile range (IQR)77.5

Descriptive statistics

Standard deviation138.80503
Coefficient of variation (CV)1.6240443
Kurtosis8.645448
Mean85.46875
Median Absolute Deviation (MAD)30.5
Skewness2.8096172
Sum2735
Variance19266.838
MonotonicityNot monotonic
2024-03-14T12:24:56.786289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
3 2
 
5.6%
2 2
 
5.6%
7 2
 
5.6%
8 2
 
5.6%
10 1
 
2.8%
6 1
 
2.8%
98 1
 
2.8%
644 1
 
2.8%
307 1
 
2.8%
43 1
 
2.8%
Other values (18) 18
50.0%
(Missing) 4
 
11.1%
ValueCountFrequency (%)
2 2
5.6%
3 2
5.6%
5 1
2.8%
6 1
2.8%
7 2
5.6%
8 2
5.6%
10 1
2.8%
12 1
2.8%
13 1
2.8%
17 1
2.8%
ValueCountFrequency (%)
644 1
2.8%
422 1
2.8%
307 1
2.8%
208 1
2.8%
164 1
2.8%
136 1
2.8%
110 1
2.8%
98 1
2.8%
81 1
2.8%
76 1
2.8%

부적합
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size420.0 B
<NA>
33 
1
 
2
3
 
1

Length

Max length4
Median length4
Mean length3.75
Min length1

Unique

Unique1 ?
Unique (%)2.8%

Sample

1st row1
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 33
91.7%
1 2
 
5.6%
3 1
 
2.8%

Length

2024-03-14T12:24:56.894240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T12:24:57.025914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 33
91.7%
1 2
 
5.6%
3 1
 
2.8%

Interactions

2024-03-14T12:24:55.436483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T12:24:55.311714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T12:24:55.494848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T12:24:55.372887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T12:24:57.110263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
식품유형비율유통식품부적합
식품유형1.0001.0001.0001.000
비율1.0001.0001.0001.000
유통식품1.0001.0001.0001.000
부적합1.0001.0001.0001.000
2024-03-14T12:24:57.220294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비율유통식품부적합
비율1.0000.9981.000
유통식품0.9981.0001.000
부적합1.0001.0001.000

Missing values

2024-03-14T12:24:55.587622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T12:24:55.663658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

식품유형비율유통식품부적합
0과자류15.44221
1빵또는 떡류2.467<NA>
2초콜릿류3.081<NA>
3잼류0.13<NA>
4설탕0.12<NA>
5과당0.0<NA><NA>
6엿류0.37<NA>
7올리고당류0.513<NA>
8식육또는 알가공품0.0<NA><NA>
9어육가공품1.645<NA>
식품유형비율유통식품부적합
26기타식품류5.0136<NA>
27기타가공품2.466<NA>
28장기보존식품1.643<NA>
29건강기능식품0.38<NA>
30식품첨가물0.0<NA><NA>
31기구및용기포장0.38<NA>
32식품접객업소(집단금식소 포함)의 조리식품11.33073
33농산물23.5644<NA>
34수산물3.698<NA>
35위생용품0.26<NA>