Overview

Dataset statistics

Number of variables4
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory39.4 B

Variable types

Text1
Numeric2
Categorical1

Dataset

Description도내 유통식품, 농산물, 수산물 검사 현황
Author전라북도
URLhttps://www.bigdatahub.go.kr/index.jeonbuk?startPage=24&menuCd=DOM_000000103007001000&pListTypeStr=&pId=3084463

Alerts

자가품질건수 is highly overall correlated with 적합 and 1 other fieldsHigh correlation
적합 is highly overall correlated with 자가품질건수 and 1 other fieldsHigh correlation
부적합 is highly overall correlated with 자가품질건수 and 1 other fieldsHigh correlation
부적합 is highly imbalanced (57.4%)Imbalance
식품 유형 has unique valuesUnique
자가품질건수 has 4 (13.3%) zerosZeros
적합 has 4 (13.3%) zerosZeros

Reproduction

Analysis started2024-03-14 03:24:52.469140
Analysis finished2024-03-14 03:24:53.090616
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

식품 유형
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2024-03-14T12:24:53.204488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length8
Mean length4.7
Min length2

Characters and Unicode

Total characters141
Distinct characters76
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row과자류
2nd row빵또는 떡류
3rd row코코아가공품 및 초콜릿류
4th row잼류
5th row올리고당류
ValueCountFrequency (%)
과자류 1
 
2.8%
빵또는 1
 
2.8%
젓갈류 1
 
2.8%
절임식품 1
 
2.8%
조림식품 1
 
2.8%
주류 1
 
2.8%
건포류 1
 
2.8%
기타식품류 1
 
2.8%
규격외일반가공품 1
 
2.8%
장기보존식품 1
 
2.8%
Other values (26) 26
72.2%
2024-03-14T12:24:53.465970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18
 
12.8%
12
 
8.5%
10
 
7.1%
6
 
4.3%
5
 
3.5%
3
 
2.1%
3
 
2.1%
3
 
2.1%
3
 
2.1%
3
 
2.1%
Other values (66) 75
53.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 135
95.7%
Space Separator 6
 
4.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
 
13.3%
12
 
8.9%
10
 
7.4%
5
 
3.7%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
2
 
1.5%
Other values (65) 73
54.1%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 135
95.7%
Common 6
 
4.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
 
13.3%
12
 
8.9%
10
 
7.4%
5
 
3.7%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
2
 
1.5%
Other values (65) 73
54.1%
Common
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 135
95.7%
ASCII 6
 
4.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
18
 
13.3%
12
 
8.9%
10
 
7.4%
5
 
3.7%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
2
 
1.5%
Other values (65) 73
54.1%
ASCII
ValueCountFrequency (%)
6
100.0%

자가품질건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct23
Distinct (%)76.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.5
Minimum0
Maximum395
Zeros4
Zeros (%)13.3%
Negative0
Negative (%)0.0%
Memory size402.0 B
2024-03-14T12:24:53.590130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12.75
median19
Q360.75
95-th percentile293.75
Maximum395
Range395
Interquartile range (IQR)58

Descriptive statistics

Standard deviation97.45901
Coefficient of variation (CV)1.7882387
Kurtosis6.1050754
Mean54.5
Median Absolute Deviation (MAD)18
Skewness2.5873196
Sum1635
Variance9498.2586
MonotonicityNot monotonic
2024-03-14T12:24:53.722799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
0 4
 
13.3%
1 3
 
10.0%
65 2
 
6.7%
5 2
 
6.7%
291 1
 
3.3%
8 1
 
3.3%
45 1
 
3.3%
395 1
 
3.3%
6 1
 
3.3%
296 1
 
3.3%
Other values (13) 13
43.3%
ValueCountFrequency (%)
0 4
13.3%
1 3
10.0%
2 1
 
3.3%
5 2
6.7%
6 1
 
3.3%
7 1
 
3.3%
8 1
 
3.3%
11 1
 
3.3%
17 1
 
3.3%
21 1
 
3.3%
ValueCountFrequency (%)
395 1
3.3%
296 1
3.3%
291 1
3.3%
103 1
3.3%
65 2
6.7%
63 1
3.3%
62 1
3.3%
57 1
3.3%
45 1
3.3%
36 1
3.3%

적합
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct23
Distinct (%)76.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.3
Minimum0
Maximum395
Zeros4
Zeros (%)13.3%
Negative0
Negative (%)0.0%
Memory size402.0 B
2024-03-14T12:24:53.836114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12.75
median18.5
Q360.75
95-th percentile291.65
Maximum395
Range395
Interquartile range (IQR)58

Descriptive statistics

Standard deviation97.114987
Coefficient of variation (CV)1.7884896
Kurtosis6.1597587
Mean54.3
Median Absolute Deviation (MAD)17.5
Skewness2.5934632
Sum1629
Variance9431.3207
MonotonicityNot monotonic
2024-03-14T12:24:53.955514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
0 4
 
13.3%
1 3
 
10.0%
65 2
 
6.7%
5 2
 
6.7%
290 1
 
3.3%
8 1
 
3.3%
45 1
 
3.3%
395 1
 
3.3%
6 1
 
3.3%
293 1
 
3.3%
Other values (13) 13
43.3%
ValueCountFrequency (%)
0 4
13.3%
1 3
10.0%
2 1
 
3.3%
5 2
6.7%
6 1
 
3.3%
7 1
 
3.3%
8 1
 
3.3%
11 1
 
3.3%
17 1
 
3.3%
20 1
 
3.3%
ValueCountFrequency (%)
395 1
3.3%
293 1
3.3%
290 1
3.3%
102 1
3.3%
65 2
6.7%
63 1
3.3%
62 1
3.3%
57 1
3.3%
45 1
3.3%
36 1
3.3%

부적합
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
0
26 
1
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 26
86.7%
1 3
 
10.0%
3 1
 
3.3%

Length

2024-03-14T12:24:54.076412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T12:24:54.152290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 26
86.7%
1 3
 
10.0%
3 1
 
3.3%

Interactions

2024-03-14T12:24:52.809341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T12:24:52.611175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T12:24:52.885589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T12:24:52.731333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T12:24:54.208474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
식품 유형자가품질건수적합부적합
식품 유형1.0001.0001.0001.000
자가품질건수1.0001.0001.0000.669
적합1.0001.0001.0000.669
부적합1.0000.6690.6691.000
2024-03-14T12:24:54.287740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자가품질건수적합부적합
자가품질건수1.0001.0000.615
적합1.0001.0000.615
부적합0.6150.6151.000

Missing values

2024-03-14T12:24:52.992641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T12:24:53.064567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

식품 유형자가품질건수적합부적합
0과자류2912901
1빵또는 떡류36360
2코코아가공품 및 초콜릿류65650
3잼류550
4올리고당류770
5두부류 또는 묵류11110
6식용유지류63630
7면류24240
8다류21201
9커피000
식품 유형자가품질건수적합부적합
20기타식품류1031021
21규격외일반가공품65650
22장기보존식품110
23건강기능식품57570
24식품첨가물000
25기구및용기포장000
26식품접객업소 조리판매식품2962933
27위생용품660
28농산물3953950
29수산물45450