Overview

Dataset statistics

Number of variables4
Number of observations28
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 KiB
Average record size in memory39.7 B

Variable types

Text1
Numeric2
Categorical1

Dataset

Description자가품질검사현황2015년상반기
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202417

Alerts

자기품질검사건수 is highly overall correlated with 적합 and 1 other fieldsHigh correlation
적합 is highly overall correlated with 자기품질검사건수 and 1 other fieldsHigh correlation
부적합 is highly overall correlated with 자기품질검사건수 and 1 other fieldsHigh correlation
부적합 is highly imbalanced (66.9%)Imbalance
식품 유형 has unique valuesUnique

Reproduction

Analysis started2024-03-14 02:34:41.007175
Analysis finished2024-03-14 02:34:41.615722
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

식품 유형
Text

UNIQUE 

Distinct28
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size356.0 B
2024-03-14T11:34:41.732463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length8
Mean length4.4642857
Min length2

Characters and Unicode

Total characters125
Distinct characters70
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)100.0%

Sample

1st row과자류
2nd row빵또는 떡류
3rd row코코아가공품 및 초콜릿류
4th row잼류
5th row올리고당류
ValueCountFrequency (%)
과자류 1
 
3.0%
조미식품 1
 
3.0%
위생용품 1
 
3.0%
기구및용기포장 1
 
3.0%
식품첨가물 1
 
3.0%
건강기능식품 1
 
3.0%
장기보존식품 1
 
3.0%
규격외일반가공품 1
 
3.0%
기타식품류 1
 
3.0%
건포류 1
 
3.0%
Other values (23) 23
69.7%
2024-03-14T11:34:42.000605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18
 
14.4%
10
 
8.0%
8
 
6.4%
5
 
4.0%
5
 
4.0%
3
 
2.4%
3
 
2.4%
3
 
2.4%
2
 
1.6%
2
 
1.6%
Other values (60) 66
52.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 120
96.0%
Space Separator 5
 
4.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
 
15.0%
10
 
8.3%
8
 
6.7%
5
 
4.2%
3
 
2.5%
3
 
2.5%
3
 
2.5%
2
 
1.7%
2
 
1.7%
2
 
1.7%
Other values (59) 64
53.3%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 120
96.0%
Common 5
 
4.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
 
15.0%
10
 
8.3%
8
 
6.7%
5
 
4.2%
3
 
2.5%
3
 
2.5%
3
 
2.5%
2
 
1.7%
2
 
1.7%
2
 
1.7%
Other values (59) 64
53.3%
Common
ValueCountFrequency (%)
5
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 120
96.0%
ASCII 5
 
4.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
18
 
15.0%
10
 
8.3%
8
 
6.7%
5
 
4.2%
3
 
2.5%
3
 
2.5%
3
 
2.5%
2
 
1.7%
2
 
1.7%
2
 
1.7%
Other values (59) 64
53.3%
ASCII
ValueCountFrequency (%)
5
100.0%

자기품질검사건수
Real number (ℝ)

HIGH CORRELATION 

Distinct19
Distinct (%)67.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.428571
Minimum1
Maximum524
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.0 B
2024-03-14T11:34:42.105584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median10.5
Q325.5
95-th percentile107.5
Maximum524
Range523
Interquartile range (IQR)21.75

Descriptive statistics

Standard deviation99.171168
Coefficient of variation (CV)2.6496114
Kurtosis23.494876
Mean37.428571
Median Absolute Deviation (MAD)9.5
Skewness4.7218287
Sum1048
Variance9834.9206
MonotonicityNot monotonic
2024-03-14T11:34:42.209483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
1 6
21.4%
14 2
 
7.1%
10 2
 
7.1%
4 2
 
7.1%
15 2
 
7.1%
45 1
 
3.6%
132 1
 
3.6%
524 1
 
3.6%
8 1
 
3.6%
27 1
 
3.6%
Other values (9) 9
32.1%
ValueCountFrequency (%)
1 6
21.4%
3 1
 
3.6%
4 2
 
7.1%
7 1
 
3.6%
8 1
 
3.6%
9 1
 
3.6%
10 2
 
7.1%
11 1
 
3.6%
14 2
 
7.1%
15 2
 
7.1%
ValueCountFrequency (%)
524 1
3.6%
132 1
3.6%
62 1
3.6%
54 1
3.6%
45 1
3.6%
28 1
3.6%
27 1
3.6%
25 1
3.6%
21 1
3.6%
15 2
7.1%

적합
Real number (ℝ)

HIGH CORRELATION 

Distinct19
Distinct (%)67.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.214286
Minimum1
Maximum521
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.0 B
2024-03-14T11:34:42.338210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median10.5
Q325.5
95-th percentile105.85
Maximum521
Range520
Interquartile range (IQR)21.75

Descriptive statistics

Standard deviation98.546608
Coefficient of variation (CV)2.6480854
Kurtosis23.554805
Mean37.214286
Median Absolute Deviation (MAD)9.5
Skewness4.7286816
Sum1042
Variance9711.4339
MonotonicityNot monotonic
2024-03-14T11:34:42.431898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
1 6
21.4%
14 2
 
7.1%
10 2
 
7.1%
4 2
 
7.1%
15 2
 
7.1%
45 1
 
3.6%
130 1
 
3.6%
521 1
 
3.6%
8 1
 
3.6%
27 1
 
3.6%
Other values (9) 9
32.1%
ValueCountFrequency (%)
1 6
21.4%
3 1
 
3.6%
4 2
 
7.1%
7 1
 
3.6%
8 1
 
3.6%
9 1
 
3.6%
10 2
 
7.1%
11 1
 
3.6%
14 2
 
7.1%
15 2
 
7.1%
ValueCountFrequency (%)
521 1
3.6%
130 1
3.6%
61 1
3.6%
54 1
3.6%
45 1
3.6%
28 1
3.6%
27 1
3.6%
25 1
3.6%
21 1
3.6%
15 2
7.1%

부적합
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size356.0 B
<NA>
25 
2
 
1
1
 
1
3
 
1

Length

Max length4
Median length4
Mean length3.6785714
Min length1

Unique

Unique3 ?
Unique (%)10.7%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 25
89.3%
2 1
 
3.6%
1 1
 
3.6%
3 1
 
3.6%

Length

2024-03-14T11:34:42.543111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T11:34:42.623514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 25
89.3%
2 1
 
3.6%
1 1
 
3.6%
3 1
 
3.6%

Interactions

2024-03-14T11:34:41.329357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T11:34:41.120438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T11:34:41.406256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T11:34:41.218785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T11:34:42.674468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
식품 유형자기품질검사건수적합부적합
식품 유형1.0001.0001.0001.000
자기품질검사건수1.0001.0001.0001.000
적합1.0001.0001.0001.000
부적합1.0001.0001.0001.000
2024-03-14T11:34:42.754341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자기품질검사건수적합부적합
자기품질검사건수1.0001.0001.000
적합1.0001.0001.000
부적합1.0001.0001.000

Missing values

2024-03-14T11:34:41.510905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T11:34:41.587302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

식품 유형자기품질검사건수적합부적합
0과자류4545<NA>
1빵또는 떡류2525<NA>
2코코아가공품 및 초콜릿류11<NA>
3잼류11<NA>
4올리고당류1414<NA>
5두부류 또는 묵류99<NA>
6식용유지류1010<NA>
7면류44<NA>
8다류1515<NA>
9커피1111<NA>
식품 유형자기품질검사건수적합부적합
18주류5454<NA>
19건포류77<NA>
20기타식품류1321302
21규격외일반가공품62611
22장기보존식품44<NA>
23건강기능식품11<NA>
24식품첨가물2828<NA>
25기구및용기포장2727<NA>
26위생용품88<NA>
27총합계5245213