Overview

Dataset statistics

Number of variables4
Number of observations28
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 KiB
Average record size in memory38.7 B

Variable types

Text1
Numeric2
Categorical1

Alerts

자기품질검사건수 is highly overall correlated with 적합 and 1 other fieldsHigh correlation
적합 is highly overall correlated with 자기품질검사건수 and 1 other fieldsHigh correlation
부적합 is highly overall correlated with 자기품질검사건수 and 1 other fieldsHigh correlation
부적합 is highly imbalanced (53.9%)Imbalance
식품 유형 has unique valuesUnique

Reproduction

Analysis started2024-03-14 02:47:40.715884
Analysis finished2024-03-14 02:47:41.291249
Duration0.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

식품 유형
Text

UNIQUE 

Distinct28
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size356.0 B
2024-03-14T11:47:41.403953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length6.5
Mean length3.8928571
Min length2

Characters and Unicode

Total characters109
Distinct characters58
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)100.0%

Sample

1st row과자류
2nd row빵또는 떡류
3rd row포도당
4th row과당
5th row엿류
ValueCountFrequency (%)
과자류 1
 
3.2%
드레싱류 1
 
3.2%
위생용품 1
 
3.2%
기구및용기포장 1
 
3.2%
식품첨가물 1
 
3.2%
건강기능식품 1
 
3.2%
장기보존식품 1
 
3.2%
기타가공품 1
 
3.2%
기타식품류 1
 
3.2%
건포류 1
 
3.2%
Other values (21) 21
67.7%
2024-03-14T11:47:41.677988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16
 
14.7%
9
 
8.3%
8
 
7.3%
6
 
5.5%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.8%
2
 
1.8%
Other values (48) 54
49.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 106
97.2%
Space Separator 3
 
2.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
16
 
15.1%
9
 
8.5%
8
 
7.5%
6
 
5.7%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.9%
2
 
1.9%
2
 
1.9%
Other values (47) 52
49.1%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 106
97.2%
Common 3
 
2.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
16
 
15.1%
9
 
8.5%
8
 
7.5%
6
 
5.7%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.9%
2
 
1.9%
2
 
1.9%
Other values (47) 52
49.1%
Common
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 106
97.2%
ASCII 3
 
2.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
16
 
15.1%
9
 
8.5%
8
 
7.5%
6
 
5.7%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.9%
2
 
1.9%
2
 
1.9%
Other values (47) 52
49.1%
ASCII
ValueCountFrequency (%)
3
100.0%

자기품질검사건수
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)82.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.214286
Minimum1
Maximum773
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.0 B
2024-03-14T11:47:41.774746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.35
Q18.25
median22.5
Q346
95-th percentile103.1
Maximum773
Range772
Interquartile range (IQR)37.75

Descriptive statistics

Standard deviation143.27527
Coefficient of variation (CV)2.594895
Kurtosis25.790309
Mean55.214286
Median Absolute Deviation (MAD)17.5
Skewness4.9966389
Sum1546
Variance20527.804
MonotonicityNot monotonic
2024-03-14T11:47:41.880085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1 2
 
7.1%
46 2
 
7.1%
15 2
 
7.1%
5 2
 
7.1%
25 2
 
7.1%
17 1
 
3.6%
773 1
 
3.6%
66 1
 
3.6%
40 1
 
3.6%
3 1
 
3.6%
Other values (13) 13
46.4%
ValueCountFrequency (%)
1 2
7.1%
2 1
3.6%
3 1
3.6%
5 2
7.1%
6 1
3.6%
9 1
3.6%
11 1
3.6%
12 1
3.6%
15 2
7.1%
17 1
3.6%
ValueCountFrequency (%)
773 1
3.6%
108 1
3.6%
94 1
3.6%
66 1
3.6%
54 1
3.6%
47 1
3.6%
46 2
7.1%
41 1
3.6%
40 1
3.6%
32 1
3.6%

적합
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)82.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.857143
Minimum1
Maximum768
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.0 B
2024-03-14T11:47:41.982284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.35
Q18.25
median22.5
Q345.25
95-th percentile101.8
Maximum768
Range767
Interquartile range (IQR)37

Descriptive statistics

Standard deviation142.33417
Coefficient of variation (CV)2.5946333
Kurtosis25.801985
Mean54.857143
Median Absolute Deviation (MAD)17.5
Skewness4.9980887
Sum1536
Variance20259.016
MonotonicityNot monotonic
2024-03-14T11:47:42.130044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1 2
 
7.1%
15 2
 
7.1%
5 2
 
7.1%
40 2
 
7.1%
25 2
 
7.1%
17 1
 
3.6%
768 1
 
3.6%
66 1
 
3.6%
3 1
 
3.6%
54 1
 
3.6%
Other values (13) 13
46.4%
ValueCountFrequency (%)
1 2
7.1%
2 1
3.6%
3 1
3.6%
5 2
7.1%
6 1
3.6%
9 1
3.6%
11 1
3.6%
12 1
3.6%
15 2
7.1%
17 1
3.6%
ValueCountFrequency (%)
768 1
3.6%
106 1
3.6%
94 1
3.6%
66 1
3.6%
54 1
3.6%
47 1
3.6%
46 1
3.6%
45 1
3.6%
40 2
7.1%
32 1
3.6%

부적합
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size356.0 B
-
23 
1
2
 
1
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)7.1%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 23
82.1%
1 3
 
10.7%
2 1
 
3.6%
5 1
 
3.6%

Length

2024-03-14T11:47:42.298530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T11:47:42.412056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
23
82.1%
1 3
 
10.7%
2 1
 
3.6%
5 1
 
3.6%

Interactions

2024-03-14T11:47:40.996887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T11:47:40.834432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T11:47:41.088365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T11:47:40.913173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T11:47:42.470312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
식품 유형자기품질검사건수적합부적합
식품 유형1.0001.0001.0001.000
자기품질검사건수1.0001.0001.0000.785
적합1.0001.0001.0000.785
부적합1.0000.7850.7851.000
2024-03-14T11:47:42.563595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자기품질검사건수적합부적합
자기품질검사건수1.0001.0000.825
적합1.0001.0000.825
부적합0.8250.8251.000

Missing values

2024-03-14T11:47:41.194838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T11:47:41.264245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

식품 유형자기품질검사건수적합부적합
0과자류9494-
1빵또는 떡류2525-
2포도당66-
3과당22-
4엿류3232-
5두부류 또는 묵류99-
6식용유지류46451
7면류1515-
8다류27261
9커피1111-
식품 유형자기품질검사건수적합부적합
18주류4646-
19건포류55-
20기타식품류1081062
21기타가공품5454-
22장기보존식품33-
23건강기능식품11-
24식품첨가물4040-
25기구및용기포장6666-
26위생용품1515-
27총합계7737685