Overview

Dataset statistics

Number of variables7
Number of observations3529
Missing cells12
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory200.0 KiB
Average record size in memory58.0 B

Variable types

Numeric2
Text2
Categorical3

Dataset

Description대표식품명, 대표식품코드, 식품분류, 식품 유형 및 대표 식품 내 전통식품수를 제공하여 대표 식품과 전통식품의 연관성을 설명한다.
Author한국식품연구원
URLhttps://www.data.go.kr/data/15047799/fileData.do

Alerts

식품분류(중) is highly overall correlated with 식품분류(대)High correlation
식품분류(대) is highly overall correlated with 식품분류(중)High correlation
식품유형 is highly imbalanced (79.4%)Imbalance
대표식품코드 has unique valuesUnique
대표식품명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 07:44:50.040926
Analysis finished2023-12-12 07:44:51.611423
Duration1.57 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대표식품코드
Real number (ℝ)

UNIQUE 

Distinct3529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102510.68
Minimum100003
Maximum105005
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.1 KiB
2023-12-12T16:44:51.713381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100003
5-th percentile100271.4
Q1101307
median102497
Q3103735
95-th percentile104693.6
Maximum105005
Range5002
Interquartile range (IQR)2428

Descriptive statistics

Standard deviation1410.5941
Coefficient of variation (CV)0.01376046
Kurtosis-1.1882233
Mean102510.68
Median Absolute Deviation (MAD)1215
Skewness-0.018502152
Sum3.6176017 × 108
Variance1989775.7
MonotonicityStrictly increasing
2023-12-12T16:44:51.899115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100003 1
 
< 0.1%
103334 1
 
< 0.1%
103319 1
 
< 0.1%
103320 1
 
< 0.1%
103321 1
 
< 0.1%
103322 1
 
< 0.1%
103323 1
 
< 0.1%
103324 1
 
< 0.1%
103325 1
 
< 0.1%
103326 1
 
< 0.1%
Other values (3519) 3519
99.7%
ValueCountFrequency (%)
100003 1
< 0.1%
100004 1
< 0.1%
100005 1
< 0.1%
100009 1
< 0.1%
100010 1
< 0.1%
100016 1
< 0.1%
100017 1
< 0.1%
100018 1
< 0.1%
100019 1
< 0.1%
100020 1
< 0.1%
ValueCountFrequency (%)
105005 1
< 0.1%
105003 1
< 0.1%
105002 1
< 0.1%
105001 1
< 0.1%
104926 1
< 0.1%
104925 1
< 0.1%
104924 1
< 0.1%
104923 1
< 0.1%
104922 1
< 0.1%
104921 1
< 0.1%

대표식품명
Text

UNIQUE 

Distinct3529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size27.7 KiB
2023-12-12T16:44:52.250159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length3.54293
Min length1

Characters and Unicode

Total characters12503
Distinct characters485
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3529 ?
Unique (%)100.0%

Sample

1st row가루전병
2nd row가물치국
3rd row가물치회
4th row가시연꽃열매
5th row가시연밥
ValueCountFrequency (%)
가루전병 1
 
< 0.1%
염통구이 1
 
< 0.1%
연저찜 1
 
< 0.1%
연해주 1
 
< 0.1%
영계찜 1
 
< 0.1%
연행인과 1
 
< 0.1%
연화주 1
 
< 0.1%
열구자탕 1
 
< 0.1%
열무김치 1
 
< 0.1%
열무장아찌 1
 
< 0.1%
Other values (3520) 3520
99.7%
2023-12-12T16:44:52.800675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
310
 
2.5%
259
 
2.1%
233
 
1.9%
218
 
1.7%
216
 
1.7%
209
 
1.7%
206
 
1.6%
199
 
1.6%
193
 
1.5%
186
 
1.5%
Other values (475) 10274
82.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12502
> 99.9%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
310
 
2.5%
259
 
2.1%
233
 
1.9%
218
 
1.7%
216
 
1.7%
209
 
1.7%
206
 
1.6%
199
 
1.6%
193
 
1.5%
186
 
1.5%
Other values (474) 10273
82.2%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12502
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
310
 
2.5%
259
 
2.1%
233
 
1.9%
218
 
1.7%
216
 
1.7%
209
 
1.7%
206
 
1.6%
199
 
1.6%
193
 
1.5%
186
 
1.5%
Other values (474) 10273
82.2%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12502
> 99.9%
ASCII 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
310
 
2.5%
259
 
2.1%
233
 
1.9%
218
 
1.7%
216
 
1.7%
209
 
1.7%
206
 
1.6%
199
 
1.6%
193
 
1.5%
186
 
1.5%
Other values (474) 10273
82.2%
ASCII
ValueCountFrequency (%)
1
100.0%

식품유형
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size27.7 KiB
복합식품
3287 
단일식품
 
199
기술
 
39
기타
 
4

Length

Max length4
Median length4
Mean length3.9756305
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row복합식품
2nd row복합식품
3rd row복합식품
4th row단일식품
5th row복합식품

Common Values

ValueCountFrequency (%)
복합식품 3287
93.1%
단일식품 199
 
5.6%
기술 39
 
1.1%
기타 4
 
0.1%

Length

2023-12-12T16:44:52.991515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:44:53.132311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
복합식품 3287
93.1%
단일식품 199
 
5.6%
기술 39
 
1.1%
기타 4
 
0.1%

식품분류(대)
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size27.7 KiB
부식
1393 
기호식
613 
기타
566 
주식
386 
주류
351 

Length

Max length3
Median length2
Mean length2.2360442
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부식
2nd row부식
3rd row부식
4th row조미식
5th row주식

Common Values

ValueCountFrequency (%)
부식 1393
39.5%
기호식 613
17.4%
기타 566
16.0%
주식 386
 
10.9%
주류 351
 
9.9%
조미식 220
 
6.2%

Length

2023-12-12T16:44:53.300147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:44:53.450687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부식 1393
39.5%
기호식 613
17.4%
기타 566
16.0%
주식 386
 
10.9%
주류 351
 
9.9%
조미식 220
 
6.2%

식품분류(중)
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size27.7 KiB
기타
672 
떡류
273 
구이류
254 
국류
241 
찜류
239 
Other values (16)
1850 

Length

Max length4
Median length2
Mean length2.5021252
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구이류
2nd row국류
3rd row회류
4th row양념류
5th row밥류

Common Values

ValueCountFrequency (%)
기타 672
19.0%
떡류 273
 
7.7%
구이류 254
 
7.2%
국류 241
 
6.8%
찜류 239
 
6.8%
양조곡주 223
 
6.3%
죽류 207
 
5.9%
한과류 202
 
5.7%
장아찌류 154
 
4.4%
음청류 138
 
3.9%
Other values (11) 926
26.2%

Length

2023-12-12T16:44:53.621818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타 672
19.0%
떡류 273
 
7.7%
구이류 254
 
7.2%
국류 241
 
6.8%
찜류 239
 
6.8%
양조곡주 223
 
6.3%
죽류 207
 
5.9%
한과류 202
 
5.7%
장아찌류 154
 
4.4%
음청류 138
 
3.9%
Other values (11) 926
26.2%
Distinct70
Distinct (%)2.0%
Missing12
Missing (%)0.3%
Memory size27.7 KiB
2023-12-12T16:44:53.862974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.2101223
Min length1

Characters and Unicode

Total characters7773
Distinct characters95
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row
2nd row토장국
3rd row생회
4th row양념류
5th row기타
ValueCountFrequency (%)
기타 1183
33.6%
162
 
4.6%
순곡주류 134
 
3.8%
구이 120
 
3.4%
105
 
3.0%
찐떡 97
 
2.8%
탕류 97
 
2.8%
양념류 87
 
2.5%
혼양곡주류 86
 
2.4%
찌개 74
 
2.1%
Other values (60) 1372
39.0%
2023-12-12T16:44:54.225344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1183
 
15.2%
1183
 
15.2%
527
 
6.8%
236
 
3.0%
227
 
2.9%
190
 
2.4%
173
 
2.2%
164
 
2.1%
153
 
2.0%
152
 
2.0%
Other values (85) 3585
46.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7773
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1183
 
15.2%
1183
 
15.2%
527
 
6.8%
236
 
3.0%
227
 
2.9%
190
 
2.4%
173
 
2.2%
164
 
2.1%
153
 
2.0%
152
 
2.0%
Other values (85) 3585
46.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7773
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1183
 
15.2%
1183
 
15.2%
527
 
6.8%
236
 
3.0%
227
 
2.9%
190
 
2.4%
173
 
2.2%
164
 
2.1%
153
 
2.0%
152
 
2.0%
Other values (85) 3585
46.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7773
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1183
 
15.2%
1183
 
15.2%
527
 
6.8%
236
 
3.0%
227
 
2.9%
190
 
2.4%
173
 
2.2%
164
 
2.1%
153
 
2.0%
152
 
2.0%
Other values (85) 3585
46.1%
Distinct40
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.1195806
Minimum1
Maximum153
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.1 KiB
2023-12-12T16:44:54.377060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile10.6
Maximum153
Range152
Interquartile range (IQR)2

Descriptive statistics

Standard deviation5.2010469
Coefficient of variation (CV)1.6672263
Kurtosis209.94844
Mean3.1195806
Median Absolute Deviation (MAD)0
Skewness9.7059834
Sum11009
Variance27.050889
MonotonicityNot monotonic
2023-12-12T16:44:54.551341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
1 1936
54.9%
2 523
 
14.8%
3 266
 
7.5%
4 189
 
5.4%
5 125
 
3.5%
6 102
 
2.9%
7 69
 
2.0%
8 55
 
1.6%
9 54
 
1.5%
10 33
 
0.9%
Other values (30) 177
 
5.0%
ValueCountFrequency (%)
1 1936
54.9%
2 523
 
14.8%
3 266
 
7.5%
4 189
 
5.4%
5 125
 
3.5%
6 102
 
2.9%
7 69
 
2.0%
8 55
 
1.6%
9 54
 
1.5%
10 33
 
0.9%
ValueCountFrequency (%)
153 1
 
< 0.1%
57 1
 
< 0.1%
49 1
 
< 0.1%
45 1
 
< 0.1%
41 2
0.1%
36 4
0.1%
35 2
0.1%
34 1
 
< 0.1%
32 3
0.1%
31 2
0.1%

Interactions

2023-12-12T16:44:51.063411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:44:50.760426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:44:51.238408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:44:50.874131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:44:54.667390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대표식품코드식품유형식품분류(대)식품분류(중)식품분류(소)대표식품 내 전통식품수
대표식품코드1.0000.0980.0880.1460.2480.000
식품유형0.0981.0000.2940.5020.3270.000
식품분류(대)0.0880.2941.0000.9940.9690.081
식품분류(중)0.1460.5020.9941.0000.9960.080
식품분류(소)0.2480.3270.9690.9961.0000.086
대표식품 내 전통식품수0.0000.0000.0810.0800.0861.000
2023-12-12T16:44:54.772075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
식품분류(중)식품분류(대)식품유형
식품분류(중)1.0000.9510.293
식품분류(대)0.9511.0000.193
식품유형0.2930.1931.000
2023-12-12T16:44:54.885720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대표식품코드대표식품 내 전통식품수식품유형식품분류(대)식품분류(중)
대표식품코드1.000-0.0490.0590.0460.054
대표식품 내 전통식품수-0.0491.0000.0000.0550.039
식품유형0.0590.0001.0000.1930.293
식품분류(대)0.0460.0550.1931.0000.951
식품분류(중)0.0540.0390.2930.9511.000

Missing values

2023-12-12T16:44:51.397815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:44:51.558934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

대표식품코드대표식품명식품유형식품분류(대)식품분류(중)식품분류(소)대표식품 내 전통식품수
0100003가루전병복합식품부식구이류1
1100004가물치국복합식품부식국류토장국3
2100005가물치회복합식품부식회류생회5
3100009가시연꽃열매단일식품조미식양념류양념류1
4100010가시연밥복합식품주식밥류기타9
5100016가오리어채복합식품부식회류기타1
6100017가이주복합식품주류기타기타1
7100018가자미단일식품기타기타기타1
8100019가자미국복합식품부식국류기타1
9100020가자미젓복합식품부식젓갈류발효젓2
대표식품코드대표식품명식품유형식품분류(대)식품분류(중)식품분류(소)대표식품 내 전통식품수
3519104921흑두즙복합식품주식죽류2
3520104922흑임자다식복합식품기호식한과류다식11
3521104923흑임자주복합식품주류양조곡주혼양곡주류1
3522104924흑임자죽복합식품주식죽류5
3523104925희렴초복합식품부식찜류1
3524104926힘줄요리복합식품기타기타기타1
3525105001총백죽복합식품주식죽류1
3526105002총시죽복합식품주식죽류1
3527105003쇠비름죽복합식품주식죽류1
3528105005삼계탕복합식품주식죽류기타1