Overview

Dataset statistics

Number of variables9
Number of observations3098
Missing cells0
Missing cells (%)0.0%
Duplicate rows352
Duplicate rows (%)11.4%
Total size in memory218.0 KiB
Average record size in memory72.0 B

Variable types

Categorical8
Text1

Alerts

요오드(131I)(기준:100) has constant value ""Constant
판정 has constant value ""Constant
Dataset has 352 (11.4%) duplicate rowsDuplicates
세슘(134Cs+137Cs)(기준:100) is highly overall correlated with 제조원산지 and 1 other fieldsHigh correlation
비고 is highly overall correlated with 제조원산지 and 1 other fieldsHigh correlation
분류 is highly overall correlated with 수거지역High correlation
제조원산지 is highly overall correlated with 세슘(134Cs+137Cs)(기준:100) and 1 other fieldsHigh correlation
수거지역 is highly overall correlated with 분류High correlation
제조원산지 is highly imbalanced (64.1%)Imbalance
세슘(134Cs+137Cs)(기준:100) is highly imbalanced (99.6%)Imbalance
비고 is highly imbalanced (59.8%)Imbalance

Reproduction

Analysis started2024-05-10 20:30:54.575009
Analysis finished2024-05-10 20:30:56.811192
Duration2.24 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

검사기간
Categorical

Distinct40
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
2023.5.22.~2023.6.4.
 
183
2023.4.10.~2023.4.23.
 
154
2023.6.5.~2023.6.18.
 
139
2023.10.16.~2023.10.22
 
134
2023.4.24.~2023.5.7.
 
127
Other values (35)
2361 

Length

Max length22
Median length21
Mean length21.074564
Min length20

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023.5.22.~2023.6.4.
2nd row2023.5.22.~2023.6.4.
3rd row2023.5.22.~2023.6.4.
4th row2023.5.22.~2023.6.4.
5th row2023.5.22.~2023.6.4.

Common Values

ValueCountFrequency (%)
2023.5.22.~2023.6.4. 183
 
5.9%
2023.4.10.~2023.4.23. 154
 
5.0%
2023.6.5.~2023.6.18. 139
 
4.5%
2023.10.16.~2023.10.22 134
 
4.3%
2023.4.24.~2023.5.7. 127
 
4.1%
2023.2.27.~2023.3.12. 116
 
3.7%
2023.09.04.~2023.09.08 113
 
3.6%
2023.7.7.~2023.7.13. 111
 
3.6%
2023.6.19.~2023.6.23. 104
 
3.4%
2023.2.13.~2023.2.26. 103
 
3.3%
Other values (30) 1814
58.6%

Length

2024-05-10T20:30:57.006999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2023.5.22.~2023.6.4 183
 
5.9%
2023.4.10.~2023.4.23 154
 
5.0%
2023.6.5.~2023.6.18 139
 
4.5%
2023.10.16.~2023.10.22 134
 
4.3%
2023.4.24.~2023.5.7 127
 
4.1%
2023.2.27.~2023.3.12 116
 
3.7%
2023.09.04.~2023.09.08 113
 
3.6%
2023.7.7.~2023.7.13 111
 
3.6%
2023.6.19.~2023.6.23 104
 
3.4%
2023.2.13.~2023.2.26 103
 
3.3%
Other values (30) 1814
58.6%

분류
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
가공식품
1449 
수산물
1139 
농산물
510 

Length

Max length4
Median length3
Mean length3.4677211
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가공식품
2nd row가공식품
3rd row가공식품
4th row가공식품
5th row가공식품

Common Values

ValueCountFrequency (%)
가공식품 1449
46.8%
수산물 1139
36.8%
농산물 510
 
16.5%

Length

2024-05-10T20:30:57.353953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T20:30:57.653078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
가공식품 1449
46.8%
수산물 1139
36.8%
농산물 510
 
16.5%
Distinct222
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
2024-05-10T20:30:58.204455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length12
Mean length4.7220788
Min length1

Characters and Unicode

Total characters14629
Distinct characters249
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique61 ?
Unique (%)2.0%

Sample

1st row기타 건포류
2nd row기타 건포류
3rd row기타 건포류
4th row기타 건포류
5th row기타 건포류
ValueCountFrequency (%)
기타 1046
25.2%
수산물가공품 965
23.2%
고등어 79
 
1.9%
건포류 74
 
1.8%
건어포 59
 
1.4%
멸치 54
 
1.3%
넙치 52
 
1.3%
가리비 51
 
1.2%
새우 50
 
1.2%
가자미 49
 
1.2%
Other values (216) 1673
40.3%
2024-05-10T20:30:59.280929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1112
 
7.6%
1104
 
7.5%
1064
 
7.3%
1054
 
7.2%
1012
 
6.9%
1000
 
6.8%
999
 
6.8%
984
 
6.7%
971
 
6.6%
392
 
2.7%
Other values (239) 4937
33.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13523
92.4%
Space Separator 1054
 
7.2%
Open Punctuation 22
 
0.2%
Close Punctuation 22
 
0.2%
Other Punctuation 8
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1112
 
8.2%
1104
 
8.2%
1064
 
7.9%
1012
 
7.5%
1000
 
7.4%
999
 
7.4%
984
 
7.3%
971
 
7.2%
392
 
2.9%
210
 
1.6%
Other values (235) 4675
34.6%
Space Separator
ValueCountFrequency (%)
1054
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Other Punctuation
ValueCountFrequency (%)
. 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13523
92.4%
Common 1106
 
7.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1112
 
8.2%
1104
 
8.2%
1064
 
7.9%
1012
 
7.5%
1000
 
7.4%
999
 
7.4%
984
 
7.3%
971
 
7.2%
392
 
2.9%
210
 
1.6%
Other values (235) 4675
34.6%
Common
ValueCountFrequency (%)
1054
95.3%
( 22
 
2.0%
) 22
 
2.0%
. 8
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13523
92.4%
ASCII 1106
 
7.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1112
 
8.2%
1104
 
8.2%
1064
 
7.9%
1012
 
7.5%
1000
 
7.4%
999
 
7.4%
984
 
7.3%
971
 
7.2%
392
 
2.9%
210
 
1.6%
Other values (235) 4675
34.6%
ASCII
ValueCountFrequency (%)
1054
95.3%
( 22
 
2.0%
) 22
 
2.0%
. 8
 
0.7%

제조원산지
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct48
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
국내산
2112 
러시아
213 
일본
 
192
미국
 
150
중국
 
146
Other values (43)
285 

Length

Max length7
Median length3
Mean length2.8392511
Min length2

Unique

Unique14 ?
Unique (%)0.5%

Sample

1st row국내산
2nd row국내산
3rd row국내산
4th row국내산
5th row국내산

Common Values

ValueCountFrequency (%)
국내산 2112
68.2%
러시아 213
 
6.9%
일본 192
 
6.2%
미국 150
 
4.8%
중국 146
 
4.7%
베트남 80
 
2.6%
칠레 33
 
1.1%
노르웨이 27
 
0.9%
태국 17
 
0.5%
이탈리아 14
 
0.5%
Other values (38) 114
 
3.7%

Length

2024-05-10T20:30:59.919761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
국내산 2112
68.2%
러시아 213
 
6.9%
일본 192
 
6.2%
미국 150
 
4.8%
중국 146
 
4.7%
베트남 80
 
2.6%
칠레 33
 
1.1%
노르웨이 27
 
0.9%
태국 17
 
0.5%
이탈리아 14
 
0.5%
Other values (38) 114
 
3.7%

수거지역
Categorical

HIGH CORRELATION 

Distinct50
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
화성시
319 
광주시
308 
용인시
305 
수원시
290 
군포시
193 
Other values (45)
1683 

Length

Max length4
Median length3
Mean length3.006133
Min length1

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
화성시 319
 
10.3%
광주시 308
 
9.9%
용인시 305
 
9.8%
수원시 290
 
9.4%
군포시 193
 
6.2%
성남시 155
 
5.0%
안산시 150
 
4.8%
김포시 139
 
4.5%
하남시 136
 
4.4%
시흥시 128
 
4.1%
Other values (40) 975
31.5%

Length

2024-05-10T20:31:00.433497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
화성시 319
 
10.3%
광주시 308
 
9.9%
용인시 305
 
9.8%
수원시 290
 
9.4%
군포시 193
 
6.2%
성남시 155
 
5.0%
안산시 150
 
4.8%
김포시 139
 
4.5%
하남시 136
 
4.4%
시흥시 128
 
4.1%
Other values (40) 975
31.5%

요오드(131I)(기준:100)
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
불검출
3098 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row불검출
2nd row불검출
3rd row불검출
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 3098
100.0%

Length

2024-05-10T20:31:00.812045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T20:31:01.236791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
불검출 3098
100.0%

세슘(134Cs+137Cs)(기준:100)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
불검출
3097 
1
 
1

Length

Max length3
Median length3
Mean length2.9993544
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row불검출
2nd row불검출
3rd row불검출
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 3097
> 99.9%
1 1
 
< 0.1%

Length

2024-05-10T20:31:01.720948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T20:31:02.206143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
불검출 3097
> 99.9%
1 1
 
< 0.1%

판정
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
적합
3098 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
적합 3098
100.0%

Length

2024-05-10T20:31:02.741802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T20:31:03.346121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 3098
100.0%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct49
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size24.3 KiB
<NA>
2155 
명태
 
89
멸치
 
83
오징어
 
82
삼치
 
80
Other values (44)
609 

Length

Max length5
Median length4
Mean length3.5100065
Min length1

Unique

Unique15 ?
Unique (%)0.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 2155
69.6%
명태 89
 
2.9%
멸치 83
 
2.7%
오징어 82
 
2.6%
삼치 80
 
2.6%
고등어 75
 
2.4%
임연수 56
 
1.8%
가자미 51
 
1.6%
새우 49
 
1.6%
다시마 39
 
1.3%
Other values (39) 339
 
10.9%

Length

2024-05-10T20:31:03.988855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 2155
69.6%
명태 89
 
2.9%
멸치 83
 
2.7%
오징어 82
 
2.6%
삼치 80
 
2.6%
고등어 75
 
2.4%
임연수 56
 
1.8%
가자미 51
 
1.6%
새우 49
 
1.6%
다시마 39
 
1.3%
Other values (39) 339
 
10.9%

Correlations

2024-05-10T20:31:04.332444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검사기간분류제조원산지수거지역세슘(134Cs+137Cs)(기준:100)비고
검사기간1.0000.6090.4410.8620.0000.770
분류0.6091.0000.5220.7830.0000.583
제조원산지0.4410.5221.0000.0000.8370.962
수거지역0.8620.7830.0001.0000.0000.787
세슘(134Cs+137Cs)(기준:100)0.0000.0000.8370.0001.0001.000
비고0.7700.5830.9620.7871.0001.000
2024-05-10T20:31:04.754989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세슘(134Cs+137Cs)(기준:100)분류비고검사기간수거지역제조원산지
세슘(134Cs+137Cs)(기준:100)1.0000.0000.9750.0000.0000.696
분류0.0001.0000.4570.3760.5500.284
비고0.9750.4571.0000.2270.2580.648
검사기간0.0000.3760.2271.0000.3000.093
수거지역0.0000.5500.2580.3001.0000.000
제조원산지0.6960.2840.6480.0930.0001.000
2024-05-10T20:31:05.066857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검사기간분류제조원산지수거지역세슘(134Cs+137Cs)(기준:100)비고
검사기간1.0000.3760.0930.3000.0000.227
분류0.3761.0000.2840.5500.0000.457
제조원산지0.0930.2841.0000.0000.6960.648
수거지역0.3000.5500.0001.0000.0000.258
세슘(134Cs+137Cs)(기준:100)0.0000.0000.6960.0001.0000.975
비고0.2270.4570.6480.2580.9751.000

Missing values

2024-05-10T20:30:56.271164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-10T20:30:56.670197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

검사기간분류품목명(식품유형)제조원산지수거지역요오드(131I)(기준:100)세슘(134Cs+137Cs)(기준:100)판정비고
02023.5.22.~2023.6.4.가공식품기타 건포류국내산-불검출불검출적합<NA>
12023.5.22.~2023.6.4.가공식품기타 건포류국내산-불검출불검출적합<NA>
22023.5.22.~2023.6.4.가공식품기타 건포류국내산-불검출불검출적합<NA>
32023.5.22.~2023.6.4.가공식품기타 건포류국내산-불검출불검출적합<NA>
42023.5.22.~2023.6.4.가공식품기타 건포류국내산-불검출불검출적합<NA>
52023.5.22.~2023.6.4.가공식품기타 건포류국내산-불검출불검출적합<NA>
62023.5.22.~2023.6.4.가공식품기타 건포류국내산-불검출불검출적합<NA>
72023.5.22.~2023.6.4.수산물밴댕이국내산이천시불검출불검출적합<NA>
82023.5.22.~2023.6.4.가공식품기타 수산물가공품국내산이천시불검출불검출적합새우
92023.5.22.~2023.6.4.수산물멸치국내산이천시불검출불검출적합<NA>
검사기간분류품목명(식품유형)제조원산지수거지역요오드(131I)(기준:100)세슘(134Cs+137Cs)(기준:100)판정비고
30882023.12.11.~2023.12.17수산물오징어페루이천시불검출불검출적합<NA>
30892023.12.11.~2023.12.17수산물가리비일본김포시불검출불검출적합<NA>
30902023.12.11.~2023.12.17수산물대합러시아김포시불검출불검출적합<NA>
30912023.12.11.~2023.12.17수산물낙지중국김포시불검출불검출적합<NA>
30922023.12.11.~2023.12.17수산물대합중국김포시불검출불검출적합<NA>
30932023.12.11.~2023.12.17수산물멍게국내산김포시불검출불검출적합<NA>
30942023.12.11.~2023.12.17수산물전복국내산김포시불검출불검출적합<NA>
30952023.12.11.~2023.12.17가공식품기타 수산물가공품국내산용인시불검출불검출적합삼치
30962023.12.11.~2023.12.17가공식품기타 수산물가공품러시아용인시불검출불검출적합명태
30972023.12.11.~2023.12.17가공식품기타 수산물가공품미국용인시불검출불검출적합가자미

Duplicate rows

Most frequently occurring

검사기간분류품목명(식품유형)제조원산지수거지역요오드(131I)(기준:100)세슘(134Cs+137Cs)(기준:100)판정비고# duplicates
2972023.6.5.~2023.6.18.가공식품천일염국내산수원시불검출불검출적합<NA>17
942023.10.30.~2023.11.05가공식품기타 건포류국내산수원시불검출불검출적합<NA>12
2452023.5.22.~2023.6.4.가공식품기타 건포류국내산-불검출불검출적합<NA>12
2482023.5.22.~2023.6.4.가공식품기타 수산물가공품국내산-불검출불검출적합멸치11
3152023.7.28.~2023.8.4.가공식품기타 수산물가공품국내산수원시불검출불검출적합멸치11
1982023.3.27.~2023.4.9.농산물국내산안성시불검출불검출적합<NA>10
2102023.4.10.~2023.4.23.농산물사과국내산문경시불검출불검출적합<NA>10
2652023.6.19.~2023.6.23.가공식품기타 수산물가공품국내산화성시불검출불검출적합<NA>10
3062023.6.5.~2023.6.18.농산물토마토국내산춘천시불검출불검출적합<NA>10
1962023.3.13.~2023.3.26.가공식품기타 수산물가공품국내산수원시불검출불검출적합멸치9