Overview

Dataset statistics

Number of variables10
Number of observations784
Missing cells141
Missing cells (%)1.8%
Duplicate rows6
Duplicate rows (%)0.8%
Total size in memory63.7 KiB
Average record size in memory83.2 B

Variable types

Numeric1
Categorical6
Text2
DateTime1

Dataset

Description부산광역시_식품방사능검사현황_20211231
Author부산광역시
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083358

Alerts

세슘검출량(Bq_kg) has constant value ""Constant
적부판정 has constant value ""Constant
Dataset has 6 (0.8%) duplicate rowsDuplicates
요오드검출량(Bq_kg) is highly overall correlated with 연번 and 3 other fieldsHigh correlation
분류 is highly overall correlated with 원산지 and 1 other fieldsHigh correlation
수입국 is highly overall correlated with 원산지 and 1 other fieldsHigh correlation
원산지 is highly overall correlated with 분류 and 2 other fieldsHigh correlation
연번 is highly overall correlated with 요오드검출량(Bq_kg)High correlation
수입국 is highly imbalanced (65.1%)Imbalance
요오드검출량(Bq_kg) is highly imbalanced (98.6%)Imbalance
연번 has 141 (18.0%) missing valuesMissing

Reproduction

Analysis started2024-03-30 09:11:11.345553
Analysis finished2024-03-30 09:11:14.615147
Duration3.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct643
Distinct (%)100.0%
Missing141
Missing (%)18.0%
Infinite0
Infinite (%)0.0%
Mean322
Minimum1
Maximum643
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 KiB
2024-03-30T09:11:14.833053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile33.1
Q1161.5
median322
Q3482.5
95-th percentile610.9
Maximum643
Range642
Interquartile range (IQR)321

Descriptive statistics

Standard deviation185.76239
Coefficient of variation (CV)0.57690184
Kurtosis-1.2
Mean322
Median Absolute Deviation (MAD)161
Skewness0
Sum207046
Variance34507.667
MonotonicityStrictly increasing
2024-03-30T09:11:15.377435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
404 1
 
0.1%
426 1
 
0.1%
427 1
 
0.1%
428 1
 
0.1%
429 1
 
0.1%
430 1
 
0.1%
431 1
 
0.1%
432 1
 
0.1%
433 1
 
0.1%
434 1
 
0.1%
Other values (633) 633
80.7%
(Missing) 141
 
18.0%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
643 1
0.1%
642 1
0.1%
641 1
0.1%
640 1
0.1%
639 1
0.1%
638 1
0.1%
637 1
0.1%
636 1
0.1%
635 1
0.1%
634 1
0.1%

분류
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
수산물
406 
가공식품
301 
농산물
50 
축산물
 
14
기타수산물가공품
 
12

Length

Max length8
Median length3
Mean length3.4617347
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row가공식품
2nd row수산물
3rd row가공식품
4th row가공식품
5th row가공식품

Common Values

ValueCountFrequency (%)
수산물 406
51.8%
가공식품 301
38.4%
농산물 50
 
6.4%
축산물 14
 
1.8%
기타수산물가공품 12
 
1.5%
수산물 1
 
0.1%

Length

2024-03-30T09:11:15.924469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T09:11:16.373476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수산물 407
51.9%
가공식품 301
38.4%
농산물 50
 
6.4%
축산물 14
 
1.8%
기타수산물가공품 12
 
1.5%
Distinct529
Distinct (%)67.5%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
2024-03-30T09:11:16.958507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length35
Mean length7.8469388
Min length1

Characters and Unicode

Total characters6152
Distinct characters468
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique434 ?
Unique (%)55.4%

Sample

1st row에스앤비골드카레매운맛
2nd row노르웨이생물연어
3rd row폰즈유즈카
4th row폰샤브
5th row오르시샤브샤브타래
ValueCountFrequency (%)
고등어 31
 
2.6%
miso 20
 
1.7%
tsuyu 20
 
1.7%
kikkoman 19
 
1.6%
가자미 16
 
1.3%
삼치 15
 
1.2%
소스 15
 
1.2%
sauce 13
 
1.1%
hon 12
 
1.0%
sinsueti 12
 
1.0%
Other values (681) 1032
85.6%
2024-03-30T09:11:17.960498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
424
 
6.9%
S 225
 
3.7%
A 212
 
3.4%
O 192
 
3.1%
I 179
 
2.9%
U 164
 
2.7%
E 147
 
2.4%
130
 
2.1%
K 117
 
1.9%
M 103
 
1.7%
Other values (458) 4259
69.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3439
55.9%
Uppercase Letter 1992
32.4%
Space Separator 424
 
6.9%
Open Punctuation 93
 
1.5%
Close Punctuation 93
 
1.5%
Lowercase Letter 54
 
0.9%
Decimal Number 38
 
0.6%
Other Punctuation 17
 
0.3%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
130
 
3.8%
92
 
2.7%
85
 
2.5%
77
 
2.2%
75
 
2.2%
65
 
1.9%
60
 
1.7%
55
 
1.6%
53
 
1.5%
52
 
1.5%
Other values (398) 2695
78.4%
Uppercase Letter
ValueCountFrequency (%)
S 225
11.3%
A 212
10.6%
O 192
9.6%
I 179
 
9.0%
U 164
 
8.2%
E 147
 
7.4%
K 117
 
5.9%
M 103
 
5.2%
T 103
 
5.2%
N 99
 
5.0%
Other values (15) 451
22.6%
Lowercase Letter
ValueCountFrequency (%)
a 7
13.0%
m 7
13.0%
p 5
 
9.3%
i 5
 
9.3%
k 4
 
7.4%
g 4
 
7.4%
u 3
 
5.6%
c 2
 
3.7%
r 2
 
3.7%
t 2
 
3.7%
Other values (7) 13
24.1%
Decimal Number
ValueCountFrequency (%)
0 18
47.4%
1 7
 
18.4%
5 5
 
13.2%
2 4
 
10.5%
9 2
 
5.3%
4 1
 
2.6%
3 1
 
2.6%
Other Punctuation
ValueCountFrequency (%)
% 4
23.5%
: 4
23.5%
. 3
17.6%
; 2
11.8%
& 2
11.8%
! 1
 
5.9%
, 1
 
5.9%
Space Separator
ValueCountFrequency (%)
424
100.0%
Open Punctuation
ValueCountFrequency (%)
( 93
100.0%
Close Punctuation
ValueCountFrequency (%)
) 93
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3437
55.9%
Latin 2046
33.3%
Common 667
 
10.8%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
130
 
3.8%
92
 
2.7%
85
 
2.5%
77
 
2.2%
75
 
2.2%
65
 
1.9%
60
 
1.7%
55
 
1.6%
53
 
1.5%
52
 
1.5%
Other values (396) 2693
78.4%
Latin
ValueCountFrequency (%)
S 225
11.0%
A 212
10.4%
O 192
 
9.4%
I 179
 
8.7%
U 164
 
8.0%
E 147
 
7.2%
K 117
 
5.7%
M 103
 
5.0%
T 103
 
5.0%
N 99
 
4.8%
Other values (32) 505
24.7%
Common
ValueCountFrequency (%)
424
63.6%
( 93
 
13.9%
) 93
 
13.9%
0 18
 
2.7%
1 7
 
1.0%
5 5
 
0.7%
% 4
 
0.6%
: 4
 
0.6%
2 4
 
0.6%
. 3
 
0.4%
Other values (8) 12
 
1.8%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3436
55.9%
ASCII 2713
44.1%
CJK 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
424
15.6%
S 225
 
8.3%
A 212
 
7.8%
O 192
 
7.1%
I 179
 
6.6%
U 164
 
6.0%
E 147
 
5.4%
K 117
 
4.3%
M 103
 
3.8%
T 103
 
3.8%
Other values (50) 847
31.2%
Hangul
ValueCountFrequency (%)
130
 
3.8%
92
 
2.7%
85
 
2.5%
77
 
2.2%
75
 
2.2%
65
 
1.9%
60
 
1.7%
55
 
1.6%
53
 
1.5%
52
 
1.5%
Other values (395) 2692
78.3%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Distinct164
Distinct (%)20.9%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
2024-03-30T09:11:18.880881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length3.5714286
Min length1

Characters and Unicode

Total characters2800
Distinct characters184
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)9.9%

Sample

1st row카레
2nd row연어
3rd row소스
4th row소스
5th row소스
ValueCountFrequency (%)
소스 102
 
13.0%
기타수산물가공품 81
 
10.3%
기타수산물활 37
 
4.7%
고등어 32
 
4.1%
혼합장 28
 
3.6%
가자미 23
 
2.9%
캔디류 17
 
2.2%
카레 17
 
2.2%
삼치 16
 
2.0%
수산물 15
 
1.9%
Other values (152) 416
53.1%
2024-03-30T09:11:20.535465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
167
 
6.0%
163
 
5.8%
161
 
5.8%
152
 
5.4%
137
 
4.9%
127
 
4.5%
109
 
3.9%
108
 
3.9%
98
 
3.5%
91
 
3.2%
Other values (174) 1487
53.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2786
99.5%
Close Punctuation 5
 
0.2%
Open Punctuation 5
 
0.2%
Space Separator 2
 
0.1%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
167
 
6.0%
163
 
5.9%
161
 
5.8%
152
 
5.5%
137
 
4.9%
127
 
4.6%
109
 
3.9%
108
 
3.9%
98
 
3.5%
91
 
3.3%
Other values (170) 1473
52.9%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2786
99.5%
Common 14
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
167
 
6.0%
163
 
5.9%
161
 
5.8%
152
 
5.5%
137
 
4.9%
127
 
4.6%
109
 
3.9%
108
 
3.9%
98
 
3.5%
91
 
3.3%
Other values (170) 1473
52.9%
Common
ValueCountFrequency (%)
) 5
35.7%
( 5
35.7%
2
 
14.3%
. 2
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2786
99.5%
ASCII 14
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
167
 
6.0%
163
 
5.9%
161
 
5.8%
152
 
5.5%
137
 
4.9%
127
 
4.6%
109
 
3.9%
108
 
3.9%
98
 
3.5%
91
 
3.3%
Other values (170) 1473
52.9%
ASCII
ValueCountFrequency (%)
) 5
35.7%
( 5
35.7%
2
 
14.3%
. 2
 
14.3%
Distinct88
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
Minimum2020-12-07 00:00:00
Maximum2023-02-23 00:00:00
2024-03-30T09:11:21.052235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T09:11:21.673852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

원산지
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
국내
496 
국외
288 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국외
2nd row국외
3rd row국외
4th row국외
5th row국외

Common Values

ValueCountFrequency (%)
국내 496
63.3%
국외 288
36.7%

Length

2024-03-30T09:11:22.231239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T09:11:22.613896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국내 496
63.3%
국외 288
36.7%

수입국
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct16
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
대한민국
493 
일본
241 
미국
 
13
러시아
 
9
중국
 
9
Other values (11)
 
19

Length

Max length5
Median length4
Mean length3.2920918
Min length2

Unique

Unique8 ?
Unique (%)1.0%

Sample

1st row일본
2nd row노르웨이
3rd row일본
4th row일본
5th row일본

Common Values

ValueCountFrequency (%)
대한민국 493
62.9%
일본 241
30.7%
미국 13
 
1.7%
러시아 9
 
1.1%
중국 9
 
1.1%
베트남 6
 
0.8%
국내 3
 
0.4%
노르웨이 2
 
0.3%
말레이사아 1
 
0.1%
이슬람 1
 
0.1%
Other values (6) 6
 
0.8%

Length

2024-03-30T09:11:23.183016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대한민국 493
62.9%
일본 241
30.7%
미국 13
 
1.7%
러시아 9
 
1.1%
중국 9
 
1.1%
베트남 6
 
0.8%
국내 3
 
0.4%
노르웨이 2
 
0.3%
말레이사아 1
 
0.1%
이슬람 1
 
0.1%
Other values (6) 6
 
0.8%

세슘검출량(Bq_kg)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
0
784 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 784
100.0%

Length

2024-03-30T09:11:23.566388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T09:11:23.874423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 784
100.0%

요오드검출량(Bq_kg)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
0
783 
<NA>
 
1

Length

Max length4
Median length1
Mean length1.0038265
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 783
99.9%
<NA> 1
 
0.1%

Length

2024-03-30T09:11:24.257245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T09:11:24.608222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 783
99.9%
na 1
 
0.1%

적부판정
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
적합
784 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
적합 784
100.0%

Length

2024-03-30T09:11:24.981179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T09:11:25.473797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 784
100.0%

Interactions

2024-03-30T09:11:12.945756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-30T09:11:25.694597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번분류수거일원산지수입국
연번1.0000.3030.9930.2560.237
분류0.3031.0000.8660.7170.635
수거일0.9930.8661.0000.7390.764
원산지0.2560.7170.7391.0001.000
수입국0.2370.6350.7641.0001.000
2024-03-30T09:11:26.061378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
요오드검출량(Bq_kg)분류수입국원산지
요오드검출량(Bq_kg)1.0001.0001.0001.000
분류1.0001.0000.3670.528
수입국1.0000.3671.0000.991
원산지1.0000.5280.9911.000
2024-03-30T09:11:26.360951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번분류원산지수입국요오드검출량(Bq_kg)
연번1.0000.1850.1950.1091.000
분류0.1851.0000.5280.3671.000
원산지0.1950.5281.0000.9911.000
수입국0.1090.3670.9911.0001.000
요오드검출량(Bq_kg)1.0001.0001.0001.0001.000

Missing values

2024-03-30T09:11:13.682996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-30T09:11:14.359012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번분류제품명품목(또는 식품유형)수거일원산지수입국세슘검출량(Bq_kg)요오드검출량(Bq_kg)적부판정
01가공식품에스앤비골드카레매운맛카레2020-12-07국외일본00적합
12수산물노르웨이생물연어연어2020-12-18국외노르웨이00적합
23가공식품폰즈유즈카소스2020-12-15국외일본00적합
34가공식품폰샤브소스2020-12-15국외일본00적합
45가공식품오르시샤브샤브타래소스2020-12-15국외일본00적합
56가공식품폰즈위즈카소스2020-12-15국외일본00적합
67가공식품기꼬만혼쯔유(코이다시)소스2020-12-15국외일본00적합
78가공식품농축쯔유소스2020-12-15국외일본00적합
89가공식품시주일미된장(적된장)혼합장2020-12-15국외일본00적합
910가공식품마루쿠메군코시혼합장2020-12-15국외일본00적합
연번분류제품명품목(또는 식품유형)수거일원산지수입국세슘검출량(Bq_kg)요오드검출량(Bq_kg)적부판정
774<NA>수산물고등어고등어2021-12-14국내대한민국00적합
775<NA>수산물삼치삼치2021-12-14국내대한민국00적합
776<NA>수산물대구대구2021-12-14국내대한민국00적합
777<NA>기타수산물가공품다시멸치건조멸치2021-12-16국내대한민국00적합
778<NA>기타수산물가공품보리새우건조새우2021-12-16국내대한민국00적합
779<NA>기타수산물가공품황태채명태2021-12-16국외러시아00적합
780<NA>기타수산물가공품건멸치건멸치2021-12-16국내대한민국00적합
781<NA>기타수산물가공품분홍새우건새우2021-12-16국외중국00적합
782<NA>기타수산물가공품건다시마건다시마2021-12-16국내대한민국00적합
783<NA>수산물고등어고등어2021-12-14국내대한민국00적합

Duplicate rows

Most frequently occurring

연번분류제품명품목(또는 식품유형)수거일원산지수입국세슘검출량(Bq_kg)요오드검출량(Bq_kg)적부판정# duplicates
0<NA>가공식품KIKKOMAN KOIDASHI HON TSUYU소스2021-11-16국외일본00적합2
1<NA>수산물고등어고등어2021-12-14국내대한민국00적합2
2<NA>수산물생고등어고등어2021-12-07국내대한민국00적합2
3<NA>수산물오징어수산물2021-11-16국내대한민국00적합2
4<NA>축산물한돈갈비돼지고기2021-11-02국내대한민국00적합2
5<NA>축산물한돈사태살돼지고기2021-11-02국내대한민국00적합2