Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 1696 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 150.8 KiB |
Average record size in memory | 91.1 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 5 |
Text | 1 |
DateTime | 2 |
Boolean | 1 |
Dataset
Description | 20201221 |
---|---|
Author | 부산시공공데이터포털 |
URL | https://bigdata.busan.go.kr/data/bigDataDetailView.do?menuCode=M00000000007&hdfs_file_sn=20230901054901241000 |
test_result has constant value "" | Constant |
apr_at has constant value "" | Constant |
skey is highly overall correlated with test_year | High correlation |
test_month is highly overall correlated with test_year | High correlation |
test_year is highly overall correlated with skey and 1 other fields | High correlation |
detec_result is highly overall correlated with origin | High correlation |
origin is highly overall correlated with detec_result | High correlation |
detec_result is highly imbalanced (98.7%) | Imbalance |
origin is highly imbalanced (63.9%) | Imbalance |
skey has unique values | Unique |
Reproduction
Analysis started | 2023-10-09 06:39:18.773451 |
---|---|
Analysis finished | 2023-10-09 06:39:21.419175 |
Duration | 2.65 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
skey
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 1696 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3825.5 |
Minimum | 2978 |
---|---|
Maximum | 4673 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 15.0 KiB |
Quantile statistics
Minimum | 2978 |
---|---|
5-th percentile | 3062.75 |
Q1 | 3401.75 |
median | 3825.5 |
Q3 | 4249.25 |
95-th percentile | 4588.25 |
Maximum | 4673 |
Range | 1695 |
Interquartile range (IQR) | 847.5 |
Descriptive statistics
Standard deviation | 489.73734 |
---|---|
Coefficient of variation (CV) | 0.12801917 |
Kurtosis | -1.2 |
Mean | 3825.5 |
Median Absolute Deviation (MAD) | 424 |
Skewness | 0 |
Sum | 6488048 |
Variance | 239842.67 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
2978 | 1 | 0.1% |
4118 | 1 | 0.1% |
4116 | 1 | 0.1% |
4115 | 1 | 0.1% |
4114 | 1 | 0.1% |
4113 | 1 | 0.1% |
4112 | 1 | 0.1% |
4111 | 1 | 0.1% |
4110 | 1 | 0.1% |
4109 | 1 | 0.1% |
Other values (1686) | 1686 |
Value | Count | Frequency (%) |
2978 | 1 | |
2979 | 1 | |
2980 | 1 | |
2981 | 1 | |
2982 | 1 | |
2983 | 1 | |
2984 | 1 | |
2985 | 1 | |
2986 | 1 | |
2987 | 1 |
Value | Count | Frequency (%) |
4673 | 1 | |
4672 | 1 | |
4671 | 1 | |
4670 | 1 | |
4669 | 1 | |
4668 | 1 | |
4667 | 1 | |
4666 | 1 | |
4665 | 1 | |
4664 | 1 |
test_year
Categorical
HIGH CORRELATION
 
Distinct | 2 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.4 KiB |
2019 | |
---|---|
2020 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2019 |
---|---|
2nd row | 2019 |
3rd row | 2019 |
4th row | 2019 |
5th row | 2019 |
Common Values
Value | Count | Frequency (%) |
2019 | 1204 | |
2020 | 492 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2019 | 1204 | |
2020 | 492 |
test_month
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 12 |
---|---|
Distinct (%) | 0.7% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 6.1892689 |
Minimum | 1 |
---|---|
Maximum | 12 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 15.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 3 |
median | 6 |
Q3 | 9 |
95-th percentile | 12 |
Maximum | 12 |
Range | 11 |
Interquartile range (IQR) | 6 |
Descriptive statistics
Standard deviation | 3.3507138 |
---|---|
Coefficient of variation (CV) | 0.54137474 |
Kurtosis | -1.0486548 |
Mean | 6.1892689 |
Median Absolute Deviation (MAD) | 3 |
Skewness | 0.18489647 |
Sum | 10497 |
Variance | 11.227283 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5 | 229 | |
3 | 222 | |
1 | 186 | |
4 | 162 | |
7 | 158 | |
11 | 142 | |
8 | 128 | |
12 | 126 | |
6 | 121 | |
9 | 106 | |
Other values (2) | 116 |
Value | Count | Frequency (%) |
1 | 186 | |
2 | 24 | 1.4% |
3 | 222 | |
4 | 162 | |
5 | 229 | |
6 | 121 | |
7 | 158 | |
8 | 128 | |
9 | 106 | |
10 | 92 |
Value | Count | Frequency (%) |
12 | 126 | |
11 | 142 | |
10 | 92 | |
9 | 106 | |
8 | 128 | |
7 | 158 | |
6 | 121 | |
5 | 229 | |
4 | 162 | |
3 | 222 |
spec_name
Text
Distinct | 604 |
---|---|
Distinct (%) | 35.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.4 KiB |
Value | Count | Frequency (%) |
고등어 | 88 | 4.0% |
삼치 | 45 | 2.0% |
우럭 | 29 | 1.3% |
가자미 | 29 | 1.3% |
오징어 | 21 | 0.9% |
소스 | 21 | 0.9% |
생대구 | 19 | 0.9% |
갈치 | 18 | 0.8% |
아와세미소 | 17 | 0.8% |
제주갈치 | 17 | 0.8% |
Other values (705) | 1912 |
Most occurring characters
Value | Count | Frequency (%) |
520 | 5.4% | |
어 | 289 | 3.0% |
미 | 229 | 2.4% |
장 | 201 | 2.1% |
고 | 198 | 2.1% |
치 | 193 | 2.0% |
스 | 190 | 2.0% |
기 | 177 | 1.8% |
생 | 146 | 1.5% |
등 | 144 | 1.5% |
Other values (452) | 7337 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 8699 | |
Space Separator | 520 | 5.4% |
Open Punctuation | 112 | 1.2% |
Close Punctuation | 112 | 1.2% |
Lowercase Letter | 72 | 0.7% |
Decimal Number | 63 | 0.7% |
Uppercase Letter | 31 | 0.3% |
Other Punctuation | 8 | 0.1% |
Dash Punctuation | 7 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
어 | 289 | 3.3% |
미 | 229 | 2.6% |
장 | 201 | 2.3% |
고 | 198 | 2.3% |
치 | 193 | 2.2% |
스 | 190 | 2.2% |
기 | 177 | 2.0% |
생 | 146 | 1.7% |
등 | 144 | 1.7% |
가 | 140 | 1.6% |
Other values (409) | 6792 |
Lowercase Letter
Value | Count | Frequency (%) |
j | 8 | |
u | 7 | 9.7% |
i | 6 | 8.3% |
r | 6 | 8.3% |
o | 5 | 6.9% |
a | 5 | 6.9% |
e | 5 | 6.9% |
t | 4 | 5.6% |
s | 4 | 5.6% |
l | 4 | 5.6% |
Other values (8) | 18 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 6 | |
B | 4 | |
M | 4 | |
P | 3 | |
T | 3 | |
G | 3 | |
C | 2 | 6.5% |
N | 2 | 6.5% |
E | 2 | 6.5% |
L | 1 | 3.2% |
Decimal Number
Value | Count | Frequency (%) |
0 | 19 | |
5 | 17 | |
1 | 11 | |
6 | 6 | 9.5% |
3 | 5 | 7.9% |
2 | 3 | 4.8% |
9 | 2 | 3.2% |
Other Punctuation
Value | Count | Frequency (%) |
& | 6 | |
/ | 1 | 12.5% |
, | 1 | 12.5% |
Space Separator
Value | Count | Frequency (%) |
520 |
Open Punctuation
Value | Count | Frequency (%) |
( | 112 |
Close Punctuation
Value | Count | Frequency (%) |
) | 112 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 7 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 8699 | |
Common | 822 | 8.5% |
Latin | 103 | 1.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
어 | 289 | 3.3% |
미 | 229 | 2.6% |
장 | 201 | 2.3% |
고 | 198 | 2.3% |
치 | 193 | 2.2% |
스 | 190 | 2.2% |
기 | 177 | 2.0% |
생 | 146 | 1.7% |
등 | 144 | 1.7% |
가 | 140 | 1.6% |
Other values (409) | 6792 |
Latin
Value | Count | Frequency (%) |
j | 8 | 7.8% |
u | 7 | 6.8% |
i | 6 | 5.8% |
r | 6 | 5.8% |
S | 6 | 5.8% |
o | 5 | 4.9% |
a | 5 | 4.9% |
e | 5 | 4.9% |
t | 4 | 3.9% |
s | 4 | 3.9% |
Other values (19) | 47 |
Common
Value | Count | Frequency (%) |
520 | ||
( | 112 | 13.6% |
) | 112 | 13.6% |
0 | 19 | 2.3% |
5 | 17 | 2.1% |
1 | 11 | 1.3% |
- | 7 | 0.9% |
6 | 6 | 0.7% |
& | 6 | 0.7% |
3 | 5 | 0.6% |
Other values (4) | 7 | 0.9% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 8699 | |
ASCII | 925 | 9.6% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
520 | ||
( | 112 | 12.1% |
) | 112 | 12.1% |
0 | 19 | 2.1% |
5 | 17 | 1.8% |
1 | 11 | 1.2% |
j | 8 | 0.9% |
- | 7 | 0.8% |
u | 7 | 0.8% |
i | 6 | 0.6% |
Other values (33) | 106 | 11.5% |
Hangul
Value | Count | Frequency (%) |
어 | 289 | 3.3% |
미 | 229 | 2.6% |
장 | 201 | 2.3% |
고 | 198 | 2.3% |
치 | 193 | 2.2% |
스 | 190 | 2.2% |
기 | 177 | 2.0% |
생 | 146 | 1.7% |
등 | 144 | 1.7% |
가 | 140 | 1.6% |
Other values (409) | 6792 |
kind
Categorical
Distinct | 4 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.4 KiB |
수산물 | |
---|---|
가공식품 | |
농산물 | |
축산물 | 16 |
Length
Max length | 4 |
---|---|
Median length | 3 |
Mean length | 3.4522406 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 가공식품 |
---|---|
2nd row | 가공식품 |
3rd row | 가공식품 |
4th row | 수산물 |
5th row | 수산물 |
Common Values
Value | Count | Frequency (%) |
수산물 | 795 | |
가공식품 | 767 | |
농산물 | 118 | 7.0% |
축산물 | 16 | 0.9% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
수산물 | 795 | |
가공식품 | 767 | |
농산물 | 118 | 7.0% |
축산물 | 16 | 0.9% |
test_result
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.4 KiB |
적합 |
---|
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 적합 |
---|---|
2nd row | 적합 |
3rd row | 적합 |
4th row | 적합 |
5th row | 적합 |
Common Values
Value | Count | Frequency (%) |
적합 | 1696 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
적합 | 1696 |
detec_result
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.4 KiB |
불검출 | |
---|---|
2 Bq/kg | 2 |
137Cs, 9.8 Bq/kg 검출 | 1 |
Length
Max length | 19 |
---|---|
Median length | 3 |
Mean length | 3.0141509 |
Min length | 3 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | 불검출 |
---|---|
2nd row | 불검출 |
3rd row | 불검출 |
4th row | 불검출 |
5th row | 불검출 |
Common Values
Value | Count | Frequency (%) |
불검출 | 1693 | |
2 Bq/kg | 2 | 0.1% |
137Cs, 9.8 Bq/kg 검출 | 1 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
불검출 | 1693 | |
bq/kg | 3 | 0.2% |
2 | 2 | 0.1% |
137cs | 1 | 0.1% |
9.8 | 1 | 0.1% |
검출 | 1 | 0.1% |
origin
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 40 |
---|---|
Distinct (%) | 2.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.4 KiB |
국내산 | |
---|---|
일본산 | |
국산 | 46 |
러시아산 | 41 |
미국산 | 22 |
Other values (35) |
Length
Max length | 13 |
---|---|
Median length | 3 |
Mean length | 3.0689858 |
Min length | 2 |
Unique
Unique | 13 ? |
---|---|
Unique (%) | 0.8% |
Sample
1st row | 일본산 |
---|---|
2nd row | 일본산 |
3rd row | 일본산 |
4th row | 국내산 |
5th row | 국내산 |
Common Values
Value | Count | Frequency (%) |
국내산 | 992 | |
일본산 | 471 | |
국산 | 46 | 2.7% |
러시아산 | 41 | 2.4% |
미국산 | 22 | 1.3% |
노르웨이산 | 15 | 0.9% |
원양산 | 12 | 0.7% |
러시아 | 12 | 0.7% |
포르투칼산 | 12 | 0.7% |
중국산 | 11 | 0.6% |
Other values (30) | 62 | 3.7% |
Length
Value | Count | Frequency (%) |
국내산 | 992 | |
일본산 | 471 | |
국산 | 46 | 2.7% |
러시아산 | 41 | 2.4% |
미국산 | 22 | 1.3% |
노르웨이산 | 15 | 0.9% |
원양산 | 12 | 0.7% |
러시아 | 12 | 0.7% |
포르투칼산 | 12 | 0.7% |
중국산 | 11 | 0.6% |
Other values (32) | 69 | 4.1% |
data_day
Date
Distinct | 18 |
---|---|
Distinct (%) | 1.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.4 KiB |
Minimum | 2019-03-30 00:00:00 |
---|---|
Maximum | 2020-08-10 00:00:00 |
apr_at
Boolean
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.8 KiB |
False |
---|
Value | Count | Frequency (%) |
False | 1696 |
last_load_dttm
Date
Distinct | 2 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.4 KiB |
Minimum | 2020-12-21 12:00:43 |
---|---|
Maximum | 2020-12-21 12:00:44 |
skey | test_year | test_month | kind | detec_result | origin | data_day | last_load_dttm | |
---|---|---|---|---|---|---|---|---|
skey | 1.000 | 0.927 | 0.884 | 0.230 | 0.000 | 0.285 | 0.896 | 0.964 |
test_year | 0.927 | 1.000 | 0.736 | 0.337 | 0.016 | 0.338 | 1.000 | 0.167 |
test_month | 0.884 | 0.736 | 1.000 | 0.292 | 0.096 | 0.442 | 1.000 | 0.521 |
kind | 0.230 | 0.337 | 0.292 | 1.000 | 0.057 | 0.639 | 0.376 | 0.171 |
detec_result | 0.000 | 0.016 | 0.096 | 0.057 | 1.000 | 0.889 | 0.205 | 0.000 |
origin | 0.285 | 0.338 | 0.442 | 0.639 | 0.889 | 1.000 | 0.525 | 0.000 |
data_day | 0.896 | 1.000 | 1.000 | 0.376 | 0.205 | 0.525 | 1.000 | 0.657 |
last_load_dttm | 0.964 | 0.167 | 0.521 | 0.171 | 0.000 | 0.000 | 0.657 | 1.000 |
test_year | origin | detec_result | kind | |
---|---|---|---|---|
test_year | 1.000 | 0.266 | 0.027 | 0.225 |
origin | 0.266 | 1.000 | 0.706 | 0.352 |
detec_result | 0.027 | 0.706 | 1.000 | 0.054 |
kind | 0.225 | 0.352 | 0.054 | 1.000 |
skey | test_month | test_year | kind | detec_result | origin | |
---|---|---|---|---|---|---|
skey | 1.000 | -0.050 | 0.771 | 0.139 | 0.000 | 0.094 |
test_month | -0.050 | 1.000 | 0.575 | 0.178 | 0.057 | 0.155 |
test_year | 0.771 | 0.575 | 1.000 | 0.225 | 0.027 | 0.266 |
kind | 0.139 | 0.178 | 0.225 | 1.000 | 0.054 | 0.352 |
detec_result | 0.000 | 0.057 | 0.027 | 0.054 | 1.000 | 0.706 |
origin | 0.094 | 0.155 | 0.266 | 0.352 | 0.706 | 1.000 |
skey | test_year | test_month | spec_name | kind | test_result | detec_result | origin | data_day | apr_at | last_load_dttm | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2978 | 2019 | 12 | 멜론향소다음료 | 가공식품 | 적합 | 불검출 | 일본산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
1 | 2979 | 2019 | 12 | 라무바틀탄산음료 | 가공식품 | 적합 | 불검출 | 일본산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
2 | 2980 | 2019 | 12 | 쿠리이리도라야키 | 가공식품 | 적합 | 불검출 | 일본산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
3 | 2981 | 2019 | 12 | 기장돌미역 | 수산물 | 적합 | 불검출 | 국내산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
4 | 2982 | 2019 | 12 | 기장다시마 | 수산물 | 적합 | 불검출 | 국내산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
5 | 2983 | 2019 | 12 | 기장 재래미역 | 수산물 | 적합 | 불검출 | 국내산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
6 | 2984 | 2019 | 12 | 기꼬만혼쯔유 | 가공식품 | 적합 | 불검출 | 일본산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
7 | 2985 | 2019 | 12 | 계란또리에뿌리는 간장소스 | 가공식품 | 적합 | 불검출 | 일본산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
8 | 2986 | 2019 | 12 | 끄유노모토 | 가공식품 | 적합 | 불검출 | 일본산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
9 | 2987 | 2019 | 12 | 고등어 | 수산물 | 적합 | 불검출 | 국내산 | 2019-12-31 | N | 2020-12-21 12:00:43 |
skey | test_year | test_month | spec_name | kind | test_result | detec_result | origin | data_day | apr_at | last_load_dttm | |
---|---|---|---|---|---|---|---|---|---|---|---|
1686 | 4664 | 2020 | 7 | 한치 | 수산물 | 적합 | 불검출 | 국내산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1687 | 4665 | 2020 | 7 | 사양벌꿀 | 가공식품 | 적합 | 불검출 | 국내산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1688 | 4666 | 2020 | 7 | 복음자리 포도잼 | 가공식품 | 적합 | 불검출 | 국내산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1689 | 4667 | 2020 | 7 | 오가닉스토리유기농딸기잼 | 가공식품 | 적합 | 불검출 | 국내산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1690 | 4668 | 2020 | 7 | 제주갈치 | 수산물 | 적합 | 불검출 | 국내산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1691 | 4669 | 2020 | 7 | 참가자미 | 수산물 | 적합 | 불검출 | 국내산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1692 | 4670 | 2020 | 7 | 사시미소유 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1693 | 4671 | 2020 | 7 | 컵미소 아와세 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1694 | 4672 | 2020 | 7 | 사시미 소유 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-08-10 | N | 2020-12-21 12:00:44 |
1695 | 4673 | 2020 | 7 | 영상가이석태 | 수산물 | 적합 | 불검출 | 국내산 | 2020-08-10 | N | 2020-12-21 12:00:44 |