Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 3691 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 328.1 KiB |
Average record size in memory | 91.0 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 5 |
Text | 1 |
DateTime | 2 |
Boolean | 1 |
Dataset
Description | 2021-03-01 |
---|---|
Author | 부산시공공데이터포털 |
URL | https://bigdata.busan.go.kr/data/bigDataDetailView.do?menuCode=M00000000007&hdfs_file_sn=20230901054901241000 |
test_result has constant value "" | Constant |
apr_at has constant value "" | Constant |
last_load_dttm has constant value "" | Constant |
skey is highly overall correlated with test_year | High correlation |
test_year is highly overall correlated with skey | High correlation |
detec_result is highly overall correlated with origin | High correlation |
origin is highly overall correlated with detec_result | High correlation |
detec_result is highly imbalanced (98.9%) | Imbalance |
origin is highly imbalanced (64.9%) | Imbalance |
skey has unique values | Unique |
Reproduction
Analysis started | 2023-12-10 08:45:25.303468 |
---|---|
Analysis finished | 2023-12-10 08:45:28.495531 |
Duration | 3.19 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
skey
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 3691 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4823 |
Minimum | 2978 |
---|---|
Maximum | 6668 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 32.6 KiB |
Quantile statistics
Minimum | 2978 |
---|---|
5-th percentile | 3162.5 |
Q1 | 3900.5 |
median | 4823 |
Q3 | 5745.5 |
95-th percentile | 6483.5 |
Maximum | 6668 |
Range | 3690 |
Interquartile range (IQR) | 1845 |
Descriptive statistics
Standard deviation | 1065.6442 |
---|---|
Coefficient of variation (CV) | 0.2209505 |
Kurtosis | -1.2 |
Mean | 4823 |
Median Absolute Deviation (MAD) | 923 |
Skewness | 0 |
Sum | 17801693 |
Variance | 1135597.7 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4674 | 1 | < 0.1% |
4052 | 1 | < 0.1% |
4054 | 1 | < 0.1% |
4055 | 1 | < 0.1% |
4056 | 1 | < 0.1% |
4057 | 1 | < 0.1% |
4058 | 1 | < 0.1% |
4059 | 1 | < 0.1% |
4060 | 1 | < 0.1% |
4061 | 1 | < 0.1% |
Other values (3681) | 3681 |
Value | Count | Frequency (%) |
2978 | 1 | |
2979 | 1 | |
2980 | 1 | |
2981 | 1 | |
2982 | 1 | |
2983 | 1 | |
2984 | 1 | |
2985 | 1 | |
2986 | 1 | |
2987 | 1 |
Value | Count | Frequency (%) |
6668 | 1 | |
6667 | 1 | |
6666 | 1 | |
6665 | 1 | |
6664 | 1 | |
6663 | 1 | |
6662 | 1 | |
6661 | 1 | |
6660 | 1 | |
6659 | 1 |
test_year
Categorical
HIGH CORRELATION
 
Distinct | 2 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 29.0 KiB |
2019 | |
---|---|
2020 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2019 |
---|---|
2nd row | 2019 |
3rd row | 2019 |
4th row | 2019 |
5th row | 2019 |
Common Values
Value | Count | Frequency (%) |
2019 | 2408 | |
2020 | 1283 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2019 | 2408 | |
2020 | 1283 |
test_month
Real number (ℝ)
Distinct | 12 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 6.4968843 |
Minimum | 1 |
---|---|
Maximum | 12 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 32.6 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 4 |
median | 6 |
Q3 | 9 |
95-th percentile | 12 |
Maximum | 12 |
Range | 11 |
Interquartile range (IQR) | 5 |
Descriptive statistics
Standard deviation | 3.3986572 |
---|---|
Coefficient of variation (CV) | 0.5231211 |
Kurtosis | -1.1395352 |
Mean | 6.4968843 |
Median Absolute Deviation (MAD) | 3 |
Skewness | 0.050389878 |
Sum | 23980 |
Variance | 11.550871 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5 | 458 | |
3 | 444 | |
1 | 372 | |
11 | 359 | |
4 | 324 | |
8 | 317 | |
7 | 316 | |
12 | 306 | |
9 | 277 | |
6 | 242 | |
Other values (2) | 276 |
Value | Count | Frequency (%) |
1 | 372 | |
2 | 48 | 1.3% |
3 | 444 | |
4 | 324 | |
5 | 458 | |
6 | 242 | |
7 | 316 | |
8 | 317 | |
9 | 277 | |
10 | 228 |
Value | Count | Frequency (%) |
12 | 306 | |
11 | 359 | |
10 | 228 | |
9 | 277 | |
8 | 317 | |
7 | 316 | |
6 | 242 | |
5 | 458 | |
4 | 324 | |
3 | 444 |
spec_name
Text
Distinct | 733 |
---|---|
Distinct (%) | 19.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 29.0 KiB |
Value | Count | Frequency (%) |
고등어 | 182 | 3.8% |
삼치 | 91 | 1.9% |
우럭 | 59 | 1.2% |
가자미 | 59 | 1.2% |
오징어 | 45 | 0.9% |
소스 | 45 | 0.9% |
명란 | 39 | 0.8% |
생대구 | 38 | 0.8% |
갈치 | 37 | 0.8% |
기꼬만 | 36 | 0.7% |
Other values (847) | 4222 |
Most occurring characters
Value | Count | Frequency (%) |
1162 | 5.5% | |
어 | 608 | 2.9% |
미 | 498 | 2.3% |
장 | 457 | 2.1% |
스 | 444 | 2.1% |
고 | 422 | 2.0% |
치 | 402 | 1.9% |
기 | 387 | 1.8% |
소 | 325 | 1.5% |
등 | 301 | 1.4% |
Other values (498) | 16308 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 19244 | |
Space Separator | 1162 | 5.5% |
Open Punctuation | 246 | 1.2% |
Close Punctuation | 246 | 1.2% |
Lowercase Letter | 152 | 0.7% |
Decimal Number | 149 | 0.7% |
Uppercase Letter | 75 | 0.4% |
Dash Punctuation | 18 | 0.1% |
Other Punctuation | 18 | 0.1% |
Other Symbol | 4 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
어 | 608 | 3.2% |
미 | 498 | 2.6% |
장 | 457 | 2.4% |
스 | 444 | 2.3% |
고 | 422 | 2.2% |
치 | 402 | 2.1% |
기 | 387 | 2.0% |
소 | 325 | 1.7% |
등 | 301 | 1.6% |
이 | 300 | 1.6% |
Other values (450) | 15100 |
Lowercase Letter
Value | Count | Frequency (%) |
j | 24 | |
u | 14 | 9.2% |
i | 12 | 7.9% |
r | 12 | 7.9% |
a | 10 | 6.6% |
e | 10 | 6.6% |
o | 10 | 6.6% |
s | 8 | 5.3% |
t | 8 | 5.3% |
l | 8 | 5.3% |
Other values (8) | 36 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 12 | |
P | 8 | |
M | 8 | |
G | 8 | |
B | 8 | |
T | 6 | |
N | 6 | |
A | 4 | 5.3% |
C | 4 | 5.3% |
E | 4 | 5.3% |
Other values (3) | 7 |
Decimal Number
Value | Count | Frequency (%) |
0 | 42 | |
5 | 38 | |
1 | 29 | |
6 | 16 | 10.7% |
3 | 10 | 6.7% |
2 | 7 | 4.7% |
9 | 5 | 3.4% |
7 | 2 | 1.3% |
Other Punctuation
Value | Count | Frequency (%) |
& | 13 | |
, | 2 | 11.1% |
/ | 2 | 11.1% |
? | 1 | 5.6% |
Space Separator
Value | Count | Frequency (%) |
1162 |
Open Punctuation
Value | Count | Frequency (%) |
( | 246 |
Close Punctuation
Value | Count | Frequency (%) |
) | 246 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 18 |
Other Symbol
Value | Count | Frequency (%) |
� | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 19239 | |
Common | 1843 | 8.6% |
Latin | 227 | 1.1% |
Han | 4 | < 0.1% |
Katakana | 1 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
어 | 608 | 3.2% |
미 | 498 | 2.6% |
장 | 457 | 2.4% |
스 | 444 | 2.3% |
고 | 422 | 2.2% |
치 | 402 | 2.1% |
기 | 387 | 2.0% |
소 | 325 | 1.7% |
등 | 301 | 1.6% |
이 | 300 | 1.6% |
Other values (445) | 15095 |
Latin
Value | Count | Frequency (%) |
j | 24 | 10.6% |
u | 14 | 6.2% |
i | 12 | 5.3% |
S | 12 | 5.3% |
r | 12 | 5.3% |
a | 10 | 4.4% |
e | 10 | 4.4% |
o | 10 | 4.4% |
P | 8 | 3.5% |
s | 8 | 3.5% |
Other values (21) | 107 |
Common
Value | Count | Frequency (%) |
1162 | ||
( | 246 | 13.3% |
) | 246 | 13.3% |
0 | 42 | 2.3% |
5 | 38 | 2.1% |
1 | 29 | 1.6% |
- | 18 | 1.0% |
6 | 16 | 0.9% |
& | 13 | 0.7% |
3 | 10 | 0.5% |
Other values (7) | 23 | 1.2% |
Han
Value | Count | Frequency (%) |
幣 | 1 | |
祺 | 1 | |
穎 | 1 | |
宙 | 1 |
Katakana
Value | Count | Frequency (%) |
シ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 19239 | |
ASCII | 2066 | 9.7% |
Specials | 4 | < 0.1% |
CJK | 4 | < 0.1% |
Katakana | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1162 | ||
( | 246 | 11.9% |
) | 246 | 11.9% |
0 | 42 | 2.0% |
5 | 38 | 1.8% |
1 | 29 | 1.4% |
j | 24 | 1.2% |
- | 18 | 0.9% |
6 | 16 | 0.8% |
u | 14 | 0.7% |
Other values (37) | 231 | 11.2% |
Hangul
Value | Count | Frequency (%) |
어 | 608 | 3.2% |
미 | 498 | 2.6% |
장 | 457 | 2.4% |
스 | 444 | 2.3% |
고 | 422 | 2.2% |
치 | 402 | 2.1% |
기 | 387 | 2.0% |
소 | 325 | 1.7% |
등 | 301 | 1.6% |
이 | 300 | 1.6% |
Other values (445) | 15095 |
Specials
Value | Count | Frequency (%) |
� | 4 |
CJK
Value | Count | Frequency (%) |
幣 | 1 | |
祺 | 1 | |
穎 | 1 | |
宙 | 1 |
Katakana
Value | Count | Frequency (%) |
シ | 1 |
kind
Categorical
Distinct | 4 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 29.0 KiB |
가공식품 | |
---|---|
수산물 | |
농산물 | |
축산물 | 33 |
Length
Max length | 4 |
---|---|
Median length | 3 |
Mean length | 3.4811704 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 가공식품 |
---|---|
2nd row | 가공식품 |
3rd row | 가공식품 |
4th row | 가공식품 |
5th row | 가공식품 |
Common Values
Value | Count | Frequency (%) |
가공식품 | 1776 | |
수산물 | 1631 | |
농산물 | 251 | 6.8% |
축산물 | 33 | 0.9% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
가공식품 | 1776 | |
수산물 | 1631 | |
농산물 | 251 | 6.8% |
축산물 | 33 | 0.9% |
test_result
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 29.0 KiB |
적합 |
---|
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 적합 |
---|---|
2nd row | 적합 |
3rd row | 적합 |
4th row | 적합 |
5th row | 적합 |
Common Values
Value | Count | Frequency (%) |
적합 | 3691 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
적합 | 3691 |
detec_result
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 29.0 KiB |
불검출 | |
---|---|
2 Bq/kg | 4 |
137Cs, 9.8 Bq/kg 검출 | 2 |
137Cs 0.9 Bq/kg | 1 |
Length
Max length | 19 |
---|---|
Median length | 3 |
Mean length | 3.0162558 |
Min length | 3 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | 불검출 |
---|---|
2nd row | 불검출 |
3rd row | 불검출 |
4th row | 불검출 |
5th row | 불검출 |
Common Values
Value | Count | Frequency (%) |
불검출 | 3684 | |
2 Bq/kg | 4 | 0.1% |
137Cs, 9.8 Bq/kg 검출 | 2 | 0.1% |
137Cs 0.9 Bq/kg | 1 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
불검출 | 3684 | |
bq/kg | 7 | 0.2% |
2 | 4 | 0.1% |
137cs | 3 | 0.1% |
9.8 | 2 | 0.1% |
검출 | 2 | 0.1% |
0.9 | 1 | < 0.1% |
origin
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 43 |
---|---|
Distinct (%) | 1.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 29.0 KiB |
국내산 | |
---|---|
일본산 | |
국산 | 92 |
러시아산 | 86 |
미국산 | 47 |
Other values (38) |
Length
Max length | 13 |
---|---|
Median length | 3 |
Mean length | 3.0671905 |
Min length | 2 |
Unique
Unique | 3 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | 국내산 |
---|---|
2nd row | 국내산 |
3rd row | 국내산 |
4th row | 국내산 |
5th row | 국내산 |
Common Values
Value | Count | Frequency (%) |
국내산 | 2139 | |
일본산 | 1064 | |
국산 | 92 | 2.5% |
러시아산 | 86 | 2.3% |
미국산 | 47 | 1.3% |
노르웨이산 | 30 | 0.8% |
중국산 | 29 | 0.8% |
포르투칼산 | 25 | 0.7% |
원양산 | 24 | 0.7% |
러시아 | 24 | 0.7% |
Other values (33) | 131 | 3.5% |
Length
Value | Count | Frequency (%) |
국내산 | 2139 | |
일본산 | 1064 | |
국산 | 92 | 2.5% |
러시아산 | 86 | 2.3% |
미국산 | 47 | 1.3% |
노르웨이산 | 30 | 0.8% |
중국산 | 29 | 0.8% |
포르투칼산 | 25 | 0.7% |
원양산 | 24 | 0.6% |
러시아 | 24 | 0.6% |
Other values (35) | 145 | 3.9% |
data_day
Date
Distinct | 23 |
---|---|
Distinct (%) | 0.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 29.0 KiB |
Minimum | 2019-03-30 00:00:00 |
---|---|
Maximum | 2020-12-23 00:00:00 |
apr_at
Boolean
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.7 KiB |
False |
---|
Value | Count | Frequency (%) |
False | 3691 |
last_load_dttm
Date
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 29.0 KiB |
Minimum | 2021-03-01 05:49:03 |
---|---|
Maximum | 2021-03-01 05:49:03 |
skey | test_year | test_month | kind | detec_result | origin | data_day | |
---|---|---|---|---|---|---|---|
skey | 1.000 | 0.929 | 0.806 | 0.225 | 0.016 | 0.231 | 0.882 |
test_year | 0.929 | 1.000 | 0.518 | 0.344 | 0.054 | 0.311 | 1.000 |
test_month | 0.806 | 0.518 | 1.000 | 0.302 | 0.111 | 0.467 | 1.000 |
kind | 0.225 | 0.344 | 0.302 | 1.000 | 0.117 | 0.613 | 0.441 |
detec_result | 0.016 | 0.054 | 0.111 | 0.117 | 1.000 | 0.829 | 0.201 |
origin | 0.231 | 0.311 | 0.467 | 0.613 | 0.829 | 1.000 | 0.595 |
data_day | 0.882 | 1.000 | 1.000 | 0.441 | 0.201 | 0.595 | 1.000 |
origin | detec_result | test_year | kind | |
---|---|---|---|---|
origin | 1.000 | 0.580 | 0.259 | 0.355 |
detec_result | 0.580 | 1.000 | 0.036 | 0.047 |
test_year | 0.259 | 0.036 | 1.000 | 0.230 |
kind | 0.355 | 0.047 | 0.230 | 1.000 |
skey | test_month | test_year | kind | detec_result | origin | |
---|---|---|---|---|---|---|
skey | 1.000 | 0.168 | 0.775 | 0.136 | 0.010 | 0.081 |
test_month | 0.168 | 1.000 | 0.398 | 0.184 | 0.066 | 0.179 |
test_year | 0.775 | 0.398 | 1.000 | 0.230 | 0.036 | 0.259 |
kind | 0.136 | 0.184 | 0.230 | 1.000 | 0.047 | 0.355 |
detec_result | 0.010 | 0.066 | 0.036 | 0.047 | 1.000 | 0.580 |
origin | 0.081 | 0.179 | 0.259 | 0.355 | 0.580 | 1.000 |
skey | test_year | test_month | spec_name | kind | test_result | detec_result | origin | data_day | apr_at | last_load_dttm | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 4674 | 2019 | 1 | 아리담배추김치 | 가공식품 | 적합 | 불검출 | 국내산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
1 | 4675 | 2019 | 1 | 땅콩맛전병 | 가공식품 | 적합 | 불검출 | 국내산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
2 | 4676 | 2019 | 1 | 잣맛전병 | 가공식품 | 적합 | 불검출 | 국내산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
3 | 4677 | 2019 | 1 | 김파래맛전병 | 가공식품 | 적합 | 불검출 | 국내산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
4 | 4678 | 2019 | 1 | 깨땅콩맛전병 | 가공식품 | 적합 | 불검출 | 국내산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
5 | 4679 | 2019 | 1 | 유씨씨블랙넌슈가PET | 가공식품 | 적합 | 불검출 | 일본산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
6 | 4680 | 2019 | 1 | 송로 | 가공식품 | 적합 | 불검출 | 일본산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
7 | 4681 | 2019 | 1 | 치즈크래커 | 가공식품 | 적합 | 불검출 | 일본산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
8 | 4682 | 2019 | 1 | 제주갈치 | 수산물 | 적합 | 불검출 | 국내산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
9 | 4683 | 2019 | 1 | 아구 | 수산물 | 적합 | 불검출 | 국내산 | 2019-03-30 | N | 2021-03-01 05:49:03 |
skey | test_year | test_month | spec_name | kind | test_result | detec_result | origin | data_day | apr_at | last_load_dttm | |
---|---|---|---|---|---|---|---|---|---|---|---|
3681 | 6659 | 2020 | 12 | 전갱이 | 수산물 | 적합 | 불검출 | 국내산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3682 | 6660 | 2020 | 12 | 백조기 | 수산물 | 적합 | 불검출 | 국내산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3683 | 6661 | 2020 | 12 | 동태 | 수산물 | 적합 | 불검출 | 러시아산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3684 | 6662 | 2020 | 12 | 소바가게 소바쯔유 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3685 | 6663 | 2020 | 12 | 기꼬만 환대두생간장 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3686 | 6664 | 2020 | 12 | 다시마장유 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3687 | 6665 | 2020 | 12 | 기꼬만혼쯔유 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3688 | 6666 | 2020 | 12 | 컵미소-아와세 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3689 | 6667 | 2020 | 12 | 한우등심 | 축산물 | 적합 | 불검출 | 국내산 | 2020-12-23 | N | 2021-03-01 05:49:03 |
3690 | 6668 | 2020 | 12 | 에스앤비 골든카레 매운맛 | 가공식품 | 적합 | 불검출 | 일본산 | 2020-12-23 | N | 2021-03-01 05:49:03 |