Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows11
Duplicate rows (%)0.1%
Total size in memory498.0 KiB
Average record size in memory51.0 B

Variable types

Numeric2
Categorical2
Text1

Dataset

Description2015년 제·개정된 농축수산물 표준코드의 단위코드와 동일한 의미를 가지는 2013년 농축수산물 표준코드의 단위코드를 나타낸 정보
Author농림수산식품교육문화정보원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220209000000001764

Alerts

B2EN_TESTUPDT_DE has constant value ""Constant
Dataset has 11 (0.1%) duplicate rowsDuplicates
STD_UNIT_NEW_CODE is highly overall correlated with STD_UNIT_CODE and 1 other fieldsHigh correlation
STD_UNIT_CODE is highly overall correlated with STD_UNIT_NEW_CODE and 1 other fieldsHigh correlation
STD_UNIT_NEW_NM is highly overall correlated with STD_UNIT_NEW_CODE and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-11 03:53:19.322267
Analysis finished2023-12-11 03:53:20.292935
Duration0.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

STD_UNIT_NEW_CODE
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.4269
Minimum11
Maximum85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T12:53:20.338842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile11
Q113
median71
Q372
95-th percentile73
Maximum85
Range74
Interquartile range (IQR)59

Descriptive statistics

Standard deviation29.161545
Coefficient of variation (CV)0.61487352
Kurtosis-1.8125123
Mean47.4269
Median Absolute Deviation (MAD)2
Skewness-0.27266242
Sum474269
Variance850.3957
MonotonicityNot monotonic
2023-12-11T12:53:20.440816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
72 1712
17.1%
71 1684
16.8%
73 1658
16.6%
13 1236
12.4%
11 1235
12.3%
12 1213
12.1%
83 197
 
2.0%
85 195
 
1.9%
32 193
 
1.9%
34 192
 
1.9%
Other values (6) 485
 
4.9%
ValueCountFrequency (%)
11 1235
12.3%
12 1213
12.1%
13 1236
12.4%
31 191
 
1.9%
32 193
 
1.9%
33 188
 
1.9%
34 192
 
1.9%
41 3
 
< 0.1%
42 4
 
< 0.1%
43 4
 
< 0.1%
ValueCountFrequency (%)
85 195
 
1.9%
84 95
 
0.9%
83 197
 
2.0%
73 1658
16.6%
72 1712
17.1%
71 1684
16.8%
43 4
 
< 0.1%
42 4
 
< 0.1%
41 3
 
< 0.1%
34 192
 
1.9%

STD_UNIT_NEW_NM
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
kg
3122 
g
3113 
ton
3086 
 
197
 
195
Other values (2)
 
287

Length

Max length3
Median length2
Mean length1.9294
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowkg
2nd row
3rd rowton
4th rowkg
5th rowg

Common Values

ValueCountFrequency (%)
kg 3122
31.2%
g 3113
31.1%
ton 3086
30.9%
197
 
2.0%
195
 
1.9%
l 192
 
1.9%
95
 
0.9%

Length

2023-12-11T12:53:20.595027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:53:20.706467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kg 3122
31.2%
g 3113
31.1%
ton 3086
30.9%
197
 
2.0%
195
 
1.9%
l 192
 
1.9%
95
 
0.9%

STD_UNIT_CODE
Real number (ℝ)

HIGH CORRELATION 

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.3772
Minimum11
Maximum85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T12:53:20.821072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile11
Q113
median71
Q372
95-th percentile73
Maximum85
Range74
Interquartile range (IQR)59

Descriptive statistics

Standard deviation29.100979
Coefficient of variation (CV)0.61424017
Kurtosis-1.8169112
Mean47.3772
Median Absolute Deviation (MAD)2
Skewness-0.27686512
Sum473772
Variance846.86701
MonotonicityNot monotonic
2023-12-11T12:53:20.935362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
72 1712
17.1%
71 1684
16.8%
73 1658
16.6%
13 1236
12.4%
11 1235
12.3%
12 1213
12.1%
32 193
 
1.9%
34 192
 
1.9%
31 191
 
1.9%
33 188
 
1.9%
Other values (8) 498
 
5.0%
ValueCountFrequency (%)
11 1235
12.3%
12 1213
12.1%
13 1236
12.4%
31 191
 
1.9%
32 193
 
1.9%
33 188
 
1.9%
34 192
 
1.9%
41 3
 
< 0.1%
42 4
 
< 0.1%
43 4
 
< 0.1%
ValueCountFrequency (%)
85 96
 
1.0%
84 95
 
0.9%
83 96
 
1.0%
82 101
 
1.0%
81 99
 
1.0%
73 1658
16.6%
72 1712
17.1%
71 1684
16.8%
43 4
 
< 0.1%
42 4
 
< 0.1%
Distinct9272
Distinct (%)92.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T12:53:21.210295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length23
Mean length9.661
Min length1

Characters and Unicode

Total characters96610
Distinct characters95
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8571 ?
Unique (%)85.7%

Sample

1st rowkg 접 100내
2nd row개 12개 3급
3rd rowton PP대 60내
4th rowkg 두름 150내
5th rowg 코 80내
ValueCountFrequency (%)
g 3167
 
10.8%
kg 3122
 
10.7%
ton 3086
 
10.6%
기타 733
 
2.5%
523
 
1.8%
상자 500
 
1.7%
그물망 482
 
1.6%
pp대 477
 
1.6%
432
 
1.5%
431
 
1.5%
Other values (181) 16287
55.7%
2023-12-11T12:53:21.687119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19240
19.9%
g 6235
 
6.5%
0 5290
 
5.5%
1 4163
 
4.3%
3673
 
3.8%
k 3118
 
3.2%
n 3086
 
3.2%
o 3086
 
3.2%
t 3082
 
3.2%
2 2379
 
2.5%
Other values (85) 43258
44.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24627
25.5%
Lowercase Letter 20872
21.6%
Decimal Number 20498
21.2%
Space Separator 19240
19.9%
Uppercase Letter 6027
 
6.2%
Other Punctuation 1889
 
2.0%
Open Punctuation 1101
 
1.1%
Close Punctuation 1101
 
1.1%
Math Symbol 710
 
0.7%
Dash Punctuation 545
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3673
 
14.9%
2124
 
8.6%
1711
 
6.9%
1224
 
5.0%
955
 
3.9%
917
 
3.7%
912
 
3.7%
733
 
3.0%
724
 
2.9%
599
 
2.4%
Other values (44) 11055
44.9%
Uppercase Letter
ValueCountFrequency (%)
P 2108
35.0%
T 522
 
8.7%
B 476
 
7.9%
M 457
 
7.6%
S 450
 
7.5%
A 263
 
4.4%
N 263
 
4.4%
C 259
 
4.3%
D 255
 
4.2%
E 231
 
3.8%
Other values (5) 743
 
12.3%
Decimal Number
ValueCountFrequency (%)
0 5290
25.8%
1 4163
20.3%
2 2379
11.6%
5 2068
 
10.1%
3 1911
 
9.3%
4 1406
 
6.9%
8 987
 
4.8%
7 938
 
4.6%
6 733
 
3.6%
9 623
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
g 6235
29.9%
k 3118
14.9%
n 3086
14.8%
o 3086
14.8%
t 3082
14.8%
m 1363
 
6.5%
c 710
 
3.4%
l 192
 
0.9%
Other Punctuation
ValueCountFrequency (%)
/ 1018
53.9%
. 819
43.4%
, 52
 
2.8%
Space Separator
ValueCountFrequency (%)
19240
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1101
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1101
100.0%
Math Symbol
ValueCountFrequency (%)
× 710
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 545
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 45084
46.7%
Latin 26899
27.8%
Hangul 24627
25.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3673
 
14.9%
2124
 
8.6%
1711
 
6.9%
1224
 
5.0%
955
 
3.9%
917
 
3.7%
912
 
3.7%
733
 
3.0%
724
 
2.9%
599
 
2.4%
Other values (44) 11055
44.9%
Latin
ValueCountFrequency (%)
g 6235
23.2%
k 3118
11.6%
n 3086
11.5%
o 3086
11.5%
t 3082
11.5%
P 2108
 
7.8%
m 1363
 
5.1%
c 710
 
2.6%
T 522
 
1.9%
B 476
 
1.8%
Other values (13) 3113
11.6%
Common
ValueCountFrequency (%)
19240
42.7%
0 5290
 
11.7%
1 4163
 
9.2%
2 2379
 
5.3%
5 2068
 
4.6%
3 1911
 
4.2%
4 1406
 
3.1%
( 1101
 
2.4%
) 1101
 
2.4%
/ 1018
 
2.3%
Other values (8) 5407
 
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 71273
73.8%
Hangul 24627
 
25.5%
None 710
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
19240
27.0%
g 6235
 
8.7%
0 5290
 
7.4%
1 4163
 
5.8%
k 3118
 
4.4%
n 3086
 
4.3%
o 3086
 
4.3%
t 3082
 
4.3%
2 2379
 
3.3%
P 2108
 
3.0%
Other values (30) 19486
27.3%
Hangul
ValueCountFrequency (%)
3673
 
14.9%
2124
 
8.6%
1711
 
6.9%
1224
 
5.0%
955
 
3.9%
917
 
3.7%
912
 
3.7%
733
 
3.0%
724
 
2.9%
599
 
2.4%
Other values (44) 11055
44.9%
None
ValueCountFrequency (%)
× 710
100.0%

B2EN_TESTUPDT_DE
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20160128
10000 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20160128
2nd row20160128
3rd row20160128
4th row20160128
5th row20160128

Common Values

ValueCountFrequency (%)
20160128 10000
100.0%

Length

2023-12-11T12:53:21.881847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:53:22.001163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20160128 10000
100.0%

Interactions

2023-12-11T12:53:19.896392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:53:19.673030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:53:19.991177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:53:19.780071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:53:22.081442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
STD_UNIT_NEW_CODESTD_UNIT_NEW_NMSTD_UNIT_CODE
STD_UNIT_NEW_CODE1.0000.7981.000
STD_UNIT_NEW_NM0.7981.0000.798
STD_UNIT_CODE1.0000.7981.000
2023-12-11T12:53:22.205602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
STD_UNIT_NEW_CODESTD_UNIT_CODESTD_UNIT_NEW_NM
STD_UNIT_NEW_CODE1.0001.0000.632
STD_UNIT_CODE1.0001.0000.632
STD_UNIT_NEW_NM0.6320.6321.000

Missing values

2023-12-11T12:53:20.150483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:53:20.251301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

STD_UNIT_NEW_CODESTD_UNIT_NEW_NMSTD_UNIT_CODESTD_UNIT_NMB2EN_TESTUPDT_DE
234212kg12kg 접 100내20160128
116038581개 12개 3급20160128
315413ton13ton PP대 60내20160128
821272kg72kg 두름 150내20160128
663771g71g 코 80내20160128
844172kg72kg 깡 6통20160128
362913ton13ton 단 50내20160128
243112kg12kg 채 25내20160128
351813ton13ton 봉지 15내20160128
929473ton73ton 상자 190내20160128
STD_UNIT_NEW_CODESTD_UNIT_NEW_NMSTD_UNIT_CODESTD_UNIT_NMB2EN_TESTUPDT_DE
1069173ton73ton 포 3방20160128
882972kg72kg 축 3방20160128
578371g71g C/T(B/T) 5단20160128
112268382단 16개이상 기타20160128
185312kg12kg 트럭 17개20160128
482733ton33ton 트럭 3.9cm×5.1cm×2.7m20160128
87111g11g 개 10내(5단위)20160128
644771g71g 미(마리) 140내20160128
588671g71g S/P 10단20160128
481133ton33ton 그물망 30cm×3.6m이상20160128

Duplicate rows

Most frequently occurring

STD_UNIT_NEW_CODESTD_UNIT_NEW_NMSTD_UNIT_CODESTD_UNIT_NMB2EN_TESTUPDT_DE# duplicates
011g11g 기타201601282
112kg12kg 기타201601282
213ton13ton 기타201601282
333ton33ton 기타201601282
434l34l 기타201601282
572kg72kg 기타201601282
673ton73ton 기타201601282
78382단 기타201601282
88383속 기타201601282
98581개 기타201601282