Overview

Dataset statistics

Number of variables6
Number of observations54
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.8 KiB
Average record size in memory52.4 B

Variable types

Numeric2
Categorical3
Text1

Dataset

Description경상남도 농작물진단처방 분석항목 데이터입니다.
Author경상남도
URLhttps://www.data.go.kr/data/15049542/fileData.do

Alerts

분류코드(소) is highly overall correlated with 비용 and 2 other fieldsHigh correlation
비용 is highly overall correlated with 분류코드(소) and 3 other fieldsHigh correlation
분류코드(대) is highly overall correlated with 분류코드(소) and 3 other fieldsHigh correlation
분석기준 is highly overall correlated with 분류코드(소) and 3 other fieldsHigh correlation
단위 is highly overall correlated with 비용 and 2 other fieldsHigh correlation
분류코드(소) has unique valuesUnique

Reproduction

Analysis started2023-12-12 02:53:01.527434
Analysis finished2023-12-12 02:53:03.039543
Duration1.51 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분류코드(소)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct54
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.5
Minimum1
Maximum54
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size618.0 B
2023-12-12T11:53:03.138910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.65
Q114.25
median27.5
Q340.75
95-th percentile51.35
Maximum54
Range53
Interquartile range (IQR)26.5

Descriptive statistics

Standard deviation15.732133
Coefficient of variation (CV)0.57207755
Kurtosis-1.2
Mean27.5
Median Absolute Deviation (MAD)13.5
Skewness0
Sum1485
Variance247.5
MonotonicityStrictly increasing
2023-12-12T11:53:03.290565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.9%
42 1
 
1.9%
31 1
 
1.9%
32 1
 
1.9%
33 1
 
1.9%
34 1
 
1.9%
35 1
 
1.9%
36 1
 
1.9%
37 1
 
1.9%
38 1
 
1.9%
Other values (44) 44
81.5%
ValueCountFrequency (%)
1 1
1.9%
2 1
1.9%
3 1
1.9%
4 1
1.9%
5 1
1.9%
6 1
1.9%
7 1
1.9%
8 1
1.9%
9 1
1.9%
10 1
1.9%
ValueCountFrequency (%)
54 1
1.9%
53 1
1.9%
52 1
1.9%
51 1
1.9%
50 1
1.9%
49 1
1.9%
48 1
1.9%
47 1
1.9%
46 1
1.9%
45 1
1.9%

분류코드(대)
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Memory size564.0 B
C
16 
E
12 
A
10 
B
D

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
C 16
29.6%
E 12
22.2%
A 10
18.5%
B 8
14.8%
D 8
14.8%

Length

2023-12-12T11:53:03.427268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:53:03.538975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 16
29.6%
e 12
22.2%
a 10
18.5%
b 8
14.8%
d 8
14.8%
Distinct38
Distinct (%)70.4%
Missing0
Missing (%)0.0%
Memory size564.0 B
2023-12-12T11:53:03.789989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length12
Mean length5.4259259
Min length1

Characters and Unicode

Total characters293
Distinct characters79
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)53.7%

Sample

1st row수소이온(pH)
2nd row전기전도도(EC)
3rd row질산성질소(NO3-N)
4th row칼륨(K)
5th row칼슘(Ca)
ValueCountFrequency (%)
수은 3
 
5.4%
3
 
5.4%
니켈 3
 
5.4%
카드뮴 3
 
5.4%
비소 3
 
5.4%
구리 3
 
5.4%
아연 3
 
5.4%
6가크롬 2
 
3.6%
전기전도도(ec 2
 
3.6%
v/v 2
 
3.6%
Other values (29) 29
51.8%
2023-12-12T11:53:04.202159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 21
 
7.2%
) 21
 
7.2%
9
 
3.1%
9
 
3.1%
9
 
3.1%
8
 
2.7%
8
 
2.7%
C 8
 
2.7%
7
 
2.4%
6
 
2.0%
Other values (69) 187
63.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 180
61.4%
Uppercase Letter 31
 
10.6%
Open Punctuation 23
 
7.8%
Close Punctuation 23
 
7.8%
Lowercase Letter 15
 
5.1%
Decimal Number 11
 
3.8%
Other Punctuation 6
 
2.0%
Dash Punctuation 2
 
0.7%
Space Separator 2
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
5.0%
9
 
5.0%
9
 
5.0%
8
 
4.4%
8
 
4.4%
7
 
3.9%
6
 
3.3%
6
 
3.3%
6
 
3.3%
6
 
3.3%
Other values (40) 106
58.9%
Uppercase Letter
ValueCountFrequency (%)
C 8
25.8%
O 5
16.1%
H 4
12.9%
N 4
12.9%
M 3
 
9.7%
E 3
 
9.7%
K 2
 
6.5%
P 1
 
3.2%
S 1
 
3.2%
Decimal Number
ValueCountFrequency (%)
5 3
27.3%
1 2
18.2%
3 2
18.2%
6 2
18.2%
2 1
 
9.1%
4 1
 
9.1%
Lowercase Letter
ValueCountFrequency (%)
v 4
26.7%
a 4
26.7%
p 3
20.0%
l 2
13.3%
g 2
13.3%
Other Punctuation
ValueCountFrequency (%)
: 2
33.3%
/ 2
33.3%
, 2
33.3%
Open Punctuation
ValueCountFrequency (%)
( 21
91.3%
[ 2
 
8.7%
Close Punctuation
ValueCountFrequency (%)
) 21
91.3%
] 2
 
8.7%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 180
61.4%
Common 67
 
22.9%
Latin 46
 
15.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
5.0%
9
 
5.0%
9
 
5.0%
8
 
4.4%
8
 
4.4%
7
 
3.9%
6
 
3.3%
6
 
3.3%
6
 
3.3%
6
 
3.3%
Other values (40) 106
58.9%
Common
ValueCountFrequency (%)
( 21
31.3%
) 21
31.3%
5 3
 
4.5%
1 2
 
3.0%
3 2
 
3.0%
: 2
 
3.0%
/ 2
 
3.0%
, 2
 
3.0%
- 2
 
3.0%
6 2
 
3.0%
Other values (5) 8
 
11.9%
Latin
ValueCountFrequency (%)
C 8
17.4%
O 5
10.9%
H 4
8.7%
v 4
8.7%
N 4
8.7%
a 4
8.7%
M 3
 
6.5%
E 3
 
6.5%
p 3
 
6.5%
K 2
 
4.3%
Other values (4) 6
13.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 180
61.4%
ASCII 113
38.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 21
18.6%
) 21
18.6%
C 8
 
7.1%
O 5
 
4.4%
H 4
 
3.5%
v 4
 
3.5%
N 4
 
3.5%
a 4
 
3.5%
M 3
 
2.7%
5 3
 
2.7%
Other values (19) 36
31.9%
Hangul
ValueCountFrequency (%)
9
 
5.0%
9
 
5.0%
9
 
5.0%
8
 
4.4%
8
 
4.4%
7
 
3.9%
6
 
3.3%
6
 
3.3%
6
 
3.3%
6
 
3.3%
Other values (40) 106
58.9%

비용
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12072.222
Minimum7500
Maximum43800
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size618.0 B
2023-12-12T11:53:04.340098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7500
5-th percentile8500
Q19900
median11700
Q311700
95-th percentile16000
Maximum43800
Range36300
Interquartile range (IQR)1800

Descriptive statistics

Standard deviation5253.4774
Coefficient of variation (CV)0.4351707
Kurtosis25.288801
Mean12072.222
Median Absolute Deviation (MAD)1800
Skewness4.4127823
Sum651900
Variance27599025
MonotonicityNot monotonic
2023-12-12T11:53:04.472621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
11700 16
29.6%
9900 15
27.8%
8500 10
18.5%
16000 9
16.7%
21600 1
 
1.9%
14300 1
 
1.9%
7500 1
 
1.9%
43800 1
 
1.9%
ValueCountFrequency (%)
7500 1
 
1.9%
8500 10
18.5%
9900 15
27.8%
11700 16
29.6%
14300 1
 
1.9%
16000 9
16.7%
21600 1
 
1.9%
43800 1
 
1.9%
ValueCountFrequency (%)
43800 1
 
1.9%
21600 1
 
1.9%
16000 9
16.7%
14300 1
 
1.9%
11700 16
29.6%
9900 15
27.8%
8500 10
18.5%
7500 1
 
1.9%

분석기준
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Memory size564.0 B
토양오염공정시험기준
16 
농촌진흥청토양및식물체분석법
15 
농촌진흥청고시 제2017-19
13 
수질오염공정시험기준
STD. Method

Length

Max length16
Median length14
Mean length12.592593
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수질오염공정시험기준
2nd row수질오염공정시험기준
3rd row수질오염공정시험기준
4th row수질오염공정시험기준
5th row수질오염공정시험기준

Common Values

ValueCountFrequency (%)
토양오염공정시험기준 16
29.6%
농촌진흥청토양및식물체분석법 15
27.8%
농촌진흥청고시 제2017-19 13
24.1%
수질오염공정시험기준 8
14.8%
STD. Method 2
 
3.7%

Length

2023-12-12T11:53:04.618884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:53:04.739431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
토양오염공정시험기준 16
23.2%
농촌진흥청토양및식물체분석법 15
21.7%
농촌진흥청고시 13
18.8%
제2017-19 13
18.8%
수질오염공정시험기준 8
11.6%
std 2
 
2.9%
method 2
 
2.9%

단위
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Memory size564.0 B
mg/kg
25 
mg/L
<NA>
cmolc/L
cmolc/kg
Other values (3)

Length

Max length8
Median length7
Mean length4.7592593
Min length1

Unique

Unique1 ?
Unique (%)1.9%

Sample

1st row<NA>
2nd rowdS/m
3rd rowmg/L
4th rowmg/L
5th rowmg/L

Common Values

ValueCountFrequency (%)
mg/kg 25
46.3%
mg/L 9
 
16.7%
<NA> 4
 
7.4%
cmolc/L 4
 
7.4%
cmolc/kg 4
 
7.4%
% 4
 
7.4%
dS/m 3
 
5.6%
g/kg 1
 
1.9%

Length

2023-12-12T11:53:04.925904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:53:05.084581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
mg/kg 25
46.3%
mg/l 9
 
16.7%
na 4
 
7.4%
cmolc/l 4
 
7.4%
cmolc/kg 4
 
7.4%
4
 
7.4%
ds/m 3
 
5.6%
g/kg 1
 
1.9%

Interactions

2023-12-12T11:53:02.310925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:53:02.014643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:53:02.521024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:53:02.172056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:53:05.205477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분류코드(소)분류코드(대)항목명비용분석기준단위
분류코드(소)1.0000.9890.0000.9230.9830.746
분류코드(대)0.9891.0000.0000.9040.9680.756
항목명0.0000.0001.0000.8620.8741.000
비용0.9230.9040.8621.0000.9290.838
분석기준0.9830.9680.8740.9291.0000.757
단위0.7460.7561.0000.8380.7571.000
2023-12-12T11:53:05.352292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분류코드(대)단위분석기준
분류코드(대)1.0000.6030.746
단위0.6031.0000.605
분석기준0.7460.6051.000
2023-12-12T11:53:05.500053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분류코드(소)비용분류코드(대)분석기준단위
분류코드(소)1.0000.8090.8060.7700.483
비용0.8091.0000.5500.6110.724
분류코드(대)0.8060.5501.0000.7460.603
분석기준0.7700.6110.7461.0000.605
단위0.4830.7240.6030.6051.000

Missing values

2023-12-12T11:53:02.794445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:53:02.966327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

분류코드(소)분류코드(대)항목명비용분석기준단위
01A수소이온(pH)8500수질오염공정시험기준<NA>
12A전기전도도(EC)8500수질오염공정시험기준dS/m
23A질산성질소(NO3-N)8500수질오염공정시험기준mg/L
34A칼륨(K)8500수질오염공정시험기준mg/L
45A칼슘(Ca)8500수질오염공정시험기준mg/L
56A마그네슘(Mg)8500수질오염공정시험기준mg/L
67A나트륨(Na)8500수질오염공정시험기준mg/L
78A염소이온(Cl-)8500수질오염공정시험기준mg/L
89A황산이온(SO4)8500STD. Methodmg/L
910A중탄산(HCO3)8500STD. Methodmg/L
분류코드(소)분류코드(대)항목명비용분석기준단위
4445E비소16000농촌진흥청고시 제2017-19mg/kg
4546E니켈16000농촌진흥청고시 제2017-19mg/kg
4647E카드뮴16000농촌진흥청고시 제2017-19mg/kg
4748E구리16000농촌진흥청고시 제2017-19mg/kg
4849E크롬16000농촌진흥청고시 제2017-19mg/kg
4950E수은16000농촌진흥청고시 제2017-19mg/kg
5051E아연16000농촌진흥청고시 제2017-19mg/kg
5152E16000농촌진흥청고시 제2017-19mg/kg
5253E부숙도(콤백법)43800농촌진흥청고시 제2017-19<NA>
5354B염소Cl9900농촌진흥청고시 제2017-19%