Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1306
Duplicate rows (%)13.1%
Total size in memory732.4 KiB
Average record size in memory75.0 B

Variable types

Categorical5
Numeric3

Dataset

Description환경부에서 운영하는 가축분뇨 전자인계관리시스템과 농림축산식품부에서 운영하는농림사업정보시스템(Agrix)간 연계하여 가축분뇨, 액비, 전자인계 정보에 대해 송수/신하는 가축정보인계정보입니다.
Author한국환경공단
URLhttps://www.data.go.kr/data/15041934/fileData.do

Alerts

Dataset has 1306 (13.1%) duplicate rowsDuplicates
처리시설방법 is highly overall correlated with 배출자위탁량 and 3 other fieldsHigh correlation
배출자처리방법 is highly overall correlated with 배출자위탁량 and 3 other fieldsHigh correlation
배출자위탁량 is highly overall correlated with 운반자인수량 and 3 other fieldsHigh correlation
운반자인수량 is highly overall correlated with 배출자위탁량 and 3 other fieldsHigh correlation
처리자인수량 is highly overall correlated with 배출자위탁량 and 3 other fieldsHigh correlation
입력구분 is highly imbalanced (99.3%)Imbalance
축종 is highly imbalanced (95.0%)Imbalance
축분 is highly imbalanced (76.0%)Imbalance
배출자위탁량 is highly skewed (γ1 = 99.82343467)Skewed
운반자인수량 is highly skewed (γ1 = 99.82343467)Skewed
처리자인수량 is highly skewed (γ1 = 99.82334824)Skewed

Reproduction

Analysis started2023-12-12 09:42:53.624228
Analysis finished2023-12-12 09:42:55.810864
Duration2.19 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

입력구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
I
9994 
U
 
6

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI
2nd rowI
3rd rowI
4th rowI
5th rowI

Common Values

ValueCountFrequency (%)
I 9994
99.9%
U 6
 
0.1%

Length

2023-12-12T18:42:55.886772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:42:56.005401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
i 9994
99.9%
u 6
 
0.1%

축종
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
돼지
9889 
 
80
 
21
 
10

Length

Max length2
Median length2
Mean length1.9889
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row돼지
2nd row돼지
3rd row돼지
4th row돼지
5th row돼지

Common Values

ValueCountFrequency (%)
돼지 9889
98.9%
80
 
0.8%
21
 
0.2%
10
 
0.1%

Length

2023-12-12T18:42:56.134194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:42:56.272475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
돼지 9889
98.9%
80
 
0.8%
21
 
0.2%
10
 
0.1%

축분
Categorical

IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
혼합
8943 
 
737
 
198
액비
 
103
퇴비
 
18

Length

Max length4
Median length2
Mean length1.9067
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row혼합
2nd row혼합
3rd row혼합
4th row
5th row혼합

Common Values

ValueCountFrequency (%)
혼합 8943
89.4%
737
 
7.4%
198
 
2.0%
액비 103
 
1.0%
퇴비 18
 
0.2%
중간퇴비 1
 
< 0.1%

Length

2023-12-12T18:42:56.722747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:42:56.846302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
혼합 8943
89.4%
737
 
7.4%
198
 
2.0%
액비 103
 
1.0%
퇴비 18
 
0.2%
중간퇴비 1
 
< 0.1%

배출자위탁량
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct1815
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.726512
Minimum0.5
Maximum23640
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T18:42:56.976913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile5
Q17.54
median15
Q322.89
95-th percentile25
Maximum23640
Range23639.5
Interquartile range (IQR)15.35

Descriptive statistics

Standard deviation236.38558
Coefficient of variation (CV)13.335143
Kurtosis9976.4556
Mean17.726512
Median Absolute Deviation (MAD)7.57
Skewness99.823435
Sum177265.12
Variance55878.142
MonotonicityNot monotonic
2023-12-12T18:42:57.144989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23.0 529
 
5.3%
15.0 458
 
4.6%
24.0 401
 
4.0%
20.0 344
 
3.4%
5.0 333
 
3.3%
25.0 304
 
3.0%
8.0 269
 
2.7%
6.0 223
 
2.2%
21.0 172
 
1.7%
22.0 149
 
1.5%
Other values (1805) 6818
68.2%
ValueCountFrequency (%)
0.5 10
 
0.1%
1.0 52
0.5%
1.18 1
 
< 0.1%
1.5 11
 
0.1%
1.95 1
 
< 0.1%
2.0 30
0.3%
2.35 1
 
< 0.1%
2.5 11
 
0.1%
2.53 1
 
< 0.1%
2.63 1
 
< 0.1%
ValueCountFrequency (%)
23640.0 1
 
< 0.1%
99.0 1
 
< 0.1%
87.0 12
0.1%
81.4 1
 
< 0.1%
78.0 1
 
< 0.1%
73.0 1
 
< 0.1%
70.0 1
 
< 0.1%
65.0 9
0.1%
64.0 1
 
< 0.1%
63.0 1
 
< 0.1%

배출자처리방법
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
4882 
액비화
3653 
정화처리
1016 
퇴비화
 
295
바이오에너지화
 
154

Length

Max length7
Median length4
Mean length3.6514
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정화처리
2nd row<NA>
3rd row<NA>
4th row퇴비화
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 4882
48.8%
액비화 3653
36.5%
정화처리 1016
 
10.2%
퇴비화 295
 
2.9%
바이오에너지화 154
 
1.5%

Length

2023-12-12T18:42:57.296884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:42:57.418948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4882
48.8%
액비화 3653
36.5%
정화처리 1016
 
10.2%
퇴비화 295
 
2.9%
바이오에너지화 154
 
1.5%

운반자인수량
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct1815
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.726512
Minimum0.5
Maximum23640
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T18:42:57.561338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile5
Q17.54
median15
Q322.89
95-th percentile25
Maximum23640
Range23639.5
Interquartile range (IQR)15.35

Descriptive statistics

Standard deviation236.38558
Coefficient of variation (CV)13.335143
Kurtosis9976.4556
Mean17.726512
Median Absolute Deviation (MAD)7.57
Skewness99.823435
Sum177265.12
Variance55878.142
MonotonicityNot monotonic
2023-12-12T18:42:57.716319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23.0 529
 
5.3%
15.0 458
 
4.6%
24.0 401
 
4.0%
20.0 344
 
3.4%
5.0 333
 
3.3%
25.0 304
 
3.0%
8.0 269
 
2.7%
6.0 223
 
2.2%
21.0 172
 
1.7%
22.0 149
 
1.5%
Other values (1805) 6818
68.2%
ValueCountFrequency (%)
0.5 10
 
0.1%
1.0 52
0.5%
1.18 1
 
< 0.1%
1.5 11
 
0.1%
1.95 1
 
< 0.1%
2.0 30
0.3%
2.35 1
 
< 0.1%
2.5 11
 
0.1%
2.53 1
 
< 0.1%
2.63 1
 
< 0.1%
ValueCountFrequency (%)
23640.0 1
 
< 0.1%
99.0 1
 
< 0.1%
87.0 12
0.1%
81.4 1
 
< 0.1%
78.0 1
 
< 0.1%
73.0 1
 
< 0.1%
70.0 1
 
< 0.1%
65.0 9
0.1%
64.0 1
 
< 0.1%
63.0 1
 
< 0.1%

처리자인수량
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct1816
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.721712
Minimum0
Maximum23640
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T18:42:57.896594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q17.53
median15
Q322.88
95-th percentile25
Maximum23640
Range23640
Interquartile range (IQR)15.35

Descriptive statistics

Standard deviation236.3857
Coefficient of variation (CV)13.338762
Kurtosis9976.4441
Mean17.721712
Median Absolute Deviation (MAD)7.57
Skewness99.823348
Sum177217.12
Variance55878.197
MonotonicityNot monotonic
2023-12-12T18:42:58.060030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23.0 529
 
5.3%
15.0 458
 
4.6%
24.0 399
 
4.0%
20.0 344
 
3.4%
5.0 333
 
3.3%
25.0 304
 
3.0%
8.0 269
 
2.7%
6.0 223
 
2.2%
21.0 172
 
1.7%
22.0 149
 
1.5%
Other values (1806) 6820
68.2%
ValueCountFrequency (%)
0.0 2
 
< 0.1%
0.5 10
 
0.1%
1.0 52
0.5%
1.18 1
 
< 0.1%
1.5 11
 
0.1%
1.95 1
 
< 0.1%
2.0 30
0.3%
2.35 1
 
< 0.1%
2.5 11
 
0.1%
2.53 1
 
< 0.1%
ValueCountFrequency (%)
23640.0 1
 
< 0.1%
99.0 1
 
< 0.1%
87.0 12
0.1%
81.4 1
 
< 0.1%
78.0 1
 
< 0.1%
73.0 1
 
< 0.1%
70.0 1
 
< 0.1%
65.0 9
0.1%
64.0 1
 
< 0.1%
63.0 1
 
< 0.1%

처리시설방법
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
4882 
액비화
3653 
정화처리
1016 
퇴비화
 
295
바이오에너지화
 
154

Length

Max length7
Median length4
Mean length3.6514
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정화처리
2nd row<NA>
3rd row<NA>
4th row퇴비화
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 4882
48.8%
액비화 3653
36.5%
정화처리 1016
 
10.2%
퇴비화 295
 
2.9%
바이오에너지화 154
 
1.5%

Length

2023-12-12T18:42:58.233532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:42:58.362511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4882
48.8%
액비화 3653
36.5%
정화처리 1016
 
10.2%
퇴비화 295
 
2.9%
바이오에너지화 154
 
1.5%

Interactions

2023-12-12T18:42:55.083854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:42:54.472084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:42:54.762277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:42:55.187541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:42:54.571554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:42:54.870771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:42:55.309667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:42:54.662481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:42:54.980189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:42:58.471324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
입력구분축종축분배출자위탁량배출자처리방법운반자인수량처리자인수량처리시설방법
입력구분1.0000.0000.0000.0000.0740.0000.0000.074
축종0.0001.0000.5230.0000.3890.0000.0000.389
축분0.0000.5231.0000.0000.5630.0000.0000.563
배출자위탁량0.0000.0000.0001.000NaN0.7070.707NaN
배출자처리방법0.0740.3890.563NaN1.000NaNNaN1.000
운반자인수량0.0000.0000.0000.707NaN1.0000.707NaN
처리자인수량0.0000.0000.0000.707NaN0.7071.000NaN
처리시설방법0.0740.3890.563NaN1.000NaNNaN1.000
2023-12-12T18:42:58.610554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
입력구분축분처리시설방법배출자처리방법축종
입력구분1.0000.0000.0490.0490.000
축분0.0001.0000.3980.3980.364
처리시설방법0.0490.3981.0001.0000.379
배출자처리방법0.0490.3981.0001.0000.379
축종0.0000.3640.3790.3791.000
2023-12-12T18:42:58.733141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배출자위탁량운반자인수량처리자인수량입력구분축종축분배출자처리방법처리시설방법
배출자위탁량1.0001.0000.9990.0000.0000.0001.0001.000
운반자인수량1.0001.0000.9990.0000.0000.0001.0001.000
처리자인수량0.9990.9991.0000.0000.0000.0001.0001.000
입력구분0.0000.0000.0001.0000.0000.0000.0490.049
축종0.0000.0000.0000.0001.0000.3640.3790.379
축분0.0000.0000.0000.0000.3641.0000.3980.398
배출자처리방법1.0001.0001.0000.0490.3790.3981.0001.000
처리시설방법1.0001.0001.0000.0490.3790.3981.0001.000

Missing values

2023-12-12T18:42:55.585213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:42:55.729986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

입력구분축종축분배출자위탁량배출자처리방법운반자인수량처리자인수량처리시설방법
56634I돼지혼합5.0정화처리5.05.0정화처리
90539I돼지혼합8.39<NA>8.398.39<NA>
51872I돼지혼합14.0<NA>14.014.0<NA>
77949I돼지12.82퇴비화12.8212.82퇴비화
6952I돼지혼합5.12<NA>5.125.12<NA>
16043I돼지혼합13.26<NA>13.2613.26<NA>
84398I돼지혼합7.59<NA>7.597.59<NA>
75651I돼지혼합6.0액비화6.06.0액비화
84315I돼지혼합23.95<NA>23.9523.95<NA>
83981I돼지혼합20.0액비화20.020.0액비화
입력구분축종축분배출자위탁량배출자처리방법운반자인수량처리자인수량처리시설방법
23807I돼지혼합8.0<NA>8.08.0<NA>
72981I돼지혼합20.0액비화20.020.0액비화
40431I돼지혼합4.61<NA>4.614.61<NA>
72586I돼지혼합6.88<NA>6.886.88<NA>
10946I돼지혼합20.6<NA>20.620.6<NA>
33982I돼지혼합7.46<NA>7.467.46<NA>
36860I돼지혼합7.62<NA>7.627.62<NA>
1145I돼지혼합20.88<NA>20.8820.88<NA>
65282I돼지혼합22.5액비화22.522.5액비화
34989I돼지혼합7.97<NA>7.977.97<NA>

Duplicate rows

Most frequently occurring

입력구분축종축분배출자위탁량배출자처리방법운반자인수량처리자인수량처리시설방법# duplicates
1023I돼지혼합23.0액비화23.023.0액비화394
1171I돼지혼합24.0액비화24.024.0액비화314
656I돼지혼합15.0액비화15.015.0액비화240
777I돼지혼합20.0액비화20.020.0액비화227
133I돼지혼합5.0정화처리5.05.0정화처리204
1267I돼지혼합25.0액비화25.025.0액비화198
837I돼지혼합21.0액비화21.021.0액비화154
446I돼지혼합8.0<NA>8.08.0<NA>121
219I돼지혼합6.0액비화6.06.0액비화107
601I돼지혼합14.0액비화14.014.0액비화93