Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells134
Missing cells (%)0.1%
Duplicate rows1577
Duplicate rows (%)15.8%
Total size in memory820.3 KiB
Average record size in memory84.0 B

Variable types

Numeric3
DateTime3
Categorical2
Boolean1

Dataset

Description가축분뇨 전자인계관리시스템에서 관리하고 있는 가축 분뇨 중 액비를 운반하는 인계서 내역에 대하여 등록된 정보입니다.
Author한국환경공단
URLhttps://www.data.go.kr/data/15041900/fileData.do

Alerts

Dataset has 1577 (15.8%) duplicate rowsDuplicates
인수량(톤) is highly overall correlated with 살포량(톤)High correlation
살포량(톤) is highly overall correlated with 인수량(톤)High correlation
보관장소 경유여부 is highly imbalanced (70.9%)Imbalance
살포일자 has 134 (1.3%) missing valuesMissing
살포량(톤) has 167 (1.7%) zerosZeros

Reproduction

Analysis started2023-12-12 05:58:32.113444
Analysis finished2023-12-12 05:58:34.170936
Duration2.06 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

운반업체번호
Real number (ℝ)

Distinct342
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0156373 × 109
Minimum2.0130001 × 109
Maximum2.021 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:58:34.253563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0130001 × 109
5-th percentile2.0130004 × 109
Q12.0150003 × 109
median2.0160008 × 109
Q32.0160033 × 109
95-th percentile2.0180105 × 109
Maximum2.021 × 109
Range7999876
Interquartile range (IQR)1002948

Descriptive statistics

Standard deviation1532742.8
Coefficient of variation (CV)0.00076042592
Kurtosis2.6708767
Mean2.0156373 × 109
Median Absolute Deviation (MAD)1000399
Skewness0.89062485
Sum2.0156373 × 1013
Variance2.3493006 × 1012
MonotonicityNot monotonic
2023-12-12T14:58:34.493782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2016000829 255
 
2.5%
2021000007 237
 
2.4%
2015000203 200
 
2.0%
2015000311 149
 
1.5%
2013000410 132
 
1.3%
2013000369 130
 
1.3%
2017000168 125
 
1.2%
2016001702 122
 
1.2%
2016002010 119
 
1.2%
2015000094 117
 
1.2%
Other values (332) 8414
84.1%
ValueCountFrequency (%)
2013000149 4
 
< 0.1%
2013000160 51
0.5%
2013000202 1
 
< 0.1%
2013000217 7
 
0.1%
2013000238 1
 
< 0.1%
2013000256 1
 
< 0.1%
2013000259 4
 
< 0.1%
2013000277 17
 
0.2%
2013000286 1
 
< 0.1%
2013000297 2
 
< 0.1%
ValueCountFrequency (%)
2021000025 10
 
0.1%
2021000007 237
2.4%
2020000737 9
 
0.1%
2020000645 2
 
< 0.1%
2020000642 7
 
0.1%
2020000618 3
 
< 0.1%
2020000530 1
 
< 0.1%
2020000527 1
 
< 0.1%
2020000476 8
 
0.1%
2020000413 20
 
0.2%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-01-01 00:00:00
Maximum2021-04-01 00:00:00
2023-12-12T14:58:34.634827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:34.727904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=4)

인수량(톤)
Real number (ℝ)

HIGH CORRELATION 

Distinct856
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.133698
Minimum2
Maximum253
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:58:34.873974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile6.14
Q113.2925
median16.6
Q323
95-th percentile25
Maximum253
Range251
Interquartile range (IQR)9.7075

Descriptive statistics

Standard deviation12.216639
Coefficient of variation (CV)0.67369817
Kurtosis119.46595
Mean18.133698
Median Absolute Deviation (MAD)6.4
Skewness8.387076
Sum181336.98
Variance149.24627
MonotonicityNot monotonic
2023-12-12T14:58:35.031297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.0 1259
 
12.6%
23.0 1136
 
11.4%
24.0 869
 
8.7%
22.0 463
 
4.6%
8.0 458
 
4.6%
14.0 452
 
4.5%
20.0 450
 
4.5%
16.0 446
 
4.5%
21.0 370
 
3.7%
7.0 356
 
3.6%
Other values (846) 3741
37.4%
ValueCountFrequency (%)
2.0 5
 
0.1%
2.46 1
 
< 0.1%
3.0 4
 
< 0.1%
3.5 1
 
< 0.1%
3.8 1
 
< 0.1%
3.9 1
 
< 0.1%
4.0 22
0.2%
4.4 1
 
< 0.1%
4.5 20
0.2%
4.89 1
 
< 0.1%
ValueCountFrequency (%)
253.0 2
< 0.1%
240.0 1
 
< 0.1%
230.0 3
< 0.1%
225.0 1
 
< 0.1%
200.0 2
< 0.1%
192.0 2
< 0.1%
184.0 2
< 0.1%
180.0 1
 
< 0.1%
175.0 1
 
< 0.1%
168.0 1
 
< 0.1%

살포일자
Date

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing134
Missing (%)1.3%
Memory size156.2 KiB
Minimum2021-01-01 00:00:00
Maximum2021-04-01 00:00:00
2023-12-12T14:58:35.143557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:35.241959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=4)

살포량(톤)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct857
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.838678
Minimum0
Maximum253
Zeros167
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:58:35.412165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q113
median16
Q323
95-th percentile25
Maximum253
Range253
Interquartile range (IQR)10

Descriptive statistics

Standard deviation12.380399
Coefficient of variation (CV)0.69401997
Kurtosis113.75996
Mean17.838678
Median Absolute Deviation (MAD)7
Skewness8.0517992
Sum178386.78
Variance153.27427
MonotonicityNot monotonic
2023-12-12T14:58:35.558441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.0 1222
 
12.2%
23.0 1126
 
11.3%
24.0 848
 
8.5%
8.0 458
 
4.6%
14.0 449
 
4.5%
16.0 446
 
4.5%
20.0 439
 
4.4%
22.0 431
 
4.3%
21.0 360
 
3.6%
7.0 356
 
3.6%
Other values (847) 3865
38.6%
ValueCountFrequency (%)
0.0 167
1.7%
2.0 5
 
0.1%
2.46 1
 
< 0.1%
3.0 4
 
< 0.1%
3.5 1
 
< 0.1%
3.8 1
 
< 0.1%
3.9 1
 
< 0.1%
4.0 22
 
0.2%
4.4 1
 
< 0.1%
4.5 19
 
0.2%
ValueCountFrequency (%)
253.0 2
< 0.1%
240.0 1
 
< 0.1%
230.0 3
< 0.1%
225.0 1
 
< 0.1%
200.0 2
< 0.1%
192.0 2
< 0.1%
184.0 2
< 0.1%
180.0 1
 
< 0.1%
175.0 1
 
< 0.1%
168.0 1
 
< 0.1%
Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-01-01 00:00:00
Maximum2022-12-01 00:00:00
2023-12-12T14:58:35.693620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:35.821990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
E1
6677 
T1
3323 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowE1
2nd rowE1
3rd rowE1
4th rowE1
5th rowE1

Common Values

ValueCountFrequency (%)
E1 6677
66.8%
T1 3323
33.2%

Length

2023-12-12T14:58:35.959424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:58:36.081197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
e1 6677
66.8%
t1 3323
33.2%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2
6256 
1
3686 
4
 
45
5
 
13

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
2 6256
62.6%
1 3686
36.9%
4 45
 
0.4%
5 13
 
0.1%

Length

2023-12-12T14:58:36.187289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:58:36.288580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 6256
62.6%
1 3686
36.9%
4 45
 
0.4%
5 13
 
0.1%

보관장소 경유여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
False
9488 
True
 
512
ValueCountFrequency (%)
False 9488
94.9%
True 512
 
5.1%
2023-12-12T14:58:36.396132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T14:58:33.669488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:32.770351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:33.355465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:33.775014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:33.151927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:33.483004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:33.861639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:33.247872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:58:33.581235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:58:36.466311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운반업체번호인수일자인수량(톤)살포일자살포량(톤)마감처리일자인계량입력업체구분인계서입력구분보관장소 경유여부
운반업체번호1.0000.1280.0760.1280.0820.2250.3180.1980.215
인수일자0.1281.0000.0381.0000.0540.8900.0600.1190.017
인수량(톤)0.0760.0381.0000.0400.9990.0000.0560.0540.074
살포일자0.1281.0000.0401.0000.0540.8910.0560.1160.035
살포량(톤)0.0820.0540.9990.0541.0000.1850.0930.0620.083
마감처리일자0.2250.8900.0000.8910.1851.0000.0900.1580.029
인계량입력업체구분0.3180.0600.0560.0560.0930.0901.0000.6170.094
인계서입력구분0.1980.1190.0540.1160.0620.1580.6171.0000.033
보관장소 경유여부0.2150.0170.0740.0350.0830.0290.0940.0331.000
2023-12-12T14:58:36.616528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
보관장소 경유여부인계서입력구분인계량입력업체구분
보관장소 경유여부1.0000.0220.060
인계서입력구분0.0221.0000.428
인계량입력업체구분0.0600.4281.000
2023-12-12T14:58:36.718979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운반업체번호인수량(톤)살포량(톤)인계량입력업체구분인계서입력구분보관장소 경유여부
운반업체번호1.000-0.060-0.0580.3180.1280.214
인수량(톤)-0.0601.0000.9690.0430.0320.057
살포량(톤)-0.0580.9691.0000.0710.0370.063
인계량입력업체구분0.3180.0430.0711.0000.4280.060
인계서입력구분0.1280.0320.0370.4281.0000.022
보관장소 경유여부0.2140.0570.0630.0600.0221.000

Missing values

2023-12-12T14:58:33.975805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:58:34.106660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

운반업체번호인수일자인수량(톤)살포일자살포량(톤)마감처리일자인계량입력업체구분인계서입력구분보관장소 경유여부
1629120150004432021-017.02021-017.02021-02E11N
3565220140002412021-0216.02021-0216.02021-02E11N
2933220130003782021-0215.02021-0215.02021-02E12N
9233020160008262021-0423.432021-0423.432021-05E11N
4837020170006242021-0315.02021-0315.02021-03E11N
3504720160007832021-028.02021-028.02021-02E12N
7736620160017022021-037.02021-037.02021-04E12N
25020160007442021-0123.02021-0123.02021-01E12N
8641520150004182021-0416.02021-0416.02021-04E11N
9401720130003792021-0420.02021-0420.02021-04E11N
운반업체번호인수일자인수량(톤)살포일자살포량(톤)마감처리일자인계량입력업체구분인계서입력구분보관장소 경유여부
3924820140002762021-0222.02021-0222.02021-02T12Y
6357420170010232021-0323.02021-0323.02021-03T12N
1289320160028542021-0115.02021-0115.02021-01E11N
4646420170010232021-0223.02021-0223.02021-03T12N
150320150003662021-0116.02021-0116.02021-01T12N
5817020160009632021-0315.02021-0315.02021-03E11N
2754920160020102021-0222.242021-0222.242021-02E11N
261720160008292021-016.132021-016.132021-01E12N
329120150003522021-0124.02021-0124.02021-01T12N
9476420160041052021-0416.02021-0416.02021-04E11N

Duplicate rows

Most frequently occurring

운반업체번호인수일자인수량(톤)살포일자살포량(톤)마감처리일자인계량입력업체구분인계서입력구분보관장소 경유여부# duplicates
3620130003692021-0316.02021-0316.02021-03E12N42
115220160031802021-037.42021-037.42021-03T12N41
11320130004102021-0223.02021-0223.02021-02E12N38
135520170006242021-0315.02021-0315.02021-03E11N34
21920140002432021-0315.02021-0315.02021-03E12N33
39420150003112021-0215.02021-0215.02021-02T12Y33
39820150003112021-0315.02021-0315.02021-03T12Y33
3920130003692021-0416.02021-0416.02021-04E12N32
91020160011542021-038.02021-038.02021-03T12N31
28520140002782021-0315.02021-0315.02021-03E11N30