Overview

Dataset statistics

Number of variables12
Number of observations330
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory33.6 KiB
Average record size in memory104.4 B

Variable types

Numeric3
Categorical5
Text1
Boolean3

Dataset

Description가축분뇨 전자인계관리시스템에서 관리하고 있는 가축분뇨와 액비 중 분뇨의 배출 및 처리계획으로 등록되어 있는 정보 입니다.
Author한국환경공단
URLhttps://www.data.go.kr/data/15041920/fileData.do

Alerts

축종 has constant value ""Constant
Dataset has 1 (0.3%) duplicate rowsDuplicates
업체(개인)번호 is highly overall correlated with 운반업체번호 and 2 other fieldsHigh correlation
운반업체번호 is highly overall correlated with 업체(개인)번호 and 2 other fieldsHigh correlation
처리업체번호 is highly overall correlated with 업체(개인)번호 and 2 other fieldsHigh correlation
운반업체신뢰여부 is highly overall correlated with 처리업체신뢰여부High correlation
처리업체신뢰여부 is highly overall correlated with 운반업체신뢰여부High correlation
대행작성허가여부 is highly overall correlated with 업체(개인)번호 and 2 other fieldsHigh correlation
처리방법 is highly imbalanced (67.0%)Imbalance
배출량_일 is highly imbalanced (92.2%)Imbalance
운반업체신뢰여부 is highly imbalanced (90.5%)Imbalance
처리업체신뢰여부 is highly imbalanced (90.5%)Imbalance

Reproduction

Analysis started2023-12-12 18:02:39.275086
Analysis finished2023-12-12 18:02:41.058437
Duration1.78 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업체(개인)번호
Real number (ℝ)

HIGH CORRELATION 

Distinct136
Distinct (%)41.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0158769 × 109
Minimum2.0140001 × 109
Maximum2.0220001 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2023-12-13T03:02:41.187758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0140001 × 109
5-th percentile2.0140002 × 109
Q12.0140003 × 109
median2.0160008 × 109
Q32.0170007 × 109
95-th percentile2.0200007 × 109
Maximum2.0220001 × 109
Range8000061
Interquartile range (IQR)3000400.5

Descriptive statistics

Standard deviation2003354.9
Coefficient of variation (CV)0.00099378833
Kurtosis1.1903084
Mean2.0158769 × 109
Median Absolute Deviation (MAD)1000810
Skewness1.2252156
Sum6.6523938 × 1011
Variance4.013431 × 1012
MonotonicityNot monotonic
2023-12-13T03:02:41.381192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014000278 84
25.5%
2016003575 27
 
8.2%
2017000692 19
 
5.8%
2015000301 6
 
1.8%
2016000815 6
 
1.8%
2017000712 6
 
1.8%
2014000326 6
 
1.8%
2016000814 6
 
1.8%
2022000011 5
 
1.5%
2015000004 4
 
1.2%
Other values (126) 161
48.8%
ValueCountFrequency (%)
2014000058 1
0.3%
2014000104 1
0.3%
2014000106 1
0.3%
2014000109 2
0.6%
2014000111 1
0.3%
2014000121 1
0.3%
2014000130 2
0.6%
2014000145 1
0.3%
2014000146 1
0.3%
2014000182 2
0.6%
ValueCountFrequency (%)
2022000119 1
 
0.3%
2022000100 1
 
0.3%
2022000058 1
 
0.3%
2022000051 1
 
0.3%
2022000011 5
1.5%
2021000499 1
 
0.3%
2021000385 1
 
0.3%
2021000251 1
 
0.3%
2021000162 1
 
0.3%
2020000752 4
1.2%

축종
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
1
330 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 330
100.0%

Length

2023-12-13T03:02:41.568757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:02:41.695099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 330
100.0%

축분
Categorical

Distinct4
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
3
215 
4
104 
1
 
6
2
 
5

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row3
4th row3
5th row4

Common Values

ValueCountFrequency (%)
3 215
65.2%
4 104
31.5%
1 6
 
1.8%
2 5
 
1.5%

Length

2023-12-13T03:02:41.813114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:02:41.989138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 215
65.2%
4 104
31.5%
1 6
 
1.8%
2 5
 
1.5%

처리방법
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
1
310 
4
 
20

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 310
93.9%
4 20
 
6.1%

Length

2023-12-13T03:02:42.138983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:02:42.261534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 310
93.9%
4 20
 
6.1%

운반업체번호
Real number (ℝ)

HIGH CORRELATION 

Distinct58
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0154527 × 109
Minimum2.0140002 × 109
Maximum2.0210005 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2023-12-13T03:02:42.433342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0140002 × 109
5-th percentile2.0140002 × 109
Q12.0140003 × 109
median2.0150004 × 109
Q32.0160036 × 109
95-th percentile2.0190004 × 109
Maximum2.0210005 × 109
Range7000262
Interquartile range (IQR)2003297

Descriptive statistics

Standard deviation1710159
Coefficient of variation (CV)0.00084852354
Kurtosis1.0228829
Mean2.0154527 × 109
Median Absolute Deviation (MAD)1000129
Skewness1.248864
Sum6.6509937 × 1011
Variance2.9246439 × 1012
MonotonicityNot monotonic
2023-12-13T03:02:42.651085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014000278 94
28.5%
2016003575 43
13.0%
2014000275 24
 
7.3%
2016002849 22
 
6.7%
2019000409 19
 
5.8%
2016000814 15
 
4.5%
2016000952 10
 
3.0%
2014000237 10
 
3.0%
2017000620 8
 
2.4%
2014000247 7
 
2.1%
Other values (48) 78
23.6%
ValueCountFrequency (%)
2014000237 10
 
3.0%
2014000244 1
 
0.3%
2014000247 7
 
2.1%
2014000272 2
 
0.6%
2014000275 24
 
7.3%
2014000278 94
28.5%
2014000279 2
 
0.6%
2014000308 2
 
0.6%
2014000320 1
 
0.3%
2014000326 4
 
1.2%
ValueCountFrequency (%)
2021000499 1
 
0.3%
2021000384 1
 
0.3%
2021000162 1
 
0.3%
2020000752 4
 
1.2%
2020000618 1
 
0.3%
2020000562 2
 
0.6%
2020000519 1
 
0.3%
2020000290 1
 
0.3%
2019000409 19
5.8%
2019000257 1
 
0.3%

처리업체번호
Real number (ℝ)

HIGH CORRELATION 

Distinct66
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0153345 × 109
Minimum2.0140001 × 109
Maximum2.0220001 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2023-12-13T03:02:42.882774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0140001 × 109
5-th percentile2.0140002 × 109
Q12.0140003 × 109
median2.0160006 × 109
Q32.0160036 × 109
95-th percentile2.0170012 × 109
Maximum2.0220001 × 109
Range8000064
Interquartile range (IQR)2003297

Descriptive statistics

Standard deviation1470831.3
Coefficient of variation (CV)0.00072981996
Kurtosis3.829213
Mean2.0153345 × 109
Median Absolute Deviation (MAD)1000327.5
Skewness1.5483613
Sum6.6506038 × 1011
Variance2.1633448 × 1012
MonotonicityNot monotonic
2023-12-13T03:02:43.410160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014000278 90
27.3%
2016003575 43
13.0%
2014000275 23
 
7.0%
2016003640 21
 
6.4%
2016000782 19
 
5.8%
2016000814 12
 
3.6%
2016000952 10
 
3.0%
2014000237 10
 
3.0%
2017000620 8
 
2.4%
2016000815 6
 
1.8%
Other values (56) 88
26.7%
ValueCountFrequency (%)
2014000058 2
 
0.6%
2014000130 1
 
0.3%
2014000237 10
 
3.0%
2014000243 1
 
0.3%
2014000244 1
 
0.3%
2014000247 4
 
1.2%
2014000271 1
 
0.3%
2014000275 23
 
7.0%
2014000278 90
27.3%
2014000279 1
 
0.3%
ValueCountFrequency (%)
2022000122 1
0.3%
2022000058 1
0.3%
2021000384 1
0.3%
2021000162 1
0.3%
2020000752 2
0.6%
2020000562 2
0.6%
2020000290 2
0.6%
2019000257 1
0.3%
2019000229 2
0.6%
2019000175 2
0.6%
Distinct57
Distinct (%)17.3%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2023-12-13T03:02:43.671597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1320
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)5.5%

Sample

1st rowR001
2nd rowR001
3rd rowR001
4th rowR021
5th rowR001
ValueCountFrequency (%)
r001 79
23.9%
r002 57
17.3%
r049 23
 
7.0%
r007 13
 
3.9%
r005 8
 
2.4%
r003 8
 
2.4%
r014 7
 
2.1%
r017 7
 
2.1%
r009 7
 
2.1%
r008 7
 
2.1%
Other values (47) 114
34.5%
2023-12-13T03:02:44.116039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 536
40.6%
R 330
25.0%
1 141
 
10.7%
2 101
 
7.7%
4 51
 
3.9%
9 38
 
2.9%
3 38
 
2.9%
7 25
 
1.9%
5 22
 
1.7%
8 21
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 990
75.0%
Uppercase Letter 330
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 536
54.1%
1 141
 
14.2%
2 101
 
10.2%
4 51
 
5.2%
9 38
 
3.8%
3 38
 
3.8%
7 25
 
2.5%
5 22
 
2.2%
8 21
 
2.1%
6 17
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
R 330
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 990
75.0%
Latin 330
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 536
54.1%
1 141
 
14.2%
2 101
 
10.2%
4 51
 
5.2%
9 38
 
3.8%
3 38
 
3.8%
7 25
 
2.5%
5 22
 
2.2%
8 21
 
2.1%
6 17
 
1.7%
Latin
ValueCountFrequency (%)
R 330
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1320
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 536
40.6%
R 330
25.0%
1 141
 
10.7%
2 101
 
7.7%
4 51
 
3.9%
9 38
 
2.9%
3 38
 
2.9%
7 25
 
1.9%
5 22
 
1.7%
8 21
 
1.6%
Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
1
199 
2
131 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1 199
60.3%
2 131
39.7%

Length

2023-12-13T03:02:44.307811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:02:44.462659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 199
60.3%
2 131
39.7%

배출량_일
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
0.0
325 
16.0
 
4
1.67
 
1

Length

Max length4
Median length3
Mean length3.0151515
Min length3

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 325
98.5%
16.0 4
 
1.2%
1.67 1
 
0.3%

Length

2023-12-13T03:02:44.581859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:02:44.693935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 325
98.5%
16.0 4
 
1.2%
1.67 1
 
0.3%

운반업체신뢰여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size462.0 B
True
326 
False
 
4
ValueCountFrequency (%)
True 326
98.8%
False 4
 
1.2%
2023-12-13T03:02:44.800926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

처리업체신뢰여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size462.0 B
True
326 
False
 
4
ValueCountFrequency (%)
True 326
98.8%
False 4
 
1.2%
2023-12-13T03:02:44.900196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

대행작성허가여부
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size462.0 B
False
201 
True
129 
ValueCountFrequency (%)
False 201
60.9%
True 129
39.1%
2023-12-13T03:02:45.007602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-13T03:02:40.448144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:02:39.776505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:02:40.052437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:02:40.557865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:02:39.863353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:02:40.155180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:02:40.661504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:02:39.958125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:02:40.309775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:02:45.096444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체(개인)번호축분처리방법운반업체번호처리업체번호저장조번호저장조유입구분배출량_일운반업체신뢰여부처리업체신뢰여부대행작성허가여부
업체(개인)번호1.0000.4020.2150.8420.8120.0000.4560.0000.2440.2440.518
축분0.4021.0000.2440.4900.2430.0000.2270.0000.2680.2680.272
처리방법0.2150.2441.0000.3420.2620.5620.1590.0000.2110.2110.338
운반업체번호0.8420.4900.3421.0000.9600.0000.3830.0000.3650.3650.824
처리업체번호0.8120.2430.2620.9601.0000.0000.4140.0000.2480.2480.719
저장조번호0.0000.0000.5620.0000.0001.0000.1570.0000.0000.0000.373
저장조유입구분0.4560.2270.1590.3830.4140.1571.0000.0730.0000.0000.431
배출량_일0.0000.0000.0000.0000.0000.0000.0731.0000.0000.0000.049
운반업체신뢰여부0.2440.2680.2110.3650.2480.0000.0000.0001.0000.9800.000
처리업체신뢰여부0.2440.2680.2110.3650.2480.0000.0000.0000.9801.0000.000
대행작성허가여부0.5180.2720.3380.8240.7190.3730.4310.0490.0000.0001.000
2023-12-13T03:02:45.277247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대행작성허가여부저장조유입구분배출량_일운반업체신뢰여부처리방법축분처리업체신뢰여부
대행작성허가여부1.0000.2830.0800.0000.2190.1800.000
저장조유입구분0.2831.0000.1210.0000.1010.1500.000
배출량_일0.0800.1211.0000.0000.0000.0000.000
운반업체신뢰여부0.0000.0000.0001.0000.1350.1770.873
처리방법0.2190.1010.0000.1351.0000.1620.135
축분0.1800.1500.0000.1770.1621.0000.177
처리업체신뢰여부0.0000.0000.0000.8730.1350.1771.000
2023-12-13T03:02:45.411442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체(개인)번호운반업체번호처리업체번호축분처리방법저장조유입구분배출량_일운반업체신뢰여부처리업체신뢰여부대행작성허가여부
업체(개인)번호1.0000.6390.6500.2650.2130.4540.0000.2410.2410.515
운반업체번호0.6391.0000.8050.2400.2590.2880.0000.2740.2740.601
처리업체번호0.6500.8051.0000.1130.1960.3010.0000.1850.1850.539
축분0.2650.2400.1131.0000.1620.1500.0000.1770.1770.180
처리방법0.2130.2590.1960.1621.0000.1010.0000.1350.1350.219
저장조유입구분0.4540.2880.3010.1500.1011.0000.1210.0000.0000.283
배출량_일0.0000.0000.0000.0000.0000.1211.0000.0000.0000.080
운반업체신뢰여부0.2410.2740.1850.1770.1350.0000.0001.0000.8730.000
처리업체신뢰여부0.2410.2740.1850.1770.1350.0000.0000.8731.0000.000
대행작성허가여부0.5150.6010.5390.1800.2190.2830.0800.0000.0001.000

Missing values

2023-12-13T03:02:40.798001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:02:40.984016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체(개인)번호축종축분처리방법운반업체번호처리업체번호저장조번호저장조유입구분배출량_일운반업체신뢰여부처리업체신뢰여부대행작성허가여부
0201400014514120160035752016003575R00120.0YYN
1201400014614120160035752016003575R00120.0YYN
2201400018213120140002372014000237R00120.0YYY
3201400018213120140002372014000237R02110.0YYY
4201400018814120160035752016003575R00120.0YYN
5201400019214120160035752016003575R00120.0YYN
6201400019413120140002372014000237R02110.0YYY
7201400005813120140002472014000247R00220.0YYN
8201400010414120160035752016003575R00120.0YYN
9201400010614120160035752016003575R00220.0YYN
업체(개인)번호축종축분처리방법운반업체번호처리업체번호저장조번호저장조유입구분배출량_일운반업체신뢰여부처리업체신뢰여부대행작성허가여부
320202000056213120200005622020000562R00220.0YYN
321202000056213120200005622020000562R00120.0YYN
322201801075013120150004312016000633R00110.0YYY
323202000058114120160035752016003575R00120.0YYN
324201900029213120160008142016000814R00810.0YYY
325201900029213120160008142016000814R00210.0YYY
326201900029213120160008142016000814R00110.0YYY
327201801061013420150003582016000044R00110.0YYY
328201801060414120160009522016000952R00210.0YYY
329201900025713120190002572019000257R00110.0YYN

Duplicate rows

Most frequently occurring

업체(개인)번호축종축분처리방법운반업체번호처리업체번호저장조번호저장조유입구분배출량_일운반업체신뢰여부처리업체신뢰여부대행작성허가여부# duplicates
0201400024713120140002472014000247R00110.0YYN2