Overview

Dataset statistics

Number of variables3
Number of observations2697
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory68.6 KiB
Average record size in memory26.0 B

Variable types

Numeric1
Categorical2

Dataset

Description가축분뇨 전자인계관리시스템에서 관리하고 있는 데이터 중 액비 배출자의 배출계획(업체번호, 배출물코드, 신고허가 관리번호 등)으로 등록된 정보 입니다.
Author한국환경공단
URLhttps://www.data.go.kr/data/15041923/fileData.do

Alerts

배출물 is highly imbalanced (96.8%)Imbalance

Reproduction

Analysis started2023-12-11 23:33:34.564379
Analysis finished2023-12-11 23:33:34.859271
Duration0.29 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업체(개인)번호
Real number (ℝ)

Distinct2601
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0163149 × 109
Minimum2.0130001 × 109
Maximum2.0220003 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.8 KiB
2023-12-12T08:33:34.933340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0130001 × 109
5-th percentile2.014 × 109
Q12.0160004 × 109
median2.016003 × 109
Q32.0170005 × 109
95-th percentile2.0200001 × 109
Maximum2.0220003 × 109
Range9000150
Interquartile range (IQR)1000087

Descriptive statistics

Standard deviation1537168.6
Coefficient of variation (CV)0.00076236534
Kurtosis2.7205696
Mean2.0163149 × 109
Median Absolute Deviation (MAD)997047
Skewness0.87169758
Sum5.4380014 × 1012
Variance2.3628874 × 1012
MonotonicityNot monotonic
2023-12-12T08:33:35.056880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2015000388 3
 
0.1%
2014000244 2
 
0.1%
2015000275 2
 
0.1%
2017001532 2
 
0.1%
2017001555 2
 
0.1%
2013000209 2
 
0.1%
2017001104 2
 
0.1%
2017001242 2
 
0.1%
2017001039 2
 
0.1%
2015000203 2
 
0.1%
Other values (2591) 2676
99.2%
ValueCountFrequency (%)
2013000135 2
0.1%
2013000136 1
< 0.1%
2013000149 1
< 0.1%
2013000151 1
< 0.1%
2013000160 1
< 0.1%
2013000168 1
< 0.1%
2013000174 1
< 0.1%
2013000177 1
< 0.1%
2013000178 1
< 0.1%
2013000196 1
< 0.1%
ValueCountFrequency (%)
2022000285 1
< 0.1%
2022000239 1
< 0.1%
2022000221 1
< 0.1%
2022000191 1
< 0.1%
2022000170 1
< 0.1%
2022000164 1
< 0.1%
2022000163 1
< 0.1%
2022000162 1
< 0.1%
2022000144 1
< 0.1%
2022000139 1
< 0.1%

배출물
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.2 KiB
90
2688 
91
 
9

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row90
2nd row90
3rd row90
4th row90
5th row90

Common Values

ValueCountFrequency (%)
90 2688
99.7%
91 9
 
0.3%

Length

2023-12-12T08:33:35.171582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:33:35.246355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
90 2688
99.7%
91 9
 
0.3%
Distinct8
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size21.2 KiB
P02
1493 
P01
650 
P03
259 
P04
175 
P05
 
77
Other values (3)
 
43

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP02
2nd rowP05
3rd rowP04
4th rowP01
5th rowP06

Common Values

ValueCountFrequency (%)
P02 1493
55.4%
P01 650
24.1%
P03 259
 
9.6%
P04 175
 
6.5%
P05 77
 
2.9%
P06 35
 
1.3%
P07 6
 
0.2%
P08 2
 
0.1%

Length

2023-12-12T08:33:35.320610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:33:35.405883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
p02 1493
55.4%
p01 650
24.1%
p03 259
 
9.6%
p04 175
 
6.5%
p05 77
 
2.9%
p06 35
 
1.3%
p07 6
 
0.2%
p08 2
 
0.1%

Interactions

2023-12-12T08:33:34.669415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:33:35.468896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체(개인)번호배출물신고허가관리번호
업체(개인)번호1.0000.0820.276
배출물0.0821.0000.000
신고허가관리번호0.2760.0001.000
2023-12-12T08:33:35.534534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배출물신고허가관리번호
배출물1.0000.000
신고허가관리번호0.0001.000
2023-12-12T08:33:35.594720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체(개인)번호배출물신고허가관리번호
업체(개인)번호1.0000.0630.135
배출물0.0631.0000.000
신고허가관리번호0.1350.0001.000

Missing values

2023-12-12T08:33:34.766150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:33:34.831728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체(개인)번호배출물신고허가관리번호
0201300031590P02
1201300031990P05
2201300031990P04
3201300032690P01
4201300032790P06
5201300033390P02
6201300033690P02
7201300033690P01
8201300033890P02
9201300034390P01
업체(개인)번호배출물신고허가관리번호
2687201900026990P01
2688201801052290P02
2689202000074190P01
2690201801031890P01
2691201801047890P02
2692201801071990P04
2693201900004090P02
2694201900019090P02
2695202000009990P01
2696202000017690P01