Overview

Dataset statistics

Number of variables12
Number of observations1122
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory110.8 KiB
Average record size in memory101.1 B

Variable types

Categorical8
Numeric3
Boolean1

Dataset

Description한국지역난방공사 배출활동(온실가스) 자료입니다. 연료, 규정산정등급, 적용산정등급, 산업군 등의 정보를 제공합니다.
Author한국지역난방공사
URLhttps://www.data.go.kr/data/15124175/fileData.do

Alerts

바이오매스사용관련활동여부 has constant value ""Constant
연료코드ID is highly overall correlated with 산정식ID순번 and 6 other fieldsHigh correlation
규정산정등급코드ID is highly overall correlated with 연료코드ID and 1 other fieldsHigh correlation
배출활동코드ID is highly overall correlated with 산정식ID순번 and 2 other fieldsHigh correlation
적용산정등급코드ID is highly overall correlated with 연료코드ID and 1 other fieldsHigh correlation
산정식ID is highly overall correlated with 산정식ID순번 and 2 other fieldsHigh correlation
산정식ID순번 is highly overall correlated with 배출활동코드ID and 2 other fieldsHigh correlation
배출활동순번 is highly overall correlated with 연료코드IDHigh correlation
산업군코드ID is highly overall correlated with 연료코드IDHigh correlation
배출활동순번 is highly imbalanced (67.3%)Imbalance
규정산정등급코드ID is highly imbalanced (68.9%)Imbalance
적용산정등급코드ID is highly imbalanced (68.9%)Imbalance
산업군코드ID is highly imbalanced (64.3%)Imbalance

Reproduction

Analysis started2023-12-12 16:22:28.283524
Analysis finished2023-12-12 16:22:30.105568
Duration1.82 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준연도
Categorical

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
2023
287 
2022
283 
2021
281 
2020
271 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2023 287
25.6%
2022 283
25.2%
2021 281
25.0%
2020 271
24.2%

Length

2023-12-13T01:22:30.163413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:22:30.251182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 287
25.6%
2022 283
25.2%
2021 281
25.0%
2020 271
24.2%

사업장순번
Real number (ℝ)

Distinct22
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.043672
Minimum3
Maximum26
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2023-12-13T01:22:30.365424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile4
Q18
median13
Q319
95-th percentile23
Maximum26
Range23
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.1370651
Coefficient of variation (CV)0.47050134
Kurtosis-1.0201953
Mean13.043672
Median Absolute Deviation (MAD)5
Skewness0.2255296
Sum14635
Variance37.663568
MonotonicityNot monotonic
2023-12-13T01:22:30.491311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
13 119
 
10.6%
5 100
 
8.9%
19 84
 
7.5%
9 84
 
7.5%
14 76
 
6.8%
4 72
 
6.4%
16 68
 
6.1%
22 65
 
5.8%
8 56
 
5.0%
11 44
 
3.9%
Other values (12) 354
31.6%
ValueCountFrequency (%)
3 8
 
0.7%
4 72
6.4%
5 100
8.9%
6 29
 
2.6%
7 39
 
3.5%
8 56
5.0%
9 84
7.5%
10 42
3.7%
11 44
3.9%
12 42
3.7%
ValueCountFrequency (%)
26 14
 
1.2%
25 8
 
0.7%
24 27
 
2.4%
23 24
 
2.1%
22 65
5.8%
21 40
3.6%
20 44
3.9%
19 84
7.5%
16 68
6.1%
15 37
3.3%

배출시설순번
Real number (ℝ)

Distinct58
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.299465
Minimum1
Maximum82
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2023-12-13T01:22:30.604775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median13
Q323
95-th percentile50
Maximum82
Range81
Interquartile range (IQR)17

Descriptive statistics

Standard deviation15.868252
Coefficient of variation (CV)0.91726833
Kurtosis3.317428
Mean17.299465
Median Absolute Deviation (MAD)7.5
Skewness1.7700624
Sum19410
Variance251.80141
MonotonicityNot monotonic
2023-12-13T01:22:30.745681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7 58
 
5.2%
3 56
 
5.0%
4 56
 
5.0%
6 51
 
4.5%
8 48
 
4.3%
5 47
 
4.2%
2 43
 
3.8%
10 43
 
3.8%
9 40
 
3.6%
1 40
 
3.6%
Other values (48) 640
57.0%
ValueCountFrequency (%)
1 40
3.6%
2 43
3.8%
3 56
5.0%
4 56
5.0%
5 47
4.2%
6 51
4.5%
7 58
5.2%
8 48
4.3%
9 40
3.6%
10 43
3.8%
ValueCountFrequency (%)
82 4
0.4%
80 7
0.6%
70 8
0.7%
67 4
0.4%
66 8
0.7%
64 4
0.4%
63 4
0.4%
62 4
0.4%
61 4
0.4%
60 4
0.4%

배출활동순번
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
1
1018 
2
 
80
3
 
24

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1018
90.7%
2 80
 
7.1%
3 24
 
2.1%

Length

2023-12-13T01:22:30.885755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:22:30.984052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1018
90.7%
2 80
 
7.1%
3 24
 
2.1%

배출활동코드ID
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
기체연료연소
406 
간접배출(외부전기사용)
314 
액체연료연소
305 
오존층파괴물질의 대체물질 사용(전기 설비)
73 
석회 생산
 
12
Other values (2)
 
12

Length

Max length23
Median length6
Mean length8.7959002
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기체연료연소
2nd row기체연료연소
3rd row기체연료연소
4th row기체연료연소
5th row기체연료연소

Common Values

ValueCountFrequency (%)
기체연료연소 406
36.2%
간접배출(외부전기사용) 314
28.0%
액체연료연소 305
27.2%
오존층파괴물질의 대체물질 사용(전기 설비) 73
 
6.5%
석회 생산 12
 
1.1%
고체연료연소 8
 
0.7%
탄산염의 기타 공정사용 4
 
0.4%

Length

2023-12-13T01:22:31.129472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:22:31.268854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기체연료연소 406
29.8%
간접배출(외부전기사용 314
23.1%
액체연료연소 305
22.4%
오존층파괴물질의 73
 
5.4%
대체물질 73
 
5.4%
사용(전기 73
 
5.4%
설비 73
 
5.4%
석회 12
 
0.9%
생산 12
 
0.9%
고체연료연소 8
 
0.6%
Other values (3) 12
 
0.9%

연료코드ID
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
도시가스(LNG)
321 
전기
314 
가스/디젤 오일(경유)
121 
SF6
73 
부생연료 1호
64 
Other values (13)
229 

Length

Max length12
Median length11
Mean length6.3074866
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도시가스(LNG)
2nd row도시가스(LNG)
3rd row도시가스(LNG)
4th row도시가스(LNG)
5th row도시가스(LNG)

Common Values

ValueCountFrequency (%)
도시가스(LNG) 321
28.6%
전기 314
28.0%
가스/디젤 오일(경유) 121
 
10.8%
SF6 73
 
6.5%
부생연료 1호 64
 
5.7%
B-C유 60
 
5.3%
천연가스(LNG) 41
 
3.7%
도시가스(LPG) 24
 
2.1%
매립지가스(LFG) 20
 
1.8%
석회석 16
 
1.4%
Other values (8) 68
 
6.1%

Length

2023-12-13T01:22:31.420819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
도시가스(lng 321
24.5%
전기 314
24.0%
가스/디젤 121
 
9.2%
오일(경유 121
 
9.2%
sf6 73
 
5.6%
부생연료 64
 
4.9%
1호 64
 
4.9%
b-c유 60
 
4.6%
천연가스(lng 41
 
3.1%
도시가스(lpg 24
 
1.8%
Other values (11) 108
 
8.2%

규정산정등급코드ID
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
Tier1
1021 
Tier2
 
85
Tier3
 
16

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTier1
2nd rowTier1
3rd rowTier1
4th rowTier1
5th rowTier1

Common Values

ValueCountFrequency (%)
Tier1 1021
91.0%
Tier2 85
 
7.6%
Tier3 16
 
1.4%

Length

2023-12-13T01:22:31.625135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:22:31.749506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
tier1 1021
91.0%
tier2 85
 
7.6%
tier3 16
 
1.4%

적용산정등급코드ID
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
Tier1
1021 
Tier2
 
85
Tier3
 
16

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTier1
2nd rowTier1
3rd rowTier1
4th rowTier1
5th rowTier1

Common Values

ValueCountFrequency (%)
Tier1 1021
91.0%
Tier2 85
 
7.6%
Tier3 16
 
1.4%

Length

2023-12-13T01:22:31.887741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:22:32.007580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
tier1 1021
91.0%
tier2 85
 
7.6%
tier3 16
 
1.4%

산정식ID
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
제조업/건설업
406 
어선
314 
상업/공공
305 
소각보일러
73 
공정배출
 
12
Other values (2)
 
12

Length

Max length12
Median length7
Mean length4.8983957
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제조업/건설업
2nd row제조업/건설업
3rd row제조업/건설업
4th row제조업/건설업
5th row제조업/건설업

Common Values

ValueCountFrequency (%)
제조업/건설업 406
36.2%
어선 314
28.0%
상업/공공 305
27.2%
소각보일러 73
 
6.5%
공정배출 12
 
1.1%
에너지산업 8
 
0.7%
기타배출(탈루,폐기물) 4
 
0.4%

Length

2023-12-13T01:22:32.137066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:22:32.275344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조업/건설업 406
36.2%
어선 314
28.0%
상업/공공 305
27.2%
소각보일러 73
 
6.5%
공정배출 12
 
1.1%
에너지산업 8
 
0.7%
기타배출(탈루,폐기물 4
 
0.4%

산정식ID순번
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.587344
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2023-12-13T01:22:32.391434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile6
Maximum7
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.3647598
Coefficient of variation (CV)0.52747521
Kurtosis0.29256186
Mean2.587344
Median Absolute Deviation (MAD)1
Skewness0.82428544
Sum2903
Variance1.8625694
MonotonicityNot monotonic
2023-12-13T01:22:32.526699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2 337
30.0%
1 272
24.2%
3 239
21.3%
4 200
17.8%
6 62
 
5.5%
5 8
 
0.7%
7 4
 
0.4%
ValueCountFrequency (%)
1 272
24.2%
2 337
30.0%
3 239
21.3%
4 200
17.8%
5 8
 
0.7%
6 62
 
5.5%
7 4
 
0.4%
ValueCountFrequency (%)
7 4
 
0.4%
6 62
 
5.5%
5 8
 
0.7%
4 200
17.8%
3 239
21.3%
2 337
30.0%
1 272
24.2%
Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
False
1122 
ValueCountFrequency (%)
False 1122
100.0%
2023-12-13T01:22:32.641533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

산업군코드ID
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
에너지산업
1046 
상업/공공
 
76

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상업/공공
2nd row상업/공공
3rd row상업/공공
4th row에너지산업
5th row에너지산업

Common Values

ValueCountFrequency (%)
에너지산업 1046
93.2%
상업/공공 76
 
6.8%

Length

2023-12-13T01:22:32.759888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:22:32.868512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
에너지산업 1046
93.2%
상업/공공 76
 
6.8%

Interactions

2023-12-13T01:22:29.560072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:22:29.048402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:22:29.291883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:22:29.636105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:22:29.118295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:22:29.377671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:22:29.786247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:22:29.207895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:22:29.476218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:22:33.275792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준연도사업장순번배출시설순번배출활동순번배출활동코드ID연료코드ID규정산정등급코드ID적용산정등급코드ID산정식ID산정식ID순번산업군코드ID
기준연도1.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.000
사업장순번0.0001.0000.6780.3980.4270.6380.4060.4060.4270.5170.162
배출시설순번0.0000.6781.0000.3370.4570.6360.3390.3390.4570.4640.147
배출활동순번0.0000.3980.3371.0000.4150.9820.5160.5160.4150.4030.045
배출활동코드ID0.0000.4270.4570.4151.0000.9760.2860.2861.0000.9530.193
연료코드ID0.0000.6380.6360.9820.9761.0000.8170.8170.9760.8880.748
규정산정등급코드ID0.0000.4060.3390.5160.2860.8171.0001.0000.2860.3290.044
적용산정등급코드ID0.0000.4060.3390.5160.2860.8171.0001.0000.2860.3290.044
산정식ID0.0000.4270.4570.4151.0000.9760.2860.2861.0000.9530.193
산정식ID순번0.0000.5170.4640.4030.9530.8880.3290.3290.9531.0000.260
산업군코드ID0.0000.1620.1470.0450.1930.7480.0440.0440.1930.2601.000
2023-12-13T01:22:33.442110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산업군코드ID연료코드ID기준연도규정산정등급코드ID배출활동코드ID적용산정등급코드ID배출활동순번산정식ID
산업군코드ID1.0000.6020.0000.0740.2070.0740.0750.207
연료코드ID0.6021.0000.0000.5540.9040.5540.8410.904
기준연도0.0000.0001.0000.0000.0000.0000.0000.000
규정산정등급코드ID0.0740.5540.0001.0000.2001.0000.2130.200
배출활동코드ID0.2070.9040.0000.2001.0000.2000.3071.000
적용산정등급코드ID0.0740.5540.0001.0000.2001.0000.2130.200
배출활동순번0.0750.8410.0000.2130.3070.2131.0000.307
산정식ID0.2070.9040.0000.2001.0000.2000.3071.000
2023-12-13T01:22:33.629425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업장순번배출시설순번산정식ID순번기준연도배출활동순번배출활동코드ID연료코드ID규정산정등급코드ID적용산정등급코드ID산정식ID산업군코드ID
사업장순번1.000-0.1100.2600.0000.2610.2320.3040.2660.2660.2320.124
배출시설순번-0.1101.0000.1680.0000.2140.2520.3020.2150.2150.2520.113
산정식ID순번0.2600.1681.0000.0000.2960.6680.6630.2340.2340.6680.278
기준연도0.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.000
배출활동순번0.2610.2140.2960.0001.0000.3070.8410.2130.2130.3070.075
배출활동코드ID0.2320.2520.6680.0000.3071.0000.9040.2000.2001.0000.207
연료코드ID0.3040.3020.6630.0000.8410.9041.0000.5540.5540.9040.602
규정산정등급코드ID0.2660.2150.2340.0000.2130.2000.5541.0001.0000.2000.074
적용산정등급코드ID0.2660.2150.2340.0000.2130.2000.5541.0001.0000.2000.074
산정식ID0.2320.2520.6680.0000.3071.0000.9040.2000.2001.0000.207
산업군코드ID0.1240.1130.2780.0000.0750.2070.6020.0740.0740.2071.000

Missing values

2023-12-13T01:22:29.906598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:22:30.046845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연도사업장순번배출시설순번배출활동순번배출활동코드ID연료코드ID규정산정등급코드ID적용산정등급코드ID산정식ID산정식ID순번바이오매스사용관련활동여부산업군코드ID
0202010121기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업4N상업/공공
120202241기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업6N상업/공공
2202019291기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업4N상업/공공
3202019321기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업4N에너지산업
4202019331기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업4N에너지산업
5202019351기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업4N에너지산업
6202011171기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업3N에너지산업
7202011181기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업3N에너지산업
8202011191기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업3N에너지산업
9202011201기체연료연소도시가스(LNG)Tier1Tier1제조업/건설업3N에너지산업
기준연도사업장순번배출시설순번배출활동순번배출활동코드ID연료코드ID규정산정등급코드ID적용산정등급코드ID산정식ID산정식ID순번바이오매스사용관련활동여부산업군코드ID
111220215281석회 생산석회석Tier1Tier1공정배출5N에너지산업
11132020851석회 생산석회석Tier1Tier1공정배출5N에너지산업
111420214451석회 생산석회석Tier1Tier1공정배출7N에너지산업
11152021851석회 생산석회석Tier1Tier1공정배출5N에너지산업
111620224451석회 생산석회석Tier1Tier1공정배출7N에너지산업
11172022851석회 생산석회석Tier1Tier1공정배출5N에너지산업
111820225281석회 생산석회석Tier1Tier1공정배출5N에너지산업
111920235281석회 생산석회석Tier1Tier1공정배출5N에너지산업
112020234451석회 생산석회석Tier1Tier1공정배출7N에너지산업
11212023851석회 생산석회석Tier1Tier1공정배출5N에너지산업