Overview

Dataset statistics

Number of variables7
Number of observations637
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory36.2 KiB
Average record size in memory58.2 B

Variable types

Categorical5
Numeric2

Dataset

Description한국지역난방공사 사업소별 설치된 기기(가스터빈, 스팀터빈, 보일러, 축열조 등) 용량에 대한 정보입니다. (사업구분, 설비명, 형식, 용량, 단위, 수량)
URLhttps://www.data.go.kr/data/15002757/fileData.do

Alerts

설비명 is highly overall correlated with 수량 and 2 other fieldsHigh correlation
형식 is highly overall correlated with 설비명 and 1 other fieldsHigh correlation
단위 is highly overall correlated with 설비명 and 1 other fieldsHigh correlation
수량 is highly overall correlated with 설비명High correlation
형식 is highly imbalanced (56.5%)Imbalance

Reproduction

Analysis started2023-12-12 01:17:08.063786
Analysis finished2023-12-12 01:17:09.473377
Duration1.41 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준일
Categorical

Distinct8
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-05-31
87 
2022-05-31
84 
2022-02-28
82 
2021-06-30
82 
2020-06-30
82 
Other values (3)
220 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-05-31
2nd row2023-05-31
3rd row2023-05-31
4th row2023-05-31
5th row2023-05-31

Common Values

ValueCountFrequency (%)
2023-05-31 87
13.7%
2022-05-31 84
13.2%
2022-02-28 82
12.9%
2021-06-30 82
12.9%
2020-06-30 82
12.9%
2016-06-30 81
12.7%
2015-12-31 76
11.9%
2014-12-31 63
9.9%

Length

2023-12-12T10:17:09.551679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:17:09.686022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-05-31 87
13.7%
2022-05-31 84
13.2%
2022-02-28 82
12.9%
2021-06-30 82
12.9%
2020-06-30 82
12.9%
2016-06-30 81
12.7%
2015-12-31 76
11.9%
2014-12-31 63
9.9%

사업구분
Categorical

Distinct49
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
판교
 
43
청주
 
40
수원
 
40
대구
 
38
광교
 
36
Other values (44)
440 

Length

Max length21
Median length2
Mean length3.4348509
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row동탄
2nd row파주
3rd row화성
4th row광교
5th row판교

Common Values

ValueCountFrequency (%)
판교 43
 
6.8%
청주 40
 
6.3%
수원 40
 
6.3%
대구 38
 
6.0%
광교 36
 
5.7%
파주 35
 
5.5%
삼송 35
 
5.5%
화성 32
 
5.0%
강남(일원) 25
 
3.9%
광주전남 25
 
3.9%
Other values (39) 288
45.2%

Length

2023-12-12T10:17:09.893693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
판교 43
 
6.8%
수원 40
 
6.3%
청주 40
 
6.3%
대구 38
 
6.0%
광교 36
 
5.7%
파주 35
 
5.5%
삼송 35
 
5.5%
화성 32
 
5.0%
강남(일원 25
 
3.9%
광주전남 25
 
3.9%
Other values (38) 288
45.2%

설비명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
보일러
308 
축열조
134 
스팀터빈
84 
가스터빈
49 
DH Pump
 
17
Other values (12)
45 

Length

Max length11
Median length3
Mean length3.3877551
Min length3

Unique

Unique8 ?
Unique (%)1.3%

Sample

1st row가스터빈
2nd row가스터빈
3rd row가스터빈
4th row가스터빈
5th row가스터빈

Common Values

ValueCountFrequency (%)
보일러 308
48.4%
축열조 134
21.0%
스팀터빈 84
 
13.2%
가스터빈 49
 
7.7%
DH Pump 17
 
2.7%
변압기 15
 
2.4%
연료탱크 12
 
1.9%
집진설비 8
 
1.3%
냉동기 2
 
0.3%
팽창탱크 1
 
0.2%
Other values (7) 7
 
1.1%

Length

2023-12-12T10:17:10.094718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
보일러 309
47.0%
축열조 134
20.4%
스팀터빈 84
 
12.8%
가스터빈 49
 
7.4%
dh 19
 
2.9%
pump 17
 
2.6%
변압기 15
 
2.3%
연료탱크 12
 
1.8%
집진설비 8
 
1.2%
냉동기 2
 
0.3%
Other values (9) 9
 
1.4%

형식
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct32
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
<NA>
266 
열전용
230 
발전용
74 
양흡입, 원심형
 
20
전기식
 
7
Other values (27)
40 

Length

Max length25
Median length11
Mean length3.8838305
Min length2

Unique

Unique22 ?
Unique (%)3.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 266
41.8%
열전용 230
36.1%
발전용 74
 
11.6%
양흡입, 원심형 20
 
3.1%
전기식 7
 
1.1%
FO/BC 6
 
0.9%
등유 6
 
0.9%
발전용(LP) 2
 
0.3%
발전용(HP) 2
 
0.3%
22.9/6.9kV 2
 
0.3%
Other values (22) 22
 
3.5%

Length

2023-12-12T10:17:10.273627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 266
39.9%
열전용 230
34.5%
발전용 74
 
11.1%
양흡입 20
 
3.0%
원심형 20
 
3.0%
전기식 7
 
1.1%
fo/bc 6
 
0.9%
등유 6
 
0.9%
22.9/6.9kv 2
 
0.3%
발전용(hp 2
 
0.3%
Other values (31) 33
 
5.0%

용량
Real number (ℝ)

Distinct98
Distinct (%)15.4%
Missing1
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean6777.6104
Minimum1
Maximum267840
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.7 KiB
2023-12-12T10:17:10.459434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9.5
Q149.4
median103
Q31837.5
95-th percentile30000
Maximum267840
Range267839
Interquartile range (IQR)1788.1

Descriptive statistics

Standard deviation19467.77
Coefficient of variation (CV)2.8723648
Kurtosis74.278083
Mean6777.6104
Median Absolute Deviation (MAD)81.505
Skewness7.2173988
Sum4310560.2
Variance3.7899406 × 108
MonotonicityNot monotonic
2023-12-12T10:17:10.633683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
103.0 55
 
8.6%
20000.0 48
 
7.5%
34.0 40
 
6.3%
68.0 23
 
3.6%
100.0 22
 
3.5%
200.0 22
 
3.5%
25000.0 21
 
3.3%
30000.0 16
 
2.5%
150.0 15
 
2.4%
10000.0 14
 
2.2%
Other values (88) 360
56.5%
ValueCountFrequency (%)
1.0 1
 
0.2%
2.0 1
 
0.2%
3.0 3
 
0.5%
5.0 4
0.6%
5.2 7
1.1%
6.0 9
1.4%
8.0 7
1.1%
10.0 2
 
0.3%
13.3 7
1.1%
15.2 7
1.1%
ValueCountFrequency (%)
267840.0 1
 
0.2%
183200.0 1
 
0.2%
167720.0 1
 
0.2%
161400.0 1
 
0.2%
73400.0 2
 
0.3%
60000.0 6
 
0.9%
50000.0 7
 
1.1%
37000.0 7
 
1.1%
30000.0 16
2.5%
25000.0 21
3.3%

단위
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
G/h
168 
T/h
140 
135 
MW
133 
㎥/h
20 
Other values (9)
41 

Length

Max length7
Median length3
Mean length2.3783359
Min length1

Unique

Unique5 ?
Unique (%)0.8%

Sample

1st rowMW
2nd rowMW
3rd rowMW
4th rowMW
5th rowMW

Common Values

ValueCountFrequency (%)
G/h 168
26.4%
T/h 140
22.0%
135
21.2%
MW 133
20.9%
㎥/h 20
 
3.1%
MVA 15
 
2.4%
kl 12
 
1.9%
N㎥/h 7
 
1.1%
USRT 2
 
0.3%
Gcal/h 1
 
0.2%
Other values (4) 4
 
0.6%

Length

2023-12-12T10:17:10.787645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
g/h 168
26.4%
t/h 140
22.0%
135
21.2%
mw 133
20.9%
㎥/h 20
 
3.1%
mva 15
 
2.4%
kl 12
 
1.9%
n㎥/h 7
 
1.1%
usrt 2
 
0.3%
gcal/h 1
 
0.2%
Other values (4) 4
 
0.6%

수량
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.8414443
Minimum1
Maximum18
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.7 KiB
2023-12-12T10:17:10.947556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q32
95-th percentile4
Maximum18
Range17
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.4756865
Coefficient of variation (CV)0.80137453
Kurtosis44.030697
Mean1.8414443
Median Absolute Deviation (MAD)1
Skewness5.4669662
Sum1173
Variance2.1776507
MonotonicityNot monotonic
2023-12-12T10:17:11.103655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1 301
47.3%
2 257
40.3%
4 33
 
5.2%
3 31
 
4.9%
7 4
 
0.6%
5 3
 
0.5%
10 2
 
0.3%
6 2
 
0.3%
14 1
 
0.2%
15 1
 
0.2%
Other values (2) 2
 
0.3%
ValueCountFrequency (%)
1 301
47.3%
2 257
40.3%
3 31
 
4.9%
4 33
 
5.2%
5 3
 
0.5%
6 2
 
0.3%
7 4
 
0.6%
10 2
 
0.3%
11 1
 
0.2%
14 1
 
0.2%
ValueCountFrequency (%)
18 1
 
0.2%
15 1
 
0.2%
14 1
 
0.2%
11 1
 
0.2%
10 2
 
0.3%
7 4
 
0.6%
6 2
 
0.3%
5 3
 
0.5%
4 33
5.2%
3 31
4.9%

Interactions

2023-12-12T10:17:08.933307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:17:08.688030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:17:09.060473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:17:08.798796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:17:11.208039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준일사업구분설비명형식용량단위수량
기준일1.0000.5140.6540.6040.1280.6310.443
사업구분0.5141.0000.8400.8790.5650.8320.755
설비명0.6540.8401.0000.9880.6860.9930.856
형식0.6040.8790.9881.0000.7300.9970.682
용량0.1280.5650.6860.7301.0000.6710.205
단위0.6310.8320.9930.9970.6711.0000.702
수량0.4430.7550.8560.6820.2050.7021.000
2023-12-12T10:17:11.344968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업구분설비명기준일형식단위
사업구분1.0000.3780.2130.3520.409
설비명0.3781.0000.3410.8560.953
기준일0.2130.3411.0000.2750.347
형식0.3520.8560.2751.0000.949
단위0.4090.9530.3470.9491.000
2023-12-12T10:17:11.487252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용량수량기준일사업구분설비명형식단위
용량1.0000.0100.0780.2870.4350.4560.442
수량0.0101.0000.1610.3880.5780.3320.411
기준일0.0780.1611.0000.2130.3410.2750.347
사업구분0.2870.3880.2131.0000.3780.3520.409
설비명0.4350.5780.3410.3781.0000.8560.953
형식0.4560.3320.2750.3520.8561.0000.949
단위0.4420.4110.3470.4090.9530.9491.000

Missing values

2023-12-12T10:17:09.238693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:17:09.412028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준일사업구분설비명형식용량단위수량
02023-05-31동탄가스터빈<NA>246.4MW2
12023-05-31파주가스터빈<NA>163.4MW2
22023-05-31화성가스터빈<NA>160.8MW2
32023-05-31광교가스터빈<NA>102.6MW1
42023-05-31판교가스터빈<NA>77.9MW1
52023-05-31양산가스터빈<NA>78.0MW1
62023-05-31강남(동남권)가스터빈<NA>33.8MW1
72023-05-31삼송가스터빈<NA>25.4MW2
82023-05-31동탄스팀터빈<NA>131.9MW2
92023-05-31파주스팀터빈<NA>188.7MW1
기준일사업구분설비명형식용량단위수량
6272014-12-31서초변압기22.9/6.13kV1.0MVA1
6282014-12-31대구변압기22.9/6.14kV37.0MVA1
6292014-12-31수원변압기22.9/6.15kV37.0MVA1
6302014-12-31장안변압기22.9/6.16kV3.0MVA2
6312014-12-31청주변압기22.9/6.17kV52.0MVA1
6322014-12-31김해변압기22.9/6.18kV3.0MVA2
6332014-12-31상암변압기22.9/6.19kV20.0MVA3
6342014-12-31양산변압기22.9/6.20kV3.0MVA2
6352014-12-31용인변압기22.9/6.21kV5.0MVA2
6362014-12-31동백변압기22.9/6.22kV5.0MVA2