Overview

Dataset statistics

Number of variables8
Number of observations150
Missing cells3
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.9 KiB
Average record size in memory67.9 B

Variable types

Categorical4
Text2
Numeric2

Dataset

Description한국지역난방공사 및 전국 지역난방사업자의 지역별 지역난방방 공급현황입니다. (회사명,지역구분,공급지역,공급세대,비율)
URLhttps://www.data.go.kr/data/3070435/fileData.do

Alerts

세대수단위 has constant value ""Constant
비율단위 has constant value ""Constant
공급세대수 is highly overall correlated with 비율High correlation
비율 is highly overall correlated with 공급세대수 High correlation
공급세대수 has 2 (1.3%) missing valuesMissing
공급세대수 has 2 (1.3%) zerosZeros
비율 has 15 (10.0%) zerosZeros

Reproduction

Analysis started2023-12-12 16:44:29.141789
Analysis finished2023-12-12 16:44:30.229986
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년도
Categorical

Distinct4
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2021
39 
2019
39 
2017
39 
2013
33 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 39
26.0%
2019 39
26.0%
2017 39
26.0%
2013 33
22.0%

Length

2023-12-13T01:44:30.642823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:44:30.771507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 39
26.0%
2019 39
26.0%
2017 39
26.0%
2013 33
22.0%
Distinct53
Distinct (%)35.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-13T01:44:30.997992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length9
Mean length5.94
Min length3

Characters and Unicode

Total characters891
Distinct characters111
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)9.3%

Sample

1st row서울에너지공사
2nd row부산광역시
3rd row한국지역난방공사
4th row한국지역난방공사
5th rowLH공사
ValueCountFrequency (%)
한국지역난방공사 8
 
5.1%
대성산업(신도림 4
 
2.5%
충남도시가스 4
 
2.5%
부산광역시 4
 
2.5%
휴세스 4
 
2.5%
대성에너지 4
 
2.5%
부산정관에너지 4
 
2.5%
수완에너지 4
 
2.5%
별내에너지 4
 
2.5%
한국ces 4
 
2.5%
Other values (46) 113
72.0%
2023-12-13T01:44:31.375402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
70
 
7.9%
63
 
7.1%
62
 
7.0%
29
 
3.3%
23
 
2.6%
21
 
2.4%
19
 
2.1%
19
 
2.1%
19
 
2.1%
18
 
2.0%
Other values (101) 548
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 798
89.6%
Uppercase Letter 67
 
7.5%
Close Punctuation 9
 
1.0%
Open Punctuation 9
 
1.0%
Space Separator 7
 
0.8%
Other Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
70
 
8.8%
63
 
7.9%
62
 
7.8%
29
 
3.6%
23
 
2.9%
21
 
2.6%
19
 
2.4%
19
 
2.4%
19
 
2.4%
18
 
2.3%
Other values (87) 455
57.0%
Uppercase Letter
ValueCountFrequency (%)
S 15
22.4%
C 11
16.4%
E 8
11.9%
G 7
10.4%
O 6
 
9.0%
I 6
 
9.0%
L 4
 
6.0%
H 4
 
6.0%
D 3
 
4.5%
M 3
 
4.5%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 799
89.7%
Latin 67
 
7.5%
Common 25
 
2.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
70
 
8.8%
63
 
7.9%
62
 
7.8%
29
 
3.6%
23
 
2.9%
21
 
2.6%
19
 
2.4%
19
 
2.4%
19
 
2.4%
18
 
2.3%
Other values (88) 456
57.1%
Latin
ValueCountFrequency (%)
S 15
22.4%
C 11
16.4%
E 8
11.9%
G 7
10.4%
O 6
 
9.0%
I 6
 
9.0%
L 4
 
6.0%
H 4
 
6.0%
D 3
 
4.5%
M 3
 
4.5%
Common
ValueCountFrequency (%)
) 9
36.0%
( 9
36.0%
7
28.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 798
89.6%
ASCII 92
 
10.3%
None 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
70
 
8.8%
63
 
7.9%
62
 
7.8%
29
 
3.6%
23
 
2.9%
21
 
2.6%
19
 
2.4%
19
 
2.4%
19
 
2.4%
18
 
2.3%
Other values (87) 455
57.0%
ASCII
ValueCountFrequency (%)
S 15
16.3%
C 11
12.0%
) 9
9.8%
( 9
9.8%
E 8
8.7%
G 7
7.6%
7
7.6%
O 6
 
6.5%
I 6
 
6.5%
L 4
 
4.3%
Other values (3) 10
10.9%
None
ValueCountFrequency (%)
1
100.0%

지역구분
Categorical

Distinct2
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
수도권
75 
지방
75 

Length

Max length3
Median length2.5
Mean length2.5
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수도권
2nd row지방
3rd row수도권
4th row지방
5th row지방

Common Values

ValueCountFrequency (%)
수도권 75
50.0%
지방 75
50.0%

Length

2023-12-13T01:44:31.541888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:44:31.637092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수도권 75
50.0%
지방 75
50.0%
Distinct60
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-13T01:44:31.834870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length8.9066667
Min length2

Characters and Unicode

Total characters1336
Distinct characters130
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)14.0%

Sample

1st row노원, 신정3, 목동, 마곡
2nd row해운대
3rd row파주, 판교, 화성 등
4th row대구, 청주, 김해 등
5th row대전서남부, 아산배방탕정
ValueCountFrequency (%)
20
 
6.7%
대구 5
 
1.7%
충남도청이전신도시 4
 
1.3%
노은3 4
 
1.3%
강일1,2 4
 
1.3%
화성향남1,2 4
 
1.3%
호매실 4
 
1.3%
안산 4
 
1.3%
송산그린시티 4
 
1.3%
서창2 4
 
1.3%
Other values (90) 242
80.9%
2023-12-13T01:44:32.231062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
153
 
11.5%
, 130
 
9.7%
51
 
3.8%
36
 
2.7%
34
 
2.5%
34
 
2.5%
33
 
2.5%
2 32
 
2.4%
30
 
2.2%
28
 
2.1%
Other values (120) 775
58.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 989
74.0%
Space Separator 153
 
11.5%
Other Punctuation 130
 
9.7%
Decimal Number 64
 
4.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
51
 
5.2%
36
 
3.6%
34
 
3.4%
34
 
3.4%
33
 
3.3%
30
 
3.0%
28
 
2.8%
23
 
2.3%
21
 
2.1%
20
 
2.0%
Other values (114) 679
68.7%
Decimal Number
ValueCountFrequency (%)
2 32
50.0%
1 18
28.1%
3 11
 
17.2%
4 3
 
4.7%
Space Separator
ValueCountFrequency (%)
153
100.0%
Other Punctuation
ValueCountFrequency (%)
, 130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 989
74.0%
Common 347
 
26.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
 
5.2%
36
 
3.6%
34
 
3.4%
34
 
3.4%
33
 
3.3%
30
 
3.0%
28
 
2.8%
23
 
2.3%
21
 
2.1%
20
 
2.0%
Other values (114) 679
68.7%
Common
ValueCountFrequency (%)
153
44.1%
, 130
37.5%
2 32
 
9.2%
1 18
 
5.2%
3 11
 
3.2%
4 3
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 989
74.0%
ASCII 347
 
26.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
153
44.1%
, 130
37.5%
2 32
 
9.2%
1 18
 
5.2%
3 11
 
3.2%
4 3
 
0.9%
Hangul
ValueCountFrequency (%)
51
 
5.2%
36
 
3.6%
34
 
3.4%
34
 
3.4%
33
 
3.3%
30
 
3.0%
28
 
2.8%
23
 
2.3%
21
 
2.1%
20
 
2.0%
Other values (114) 679
68.7%

공급세대수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct121
Distinct (%)81.8%
Missing2
Missing (%)1.3%
Infinite0
Infinite (%)0.0%
Mean77551.77
Minimum0
Maximum1239792
Zeros2
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-13T01:44:32.370841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile820
Q17091.5
median21038.5
Q345902.75
95-th percentile346418.75
Maximum1239792
Range1239792
Interquartile range (IQR)38811.25

Descriptive statistics

Standard deviation196342.86
Coefficient of variation (CV)2.531765
Kurtosis22.686614
Mean77551.77
Median Absolute Deviation (MAD)17151.5
Skewness4.6106799
Sum11477662
Variance3.8550517 × 1010
MonotonicityNot monotonic
2023-12-13T01:44:32.525295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
820 4
 
2.7%
524 4
 
2.7%
2795 4
 
2.7%
5988 3
 
2.0%
1887 3
 
2.0%
3953 3
 
2.0%
5726 3
 
2.0%
33099 2
 
1.3%
7572 2
 
1.3%
7429 2
 
1.3%
Other values (111) 118
78.7%
ValueCountFrequency (%)
0 2
1.3%
482 1
 
0.7%
524 4
2.7%
820 4
2.7%
885 1
 
0.7%
980 1
 
0.7%
1277 1
 
0.7%
1786 1
 
0.7%
1887 3
2.0%
2795 4
2.7%
ValueCountFrequency (%)
1239792 1
0.7%
1203860 1
0.7%
1131619 1
0.7%
987747 1
0.7%
384666 1
0.7%
369207 1
0.7%
359213 1
0.7%
346977 1
0.7%
345382 1
0.7%
326610 1
0.7%

세대수단위
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
150 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
150
100.0%

Length

2023-12-13T01:44:32.677952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:44:32.814818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
150
100.0%

비율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct46
Distinct (%)30.9%
Missing1
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean2.6808725
Minimum0
Maximum44.5
Zeros15
Zeros (%)10.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-13T01:44:32.944936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.2
median0.8
Q31.5
95-th percentile11.6
Maximum44.5
Range44.5
Interquartile range (IQR)1.3

Descriptive statistics

Standard deviation6.8700722
Coefficient of variation (CV)2.5626255
Kurtosis22.706152
Mean2.6808725
Median Absolute Deviation (MAD)0.6
Skewness4.6193497
Sum399.45
Variance47.197892
MonotonicityNot monotonic
2023-12-13T01:44:33.126625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
0.3 15
 
10.0%
0.0 15
 
10.0%
0.1 13
 
8.7%
0.2 13
 
8.7%
1.2 8
 
5.3%
0.8 6
 
4.0%
1.5 5
 
3.3%
0.5 5
 
3.3%
0.6 5
 
3.3%
0.9 5
 
3.3%
Other values (36) 59
39.3%
ValueCountFrequency (%)
0.0 15
10.0%
0.1 13
8.7%
0.2 13
8.7%
0.3 15
10.0%
0.4 3
 
2.0%
0.45 1
 
0.7%
0.5 5
 
3.3%
0.6 5
 
3.3%
0.7 4
 
2.7%
0.8 6
 
4.0%
ValueCountFrequency (%)
44.5 1
0.7%
39.1 1
0.7%
38.8 1
0.7%
38.1 1
0.7%
13.5 1
0.7%
12.0 1
0.7%
11.9 1
0.7%
11.8 1
0.7%
11.3 1
0.7%
11.1 1
0.7%

비율단위
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
%
150 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row%
2nd row%
3rd row%
4th row%
5th row%

Common Values

ValueCountFrequency (%)
% 150
100.0%

Length

2023-12-13T01:44:33.465988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:44:33.722555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
150
100.0%

Interactions

2023-12-13T01:44:29.723075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:44:29.540228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:44:29.817054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:44:29.630119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:44:33.839094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준년도회사명지역구분공급지역공급세대수비율
기준년도1.0000.0000.0000.0000.0280.000
회사명0.0001.0000.9870.9990.5750.595
지역구분0.0000.9871.0001.0000.2020.182
공급지역0.0000.9991.0001.0000.8300.831
공급세대수0.0280.5750.2020.8301.0000.979
비율0.0000.5950.1820.8310.9791.000
2023-12-13T01:44:34.036452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준년도지역구분
기준년도1.0000.000
지역구분0.0001.000
2023-12-13T01:44:34.212583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급세대수비율기준년도지역구분
공급세대수1.0000.9920.0090.143
비율0.9921.0000.0000.150
기준년도0.0090.0001.0000.000
지역구분0.1430.1500.0001.000

Missing values

2023-12-13T01:44:29.944168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:44:30.086077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T01:44:30.178365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

기준년도회사명지역구분공급지역공급세대수세대수단위비율비율단위
02021서울에너지공사수도권노원, 신정3, 목동, 마곡2569537.9%
12021부산광역시지방해운대442711.4%
22021한국지역난방공사수도권파주, 판교, 화성 등123979238.1%
32021한국지역난방공사지방대구, 청주, 김해 등38466611.8%
42021LH공사지방대전서남부, 아산배방탕정471981.4%
52021GS파워수도권안양, 부천35921311.0%
62021안산도시개발수도권안산, 송산그린시티, 시흥군자1007413.1%
72021인천공항에너지수도권인천공항신도시167780.5%
82021미래엔인천에너지수도권인천논현2, 서창2691842.1%
92021인천종합에너지수도권송도국제도시645472.0%
기준년도회사명지역구분공급지역공급세대수세대수단위비율비율단위
1402013경기CES수도권양주 고읍75720.3%
1412013대성산업(신도림)수도권신도림 디큐브시티5240.0%
1422013별내에너지수도권남양주 별내72820.3%
1432013롯데건설(주)지방충남도청이전신도시8850.0%
1442013평택에너지서비스(주)수도권평택소사벌31100.1%
1452013㈜대륜에너지수도권의정부민락2, 고산17860.1%
1462013대구그린파워(주)지방대구혁신도시00.0%
1472013대성산업 코젠수도권오산 운암259721.2%
1482013대전열병합지방대전 송강325331.5%
1492013전북에너지지방익산 배산77900.4%