Overview

Dataset statistics

Number of variables7
Number of observations9371
Missing cells2
Missing cells (%)< 0.1%
Duplicate rows2219
Duplicate rows (%)23.7%
Total size in memory521.8 KiB
Average record size in memory57.0 B

Variable types

Categorical5
Text1
Numeric1

Dataset

Description한국지역난방공사 지사별, 지구별 설치된 지역난방 열량계에 대한 기준정보로 공급유형, 열량계 형식 및 열공급개시일에 대한 내용을 포함하고 있습니다.
Author한국지역난방공사
URLhttps://www.data.go.kr/data/15124051/fileData.do

Alerts

Dataset has 2219 (23.7%) duplicate rowsDuplicates
설치종별 is highly overall correlated with 열량계종류High correlation
열량계종류 is highly overall correlated with 설치종별High correlation
열량계종류 is highly imbalanced (91.8%)Imbalance
열공급개시일 is highly skewed (γ1 = -25.1990986)Skewed

Reproduction

Analysis started2023-12-12 14:19:02.583960
Analysis finished2023-12-12 14:19:03.515447
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지사
Categorical

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size73.3 KiB
고양사업소
1255 
분당사업소
1183 
강남지사
893 
용인지사
715 
동탄지사
714 
Other values (14)
4611 

Length

Max length6
Median length4
Mean length4.3502294
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중앙지사
2nd row중앙지사
3rd row중앙지사
4th row중앙지사
5th row중앙지사

Common Values

ValueCountFrequency (%)
고양사업소 1255
13.4%
분당사업소 1183
12.6%
강남지사 893
9.5%
용인지사 715
 
7.6%
동탄지사 714
 
7.6%
세종지사 669
 
7.1%
중앙지사 591
 
6.3%
수원사업소 533
 
5.7%
판교지사 426
 
4.5%
화성지사 365
 
3.9%
Other values (9) 2027
21.6%

Length

2023-12-12T23:19:03.583839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
고양사업소 1255
13.4%
분당사업소 1183
12.6%
강남지사 893
9.5%
용인지사 715
 
7.6%
동탄지사 714
 
7.6%
세종지사 669
 
7.1%
중앙지사 591
 
6.3%
수원사업소 533
 
5.7%
판교지사 426
 
4.5%
화성지사 365
 
3.9%
Other values (9) 2027
21.6%

지구
Text

Distinct112
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size73.3 KiB
2023-12-12T23:19:03.839951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length4
Mean length5.0002134
Min length2

Characters and Unicode

Total characters46857
Distinct characters134
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row여의도
2nd row여의도
3rd row여의도
4th row여의도
5th row여의도
ValueCountFrequency (%)
분당지구 1091
 
11.6%
동탄2지구 714
 
7.6%
행정중심복합도시 669
 
7.1%
일산지구 644
 
6.9%
용인기존지구 366
 
3.9%
광교지구 338
 
3.6%
강남기존 268
 
2.9%
동탄지구 232
 
2.5%
수원기존지구 220
 
2.3%
송파지구 212
 
2.3%
Other values (103) 4621
49.3%
2023-12-12T23:19:04.190812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7420
 
15.8%
7280
 
15.5%
1431
 
3.1%
1160
 
2.5%
1138
 
2.4%
1119
 
2.4%
1091
 
2.3%
1091
 
2.3%
994
 
2.1%
958
 
2.0%
Other values (124) 23175
49.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 44605
95.2%
Decimal Number 1115
 
2.4%
Close Punctuation 346
 
0.7%
Open Punctuation 346
 
0.7%
Uppercase Letter 336
 
0.7%
Other Punctuation 105
 
0.2%
Space Separator 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7420
 
16.6%
7280
 
16.3%
1431
 
3.2%
1160
 
2.6%
1138
 
2.6%
1119
 
2.5%
1091
 
2.4%
1091
 
2.4%
994
 
2.2%
958
 
2.1%
Other values (113) 20923
46.9%
Decimal Number
ValueCountFrequency (%)
2 947
84.9%
1 80
 
7.2%
3 62
 
5.6%
4 26
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
M 112
33.3%
C 112
33.3%
D 112
33.3%
Close Punctuation
ValueCountFrequency (%)
) 346
100.0%
Open Punctuation
ValueCountFrequency (%)
( 346
100.0%
Other Punctuation
ValueCountFrequency (%)
? 105
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 44605
95.2%
Common 1916
 
4.1%
Latin 336
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7420
 
16.6%
7280
 
16.3%
1431
 
3.2%
1160
 
2.6%
1138
 
2.6%
1119
 
2.5%
1091
 
2.4%
1091
 
2.4%
994
 
2.2%
958
 
2.1%
Other values (113) 20923
46.9%
Common
ValueCountFrequency (%)
2 947
49.4%
) 346
 
18.1%
( 346
 
18.1%
? 105
 
5.5%
1 80
 
4.2%
3 62
 
3.2%
4 26
 
1.4%
4
 
0.2%
Latin
ValueCountFrequency (%)
M 112
33.3%
C 112
33.3%
D 112
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 44605
95.2%
ASCII 2252
 
4.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7420
 
16.6%
7280
 
16.3%
1431
 
3.2%
1160
 
2.6%
1138
 
2.6%
1119
 
2.5%
1091
 
2.4%
1091
 
2.4%
994
 
2.2%
958
 
2.1%
Other values (113) 20923
46.9%
ASCII
ValueCountFrequency (%)
2 947
42.1%
) 346
 
15.4%
( 346
 
15.4%
M 112
 
5.0%
C 112
 
5.0%
D 112
 
5.0%
? 105
 
4.7%
1 80
 
3.6%
3 62
 
2.8%
4 26
 
1.2%

설치종별
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size73.3 KiB
주택용
4646 
업무용
3622 
공공용
987 
냉수용
 
116

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주택용
2nd row주택용
3rd row주택용
4th row주택용
5th row주택용

Common Values

ValueCountFrequency (%)
주택용 4646
49.6%
업무용 3622
38.7%
공공용 987
 
10.5%
냉수용 116
 
1.2%

Length

2023-12-12T23:19:04.318904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:19:04.391156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주택용 4646
49.6%
업무용 3622
38.7%
공공용 987
 
10.5%
냉수용 116
 
1.2%

기계실번호
Categorical

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size73.3 KiB
5***
1143 
3***
982 
6****
880 
1***
874 
4***
863 
Other values (23)
4629 

Length

Max length5
Median length4
Mean length4.0532494
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2***
2nd row2***
3rd row2***
4th row2***
5th row2***

Common Values

ValueCountFrequency (%)
5*** 1143
12.2%
3*** 982
10.5%
6**** 880
9.4%
1*** 874
9.3%
4*** 863
9.2%
2*** 859
9.2%
5**** 820
8.8%
4**** 688
7.3%
6*** 630
6.7%
6** 195
 
2.1%
Other values (18) 1437
15.3%

Length

2023-12-12T23:19:04.484182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5 2181
23.3%
4 1767
18.9%
6 1732
18.5%
3 1197
12.8%
1 1075
11.5%
2 1075
11.5%
7 119
 
1.3%
8 110
 
1.2%
9 107
 
1.1%
8
 
0.1%

설치번호
Categorical

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size73.3 KiB
5****
952 
5***
846 
1***
844 
4***
842 
3***
819 
Other values (25)
5068 

Length

Max length5
Median length4
Mean length4.0460997
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5***
2nd row5***
3rd row5***
4th row5***
5th row5***

Common Values

ValueCountFrequency (%)
5**** 952
10.2%
5*** 846
9.0%
1*** 844
9.0%
4*** 842
9.0%
3*** 819
8.7%
2*** 818
8.7%
6**** 738
7.9%
6*** 669
 
7.1%
7*** 578
 
6.2%
4**** 508
 
5.4%
Other values (20) 1757
18.7%

Length

2023-12-12T23:19:04.589722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5 2008
21.4%
6 1581
16.9%
4 1558
16.6%
1 1040
11.1%
3 1013
10.8%
2 1009
10.8%
7 944
10.1%
8 109
 
1.2%
9 101
 
1.1%
8
 
0.1%

열량계종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size73.3 KiB
중온수 회수 S형
9158 
냉수 S형
 
115
중온수 공급 S형
 
56
중온수 회수 M형
 
41
냉수 M형
 
1

Length

Max length9
Median length9
Mean length8.9504855
Min length5

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row중온수 회수 S형
2nd row중온수 회수 S형
3rd row중온수 회수 S형
4th row중온수 회수 S형
5th row중온수 회수 S형

Common Values

ValueCountFrequency (%)
중온수 회수 S형 9158
97.7%
냉수 S형 115
 
1.2%
중온수 공급 S형 56
 
0.6%
중온수 회수 M형 41
 
0.4%
냉수 M형 1
 
< 0.1%

Length

2023-12-12T23:19:04.695484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:19:04.785804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
s형 9329
33.3%
중온수 9255
33.1%
회수 9199
32.9%
냉수 116
 
0.4%
공급 56
 
0.2%
m형 42
 
0.2%

열공급개시일
Real number (ℝ)

SKEWED 

Distinct3656
Distinct (%)39.0%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean20050795
Minimum0
Maximum20230920
Zeros14
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size82.5 KiB
2023-12-12T23:19:04.903552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile19930117
Q119991105
median20090101
Q320171115
95-th percentile20220621
Maximum20230920
Range20230920
Interquartile range (IQR)180010

Descriptive statistics

Standard deviation781962.33
Coefficient of variation (CV)0.038999069
Kurtosis643.4606
Mean20050795
Median Absolute Deviation (MAD)89572
Skewness-25.199099
Sum1.8785589 × 1011
Variance6.1146508 × 1011
MonotonicityNot monotonic
2023-12-12T23:19:05.045217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19871109 64
 
0.7%
19990901 25
 
0.3%
19871110 23
 
0.2%
19980901 23
 
0.2%
20150831 20
 
0.2%
20010901 18
 
0.2%
19990907 17
 
0.2%
19971118 17
 
0.2%
20130830 17
 
0.2%
19930325 17
 
0.2%
Other values (3646) 9128
97.4%
ValueCountFrequency (%)
0 14
 
0.1%
19871102 8
 
0.1%
19871107 2
 
< 0.1%
19871109 64
0.7%
19871110 23
 
0.2%
19871111 10
 
0.1%
19871112 1
 
< 0.1%
19871125 4
 
< 0.1%
19871128 1
 
< 0.1%
19871201 3
 
< 0.1%
ValueCountFrequency (%)
20230920 4
< 0.1%
20230919 1
 
< 0.1%
20230915 2
< 0.1%
20230911 2
< 0.1%
20230907 2
< 0.1%
20230905 1
 
< 0.1%
20230904 2
< 0.1%
20230901 1
 
< 0.1%
20230830 1
 
< 0.1%
20230828 1
 
< 0.1%

Interactions

2023-12-12T23:19:03.189441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:19:05.124380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지사설치종별기계실번호설치번호열량계종류열공급개시일
지사1.0000.4930.8340.8600.3680.125
설치종별0.4931.0000.5280.4670.6470.052
기계실번호0.8340.5281.0000.9000.2350.136
설치번호0.8600.4670.9001.0000.2580.423
열량계종류0.3680.6470.2350.2581.0000.000
열공급개시일0.1250.0520.1360.4230.0001.000
2023-12-12T23:19:05.224126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기계실번호설치종별열량계종류지사설치번호
기계실번호1.0000.2760.1150.3890.438
설치종별0.2761.0000.5770.2910.259
열량계종류0.1150.5771.0000.1920.114
지사0.3890.2910.1921.0000.422
설치번호0.4380.2590.1140.4221.000
2023-12-12T23:19:05.323524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
열공급개시일지사설치종별기계실번호설치번호열량계종류
열공급개시일1.0000.1110.0350.1070.3370.000
지사0.1111.0000.2910.3890.4220.192
설치종별0.0350.2911.0000.2760.2590.577
기계실번호0.1070.3890.2761.0000.4380.115
설치번호0.3370.4220.2590.4381.0000.114
열량계종류0.0000.1920.5770.1150.1141.000

Missing values

2023-12-12T23:19:03.343507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:19:03.464393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지사지구설치종별기계실번호설치번호열량계종류열공급개시일
0중앙지사여의도주택용2***5***중온수 회수 S형19871110
1중앙지사여의도주택용2***5***중온수 회수 S형19871110
2중앙지사여의도주택용2***5***중온수 회수 S형19871110
3중앙지사여의도주택용2***5***중온수 회수 S형19871110
4중앙지사여의도주택용2***5***중온수 회수 S형19871110
5중앙지사여의도주택용2***5***중온수 회수 S형19871110
6중앙지사여의도주택용2***5***중온수 회수 S형19871110
7중앙지사여의도주택용2***5***중온수 회수 S형19871110
8중앙지사여의도주택용2***5***중온수 회수 S형19871110
9중앙지사여의도주택용2***5***중온수 회수 S형19871110
지사지구설치종별기계실번호설치번호열량계종류열공급개시일
9361광주전남지사광주?전남혁신도시업무용6****6****중온수 회수 S형20220728
9362광주전남지사광주?전남혁신도시업무용6****6****중온수 회수 S형20220728
9363광주전남지사광주?전남혁신도시공공용2***1***중온수 회수 S형20121221
9364광주전남지사광주?전남혁신도시공공용2***1***중온수 회수 S형20121221
9365광주전남지사광주?전남혁신도시공공용4****4****중온수 회수 S형20190221
9366광주전남지사광주?전남혁신도시공공용4****4****중온수 회수 S형20180528
9367광주전남지사광주?전남혁신도시공공용5***6***중온수 회수 S형20131119
9368광주전남지사광주?전남혁신도시공공용5***6***중온수 회수 S형20131116
9369광주전남지사광주?전남혁신도시공공용5***6***중온수 회수 S형20140512
9370광주전남지사광주?전남혁신도시공공용5***6***중온수 회수 S형20140512

Duplicate rows

Most frequently occurring

지사지구설치종별기계실번호설치번호열량계종류열공급개시일# duplicates
1793중앙지사용산지구주택용2***5***중온수 회수 S형1987110927
1787중앙지사여의도주택용2***5***중온수 회수 S형1987111022
1786중앙지사여의도주택용2***5***중온수 회수 S형1987110921
183강남지사송파지구주택용5**9**중온수 회수 S형1999090717
640대구지사성서지구주택용3***1***중온수 회수 S형1997111817
166강남지사송파지구주택용4***5***중온수 회수 S형1992101416
169강남지사송파지구주택용4***6**중온수 회수 S형1997091016
1716중앙지사가재울뉴타운주택용6***7***중온수 회수 S형2015083115
44강남지사강남기존주택용6****7****중온수 회수 S형014
45강남지사강남기존주택용6****7****중온수 회수 S형2023072114