Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory498.0 KiB
Average record size in memory51.0 B

Variable types

Numeric2
Categorical2
Text1

Dataset

Description한국지역난방공사에서 제공하는 2022년 ~ 2023년 주소 구분에 따른 지역난방(열) 사용량 정보입니다(단위 : Gcal)
Author한국지역난방공사
URLhttps://www.data.go.kr/data/15127129/fileData.do

Alerts

단위 has constant value ""Constant
순번 is highly overall correlated with 연도High correlation
연도 is highly overall correlated with 순번High correlation
순번 has unique valuesUnique

Reproduction

Analysis started2024-03-16 04:17:36.933821
Analysis finished2024-03-16 04:17:38.923213
Duration1.99 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5724.985
Minimum1
Maximum11444
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-16T13:17:39.102939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile571.95
Q12854.75
median5722.5
Q38602.25
95-th percentile10866.05
Maximum11444
Range11443
Interquartile range (IQR)5747.5

Descriptive statistics

Standard deviation3309.7205
Coefficient of variation (CV)0.57811863
Kurtosis-1.2068672
Mean5724.985
Median Absolute Deviation (MAD)2873.5
Skewness-0.0025100006
Sum57249850
Variance10954250
MonotonicityNot monotonic
2024-03-16T13:17:39.651274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4772 1
 
< 0.1%
8888 1
 
< 0.1%
1939 1
 
< 0.1%
5994 1
 
< 0.1%
2034 1
 
< 0.1%
7912 1
 
< 0.1%
10387 1
 
< 0.1%
7992 1
 
< 0.1%
6384 1
 
< 0.1%
2208 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
11 1
< 0.1%
ValueCountFrequency (%)
11444 1
< 0.1%
11443 1
< 0.1%
11442 1
< 0.1%
11440 1
< 0.1%
11437 1
< 0.1%
11435 1
< 0.1%
11434 1
< 0.1%
11433 1
< 0.1%
11432 1
< 0.1%
11431 1
< 0.1%

연도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023
5197 
2022
4803 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2023 5197
52.0%
2022 4803
48.0%

Length

2024-03-16T13:17:40.144315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T13:17:40.494468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 5197
52.0%
2022 4803
48.0%

주소
Text

Distinct1521
Distinct (%)15.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-16T13:17:41.131334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length21
Mean length15.6762
Min length7

Characters and Unicode

Total characters156762
Distinct characters323
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147 ?
Unique (%)1.5%

Sample

1st row경기도 화성시 동탄첨단산업2로
2nd row충청북도 청주시 서원구 구룡산로52번길
3rd row서울특별시 강남구 영동대로
4th row서울특별시 서초구 나루터로
5th row서울특별시 강남구 자곡로
ValueCountFrequency (%)
경기도 6728
 
19.2%
고양시 1842
 
5.3%
성남시 1815
 
5.2%
분당구 1660
 
4.7%
서울특별시 1450
 
4.1%
용인시 980
 
2.8%
화성시 944
 
2.7%
수원시 806
 
2.3%
세종특별자치시 687
 
2.0%
일산동구 674
 
1.9%
Other values (1492) 17432
49.8%
2024-03-16T13:17:42.115077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25025
 
16.0%
10109
 
6.4%
9590
 
6.1%
8155
 
5.2%
7656
 
4.9%
7373
 
4.7%
7133
 
4.6%
3261
 
2.1%
2988
 
1.9%
2979
 
1.9%
Other values (313) 72493
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 124384
79.3%
Space Separator 25025
 
16.0%
Decimal Number 7351
 
4.7%
Open Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10109
 
8.1%
9590
 
7.7%
8155
 
6.6%
7656
 
6.2%
7373
 
5.9%
7133
 
5.7%
3261
 
2.6%
2988
 
2.4%
2979
 
2.4%
2957
 
2.4%
Other values (301) 62183
50.0%
Decimal Number
ValueCountFrequency (%)
1 1574
21.4%
2 1060
14.4%
3 903
12.3%
5 719
9.8%
4 684
9.3%
6 586
 
8.0%
7 522
 
7.1%
8 471
 
6.4%
0 460
 
6.3%
9 372
 
5.1%
Space Separator
ValueCountFrequency (%)
25025
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 124384
79.3%
Common 32378
 
20.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10109
 
8.1%
9590
 
7.7%
8155
 
6.6%
7656
 
6.2%
7373
 
5.9%
7133
 
5.7%
3261
 
2.6%
2988
 
2.4%
2979
 
2.4%
2957
 
2.4%
Other values (301) 62183
50.0%
Common
ValueCountFrequency (%)
25025
77.3%
1 1574
 
4.9%
2 1060
 
3.3%
3 903
 
2.8%
5 719
 
2.2%
4 684
 
2.1%
6 586
 
1.8%
7 522
 
1.6%
8 471
 
1.5%
0 460
 
1.4%
Other values (2) 374
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 124384
79.3%
ASCII 32378
 
20.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25025
77.3%
1 1574
 
4.9%
2 1060
 
3.3%
3 903
 
2.8%
5 719
 
2.2%
4 684
 
2.1%
6 586
 
1.8%
7 522
 
1.6%
8 471
 
1.5%
0 460
 
1.4%
Other values (2) 374
 
1.2%
Hangul
ValueCountFrequency (%)
10109
 
8.1%
9590
 
7.7%
8155
 
6.6%
7656
 
6.2%
7373
 
5.9%
7133
 
5.7%
3261
 
2.6%
2988
 
2.4%
2979
 
2.4%
2957
 
2.4%
Other values (301) 62183
50.0%

열 사용량
Real number (ℝ)

Distinct8671
Distinct (%)86.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2626.1328
Minimum0
Maximum65331.4
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-16T13:17:42.404450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40.4
Q1289.375
median1281.65
Q33721.575
95-th percentile8451.685
Maximum65331.4
Range65331.4
Interquartile range (IQR)3432.2

Descriptive statistics

Standard deviation3993.5535
Coefficient of variation (CV)1.5206975
Kurtosis57.834601
Mean2626.1328
Median Absolute Deviation (MAD)1149.85
Skewness5.6261619
Sum26261328
Variance15948470
MonotonicityNot monotonic
2024-03-16T13:17:42.744213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.6 6
 
0.1%
24.9 6
 
0.1%
281.0 5
 
0.1%
45.5 5
 
0.1%
46.9 5
 
0.1%
202.0 5
 
0.1%
66.3 5
 
0.1%
181.7 4
 
< 0.1%
150.6 4
 
< 0.1%
51.4 4
 
< 0.1%
Other values (8661) 9951
99.5%
ValueCountFrequency (%)
0.0 2
< 0.1%
0.1 3
< 0.1%
0.2 2
< 0.1%
0.3 3
< 0.1%
0.4 3
< 0.1%
0.5 4
< 0.1%
0.6 1
 
< 0.1%
0.8 1
 
< 0.1%
0.9 3
< 0.1%
1.1 2
< 0.1%
ValueCountFrequency (%)
65331.4 1
< 0.1%
65321.5 1
< 0.1%
60605.2 1
< 0.1%
59616.4 1
< 0.1%
58303.6 1
< 0.1%
56769.4 1
< 0.1%
56000.2 1
< 0.1%
55549.2 1
< 0.1%
54829.6 1
< 0.1%
50899.0 1
< 0.1%

단위
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Gcal
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGcal
2nd rowGcal
3rd rowGcal
4th rowGcal
5th rowGcal

Common Values

ValueCountFrequency (%)
Gcal 10000
100.0%

Length

2024-03-16T13:17:42.964712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T13:17:43.110281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
gcal 10000
100.0%

Interactions

2024-03-16T13:17:38.110434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:17:37.659249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:17:38.313371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:17:37.861400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-16T13:17:43.192856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번연도열 사용량
순번1.0000.9990.101
연도0.9991.0000.050
열 사용량0.1010.0501.000
2024-03-16T13:17:43.334367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번열 사용량연도
순번1.000-0.0500.967
열 사용량-0.0501.0000.039
연도0.9670.0391.000

Missing values

2024-03-16T13:17:38.527950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-16T13:17:38.792476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번연도주소열 사용량단위
477147722022경기도 화성시 동탄첨단산업2로550.1Gcal
305230532022충청북도 청주시 서원구 구룡산로52번길3974.9Gcal
498249832022서울특별시 강남구 영동대로115.5Gcal
189318942022서울특별시 서초구 나루터로1186.1Gcal
428842892022서울특별시 강남구 자곡로2014.7Gcal
10005100062023경기도 화성시 동탄순환대로11길3479.8Gcal
922592262023서울특별시 송파구 법원로2223.7Gcal
93942022충청북도 청주시 서원구 분평로4898.1Gcal
7487492022경기도 성남시 분당구 벌말로30번길1321.8Gcal
625162522023경기도 성남시 분당구 판교로1925.4Gcal
순번연도주소열 사용량단위
302630272022경기도 고양시 일산서구 강성로263.2Gcal
243124322022경기도 수원시 영통구 광교호수로1646.5Gcal
527052712022경기도 성남시 분당구 판교로256번길1449.8Gcal
6286292022경기도 성남시 분당구 서현로180번길115.5Gcal
551255132023경기도 성남시 분당구 구미로174번길598.3Gcal
455545562022경기도 파주시 경의로471.0Gcal
355735582022서울특별시 마포구 독막로20나길2985.6Gcal
849084912023경기도 화성시 삼성1로603.8Gcal
990699072023경기도 고양시 일산동구 장백로899.7Gcal
253725382022서울특별시 서초구 효령로68길1906.0Gcal