Overview

Dataset statistics

Number of variables9
Number of observations2625
Missing cells0
Missing cells (%)0.0%
Duplicate rows735
Duplicate rows (%)28.0%
Total size in memory200.1 KiB
Average record size in memory78.1 B

Variable types

Numeric5
Categorical4

Alerts

시도명 has constant value ""Constant
Dataset has 735 (28.0%) duplicate rowsDuplicates
전력사용량 is highly overall correlated with 전기요금High correlation
평균판매단가 is highly overall correlated with 조회연도High correlation
전기요금 is highly overall correlated with 전력사용량High correlation
조회연도 is highly overall correlated with 평균판매단가High correlation

Reproduction

Analysis started2024-01-09 22:56:06.229937
Analysis finished2024-01-09 22:56:09.562450
Duration3.33 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

조회월
Real number (ℝ)

Distinct11
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.24
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2024-01-10T07:56:09.613382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median6
Q38
95-th percentile11
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.063654
Coefficient of variation (CV)0.49097019
Kurtosis-1.0172453
Mean6.24
Median Absolute Deviation (MAD)3
Skewness0.023356163
Sum16380
Variance9.3859756
MonotonicityNot monotonic
2024-01-10T07:56:09.706484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2 315
12.0%
3 315
12.0%
5 315
12.0%
6 315
12.0%
7 315
12.0%
8 315
12.0%
10 315
12.0%
1 105
 
4.0%
9 105
 
4.0%
11 105
 
4.0%
ValueCountFrequency (%)
1 105
 
4.0%
2 315
12.0%
3 315
12.0%
5 315
12.0%
6 315
12.0%
7 315
12.0%
8 315
12.0%
9 105
 
4.0%
10 315
12.0%
11 105
 
4.0%
ValueCountFrequency (%)
12 105
 
4.0%
11 105
 
4.0%
10 315
12.0%
9 105
 
4.0%
8 315
12.0%
7 315
12.0%
6 315
12.0%
5 315
12.0%
3 315
12.0%
2 315
12.0%

조회연도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
2022
1995 
2023
630 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 1995
76.0%
2023 630
 
24.0%

Length

2024-01-10T07:56:09.808578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:56:09.893359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 1995
76.0%
2023 630
 
24.0%

시군구명
Categorical

Distinct15
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
계룡시
 
175
공주시
 
175
금산군
 
175
논산시
 
175
당진시
 
175
Other values (10)
1750 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row계룡시
2nd row계룡시
3rd row계룡시
4th row계룡시
5th row계룡시

Common Values

ValueCountFrequency (%)
계룡시 175
 
6.7%
공주시 175
 
6.7%
금산군 175
 
6.7%
논산시 175
 
6.7%
당진시 175
 
6.7%
보령시 175
 
6.7%
부여군 175
 
6.7%
서산시 175
 
6.7%
서천군 175
 
6.7%
아산시 175
 
6.7%
Other values (5) 875
33.3%

Length

2024-01-10T07:56:09.978407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
계룡시 175
 
6.7%
공주시 175
 
6.7%
금산군 175
 
6.7%
논산시 175
 
6.7%
당진시 175
 
6.7%
보령시 175
 
6.7%
부여군 175
 
6.7%
서산시 175
 
6.7%
서천군 175
 
6.7%
아산시 175
 
6.7%
Other values (5) 875
33.3%

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
충청남도
2625 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청남도
2nd row충청남도
3rd row충청남도
4th row충청남도
5th row충청남도

Common Values

ValueCountFrequency (%)
충청남도 2625
100.0%

Length

2024-01-10T07:56:10.080452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:56:10.167254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
충청남도 2625
100.0%

전력사용량
Real number (ℝ)

HIGH CORRELATION 

Distinct1785
Distinct (%)68.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39491647
Minimum182829
Maximum1.1034753 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2024-01-10T07:56:10.267708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum182829
5-th percentile463745
Q11620123
median8933471
Q320579479
95-th percentile92944266
Maximum1.1034753 × 109
Range1.1032924 × 109
Interquartile range (IQR)18959356

Descriptive statistics

Standard deviation1.3340112 × 108
Coefficient of variation (CV)3.3779579
Kurtosis31.049769
Mean39491647
Median Absolute Deviation (MAD)7736703
Skewness5.4647319
Sum1.0366557 × 1011
Variance1.779586 × 1016
MonotonicityNot monotonic
2024-01-10T07:56:10.389013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3335721 3
 
0.1%
20675446 3
 
0.1%
1254491 3
 
0.1%
1587063 3
 
0.1%
15142245 3
 
0.1%
407862818 3
 
0.1%
3842217 3
 
0.1%
27791083 3
 
0.1%
21491496 3
 
0.1%
961344 3
 
0.1%
Other values (1775) 2595
98.9%
ValueCountFrequency (%)
182829 2
0.1%
186935 2
0.1%
190232 1
< 0.1%
194351 1
< 0.1%
197615 2
0.1%
198780 2
0.1%
206265 1
< 0.1%
208453 1
< 0.1%
219474 1
< 0.1%
220647 1
< 0.1%
ValueCountFrequency (%)
1103475252 2
0.1%
1050298434 1
 
< 0.1%
1031520071 2
0.1%
1014684020 2
0.1%
1002420341 2
0.1%
986575177 1
 
< 0.1%
986182081 1
 
< 0.1%
983258690 1
 
< 0.1%
975167408 2
0.1%
959372999 3
0.1%

계약종별
Categorical

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
가로등
375 
교육용
375 
농사용
375 
산업용
375 
심 야
375 
Other values (2)
750 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가로등
2nd row교육용
3rd row농사용
4th row산업용
5th row심 야

Common Values

ValueCountFrequency (%)
가로등 375
14.3%
교육용 375
14.3%
농사용 375
14.3%
산업용 375
14.3%
심 야 375
14.3%
일반용 375
14.3%
주택용 375
14.3%

Length

2024-01-10T07:56:10.500513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:56:10.595344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
가로등 375
12.5%
교육용 375
12.5%
농사용 375
12.5%
산업용 375
12.5%
375
12.5%
375
12.5%
일반용 375
12.5%
주택용 375
12.5%

평균판매단가
Real number (ℝ)

HIGH CORRELATION 

Distinct960
Distinct (%)36.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean117.11063
Minimum44.5
Maximum206.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2024-01-10T07:56:10.718605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum44.5
5-th percentile57.8
Q187.8
median124.2
Q3141.4
95-th percentile169.48
Maximum206.3
Range161.8
Interquartile range (IQR)53.6

Descriptive statistics

Standard deviation35.661018
Coefficient of variation (CV)0.3045071
Kurtosis-0.77986948
Mean117.11063
Median Absolute Deviation (MAD)22.9
Skewness-0.2556845
Sum307415.4
Variance1271.7082
MonotonicityNot monotonic
2024-01-10T07:56:10.853136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
138.8 12
 
0.5%
67.1 12
 
0.5%
137.0 11
 
0.4%
124.1 11
 
0.4%
112.4 11
 
0.4%
128.0 11
 
0.4%
131.2 11
 
0.4%
125.9 10
 
0.4%
70.8 10
 
0.4%
126.3 10
 
0.4%
Other values (950) 2516
95.8%
ValueCountFrequency (%)
44.5 2
0.1%
45.3 4
0.2%
46.2 2
0.1%
46.6 1
 
< 0.1%
46.7 1
 
< 0.1%
46.8 2
0.1%
47.0 2
0.1%
47.4 4
0.2%
47.5 4
0.2%
47.8 1
 
< 0.1%
ValueCountFrequency (%)
206.3 1
 
< 0.1%
205.1 1
 
< 0.1%
203.4 1
 
< 0.1%
200.0 1
 
< 0.1%
197.8 1
 
< 0.1%
197.2 1
 
< 0.1%
196.5 3
0.1%
196.2 1
 
< 0.1%
195.9 1
 
< 0.1%
195.3 1
 
< 0.1%

전기요금
Real number (ℝ)

HIGH CORRELATION 

Distinct1785
Distinct (%)68.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8750866 × 109
Minimum16827517
Maximum1.5902492 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2024-01-10T07:56:11.269092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16827517
5-th percentile56761767
Q11.713811 × 108
median8.2390744 × 108
Q32.2128832 × 109
95-th percentile1.2401494 × 1010
Maximum1.5902492 × 1011
Range1.590081 × 1011
Interquartile range (IQR)2.0415021 × 109

Descriptive statistics

Standard deviation1.6580614 × 1010
Coefficient of variation (CV)3.4010911
Kurtosis33.488486
Mean4.8750866 × 109
Median Absolute Deviation (MAD)7.0930283 × 108
Skewness5.5767732
Sum1.2797102 × 1013
Variance2.7491675 × 1020
MonotonicityNot monotonic
2024-01-10T07:56:11.404375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
238036840 3
 
0.1%
2814695674 3
 
0.1%
160137442 3
 
0.1%
188990202 3
 
0.1%
978479125 3
 
0.1%
49184224336 3
 
0.1%
271176269 3
 
0.1%
3855776856 3
 
0.1%
2714920620 3
 
0.1%
120470190 3
 
0.1%
Other values (1775) 2595
98.9%
ValueCountFrequency (%)
16827517 2
0.1%
17168165 2
0.1%
18790321 2
0.1%
19910422 2
0.1%
20295197 3
0.1%
20361799 1
 
< 0.1%
20374999 3
0.1%
21035378 1
 
< 0.1%
21042203 1
 
< 0.1%
21551705 2
0.1%
ValueCountFrequency (%)
159024923371 1
< 0.1%
158888550038 1
< 0.1%
150645572912 1
< 0.1%
138083394438 1
< 0.1%
134598767541 1
< 0.1%
132734074666 1
< 0.1%
128610497161 2
0.1%
127037121425 2
0.1%
126233032880 1
< 0.1%
122730542254 1
< 0.1%

고객호수
Real number (ℝ)

Distinct1453
Distinct (%)55.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14496.406
Minimum16
Maximum181131
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2024-01-10T07:56:11.536695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile53
Q11376
median6687
Q316108
95-th percentile50667.6
Maximum181131
Range181115
Interquartile range (IQR)14732

Descriptive statistics

Standard deviation23676.98
Coefficient of variation (CV)1.6333
Kurtosis22.535396
Mean14496.406
Median Absolute Deviation (MAD)6088
Skewness4.1265017
Sum38053066
Variance5.6059936 × 108
MonotonicityNot monotonic
2024-01-10T07:56:11.662468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
76 31
 
1.2%
43 25
 
1.0%
28 25
 
1.0%
84 25
 
1.0%
54 24
 
0.9%
48 19
 
0.7%
17 18
 
0.7%
38 17
 
0.6%
82 15
 
0.6%
136 15
 
0.6%
Other values (1443) 2411
91.8%
ValueCountFrequency (%)
16 7
 
0.3%
17 18
0.7%
28 25
1.0%
38 17
0.6%
39 8
 
0.3%
43 25
1.0%
47 5
 
0.2%
48 19
0.7%
49 1
 
< 0.1%
52 2
 
0.1%
ValueCountFrequency (%)
181131 1
 
< 0.1%
181092 1
 
< 0.1%
181015 1
 
< 0.1%
180894 1
 
< 0.1%
180404 1
 
< 0.1%
180276 1
 
< 0.1%
180166 1
 
< 0.1%
180142 1
 
< 0.1%
180028 3
0.1%
179939 1
 
< 0.1%

Interactions

2024-01-10T07:56:08.890854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:06.742458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.206363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.727799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:08.284078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:09.000367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:06.819503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.315515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.819104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:08.398148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:09.086518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:06.904469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.421050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.913544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:08.509851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:09.183616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:06.992503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.542776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:08.035247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:08.643053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:09.281436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.095528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:07.635498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:08.164466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:56:08.768685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T07:56:11.751259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조회월조회연도시군구명전력사용량계약종별평균판매단가전기요금고객호수
조회월1.0000.3190.0000.0000.0000.3570.0480.000
조회연도0.3191.0000.0000.0000.0000.7510.0790.000
시군구명0.0000.0001.0000.5360.0000.0970.4610.615
전력사용량0.0000.0000.5361.0000.3810.1430.8560.293
계약종별0.0000.0000.0000.3811.0000.7190.3860.745
평균판매단가0.3570.7510.0970.1430.7191.0000.2530.264
전기요금0.0480.0790.4610.8560.3860.2531.0000.255
고객호수0.0000.0000.6150.2930.7450.2640.2551.000
2024-01-10T07:56:11.860667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계약종별조회연도시군구명
계약종별1.0000.0000.000
조회연도0.0001.0000.000
시군구명0.0000.0001.000
2024-01-10T07:56:11.945893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조회월전력사용량평균판매단가전기요금고객호수조회연도시군구명계약종별
조회월1.000-0.0550.113-0.0340.0020.3190.0000.000
전력사용량-0.0551.0000.0480.9690.3080.0000.2500.212
평균판매단가0.1130.0481.0000.253-0.0630.5900.0360.472
전기요금-0.0340.9690.2531.0000.2740.0600.1900.206
고객호수0.0020.308-0.0630.2741.0000.0000.3340.345
조회연도0.3190.0000.5900.0600.0001.0000.0000.000
시군구명0.0000.2500.0360.1900.3340.0001.0000.000
계약종별0.0000.2120.4720.2060.3450.0000.0001.000

Missing values

2024-01-10T07:56:09.396406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T07:56:09.510405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

조회월조회연도시군구명시도명전력사용량계약종별평균판매단가전기요금고객호수
012022계룡시충청남도298403가로등108.5323833731367
112022계룡시충청남도522500교육용100.85264890517
212022계룡시충청남도398228농사용51.120361799605
312022계룡시충청남도5392303산업용130.0700763834135
412022계룡시충청남도695256심 야76.953484858270
512022계룡시충청남도13009738일반용128.616730905572020
612022계룡시충청남도6110496주택용114.16972139674272
712022공주시충청남도1403657가로등118.916684837210740
812022공주시충청남도4258117교육용96.941248259184
912022공주시충청남도14440716농사용49.070790184616661
조회월조회연도시군구명시도명전력사용량계약종별평균판매단가전기요금고객호수
261582022태안군충청남도1732684심 야68.11180258864736
261682022태안군충청남도24162021일반용155.637598406349776
261782022태안군충청남도9142378주택용139.7127742570533736
261882022홍성군충청남도702949가로등132.39302874512266
261982022홍성군충청남도1633008교육용142.123202306965
262082022홍성군충청남도26621032농사용58.3155224557413924
262182022홍성군충청남도26235347산업용146.43841628571948
262282022홍성군충청남도1055708심 야67.9716703404593
262382022홍성군충청남도22035929일반용158.034823822768318
262482022홍성군충청남도17347418주택용127.6221288322641962

Duplicate rows

Most frequently occurring

조회월조회연도시군구명시도명전력사용량계약종별평균판매단가전기요금고객호수# duplicates
630102022계룡시충청남도239886가로등129.73110578513773
631102022계룡시충청남도255136심 야79.9203749992573
632102022계룡시충청남도256912교육용116.930040893173
633102022계룡시충청남도286602농사용70.8202951976343
634102022계룡시충청남도4538666산업용134.86116319661363
635102022계룡시충청남도5108845주택용119.461000788143153
636102022계룡시충청남도10062847일반용135.0135816133421463
637102022공주시충청남도1344014가로등129.4173902660113743
638102022공주시충청남도2261061교육용106.1240002156843
639102022공주시충청남도2930381심 야71.620973868162233