Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory566.4 KiB
Average record size in memory58.0 B

Variable types

Categorical4
Numeric1
Text1

Dataset

Description전국 전주전산화번호(전주번호)
Author한국전력공사
URLhttps://www.data.go.kr/data/15069462/fileData.do

Alerts

1차본부 is highly overall correlated with 2차순번 and 2 other fieldsHigh correlation
2차사업소 is highly overall correlated with 2차순번 and 2 other fieldsHigh correlation
1차순번 is highly overall correlated with 2차순번 and 2 other fieldsHigh correlation
2차순번 is highly overall correlated with 1차순번 and 2 other fieldsHigh correlation
1차순번 is highly imbalanced (80.1%)Imbalance
1차본부 is highly imbalanced (80.1%)Imbalance
전산화번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:21:31.473603
Analysis finished2023-12-12 01:21:32.664972
Duration1.19 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

1차순번
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9690 
2
 
310

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9690
96.9%
2 310
 
3.1%

Length

2023-12-12T10:21:32.758832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:21:32.865269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9690
96.9%
2 310
 
3.1%

1차본부
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
서울본부
9690 
남서울본부
 
310

Length

Max length5
Median length4
Mean length4.031
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울본부
2nd row서울본부
3rd row서울본부
4th row서울본부
5th row서울본부

Common Values

ValueCountFrequency (%)
서울본부 9690
96.9%
남서울본부 310
 
3.1%

Length

2023-12-12T10:21:32.959519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:21:33.060405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울본부 9690
96.9%
남서울본부 310
 
3.1%

2차순번
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1673
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T10:21:33.152194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile7
Maximum10
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.0941225
Coefficient of variation (CV)0.50251303
Kurtosis0.002807377
Mean4.1673
Median Absolute Deviation (MAD)2
Skewness0.52329655
Sum41673
Variance4.3853492
MonotonicityNot monotonic
2023-12-12T10:21:33.264259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
4 1801
18.0%
2 1605
16.1%
3 1582
15.8%
5 1352
13.5%
6 1307
13.1%
7 1128
11.3%
1 915
9.2%
10 310
 
3.1%
ValueCountFrequency (%)
1 915
9.2%
2 1605
16.1%
3 1582
15.8%
4 1801
18.0%
5 1352
13.5%
6 1307
13.1%
7 1128
11.3%
10 310
 
3.1%
ValueCountFrequency (%)
10 310
 
3.1%
7 1128
11.3%
6 1307
13.1%
5 1352
13.5%
4 1801
18.0%
3 1582
15.8%
2 1605
16.1%
1 915
9.2%

2차사업소
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
강북성북지사
1801 
동대문중랑지사
1605 
서대문은평지사
1582 
광진성동지사
1352 
마포용산지사
1307 
Other values (3)
2353 

Length

Max length7
Median length6
Mean length6.3497
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row마포용산지사
2nd row서대문은평지사
3rd row강북성북지사
4th row광진성동지사
5th row마포용산지사

Common Values

ValueCountFrequency (%)
강북성북지사 1801
18.0%
동대문중랑지사 1605
16.1%
서대문은평지사 1582
15.8%
광진성동지사 1352
13.5%
마포용산지사 1307
13.1%
노원도봉지사 1128
11.3%
서울본부직할 915
9.2%
남서울본부직할 310
 
3.1%

Length

2023-12-12T10:21:33.380275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:21:33.505946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강북성북지사 1801
18.0%
동대문중랑지사 1605
16.1%
서대문은평지사 1582
15.8%
광진성동지사 1352
13.5%
마포용산지사 1307
13.1%
노원도봉지사 1128
11.3%
서울본부직할 915
9.2%
남서울본부직할 310
 
3.1%

전산화번호
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T10:21:33.820894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters80000
Distinct characters26
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row0023B511
2nd row9728B772
3rd row0228P631
4th row0226Y182
5th row9625A171
ValueCountFrequency (%)
0023b511 1
 
< 0.1%
9924a321 1
 
< 0.1%
9629s541 1
 
< 0.1%
9827c932 1
 
< 0.1%
0324b721 1
 
< 0.1%
0031f531 1
 
< 0.1%
0128p433 1
 
< 0.1%
9728g853 1
 
< 0.1%
9623c252 1
 
< 0.1%
9628z367 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-12T10:21:34.264874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 15530
19.4%
0 10148
12.7%
1 7872
9.8%
9 6781
8.5%
3 6602
8.3%
6 4955
 
6.2%
4 4818
 
6.0%
7 4642
 
5.8%
8 4361
 
5.5%
5 4286
 
5.4%
Other values (16) 10005
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 69995
87.5%
Uppercase Letter 10005
 
12.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 692
 
6.9%
Q 687
 
6.9%
A 664
 
6.6%
C 660
 
6.6%
R 657
 
6.6%
G 646
 
6.5%
F 646
 
6.5%
P 643
 
6.4%
Z 633
 
6.3%
X 629
 
6.3%
Other values (6) 3448
34.5%
Decimal Number
ValueCountFrequency (%)
2 15530
22.2%
0 10148
14.5%
1 7872
11.2%
9 6781
9.7%
3 6602
9.4%
6 4955
 
7.1%
4 4818
 
6.9%
7 4642
 
6.6%
8 4361
 
6.2%
5 4286
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 69995
87.5%
Latin 10005
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 692
 
6.9%
Q 687
 
6.9%
A 664
 
6.6%
C 660
 
6.6%
R 657
 
6.6%
G 646
 
6.5%
F 646
 
6.5%
P 643
 
6.4%
Z 633
 
6.3%
X 629
 
6.3%
Other values (6) 3448
34.5%
Common
ValueCountFrequency (%)
2 15530
22.2%
0 10148
14.5%
1 7872
11.2%
9 6781
9.7%
3 6602
9.4%
6 4955
 
7.1%
4 4818
 
6.9%
7 4642
 
6.6%
8 4361
 
6.2%
5 4286
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 15530
19.4%
0 10148
12.7%
1 7872
9.8%
9 6781
8.5%
3 6602
8.3%
6 4955
 
6.2%
4 4818
 
6.0%
7 4642
 
5.8%
8 4361
 
5.5%
5 4286
 
5.4%
Other values (16) 10005
12.5%

지역구분
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
주택가
5404 
<NA>
3371 
번화가
1218 
농어촌
 
5
공란
 
1

Length

Max length4
Median length3
Mean length3.3371
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row주택가
3rd row번화가
4th row주택가
5th row<NA>

Common Values

ValueCountFrequency (%)
주택가 5404
54.0%
<NA> 3371
33.7%
번화가 1218
 
12.2%
농어촌 5
 
0.1%
공란 1
 
< 0.1%
야외도로 1
 
< 0.1%

Length

2023-12-12T10:21:34.416565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:21:34.540138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주택가 5404
54.0%
na 3371
33.7%
번화가 1218
 
12.2%
농어촌 5
 
< 0.1%
공란 1
 
< 0.1%
야외도로 1
 
< 0.1%

Interactions

2023-12-12T10:21:32.313864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:21:34.638936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
1차순번1차본부2차순번2차사업소지역구분
1차순번1.0001.0001.0001.0000.000
1차본부1.0001.0001.0001.0000.000
2차순번1.0001.0001.0001.0000.127
2차사업소1.0001.0001.0001.0000.127
지역구분0.0000.0000.1270.1271.000
2023-12-12T10:21:34.755313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역구분1차본부2차사업소1차순번
지역구분1.0000.0000.0780.000
1차본부0.0001.0001.0000.998
2차사업소0.0781.0001.0001.000
1차순번0.0000.9981.0001.000
2023-12-12T10:21:34.874911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2차순번1차순번1차본부2차사업소지역구분
2차순번1.0001.0001.0001.0000.078
1차순번1.0001.0000.9981.0000.000
1차본부1.0000.9981.0001.0000.000
2차사업소1.0001.0001.0001.0000.078
지역구분0.0780.0000.0000.0781.000

Missing values

2023-12-12T10:21:32.454941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:21:32.591970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

1차순번1차본부2차순번2차사업소전산화번호지역구분
728921서울본부6마포용산지사0023B511<NA>
342121서울본부3서대문은평지사9728B772주택가
489031서울본부4강북성북지사0228P631번화가
626961서울본부5광진성동지사0226Y182주택가
802941서울본부6마포용산지사9625A171<NA>
960561서울본부7노원도봉지사0232Z195주택가
650141서울본부5광진성동지사0325H251번화가
228221서울본부2동대문중랑지사0327X944번화가
842011서울본부6마포용산지사9725R936<NA>
877511서울본부7노원도봉지사0329D273<NA>
1차순번1차본부2차순번2차사업소전산화번호지역구분
259981서울본부3서대문은평지사9725E901번화가
910171서울본부7노원도봉지사0231R221주택가
916021서울본부7노원도봉지사0231C812주택가
645921서울본부5광진성동지사0325E523주택가
80071서울본부1서울본부직할9926Q271<NA>
884611서울본부7노원도봉지사0431C521야외도로
739741서울본부6마포용산지사0024W372주택가
120441서울본부2동대문중랑지사0227Y812번화가
674121서울본부5광진성동지사0425B493주택가
122411서울본부2동대문중랑지사0228Z711주택가