Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells336
Missing cells (%)0.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory625.0 KiB
Average record size in memory64.0 B

Variable types

DateTime1
Text3
Categorical3

Dataset

Description전력고객주요정보 대용량고객자재정보 등록일자, 설비명, 설비등급, 설비구분, 운영방식, 부서, 급전분소명입니다.
Author한국전력공사
URLhttps://www.data.go.kr/data/15069029/fileData.do

Alerts

설비구분 is highly imbalanced (60.2%)Imbalance
운영방식 is highly imbalanced (60.1%)Imbalance
부서 has 196 (2.0%) missing valuesMissing
급전분소명 has 140 (1.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 10:49:24.698148
Analysis finished2023-12-12 10:49:25.715605
Duration1.02 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct116
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2016-03-25 00:00:00
Maximum2020-10-05 00:00:00
2023-12-12T19:49:25.794798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:49:25.954300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct883
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T19:49:26.401241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length5
Mean length5.1962
Min length2

Characters and Unicode

Total characters51962
Distinct characters279
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)0.3%

Sample

1st row신고리S/S
2nd row영동T/P
3rd row부강S/S
4th row일동S/S
5th row호남화력
ValueCountFrequency (%)
s/y 40
 
0.4%
안좌s/s 30
 
0.3%
삼척s/s 23
 
0.2%
부강s/s 22
 
0.2%
동영월s/s 22
 
0.2%
구리s/s 21
 
0.2%
성거s/s 21
 
0.2%
염곡s/s 20
 
0.2%
신포천s/s 20
 
0.2%
양지s/s 20
 
0.2%
Other values (876) 9837
97.6%
2023-12-12T19:49:26.961140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 18222
35.1%
/ 9365
18.0%
872
 
1.7%
795
 
1.5%
694
 
1.3%
649
 
1.2%
542
 
1.0%
519
 
1.0%
500
 
1.0%
481
 
0.9%
Other values (269) 19323
37.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23370
45.0%
Uppercase Letter 18863
36.3%
Other Punctuation 9413
18.1%
Decimal Number 104
 
0.2%
Letter Number 90
 
0.2%
Space Separator 76
 
0.1%
Lowercase Letter 18
 
< 0.1%
Close Punctuation 14
 
< 0.1%
Open Punctuation 14
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
872
 
3.7%
795
 
3.4%
694
 
3.0%
649
 
2.8%
542
 
2.3%
519
 
2.2%
500
 
2.1%
481
 
2.1%
452
 
1.9%
444
 
1.9%
Other values (243) 17422
74.5%
Uppercase Letter
ValueCountFrequency (%)
S 18222
96.6%
C 268
 
1.4%
P 119
 
0.6%
Y 60
 
0.3%
H 49
 
0.3%
T 37
 
0.2%
N 23
 
0.1%
G 22
 
0.1%
E 20
 
0.1%
V 18
 
0.1%
Other values (2) 25
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 42
40.4%
6 30
28.8%
1 26
25.0%
5 3
 
2.9%
4 3
 
2.9%
Letter Number
ValueCountFrequency (%)
31
34.4%
31
34.4%
28
31.1%
Other Punctuation
ValueCountFrequency (%)
/ 9365
99.5%
# 48
 
0.5%
Space Separator
ValueCountFrequency (%)
76
100.0%
Lowercase Letter
ValueCountFrequency (%)
k 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23370
45.0%
Latin 18971
36.5%
Common 9621
18.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
872
 
3.7%
795
 
3.4%
694
 
3.0%
649
 
2.8%
542
 
2.3%
519
 
2.2%
500
 
2.1%
481
 
2.1%
452
 
1.9%
444
 
1.9%
Other values (243) 17422
74.5%
Latin
ValueCountFrequency (%)
S 18222
96.1%
C 268
 
1.4%
P 119
 
0.6%
Y 60
 
0.3%
H 49
 
0.3%
T 37
 
0.2%
31
 
0.2%
31
 
0.2%
28
 
0.1%
N 23
 
0.1%
Other values (6) 103
 
0.5%
Common
ValueCountFrequency (%)
/ 9365
97.3%
76
 
0.8%
# 48
 
0.5%
2 42
 
0.4%
6 30
 
0.3%
1 26
 
0.3%
) 14
 
0.1%
( 14
 
0.1%
5 3
 
< 0.1%
4 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28502
54.9%
Hangul 23370
45.0%
Number Forms 90
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 18222
63.9%
/ 9365
32.9%
C 268
 
0.9%
P 119
 
0.4%
76
 
0.3%
Y 60
 
0.2%
H 49
 
0.2%
# 48
 
0.2%
2 42
 
0.1%
T 37
 
0.1%
Other values (13) 216
 
0.8%
Hangul
ValueCountFrequency (%)
872
 
3.7%
795
 
3.4%
694
 
3.0%
649
 
2.8%
542
 
2.3%
519
 
2.2%
500
 
2.1%
481
 
2.1%
452
 
1.9%
444
 
1.9%
Other values (243) 17422
74.5%
Number Forms
ValueCountFrequency (%)
31
34.4%
31
34.4%
28
31.1%

설비등급
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2
6745 
1
1766 
<NA>
879 
-
 
610

Length

Max length4
Median length1
Mean length1.2637
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
2 6745
67.5%
1 1766
 
17.7%
<NA> 879
 
8.8%
- 610
 
6.1%

Length

2023-12-12T19:49:27.172963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:49:27.323816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 6745
67.5%
1 1766
 
17.7%
na 879
 
8.8%
610
 
6.1%

설비구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
변전소
9212 
SWYD
 
788

Length

Max length4
Median length3
Mean length3.0788
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row변전소
2nd rowSWYD
3rd row변전소
4th row변전소
5th rowSWYD

Common Values

ValueCountFrequency (%)
변전소 9212
92.1%
SWYD 788
 
7.9%

Length

2023-12-12T19:49:27.487933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:49:27.631194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
변전소 9212
92.1%
swyd 788
 
7.9%

운영방식
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
무인
8414 
유인
1584 
8
 
2

Length

Max length2
Median length2
Mean length1.9998
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유인
2nd row무인
3rd row무인
4th row무인
5th row무인

Common Values

ValueCountFrequency (%)
무인 8414
84.1%
유인 1584
 
15.8%
8 2
 
< 0.1%

Length

2023-12-12T19:49:27.794972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:49:27.953347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무인 8414
84.1%
유인 1584
 
15.8%
8 2
 
< 0.1%

부서
Text

MISSING 

Distinct200
Distinct (%)2.0%
Missing196
Missing (%)2.0%
Memory size156.2 KiB
2023-12-12T19:49:28.323852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.4695022
Min length1

Characters and Unicode

Total characters34015
Distinct characters142
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)0.2%

Sample

1st row해당없음
2nd row강릉
3rd row서청주
4th row순회센터(변전부)
5th row해당없음
ValueCountFrequency (%)
해당없음 1674
 
17.0%
순회센터(변전부 407
 
4.1%
직할 271
 
2.8%
파주 164
 
1.7%
제주 144
 
1.5%
송정순회팀 139
 
1.4%
영암 114
 
1.2%
미금 110
 
1.1%
원주 109
 
1.1%
전주순회진단팀 109
 
1.1%
Other values (187) 6602
67.1%
2023-12-12T19:49:28.983234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2357
 
6.9%
2286
 
6.7%
1867
 
5.5%
1773
 
5.2%
1732
 
5.1%
1712
 
5.0%
1697
 
5.0%
1010
 
3.0%
976
 
2.9%
790
 
2.3%
Other values (132) 17815
52.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33089
97.3%
Open Punctuation 407
 
1.2%
Close Punctuation 407
 
1.2%
Decimal Number 64
 
0.2%
Space Separator 39
 
0.1%
Uppercase Letter 6
 
< 0.1%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2357
 
7.1%
2286
 
6.9%
1867
 
5.6%
1773
 
5.4%
1732
 
5.2%
1712
 
5.2%
1697
 
5.1%
1010
 
3.1%
976
 
2.9%
790
 
2.4%
Other values (123) 16889
51.0%
Decimal Number
ValueCountFrequency (%)
1 33
51.6%
2 24
37.5%
0 7
 
10.9%
Uppercase Letter
ValueCountFrequency (%)
S 4
66.7%
Y 2
33.3%
Open Punctuation
ValueCountFrequency (%)
( 407
100.0%
Close Punctuation
ValueCountFrequency (%)
) 407
100.0%
Space Separator
ValueCountFrequency (%)
39
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33089
97.3%
Common 920
 
2.7%
Latin 6
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2357
 
7.1%
2286
 
6.9%
1867
 
5.6%
1773
 
5.4%
1732
 
5.2%
1712
 
5.2%
1697
 
5.1%
1010
 
3.1%
976
 
2.9%
790
 
2.4%
Other values (123) 16889
51.0%
Common
ValueCountFrequency (%)
( 407
44.2%
) 407
44.2%
39
 
4.2%
1 33
 
3.6%
2 24
 
2.6%
0 7
 
0.8%
/ 3
 
0.3%
Latin
ValueCountFrequency (%)
S 4
66.7%
Y 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33089
97.3%
ASCII 926
 
2.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2357
 
7.1%
2286
 
6.9%
1867
 
5.6%
1773
 
5.4%
1732
 
5.2%
1712
 
5.2%
1697
 
5.1%
1010
 
3.1%
976
 
2.9%
790
 
2.4%
Other values (123) 16889
51.0%
ASCII
ValueCountFrequency (%)
( 407
44.0%
) 407
44.0%
39
 
4.2%
1 33
 
3.6%
2 24
 
2.6%
0 7
 
0.8%
S 4
 
0.4%
/ 3
 
0.3%
Y 2
 
0.2%

급전분소명
Text

MISSING 

Distinct108
Distinct (%)1.1%
Missing140
Missing (%)1.4%
Memory size156.2 KiB
2023-12-12T19:49:29.353974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length3.9783976
Min length1

Characters and Unicode

Total characters39227
Distinct characters88
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row해당없음
2nd row강릉
3rd row청주
4th row군포
5th row순천
ValueCountFrequency (%)
해당없음 1481
 
14.3%
직할 448
 
4.3%
급전분소 423
 
4.1%
직할급전분소 412
 
4.0%
광역계통운영센터 203
 
2.0%
울산급전분소 200
 
1.9%
구리 198
 
1.9%
원주 191
 
1.8%
신김제 189
 
1.8%
북부산급전분소 186
 
1.8%
Other values (102) 6460
62.2%
2023-12-12T19:49:29.905484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3011
 
7.7%
2932
 
7.5%
2891
 
7.4%
2890
 
7.4%
1815
 
4.6%
1493
 
3.8%
1493
 
3.8%
1481
 
3.8%
1272
 
3.2%
1083
 
2.8%
Other values (78) 18866
48.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 38456
98.0%
Space Separator 531
 
1.4%
Uppercase Letter 72
 
0.2%
Open Punctuation 62
 
0.2%
Close Punctuation 62
 
0.2%
Decimal Number 31
 
0.1%
Other Punctuation 13
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3011
 
7.8%
2932
 
7.6%
2891
 
7.5%
2890
 
7.5%
1815
 
4.7%
1493
 
3.9%
1493
 
3.9%
1481
 
3.9%
1272
 
3.3%
1083
 
2.8%
Other values (62) 18095
47.1%
Uppercase Letter
ValueCountFrequency (%)
E 18
25.0%
T 18
25.0%
C 9
12.5%
N 9
12.5%
R 9
12.5%
D 9
12.5%
Decimal Number
ValueCountFrequency (%)
3 8
25.8%
8 8
25.8%
0 7
22.6%
7 4
12.9%
5 4
12.9%
Other Punctuation
ValueCountFrequency (%)
& 9
69.2%
, 4
30.8%
Space Separator
ValueCountFrequency (%)
531
100.0%
Open Punctuation
ValueCountFrequency (%)
( 62
100.0%
Close Punctuation
ValueCountFrequency (%)
) 62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 38456
98.0%
Common 699
 
1.8%
Latin 72
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3011
 
7.8%
2932
 
7.6%
2891
 
7.5%
2890
 
7.5%
1815
 
4.7%
1493
 
3.9%
1493
 
3.9%
1481
 
3.9%
1272
 
3.3%
1083
 
2.8%
Other values (62) 18095
47.1%
Common
ValueCountFrequency (%)
531
76.0%
( 62
 
8.9%
) 62
 
8.9%
& 9
 
1.3%
3 8
 
1.1%
8 8
 
1.1%
0 7
 
1.0%
7 4
 
0.6%
, 4
 
0.6%
5 4
 
0.6%
Latin
ValueCountFrequency (%)
E 18
25.0%
T 18
25.0%
C 9
12.5%
N 9
12.5%
R 9
12.5%
D 9
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 38456
98.0%
ASCII 771
 
2.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3011
 
7.8%
2932
 
7.6%
2891
 
7.5%
2890
 
7.5%
1815
 
4.7%
1493
 
3.9%
1493
 
3.9%
1481
 
3.9%
1272
 
3.3%
1083
 
2.8%
Other values (62) 18095
47.1%
ASCII
ValueCountFrequency (%)
531
68.9%
( 62
 
8.0%
) 62
 
8.0%
E 18
 
2.3%
T 18
 
2.3%
& 9
 
1.2%
C 9
 
1.2%
N 9
 
1.2%
R 9
 
1.2%
D 9
 
1.2%
Other values (6) 35
 
4.5%

Correlations

2023-12-12T19:49:30.061773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설비등급설비구분운영방식
설비등급1.0000.0240.000
설비구분0.0241.0000.004
운영방식0.0000.0041.000
2023-12-12T19:49:30.223288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설비등급운영방식설비구분
설비등급1.0000.0000.039
운영방식0.0001.0000.007
설비구분0.0390.0071.000
2023-12-12T19:49:30.362067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설비등급설비구분운영방식
설비등급1.0000.0390.000
설비구분0.0391.0000.007
운영방식0.0000.0071.000

Missing values

2023-12-12T19:49:25.323776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:49:25.464627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T19:49:25.637620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

등록일자설비명설비등급설비구분운영방식부서급전분소명
641652017-01-02신고리S/S1변전소유인해당없음해당없음
564932017-01-30영동T/P2SWYD무인강릉강릉
763842018-04-30부강S/S2변전소무인서청주청주
384072016-09-19일동S/S1변전소무인순회센터(변전부)군포
666942017-08-28호남화력1SWYD무인해당없음순천
231562016-04-15완암S/S2변전소무인완암 순회점검팀신김해급전분소
562882017-02-27일곡S/S2변전소무인북광주광주
151152016-04-13신김제S/S2변전소유인해당없음해당없음
407472016-11-28남창S/S1변전소무인영암강진
382162016-10-24남안산S/S1변전소무인순회센터(변전부)군포
등록일자설비명설비등급설비구분운영방식부서급전분소명
748892018-05-28경주S/S2변전소유인해당없음해당없음
56822016-03-27중리S/S2변전소무인서마산 순회진단팀함안급전분소
592662017-02-27지제S/S2변전소무인미금구리
781572020-06-29고강S/S-변전소무인중동순회진단팀신부평급전분소
929442019-12-30청평양수2SWYD무인미금구리
508232016-11-07노포S/S1변전소무인노포순회팀동부산 급전분소
790682018-11-26나주S/S2변전소유인해당없음해당없음
863472020-06-29동탄C/C<NA>SWYD무인농서순회진단팀성남급전분소
123002016-04-06신김제S/S2변전소유인해당없음해당없음
786812019-06-24동춘S/S2변전소무인남동신시흥