Overview

Dataset statistics

Number of variables7
Number of observations659
Missing cells207
Missing cells (%)4.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory36.8 KiB
Average record size in memory57.2 B

Variable types

DateTime2
Text2
Numeric1
Categorical1
Boolean1

Dataset

Description인천광역시 전기사업체등록 현황(허가일자,상호,설비용량,설치장소, 원동력 종류, 사업개시유무, 사업개시일 등)자료입니다
Author인천광역시
URLhttps://www.data.go.kr/data/15030592/fileData.do

Alerts

설비용량(킬로와트) is highly overall correlated with 원동력 종류High correlation
원동력 종류 is highly overall correlated with 설비용량(킬로와트)High correlation
원동력 종류 is highly imbalanced (94.9%)Imbalance
사업개시일 has 207 (31.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 02:44:39.343005
Analysis finished2023-12-12 02:44:40.124666
Duration0.78 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct373
Distinct (%)56.6%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
Minimum2005-09-30 00:00:00
Maximum2020-12-31 00:00:00
2023-12-12T11:44:40.219600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:44:40.394995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

상호
Text

Distinct651
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
2023-12-12T11:44:40.896439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21
Mean length10.99393
Min length5

Characters and Unicode

Total characters7245
Distinct characters382
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique643 ?
Unique (%)97.6%

Sample

1st row동검발전소
2nd row기린에코 발전소
3rd row영흥도 태양광발전소
4th row영흥 소수력발전소
5th row해성 태양광발전소
ValueCountFrequency (%)
태양광발전소 453
35.4%
발전소 25
 
2.0%
2호 11
 
0.9%
1호 9
 
0.7%
다주 8
 
0.6%
햇빛발전소 6
 
0.5%
인천 5
 
0.4%
태양광발전 4
 
0.3%
2호기 4
 
0.3%
제2태양광발전소 4
 
0.3%
Other values (693) 750
58.6%
2023-12-12T11:44:41.498339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
660
 
9.1%
654
 
9.0%
648
 
8.9%
620
 
8.6%
620
 
8.6%
612
 
8.4%
611
 
8.4%
164
 
2.3%
2 83
 
1.1%
1 66
 
0.9%
Other values (372) 2507
34.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6229
86.0%
Space Separator 620
 
8.6%
Decimal Number 234
 
3.2%
Uppercase Letter 78
 
1.1%
Close Punctuation 23
 
0.3%
Open Punctuation 23
 
0.3%
Lowercase Letter 21
 
0.3%
Other Symbol 11
 
0.2%
Dash Punctuation 3
 
< 0.1%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
660
 
10.6%
654
 
10.5%
648
 
10.4%
620
 
10.0%
612
 
9.8%
611
 
9.8%
164
 
2.6%
65
 
1.0%
60
 
1.0%
47
 
0.8%
Other values (320) 2088
33.5%
Uppercase Letter
ValueCountFrequency (%)
S 15
19.2%
G 7
 
9.0%
E 5
 
6.4%
P 5
 
6.4%
K 5
 
6.4%
H 5
 
6.4%
N 4
 
5.1%
A 4
 
5.1%
O 4
 
5.1%
I 3
 
3.8%
Other values (12) 21
26.9%
Lowercase Letter
ValueCountFrequency (%)
n 7
33.3%
o 3
14.3%
g 2
 
9.5%
i 1
 
4.8%
d 1
 
4.8%
a 1
 
4.8%
r 1
 
4.8%
y 1
 
4.8%
l 1
 
4.8%
h 1
 
4.8%
Other values (2) 2
 
9.5%
Decimal Number
ValueCountFrequency (%)
2 83
35.5%
1 66
28.2%
3 35
15.0%
4 16
 
6.8%
5 13
 
5.6%
6 7
 
3.0%
8 4
 
1.7%
0 4
 
1.7%
7 4
 
1.7%
9 2
 
0.9%
Other Punctuation
ValueCountFrequency (%)
, 1
33.3%
. 1
33.3%
& 1
33.3%
Space Separator
ValueCountFrequency (%)
620
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Other Symbol
ValueCountFrequency (%)
11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6240
86.1%
Common 906
 
12.5%
Latin 99
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
660
 
10.6%
654
 
10.5%
648
 
10.4%
620
 
9.9%
612
 
9.8%
611
 
9.8%
164
 
2.6%
65
 
1.0%
60
 
1.0%
47
 
0.8%
Other values (321) 2099
33.6%
Latin
ValueCountFrequency (%)
S 15
 
15.2%
G 7
 
7.1%
n 7
 
7.1%
E 5
 
5.1%
P 5
 
5.1%
K 5
 
5.1%
H 5
 
5.1%
N 4
 
4.0%
A 4
 
4.0%
O 4
 
4.0%
Other values (24) 38
38.4%
Common
ValueCountFrequency (%)
620
68.4%
2 83
 
9.2%
1 66
 
7.3%
3 35
 
3.9%
) 23
 
2.5%
( 23
 
2.5%
4 16
 
1.8%
5 13
 
1.4%
6 7
 
0.8%
8 4
 
0.4%
Other values (7) 16
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6229
86.0%
ASCII 1005
 
13.9%
None 11
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
660
 
10.6%
654
 
10.5%
648
 
10.4%
620
 
10.0%
612
 
9.8%
611
 
9.8%
164
 
2.6%
65
 
1.0%
60
 
1.0%
47
 
0.8%
Other values (320) 2088
33.5%
ASCII
ValueCountFrequency (%)
620
61.7%
2 83
 
8.3%
1 66
 
6.6%
3 35
 
3.5%
) 23
 
2.3%
( 23
 
2.3%
4 16
 
1.6%
S 15
 
1.5%
5 13
 
1.3%
6 7
 
0.7%
Other values (41) 104
 
10.3%
None
ValueCountFrequency (%)
11
100.0%

설비용량(킬로와트)
Real number (ℝ)

HIGH CORRELATION 

Distinct394
Distinct (%)59.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean203.90896
Minimum3
Maximum3000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.9 KiB
2023-12-12T11:44:41.675143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile15
Q134.18
median97.28
Q3124.325
95-th percentile995.58
Maximum3000
Range2997
Interquartile range (IQR)90.145

Descriptive statistics

Standard deviation397.16029
Coefficient of variation (CV)1.9477334
Kurtosis22.37265
Mean203.90896
Median Absolute Deviation (MAD)55.84
Skewness4.3055522
Sum134376.01
Variance157736.29
MonotonicityNot monotonic
2023-12-12T11:44:41.814298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.0 25
 
3.8%
97.92 23
 
3.5%
99.9 21
 
3.2%
99.0 20
 
3.0%
30.0 16
 
2.4%
97.2 11
 
1.7%
96.0 9
 
1.4%
99.2 8
 
1.2%
99.6 7
 
1.1%
98.55 7
 
1.1%
Other values (384) 512
77.7%
ValueCountFrequency (%)
3.0 2
 
0.3%
6.0 1
 
0.2%
9.0 6
0.9%
10.26 1
 
0.2%
10.8 1
 
0.2%
10.88 1
 
0.2%
12.0 4
0.6%
12.3 1
 
0.2%
12.45 1
 
0.2%
13.28 2
 
0.3%
ValueCountFrequency (%)
3000.0 2
0.3%
2999.7 1
0.2%
2990.13 1
0.2%
2929.2 1
0.2%
2460.0 1
0.2%
2400.0 1
0.2%
2081.75 1
0.2%
1901.0 1
0.2%
1900.0 1
0.2%
1818.18 1
0.2%
Distinct616
Distinct (%)93.5%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
2023-12-12T11:44:42.100781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length42
Mean length25.693475
Min length14

Characters and Unicode

Total characters16932
Distinct characters268
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique577 ?
Unique (%)87.6%

Sample

1st row인천광역시 강화군 길상면 동검리 555
2nd row인천광역시 옹진군 영흥면 내리 511 - 8
3rd row인천광역시 옹진군 영흥면 내6리 1703 - 2
4th row인천광역시 옹진군 영흥면 외리 산168
5th row인천광역시 미추홀구 용현4동 183-6
ValueCountFrequency (%)
인천광역시 659
 
19.8%
강화군 296
 
8.9%
서구 157
 
4.7%
중구 57
 
1.7%
남동구 55
 
1.6%
하점면 45
 
1.3%
건물위 45
 
1.3%
길상면 43
 
1.3%
옹진군 34
 
1.0%
불은면 32
 
1.0%
Other values (1062) 1913
57.3%
2023-12-12T11:44:42.536926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2683
 
15.8%
1 719
 
4.2%
678
 
4.0%
672
 
4.0%
660
 
3.9%
659
 
3.9%
659
 
3.9%
- 478
 
2.8%
3 406
 
2.4%
395
 
2.3%
Other values (258) 8923
52.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9740
57.5%
Decimal Number 3205
 
18.9%
Space Separator 2683
 
15.8%
Dash Punctuation 478
 
2.8%
Open Punctuation 308
 
1.8%
Close Punctuation 307
 
1.8%
Other Punctuation 188
 
1.1%
Uppercase Letter 21
 
0.1%
Other Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
678
 
7.0%
672
 
6.9%
660
 
6.8%
659
 
6.8%
659
 
6.8%
395
 
4.1%
345
 
3.5%
333
 
3.4%
330
 
3.4%
325
 
3.3%
Other values (230) 4684
48.1%
Decimal Number
ValueCountFrequency (%)
1 719
22.4%
3 406
12.7%
2 355
11.1%
4 308
9.6%
5 283
 
8.8%
7 279
 
8.7%
6 266
 
8.3%
8 212
 
6.6%
9 191
 
6.0%
0 186
 
5.8%
Uppercase Letter
ValueCountFrequency (%)
B 5
23.8%
J 4
19.0%
L 4
19.0%
K 2
 
9.5%
A 1
 
4.8%
C 1
 
4.8%
G 1
 
4.8%
N 1
 
4.8%
T 1
 
4.8%
O 1
 
4.8%
Open Punctuation
ValueCountFrequency (%)
( 304
98.7%
[ 4
 
1.3%
Close Punctuation
ValueCountFrequency (%)
) 303
98.7%
] 4
 
1.3%
Space Separator
ValueCountFrequency (%)
2683
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 478
100.0%
Other Punctuation
ValueCountFrequency (%)
, 188
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9742
57.5%
Common 7169
42.3%
Latin 21
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
678
 
7.0%
672
 
6.9%
660
 
6.8%
659
 
6.8%
659
 
6.8%
395
 
4.1%
345
 
3.5%
333
 
3.4%
330
 
3.4%
325
 
3.3%
Other values (231) 4686
48.1%
Common
ValueCountFrequency (%)
2683
37.4%
1 719
 
10.0%
- 478
 
6.7%
3 406
 
5.7%
2 355
 
5.0%
4 308
 
4.3%
( 304
 
4.2%
) 303
 
4.2%
5 283
 
3.9%
7 279
 
3.9%
Other values (7) 1051
 
14.7%
Latin
ValueCountFrequency (%)
B 5
23.8%
J 4
19.0%
L 4
19.0%
K 2
 
9.5%
A 1
 
4.8%
C 1
 
4.8%
G 1
 
4.8%
N 1
 
4.8%
T 1
 
4.8%
O 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9740
57.5%
ASCII 7190
42.5%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2683
37.3%
1 719
 
10.0%
- 478
 
6.6%
3 406
 
5.6%
2 355
 
4.9%
4 308
 
4.3%
( 304
 
4.2%
) 303
 
4.2%
5 283
 
3.9%
7 279
 
3.9%
Other values (17) 1072
 
14.9%
Hangul
ValueCountFrequency (%)
678
 
7.0%
672
 
6.9%
660
 
6.8%
659
 
6.8%
659
 
6.8%
395
 
4.1%
345
 
3.5%
333
 
3.4%
330
 
3.4%
325
 
3.3%
Other values (230) 4684
48.1%
None
ValueCountFrequency (%)
2
100.0%

원동력 종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
태양광
649 
풍력
 
3
바이오가스
 
2
소수력
 
1
바이오가스 (LFG,혐기성소화조)
 
1
Other values (3)
 
3

Length

Max length18
Median length3
Mean length3.030349
Min length2

Unique

Unique5 ?
Unique (%)0.8%

Sample

1st row태양광
2nd row태양광
3rd row태양광
4th row소수력
5th row태양광

Common Values

ValueCountFrequency (%)
태양광 649
98.5%
풍력 3
 
0.5%
바이오가스 2
 
0.3%
소수력 1
 
0.2%
바이오가스 (LFG,혐기성소화조) 1
 
0.2%
화력 1
 
0.2%
태양광, 풍력 1
 
0.2%
연료전지 1
 
0.2%

Length

2023-12-12T11:44:42.699534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:44:42.825288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
태양광 650
98.3%
풍력 4
 
0.6%
바이오가스 3
 
0.5%
소수력 1
 
0.2%
lfg,혐기성소화조 1
 
0.2%
화력 1
 
0.2%
연료전지 1
 
0.2%
Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size791.0 B
True
452 
False
207 
ValueCountFrequency (%)
True 452
68.6%
False 207
31.4%
2023-12-12T11:44:42.937545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

사업개시일
Date

MISSING 

Distinct329
Distinct (%)72.8%
Missing207
Missing (%)31.4%
Memory size5.3 KiB
Minimum2006-03-27 00:00:00
Maximum2021-07-30 00:00:00
2023-12-12T11:44:43.050959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:44:43.205945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T11:44:39.810660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:44:43.302522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설비용량(킬로와트)원동력 종류사업개시유무
설비용량(킬로와트)1.0000.8300.161
원동력 종류0.8301.0000.047
사업개시유무0.1610.0471.000
2023-12-12T11:44:43.385004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업개시유무원동력 종류
사업개시유무1.0000.035
원동력 종류0.0351.000
2023-12-12T11:44:43.482584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설비용량(킬로와트)원동력 종류사업개시유무
설비용량(킬로와트)1.0000.5940.123
원동력 종류0.5941.0000.035
사업개시유무0.1230.0351.000

Missing values

2023-12-12T11:44:39.953990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:44:40.065318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

허가일자상호설비용량(킬로와트)설치장소원동력 종류사업개시유무사업개시일
02005-09-30동검발전소3.0인천광역시 강화군 길상면 동검리 555태양광Y2015-11-23
12005-12-12기린에코 발전소3.0인천광역시 옹진군 영흥면 내리 511 - 8태양광Y2006-03-27
22006-02-27영흥도 태양광발전소1000.0인천광역시 옹진군 영흥면 내6리 1703 - 2태양광Y2006-10-04
32006-08-07영흥 소수력발전소3000.0인천광역시 옹진군 영흥면 외리 산168소수력Y2008-03-15
42007-11-29해성 태양광발전소6.0인천광역시 미추홀구 용현4동 183-6태양광Y2008-01-02
52007-12-28바이오가스발전소1900.0인천광역시 서구 백석동 58바이오가스 (LFG,혐기성소화조)Y2008-05-02
62008-04-04고명자 태양광발전소30.0인천광역시 강화군 길상면 온수리 산 129-2태양광Y2008-05-02
72008-04-04이동근 태양광발전소30.0인천광역시 강화군 길상면 온수리 산 129-3태양광Y2008-05-02
82008-04-17이건태양광발전소27.44인천광역시 미추홀구 도화동 967-3태양광Y2008-05-15
92008-07-10고용민태양광발전소9.0인천광역시 강화군 길상면 온수리 380-4태양광Y2008-09-16
허가일자상호설비용량(킬로와트)설치장소원동력 종류사업개시유무사업개시일
6492020-10-19승원 태양광발전소35.28인천광역시 강화군 길상면 장흥리 623번지 ,623-1(건물위)태양광Y2020-12-29
6502020-10-20햇빛나눔 인천1호 태양광발전소99.75인천광역시 서구 북항로120번길 13-18 (원창동)태양광Y2021-01-06
6512020-10-27남동빌딩 태양광발전소15.12인천광역시 남동구 만수동 987-11 남동빌딩(건물위)태양광N<NA>
6522020-10-27금분태양광발전소18.0인천광역시 남동구 석정로477번길 9, (건물위) (간석동)태양광Y2021-03-26
6532020-11-05전앤유에너지 발전소19.95인천광역시 서구 북항로120번길 13-26, (건물위) (원창동)태양광N<NA>
6542020-12-14훈민 태양광발전소79.0인천광역시 부평구 청천동 423-1(건물 위)태양광N<NA>
6552020-12-14늘푸른 태양광발전소79.8인천광역시 남동구 도림동 491(건물 위)태양광N<NA>
6562020-12-14늘푸른2 태양광발전소44.1인천광역시 남동구 서창동 740-1(건물 위)태양광Y2021-04-27
6572020-12-22주5 SOLAR 발전18.48인천광역시 미추홀구 주안동 16-81(건물 위)태양광Y2021-05-04
6582020-12-31수성밸브공업㈜ 태양광발전소99.96인천광역시 서구 마중5로 14 다동(건물위)태양광Y2021-04-13