Overview

Dataset statistics

Number of variables6
Number of observations73
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.7 KiB
Average record size in memory51.8 B

Variable types

Categorical3
Text1
Numeric2

Dataset

Description2014년 발전소의 사용전검사 진행한 이력을 대분류(기력,복합화력), 중분류(발전기,변압기,차단기,보일러,증기터빈), 구분, 내용, 설치대수, 점유율로 제공하는 데이터입니다.
Author한국전기안전공사
URLhttps://www.data.go.kr/data/15002356/fileData.do

Alerts

설치대수 is highly overall correlated with 점유율High correlation
점유율 is highly overall correlated with 설치대수High correlation
중분류 is highly overall correlated with 구분High correlation
구분 is highly overall correlated with 중분류High correlation

Reproduction

Analysis started2023-12-12 05:48:32.499498
Analysis finished2023-12-12 05:48:33.483810
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대분류
Categorical

Distinct2
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size716.0 B
기력설비
51 
복합화력설비
22 

Length

Max length6
Median length4
Mean length4.6027397
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기력설비
2nd row기력설비
3rd row기력설비
4th row기력설비
5th row기력설비

Common Values

ValueCountFrequency (%)
기력설비 51
69.9%
복합화력설비 22
30.1%

Length

2023-12-12T14:48:33.572616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:48:33.712958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기력설비 51
69.9%
복합화력설비 22
30.1%

중분류
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size716.0 B
발전기
21 
변압기
15 
보일러
15 
증기터빈
13 
차단기

Length

Max length4
Median length3
Mean length3.1780822
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row발전기
2nd row발전기
3rd row발전기
4th row발전기
5th row발전기

Common Values

ValueCountFrequency (%)
발전기 21
28.8%
변압기 15
20.5%
보일러 15
20.5%
증기터빈 13
17.8%
차단기 9
12.3%

Length

2023-12-12T14:48:33.851496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:48:34.005931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
발전기 21
28.8%
변압기 15
20.5%
보일러 15
20.5%
증기터빈 13
17.8%
차단기 9
12.3%

구분
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size716.0 B
제작사
35 
전압
소내변압기 용량비율
형식
증발량
Other values (3)

Length

Max length10
Median length3
Mean length3.5890411
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제작사
2nd row제작사
3rd row제작사
4th row제작사
5th row전압

Common Values

ValueCountFrequency (%)
제작사 35
47.9%
전압 9
 
12.3%
소내변압기 용량비율 8
 
11.0%
형식 8
 
11.0%
증발량 5
 
6.8%
접지방식 3
 
4.1%
조작방식 3
 
4.1%
종류 2
 
2.7%

Length

2023-12-12T14:48:34.191532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:48:34.352333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제작사 35
43.2%
전압 9
 
11.1%
소내변압기 8
 
9.9%
용량비율 8
 
9.9%
형식 8
 
9.9%
증발량 5
 
6.2%
접지방식 3
 
3.7%
조작방식 3
 
3.7%
종류 2
 
2.5%

내용
Text

Distinct51
Distinct (%)69.9%
Missing0
Missing (%)0.0%
Memory size716.0 B
2023-12-12T14:48:34.660156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length4.6438356
Min length2

Characters and Unicode

Total characters339
Distinct characters86
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)52.1%

Sample

1st row두산중공업
2nd row히타치
3rd row웨스팅하우스
4th row기타(7개업체)
5th row25kV
ValueCountFrequency (%)
두산중공업 4
 
5.1%
기타 4
 
5.1%
히타치 3
 
3.8%
ge 3
 
3.8%
abb 3
 
3.8%
지멘스 3
 
3.8%
현대중공업 3
 
3.8%
효성중공업 3
 
3.8%
드럼형 3
 
3.8%
관류형 2
 
2.5%
Other values (43) 48
60.8%
2023-12-12T14:48:35.136637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16
 
4.7%
0 15
 
4.4%
13
 
3.8%
12
 
3.5%
9 11
 
3.2%
11
 
3.2%
1 11
 
3.2%
9
 
2.7%
% 8
 
2.4%
8
 
2.4%
Other values (76) 225
66.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 190
56.0%
Decimal Number 63
 
18.6%
Uppercase Letter 47
 
13.9%
Other Punctuation 11
 
3.2%
Space Separator 7
 
2.1%
Lowercase Letter 7
 
2.1%
Close Punctuation 4
 
1.2%
Open Punctuation 4
 
1.2%
Dash Punctuation 3
 
0.9%
Math Symbol 3
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
16
 
8.4%
13
 
6.8%
12
 
6.3%
11
 
5.8%
9
 
4.7%
8
 
4.2%
8
 
4.2%
6
 
3.2%
6
 
3.2%
5
 
2.6%
Other values (45) 96
50.5%
Uppercase Letter
ValueCountFrequency (%)
V 8
17.0%
B 8
17.0%
C 5
10.6%
G 5
10.6%
E 4
8.5%
F 3
 
6.4%
T 3
 
6.4%
A 3
 
6.4%
H 3
 
6.4%
I 3
 
6.4%
Other values (2) 2
 
4.3%
Decimal Number
ValueCountFrequency (%)
0 15
23.8%
9 11
17.5%
1 11
17.5%
2 7
11.1%
3 7
11.1%
4 3
 
4.8%
6 3
 
4.8%
5 3
 
4.8%
8 2
 
3.2%
7 1
 
1.6%
Other Punctuation
ValueCountFrequency (%)
% 8
72.7%
/ 2
 
18.2%
. 1
 
9.1%
Space Separator
ValueCountFrequency (%)
7
100.0%
Lowercase Letter
ValueCountFrequency (%)
k 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 190
56.0%
Common 95
28.0%
Latin 54
 
15.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
16
 
8.4%
13
 
6.8%
12
 
6.3%
11
 
5.8%
9
 
4.7%
8
 
4.2%
8
 
4.2%
6
 
3.2%
6
 
3.2%
5
 
2.6%
Other values (45) 96
50.5%
Common
ValueCountFrequency (%)
0 15
15.8%
9 11
11.6%
1 11
11.6%
% 8
8.4%
7
 
7.4%
2 7
 
7.4%
3 7
 
7.4%
) 4
 
4.2%
( 4
 
4.2%
- 3
 
3.2%
Other values (8) 18
18.9%
Latin
ValueCountFrequency (%)
V 8
14.8%
B 8
14.8%
k 7
13.0%
C 5
9.3%
G 5
9.3%
E 4
7.4%
F 3
 
5.6%
T 3
 
5.6%
A 3
 
5.6%
H 3
 
5.6%
Other values (3) 5
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 190
56.0%
ASCII 149
44.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
16
 
8.4%
13
 
6.8%
12
 
6.3%
11
 
5.8%
9
 
4.7%
8
 
4.2%
8
 
4.2%
6
 
3.2%
6
 
3.2%
5
 
2.6%
Other values (45) 96
50.5%
ASCII
ValueCountFrequency (%)
0 15
 
10.1%
9 11
 
7.4%
1 11
 
7.4%
% 8
 
5.4%
V 8
 
5.4%
B 8
 
5.4%
7
 
4.7%
k 7
 
4.7%
2 7
 
4.7%
3 7
 
4.7%
Other values (21) 60
40.3%

설치대수
Real number (ℝ)

HIGH CORRELATION 

Distinct36
Distinct (%)49.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.342466
Minimum1
Maximum69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size789.0 B
2023-12-12T14:48:35.284900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q17
median12
Q325
95-th percentile51
Maximum69
Range68
Interquartile range (IQR)18

Descriptive statistics

Standard deviation16.238859
Coefficient of variation (CV)0.88531493
Kurtosis1.024467
Mean18.342466
Median Absolute Deviation (MAD)7
Skewness1.2969291
Sum1339
Variance263.70053
MonotonicityNot monotonic
2023-12-12T14:48:35.454481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
8 5
 
6.8%
2 5
 
6.8%
11 5
 
6.8%
6 4
 
5.5%
3 4
 
5.5%
10 4
 
5.5%
40 3
 
4.1%
19 3
 
4.1%
9 3
 
4.1%
16 2
 
2.7%
Other values (26) 35
47.9%
ValueCountFrequency (%)
1 1
 
1.4%
2 5
6.8%
3 4
5.5%
4 2
 
2.7%
5 2
 
2.7%
6 4
5.5%
7 1
 
1.4%
8 5
6.8%
9 3
4.1%
10 4
5.5%
ValueCountFrequency (%)
69 1
 
1.4%
66 1
 
1.4%
52 1
 
1.4%
51 2
2.7%
48 1
 
1.4%
45 1
 
1.4%
44 1
 
1.4%
40 3
4.1%
39 1
 
1.4%
36 1
 
1.4%

점유율
Real number (ℝ)

HIGH CORRELATION 

Distinct49
Distinct (%)67.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.650685
Minimum1.9
Maximum77.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size789.0 B
2023-12-12T14:48:35.612610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9
5-th percentile3
Q19
median16.4
Q337.3
95-th percentile64.62
Maximum77.6
Range75.7
Interquartile range (IQR)28.3

Descriptive statistics

Standard deviation20.40723
Coefficient of variation (CV)0.82785651
Kurtosis-0.066559361
Mean24.650685
Median Absolute Deviation (MAD)10.4
Skewness1.0084244
Sum1799.5
Variance416.45503
MonotonicityNot monotonic
2023-12-12T14:48:35.768138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
11.9 5
 
6.8%
8.3 3
 
4.1%
14.9 3
 
4.1%
9.0 3
 
4.1%
4.5 3
 
4.1%
16.4 3
 
4.1%
37.0 2
 
2.7%
22.4 2
 
2.7%
1.9 2
 
2.7%
3.0 2
 
2.7%
Other values (39) 45
61.6%
ValueCountFrequency (%)
1.9 2
2.7%
2.8 1
 
1.4%
3.0 2
2.7%
4.5 3
4.1%
5.5 2
2.7%
6.0 2
2.7%
6.2 1
 
1.4%
7.5 1
 
1.4%
8.3 3
4.1%
9.0 3
4.1%
ValueCountFrequency (%)
77.6 1
1.4%
76.2 1
1.4%
67.2 1
1.4%
65.7 1
1.4%
63.9 1
1.4%
61.1 2
2.7%
59.7 1
1.4%
58.2 1
1.4%
53.7 1
1.4%
50.7 1
1.4%

Interactions

2023-12-12T14:48:33.059498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:48:32.816488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:48:33.178571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:48:32.929904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:48:35.935203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류중분류구분내용설치대수점유율
대분류1.0000.3390.4370.0000.0000.000
중분류0.3391.0000.7790.6280.1210.000
구분0.4370.7791.0000.9980.1880.401
내용0.0000.6280.9981.0000.5780.611
설치대수0.0000.1210.1880.5781.0000.884
점유율0.0000.0000.4010.6110.8841.000
2023-12-12T14:48:36.060031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류중분류구분
대분류1.0000.4030.313
중분류0.4031.0000.613
구분0.3130.6131.000
2023-12-12T14:48:36.178584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치대수점유율대분류중분류구분
설치대수1.0000.9580.0000.0550.083
점유율0.9581.0000.0000.0000.196
대분류0.0000.0001.0000.4030.313
중분류0.0550.0000.4031.0000.613
구분0.0830.1960.3130.6131.000

Missing values

2023-12-12T14:48:33.328866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:48:33.440722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

대분류중분류구분내용설치대수점유율
0기력설비발전기제작사두산중공업3653.7
1기력설비발전기제작사히타치913.4
2기력설비발전기제작사웨스팅하우스69.0
3기력설비발전기제작사기타(7개업체)1623.9
4기력설비발전기전압25kV69.0
5기력설비발전기전압22kV3958.2
6기력설비발전기전압19kV46.0
7기력설비발전기전압기타1826.9
8기력설비발전기접지방식저항접지4567.2
9기력설비발전기접지방식변압기접지1928.4
대분류중분류구분내용설치대수점유율
63복합화력설비변압기제작사일진21.9
64복합화력설비변압기소내변압기 용량비율3%이하98.3
65복합화력설비변압기소내변압기 용량비율6%이하1614.8
66복합화력설비변압기소내변압기 용량비율10%이하6963.9
67복합화력설비변압기소내변압기 용량비율10%초과1413.0
68복합화력설비증기터빈제작사GE2261.1
69복합화력설비증기터빈제작사MHI822.2
70복합화력설비증기터빈제작사지멘스38.3
71복합화력설비증기터빈제작사ABB25.5
72복합화력설비증기터빈제작사두산중공업12.8