Overview

Dataset statistics

Number of variables4
Number of observations253
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.3 KiB
Average record size in memory33.5 B

Variable types

Categorical1
Numeric1
Text1
DateTime1

Dataset

Description한국동서발전 발전소별 견학 현황 데이터를 제공합니다. 발전소별 견학 현황은 구분, 인원, 참여그룹, 신청날짜의 항목으로 구성됩니다.
URLhttps://www.data.go.kr/data/15009799/fileData.do

Alerts

구분 is highly imbalanced (72.7%)Imbalance

Reproduction

Analysis started2023-12-12 21:08:00.539122
Analysis finished2023-12-12 21:08:00.932485
Duration0.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

IMBALANCE 

Distinct5
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
당진발전
228 
동해발전
 
11
울산발전
 
8
일산발전
 
5
호남발전
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row일산발전
2nd row울산발전
3rd row당진발전
4th row동해발전
5th row동해발전

Common Values

ValueCountFrequency (%)
당진발전 228
90.1%
동해발전 11
 
4.3%
울산발전 8
 
3.2%
일산발전 5
 
2.0%
호남발전 1
 
0.4%

Length

2023-12-13T06:08:00.995036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:08:01.099979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
당진발전 228
90.1%
동해발전 11
 
4.3%
울산발전 8
 
3.2%
일산발전 5
 
2.0%
호남발전 1
 
0.4%

인원
Real number (ℝ)

Distinct43
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.885375
Minimum1
Maximum120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 KiB
2023-12-13T06:08:01.210435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median11
Q328
95-th percentile42.8
Maximum120
Range119
Interquartile range (IQR)24

Descriptive statistics

Standard deviation18.452607
Coefficient of variation (CV)1.0317148
Kurtosis7.8491982
Mean17.885375
Median Absolute Deviation (MAD)8
Skewness2.3499632
Sum4525
Variance340.49871
MonotonicityNot monotonic
2023-12-13T06:08:01.350845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
3 25
 
9.9%
4 21
 
8.3%
10 19
 
7.5%
2 17
 
6.7%
30 15
 
5.9%
8 11
 
4.3%
12 10
 
4.0%
7 9
 
3.6%
40 8
 
3.2%
24 8
 
3.2%
Other values (33) 110
43.5%
ValueCountFrequency (%)
1 3
 
1.2%
2 17
6.7%
3 25
9.9%
4 21
8.3%
5 7
 
2.8%
6 7
 
2.8%
7 9
 
3.6%
8 11
4.3%
9 4
 
1.6%
10 19
7.5%
ValueCountFrequency (%)
120 1
 
0.4%
110 1
 
0.4%
90 3
 
1.2%
80 2
 
0.8%
60 2
 
0.8%
50 3
 
1.2%
44 1
 
0.4%
42 2
 
0.8%
40 8
3.2%
38 4
1.6%
Distinct238
Distinct (%)94.1%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-13T06:08:01.655391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length19
Mean length8.9288538
Min length2

Characters and Unicode

Total characters2259
Distinct characters301
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique226 ?
Unique (%)89.3%

Sample

1st row일산화력발전소 견학
2nd row서울시 강동구청 녹색에너지과
3rd row전북대학교 공과대학
4th row대원연료전지 주주단, SK 건설
5th row강원도 고성군 간성읍 금수리 주민 외
ValueCountFrequency (%)
당진시 4
 
0.9%
에너지캠퍼스 4
 
0.9%
동서발전 4
 
0.9%
한국가스공사 4
 
0.9%
서울도시과학기술고등학교(홍보관 3
 
0.7%
한전 3
 
0.7%
주민 3
 
0.7%
인도네시아 3
 
0.7%
본사 3
 
0.7%
당진발전본부 3
 
0.7%
Other values (357) 408
92.3%
2023-12-13T06:08:02.094429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
193
 
8.5%
* 53
 
2.3%
53
 
2.3%
) 46
 
2.0%
( 46
 
2.0%
42
 
1.9%
41
 
1.8%
41
 
1.8%
40
 
1.8%
39
 
1.7%
Other values (291) 1665
73.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1860
82.3%
Space Separator 193
 
8.5%
Other Punctuation 58
 
2.6%
Close Punctuation 46
 
2.0%
Open Punctuation 46
 
2.0%
Decimal Number 23
 
1.0%
Lowercase Letter 15
 
0.7%
Uppercase Letter 13
 
0.6%
Dash Punctuation 4
 
0.2%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
53
 
2.8%
42
 
2.3%
41
 
2.2%
41
 
2.2%
40
 
2.2%
39
 
2.1%
38
 
2.0%
36
 
1.9%
36
 
1.9%
35
 
1.9%
Other values (256) 1459
78.4%
Lowercase Letter
ValueCountFrequency (%)
m 3
20.0%
a 2
13.3%
i 2
13.3%
v 1
 
6.7%
k 1
 
6.7%
s 1
 
6.7%
t 1
 
6.7%
u 1
 
6.7%
p 1
 
6.7%
o 1
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
S 3
23.1%
P 2
15.4%
E 2
15.4%
W 1
 
7.7%
K 1
 
7.7%
A 1
 
7.7%
B 1
 
7.7%
C 1
 
7.7%
T 1
 
7.7%
Decimal Number
ValueCountFrequency (%)
1 5
21.7%
7 5
21.7%
2 4
17.4%
3 2
 
8.7%
5 2
 
8.7%
6 2
 
8.7%
0 2
 
8.7%
4 1
 
4.3%
Other Punctuation
ValueCountFrequency (%)
* 53
91.4%
, 5
 
8.6%
Space Separator
ValueCountFrequency (%)
193
100.0%
Close Punctuation
ValueCountFrequency (%)
) 46
100.0%
Open Punctuation
ValueCountFrequency (%)
( 46
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1860
82.3%
Common 371
 
16.4%
Latin 28
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
53
 
2.8%
42
 
2.3%
41
 
2.2%
41
 
2.2%
40
 
2.2%
39
 
2.1%
38
 
2.0%
36
 
1.9%
36
 
1.9%
35
 
1.9%
Other values (256) 1459
78.4%
Latin
ValueCountFrequency (%)
m 3
 
10.7%
S 3
 
10.7%
a 2
 
7.1%
i 2
 
7.1%
P 2
 
7.1%
E 2
 
7.1%
W 1
 
3.6%
K 1
 
3.6%
v 1
 
3.6%
k 1
 
3.6%
Other values (10) 10
35.7%
Common
ValueCountFrequency (%)
193
52.0%
* 53
 
14.3%
) 46
 
12.4%
( 46
 
12.4%
1 5
 
1.3%
, 5
 
1.3%
7 5
 
1.3%
- 4
 
1.1%
2 4
 
1.1%
3 2
 
0.5%
Other values (5) 8
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1860
82.3%
ASCII 399
 
17.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
193
48.4%
* 53
 
13.3%
) 46
 
11.5%
( 46
 
11.5%
1 5
 
1.3%
, 5
 
1.3%
7 5
 
1.3%
- 4
 
1.0%
2 4
 
1.0%
m 3
 
0.8%
Other values (25) 35
 
8.8%
Hangul
ValueCountFrequency (%)
53
 
2.8%
42
 
2.3%
41
 
2.2%
41
 
2.2%
40
 
2.2%
39
 
2.1%
38
 
2.0%
36
 
1.9%
36
 
1.9%
35
 
1.9%
Other values (256) 1459
78.4%
Distinct166
Distinct (%)65.6%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
Minimum2020-10-08 00:00:00
Maximum2023-07-27 00:00:00
2023-12-13T06:08:02.216355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:08:02.332198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-13T06:08:00.727713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:08:02.405711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분인원
구분1.0000.000
인원0.0001.000
2023-12-13T06:08:02.477496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인원구분
인원1.0000.000
구분0.0001.000

Missing values

2023-12-13T06:08:00.832003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:08:00.900827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분인원참여그룹신청날짜
0일산발전10일산화력발전소 견학2020-10-08
1울산발전10서울시 강동구청 녹색에너지과2020-10-30
2당진발전15전북대학교 공과대학2020-10-30
3동해발전9대원연료전지 주주단, SK 건설2020-12-08
4동해발전16강원도 고성군 간성읍 금수리 주민 외2020-12-14
5당진발전10재화엔지니어링2021-01-13
6호남발전10포스코에너지2021-03-15
7동해발전10한일병원2021-06-28
8동해발전10한일병원2021-06-29
9울산발전10부산시설공단2021-07-06
구분인원참여그룹신청날짜
243당진발전38한전인재개발원 송,변전부(홍보관)2023-06-23
244당진발전14행복플러스보호작업장(빌전소)2023-06-23
245당진발전14행복플러스보호작업장(홍보관)2023-06-23
246당진발전9오*미(홍보관)2023-06-24
247당진발전7전*자(홍보관)2023-06-24
248동해발전10가스공사 삼척본부2023-06-27
249당진발전7청렴시민감사관(홍보관)2023-06-27
250당진발전5남양주 다산행복지원센터(홍보관)2023-06-28
251당진발전4남양주 다산행복지원센터(홍보관)2023-06-29
252동해발전10강원대 학생 견학2023-07-27