Overview

Dataset statistics

Number of variables5
Number of observations452
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.7 KiB
Average record size in memory42.3 B

Variable types

Numeric2
Categorical2
Text1

Dataset

Description한국서부발전의 소프트웨어 도입목록 정보입니다. 제공데이터는 No,대분류,중분류,소프트웨어명,도입년도 입니다. - 데이터 예) 1,기타,기타,(주)글로텍 / EBS-C,2017
URLhttps://www.data.go.kr/data/15089539/fileData.do

Alerts

순번 is highly overall correlated with 도입년도High correlation
도입년도 is highly overall correlated with 순번High correlation
대분류 is highly overall correlated with 중분류High correlation
중분류 is highly overall correlated with 대분류High correlation
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 00:55:42.906681
Analysis finished2023-12-12 00:55:43.885595
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct452
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean226.5
Minimum1
Maximum452
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB
2023-12-12T09:55:44.295733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile23.55
Q1113.75
median226.5
Q3339.25
95-th percentile429.45
Maximum452
Range451
Interquartile range (IQR)225.5

Descriptive statistics

Standard deviation130.62542
Coefficient of variation (CV)0.57671267
Kurtosis-1.2
Mean226.5
Median Absolute Deviation (MAD)113
Skewness0
Sum102378
Variance17063
MonotonicityStrictly increasing
2023-12-12T09:55:44.461213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
285 1
 
0.2%
311 1
 
0.2%
310 1
 
0.2%
309 1
 
0.2%
308 1
 
0.2%
307 1
 
0.2%
306 1
 
0.2%
305 1
 
0.2%
304 1
 
0.2%
Other values (442) 442
97.8%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
452 1
0.2%
451 1
0.2%
450 1
0.2%
449 1
0.2%
448 1
0.2%
447 1
0.2%
446 1
0.2%
445 1
0.2%
444 1
0.2%
443 1
0.2%

대분류
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
개발용
122 
멀티미디어
86 
사무용
69 
유틸리티
44 
기타
41 
Other values (3)
90 

Length

Max length5
Median length4
Mean length3.4137168
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row개발용
2nd row기타
3rd row통신
4th row시스템
5th row시스템

Common Values

ValueCountFrequency (%)
개발용 122
27.0%
멀티미디어 86
19.0%
사무용 69
15.3%
유틸리티 44
 
9.7%
기타 41
 
9.1%
시스템 34
 
7.5%
운영체제 34
 
7.5%
통신 22
 
4.9%

Length

2023-12-12T09:55:44.593883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:55:44.729625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개발용 122
27.0%
멀티미디어 86
19.0%
사무용 69
15.3%
유틸리티 44
 
9.7%
기타 41
 
9.1%
시스템 34
 
7.5%
운영체제 34
 
7.5%
통신 22
 
4.9%

중분류
Categorical

HIGH CORRELATION 

Distinct30
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
기타
125 
그래픽
65 
DBMS
48 
파일편집
28 
웹페이지
27 
Other values (25)
159 

Length

Max length16
Median length13
Mean length4.079646
Min length2

Unique

Unique4 ?
Unique (%)0.9%

Sample

1st rowDBMS
2nd row기타
3rd rowTelnet
4th row보안
5th rowV3

Common Values

ValueCountFrequency (%)
기타 125
27.7%
그래픽 65
14.4%
DBMS 48
 
10.6%
파일편집 28
 
6.2%
웹페이지 27
 
6.0%
Windows 계열 26
 
5.8%
MS-Visual 시리즈 14
 
3.1%
동영상 12
 
2.7%
MS-Office 11
 
2.4%
한글과 컴퓨터 10
 
2.2%
Other values (20) 86
19.0%

Length

2023-12-12T09:55:44.914296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타 125
24.5%
그래픽 65
12.7%
dbms 48
 
9.4%
계열 32
 
6.3%
파일편집 28
 
5.5%
웹페이지 27
 
5.3%
windows 26
 
5.1%
시리즈 17
 
3.3%
ms-visual 15
 
2.9%
동영상 12
 
2.3%
Other values (22) 116
22.7%
Distinct363
Distinct (%)80.3%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
2023-12-12T09:55:45.277285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length87
Median length44
Mean length24.09292
Min length3

Characters and Unicode

Total characters10890
Distinct characters223
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique300 ?
Unique (%)66.4%

Sample

1st rowQUEST / Toad7
2nd row(주)글로텍 / EBS-C
3rd row(주)넷사랑 / Xmanager 1
4th row(주)안철수연구소 / Client Security Leader
5th row(주)안철수연구소 / V3 Internet Security 2007 Platinum
ValueCountFrequency (%)
390
 
19.0%
ms 86
 
4.2%
adobe 75
 
3.7%
pro 56
 
2.7%
server 38
 
1.9%
oracle 29
 
1.4%
photoshop 27
 
1.3%
for 25
 
1.2%
windows 23
 
1.1%
acrobat 22
 
1.1%
Other values (513) 1282
62.4%
2023-12-12T09:55:45.919325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1601
 
14.7%
e 608
 
5.6%
o 560
 
5.1%
r 510
 
4.7%
t 412
 
3.8%
i 402
 
3.7%
/ 399
 
3.7%
a 349
 
3.2%
S 320
 
2.9%
n 280
 
2.6%
Other values (213) 5449
50.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4880
44.8%
Uppercase Letter 2148
19.7%
Space Separator 1601
 
14.7%
Other Letter 794
 
7.3%
Decimal Number 787
 
7.2%
Other Punctuation 497
 
4.6%
Close Punctuation 79
 
0.7%
Open Punctuation 79
 
0.7%
Math Symbol 10
 
0.1%
Dash Punctuation 7
 
0.1%
Other values (3) 8
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55
 
6.9%
42
 
5.3%
40
 
5.0%
38
 
4.8%
34
 
4.3%
26
 
3.3%
23
 
2.9%
18
 
2.3%
17
 
2.1%
16
 
2.0%
Other values (138) 485
61.1%
Uppercase Letter
ValueCountFrequency (%)
S 320
14.9%
A 228
 
10.6%
P 168
 
7.8%
C 148
 
6.9%
M 130
 
6.1%
E 121
 
5.6%
D 117
 
5.4%
T 101
 
4.7%
O 88
 
4.1%
L 80
 
3.7%
Other values (16) 647
30.1%
Lowercase Letter
ValueCountFrequency (%)
e 608
12.5%
o 560
11.5%
r 510
10.5%
t 412
 
8.4%
i 402
 
8.2%
a 349
 
7.2%
n 280
 
5.7%
d 246
 
5.0%
s 228
 
4.7%
l 180
 
3.7%
Other values (15) 1105
22.6%
Decimal Number
ValueCountFrequency (%)
0 177
22.5%
2 175
22.2%
1 104
13.2%
6 68
 
8.6%
3 65
 
8.3%
4 61
 
7.8%
5 48
 
6.1%
9 31
 
3.9%
8 31
 
3.9%
7 27
 
3.4%
Other Punctuation
ValueCountFrequency (%)
/ 399
80.3%
. 52
 
10.5%
, 44
 
8.9%
# 1
 
0.2%
@ 1
 
0.2%
Math Symbol
ValueCountFrequency (%)
+ 7
70.0%
~ 3
30.0%
Space Separator
ValueCountFrequency (%)
1601
100.0%
Close Punctuation
ValueCountFrequency (%)
) 79
100.0%
Open Punctuation
ValueCountFrequency (%)
( 79
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7028
64.5%
Common 3068
28.2%
Hangul 794
 
7.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
 
6.9%
42
 
5.3%
40
 
5.0%
38
 
4.8%
34
 
4.3%
26
 
3.3%
23
 
2.9%
18
 
2.3%
17
 
2.1%
16
 
2.0%
Other values (138) 485
61.1%
Latin
ValueCountFrequency (%)
e 608
 
8.7%
o 560
 
8.0%
r 510
 
7.3%
t 412
 
5.9%
i 402
 
5.7%
a 349
 
5.0%
S 320
 
4.6%
n 280
 
4.0%
d 246
 
3.5%
A 228
 
3.2%
Other values (41) 3113
44.3%
Common
ValueCountFrequency (%)
1601
52.2%
/ 399
 
13.0%
0 177
 
5.8%
2 175
 
5.7%
1 104
 
3.4%
) 79
 
2.6%
( 79
 
2.6%
6 68
 
2.2%
3 65
 
2.1%
4 61
 
2.0%
Other values (14) 260
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10095
92.7%
Hangul 794
 
7.3%
Letterlike Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1601
 
15.9%
e 608
 
6.0%
o 560
 
5.5%
r 510
 
5.1%
t 412
 
4.1%
i 402
 
4.0%
/ 399
 
4.0%
a 349
 
3.5%
S 320
 
3.2%
n 280
 
2.8%
Other values (64) 4654
46.1%
Hangul
ValueCountFrequency (%)
55
 
6.9%
42
 
5.3%
40
 
5.0%
38
 
4.8%
34
 
4.3%
26
 
3.3%
23
 
2.9%
18
 
2.3%
17
 
2.1%
16
 
2.0%
Other values (138) 485
61.1%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%

도입년도
Real number (ℝ)

HIGH CORRELATION 

Distinct24
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.9469
Minimum2000
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB
2023-12-12T09:55:46.095924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2001
Q12005
median2010
Q32017
95-th percentile2021
Maximum2023
Range23
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.0224104
Coefficient of variation (CV)0.0034920914
Kurtosis-1.3554158
Mean2010.9469
Median Absolute Deviation (MAD)7
Skewness-0.036909591
Sum908948
Variance49.314248
MonotonicityIncreasing
2023-12-12T09:55:46.245517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
2001 77
17.0%
2016 43
 
9.5%
2017 39
 
8.6%
2008 37
 
8.2%
2019 24
 
5.3%
2009 22
 
4.9%
2021 21
 
4.6%
2005 20
 
4.4%
2010 20
 
4.4%
2018 17
 
3.8%
Other values (14) 132
29.2%
ValueCountFrequency (%)
2000 1
 
0.2%
2001 77
17.0%
2002 12
 
2.7%
2003 8
 
1.8%
2004 7
 
1.5%
2005 20
 
4.4%
2006 16
 
3.5%
2007 12
 
2.7%
2008 37
8.2%
2009 22
 
4.9%
ValueCountFrequency (%)
2023 7
 
1.5%
2022 12
 
2.7%
2021 21
4.6%
2020 15
 
3.3%
2019 24
5.3%
2018 17
 
3.8%
2017 39
8.6%
2016 43
9.5%
2015 4
 
0.9%
2014 11
 
2.4%

Interactions

2023-12-12T09:55:43.468780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:55:43.232987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:55:43.565276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:55:43.345917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:55:46.375262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번대분류중분류도입년도
순번1.0000.2930.5320.985
대분류0.2931.0000.9710.227
중분류0.5320.9711.0000.426
도입년도0.9850.2270.4261.000
2023-12-12T09:55:46.488515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류중분류
대분류1.0000.825
중분류0.8251.000
2023-12-12T09:55:46.595451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번도입년도대분류중분류
순번1.0000.9960.1430.192
도입년도0.9961.0000.1110.144
대분류0.1430.1111.0000.825
중분류0.1920.1440.8251.000

Missing values

2023-12-12T09:55:43.715727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:55:43.829604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번대분류중분류소프트웨어명도입년도
01개발용DBMSQUEST / Toad72000
12기타기타(주)글로텍 / EBS-C2001
23통신Telnet(주)넷사랑 / Xmanager 12001
34시스템보안(주)안철수연구소 / Client Security Leader2001
45시스템V3(주)안철수연구소 / V3 Internet Security 2007 Platinum2001
56개발용기타(주)포시에스 / OZ 22001
67유틸리티파일편집Adobe / Acrobat 52001
78멀티미디어동영상Adobe / Flash 52001
89멀티미디어그래픽Adobe / Photoshop 52001
910멀티미디어그래픽Adobe / Photoshop 62001
순번대분류중분류소프트웨어명도입년도
442443멀티미디어그래픽Adobe / Photoshop CC (64Bit)2022
443444유틸리티파일편집Adobe / Acrobat 2020 Pro2022
444445사무용기타ABBYY / FineReader PDF 15 Corporate2022
445446유틸리티파일편집Adobe / Acrobat Pro DC(~`24.4.1까지)2023
446447사무용MS-OfficeMS / Office 2021 Pro2023
447448멀티미디어그래픽Trimble / SketchUp Pro 20212023
448449멀티미디어그래픽Adobe / Photoshop CC (64Bit)(~`24.4.1까지)2023
449450사무용기타MS / Visio 2021 pro2023
450451멀티미디어그래픽Wondershare/EdrawMax2023
451452시스템기타닥터소프트 / 넷클라이언트(NetClinet6)2023