Overview

Dataset statistics

Number of variables4
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.3 KiB
Average record size in memory34.1 B

Variable types

Numeric2
Text1
Categorical1

Dataset

Descriptionㅇ 연구개발투자 세계 TOP 50기업의 통계로 연구개발 투자를 많이한 기업의 순위 데이터를 확인할 수 있음. 상위 50개 순위의 기업명과 국가 정보를 확인 - 컬럼명: 구분, 순위, 회사명, 국가
URLhttps://www.data.go.kr/data/15070040/fileData.do

Reproduction

Analysis started2023-12-12 22:09:54.109572
Analysis finished2023-12-12 22:09:55.285379
Duration1.18 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Real number (ℝ)

Distinct20
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2011.5
Minimum2002
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T07:09:55.347187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2002
5-th percentile2002.95
Q12006.75
median2011.5
Q32016.25
95-th percentile2020.05
Maximum2021
Range19
Interquartile range (IQR)9.5

Descriptive statistics

Standard deviation5.7691666
Coefficient of variation (CV)0.0028680918
Kurtosis-1.2060428
Mean2011.5
Median Absolute Deviation (MAD)5
Skewness0
Sum2011500
Variance33.283283
MonotonicityDecreasing
2023-12-13T07:09:55.479456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
2021 50
 
5.0%
2010 50
 
5.0%
2002 50
 
5.0%
2003 50
 
5.0%
2004 50
 
5.0%
2005 50
 
5.0%
2006 50
 
5.0%
2007 50
 
5.0%
2008 50
 
5.0%
2009 50
 
5.0%
Other values (10) 500
50.0%
ValueCountFrequency (%)
2002 50
5.0%
2003 50
5.0%
2004 50
5.0%
2005 50
5.0%
2006 50
5.0%
2007 50
5.0%
2008 50
5.0%
2009 50
5.0%
2010 50
5.0%
2011 50
5.0%
ValueCountFrequency (%)
2021 50
5.0%
2020 50
5.0%
2019 50
5.0%
2018 50
5.0%
2017 50
5.0%
2016 50
5.0%
2015 50
5.0%
2014 50
5.0%
2013 50
5.0%
2012 50
5.0%

순위
Real number (ℝ)

Distinct50
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.5
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T07:09:55.657051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q113
median25.5
Q338
95-th percentile48
Maximum50
Range49
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.438091
Coefficient of variation (CV)0.56619963
Kurtosis-1.2009628
Mean25.5
Median Absolute Deviation (MAD)12.5
Skewness0
Sum25500
Variance208.45846
MonotonicityNot monotonic
2023-12-13T07:09:55.807282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 20
 
2.0%
39 20
 
2.0%
29 20
 
2.0%
30 20
 
2.0%
31 20
 
2.0%
32 20
 
2.0%
33 20
 
2.0%
34 20
 
2.0%
35 20
 
2.0%
36 20
 
2.0%
Other values (40) 800
80.0%
ValueCountFrequency (%)
1 20
2.0%
2 20
2.0%
3 20
2.0%
4 20
2.0%
5 20
2.0%
6 20
2.0%
7 20
2.0%
8 20
2.0%
9 20
2.0%
10 20
2.0%
ValueCountFrequency (%)
50 20
2.0%
49 20
2.0%
48 20
2.0%
47 20
2.0%
46 20
2.0%
45 20
2.0%
44 20
2.0%
43 20
2.0%
42 20
2.0%
41 20
2.0%

회사
Text

Distinct152
Distinct (%)15.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T07:09:56.089485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length30
Mean length9.926
Min length2

Characters and Unicode

Total characters9926
Distinct characters48
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique59 ?
Unique (%)5.9%

Sample

1st rowALPHABET
2nd rowMETA
3rd rowMICROSOFT
4th rowHUAWEI INVESTMENT & HOLDING
5th rowAPPLE
ValueCountFrequency (%)
motor 80
 
5.7%
johnson 38
 
2.7%
general 38
 
2.7%
electronics 26
 
1.9%
systems 24
 
1.7%
electric 24
 
1.7%
24
 
1.7%
motors 22
 
1.6%
pfizer 21
 
1.5%
ford 20
 
1.4%
Other values (137) 1088
77.4%
2023-12-13T07:09:56.528343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 953
 
9.6%
O 921
 
9.3%
S 771
 
7.8%
A 755
 
7.6%
N 682
 
6.9%
I 677
 
6.8%
T 660
 
6.6%
R 630
 
6.3%
L 501
 
5.0%
C 479
 
4.8%
Other values (38) 2897
29.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 9307
93.8%
Space Separator 467
 
4.7%
Dash Punctuation 50
 
0.5%
Lowercase Letter 37
 
0.4%
Other Punctuation 25
 
0.3%
Open Punctuation 17
 
0.2%
Close Punctuation 17
 
0.2%
Other Letter 4
 
< 0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 953
10.2%
O 921
9.9%
S 771
 
8.3%
A 755
 
8.1%
N 682
 
7.3%
I 677
 
7.3%
T 660
 
7.1%
R 630
 
6.8%
L 501
 
5.4%
C 479
 
5.1%
Other values (16) 2278
24.5%
Lowercase Letter
ValueCountFrequency (%)
o 9
24.3%
n 6
16.2%
w 5
13.5%
t 3
 
8.1%
i 3
 
8.1%
r 2
 
5.4%
a 2
 
5.4%
p 1
 
2.7%
s 1
 
2.7%
d 1
 
2.7%
Other values (4) 4
10.8%
Other Letter
ValueCountFrequency (%)
2
50.0%
2
50.0%
Space Separator
ValueCountFrequency (%)
467
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 50
100.0%
Other Punctuation
ValueCountFrequency (%)
& 25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9344
94.1%
Common 578
 
5.8%
Hangul 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 953
10.2%
O 921
 
9.9%
S 771
 
8.3%
A 755
 
8.1%
N 682
 
7.3%
I 677
 
7.2%
T 660
 
7.1%
R 630
 
6.7%
L 501
 
5.4%
C 479
 
5.1%
Other values (30) 2315
24.8%
Common
ValueCountFrequency (%)
467
80.8%
- 50
 
8.7%
& 25
 
4.3%
( 17
 
2.9%
) 17
 
2.9%
_ 2
 
0.3%
Hangul
ValueCountFrequency (%)
2
50.0%
2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9922
> 99.9%
Hangul 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 953
 
9.6%
O 921
 
9.3%
S 771
 
7.8%
A 755
 
7.6%
N 682
 
6.9%
I 677
 
6.8%
T 660
 
6.7%
R 630
 
6.3%
L 501
 
5.0%
C 479
 
4.8%
Other values (36) 2893
29.2%
Hangul
ValueCountFrequency (%)
2
50.0%
2
50.0%

국가
Categorical

Distinct16
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
미국
394 
일본
189 
독일
146 
프랑스
56 
영국
44 
Other values (11)
171 

Length

Max length4
Median length2
Mean length2.217
Min length2

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st row미국
2nd row미국
3rd row미국
4th row중국
5th row미국

Common Values

ValueCountFrequency (%)
미국 394
39.4%
일본 189
18.9%
독일 146
 
14.6%
프랑스 56
 
5.6%
영국 44
 
4.4%
스위스 40
 
4.0%
네덜란드 30
 
3.0%
한국 26
 
2.6%
스웨덴 21
 
2.1%
중국 20
 
2.0%
Other values (6) 34
 
3.4%

Length

2023-12-13T07:09:56.683373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
미국 394
39.4%
일본 189
18.9%
독일 146
 
14.6%
프랑스 56
 
5.6%
영국 44
 
4.4%
스위스 40
 
4.0%
네덜란드 30
 
3.0%
한국 26
 
2.6%
스웨덴 21
 
2.1%
중국 20
 
2.0%
Other values (6) 34
 
3.4%

Interactions

2023-12-13T07:09:54.850802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:09:54.277807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:09:54.979573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:09:54.717581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:09:56.762641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분순위국가
구분1.0000.0000.000
순위0.0001.0000.375
국가0.0000.3751.000
2023-12-13T07:09:56.835427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분순위국가
구분1.0000.0000.000
순위0.0001.0000.156
국가0.0000.1561.000

Missing values

2023-12-13T07:09:55.134407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:09:55.249680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분순위회사국가
020211ALPHABET미국
120212META미국
220213MICROSOFT미국
320214HUAWEI INVESTMENT & HOLDING중국
420215APPLE미국
520216SAMSUNG ELECTRONICS한국
620217VOLKSWAGEN독일
720218INTEL미국
820219ROCHE스위스
9202110JOHNSON & JOHNSON미국
구분순위회사국가
990200241GENERAL ELECTRIC미국
991200242BOEING미국
992200243AMERICAN HOME PRODUCTS (now WYETH)미국
993200244EADS네덜란드
994200245PROCTER & GAMBLE미국
995200246NISSAN MOTOR일본
996200247RENAULT프랑스
997200248DELPHI AUTOMOTIVE (now DELPHI)미국
998200249CANON일본
999200250BMW독일