Overview

Dataset statistics

Number of variables5
Number of observations3259
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory140.2 KiB
Average record size in memory44.0 B

Variable types

Numeric4
Text1

Dataset

Description국민연금의 연도말 기준 해외주식 투자 종목별 평가액, 자산군 내 비중, 지분율 등 투자 현황에 대한 정보로, 원화 10억 원 미만 종목은 제외 (단위: 억 원, %)
Author국민연금공단
URLhttps://www.data.go.kr/data/3070517/fileData.do

Alerts

번호 is highly overall correlated with 평가액(억 원) and 2 other fieldsHigh correlation
평가액(억 원) is highly overall correlated with 번호 and 2 other fieldsHigh correlation
자산군 내 비중(퍼센트) is highly overall correlated with 번호 and 2 other fieldsHigh correlation
지분율(퍼센트) is highly overall correlated with 번호 and 2 other fieldsHigh correlation
지분율(퍼센트) is highly skewed (γ1 = 48.53613154)Skewed
번호 has unique valuesUnique
자산군 내 비중(퍼센트) has 1896 (58.2%) zerosZeros
지분율(퍼센트) has 35 (1.1%) zerosZeros

Reproduction

Analysis started2023-12-12 10:46:17.713072
Analysis finished2023-12-12 10:46:21.383363
Duration3.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct3259
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1630
Minimum1
Maximum3259
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.8 KiB
2023-12-12T19:46:21.492590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile163.9
Q1815.5
median1630
Q32444.5
95-th percentile3096.1
Maximum3259
Range3258
Interquartile range (IQR)1629

Descriptive statistics

Standard deviation940.93659
Coefficient of variation (CV)0.57726171
Kurtosis-1.2
Mean1630
Median Absolute Deviation (MAD)815
Skewness0
Sum5312170
Variance885361.67
MonotonicityStrictly increasing
2023-12-12T19:46:21.688980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2178 1
 
< 0.1%
2168 1
 
< 0.1%
2169 1
 
< 0.1%
2170 1
 
< 0.1%
2171 1
 
< 0.1%
2172 1
 
< 0.1%
2173 1
 
< 0.1%
2174 1
 
< 0.1%
2175 1
 
< 0.1%
Other values (3249) 3249
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
3259 1
< 0.1%
3258 1
< 0.1%
3257 1
< 0.1%
3256 1
< 0.1%
3255 1
< 0.1%
3254 1
< 0.1%
3253 1
< 0.1%
3252 1
< 0.1%
3251 1
< 0.1%
3250 1
< 0.1%
Distinct3166
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size25.6 KiB
2023-12-12T19:46:22.240815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length23
Mean length19.629027
Min length5

Characters and Unicode

Total characters63971
Distinct characters41
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3074 ?
Unique (%)94.3%

Sample

1st rowAPPLE INC
2nd rowMICROSOFT CORP
3rd rowALPHABET INC CL A
4th rowAMAZON.COM INC
5th rowUNITEDHEALTH GROUP INC
ValueCountFrequency (%)
inc 646
 
5.8%
ltd 609
 
5.5%
a 451
 
4.0%
co 449
 
4.0%
corp 385
 
3.5%
group 276
 
2.5%
holdings 177
 
1.6%
plc 165
 
1.5%
sa 132
 
1.2%
bank 117
 
1.0%
Other values (3826) 7745
69.4%
2023-12-12T19:46:22.984089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8062
12.6%
A 5100
 
8.0%
N 4762
 
7.4%
I 4654
 
7.3%
O 4398
 
6.9%
E 4355
 
6.8%
C 4004
 
6.3%
R 3726
 
5.8%
T 3339
 
5.2%
S 3284
 
5.1%
Other values (31) 18287
28.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 55616
86.9%
Space Separator 8062
 
12.6%
Other Punctuation 121
 
0.2%
Math Symbol 109
 
0.2%
Decimal Number 41
 
0.1%
Close Punctuation 11
 
< 0.1%
Open Punctuation 11
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 5100
 
9.2%
N 4762
 
8.6%
I 4654
 
8.4%
O 4398
 
7.9%
E 4355
 
7.8%
C 4004
 
7.2%
R 3726
 
6.7%
T 3339
 
6.0%
S 3284
 
5.9%
L 3157
 
5.7%
Other values (16) 14837
26.7%
Decimal Number
ValueCountFrequency (%)
3 9
22.0%
0 7
17.1%
2 7
17.1%
5 5
12.2%
6 4
9.8%
4 3
 
7.3%
7 3
 
7.3%
1 2
 
4.9%
8 1
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 62
51.2%
/ 59
48.8%
Space Separator
ValueCountFrequency (%)
8062
100.0%
Math Symbol
ValueCountFrequency (%)
+ 109
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 55616
86.9%
Common 8355
 
13.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 5100
 
9.2%
N 4762
 
8.6%
I 4654
 
8.4%
O 4398
 
7.9%
E 4355
 
7.8%
C 4004
 
7.2%
R 3726
 
6.7%
T 3339
 
6.0%
S 3284
 
5.9%
L 3157
 
5.7%
Other values (16) 14837
26.7%
Common
ValueCountFrequency (%)
8062
96.5%
+ 109
 
1.3%
. 62
 
0.7%
/ 59
 
0.7%
) 11
 
0.1%
( 11
 
0.1%
3 9
 
0.1%
0 7
 
0.1%
2 7
 
0.1%
5 5
 
0.1%
Other values (5) 13
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 63971
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8062
12.6%
A 5100
 
8.0%
N 4762
 
7.4%
I 4654
 
7.3%
O 4398
 
6.9%
E 4355
 
6.8%
C 4004
 
6.3%
R 3726
 
5.8%
T 3339
 
5.2%
S 3284
 
5.1%
Other values (31) 18287
28.6%

평가액(억 원)
Real number (ℝ)

HIGH CORRELATION 

Distinct1078
Distinct (%)33.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean727.73704
Minimum10
Maximum74030
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.8 KiB
2023-12-12T19:46:23.499859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile12
Q126
median75
Q3484
95-th percentile3426.3
Maximum74030
Range74020
Interquartile range (IQR)458

Descriptive statistics

Standard deviation2523.676
Coefficient of variation (CV)3.4678405
Kurtosis354.29552
Mean727.73704
Median Absolute Deviation (MAD)60
Skewness14.904686
Sum2371695
Variance6368940.4
MonotonicityDecreasing
2023-12-12T19:46:23.672333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11 90
 
2.8%
15 63
 
1.9%
17 62
 
1.9%
12 61
 
1.9%
13 58
 
1.8%
16 56
 
1.7%
14 50
 
1.5%
21 49
 
1.5%
19 48
 
1.5%
20 47
 
1.4%
Other values (1068) 2675
82.1%
ValueCountFrequency (%)
10 32
 
1.0%
11 90
2.8%
12 61
1.9%
13 58
1.8%
14 50
1.5%
15 63
1.9%
16 56
1.7%
17 62
1.9%
18 40
1.2%
19 48
1.5%
ValueCountFrequency (%)
74030 1
< 0.1%
63855 1
< 0.1%
26432 1
< 0.1%
25673 1
< 0.1%
25065 1
< 0.1%
22584 1
< 0.1%
22209 1
< 0.1%
16809 1
< 0.1%
16531 1
< 0.1%
16094 1
< 0.1%

자산군 내 비중(퍼센트)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct60
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.029404725
Minimum0
Maximum3.07
Zeros1896
Zeros (%)58.2%
Negative0
Negative (%)0.0%
Memory size28.8 KiB
2023-12-12T19:46:23.859140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.02
95-th percentile0.14
Maximum3.07
Range3.07
Interquartile range (IQR)0.02

Descriptive statistics

Standard deviation0.1050263
Coefficient of variation (CV)3.571749
Kurtosis350.28692
Mean0.029404725
Median Absolute Deviation (MAD)0
Skewness14.799913
Sum95.83
Variance0.011030523
MonotonicityDecreasing
2023-12-12T19:46:24.091317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1896
58.2%
0.01 415
 
12.7%
0.02 246
 
7.5%
0.03 118
 
3.6%
0.04 91
 
2.8%
0.05 81
 
2.5%
0.06 59
 
1.8%
0.07 38
 
1.2%
0.08 30
 
0.9%
0.09 29
 
0.9%
Other values (50) 256
 
7.9%
ValueCountFrequency (%)
0.0 1896
58.2%
0.01 415
 
12.7%
0.02 246
 
7.5%
0.03 118
 
3.6%
0.04 91
 
2.8%
0.05 81
 
2.5%
0.06 59
 
1.8%
0.07 38
 
1.2%
0.08 30
 
0.9%
0.09 29
 
0.9%
ValueCountFrequency (%)
3.07 1
< 0.1%
2.65 1
< 0.1%
1.1 1
< 0.1%
1.07 1
< 0.1%
1.04 1
< 0.1%
0.94 1
< 0.1%
0.92 1
< 0.1%
0.7 1
< 0.1%
0.69 1
< 0.1%
0.67 2
0.1%

지분율(퍼센트)
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct145
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.22233507
Minimum0
Maximum80.43
Zeros35
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size28.8 KiB
2023-12-12T19:46:24.310503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.01
Q10.05
median0.11
Q30.23
95-th percentile0.61
Maximum80.43
Range80.43
Interquartile range (IQR)0.18

Descriptive statistics

Standard deviation1.4946246
Coefficient of variation (CV)6.7223971
Kurtosis2561.2553
Mean0.22233507
Median Absolute Deviation (MAD)0.08
Skewness48.536132
Sum724.59
Variance2.2339028
MonotonicityNot monotonic
2023-12-12T19:46:24.502537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.01 268
 
8.2%
0.04 171
 
5.2%
0.03 170
 
5.2%
0.05 165
 
5.1%
0.07 146
 
4.5%
0.06 142
 
4.4%
0.02 142
 
4.4%
0.08 125
 
3.8%
0.09 108
 
3.3%
0.11 100
 
3.1%
Other values (135) 1722
52.8%
ValueCountFrequency (%)
0.0 35
 
1.1%
0.01 268
8.2%
0.02 142
4.4%
0.03 170
5.2%
0.04 171
5.2%
0.05 165
5.1%
0.06 142
4.4%
0.07 146
4.5%
0.08 125
3.8%
0.09 108
3.3%
ValueCountFrequency (%)
80.43 1
< 0.1%
21.74 1
< 0.1%
9.13 1
< 0.1%
8.49 1
< 0.1%
6.53 1
< 0.1%
3.28 1
< 0.1%
2.88 1
< 0.1%
2.48 1
< 0.1%
2.26 1
< 0.1%
2.18 1
< 0.1%

Interactions

2023-12-12T19:46:20.378333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:18.162241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:18.840206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:19.557629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:20.616575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:18.293219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:19.000431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:19.737851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:20.816088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:18.455119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:19.188568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:19.950756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:20.986691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:18.654101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:19.373922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:46:20.160808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:46:24.644716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호평가액(억 원)자산군 내 비중(퍼센트)지분율(퍼센트)
번호1.0000.3120.3180.051
평가액(억 원)0.3121.0001.0000.304
자산군 내 비중(퍼센트)0.3181.0001.0000.304
지분율(퍼센트)0.0510.3040.3041.000
2023-12-12T19:46:24.759058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호평가액(억 원)자산군 내 비중(퍼센트)지분율(퍼센트)
번호1.000-1.000-0.895-0.717
평가액(억 원)-1.0001.0000.8950.717
자산군 내 비중(퍼센트)-0.8950.8951.0000.710
지분율(퍼센트)-0.7170.7170.7101.000

Missing values

2023-12-12T19:46:21.178228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:46:21.321110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호종목명평가액(억 원)자산군 내 비중(퍼센트)지분율(퍼센트)
01APPLE INC740303.070.28
12MICROSOFT CORP638552.650.28
23ALPHABET INC CL A264321.10.4
34AMAZON.COM INC256731.070.24
45UNITEDHEALTH GROUP INC250651.040.4
56ALPHABET INC CL C225840.940.33
67INVESCO PUREBETA MSCI USA ETF222090.9280.43
78EXXON MOBIL CORP168090.70.29
89JOHNSON + JOHNSON165310.690.28
910NESTLE SA REG160940.670.4
번호종목명평가액(억 원)자산군 내 비중(퍼센트)지분율(퍼센트)
32493250AECOM100.00.01
32503251HUIZHOU DESAY SV AUTOMOTIV A100.00.01
32513252SECUNET SECURITY NETWORKS AG100.00.06
32523253CHINA EVERGRANDE GROUP100.00.03
32533254HARBORONE BANCORP INC100.00.12
32543255METSO OUTOTEC OYJ100.00.01
32553256XVIVO PERFUSION AB100.00.15
32563257JIANGSU ZHONGTIAN TECHNOLO A100.00.01
32573258NEW HOPE LIUHE CO LTD A100.00.01
32583259MITSUBISHI SHOKUHIN CO LTD100.00.08