Overview

Dataset statistics

Number of variables3
Number of observations144
Missing cells121
Missing cells (%)28.0%
Duplicate rows4
Duplicate rows (%)2.8%
Total size in memory3.6 KiB
Average record size in memory25.9 B

Variable types

Categorical2
Numeric1

Dataset

Description공단이 보유한 퇴직연금기록괸라시스템DB자료(퇴직연금사업장운용현황)2019.08.입니다._통화코드국가코드은행코드공식익률 등
Author근로복지공단
URLhttps://www.data.go.kr/data/15051514/fileData.do

Alerts

Dataset has 4 (2.8%) duplicate rowsDuplicates
공시수익률_기준일자 is highly overall correlated with 장기수수료할인기본_적용시작일자High correlation
장기수수료할인기본_적용시작일자 is highly overall correlated with 공시수익률_기준일자High correlation
공시수익률_기준일자 is highly imbalanced (52.4%)Imbalance
장기수수료할인기본_적용시작일자 is highly imbalanced (59.6%)Imbalance
공시수익률_공시수익률 has 121 (84.0%) missing valuesMissing
공시수익률_공시수익률 has 8 (5.6%) zerosZeros

Reproduction

Analysis started2023-12-12 23:15:12.862532
Analysis finished2023-12-12 23:15:13.199899
Duration0.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공시수익률_기준일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
<NA>
121 
2016-12-31
18 
2017-03-31
 
5

Length

Max length10
Median length4
Mean length4.9583333
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016-12-31
2nd row2016-12-31
3rd row2016-12-31
4th row2016-12-31
5th row2016-12-31

Common Values

ValueCountFrequency (%)
<NA> 121
84.0%
2016-12-31 18
 
12.5%
2017-03-31 5
 
3.5%

Length

2023-12-13T08:15:13.294843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:15:13.416212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 121
84.0%
2016-12-31 18
 
12.5%
2017-03-31 5
 
3.5%

공시수익률_공시수익률
Real number (ℝ)

MISSING  ZEROS 

Distinct12
Distinct (%)52.2%
Missing121
Missing (%)84.0%
Infinite0
Infinite (%)0.0%
Mean0.18695652
Minimum-2.62
Maximum2.53
Zeros8
Zeros (%)5.6%
Negative6
Negative (%)4.2%
Memory size1.4 KiB
2023-12-13T08:15:13.595027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-2.62
5-th percentile-1.8
Q1-0.155
median0
Q30.41
95-th percentile2.324
Maximum2.53
Range5.15
Interquartile range (IQR)0.565

Descriptive statistics

Standard deviation1.2338684
Coefficient of variation (CV)6.5997612
Kurtosis0.53771919
Mean0.18695652
Median Absolute Deviation (MAD)0.39
Skewness-0.036254429
Sum4.3
Variance1.5224312
MonotonicityNot monotonic
2023-12-13T08:15:13.719129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0.0 8
 
5.6%
-1.08 2
 
1.4%
0.41 2
 
1.4%
-0.31 2
 
1.4%
1.73 2
 
1.4%
0.39 1
 
0.7%
-1.88 1
 
0.7%
1.63 1
 
0.7%
-2.62 1
 
0.7%
0.36 1
 
0.7%
Other values (2) 2
 
1.4%
(Missing) 121
84.0%
ValueCountFrequency (%)
-2.62 1
 
0.7%
-1.88 1
 
0.7%
-1.08 2
 
1.4%
-0.31 2
 
1.4%
0.0 8
5.6%
0.36 1
 
0.7%
0.39 1
 
0.7%
0.41 2
 
1.4%
1.63 1
 
0.7%
1.73 2
 
1.4%
ValueCountFrequency (%)
2.53 1
 
0.7%
2.39 1
 
0.7%
1.73 2
 
1.4%
1.63 1
 
0.7%
0.41 2
 
1.4%
0.39 1
 
0.7%
0.36 1
 
0.7%
0.0 8
5.6%
-0.31 2
 
1.4%
-1.08 2
 
1.4%

장기수수료할인기본_적용시작일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
<NA>
121 
2015-09-15
 
9
2015-09-01
 
6
2018-07-01
 
4
2018-07-02
 
4

Length

Max length10
Median length4
Mean length4.9583333
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015-09-15
2nd row2015-09-01
3rd row2015-09-01
4th row2015-09-15
5th row2015-09-01

Common Values

ValueCountFrequency (%)
<NA> 121
84.0%
2015-09-15 9
 
6.2%
2015-09-01 6
 
4.2%
2018-07-01 4
 
2.8%
2018-07-02 4
 
2.8%

Length

2023-12-13T08:15:13.878437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:15:14.017903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 121
84.0%
2015-09-15 9
 
6.2%
2015-09-01 6
 
4.2%
2018-07-01 4
 
2.8%
2018-07-02 4
 
2.8%

Interactions

2023-12-13T08:15:12.956334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:15:14.113677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공시수익률_기준일자공시수익률_공시수익률장기수수료할인기본_적용시작일자
공시수익률_기준일자1.0000.4400.983
공시수익률_공시수익률0.4401.0000.000
장기수수료할인기본_적용시작일자0.9830.0001.000
2023-12-13T08:15:14.221797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공시수익률_기준일자장기수수료할인기본_적용시작일자
공시수익률_기준일자1.0000.839
장기수수료할인기본_적용시작일자0.8391.000
2023-12-13T08:15:14.309682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공시수익률_공시수익률공시수익률_기준일자장기수수료할인기본_적용시작일자
공시수익률_공시수익률1.0000.3990.000
공시수익률_기준일자0.3991.0000.839
장기수수료할인기본_적용시작일자0.0000.8391.000

Missing values

2023-12-13T08:15:13.074117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:15:13.164608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

공시수익률_기준일자공시수익률_공시수익률장기수수료할인기본_적용시작일자
02016-12-310.392015-09-15
12016-12-310.02015-09-01
22016-12-31-1.082015-09-01
32016-12-310.02015-09-15
42016-12-31-1.882015-09-01
52016-12-310.412015-09-15
62016-12-310.412015-09-15
72016-12-310.02015-09-15
82016-12-31-1.082015-09-15
92016-12-311.632015-09-01
공시수익률_기준일자공시수익률_공시수익률장기수수료할인기본_적용시작일자
134<NA><NA><NA>
135<NA><NA><NA>
136<NA><NA><NA>
137<NA><NA><NA>
138<NA><NA><NA>
139<NA><NA><NA>
140<NA><NA><NA>
141<NA><NA><NA>
142<NA><NA><NA>
143<NA><NA><NA>

Duplicate rows

Most frequently occurring

공시수익률_기준일자공시수익률_공시수익률장기수수료할인기본_적용시작일자# duplicates
3<NA><NA><NA>121
02016-12-310.02015-09-154
12016-12-310.412015-09-152
22017-03-310.02018-07-022