Overview

Dataset statistics

Number of variables3
Number of observations163
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.3 KiB
Average record size in memory26.8 B

Variable types

Text1
Numeric1
Categorical1

Dataset

Description한국주택금융공사 유동화자산부 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터)
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15072825/fileData.do

Alerts

MSPRTC_SEQ is highly imbalanced (66.7%)Imbalance

Reproduction

Analysis started2023-12-12 16:25:52.483902
Analysis finished2023-12-12 16:25:52.831712
Duration0.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct97
Distinct (%)59.5%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2023-12-13T01:25:53.102072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters2119
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)38.0%

Sample

1st row8408252******
2nd row8302211******
3rd row8002052******
4th row8002052******
5th row7911132******
ValueCountFrequency (%)
6602101 11
 
6.7%
6408201 5
 
3.1%
7911132 5
 
3.1%
7307202 5
 
3.1%
5612201 4
 
2.5%
6508122 4
 
2.5%
6903231 4
 
2.5%
6312222 3
 
1.8%
7010171 3
 
1.8%
6101161 3
 
1.8%
Other values (87) 116
71.2%
2023-12-13T01:25:53.559339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 978
46.2%
1 280
 
13.2%
0 227
 
10.7%
2 181
 
8.5%
6 121
 
5.7%
7 96
 
4.5%
5 67
 
3.2%
3 59
 
2.8%
8 46
 
2.2%
4 39
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1141
53.8%
Other Punctuation 978
46.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 280
24.5%
0 227
19.9%
2 181
15.9%
6 121
10.6%
7 96
 
8.4%
5 67
 
5.9%
3 59
 
5.2%
8 46
 
4.0%
4 39
 
3.4%
9 25
 
2.2%
Other Punctuation
ValueCountFrequency (%)
* 978
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2119
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 978
46.2%
1 280
 
13.2%
0 227
 
10.7%
2 181
 
8.5%
6 121
 
5.7%
7 96
 
4.5%
5 67
 
3.2%
3 59
 
2.8%
8 46
 
2.2%
4 39
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2119
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 978
46.2%
1 280
 
13.2%
0 227
 
10.7%
2 181
 
8.5%
6 121
 
5.7%
7 96
 
4.5%
5 67
 
3.2%
3 59
 
2.8%
8 46
 
2.2%
4 39
 
1.8%

ASSET_NO
Real number (ℝ)

Distinct11
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2515337
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-13T01:25:53.710765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile6
Maximum11
Range10
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.7474622
Coefficient of variation (CV)0.53742706
Kurtosis3.3914823
Mean3.2515337
Median Absolute Deviation (MAD)1
Skewness1.4418556
Sum530
Variance3.0536242
MonotonicityNot monotonic
2023-12-13T01:25:53.826561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
3 48
29.4%
2 38
23.3%
4 28
17.2%
1 20
12.3%
5 14
 
8.6%
6 8
 
4.9%
7 2
 
1.2%
8 2
 
1.2%
11 1
 
0.6%
10 1
 
0.6%
ValueCountFrequency (%)
1 20
12.3%
2 38
23.3%
3 48
29.4%
4 28
17.2%
5 14
 
8.6%
6 8
 
4.9%
7 2
 
1.2%
8 2
 
1.2%
9 1
 
0.6%
10 1
 
0.6%
ValueCountFrequency (%)
11 1
 
0.6%
10 1
 
0.6%
9 1
 
0.6%
8 2
 
1.2%
7 2
 
1.2%
6 8
 
4.9%
5 14
 
8.6%
4 28
17.2%
3 48
29.4%
2 38
23.3%

MSPRTC_SEQ
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
1
153 
2
 
10

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 153
93.9%
2 10
 
6.1%

Length

2023-12-13T01:25:53.965068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:25:54.078347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 153
93.9%
2 10
 
6.1%

Interactions

2023-12-13T01:25:52.589136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:25:54.145000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DBTR_JUMIN_NOASSET_NOMSPRTC_SEQ
DBTR_JUMIN_NO1.0000.0000.000
ASSET_NO0.0001.0000.000
MSPRTC_SEQ0.0000.0001.000
2023-12-13T01:25:54.249101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ASSET_NOMSPRTC_SEQ
ASSET_NO1.0000.000
MSPRTC_SEQ0.0001.000

Missing values

2023-12-13T01:25:52.695877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:25:52.789949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

DBTR_JUMIN_NOASSET_NOMSPRTC_SEQ
08408252******21
18302211******21
28002052******62
38002052******51
47911132******71
57911132******61
67911132******51
77911132******41
87911132******31
97907091******11
DBTR_JUMIN_NOASSET_NOMSPRTC_SEQ
1535210191******11
1545202282******11
1555108271******21
1565104021******11
1575007201******31
1584901231******51
1594803181******31
1604306101******31
1614111152******21
1624111152******11