Overview

Dataset statistics

Number of variables4
Number of observations536
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.4 KiB
Average record size in memory35.2 B

Variable types

Text1
Numeric1
Categorical2

Dataset

Description한국주택금융공사 채권관리부에서 제공하는 담보물내용배치에 대한 데이터로, 보증번호, 이행청구일자, 처리순번, 담보물내용순번 등의 항목을 제공합니다.
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15073033/fileData.do

Alerts

PROCESS_SEQ has constant value ""Constant
SCRTY_CONT_SEQ is highly imbalanced (96.5%)Imbalance

Reproduction

Analysis started2023-12-12 23:46:27.250337
Analysis finished2023-12-12 23:46:27.597342
Duration0.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct421
Distinct (%)78.5%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
2023-12-13T08:46:27.750742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters6968
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique341 ?
Unique (%)63.6%

Sample

1st rowTHO2014029259
2nd rowTBA2018000923
3rd rowTPB2017003479
4th rowTAB2013027419
5th rowTBA2018000923
ValueCountFrequency (%)
tha2015051284 7
 
1.3%
tlb2017009751 5
 
0.9%
tho2016057593 4
 
0.7%
tpa2014031195 4
 
0.7%
tlb2015009657 4
 
0.7%
tac2015032789 4
 
0.7%
tab2014005247 4
 
0.7%
tho2014040037 3
 
0.6%
tlb2015020348 3
 
0.6%
tho2014056885 3
 
0.6%
Other values (411) 495
92.4%
2023-12-13T08:46:28.105627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1343
19.3%
2 947
13.6%
1 832
11.9%
A 454
 
6.5%
5 424
 
6.1%
T 421
 
6.0%
3 413
 
5.9%
4 383
 
5.5%
8 278
 
4.0%
7 259
 
3.7%
Other values (14) 1214
17.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5360
76.9%
Uppercase Letter 1608
 
23.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 454
28.2%
T 421
26.2%
H 181
 
11.3%
Q 125
 
7.8%
D 122
 
7.6%
B 117
 
7.3%
O 117
 
7.3%
C 36
 
2.2%
L 16
 
1.0%
P 12
 
0.7%
Other values (4) 7
 
0.4%
Decimal Number
ValueCountFrequency (%)
0 1343
25.1%
2 947
17.7%
1 832
15.5%
5 424
 
7.9%
3 413
 
7.7%
4 383
 
7.1%
8 278
 
5.2%
7 259
 
4.8%
9 247
 
4.6%
6 234
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
Common 5360
76.9%
Latin 1608
 
23.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 454
28.2%
T 421
26.2%
H 181
 
11.3%
Q 125
 
7.8%
D 122
 
7.6%
B 117
 
7.3%
O 117
 
7.3%
C 36
 
2.2%
L 16
 
1.0%
P 12
 
0.7%
Other values (4) 7
 
0.4%
Common
ValueCountFrequency (%)
0 1343
25.1%
2 947
17.7%
1 832
15.5%
5 424
 
7.9%
3 413
 
7.7%
4 383
 
7.1%
8 278
 
5.2%
7 259
 
4.8%
9 247
 
4.6%
6 234
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6968
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1343
19.3%
2 947
13.6%
1 832
11.9%
A 454
 
6.5%
5 424
 
6.1%
T 421
 
6.0%
3 413
 
5.9%
4 383
 
5.5%
8 278
 
4.0%
7 259
 
3.7%
Other values (14) 1214
17.4%

DISCHRG_DEMND_DY
Real number (ℝ)

Distinct387
Distinct (%)72.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20176599
Minimum20090731
Maximum20201023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.8 KiB
2023-12-13T08:46:28.255082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20090731
5-th percentile20150614
Q120160904
median20180461
Q320190916
95-th percentile20200824
Maximum20201023
Range110292
Interquartile range (IQR)30012.25

Descriptive statistics

Standard deviation21747.966
Coefficient of variation (CV)0.0010778807
Kurtosis4.0413468
Mean20176599
Median Absolute Deviation (MAD)10666.5
Skewness-1.6048367
Sum1.0814657 × 1010
Variance4.7297401 × 108
MonotonicityNot monotonic
2023-12-13T08:46:28.400296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20190607 6
 
1.1%
20160518 5
 
0.9%
20160715 4
 
0.7%
20180315 4
 
0.7%
20190830 4
 
0.7%
20200306 4
 
0.7%
20200617 4
 
0.7%
20170628 4
 
0.7%
20181129 3
 
0.6%
20160127 3
 
0.6%
Other values (377) 495
92.4%
ValueCountFrequency (%)
20090731 2
0.4%
20090806 1
0.2%
20090819 1
0.2%
20090921 1
0.2%
20091209 2
0.4%
20091210 1
0.2%
20091214 1
0.2%
20100120 1
0.2%
20100310 2
0.4%
20100325 1
0.2%
ValueCountFrequency (%)
20201023 1
 
0.2%
20201021 2
0.4%
20201019 1
 
0.2%
20201016 3
0.6%
20201015 3
0.6%
20201014 2
0.4%
20201008 2
0.4%
20200929 2
0.4%
20200928 1
 
0.2%
20200925 1
 
0.2%

PROCESS_SEQ
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
1
536 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 536
100.0%

Length

2023-12-13T08:46:28.534225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:46:28.618751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 536
100.0%

SCRTY_CONT_SEQ
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
1
534 
2
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 534
99.6%
2 2
 
0.4%

Length

2023-12-13T08:46:28.725067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:46:28.823295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 534
99.6%
2 2
 
0.4%

Interactions

2023-12-13T08:46:27.350314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:46:28.878618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DISCHRG_DEMND_DYSCRTY_CONT_SEQ
DISCHRG_DEMND_DY1.0000.000
SCRTY_CONT_SEQ0.0001.000
2023-12-13T08:46:28.959493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DISCHRG_DEMND_DYSCRTY_CONT_SEQ
DISCHRG_DEMND_DY1.0000.140
SCRTY_CONT_SEQ0.1401.000

Missing values

2023-12-13T08:46:27.472395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:46:27.565804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GUARNT_NODISCHRG_DEMND_DYPROCESS_SEQSCRTY_CONT_SEQ
0THO20140292592020102311
1TBA20180009232020102111
2TPB20170034792020102111
3TAB20130274192020101911
4TBA20180009232020101611
5THO20120008422020101611
6THB20150404802020101611
7TAD20150072942020101511
8TAC20150438422020101511
9THA20130300692020101511
GUARNT_NODISCHRG_DEMND_DYPROCESS_SEQSCRTY_CONT_SEQ
526TLA20020084452010032611
527THA20020230862009092111
528THA20020208342009081911
529THA20020671272009080611
530THO20021168492009121411
531THO20030399552010060711
532THA20020242822010101311
533QAC20020640842009073112
534QAC20020640842009073111
535TLA20020084452010032511