Overview

Dataset statistics

Number of variables9
Number of observations1000
Missing cells277
Missing cells (%)3.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory74.3 KiB
Average record size in memory76.1 B

Variable types

Numeric3
Categorical3
Text2
DateTime1

Dataset

Description한국주택금융공사 채권관리부 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터)
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15072848/fileData.do

Alerts

REG_BRCD is highly overall correlated with REG_ENO and 1 other fieldsHigh correlation
UPDT_BRCD is highly overall correlated with REG_ENO and 1 other fieldsHigh correlation
REG_ENO is highly overall correlated with UPDT_BRCD and 1 other fieldsHigh correlation
LAWST_CLSS_DVCD is highly imbalanced (94.6%)Imbalance
UPDT_ENO has 277 (27.7%) missing valuesMissing
ACPT_PTNO has unique valuesUnique

Reproduction

Analysis started2023-12-13 01:01:09.796395
Analysis finished2023-12-13 01:01:11.060968
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ACPT_PTNO
Real number (ℝ)

UNIQUE 

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0187854 × 1010
Minimum2.0081325 × 1010
Maximum2.0201305 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T10:01:11.116220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0081325 × 1010
5-th percentile2.0141303 × 1010
Q12.0181303 × 1010
median2.0201303 × 1010
Q32.0201304 × 1010
95-th percentile2.0201304 × 1010
Maximum2.0201305 × 1010
Range1.1997962 × 108
Interquartile range (IQR)20000837

Descriptive statistics

Standard deviation23393652
Coefficient of variation (CV)0.0011587983
Kurtosis3.3185368
Mean2.0187854 × 1010
Median Absolute Deviation (MAD)1053
Skewness-1.9019477
Sum2.0187854 × 1013
Variance5.4726296 × 1014
MonotonicityNot monotonic
2023-12-13T10:01:11.226245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20201304540 1
 
0.1%
20151300566 1
 
0.1%
20161303480 1
 
0.1%
20161303280 1
 
0.1%
20161302326 1
 
0.1%
20161302203 1
 
0.1%
20151307381 1
 
0.1%
20151310425 1
 
0.1%
20151307008 1
 
0.1%
20151306722 1
 
0.1%
Other values (990) 990
99.0%
ValueCountFrequency (%)
20081324924 1
0.1%
20091307783 1
0.1%
20091307788 1
0.1%
20091309090 1
0.1%
20091312608 1
0.1%
20091312923 1
0.1%
20091314996 1
0.1%
20091324424 1
0.1%
20091330184 1
0.1%
20091344024 1
0.1%
ValueCountFrequency (%)
20201304540 1
0.1%
20201304538 1
0.1%
20201304537 1
0.1%
20201304534 1
0.1%
20201304533 1
0.1%
20201304532 1
0.1%
20201304531 1
0.1%
20201304530 1
0.1%
20201304529 1
0.1%
20201304527 1
0.1%

MDBTR_CUST_NO
Real number (ℝ)

Distinct982
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean76000863
Minimum782315
Maximum1.362029 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T10:01:11.338587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum782315
5-th percentile19700945
Q156385943
median83014542
Q396164870
95-th percentile1.1854137 × 108
Maximum1.362029 × 108
Range1.3542059 × 108
Interquartile range (IQR)39778927

Descriptive statistics

Standard deviation29807196
Coefficient of variation (CV)0.39219549
Kurtosis-0.53195007
Mean76000863
Median Absolute Deviation (MAD)14887968
Skewness-0.59069041
Sum7.6000863 × 1010
Variance8.8846893 × 1014
MonotonicityNot monotonic
2023-12-13T10:01:11.453470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28530743 3
 
0.3%
109309795 3
 
0.3%
10996887 2
 
0.2%
93908202 2
 
0.2%
92335494 2
 
0.2%
80001882 2
 
0.2%
89555355 2
 
0.2%
38155558 2
 
0.2%
111778424 2
 
0.2%
101837414 2
 
0.2%
Other values (972) 978
97.8%
ValueCountFrequency (%)
782315 1
0.1%
3792483 1
0.1%
6867603 1
0.1%
6946906 1
0.1%
6975375 1
0.1%
7859348 1
0.1%
8631277 1
0.1%
8763695 1
0.1%
9392557 1
0.1%
9832350 1
0.1%
ValueCountFrequency (%)
136202904 1
0.1%
135204725 1
0.1%
130928776 1
0.1%
128943598 1
0.1%
128881391 1
0.1%
128050287 1
0.1%
127947126 1
0.1%
126454173 1
0.1%
126430236 1
0.1%
124993018 1
0.1%

LAWST_CLSS_DVCD
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
990 
2
 
9
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 990
99.0%
2 9
 
0.9%
3 1
 
0.1%

Length

2023-12-13T10:01:11.554811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T10:01:11.623616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 990
99.0%
2 9
 
0.9%
3 1
 
0.1%
Distinct609
Distinct (%)60.9%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T10:01:11.815879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length16.74
Min length15

Characters and Unicode

Total characters16740
Distinct characters14
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique577 ?
Unique (%)57.7%

Sample

1st row0001/01/01 01:01:01
2nd row0001/01/01 01:01:01
3rd row0001/01/01 01:01:01
4th row0001/01/01 01:01:01
5th row0001/01/01 01:01:01
ValueCountFrequency (%)
0001/01/01 277
 
13.9%
01:01:01 277
 
13.9%
2020-08-25 69
 
3.5%
17:01 33
 
1.7%
17:02 32
 
1.6%
2020-10-05 32
 
1.6%
13:48 23
 
1.1%
2020-08-21 22
 
1.1%
2020-10-15 21
 
1.1%
2020-09-18 18
 
0.9%
Other values (678) 1196
59.8%
2023-12-13T10:01:12.185667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4657
27.8%
1 3396
20.3%
2 1765
 
10.5%
- 1446
 
8.6%
: 1277
 
7.6%
1000
 
6.0%
/ 554
 
3.3%
5 436
 
2.6%
4 435
 
2.6%
9 409
 
2.4%
Other values (4) 1365
 
8.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12463
74.5%
Other Punctuation 1831
 
10.9%
Dash Punctuation 1446
 
8.6%
Space Separator 1000
 
6.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4657
37.4%
1 3396
27.2%
2 1765
 
14.2%
5 436
 
3.5%
4 435
 
3.5%
9 409
 
3.3%
8 408
 
3.3%
3 380
 
3.0%
7 321
 
2.6%
6 256
 
2.1%
Other Punctuation
ValueCountFrequency (%)
: 1277
69.7%
/ 554
30.3%
Dash Punctuation
ValueCountFrequency (%)
- 1446
100.0%
Space Separator
ValueCountFrequency (%)
1000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16740
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4657
27.8%
1 3396
20.3%
2 1765
 
10.5%
- 1446
 
8.6%
: 1277
 
7.6%
1000
 
6.0%
/ 554
 
3.3%
5 436
 
2.6%
4 435
 
2.6%
9 409
 
2.4%
Other values (4) 1365
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16740
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4657
27.8%
1 3396
20.3%
2 1765
 
10.5%
- 1446
 
8.6%
: 1277
 
7.6%
1000
 
6.0%
/ 554
 
3.3%
5 436
 
2.6%
4 435
 
2.6%
9 409
 
2.4%
Other values (4) 1365
 
8.2%

UPDT_ENO
Text

MISSING 

Distinct162
Distinct (%)22.4%
Missing277
Missing (%)27.7%
Memory size7.9 KiB
2023-12-13T10:01:12.460370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.1728907
Min length4

Characters and Unicode

Total characters3017
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)7.2%

Sample

1st row1935
2nd row1985
3rd row1616
4th row1921
5th row1647
ValueCountFrequency (%)
batch 88
 
12.2%
1935 28
 
3.9%
1890 27
 
3.7%
1484 17
 
2.4%
1785 17
 
2.4%
1908 15
 
2.1%
1688 14
 
1.9%
1872 14
 
1.9%
1726 13
 
1.8%
1592 12
 
1.7%
Other values (152) 478
66.1%
2023-12-13T10:01:12.832850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 737
24.4%
5 299
9.9%
8 289
 
9.6%
6 211
 
7.0%
3 210
 
7.0%
9 203
 
6.7%
0 162
 
5.4%
7 161
 
5.3%
4 157
 
5.2%
2 148
 
4.9%
Other values (5) 440
14.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2577
85.4%
Lowercase Letter 440
 
14.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 737
28.6%
5 299
11.6%
8 289
 
11.2%
6 211
 
8.2%
3 210
 
8.1%
9 203
 
7.9%
0 162
 
6.3%
7 161
 
6.2%
4 157
 
6.1%
2 148
 
5.7%
Lowercase Letter
ValueCountFrequency (%)
b 88
20.0%
a 88
20.0%
t 88
20.0%
c 88
20.0%
h 88
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2577
85.4%
Latin 440
 
14.6%

Most frequent character per script

Common
ValueCountFrequency (%)
1 737
28.6%
5 299
11.6%
8 289
 
11.2%
6 211
 
8.2%
3 210
 
8.1%
9 203
 
7.9%
0 162
 
6.3%
7 161
 
6.2%
4 157
 
6.1%
2 148
 
5.7%
Latin
ValueCountFrequency (%)
b 88
20.0%
a 88
20.0%
t 88
20.0%
c 88
20.0%
h 88
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3017
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 737
24.4%
5 299
9.9%
8 289
 
9.6%
6 211
 
7.0%
3 210
 
7.0%
9 203
 
6.7%
0 162
 
5.4%
7 161
 
5.3%
4 157
 
5.2%
2 148
 
4.9%
Other values (5) 440
14.6%

UPDT_BRCD
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
<NA>
337 
ACS
125 
0
88 
TLB
70 
TAC
65 
Other values (23)
315 

Length

Max length4
Median length3
Mean length3.161
Min length1

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 337
33.7%
ACS 125
 
12.5%
0 88
 
8.8%
TLB 70
 
7.0%
TAC 65
 
6.5%
TBA 51
 
5.1%
TAB 49
 
4.9%
TAD 28
 
2.8%
THB 25
 
2.5%
THO 24
 
2.4%
Other values (18) 138
13.8%

Length

2023-12-13T10:01:12.942559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 337
33.7%
acs 125
 
12.5%
0 88
 
8.8%
tlb 70
 
7.0%
tac 65
 
6.5%
tba 51
 
5.1%
tab 49
 
4.9%
tad 28
 
2.8%
thb 25
 
2.5%
tho 24
 
2.4%
Other values (18) 138
13.8%

REG_TS
Date

Distinct767
Distinct (%)76.7%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2008-08-08 10:05:00
Maximum2020-10-26 10:12:00
2023-12-13T10:01:13.035340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:13.375917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

REG_ENO
Real number (ℝ)

HIGH CORRELATION 

Distinct91
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1219.599
Minimum1009
Maximum1499
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T10:01:13.517827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1009
5-th percentile1086
Q11142
median1195
Q31331
95-th percentile1424
Maximum1499
Range490
Interquartile range (IQR)189

Descriptive statistics

Standard deviation106.80782
Coefficient of variation (CV)0.087576178
Kurtosis-0.6653066
Mean1219.599
Median Absolute Deviation (MAD)57
Skewness0.54889076
Sum1219599
Variance11407.91
MonotonicityNot monotonic
2023-12-13T10:01:13.622542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1221 93
 
9.3%
1424 92
 
9.2%
1339 71
 
7.1%
1157 52
 
5.2%
1201 51
 
5.1%
1166 34
 
3.4%
1195 34
 
3.4%
1088 32
 
3.2%
1142 29
 
2.9%
1141 28
 
2.8%
Other values (81) 484
48.4%
ValueCountFrequency (%)
1009 13
1.3%
1011 1
 
0.1%
1021 3
 
0.3%
1023 5
 
0.5%
1031 1
 
0.1%
1035 2
 
0.2%
1061 1
 
0.1%
1064 1
 
0.1%
1070 4
 
0.4%
1074 5
 
0.5%
ValueCountFrequency (%)
1499 1
 
0.1%
1424 92
9.2%
1419 1
 
0.1%
1394 3
 
0.3%
1393 6
 
0.6%
1385 1
 
0.1%
1377 3
 
0.3%
1375 2
 
0.2%
1369 4
 
0.4%
1367 8
 
0.8%

REG_BRCD
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
ACS
259 
THO
86 
TAC
85 
TLB
82 
TBA
70 
Other values (21)
418 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowQAD
2nd rowQAD
3rd rowTPA
4th rowTPA
5th rowTBB

Common Values

ValueCountFrequency (%)
ACS 259
25.9%
THO 86
 
8.6%
TAC 85
 
8.5%
TLB 82
 
8.2%
TBA 70
 
7.0%
TAB 61
 
6.1%
QAD 43
 
4.3%
THA 42
 
4.2%
THB 41
 
4.1%
TNA 34
 
3.4%
Other values (16) 197
19.7%

Length

2023-12-13T10:01:13.725086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
acs 259
25.9%
tho 86
 
8.6%
tac 85
 
8.5%
tlb 82
 
8.2%
tba 70
 
7.0%
tab 61
 
6.1%
qad 43
 
4.3%
tha 42
 
4.2%
thb 41
 
4.1%
tna 34
 
3.4%
Other values (16) 197
19.7%

Interactions

2023-12-13T10:01:10.636525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:10.146481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:10.380830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:10.712259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:10.225973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:10.464497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:10.799347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:10.307459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T10:01:10.554174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T10:01:13.789877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ACPT_PTNOMDBTR_CUST_NOLAWST_CLSS_DVCDUPDT_BRCDREG_ENOREG_BRCD
ACPT_PTNO1.0000.3520.0000.8170.5300.805
MDBTR_CUST_NO0.3521.0000.0000.4110.2890.447
LAWST_CLSS_DVCD0.0000.0001.0000.0000.0000.000
UPDT_BRCD0.8170.4110.0001.0000.8550.997
REG_ENO0.5300.2890.0000.8551.0000.866
REG_BRCD0.8050.4470.0000.9970.8661.000
2023-12-13T10:01:13.870706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
LAWST_CLSS_DVCDREG_BRCDUPDT_BRCD
LAWST_CLSS_DVCD1.0000.0000.000
REG_BRCD0.0001.0000.948
UPDT_BRCD0.0000.9481.000
2023-12-13T10:01:13.939762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ACPT_PTNOMDBTR_CUST_NOREG_ENOLAWST_CLSS_DVCDUPDT_BRCDREG_BRCD
ACPT_PTNO1.0000.3440.2910.0000.4270.458
MDBTR_CUST_NO0.3441.0000.0750.0000.1590.177
REG_ENO0.2910.0751.0000.0000.5130.537
LAWST_CLSS_DVCD0.0000.0000.0001.0000.0000.000
UPDT_BRCD0.4270.1590.5130.0001.0000.948
REG_BRCD0.4580.1770.5370.0000.9481.000

Missing values

2023-12-13T10:01:10.922220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T10:01:11.022321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ACPT_PTNOMDBTR_CUST_NOLAWST_CLSS_DVCDUPDT_TSUPDT_ENOUPDT_BRCDREG_TSREG_ENOREG_BRCD
0202013045408787905910001/01/01 01:01:01<NA><NA>2020-10-26 10:121142QAD
1202013045146977252710001/01/01 01:01:01<NA><NA>2020-10-26 10:031142QAD
22020130453810995005010001/01/01 01:01:01<NA><NA>2020-10-26 9:391186TPA
3202013045376812643310001/01/01 01:01:01<NA><NA>2020-10-26 9:391186TPA
42020130453411882689010001/01/01 01:01:01<NA><NA>2020-10-23 14:431205TBB
52020130385612231443710001/01/01 01:01:01<NA><NA>2020-10-23 14:431205TBB
6202013045316614811210001/01/01 01:01:01<NA><NA>2020-10-23 13:501424ACS
72020130453310627413710001/01/01 01:01:01<NA><NA>2020-10-23 11:101195TNA
8202013045326749120810001/01/01 01:01:01<NA><NA>2020-10-23 11:101195TNA
9202013045278403972410001/01/01 01:01:01<NA><NA>2020-10-23 9:071221ACS
ACPT_PTNOMDBTR_CUST_NOLAWST_CLSS_DVCDUPDT_TSUPDT_ENOUPDT_BRCDREG_TSREG_ENOREG_BRCD
990201713066739189182912018-07-05 10:031345ACS2017-09-22 16:351070ACS
991201813034519699777312018-09-17 14:041628TAA2018-06-28 14:301121TAA
992201813027163668265412019-01-04 14:311331ACS2018-05-24 16:361093ACS
993201813020013695014012018-05-21 13:221414QAD2018-04-19 14:401322QAD
994201813012761915175912018-06-29 18:431798ACS2018-03-21 14:491212ACS
995201813010549380473312018-04-09 13:141815TAA2018-03-09 16:561121TAA
996201713077174795072812018-02-21 17:261815TAA2017-11-20 9:221143TAA
997201813003988845466412018-08-29 8:251331ACS2018-01-23 16:221070ACS
998201613006508939382112017-12-20 12:111256ACS2016-01-22 15:311070ACS
999201713074692856057712018-01-04 10:071330QAD2017-11-07 14:521322QAD