Overview

Dataset statistics

Number of variables8
Number of observations1000
Missing cells889
Missing cells (%)11.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory66.5 KiB
Average record size in memory68.1 B

Variable types

Text1
Numeric4
Categorical2
DateTime1

Dataset

Description한국주택금융공사 채권관리부 시효내역 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터) 입니다.
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15072960/fileData.do

Alerts

채무관계자고객번호 is highly overall correlated with 주채무자고객번호High correlation
주채무자고객번호 is highly overall correlated with 채무관계자고객번호High correlation
변경자사번 is highly overall correlated with 변경부점코드High correlation
등록자사번 is highly overall correlated with 등록부점코드High correlation
변경부점코드 is highly overall correlated with 변경자사번 and 1 other fieldsHigh correlation
등록부점코드 is highly overall correlated with 등록자사번 and 1 other fieldsHigh correlation
변경부점코드 is highly imbalanced (75.0%)Imbalance
변경자사번 has 838 (83.8%) missing valuesMissing
등록자사번 has 51 (5.1%) missing valuesMissing

Reproduction

Analysis started2023-12-12 18:46:02.048880
Analysis finished2023-12-12 18:46:06.114879
Duration4.07 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct822
Distinct (%)82.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T03:46:06.363161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13000
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique717 ?
Unique (%)71.7%

Sample

1st rowTOA2000000763
2nd rowTOA2010011407
3rd rowTHA2010055042
4th rowQAC2001080430
5th rowQAC2001108266
ValueCountFrequency (%)
tab2012032155 12
 
1.2%
thb2012041934 9
 
0.9%
tho2007003647 9
 
0.9%
toa2011004107 9
 
0.9%
taa2012075024 8
 
0.8%
tla2012012148 7
 
0.7%
tma2014007207 5
 
0.5%
thb2013032610 5
 
0.5%
qad2014014490 5
 
0.5%
qad2006059244 5
 
0.5%
Other values (812) 926
92.6%
2023-12-13T03:46:07.021529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2896
22.3%
2 1681
12.9%
1 1525
11.7%
A 980
 
7.5%
T 822
 
6.3%
4 638
 
4.9%
3 633
 
4.9%
7 558
 
4.3%
6 546
 
4.2%
8 543
 
4.2%
Other values (17) 2178
16.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10000
76.9%
Uppercase Letter 3000
 
23.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 980
32.7%
T 822
27.4%
Q 220
 
7.3%
H 205
 
6.8%
B 183
 
6.1%
D 176
 
5.9%
O 138
 
4.6%
C 90
 
3.0%
P 65
 
2.2%
N 32
 
1.1%
Other values (7) 89
 
3.0%
Decimal Number
ValueCountFrequency (%)
0 2896
29.0%
2 1681
16.8%
1 1525
15.2%
4 638
 
6.4%
3 633
 
6.3%
7 558
 
5.6%
6 546
 
5.5%
8 543
 
5.4%
9 490
 
4.9%
5 490
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Common 10000
76.9%
Latin 3000
 
23.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 980
32.7%
T 822
27.4%
Q 220
 
7.3%
H 205
 
6.8%
B 183
 
6.1%
D 176
 
5.9%
O 138
 
4.6%
C 90
 
3.0%
P 65
 
2.2%
N 32
 
1.1%
Other values (7) 89
 
3.0%
Common
ValueCountFrequency (%)
0 2896
29.0%
2 1681
16.8%
1 1525
15.2%
4 638
 
6.4%
3 633
 
6.3%
7 558
 
5.6%
6 546
 
5.5%
8 543
 
5.4%
9 490
 
4.9%
5 490
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2896
22.3%
2 1681
12.9%
1 1525
11.7%
A 980
 
7.5%
T 822
 
6.3%
4 638
 
4.9%
3 633
 
4.9%
7 558
 
4.3%
6 546
 
4.2%
8 543
 
4.2%
Other values (17) 2178
16.8%

채무관계자고객번호
Real number (ℝ)

HIGH CORRELATION 

Distinct901
Distinct (%)90.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85690709
Minimum7010718
Maximum1.4530602 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T03:46:07.268480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7010718
5-th percentile21854280
Q162378803
median89057387
Q31.1325799 × 108
95-th percentile1.4501209 × 108
Maximum1.4530602 × 108
Range1.3829531 × 108
Interquartile range (IQR)50879184

Descriptive statistics

Standard deviation36331009
Coefficient of variation (CV)0.42397839
Kurtosis-0.76694522
Mean85690709
Median Absolute Deviation (MAD)25157986
Skewness-0.33064852
Sum8.5690709 × 1010
Variance1.3199422 × 1015
MonotonicityNot monotonic
2023-12-13T03:46:08.012272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18880092 4
 
0.4%
76668569 3
 
0.3%
69153285 3
 
0.3%
23783870 3
 
0.3%
64109425 2
 
0.2%
108367194 2
 
0.2%
69584508 2
 
0.2%
92799522 2
 
0.2%
15380605 2
 
0.2%
97549544 2
 
0.2%
Other values (891) 975
97.5%
ValueCountFrequency (%)
7010718 1
0.1%
7900417 1
0.1%
7901513 2
0.2%
9005970 1
0.1%
9370728 1
0.1%
9515965 1
0.1%
10199464 1
0.1%
11006255 1
0.1%
11338936 1
0.1%
11706458 1
0.1%
ValueCountFrequency (%)
145306024 1
0.1%
145279533 2
0.2%
145279371 2
0.2%
145279216 2
0.2%
145278301 2
0.2%
145278259 2
0.2%
145271023 2
0.2%
145191352 1
0.1%
145172083 1
0.1%
145172070 1
0.1%

주채무자고객번호
Real number (ℝ)

HIGH CORRELATION 

Distinct798
Distinct (%)79.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78303049
Minimum7900417
Maximum1.3810488 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T03:46:08.234129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7900417
5-th percentile22712866
Q156645238
median84097795
Q31.0166293 × 108
95-th percentile1.2229328 × 108
Maximum1.3810488 × 108
Range1.3020446 × 108
Interquartile range (IQR)45017688

Descriptive statistics

Standard deviation31872776
Coefficient of variation (CV)0.40704387
Kurtosis-0.89355179
Mean78303049
Median Absolute Deviation (MAD)23879610
Skewness-0.39820605
Sum7.8303049 × 1010
Variance1.0158739 × 1015
MonotonicityNot monotonic
2023-12-13T03:46:08.465351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
87194051 12
 
1.2%
45612356 9
 
0.9%
62949865 9
 
0.9%
81984665 9
 
0.9%
57348248 8
 
0.8%
89860464 7
 
0.7%
77924969 5
 
0.5%
93819113 5
 
0.5%
96504917 5
 
0.5%
61500399 5
 
0.5%
Other values (788) 926
92.6%
ValueCountFrequency (%)
7900417 1
0.1%
7901513 2
0.2%
9005970 1
0.1%
9515965 1
0.1%
10199464 1
0.1%
11006255 1
0.1%
11338936 1
0.1%
13136631 1
0.1%
14929656 1
0.1%
14939361 1
0.1%
ValueCountFrequency (%)
138104880 1
0.1%
131765404 1
0.1%
130954009 1
0.1%
130715235 1
0.1%
129984093 1
0.1%
129722161 1
0.1%
129424993 1
0.1%
129304796 1
0.1%
128782319 1
0.1%
128660460 1
0.1%

변경자사번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct54
Distinct (%)33.3%
Missing838
Missing (%)83.8%
Infinite0
Infinite (%)0.0%
Mean1523.8889
Minimum1007
Maximum8889
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T03:46:08.700213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1007
5-th percentile1122.05
Q11221
median1503.5
Q31711
95-th percentile1921
Maximum8889
Range7882
Interquartile range (IQR)490

Descriptive statistics

Standard deviation646.81839
Coefficient of variation (CV)0.42445246
Kurtosis105.38463
Mean1523.8889
Median Absolute Deviation (MAD)280.5
Skewness9.2703925
Sum246870
Variance418374.02
MonotonicityNot monotonic
2023-12-13T03:46:08.922372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1253 26
 
2.6%
1221 17
 
1.7%
1520 13
 
1.3%
1890 13
 
1.3%
1603 9
 
0.9%
1842 5
 
0.5%
1201 4
 
0.4%
1921 4
 
0.4%
1339 3
 
0.3%
1166 3
 
0.3%
Other values (44) 65
 
6.5%
(Missing) 838
83.8%
ValueCountFrequency (%)
1007 1
0.1%
1032 2
0.2%
1037 1
0.1%
1086 1
0.1%
1103 1
0.1%
1108 1
0.1%
1121 2
0.2%
1142 2
0.2%
1149 1
0.1%
1157 1
0.1%
ValueCountFrequency (%)
8889 1
 
0.1%
1978 3
 
0.3%
1958 2
 
0.2%
1935 1
 
0.1%
1934 1
 
0.1%
1921 4
 
0.4%
1890 13
1.3%
1883 1
 
0.1%
1872 2
 
0.2%
1869 1
 
0.1%

변경부점코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct24
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
<NA>
838 
ACS
 
72
TOA
 
15
QAD
 
14
TAC
 
8
Other values (19)
 
53

Length

Max length4
Median length4
Mean length3.838
Min length3

Unique

Unique9 ?
Unique (%)0.9%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th rowACS
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 838
83.8%
ACS 72
 
7.2%
TOA 15
 
1.5%
QAD 14
 
1.4%
TAC 8
 
0.8%
TBA 8
 
0.8%
THA 7
 
0.7%
TAB 5
 
0.5%
TMA 5
 
0.5%
TAA 5
 
0.5%
Other values (14) 23
 
2.3%

Length

2023-12-13T03:46:09.157358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 838
83.8%
acs 72
 
7.2%
toa 15
 
1.5%
qad 14
 
1.4%
tac 8
 
0.8%
tba 8
 
0.8%
tha 7
 
0.7%
tab 5
 
0.5%
tma 5
 
0.5%
taa 5
 
0.5%
Other values (14) 23
 
2.3%
Distinct622
Distinct (%)62.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2010-07-21 13:47:00
Maximum2020-10-27 18:06:00
2023-12-13T03:46:09.375514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:09.597542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

등록자사번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct145
Distinct (%)15.3%
Missing51
Missing (%)5.1%
Infinite0
Infinite (%)0.0%
Mean2030.5553
Minimum1088
Maximum52042
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T03:46:09.807654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1088
5-th percentile1253
Q11544
median1648
Q31867
95-th percentile1973
Maximum52042
Range50954
Interquartile range (IQR)323

Descriptive statistics

Standard deviation3704.0655
Coefficient of variation (CV)1.8241638
Kurtosis164.89458
Mean2030.5553
Median Absolute Deviation (MAD)150
Skewness12.614862
Sum1926997
Variance13720102
MonotonicityNot monotonic
2023-12-13T03:46:10.001596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1690 50
 
5.0%
1520 50
 
5.0%
1605 48
 
4.8%
1603 47
 
4.7%
1872 41
 
4.1%
1890 41
 
4.1%
1253 40
 
4.0%
1842 31
 
3.1%
1867 31
 
3.1%
1590 28
 
2.8%
Other values (135) 542
54.2%
(Missing) 51
 
5.1%
ValueCountFrequency (%)
1088 1
0.1%
1121 1
0.1%
1127 1
0.1%
1144 1
0.1%
1148 1
0.1%
1156 1
0.1%
1158 2
0.2%
1163 1
0.1%
1170 2
0.2%
1184 1
0.1%
ValueCountFrequency (%)
52042 1
 
0.1%
51646 1
 
0.1%
51641 1
 
0.1%
51010 1
 
0.1%
50711 1
 
0.1%
8889 13
1.3%
7403 1
 
0.1%
5003 1
 
0.1%
2002 1
 
0.1%
1995 1
 
0.1%

등록부점코드
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
ACS
170 
QAD
122 
TAC
81 
TAA
78 
TOA
65 
Other values (23)
484 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st rowTOA
2nd rowTOA
3rd rowTOA
4th rowACS
5th rowACS

Common Values

ValueCountFrequency (%)
ACS 170
17.0%
QAD 122
12.2%
TAC 81
 
8.1%
TAA 78
 
7.8%
TOA 65
 
6.5%
THA 52
 
5.2%
TAD 51
 
5.1%
TAB 45
 
4.5%
TPA 44
 
4.4%
TBA 41
 
4.1%
Other values (18) 251
25.1%

Length

2023-12-13T03:46:10.179043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
acs 170
17.0%
qad 122
12.2%
tac 81
 
8.1%
taa 78
 
7.8%
toa 65
 
6.5%
tha 52
 
5.2%
tad 51
 
5.1%
tab 45
 
4.5%
tpa 44
 
4.4%
tba 41
 
4.1%
Other values (18) 251
25.1%

Interactions

2023-12-13T03:46:04.854098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:02.641209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:03.376110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:04.138996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:05.025569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:02.833361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:03.589576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:04.332985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:05.201696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:03.039555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:03.775934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:04.519424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:05.367850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:03.196439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:03.952048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:04.701811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:46:10.288978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
채무관계자고객번호주채무자고객번호변경자사번변경부점코드등록자사번등록부점코드
채무관계자고객번호1.0000.9720.0000.6010.1150.488
주채무자고객번호0.9721.0000.0000.5320.1120.470
변경자사번0.0000.0001.0000.9070.0000.551
변경부점코드0.6010.5320.9071.0000.0000.980
등록자사번0.1150.1120.0000.0001.0000.855
등록부점코드0.4880.4700.5510.9800.8551.000
2023-12-13T03:46:10.447886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
변경부점코드등록부점코드
변경부점코드1.0000.797
등록부점코드0.7971.000
2023-12-13T03:46:10.570578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
채무관계자고객번호주채무자고객번호변경자사번등록자사번변경부점코드등록부점코드
채무관계자고객번호1.0000.7480.0040.0540.2560.195
주채무자고객번호0.7481.0000.0420.1750.2240.186
변경자사번0.0040.0421.0000.1390.7150.307
등록자사번0.0540.1750.1391.0000.0000.667
변경부점코드0.2560.2240.7150.0001.0000.797
등록부점코드0.1950.1860.3070.6670.7971.000

Missing values

2023-12-13T03:46:05.591403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:46:05.817361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T03:46:06.017450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

보증번호채무관계자고객번호주채무자고객번호변경자사번변경부점코드등록일시등록자사번등록부점코드
0TOA20000007632169470921694709<NA><NA>2020-10-27 18:061520TOA
1TOA20100114077918170079181700<NA><NA>2020-10-27 18:061520TOA
2THA20100550427997463279974632<NA><NA>2020-10-27 18:061520TOA
3QAC200108043032282058322820031424ACS2012-10-17 12:3651010ACS
4QAC20011082663335421533354215<NA><NA>2020-10-27 17:421935ACS
5TQH20000003262268685522686855<NA><NA>2020-10-27 17:421935ACS
6TOA20130212659580525195805251<NA><NA>2020-10-27 16:551520TOA
7TOA20130134209425673394256733<NA><NA>2020-10-27 16:551520TOA
8TOA20090010977305655673056200<NA><NA>2020-10-27 16:531520TOA
9TOA20090010977305620073056200<NA><NA>2020-10-27 16:531520TOA
보증번호채무관계자고객번호주채무자고객번호변경자사번변경부점코드등록일시등록자사번등록부점코드
990THA201106474084737372847373721842THA2015-11-10 11:421402THA
991QAD20100461923723978937239789<NA><NA>2020-10-07 16:141872TAC
992QAD201004619237239789372397891872TAC2015-10-27 13:261428TAC
993TNA2015018437109004359109004359<NA><NA>2020-10-07 15:571915TNA
994TNA20150052669546556595465565<NA><NA>2020-10-07 15:571915TNA
995TNA2018009705122477248122477248<NA><NA>2020-10-07 15:571915TNA
996TNA20150119359796643197966431<NA><NA>2020-10-07 15:571915TNA
997TNA20140148969921474199214741<NA><NA>2020-10-07 15:571915TNA
998TNA2018017254124730938124730938<NA><NA>2020-10-07 15:571915TNA
999TNA20110091198399357383993573<NA><NA>2020-10-07 15:571915TNA