Overview

Dataset statistics

Number of variables11
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory91.9 KiB
Average record size in memory94.1 B

Variable types

Text1
Numeric4
Categorical3
DateTime3

Dataset

Description한국주택금융공사 주택연금부의 개별인출내역이 포함된 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터) 입니다.
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15072808/fileData.do

Alerts

취급자팀코드 is highly overall correlated with 취급부점코드High correlation
취급부점코드 is highly overall correlated with 취급자팀코드High correlation
취급자사번 is highly overall correlated with 등록사번 and 1 other fieldsHigh correlation
등록사번 is highly overall correlated with 취급자사번 and 1 other fieldsHigh correlation
최종수정자사번 is highly overall correlated with 취급자사번 and 1 other fieldsHigh correlation
순번 is highly imbalanced (98.9%)Imbalance

Reproduction

Analysis started2023-12-13 00:26:35.817308
Analysis finished2023-12-13 00:26:37.574918
Duration1.76 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct860
Distinct (%)86.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T09:26:37.719530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters14000
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique769 ?
Unique (%)76.9%

Sample

1st rowRTAD2018000231
2nd rowRTOA2019000215
3rd rowRTOA2020000221
4th rowRTBA2012000239
5th rowRTQB2020000051
ValueCountFrequency (%)
rtab2018000639 22
 
2.2%
rtaa2019000598 6
 
0.6%
rtba2018001035 6
 
0.6%
rqad2007000193 6
 
0.6%
rtad2018000268 5
 
0.5%
rtho2018000106 4
 
0.4%
rqad2016000291 3
 
0.3%
rtad2019000616 3
 
0.3%
rtba2018000674 3
 
0.3%
rtpa2019000587 3
 
0.3%
Other values (850) 939
93.9%
2023-12-13T09:26:38.008354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4542
32.4%
2 1525
 
10.9%
1 1247
 
8.9%
R 1002
 
7.2%
A 926
 
6.6%
T 898
 
6.4%
9 464
 
3.3%
8 412
 
2.9%
7 386
 
2.8%
6 382
 
2.7%
Other values (14) 2216
15.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10000
71.4%
Uppercase Letter 4000
 
28.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 1002
25.1%
A 926
23.2%
T 898
22.4%
B 364
 
9.1%
H 235
 
5.9%
D 163
 
4.1%
Q 136
 
3.4%
O 98
 
2.5%
C 83
 
2.1%
P 35
 
0.9%
Other values (4) 60
 
1.5%
Decimal Number
ValueCountFrequency (%)
0 4542
45.4%
2 1525
 
15.2%
1 1247
 
12.5%
9 464
 
4.6%
8 412
 
4.1%
7 386
 
3.9%
6 382
 
3.8%
5 380
 
3.8%
3 361
 
3.6%
4 301
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10000
71.4%
Latin 4000
 
28.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 1002
25.1%
A 926
23.2%
T 898
22.4%
B 364
 
9.1%
H 235
 
5.9%
D 163
 
4.1%
Q 136
 
3.4%
O 98
 
2.5%
C 83
 
2.1%
P 35
 
0.9%
Other values (4) 60
 
1.5%
Common
ValueCountFrequency (%)
0 4542
45.4%
2 1525
 
15.2%
1 1247
 
12.5%
9 464
 
4.6%
8 412
 
4.1%
7 386
 
3.9%
6 382
 
3.8%
5 380
 
3.8%
3 361
 
3.6%
4 301
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4542
32.4%
2 1525
 
10.9%
1 1247
 
8.9%
R 1002
 
7.2%
A 926
 
6.6%
T 898
 
6.4%
9 464
 
3.3%
8 412
 
2.9%
7 386
 
2.8%
6 382
 
2.7%
Other values (14) 2216
15.8%

개별순번
Real number (ℝ)

Distinct36
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.74
Minimum1
Maximum36
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T09:26:38.121165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile7
Maximum36
Range35
Interquartile range (IQR)2

Descriptive statistics

Standard deviation4.0411299
Coefficient of variation (CV)1.4748649
Kurtosis30.95977
Mean2.74
Median Absolute Deviation (MAD)1
Skewness5.1820006
Sum2740
Variance16.330731
MonotonicityNot monotonic
2023-12-13T09:26:38.215752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
1 411
41.1%
2 335
33.5%
3 108
 
10.8%
4 42
 
4.2%
5 26
 
2.6%
6 19
 
1.9%
7 10
 
1.0%
9 7
 
0.7%
8 6
 
0.6%
12 4
 
0.4%
Other values (26) 32
 
3.2%
ValueCountFrequency (%)
1 411
41.1%
2 335
33.5%
3 108
 
10.8%
4 42
 
4.2%
5 26
 
2.6%
6 19
 
1.9%
7 10
 
1.0%
8 6
 
0.6%
9 7
 
0.7%
10 2
 
0.2%
ValueCountFrequency (%)
36 1
0.1%
35 1
0.1%
34 1
0.1%
33 1
0.1%
32 1
0.1%
31 1
0.1%
30 1
0.1%
29 1
0.1%
28 1
0.1%
27 1
0.1%

순번
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
999 
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 999
99.9%
2 1
 
0.1%

Length

2023-12-13T09:26:38.308614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:26:38.593743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 999
99.9%
2 1
 
0.1%
Distinct212
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2000-03-10 00:00:00
Maximum2022-05-22 00:00:00
2023-12-13T09:26:38.671544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:38.776757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

취급부점코드
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
TAB
119 
TAA
101 
TBA
97 
THB
94 
TAC
91 
Other values (20)
498 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st rowTAD
2nd rowTOA
3rd rowTOA
4th rowTBA
5th rowTQB

Common Values

ValueCountFrequency (%)
TAB 119
11.9%
TAA 101
10.1%
TBA 97
9.7%
THB 94
9.4%
TAC 91
9.1%
QAD 85
8.5%
THA 82
8.2%
TAD 82
8.2%
THO 66
6.6%
TOA 29
 
2.9%
Other values (15) 154
15.4%

Length

2023-12-13T09:26:38.875795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tab 119
11.9%
taa 101
10.1%
tba 97
9.7%
thb 94
9.4%
tac 91
9.1%
qad 85
8.5%
tha 82
8.2%
tad 82
8.2%
tho 66
6.6%
toa 29
 
2.9%
Other values (15) 154
15.4%

취급자팀코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
485 
3
267 
2
248 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row3
5th row1

Common Values

ValueCountFrequency (%)
1 485
48.5%
3 267
26.7%
2 248
24.8%

Length

2023-12-13T09:26:38.958208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:26:39.028499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 485
48.5%
3 267
26.7%
2 248
24.8%

취급자사번
Real number (ℝ)

HIGH CORRELATION 

Distinct100
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1767.63
Minimum1174
Maximum6018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T09:26:39.114557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1174
5-th percentile1385
Q11569
median1689
Q31860.75
95-th percentile2001
Maximum6018
Range4844
Interquartile range (IQR)291.75

Descriptive statistics

Standard deviation558.63335
Coefficient of variation (CV)0.31603523
Kurtosis47.669491
Mean1767.63
Median Absolute Deviation (MAD)132
Skewness6.5684115
Sum1767630
Variance312071.22
MonotonicityNot monotonic
2023-12-13T09:26:39.220217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1689 73
 
7.3%
1656 47
 
4.7%
2001 45
 
4.5%
1799 45
 
4.5%
1753 44
 
4.4%
1475 41
 
4.1%
1406 36
 
3.6%
1650 35
 
3.5%
1686 35
 
3.5%
1691 30
 
3.0%
Other values (90) 569
56.9%
ValueCountFrequency (%)
1174 22
2.2%
1304 6
 
0.6%
1371 9
 
0.9%
1375 2
 
0.2%
1385 27
2.7%
1406 36
3.6%
1410 3
 
0.3%
1446 1
 
0.1%
1469 1
 
0.1%
1475 41
4.1%
ValueCountFrequency (%)
6018 15
 
1.5%
2003 14
 
1.4%
2002 1
 
0.1%
2001 45
4.5%
2000 6
 
0.6%
1999 2
 
0.2%
1997 1
 
0.1%
1987 4
 
0.4%
1982 4
 
0.4%
1978 6
 
0.6%

등록사번
Real number (ℝ)

HIGH CORRELATION 

Distinct100
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1767.63
Minimum1174
Maximum6018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T09:26:39.321361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1174
5-th percentile1385
Q11569
median1689
Q31860.75
95-th percentile2001
Maximum6018
Range4844
Interquartile range (IQR)291.75

Descriptive statistics

Standard deviation558.63335
Coefficient of variation (CV)0.31603523
Kurtosis47.669491
Mean1767.63
Median Absolute Deviation (MAD)132
Skewness6.5684115
Sum1767630
Variance312071.22
MonotonicityNot monotonic
2023-12-13T09:26:39.422926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1689 73
 
7.3%
1656 47
 
4.7%
2001 45
 
4.5%
1799 45
 
4.5%
1753 44
 
4.4%
1475 41
 
4.1%
1406 36
 
3.6%
1650 35
 
3.5%
1686 35
 
3.5%
1691 30
 
3.0%
Other values (90) 569
56.9%
ValueCountFrequency (%)
1174 22
2.2%
1304 6
 
0.6%
1371 9
 
0.9%
1375 2
 
0.2%
1385 27
2.7%
1406 36
3.6%
1410 3
 
0.3%
1446 1
 
0.1%
1469 1
 
0.1%
1475 41
4.1%
ValueCountFrequency (%)
6018 15
 
1.5%
2003 14
 
1.4%
2002 1
 
0.1%
2001 45
4.5%
2000 6
 
0.6%
1999 2
 
0.2%
1997 1
 
0.1%
1987 4
 
0.4%
1982 4
 
0.4%
1978 6
 
0.6%
Distinct996
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2019-12-23 10:37:00
Maximum2020-10-26 11:30:00
2023-12-13T09:26:39.534044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:39.644095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

최종수정자사번
Real number (ℝ)

HIGH CORRELATION 

Distinct100
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1751.256
Minimum1174
Maximum6018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T09:26:39.749848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1174
5-th percentile1375
Q11554
median1689
Q31859
95-th percentile2001
Maximum6018
Range4844
Interquartile range (IQR)305

Descriptive statistics

Standard deviation563.41954
Coefficient of variation (CV)0.32172312
Kurtosis46.729375
Mean1751.256
Median Absolute Deviation (MAD)142
Skewness6.4759782
Sum1751256
Variance317441.58
MonotonicityNot monotonic
2023-12-13T09:26:39.855273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1689 73
 
7.3%
1656 47
 
4.7%
2001 46
 
4.6%
1799 45
 
4.5%
1753 44
 
4.4%
1375 42
 
4.2%
1475 41
 
4.1%
1406 36
 
3.6%
1650 35
 
3.5%
1686 35
 
3.5%
Other values (90) 556
55.6%
ValueCountFrequency (%)
1174 22
2.2%
1304 6
 
0.6%
1371 9
 
0.9%
1375 42
4.2%
1385 27
2.7%
1406 36
3.6%
1410 3
 
0.3%
1446 1
 
0.1%
1469 1
 
0.1%
1475 41
4.1%
ValueCountFrequency (%)
6018 15
 
1.5%
2003 14
 
1.4%
2002 1
 
0.1%
2001 46
4.6%
2000 7
 
0.7%
1999 2
 
0.2%
1997 1
 
0.1%
1987 4
 
0.4%
1982 4
 
0.4%
1978 3
 
0.3%
Distinct981
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2019-12-23 10:43:00
Maximum2020-10-26 11:30:00
2023-12-13T09:26:39.955915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:40.061843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-13T09:26:37.063135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.201119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.492971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.772952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:37.138141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.276395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.565629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.849695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:37.225601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.350015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.633144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.921003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:37.309209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.420802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.702691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:26:36.993051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:26:40.129983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
개별순번순번취급부점코드취급자팀코드취급자사번등록사번최종수정자사번
개별순번1.0000.0000.2030.2710.1200.1200.124
순번0.0001.0000.0000.0000.0000.0000.000
취급부점코드0.2030.0001.0000.9710.6370.6370.643
취급자팀코드0.2710.0000.9711.0000.5250.5250.462
취급자사번0.1200.0000.6370.5251.0001.0000.999
등록사번0.1200.0000.6370.5251.0001.0000.999
최종수정자사번0.1240.0000.6430.4620.9990.9991.000
2023-12-13T09:26:40.212666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
취급자팀코드취급부점코드순번
취급자팀코드1.0000.9050.000
취급부점코드0.9051.0000.000
순번0.0000.0001.000
2023-12-13T09:26:40.291491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
개별순번취급자사번등록사번최종수정자사번순번취급부점코드취급자팀코드
개별순번1.0000.0540.0540.0370.0000.0520.157
취급자사번0.0541.0001.0000.8850.0000.4200.230
등록사번0.0541.0001.0000.8850.0000.4200.230
최종수정자사번0.0370.8850.8851.0000.0000.4230.190
순번0.0000.0000.0000.0001.0000.0000.000
취급부점코드0.0520.4200.4200.4230.0001.0000.905
취급자팀코드0.1570.2300.2300.1900.0000.9051.000

Missing values

2023-12-13T09:26:37.411727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:26:37.528098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

보증번호개별순번순번사용일자취급부점코드취급자팀코드취급자사번등록사번등록일시최종수정자사번갱신일자
0RTAD2018000231212020-10-23TAD1165616562020-10-23 17:4516562020-10-23 17:46
1RTOA2019000215212020-10-26TOA2200020002020-10-26 11:3020002020-10-26 11:30
2RTOA2020000221112020-10-23TOA2200020002020-10-23 17:0220002020-10-23 17:05
3RTBA2012000239112020-10-23TBA3170217022020-10-23 13:5413752020-10-23 13:55
4RTQB2020000051212020-10-26TQB1181718172020-10-23 16:3118172020-10-23 16:55
5RTOB2016000053412020-10-26TOB1196519652020-10-23 13:4919652020-10-23 13:57
6RTAC2019000464112020-10-26TAC3178817882020-10-23 13:5117882020-10-23 15:29
7RTAB2019000564312020-10-26TAB3168916892020-10-23 12:2416892020-10-23 14:07
8RTBA2020000578112020-10-26TBA2172017202020-10-23 11:0413752020-10-23 11:14
9RTAC2016000763112020-10-26TAC3175317532020-10-23 09:5217532020-10-23 13:48
보증번호개별순번순번사용일자취급부점코드취급자팀코드취급자사번등록사번등록일시최종수정자사번갱신일자
990RTAD2018000479112019-12-24TAD1165016502019-12-23 18:1216502019-12-23 18:14
991RTLB2015000021212019-12-24TLB1147614762019-12-23 17:3714762019-12-23 17:40
992RTHB2017000329112019-12-24THB2177317732019-12-23 17:1217732019-12-23 17:15
993RTMA2017000180312019-12-24TMA1192119212019-12-23 17:0419212019-12-23 17:29
994RTBA2011000032112019-12-24TBA3185918592019-12-23 16:2218592019-12-23 17:38
995RQAD2017000324112019-12-24QAD1168616862019-12-23 16:0616862019-12-23 16:57
996RTAB20160000821212019-12-24TAB3153215322019-12-23 16:0115322019-12-23 17:21
997RTQA2019000346312019-12-23TQA1184618462019-12-23 15:0918462019-12-23 19:49
998RTAB2015000611112019-12-24TAB3151315132019-12-23 10:3715132019-12-23 10:43
999RTPA2019000592112019-12-24TPA3165516552019-12-24 16:1216552019-12-24 16:36