Overview

Dataset statistics

Number of variables9
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory74.3 KiB
Average record size in memory76.1 B

Variable types

Categorical4
DateTime2
Numeric3

Dataset

Description한국주택금융공사 신탁자산부 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터)
URLhttps://www.data.go.kr/data/15073315/fileData.do

Alerts

유동화계획코드 has constant value ""Constant
수수료종류 is highly overall correlated with 수수료율 and 4 other fieldsHigh correlation
처리자사번 is highly overall correlated with 수수료율 and 3 other fieldsHigh correlation
관리수수료코드 is highly overall correlated with 수수료율 and 2 other fieldsHigh correlation
수수료율 is highly overall correlated with 수수료종류 and 2 other fieldsHigh correlation
계산기준금액 is highly overall correlated with 지급금액 and 2 other fieldsHigh correlation
지급금액 is highly overall correlated with 계산기준금액 and 1 other fieldsHigh correlation
수수료종류 is highly imbalanced (71.2%)Imbalance
관리수수료코드 is highly imbalanced (58.3%)Imbalance
처리자사번 is highly imbalanced (58.7%)Imbalance
수수료율 has 63 (6.3%) zerosZeros

Reproduction

Analysis started2023-12-12 22:27:18.435024
Analysis finished2023-12-12 22:27:20.091558
Duration1.66 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

유동화계획코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
KHFCMB2020S-34
1000 

Length

Max length14
Median length14
Mean length14
Min length14

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKHFCMB2020S-34
2nd rowKHFCMB2020S-34
3rd rowKHFCMB2020S-34
4th rowKHFCMB2020S-34
5th rowKHFCMB2020S-34

Common Values

ValueCountFrequency (%)
KHFCMB2020S-34 1000
100.0%

Length

2023-12-13T07:27:20.165227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:27:20.289454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
khfcmb2020s-34 1000
100.0%

수수료종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct37
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
B01003
720 
B03503
197 
B08108
 
6
B02008
 
5
B08111
 
4
Other values (32)
 
68

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique18 ?
Unique (%)1.8%

Sample

1st rowB02018
2nd rowB08117
3rd rowB03717
4th rowB03117
5th rowB02017

Common Values

ValueCountFrequency (%)
B01003 720
72.0%
B03503 197
 
19.7%
B08108 6
 
0.6%
B02008 5
 
0.5%
B08111 4
 
0.4%
B00308 4
 
0.4%
B01008 4
 
0.4%
B03108 4
 
0.4%
B03708 4
 
0.4%
B00311 4
 
0.4%
Other values (27) 48
 
4.8%

Length

2023-12-13T07:27:20.420722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b01003 720
72.0%
b03503 197
 
19.7%
b08108 6
 
0.6%
b02008 5
 
0.5%
b08111 4
 
0.4%
b00308 4
 
0.4%
b01008 4
 
0.4%
b03108 4
 
0.4%
b00311 4
 
0.4%
b00411 4
 
0.4%
Other values (27) 48
 
4.8%
Distinct361
Distinct (%)36.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2020-10-07 00:00:00
Maximum2050-09-07 00:00:00
2023-12-13T07:27:20.545928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:20.762742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

관리수수료코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct14
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
C1
557 
CL
360 
1T
 
10
2T
 
9
6A
 
7
Other values (9)
57 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row9T
2nd row6A
3rd row6A
4th row6A
5th row6A

Common Values

ValueCountFrequency (%)
C1 557
55.7%
CL 360
36.0%
1T 10
 
1.0%
2T 9
 
0.9%
6A 7
 
0.7%
6T 7
 
0.7%
3A 7
 
0.7%
8A 7
 
0.7%
5A 7
 
0.7%
1A 7
 
0.7%
Other values (4) 22
 
2.2%

Length

2023-12-13T07:27:20.905125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c1 557
55.7%
cl 360
36.0%
1t 10
 
1.0%
2t 9
 
0.9%
6a 7
 
0.7%
6t 7
 
0.7%
3a 7
 
0.7%
8a 7
 
0.7%
5a 7
 
0.7%
1a 7
 
0.7%
Other values (4) 22
 
2.2%

수수료율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct7
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.12675
Minimum0
Maximum0.6
Zeros63
Zeros (%)6.3%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T07:27:20.993824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.1
median0.15
Q30.15
95-th percentile0.15
Maximum0.6
Range0.6
Interquartile range (IQR)0.05

Descriptive statistics

Standard deviation0.054634444
Coefficient of variation (CV)0.43104097
Kurtosis15.200443
Mean0.12675
Median Absolute Deviation (MAD)0
Skewness1.9123682
Sum126.75
Variance0.0029849224
MonotonicityNot monotonic
2023-12-13T07:27:21.124335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0.15 557
55.7%
0.1 363
36.3%
0.0 63
 
6.3%
0.4 13
 
1.3%
0.45 2
 
0.2%
0.2 1
 
0.1%
0.6 1
 
0.1%
ValueCountFrequency (%)
0.0 63
 
6.3%
0.1 363
36.3%
0.15 557
55.7%
0.2 1
 
0.1%
0.4 13
 
1.3%
0.45 2
 
0.2%
0.6 1
 
0.1%
ValueCountFrequency (%)
0.6 1
 
0.1%
0.45 2
 
0.2%
0.4 13
 
1.3%
0.2 1
 
0.1%
0.15 557
55.7%
0.1 363
36.3%
0.0 63
 
6.3%

계산기준금액
Real number (ℝ)

HIGH CORRELATION 

Distinct586
Distinct (%)58.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1577118 × 109
Minimum0
Maximum2.17 × 1011
Zeros3
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T07:27:21.245940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile72543350
Q13.6792375 × 108
median8.4177296 × 108
Q31.8283388 × 109
95-th percentile9.7734845 × 109
Maximum2.17 × 1011
Range2.17 × 1011
Interquartile range (IQR)1.460415 × 109

Descriptive statistics

Standard deviation2.1127138 × 1010
Coefficient of variation (CV)5.0814339
Kurtosis82.422208
Mean4.1577118 × 109
Median Absolute Deviation (MAD)6.1755577 × 108
Skewness8.9174448
Sum4.1577118 × 1012
Variance4.4635596 × 1020
MonotonicityNot monotonic
2023-12-13T07:27:21.396686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30503345497 5
 
0.5%
100100000 5
 
0.5%
409121162 5
 
0.5%
15122515075 5
 
0.5%
21176840809 5
 
0.5%
15711966412 5
 
0.5%
9736983720 5
 
0.5%
195000000000 5
 
0.5%
208207666 5
 
0.5%
8337076050 5
 
0.5%
Other values (576) 950
95.0%
ValueCountFrequency (%)
0 3
0.3%
378000 1
 
0.1%
1014000 1
 
0.1%
1832742 2
0.2%
3813500 1
 
0.1%
6746638 2
0.2%
8056500 1
 
0.1%
11459500 1
 
0.1%
12905218 2
0.2%
14862500 1
 
0.1%
ValueCountFrequency (%)
217000000000 4
0.4%
216000000000 1
 
0.1%
195000000000 5
0.5%
71903000000 1
 
0.1%
43446740267 4
0.4%
41433188524 1
 
0.1%
41387395085 5
0.5%
33121000000 1
 
0.1%
30503345497 5
0.5%
29540000000 1
 
0.1%

지급금액
Real number (ℝ)

HIGH CORRELATION 

Distinct988
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5770601
Minimum0
Maximum9.424216 × 108
Zeros10
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T07:27:21.542095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6809.55
Q140704.25
median87676.5
Q3184569.75
95-th percentile1226038.8
Maximum9.424216 × 108
Range9.424216 × 108
Interquartile range (IQR)143865.5

Descriptive statistics

Standard deviation53966890
Coefficient of variation (CV)9.3520399
Kurtosis206.41267
Mean5770601
Median Absolute Deviation (MAD)61733.5
Skewness13.840619
Sum5.770601 × 109
Variance2.9124253 × 1015
MonotonicityNot monotonic
2023-12-13T07:27:21.682510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 10
 
1.0%
1080000 2
 
0.2%
27038 2
 
0.2%
30000 2
 
0.2%
122914 1
 
0.1%
119617 1
 
0.1%
124295 1
 
0.1%
120953 1
 
0.1%
125675 1
 
0.1%
127959 1
 
0.1%
Other values (978) 978
97.8%
ValueCountFrequency (%)
0 10
1.0%
48 1
 
0.1%
129 1
 
0.1%
151 1
 
0.1%
226 1
 
0.1%
470 1
 
0.1%
573 1
 
0.1%
860 1
 
0.1%
1026 1
 
0.1%
1096 1
 
0.1%
ValueCountFrequency (%)
942421604 1
0.1%
787130436 1
0.1%
776159565 1
0.1%
719252379 1
0.1%
285888000 1
0.1%
178079457 1
0.1%
170883232 1
0.1%
169714317 1
0.1%
157313842 1
0.1%
134763113 1
0.1%

처리자사번
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1478
917 
1769
 
83

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1769
2nd row1769
3rd row1769
4th row1769
5th row1769

Common Values

ValueCountFrequency (%)
1478 917
91.7%
1769 83
 
8.3%

Length

2023-12-13T07:27:21.797514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:27:21.884354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1478 917
91.7%
1769 83
 
8.3%
Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2020-10-28 11:06:00
Maximum2020-10-28 11:22:00
2023-12-13T07:27:21.981361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:22.101409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=3)

Interactions

2023-12-13T07:27:19.460521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:18.849443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:19.166032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:19.571822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:18.953091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:19.259089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:19.701248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:19.067047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:27:19.353087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:27:22.175409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수수료종류관리수수료코드수수료율계산기준금액지급금액처리자사번처리일시
수수료종류1.0000.9500.9140.9760.8331.0000.912
관리수수료코드0.9501.0000.9870.6480.6461.0000.916
수수료율0.9140.9871.0000.4660.4800.9190.851
계산기준금액0.9760.6480.4661.0000.9400.7650.721
지급금액0.8330.6460.4800.9401.0000.5590.569
처리자사번1.0001.0000.9190.7650.5591.0001.000
처리일시0.9120.9160.8510.7210.5691.0001.000
2023-12-13T07:27:22.292804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수수료종류처리자사번관리수수료코드
수수료종류1.0000.9820.670
처리자사번0.9821.0000.994
관리수수료코드0.6700.9941.000
2023-12-13T07:27:22.385778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수수료율계산기준금액지급금액수수료종류관리수수료코드처리자사번
수수료율1.000-0.246-0.0700.6700.8220.978
계산기준금액-0.2461.0000.9060.8530.3870.570
지급금액-0.0700.9061.0000.5380.3850.404
수수료종류0.6700.8530.5381.0000.6700.982
관리수수료코드0.8220.3870.3850.6701.0000.994
처리자사번0.9780.5700.4040.9820.9941.000

Missing values

2023-12-13T07:27:19.828701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:27:20.025023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

유동화계획코드수수료종류발생일자관리수수료코드수수료율계산기준금액지급금액처리자사번처리일시
0KHFCMB2020S-34B020182020-11-209T0.2326905876538117692020-10-28 11:22
1KHFCMB2020S-34B081172020-11-206A0.015122515075172500017692020-10-28 11:22
2KHFCMB2020S-34B037172020-11-206A0.01001000003000017692020-10-28 11:22
3KHFCMB2020S-34B031172020-11-206A0.04091211626000017692020-10-28 11:22
4KHFCMB2020S-34B020172020-11-206A0.011823604998108000017692020-10-28 11:22
5KHFCMB2020S-34B010172020-11-206A0.021176840809280500017692020-10-28 11:22
6KHFCMB2020S-34B004172020-11-206A0.015711966412166500017692020-10-28 11:22
7KHFCMB2020S-34B003172020-11-206A0.09736983720112500017692020-10-28 11:22
8KHFCMB2020S-34B081142020-11-206T0.01950000000001840500017692020-10-28 11:22
9KHFCMB2020S-34B037142020-11-206T0.02082076663000017692020-10-28 11:22
유동화계획코드수수료종류발생일자관리수수료코드수수료율계산기준금액지급금액처리자사번처리일시
990KHFCMB2020S-34B035032035-01-07C10.156694555007703314782020-10-28 11:06
991KHFCMB2020S-34B035032034-12-07C10.156743905008591614782020-10-28 11:06
992KHFCMB2020S-34B035032034-11-07C10.156793255008654414782020-10-28 11:06
993KHFCMB2020S-34B035032034-10-07C10.156842605008436114782020-10-28 11:06
994KHFCMB2020S-34B035032034-09-07C10.156891955008780214782020-10-28 11:06
995KHFCMB2020S-34B035032034-08-07C10.156941305008557814782020-10-28 11:06
996KHFCMB2020S-34B035032034-07-07C10.156990655008905914782020-10-28 11:06
997KHFCMB2020S-34B035032034-06-07C10.157040005008968814782020-10-28 11:06
998KHFCMB2020S-34B035032034-05-07C10.157089355008740314782020-10-28 11:06
999KHFCMB2020S-34B035032034-04-07C10.157138705009094514782020-10-28 11:06