Overview

Dataset statistics

Number of variables3
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows156
Duplicate rows (%)15.6%
Total size in memory25.5 KiB
Average record size in memory26.1 B

Variable types

Text1
Numeric1
Categorical1

Dataset

Description한국주택금융공사 채권관리부 비용내용 업무 관련 공개 데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터) 입니다.
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15072952/fileData.do

Alerts

PROCESS_SEQ has constant value ""Constant
Dataset has 156 (15.6%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 19:48:37.519245
Analysis finished2023-12-12 19:48:37.933352
Duration0.41 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct699
Distinct (%)69.9%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T04:48:38.175161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13000
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique500 ?
Unique (%)50.0%

Sample

1st rowTLA2012005060
2nd rowTAD2018047084
3rd rowTAC2016050579
4th rowTHB2018049406
5th rowTAC2018087561
ValueCountFrequency (%)
tma2011019772 10
 
1.0%
taa2010075463 10
 
1.0%
tha2018025254 6
 
0.6%
tba2015031242 6
 
0.6%
tha2016068933 6
 
0.6%
taa2011077993 6
 
0.6%
qad2013031910 5
 
0.5%
tba2019002283 5
 
0.5%
tab2013036041 5
 
0.5%
tba2013034710 4
 
0.4%
Other values (689) 937
93.7%
2023-12-13T04:48:38.704909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2621
20.2%
1 1673
12.9%
2 1571
12.1%
A 991
 
7.6%
T 894
 
6.9%
3 626
 
4.8%
5 618
 
4.8%
7 611
 
4.7%
4 608
 
4.7%
6 597
 
4.6%
Other values (15) 2190
16.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10000
76.9%
Uppercase Letter 3000
 
23.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 991
33.0%
T 894
29.8%
B 261
 
8.7%
H 180
 
6.0%
Q 166
 
5.5%
D 145
 
4.8%
P 107
 
3.6%
O 71
 
2.4%
C 68
 
2.3%
L 42
 
1.4%
Other values (5) 75
 
2.5%
Decimal Number
ValueCountFrequency (%)
0 2621
26.2%
1 1673
16.7%
2 1571
15.7%
3 626
 
6.3%
5 618
 
6.2%
7 611
 
6.1%
4 608
 
6.1%
6 597
 
6.0%
8 590
 
5.9%
9 485
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Common 10000
76.9%
Latin 3000
 
23.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 991
33.0%
T 894
29.8%
B 261
 
8.7%
H 180
 
6.0%
Q 166
 
5.5%
D 145
 
4.8%
P 107
 
3.6%
O 71
 
2.4%
C 68
 
2.3%
L 42
 
1.4%
Other values (5) 75
 
2.5%
Common
ValueCountFrequency (%)
0 2621
26.2%
1 1673
16.7%
2 1571
15.7%
3 626
 
6.3%
5 618
 
6.2%
7 611
 
6.1%
4 608
 
6.1%
6 597
 
6.0%
8 590
 
5.9%
9 485
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2621
20.2%
1 1673
12.9%
2 1571
12.1%
A 991
 
7.6%
T 894
 
6.9%
3 626
 
4.8%
5 618
 
4.8%
7 611
 
4.7%
4 608
 
4.7%
6 597
 
4.6%
Other values (15) 2190
16.8%

DISCHRG_DEMND_DY
Real number (ℝ)

Distinct212
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20199891
Minimum20191206
Maximum20201023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T04:48:38.917371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20191206
5-th percentile20191220
Q120200310
median20200601
Q320200810
95-th percentile20201007
Maximum20201023
Range9817
Interquartile range (IQR)500

Descriptive statistics

Standard deviation2466.2944
Coefficient of variation (CV)0.00012209444
Kurtosis8.4016826
Mean20199891
Median Absolute Deviation (MAD)225
Skewness-3.1990739
Sum2.0199891 × 1010
Variance6082608.2
MonotonicityDecreasing
2023-12-13T04:48:39.120134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20200623 16
 
1.6%
20200514 13
 
1.3%
20200303 12
 
1.2%
20200723 11
 
1.1%
20200513 11
 
1.1%
20200615 10
 
1.0%
20200921 10
 
1.0%
20200618 10
 
1.0%
20200428 10
 
1.0%
20200727 10
 
1.0%
Other values (202) 887
88.7%
ValueCountFrequency (%)
20191206 6
0.6%
20191209 4
0.4%
20191210 5
0.5%
20191211 1
 
0.1%
20191212 2
 
0.2%
20191213 5
0.5%
20191216 3
 
0.3%
20191217 5
0.5%
20191218 9
0.9%
20191219 6
0.6%
ValueCountFrequency (%)
20201023 4
0.4%
20201022 6
0.6%
20201020 6
0.6%
20201019 5
0.5%
20201016 5
0.5%
20201015 8
0.8%
20201013 1
 
0.1%
20201012 6
0.6%
20201008 4
0.4%
20201007 9
0.9%

PROCESS_SEQ
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
1000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1000
100.0%

Length

2023-12-13T04:48:39.320764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:48:39.465767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1000
100.0%

Interactions

2023-12-13T04:48:37.601186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-13T04:48:37.761781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:48:37.881791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GUARNT_NODISCHRG_DEMND_DYPROCESS_SEQ
0TLA2012005060202010231
1TAD2018047084202010231
2TAC2016050579202010231
3THB2018049406202010231
4TAC2018087561202010221
5QAD2013087981202010221
6QAD2015086581202010221
7TAA2014062865202010221
8TAA2018091318202010221
9TAA2014037479202010221
GUARNT_NODISCHRG_DEMND_DYPROCESS_SEQ
990TQA2018025519201912091
991TQA2018025519201912091
992TAB2015053078201912091
993TAB2015053078201912091
994TPA2015018732201912061
995THB2014000983201912061
996THB2014000983201912061
997THO2017029139201912061
998THO2017029139201912061
999TAC2015060728201912061

Duplicate rows

Most frequently occurring

GUARNT_NODISCHRG_DEMND_DYPROCESS_SEQ# duplicates
79TBA20190022832020090115
22TAA20100754632020040114
23TAA20100754632020040214
36TAA20150011842020091114
116TMA20110197722020051314
11QAD20150450262019122613
43TAB20110806602020072713
47TAB20130360412020052013
55TAB20150495822020062613
57TAB20160051552020092313