Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 3.4 KiB |
Average record size in memory | 35.3 B |
Variable types
Text | 1 |
---|---|
Categorical | 1 |
Numeric | 1 |
DateTime | 1 |
Dataset
Description | 한국주택금융공사 주택연금부 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터) |
---|---|
Author | 한국주택금융공사 |
URL | https://www.data.go.kr/data/15073020/fileData.do |
SEQ is highly imbalanced (65.6%) | Imbalance |
Reproduction
Analysis started | 2023-12-12 22:20:53.139642 |
---|---|
Analysis finished | 2023-12-12 22:20:53.522705 |
Duration | 0.38 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
GUARNT_NO
Text
Distinct | 86 |
---|---|
Distinct (%) | 86.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Length
Max length | 14 |
---|---|
Median length | 14 |
Mean length | 14 |
Min length | 14 |
Characters and Unicode
Total characters | 1400 |
---|---|
Distinct characters | 24 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 74 ? |
---|---|
Unique (%) | 74.0% |
Sample
1st row | RTPA2020000519 |
---|---|
2nd row | RTHO2020000496 |
3rd row | RTHB2020000590 |
4th row | RTOB2020000067 |
5th row | RTPB2020000155 |
Value | Count | Frequency (%) |
rtna2020000220 | 4 | 4.0% |
rtma2020000256 | 2 | 2.0% |
rtma2020000251 | 2 | 2.0% |
rtha2020000742 | 2 | 2.0% |
rtad2020000616 | 2 | 2.0% |
rtab2020000727 | 2 | 2.0% |
rtba2020000625 | 2 | 2.0% |
rtpb2020000155 | 2 | 2.0% |
rtac2020000662 | 2 | 2.0% |
rtad2020000688 | 2 | 2.0% |
Other values (76) | 78 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 518 | |
2 | 242 | |
R | 100 | 7.1% |
T | 95 | 6.8% |
A | 83 | 5.9% |
6 | 58 | 4.1% |
1 | 37 | 2.6% |
7 | 34 | 2.4% |
B | 33 | 2.4% |
4 | 31 | 2.2% |
Other values (14) | 169 | 12.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 1000 | |
Uppercase Letter | 400 | 28.6% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
R | 100 | |
T | 95 | |
A | 83 | |
B | 33 | 8.2% |
H | 24 | 6.0% |
D | 14 | 3.5% |
O | 11 | 2.8% |
C | 9 | 2.2% |
P | 9 | 2.2% |
Q | 7 | 1.8% |
Other values (4) | 15 | 3.8% |
Decimal Number
Value | Count | Frequency (%) |
0 | 518 | |
2 | 242 | |
6 | 58 | 5.8% |
1 | 37 | 3.7% |
7 | 34 | 3.4% |
4 | 31 | 3.1% |
5 | 29 | 2.9% |
9 | 24 | 2.4% |
8 | 16 | 1.6% |
3 | 11 | 1.1% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 1000 | |
Latin | 400 | 28.6% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
R | 100 | |
T | 95 | |
A | 83 | |
B | 33 | 8.2% |
H | 24 | 6.0% |
D | 14 | 3.5% |
O | 11 | 2.8% |
C | 9 | 2.2% |
P | 9 | 2.2% |
Q | 7 | 1.8% |
Other values (4) | 15 | 3.8% |
Common
Value | Count | Frequency (%) |
0 | 518 | |
2 | 242 | |
6 | 58 | 5.8% |
1 | 37 | 3.7% |
7 | 34 | 3.4% |
4 | 31 | 3.1% |
5 | 29 | 2.9% |
9 | 24 | 2.4% |
8 | 16 | 1.6% |
3 | 11 | 1.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1400 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 518 | |
2 | 242 | |
R | 100 | 7.1% |
T | 95 | 6.8% |
A | 83 | 5.9% |
6 | 58 | 4.1% |
1 | 37 | 2.6% |
7 | 34 | 2.4% |
B | 33 | 2.4% |
4 | 31 | 2.2% |
Other values (14) | 169 | 12.1% |
SEQ
Categorical
IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 4.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
1 | |
---|---|
2 | |
4 | 1 |
3 | 1 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 2.0% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 2 |
Common Values
Value | Count | Frequency (%) |
1 | 86 | |
2 | 12 | 12.0% |
4 | 1 | 1.0% |
3 | 1 | 1.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 86 | |
2 | 12 | 12.0% |
4 | 1 | 1.0% |
3 | 1 | 1.0% |
REG_ENO
Real number (ℝ)
Distinct | 43 |
---|---|
Distinct (%) | 43.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1702.7 |
Minimum | 1174 |
---|---|
Maximum | 2003 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 1174 |
---|---|
5-th percentile | 1304 |
Q1 | 1557 |
median | 1690 |
Q3 | 1917 |
95-th percentile | 1982.25 |
Maximum | 2003 |
Range | 829 |
Interquartile range (IQR) | 360 |
Descriptive statistics
Standard deviation | 221.72011 |
---|---|
Coefficient of variation (CV) | 0.13021678 |
Kurtosis | -0.3826661 |
Mean | 1702.7 |
Median Absolute Deviation (MAD) | 152.5 |
Skewness | -0.50995131 |
Sum | 170270 |
Variance | 49159.808 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1656 | 8 | 8.0% |
1970 | 6 | 6.0% |
1557 | 5 | 5.0% |
1689 | 5 | 5.0% |
1956 | 5 | 5.0% |
1753 | 5 | 5.0% |
1917 | 4 | 4.0% |
1174 | 4 | 4.0% |
1691 | 4 | 4.0% |
1385 | 4 | 4.0% |
Other values (33) | 50 |
Value | Count | Frequency (%) |
1174 | 4 | |
1304 | 2 | 2.0% |
1371 | 3 | |
1385 | 4 | |
1406 | 2 | 2.0% |
1475 | 3 | |
1521 | 1 | 1.0% |
1554 | 2 | 2.0% |
1557 | 5 | |
1569 | 1 | 1.0% |
Value | Count | Frequency (%) |
2003 | 1 | 1.0% |
2001 | 2 | 2.0% |
2000 | 1 | 1.0% |
1987 | 1 | 1.0% |
1982 | 1 | 1.0% |
1980 | 2 | 2.0% |
1977 | 1 | 1.0% |
1970 | 6 | |
1968 | 2 | 2.0% |
1956 | 5 |
REG_TS
Date
Distinct | 87 |
---|---|
Distinct (%) | 87.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2020-09-17 15:34:12 |
---|---|
Maximum | 2020-10-22 15:51:00 |
GUARNT_NO | SEQ | REG_ENO | REG_TS | |
---|---|---|---|---|
GUARNT_NO | 1.000 | 0.000 | 1.000 | 1.000 |
SEQ | 0.000 | 1.000 | 0.000 | 0.000 |
REG_ENO | 1.000 | 0.000 | 1.000 | 1.000 |
REG_TS | 1.000 | 0.000 | 1.000 | 1.000 |
REG_ENO | SEQ | |
---|---|---|
REG_ENO | 1.000 | 0.000 |
SEQ | 0.000 | 1.000 |
GUARNT_NO | SEQ | REG_ENO | REG_TS | |
---|---|---|---|---|
0 | RTPA2020000519 | 1 | 1982 | 2020/10/22 15:51:00 |
1 | RTHO2020000496 | 1 | 1917 | 2020/10/22 13:53:17 |
2 | RTHB2020000590 | 1 | 1371 | 2020/10/22 14:31:27 |
3 | RTOB2020000067 | 1 | 1620 | 2020/10/22 11:45:18 |
4 | RTPB2020000155 | 2 | 1980 | 2020/10/22 10:58:30 |
5 | RTAD2020000688 | 2 | 1656 | 2020/10/22 10:34:13 |
6 | RTPB2020000155 | 1 | 1980 | 2020/10/22 10:58:30 |
7 | RTHO2020000499 | 1 | 1691 | 2020/10/22 09:51:06 |
8 | RTAD2020000688 | 1 | 1656 | 2020/10/22 10:34:13 |
9 | RTPA2020000531 | 1 | 1889 | 2020/10/22 09:26:53 |
GUARNT_NO | SEQ | REG_ENO | REG_TS | |
---|---|---|---|---|
90 | RTAB2020000727 | 1 | 1385 | 2020/09/21 17:35:17 |
91 | RTAB2020000724 | 1 | 1689 | 2020/09/23 11:53:16 |
92 | RTAA2020000554 | 1 | 1799 | 2020/09/18 14:58:14 |
93 | RTAC2020000673 | 1 | 1788 | 2020/09/18 14:55:00 |
94 | RTBA2020000607 | 1 | 1720 | 2020/09/22 11:23:05 |
95 | RTAC2020000662 | 2 | 1753 | 2020/09/21 16:49:15 |
96 | RTBB2020000178 | 1 | 1304 | 2020/09/18 14:16:57 |
97 | RTAC2020000662 | 1 | 1753 | 2020/09/21 16:49:15 |
98 | RTQA2020000271 | 1 | 1874 | 2020/09/18 10:27:21 |
99 | RTHA2020000645 | 1 | 1569 | 2020/09/17 15:34:12 |