Dataset statistics
Number of variables | 10 |
---|---|
Number of observations | 500 |
Missing cells | 2500 |
Missing cells (%) | 50.0% |
Duplicate rows | 3 |
Duplicate rows (%) | 0.6% |
Total size in memory | 42.6 KiB |
Average record size in memory | 87.3 B |
Variable types
Text | 1 |
---|---|
Categorical | 2 |
Unsupported | 5 |
Numeric | 2 |
Dataset
Description | 해당 파일 데이터는 신용보증기금의 시스템관리공통코드마스터이력에 대한 정보를 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다. |
---|---|
Author | 신용보증기금 |
URL | https://www.data.go.kr/data/15093316/fileData.do |
코드유형구분코드 has constant value "" | Constant |
Dataset has 3 (0.6%) duplicate rows | Duplicates |
처리직원번호 is highly overall correlated with 주제영역코드 | High correlation |
주제영역코드 is highly overall correlated with 처리직원번호 | High correlation |
주제영역코드 is highly imbalanced (81.2%) | Imbalance |
목록코드테이블명 has 500 (100.0%) missing values | Missing |
목록코드테이블논리명 has 500 (100.0%) missing values | Missing |
목록코드컬럼명 has 500 (100.0%) missing values | Missing |
목록코드컬럼논리명 has 500 (100.0%) missing values | Missing |
처리시각 has 500 (100.0%) missing values | Missing |
목록코드테이블명 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
목록코드테이블논리명 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
목록코드컬럼명 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
목록코드컬럼논리명 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
처리시각 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-12 07:55:44.368141 |
---|---|
Analysis finished | 2023-12-12 07:55:45.506328 |
Duration | 1.14 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
코드값
Text
Distinct | 462 |
---|---|
Distinct (%) | 92.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
Value | Count | Frequency (%) |
v4 | 4 | 0.8% |
736 | 4 | 0.8% |
1157 | 4 | 0.8% |
2946 | 3 | 0.6% |
g203 | 3 | 0.6% |
158 | 3 | 0.6% |
g184 | 3 | 0.6% |
2 | 3 | 0.6% |
1137 | 2 | 0.4% |
1 | 2 | 0.4% |
Other values (452) | 469 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 746 | |
2 | 438 | |
1 | 406 | |
4 | 228 | 8.4% |
G | 187 | 6.9% |
3 | 127 | 4.7% |
7 | 87 | 3.2% |
8 | 76 | 2.8% |
5 | 74 | 2.7% |
6 | 74 | 2.7% |
Other values (21) | 285 | 10.4% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 2329 | |
Uppercase Letter | 399 | 14.6% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
G | 187 | |
C | 31 | 7.8% |
H | 26 | 6.5% |
R | 22 | 5.5% |
N | 18 | 4.5% |
D | 13 | 3.3% |
A | 12 | 3.0% |
S | 12 | 3.0% |
I | 12 | 3.0% |
F | 12 | 3.0% |
Other values (11) | 54 | 13.5% |
Decimal Number
Value | Count | Frequency (%) |
0 | 746 | |
2 | 438 | |
1 | 406 | |
4 | 228 | 9.8% |
3 | 127 | 5.5% |
7 | 87 | 3.7% |
8 | 76 | 3.3% |
5 | 74 | 3.2% |
6 | 74 | 3.2% |
9 | 73 | 3.1% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 2329 | |
Latin | 399 | 14.6% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
G | 187 | |
C | 31 | 7.8% |
H | 26 | 6.5% |
R | 22 | 5.5% |
N | 18 | 4.5% |
D | 13 | 3.3% |
A | 12 | 3.0% |
S | 12 | 3.0% |
I | 12 | 3.0% |
F | 12 | 3.0% |
Other values (11) | 54 | 13.5% |
Common
Value | Count | Frequency (%) |
0 | 746 | |
2 | 438 | |
1 | 406 | |
4 | 228 | 9.8% |
3 | 127 | 5.5% |
7 | 87 | 3.7% |
8 | 76 | 3.3% |
5 | 74 | 3.2% |
6 | 74 | 3.2% |
9 | 73 | 3.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2728 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 746 | |
2 | 438 | |
1 | 406 | |
4 | 228 | 8.4% |
G | 187 | 6.9% |
3 | 127 | 4.7% |
7 | 87 | 3.2% |
8 | 76 | 2.8% |
5 | 74 | 2.7% |
6 | 74 | 2.7% |
Other values (21) | 285 | 10.4% |
주제영역코드
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 7 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
B | |
---|---|
A | 16 |
K | 12 |
T | 4 |
G | 2 |
Other values (2) | 3 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.2% |
Sample
1st row | B |
---|---|
2nd row | B |
3rd row | B |
4th row | B |
5th row | B |
Common Values
Value | Count | Frequency (%) |
B | 463 | |
A | 16 | 3.2% |
K | 12 | 2.4% |
T | 4 | 0.8% |
G | 2 | 0.4% |
Z | 2 | 0.4% |
I | 1 | 0.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
b | 463 | |
a | 16 | 3.2% |
k | 12 | 2.4% |
t | 4 | 0.8% |
g | 2 | 0.4% |
z | 2 | 0.4% |
i | 1 | 0.2% |
코드유형구분코드
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
C |
---|
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | C |
---|---|
2nd row | C |
3rd row | C |
4th row | C |
5th row | C |
Common Values
Value | Count | Frequency (%) |
C | 500 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
c | 500 |
목록코드테이블명
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 500 |
---|---|
Missing (%) | 100.0% |
Memory size | 4.5 KiB |
목록코드테이블논리명
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 500 |
---|---|
Missing (%) | 100.0% |
Memory size | 4.5 KiB |
목록코드컬럼명
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 500 |
---|---|
Missing (%) | 100.0% |
Memory size | 4.5 KiB |
목록코드컬럼논리명
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 500 |
---|---|
Missing (%) | 100.0% |
Memory size | 4.5 KiB |
최종수정수
Real number (ℝ)
Distinct | 20 |
---|---|
Distinct (%) | 4.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2.37 |
Minimum | 1 |
---|---|
Maximum | 36 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 1 |
median | 1 |
Q3 | 2 |
95-th percentile | 4 |
Maximum | 36 |
Range | 35 |
Interquartile range (IQR) | 1 |
Descriptive statistics
Standard deviation | 5.2406133 |
---|---|
Coefficient of variation (CV) | 2.2112293 |
Kurtosis | 25.600953 |
Mean | 2.37 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 5.1455663 |
Sum | 1185 |
Variance | 27.464028 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 324 | |
2 | 142 | |
3 | 8 | 1.6% |
4 | 4 | 0.8% |
5 | 3 | 0.6% |
24 | 2 | 0.4% |
31 | 2 | 0.4% |
30 | 2 | 0.4% |
33 | 2 | 0.4% |
23 | 1 | 0.2% |
Other values (10) | 10 | 2.0% |
Value | Count | Frequency (%) |
1 | 324 | |
2 | 142 | |
3 | 8 | 1.6% |
4 | 4 | 0.8% |
5 | 3 | 0.6% |
7 | 1 | 0.2% |
10 | 1 | 0.2% |
23 | 1 | 0.2% |
24 | 2 | 0.4% |
25 | 1 | 0.2% |
Value | Count | Frequency (%) |
36 | 1 | |
35 | 1 | |
34 | 1 | |
33 | 2 | |
32 | 1 | |
31 | 2 | |
30 | 2 | |
29 | 1 | |
28 | 1 | |
27 | 1 |
처리시각
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 500 |
---|---|
Missing (%) | 100.0% |
Memory size | 4.5 KiB |
처리직원번호
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 16 |
---|---|
Distinct (%) | 3.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 5329.58 |
Minimum | 4509 |
---|---|
Maximum | 6105 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 4509 |
---|---|
5-th percentile | 5099 |
Q1 | 5099 |
median | 5099 |
Q3 | 5220 |
95-th percentile | 6105 |
Maximum | 6105 |
Range | 1596 |
Interquartile range (IQR) | 121 |
Descriptive statistics
Standard deviation | 377.02799 |
---|---|
Coefficient of variation (CV) | 0.070742533 |
Kurtosis | 0.12382107 |
Mean | 5329.58 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 1.2808986 |
Sum | 2664790 |
Variance | 142150.1 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5099 | 255 | |
5220 | 126 | |
6105 | 51 | 10.2% |
6099 | 24 | 4.8% |
5803 | 16 | 3.2% |
5823 | 6 | 1.2% |
5742 | 6 | 1.2% |
6009 | 3 | 0.6% |
4509 | 3 | 0.6% |
5544 | 2 | 0.4% |
Other values (6) | 8 | 1.6% |
Value | Count | Frequency (%) |
4509 | 3 | 0.6% |
4917 | 1 | 0.2% |
5099 | 255 | |
5113 | 1 | 0.2% |
5220 | 126 | |
5222 | 1 | 0.2% |
5476 | 1 | 0.2% |
5544 | 2 | 0.4% |
5742 | 6 | 1.2% |
5803 | 16 | 3.2% |
Value | Count | Frequency (%) |
6105 | 51 | |
6099 | 24 | |
6009 | 3 | 0.6% |
5873 | 2 | 0.4% |
5870 | 2 | 0.4% |
5823 | 6 | 1.2% |
5803 | 16 | 3.2% |
5742 | 6 | 1.2% |
5544 | 2 | 0.4% |
5476 | 1 | 0.2% |
주제영역코드 | 최종수정수 | 처리직원번호 | |
---|---|---|---|
주제영역코드 | 1.000 | 0.776 | 0.904 |
최종수정수 | 0.776 | 1.000 | 0.580 |
처리직원번호 | 0.904 | 0.580 | 1.000 |
최종수정수 | 처리직원번호 | 주제영역코드 | |
---|---|---|---|
최종수정수 | 1.000 | -0.094 | 0.373 |
처리직원번호 | -0.094 | 1.000 | 0.650 |
주제영역코드 | 0.373 | 0.650 | 1.000 |
코드값 | 주제영역코드 | 코드유형구분코드 | 목록코드테이블명 | 목록코드테이블논리명 | 목록코드컬럼명 | 목록코드컬럼논리명 | 최종수정수 | 처리시각 | 처리직원번호 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 28 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
1 | 1137 | B | C | <NA> | <NA> | <NA> | <NA> | 4 | <NA> | 5099 |
2 | 27 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
3 | 40 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
4 | 1137 | B | C | <NA> | <NA> | <NA> | <NA> | 3 | <NA> | 5099 |
5 | 45 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
6 | 44 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
7 | 43 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
8 | 42 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
9 | 41 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
코드값 | 주제영역코드 | 코드유형구분코드 | 목록코드테이블명 | 목록코드테이블논리명 | 목록코드컬럼명 | 목록코드컬럼논리명 | 최종수정수 | 처리시각 | 처리직원번호 | |
---|---|---|---|---|---|---|---|---|---|---|
490 | 110101 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
491 | 100305 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
492 | 100304 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
493 | 100303 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
494 | 100302 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
495 | 100301 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
496 | 100204 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
497 | 100203 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
498 | 100202 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
499 | 100201 | B | C | <NA> | <NA> | <NA> | <NA> | 2 | <NA> | 5099 |
Most frequently occurring
코드값 | 주제영역코드 | 코드유형구분코드 | 최종수정수 | 처리직원번호 | # duplicates | |
---|---|---|---|---|---|---|
0 | G184 | B | C | 1 | 6105 | 3 |
1 | G203 | B | C | 1 | 5099 | 3 |
2 | G204 | B | C | 1 | 5099 | 2 |