Overview

Dataset statistics

Number of variables8
Number of observations500
Missing cells2000
Missing cells (%)50.0%
Duplicate rows16
Duplicate rows (%)3.2%
Total size in memory33.8 KiB
Average record size in memory69.3 B

Variable types

Categorical2
Unsupported4
Boolean1
Numeric1

Dataset

Description해당 파일 데이터는 신용보증기금의 시스템관리공통코드마스터에 대한 정보를 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15093315/fileData.do

Alerts

코드유형구분코드 has constant value ""Constant
Dataset has 16 (3.2%) duplicate rowsDuplicates
최종수정수 is highly overall correlated with 최초처리직원번호High correlation
최초처리직원번호 is highly overall correlated with 최종수정수High correlation
삭제여부 is highly imbalanced (74.9%)Imbalance
목록코드테이블명 has 500 (100.0%) missing valuesMissing
목록코드테이블논리명 has 500 (100.0%) missing valuesMissing
목록코드컬럼명 has 500 (100.0%) missing valuesMissing
목록코드컬럼논리명 has 500 (100.0%) missing valuesMissing
목록코드테이블명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
목록코드테이블논리명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
목록코드컬럼명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
목록코드컬럼논리명 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 14:39:54.057272
Analysis finished2023-12-12 14:39:54.604388
Duration0.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

코드유형구분코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
C
500 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowC
3rd rowC
4th rowC
5th rowC

Common Values

ValueCountFrequency (%)
C 500
100.0%

Length

2023-12-12T23:39:54.688833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:39:54.803182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 500
100.0%

목록코드테이블명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

목록코드테이블논리명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

목록코드컬럼명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

목록코드컬럼논리명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

삭제여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size632.0 B
False
479 
True
 
21
ValueCountFrequency (%)
False 479
95.8%
True 21
 
4.2%
2023-12-12T23:39:54.911829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

최종수정수
Real number (ℝ)

HIGH CORRELATION 

Distinct13
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.736
Minimum1
Maximum36
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T23:39:55.026266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile2
Maximum36
Range35
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.2260065
Coefficient of variation (CV)1.8582987
Kurtosis76.5679
Mean1.736
Median Absolute Deviation (MAD)0
Skewness8.5576571
Sum868
Variance10.407118
MonotonicityNot monotonic
2023-12-12T23:39:55.176830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
1 335
67.0%
2 144
28.8%
3 6
 
1.2%
4 4
 
0.8%
5 3
 
0.6%
7 1
 
0.2%
24 1
 
0.2%
25 1
 
0.2%
31 1
 
0.2%
30 1
 
0.2%
Other values (3) 3
 
0.6%
ValueCountFrequency (%)
1 335
67.0%
2 144
28.8%
3 6
 
1.2%
4 4
 
0.8%
5 3
 
0.6%
7 1
 
0.2%
10 1
 
0.2%
24 1
 
0.2%
25 1
 
0.2%
30 1
 
0.2%
ValueCountFrequency (%)
36 1
 
0.2%
33 1
 
0.2%
31 1
 
0.2%
30 1
 
0.2%
25 1
 
0.2%
24 1
 
0.2%
10 1
 
0.2%
7 1
 
0.2%
5 3
0.6%
4 4
0.8%

최초처리직원번호
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
5099
251 
5220
128 
6105
45 
BATCH
 
25
5803
 
9
Other values (16)
42 

Length

Max length5
Median length4
Mean length4.06
Min length4

Unique

Unique4 ?
Unique (%)0.8%

Sample

1st rowBATCH
2nd row5423
3rd rowBATCH
4th row4444
5th row4444

Common Values

ValueCountFrequency (%)
5099 251
50.2%
5220 128
25.6%
6105 45
 
9.0%
BATCH 25
 
5.0%
5803 9
 
1.8%
4444 6
 
1.2%
5823 6
 
1.2%
5423 4
 
0.8%
6009 3
 
0.6%
5222 3
 
0.6%
Other values (11) 20
 
4.0%

Length

2023-12-12T23:39:55.376015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5099 251
50.2%
5220 128
25.6%
6105 45
 
9.0%
batch 25
 
5.0%
5803 9
 
1.8%
4444 6
 
1.2%
5823 6
 
1.2%
5423 4
 
0.8%
4509 3
 
0.6%
exc41 3
 
0.6%
Other values (11) 20
 
4.0%

Interactions

2023-12-12T23:39:54.201291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:39:55.481041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
삭제여부최종수정수최초처리직원번호
삭제여부1.0000.0000.069
최종수정수0.0001.0000.929
최초처리직원번호0.0690.9291.000
2023-12-12T23:39:55.597582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
삭제여부최초처리직원번호
삭제여부1.0000.058
최초처리직원번호0.0581.000
2023-12-12T23:39:55.708160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최종수정수삭제여부최초처리직원번호
최종수정수1.0000.0000.731
삭제여부0.0001.0000.058
최초처리직원번호0.7310.0581.000

Missing values

2023-12-12T23:39:54.364318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:39:54.529451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

코드유형구분코드목록코드테이블명목록코드테이블논리명목록코드컬럼명목록코드컬럼논리명삭제여부최종수정수최초처리직원번호
0C<NA><NA><NA><NA>N2BATCH
1C<NA><NA><NA><NA>N45423
2C<NA><NA><NA><NA>N2BATCH
3C<NA><NA><NA><NA>N24444
4C<NA><NA><NA><NA>N24444
5C<NA><NA><NA><NA>N24444
6C<NA><NA><NA><NA>N24444
7C<NA><NA><NA><NA>N24444
8C<NA><NA><NA><NA>N24444
9C<NA><NA><NA><NA>N16009
코드유형구분코드목록코드테이블명목록코드테이블논리명목록코드컬럼명목록코드컬럼논리명삭제여부최종수정수최초처리직원번호
490C<NA><NA><NA><NA>Y15099
491C<NA><NA><NA><NA>Y15099
492C<NA><NA><NA><NA>Y15099
493C<NA><NA><NA><NA>Y15099
494C<NA><NA><NA><NA>Y15099
495C<NA><NA><NA><NA>Y15099
496C<NA><NA><NA><NA>Y15099
497C<NA><NA><NA><NA>Y15099
498C<NA><NA><NA><NA>Y15099
499C<NA><NA><NA><NA>Y15099

Duplicate rows

Most frequently occurring

코드유형구분코드삭제여부최종수정수최초처리직원번호# duplicates
1CN15099129
2CN15220126
10CN25099100
8CN1610539
13CN2BATCH24
15CY1509921
11CN258037
9CN244446
4CN158235
12CN261054