"TKG102_%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD_%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD_%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD_%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD_%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD_%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD.csv"의 파일명이 "TKG102_시스템_관리_공통_코드_마스터_이력.csv"으로 변경 됨.

Overview

Dataset statistics

Number of variables10
Number of observations500
Missing cells2500
Missing cells (%)50.0%
Duplicate rows3
Duplicate rows (%)0.6%
Total size in memory42.6 KiB
Average record size in memory87.3 B

Variable types

Text1
Categorical2
Unsupported5
Numeric2

Dataset

Description해당 파일 데이터는 신용보증기금의 시스템관리공통코드마스터이력에 대한 정보를 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15093316/fileData.do

Alerts

코드유형구분코드 has constant value ""Constant
Dataset has 3 (0.6%) duplicate rowsDuplicates
처리직원번호 is highly overall correlated with 주제영역코드High correlation
주제영역코드 is highly overall correlated with 처리직원번호High correlation
주제영역코드 is highly imbalanced (81.2%)Imbalance
목록코드테이블명 has 500 (100.0%) missing valuesMissing
목록코드테이블논리명 has 500 (100.0%) missing valuesMissing
목록코드컬럼명 has 500 (100.0%) missing valuesMissing
목록코드컬럼논리명 has 500 (100.0%) missing valuesMissing
처리시각 has 500 (100.0%) missing valuesMissing
목록코드테이블명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
목록코드테이블논리명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
목록코드컬럼명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
목록코드컬럼논리명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
처리시각 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 07:55:44.368141
Analysis finished2023-12-12 07:55:45.506328
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct462
Distinct (%)92.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T16:55:45.767969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length5.456
Min length1

Characters and Unicode

Total characters2728
Distinct characters31
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique435 ?
Unique (%)87.0%

Sample

1st row28
2nd row1137
3rd row27
4th row40
5th row1137
ValueCountFrequency (%)
v4 4
 
0.8%
736 4
 
0.8%
1157 4
 
0.8%
2946 3
 
0.6%
g203 3
 
0.6%
158 3
 
0.6%
g184 3
 
0.6%
2 3
 
0.6%
1137 2
 
0.4%
1 2
 
0.4%
Other values (452) 469
93.8%
2023-12-12T16:55:46.292282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 746
27.3%
2 438
16.1%
1 406
14.9%
4 228
 
8.4%
G 187
 
6.9%
3 127
 
4.7%
7 87
 
3.2%
8 76
 
2.8%
5 74
 
2.7%
6 74
 
2.7%
Other values (21) 285
 
10.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2329
85.4%
Uppercase Letter 399
 
14.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G 187
46.9%
C 31
 
7.8%
H 26
 
6.5%
R 22
 
5.5%
N 18
 
4.5%
D 13
 
3.3%
A 12
 
3.0%
S 12
 
3.0%
I 12
 
3.0%
F 12
 
3.0%
Other values (11) 54
 
13.5%
Decimal Number
ValueCountFrequency (%)
0 746
32.0%
2 438
18.8%
1 406
17.4%
4 228
 
9.8%
3 127
 
5.5%
7 87
 
3.7%
8 76
 
3.3%
5 74
 
3.2%
6 74
 
3.2%
9 73
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
Common 2329
85.4%
Latin 399
 
14.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 187
46.9%
C 31
 
7.8%
H 26
 
6.5%
R 22
 
5.5%
N 18
 
4.5%
D 13
 
3.3%
A 12
 
3.0%
S 12
 
3.0%
I 12
 
3.0%
F 12
 
3.0%
Other values (11) 54
 
13.5%
Common
ValueCountFrequency (%)
0 746
32.0%
2 438
18.8%
1 406
17.4%
4 228
 
9.8%
3 127
 
5.5%
7 87
 
3.7%
8 76
 
3.3%
5 74
 
3.2%
6 74
 
3.2%
9 73
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2728
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 746
27.3%
2 438
16.1%
1 406
14.9%
4 228
 
8.4%
G 187
 
6.9%
3 127
 
4.7%
7 87
 
3.2%
8 76
 
2.8%
5 74
 
2.7%
6 74
 
2.7%
Other values (21) 285
 
10.4%

주제영역코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
B
463 
A
 
16
K
 
12
T
 
4
G
 
2
Other values (2)
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st rowB
2nd rowB
3rd rowB
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
B 463
92.6%
A 16
 
3.2%
K 12
 
2.4%
T 4
 
0.8%
G 2
 
0.4%
Z 2
 
0.4%
I 1
 
0.2%

Length

2023-12-12T16:55:46.462852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:55:46.584047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
b 463
92.6%
a 16
 
3.2%
k 12
 
2.4%
t 4
 
0.8%
g 2
 
0.4%
z 2
 
0.4%
i 1
 
0.2%

코드유형구분코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
C
500 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowC
3rd rowC
4th rowC
5th rowC

Common Values

ValueCountFrequency (%)
C 500
100.0%

Length

2023-12-12T16:55:46.709904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:55:46.813498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 500
100.0%

목록코드테이블명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

목록코드테이블논리명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

목록코드컬럼명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

목록코드컬럼논리명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

최종수정수
Real number (ℝ)

Distinct20
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.37
Minimum1
Maximum36
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T16:55:46.905296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile4
Maximum36
Range35
Interquartile range (IQR)1

Descriptive statistics

Standard deviation5.2406133
Coefficient of variation (CV)2.2112293
Kurtosis25.600953
Mean2.37
Median Absolute Deviation (MAD)0
Skewness5.1455663
Sum1185
Variance27.464028
MonotonicityNot monotonic
2023-12-12T16:55:47.050233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
1 324
64.8%
2 142
28.4%
3 8
 
1.6%
4 4
 
0.8%
5 3
 
0.6%
24 2
 
0.4%
31 2
 
0.4%
30 2
 
0.4%
33 2
 
0.4%
23 1
 
0.2%
Other values (10) 10
 
2.0%
ValueCountFrequency (%)
1 324
64.8%
2 142
28.4%
3 8
 
1.6%
4 4
 
0.8%
5 3
 
0.6%
7 1
 
0.2%
10 1
 
0.2%
23 1
 
0.2%
24 2
 
0.4%
25 1
 
0.2%
ValueCountFrequency (%)
36 1
0.2%
35 1
0.2%
34 1
0.2%
33 2
0.4%
32 1
0.2%
31 2
0.4%
30 2
0.4%
29 1
0.2%
28 1
0.2%
27 1
0.2%

처리시각
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing500
Missing (%)100.0%
Memory size4.5 KiB

처리직원번호
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5329.58
Minimum4509
Maximum6105
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T16:55:47.174624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4509
5-th percentile5099
Q15099
median5099
Q35220
95-th percentile6105
Maximum6105
Range1596
Interquartile range (IQR)121

Descriptive statistics

Standard deviation377.02799
Coefficient of variation (CV)0.070742533
Kurtosis0.12382107
Mean5329.58
Median Absolute Deviation (MAD)0
Skewness1.2808986
Sum2664790
Variance142150.1
MonotonicityNot monotonic
2023-12-12T16:55:47.295508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
5099 255
51.0%
5220 126
25.2%
6105 51
 
10.2%
6099 24
 
4.8%
5803 16
 
3.2%
5823 6
 
1.2%
5742 6
 
1.2%
6009 3
 
0.6%
4509 3
 
0.6%
5544 2
 
0.4%
Other values (6) 8
 
1.6%
ValueCountFrequency (%)
4509 3
 
0.6%
4917 1
 
0.2%
5099 255
51.0%
5113 1
 
0.2%
5220 126
25.2%
5222 1
 
0.2%
5476 1
 
0.2%
5544 2
 
0.4%
5742 6
 
1.2%
5803 16
 
3.2%
ValueCountFrequency (%)
6105 51
10.2%
6099 24
4.8%
6009 3
 
0.6%
5873 2
 
0.4%
5870 2
 
0.4%
5823 6
 
1.2%
5803 16
 
3.2%
5742 6
 
1.2%
5544 2
 
0.4%
5476 1
 
0.2%

Interactions

2023-12-12T16:55:44.835066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:55:44.558521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:55:45.007245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:55:44.687303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:55:47.394104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주제영역코드최종수정수처리직원번호
주제영역코드1.0000.7760.904
최종수정수0.7761.0000.580
처리직원번호0.9040.5801.000
2023-12-12T16:55:47.483933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최종수정수처리직원번호주제영역코드
최종수정수1.000-0.0940.373
처리직원번호-0.0941.0000.650
주제영역코드0.3730.6501.000

Missing values

2023-12-12T16:55:45.211157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:55:45.426541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

코드값주제영역코드코드유형구분코드목록코드테이블명목록코드테이블논리명목록코드컬럼명목록코드컬럼논리명최종수정수처리시각처리직원번호
028BC<NA><NA><NA><NA>2<NA>5099
11137BC<NA><NA><NA><NA>4<NA>5099
227BC<NA><NA><NA><NA>2<NA>5099
340BC<NA><NA><NA><NA>2<NA>5099
41137BC<NA><NA><NA><NA>3<NA>5099
545BC<NA><NA><NA><NA>2<NA>5099
644BC<NA><NA><NA><NA>2<NA>5099
743BC<NA><NA><NA><NA>2<NA>5099
842BC<NA><NA><NA><NA>2<NA>5099
941BC<NA><NA><NA><NA>2<NA>5099
코드값주제영역코드코드유형구분코드목록코드테이블명목록코드테이블논리명목록코드컬럼명목록코드컬럼논리명최종수정수처리시각처리직원번호
490110101BC<NA><NA><NA><NA>2<NA>5099
491100305BC<NA><NA><NA><NA>2<NA>5099
492100304BC<NA><NA><NA><NA>2<NA>5099
493100303BC<NA><NA><NA><NA>2<NA>5099
494100302BC<NA><NA><NA><NA>2<NA>5099
495100301BC<NA><NA><NA><NA>2<NA>5099
496100204BC<NA><NA><NA><NA>2<NA>5099
497100203BC<NA><NA><NA><NA>2<NA>5099
498100202BC<NA><NA><NA><NA>2<NA>5099
499100201BC<NA><NA><NA><NA>2<NA>5099

Duplicate rows

Most frequently occurring

코드값주제영역코드코드유형구분코드최종수정수처리직원번호# duplicates
0G184BC161053
1G203BC150993
2G204BC150992