Overview

Dataset statistics

Number of variables9
Number of observations500
Missing cells479
Missing cells (%)10.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory37.2 KiB
Average record size in memory76.3 B

Variable types

Categorical6
Text2
Numeric1

Dataset

Description해당 파일 데이터는 신용보증기금의 보증상담기업 신용정보 상세정보를 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15093227/fileData.do

Alerts

이력일련번호 has constant value ""Constant
최종수정수 has constant value ""Constant
상담기업개요ID is highly overall correlated with 처리직원번호High correlation
처리직원번호 is highly overall correlated with 상담기업개요IDHigh correlation
여부항목여부 is highly imbalanced (78.3%)Imbalance
일자항목일자 is highly imbalanced (89.4%)Imbalance
문자항목명 has 479 (95.8%) missing valuesMissing
숫자항목값 is highly skewed (γ1 = 20.6393162)Skewed
숫자항목값 has 493 (98.6%) zerosZeros

Reproduction

Analysis started2023-12-12 03:21:08.658730
Analysis finished2023-12-12 03:21:10.082794
Duration1.42 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상담기업개요ID
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
9bLqoOA34p
113 
9bLqlSJQvi
113 
9bLqkoCoJ6
113 
9bLqjgGybW
113 
9bLqhQ42Cv
48 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9bLqoOA34p
2nd row9bLqoOA34p
3rd row9bLqoOA34p
4th row9bLqoOA34p
5th row9bLqoOA34p

Common Values

ValueCountFrequency (%)
9bLqoOA34p 113
22.6%
9bLqlSJQvi 113
22.6%
9bLqkoCoJ6 113
22.6%
9bLqjgGybW 113
22.6%
9bLqhQ42Cv 48
9.6%

Length

2023-12-12T12:21:10.175336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:21:10.324404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9blqooa34p 113
22.6%
9blqlsjqvi 113
22.6%
9blqkocoj6 113
22.6%
9blqjggybw 113
22.6%
9blqhq42cv 48
9.6%
Distinct113
Distinct (%)22.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T12:21:10.720262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters3500
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCA001YN
2nd rowCH007DT
3rd rowCH006DC
4th rowCH005DC
5th rowCH004DT
ValueCountFrequency (%)
ca001yn 5
 
1.0%
cd005ch 5
 
1.0%
cb008dc 5
 
1.0%
cb005ch 5
 
1.0%
cb009ch 5
 
1.0%
cc001yn 5
 
1.0%
cc002ch 5
 
1.0%
cc004ch 5
 
1.0%
cc005ch 5
 
1.0%
cc006ch 5
 
1.0%
Other values (103) 450
90.0%
2023-12-12T12:21:11.324299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 953
27.2%
0 838
23.9%
H 288
 
8.2%
D 265
 
7.6%
1 242
 
6.9%
A 95
 
2.7%
T 90
 
2.6%
F 84
 
2.4%
E 84
 
2.4%
2 79
 
2.3%
Other values (11) 482
13.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2000
57.1%
Decimal Number 1500
42.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 953
47.6%
H 288
 
14.4%
D 265
 
13.2%
A 95
 
4.8%
T 90
 
4.5%
F 84
 
4.2%
E 84
 
4.2%
B 45
 
2.2%
N 32
 
1.6%
Y 32
 
1.6%
Decimal Number
ValueCountFrequency (%)
0 838
55.9%
1 242
 
16.1%
2 79
 
5.3%
3 58
 
3.9%
5 54
 
3.6%
4 54
 
3.6%
6 48
 
3.2%
7 48
 
3.2%
9 40
 
2.7%
8 39
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 2000
57.1%
Common 1500
42.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 953
47.6%
H 288
 
14.4%
D 265
 
13.2%
A 95
 
4.8%
T 90
 
4.5%
F 84
 
4.2%
E 84
 
4.2%
B 45
 
2.2%
N 32
 
1.6%
Y 32
 
1.6%
Common
ValueCountFrequency (%)
0 838
55.9%
1 242
 
16.1%
2 79
 
5.3%
3 58
 
3.9%
5 54
 
3.6%
4 54
 
3.6%
6 48
 
3.2%
7 48
 
3.2%
9 40
 
2.7%
8 39
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 953
27.2%
0 838
23.9%
H 288
 
8.2%
D 265
 
7.6%
1 242
 
6.9%
A 95
 
2.7%
T 90
 
2.6%
F 84
 
2.4%
E 84
 
2.4%
2 79
 
2.3%
Other values (11) 482
13.8%

이력일련번호
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
500 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 500
100.0%

Length

2023-12-12T12:21:11.501270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:21:11.612382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 500
100.0%

여부항목여부
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
472 
N
 
24
Y
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowN
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
472
94.4%
N 24
 
4.8%
Y 4
 
0.8%

Length

2023-12-12T12:21:11.732801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:21:11.847912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
n 24
85.7%
y 4
 
14.3%

문자항목명
Text

MISSING 

Distinct18
Distinct (%)85.7%
Missing479
Missing (%)95.8%
Memory size4.0 KiB
2023-12-12T12:21:12.036662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length8
Mean length5.3809524
Min length1

Characters and Unicode

Total characters113
Distinct characters41
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)71.4%

Sample

1st rowCBR-3
2nd rowCBR-2
3rd row감사
4th row9bgBEBVMY9
5th row9bpEsCUSjo
ValueCountFrequency (%)
6.60e+12 2
 
9.5%
cbr-2 2
 
9.5%
2 2
 
9.5%
92 1
 
4.8%
4 1
 
4.8%
2010-10-15-19.20.22.129735 1
 
4.8%
thy 1
 
4.8%
박홍준 1
 
4.8%
1 1
 
4.8%
51 1
 
4.8%
Other values (8) 8
38.1%
2023-12-12T12:21:12.430503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 13
 
11.5%
1 10
 
8.8%
9 9
 
8.0%
- 7
 
6.2%
0 6
 
5.3%
B 6
 
5.3%
. 5
 
4.4%
C 5
 
4.4%
R 4
 
3.5%
6 4
 
3.5%
Other values (31) 44
38.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50
44.2%
Uppercase Letter 31
27.4%
Lowercase Letter 10
 
8.8%
Other Letter 8
 
7.1%
Dash Punctuation 7
 
6.2%
Other Punctuation 5
 
4.4%
Math Symbol 2
 
1.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 6
19.4%
C 5
16.1%
R 4
12.9%
E 4
12.9%
H 2
 
6.5%
Y 2
 
6.5%
T 2
 
6.5%
S 1
 
3.2%
U 1
 
3.2%
M 1
 
3.2%
Other values (3) 3
9.7%
Decimal Number
ValueCountFrequency (%)
2 13
26.0%
1 10
20.0%
9 9
18.0%
0 6
12.0%
6 4
 
8.0%
5 3
 
6.0%
3 2
 
4.0%
4 2
 
4.0%
7 1
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
b 3
30.0%
o 1
 
10.0%
j 1
 
10.0%
s 1
 
10.0%
p 1
 
10.0%
g 1
 
10.0%
q 1
 
10.0%
l 1
 
10.0%
Other Letter
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 64
56.6%
Latin 41
36.3%
Hangul 8
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 6
14.6%
C 5
12.2%
R 4
 
9.8%
E 4
 
9.8%
b 3
 
7.3%
H 2
 
4.9%
Y 2
 
4.9%
T 2
 
4.9%
o 1
 
2.4%
S 1
 
2.4%
Other values (11) 11
26.8%
Common
ValueCountFrequency (%)
2 13
20.3%
1 10
15.6%
9 9
14.1%
- 7
10.9%
0 6
9.4%
. 5
 
7.8%
6 4
 
6.2%
5 3
 
4.7%
+ 2
 
3.1%
3 2
 
3.1%
Other values (2) 3
 
4.7%
Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 105
92.9%
Hangul 8
 
7.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 13
 
12.4%
1 10
 
9.5%
9 9
 
8.6%
- 7
 
6.7%
0 6
 
5.7%
B 6
 
5.7%
. 5
 
4.8%
C 5
 
4.8%
R 4
 
3.8%
6 4
 
3.8%
Other values (23) 36
34.3%
Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

숫자항목값
Real number (ℝ)

SKEWED  ZEROS 

Distinct6
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean724613.87
Minimum0
Maximum2.88 × 108
Zeros493
Zeros (%)98.6%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T12:21:12.569399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum2.88 × 108
Range2.88 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation13295093
Coefficient of variation (CV)18.347831
Kurtosis441.01461
Mean724613.87
Median Absolute Deviation (MAD)0
Skewness20.639316
Sum3.6230693 × 108
Variance1.7675949 × 1014
MonotonicityNot monotonic
2023-12-12T12:21:12.718328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 493
98.6%
1 3
 
0.6%
74306903 1
 
0.2%
288000000 1
 
0.2%
26 1
 
0.2%
2 1
 
0.2%
ValueCountFrequency (%)
0 493
98.6%
1 3
 
0.6%
2 1
 
0.2%
26 1
 
0.2%
74306903 1
 
0.2%
288000000 1
 
0.2%
ValueCountFrequency (%)
288000000 1
 
0.2%
74306903 1
 
0.2%
26 1
 
0.2%
2 1
 
0.2%
1 3
 
0.6%
0 493
98.6%

일자항목일자
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
0001-01-01 00:00:00.000000
493 
00:00.0
 
7

Length

Max length26
Median length26
Mean length25.734
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0001-01-01 00:00:00.000000
2nd row00:00.0
3rd row0001-01-01 00:00:00.000000
4th row0001-01-01 00:00:00.000000
5th row0001-01-01 00:00:00.000000

Common Values

ValueCountFrequency (%)
0001-01-01 00:00:00.000000 493
98.6%
00:00.0 7
 
1.4%

Length

2023-12-12T12:21:12.902737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:21:13.035027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0001-01-01 493
49.6%
00:00:00.000000 493
49.6%
00:00.0 7
 
0.7%

최종수정수
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
500 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 500
100.0%

Length

2023-12-12T12:21:13.178203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:21:13.306514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 500
100.0%

처리직원번호
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
94211
161 
3491
113 
3471
113 
3452
113 

Length

Max length5
Median length4
Mean length4.322
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3491
2nd row3491
3rd row3491
4th row3491
5th row3491

Common Values

ValueCountFrequency (%)
94211 161
32.2%
3491 113
22.6%
3471 113
22.6%
3452 113
22.6%

Length

2023-12-12T12:21:13.440602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:21:13.577610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
94211 161
32.2%
3491 113
22.6%
3471 113
22.6%
3452 113
22.6%

Interactions

2023-12-12T12:21:09.130350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:21:13.670902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상담기업개요ID여부항목여부문자항목명숫자항목값일자항목일자처리직원번호
상담기업개요ID1.0000.0320.7800.0000.0001.000
여부항목여부0.0321.000NaN0.0000.0000.053
문자항목명0.780NaN1.000NaNNaN0.880
숫자항목값0.0000.000NaN1.0000.0000.031
일자항목일자0.0000.000NaN0.0001.0000.013
처리직원번호1.0000.0530.8800.0310.0131.000
2023-12-12T12:21:13.808210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
여부항목여부상담기업개요ID처리직원번호일자항목일자
여부항목여부1.0000.0230.0500.000
상담기업개요ID0.0231.0000.9990.000
처리직원번호0.0500.9991.0000.008
일자항목일자0.0000.0000.0081.000
2023-12-12T12:21:13.930390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
숫자항목값상담기업개요ID여부항목여부일자항목일자처리직원번호
숫자항목값1.0000.0000.0000.0000.029
상담기업개요ID0.0001.0000.0230.0000.999
여부항목여부0.0000.0231.0000.0000.050
일자항목일자0.0000.0000.0001.0000.008
처리직원번호0.0290.9990.0500.0081.000

Missing values

2023-12-12T12:21:09.398950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:21:09.997056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상담기업개요ID상담신용정보항목코드이력일련번호여부항목여부문자항목명숫자항목값일자항목일자최종수정수처리직원번호
09bLqoOA34pCA001YN1N<NA>00001-01-01 00:00:00.00000013491
19bLqoOA34pCH007DT1<NA>000:00.013491
29bLqoOA34pCH006DC1<NA>743069030001-01-01 00:00:00.00000013491
39bLqoOA34pCH005DC1<NA>10001-01-01 00:00:00.00000013491
49bLqoOA34pCH004DT1<NA>00001-01-01 00:00:00.00000013491
59bLqoOA34pCH003DC1<NA>00001-01-01 00:00:00.00000013491
69bLqoOA34pCH002DC1<NA>00001-01-01 00:00:00.00000013491
79bLqoOA34pCH001DC1<NA>00001-01-01 00:00:00.00000013491
89bLqoOA34pCG008DC1<NA>00001-01-01 00:00:00.00000013491
99bLqoOA34pCG007DC1<NA>00001-01-01 00:00:00.00000013491
상담기업개요ID상담신용정보항목코드이력일련번호여부항목여부문자항목명숫자항목값일자항목일자최종수정수처리직원번호
4909bLqhQ42CvCA012CH1<NA>00001-01-01 00:00:00.000000194211
4919bLqhQ42CvCA011DT1<NA>00001-01-01 00:00:00.000000194211
4929bLqhQ42CvCA010CH1<NA>00001-01-01 00:00:00.000000194211
4939bLqhQ42CvCA009DT1<NA>00001-01-01 00:00:00.000000194211
4949bLqhQ42CvCA007DC1<NA>00001-01-01 00:00:00.000000194211
4959bLqhQ42CvCA006CH1<NA>00001-01-01 00:00:00.000000194211
4969bLqhQ42CvCA005DT1<NA>00001-01-01 00:00:00.000000194211
4979bLqhQ42CvCA004CH1<NA>00001-01-01 00:00:00.000000194211
4989bLqhQ42CvCA003DT1<NA>00001-01-01 00:00:00.000000194211
4999bLqhQ42CvCA002CH1<NA>00001-01-01 00:00:00.000000194211