Overview

Dataset statistics

Number of variables7
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory29.9 KiB
Average record size in memory61.3 B

Variable types

Text2
Categorical2
Numeric3

Dataset

Description해당 파일 데이터는 신용보증기금의 보험상담고객관계에 대한 정보를 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15092979/fileData.do

Alerts

최종수정수 is highly overall correlated with 이력일련번호High correlation
처리직원번호 is highly overall correlated with 최초처리직원번호High correlation
최초처리직원번호 is highly overall correlated with 처리직원번호High correlation
이력일련번호 is highly overall correlated with 최종수정수High correlation
이력일련번호 is highly imbalanced (52.8%)Imbalance

Reproduction

Analysis started2023-12-12 15:51:42.706949
Analysis finished2023-12-12 15:51:44.204135
Duration1.5 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct324
Distinct (%)64.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-13T00:51:44.454816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5000
Distinct characters62
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique218 ?
Unique (%)43.6%

Sample

1st row9dnMMcHjZv
2nd row9dnMMcH1t5
3rd row9dnMMcHjZv
4th row9dnMMcH1t5
5th row9dnMMcHjZv
ValueCountFrequency (%)
9bbnkrsvx6 8
 
1.6%
9bbnkrsdil 8
 
1.6%
9c1bsuvowc 5
 
1.0%
9dihqbtpoi 5
 
1.0%
9c1bsuwjgn 5
 
1.0%
9defbtkkeh 5
 
1.0%
9defbtj3nc 5
 
1.0%
9dihqbu7y1 5
 
1.0%
9cxjf47dao 5
 
1.0%
9cxjf47puf 5
 
1.0%
Other values (314) 444
88.8%
2023-12-13T00:51:44.977260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 491
 
9.8%
a 404
 
8.1%
c 293
 
5.9%
d 210
 
4.2%
b 186
 
3.7%
S 93
 
1.9%
m 83
 
1.7%
G 83
 
1.7%
k 80
 
1.6%
H 77
 
1.5%
Other values (52) 3000
60.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2429
48.6%
Uppercase Letter 1567
31.3%
Decimal Number 1004
20.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 404
16.6%
c 293
 
12.1%
d 210
 
8.6%
b 186
 
7.7%
m 83
 
3.4%
k 80
 
3.3%
f 76
 
3.1%
j 73
 
3.0%
n 71
 
2.9%
v 71
 
2.9%
Other values (16) 882
36.3%
Uppercase Letter
ValueCountFrequency (%)
S 93
 
5.9%
G 83
 
5.3%
H 77
 
4.9%
U 77
 
4.9%
M 71
 
4.5%
N 71
 
4.5%
C 70
 
4.5%
V 64
 
4.1%
B 62
 
4.0%
R 62
 
4.0%
Other values (16) 837
53.4%
Decimal Number
ValueCountFrequency (%)
9 491
48.9%
7 66
 
6.6%
1 65
 
6.5%
6 65
 
6.5%
0 59
 
5.9%
2 59
 
5.9%
5 59
 
5.9%
4 55
 
5.5%
3 43
 
4.3%
8 42
 
4.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 3996
79.9%
Common 1004
 
20.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 404
 
10.1%
c 293
 
7.3%
d 210
 
5.3%
b 186
 
4.7%
S 93
 
2.3%
m 83
 
2.1%
G 83
 
2.1%
k 80
 
2.0%
H 77
 
1.9%
U 77
 
1.9%
Other values (42) 2410
60.3%
Common
ValueCountFrequency (%)
9 491
48.9%
7 66
 
6.6%
1 65
 
6.5%
6 65
 
6.5%
0 59
 
5.9%
2 59
 
5.9%
5 59
 
5.9%
4 55
 
5.5%
3 43
 
4.3%
8 42
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 491
 
9.8%
a 404
 
8.1%
c 293
 
5.9%
d 210
 
4.2%
b 186
 
3.7%
S 93
 
1.9%
m 83
 
1.7%
G 83
 
1.7%
k 80
 
1.6%
H 77
 
1.5%
Other values (52) 3000
60.0%
Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
250 
3
250 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row3
3rd row1
4th row3
5th row1

Common Values

ValueCountFrequency (%)
1 250
50.0%
3 250
50.0%

Length

2023-12-13T00:51:45.186066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:51:45.342612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 250
50.0%
3 250
50.0%
Distinct192
Distinct (%)38.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-13T00:51:45.688235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5000
Distinct characters62
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9dnS0sGKIU
2nd row9dnS0sGKIU
3rd row9dnS0pEiOe
4th row9dnS0pEiOe
5th row9dnS0lO4OS
ValueCountFrequency (%)
9dm0kl3fuk 10
 
2.0%
9dnlyc5cqi 8
 
1.6%
9dnsgzgxmd 8
 
1.6%
9dnsoaufq4 8
 
1.6%
9dnst9gnmm 6
 
1.2%
9dnoefjsna 6
 
1.2%
9dnswh90lb 6
 
1.2%
9dnsjb2hkb 6
 
1.2%
9dnmzrzqmr 6
 
1.2%
9dnsyjacep 6
 
1.2%
Other values (182) 430
86.0%
2023-12-13T00:51:46.220989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 552
 
11.0%
n 532
 
10.6%
d 532
 
10.6%
S 288
 
5.8%
O 248
 
5.0%
L 88
 
1.8%
N 86
 
1.7%
M 84
 
1.7%
m 80
 
1.6%
U 74
 
1.5%
Other values (52) 2436
48.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2224
44.5%
Uppercase Letter 1844
36.9%
Decimal Number 932
18.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 532
23.9%
d 532
23.9%
m 80
 
3.6%
a 70
 
3.1%
q 70
 
3.1%
k 60
 
2.7%
u 58
 
2.6%
z 56
 
2.5%
r 54
 
2.4%
c 50
 
2.2%
Other values (16) 662
29.8%
Uppercase Letter
ValueCountFrequency (%)
S 288
 
15.6%
O 248
 
13.4%
L 88
 
4.8%
N 86
 
4.7%
M 84
 
4.6%
U 74
 
4.0%
F 68
 
3.7%
K 68
 
3.7%
W 60
 
3.3%
Y 60
 
3.3%
Other values (16) 720
39.0%
Decimal Number
ValueCountFrequency (%)
9 552
59.2%
0 52
 
5.6%
8 48
 
5.2%
4 48
 
5.2%
5 46
 
4.9%
3 46
 
4.9%
2 44
 
4.7%
6 40
 
4.3%
1 36
 
3.9%
7 20
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4068
81.4%
Common 932
 
18.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 532
 
13.1%
d 532
 
13.1%
S 288
 
7.1%
O 248
 
6.1%
L 88
 
2.2%
N 86
 
2.1%
M 84
 
2.1%
m 80
 
2.0%
U 74
 
1.8%
a 70
 
1.7%
Other values (42) 1986
48.8%
Common
ValueCountFrequency (%)
9 552
59.2%
0 52
 
5.6%
8 48
 
5.2%
4 48
 
5.2%
5 46
 
4.9%
3 46
 
4.9%
2 44
 
4.7%
6 40
 
4.3%
1 36
 
3.9%
7 20
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 552
 
11.0%
n 532
 
10.6%
d 532
 
10.6%
S 288
 
5.8%
O 248
 
5.0%
L 88
 
1.8%
N 86
 
1.7%
M 84
 
1.7%
m 80
 
1.6%
U 74
 
1.5%
Other values (52) 2436
48.7%

이력일련번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
384 
3
76 
5
 
26
7
 
10
9
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 384
76.8%
3 76
 
15.2%
5 26
 
5.2%
7 10
 
2.0%
9 4
 
0.8%

Length

2023-12-13T00:51:46.399439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:51:46.526682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 384
76.8%
3 76
 
15.2%
5 26
 
5.2%
7 10
 
2.0%
9 4
 
0.8%

최종수정수
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.988
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-13T00:51:46.640941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.5902559
Coefficient of variation (CV)0.79992752
Kurtosis4.4395363
Mean1.988
Median Absolute Deviation (MAD)0
Skewness2.0669953
Sum994
Variance2.5289138
MonotonicityNot monotonic
2023-12-13T00:51:46.784518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 290
58.0%
2 86
 
17.2%
3 56
 
11.2%
4 26
 
5.2%
5 18
 
3.6%
6 10
 
2.0%
7 6
 
1.2%
9 4
 
0.8%
8 4
 
0.8%
ValueCountFrequency (%)
1 290
58.0%
2 86
 
17.2%
3 56
 
11.2%
4 26
 
5.2%
5 18
 
3.6%
6 10
 
2.0%
7 6
 
1.2%
8 4
 
0.8%
9 4
 
0.8%
ValueCountFrequency (%)
9 4
 
0.8%
8 4
 
0.8%
7 6
 
1.2%
6 10
 
2.0%
5 18
 
3.6%
4 26
 
5.2%
3 56
 
11.2%
2 86
 
17.2%
1 290
58.0%

처리직원번호
Real number (ℝ)

HIGH CORRELATION 

Distinct109
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5223.8
Minimum3290
Maximum6185
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-13T00:51:46.954141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3290
5-th percentile3620
Q14913
median5408.5
Q35757
95-th percentile6115
Maximum6185
Range2895
Interquartile range (IQR)844

Descriptive statistics

Standard deviation718.29043
Coefficient of variation (CV)0.13750343
Kurtosis0.23659812
Mean5223.8
Median Absolute Deviation (MAD)407.5
Skewness-0.94903762
Sum2611900
Variance515941.14
MonotonicityNot monotonic
2023-12-13T00:51:47.101547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3620 28
 
5.6%
5173 26
 
5.2%
5495 24
 
4.8%
5076 16
 
3.2%
5406 12
 
2.4%
4621 12
 
2.4%
5040 12
 
2.4%
5684 10
 
2.0%
5001 10
 
2.0%
4913 10
 
2.0%
Other values (99) 340
68.0%
ValueCountFrequency (%)
3290 2
 
0.4%
3447 2
 
0.4%
3548 2
 
0.4%
3555 2
 
0.4%
3590 4
 
0.8%
3613 8
 
1.6%
3620 28
5.6%
3860 2
 
0.4%
3977 2
 
0.4%
4053 2
 
0.4%
ValueCountFrequency (%)
6185 2
 
0.4%
6175 4
 
0.8%
6147 6
1.2%
6139 4
 
0.8%
6129 6
1.2%
6121 2
 
0.4%
6115 6
1.2%
6098 2
 
0.4%
6092 2
 
0.4%
6078 10
2.0%

최초처리직원번호
Real number (ℝ)

HIGH CORRELATION 

Distinct109
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5223.8
Minimum3290
Maximum6185
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-13T00:51:47.258416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3290
5-th percentile3620
Q14913
median5408.5
Q35757
95-th percentile6115
Maximum6185
Range2895
Interquartile range (IQR)844

Descriptive statistics

Standard deviation718.29043
Coefficient of variation (CV)0.13750343
Kurtosis0.23659812
Mean5223.8
Median Absolute Deviation (MAD)407.5
Skewness-0.94903762
Sum2611900
Variance515941.14
MonotonicityNot monotonic
2023-12-13T00:51:47.416564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3620 28
 
5.6%
5173 26
 
5.2%
5495 24
 
4.8%
5076 16
 
3.2%
5406 12
 
2.4%
4621 12
 
2.4%
5040 12
 
2.4%
5684 10
 
2.0%
5001 10
 
2.0%
4913 10
 
2.0%
Other values (99) 340
68.0%
ValueCountFrequency (%)
3290 2
 
0.4%
3447 2
 
0.4%
3548 2
 
0.4%
3555 2
 
0.4%
3590 4
 
0.8%
3613 8
 
1.6%
3620 28
5.6%
3860 2
 
0.4%
3977 2
 
0.4%
4053 2
 
0.4%
ValueCountFrequency (%)
6185 2
 
0.4%
6175 4
 
0.8%
6147 6
1.2%
6139 4
 
0.8%
6129 6
1.2%
6121 2
 
0.4%
6115 6
1.2%
6098 2
 
0.4%
6092 2
 
0.4%
6078 10
2.0%

Interactions

2023-12-13T00:51:43.623331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:51:42.973269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:51:43.303334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:51:43.729576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:51:43.084331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:51:43.407043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:51:43.846936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:51:43.192945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:51:43.507133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:51:47.565132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고객상담역할관계코드이력일련번호최종수정수처리직원번호최초처리직원번호
고객상담역할관계코드1.0000.0000.0000.0000.000
이력일련번호0.0001.0000.9850.3100.310
최종수정수0.0000.9851.0000.2970.297
처리직원번호0.0000.3100.2971.0001.000
최초처리직원번호0.0000.3100.2971.0001.000
2023-12-13T00:51:47.737627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고객상담역할관계코드이력일련번호
고객상담역할관계코드1.0000.000
이력일련번호0.0001.000
2023-12-13T00:51:47.865540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최종수정수처리직원번호최초처리직원번호고객상담역할관계코드이력일련번호
최종수정수1.0000.1080.1080.0000.978
처리직원번호0.1081.0001.0000.0000.134
최초처리직원번호0.1081.0001.0000.0000.134
고객상담역할관계코드0.0000.0000.0001.0000.000
이력일련번호0.9780.1340.1340.0001.000

Missing values

2023-12-13T00:51:43.997311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:51:44.149746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

고객아이디(ID)고객상담역할관계코드상담아이디(ID)이력일련번호최종수정수처리직원번호최초처리직원번호
09dnMMcHjZv19dnS0sGKIU1156175617
19dnMMcH1t539dnS0sGKIU1156175617
29dnMMcHjZv19dnS0pEiOe1156175617
39dnMMcH1t539dnS0pEiOe1156175617
49dnMMcHjZv19dnS0lO4OS1156175617
59dnMMcH1t539dnS0lO4OS1156175617
69dmUffYYxf19dnSZP2WXb1157635763
79dmUffZf7u39dnSZP2WXb1157635763
89dlZByAXhp19dnSZGak7j1150765076
99dlZByBuRc39dnSZGak7j1150765076
고객아이디(ID)고객상담역할관계코드상담아이디(ID)이력일련번호최종수정수처리직원번호최초처리직원번호
490aaaaadCTQz19dnN85ELH91156255625
491aaaaab9NGz39dnN85ELH91156255625
4929cCSwaUNSc19dnN7HkbJs1154095409
4939cCSwaVcDb39dnN7HkbJs1154095409
4949cULwNVSmv19dnN7DZ3Lx1146684668
4959cULwNV5fI39dnN7DZ3Lx1146684668
4969deB0cakhn19dnN7Aucra1160386038
4979deB0cb39C39dnN7Aucra1160386038
4989coZBDREjn19dnN7pMvib1161476147
4999coZBDRX2M39dnN7pMvib1161476147