Overview

Dataset statistics

Number of variables10
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory41.1 KiB
Average record size in memory84.3 B

Variable types

Text2
Numeric2
Categorical4
Boolean2

Dataset

Description해당 파일 데이터는 신용보증기금의 공통일반부점관계정보에 대해 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15093125/fileData.do

Alerts

부점관계코드 is highly overall correlated with 성과평가여부 and 1 other fieldsHigh correlation
성과평가여부 is highly overall correlated with 부점관계코드 and 1 other fieldsHigh correlation
회계처리여부 is highly overall correlated with 부점관계코드 and 1 other fieldsHigh correlation
이력일련번호 is highly imbalanced (84.1%)Imbalance
팀코드 is highly imbalanced (67.2%)Imbalance

Reproduction

Analysis started2023-12-12 11:57:17.995144
Analysis finished2023-12-12 11:57:19.612928
Duration1.62 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct153
Distinct (%)30.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T20:57:19.943753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1500
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)20.8%

Sample

1st rowQAG
2nd rowTDA
3rd rowQAF
4th rowTBR
5th rowQAR
ValueCountFrequency (%)
taa 30
 
6.0%
tid 27
 
5.4%
tab 26
 
5.2%
jma 24
 
4.8%
the 21
 
4.2%
jpa 16
 
3.2%
tpb 15
 
3.0%
tmd 15
 
3.0%
jac 12
 
2.4%
tbc 12
 
2.4%
Other values (143) 302
60.4%
2023-12-12T20:57:20.618155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 286
19.1%
T 215
14.3%
J 190
12.7%
B 106
 
7.1%
H 99
 
6.6%
D 76
 
5.1%
M 63
 
4.2%
N 60
 
4.0%
E 56
 
3.7%
I 56
 
3.7%
Other values (14) 293
19.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1500
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 286
19.1%
T 215
14.3%
J 190
12.7%
B 106
 
7.1%
H 99
 
6.6%
D 76
 
5.1%
M 63
 
4.2%
N 60
 
4.0%
E 56
 
3.7%
I 56
 
3.7%
Other values (14) 293
19.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 1500
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 286
19.1%
T 215
14.3%
J 190
12.7%
B 106
 
7.1%
H 99
 
6.6%
D 76
 
5.1%
M 63
 
4.2%
N 60
 
4.0%
E 56
 
3.7%
I 56
 
3.7%
Other values (14) 293
19.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 286
19.1%
T 215
14.3%
J 190
12.7%
B 106
 
7.1%
H 99
 
6.6%
D 76
 
5.1%
M 63
 
4.2%
N 60
 
4.0%
E 56
 
3.7%
I 56
 
3.7%
Other values (14) 293
19.5%

부점관계코드
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.716
Minimum3
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T20:57:20.834974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile4
Q15
median7
Q311
95-th percentile11
Maximum13
Range10
Interquartile range (IQR)6

Descriptive statistics

Standard deviation2.6618631
Coefficient of variation (CV)0.34497966
Kurtosis-1.3580806
Mean7.716
Median Absolute Deviation (MAD)2
Skewness0.26262968
Sum3858
Variance7.085515
MonotonicityNot monotonic
2023-12-12T20:57:21.011412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
7 147
29.4%
11 133
26.6%
5 130
26.0%
4 32
 
6.4%
10 18
 
3.6%
9 18
 
3.6%
12 10
 
2.0%
13 9
 
1.8%
3 3
 
0.6%
ValueCountFrequency (%)
3 3
 
0.6%
4 32
 
6.4%
5 130
26.0%
7 147
29.4%
9 18
 
3.6%
10 18
 
3.6%
11 133
26.6%
12 10
 
2.0%
13 9
 
1.8%
ValueCountFrequency (%)
13 9
 
1.8%
12 10
 
2.0%
11 133
26.6%
10 18
 
3.6%
9 18
 
3.6%
7 147
29.4%
5 130
26.0%
4 32
 
6.4%
3 3
 
0.6%
Distinct230
Distinct (%)46.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T20:57:21.583400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1500
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)14.4%

Sample

1st rowQAG
2nd rowQAG
3rd rowQAF
4th rowTBR
5th rowQAR
ValueCountFrequency (%)
tpk 5
 
1.0%
jra 5
 
1.0%
jja 5
 
1.0%
jml 5
 
1.0%
jlb 5
 
1.0%
jqh 5
 
1.0%
jla 5
 
1.0%
jjd 5
 
1.0%
joi 5
 
1.0%
jph 5
 
1.0%
Other values (220) 450
90.0%
2023-12-12T20:57:22.387834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 241
16.1%
A 158
10.5%
N 118
 
7.9%
H 115
 
7.7%
J 115
 
7.7%
B 89
 
5.9%
V 78
 
5.2%
I 75
 
5.0%
P 63
 
4.2%
D 55
 
3.7%
Other values (15) 393
26.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1500
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 241
16.1%
A 158
10.5%
N 118
 
7.9%
H 115
 
7.7%
J 115
 
7.7%
B 89
 
5.9%
V 78
 
5.2%
I 75
 
5.0%
P 63
 
4.2%
D 55
 
3.7%
Other values (15) 393
26.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 1500
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 241
16.1%
A 158
10.5%
N 118
 
7.9%
H 115
 
7.7%
J 115
 
7.7%
B 89
 
5.9%
V 78
 
5.2%
I 75
 
5.0%
P 63
 
4.2%
D 55
 
3.7%
Other values (15) 393
26.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 241
16.1%
A 158
10.5%
N 118
 
7.9%
H 115
 
7.7%
J 115
 
7.7%
B 89
 
5.9%
V 78
 
5.2%
I 75
 
5.0%
P 63
 
4.2%
D 55
 
3.7%
Other values (15) 393
26.2%

이력일련번호
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
481 
2
 
17
3
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 481
96.2%
2 17
 
3.4%
3 2
 
0.4%

Length

2023-12-12T20:57:22.593788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:57:22.730846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 481
96.2%
2 17
 
3.4%
3 2
 
0.4%

팀코드
Categorical

IMBALANCE 

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
429 
1
 
24
3
 
15
2
 
13
4
 
10
Other values (2)
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
429
85.8%
1 24
 
4.8%
3 15
 
3.0%
2 13
 
2.6%
4 10
 
2.0%
5 7
 
1.4%
6 2
 
0.4%

Length

2023-12-12T20:57:22.897812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:57:23.029255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 24
33.8%
3 15
21.1%
2 13
18.3%
4 10
14.1%
5 7
 
9.9%
6 2
 
2.8%
Distinct12
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
A1
77 
A2
74 
F1
59 
G1
56 
D1
48 
Other values (7)
186 

Length

Max length2
Median length2
Mean length1.954
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowT1
2nd rowT1
3rd rowT1
4th rowC1
5th rowA2

Common Values

ValueCountFrequency (%)
A1 77
15.4%
A2 74
14.8%
F1 59
11.8%
G1 56
11.2%
D1 48
9.6%
C1 44
8.8%
E1 40
8.0%
H1 39
7.8%
B1 25
 
5.0%
23
 
4.6%
Other values (2) 15
 
3.0%

Length

2023-12-12T20:57:23.221274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a1 77
16.1%
a2 74
15.5%
f1 59
12.4%
g1 56
11.7%
d1 48
10.1%
c1 44
9.2%
e1 40
8.4%
h1 39
8.2%
b1 25
 
5.2%
t1 12
 
2.5%

성과평가여부
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size632.0 B
True
342 
False
158 
ValueCountFrequency (%)
True 342
68.4%
False 158
31.6%
2023-12-12T20:57:23.336883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

회계처리여부
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size632.0 B
True
256 
False
244 
ValueCountFrequency (%)
True 256
51.2%
False 244
48.8%
2023-12-12T20:57:23.430028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

최종수정수
Categorical

Distinct3
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
393 
2
98 
3
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 393
78.6%
2 98
 
19.6%
3 9
 
1.8%

Length

2023-12-12T20:57:23.576566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:57:23.712824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 393
78.6%
2 98
 
19.6%
3 9
 
1.8%

처리직원번호
Real number (ℝ)

Distinct11
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4897.76
Minimum3682
Maximum5803
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T20:57:23.845208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3682
5-th percentile4169
Q14800
median5099
Q35099
95-th percentile5099
Maximum5803
Range2121
Interquartile range (IQR)299

Descriptive statistics

Standard deviation309.49729
Coefficient of variation (CV)0.0631916
Kurtosis1.1452889
Mean4897.76
Median Absolute Deviation (MAD)0
Skewness-1.2729847
Sum2448880
Variance95788.572
MonotonicityNot monotonic
2023-12-12T20:57:23.974006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
5099 280
56.0%
4800 120
24.0%
4169 43
 
8.6%
4451 32
 
6.4%
4925 9
 
1.8%
5314 6
 
1.2%
4917 4
 
0.8%
5803 2
 
0.4%
4062 2
 
0.4%
4172 1
 
0.2%
ValueCountFrequency (%)
3682 1
 
0.2%
4062 2
 
0.4%
4169 43
 
8.6%
4172 1
 
0.2%
4451 32
 
6.4%
4800 120
24.0%
4917 4
 
0.8%
4925 9
 
1.8%
5099 280
56.0%
5314 6
 
1.2%
ValueCountFrequency (%)
5803 2
 
0.4%
5314 6
 
1.2%
5099 280
56.0%
4925 9
 
1.8%
4917 4
 
0.8%
4800 120
24.0%
4451 32
 
6.4%
4172 1
 
0.2%
4169 43
 
8.6%
4062 2
 
0.4%

Interactions

2023-12-12T20:57:18.812822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:57:18.604245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:57:18.910986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:57:18.712318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:57:24.078426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부점관계코드이력일련번호팀코드부점지역구분코드성과평가여부회계처리여부최종수정수처리직원번호
부점관계코드1.0000.4750.6410.6090.9860.9390.5360.475
이력일련번호0.4751.0000.2640.7280.0750.0000.2850.000
팀코드0.6410.2641.0000.2780.2390.2800.4250.209
부점지역구분코드0.6090.7280.2781.0000.0650.2300.6220.618
성과평가여부0.9860.0750.2390.0651.0000.8850.0740.270
회계처리여부0.9390.0000.2800.2300.8851.0000.0280.345
최종수정수0.5360.2850.4250.6220.0740.0281.0000.507
처리직원번호0.4750.0000.2090.6180.2700.3450.5071.000
2023-12-12T20:57:24.231441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
팀코드회계처리여부성과평가여부최종수정수이력일련번호부점지역구분코드
팀코드1.0000.2980.2540.3150.1820.138
회계처리여부0.2981.0000.6910.0470.0000.177
성과평가여부0.2540.6911.0000.1230.1250.050
최종수정수0.3150.0470.1231.0000.0940.349
이력일련번호0.1820.0000.1250.0941.0000.443
부점지역구분코드0.1380.1770.0500.3490.4431.000
2023-12-12T20:57:24.410788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부점관계코드처리직원번호이력일련번호팀코드부점지역구분코드성과평가여부회계처리여부최종수정수
부점관계코드1.0000.3310.3420.3970.3050.8910.7700.393
처리직원번호0.3311.0000.0000.0590.3310.2860.3670.391
이력일련번호0.3420.0001.0000.1820.4430.1250.0000.094
팀코드0.3970.0590.1821.0000.1380.2540.2980.315
부점지역구분코드0.3050.3310.4430.1381.0000.0500.1770.349
성과평가여부0.8910.2860.1250.2540.0501.0000.6910.123
회계처리여부0.7700.3670.0000.2980.1770.6911.0000.047
최종수정수0.3930.3910.0940.3150.3490.1230.0471.000

Missing values

2023-12-12T20:57:19.342584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:57:19.545189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

모점부점코드부점관계코드관계부점코드이력일련번호팀코드부점지역구분코드성과평가여부회계처리여부최종수정수처리직원번호
0QAG5QAG1T1YY15803
1TDA7QAG1T1YY15803
2QAF5QAF1T1YY15314
3TBR5TBR1C1YY15314
4QAR5QAR1A2YY15314
5TDA7QAF1T1YY15314
6TBC7TBR1C1NN15314
7TAB7QAR1A2NN15314
8EME13TID1YY15099
9ECE13TID1YY25099
모점부점코드부점관계코드관계부점코드이력일련번호팀코드부점지역구분코드성과평가여부회계처리여부최종수정수처리직원번호
490THO5THO1H1YY24451
491THN5THN1H1YY24451
492THK5THK1H1YY24451
493THG5THG1G1YY24451
494THE7VHQ1H1NN24451
495THE7VHP1H1NN24451
496THE7VHO1H1NN24451
497THE7TIA1H1NN24451
498THE7THT1H1NN24451
499THE7THO1H1NN24451