Overview

Dataset statistics

Number of variables7
Number of observations408
Missing cells0
Missing cells (%)0.0%
Duplicate rows43
Duplicate rows (%)10.5%
Total size in memory23.6 KiB
Average record size in memory59.3 B

Variable types

Categorical4
Boolean1
Numeric2

Dataset

Description해당 파일 데이터는 신용보증기금의 품질감리정보에 대해 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15093007/fileData.do

Alerts

Dataset has 43 (10.5%) duplicate rowsDuplicates
삭제여부 is highly overall correlated with 최종수정수 and 4 other fieldsHigh correlation
최초처리시각 is highly overall correlated with 처리직원번호 and 3 other fieldsHigh correlation
감리전자결재상태코드 is highly overall correlated with 처리직원번호 and 3 other fieldsHigh correlation
최초처리직원번호 is highly overall correlated with 처리직원번호 and 3 other fieldsHigh correlation
최종수정수 is highly overall correlated with 삭제여부High correlation
처리직원번호 is highly overall correlated with 감리전자결재상태코드 and 3 other fieldsHigh correlation
감리전자결재상태코드 is highly imbalanced (88.2%)Imbalance
삭제여부 is highly imbalanced (84.7%)Imbalance
최초처리시각 is highly imbalanced (86.9%)Imbalance
최초처리직원번호 is highly imbalanced (86.4%)Imbalance

Reproduction

Analysis started2023-12-12 23:09:56.138770
Analysis finished2023-12-12 23:09:57.136327
Duration1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2
275 
1
133 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 275
67.4%
1 133
32.6%

Length

2023-12-13T08:09:57.208496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:09:57.324001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 275
67.4%
1 133
32.6%

감리전자결재상태코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
23
395 
13
 
10
12
 
2
 
1

Length

Max length2
Median length2
Mean length1.997549
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row23
2nd row23
3rd row23
4th row23
5th row23

Common Values

ValueCountFrequency (%)
23 395
96.8%
13 10
 
2.5%
12 2
 
0.5%
1
 
0.2%

Length

2023-12-13T08:09:57.449229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:09:57.564620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
23 395
97.1%
13 10
 
2.5%
12 2
 
0.5%

삭제여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size540.0 B
False
399 
True
 
9
ValueCountFrequency (%)
False 399
97.8%
True 9
 
2.2%
2023-12-13T08:09:57.667310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

최종수정수
Real number (ℝ)

HIGH CORRELATION 

Distinct25
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.5980392
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2023-12-13T08:09:57.764179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q15
median6
Q38.25
95-th percentile17
Maximum39
Range38
Interquartile range (IQR)3.25

Descriptive statistics

Standard deviation4.4853237
Coefficient of variation (CV)0.59032648
Kurtosis8.7535103
Mean7.5980392
Median Absolute Deviation (MAD)1
Skewness2.4292669
Sum3100
Variance20.118129
MonotonicityNot monotonic
2023-12-13T08:09:57.907129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
5 156
38.2%
6 58
 
14.2%
7 35
 
8.6%
8 32
 
7.8%
9 19
 
4.7%
10 13
 
3.2%
11 12
 
2.9%
13 12
 
2.9%
4 12
 
2.9%
2 10
 
2.5%
Other values (15) 49
 
12.0%
ValueCountFrequency (%)
1 3
 
0.7%
2 10
 
2.5%
4 12
 
2.9%
5 156
38.2%
6 58
 
14.2%
7 35
 
8.6%
8 32
 
7.8%
9 19
 
4.7%
10 13
 
3.2%
11 12
 
2.9%
ValueCountFrequency (%)
39 1
 
0.2%
30 1
 
0.2%
28 1
 
0.2%
24 1
 
0.2%
22 1
 
0.2%
21 1
 
0.2%
20 4
1.0%
19 4
1.0%
18 6
1.5%
17 6
1.5%

처리직원번호
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3869.277
Minimum2969
Maximum5470
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2023-12-13T08:09:58.057281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2969
5-th percentile3559
Q13559
median3559
Q34138
95-th percentile4597
Maximum5470
Range2501
Interquartile range (IQR)579

Descriptive statistics

Standard deviation436.73655
Coefficient of variation (CV)0.11287291
Kurtosis-0.29197203
Mean3869.277
Median Absolute Deviation (MAD)0
Skewness0.96243005
Sum1578665
Variance190738.81
MonotonicityNot monotonic
2023-12-13T08:09:58.162935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
3559 249
61.0%
4509 72
 
17.6%
4042 50
 
12.3%
4597 15
 
3.7%
4875 5
 
1.2%
4129 3
 
0.7%
3513 3
 
0.7%
4293 2
 
0.5%
5470 2
 
0.5%
4964 1
 
0.2%
Other values (6) 6
 
1.5%
ValueCountFrequency (%)
2969 1
 
0.2%
3513 3
 
0.7%
3559 249
61.0%
4042 50
 
12.3%
4129 3
 
0.7%
4165 1
 
0.2%
4172 1
 
0.2%
4293 2
 
0.5%
4436 1
 
0.2%
4509 72
 
17.6%
ValueCountFrequency (%)
5470 2
 
0.5%
4964 1
 
0.2%
4875 5
 
1.2%
4632 1
 
0.2%
4606 1
 
0.2%
4597 15
 
3.7%
4509 72
17.6%
4436 1
 
0.2%
4293 2
 
0.5%
4172 1
 
0.2%

최초처리시각
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct26
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0001-01-01 00:00:00.000000
383 
32:00.2
 
1
07:27.9
 
1
14:53.8
 
1
38:57.6
 
1
Other values (21)
 
21

Length

Max length26
Median length26
Mean length24.835784
Min length7

Unique

Unique25 ?
Unique (%)6.1%

Sample

1st row32:00.2
2nd row19:09.9
3rd row07:27.9
4th row14:53.8
5th row38:57.6

Common Values

ValueCountFrequency (%)
0001-01-01 00:00:00.000000 383
93.9%
32:00.2 1
 
0.2%
07:27.9 1
 
0.2%
14:53.8 1
 
0.2%
38:57.6 1
 
0.2%
06:29.2 1
 
0.2%
34:41.9 1
 
0.2%
44:37.0 1
 
0.2%
04:26.4 1
 
0.2%
14:38.6 1
 
0.2%
Other values (16) 16
 
3.9%

Length

2023-12-13T08:09:58.282884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0001-01-01 383
48.4%
00:00:00.000000 383
48.4%
12:26.9 1
 
0.1%
19:09.9 1
 
0.1%
23:02.9 1
 
0.1%
17:12.7 1
 
0.1%
54:24.4 1
 
0.1%
39:24.9 1
 
0.1%
40:53.3 1
 
0.1%
00:12.1 1
 
0.1%
Other values (17) 17
 
2.1%

최초처리직원번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct17
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
BATCH
383 
4875
 
6
5470
 
3
4168
 
2
4444
 
2
Other values (12)
 
12

Length

Max length5
Median length5
Mean length4.9387255
Min length4

Unique

Unique12 ?
Unique (%)2.9%

Sample

1st row6105
2nd row5921
3rd row5873
4th row4432
5th row5107

Common Values

ValueCountFrequency (%)
BATCH 383
93.9%
4875 6
 
1.5%
5470 3
 
0.7%
4168 2
 
0.5%
4444 2
 
0.5%
5921 1
 
0.2%
5873 1
 
0.2%
4432 1
 
0.2%
5107 1
 
0.2%
5573 1
 
0.2%
Other values (7) 7
 
1.7%

Length

2023-12-13T08:09:58.419519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
batch 383
93.9%
4875 6
 
1.5%
5470 3
 
0.7%
4168 2
 
0.5%
4444 2
 
0.5%
4964 1
 
0.2%
6105 1
 
0.2%
4064 1
 
0.2%
4436 1
 
0.2%
5314 1
 
0.2%
Other values (7) 7
 
1.7%

Interactions

2023-12-13T08:09:56.748393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:09:56.515407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:09:56.852819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:09:56.614874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:09:58.514139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
감리구분코드감리전자결재상태코드삭제여부최종수정수처리직원번호최초처리시각최초처리직원번호
감리구분코드1.0000.0000.0200.1340.3340.0000.000
감리전자결재상태코드0.0001.0000.9680.5540.7440.9870.970
삭제여부0.0200.9681.0000.5730.9701.0000.929
최종수정수0.1340.5540.5731.0000.4910.6830.583
처리직원번호0.3340.7440.9700.4911.0000.8990.849
최초처리시각0.0000.9871.0000.6830.8991.0001.000
최초처리직원번호0.0000.9700.9290.5830.8491.0001.000
2023-12-13T08:09:58.630450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
삭제여부최초처리시각감리전자결재상태코드감리구분코드최초처리직원번호
삭제여부1.0000.9700.8360.0120.890
최초처리시각0.9701.0000.9180.0000.988
감리전자결재상태코드0.8360.9181.0000.0000.902
감리구분코드0.0120.0000.0001.0000.000
최초처리직원번호0.8900.9880.9020.0001.000
2023-12-13T08:09:58.750606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최종수정수처리직원번호감리구분코드감리전자결재상태코드삭제여부최초처리시각최초처리직원번호
최종수정수1.0000.0030.1300.3870.5720.3200.268
처리직원번호0.0031.0000.2440.5770.8390.5970.544
감리구분코드0.1300.2441.0000.0000.0120.0000.000
감리전자결재상태코드0.3870.5770.0001.0000.8360.9180.902
삭제여부0.5720.8390.0120.8361.0000.9700.890
최초처리시각0.3200.5970.0000.9180.9701.0000.988
최초처리직원번호0.2680.5440.0000.9020.8900.9881.000

Missing values

2023-12-13T08:09:56.976655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:09:57.085256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

감리구분코드감리전자결재상태코드삭제여부최종수정수처리직원번호최초처리시각최초처리직원번호
0223N18459732:00.26105
1223N12459719:09.95921
2223N10459707:27.95873
3223N13459714:53.84432
4223N8459738:57.65107
5223N13459706:29.25573
6223N18459734:41.95470
7223N18459744:37.04444
8212N1496404:26.44964
9223N9459714:38.64875
감리구분코드감리전자결재상태코드삭제여부최종수정수처리직원번호최초처리시각최초처리직원번호
398123N535590001-01-01 00:00:00.000000BATCH
399223N835590001-01-01 00:00:00.000000BATCH
400123N535590001-01-01 00:00:00.000000BATCH
401223N545090001-01-01 00:00:00.000000BATCH
402223N535590001-01-01 00:00:00.000000BATCH
403123N545090001-01-01 00:00:00.000000BATCH
404223N635590001-01-01 00:00:00.000000BATCH
405123N735590001-01-01 00:00:00.000000BATCH
406223N1535590001-01-01 00:00:00.000000BATCH
407223N535590001-01-01 00:00:00.000000BATCH

Duplicate rows

Most frequently occurring

감리구분코드감리전자결재상태코드삭제여부최종수정수처리직원번호최초처리시각최초처리직원번호# duplicates
17223N535590001-01-01 00:00:00.000000BATCH60
0123N535590001-01-01 00:00:00.000000BATCH42
19223N545090001-01-01 00:00:00.000000BATCH24
2123N635590001-01-01 00:00:00.000000BATCH19
20223N635590001-01-01 00:00:00.000000BATCH19
18223N540420001-01-01 00:00:00.000000BATCH17
23223N735590001-01-01 00:00:00.000000BATCH16
1123N545090001-01-01 00:00:00.000000BATCH13
25223N835590001-01-01 00:00:00.000000BATCH13
6123N835590001-01-01 00:00:00.000000BATCH9