Overview

Dataset statistics

Number of variables6
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory26.0 KiB
Average record size in memory53.3 B

Variable types

Numeric3
Categorical2
Boolean1

Dataset

Description해당 파일 데이터는 신용보증기금의 공통전자문서서식프로그램정보에 대해 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15093169/fileData.do

Alerts

컬럼상위레벨값 has constant value ""Constant
전자문서서식프로그램 is highly overall correlated with 최종수정수 and 2 other fieldsHigh correlation
최종수정수 is highly overall correlated with 전자문서서식프로그램 and 1 other fieldsHigh correlation
처리직원번호 is highly overall correlated with 전자문서서식프로그램 and 1 other fieldsHigh correlation
컬럼레벨값 is highly overall correlated with 전자문서서식프로그램High correlation
컬럼레벨값 is highly imbalanced (63.4%)Imbalance
삭제여부 is highly imbalanced (59.1%)Imbalance
전자문서서식프로그램 has unique valuesUnique

Reproduction

Analysis started2023-12-12 12:53:55.910586
Analysis finished2023-12-12 12:53:57.181419
Duration1.27 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

전자문서서식프로그램
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean651.222
Minimum379
Maximum908
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T21:53:57.250501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum379
5-th percentile425.95
Q1525.75
median650.5
Q3775.25
95-th percentile883.05
Maximum908
Range529
Interquartile range (IQR)249.5

Descriptive statistics

Standard deviation145.83184
Coefficient of variation (CV)0.22393568
Kurtosis-1.1715046
Mean651.222
Median Absolute Deviation (MAD)125
Skewness0.021354357
Sum325611
Variance21266.927
MonotonicityNot monotonic
2023-12-12T21:53:57.385885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
867 1
 
0.2%
572 1
 
0.2%
559 1
 
0.2%
560 1
 
0.2%
561 1
 
0.2%
562 1
 
0.2%
553 1
 
0.2%
564 1
 
0.2%
565 1
 
0.2%
566 1
 
0.2%
Other values (490) 490
98.0%
ValueCountFrequency (%)
379 1
0.2%
393 1
0.2%
403 1
0.2%
404 1
0.2%
405 1
0.2%
406 1
0.2%
407 1
0.2%
408 1
0.2%
409 1
0.2%
410 1
0.2%
ValueCountFrequency (%)
908 1
0.2%
907 1
0.2%
906 1
0.2%
905 1
0.2%
904 1
0.2%
903 1
0.2%
902 1
0.2%
901 1
0.2%
900 1
0.2%
899 1
0.2%

컬럼레벨값
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
465 
0
 
35

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 465
93.0%
0 35
 
7.0%

Length

2023-12-12T21:53:57.529670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:53:57.618409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 465
93.0%
0 35
 
7.0%

컬럼상위레벨값
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
0
500 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 500
100.0%

Length

2023-12-12T21:53:57.709396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:53:57.821374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 500
100.0%

삭제여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size632.0 B
False
459 
True
 
41
ValueCountFrequency (%)
False 459
91.8%
True 41
 
8.2%
2023-12-12T21:53:57.906205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

최종수정수
Real number (ℝ)

HIGH CORRELATION 

Distinct13
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.788
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T21:53:58.006864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q38
95-th percentile12
Maximum13
Range12
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.1022815
Coefficient of variation (CV)0.85678394
Kurtosis-0.93677587
Mean4.788
Median Absolute Deviation (MAD)1
Skewness0.84152022
Sum2394
Variance16.828713
MonotonicityNot monotonic
2023-12-12T21:53:58.109390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
2 175
35.0%
1 89
17.8%
12 86
17.2%
8 49
 
9.8%
3 32
 
6.4%
4 31
 
6.2%
10 15
 
3.0%
6 12
 
2.4%
5 5
 
1.0%
13 2
 
0.4%
Other values (3) 4
 
0.8%
ValueCountFrequency (%)
1 89
17.8%
2 175
35.0%
3 32
 
6.4%
4 31
 
6.2%
5 5
 
1.0%
6 12
 
2.4%
7 1
 
0.2%
8 49
 
9.8%
9 1
 
0.2%
10 15
 
3.0%
ValueCountFrequency (%)
13 2
 
0.4%
12 86
17.2%
11 2
 
0.4%
10 15
 
3.0%
9 1
 
0.2%
8 49
9.8%
7 1
 
0.2%
6 12
 
2.4%
5 5
 
1.0%
4 31
 
6.2%

처리직원번호
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5002.588
Minimum4917
Maximum5536
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T21:53:58.206932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4917
5-th percentile4917
Q14917
median4917
Q34917
95-th percentile5536
Maximum5536
Range619
Interquartile range (IQR)0

Descriptive statistics

Standard deviation183.91641
Coefficient of variation (CV)0.036764252
Kurtosis3.4295915
Mean5002.588
Median Absolute Deviation (MAD)0
Skewness2.1714267
Sum2501294
Variance33825.245
MonotonicityNot monotonic
2023-12-12T21:53:58.298662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
4917 385
77.0%
5536 44
 
8.8%
5093 41
 
8.2%
5222 18
 
3.6%
5176 11
 
2.2%
4920 1
 
0.2%
ValueCountFrequency (%)
4917 385
77.0%
4920 1
 
0.2%
5093 41
 
8.2%
5176 11
 
2.2%
5222 18
 
3.6%
5536 44
 
8.8%
ValueCountFrequency (%)
5536 44
 
8.8%
5222 18
 
3.6%
5176 11
 
2.2%
5093 41
 
8.2%
4920 1
 
0.2%
4917 385
77.0%

Interactions

2023-12-12T21:53:56.742766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:56.144704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:56.476504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:56.830750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:56.257407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:56.572384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:56.928301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:56.367360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:56.655621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:53:58.373596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전자문서서식프로그램컬럼레벨값삭제여부최종수정수처리직원번호
전자문서서식프로그램1.0000.6970.4650.8970.607
컬럼레벨값0.6971.0000.0800.2040.635
삭제여부0.4650.0801.0000.4490.000
최종수정수0.8970.2040.4491.0000.125
처리직원번호0.6070.6350.0000.1251.000
2023-12-12T21:53:58.469738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
삭제여부컬럼레벨값
삭제여부1.0000.051
컬럼레벨값0.0511.000
2023-12-12T21:53:58.545359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전자문서서식프로그램최종수정수처리직원번호컬럼레벨값삭제여부
전자문서서식프로그램1.000-0.9100.7020.5390.345
최종수정수-0.9101.000-0.5990.1550.345
처리직원번호0.702-0.5991.0000.0550.105
컬럼레벨값0.5390.1550.0551.0000.051
삭제여부0.3450.3450.1050.0511.000

Missing values

2023-12-12T21:53:57.039228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:53:57.143161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

전자문서서식프로그램컬럼레벨값컬럼상위레벨값삭제여부최종수정수처리직원번호
086710N35536
181210N25536
281110N25536
390810N25536
490710N15536
590610N35536
690310N35536
790510N35536
890410N35536
990210N15536
전자문서서식프로그램컬럼레벨값컬럼상위레벨값삭제여부최종수정수처리직원번호
49041210N124917
49141110N124917
49241010N124917
49340910N124917
49440810N124917
49540710N124917
49640610N124917
49740510N124917
49840410N124917
49939310N124917