Overview

Dataset statistics

Number of variables5
Number of observations31
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 KiB
Average record size in memory45.3 B

Variable types

Categorical4
Text1

Dataset

Description샘플 데이터
Author한국신용데이터
URLhttps://bigdata-region.kr/#/dataset/dd9db859-5339-4cb0-9c22-011e030db735

Alerts

코로나직전대비율 has constant value ""Constant
유형명 is highly overall correlated with 기준년월 and 1 other fieldsHigh correlation
기준년월 is highly overall correlated with 유형명High correlation
업종명 is highly overall correlated with 유형명High correlation
기준년월 is highly imbalanced (79.4%)Imbalance

Reproduction

Analysis started2023-12-10 14:24:14.091490
Analysis finished2023-12-10 14:24:14.410065
Duration0.32 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

유형명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)12.9%
Missing0
Missing (%)0.0%
Memory size380.0 B
지역
17 
업종
지역X업종
통합
 
1

Length

Max length5
Median length2
Mean length2.5806452
Min length2

Unique

Unique1 ?
Unique (%)3.2%

Sample

1st row통합
2nd row업종
3rd row업종
4th row업종
5th row업종

Common Values

ValueCountFrequency (%)
지역 17
54.8%
업종 7
22.6%
지역X업종 6
 
19.4%
통합 1
 
3.2%

Length

2023-12-10T23:24:14.477947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:24:14.558658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지역 17
54.8%
업종 7
22.6%
지역x업종 6
 
19.4%
통합 1
 
3.2%

기준년월
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size380.0 B
Jan-22
30 
Feb-22
 
1

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique1 ?
Unique (%)3.2%

Sample

1st rowFeb-22
2nd rowJan-22
3rd rowJan-22
4th rowJan-22
5th rowJan-22

Common Values

ValueCountFrequency (%)
Jan-22 30
96.8%
Feb-22 1
 
3.2%

Length

2023-12-10T23:24:14.641296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:24:14.713200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
jan-22 30
96.8%
feb-22 1
 
3.2%

업종명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)25.8%
Missing0
Missing (%)0.0%
Memory size380.0 B
전체
18 
유통업
서비스업
건설업
 
1
외식업
 
1
Other values (3)

Length

Max length5
Median length2
Mean length2.6129032
Min length2

Unique

Unique5 ?
Unique (%)16.1%

Sample

1st row전체
2nd row서비스업
3rd row건설업
4th row외식업
5th row제조업

Common Values

ValueCountFrequency (%)
전체 18
58.1%
유통업 5
 
16.1%
서비스업 3
 
9.7%
건설업 1
 
3.2%
외식업 1
 
3.2%
제조업 1
 
3.2%
부동산업 1
 
3.2%
정보통신업 1
 
3.2%

Length

2023-12-10T23:24:14.804693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:24:14.906722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전체 18
58.1%
유통업 5
 
16.1%
서비스업 3
 
9.7%
건설업 1
 
3.2%
외식업 1
 
3.2%
제조업 1
 
3.2%
부동산업 1
 
3.2%
정보통신업 1
 
3.2%
Distinct18
Distinct (%)58.1%
Missing0
Missing (%)0.0%
Memory size380.0 B
2023-12-10T23:24:15.057005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length3.9032258
Min length2

Characters and Unicode

Total characters121
Distinct characters32
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)35.5%

Sample

1st row전국
2nd row전국
3rd row전국
4th row전국
5th row전국
ValueCountFrequency (%)
전국 8
25.8%
경상남도 2
 
6.5%
부산광역시 2
 
6.5%
충청남도 2
 
6.5%
강원도 2
 
6.5%
인천광역시 2
 
6.5%
울산광역시 2
 
6.5%
전라남도 1
 
3.2%
대구광역시 1
 
3.2%
광주광역시 1
 
3.2%
Other values (8) 8
25.8%
2023-12-10T23:24:15.399634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
9.9%
11
 
9.1%
11
 
9.1%
10
 
8.3%
9
 
7.4%
8
 
6.6%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
Other values (22) 44
36.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 121
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
9.9%
11
 
9.1%
11
 
9.1%
10
 
8.3%
9
 
7.4%
8
 
6.6%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
Other values (22) 44
36.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 121
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
9.9%
11
 
9.1%
11
 
9.1%
10
 
8.3%
9
 
7.4%
8
 
6.6%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
Other values (22) 44
36.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 121
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
9.9%
11
 
9.1%
11
 
9.1%
10
 
8.3%
9
 
7.4%
8
 
6.6%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
Other values (22) 44
36.4%

코로나직전대비율
Categorical

CONSTANT 

Distinct1
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size380.0 B
110.25
31 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row110.25
2nd row110.25
3rd row110.25
4th row110.25
5th row110.25

Common Values

ValueCountFrequency (%)
110.25 31
100.0%

Length

2023-12-10T23:24:15.531517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:24:15.631809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
110.25 31
100.0%

Correlations

2023-12-10T23:24:15.703725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형명기준년월업종명시도명
유형명1.0001.0000.9070.000
기준년월1.0001.0000.0000.000
업종명0.9070.0001.0000.000
시도명0.0000.0000.0001.000
2023-12-10T23:24:15.796312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명유형명기준년월
업종명1.0000.5600.000
유형명0.5601.0000.965
기준년월0.0000.9651.000
2023-12-10T23:24:15.885173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형명기준년월업종명
유형명1.0000.9650.560
기준년월0.9651.0000.000
업종명0.5600.0001.000

Missing values

2023-12-10T23:24:14.285034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:24:14.371111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

유형명기준년월업종명시도명코로나직전대비율
0통합Feb-22전체전국110.25
1업종Jan-22서비스업전국110.25
2업종Jan-22건설업전국110.25
3업종Jan-22외식업전국110.25
4업종Jan-22제조업전국110.25
5업종Jan-22부동산업전국110.25
6업종Jan-22유통업전국110.25
7업종Jan-22정보통신업전국110.25
8지역Jan-22전체충청북도110.25
9지역Jan-22전체강원도110.25
유형명기준년월업종명시도명코로나직전대비율
21지역Jan-22전체세종특별자치시110.25
22지역Jan-22전체울산광역시110.25
23지역Jan-22전체대전광역시110.25
24지역Jan-22전체충청남도110.25
25지역X업종Jan-22서비스업울산광역시110.25
26지역X업종Jan-22유통업강원도110.25
27지역X업종Jan-22유통업부산광역시110.25
28지역X업종Jan-22유통업인천광역시110.25
29지역X업종Jan-22유통업충청남도110.25
30지역X업종Jan-22서비스업경상남도110.25