Overview

Dataset statistics

Number of variables5
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.3 KiB
Average record size in memory45.4 B

Variable types

Categorical4
Text1

Dataset

Description샘플 데이터
Author한국신용데이터
URLhttps://bigdata-region.kr/#/dataset/63957284-a24f-4b12-b4ef-b042fa9e29b0

Alerts

결제단가금액 has constant value ""Constant
기준년월 is highly overall correlated with 유형명High correlation
유형명 is highly overall correlated with 기준년월High correlation
기준년월 is highly imbalanced (78.9%)Imbalance

Reproduction

Analysis started2023-12-10 14:13:48.136039
Analysis finished2023-12-10 14:13:49.083685
Duration0.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

유형명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
지역
17 
업종
지역X업종
통합
 
1

Length

Max length5
Median length2
Mean length2.3
Min length2

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row통합
2nd row업종
3rd row업종
4th row업종
5th row업종

Common Values

ValueCountFrequency (%)
지역 17
56.7%
업종 9
30.0%
지역X업종 3
 
10.0%
통합 1
 
3.3%

Length

2023-12-10T23:13:49.236334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:13:49.447687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지역 17
56.7%
업종 9
30.0%
지역x업종 3
 
10.0%
통합 1
 
3.3%

기준년월
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
22-Jan
29 
Jan-22
 
1

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st rowJan-22
2nd row22-Jan
3rd row22-Jan
4th row22-Jan
5th row22-Jan

Common Values

ValueCountFrequency (%)
22-Jan 29
96.7%
Jan-22 1
 
3.3%

Length

2023-12-10T23:13:49.642119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:13:49.804410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
22-jan 29
96.7%
jan-22 1
 
3.3%

업종명
Categorical

Distinct10
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
전체
18 
유통업
외식업
정보통신업
 
1
부동산업
 
1
Other values (5)

Length

Max length8
Median length2
Mean length2.6666667
Min length2

Unique

Unique7 ?
Unique (%)23.3%

Sample

1st row전체
2nd row정보통신업
3rd row부동산업
4th row외식업
5th row제조업

Common Values

ValueCountFrequency (%)
전체 18
60.0%
유통업 3
 
10.0%
외식업 2
 
6.7%
정보통신업 1
 
3.3%
부동산업 1
 
3.3%
제조업 1
 
3.3%
농업/임업/어업 1
 
3.3%
건설업 1
 
3.3%
기타 1
 
3.3%
서비스업 1
 
3.3%

Length

2023-12-10T23:13:50.004882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:13:50.224023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전체 18
60.0%
유통업 3
 
10.0%
외식업 2
 
6.7%
정보통신업 1
 
3.3%
부동산업 1
 
3.3%
제조업 1
 
3.3%
농업/임업/어업 1
 
3.3%
건설업 1
 
3.3%
기타 1
 
3.3%
서비스업 1
 
3.3%
Distinct18
Distinct (%)60.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:13:50.484138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length3.6666667
Min length2

Characters and Unicode

Total characters110
Distinct characters32
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)46.7%

Sample

1st row전국
2nd row전국
3rd row전국
4th row전국
5th row전국
ValueCountFrequency (%)
전국 10
33.3%
강원도 2
 
6.7%
경기도 2
 
6.7%
대전광역시 2
 
6.7%
서울특별시 1
 
3.3%
울산광역시 1
 
3.3%
충청남도 1
 
3.3%
제주특별자치도 1
 
3.3%
전라북도 1
 
3.3%
전라남도 1
 
3.3%
Other values (8) 8
26.7%
2023-12-10T23:13:50.961352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14
 
12.7%
11
 
10.0%
10
 
9.1%
9
 
8.2%
8
 
7.3%
7
 
6.4%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (22) 38
34.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 110
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
12.7%
11
 
10.0%
10
 
9.1%
9
 
8.2%
8
 
7.3%
7
 
6.4%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (22) 38
34.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 110
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
12.7%
11
 
10.0%
10
 
9.1%
9
 
8.2%
8
 
7.3%
7
 
6.4%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (22) 38
34.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 110
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
14
 
12.7%
11
 
10.0%
10
 
9.1%
9
 
8.2%
8
 
7.3%
7
 
6.4%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (22) 38
34.5%

결제단가금액
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
10000
30 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10000
2nd row10000
3rd row10000
4th row10000
5th row10000

Common Values

ValueCountFrequency (%)
10000 30
100.0%

Length

2023-12-10T23:13:51.195927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:13:51.340813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10000 30
100.0%

Correlations

2023-12-10T23:13:51.465426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형명기준년월업종명시도명
유형명1.0001.0000.7220.000
기준년월1.0001.0000.0000.000
업종명0.7220.0001.0000.000
시도명0.0000.0000.0001.000
2023-12-10T23:13:51.618988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준년월업종명유형명
기준년월1.0000.0000.964
업종명0.0001.0000.452
유형명0.9640.4521.000
2023-12-10T23:13:51.774429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형명기준년월업종명
유형명1.0000.9640.452
기준년월0.9641.0000.000
업종명0.4520.0001.000

Missing values

2023-12-10T23:13:48.533703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:13:48.929373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

유형명기준년월업종명시도명결제단가금액
0통합Jan-22전체전국10000
1업종22-Jan정보통신업전국10000
2업종22-Jan부동산업전국10000
3업종22-Jan외식업전국10000
4업종22-Jan제조업전국10000
5업종22-Jan농업/임업/어업전국10000
6업종22-Jan건설업전국10000
7업종22-Jan기타전국10000
8업종22-Jan유통업전국10000
9업종22-Jan서비스업전국10000
유형명기준년월업종명시도명결제단가금액
20지역22-Jan전체울산광역시10000
21지역22-Jan전체인천광역시10000
22지역22-Jan전체전라남도10000
23지역22-Jan전체전라북도10000
24지역22-Jan전체제주특별자치도10000
25지역22-Jan전체충청남도10000
26지역22-Jan전체충청북도10000
27지역X업종22-Jan외식업경기도10000
28지역X업종22-Jan유통업강원도10000
29지역X업종22-Jan유통업대전광역시10000