Overview

Dataset statistics

Number of variables7
Number of observations29
Missing cells10
Missing cells (%)4.9%
Duplicate rows1
Duplicate rows (%)3.4%
Total size in memory1.8 KiB
Average record size in memory62.6 B

Variable types

Categorical6
DateTime1

Dataset

Description샘플 데이터
Author한국신용데이터
URLhttps://bigdata-region.kr/#/dataset/b58af84d-1a7b-41b5-a077-89964ed68292

Alerts

2022-09 has constant value ""Constant
Dataset has 1 (3.4%) duplicate rowsDuplicates
20 is highly overall correlated with 통합 and 4 other fieldsHigh correlation
전국 is highly overall correlated with 통합 and 1 other fieldsHigh correlation
1천만원 미만 is highly overall correlated with 20 and 1 other fieldsHigh correlation
4040000 is highly overall correlated with 1천만원 미만 and 1 other fieldsHigh correlation
통합 is highly overall correlated with 전체 and 2 other fieldsHigh correlation
전체 is highly overall correlated with 통합 and 1 other fieldsHigh correlation
2022-09 has 10 (34.5%) missing valuesMissing

Reproduction

Analysis started2023-12-22 20:39:36.567315
Analysis finished2023-12-22 20:40:09.174422
Duration32.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통합
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
지역
업종
지역X업종
통합

Length

Max length5
Median length4
Mean length3.2068966
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row통합
2nd row통합
3rd row통합
4th row통합
5th row지역

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
지역 5
17.2%
업종 5
17.2%
지역X업종 5
17.2%
통합 4
 
13.8%

Length

2023-12-22T20:40:09.795521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:40:10.882137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 10
34.5%
지역 5
17.2%
업종 5
17.2%
지역x업종 5
17.2%
통합 4
 
13.8%

전체
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size364.0 B
유통업
10 
<NA>
10 
전체

Length

Max length4
Median length3
Mean length3.0344828
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전체
2nd row전체
3rd row전체
4th row전체
5th row전체

Common Values

ValueCountFrequency (%)
유통업 10
34.5%
<NA> 10
34.5%
전체 9
31.0%

Length

2023-12-22T20:40:11.815797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:40:12.350843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유통업 10
34.5%
na 10
34.5%
전체 9
31.0%

전국
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size364.0 B
서울특별시
10 
<NA>
10 
전국

Length

Max length5
Median length4
Mean length3.7241379
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전국
2nd row전국
3rd row전국
4th row전국
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 10
34.5%
<NA> 10
34.5%
전국 9
31.0%

Length

2023-12-22T20:40:13.078495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:40:13.688700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 10
34.5%
na 10
34.5%
전국 9
31.0%

1천만원 미만
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)20.7%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
1천만원 이상 - 2천만원 미만
2천만원 이상 - 3천만원 미만
3천만원 이상 - 5천만원 미만
5천만원 이상

Length

Max length17
Median length7
Mean length10.103448
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1천만원 이상 - 2천만원 미만
2nd row2천만원 이상 - 3천만원 미만
3rd row3천만원 이상 - 5천만원 미만
4th row5천만원 이상
5th row1천만원 미만

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
1천만원 이상 - 2천만원 미만 4
 
13.8%
2천만원 이상 - 3천만원 미만 4
 
13.8%
3천만원 이상 - 5천만원 미만 4
 
13.8%
5천만원 이상 4
 
13.8%
1천만원 미만 3
 
10.3%

Length

2023-12-22T20:40:14.291795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:40:15.057118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
이상 16
19.0%
미만 15
17.9%
12
14.3%
na 10
11.9%
2천만원 8
9.5%
3천만원 8
9.5%
5천만원 8
9.5%
1천만원 7
8.3%

20
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size364.0 B
20
19 
<NA>
10 

Length

Max length4
Median length2
Mean length2.6896552
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 19
65.5%
<NA> 10
34.5%

Length

2023-12-22T20:40:15.824748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:40:16.224944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 19
65.5%
na 10
34.5%

4040000
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)20.7%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
10600000.0
20777777.78
32500000.0
60000000.0

Length

Max length11
Median length10
Mean length7.9655172
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10600000.0
2nd row20777777.78
3rd row32500000.0
4th row60000000.0
5th row4040000.0

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
10600000.0 4
 
13.8%
20777777.78 4
 
13.8%
32500000.0 4
 
13.8%
60000000.0 4
 
13.8%
4040000.0 3
 
10.3%

Length

2023-12-22T20:40:16.801578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:40:17.356361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 10
34.5%
10600000.0 4
 
13.8%
20777777.78 4
 
13.8%
32500000.0 4
 
13.8%
60000000.0 4
 
13.8%
4040000.0 3
 
10.3%

2022-09
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)5.3%
Missing10
Missing (%)34.5%
Memory size364.0 B
Minimum2022-09-01 00:00:00
Maximum2022-09-01 00:00:00
2023-12-22T20:40:17.847154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-22T20:40:18.264205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-22T20:40:18.496569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통합전체전국1천만원 미만4040000
통합1.0001.0001.0000.0000.000
전체1.0001.0000.0000.0000.000
전국1.0000.0001.0000.0000.000
1천만원 미만0.0000.0000.0001.0001.000
40400000.0000.0000.0001.0001.000
2023-12-22T20:40:18.986058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
20전국1천만원 미만4040000통합전체
201.0001.0001.0001.0001.0001.000
전국1.0001.0000.0000.0000.9390.000
1천만원 미만1.0000.0001.0001.0000.0000.000
40400001.0000.0001.0001.0000.0000.000
통합1.0000.9390.0000.0001.0000.939
전체1.0000.0000.0000.0000.9391.000
2023-12-22T20:40:19.513948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통합전체전국1천만원 미만204040000
통합1.0000.9390.9390.0001.0000.000
전체0.9391.0000.0000.0001.0000.000
전국0.9390.0001.0000.0001.0000.000
1천만원 미만0.0000.0000.0001.0001.0001.000
201.0001.0001.0001.0001.0001.000
40400000.0000.0000.0001.0001.0001.000

Missing values

2023-12-22T20:40:07.118004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-22T20:40:08.626693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통합전체전국1천만원 미만2040400002022-09
0통합전체전국1천만원 이상 - 2천만원 미만2010600000.02022-09
1통합전체전국2천만원 이상 - 3천만원 미만2020777777.782022-09
2통합전체전국3천만원 이상 - 5천만원 미만2032500000.02022-09
3통합전체전국5천만원 이상2060000000.02022-09
4지역전체서울특별시1천만원 미만204040000.02022-09
5지역전체서울특별시1천만원 이상 - 2천만원 미만2010600000.02022-09
6지역전체서울특별시2천만원 이상 - 3천만원 미만2020777777.782022-09
7지역전체서울특별시3천만원 이상 - 5천만원 미만2032500000.02022-09
8지역전체서울특별시5천만원 이상2060000000.02022-09
9업종유통업전국1천만원 미만204040000.02022-09
통합전체전국1천만원 미만2040400002022-09
19<NA><NA><NA><NA><NA><NA><NA>
20<NA><NA><NA><NA><NA><NA><NA>
21<NA><NA><NA><NA><NA><NA><NA>
22<NA><NA><NA><NA><NA><NA><NA>
23<NA><NA><NA><NA><NA><NA><NA>
24<NA><NA><NA><NA><NA><NA><NA>
25<NA><NA><NA><NA><NA><NA><NA>
26<NA><NA><NA><NA><NA><NA><NA>
27<NA><NA><NA><NA><NA><NA><NA>
28<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

통합전체전국1천만원 미만2040400002022-09# duplicates
0<NA><NA><NA><NA><NA><NA><NA>10