Overview

Dataset statistics

Number of variables7
Number of observations29
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)3.4%
Total size in memory1.8 KiB
Average record size in memory62.6 B

Variable types

Categorical7

Dataset

Description샘플 데이터
Author한국신용데이터
URLhttps://bigdata-region.kr/#/dataset/0ac8fd71-cdb0-4ebe-aff3-a67053482be8

Alerts

Dataset has 1 (3.4%) duplicate rowsDuplicates
2022-09 is highly overall correlated with 통합 and 5 other fieldsHigh correlation
4040000 is highly overall correlated with 1천만원 미만 and 2 other fieldsHigh correlation
통합 is highly overall correlated with 전체 and 3 other fieldsHigh correlation
전체 is highly overall correlated with 통합 and 2 other fieldsHigh correlation
20 is highly overall correlated with 통합 and 5 other fieldsHigh correlation
1천만원 미만 is highly overall correlated with 20 and 2 other fieldsHigh correlation
전국 is highly overall correlated with 통합 and 2 other fieldsHigh correlation

Reproduction

Analysis started2023-12-22 20:43:16.717009
Analysis finished2023-12-22 20:43:18.577024
Duration1.86 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통합
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
지역
업종
지역X업종
통합

Length

Max length5
Median length4
Mean length3.2068966
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row통합
2nd row통합
3rd row통합
4th row통합
5th row지역

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
지역 5
17.2%
업종 5
17.2%
지역X업종 5
17.2%
통합 4
 
13.8%

Length

2023-12-22T20:43:18.978343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:43:19.415937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 10
34.5%
지역 5
17.2%
업종 5
17.2%
지역x업종 5
17.2%
통합 4
 
13.8%

전체
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size364.0 B
유통업
10 
<NA>
10 
전체

Length

Max length4
Median length3
Mean length3.0344828
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전체
2nd row전체
3rd row전체
4th row전체
5th row전체

Common Values

ValueCountFrequency (%)
유통업 10
34.5%
<NA> 10
34.5%
전체 9
31.0%

Length

2023-12-22T20:43:19.845334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:43:20.379401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유통업 10
34.5%
na 10
34.5%
전체 9
31.0%

전국
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size364.0 B
서울특별시
10 
<NA>
10 
전국

Length

Max length5
Median length4
Mean length3.7241379
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전국
2nd row전국
3rd row전국
4th row전국
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 10
34.5%
<NA> 10
34.5%
전국 9
31.0%

Length

2023-12-22T20:43:21.036473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:43:21.533243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 10
34.5%
na 10
34.5%
전국 9
31.0%

1천만원 미만
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)20.7%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
1천만원 이상 - 2천만원 미만
2천만원 이상 - 3천만원 미만
3천만원 이상 - 5천만원 미만
5천만원 이상

Length

Max length17
Median length7
Mean length10.103448
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1천만원 이상 - 2천만원 미만
2nd row2천만원 이상 - 3천만원 미만
3rd row3천만원 이상 - 5천만원 미만
4th row5천만원 이상
5th row1천만원 미만

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
1천만원 이상 - 2천만원 미만 4
 
13.8%
2천만원 이상 - 3천만원 미만 4
 
13.8%
3천만원 이상 - 5천만원 미만 4
 
13.8%
5천만원 이상 4
 
13.8%
1천만원 미만 3
 
10.3%

Length

2023-12-22T20:43:22.374695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:43:23.085736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
이상 16
19.0%
미만 15
17.9%
12
14.3%
na 10
11.9%
2천만원 8
9.5%
3천만원 8
9.5%
5천만원 8
9.5%
1천만원 7
8.3%

20
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size364.0 B
20
19 
<NA>
10 

Length

Max length4
Median length2
Mean length2.6896552
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 19
65.5%
<NA> 10
34.5%

Length

2023-12-22T20:43:23.775728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:43:24.506039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 19
65.5%
na 10
34.5%

4040000
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)20.7%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
10600000.0
20777777.78
32500000.0
60000000.0

Length

Max length11
Median length10
Mean length7.9655172
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10600000.0
2nd row20777777.78
3rd row32500000.0
4th row60000000.0
5th row4040000.0

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
10600000.0 4
 
13.8%
20777777.78 4
 
13.8%
32500000.0 4
 
13.8%
60000000.0 4
 
13.8%
4040000.0 3
 
10.3%

Length

2023-12-22T20:43:25.087331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:43:25.680290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 10
34.5%
10600000.0 4
 
13.8%
20777777.78 4
 
13.8%
32500000.0 4
 
13.8%
60000000.0 4
 
13.8%
4040000.0 3
 
10.3%

2022-09
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size364.0 B
2022-09
19 
<NA>
10 

Length

Max length7
Median length7
Mean length5.9655172
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-09
2nd row2022-09
3rd row2022-09
4th row2022-09
5th row2022-09

Common Values

ValueCountFrequency (%)
2022-09 19
65.5%
<NA> 10
34.5%

Length

2023-12-22T20:43:26.397442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:43:26.869786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-09 19
65.5%
na 10
34.5%

Correlations

2023-12-22T20:43:27.252119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통합전체전국1천만원 미만4040000
통합1.0001.0001.0000.0000.000
전체1.0001.0000.0000.0000.000
전국1.0000.0001.0000.0000.000
1천만원 미만0.0000.0000.0001.0001.000
40400000.0000.0000.0001.0001.000
2023-12-22T20:43:27.682007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2022-094040000통합전체201천만원 미만전국
2022-091.0001.0001.0001.0001.0001.0001.000
40400001.0001.0000.0000.0001.0001.0000.000
통합1.0000.0001.0000.9391.0000.0000.939
전체1.0000.0000.9391.0001.0000.0000.000
201.0001.0001.0001.0001.0001.0001.000
1천만원 미만1.0001.0000.0000.0001.0001.0000.000
전국1.0000.0000.9390.0001.0000.0001.000
2023-12-22T20:43:28.251260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통합전체전국1천만원 미만2040400002022-09
통합1.0000.9390.9390.0001.0000.0001.000
전체0.9391.0000.0000.0001.0000.0001.000
전국0.9390.0001.0000.0001.0000.0001.000
1천만원 미만0.0000.0000.0001.0001.0001.0001.000
201.0001.0001.0001.0001.0001.0001.000
40400000.0000.0000.0001.0001.0001.0001.000
2022-091.0001.0001.0001.0001.0001.0001.000

Missing values

2023-12-22T20:43:17.528782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-22T20:43:18.226442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통합전체전국1천만원 미만2040400002022-09
0통합전체전국1천만원 이상 - 2천만원 미만2010600000.02022-09
1통합전체전국2천만원 이상 - 3천만원 미만2020777777.782022-09
2통합전체전국3천만원 이상 - 5천만원 미만2032500000.02022-09
3통합전체전국5천만원 이상2060000000.02022-09
4지역전체서울특별시1천만원 미만204040000.02022-09
5지역전체서울특별시1천만원 이상 - 2천만원 미만2010600000.02022-09
6지역전체서울특별시2천만원 이상 - 3천만원 미만2020777777.782022-09
7지역전체서울특별시3천만원 이상 - 5천만원 미만2032500000.02022-09
8지역전체서울특별시5천만원 이상2060000000.02022-09
9업종유통업전국1천만원 미만204040000.02022-09
통합전체전국1천만원 미만2040400002022-09
19<NA><NA><NA><NA><NA><NA><NA>
20<NA><NA><NA><NA><NA><NA><NA>
21<NA><NA><NA><NA><NA><NA><NA>
22<NA><NA><NA><NA><NA><NA><NA>
23<NA><NA><NA><NA><NA><NA><NA>
24<NA><NA><NA><NA><NA><NA><NA>
25<NA><NA><NA><NA><NA><NA><NA>
26<NA><NA><NA><NA><NA><NA><NA>
27<NA><NA><NA><NA><NA><NA><NA>
28<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

통합전체전국1천만원 미만2040400002022-09# duplicates
0<NA><NA><NA><NA><NA><NA><NA>10