Overview

Dataset statistics

Number of variables7
Number of observations29
Missing cells10
Missing cells (%)4.9%
Duplicate rows1
Duplicate rows (%)3.4%
Total size in memory1.8 KiB
Average record size in memory62.6 B

Variable types

Categorical6
DateTime1

Dataset

Description샘플 데이터
Author한국신용데이터
URLhttps://bigdata-region.kr/#/dataset/24595c42-37f3-4ad5-b169-110c0bb75727

Alerts

2022-09 has constant value ""Constant
Dataset has 1 (3.4%) duplicate rowsDuplicates
전체 is highly overall correlated with 통합 and 1 other fieldsHigh correlation
전국 is highly overall correlated with 통합 and 1 other fieldsHigh correlation
1천만원 미만 is highly overall correlated with 20 and 1 other fieldsHigh correlation
20 is highly overall correlated with 통합 and 4 other fieldsHigh correlation
4040000 is highly overall correlated with 1천만원 미만 and 1 other fieldsHigh correlation
통합 is highly overall correlated with 전체 and 2 other fieldsHigh correlation
2022-09 has 10 (34.5%) missing valuesMissing

Reproduction

Analysis started2023-12-22 20:42:07.701362
Analysis finished2023-12-22 20:42:09.708894
Duration2.01 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통합
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
지역
업종
지역X업종
통합

Length

Max length5
Median length4
Mean length3.2068966
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row통합
2nd row통합
3rd row통합
4th row통합
5th row지역

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
지역 5
17.2%
업종 5
17.2%
지역X업종 5
17.2%
통합 4
 
13.8%

Length

2023-12-22T20:42:09.975302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:42:10.445316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 10
34.5%
지역 5
17.2%
업종 5
17.2%
지역x업종 5
17.2%
통합 4
 
13.8%

전체
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size364.0 B
유통업
10 
<NA>
10 
전체

Length

Max length4
Median length3
Mean length3.0344828
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전체
2nd row전체
3rd row전체
4th row전체
5th row전체

Common Values

ValueCountFrequency (%)
유통업 10
34.5%
<NA> 10
34.5%
전체 9
31.0%

Length

2023-12-22T20:42:11.177934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:42:11.826290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유통업 10
34.5%
na 10
34.5%
전체 9
31.0%

전국
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size364.0 B
서울특별시
10 
<NA>
10 
전국

Length

Max length5
Median length4
Mean length3.7241379
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전국
2nd row전국
3rd row전국
4th row전국
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 10
34.5%
<NA> 10
34.5%
전국 9
31.0%

Length

2023-12-22T20:42:12.455097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:42:12.856896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 10
34.5%
na 10
34.5%
전국 9
31.0%

1천만원 미만
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)20.7%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
1천만원 이상 - 2천만원 미만
2천만원 이상 - 3천만원 미만
3천만원 이상 - 5천만원 미만
5천만원 이상

Length

Max length17
Median length7
Mean length10.103448
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1천만원 이상 - 2천만원 미만
2nd row2천만원 이상 - 3천만원 미만
3rd row3천만원 이상 - 5천만원 미만
4th row5천만원 이상
5th row1천만원 미만

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
1천만원 이상 - 2천만원 미만 4
 
13.8%
2천만원 이상 - 3천만원 미만 4
 
13.8%
3천만원 이상 - 5천만원 미만 4
 
13.8%
5천만원 이상 4
 
13.8%
1천만원 미만 3
 
10.3%

Length

2023-12-22T20:42:13.350364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:42:13.884967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
이상 16
19.0%
미만 15
17.9%
12
14.3%
na 10
11.9%
2천만원 8
9.5%
3천만원 8
9.5%
5천만원 8
9.5%
1천만원 7
8.3%

20
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size364.0 B
20
19 
<NA>
10 

Length

Max length4
Median length2
Mean length2.6896552
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 19
65.5%
<NA> 10
34.5%

Length

2023-12-22T20:42:14.606852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:42:15.110091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 19
65.5%
na 10
34.5%

4040000
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)20.7%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
10 
10600000.0
20777777.78
32500000.0
60000000.0

Length

Max length11
Median length10
Mean length7.9655172
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10600000.0
2nd row20777777.78
3rd row32500000.0
4th row60000000.0
5th row4040000.0

Common Values

ValueCountFrequency (%)
<NA> 10
34.5%
10600000.0 4
 
13.8%
20777777.78 4
 
13.8%
32500000.0 4
 
13.8%
60000000.0 4
 
13.8%
4040000.0 3
 
10.3%

Length

2023-12-22T20:42:15.597126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-22T20:42:16.240638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 10
34.5%
10600000.0 4
 
13.8%
20777777.78 4
 
13.8%
32500000.0 4
 
13.8%
60000000.0 4
 
13.8%
4040000.0 3
 
10.3%

2022-09
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)5.3%
Missing10
Missing (%)34.5%
Memory size364.0 B
Minimum2022-09-01 00:00:00
Maximum2022-09-01 00:00:00
2023-12-22T20:42:16.568493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-22T20:42:16.871136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-22T20:42:17.106272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통합전체전국1천만원 미만4040000
통합1.0001.0001.0000.0000.000
전체1.0001.0000.0000.0000.000
전국1.0000.0001.0000.0000.000
1천만원 미만0.0000.0000.0001.0001.000
40400000.0000.0000.0001.0001.000
2023-12-22T20:42:17.658112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전체전국1천만원 미만204040000통합
전체1.0000.0000.0001.0000.0000.939
전국0.0001.0000.0001.0000.0000.939
1천만원 미만0.0000.0001.0001.0001.0000.000
201.0001.0001.0001.0001.0001.000
40400000.0000.0001.0001.0001.0000.000
통합0.9390.9390.0001.0000.0001.000
2023-12-22T20:42:17.946767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통합전체전국1천만원 미만204040000
통합1.0000.9390.9390.0001.0000.000
전체0.9391.0000.0000.0001.0000.000
전국0.9390.0001.0000.0001.0000.000
1천만원 미만0.0000.0000.0001.0001.0001.000
201.0001.0001.0001.0001.0001.000
40400000.0000.0000.0001.0001.0001.000

Missing values

2023-12-22T20:42:08.807672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-22T20:42:09.283541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통합전체전국1천만원 미만2040400002022-09
0통합전체전국1천만원 이상 - 2천만원 미만2010600000.02022-09
1통합전체전국2천만원 이상 - 3천만원 미만2020777777.782022-09
2통합전체전국3천만원 이상 - 5천만원 미만2032500000.02022-09
3통합전체전국5천만원 이상2060000000.02022-09
4지역전체서울특별시1천만원 미만204040000.02022-09
5지역전체서울특별시1천만원 이상 - 2천만원 미만2010600000.02022-09
6지역전체서울특별시2천만원 이상 - 3천만원 미만2020777777.782022-09
7지역전체서울특별시3천만원 이상 - 5천만원 미만2032500000.02022-09
8지역전체서울특별시5천만원 이상2060000000.02022-09
9업종유통업전국1천만원 미만204040000.02022-09
통합전체전국1천만원 미만2040400002022-09
19<NA><NA><NA><NA><NA><NA><NA>
20<NA><NA><NA><NA><NA><NA><NA>
21<NA><NA><NA><NA><NA><NA><NA>
22<NA><NA><NA><NA><NA><NA><NA>
23<NA><NA><NA><NA><NA><NA><NA>
24<NA><NA><NA><NA><NA><NA><NA>
25<NA><NA><NA><NA><NA><NA><NA>
26<NA><NA><NA><NA><NA><NA><NA>
27<NA><NA><NA><NA><NA><NA><NA>
28<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

통합전체전국1천만원 미만2040400002022-09# duplicates
0<NA><NA><NA><NA><NA><NA><NA>10