Overview

Dataset statistics

Number of variables7
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory65.4 B

Variable types

Numeric2
Categorical5

Dataset

Description샘플 데이터
Author경기도경제과학진흥원
URLhttps://bigdata-region.kr/#/dataset/cf6c467e-18b9-4d3a-bd10-5e8473434d36

Alerts

성별코드 is highly overall correlated with 분석인덱스 and 3 other fieldsHigh correlation
가맹점우편번호 is highly overall correlated with 분석인덱스 and 3 other fieldsHigh correlation
분석인덱스 is highly overall correlated with 가맹점우편번호 and 3 other fieldsHigh correlation
연령대코드 is highly overall correlated with 분석인덱스 and 3 other fieldsHigh correlation
결제수 is highly overall correlated with 분석인덱스 and 3 other fieldsHigh correlation
분석인덱스 has unique valuesUnique
분석인덱스 has 1 (3.3%) zerosZeros

Reproduction

Analysis started2023-12-10 13:56:42.857133
Analysis finished2023-12-10 13:56:44.418922
Duration1.56 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분석인덱스
Real number (ℝ)

HIGH CORRELATION  UNIQUE  ZEROS 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.5
Minimum0
Maximum29
Zeros1
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:56:44.512419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.45
Q17.25
median14.5
Q321.75
95-th percentile27.55
Maximum29
Range29
Interquartile range (IQR)14.5

Descriptive statistics

Standard deviation8.8034084
Coefficient of variation (CV)0.60713162
Kurtosis-1.2
Mean14.5
Median Absolute Deviation (MAD)7.5
Skewness0
Sum435
Variance77.5
MonotonicityStrictly increasing
2023-12-10T22:56:44.742235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0 1
 
3.3%
16 1
 
3.3%
29 1
 
3.3%
28 1
 
3.3%
27 1
 
3.3%
26 1
 
3.3%
25 1
 
3.3%
24 1
 
3.3%
23 1
 
3.3%
22 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
0 1
3.3%
1 1
3.3%
2 1
3.3%
3 1
3.3%
4 1
3.3%
5 1
3.3%
6 1
3.3%
7 1
3.3%
8 1
3.3%
9 1
3.3%
ValueCountFrequency (%)
29 1
3.3%
28 1
3.3%
27 1
3.3%
26 1
3.3%
25 1
3.3%
24 1
3.3%
23 1
3.3%
22 1
3.3%
21 1
3.3%
20 1
3.3%

가맹점우편번호
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
10200
18 
10125
12 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10125
2nd row10125
3rd row10125
4th row10125
5th row10125

Common Values

ValueCountFrequency (%)
10200 18
60.0%
10125 12
40.0%

Length

2023-12-10T22:56:44.986003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:56:45.154774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10200 18
60.0%
10125 12
40.0%

성별코드
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
F
18 
M
12 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
F 18
60.0%
M 12
40.0%

Length

2023-12-10T22:56:45.333215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:56:45.503552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 18
60.0%
m 12
40.0%

연령대코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
40
14 
20
30

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row40
2nd row40
3rd row40
4th row40
5th row40

Common Values

ValueCountFrequency (%)
40 14
46.7%
20 8
26.7%
30 8
26.7%

Length

2023-12-10T22:56:45.669470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:56:46.241843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40 14
46.7%
20 8
26.7%
30 8
26.7%

결제수
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
1
20 
10
5
 
2

Length

Max length2
Median length1
Mean length1.2666667
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 20
66.7%
10 8
 
26.7%
5 2
 
6.7%

Length

2023-12-10T22:56:46.408083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:56:46.611472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 20
66.7%
10 8
 
26.7%
5 2
 
6.7%
Distinct13
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
C제조업
E하수·폐기물처리;원료재생및환경복원업
F건설업
G도매및소매업
H운수업
Other values (8)
14 

Length

Max length20
Median length18
Mean length8.8333333
Min length4

Unique

Unique5 ?
Unique (%)16.7%

Sample

1st rowC제조업
2nd rowE하수·폐기물처리;원료재생및환경복원업
3rd rowF건설업
4th rowG도매및소매업
5th rowH운수업

Common Values

ValueCountFrequency (%)
C제조업 4
13.3%
E하수·폐기물처리;원료재생및환경복원업 3
10.0%
F건설업 3
10.0%
G도매및소매업 3
10.0%
H운수업 3
10.0%
I숙박및음식점업 3
10.0%
P교육서비스업 3
10.0%
A농업;임업및어업 3
10.0%
L부동산업및임대업 1
 
3.3%
M전문;과학및기술서비스업 1
 
3.3%
Other values (3) 3
10.0%

Length

2023-12-10T22:56:46.866669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c제조업 4
13.3%
e하수·폐기물처리;원료재생및환경복원업 3
10.0%
f건설업 3
10.0%
g도매및소매업 3
10.0%
h운수업 3
10.0%
i숙박및음식점업 3
10.0%
p교육서비스업 3
10.0%
a농업;임업및어업 3
10.0%
l부동산업및임대업 1
 
3.3%
m전문;과학및기술서비스업 1
 
3.3%
Other values (3) 3
10.0%

상가수
Real number (ℝ)

Distinct13
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9
Minimum1
Maximum69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:56:47.087417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11.25
median4
Q38.75
95-th percentile35.35
Maximum69
Range68
Interquartile range (IQR)7.5

Descriptive statistics

Standard deviation14.895035
Coefficient of variation (CV)1.6550039
Kurtosis10.819319
Mean9
Median Absolute Deviation (MAD)3
Skewness3.2427727
Sum270
Variance221.86207
MonotonicityNot monotonic
2023-12-10T22:56:47.267180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
1 8
26.7%
3 4
13.3%
14 3
 
10.0%
4 3
 
10.0%
7 3
 
10.0%
2 2
 
6.7%
15 1
 
3.3%
5 1
 
3.3%
52 1
 
3.3%
9 1
 
3.3%
Other values (3) 3
 
10.0%
ValueCountFrequency (%)
1 8
26.7%
2 2
 
6.7%
3 4
13.3%
4 3
 
10.0%
5 1
 
3.3%
7 3
 
10.0%
8 1
 
3.3%
9 1
 
3.3%
13 1
 
3.3%
14 3
 
10.0%
ValueCountFrequency (%)
69 1
 
3.3%
52 1
 
3.3%
15 1
 
3.3%
14 3
10.0%
13 1
 
3.3%
9 1
 
3.3%
8 1
 
3.3%
7 3
10.0%
5 1
 
3.3%
4 3
10.0%

Interactions

2023-12-10T22:56:43.655464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:56:43.371047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:56:43.828437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:56:43.503982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:56:47.407646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분석인덱스가맹점우편번호성별코드연령대코드결제수업종대분류명상가수
분석인덱스1.0001.0001.0000.9200.8620.0000.396
가맹점우편번호1.0001.0000.9940.5940.3330.0000.344
성별코드1.0000.9941.0000.5940.3330.0000.344
연령대코드0.9200.5940.5941.0000.9410.0000.000
결제수0.8620.3330.3330.9411.0000.0000.000
업종대분류명0.0000.0000.0000.0000.0001.0000.000
상가수0.3960.3440.3440.0000.0000.0001.000
2023-12-10T22:56:47.606977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별코드업종대분류명연령대코드결제수가맹점우편번호
성별코드1.0000.0000.8470.5230.928
업종대분류명0.0001.0000.0000.0000.000
연령대코드0.8470.0001.0000.7030.847
결제수0.5230.0000.7031.0000.523
가맹점우편번호0.9280.0000.8470.5231.000
2023-12-10T22:56:47.789745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분석인덱스상가수가맹점우편번호성별코드연령대코드결제수업종대분류명
분석인덱스1.000-0.3570.8450.8450.7620.6740.000
상가수-0.3571.0000.3910.3910.0000.0000.000
가맹점우편번호0.8450.3911.0000.9280.8470.5230.000
성별코드0.8450.3910.9281.0000.8470.5230.000
연령대코드0.7620.0000.8470.8471.0000.7030.000
결제수0.6740.0000.5230.5230.7031.0000.000
업종대분류명0.0000.0000.0000.0000.0000.0001.000

Missing values

2023-12-10T22:56:44.143069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:56:44.348166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

분석인덱스가맹점우편번호성별코드연령대코드결제수업종대분류명상가수
0010125M401C제조업15
1110125M401E하수·폐기물처리;원료재생및환경복원업1
2210125M401F건설업5
3310125M401G도매및소매업52
4410125M401H운수업9
5510125M401I숙박및음식점업69
6610125M401L부동산업및임대업14
7710125M401M전문;과학및기술서비스업4
8810125M401N사업시설관리및사업지원서비스업1
9910125M401P교육서비스업13
분석인덱스가맹점우편번호성별코드연령대코드결제수업종대분류명상가수
202010200F3010A농업;임업및어업3
212110200F3010C제조업7
222210200F3010E하수·폐기물처리;원료재생및환경복원업1
232310200F3010F건설업1
242410200F3010G도매및소매업14
252510200F3010H운수업2
262610200F3010I숙박및음식점업4
272710200F3010P교육서비스업1
282810200F405A농업;임업및어업3
292910200F405C제조업7