Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.9 KiB
Average record size in memory70.3 B

Variable types

Numeric5
Categorical3

Dataset

Description샘플 데이터
Author지디에스컨설팅그룹
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=d5a6d330-2dfc-11ea-9c1b-71bfb969ab02

Alerts

법정동코드 is highly overall correlated with 업종코드 and 1 other fieldsHigh correlation
업종코드 is highly overall correlated with 법정동코드 and 3 other fieldsHigh correlation
금액 is highly overall correlated with 업종코드High correlation
업종명 is highly overall correlated with 법정동코드 and 2 other fieldsHigh correlation
성별 is highly overall correlated with 업종코드 and 2 other fieldsHigh correlation
연령 is highly overall correlated with 성별High correlation

Reproduction

Analysis started2023-12-10 13:14:27.810347
Analysis finished2023-12-10 13:14:32.253625
Duration4.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

법정동코드
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11110109
Minimum11110106
Maximum11110117
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:14:32.359172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110106
5-th percentile11110107
Q111110107
median11110108
Q311110112
95-th percentile11110114
Maximum11110117
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.0158504
Coefficient of variation (CV)2.7145101 × 10-7
Kurtosis-0.13549702
Mean11110109
Median Absolute Deviation (MAD)1
Skewness0.9715702
Sum1.1110109 × 109
Variance9.0953535
MonotonicityIncreasing
2023-12-10T22:14:32.546483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
11110107 41
41.0%
11110112 19
19.0%
11110108 17
17.0%
11110113 8
 
8.0%
11110117 5
 
5.0%
11110106 4
 
4.0%
11110114 3
 
3.0%
11110109 2
 
2.0%
11110110 1
 
1.0%
ValueCountFrequency (%)
11110106 4
 
4.0%
11110107 41
41.0%
11110108 17
17.0%
11110109 2
 
2.0%
11110110 1
 
1.0%
11110112 19
19.0%
11110113 8
 
8.0%
11110114 3
 
3.0%
11110117 5
 
5.0%
ValueCountFrequency (%)
11110117 5
 
5.0%
11110114 3
 
3.0%
11110113 8
 
8.0%
11110112 19
19.0%
11110110 1
 
1.0%
11110109 2
 
2.0%
11110108 17
17.0%
11110107 41
41.0%
11110106 4
 
4.0%

년월
Real number (ℝ)

Distinct12
Distinct (%)12.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1884.98
Minimum1810
Maximum1909
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:14:32.715680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1810
5-th percentile1810
Q11901
median1905
Q31908
95-th percentile1909
Maximum1909
Range99
Interquartile range (IQR)7

Descriptive statistics

Standard deviation39.743618
Coefficient of variation (CV)0.021084371
Kurtosis-0.13037503
Mean1884.98
Median Absolute Deviation (MAD)3
Skewness-1.3617937
Sum188498
Variance1579.5552
MonotonicityNot monotonic
2023-12-10T22:14:32.933199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1909 16
16.0%
1906 13
13.0%
1810 12
12.0%
1908 10
10.0%
1907 9
9.0%
1904 8
8.0%
1905 8
8.0%
1811 6
 
6.0%
1901 5
 
5.0%
1902 5
 
5.0%
Other values (2) 8
8.0%
ValueCountFrequency (%)
1810 12
12.0%
1811 6
6.0%
1812 4
 
4.0%
1901 5
 
5.0%
1902 5
 
5.0%
1903 4
 
4.0%
1904 8
8.0%
1905 8
8.0%
1906 13
13.0%
1907 9
9.0%
ValueCountFrequency (%)
1909 16
16.0%
1908 10
10.0%
1907 9
9.0%
1906 13
13.0%
1905 8
8.0%
1904 8
8.0%
1903 4
 
4.0%
1902 5
 
5.0%
1901 5
 
5.0%
1812 4
 
4.0%

업종코드
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4403.88
Minimum2002
Maximum8201
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:14:33.121957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2002
5-th percentile2004
Q12104
median3401
Q38201
95-th percentile8201
Maximum8201
Range6199
Interquartile range (IQR)6097

Descriptive statistics

Standard deviation2571.7261
Coefficient of variation (CV)0.58396825
Kurtosis-1.4026567
Mean4403.88
Median Absolute Deviation (MAD)1297
Skewness0.57061428
Sum440388
Variance6613775
MonotonicityNot monotonic
2023-12-10T22:14:33.316530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2104 30
30.0%
8201 27
27.0%
5610 11
 
11.0%
2499 11
 
11.0%
4112 9
 
9.0%
2004 7
 
7.0%
3401 4
 
4.0%
2002 1
 
1.0%
ValueCountFrequency (%)
2002 1
 
1.0%
2004 7
 
7.0%
2104 30
30.0%
2499 11
 
11.0%
3401 4
 
4.0%
4112 9
 
9.0%
5610 11
 
11.0%
8201 27
27.0%
ValueCountFrequency (%)
8201 27
27.0%
5610 11
 
11.0%
4112 9
 
9.0%
3401 4
 
4.0%
2499 11
 
11.0%
2104 30
30.0%
2004 7
 
7.0%
2002 1
 
1.0%

업종명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
한식
30 
용역서비스업
27 
주차장
11 
기타 식품
11 
편의점
Other values (3)
12 

Length

Max length6
Median length5
Mean length3.93
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row한식
2nd row한식
3rd row한식
4th row한식
5th row용역서비스업

Common Values

ValueCountFrequency (%)
한식 30
30.0%
용역서비스업 27
27.0%
주차장 11
 
11.0%
기타 식품 11
 
11.0%
편의점 9
 
9.0%
커피전문점 7
 
7.0%
건축자재 4
 
4.0%
휴게음식점 1
 
1.0%

Length

2023-12-10T22:14:33.546279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:14:33.752781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한식 30
27.0%
용역서비스업 27
24.3%
주차장 11
 
9.9%
기타 11
 
9.9%
식품 11
 
9.9%
편의점 9
 
8.1%
커피전문점 7
 
6.3%
건축자재 4
 
3.6%
휴게음식점 1
 
0.9%

성별
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2.여성
51 
1.남성
43 
0.법인

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.여성
2nd row2.여성
3rd row1.남성
4th row2.여성
5th row2.여성

Common Values

ValueCountFrequency (%)
2.여성 51
51.0%
1.남성 43
43.0%
0.법인 6
 
6.0%

Length

2023-12-10T22:14:33.958793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:14:34.105698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2.여성 51
51.0%
1.남성 43
43.0%
0.법인 6
 
6.0%

연령
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
09.50세미만
27 
10.55세미만
16 
08.45세미만
14 
11.60세미만
11 
07.40세미만
11 
Other values (6)
21 

Length

Max length8
Median length8
Mean length7.82
Min length5

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row10.55세미만
2nd row08.45세미만
3rd row12.65세미만
4th row10.55세미만
5th row11.60세미만

Common Values

ValueCountFrequency (%)
09.50세미만 27
27.0%
10.55세미만 16
16.0%
08.45세미만 14
14.0%
11.60세미만 11
11.0%
07.40세미만 11
11.0%
12.65세미만 6
 
6.0%
99.기타 6
 
6.0%
06.35세미만 4
 
4.0%
05.30세미만 2
 
2.0%
13.70세미만 2
 
2.0%

Length

2023-12-10T22:14:34.314959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
09.50세미만 27
27.0%
10.55세미만 16
16.0%
08.45세미만 14
14.0%
11.60세미만 11
11.0%
07.40세미만 11
11.0%
12.65세미만 6
 
6.0%
99.기타 6
 
6.0%
06.35세미만 4
 
4.0%
05.30세미만 2
 
2.0%
13.70세미만 2
 
2.0%

금액
Real number (ℝ)

HIGH CORRELATION 

Distinct89
Distinct (%)89.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62038.56
Minimum4700
Maximum296000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:14:34.511841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4700
5-th percentile10355
Q119775
median36305
Q381530
95-th percentile203400
Maximum296000
Range291300
Interquartile range (IQR)61755

Descriptive statistics

Standard deviation63713.476
Coefficient of variation (CV)1.026998
Kurtosis2.320859
Mean62038.56
Median Absolute Deviation (MAD)19305
Skewness1.7129248
Sum6203856
Variance4.059407 × 109
MonotonicityNot monotonic
2023-12-10T22:14:34.742840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25000 3
 
3.0%
20000 3
 
3.0%
16400 2
 
2.0%
22800 2
 
2.0%
18500 2
 
2.0%
50000 2
 
2.0%
19000 2
 
2.0%
12500 2
 
2.0%
89000 2
 
2.0%
83000 1
 
1.0%
Other values (79) 79
79.0%
ValueCountFrequency (%)
4700 1
1.0%
6500 1
1.0%
7900 1
1.0%
8160 1
1.0%
9500 1
1.0%
10400 1
1.0%
11000 1
1.0%
11400 1
1.0%
12500 2
2.0%
13800 1
1.0%
ValueCountFrequency (%)
296000 1
1.0%
260000 1
1.0%
221000 1
1.0%
219500 1
1.0%
211000 1
1.0%
203000 1
1.0%
201000 1
1.0%
200000 1
1.0%
193000 1
1.0%
162400 1
1.0%

이용건수
Real number (ℝ)

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.52
Minimum3
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:14:34.953752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q13
median4
Q35
95-th percentile10.05
Maximum13
Range10
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.2449944
Coefficient of variation (CV)0.49668018
Kurtosis4.3187808
Mean4.52
Median Absolute Deviation (MAD)1
Skewness2.0856684
Sum452
Variance5.04
MonotonicityNot monotonic
2023-12-10T22:14:35.125545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
3 45
45.0%
4 23
23.0%
5 12
 
12.0%
6 6
 
6.0%
7 4
 
4.0%
8 4
 
4.0%
11 3
 
3.0%
13 2
 
2.0%
10 1
 
1.0%
ValueCountFrequency (%)
3 45
45.0%
4 23
23.0%
5 12
 
12.0%
6 6
 
6.0%
7 4
 
4.0%
8 4
 
4.0%
10 1
 
1.0%
11 3
 
3.0%
13 2
 
2.0%
ValueCountFrequency (%)
13 2
 
2.0%
11 3
 
3.0%
10 1
 
1.0%
8 4
 
4.0%
7 4
 
4.0%
6 6
 
6.0%
5 12
 
12.0%
4 23
23.0%
3 45
45.0%

Interactions

2023-12-10T22:14:31.232321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:28.441674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:29.128937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:29.767503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:30.455349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:31.388131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:28.582504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:29.266391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:29.903960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:30.623874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:31.507917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:28.702796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:29.381447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:30.042806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:30.784637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:31.654923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:28.834276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:29.512090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:30.174553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:30.916721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:31.787127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:28.996387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:29.649704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:30.311817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:14:31.055375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:14:35.275689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동코드년월업종코드업종명성별연령금액이용건수
법정동코드1.0000.0000.7560.9270.5150.4240.4430.429
년월0.0001.0000.0000.0000.0000.0000.3000.026
업종코드0.7560.0001.0001.0000.6560.6530.7960.153
업종명0.9270.0001.0001.0000.7480.6260.5900.479
성별0.5150.0000.6560.7481.0000.8480.6430.288
연령0.4240.0000.6530.6260.8481.0000.5110.000
금액0.4430.3000.7960.5900.6430.5111.0000.417
이용건수0.4290.0260.1530.4790.2880.0000.4171.000
2023-12-10T22:14:35.444348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령업종명
성별1.0000.7170.633
연령0.7171.0000.350
업종명0.6330.3501.000
2023-12-10T22:14:35.603681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동코드년월업종코드금액이용건수업종명성별연령
법정동코드1.0000.151-0.6130.3870.0270.5650.3550.213
년월0.1511.000-0.1890.0670.0680.0330.1010.000
업종코드-0.613-0.1891.000-0.6060.0360.9840.6190.417
금액0.3870.067-0.6061.0000.2000.3280.4740.239
이용건수0.0270.0680.0360.2001.0000.2650.1450.000
업종명0.5650.0330.9840.3280.2651.0000.6330.350
성별0.3550.1010.6190.4740.1450.6331.0000.717
연령0.2130.0000.4170.2390.0000.3500.7171.000

Missing values

2023-12-10T22:14:31.961240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:14:32.166776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

법정동코드년월업종코드업종명성별연령금액이용건수
01111010619092104한식2.여성10.55세미만830005
11111010619072104한식2.여성08.45세미만185003
21111010619032104한식1.남성12.65세미만840003
31111010618102104한식2.여성10.55세미만505003
41111010719068201용역서비스업2.여성11.60세미만114003
51111010719038201용역서비스업1.남성08.45세미만250003
61111010719045610주차장1.남성11.60세미만47003
71111010719098201용역서비스업2.여성07.40세미만150003
81111010719078201용역서비스업1.남성11.60세미만65003
91111010719015610주차장1.남성10.55세미만104004
법정동코드년월업종코드업종명성별연령금액이용건수
901111011318104112편의점2.여성10.55세미만190003
911111011319094112편의점2.여성10.55세미만204203
921111011419082004커피전문점2.여성06.35세미만219003
931111011419092002휴게음식점2.여성12.65세미만190003
941111011419092104한식2.여성10.55세미만2000004
951111011718112104한식2.여성05.30세미만495004
961111011719074112편의점1.남성09.50세미만4455011
971111011719072104한식2.여성06.35세미만410004
981111011719072104한식1.남성11.60세미만1380004
991111011719064112편의점1.남성09.50세미만228004