Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 2150 |
Duplicate rows (%) | 21.5% |
Total size in memory | 683.6 KiB |
Average record size in memory | 70.0 B |
Variable types
Categorical | 1 |
---|---|
Numeric | 6 |
Dataset
Description | 서대문구 전체 상권별 - 미세먼지 - 유동인구 분석 기초자료 데이터로 날짜별 상권별, 시간별, 유동인구, 미세먼지, 초미세먼지 데이터 항목을 제공합니다 |
---|---|
Author | 서울특별시 서대문구 |
URL | https://www.data.go.kr/data/15097162/fileData.do |
Dataset has 2150 (21.5%) duplicate rows | Duplicates |
우편번호 is highly overall correlated with 상권코드 | High correlation |
상권코드 is highly overall correlated with 우편번호 and 1 other fields | High correlation |
유동인구 is highly overall correlated with 상권코드 | High correlation |
미세먼지 is highly overall correlated with 초미세먼지 and 1 other fields | High correlation |
초미세먼지 is highly overall correlated with 미세먼지 and 1 other fields | High correlation |
연월일 is highly overall correlated with 미세먼지 and 1 other fields | High correlation |
Reproduction
Analysis started | 2023-12-12 07:52:00.490423 |
---|---|
Analysis finished | 2023-12-12 07:52:06.503843 |
Duration | 6.01 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연월일
Categorical
HIGH CORRELATION
 
Distinct | 34 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2020-01-13 | 331 |
---|---|
2020-01-30 | 331 |
2020-01-07 | 330 |
2020-01-08 | 325 |
2020-02-01 | 320 |
Other values (29) |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2020-01-13 |
---|---|
2nd row | 2020-01-25 |
3rd row | 2020-01-05 |
4th row | 2020-01-31 |
5th row | 2020-01-03 |
Common Values
Value | Count | Frequency (%) |
2020-01-13 | 331 | 3.3% |
2020-01-30 | 331 | 3.3% |
2020-01-07 | 330 | 3.3% |
2020-01-08 | 325 | 3.2% |
2020-02-01 | 320 | 3.2% |
2020-01-11 | 318 | 3.2% |
2020-01-20 | 316 | 3.2% |
2020-01-04 | 315 | 3.1% |
2020-01-23 | 315 | 3.1% |
2020-01-03 | 313 | 3.1% |
Other values (24) | 6786 |
Length
Value | Count | Frequency (%) |
2020-01-13 | 331 | 3.3% |
2020-01-30 | 331 | 3.3% |
2020-01-07 | 330 | 3.3% |
2020-01-08 | 325 | 3.2% |
2020-02-01 | 320 | 3.2% |
2020-01-11 | 318 | 3.2% |
2020-01-20 | 316 | 3.2% |
2020-01-23 | 315 | 3.1% |
2020-01-04 | 315 | 3.1% |
2020-01-03 | 313 | 3.1% |
Other values (24) | 6786 |
시간
Real number (ℝ)
Distinct | 6 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3.9169 |
Minimum | 1 |
---|---|
Maximum | 6 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 3 |
median | 4 |
Q3 | 5 |
95-th percentile | 6 |
Maximum | 6 |
Range | 5 |
Interquartile range (IQR) | 2 |
Descriptive statistics
Standard deviation | 1.4532741 |
---|---|
Coefficient of variation (CV) | 0.3710266 |
Kurtosis | -0.88608407 |
Mean | 3.9169 |
Median Absolute Deviation (MAD) | 1 |
Skewness | -0.26008839 |
Sum | 39169 |
Variance | 2.1120056 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5 | 2430 | |
3 | 2071 | |
4 | 2065 | |
6 | 1567 | |
2 | 1277 | |
1 | 590 | 5.9% |
Value | Count | Frequency (%) |
1 | 590 | 5.9% |
2 | 1277 | |
3 | 2071 | |
4 | 2065 | |
5 | 2430 | |
6 | 1567 |
Value | Count | Frequency (%) |
6 | 1567 | |
5 | 2430 | |
4 | 2065 | |
3 | 2071 | |
2 | 1277 | |
1 | 590 | 5.9% |
우편번호
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 81 |
---|---|
Distinct (%) | 0.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3706.6328 |
Minimum | 3605 |
---|---|
Maximum | 3789 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 3605 |
---|---|
5-th percentile | 3625 |
Q1 | 3665 |
median | 3711 |
Q3 | 3757 |
95-th percentile | 3788 |
Maximum | 3789 |
Range | 184 |
Interquartile range (IQR) | 92 |
Descriptive statistics
Standard deviation | 53.711577 |
---|---|
Coefficient of variation (CV) | 0.014490666 |
Kurtosis | -1.1945723 |
Mean | 3706.6328 |
Median Absolute Deviation (MAD) | 46 |
Skewness | -0.11060314 |
Sum | 37066328 |
Variance | 2884.9335 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
3712 | 432 | 4.3% |
3735 | 379 | 3.8% |
3646 | 339 | 3.4% |
3734 | 310 | 3.1% |
3789 | 275 | 2.8% |
3766 | 269 | 2.7% |
3692 | 261 | 2.6% |
3628 | 260 | 2.6% |
3788 | 252 | 2.5% |
3777 | 230 | 2.3% |
Other values (71) | 6993 |
Value | Count | Frequency (%) |
3605 | 119 | |
3606 | 9 | 0.1% |
3607 | 17 | 0.2% |
3611 | 36 | 0.4% |
3612 | 31 | 0.3% |
3615 | 75 | 0.8% |
3616 | 88 | 0.9% |
3624 | 124 | |
3625 | 73 | 0.7% |
3628 | 260 |
Value | Count | Frequency (%) |
3789 | 275 | |
3788 | 252 | |
3787 | 148 | |
3780 | 166 | |
3779 | 219 | |
3778 | 66 | 0.7% |
3777 | 230 | |
3776 | 205 | |
3767 | 142 | |
3766 | 269 |
상권코드
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 32 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 12.1638 |
Minimum | 1 |
---|---|
Maximum | 32 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 3 |
median | 11 |
Q3 | 20 |
95-th percentile | 29 |
Maximum | 32 |
Range | 31 |
Interquartile range (IQR) | 17 |
Descriptive statistics
Standard deviation | 9.1892989 |
---|---|
Coefficient of variation (CV) | 0.75546284 |
Kurtosis | -0.99042997 |
Mean | 12.1638 |
Median Absolute Deviation (MAD) | 8 |
Skewness | 0.42332724 |
Sum | 121638 |
Variance | 84.443214 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 1561 | 15.6% |
2 | 656 | 6.6% |
7 | 637 | 6.4% |
11 | 489 | 4.9% |
16 | 482 | 4.8% |
23 | 471 | 4.7% |
18 | 431 | 4.3% |
6 | 381 | 3.8% |
5 | 374 | 3.7% |
15 | 364 | 3.6% |
Other values (22) | 4154 |
Value | Count | Frequency (%) |
1 | 1561 | |
2 | 656 | |
3 | 331 | 3.3% |
4 | 202 | 2.0% |
5 | 374 | 3.7% |
6 | 381 | 3.8% |
7 | 637 | |
8 | 195 | 1.9% |
9 | 230 | 2.3% |
10 | 304 | 3.0% |
Value | Count | Frequency (%) |
32 | 139 | 1.4% |
31 | 227 | |
30 | 37 | 0.4% |
29 | 157 | 1.6% |
28 | 145 | 1.5% |
27 | 128 | 1.3% |
26 | 96 | 1.0% |
25 | 167 | 1.7% |
24 | 278 | |
23 | 471 |
유동인구
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 4249 |
---|---|
Distinct (%) | 42.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 21939.672 |
Minimum | 530.5 |
---|---|
Maximum | 120282.46 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 530.5 |
---|---|
5-th percentile | 1745.2905 |
Q1 | 5885.945 |
median | 13882.37 |
Q3 | 20806.812 |
95-th percentile | 96644.27 |
Maximum | 120282.46 |
Range | 119751.96 |
Interquartile range (IQR) | 14920.868 |
Descriptive statistics
Standard deviation | 26776.396 |
---|---|
Coefficient of variation (CV) | 1.2204557 |
Kurtosis | 3.6568421 |
Mean | 21939.672 |
Median Absolute Deviation (MAD) | 7745.84 |
Skewness | 2.1022279 |
Sum | 2.1939672 × 108 |
Variance | 7.1697539 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
60116.7 | 18 | 0.2% |
115588.49 | 16 | 0.2% |
80846.86 | 15 | 0.1% |
100553.89 | 14 | 0.1% |
92007.78 | 14 | 0.1% |
114246.86 | 14 | 0.1% |
41857.76 | 14 | 0.1% |
97066.13 | 14 | 0.1% |
96749.89 | 13 | 0.1% |
92034.96 | 13 | 0.1% |
Other values (4239) | 9855 |
Value | Count | Frequency (%) |
530.5 | 1 | |
537.07 | 1 | |
543.92 | 1 | |
552.65 | 1 | |
573.11 | 2 | |
576.4 | 1 | |
588.11 | 1 | |
591.28 | 1 | |
608.67 | 2 | |
624.18 | 1 |
Value | Count | Frequency (%) |
120282.46 | 8 | |
119357.06 | 9 | |
118475.28 | 10 | |
118426.52 | 11 | |
118218.57 | 11 | |
117806.23 | 13 | |
117521.14 | 7 | |
116896.56 | 8 | |
115588.49 | 16 | |
115042.86 | 7 |
미세먼지
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 144 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 42.682675 |
Minimum | 3 |
---|---|
Maximum | 94.75 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 3 |
---|---|
5-th percentile | 6.75 |
Q1 | 25.75 |
median | 43 |
Q3 | 58 |
95-th percentile | 78.25 |
Maximum | 94.75 |
Range | 91.75 |
Interquartile range (IQR) | 32.25 |
Descriptive statistics
Standard deviation | 22.720099 |
---|---|
Coefficient of variation (CV) | 0.53230259 |
Kurtosis | -0.7190023 |
Mean | 42.682675 |
Median Absolute Deviation (MAD) | 16.5 |
Skewness | 0.087178051 |
Sum | 426826.75 |
Variance | 516.20288 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
64.5 | 283 | 2.8% |
3.0 | 272 | 2.7% |
40.25 | 251 | 2.5% |
26.0 | 206 | 2.1% |
13.75 | 184 | 1.8% |
19.75 | 169 | 1.7% |
39.0 | 144 | 1.4% |
55.5 | 142 | 1.4% |
57.25 | 141 | 1.4% |
40.5 | 132 | 1.3% |
Other values (134) | 8076 |
Value | Count | Frequency (%) |
3.0 | 272 | |
3.25 | 60 | 0.6% |
3.5 | 17 | 0.2% |
4.25 | 81 | 0.8% |
5.25 | 51 | 0.5% |
5.5 | 18 | 0.2% |
6.75 | 57 | 0.6% |
7.0 | 70 | 0.7% |
7.25 | 66 | 0.7% |
7.5 | 101 | 1.0% |
Value | Count | Frequency (%) |
94.75 | 43 | |
92.75 | 81 | |
92.25 | 54 | |
91.0 | 63 | |
87.25 | 52 | |
84.5 | 60 | |
83.75 | 67 | |
80.5 | 72 | |
78.25 | 76 | |
77.5 | 49 |
초미세먼지
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 135 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 26.8539 |
Minimum | 1 |
---|---|
Maximum | 76.75 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 3 |
Q1 | 14.5 |
median | 26.5 |
Q3 | 34.25 |
95-th percentile | 60.25 |
Maximum | 76.75 |
Range | 75.75 |
Interquartile range (IQR) | 19.75 |
Descriptive statistics
Standard deviation | 16.534254 |
---|---|
Coefficient of variation (CV) | 0.61571146 |
Kurtosis | 0.33043849 |
Mean | 26.8539 |
Median Absolute Deviation (MAD) | 9.5 |
Skewness | 0.67034466 |
Sum | 268539 |
Variance | 273.38156 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
34.25 | 213 | 2.1% |
1.0 | 209 | 2.1% |
40.0 | 208 | 2.1% |
3.0 | 194 | 1.9% |
18.75 | 187 | 1.9% |
31.0 | 178 | 1.8% |
32.5 | 178 | 1.8% |
29.5 | 169 | 1.7% |
34.0 | 164 | 1.6% |
10.25 | 162 | 1.6% |
Other values (125) | 8138 |
Value | Count | Frequency (%) |
1.0 | 209 | |
1.25 | 39 | 0.4% |
1.5 | 122 | |
1.75 | 60 | 0.6% |
2.25 | 51 | 0.5% |
2.5 | 11 | 0.1% |
3.0 | 194 | |
3.75 | 73 | 0.7% |
4.0 | 45 | 0.4% |
4.25 | 47 | 0.5% |
Value | Count | Frequency (%) |
76.75 | 54 | |
74.5 | 43 | |
72.25 | 81 | |
67.0 | 67 | |
66.5 | 28 | 0.3% |
65.75 | 60 | |
63.0 | 18 | 0.2% |
62.0 | 46 | |
61.25 | 65 | |
60.25 | 58 |
연월일 | 시간 | 우편번호 | 상권코드 | 유동인구 | 미세먼지 | 초미세먼지 | |
---|---|---|---|---|---|---|---|
연월일 | 1.000 | 0.067 | 0.046 | 0.043 | 0.322 | 0.920 | 0.909 |
시간 | 0.067 | 1.000 | 0.104 | 0.077 | 0.390 | 0.338 | 0.310 |
우편번호 | 0.046 | 0.104 | 1.000 | 0.934 | 0.821 | 0.020 | 0.000 |
상권코드 | 0.043 | 0.077 | 0.934 | 1.000 | 0.732 | 0.041 | 0.053 |
유동인구 | 0.322 | 0.390 | 0.821 | 0.732 | 1.000 | 0.199 | 0.234 |
미세먼지 | 0.920 | 0.338 | 0.020 | 0.041 | 0.199 | 1.000 | 0.943 |
초미세먼지 | 0.909 | 0.310 | 0.000 | 0.053 | 0.234 | 0.943 | 1.000 |
시간 | 우편번호 | 상권코드 | 유동인구 | 미세먼지 | 초미세먼지 | 연월일 | |
---|---|---|---|---|---|---|---|
시간 | 1.000 | -0.021 | 0.031 | -0.021 | -0.028 | -0.035 | 0.029 |
우편번호 | -0.021 | 1.000 | -0.613 | 0.395 | -0.034 | -0.035 | 0.017 |
상권코드 | 0.031 | -0.613 | 1.000 | -0.575 | 0.019 | 0.018 | 0.015 |
유동인구 | -0.021 | 0.395 | -0.575 | 1.000 | -0.018 | -0.014 | 0.119 |
미세먼지 | -0.028 | -0.034 | 0.019 | -0.018 | 1.000 | 0.947 | 0.645 |
초미세먼지 | -0.035 | -0.035 | 0.018 | -0.014 | 0.947 | 1.000 | 0.617 |
연월일 | 0.029 | 0.017 | 0.015 | 0.119 | 0.645 | 0.617 | 1.000 |
연월일 | 시간 | 우편번호 | 상권코드 | 유동인구 | 미세먼지 | 초미세먼지 | |
---|---|---|---|---|---|---|---|
36783 | 2020-01-13 | 6 | 3675 | 6 | 6104.3 | 26.0 | 17.0 |
69791 | 2020-01-25 | 3 | 3788 | 1 | 34444.61 | 28.75 | 29.0 |
14023 | 2020-01-05 | 6 | 3665 | 16 | 8320.12 | 45.25 | 23.0 |
87068 | 2020-01-31 | 6 | 3605 | 28 | 2135.36 | 48.5 | 40.5 |
5474 | 2020-01-03 | 1 | 3788 | 1 | 46016.53 | 77.0 | 42.5 |
90962 | 2020-02-02 | 3 | 3735 | 3 | 28291.19 | 83.75 | 67.0 |
22571 | 2020-01-08 | 6 | 3632 | 10 | 11030.98 | 50.0 | 29.5 |
89611 | 2020-02-01 | 5 | 3692 | 32 | 1005.09 | 92.75 | 72.25 |
56722 | 2020-01-20 | 5 | 3726 | 14 | 5664.35 | 39.0 | 19.25 |
42908 | 2020-01-15 | 6 | 3714 | 12 | 6182.76 | 25.75 | 19.5 |
연월일 | 시간 | 우편번호 | 상권코드 | 유동인구 | 미세먼지 | 초미세먼지 | |
---|---|---|---|---|---|---|---|
27621 | 2020-01-10 | 5 | 3712 | 31 | 853.76 | 52.5 | 26.25 |
41583 | 2020-01-15 | 4 | 3711 | 7 | 15430.33 | 27.75 | 17.5 |
18979 | 2020-01-07 | 5 | 3789 | 1 | 93400.91 | 3.0 | 1.0 |
67160 | 2020-01-24 | 3 | 3777 | 1 | 47279.93 | 71.75 | 57.0 |
87770 | 2020-02-01 | 2 | 3624 | 11 | 17912.48 | 51.25 | 42.0 |
16638 | 2020-01-06 | 6 | 3665 | 16 | 8101.46 | 7.5 | 3.75 |
13325 | 2020-01-05 | 5 | 3616 | 9 | 16125.37 | 57.25 | 31.0 |
25946 | 2020-01-10 | 2 | 3682 | 18 | 19113.84 | 64.75 | 39.25 |
13549 | 2020-01-05 | 5 | 3640 | 26 | 11440.52 | 57.25 | 31.0 |
43716 | 2020-01-16 | 3 | 3739 | 21 | 11487.87 | 47.5 | 28.5 |
Most frequently occurring
연월일 | 시간 | 우편번호 | 상권코드 | 유동인구 | 미세먼지 | 초미세먼지 | # duplicates | |
---|---|---|---|---|---|---|---|---|
535 | 2020-01-09 | 4 | 3646 | 23 | 17171.81 | 41.0 | 23.25 | 6 |
712 | 2020-01-11 | 6 | 3646 | 23 | 21778.01 | 58.0 | 32.5 | 6 |
1340 | 2020-01-21 | 4 | 3665 | 16 | 6273.42 | 44.75 | 26.75 | 6 |
1677 | 2020-01-27 | 5 | 3646 | 23 | 20917.06 | 7.0 | 3.0 | 6 |
1718 | 2020-01-28 | 4 | 3665 | 16 | 6379.14 | 7.75 | 5.0 | 6 |
247 | 2020-01-04 | 6 | 3789 | 1 | 63589.78 | 48.25 | 26.5 | 5 |
271 | 2020-01-05 | 4 | 3632 | 10 | 8596.87 | 56.0 | 30.0 | 5 |
272 | 2020-01-05 | 4 | 3646 | 23 | 20940.13 | 56.0 | 30.0 | 5 |
461 | 2020-01-08 | 3 | 3646 | 23 | 17031.27 | 19.25 | 12.25 | 5 |
471 | 2020-01-08 | 3 | 3777 | 1 | 109943.77 | 19.25 | 12.25 | 5 |