Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows2150
Duplicate rows (%)21.5%
Total size in memory683.6 KiB
Average record size in memory70.0 B

Variable types

Categorical1
Numeric6

Dataset

Description서대문구 전체 상권별 - 미세먼지 - 유동인구 분석 기초자료 데이터로 날짜별 상권별, 시간별, 유동인구, 미세먼지, 초미세먼지 데이터 항목을 제공합니다
Author서울특별시 서대문구
URLhttps://www.data.go.kr/data/15097162/fileData.do

Alerts

Dataset has 2150 (21.5%) duplicate rowsDuplicates
우편번호 is highly overall correlated with 상권코드High correlation
상권코드 is highly overall correlated with 우편번호 and 1 other fieldsHigh correlation
유동인구 is highly overall correlated with 상권코드High correlation
미세먼지 is highly overall correlated with 초미세먼지 and 1 other fieldsHigh correlation
초미세먼지 is highly overall correlated with 미세먼지 and 1 other fieldsHigh correlation
연월일 is highly overall correlated with 미세먼지 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 07:52:00.490423
Analysis finished2023-12-12 07:52:06.503843
Duration6.01 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연월일
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2020-01-13
 
331
2020-01-30
 
331
2020-01-07
 
330
2020-01-08
 
325
2020-02-01
 
320
Other values (29)
8363 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-01-13
2nd row2020-01-25
3rd row2020-01-05
4th row2020-01-31
5th row2020-01-03

Common Values

ValueCountFrequency (%)
2020-01-13 331
 
3.3%
2020-01-30 331
 
3.3%
2020-01-07 330
 
3.3%
2020-01-08 325
 
3.2%
2020-02-01 320
 
3.2%
2020-01-11 318
 
3.2%
2020-01-20 316
 
3.2%
2020-01-04 315
 
3.1%
2020-01-23 315
 
3.1%
2020-01-03 313
 
3.1%
Other values (24) 6786
67.9%

Length

2023-12-12T16:52:06.568594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-01-13 331
 
3.3%
2020-01-30 331
 
3.3%
2020-01-07 330
 
3.3%
2020-01-08 325
 
3.2%
2020-02-01 320
 
3.2%
2020-01-11 318
 
3.2%
2020-01-20 316
 
3.2%
2020-01-23 315
 
3.1%
2020-01-04 315
 
3.1%
2020-01-03 313
 
3.1%
Other values (24) 6786
67.9%

시간
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9169
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:52:06.695796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q35
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4532741
Coefficient of variation (CV)0.3710266
Kurtosis-0.88608407
Mean3.9169
Median Absolute Deviation (MAD)1
Skewness-0.26008839
Sum39169
Variance2.1120056
MonotonicityNot monotonic
2023-12-12T16:52:06.800299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
5 2430
24.3%
3 2071
20.7%
4 2065
20.6%
6 1567
15.7%
2 1277
12.8%
1 590
 
5.9%
ValueCountFrequency (%)
1 590
 
5.9%
2 1277
12.8%
3 2071
20.7%
4 2065
20.6%
5 2430
24.3%
6 1567
15.7%
ValueCountFrequency (%)
6 1567
15.7%
5 2430
24.3%
4 2065
20.6%
3 2071
20.7%
2 1277
12.8%
1 590
 
5.9%

우편번호
Real number (ℝ)

HIGH CORRELATION 

Distinct81
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3706.6328
Minimum3605
Maximum3789
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:52:06.921038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3605
5-th percentile3625
Q13665
median3711
Q33757
95-th percentile3788
Maximum3789
Range184
Interquartile range (IQR)92

Descriptive statistics

Standard deviation53.711577
Coefficient of variation (CV)0.014490666
Kurtosis-1.1945723
Mean3706.6328
Median Absolute Deviation (MAD)46
Skewness-0.11060314
Sum37066328
Variance2884.9335
MonotonicityNot monotonic
2023-12-12T16:52:07.066316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3712 432
 
4.3%
3735 379
 
3.8%
3646 339
 
3.4%
3734 310
 
3.1%
3789 275
 
2.8%
3766 269
 
2.7%
3692 261
 
2.6%
3628 260
 
2.6%
3788 252
 
2.5%
3777 230
 
2.3%
Other values (71) 6993
69.9%
ValueCountFrequency (%)
3605 119
1.2%
3606 9
 
0.1%
3607 17
 
0.2%
3611 36
 
0.4%
3612 31
 
0.3%
3615 75
 
0.8%
3616 88
 
0.9%
3624 124
1.2%
3625 73
 
0.7%
3628 260
2.6%
ValueCountFrequency (%)
3789 275
2.8%
3788 252
2.5%
3787 148
1.5%
3780 166
1.7%
3779 219
2.2%
3778 66
 
0.7%
3777 230
2.3%
3776 205
2.1%
3767 142
1.4%
3766 269
2.7%

상권코드
Real number (ℝ)

HIGH CORRELATION 

Distinct32
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.1638
Minimum1
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:52:07.194332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median11
Q320
95-th percentile29
Maximum32
Range31
Interquartile range (IQR)17

Descriptive statistics

Standard deviation9.1892989
Coefficient of variation (CV)0.75546284
Kurtosis-0.99042997
Mean12.1638
Median Absolute Deviation (MAD)8
Skewness0.42332724
Sum121638
Variance84.443214
MonotonicityNot monotonic
2023-12-12T16:52:07.328700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
1 1561
 
15.6%
2 656
 
6.6%
7 637
 
6.4%
11 489
 
4.9%
16 482
 
4.8%
23 471
 
4.7%
18 431
 
4.3%
6 381
 
3.8%
5 374
 
3.7%
15 364
 
3.6%
Other values (22) 4154
41.5%
ValueCountFrequency (%)
1 1561
15.6%
2 656
6.6%
3 331
 
3.3%
4 202
 
2.0%
5 374
 
3.7%
6 381
 
3.8%
7 637
6.4%
8 195
 
1.9%
9 230
 
2.3%
10 304
 
3.0%
ValueCountFrequency (%)
32 139
 
1.4%
31 227
2.3%
30 37
 
0.4%
29 157
 
1.6%
28 145
 
1.5%
27 128
 
1.3%
26 96
 
1.0%
25 167
 
1.7%
24 278
2.8%
23 471
4.7%

유동인구
Real number (ℝ)

HIGH CORRELATION 

Distinct4249
Distinct (%)42.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21939.672
Minimum530.5
Maximum120282.46
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:52:07.523049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum530.5
5-th percentile1745.2905
Q15885.945
median13882.37
Q320806.812
95-th percentile96644.27
Maximum120282.46
Range119751.96
Interquartile range (IQR)14920.868

Descriptive statistics

Standard deviation26776.396
Coefficient of variation (CV)1.2204557
Kurtosis3.6568421
Mean21939.672
Median Absolute Deviation (MAD)7745.84
Skewness2.1022279
Sum2.1939672 × 108
Variance7.1697539 × 108
MonotonicityNot monotonic
2023-12-12T16:52:07.681722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60116.7 18
 
0.2%
115588.49 16
 
0.2%
80846.86 15
 
0.1%
100553.89 14
 
0.1%
92007.78 14
 
0.1%
114246.86 14
 
0.1%
41857.76 14
 
0.1%
97066.13 14
 
0.1%
96749.89 13
 
0.1%
92034.96 13
 
0.1%
Other values (4239) 9855
98.6%
ValueCountFrequency (%)
530.5 1
< 0.1%
537.07 1
< 0.1%
543.92 1
< 0.1%
552.65 1
< 0.1%
573.11 2
< 0.1%
576.4 1
< 0.1%
588.11 1
< 0.1%
591.28 1
< 0.1%
608.67 2
< 0.1%
624.18 1
< 0.1%
ValueCountFrequency (%)
120282.46 8
0.1%
119357.06 9
0.1%
118475.28 10
0.1%
118426.52 11
0.1%
118218.57 11
0.1%
117806.23 13
0.1%
117521.14 7
0.1%
116896.56 8
0.1%
115588.49 16
0.2%
115042.86 7
0.1%

미세먼지
Real number (ℝ)

HIGH CORRELATION 

Distinct144
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.682675
Minimum3
Maximum94.75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:52:07.818409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile6.75
Q125.75
median43
Q358
95-th percentile78.25
Maximum94.75
Range91.75
Interquartile range (IQR)32.25

Descriptive statistics

Standard deviation22.720099
Coefficient of variation (CV)0.53230259
Kurtosis-0.7190023
Mean42.682675
Median Absolute Deviation (MAD)16.5
Skewness0.087178051
Sum426826.75
Variance516.20288
MonotonicityNot monotonic
2023-12-12T16:52:07.960966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
64.5 283
 
2.8%
3.0 272
 
2.7%
40.25 251
 
2.5%
26.0 206
 
2.1%
13.75 184
 
1.8%
19.75 169
 
1.7%
39.0 144
 
1.4%
55.5 142
 
1.4%
57.25 141
 
1.4%
40.5 132
 
1.3%
Other values (134) 8076
80.8%
ValueCountFrequency (%)
3.0 272
2.7%
3.25 60
 
0.6%
3.5 17
 
0.2%
4.25 81
 
0.8%
5.25 51
 
0.5%
5.5 18
 
0.2%
6.75 57
 
0.6%
7.0 70
 
0.7%
7.25 66
 
0.7%
7.5 101
 
1.0%
ValueCountFrequency (%)
94.75 43
0.4%
92.75 81
0.8%
92.25 54
0.5%
91.0 63
0.6%
87.25 52
0.5%
84.5 60
0.6%
83.75 67
0.7%
80.5 72
0.7%
78.25 76
0.8%
77.5 49
0.5%

초미세먼지
Real number (ℝ)

HIGH CORRELATION 

Distinct135
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.8539
Minimum1
Maximum76.75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:52:08.108687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q114.5
median26.5
Q334.25
95-th percentile60.25
Maximum76.75
Range75.75
Interquartile range (IQR)19.75

Descriptive statistics

Standard deviation16.534254
Coefficient of variation (CV)0.61571146
Kurtosis0.33043849
Mean26.8539
Median Absolute Deviation (MAD)9.5
Skewness0.67034466
Sum268539
Variance273.38156
MonotonicityNot monotonic
2023-12-12T16:52:08.276242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.25 213
 
2.1%
1.0 209
 
2.1%
40.0 208
 
2.1%
3.0 194
 
1.9%
18.75 187
 
1.9%
31.0 178
 
1.8%
32.5 178
 
1.8%
29.5 169
 
1.7%
34.0 164
 
1.6%
10.25 162
 
1.6%
Other values (125) 8138
81.4%
ValueCountFrequency (%)
1.0 209
2.1%
1.25 39
 
0.4%
1.5 122
1.2%
1.75 60
 
0.6%
2.25 51
 
0.5%
2.5 11
 
0.1%
3.0 194
1.9%
3.75 73
 
0.7%
4.0 45
 
0.4%
4.25 47
 
0.5%
ValueCountFrequency (%)
76.75 54
0.5%
74.5 43
0.4%
72.25 81
0.8%
67.0 67
0.7%
66.5 28
 
0.3%
65.75 60
0.6%
63.0 18
 
0.2%
62.0 46
0.5%
61.25 65
0.7%
60.25 58
0.6%

Interactions

2023-12-12T16:52:05.607972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:01.729284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.431231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:03.088839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:03.829712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:04.897712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:05.719852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:01.850265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.532692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:03.199783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:03.925244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:05.014854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:05.840356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:01.995355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.640413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:03.340957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:04.385532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:05.163977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:05.947547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.112306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.770619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:03.464320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:04.508197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:05.263780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:06.045995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.231083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.888932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:03.598217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:04.632958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:05.379180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:06.154478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.324602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:02.984064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:03.710550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:04.774045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:52:05.495748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:52:08.390045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연월일시간우편번호상권코드유동인구미세먼지초미세먼지
연월일1.0000.0670.0460.0430.3220.9200.909
시간0.0671.0000.1040.0770.3900.3380.310
우편번호0.0460.1041.0000.9340.8210.0200.000
상권코드0.0430.0770.9341.0000.7320.0410.053
유동인구0.3220.3900.8210.7321.0000.1990.234
미세먼지0.9200.3380.0200.0410.1991.0000.943
초미세먼지0.9090.3100.0000.0530.2340.9431.000
2023-12-12T16:52:08.507221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시간우편번호상권코드유동인구미세먼지초미세먼지연월일
시간1.000-0.0210.031-0.021-0.028-0.0350.029
우편번호-0.0211.000-0.6130.395-0.034-0.0350.017
상권코드0.031-0.6131.000-0.5750.0190.0180.015
유동인구-0.0210.395-0.5751.000-0.018-0.0140.119
미세먼지-0.028-0.0340.019-0.0181.0000.9470.645
초미세먼지-0.035-0.0350.018-0.0140.9471.0000.617
연월일0.0290.0170.0150.1190.6450.6171.000

Missing values

2023-12-12T16:52:06.305299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:52:06.446650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연월일시간우편번호상권코드유동인구미세먼지초미세먼지
367832020-01-136367566104.326.017.0
697912020-01-2533788134444.6128.7529.0
140232020-01-0563665168320.1245.2523.0
870682020-01-3163605282135.3648.540.5
54742020-01-0313788146016.5377.042.5
909622020-02-0233735328291.1983.7567.0
225712020-01-08636321011030.9850.029.5
896112020-02-0153692321005.0992.7572.25
567222020-01-2053726145664.3539.019.25
429082020-01-1563714126182.7625.7519.5
연월일시간우편번호상권코드유동인구미세먼지초미세먼지
276212020-01-105371231853.7652.526.25
415832020-01-1543711715430.3327.7517.5
189792020-01-0753789193400.913.01.0
671602020-01-2433777147279.9371.7557.0
877702020-02-01236241117912.4851.2542.0
166382020-01-0663665168101.467.53.75
133252020-01-0553616916125.3757.2531.0
259462020-01-10236821819113.8464.7539.25
135492020-01-05536402611440.5257.2531.0
437162020-01-16337392111487.8747.528.5

Duplicate rows

Most frequently occurring

연월일시간우편번호상권코드유동인구미세먼지초미세먼지# duplicates
5352020-01-09436462317171.8141.023.256
7122020-01-11636462321778.0158.032.56
13402020-01-2143665166273.4244.7526.756
16772020-01-27536462320917.067.03.06
17182020-01-2843665166379.147.755.06
2472020-01-0463789163589.7848.2526.55
2712020-01-0543632108596.8756.030.05
2722020-01-05436462320940.1356.030.05
4612020-01-08336462317031.2719.2512.255
4712020-01-08337771109943.7719.2512.255