Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory761.7 KiB
Average record size in memory78.0 B

Variable types

Categorical2
Numeric4
Text2

Dataset

Description경기도 발달 골목 상권 추정 매출 현황
Author경기도시장상권진흥원
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=B7BIA8NM4VIXSPPQQV0132089283&infSeq=1

Alerts

기준연도 has constant value ""Constant
기준분기 has constant value ""Constant
매출금액 is highly overall correlated with 매출건수High correlation
매출건수 is highly overall correlated with 매출금액High correlation
매출금액 is highly skewed (γ1 = 21.87633998)Skewed

Reproduction

Analysis started2024-03-12 23:35:50.717621
Analysis finished2024-03-12 23:35:52.825239
Duration2.11 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 10000
100.0%

Length

2024-03-13T08:35:52.873564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T08:35:52.940479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 10000
100.0%

기준분기
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 10000
100.0%

Length

2024-03-13T08:35:53.018897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T08:35:53.095424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 10000
100.0%

상권ID
Real number (ℝ)

Distinct1623
Distinct (%)16.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean804.9223
Minimum1
Maximum1865
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T08:35:53.188781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile67
Q1343
median704
Q31269
95-th percentile1735
Maximum1865
Range1864
Interquartile range (IQR)926

Descriptive statistics

Standard deviation543.07291
Coefficient of variation (CV)0.67468986
Kurtosis-1.125771
Mean804.9223
Median Absolute Deviation (MAD)442
Skewness0.35083869
Sum8049223
Variance294928.19
MonotonicityNot monotonic
2024-03-13T08:35:53.319485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34 22
 
0.2%
124 21
 
0.2%
888 19
 
0.2%
143 19
 
0.2%
93 19
 
0.2%
767 19
 
0.2%
366 19
 
0.2%
363 18
 
0.2%
82 18
 
0.2%
489 18
 
0.2%
Other values (1613) 9808
98.1%
ValueCountFrequency (%)
1 9
0.1%
3 3
 
< 0.1%
4 10
0.1%
5 7
0.1%
6 13
0.1%
7 9
0.1%
8 1
 
< 0.1%
9 16
0.2%
10 11
0.1%
11 6
 
0.1%
ValueCountFrequency (%)
1865 14
0.1%
1864 5
 
0.1%
1863 8
0.1%
1862 8
0.1%
1861 8
0.1%
1860 6
0.1%
1858 3
 
< 0.1%
1857 4
 
< 0.1%
1856 1
 
< 0.1%
1854 4
 
< 0.1%
Distinct1544
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-13T08:35:53.514781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length6.2699
Min length2

Characters and Unicode

Total characters62699
Distinct characters367
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)1.0%

Sample

1st row김포대로319번길
2nd row인덕원역_2번출구
3rd row보정동주민센터
4th row서정마을2로7번길
5th row역전로
ValueCountFrequency (%)
중앙로 42
 
0.4%
중앙로_2 27
 
0.3%
사강장사강시장 22
 
0.2%
산성대로 21
 
0.2%
경안동주민센터 21
 
0.2%
광명로 21
 
0.2%
중앙로_1 21
 
0.2%
영통로 20
 
0.2%
매화로 19
 
0.2%
엘에스로_2 19
 
0.2%
Other values (1534) 9767
97.7%
2024-03-13T08:35:53.819373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4912
 
7.8%
2427
 
3.9%
2407
 
3.8%
1 2389
 
3.8%
_ 1785
 
2.8%
2 1549
 
2.5%
1423
 
2.3%
1136
 
1.8%
1104
 
1.8%
957
 
1.5%
Other values (357) 42610
68.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52867
84.3%
Decimal Number 7852
 
12.5%
Connector Punctuation 1785
 
2.8%
Uppercase Letter 195
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4912
 
9.3%
2427
 
4.6%
2407
 
4.6%
1423
 
2.7%
1136
 
2.1%
1104
 
2.1%
957
 
1.8%
885
 
1.7%
843
 
1.6%
815
 
1.5%
Other values (341) 35958
68.0%
Decimal Number
ValueCountFrequency (%)
1 2389
30.4%
2 1549
19.7%
3 875
 
11.1%
4 576
 
7.3%
5 505
 
6.4%
6 468
 
6.0%
9 431
 
5.5%
7 412
 
5.2%
8 330
 
4.2%
0 317
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
C 69
35.4%
V 51
26.2%
G 51
26.2%
N 18
 
9.2%
I 6
 
3.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1785
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 52867
84.3%
Common 9637
 
15.4%
Latin 195
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4912
 
9.3%
2427
 
4.6%
2407
 
4.6%
1423
 
2.7%
1136
 
2.1%
1104
 
2.1%
957
 
1.8%
885
 
1.7%
843
 
1.6%
815
 
1.5%
Other values (341) 35958
68.0%
Common
ValueCountFrequency (%)
1 2389
24.8%
_ 1785
18.5%
2 1549
16.1%
3 875
 
9.1%
4 576
 
6.0%
5 505
 
5.2%
6 468
 
4.9%
9 431
 
4.5%
7 412
 
4.3%
8 330
 
3.4%
Latin
ValueCountFrequency (%)
C 69
35.4%
V 51
26.2%
G 51
26.2%
N 18
 
9.2%
I 6
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52867
84.3%
ASCII 9832
 
15.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4912
 
9.3%
2427
 
4.6%
2407
 
4.6%
1423
 
2.7%
1136
 
2.1%
1104
 
2.1%
957
 
1.8%
885
 
1.7%
843
 
1.6%
815
 
1.5%
Other values (341) 35958
68.0%
ASCII
ValueCountFrequency (%)
1 2389
24.3%
_ 1785
18.2%
2 1549
15.8%
3 875
 
8.9%
4 576
 
5.9%
5 505
 
5.1%
6 468
 
4.8%
9 431
 
4.4%
7 412
 
4.2%
8 330
 
3.4%
Other values (6) 512
 
5.2%

산업분류코드
Real number (ℝ)

Distinct68
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64140.843
Minimum47121
Maximum96912
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T08:35:53.928763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum47121
5-th percentile47122
Q147511
median56111
Q386202
95-th percentile96113
Maximum96912
Range49791
Interquartile range (IQR)38691

Descriptive statistics

Standard deviation19809.472
Coefficient of variation (CV)0.30884334
Kurtosis-1.39729
Mean64140.843
Median Absolute Deviation (MAD)8799
Skewness0.65631651
Sum6.4140843 × 108
Variance3.9241519 × 108
MonotonicityNot monotonic
2024-03-13T08:35:54.032254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
56123 407
 
4.1%
56111 399
 
4.0%
96112 356
 
3.6%
47122 329
 
3.3%
47219 319
 
3.2%
56194 314
 
3.1%
96113 259
 
2.6%
47811 253
 
2.5%
56219 248
 
2.5%
91223 241
 
2.4%
Other values (58) 6875
68.8%
ValueCountFrequency (%)
47121 224
2.2%
47122 329
3.3%
47129 164
1.6%
47211 44
 
0.4%
47212 205
2.1%
47217 131
 
1.3%
47219 319
3.2%
47311 159
1.6%
47312 143
1.4%
47320 133
1.3%
ValueCountFrequency (%)
96912 222
2.2%
96119 102
 
1.0%
96113 259
2.6%
96112 356
3.6%
95310 112
 
1.1%
95213 96
 
1.0%
95212 215
2.1%
91223 241
2.4%
91222 185
1.8%
91136 81
 
0.8%
Distinct68
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-13T08:35:54.252013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length18
Mean length9.6644
Min length3

Characters and Unicode

Total characters96644
Distinct characters160
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row슈퍼마켓
2nd row의약품 및 의료용품 소매업
3rd row중식 음식점업
4th row기타 식료품 소매업
5th row한식 일반 음식점업
ValueCountFrequency (%)
소매업 3873
 
13.6%
2509
 
8.8%
기타 1745
 
6.1%
음식점업 1543
 
5.4%
운영업 936
 
3.3%
미용업 717
 
2.5%
일반 579
 
2.0%
가정용 466
 
1.6%
서양식 407
 
1.4%
한식 402
 
1.4%
Other values (115) 15258
53.7%
2024-03-13T08:35:54.637061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18435
19.1%
8394
 
8.7%
4032
 
4.2%
3873
 
4.0%
3545
 
3.7%
2731
 
2.8%
2632
 
2.7%
2509
 
2.6%
2346
 
2.4%
2328
 
2.4%
Other values (150) 45819
47.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 77470
80.2%
Space Separator 18435
 
19.1%
Other Punctuation 739
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8394
 
10.8%
4032
 
5.2%
3873
 
5.0%
3545
 
4.6%
2731
 
3.5%
2632
 
3.4%
2509
 
3.2%
2346
 
3.0%
2328
 
3.0%
1745
 
2.3%
Other values (148) 43335
55.9%
Space Separator
ValueCountFrequency (%)
18435
100.0%
Other Punctuation
ValueCountFrequency (%)
, 739
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 77470
80.2%
Common 19174
 
19.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8394
 
10.8%
4032
 
5.2%
3873
 
5.0%
3545
 
4.6%
2731
 
3.5%
2632
 
3.4%
2509
 
3.2%
2346
 
3.0%
2328
 
3.0%
1745
 
2.3%
Other values (148) 43335
55.9%
Common
ValueCountFrequency (%)
18435
96.1%
, 739
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 77306
80.0%
ASCII 19174
 
19.8%
Compat Jamo 164
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18435
96.1%
, 739
 
3.9%
Hangul
ValueCountFrequency (%)
8394
 
10.9%
4032
 
5.2%
3873
 
5.0%
3545
 
4.6%
2731
 
3.5%
2632
 
3.4%
2509
 
3.2%
2346
 
3.0%
2328
 
3.0%
1745
 
2.3%
Other values (147) 43171
55.8%
Compat Jamo
ValueCountFrequency (%)
164
100.0%

매출금액
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct9997
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2381335 × 108
Minimum6
Maximum1.9401094 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T08:35:54.751277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile611005.85
Q17039251.5
median32974990
Q31.1827653 × 108
95-th percentile5.0755985 × 108
Maximum1.9401094 × 1010
Range1.9401094 × 1010
Interquartile range (IQR)1.1123728 × 108

Descriptive statistics

Standard deviation3.4606227 × 108
Coefficient of variation (CV)2.7950319
Kurtosis1018.3753
Mean1.2381335 × 108
Median Absolute Deviation (MAD)30719238
Skewness21.87634
Sum1.2381335 × 1012
Variance1.1975909 × 1017
MonotonicityNot monotonic
2024-03-13T08:35:54.866038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
285122 2
 
< 0.1%
291529 2
 
< 0.1%
3850020 2
 
< 0.1%
7797062 1
 
< 0.1%
129795844 1
 
< 0.1%
1000646855 1
 
< 0.1%
881782 1
 
< 0.1%
23233393 1
 
< 0.1%
104922135 1
 
< 0.1%
306050663 1
 
< 0.1%
Other values (9987) 9987
99.9%
ValueCountFrequency (%)
6 1
< 0.1%
27 1
< 0.1%
1277 1
< 0.1%
1398 1
< 0.1%
2039 1
< 0.1%
2836 1
< 0.1%
2960 1
< 0.1%
5509 1
< 0.1%
5673 1
< 0.1%
6244 1
< 0.1%
ValueCountFrequency (%)
19401094394 1
< 0.1%
8243625581 1
< 0.1%
5005607514 1
< 0.1%
4951757501 1
< 0.1%
4591174666 1
< 0.1%
4043437183 1
< 0.1%
3758113640 1
< 0.1%
3620629784 1
< 0.1%
3594047276 1
< 0.1%
3376282422 1
< 0.1%

매출건수
Real number (ℝ)

HIGH CORRELATION 

Distinct2954
Distinct (%)29.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1543.8451
Minimum1
Maximum112783
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T08:35:54.977468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q130
median164
Q3943.25
95-th percentile7867.05
Maximum112783
Range112782
Interquartile range (IQR)913.25

Descriptive statistics

Standard deviation4658.6147
Coefficient of variation (CV)3.0175402
Kurtosis109.29248
Mean1543.8451
Median Absolute Deviation (MAD)157
Skewness8.3454143
Sum15438451
Variance21702691
MonotonicityNot monotonic
2024-03-13T08:35:55.080190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 217
 
2.2%
4 161
 
1.6%
1 150
 
1.5%
3 143
 
1.4%
6 128
 
1.3%
9 123
 
1.2%
5 115
 
1.1%
7 107
 
1.1%
8 100
 
1.0%
10 97
 
1.0%
Other values (2944) 8659
86.6%
ValueCountFrequency (%)
1 150
1.5%
2 217
2.2%
3 143
1.4%
4 161
1.6%
5 115
1.1%
6 128
1.3%
7 107
1.1%
8 100
1.0%
9 123
1.2%
10 97
1.0%
ValueCountFrequency (%)
112783 1
< 0.1%
82058 1
< 0.1%
79086 1
< 0.1%
78446 1
< 0.1%
77147 1
< 0.1%
75061 1
< 0.1%
74420 1
< 0.1%
66105 1
< 0.1%
64369 1
< 0.1%
60610 1
< 0.1%

Interactions

2024-03-13T08:35:52.320264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.370705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.676045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.978338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:52.405046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.446005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.750548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:52.068534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:52.473190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.518151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.821112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:52.144792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:52.555854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.601488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:51.901446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:35:52.243674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T08:35:55.159454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상권ID산업분류코드산업분류코드명매출금액매출건수
상권ID1.0000.0580.1490.0550.061
산업분류코드0.0581.0001.0000.0320.077
산업분류코드명0.1491.0001.0000.0980.300
매출금액0.0550.0320.0981.0000.250
매출건수0.0610.0770.3000.2501.000
2024-03-13T08:35:55.516370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상권ID산업분류코드매출금액매출건수
상권ID1.000-0.013-0.159-0.118
산업분류코드-0.0131.0000.005-0.069
매출금액-0.1590.0051.0000.640
매출건수-0.118-0.0690.6401.000

Missing values

2024-03-13T08:35:52.657423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T08:35:52.780973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연도기준분기상권ID상권명산업분류코드산업분류코드명매출금액매출건수
15382202311409김포대로319번길47121슈퍼마켓7797062659
266620231426인덕원역_2번출구47811의약품 및 의료용품 소매업1050059995851
3127320231217보정동주민센터56121중식 음식점업1950535662139
11528202311754서정마을2로7번길47219기타 식료품 소매업645633571582
7468202311148역전로56111한식 일반 음식점업70797234664
9702202311537경의로55901기숙사 및 고시원 운영업13483822
3910020231833철산역_4번출구85629기타 예술학원10076797185
6027202311081부부로2길47813화장품, 비누 및 방향제 소매업191976412
26134202311664평화로56121중식 음식점업475074347
3090920231569오리로_356194김밥 및 기타 간이 음식점업9329038226
기준연도기준분기상권ID상권명산업분류코드산업분류코드명매출금액매출건수
4851202311015율마로438번길56123서양식 음식점업3168337578
14506202311478다리간2길47411남자용 겉옷 소매업61088945
2373520231151광덕4로_156111한식 일반 음식점업3563294135247
3791120231830쇠재안길47311컴퓨터 및 주변장치, 소프트웨어 소매업488056337
174520231416시흥정왕동우체국47122체인화 편의점112692539166105
439920231655포천우체국47429섬유 원단, 실 및 기타 섬유제품 소매업1061921310
7613202311091수목원로468번길91135당구장 운영업15000780140
6841202311043한글세계평화지도전시관91135당구장 운영업14465193136
15586202311213직행고속버스정류소56111한식 일반 음식점업335225231231
496420231858대화역_4번출구47312통신기기 소매업184966666