Overview

Dataset statistics

Number of variables6
Number of observations427
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.8 KiB
Average record size in memory52.3 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 소상공인비율(퍼센트), 전체 기업수(개), 소상공인 수(개)으로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15110105

Alerts

소상공인비율(퍼센트) is highly overall correlated with 전체 기업수(개) and 1 other fieldsHigh correlation
전체 기업수(개) is highly overall correlated with 소상공인비율(퍼센트) and 1 other fieldsHigh correlation
소상공인 수(개) is highly overall correlated with 소상공인비율(퍼센트) and 1 other fieldsHigh correlation
전체 기업수(개) has unique valuesUnique

Reproduction

Analysis started2023-12-10 23:24:38.995985
Analysis finished2023-12-10 23:24:40.137696
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
2020
214 
2019
213 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2020 214
50.1%
2019 213
49.9%

Length

2023-12-11T08:24:40.184972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:24:40.259526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 214
50.1%
2019 213
49.9%

시도명
Categorical

Distinct16
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
경기도
80 
서울특별시
50 
경상북도
39 
경상남도
36 
전라남도
34 
Other values (11)
188 

Length

Max length7
Median length5
Mean length4.1217799
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 80
18.7%
서울특별시 50
11.7%
경상북도 39
9.1%
경상남도 36
8.4%
전라남도 34
8.0%
부산광역시 32
 
7.5%
충청남도 28
 
6.6%
강원도 24
 
5.6%
충청북도 20
 
4.7%
인천광역시 18
 
4.2%
Other values (6) 66
15.5%

Length

2023-12-11T08:24:40.359414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 80
18.7%
서울특별시 50
11.7%
경상북도 39
9.1%
경상남도 36
8.4%
전라남도 34
8.0%
부산광역시 32
 
7.5%
충청남도 28
 
6.6%
강원도 24
 
5.6%
충청북도 20
 
4.7%
인천광역시 18
 
4.2%
Other values (6) 66
15.5%
Distinct198
Distinct (%)46.4%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
2023-12-11T08:24:40.640318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.4613583
Min length2

Characters and Unicode

Total characters1478
Distinct characters135
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)2.3%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
중구 12
 
2.5%
동구 11
 
2.3%
북구 10
 
2.1%
서구 9
 
1.9%
남구 9
 
1.9%
청주시 8
 
1.7%
수원시 8
 
1.7%
성남시 6
 
1.2%
용인시 6
 
1.2%
고양시 6
 
1.2%
Other values (196) 396
82.3%
2023-12-11T08:24:41.387075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
209
 
14.1%
185
 
12.5%
103
 
7.0%
54
 
3.7%
45
 
3.0%
44
 
3.0%
38
 
2.6%
36
 
2.4%
36
 
2.4%
34
 
2.3%
Other values (125) 694
47.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1424
96.3%
Space Separator 54
 
3.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
209
 
14.7%
185
 
13.0%
103
 
7.2%
45
 
3.2%
44
 
3.1%
38
 
2.7%
36
 
2.5%
36
 
2.5%
34
 
2.4%
31
 
2.2%
Other values (124) 663
46.6%
Space Separator
ValueCountFrequency (%)
54
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1424
96.3%
Common 54
 
3.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
209
 
14.7%
185
 
13.0%
103
 
7.2%
45
 
3.2%
44
 
3.1%
38
 
2.7%
36
 
2.5%
36
 
2.5%
34
 
2.4%
31
 
2.2%
Other values (124) 663
46.6%
Common
ValueCountFrequency (%)
54
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1424
96.3%
ASCII 54
 
3.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
209
 
14.7%
185
 
13.0%
103
 
7.2%
45
 
3.2%
44
 
3.1%
38
 
2.7%
36
 
2.5%
36
 
2.5%
34
 
2.4%
31
 
2.2%
Other values (124) 663
46.6%
ASCII
ValueCountFrequency (%)
54
100.0%

소상공인비율(퍼센트)
Real number (ℝ)

HIGH CORRELATION 

Distinct307
Distinct (%)71.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean94.309508
Minimum84.68
Maximum97.71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-11T08:24:41.519330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum84.68
5-th percentile91.451
Q193.39
median94.46
Q395.475
95-th percentile96.927
Maximum97.71
Range13.03
Interquartile range (IQR)2.085

Descriptive statistics

Standard deviation1.7911864
Coefficient of variation (CV)0.018992639
Kurtosis3.6334594
Mean94.309508
Median Absolute Deviation (MAD)1.04
Skewness-1.1844932
Sum40270.16
Variance3.2083488
MonotonicityNot monotonic
2023-12-11T08:24:41.647874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
94.86 4
 
0.9%
95.5 4
 
0.9%
95.21 4
 
0.9%
95.31 4
 
0.9%
94.99 3
 
0.7%
94.44 3
 
0.7%
93.66 3
 
0.7%
94.85 3
 
0.7%
95.42 3
 
0.7%
93.87 3
 
0.7%
Other values (297) 393
92.0%
ValueCountFrequency (%)
84.68 1
0.2%
85.52 1
0.2%
87.28 1
0.2%
87.72 1
0.2%
87.96 1
0.2%
88.5 1
0.2%
89.62 1
0.2%
89.91 1
0.2%
89.98 1
0.2%
90.13 1
0.2%
ValueCountFrequency (%)
97.71 2
0.5%
97.65 1
0.2%
97.63 1
0.2%
97.55 1
0.2%
97.47 2
0.5%
97.41 1
0.2%
97.23 1
0.2%
97.21 1
0.2%
97.16 1
0.2%
97.14 1
0.2%

전체 기업수(개)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct427
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31895.616
Minimum1532
Maximum148846
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-11T08:24:41.769030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1532
5-th percentile4775.1
Q113335
median28929
Q344757.5
95-th percentile75766.6
Maximum148846
Range147314
Interquartile range (IQR)31422.5

Descriptive statistics

Standard deviation23861.963
Coefficient of variation (CV)0.74812673
Kurtosis3.1939565
Mean31895.616
Median Absolute Deviation (MAD)15761
Skewness1.4527273
Sum13619428
Variance5.6939328 × 108
MonotonicityNot monotonic
2023-12-11T08:24:41.900110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61395 1
 
0.2%
110483 1
 
0.2%
27975 1
 
0.2%
51386 1
 
0.2%
68744 1
 
0.2%
31092 1
 
0.2%
28826 1
 
0.2%
39263 1
 
0.2%
32419 1
 
0.2%
41003 1
 
0.2%
Other values (417) 417
97.7%
ValueCountFrequency (%)
1532 1
0.2%
2155 1
0.2%
2315 1
0.2%
2923 1
0.2%
3015 1
0.2%
3215 1
0.2%
3221 1
0.2%
3230 1
0.2%
3261 1
0.2%
3276 1
0.2%
ValueCountFrequency (%)
148846 1
0.2%
139768 1
0.2%
133629 1
0.2%
119985 1
0.2%
110483 1
0.2%
105286 1
0.2%
102623 1
0.2%
98824 1
0.2%
98299 1
0.2%
96805 1
0.2%

소상공인 수(개)
Real number (ℝ)

HIGH CORRELATION 

Distinct426
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29829.719
Minimum1485
Maximum127286
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-11T08:24:42.023105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1485
5-th percentile4605.1
Q112783
median27311
Q341917
95-th percentile70315.1
Maximum127286
Range125801
Interquartile range (IQR)29134

Descriptive statistics

Standard deviation21805.078
Coefficient of variation (CV)0.73098503
Kurtosis2.4510544
Mean29829.719
Median Absolute Deviation (MAD)14564
Skewness1.3191492
Sum12737290
Variance4.7546142 × 108
MonotonicityNot monotonic
2023-12-11T08:24:42.173045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10300 2
 
0.5%
56456 1
 
0.2%
44315 1
 
0.2%
49174 1
 
0.2%
62995 1
 
0.2%
28933 1
 
0.2%
27176 1
 
0.2%
36747 1
 
0.2%
30663 1
 
0.2%
38608 1
 
0.2%
Other values (416) 416
97.4%
ValueCountFrequency (%)
1485 1
0.2%
2104 1
0.2%
2262 1
0.2%
2856 1
0.2%
2944 1
0.2%
3080 1
0.2%
3102 1
0.2%
3121 1
0.2%
3124 1
0.2%
3135 1
0.2%
ValueCountFrequency (%)
127286 1
0.2%
121975 1
0.2%
118355 1
0.2%
109117 1
0.2%
104069 1
0.2%
98717 1
0.2%
94200 1
0.2%
91265 1
0.2%
90431 1
0.2%
88822 1
0.2%

Interactions

2023-12-11T08:24:39.785917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:24:39.249390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:24:39.566819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:24:39.857784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:24:39.355978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:24:39.642537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:24:39.929633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:24:39.482693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:24:39.717376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:24:42.255321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명소상공인비율(퍼센트)전체 기업수(개)소상공인 수(개)
통계연도1.0000.0000.0230.0000.000
시도명0.0001.0000.3620.5700.527
소상공인비율(퍼센트)0.0230.3621.0000.8440.799
전체 기업수(개)0.0000.5700.8441.0000.980
소상공인 수(개)0.0000.5270.7990.9801.000
2023-12-11T08:24:42.340081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-11T08:24:42.407798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소상공인비율(퍼센트)전체 기업수(개)소상공인 수(개)통계연도시도명
소상공인비율(퍼센트)1.000-0.542-0.5270.0160.149
전체 기업수(개)-0.5421.0001.0000.0000.262
소상공인 수(개)-0.5271.0001.0000.0000.236
통계연도0.0160.0000.0001.0000.000
시도명0.1490.2620.2360.0001.000

Missing values

2023-12-11T08:24:40.020453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:24:40.103982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명소상공인비율(퍼센트)전체 기업수(개)소상공인 수(개)
02019서울특별시종로구91.966139556456
12019서울특별시중구92.849829991265
22019서울특별시용산구91.433969536293
32019서울특별시성동구89.624709842207
42019서울특별시광진구93.944369641047
52019서울특별시동대문구94.625235049534
62019서울특별시중랑구96.254482543146
72019서울특별시성북구95.614278740908
82019서울특별시강북구95.73203530658
92019서울특별시도봉구96.123052429341
통계연도시도명시군구명소상공인비율(퍼센트)전체 기업수(개)소상공인 수(개)
4172020경상남도의령군95.7132613121
4182020경상남도함안군92.151153210627
4192020경상남도창녕군94.175087065
4202020경상남도고성군95.4675797235
4212020경상남도남해군97.4167686593
4222020경상남도하동군97.2367156529
4232020경상남도함양군96.3351184930
4242020제주특별자치도제주시93.758439479121
4252020제주특별자치도서귀포시94.723277031041
4262020인천광역시미추홀구95.425043848127