Overview

Dataset statistics

Number of variables6
Number of observations1286
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory65.4 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 외국인비율(퍼센트), 주민등록(한국인)인구수(명), 등록외국인수(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110120/fileData.do

Alerts

외국인비율(퍼센트) is highly overall correlated with 등록외국인수(명)High correlation
주민등록(한국인)인구수(명) is highly overall correlated with 등록외국인수(명)High correlation
등록외국인수(명) is highly overall correlated with 외국인비율(퍼센트) and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 07:09:31.519670
Analysis finished2023-12-12 07:09:32.597600
Duration1.08 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
2017
258 
2018
257 
2019
257 
2020
257 
2021
257 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2017 258
20.1%
2018 257
20.0%
2019 257
20.0%
2020 257
20.0%
2021 257
20.0%

Length

2023-12-12T16:09:32.913967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:09:33.001138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 258
20.1%
2018 257
20.0%
2019 257
20.0%
2020 257
20.0%
2021 257
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
경기도
240 
경상북도
125 
서울특별시
125 
경상남도
115 
전라남도
110 
Other values (11)
571 

Length

Max length7
Median length5
Mean length4.0474339
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 240
18.7%
경상북도 125
9.7%
서울특별시 125
9.7%
경상남도 115
8.9%
전라남도 110
8.6%
강원도 90
 
7.0%
전라북도 80
 
6.2%
충청남도 80
 
6.2%
부산광역시 80
 
6.2%
충청북도 75
 
5.8%
Other values (6) 166
12.9%

Length

2023-12-12T16:09:33.114170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 240
18.7%
경상북도 125
9.7%
서울특별시 125
9.7%
경상남도 115
8.9%
전라남도 110
8.6%
강원도 90
 
7.0%
전라북도 80
 
6.2%
충청남도 80
 
6.2%
부산광역시 80
 
6.2%
충청북도 75
 
5.8%
Other values (6) 166
12.9%
Distinct233
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
2023-12-12T16:09:33.460774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9525661
Min length2

Characters and Unicode

Total characters3797
Distinct characters137
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강릉시
2nd row고성군
3rd row동해시
4th row삼척시
5th row속초시
ValueCountFrequency (%)
중구 30
 
2.3%
동구 30
 
2.3%
남구 26
 
2.0%
서구 25
 
1.9%
북구 25
 
1.9%
고성군 10
 
0.8%
강서구 10
 
0.8%
금산군 5
 
0.4%
완산구 5
 
0.4%
덕진구 5
 
0.4%
Other values (223) 1115
86.7%
2023-12-12T16:09:33.928044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
521
 
13.7%
425
 
11.2%
385
 
10.1%
110
 
2.9%
110
 
2.9%
100
 
2.6%
100
 
2.6%
95
 
2.5%
90
 
2.4%
80
 
2.1%
Other values (127) 1781
46.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3797
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
521
 
13.7%
425
 
11.2%
385
 
10.1%
110
 
2.9%
110
 
2.9%
100
 
2.6%
100
 
2.6%
95
 
2.5%
90
 
2.4%
80
 
2.1%
Other values (127) 1781
46.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3797
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
521
 
13.7%
425
 
11.2%
385
 
10.1%
110
 
2.9%
110
 
2.9%
100
 
2.6%
100
 
2.6%
95
 
2.5%
90
 
2.4%
80
 
2.1%
Other values (127) 1781
46.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3797
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
521
 
13.7%
425
 
11.2%
385
 
10.1%
110
 
2.9%
110
 
2.9%
100
 
2.6%
100
 
2.6%
95
 
2.5%
90
 
2.4%
80
 
2.1%
Other values (127) 1781
46.9%

외국인비율(퍼센트)
Real number (ℝ)

HIGH CORRELATION 

Distinct437
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1669051
Minimum0.25
Maximum12.83
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2023-12-12T16:09:34.114047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.25
5-th percentile0.64
Q11.07
median1.63
Q32.71
95-th percentile5.9675
Maximum12.83
Range12.58
Interquartile range (IQR)1.64

Descriptive statistics

Standard deviation1.6785779
Coefficient of variation (CV)0.774643
Kurtosis6.4976739
Mean2.1669051
Median Absolute Deviation (MAD)0.67
Skewness2.2118018
Sum2786.64
Variance2.8176237
MonotonicityNot monotonic
2023-12-12T16:09:34.276425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.22 12
 
0.9%
1.72 12
 
0.9%
1.09 11
 
0.9%
0.98 11
 
0.9%
1.0 11
 
0.9%
1.14 11
 
0.9%
0.94 10
 
0.8%
1.19 10
 
0.8%
1.31 9
 
0.7%
1.04 9
 
0.7%
Other values (427) 1180
91.8%
ValueCountFrequency (%)
0.25 1
 
0.1%
0.26 1
 
0.1%
0.3 1
 
0.1%
0.31 1
 
0.1%
0.33 2
0.2%
0.34 1
 
0.1%
0.35 3
0.2%
0.36 1
 
0.1%
0.37 3
0.2%
0.38 2
0.2%
ValueCountFrequency (%)
12.83 1
0.1%
12.48 1
0.1%
12.31 1
0.1%
11.43 1
0.1%
10.41 1
0.1%
9.05 1
0.1%
8.88 1
0.1%
8.82 1
0.1%
8.53 1
0.1%
8.33 1
0.1%

주민등록(한국인)인구수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1285
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean233086.49
Minimum8867
Maximum1202628
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2023-12-12T16:09:34.415095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8867
5-th percentile27391.5
Q162280.75
median185626.5
Q3342607.75
95-th percentile624071
Maximum1202628
Range1193761
Interquartile range (IQR)280327

Descriptive statistics

Standard deviation211189.69
Coefficient of variation (CV)0.9060572
Kurtosis3.4824657
Mean233086.49
Median Absolute Deviation (MAD)132007.5
Skewness1.6286931
Sum2.9974922 × 108
Variance4.4601086 × 1010
MonotonicityNot monotonic
2023-12-12T16:09:34.590482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122499 2
 
0.2%
213952 1
 
0.1%
61502 1
 
0.1%
263728 1
 
0.1%
44006 1
 
0.1%
39080 1
 
0.1%
64182 1
 
0.1%
44785 1
 
0.1%
128293 1
 
0.1%
1036738 1
 
0.1%
Other values (1275) 1275
99.1%
ValueCountFrequency (%)
8867 1
0.1%
9077 1
0.1%
9617 1
0.1%
9832 1
0.1%
9975 1
0.1%
16320 1
0.1%
16692 1
0.1%
16993 1
0.1%
17356 1
0.1%
17479 1
0.1%
ValueCountFrequency (%)
1202628 1
0.1%
1201166 1
0.1%
1194465 1
0.1%
1186078 1
0.1%
1183714 1
0.1%
1079353 1
0.1%
1079216 1
0.1%
1077508 1
0.1%
1074176 1
0.1%
1066351 1
0.1%

등록외국인수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1177
Distinct (%)91.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5343.5762
Minimum142
Maximum56787
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2023-12-12T16:09:34.749938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum142
5-th percentile283
Q1998.75
median2869
Q36143.75
95-th percentile17889.5
Maximum56787
Range56645
Interquartile range (IQR)5145

Descriptive statistics

Standard deviation7275.5918
Coefficient of variation (CV)1.3615585
Kurtosis12.995932
Mean5343.5762
Median Absolute Deviation (MAD)2075
Skewness3.1540256
Sum6871839
Variance52934236
MonotonicityNot monotonic
2023-12-12T16:09:34.915989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
773 4
 
0.3%
1502 3
 
0.2%
258 3
 
0.2%
635 3
 
0.2%
682 3
 
0.2%
265 3
 
0.2%
183 3
 
0.2%
504 3
 
0.2%
1500 3
 
0.2%
931 3
 
0.2%
Other values (1167) 1255
97.6%
ValueCountFrequency (%)
142 1
 
0.1%
150 1
 
0.1%
151 1
 
0.1%
166 1
 
0.1%
168 2
0.2%
173 1
 
0.1%
178 1
 
0.1%
179 1
 
0.1%
182 1
 
0.1%
183 3
0.2%
ValueCountFrequency (%)
56787 1
0.1%
56467 1
0.1%
53733 1
0.1%
51270 1
0.1%
47412 1
0.1%
44734 1
0.1%
43688 1
0.1%
43085 1
0.1%
41046 1
0.1%
40557 1
0.1%

Interactions

2023-12-12T16:09:32.232699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:09:31.755975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:09:32.000848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:09:32.313035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:09:31.839192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:09:32.077396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:09:32.390034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:09:31.921821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:09:32.155527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:09:35.037549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명외국인비율(퍼센트)주민등록(한국인)인구수(명)등록외국인수(명)
통계연도1.0000.0000.0000.0000.000
시도명0.0001.0000.4370.5720.453
외국인비율(퍼센트)0.0000.4371.0000.2160.675
주민등록(한국인)인구수(명)0.0000.5720.2161.0000.780
등록외국인수(명)0.0000.4530.6750.7801.000
2023-12-12T16:09:35.148225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-12T16:09:35.247206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
외국인비율(퍼센트)주민등록(한국인)인구수(명)등록외국인수(명)통계연도시도명
외국인비율(퍼센트)1.0000.0250.5470.0000.196
주민등록(한국인)인구수(명)0.0251.0000.8290.0000.265
등록외국인수(명)0.5470.8291.0000.0000.195
통계연도0.0000.0000.0001.0000.000
시도명0.1960.2650.1950.0001.000

Missing values

2023-12-12T16:09:32.480844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:09:32.563412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명외국인비율(퍼센트)주민등록(한국인)인구수(명)등록외국인수(명)
02017강원도강릉시0.882139521904
12017강원도고성군2.7330029842
22017강원도동해시0.8692851806
32017강원도삼척시1.1968514825
42017강원도속초시1.1382273940
52017강원도양구군1.0923835263
62017강원도양양군1.0327207283
72017강원도영월군0.6440067260
82017강원도원주시0.953413373277
92017강원도인제군0.9732582319
통계연도시도명시군구명외국인비율(퍼센트)주민등록(한국인)인구수(명)등록외국인수(명)
12762021인천광역시동구1.3661486849
12772021인천광역시서구1.9955538011296
12782021인천광역시중구2.641436333902
12792021인천광역시강화군1.0869693763
12802021인천광역시남동구2.0751827210928
12812021인천광역시부평구2.7548676513781
12822021인천광역시연수구3.138964412459
12832021인천광역시옹진군0.9420342193
12842021제주특별자치도제주시2.7149309613747
12852021제주특별자치도서귀포시3.661836636970