Overview

Dataset statistics

Number of variables3
Number of observations2441
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory62.1 KiB
Average record size in memory26.1 B

Variable types

Numeric2
Text1

Dataset

Description관광, 친지방문 등의 목적으로 입국하여 90일 이내에서 단기간 국내 체류하는 단기체류외국인 및 등록외국인과 외국국적동포 국내거소신고자를 포함하는 장기체류 외국인의 국적(지역)별 현황을 연도별로 제공
URLhttps://www.data.go.kr/data/15100012/fileData.do

Reproduction

Analysis started2023-12-12 11:01:01.475929
Analysis finished2023-12-12 11:01:02.612470
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Real number (ℝ)

Distinct12
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.5023
Minimum2011
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.6 KiB
2023-12-12T20:01:02.719657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2011
Q12013
median2017
Q32019
95-th percentile2022
Maximum2022
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.456417
Coefficient of variation (CV)0.0017140655
Kurtosis-1.2209825
Mean2016.5023
Median Absolute Deviation (MAD)3
Skewness-0.0074862981
Sum4922282
Variance11.946819
MonotonicityIncreasing
2023-12-12T20:01:02.912888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2019 209
8.6%
2011 206
8.4%
2016 205
8.4%
2021 205
8.4%
2012 204
8.4%
2018 204
8.4%
2020 204
8.4%
2013 203
8.3%
2014 201
8.2%
2022 201
8.2%
Other values (2) 399
16.3%
ValueCountFrequency (%)
2011 206
8.4%
2012 204
8.4%
2013 203
8.3%
2014 201
8.2%
2015 199
8.2%
2016 205
8.4%
2017 200
8.2%
2018 204
8.4%
2019 209
8.6%
2020 204
8.4%
ValueCountFrequency (%)
2022 201
8.2%
2021 205
8.4%
2020 204
8.4%
2019 209
8.6%
2018 204
8.4%
2017 200
8.2%
2016 205
8.4%
2015 199
8.2%
2014 201
8.2%
2013 203
8.3%
Distinct229
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Memory size19.2 KiB
2023-12-12T20:01:03.336008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length4.0434248
Min length2

Characters and Unicode

Total characters9870
Distinct characters201
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)0.5%

Sample

1st row중국
2nd row한국계중국인
3rd row베트남
4th row일본
5th row필리핀
ValueCountFrequency (%)
중국 12
 
0.5%
아르메니아 12
 
0.5%
리투아니아 12
 
0.5%
몰도바 12
 
0.5%
세르비아 12
 
0.5%
라트비아 12
 
0.5%
알바니아 12
 
0.5%
슬로베니아 12
 
0.5%
감비아 12
 
0.5%
몰타 12
 
0.5%
Other values (219) 2321
95.1%
2023-12-12T20:01:04.532092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
680
 
6.9%
378
 
3.8%
373
 
3.8%
295
 
3.0%
291
 
2.9%
271
 
2.7%
235
 
2.4%
200
 
2.0%
184
 
1.9%
177
 
1.8%
Other values (191) 6786
68.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9854
99.8%
Dash Punctuation 12
 
0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
680
 
6.9%
378
 
3.8%
373
 
3.8%
295
 
3.0%
291
 
3.0%
271
 
2.8%
235
 
2.4%
200
 
2.0%
184
 
1.9%
177
 
1.8%
Other values (188) 6770
68.7%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9854
99.8%
Common 16
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
680
 
6.9%
378
 
3.8%
373
 
3.8%
295
 
3.0%
291
 
3.0%
271
 
2.8%
235
 
2.4%
200
 
2.0%
184
 
1.9%
177
 
1.8%
Other values (188) 6770
68.7%
Common
ValueCountFrequency (%)
- 12
75.0%
( 2
 
12.5%
) 2
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9854
99.8%
ASCII 16
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
680
 
6.9%
378
 
3.8%
373
 
3.8%
295
 
3.0%
291
 
3.0%
271
 
2.8%
235
 
2.4%
200
 
2.0%
184
 
1.9%
177
 
1.8%
Other values (188) 6770
68.7%
ASCII
ValueCountFrequency (%)
- 12
75.0%
( 2
 
12.5%
) 2
 
12.5%

체류외국인수
Real number (ℝ)

Distinct1076
Distinct (%)44.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9616.5621
Minimum1
Maximum708082
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.6 KiB
2023-12-12T20:01:04.780974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q116
median104
Q3783
95-th percentile37012
Maximum708082
Range708081
Interquartile range (IQR)767

Descriptive statistics

Standard deviation51249.48
Coefficient of variation (CV)5.3292934
Kurtosis102.12757
Mean9616.5621
Median Absolute Deviation (MAD)101
Skewness9.4045991
Sum23474028
Variance2.6265092 × 109
MonotonicityNot monotonic
2023-12-12T20:01:05.035776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 101
 
4.1%
2 64
 
2.6%
7 48
 
2.0%
6 47
 
1.9%
5 47
 
1.9%
8 42
 
1.7%
3 41
 
1.7%
9 35
 
1.4%
10 32
 
1.3%
11 29
 
1.2%
Other values (1066) 1955
80.1%
ValueCountFrequency (%)
1 101
4.1%
2 64
2.6%
3 41
1.7%
4 28
 
1.1%
5 47
1.9%
6 47
1.9%
7 48
2.0%
8 42
1.7%
9 35
 
1.4%
10 32
 
1.3%
ValueCountFrequency (%)
708082 1
< 0.1%
701098 1
< 0.1%
679729 1
< 0.1%
647576 1
< 0.1%
627004 1
< 0.1%
626655 1
< 0.1%
614665 1
< 0.1%
602907 1
< 0.1%
590856 1
< 0.1%
497989 1
< 0.1%

Interactions

2023-12-12T20:01:02.064156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:01.736332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:02.223854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:01.884529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:01:05.198073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
체류외국인수
1.0000.000
체류외국인수0.0001.000
2023-12-12T20:01:05.329026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
체류외국인수
1.0000.085
체류외국인수0.0851.000

Missing values

2023-12-12T20:01:02.435656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:01:02.556618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

국적지역체류외국인수
02011중국207384
12011한국계중국인470570
22011베트남116219
32011일본58169
42011필리핀47542
52011태국45634
62011인도네시아36971
72011우즈베키스탄29742
82011몽골28634
92011타이완26316
국적지역체류외국인수
24312022적도기니10
24322022니제르9
24332022자이르7
24342022레소토6
24352022지부티4
24362022코모로3
24372022세이셸3
24382022상투메프린시페2
24392022나미비아2
24402022카보베르데1