Overview

Dataset statistics

Number of variables4
Number of observations5620
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory192.2 KiB
Average record size in memory35.0 B

Variable types

Categorical1
Numeric2
Text1

Dataset

Description관광, 친지방문 등의 목적으로 입국하여 90일 이내에서 단기간 국내 체류하는 단기체류외국인 및 등록외국인과 외국국적동포 국내거소신고자를 포함하는 장기체류 외국인의 국적(지역)별 현황을 월별로 제공
Author법무부
URLhttps://www.data.go.kr/data/15100013/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-04 07:29:24.476137
Analysis finished2024-05-04 07:29:27.234873
Duration2.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
2022
2503 
2023
2495 
2024
622 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 2503
44.5%
2023 2495
44.4%
2024 622
 
11.1%

Length

2024-05-04T07:29:27.497646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T07:29:27.791538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 2503
44.5%
2023 2495
44.4%
2024 622
 
11.1%


Real number (ℝ)

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.9991103
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size49.5 KiB
2024-05-04T07:29:28.171124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.5535963
Coefficient of variation (CV)0.59235389
Kurtosis-1.2723279
Mean5.9991103
Median Absolute Deviation (MAD)3
Skewness0.17725896
Sum33715
Variance12.628047
MonotonicityNot monotonic
2024-05-04T07:29:28.560503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
3 626
11.1%
1 622
11.1%
2 622
11.1%
7 421
7.5%
8 418
7.4%
4 417
7.4%
5 417
7.4%
6 417
7.4%
9 416
7.4%
10 416
7.4%
Other values (2) 828
14.7%
ValueCountFrequency (%)
1 622
11.1%
2 622
11.1%
3 626
11.1%
4 417
7.4%
5 417
7.4%
6 417
7.4%
7 421
7.5%
8 418
7.4%
9 416
7.4%
10 416
7.4%
ValueCountFrequency (%)
12 413
7.3%
11 415
7.4%
10 416
7.4%
9 416
7.4%
8 418
7.4%
7 421
7.5%
6 417
7.4%
5 417
7.4%
4 417
7.4%
3 626
11.1%
Distinct224
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
2024-05-04T07:29:29.308894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length4.0895018
Min length2

Characters and Unicode

Total characters22983
Distinct characters202
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row한국계중국인
2nd row중국
3rd row베트남
4th row태국
5th row우즈베키스탄
ValueCountFrequency (%)
국제연합 37
 
0.7%
한국계중국인 27
 
0.5%
파푸아뉴기니 27
 
0.5%
바누아투 27
 
0.5%
영국외지시민 27
 
0.5%
스발바르 27
 
0.5%
세르비아몬테네그로 27
 
0.5%
산마리노 27
 
0.5%
오스트레일리아 27
 
0.5%
뉴질랜드 27
 
0.5%
Other values (214) 5350
95.0%
2024-05-04T07:29:30.567644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1519
 
6.6%
846
 
3.7%
823
 
3.6%
693
 
3.0%
661
 
2.9%
621
 
2.7%
549
 
2.4%
523
 
2.3%
425
 
1.8%
421
 
1.8%
Other values (192) 15902
69.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22907
99.7%
Open Punctuation 22
 
0.1%
Close Punctuation 22
 
0.1%
Dash Punctuation 22
 
0.1%
Space Separator 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1519
 
6.6%
846
 
3.7%
823
 
3.6%
693
 
3.0%
661
 
2.9%
621
 
2.7%
549
 
2.4%
523
 
2.3%
425
 
1.9%
421
 
1.8%
Other values (188) 15826
69.1%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22907
99.7%
Common 76
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1519
 
6.6%
846
 
3.7%
823
 
3.6%
693
 
3.0%
661
 
2.9%
621
 
2.7%
549
 
2.4%
523
 
2.3%
425
 
1.9%
421
 
1.8%
Other values (188) 15826
69.1%
Common
ValueCountFrequency (%)
( 22
28.9%
) 22
28.9%
- 22
28.9%
10
13.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22907
99.7%
ASCII 76
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1519
 
6.6%
846
 
3.7%
823
 
3.6%
693
 
3.0%
661
 
2.9%
621
 
2.7%
549
 
2.4%
523
 
2.3%
425
 
1.9%
421
 
1.8%
Other values (188) 15826
69.1%
ASCII
ValueCountFrequency (%)
( 22
28.9%
) 22
28.9%
- 22
28.9%
10
13.2%

체류외국인 수
Real number (ℝ)

Distinct2086
Distinct (%)37.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10827.875
Minimum1
Maximum631259
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size49.5 KiB
2024-05-04T07:29:31.091195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q123
median141
Q3991.25
95-th percentile47368.65
Maximum631259
Range631258
Interquartile range (IQR)968.25

Descriptive statistics

Standard deviation52608.529
Coefficient of variation (CV)4.85862
Kurtosis85.457687
Mean10827.875
Median Absolute Deviation (MAD)138
Skewness8.532955
Sum60852656
Variance2.7676573 × 109
MonotonicityNot monotonic
2024-05-04T07:29:31.502576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 229
 
4.1%
2 133
 
2.4%
5 125
 
2.2%
7 75
 
1.3%
15 74
 
1.3%
6 73
 
1.3%
10 68
 
1.2%
3 67
 
1.2%
9 61
 
1.1%
4 61
 
1.1%
Other values (2076) 4654
82.8%
ValueCountFrequency (%)
1 229
4.1%
2 133
2.4%
3 67
 
1.2%
4 61
 
1.1%
5 125
2.2%
6 73
 
1.3%
7 75
 
1.3%
8 43
 
0.8%
9 61
 
1.1%
10 68
 
1.2%
ValueCountFrequency (%)
631259 1
< 0.1%
629572 1
< 0.1%
627780 1
< 0.1%
627450 1
< 0.1%
623912 1
< 0.1%
623484 1
< 0.1%
622805 1
< 0.1%
618886 1
< 0.1%
617656 1
< 0.1%
614917 1
< 0.1%

Interactions

2024-05-04T07:29:25.883897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:29:25.124319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:29:26.243720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:29:25.497371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T07:29:31.753745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
체류외국인 수
1.0000.5050.078
0.5051.0000.000
체류외국인 수0.0780.0001.000
2024-05-04T07:29:32.019718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
체류외국인 수
1.0000.0210.351
체류외국인 수0.0211.0000.052
0.3510.0521.000

Missing values

2024-05-04T07:29:26.726563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T07:29:27.125223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

국적지역체류외국인 수
020221한국계중국인612303
120221중국223121
220221베트남206728
320221태국171364
420221우즈베키스탄66612
520221필리핀46591
620221캄보디아41673
720221네팔37116
820221몽골37010
920221인도네시아34565
국적지역체류외국인 수
561020243포르투갈556
561120243폴란드1238
561220243프랑스9951
561320243피지88
561420243핀란드631
561520243필리핀67401
561620243한국계중국인631259
561720243헝가리522
561820243홍콩18090
561920243홍콩거주난민15

Duplicate rows

Most frequently occurring

국적지역체류외국인 수# duplicates
020224미상12