Overview

Dataset statistics

Number of variables4
Number of observations1196
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory39.8 KiB
Average record size in memory34.1 B

Variable types

Numeric2
Categorical1
Text1

Dataset

Description국적을 신청하는 유형별 (귀화신청, 국적회복신청) 국내 체류 외국인의 국적(지역)별 현황을 연도별로 제공합니다.
Author법무부
URLhttps://www.data.go.kr/data/15100047/fileData.do

Reproduction

Analysis started2023-12-12 16:18:20.221198
Analysis finished2023-12-12 16:18:20.972546
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년도
Real number (ℝ)

Distinct12
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.5978
Minimum2011
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 KiB
2023-12-13T01:18:21.040835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2011
Q12014
median2017
Q32020
95-th percentile2022
Maximum2022
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.4376155
Coefficient of variation (CV)0.0017046609
Kurtosis-1.2158707
Mean2016.5978
Median Absolute Deviation (MAD)3
Skewness-0.045750889
Sum2411851
Variance11.8172
MonotonicityIncreasing
2023-12-13T01:18:21.168196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2020 109
9.1%
2019 107
8.9%
2013 105
8.8%
2017 103
8.6%
2021 101
8.4%
2018 99
8.3%
2022 99
8.3%
2014 97
8.1%
2015 97
8.1%
2011 96
8.0%
Other values (2) 183
15.3%
ValueCountFrequency (%)
2011 96
8.0%
2012 89
7.4%
2013 105
8.8%
2014 97
8.1%
2015 97
8.1%
2016 94
7.9%
2017 103
8.6%
2018 99
8.3%
2019 107
8.9%
2020 109
9.1%
ValueCountFrequency (%)
2022 99
8.3%
2021 101
8.4%
2020 109
9.1%
2019 107
8.9%
2018 99
8.3%
2017 103
8.6%
2016 94
7.9%
2015 97
8.1%
2014 97
8.1%
2013 105
8.8%

유형
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
귀화
626 
국적회복
570 

Length

Max length4
Median length2
Mean length2.9531773
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row귀화
2nd row귀화
3rd row귀화
4th row귀화
5th row귀화

Common Values

ValueCountFrequency (%)
귀화 626
52.3%
국적회복 570
47.7%

Length

2023-12-13T01:18:21.313499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:18:21.446506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
귀화 626
52.3%
국적회복 570
47.7%

국적
Text

Distinct138
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
2023-12-13T01:18:21.745775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length3.6764214
Min length2

Characters and Unicode

Total characters4397
Distinct characters156
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)2.0%

Sample

1st row방글라데시
2nd row미얀마
3rd row캄보디아
4th row스리랑카
5th row중국
ValueCountFrequency (%)
몽골 24
 
2.0%
우즈베키스탄 24
 
2.0%
중국 24
 
2.0%
타이완 24
 
2.0%
미국 24
 
2.0%
베트남 24
 
2.0%
인도네시아 24
 
2.0%
이란 24
 
2.0%
태국 24
 
2.0%
일본 24
 
2.0%
Other values (127) 956
79.9%
2023-12-13T01:18:22.282018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
256
 
5.8%
252
 
5.7%
188
 
4.3%
146
 
3.3%
137
 
3.1%
122
 
2.8%
113
 
2.6%
98
 
2.2%
90
 
2.0%
86
 
2.0%
Other values (146) 2909
66.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4388
99.8%
Close Punctuation 5
 
0.1%
Open Punctuation 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
256
 
5.8%
252
 
5.7%
188
 
4.3%
146
 
3.3%
137
 
3.1%
122
 
2.8%
113
 
2.6%
98
 
2.2%
90
 
2.1%
86
 
2.0%
Other values (143) 2900
66.1%
Close Punctuation
ValueCountFrequency (%)
) 4
80.0%
} 1
 
20.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4388
99.8%
Common 9
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
256
 
5.8%
252
 
5.7%
188
 
4.3%
146
 
3.3%
137
 
3.1%
122
 
2.8%
113
 
2.6%
98
 
2.2%
90
 
2.1%
86
 
2.0%
Other values (143) 2900
66.1%
Common
ValueCountFrequency (%)
) 4
44.4%
( 4
44.4%
} 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4388
99.8%
ASCII 9
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
256
 
5.8%
252
 
5.7%
188
 
4.3%
146
 
3.3%
137
 
3.1%
122
 
2.8%
113
 
2.6%
98
 
2.2%
90
 
2.1%
86
 
2.0%
Other values (143) 2900
66.1%
ASCII
ValueCountFrequency (%)
) 4
44.4%
( 4
44.4%
} 1
 
11.1%

건수
Real number (ℝ)

Distinct200
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean139.64548
Minimum1
Maximum7730
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 KiB
2023-12-13T01:18:22.451494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median4
Q319
95-th percentile394.25
Maximum7730
Range7729
Interquartile range (IQR)18

Descriptive statistics

Standard deviation633.40469
Coefficient of variation (CV)4.535805
Kurtosis48.000014
Mean139.64548
Median Absolute Deviation (MAD)3
Skewness6.5051988
Sum167016
Variance401201.5
MonotonicityNot monotonic
2023-12-13T01:18:22.629851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 367
30.7%
2 124
 
10.4%
3 93
 
7.8%
4 49
 
4.1%
5 49
 
4.1%
6 45
 
3.8%
7 33
 
2.8%
8 29
 
2.4%
10 16
 
1.3%
12 14
 
1.2%
Other values (190) 377
31.5%
ValueCountFrequency (%)
1 367
30.7%
2 124
 
10.4%
3 93
 
7.8%
4 49
 
4.1%
5 49
 
4.1%
6 45
 
3.8%
7 33
 
2.8%
8 29
 
2.4%
9 14
 
1.2%
10 16
 
1.3%
ValueCountFrequency (%)
7730 1
0.1%
6031 1
0.1%
4940 1
0.1%
4852 1
0.1%
4849 1
0.1%
4838 1
0.1%
4781 1
0.1%
4431 1
0.1%
4225 1
0.1%
4217 1
0.1%

Interactions

2023-12-13T01:18:20.611641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:18:20.407241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:18:20.727689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:18:20.495953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:18:22.730492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도유형건수
년도1.0000.0000.000
유형0.0001.0000.116
건수0.0000.1161.000
2023-12-13T01:18:22.817627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도건수유형
년도1.0000.0130.000
건수0.0131.0000.116
유형0.0000.1161.000

Missing values

2023-12-13T01:18:20.859529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:18:20.938416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년도유형국적건수
02011귀화방글라데시23
12011귀화미얀마8
22011귀화캄보디아488
32011귀화스리랑카2
42011귀화중국3219
52011귀화타이완240
62011귀화홍콩1
72011귀화한국계중국인7730
82011귀화인도3
92011귀화인도네시아15
년도유형국적건수
11862022국적회복아일랜드1
11872022국적회복영국16
11882022국적회복오스트리아1
11892022국적회복이탈리아3
11902022국적회복프랑스6
11912022국적회복한국계러시아인8
11922022국적회복뉴질랜드67
11932022국적회복오스트레일리아130
11942022국적회복피지2
11952022국적회복가봉1