Overview

Dataset statistics

Number of variables4
Number of observations2994
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory102.5 KiB
Average record size in memory35.0 B

Variable types

Categorical1
Numeric2
Text1

Dataset

Description전자여행허가(K-ETA)를 신청하는 외국인의 국적(지역)별 현황을 월별로 업데이트 하여 제공(연월, 국적(지역), 신청수)
Author법무부
URLhttps://www.data.go.kr/data/15099998/fileData.do

Reproduction

Analysis started2024-04-29 22:57:49.539466
Analysis finished2024-04-29 22:57:51.552195
Duration2.01 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.5 KiB
2023
1335 
2022
1104 
2024
329 
2021
226 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2023 1335
44.6%
2022 1104
36.9%
2024 329
 
11.0%
2021 226
 
7.5%

Length

2024-04-30T07:57:51.614240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T07:57:51.717341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 1335
44.6%
2022 1104
36.9%
2024 329
 
11.0%
2021 226
 
7.5%


Real number (ℝ)

Distinct12
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5197061
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.4 KiB
2024-04-30T07:57:51.813383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.5305614
Coefficient of variation (CV)0.54152156
Kurtosis-1.293712
Mean6.5197061
Median Absolute Deviation (MAD)3
Skewness-0.016188191
Sum19520
Variance12.464864
MonotonicityNot monotonic
2024-04-30T07:57:51.929748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
3 305
10.2%
11 268
9.0%
9 266
8.9%
10 262
8.8%
2 261
8.7%
12 258
8.6%
1 256
8.6%
8 234
7.8%
7 231
7.7%
6 223
7.4%
Other values (2) 430
14.4%
ValueCountFrequency (%)
1 256
8.6%
2 261
8.7%
3 305
10.2%
4 208
6.9%
5 222
7.4%
6 223
7.4%
7 231
7.7%
8 234
7.8%
9 266
8.9%
10 262
8.8%
ValueCountFrequency (%)
12 258
8.6%
11 268
9.0%
10 262
8.8%
9 266
8.9%
8 234
7.8%
7 231
7.7%
6 223
7.4%
5 222
7.4%
4 208
6.9%
3 305
10.2%
Distinct121
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size23.5 KiB
2024-04-30T07:57:52.187146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length4.1903808
Min length2

Characters and Unicode

Total characters12546
Distinct characters153
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미국
2nd row영국
3rd row교황청
4th row멕시코
5th row모나코
ValueCountFrequency (%)
미국 35
 
1.2%
베네수엘라 35
 
1.2%
도미니카연방 35
 
1.2%
영국 35
 
1.2%
아일랜드 35
 
1.2%
멕시코 35
 
1.2%
알바니아 34
 
1.1%
영국속국민 34
 
1.1%
세인트크리스토퍼네비스 34
 
1.1%
슬로베니아 34
 
1.1%
Other values (111) 2648
88.4%
2024-04-30T07:57:52.597965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
833
 
6.6%
570
 
4.5%
446
 
3.6%
407
 
3.2%
359
 
2.9%
300
 
2.4%
292
 
2.3%
290
 
2.3%
259
 
2.1%
252
 
2.0%
Other values (143) 8538
68.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12480
99.5%
Dash Punctuation 22
 
0.2%
Open Punctuation 22
 
0.2%
Close Punctuation 22
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
833
 
6.7%
570
 
4.6%
446
 
3.6%
407
 
3.3%
359
 
2.9%
300
 
2.4%
292
 
2.3%
290
 
2.3%
259
 
2.1%
252
 
2.0%
Other values (140) 8472
67.9%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12480
99.5%
Common 66
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
833
 
6.7%
570
 
4.6%
446
 
3.6%
407
 
3.3%
359
 
2.9%
300
 
2.4%
292
 
2.3%
290
 
2.3%
259
 
2.1%
252
 
2.0%
Other values (140) 8472
67.9%
Common
ValueCountFrequency (%)
- 22
33.3%
( 22
33.3%
) 22
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12480
99.5%
ASCII 66
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
833
 
6.7%
570
 
4.6%
446
 
3.6%
407
 
3.3%
359
 
2.9%
300
 
2.4%
292
 
2.3%
290
 
2.3%
259
 
2.1%
252
 
2.0%
Other values (140) 8472
67.9%
ASCII
ValueCountFrequency (%)
- 22
33.3%
( 22
33.3%
) 22
33.3%

합계
Real number (ℝ)

Distinct942
Distinct (%)31.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1551.2859
Minimum1
Maximum138659
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.4 KiB
2024-04-30T07:57:52.720114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q112
median63
Q3341.75
95-th percentile6077.6
Maximum138659
Range138658
Interquartile range (IQR)329.75

Descriptive statistics

Standard deviation7282.7308
Coefficient of variation (CV)4.6946413
Kurtosis91.627551
Mean1551.2859
Median Absolute Deviation (MAD)60
Skewness8.4144576
Sum4644550
Variance53038168
MonotonicityNot monotonic
2024-04-30T07:57:52.856100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 129
 
4.3%
2 118
 
3.9%
3 84
 
2.8%
4 83
 
2.8%
6 62
 
2.1%
5 60
 
2.0%
7 52
 
1.7%
8 42
 
1.4%
10 40
 
1.3%
12 36
 
1.2%
Other values (932) 2288
76.4%
ValueCountFrequency (%)
1 129
4.3%
2 118
3.9%
3 84
2.8%
4 83
2.8%
5 60
2.0%
6 62
2.1%
7 52
1.7%
8 42
 
1.4%
9 36
 
1.2%
10 40
 
1.3%
ValueCountFrequency (%)
138659 1
< 0.1%
99160 1
< 0.1%
78601 1
< 0.1%
74693 1
< 0.1%
72692 1
< 0.1%
67826 1
< 0.1%
67742 1
< 0.1%
66981 1
< 0.1%
66752 1
< 0.1%
65736 1
< 0.1%

Interactions

2024-04-30T07:57:51.215842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T07:57:50.964120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T07:57:51.312535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T07:57:51.111573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T07:57:52.990809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
합계
1.0000.5460.000
0.5461.0000.000
합계0.0000.0001.000
2024-04-30T07:57:53.107907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
합계
1.0000.0220.358
합계0.0221.0000.000
0.3580.0001.000

Missing values

2024-04-30T07:57:51.438899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T07:57:51.515543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

국적지역합계
020215미국3941
120215영국79
220215교황청1
320215멕시코35
420215모나코1
520215안도라1
620215팔라우1
720215아일랜드12
820215알바니아2
920215바베이도스1
국적지역합계
298420243도미니카공화국122
298520243미이크로네시아7
298620243사우디아라비아327
298720243오스트레일리아554
298820243남아프리카공화국362
298920243아랍에미리트연합32
299020243트리니다드토바고24
299120243세인트빈센트그레나딘7
299220243보스니아-헤르체고비나48
299320243세인트크리스토퍼네비스45