Overview

Dataset statistics

Number of variables4
Number of observations4355
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory149.0 KiB
Average record size in memory35.0 B

Variable types

Categorical1
Numeric2
Text1

Dataset

Description전문인력, 단순기능인력 취업자격 체류외국인(C-4, E-1~E-7, E-9~E-10, H-2)의 국적(지역)별 현황을 월별로 제공
Author법무부
URLhttps://www.data.go.kr/data/15100029/fileData.do

Reproduction

Analysis started2024-04-29 22:59:11.403690
Analysis finished2024-04-29 22:59:13.373908
Duration1.97 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size34.2 KiB
2023
1958 
2022
1912 
2024
485 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2023 1958
45.0%
2022 1912
43.9%
2024 485
 
11.1%

Length

2024-04-30T07:59:13.429654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T07:59:13.522810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 1958
45.0%
2022 1912
43.9%
2024 485
 
11.1%


Real number (ℝ)

Distinct12
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.9972445
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.4 KiB
2024-04-30T07:59:13.614069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.5557754
Coefficient of variation (CV)0.59290152
Kurtosis-1.2724156
Mean5.9972445
Median Absolute Deviation (MAD)3
Skewness0.17920823
Sum26118
Variance12.643539
MonotonicityNot monotonic
2024-04-30T07:59:13.718206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
3 488
11.2%
2 483
11.1%
1 482
11.1%
7 326
7.5%
8 325
7.5%
12 323
7.4%
4 322
7.4%
5 322
7.4%
6 322
7.4%
10 322
7.4%
Other values (2) 640
14.7%
ValueCountFrequency (%)
1 482
11.1%
2 483
11.1%
3 488
11.2%
4 322
7.4%
5 322
7.4%
6 322
7.4%
7 326
7.5%
8 325
7.5%
9 321
7.4%
10 322
7.4%
ValueCountFrequency (%)
12 323
7.4%
11 319
7.3%
10 322
7.4%
9 321
7.4%
8 325
7.5%
7 326
7.5%
6 322
7.4%
5 322
7.4%
4 322
7.4%
3 488
11.2%
Distinct183
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size34.2 KiB
2024-04-30T07:59:13.982029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length3.9671642
Min length2

Characters and Unicode

Total characters17277
Distinct characters170
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row한국계중국인
2nd row베트남
3rd row캄보디아
4th row네팔
5th row인도네시아
ValueCountFrequency (%)
한국계중국인 27
 
0.6%
베트남 27
 
0.6%
조지아 27
 
0.6%
몰도바 27
 
0.6%
키프로스 27
 
0.6%
과테말라 27
 
0.6%
아제르바이잔 27
 
0.6%
에스토니아 27
 
0.6%
파나마 27
 
0.6%
레바논 27
 
0.6%
Other values (173) 4085
93.8%
2024-04-30T07:59:14.368019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1304
 
7.5%
691
 
4.0%
659
 
3.8%
511
 
3.0%
506
 
2.9%
492
 
2.8%
477
 
2.8%
473
 
2.7%
325
 
1.9%
301
 
1.7%
Other values (160) 11538
66.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 17202
99.6%
Close Punctuation 27
 
0.2%
Open Punctuation 27
 
0.2%
Dash Punctuation 21
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1304
 
7.6%
691
 
4.0%
659
 
3.8%
511
 
3.0%
506
 
2.9%
492
 
2.9%
477
 
2.8%
473
 
2.7%
325
 
1.9%
301
 
1.7%
Other values (157) 11463
66.6%
Close Punctuation
ValueCountFrequency (%)
) 27
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 17202
99.6%
Common 75
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1304
 
7.6%
691
 
4.0%
659
 
3.8%
511
 
3.0%
506
 
2.9%
492
 
2.9%
477
 
2.8%
473
 
2.7%
325
 
1.9%
301
 
1.7%
Other values (157) 11463
66.6%
Common
ValueCountFrequency (%)
) 27
36.0%
( 27
36.0%
- 21
28.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 17202
99.6%
ASCII 75
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1304
 
7.6%
691
 
4.0%
659
 
3.8%
511
 
3.0%
506
 
2.9%
492
 
2.9%
477
 
2.8%
473
 
2.7%
325
 
1.9%
301
 
1.7%
Other values (157) 11463
66.6%
ASCII
ValueCountFrequency (%)
) 27
36.0%
( 27
36.0%
- 21
28.0%

취업자격자수
Real number (ℝ)

Distinct996
Distinct (%)22.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2872.0999
Minimum1
Maximum105662
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.4 KiB
2024-04-30T07:59:14.498331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median15
Q3124.5
95-th percentile23271.9
Maximum105662
Range105661
Interquartile range (IQR)121.5

Descriptive statistics

Standard deviation10702.974
Coefficient of variation (CV)3.7265328
Kurtosis35.193795
Mean2872.0999
Median Absolute Deviation (MAD)14
Skewness5.4344411
Sum12507995
Variance1.1455366 × 108
MonotonicityNot monotonic
2024-04-30T07:59:14.626252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 586
 
13.5%
2 285
 
6.5%
3 227
 
5.2%
5 156
 
3.6%
4 155
 
3.6%
7 122
 
2.8%
6 117
 
2.7%
9 103
 
2.4%
8 99
 
2.3%
12 78
 
1.8%
Other values (986) 2427
55.7%
ValueCountFrequency (%)
1 586
13.5%
2 285
6.5%
3 227
 
5.2%
4 155
 
3.6%
5 156
 
3.6%
6 117
 
2.7%
7 122
 
2.8%
8 99
 
2.3%
9 103
 
2.4%
10 55
 
1.3%
ValueCountFrequency (%)
105662 1
< 0.1%
103780 1
< 0.1%
102031 1
< 0.1%
101128 1
< 0.1%
100162 1
< 0.1%
98258 1
< 0.1%
96677 1
< 0.1%
94998 1
< 0.1%
93936 1
< 0.1%
92679 1
< 0.1%

Interactions

2024-04-30T07:59:13.045641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T07:59:12.808482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T07:59:13.133660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T07:59:12.950694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T07:59:14.701535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
취업자격자수
1.0000.5050.134
0.5051.0000.000
취업자격자수0.1340.0001.000
2024-04-30T07:59:14.779808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
취업자격자수
1.0000.0070.351
취업자격자수0.0071.0000.059
0.3510.0591.000

Missing values

2024-04-30T07:59:13.255827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T07:59:13.337543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

국적지역취업자격자수
020221한국계중국인105662
120221베트남40071
220221캄보디아33574
320221네팔29587
420221인도네시아28074
520221태국23061
620221필리핀21431
720221미얀마21278
820221우즈베키스탄20875
920221스리랑카17080
국적지역취업자격자수
434520243포르투갈46
434620243폴란드132
434720243프랑스1569
434820243피지4
434920243핀란드29
435020243필리핀29576
435120243한국계러시아인9
435220243한국계중국인89161
435320243헝가리60
435420243홍콩212