Overview

Dataset statistics

Number of variables7
Number of observations31
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.0 KiB
Average record size in memory65.3 B

Variable types

Text1
Categorical2
Numeric4

Dataset

Description재외공관에서 근로계약이 체결된 외국인근로자가 별다른 절차없이 국내 입국에 필요한 사증을 받을 수 있도록 국내에서 입국을 허락한 인정서인 고용허가제에 대한 국가별 사증발급인정서 발급 현황을 국가별, 성별로 제공
URLhttps://www.data.go.kr/data/3075820/fileData.do

Alerts

E9_01(제조업) is highly overall correlated with E9_03(농업) and 1 other fieldsHigh correlation
E9_02(건설업) is highly overall correlated with E9_03(농업)High correlation
E9_03(농업) is highly overall correlated with E9_01(제조업) and 1 other fieldsHigh correlation
성별 is highly overall correlated with E9_01(제조업)High correlation
E9_05(서비스업) is highly imbalanced (65.0%)Imbalance
E9_01(제조업) has unique valuesUnique
E9_02(건설업) has 26 (83.9%) zerosZeros
E9_03(농업) has 20 (64.5%) zerosZeros
E9_04(어업) has 24 (77.4%) zerosZeros

Reproduction

Analysis started2023-12-12 02:48:57.088603
Analysis finished2023-12-12 02:48:59.474110
Duration2.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

국적
Text

Distinct16
Distinct (%)51.6%
Missing0
Missing (%)0.0%
Memory size380.0 B
2023-12-12T11:48:59.613039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length6
Mean length3.7741935
Min length2

Characters and Unicode

Total characters117
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)3.2%

Sample

1st row네팔
2nd row네팔
3rd row인도네시아
4th row인도네시아
5th row베트남
ValueCountFrequency (%)
네팔 2
 
6.5%
인도네시아 2
 
6.5%
베트남 2
 
6.5%
캄보디아 2
 
6.5%
필리핀 2
 
6.5%
스리랑카 2
 
6.5%
타이 2
 
6.5%
방글라데시 2
 
6.5%
우즈베키스탄 2
 
6.5%
파키스탄 2
 
6.5%
Other values (6) 11
35.5%
2023-12-12T11:48:59.963316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
6.8%
6
 
5.1%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
Other values (35) 71
60.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 117
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
6.8%
6
 
5.1%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
Other values (35) 71
60.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 117
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
6.8%
6
 
5.1%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
Other values (35) 71
60.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 117
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
6.8%
6
 
5.1%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
Other values (35) 71
60.7%

성별
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size380.0 B
남성
16 
여성
15 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남성
2nd row여성
3rd row남성
4th row여성
5th row남성

Common Values

ValueCountFrequency (%)
남성 16
51.6%
여성 15
48.4%

Length

2023-12-12T11:49:00.105502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:49:00.225842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남성 16
51.6%
여성 15
48.4%

E9_01(제조업)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct31
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2325.0968
Minimum2
Maximum8934
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-12T11:49:00.339196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile12
Q196.5
median560
Q34648.5
95-th percentile8363.5
Maximum8934
Range8932
Interquartile range (IQR)4552

Descriptive statistics

Standard deviation3094.4732
Coefficient of variation (CV)1.3309008
Kurtosis-0.41902481
Mean2325.0968
Median Absolute Deviation (MAD)535
Skewness1.094198
Sum72078
Variance9575764.4
MonotonicityNot monotonic
2023-12-12T11:49:00.496251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
8934 1
 
3.2%
320 1
 
3.2%
6 1
 
3.2%
164 1
 
3.2%
52 1
 
3.2%
653 1
 
3.2%
141 1
 
3.2%
571 1
 
3.2%
25 1
 
3.2%
862 1
 
3.2%
Other values (21) 21
67.7%
ValueCountFrequency (%)
2 1
3.2%
6 1
3.2%
18 1
3.2%
19 1
3.2%
25 1
3.2%
45 1
3.2%
46 1
3.2%
52 1
3.2%
141 1
3.2%
164 1
3.2%
ValueCountFrequency (%)
8934 1
3.2%
8864 1
3.2%
7863 1
3.2%
7299 1
3.2%
6659 1
3.2%
6188 1
3.2%
5896 1
3.2%
5719 1
3.2%
3578 1
3.2%
3424 1
3.2%

E9_02(건설업)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean57.774194
Minimum0
Maximum1015
Zeros26
Zeros (%)83.9%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-12T11:49:00.643413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile326.5
Maximum1015
Range1015
Interquartile range (IQR)0

Descriptive statistics

Standard deviation195.58335
Coefficient of variation (CV)3.3853065
Kurtosis20.338606
Mean57.774194
Median Absolute Deviation (MAD)0
Skewness4.3383672
Sum1791
Variance38252.847
MonotonicityNot monotonic
2023-12-12T11:49:00.763743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 26
83.9%
65 1
 
3.2%
331 1
 
3.2%
322 1
 
3.2%
58 1
 
3.2%
1015 1
 
3.2%
ValueCountFrequency (%)
0 26
83.9%
58 1
 
3.2%
65 1
 
3.2%
322 1
 
3.2%
331 1
 
3.2%
1015 1
 
3.2%
ValueCountFrequency (%)
1015 1
 
3.2%
331 1
 
3.2%
322 1
 
3.2%
65 1
 
3.2%
58 1
 
3.2%
0 26
83.9%

E9_03(농업)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct12
Distinct (%)38.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean433.29032
Minimum0
Maximum4147
Zeros20
Zeros (%)64.5%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-12T11:49:00.866155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3489.5
95-th percentile2140
Maximum4147
Range4147
Interquartile range (IQR)489.5

Descriptive statistics

Standard deviation937.53806
Coefficient of variation (CV)2.1637641
Kurtosis8.6712569
Mean433.29032
Median Absolute Deviation (MAD)0
Skewness2.8477487
Sum13432
Variance878977.61
MonotonicityNot monotonic
2023-12-12T11:49:00.987934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 20
64.5%
4147 1
 
3.2%
1419 1
 
3.2%
727 1
 
3.2%
326 1
 
3.2%
1357 1
 
3.2%
2861 1
 
3.2%
653 1
 
3.2%
6 1
 
3.2%
1071 1
 
3.2%
Other values (2) 2
 
6.5%
ValueCountFrequency (%)
0 20
64.5%
3 1
 
3.2%
6 1
 
3.2%
326 1
 
3.2%
653 1
 
3.2%
727 1
 
3.2%
862 1
 
3.2%
1071 1
 
3.2%
1357 1
 
3.2%
1419 1
 
3.2%
ValueCountFrequency (%)
4147 1
3.2%
2861 1
3.2%
1419 1
3.2%
1357 1
3.2%
1071 1
3.2%
862 1
3.2%
727 1
3.2%
653 1
3.2%
326 1
3.2%
6 1
3.2%

E9_04(어업)
Real number (ℝ)

ZEROS 

Distinct8
Distinct (%)25.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean232
Minimum0
Maximum3185
Zeros24
Zeros (%)77.4%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-12T11:49:01.116980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1605
Maximum3185
Range3185
Interquartile range (IQR)0

Descriptive statistics

Standard deviation701.3391
Coefficient of variation (CV)3.0230134
Kurtosis11.68435
Mean232
Median Absolute Deviation (MAD)0
Skewness3.3929277
Sum7192
Variance491876.53
MonotonicityNot monotonic
2023-12-12T11:49:01.246379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 24
77.4%
3185 1
 
3.2%
3 1
 
3.2%
2108 1
 
3.2%
4 1
 
3.2%
1102 1
 
3.2%
1 1
 
3.2%
789 1
 
3.2%
ValueCountFrequency (%)
0 24
77.4%
1 1
 
3.2%
3 1
 
3.2%
4 1
 
3.2%
789 1
 
3.2%
1102 1
 
3.2%
2108 1
 
3.2%
3185 1
 
3.2%
ValueCountFrequency (%)
3185 1
 
3.2%
2108 1
 
3.2%
1102 1
 
3.2%
789 1
 
3.2%
4 1
 
3.2%
3 1
 
3.2%
1 1
 
3.2%
0 24
77.4%

E9_05(서비스업)
Categorical

IMBALANCE 

Distinct5
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Memory size380.0 B
0
27 
77
 
1
27
 
1
2
 
1
1
 
1

Length

Max length2
Median length1
Mean length1.0645161
Min length1

Unique

Unique4 ?
Unique (%)12.9%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 27
87.1%
77 1
 
3.2%
27 1
 
3.2%
2 1
 
3.2%
1 1
 
3.2%

Length

2023-12-12T11:49:01.464440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:49:01.609015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 27
87.1%
77 1
 
3.2%
27 1
 
3.2%
2 1
 
3.2%
1 1
 
3.2%

Interactions

2023-12-12T11:48:58.859721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:57.397084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:57.762586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:58.474210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:58.958887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:57.476802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:57.855270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:58.572895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:59.067376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:57.559291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:57.940503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:58.672843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:59.156428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:57.658270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:58.025290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:58.750218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:49:01.701206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국적성별E9_01(제조업)E9_02(건설업)E9_03(농업)E9_04(어업)E9_05(서비스업)
국적1.0000.0000.6940.6810.8090.0000.223
성별0.0001.0000.7900.1190.0000.0870.015
E9_01(제조업)0.6940.7901.0000.8960.6430.5190.348
E9_02(건설업)0.6810.1190.8961.0000.8930.0000.000
E9_03(농업)0.8090.0000.6430.8931.0000.0000.000
E9_04(어업)0.0000.0870.5190.0000.0001.0000.000
E9_05(서비스업)0.2230.0150.3480.0000.0000.0001.000
2023-12-12T11:49:01.836844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
E9_05(서비스업)성별
E9_05(서비스업)1.0000.000
성별0.0001.000
2023-12-12T11:49:01.947416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
E9_01(제조업)E9_02(건설업)E9_03(농업)E9_04(어업)성별E9_05(서비스업)
E9_01(제조업)1.0000.4720.5060.2320.5400.188
E9_02(건설업)0.4721.0000.5340.1760.1870.000
E9_03(농업)0.5060.5341.0000.0180.0000.000
E9_04(어업)0.2320.1760.0181.0000.0760.000
성별0.5400.1870.0000.0761.0000.000
E9_05(서비스업)0.1880.0000.0000.0000.0001.000

Missing values

2023-12-12T11:48:59.282718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:48:59.426291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

국적성별E9_01(제조업)E9_02(건설업)E9_03(농업)E9_04(어업)E9_05(서비스업)
0네팔남성89340414700
1네팔여성3200141900
2인도네시아남성88640031850
3인도네시아여성2710030
4베트남남성78636572721080
5베트남여성832032640
6캄보디아남성6188331135700
7캄보디아여성5600286100
8필리핀남성72990000
9필리핀여성4640000
국적성별E9_01(제조업)E9_02(건설업)E9_03(농업)E9_04(어업)E9_05(서비스업)
21티모르민주공화국남성256007890
22티모르민주공화국여성190000
23키르기즈남성8620000
24키르기즈여성250000
25몽골남성57100027
26몽골여성1410002
27라오스남성6530000
28라오스여성520000
29중국남성1640001
30중국여성60000