Overview

Dataset statistics

Number of variables13
Number of observations231
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory24.0 KiB
Average record size in memory106.6 B

Variable types

Numeric2
Categorical10
Text1

Dataset

Description전국 다문화가족지원센터 언어별 통번역지원사 년도별 배치현황입니다.파일데이터 제공항목은 연번, 시도명, 센터명, 통번역지원사 배치인원, 베트남어, 중국어, 캄보디아어, 몽골어, 일본어, 필리핀어, 러시아어, 네팔어입니다.
Author한국건강가정진흥원
URLhttps://www.data.go.kr/data/3081602/fileData.do

Alerts

연번 is highly overall correlated with 시도명High correlation
통번역지원사 배치인원 is highly overall correlated with 중국어 and 2 other fieldsHigh correlation
시도명 is highly overall correlated with 연번High correlation
중국어 is highly overall correlated with 통번역지원사 배치인원High correlation
일본어 is highly overall correlated with 통번역지원사 배치인원High correlation
러시아어 is highly overall correlated with 통번역지원사 배치인원High correlation
일본어 is highly imbalanced (76.2%)Imbalance
태국어 is highly imbalanced (92.8%)Imbalance
러시아어 is highly imbalanced (76.2%)Imbalance
네팔어 is highly imbalanced (96.0%)Imbalance
몽골어 is highly imbalanced (72.4%)Imbalance
캄보디아어 is highly imbalanced (74.3%)Imbalance
필리핀어 is highly imbalanced (57.5%)Imbalance
연번 has unique valuesUnique
통번역지원사 배치인원 has 22 (9.5%) zerosZeros

Reproduction

Analysis started2023-12-12 21:37:09.649359
Analysis finished2023-12-12 21:37:10.839047
Duration1.19 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct231
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean116
Minimum1
Maximum231
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-13T06:37:10.892612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile12.5
Q158.5
median116
Q3173.5
95-th percentile219.5
Maximum231
Range230
Interquartile range (IQR)115

Descriptive statistics

Standard deviation66.828138
Coefficient of variation (CV)0.57610464
Kurtosis-1.2
Mean116
Median Absolute Deviation (MAD)58
Skewness0
Sum26796
Variance4466
MonotonicityStrictly increasing
2023-12-13T06:37:10.998387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
160 1
 
0.4%
148 1
 
0.4%
149 1
 
0.4%
150 1
 
0.4%
151 1
 
0.4%
152 1
 
0.4%
153 1
 
0.4%
154 1
 
0.4%
155 1
 
0.4%
Other values (221) 221
95.7%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
231 1
0.4%
230 1
0.4%
229 1
0.4%
228 1
0.4%
227 1
0.4%
226 1
0.4%
225 1
0.4%
224 1
0.4%
223 1
0.4%
222 1
0.4%

시도명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
경기도
31 
서울특별시
26 
경상북도
24 
전라남도
22 
경상남도
19 
Other values (12)
109 

Length

Max length7
Median length5
Mean length4.1428571
Min length3

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 31
13.4%
서울특별시 26
11.3%
경상북도 24
10.4%
전라남도 22
9.5%
경상남도 19
8.2%
강원도 18
7.8%
충청남도 15
6.5%
전라북도 14
 
6.1%
부산광역시 14
 
6.1%
충청북도 12
 
5.2%
Other values (7) 36
15.6%

Length

2023-12-13T06:37:11.100905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 31
13.4%
서울특별시 26
11.3%
경상북도 24
10.4%
전라남도 22
9.5%
경상남도 19
8.2%
강원도 18
7.8%
충청남도 15
6.5%
부산광역시 14
 
6.1%
전라북도 14
 
6.1%
충청북도 12
 
5.2%
Other values (7) 36
15.6%
Distinct215
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-12-13T06:37:11.411550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length8
Mean length8.4675325
Min length7

Characters and Unicode

Total characters1956
Distinct characters142
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique208 ?
Unique (%)90.0%

Sample

1st row강남구 가족센터
2nd row강동구 가족센터
3rd row강북구 가족센터
4th row강서구 가족센터
5th row관악구 가족센터
ValueCountFrequency (%)
가족센터 208
45.0%
다문화가족지원센터 23
 
5.0%
동구 6
 
1.3%
서구 5
 
1.1%
중구 5
 
1.1%
북구 4
 
0.9%
남구 4
 
0.9%
고성군 2
 
0.4%
순창군 1
 
0.2%
장수군 1
 
0.2%
Other values (203) 203
43.9%
2023-12-13T06:37:11.878344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
232
11.9%
231
11.8%
231
11.8%
231
11.8%
231
11.8%
85
 
4.3%
82
 
4.2%
72
 
3.7%
31
 
1.6%
28
 
1.4%
Other values (132) 502
25.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1723
88.1%
Space Separator 231
 
11.8%
Close Punctuation 1
 
0.1%
Open Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
232
13.5%
231
13.4%
231
13.4%
231
13.4%
85
 
4.9%
82
 
4.8%
72
 
4.2%
31
 
1.8%
28
 
1.6%
26
 
1.5%
Other values (129) 474
27.5%
Space Separator
ValueCountFrequency (%)
231
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1723
88.1%
Common 233
 
11.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
232
13.5%
231
13.4%
231
13.4%
231
13.4%
85
 
4.9%
82
 
4.8%
72
 
4.2%
31
 
1.8%
28
 
1.6%
26
 
1.5%
Other values (129) 474
27.5%
Common
ValueCountFrequency (%)
231
99.1%
) 1
 
0.4%
( 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1723
88.1%
ASCII 233
 
11.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
232
13.5%
231
13.4%
231
13.4%
231
13.4%
85
 
4.9%
82
 
4.8%
72
 
4.2%
31
 
1.8%
28
 
1.6%
26
 
1.5%
Other values (129) 474
27.5%
ASCII
ValueCountFrequency (%)
231
99.1%
) 1
 
0.4%
( 1
 
0.4%

통번역지원사 배치인원
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3506494
Minimum0
Maximum5
Zeros22
Zeros (%)9.5%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-13T06:37:12.003449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.91032102
Coefficient of variation (CV)0.67398768
Kurtosis2.9021499
Mean1.3506494
Median Absolute Deviation (MAD)0
Skewness1.4449901
Sum312
Variance0.82868436
MonotonicityNot monotonic
2023-12-13T06:37:12.121750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 139
60.2%
2 49
 
21.2%
0 22
 
9.5%
3 11
 
4.8%
4 8
 
3.5%
5 2
 
0.9%
ValueCountFrequency (%)
0 22
 
9.5%
1 139
60.2%
2 49
 
21.2%
3 11
 
4.8%
4 8
 
3.5%
5 2
 
0.9%
ValueCountFrequency (%)
5 2
 
0.9%
4 8
 
3.5%
3 11
 
4.8%
2 49
 
21.2%
1 139
60.2%
0 22
 
9.5%

베트남어
Categorical

Distinct3
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
O
174 
X
56 
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st rowX
2nd rowO
3rd rowO
4th rowO
5th rowO

Common Values

ValueCountFrequency (%)
O 174
75.3%
X 56
 
24.2%
2 1
 
0.4%

Length

2023-12-13T06:37:12.228658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:12.307950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
o 174
75.3%
x 56
 
24.2%
2 1
 
0.4%

중국어
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
X
159 
O
70 
2
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowO
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 159
68.8%
O 70
30.3%
2 2
 
0.9%

Length

2023-12-13T06:37:12.406331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:12.494194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 159
68.8%
o 70
30.3%
2 2
 
0.9%

일본어
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
X
222 
O
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 222
96.1%
O 9
 
3.9%

Length

2023-12-13T06:37:12.598047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:12.691840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 222
96.1%
o 9
 
3.9%

태국어
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
X
229 
O
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 229
99.1%
O 2
 
0.9%

Length

2023-12-13T06:37:12.782715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:12.869736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 229
99.1%
o 2
 
0.9%

러시아어
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
X
222 
O
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 222
96.1%
O 9
 
3.9%

Length

2023-12-13T06:37:12.956948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:13.041920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 222
96.1%
o 9
 
3.9%

네팔어
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
X
230 
O
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 230
99.6%
O 1
 
0.4%

Length

2023-12-13T06:37:13.143110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:13.248893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 230
99.6%
o 1
 
0.4%

몽골어
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
X
220 
O
 
11

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 220
95.2%
O 11
 
4.8%

Length

2023-12-13T06:37:13.349877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:13.447873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 220
95.2%
o 11
 
4.8%

캄보디아어
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
X
221 
O
 
10

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 221
95.7%
O 10
 
4.3%

Length

2023-12-13T06:37:13.628638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:13.732179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 221
95.7%
o 10
 
4.3%

필리핀어
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
X
211 
O
 
20

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 211
91.3%
O 20
 
8.7%

Length

2023-12-13T06:37:13.828443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:37:13.913199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 211
91.3%
o 20
 
8.7%

Interactions

2023-12-13T06:37:10.502051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:37:10.367245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:37:10.563560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:37:10.424943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:37:13.997540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시도명통번역지원사 배치인원베트남어중국어일본어태국어러시아어네팔어몽골어캄보디아어필리핀어
연번1.0000.9710.0000.0000.2260.0850.0000.1010.0250.2350.1090.124
시도명0.9711.0000.6550.0000.3750.4450.0000.4220.0000.1630.0000.495
통번역지원사 배치인원0.0000.6551.0000.7410.8820.8440.2910.7570.0000.4390.5370.668
베트남어0.0000.0000.7411.0000.0000.0000.0000.0000.0420.0000.0000.000
중국어0.2260.3750.8820.0001.0000.1450.0000.1240.0000.0000.0000.024
일본어0.0850.4450.8440.0000.1451.0000.1220.3680.0000.0000.0000.000
태국어0.0000.0000.2910.0000.0000.1221.0000.0000.0000.0000.0000.000
러시아어0.1010.4220.7570.0000.1240.3680.0001.0000.0000.0000.0000.188
네팔어0.0250.0000.0000.0420.0000.0000.0000.0001.0000.0000.0000.000
몽골어0.2350.1630.4390.0000.0000.0000.0000.0000.0001.0000.0000.000
캄보디아어0.1090.0000.5370.0000.0000.0000.0000.0000.0000.0001.0000.164
필리핀어0.1240.4950.6680.0000.0240.0000.0000.1880.0000.0000.1641.000
2023-12-13T06:37:14.141686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
네팔어일본어시도명중국어베트남어필리핀어몽골어러시아어캄보디아어태국어
네팔어1.0000.0000.0000.0000.0700.0000.0000.0000.0000.000
일본어0.0001.0000.3870.2380.0000.0000.0000.2400.0000.078
시도명0.0000.3871.0000.2110.0000.4310.1400.3660.0000.000
중국어0.0000.2380.2111.0000.0000.0390.0000.2040.0000.000
베트남어0.0700.0000.0000.0001.0000.0000.0000.0000.0000.000
필리핀어0.0000.0000.4310.0390.0001.0000.0000.1200.1050.000
몽골어0.0000.0000.1400.0000.0000.0001.0000.0000.0000.000
러시아어0.0000.2400.3660.2040.0000.1200.0001.0000.0000.000
캄보디아어0.0000.0000.0000.0000.0000.1050.0000.0001.0000.000
태국어0.0000.0780.0000.0000.0000.0000.0000.0000.0001.000
2023-12-13T06:37:14.272320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번통번역지원사 배치인원시도명베트남어중국어일본어태국어러시아어네팔어몽골어캄보디아어필리핀어
연번1.0000.0000.8590.0000.1170.0750.0000.0760.0140.1690.0800.094
통번역지원사 배치인원0.0001.0000.3700.4170.5870.6450.2070.5600.0000.3130.3850.485
시도명0.8590.3701.0000.0000.2110.3870.0000.3660.0000.1400.0000.431
베트남어0.0000.4170.0001.0000.0000.0000.0000.0000.0700.0000.0000.000
중국어0.1170.5870.2110.0001.0000.2380.0000.2040.0000.0000.0000.039
일본어0.0750.6450.3870.0000.2381.0000.0780.2400.0000.0000.0000.000
태국어0.0000.2070.0000.0000.0000.0781.0000.0000.0000.0000.0000.000
러시아어0.0760.5600.3660.0000.2040.2400.0001.0000.0000.0000.0000.120
네팔어0.0140.0000.0000.0700.0000.0000.0000.0001.0000.0000.0000.000
몽골어0.1690.3130.1400.0000.0000.0000.0000.0000.0001.0000.0000.000
캄보디아어0.0800.3850.0000.0000.0000.0000.0000.0000.0000.0001.0000.105
필리핀어0.0940.4850.4310.0000.0390.0000.0000.1200.0000.0000.1051.000

Missing values

2023-12-13T06:37:10.658111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:37:10.790520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시도명센터명통번역지원사 배치인원베트남어중국어일본어태국어러시아어네팔어몽골어캄보디아어필리핀어
01서울특별시강남구 가족센터1XOXXXXXXX
12서울특별시강동구 가족센터1OXXXXXXXX
23서울특별시강북구 가족센터1OXXXXXXXX
34서울특별시강서구 가족센터1OXXXXXXXX
45서울특별시관악구 가족센터1OXXXXXXXX
56서울특별시광진구 가족센터1XOXXXXXXX
67서울특별시구로구 가족센터3OOXXXXXOX
78서울특별시금천구 가족센터1OXXXXXXXX
89서울특별시노원구 가족센터2OOXXXXXXX
910서울특별시도봉구 가족센터2OOXXXXXXX
연번시도명센터명통번역지원사 배치인원베트남어중국어일본어태국어러시아어네팔어몽골어캄보디아어필리핀어
221222경상남도창녕군 가족센터1OXXXXXXXX
222223경상남도창원시 가족센터1OXXXXXXXX
223224경상남도창원시마산 가족센터1OXXXXXXXX
224225경상남도통영시 가족센터1OXXXXXXXX
225226경상남도하동군 가족센터1OXXXXXXXX
226227경상남도함안군 가족센터1XXXXXXXXO
227228경상남도함양군 가족센터1OXXXXXXXX
228229경상남도합천군 가족센터1OXXXXXXXX
229230제주특별자치도서귀포시 가족센터3OOXXXXXXO
230231제주특별자치도제주시 가족센터3OOXXXXXXO