Overview

Dataset statistics

Number of variables3
Number of observations84
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 KiB
Average record size in memory27.6 B

Variable types

Text1
Numeric2

Dataset

Description2021년 12월 31일 기준 여성 과학기술인력의 공공연구기관의 정규직 과학기술연구개발인력 국내외연수 참여 현황에 대한 정보입니다.
URLhttps://www.data.go.kr/data/15053991/fileData.do

Alerts

여성 is highly overall correlated with 전체High correlation
전체 is highly overall correlated with 여성High correlation
구분 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:52:53.207443
Analysis finished2023-12-12 05:52:54.260986
Duration1.05 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

UNIQUE 

Distinct84
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size804.0 B
2023-12-12T14:52:54.447818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length14
Mean length16.333333
Min length14

Characters and Unicode

Total characters1372
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique84 ?
Unique (%)100.0%

Sample

1st row2021-국내 1개월 미만
2nd row2021-국내 1개월 이상~6개월 미만
3rd row2021-국내 6개월 이상
4th row2021-해외 1개월 미만
5th row2021-해외 1개월 이상~6개월 미만
ValueCountFrequency (%)
미만 56
20.0%
1개월 56
20.0%
이상~6개월 28
10.0%
6개월 28
10.0%
이상 28
10.0%
2021-국내 3
 
1.1%
2012-해외 3
 
1.1%
2011-국내 3
 
1.1%
2014-해외 3
 
1.1%
2013-국내 3
 
1.1%
Other values (23) 69
24.6%
2023-12-12T14:52:54.840141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
196
14.3%
1 128
 
9.3%
112
 
8.2%
112
 
8.2%
0 108
 
7.9%
2 102
 
7.4%
- 84
 
6.1%
6 62
 
4.5%
56
 
4.1%
56
 
4.1%
Other values (13) 356
25.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 616
44.9%
Decimal Number 448
32.7%
Space Separator 196
 
14.3%
Dash Punctuation 84
 
6.1%
Math Symbol 28
 
2.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 128
28.6%
0 108
24.1%
2 102
22.8%
6 62
13.8%
9 12
 
2.7%
8 12
 
2.7%
3 6
 
1.3%
5 6
 
1.3%
7 6
 
1.3%
4 6
 
1.3%
Other Letter
ValueCountFrequency (%)
112
18.2%
112
18.2%
56
9.1%
56
9.1%
56
9.1%
56
9.1%
42
 
6.8%
42
 
6.8%
42
 
6.8%
42
 
6.8%
Space Separator
ValueCountFrequency (%)
196
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 84
100.0%
Math Symbol
ValueCountFrequency (%)
~ 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 756
55.1%
Hangul 616
44.9%

Most frequent character per script

Common
ValueCountFrequency (%)
196
25.9%
1 128
16.9%
0 108
14.3%
2 102
13.5%
- 84
11.1%
6 62
 
8.2%
~ 28
 
3.7%
9 12
 
1.6%
8 12
 
1.6%
3 6
 
0.8%
Other values (3) 18
 
2.4%
Hangul
ValueCountFrequency (%)
112
18.2%
112
18.2%
56
9.1%
56
9.1%
56
9.1%
56
9.1%
42
 
6.8%
42
 
6.8%
42
 
6.8%
42
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 756
55.1%
Hangul 616
44.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
196
25.9%
1 128
16.9%
0 108
14.3%
2 102
13.5%
- 84
11.1%
6 62
 
8.2%
~ 28
 
3.7%
9 12
 
1.6%
8 12
 
1.6%
3 6
 
0.8%
Other values (3) 18
 
2.4%
Hangul
ValueCountFrequency (%)
112
18.2%
112
18.2%
56
9.1%
56
9.1%
56
9.1%
56
9.1%
42
 
6.8%
42
 
6.8%
42
 
6.8%
42
 
6.8%

여성
Real number (ℝ)

HIGH CORRELATION 

Distinct70
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2293.0833
Minimum1
Maximum19202
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size888.0 B
2023-12-12T14:52:55.018124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.15
Q114
median119
Q31077.25
95-th percentile13187.95
Maximum19202
Range19201
Interquartile range (IQR)1063.25

Descriptive statistics

Standard deviation4751.3214
Coefficient of variation (CV)2.072023
Kurtosis3.2281044
Mean2293.0833
Median Absolute Deviation (MAD)109
Skewness2.1334889
Sum192619
Variance22575055
MonotonicityNot monotonic
2023-12-12T14:52:55.204649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12 5
 
6.0%
10 3
 
3.6%
18 2
 
2.4%
7 2
 
2.4%
3 2
 
2.4%
14 2
 
2.4%
9 2
 
2.4%
119 2
 
2.4%
24 2
 
2.4%
21 2
 
2.4%
Other values (60) 60
71.4%
ValueCountFrequency (%)
1 1
 
1.2%
3 2
 
2.4%
4 1
 
1.2%
5 1
 
1.2%
6 1
 
1.2%
7 2
 
2.4%
9 2
 
2.4%
10 3
3.6%
11 1
 
1.2%
12 5
6.0%
ValueCountFrequency (%)
19202 1
1.2%
16805 1
1.2%
15436 1
1.2%
13455 1
1.2%
13204 1
1.2%
13097 1
1.2%
12944 1
1.2%
12882 1
1.2%
12877 1
1.2%
10102 1
1.2%

전체
Real number (ℝ)

HIGH CORRELATION 

Distinct81
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12058.631
Minimum6
Maximum88140
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size888.0 B
2023-12-12T14:52:55.420497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile45.95
Q1124.5
median646.5
Q35273
95-th percentile74156.8
Maximum88140
Range88134
Interquartile range (IQR)5148.5

Descriptive statistics

Standard deviation24540.811
Coefficient of variation (CV)2.0351241
Kurtosis2.7243569
Mean12058.631
Median Absolute Deviation (MAD)576.5
Skewness2.058108
Sum1012925
Variance6.022514 × 108
MonotonicityNot monotonic
2023-12-12T14:52:55.588059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
143 2
 
2.4%
58 2
 
2.4%
125 2
 
2.4%
42142 1
 
1.2%
128 1
 
1.2%
71321 1
 
1.2%
127 1
 
1.2%
1395 1
 
1.2%
894 1
 
1.2%
10357 1
 
1.2%
Other values (71) 71
84.5%
ValueCountFrequency (%)
6 1
1.2%
19 1
1.2%
31 1
1.2%
36 1
1.2%
44 1
1.2%
57 1
1.2%
58 2
2.4%
61 1
1.2%
68 1
1.2%
69 1
1.2%
ValueCountFrequency (%)
88140 1
1.2%
82534 1
1.2%
81306 1
1.2%
76753 1
1.2%
74452 1
1.2%
72484 1
1.2%
71321 1
1.2%
64611 1
1.2%
62642 1
1.2%
55573 1
1.2%

Interactions

2023-12-12T14:52:53.520266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:52:53.318274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:52:53.628661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:52:53.427307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:52:55.680942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분여성전체
구분1.0001.0001.000
여성1.0001.0000.908
전체1.0000.9081.000
2023-12-12T14:52:55.777031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
여성전체
여성1.0000.978
전체0.9781.000

Missing values

2023-12-12T14:52:54.089175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:52:54.208925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분여성전체
02021-국내 1개월 미만924342142
12021-국내 1개월 이상~6개월 미만10245321
22021-국내 6개월 이상63386
32021-해외 1개월 미만336
42021-해외 1개월 이상~6개월 미만16
52021-해외 6개월 이상1461
62020-국내 1개월 미만1345541248
72020-국내 1개월 이상~6개월 미만6595018
82020-국내 6개월 이상32125
92020-해외 1개월 미만419
구분여성전체
742009-국내 6개월 이상1801024
752009-해외 1개월 미만68645
762009-해외 1개월 이상~6개월 미만1277
772009-해외 6개월 이상17126
782008-국내 1개월 미만387425831
792008-국내 1개월 이상~6개월 미만132878
802008-국내 6개월 이상12116
812008-해외 1개월 미만121753
822008-해외 1개월 이상~6개월 미만12125
832008-해외 6개월 이상26221