Overview

Dataset statistics

Number of variables4
Number of observations104
Missing cells100
Missing cells (%)24.0%
Duplicate rows1
Duplicate rows (%)1.0%
Total size in memory3.7 KiB
Average record size in memory36.3 B

Variable types

Text1
Categorical3

Dataset

Description2022년에 실시한 대전일자리종합박람회 성과 결과입니다(참여기업수, 기업의 구인인원수, 구직자 취업인원수)
URLhttps://www.data.go.kr/data/15081175/fileData.do

Alerts

Dataset has 1 (1.0%) duplicate rowsDuplicates
참여기업 is highly overall correlated with 구인인원 and 1 other fieldsHigh correlation
구인인원 is highly overall correlated with 참여기업 and 1 other fieldsHigh correlation
취업인원 is highly overall correlated with 참여기업 and 1 other fieldsHigh correlation
참여기업 is highly imbalanced (86.6%)Imbalance
구인인원 is highly imbalanced (86.6%)Imbalance
취업인원 is highly imbalanced (86.6%)Imbalance
구 분 has 100 (96.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 09:38:50.734673
Analysis finished2023-12-12 09:38:51.191423
Duration0.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구 분
Text

MISSING 

Distinct4
Distinct (%)100.0%
Missing100
Missing (%)96.2%
Memory size964.0 B
2023-12-12T18:38:51.310777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6.5
Mean length5.75
Min length4

Characters and Unicode

Total characters23
Distinct characters19
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row청년특화기업
2nd row공사, 공단
3rd rowIT 정보통신
4th row일반기업
ValueCountFrequency (%)
청년특화기업 1
16.7%
공사 1
16.7%
공단 1
16.7%
it 1
16.7%
정보통신 1
16.7%
일반기업 1
16.7%
2023-12-12T18:38:51.688450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2
 
8.7%
2
 
8.7%
2
 
8.7%
2
 
8.7%
I 1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (9) 9
39.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18
78.3%
Space Separator 2
 
8.7%
Uppercase Letter 2
 
8.7%
Other Punctuation 1
 
4.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
11.1%
2
 
11.1%
2
 
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (5) 5
27.8%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
T 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18
78.3%
Common 3
 
13.0%
Latin 2
 
8.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
 
11.1%
2
 
11.1%
2
 
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (5) 5
27.8%
Common
ValueCountFrequency (%)
2
66.7%
, 1
33.3%
Latin
ValueCountFrequency (%)
I 1
50.0%
T 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18
78.3%
ASCII 5
 
21.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2
40.0%
I 1
20.0%
T 1
20.0%
, 1
20.0%
Hangul
ValueCountFrequency (%)
2
 
11.1%
2
 
11.1%
2
 
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (5) 5
27.8%

참여기업
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size964.0 B
<NA>
100 
39
 
1
11
 
1
12
 
1
71
 
1

Length

Max length4
Median length4
Mean length3.9230769
Min length2

Unique

Unique4 ?
Unique (%)3.8%

Sample

1st row39
2nd row11
3rd row12
4th row71
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 100
96.2%
39 1
 
1.0%
11 1
 
1.0%
12 1
 
1.0%
71 1
 
1.0%

Length

2023-12-12T18:38:51.855479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:38:52.012084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 100
96.2%
39 1
 
1.0%
11 1
 
1.0%
12 1
 
1.0%
71 1
 
1.0%

구인인원
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size964.0 B
<NA>
100 
108
 
1
31
 
1
44
 
1
383
 
1

Length

Max length4
Median length4
Mean length3.9423077
Min length2

Unique

Unique4 ?
Unique (%)3.8%

Sample

1st row108
2nd row31
3rd row44
4th row383
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 100
96.2%
108 1
 
1.0%
31 1
 
1.0%
44 1
 
1.0%
383 1
 
1.0%

Length

2023-12-12T18:38:52.160640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:38:52.310456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 100
96.2%
108 1
 
1.0%
31 1
 
1.0%
44 1
 
1.0%
383 1
 
1.0%

취업인원
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size964.0 B
<NA>
100 
27
 
1
4
 
1
2
 
1
131
 
1

Length

Max length4
Median length4
Mean length3.9134615
Min length1

Unique

Unique4 ?
Unique (%)3.8%

Sample

1st row27
2nd row4
3rd row2
4th row131
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 100
96.2%
27 1
 
1.0%
4 1
 
1.0%
2 1
 
1.0%
131 1
 
1.0%

Length

2023-12-12T18:38:52.467769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:38:52.637194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 100
96.2%
27 1
 
1.0%
4 1
 
1.0%
2 1
 
1.0%
131 1
 
1.0%

Correlations

2023-12-12T18:38:52.719481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구 분참여기업구인인원취업인원
구 분1.0001.0001.0001.000
참여기업1.0001.0001.0001.000
구인인원1.0001.0001.0001.000
취업인원1.0001.0001.0001.000
2023-12-12T18:38:52.849680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
참여기업구인인원취업인원
참여기업1.0001.0001.000
구인인원1.0001.0001.000
취업인원1.0001.0001.000
2023-12-12T18:38:52.978445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
참여기업구인인원취업인원
참여기업1.0001.0001.000
구인인원1.0001.0001.000
취업인원1.0001.0001.000

Missing values

2023-12-12T18:38:50.993052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:38:51.143878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구 분참여기업구인인원취업인원
0청년특화기업3910827
1공사, 공단11314
2IT 정보통신12442
3일반기업71383131
4<NA><NA><NA><NA>
5<NA><NA><NA><NA>
6<NA><NA><NA><NA>
7<NA><NA><NA><NA>
8<NA><NA><NA><NA>
9<NA><NA><NA><NA>
구 분참여기업구인인원취업인원
94<NA><NA><NA><NA>
95<NA><NA><NA><NA>
96<NA><NA><NA><NA>
97<NA><NA><NA><NA>
98<NA><NA><NA><NA>
99<NA><NA><NA><NA>
100<NA><NA><NA><NA>
101<NA><NA><NA><NA>
102<NA><NA><NA><NA>
103<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

구 분참여기업구인인원취업인원# duplicates
0<NA><NA><NA><NA>100