Overview

Dataset statistics

Number of variables15
Number of observations3643
Missing cells14922
Missing cells (%)27.3%
Duplicate rows289
Duplicate rows (%)7.9%
Total size in memory455.5 KiB
Average record size in memory128.0 B

Variable types

Numeric6
Categorical3
Boolean3
Text3

Dataset

Description중장기개방계획에따른 경상남도 경남도립거창대학 데이터자료입니다.(학교명, 학교구분, 취업불가능자 구분, 취업자구분, 직업분류, 회사명 등의 데이터를 포함하고 있습니다.)
URLhttps://www.data.go.kr/data/15066708/fileData.do

Alerts

Dataset has 289 (7.9%) duplicate rowsDuplicates
학교구분 is highly overall correlated with 취업구분 and 6 other fieldsHigh correlation
전공일치여부 is highly overall correlated with 취업구분 and 2 other fieldsHigh correlation
학교명 is highly overall correlated with 취업구분 and 5 other fieldsHigh correlation
건강보험적용여부 is highly overall correlated with 취업구분 and 2 other fieldsHigh correlation
취업불가능자구분 is highly overall correlated with 취업구분 and 6 other fieldsHigh correlation
건강보험직장가입제외여부 is highly overall correlated with 취업구분 and 2 other fieldsHigh correlation
취업구분 is highly overall correlated with 취업자구분 and 7 other fieldsHigh correlation
학교코드 is highly overall correlated with 학교명 and 2 other fieldsHigh correlation
취업자구분 is highly overall correlated with 취업구분 and 2 other fieldsHigh correlation
직업분류 is highly overall correlated with 학교구분 and 1 other fieldsHigh correlation
회사코드 is highly overall correlated with 취업구분 and 2 other fieldsHigh correlation
학교명 is highly imbalanced (87.6%)Imbalance
학교구분 is highly imbalanced (74.3%)Imbalance
취업불가능자구분 is highly imbalanced (99.3%)Imbalance
건강보험직장가입제외여부 is highly imbalanced (76.9%)Imbalance
학교코드 has 3385 (92.9%) missing valuesMissing
전공일치여부 has 1328 (36.5%) missing valuesMissing
직업분류 has 1328 (36.5%) missing valuesMissing
직업명 has 1328 (36.5%) missing valuesMissing
회사명 has 1328 (36.5%) missing valuesMissing
건강보험적용여부 has 1328 (36.5%) missing valuesMissing
건강보험직장가입제외여부 has 1328 (36.5%) missing valuesMissing
기타내용 has 3569 (98.0%) missing valuesMissing
취업자구분 has 1328 (36.5%) zerosZeros
회사코드 has 1328 (36.5%) zerosZeros

Reproduction

Analysis started2023-12-12 08:38:54.032645
Analysis finished2023-12-12 08:39:00.539852
Duration6.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회차
Real number (ℝ)

Distinct9
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20141.233
Minimum20111
Maximum20155
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.1 KiB
2023-12-12T17:39:00.601811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20111
5-th percentile20111
Q120131
median20151
Q320153
95-th percentile20155
Maximum20155
Range44
Interquartile range (IQR)22

Descriptive statistics

Standard deviation15.483882
Coefficient of variation (CV)0.00076876533
Kurtosis-0.84924232
Mean20141.233
Median Absolute Deviation (MAD)4
Skewness-0.84063676
Sum73374513
Variance239.7506
MonotonicityNot monotonic
2023-12-12T17:39:00.740992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
20121 447
12.3%
20151 419
11.5%
20152 419
11.5%
20153 419
11.5%
20154 419
11.5%
20155 419
11.5%
20111 392
10.8%
20131 359
9.9%
20141 350
9.6%
ValueCountFrequency (%)
20111 392
10.8%
20121 447
12.3%
20131 359
9.9%
20141 350
9.6%
20151 419
11.5%
20152 419
11.5%
20153 419
11.5%
20154 419
11.5%
20155 419
11.5%
ValueCountFrequency (%)
20155 419
11.5%
20154 419
11.5%
20153 419
11.5%
20152 419
11.5%
20151 419
11.5%
20141 350
9.6%
20131 359
9.9%
20121 447
12.3%
20111 392
10.8%

취업구분
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1976393
Minimum0
Maximum9
Zeros9
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size32.1 KiB
2023-12-12T17:39:00.863785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q35
95-th percentile5
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.769044
Coefficient of variation (CV)0.80497468
Kurtosis-0.9062273
Mean2.1976393
Median Absolute Deviation (MAD)0
Skewness0.95275957
Sum8006
Variance3.1295167
MonotonicityNot monotonic
2023-12-12T17:39:00.999748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 2315
63.5%
5 934
25.6%
2 259
 
7.1%
3 85
 
2.3%
6 37
 
1.0%
0 9
 
0.2%
9 2
 
0.1%
4 2
 
0.1%
ValueCountFrequency (%)
0 9
 
0.2%
1 2315
63.5%
2 259
 
7.1%
3 85
 
2.3%
4 2
 
0.1%
5 934
25.6%
6 37
 
1.0%
9 2
 
0.1%
ValueCountFrequency (%)
9 2
 
0.1%
6 37
 
1.0%
5 934
25.6%
4 2
 
0.1%
3 85
 
2.3%
2 259
 
7.1%
1 2315
63.5%
0 9
 
0.2%

학교코드
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct38
Distinct (%)14.7%
Missing3385
Missing (%)92.9%
Infinite0
Infinite (%)0.0%
Mean15278869
Minimum11110401
Maximum24821102
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.1 KiB
2023-12-12T17:39:01.165004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110401
5-th percentile12110101
Q114467651
median14810101
Q314810501
95-th percentile24810102
Maximum24821102
Range13710701
Interquartile range (IQR)342850

Descriptive statistics

Standard deviation3456880.7
Coefficient of variation (CV)0.22625239
Kurtosis2.905412
Mean15278869
Median Absolute Deviation (MAD)10000
Skewness1.9221424
Sum3.9419483 × 109
Variance1.1950024 × 1013
MonotonicityNot monotonic
2023-12-12T17:39:01.385372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
14810101 58
 
1.6%
14810404 48
 
1.3%
14810501 22
 
0.6%
24810102 15
 
0.4%
14820101 12
 
0.3%
12620101 11
 
0.3%
14820301 8
 
0.2%
12110101 8
 
0.2%
12120601 7
 
0.2%
11110505 5
 
0.1%
Other values (28) 64
 
1.8%
(Missing) 3385
92.9%
ValueCountFrequency (%)
11110401 1
 
< 0.1%
11110505 5
0.1%
11121901 2
 
0.1%
11123801 1
 
< 0.1%
12110101 8
0.2%
12110301 5
0.1%
12110401 5
0.1%
12120401 1
 
< 0.1%
12120501 4
0.1%
12120601 7
0.2%
ValueCountFrequency (%)
24821102 1
 
< 0.1%
24820802 5
 
0.1%
24810102 15
0.4%
24721602 1
 
< 0.1%
24720802 1
 
< 0.1%
22620302 1
 
< 0.1%
22220602 5
 
0.1%
22120108 1
 
< 0.1%
14820401 1
 
< 0.1%
14820301 8
0.2%

학교명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct40
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
<NA>
3384 
경상대학교
 
58
진주산업대학교
 
48
창원대학교
 
22
거창전문대학
 
15
Other values (35)
 
116

Length

Max length22
Median length4
Mean length4.132034
Min length4

Unique

Unique17 ?
Unique (%)0.5%

Sample

1st row<NA>
2nd row진주산업대학교
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 3384
92.9%
경상대학교 58
 
1.6%
진주산업대학교 48
 
1.3%
창원대학교 22
 
0.6%
거창전문대학 15
 
0.4%
경남대학교 12
 
0.3%
울산대학교 11
 
0.3%
인제대학교 8
 
0.2%
부경대학교 8
 
0.2%
동의대학교 7
 
0.2%
Other values (30) 70
 
1.9%

Length

2023-12-12T17:39:01.545688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 3384
92.8%
경상대학교 58
 
1.6%
진주산업대학교 48
 
1.3%
창원대학교 22
 
0.6%
거창전문대학 15
 
0.4%
경남대학교 12
 
0.3%
울산대학교 11
 
0.3%
인제대학교 8
 
0.2%
부경대학교 8
 
0.2%
동의대학교 7
 
0.2%
Other values (34) 74
 
2.0%

학교구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
0
3384 
2
 
229
1
 
30

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row2
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 3384
92.9%
2 229
 
6.3%
1 30
 
0.8%

Length

2023-12-12T17:39:01.710584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:39:01.835514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 3384
92.9%
2 229
 
6.3%
1 30
 
0.8%

취업불가능자구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
0
3641 
3
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 3641
99.9%
3 2
 
0.1%

Length

2023-12-12T17:39:01.968994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:39:02.082401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 3641
99.9%
3 2
 
0.1%

취업자구분
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.82130113
Minimum0
Maximum5
Zeros1328
Zeros (%)36.5%
Negative0
Negative (%)0.0%
Memory size32.1 KiB
2023-12-12T17:39:02.170235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.89213216
Coefficient of variation (CV)1.0862425
Kurtosis5.949137
Mean0.82130113
Median Absolute Deviation (MAD)0
Skewness2.0102939
Sum2992
Variance0.79589979
MonotonicityNot monotonic
2023-12-12T17:39:02.299554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 1975
54.2%
0 1328
36.5%
2 132
 
3.6%
3 118
 
3.2%
4 51
 
1.4%
5 39
 
1.1%
ValueCountFrequency (%)
0 1328
36.5%
1 1975
54.2%
2 132
 
3.6%
3 118
 
3.2%
4 51
 
1.4%
5 39
 
1.1%
ValueCountFrequency (%)
5 39
 
1.1%
4 51
 
1.4%
3 118
 
3.2%
2 132
 
3.6%
1 1975
54.2%
0 1328
36.5%

전공일치여부
Boolean

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)0.1%
Missing1328
Missing (%)36.5%
Memory size7.2 KiB
True
1800 
False
515 
(Missing)
1328 
ValueCountFrequency (%)
True 1800
49.4%
False 515
 
14.1%
(Missing) 1328
36.5%
2023-12-12T17:39:02.420168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

직업분류
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct244
Distinct (%)10.5%
Missing1328
Missing (%)36.5%
Infinite0
Infinite (%)0.0%
Mean43266.641
Minimum12010
Maximum99999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.1 KiB
2023-12-12T17:39:02.577561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12010
5-th percentile13402
Q123537
median27250
Q376212
95-th percentile86405.5
Maximum99999
Range87989
Interquartile range (IQR)52675

Descriptive statistics

Standard deviation27882.956
Coefficient of variation (CV)0.64444466
Kurtosis-1.2210194
Mean43266.641
Median Absolute Deviation (MAD)12139
Skewness0.71093981
Sum1.0016227 × 108
Variance7.7745921 × 108
MonotonicityNot monotonic
2023-12-12T17:39:02.744602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24302 307
 
8.4%
85432 241
 
6.6%
24720 164
 
4.5%
42220 83
 
2.3%
13310 69
 
1.9%
76212 61
 
1.7%
31320 59
 
1.6%
23113 57
 
1.6%
23119 56
 
1.5%
42231 50
 
1.4%
Other values (234) 1168
32.1%
(Missing) 1328
36.5%
ValueCountFrequency (%)
12010 1
 
< 0.1%
12022 3
 
0.1%
12023 10
 
0.3%
12024 4
 
0.1%
12090 1
 
< 0.1%
13201 7
 
0.2%
13202 11
 
0.3%
13310 69
1.9%
13320 7
 
0.2%
13402 6
 
0.2%
ValueCountFrequency (%)
99999 6
 
0.2%
99991 1
 
< 0.1%
99101 6
 
0.2%
95391 3
 
0.1%
95310 4
 
0.1%
95210 2
 
0.1%
94219 1
 
< 0.1%
93009 20
0.5%
93003 10
0.3%
93001 4
 
0.1%

직업명
Text

MISSING 

Distinct388
Distinct (%)16.8%
Missing1328
Missing (%)36.5%
Memory size28.6 KiB
2023-12-12T17:39:03.077373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length4.1390929
Min length1

Characters and Unicode

Total characters9582
Distinct characters260
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique237 ?
Unique (%)10.2%

Sample

1st row경리
2nd row은행사무원
3rd row경리사무원
4th row회계사무원
5th row경리사무원
ValueCountFrequency (%)
간호사 342
 
13.2%
사원 170
 
6.6%
보육교사 164
 
6.3%
설계 98
 
3.8%
조선 85
 
3.3%
선박 82
 
3.2%
미용사 76
 
2.9%
건조 75
 
2.9%
관리사무원 69
 
2.7%
피부관리사 50
 
1.9%
Other values (395) 1377
53.2%
2023-12-12T17:39:03.553827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1260
 
13.1%
699
 
7.3%
349
 
3.6%
349
 
3.6%
334
 
3.5%
303
 
3.2%
299
 
3.1%
291
 
3.0%
279
 
2.9%
274
 
2.9%
Other values (250) 5145
53.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9068
94.6%
Space Separator 273
 
2.8%
Lowercase Letter 103
 
1.1%
Close Punctuation 56
 
0.6%
Open Punctuation 56
 
0.6%
Uppercase Letter 19
 
0.2%
Other Punctuation 4
 
< 0.1%
Decimal Number 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1260
 
13.9%
699
 
7.7%
349
 
3.8%
349
 
3.8%
334
 
3.7%
303
 
3.3%
299
 
3.3%
291
 
3.2%
279
 
3.1%
274
 
3.0%
Other values (227) 4631
51.1%
Uppercase Letter
ValueCountFrequency (%)
D 5
26.3%
B 3
15.8%
K 3
15.8%
O 2
 
10.5%
P 1
 
5.3%
T 1
 
5.3%
I 1
 
5.3%
L 1
 
5.3%
C 1
 
5.3%
F 1
 
5.3%
Lowercase Letter
ValueCountFrequency (%)
r 27
26.2%
o 25
24.3%
e 13
12.6%
a 13
12.6%
t 13
12.6%
p 12
11.7%
Other Punctuation
ValueCountFrequency (%)
, 3
75.0%
. 1
 
25.0%
Space Separator
ValueCountFrequency (%)
273
100.0%
Close Punctuation
ValueCountFrequency (%)
) 56
100.0%
Open Punctuation
ValueCountFrequency (%)
( 56
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9068
94.6%
Common 392
 
4.1%
Latin 122
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1260
 
13.9%
699
 
7.7%
349
 
3.8%
349
 
3.8%
334
 
3.7%
303
 
3.3%
299
 
3.3%
291
 
3.2%
279
 
3.1%
274
 
3.0%
Other values (227) 4631
51.1%
Latin
ValueCountFrequency (%)
r 27
22.1%
o 25
20.5%
e 13
10.7%
a 13
10.7%
t 13
10.7%
p 12
9.8%
D 5
 
4.1%
B 3
 
2.5%
K 3
 
2.5%
O 2
 
1.6%
Other values (6) 6
 
4.9%
Common
ValueCountFrequency (%)
273
69.6%
) 56
 
14.3%
( 56
 
14.3%
, 3
 
0.8%
1 2
 
0.5%
- 1
 
0.3%
. 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9068
94.6%
ASCII 514
 
5.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1260
 
13.9%
699
 
7.7%
349
 
3.8%
349
 
3.8%
334
 
3.7%
303
 
3.3%
299
 
3.3%
291
 
3.2%
279
 
3.1%
274
 
3.0%
Other values (227) 4631
51.1%
ASCII
ValueCountFrequency (%)
273
53.1%
) 56
 
10.9%
( 56
 
10.9%
r 27
 
5.3%
o 25
 
4.9%
e 13
 
2.5%
a 13
 
2.5%
t 13
 
2.5%
p 12
 
2.3%
D 5
 
1.0%
Other values (13) 21
 
4.1%

회사코드
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct862
Distinct (%)23.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2331.5611
Minimum0
Maximum65387
Zeros1328
Zeros (%)36.5%
Negative0
Negative (%)0.0%
Memory size32.1 KiB
2023-12-12T17:39:03.717358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median228
Q3641
95-th percentile11956
Maximum65387
Range65387
Interquartile range (IQR)641

Descriptive statistics

Standard deviation7100.9469
Coefficient of variation (CV)3.0455762
Kurtosis32.304593
Mean2331.5611
Median Absolute Deviation (MAD)228
Skewness5.2532382
Sum8493877
Variance50423448
MonotonicityNot monotonic
2023-12-12T17:39:03.862595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1328
36.5%
120 75
 
2.1%
8154 69
 
1.9%
6222 65
 
1.8%
5060 27
 
0.7%
588 25
 
0.7%
235 19
 
0.5%
505 19
 
0.5%
31 18
 
0.5%
44 16
 
0.4%
Other values (852) 1982
54.4%
ValueCountFrequency (%)
0 1328
36.5%
1 3
 
0.1%
3 2
 
0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
6 4
 
0.1%
7 2
 
0.1%
8 1
 
< 0.1%
10 2
 
0.1%
11 9
 
0.2%
ValueCountFrequency (%)
65387 1
 
< 0.1%
65243 1
 
< 0.1%
63745 1
 
< 0.1%
63671 1
 
< 0.1%
62507 5
0.1%
59818 1
 
< 0.1%
59692 1
 
< 0.1%
59645 1
 
< 0.1%
58783 2
 
0.1%
57950 2
 
0.1%

회사명
Text

MISSING 

Distinct866
Distinct (%)37.4%
Missing1328
Missing (%)36.5%
Memory size28.6 KiB
2023-12-12T17:39:04.111570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length7.249676
Min length2

Characters and Unicode

Total characters16783
Distinct characters498
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique533 ?
Unique (%)23.0%

Sample

1st row(주)신쿨
2nd row거창신용협동조합
3rd row(주)금아스틸
4th row오영수회계사무소
5th row형제석재산업
ValueCountFrequency (%)
대우조선해양주식회사 75
 
2.9%
에스피피해양조선(주 69
 
2.7%
성동조선해양(주 65
 
2.5%
약손명가 34
 
1.3%
부민병원 27
 
1.1%
서부엔지니어링 25
 
1.0%
거창서경병원 19
 
0.7%
성균관대학교 19
 
0.7%
삼성창원병원 19
 
0.7%
경상대학교병원 18
 
0.7%
Other values (923) 2194
85.6%
2023-12-12T17:39:04.550527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
787
 
4.7%
) 680
 
4.1%
( 680
 
4.1%
510
 
3.0%
399
 
2.4%
391
 
2.3%
360
 
2.1%
343
 
2.0%
288
 
1.7%
287
 
1.7%
Other values (488) 12058
71.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14682
87.5%
Close Punctuation 680
 
4.1%
Open Punctuation 680
 
4.1%
Uppercase Letter 426
 
2.5%
Space Separator 249
 
1.5%
Lowercase Letter 43
 
0.3%
Decimal Number 15
 
0.1%
Other Punctuation 5
 
< 0.1%
Other Symbol 1
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
787
 
5.4%
510
 
3.5%
399
 
2.7%
391
 
2.7%
360
 
2.5%
343
 
2.3%
288
 
2.0%
287
 
2.0%
280
 
1.9%
266
 
1.8%
Other values (441) 10771
73.4%
Uppercase Letter
ValueCountFrequency (%)
K 45
10.6%
G 41
 
9.6%
S 41
 
9.6%
B 35
 
8.2%
L 34
 
8.0%
N 28
 
6.6%
A 26
 
6.1%
T 23
 
5.4%
E 21
 
4.9%
C 20
 
4.7%
Other values (12) 112
26.3%
Lowercase Letter
ValueCountFrequency (%)
s 8
18.6%
p 6
14.0%
a 4
9.3%
i 4
9.3%
c 4
9.3%
t 3
 
7.0%
y 3
 
7.0%
m 2
 
4.7%
l 2
 
4.7%
e 2
 
4.7%
Other values (4) 5
11.6%
Decimal Number
ValueCountFrequency (%)
2 8
53.3%
1 6
40.0%
4 1
 
6.7%
Other Punctuation
ValueCountFrequency (%)
& 4
80.0%
. 1
 
20.0%
Close Punctuation
ValueCountFrequency (%)
) 680
100.0%
Open Punctuation
ValueCountFrequency (%)
( 680
100.0%
Space Separator
ValueCountFrequency (%)
249
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14683
87.5%
Common 1631
 
9.7%
Latin 469
 
2.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
787
 
5.4%
510
 
3.5%
399
 
2.7%
391
 
2.7%
360
 
2.5%
343
 
2.3%
288
 
2.0%
287
 
2.0%
280
 
1.9%
266
 
1.8%
Other values (442) 10772
73.4%
Latin
ValueCountFrequency (%)
K 45
 
9.6%
G 41
 
8.7%
S 41
 
8.7%
B 35
 
7.5%
L 34
 
7.2%
N 28
 
6.0%
A 26
 
5.5%
T 23
 
4.9%
E 21
 
4.5%
C 20
 
4.3%
Other values (26) 155
33.0%
Common
ValueCountFrequency (%)
) 680
41.7%
( 680
41.7%
249
 
15.3%
2 8
 
0.5%
1 6
 
0.4%
& 4
 
0.2%
+ 1
 
0.1%
- 1
 
0.1%
4 1
 
0.1%
. 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14682
87.5%
ASCII 2100
 
12.5%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
787
 
5.4%
510
 
3.5%
399
 
2.7%
391
 
2.7%
360
 
2.5%
343
 
2.3%
288
 
2.0%
287
 
2.0%
280
 
1.9%
266
 
1.8%
Other values (441) 10771
73.4%
ASCII
ValueCountFrequency (%)
) 680
32.4%
( 680
32.4%
249
 
11.9%
K 45
 
2.1%
G 41
 
2.0%
S 41
 
2.0%
B 35
 
1.7%
L 34
 
1.6%
N 28
 
1.3%
A 26
 
1.2%
Other values (36) 241
 
11.5%
None
ValueCountFrequency (%)
1
100.0%

건강보험적용여부
Boolean

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)0.1%
Missing1328
Missing (%)36.5%
Memory size7.2 KiB
True
2057 
False
258 
(Missing)
1328 
ValueCountFrequency (%)
True 2057
56.5%
False 258
 
7.1%
(Missing) 1328
36.5%
2023-12-12T17:39:04.664043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

건강보험직장가입제외여부
Boolean

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)0.1%
Missing1328
Missing (%)36.5%
Memory size7.2 KiB
False
2228 
True
 
87
(Missing)
1328 
ValueCountFrequency (%)
False 2228
61.2%
True 87
 
2.4%
(Missing) 1328
36.5%
2023-12-12T17:39:04.738369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

기타내용
Text

MISSING 

Distinct57
Distinct (%)77.0%
Missing3569
Missing (%)98.0%
Memory size28.6 KiB
2023-12-12T17:39:04.898843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length96
Median length35
Mean length23.743243
Min length3

Characters and Unicode

Total characters1757
Distinct characters225
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)68.9%

Sample

1st row취업일 : 2010.11.29 건강보험자격득실확인서 있음
2nd row취업일 : 2010.09.13 의료보험증 사본 있음
3rd row엘지 -> 취업일 : 2010.11.02 의료보험증 사본 있음 삼성SDI -> 취업일 : 2011.5.17 충남 천안근무
4th row취업일자 : 2010.11.29
5th row취업일 : 2010.11.29
ValueCountFrequency (%)
16
 
5.2%
회사명 10
 
3.2%
취업일 8
 
2.6%
회사검색불가로 5
 
1.6%
홈플러스-서면점 5
 
1.6%
현장 5
 
1.6%
소재 5
 
1.6%
청주시 5
 
1.6%
lg화학-충북 5
 
1.6%
입력불가 5
 
1.6%
Other values (150) 239
77.6%
2023-12-12T17:39:05.279863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
234
 
13.3%
45
 
2.6%
. 45
 
2.6%
1 43
 
2.4%
0 36
 
2.0%
34
 
1.9%
33
 
1.9%
28
 
1.6%
) 28
 
1.6%
( 28
 
1.6%
Other values (215) 1203
68.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1160
66.0%
Space Separator 234
 
13.3%
Decimal Number 164
 
9.3%
Other Punctuation 84
 
4.8%
Close Punctuation 28
 
1.6%
Open Punctuation 28
 
1.6%
Uppercase Letter 28
 
1.6%
Dash Punctuation 27
 
1.5%
Math Symbol 2
 
0.1%
Lowercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
45
 
3.9%
34
 
2.9%
33
 
2.8%
28
 
2.4%
24
 
2.1%
23
 
2.0%
19
 
1.6%
19
 
1.6%
19
 
1.6%
19
 
1.6%
Other values (186) 897
77.3%
Decimal Number
ValueCountFrequency (%)
1 43
26.2%
0 36
22.0%
2 25
15.2%
3 13
 
7.9%
9 13
 
7.9%
6 10
 
6.1%
5 8
 
4.9%
7 8
 
4.9%
4 5
 
3.0%
8 3
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
G 11
39.3%
L 10
35.7%
A 3
 
10.7%
S 1
 
3.6%
I 1
 
3.6%
D 1
 
3.6%
B 1
 
3.6%
Other Punctuation
ValueCountFrequency (%)
. 45
53.6%
: 24
28.6%
, 7
 
8.3%
* 5
 
6.0%
/ 3
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
k 1
50.0%
j 1
50.0%
Space Separator
ValueCountFrequency (%)
234
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 27
100.0%
Math Symbol
ValueCountFrequency (%)
> 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1160
66.0%
Common 567
32.3%
Latin 30
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
45
 
3.9%
34
 
2.9%
33
 
2.8%
28
 
2.4%
24
 
2.1%
23
 
2.0%
19
 
1.6%
19
 
1.6%
19
 
1.6%
19
 
1.6%
Other values (186) 897
77.3%
Common
ValueCountFrequency (%)
234
41.3%
. 45
 
7.9%
1 43
 
7.6%
0 36
 
6.3%
) 28
 
4.9%
( 28
 
4.9%
- 27
 
4.8%
2 25
 
4.4%
: 24
 
4.2%
3 13
 
2.3%
Other values (10) 64
 
11.3%
Latin
ValueCountFrequency (%)
G 11
36.7%
L 10
33.3%
A 3
 
10.0%
k 1
 
3.3%
j 1
 
3.3%
S 1
 
3.3%
I 1
 
3.3%
D 1
 
3.3%
B 1
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1160
66.0%
ASCII 597
34.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
234
39.2%
. 45
 
7.5%
1 43
 
7.2%
0 36
 
6.0%
) 28
 
4.7%
( 28
 
4.7%
- 27
 
4.5%
2 25
 
4.2%
: 24
 
4.0%
3 13
 
2.2%
Other values (19) 94
15.7%
Hangul
ValueCountFrequency (%)
45
 
3.9%
34
 
2.9%
33
 
2.8%
28
 
2.4%
24
 
2.1%
23
 
2.0%
19
 
1.6%
19
 
1.6%
19
 
1.6%
19
 
1.6%
Other values (186) 897
77.3%

Interactions

2023-12-12T17:38:58.719017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:55.563715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.211133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.859611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.439750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:58.068759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:58.829238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:55.676946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.316439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.941811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.545966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:58.166100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:58.925472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:55.788195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.428974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.056634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.638736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:58.275372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:59.049332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:55.907006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.549886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.156044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.747184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:58.361205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:59.169171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.001848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.647670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.259007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.856724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:58.492051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:59.288584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.104417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:56.747350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.339820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:57.963528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:38:58.605715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:39:05.384613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회차취업구분학교코드학교명학교구분취업불가능자구분취업자구분전공일치여부직업분류회사코드건강보험적용여부건강보험직장가입제외여부기타내용
회차1.0000.0880.0000.6740.0290.0480.2020.0930.2100.1690.4170.3461.000
취업구분0.0881.000NaNNaN0.7881.0000.662NaNNaN0.168NaNNaNNaN
학교코드0.000NaN1.0001.0001.000NaNNaNNaNNaNNaNNaNNaNNaN
학교명0.674NaN1.0001.0001.000NaNNaNNaNNaNNaNNaNNaNNaN
학교구분0.0290.7881.0001.0001.0000.0000.529NaNNaN0.070NaNNaNNaN
취업불가능자구분0.0481.000NaNNaN0.0001.0000.000NaNNaN0.000NaNNaNNaN
취업자구분0.2020.662NaNNaN0.5290.0001.0000.2340.4500.2580.2130.1531.000
전공일치여부0.093NaNNaNNaNNaNNaN0.2341.0000.5410.2240.1310.0001.000
직업분류0.210NaNNaNNaNNaNNaN0.4500.5411.0000.3230.3310.1930.996
회사코드0.1690.168NaNNaN0.0700.0000.2580.2240.3231.0000.1120.0561.000
건강보험적용여부0.417NaNNaNNaNNaNNaN0.2130.1310.3310.1121.0000.6951.000
건강보험직장가입제외여부0.346NaNNaNNaNNaNNaN0.1530.0000.1930.0560.6951.0001.000
기타내용1.000NaNNaNNaNNaNNaN1.0001.0000.9961.0001.0001.0001.000
2023-12-12T17:39:05.550788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학교구분전공일치여부학교명건강보험적용여부취업불가능자구분건강보험직장가입제외여부
학교구분1.0001.0000.9251.0000.0001.000
전공일치여부1.0001.000NaN0.0841.0000.000
학교명0.925NaN1.000NaN1.000NaN
건강보험적용여부1.0000.084NaN1.0001.0000.489
취업불가능자구분0.0001.0001.0001.0001.0001.000
건강보험직장가입제외여부1.0000.000NaN0.4891.0001.000
2023-12-12T17:39:05.686332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회차취업구분학교코드취업자구분직업분류회사코드학교명학교구분취업불가능자구분전공일치여부건강보험적용여부건강보험직장가입제외여부
회차1.0000.0560.075-0.0810.0550.0600.3920.0290.0300.0690.2640.212
취업구분0.0561.000NaN-0.904NaN-0.8251.0000.7060.9991.0001.0001.000
학교코드0.075NaN1.000NaNNaNNaN0.9330.9941.0000.0000.0000.000
취업자구분-0.081-0.904NaN1.0000.0410.7831.0000.2560.0000.2850.2600.187
직업분류0.055NaNNaN0.0411.0000.1740.0001.0001.0000.4170.2530.148
회사코드0.060-0.825NaN0.7830.1741.0001.0000.0410.0000.1710.0860.043
학교명0.3921.0000.9331.0000.0001.0001.0000.9251.0000.0000.0000.000
학교구분0.0290.7060.9940.2561.0000.0410.9251.0000.0001.0001.0001.000
취업불가능자구분0.0300.9991.0000.0001.0000.0001.0000.0001.0001.0001.0001.000
전공일치여부0.0691.0000.0000.2850.4170.1710.0001.0001.0001.0000.0840.000
건강보험적용여부0.2641.0000.0000.2600.2530.0860.0001.0001.0000.0841.0000.489
건강보험직장가입제외여부0.2121.0000.0000.1870.1480.0430.0001.0001.0000.0000.4891.000

Missing values

2023-12-12T17:38:59.519096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:38:59.842234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T17:39:00.399827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

회차취업구분학교코드학교명학교구분취업불가능자구분취업자구분전공일치여부직업분류직업명회사코드회사명건강보험적용여부건강보험직장가입제외여부기타내용
0201111<NA><NA>001N31320경리7446(주)신쿨YN<NA>
120111214810404진주산업대학교200<NA><NA><NA>0<NA><NA><NA><NA>
2201111<NA><NA>001N32031은행사무원65거창신용협동조합YN<NA>
3201111<NA><NA>001Y31320경리사무원1243(주)금아스틸YN<NA>
4201111<NA><NA>001Y31310회계사무원67오영수회계사무소YN<NA>
5201111<NA><NA>001Y31320경리사무원168형제석재산업YN<NA>
6201111<NA><NA>001Y78021통신장비기사66영재정보통신(주)YN<NA>
7201111<NA><NA>001Y31320경리사무원169진용전력(주)YN<NA>
8201111<NA><NA>001N24650간호조무사68성모치과의원YN<NA>
9201111<NA><NA>001N41120소방관69거창소방서YN<NA>
회차취업구분학교코드학교명학교구분취업불가능자구분취업자구분전공일치여부직업분류직업명회사코드회사명건강보험적용여부건강보험직장가입제외여부기타내용
3633201551<NA><NA>001Y42220미용사634리챠드프로헤어YN<NA>
3634201555<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>
3635201551<NA><NA>001Y42220미용사639스왕헤어클리닉YN<NA>
3636201555<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>
3637201551<NA><NA>003Y31310사원3420동원회계법인YN<NA>
3638201551<NA><NA>001N89909생산사원725고려화공고성공장YN<NA>
3639201555<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>
3640201551<NA><NA>001Y31310회계사무원593한울회계법인YN<NA>
3641201555<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>
3642201551<NA><NA>001Y31320경리사무원12424(주)하이원YN<NA>

Duplicate rows

Most frequently occurring

회차취업구분학교코드학교명학교구분취업불가능자구분취업자구분전공일치여부직업분류직업명회사코드회사명건강보험적용여부건강보험직장가입제외여부기타내용# duplicates
156201515<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>143
188201525<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>133
221201535<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>119
254201545<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>97
58201215<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>94
287201555<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>93
93201315<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>87
124201415<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>85
31201115<NA><NA>000<NA><NA><NA>0<NA><NA><NA><NA>83
5420121214810101경상대학교200<NA><NA><NA>0<NA><NA><NA><NA>15