Overview

Dataset statistics

Number of variables4
Number of observations601
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.9 KiB
Average record size in memory32.2 B

Variable types

Text3
Categorical1

Dataset

Description경남도립거창대학의 구분코드 공공데이터입니다. 구분종류, 구분코드, 구분명, 비고 등의 데이터를 포함하고 있습니다.
Author공공데이터포털
URLhttps://www.data.go.kr/data/15097838/fileData.do

Alerts

비고 is highly imbalanced (87.1%)Imbalance

Reproduction

Analysis started2024-04-21 23:58:17.204653
Analysis finished2024-04-21 23:58:17.546371
Duration0.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct65
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
2024-04-22T08:58:17.687662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length3.9650582
Min length2

Characters and Unicode

Total characters2383
Distinct characters94
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.3%

Sample

1st row장학구분
2nd row장학구분
3rd row교수구분
4th row이수구분
5th row이수구분
ValueCountFrequency (%)
장학구분 99
 
16.5%
소속부서 24
 
4.0%
계급 22
 
3.7%
취업직업 21
 
3.5%
변동구분 21
 
3.5%
교수구분 17
 
2.8%
시도구분 17
 
2.8%
지망구분 16
 
2.7%
취업업종 16
 
2.7%
업종구분 16
 
2.7%
Other values (55) 332
55.2%
2024-04-22T08:58:18.000708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
325
 
13.6%
323
 
13.6%
154
 
6.5%
111
 
4.7%
107
 
4.5%
64
 
2.7%
59
 
2.5%
58
 
2.4%
55
 
2.3%
41
 
1.7%
Other values (84) 1086
45.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2378
99.8%
Lowercase Letter 4
 
0.2%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
325
 
13.7%
323
 
13.6%
154
 
6.5%
111
 
4.7%
107
 
4.5%
64
 
2.7%
59
 
2.5%
58
 
2.4%
55
 
2.3%
41
 
1.7%
Other values (79) 1081
45.5%
Lowercase Letter
ValueCountFrequency (%)
g 1
25.0%
b 1
25.0%
j 1
25.0%
r 1
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2378
99.8%
Latin 4
 
0.2%
Common 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
325
 
13.7%
323
 
13.6%
154
 
6.5%
111
 
4.7%
107
 
4.5%
64
 
2.7%
59
 
2.5%
58
 
2.4%
55
 
2.3%
41
 
1.7%
Other values (79) 1081
45.5%
Latin
ValueCountFrequency (%)
g 1
25.0%
b 1
25.0%
j 1
25.0%
r 1
25.0%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2378
99.8%
ASCII 5
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
325
 
13.7%
323
 
13.6%
154
 
6.5%
111
 
4.7%
107
 
4.5%
64
 
2.7%
59
 
2.5%
58
 
2.4%
55
 
2.3%
41
 
1.7%
Other values (79) 1081
45.5%
ASCII
ValueCountFrequency (%)
g 1
20.0%
b 1
20.0%
_ 1
20.0%
j 1
20.0%
r 1
20.0%
Distinct105
Distinct (%)17.5%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
2024-04-22T08:58:18.217912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.4076539
Min length1

Characters and Unicode

Total characters846
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)8.3%

Sample

1st row82
2nd row29
3rd row13
4th row1
5th row2
ValueCountFrequency (%)
1 60
 
10.0%
2 58
 
9.7%
3 53
 
8.8%
4 45
 
7.5%
5 36
 
6.0%
6 32
 
5.3%
7 26
 
4.3%
8 25
 
4.2%
12 16
 
2.7%
11 16
 
2.7%
Other values (95) 234
38.9%
2024-04-22T08:58:18.544608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 201
23.8%
2 123
14.5%
3 103
12.2%
4 82
9.7%
5 78
 
9.2%
6 62
 
7.3%
7 60
 
7.1%
8 53
 
6.3%
9 45
 
5.3%
0 33
 
3.9%
Other values (6) 6
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 840
99.3%
Uppercase Letter 4
 
0.5%
Lowercase Letter 2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 201
23.9%
2 123
14.6%
3 103
12.3%
4 82
9.8%
5 78
 
9.3%
6 62
 
7.4%
7 60
 
7.1%
8 53
 
6.3%
9 45
 
5.4%
0 33
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
B 1
25.0%
D 1
25.0%
C 1
25.0%
A 1
25.0%
Lowercase Letter
ValueCountFrequency (%)
g 1
50.0%
b 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 840
99.3%
Latin 6
 
0.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 201
23.9%
2 123
14.6%
3 103
12.3%
4 82
9.8%
5 78
 
9.3%
6 62
 
7.4%
7 60
 
7.1%
8 53
 
6.3%
9 45
 
5.4%
0 33
 
3.9%
Latin
ValueCountFrequency (%)
B 1
16.7%
g 1
16.7%
b 1
16.7%
D 1
16.7%
C 1
16.7%
A 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 846
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 201
23.8%
2 123
14.5%
3 103
12.2%
4 82
9.7%
5 78
 
9.2%
6 62
 
7.3%
7 60
 
7.1%
8 53
 
6.3%
9 45
 
5.3%
0 33
 
3.9%
Other values (6) 6
 
0.7%
Distinct538
Distinct (%)89.5%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
2024-04-22T08:58:18.793935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length4.2828619
Min length1

Characters and Unicode

Total characters2574
Distinct characters288
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique485 ?
Unique (%)80.7%

Sample

1st row국가다자녀장학감면
2nd row포항지진장학
3rd row명예교수
4th row교양
5th row교선
ValueCountFrequency (%)
기타 6
 
1.0%
가능 4
 
0.6%
기타(직접입력 4
 
0.6%
정상 3
 
0.5%
미등록제적 3
 
0.5%
미복학제적 3
 
0.5%
군인 3
 
0.5%
세무/회계직 2
 
0.3%
의료기관 2
 
0.3%
교육기관 2
 
0.3%
Other values (540) 591
94.9%
2024-04-22T08:58:19.152644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
156
 
6.1%
87
 
3.4%
59
 
2.3%
54
 
2.1%
52
 
2.0%
48
 
1.9%
47
 
1.8%
46
 
1.8%
( 46
 
1.8%
) 46
 
1.8%
Other values (278) 1933
75.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2388
92.8%
Open Punctuation 46
 
1.8%
Close Punctuation 46
 
1.8%
Decimal Number 31
 
1.2%
Space Separator 25
 
1.0%
Other Punctuation 17
 
0.7%
Uppercase Letter 12
 
0.5%
Math Symbol 4
 
0.2%
Lowercase Letter 4
 
0.2%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
156
 
6.5%
87
 
3.6%
59
 
2.5%
54
 
2.3%
52
 
2.2%
48
 
2.0%
47
 
2.0%
46
 
1.9%
39
 
1.6%
36
 
1.5%
Other values (251) 1764
73.9%
Uppercase Letter
ValueCountFrequency (%)
A 4
33.3%
B 2
16.7%
R 1
 
8.3%
K 1
 
8.3%
P 1
 
8.3%
C 1
 
8.3%
T 1
 
8.3%
O 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
2 12
38.7%
1 8
25.8%
3 4
 
12.9%
4 3
 
9.7%
5 2
 
6.5%
6 1
 
3.2%
7 1
 
3.2%
Lowercase Letter
ValueCountFrequency (%)
m 1
25.0%
n 1
25.0%
b 1
25.0%
g 1
25.0%
Other Punctuation
ValueCountFrequency (%)
/ 15
88.2%
& 1
 
5.9%
. 1
 
5.9%
Open Punctuation
ValueCountFrequency (%)
( 46
100.0%
Close Punctuation
ValueCountFrequency (%)
) 46
100.0%
Space Separator
ValueCountFrequency (%)
25
100.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2388
92.8%
Common 170
 
6.6%
Latin 16
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
156
 
6.5%
87
 
3.6%
59
 
2.5%
54
 
2.3%
52
 
2.2%
48
 
2.0%
47
 
2.0%
46
 
1.9%
39
 
1.6%
36
 
1.5%
Other values (251) 1764
73.9%
Common
ValueCountFrequency (%)
( 46
27.1%
) 46
27.1%
25
14.7%
/ 15
 
8.8%
2 12
 
7.1%
1 8
 
4.7%
+ 4
 
2.4%
3 4
 
2.4%
4 3
 
1.8%
5 2
 
1.2%
Other values (5) 5
 
2.9%
Latin
ValueCountFrequency (%)
A 4
25.0%
B 2
12.5%
R 1
 
6.2%
K 1
 
6.2%
m 1
 
6.2%
P 1
 
6.2%
C 1
 
6.2%
T 1
 
6.2%
n 1
 
6.2%
b 1
 
6.2%
Other values (2) 2
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2388
92.8%
ASCII 186
 
7.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
156
 
6.5%
87
 
3.6%
59
 
2.5%
54
 
2.3%
52
 
2.2%
48
 
2.0%
47
 
2.0%
46
 
1.9%
39
 
1.6%
36
 
1.5%
Other values (251) 1764
73.9%
ASCII
ValueCountFrequency (%)
( 46
24.7%
) 46
24.7%
25
13.4%
/ 15
 
8.1%
2 12
 
6.5%
1 8
 
4.3%
A 4
 
2.2%
+ 4
 
2.2%
3 4
 
2.2%
4 3
 
1.6%
Other values (17) 19
10.2%

비고
Categorical

IMBALANCE 

Distinct13
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
<NA>
568 
군휴학
 
9
REJECT
 
5
일반휴학
 
4
질병휴학
 
4
Other values (8)
 
11

Length

Max length6
Median length4
Mean length3.9950083
Min length2

Unique

Unique7 ?
Unique (%)1.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 568
94.5%
군휴학 9
 
1.5%
REJECT 5
 
0.8%
일반휴학 4
 
0.7%
질병휴학 4
 
0.7%
군예정휴학 4
 
0.7%
체능계열 1
 
0.2%
하사관 1
 
0.2%
사회계열 1
 
0.2%
경북 1
 
0.2%
Other values (3) 3
 
0.5%

Length

2024-04-22T08:58:19.285215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 568
94.5%
군휴학 9
 
1.5%
reject 5
 
0.8%
일반휴학 4
 
0.7%
질병휴학 4
 
0.7%
군예정휴학 4
 
0.7%
체능계열 1
 
0.2%
하사관 1
 
0.2%
사회계열 1
 
0.2%
경북 1
 
0.2%
Other values (3) 3
 
0.5%

Correlations

2024-04-22T08:58:19.362180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분종류비고
구분종류1.0000.990
비고0.9901.000

Missing values

2024-04-22T08:58:17.446179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-22T08:58:17.515321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분종류구분코드구분명비고
0장학구분82국가다자녀장학감면<NA>
1장학구분29포항지진장학<NA>
2교수구분13명예교수<NA>
3이수구분1교양<NA>
4이수구분2교선<NA>
5이수구분3전필<NA>
6이수구분4전선<NA>
7이수구분5교직<NA>
8입학구분1당초입학<NA>
9입학구분2편입학<NA>
구분종류구분코드구분명비고
591소속부서51건축인테리어(심화)<NA>
592소속부서55드론토목학부(심화)<NA>
593장학구분28다문화가족 장학<NA>
594장학구분83장애학생 장학<NA>
595장학구분84다자녀생활비지원금<NA>
596장학구분89국가다자녀장학지급<NA>
597장학구분6학생자치회층장장학<NA>
598장학구분7학생자치회학생장장학<NA>
599장학구분8체육우수장학<NA>
600정외세부62만학도<NA>