Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory752.0 KiB
Average record size in memory77.0 B

Variable types

Numeric3
Categorical5

Dataset

Description중장기개방계획에따른 경상남도 경남도립거창대학 데이터자료입니다.(학년도, 학과코드, 전공코드, 학년, 과목코드, 과목반, 과정종류 등의 데이터를 포함하고 있습니다.)
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15066691

Alerts

과목반 is highly overall correlated with 전공코드 and 1 other fieldsHigh correlation
과정종류 is highly overall correlated with 학과코드 and 3 other fieldsHigh correlation
학과코드 is highly overall correlated with 과목코드 and 1 other fieldsHigh correlation
과목코드 is highly overall correlated with 학과코드 and 1 other fieldsHigh correlation
전공코드 is highly overall correlated with 과목반 and 1 other fieldsHigh correlation
전공코드 is highly imbalanced (72.2%)Imbalance
과정종류 is highly imbalanced (75.5%)Imbalance

Reproduction

Analysis started2023-12-11 00:35:37.975384
Analysis finished2023-12-11 00:35:39.476566
Duration1.5 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년도
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.6327
Minimum1999
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:35:39.525948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1999
5-th percentile2000
Q12004
median2011
Q32017
95-th percentile2022
Maximum2023
Range24
Interquartile range (IQR)13

Descriptive statistics

Standard deviation7.135929
Coefficient of variation (CV)0.0035490963
Kurtosis-1.2155133
Mean2010.6327
Median Absolute Deviation (MAD)6
Skewness0.042956259
Sum20106327
Variance50.921483
MonotonicityNot monotonic
2023-12-11T09:35:39.619792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
2004 470
 
4.7%
2001 443
 
4.4%
2000 440
 
4.4%
2005 433
 
4.3%
2002 432
 
4.3%
2022 429
 
4.3%
2014 425
 
4.2%
2003 422
 
4.2%
2015 418
 
4.2%
2021 417
 
4.2%
Other values (15) 5671
56.7%
ValueCountFrequency (%)
1999 382
3.8%
2000 440
4.4%
2001 443
4.4%
2002 432
4.3%
2003 422
4.2%
2004 470
4.7%
2005 433
4.3%
2006 386
3.9%
2007 400
4.0%
2008 371
3.7%
ValueCountFrequency (%)
2023 226
2.3%
2022 429
4.3%
2021 417
4.2%
2020 393
3.9%
2019 365
3.6%
2018 390
3.9%
2017 376
3.8%
2016 379
3.8%
2015 418
4.2%
2014 425
4.2%

학과코드
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.7179
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:35:39.719454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile14
Maximum99
Range98
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.6780229
Coefficient of variation (CV)0.84520801
Kurtosis22.113352
Mean6.7179
Median Absolute Deviation (MAD)3
Skewness3.5156571
Sum67179
Variance32.239944
MonotonicityNot monotonic
2023-12-11T09:35:39.825077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
5 1355
13.6%
6 1209
12.1%
3 1063
10.6%
1 1010
10.1%
7 889
8.9%
8 773
7.7%
2 745
7.4%
4 592
 
5.9%
9 443
 
4.4%
14 414
 
4.1%
Other values (13) 1507
15.1%
ValueCountFrequency (%)
1 1010
10.1%
2 745
7.4%
3 1063
10.6%
4 592
5.9%
5 1355
13.6%
6 1209
12.1%
7 889
8.9%
8 773
7.7%
9 443
 
4.4%
10 169
 
1.7%
ValueCountFrequency (%)
99 1
 
< 0.1%
55 1
 
< 0.1%
51 4
 
< 0.1%
42 10
 
0.1%
41 85
 
0.9%
31 86
 
0.9%
17 13
 
0.1%
16 22
 
0.2%
15 167
1.7%
14 414
4.1%

전공코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
8184 
01
 
598
02
 
515
11
 
112
12
 
112
Other values (15)
 
479

Length

Max length2
Median length1
Mean length1.1815
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row
2nd row10
3rd row
4th row01
5th row

Common Values

ValueCountFrequency (%)
8184
81.8%
01 598
 
6.0%
02 515
 
5.1%
11 112
 
1.1%
12 112
 
1.1%
22 84
 
0.8%
21 83
 
0.8%
03 80
 
0.8%
31 78
 
0.8%
32 74
 
0.7%
Other values (10) 80
 
0.8%

Length

2023-12-11T09:35:39.942659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
01 598
32.9%
02 515
28.4%
11 112
 
6.2%
12 112
 
6.2%
22 84
 
4.6%
21 83
 
4.6%
03 80
 
4.4%
31 78
 
4.3%
32 74
 
4.1%
33 15
 
0.8%
Other values (9) 65
 
3.6%

학년
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
4762 
2
4675 
3
 
456
4
 
107

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 4762
47.6%
2 4675
46.8%
3 456
 
4.6%
4 107
 
1.1%

Length

2023-12-11T09:35:40.049362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:35:40.144541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 4762
47.6%
2 4675
46.8%
3 456
 
4.6%
4 107
 
1.1%

학기
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
5047 
2
4953 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1 5047
50.5%
2 4953
49.5%

Length

2023-12-11T09:35:40.243348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:35:40.360337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 5047
50.5%
2 4953
49.5%

과목코드
Real number (ℝ)

HIGH CORRELATION 

Distinct3881
Distinct (%)38.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean123006.7
Minimum9999
Maximum584068
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:35:40.485408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9999
5-th percentile14048.95
Q151014
median74009
Q3131063.25
95-th percentile554004
Maximum584068
Range574069
Interquartile range (IQR)80049.25

Descriptive statistics

Standard deviation141117.6
Coefficient of variation (CV)1.1472351
Kurtosis4.3121108
Mean123006.7
Median Absolute Deviation (MAD)39008
Skewness2.280695
Sum1.230067 × 109
Variance1.9914176 × 1010
MonotonicityNot monotonic
2023-12-11T09:35:40.657014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
53015 31
 
0.3%
71013 21
 
0.2%
35001 20
 
0.2%
65005 20
 
0.2%
31013 19
 
0.2%
13016 19
 
0.2%
73022 18
 
0.2%
75001 18
 
0.2%
74026 18
 
0.2%
65006 17
 
0.2%
Other values (3871) 9799
98.0%
ValueCountFrequency (%)
9999 1
 
< 0.1%
11003 2
 
< 0.1%
11004 3
 
< 0.1%
11005 8
0.1%
11006 2
 
< 0.1%
11007 1
 
< 0.1%
11008 2
 
< 0.1%
11010 2
 
< 0.1%
11011 8
0.1%
11012 17
0.2%
ValueCountFrequency (%)
584068 1
< 0.1%
584067 1
< 0.1%
584065 1
< 0.1%
584062 1
< 0.1%
584061 1
< 0.1%
584057 1
< 0.1%
584056 1
< 0.1%
584054 1
< 0.1%
584051 2
< 0.1%
584050 1
< 0.1%

과목반
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
6398 
B
2716 
E
737 
C
 
125
D
 
23

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowB
2nd rowA
3rd rowA
4th rowA
5th rowB

Common Values

ValueCountFrequency (%)
A 6398
64.0%
B 2716
27.2%
E 737
 
7.4%
C 125
 
1.2%
D 23
 
0.2%
a 1
 
< 0.1%

Length

2023-12-11T09:35:40.801377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:35:40.930232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 6399
64.0%
b 2716
27.2%
e 737
 
7.4%
c 125
 
1.2%
d 23
 
0.2%

과정종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
9242 
B
 
757
1
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 9242
92.4%
B 757
 
7.6%
1 1
 
< 0.1%

Length

2023-12-11T09:35:41.044029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:35:41.152045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 9242
92.4%
b 757
 
7.6%
1 1
 
< 0.1%

Interactions

2023-12-11T09:35:39.040271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:38.562633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:38.800512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:39.133503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:38.644470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:38.870703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:39.210165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:38.718601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:38.959952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:35:41.219782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도학과코드전공코드학년학기과목코드과목반과정종류
년도1.0000.3770.5410.2020.1110.4160.2940.307
학과코드0.3771.0000.7600.1430.0550.8880.8450.944
전공코드0.5410.7601.0000.1380.1090.6860.8710.864
학년0.2020.1430.1381.0000.0300.2260.0730.040
학기0.1110.0550.1090.0301.0000.0450.0000.000
과목코드0.4160.8880.6860.2260.0451.0000.7290.968
과목반0.2940.8450.8710.0730.0000.7291.0001.000
과정종류0.3070.9440.8640.0400.0000.9681.0001.000
2023-12-11T09:35:41.328435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과목반전공코드학년학기과정종류
과목반1.0000.6410.0470.0000.999
전공코드0.6411.0000.0650.0860.712
학년0.0470.0651.0000.0200.038
학기0.0000.0860.0201.0000.000
과정종류0.9990.7120.0380.0001.000
2023-12-11T09:35:41.425664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도학과코드과목코드전공코드학년학기과목반과정종류
년도1.0000.3330.1020.2000.1230.0830.1590.188
학과코드0.3331.0000.5610.4810.0930.0400.4640.713
과목코드0.1020.5611.0000.3500.1460.0450.4770.704
전공코드0.2000.4810.3501.0000.0650.0860.6410.712
학년0.1230.0930.1460.0651.0000.0200.0470.038
학기0.0830.0400.0450.0860.0201.0000.0000.000
과목반0.1590.4640.4770.6410.0470.0001.0000.999
과정종류0.1880.7130.7040.7120.0380.0000.9991.000

Missing values

2023-12-11T09:35:39.324984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:35:39.431540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년도학과코드전공코드학년학기과목코드과목반과정종류
8317200571274037BA
14076201621012224171AA
1364120171212124050AA
4590201220121224093AA
9537200472274084BA
11036200232231011AA
15084201862263202AA
11454200341141015AA
13583201782284119AA
101372004312533050EB
년도학과코드전공코드학년학기과목코드과목반과정종류
11172200351154038AA
17182202382184187AA
8268200571174079AA
1223201472173022AA
5904201091294063AA
6898200852154085AA
11632201663163173AA
8425200681284003AA
66292009112514041EB
1983201541244219AA