Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory752.0 KiB
Average record size in memory77.0 B

Variable types

Numeric3
Categorical5

Dataset

Description중장기개방계획에따른 경상남도 경남도립거창대학 데이터자료입니다.(학년도, 학과코드, 전공코드, 학년, 과목코드, 과목반, 과정종류 등의 데이터를 포함하고 있습니다.)
URLhttps://www.data.go.kr/data/15066691/fileData.do

Alerts

과정종류 is highly overall correlated with 학과코드 and 3 other fieldsHigh correlation
과목반 is highly overall correlated with 전공코드 and 1 other fieldsHigh correlation
학과코드 is highly overall correlated with 과목코드 and 1 other fieldsHigh correlation
과목코드 is highly overall correlated with 학과코드 and 1 other fieldsHigh correlation
전공코드 is highly overall correlated with 과목반 and 1 other fieldsHigh correlation
전공코드 is highly imbalanced (72.1%)Imbalance
과정종류 is highly imbalanced (75.0%)Imbalance

Reproduction

Analysis started2023-12-12 05:38:00.162510
Analysis finished2023-12-12 05:38:02.299108
Duration2.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년도
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.633
Minimum1999
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:38:02.401115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1999
5-th percentile2000
Q12004
median2011
Q32017
95-th percentile2022
Maximum2023
Range24
Interquartile range (IQR)13

Descriptive statistics

Standard deviation7.1226108
Coefficient of variation (CV)0.0035424718
Kurtosis-1.2128153
Mean2010.633
Median Absolute Deviation (MAD)6
Skewness0.042907825
Sum20106330
Variance50.731584
MonotonicityNot monotonic
2023-12-12T14:38:02.539349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
2004 458
 
4.6%
2015 455
 
4.5%
2002 447
 
4.5%
2003 442
 
4.4%
2001 440
 
4.4%
2012 431
 
4.3%
2022 429
 
4.3%
2005 426
 
4.3%
2010 411
 
4.1%
2000 409
 
4.1%
Other values (15) 5652
56.5%
ValueCountFrequency (%)
1999 391
3.9%
2000 409
4.1%
2001 440
4.4%
2002 447
4.5%
2003 442
4.4%
2004 458
4.6%
2005 426
4.3%
2006 404
4.0%
2007 372
3.7%
2008 380
3.8%
ValueCountFrequency (%)
2023 222
2.2%
2022 429
4.3%
2021 408
4.1%
2020 403
4.0%
2019 368
3.7%
2018 370
3.7%
2017 392
3.9%
2016 361
3.6%
2015 455
4.5%
2014 409
4.1%

학과코드
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.7453
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:38:02.675978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile14
Maximum99
Range98
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.680533
Coefficient of variation (CV)0.84214682
Kurtosis22.267391
Mean6.7453
Median Absolute Deviation (MAD)3
Skewness3.524183
Sum67453
Variance32.268455
MonotonicityNot monotonic
2023-12-12T14:38:02.802767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
5 1411
14.1%
6 1190
11.9%
3 1052
10.5%
1 985
9.8%
7 902
9.0%
8 767
7.7%
2 749
7.5%
4 563
 
5.6%
14 475
 
4.8%
9 461
 
4.6%
Other values (13) 1445
14.4%
ValueCountFrequency (%)
1 985
9.8%
2 749
7.5%
3 1052
10.5%
4 563
 
5.6%
5 1411
14.1%
6 1190
11.9%
7 902
9.0%
8 767
7.7%
9 461
 
4.6%
10 151
 
1.5%
ValueCountFrequency (%)
99 1
 
< 0.1%
55 2
 
< 0.1%
51 4
 
< 0.1%
42 13
 
0.1%
41 79
 
0.8%
31 89
 
0.9%
17 6
 
0.1%
16 15
 
0.1%
15 152
 
1.5%
14 475
4.8%

전공코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
8171 
01
 
635
02
 
506
12
 
104
11
 
102
Other values (15)
 
482

Length

Max length2
Median length1
Mean length1.1828
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row01
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
8171
81.7%
01 635
 
6.3%
02 506
 
5.1%
12 104
 
1.0%
11 102
 
1.0%
03 88
 
0.9%
22 86
 
0.9%
31 80
 
0.8%
21 79
 
0.8%
32 57
 
0.6%
Other values (10) 92
 
0.9%

Length

2023-12-12T14:38:02.942253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
01 635
34.7%
02 506
27.7%
12 104
 
5.7%
11 102
 
5.6%
03 88
 
4.8%
22 86
 
4.7%
31 80
 
4.4%
21 79
 
4.3%
32 57
 
3.1%
14 18
 
1.0%
Other values (9) 74
 
4.0%

학년
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
4758 
2
4683 
3
481 
4
 
78

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 4758
47.6%
2 4683
46.8%
3 481
 
4.8%
4 78
 
0.8%

Length

2023-12-12T14:38:03.074736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:38:03.206490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 4758
47.6%
2 4683
46.8%
3 481
 
4.8%
4 78
 
0.8%

학기
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
5003 
2
4997 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 5003
50.0%
2 4997
50.0%

Length

2023-12-12T14:38:03.314440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:38:03.448922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 5003
50.0%
2 4997
50.0%

과목코드
Real number (ℝ)

HIGH CORRELATION 

Distinct3861
Distinct (%)38.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean124423.7
Minimum9999
Maximum584068
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:38:03.584902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9999
5-th percentile14047
Q151047
median74026
Q3134007.25
95-th percentile554005
Maximum584068
Range574069
Interquartile range (IQR)82960.25

Descriptive statistics

Standard deviation142167.59
Coefficient of variation (CV)1.1426086
Kurtosis4.1451225
Mean124423.7
Median Absolute Deviation (MAD)39419.5
Skewness2.2514834
Sum1.244237 × 109
Variance2.0211624 × 1010
MonotonicityNot monotonic
2023-12-12T14:38:03.763124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
53015 26
 
0.3%
13016 22
 
0.2%
14003 20
 
0.2%
11012 19
 
0.2%
14045 18
 
0.2%
75001 18
 
0.2%
71013 17
 
0.2%
35002 17
 
0.2%
65006 17
 
0.2%
31013 17
 
0.2%
Other values (3851) 9809
98.1%
ValueCountFrequency (%)
9999 1
 
< 0.1%
11003 2
 
< 0.1%
11004 2
 
< 0.1%
11005 5
0.1%
11006 2
 
< 0.1%
11007 2
 
< 0.1%
11008 2
 
< 0.1%
11009 2
 
< 0.1%
11010 2
 
< 0.1%
11011 6
0.1%
ValueCountFrequency (%)
584068 1
< 0.1%
584063 1
< 0.1%
584062 1
< 0.1%
584060 1
< 0.1%
584059 1
< 0.1%
584057 1
< 0.1%
584055 1
< 0.1%
584053 1
< 0.1%
584052 1
< 0.1%
584051 2
< 0.1%

과목반
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
6426 
B
2653 
E
753 
C
 
140
D
 
27

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowA
2nd rowA
3rd rowE
4th rowB
5th rowE

Common Values

ValueCountFrequency (%)
A 6426
64.3%
B 2653
26.5%
E 753
 
7.5%
C 140
 
1.4%
D 27
 
0.3%
a 1
 
< 0.1%

Length

2023-12-12T14:38:03.917603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:38:04.014555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 6427
64.3%
b 2653
26.5%
e 753
 
7.5%
c 140
 
1.4%
d 27
 
0.3%

과정종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
9220 
B
 
779
1
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowA
2nd rowA
3rd rowB
4th rowA
5th rowB

Common Values

ValueCountFrequency (%)
A 9220
92.2%
B 779
 
7.8%
1 1
 
< 0.1%

Length

2023-12-12T14:38:04.146156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:38:04.234532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 9220
92.2%
b 779
 
7.8%
1 1
 
< 0.1%

Interactions

2023-12-12T14:38:01.568278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:38:00.858320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:38:01.211059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:38:01.681944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:38:00.948414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:38:01.325270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:38:01.816039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:38:01.071616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:38:01.419803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:38:04.306924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도학과코드전공코드학년학기과목코드과목반과정종류
년도1.0000.3760.5320.1780.0840.4120.2920.310
학과코드0.3761.0000.7600.1460.0490.8840.8460.944
전공코드0.5320.7601.0000.1340.1010.6800.8650.865
학년0.1780.1460.1341.0000.0240.2220.0680.035
학기0.0840.0490.1010.0241.0000.0370.0230.004
과목코드0.4120.8840.6800.2220.0371.0000.7310.966
과목반0.2920.8460.8650.0680.0230.7311.0001.000
과정종류0.3100.9440.8650.0350.0040.9661.0001.000
2023-12-12T14:38:04.421454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전공코드과정종류학기과목반학년
전공코드1.0000.7120.0800.6310.064
과정종류0.7121.0000.0061.0000.033
학기0.0800.0061.0000.0160.016
과목반0.6311.0000.0161.0000.044
학년0.0640.0330.0160.0441.000
2023-12-12T14:38:04.532424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도학과코드과목코드전공코드학년학기과목반과정종류
년도1.0000.3340.1050.1960.1090.0630.1570.190
학과코드0.3341.0000.5490.4810.0940.0350.4650.713
과목코드0.1050.5491.0000.3450.1430.0370.4800.704
전공코드0.1960.4810.3451.0000.0640.0800.6310.712
학년0.1090.0940.1430.0641.0000.0160.0440.033
학기0.0630.0350.0370.0800.0161.0000.0160.006
과목반0.1570.4650.4800.6310.0440.0161.0001.000
과정종류0.1900.7130.7040.7120.0330.0061.0001.000

Missing values

2023-12-12T14:38:01.996585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:38:02.207842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년도학과코드전공코드학년학기과목코드과목반과정종류
4975201020122224124AA
9573200481284043AA
84562006822584040EB
5083201051154056BA
79152005511554031EB
895200020122214031AA
15770202191194181AA
65192009511554052EB
1684220231711174010AA
10196200433133071AA
년도학과코드전공코드학년학기과목코드과목반과정종류
1631120205212253030AA
10730200220121214015AA
1572220203112311032AA
4895201163265007AA
189620151411144017AA
1791920231711171053AA
5692200951254150BA
2956201352154189AA
1740199912111010AA
595520101111114001AA