Overview

Dataset statistics

Number of variables11
Number of observations10000
Missing cells4253
Missing cells (%)3.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1005.9 KiB
Average record size in memory103.0 B

Variable types

Numeric3
Categorical6
Text1
DateTime1

Dataset

Description중장기개방계획에따른 경상남도 경남도립남해대학 데이터자료입니다.(수강반, 소속반, 사용날짜)
Author경상남도
URLhttps://www.data.go.kr/data/15067552/fileData.do

Alerts

과목전공 has constant value ""Constant
소속전공 has constant value ""Constant
과목계열 is highly overall correlated with 소속계열High correlation
소속계열 is highly overall correlated with 과목계열High correlation
수강반 is highly imbalanced (58.4%)Imbalance
사용날짜 has 4253 (42.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 19:03:54.180310
Analysis finished2023-12-12 19:03:56.381082
Duration2.2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년도
Real number (ℝ)

Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2011.7704
Minimum2001
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:03:56.437037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2001
5-th percentile2002
Q12007
median2012
Q32017
95-th percentile2020
Maximum2020
Range19
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.5567591
Coefficient of variation (CV)0.0027621239
Kurtosis-1.1166647
Mean2011.7704
Median Absolute Deviation (MAD)5
Skewness-0.26095107
Sum20117704
Variance30.877572
MonotonicityNot monotonic
2023-12-13T04:03:56.561532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
2018 749
 
7.5%
2017 648
 
6.5%
2019 631
 
6.3%
2015 625
 
6.2%
2016 618
 
6.2%
2020 578
 
5.8%
2011 541
 
5.4%
2014 529
 
5.3%
2010 525
 
5.2%
2013 525
 
5.2%
Other values (10) 4031
40.3%
ValueCountFrequency (%)
2001 178
 
1.8%
2002 416
4.2%
2003 399
4.0%
2004 419
4.2%
2005 403
4.0%
2006 395
4.0%
2007 388
3.9%
2008 452
4.5%
2009 470
4.7%
2010 525
5.2%
ValueCountFrequency (%)
2020 578
5.8%
2019 631
6.3%
2018 749
7.5%
2017 648
6.5%
2016 618
6.2%
2015 625
6.2%
2014 529
5.3%
2013 525
5.2%
2012 511
5.1%
2011 541
5.4%

학기
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2
5055 
1
4945 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 5055
50.5%
1 4945
49.5%

Length

2023-12-13T04:03:56.723888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:56.830812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 5055
50.5%
1 4945
49.5%

과목계열
Real number (ℝ)

HIGH CORRELATION 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.2056
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:03:56.938476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q37
95-th percentile12
Maximum99
Range98
Interquartile range (IQR)4

Descriptive statistics

Standard deviation15.036061
Coefficient of variation (CV)2.0867188
Kurtosis31.022788
Mean7.2056
Median Absolute Deviation (MAD)2
Skewness5.634362
Sum72056
Variance226.08314
MonotonicityNot monotonic
2023-12-13T04:03:57.055592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
4 1333
13.3%
6 1333
13.3%
7 1317
13.2%
3 1310
13.1%
1 1173
11.7%
2 1152
11.5%
5 1115
11.2%
11 511
 
5.1%
98 170
 
1.7%
12 110
 
1.1%
Other values (9) 476
 
4.8%
ValueCountFrequency (%)
1 1173
11.7%
2 1152
11.5%
3 1310
13.1%
4 1333
13.3%
5 1115
11.2%
6 1333
13.3%
7 1317
13.2%
8 95
 
0.9%
9 103
 
1.0%
10 20
 
0.2%
ValueCountFrequency (%)
99 6
 
0.1%
98 170
 
1.7%
97 83
 
0.8%
16 16
 
0.2%
15 34
 
0.3%
14 79
 
0.8%
13 40
 
0.4%
12 110
 
1.1%
11 511
5.1%
10 20
 
0.2%

과목전공
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 10000
100.0%

Length

2023-12-13T04:03:57.189565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:57.274614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 10000
100.0%

과목학년
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
5100 
2
4832 
4
 
39
3
 
29

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 5100
51.0%
2 4832
48.3%
4 39
 
0.4%
3 29
 
0.3%

Length

2023-12-13T04:03:57.364480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:57.466499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 5100
51.0%
2 4832
48.3%
4 39
 
0.4%
3 29
 
0.3%

과목
Text

Distinct3660
Distinct (%)36.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T04:03:57.816804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters60000
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1384 ?
Unique (%)13.8%

Sample

1st rowC70037
2nd rowN04066
3rd rowA30057
4th rowC20182
5th rowC30293
ValueCountFrequency (%)
a30052 23
 
0.2%
a20024 22
 
0.2%
c60009 19
 
0.2%
a70039 19
 
0.2%
c60033 18
 
0.2%
a70036 17
 
0.2%
a40040 17
 
0.2%
c60032 17
 
0.2%
d10001 17
 
0.2%
c30127 16
 
0.2%
Other values (3650) 9815
98.2%
2023-12-13T04:03:58.398603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18219
30.4%
1 7456
12.4%
C 6586
 
11.0%
2 4939
 
8.2%
3 3700
 
6.2%
4 3378
 
5.6%
6 3092
 
5.2%
7 2944
 
4.9%
5 2929
 
4.9%
A 1824
 
3.0%
Other values (7) 4933
 
8.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50178
83.6%
Uppercase Letter 9822
 
16.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18219
36.3%
1 7456
14.9%
2 4939
 
9.8%
3 3700
 
7.4%
4 3378
 
6.7%
6 3092
 
6.2%
7 2944
 
5.9%
5 2929
 
5.8%
9 1762
 
3.5%
8 1759
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
C 6586
67.1%
A 1824
 
18.6%
N 553
 
5.6%
B 543
 
5.5%
E 129
 
1.3%
F 108
 
1.1%
D 79
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common 50178
83.6%
Latin 9822
 
16.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18219
36.3%
1 7456
14.9%
2 4939
 
9.8%
3 3700
 
7.4%
4 3378
 
6.7%
6 3092
 
6.2%
7 2944
 
5.9%
5 2929
 
5.8%
9 1762
 
3.5%
8 1759
 
3.5%
Latin
ValueCountFrequency (%)
C 6586
67.1%
A 1824
 
18.6%
N 553
 
5.6%
B 543
 
5.5%
E 129
 
1.3%
F 108
 
1.1%
D 79
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18219
30.4%
1 7456
12.4%
C 6586
 
11.0%
2 4939
 
8.2%
3 3700
 
6.2%
4 3378
 
5.6%
6 3092
 
5.2%
7 2944
 
4.9%
5 2929
 
4.9%
A 1824
 
3.0%
Other values (7) 4933
 
8.2%

수강반
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
7154 
B
2668 
C
 
162
D
 
14
E
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowB
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 7154
71.5%
B 2668
 
26.7%
C 162
 
1.6%
D 14
 
0.1%
E 2
 
< 0.1%

Length

2023-12-13T04:03:58.599897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:58.735926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 7154
71.5%
b 2668
 
26.7%
c 162
 
1.6%
d 14
 
0.1%
e 2
 
< 0.1%

소속계열
Real number (ℝ)

HIGH CORRELATION 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.2062
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:03:58.900782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q37
95-th percentile12
Maximum99
Range98
Interquartile range (IQR)4

Descriptive statistics

Standard deviation15.036253
Coefficient of variation (CV)2.0865716
Kurtosis31.020162
Mean7.2062
Median Absolute Deviation (MAD)2
Skewness5.6340396
Sum72062
Variance226.08889
MonotonicityNot monotonic
2023-12-13T04:03:59.071288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
4 1333
13.3%
6 1333
13.3%
7 1317
13.2%
3 1310
13.1%
1 1173
11.7%
2 1152
11.5%
5 1115
11.2%
11 508
 
5.1%
98 170
 
1.7%
12 110
 
1.1%
Other values (9) 479
 
4.8%
ValueCountFrequency (%)
1 1173
11.7%
2 1152
11.5%
3 1310
13.1%
4 1333
13.3%
5 1115
11.2%
6 1333
13.3%
7 1317
13.2%
8 95
 
0.9%
9 103
 
1.0%
10 20
 
0.2%
ValueCountFrequency (%)
99 6
 
0.1%
98 170
 
1.7%
97 83
 
0.8%
16 16
 
0.2%
15 34
 
0.3%
14 79
 
0.8%
13 43
 
0.4%
12 110
 
1.1%
11 508
5.1%
10 20
 
0.2%

소속전공
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 10000
100.0%

Length

2023-12-13T04:03:59.232331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:59.367186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 10000
100.0%

소속반
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
5276 
B
4604 
C
 
119
D
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowB
2nd rowB
3rd rowA
4th rowB
5th rowA

Common Values

ValueCountFrequency (%)
A 5276
52.8%
B 4604
46.0%
C 119
 
1.2%
D 1
 
< 0.1%

Length

2023-12-13T04:03:59.527306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:59.672086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 5276
52.8%
b 4604
46.0%
c 119
 
1.2%
d 1
 
< 0.1%

사용날짜
Date

MISSING 

Distinct203
Distinct (%)3.5%
Missing4253
Missing (%)42.5%
Memory size156.2 KiB
Minimum2001-12-27 00:00:00
Maximum2020-09-02 00:00:00
2023-12-13T04:03:59.838951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:59.997215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-13T04:03:55.756711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:55.017947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:55.423113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:55.866378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:55.162852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:55.548096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:55.960831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:55.287344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:55.651618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:04:00.162441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도학기과목계열과목학년수강반소속계열소속반
년도1.0000.0000.3700.1200.1790.3700.131
학기0.0001.0000.0360.1530.0920.0360.000
과목계열0.3700.0361.0000.2210.1471.0000.093
과목학년0.1200.1530.2211.0000.1250.2210.113
수강반0.1790.0920.1470.1251.0000.1470.500
소속계열0.3700.0361.0000.2210.1471.0000.093
소속반0.1310.0000.0930.1130.5000.0931.000
2023-12-13T04:04:00.332942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소속반과목학년학기수강반
소속반1.0000.0450.0000.428
과목학년0.0451.0000.1020.102
학기0.0000.1021.0000.113
수강반0.4280.1020.1131.000
2023-12-13T04:04:00.460032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도과목계열소속계열학기과목학년수강반소속반
년도1.0000.1600.1600.0710.0730.0760.078
과목계열0.1601.0001.0000.0590.2100.1110.087
소속계열0.1601.0001.0000.0590.2100.1110.087
학기0.0710.0590.0591.0000.1020.1130.000
과목학년0.0730.2100.2100.1021.0000.1020.045
수강반0.0760.1110.1110.1130.1021.0000.428
소속반0.0780.0870.0870.0000.0450.4281.000

Missing values

2023-12-13T04:03:56.108243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:03:56.287986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년도학기과목계열과목전공과목학년과목수강반소속계열소속전공소속반사용날짜
120020031702C70037B70B2003-02-25
1568420202402N04066B40B<NA>
1444020192301A30057A30A<NA>
721920112202C20182A20B2011-08-22
1155220162302C30293A30A<NA>
1574620202101C01268A10B2020-08-19
13520012101601106A10A2001-12-27
41220021101C10003B10B2002-02-27
266220052202C20071A20A<NA>
1204220171702C70164A70A<NA>
년도학기과목계열과목전공과목학년과목수강반소속계열소속전공소속반사용날짜
1161320162401C40237A40B2016-08-22
251320051601C60052A60A<NA>
15216202011102N11032A110B2020-03-06
523120091702C70072A70B2009-02-23
1527020201602A60067A60B2020-03-06
352620062302C30126A30B2006-07-12
125520032102C10021A10B2003-08-18
1503720201401N04053A40A<NA>
769920121402A40021B40B2012-02-20
113220031502C50038A50A2003-02-25