Overview

Dataset statistics

Number of variables7
Number of observations116
Missing cells165
Missing cells (%)20.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory58.1 B

Variable types

Text2
Boolean2
Numeric1
Categorical2

Dataset

Description중장기개방계획에따른 경상남도 경남도립거창대학 데이터자료입니다.(프로그램명, 그룹여부, 정렬순서, 설명, 사용여부등의 데이터를 포함하고있습니다.)
URLhttps://www.data.go.kr/data/15066696/fileData.do

Alerts

생성일시 is highly overall correlated with 정렬순서 and 2 other fieldsHigh correlation
수정일시 is highly overall correlated with 정렬순서 and 2 other fieldsHigh correlation
사용여부 is highly overall correlated with 그룹여부 and 2 other fieldsHigh correlation
그룹여부 is highly overall correlated with 사용여부High correlation
정렬순서 is highly overall correlated with 생성일시 and 1 other fieldsHigh correlation
그룹여부 is highly imbalanced (88.1%)Imbalance
사용여부 is highly imbalanced (82.7%)Imbalance
그룹여부 has 54 (46.6%) missing valuesMissing
설명 has 111 (95.7%) missing valuesMissing
정렬순서 has 13 (11.2%) zerosZeros

Reproduction

Analysis started2023-12-12 07:20:20.947459
Analysis finished2023-12-12 07:20:21.753130
Duration0.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct115
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T16:20:21.939290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length13
Mean length7.4137931
Min length3

Characters and Unicode

Total characters860
Distinct characters152
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique114 ?
Unique (%)98.3%

Sample

1st row권한 관리
2nd row재적학생수
3rd row연령별 학생현황
4th row연령별 졸업자
5th row자격증 취득현황
ValueCountFrequency (%)
현황 16
 
6.6%
관리 10
 
4.1%
찾기 7
 
2.9%
학생 6
 
2.5%
외국인 6
 
2.5%
졸업생 6
 
2.5%
자격증 5
 
2.1%
연령별 5
 
2.1%
5
 
2.1%
분포 4
 
1.7%
Other values (133) 171
71.0%
2023-12-12T16:20:22.388962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
125
 
14.5%
44
 
5.1%
35
 
4.1%
32
 
3.7%
30
 
3.5%
29
 
3.4%
21
 
2.4%
18
 
2.1%
18
 
2.1%
18
 
2.1%
Other values (142) 490
57.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 720
83.7%
Space Separator 125
 
14.5%
Decimal Number 8
 
0.9%
Close Punctuation 3
 
0.3%
Open Punctuation 3
 
0.3%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
44
 
6.1%
35
 
4.9%
32
 
4.4%
30
 
4.2%
29
 
4.0%
21
 
2.9%
18
 
2.5%
18
 
2.5%
18
 
2.5%
16
 
2.2%
Other values (131) 459
63.7%
Decimal Number
ValueCountFrequency (%)
1 2
25.0%
2 1
12.5%
3 1
12.5%
4 1
12.5%
5 1
12.5%
6 1
12.5%
7 1
12.5%
Space Separator
ValueCountFrequency (%)
125
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 720
83.7%
Common 140
 
16.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
44
 
6.1%
35
 
4.9%
32
 
4.4%
30
 
4.2%
29
 
4.0%
21
 
2.9%
18
 
2.5%
18
 
2.5%
18
 
2.5%
16
 
2.2%
Other values (131) 459
63.7%
Common
ValueCountFrequency (%)
125
89.3%
) 3
 
2.1%
( 3
 
2.1%
1 2
 
1.4%
2 1
 
0.7%
3 1
 
0.7%
4 1
 
0.7%
5 1
 
0.7%
6 1
 
0.7%
7 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 720
83.7%
ASCII 140
 
16.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
125
89.3%
) 3
 
2.1%
( 3
 
2.1%
1 2
 
1.4%
2 1
 
0.7%
3 1
 
0.7%
4 1
 
0.7%
5 1
 
0.7%
6 1
 
0.7%
7 1
 
0.7%
Hangul
ValueCountFrequency (%)
44
 
6.1%
35
 
4.9%
32
 
4.4%
30
 
4.2%
29
 
4.0%
21
 
2.9%
18
 
2.5%
18
 
2.5%
18
 
2.5%
16
 
2.2%
Other values (131) 459
63.7%

그룹여부
Boolean

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)3.2%
Missing54
Missing (%)46.6%
Memory size364.0 B
True
61 
False
 
1
(Missing)
54 
ValueCountFrequency (%)
True 61
52.6%
False 1
 
0.9%
(Missing) 54
46.6%
2023-12-12T16:20:22.540027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

정렬순서
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct36
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.698276
Minimum0
Maximum99
Zeros13
Zeros (%)11.2%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T16:20:22.668664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median7.5
Q316
95-th percentile94.25
Maximum99
Range99
Interquartile range (IQR)13

Descriptive statistics

Standard deviation23.515848
Coefficient of variation (CV)1.5999052
Kurtosis7.6747893
Mean14.698276
Median Absolute Deviation (MAD)5.5
Skewness2.9058339
Sum1705
Variance552.99513
MonotonicityNot monotonic
2023-12-12T16:20:22.831013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0 13
 
11.2%
3 8
 
6.9%
4 7
 
6.0%
1 7
 
6.0%
2 7
 
6.0%
5 6
 
5.2%
6 5
 
4.3%
7 5
 
4.3%
8 5
 
4.3%
9 5
 
4.3%
Other values (26) 48
41.4%
ValueCountFrequency (%)
0 13
11.2%
1 7
6.0%
2 7
6.0%
3 8
6.9%
4 7
6.0%
5 6
5.2%
6 5
 
4.3%
7 5
 
4.3%
8 5
 
4.3%
9 5
 
4.3%
ValueCountFrequency (%)
99 2
1.7%
98 1
0.9%
97 1
0.9%
96 1
0.9%
95 1
0.9%
94 1
0.9%
93 1
0.9%
29 1
0.9%
27 1
0.9%
26 1
0.9%

설명
Text

MISSING 

Distinct4
Distinct (%)80.0%
Missing111
Missing (%)95.7%
Memory size1.0 KiB
2023-12-12T16:20:23.033493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length13
Mean length13.6
Min length10

Characters and Unicode

Total characters68
Distinct characters30
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)60.0%

Sample

1st row정원내/정원외 재적학생수
2nd row공통화면으로 사용여부는 'N'
3rd row공통화면으로 사용여부는 'N'
4th row취업정보 보기/추가/수정
5th row학적정보 보기/수정
ValueCountFrequency (%)
공통화면으로 2
16.7%
사용여부는 2
16.7%
n 2
16.7%
정원내/정원외 1
8.3%
재적학생수 1
8.3%
취업정보 1
8.3%
보기/추가/수정 1
8.3%
학적정보 1
8.3%
보기/수정 1
8.3%
2023-12-12T16:20:23.376645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
10.3%
6
 
8.8%
' 4
 
5.9%
4
 
5.9%
/ 4
 
5.9%
3
 
4.4%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
Other values (20) 32
47.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 51
75.0%
Other Punctuation 8
 
11.8%
Space Separator 7
 
10.3%
Uppercase Letter 2
 
2.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6
 
11.8%
4
 
7.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (16) 24
47.1%
Other Punctuation
ValueCountFrequency (%)
' 4
50.0%
/ 4
50.0%
Space Separator
ValueCountFrequency (%)
7
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 51
75.0%
Common 15
 
22.1%
Latin 2
 
2.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6
 
11.8%
4
 
7.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (16) 24
47.1%
Common
ValueCountFrequency (%)
7
46.7%
' 4
26.7%
/ 4
26.7%
Latin
ValueCountFrequency (%)
N 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 51
75.0%
ASCII 17
 
25.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7
41.2%
' 4
23.5%
/ 4
23.5%
N 2
 
11.8%
Hangul
ValueCountFrequency (%)
6
 
11.8%
4
 
7.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (16) 24
47.1%

사용여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size248.0 B
True
113 
False
 
3
ValueCountFrequency (%)
True 113
97.4%
False 3
 
2.6%
2023-12-12T16:20:23.493862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

생성일시
Categorical

HIGH CORRELATION 

Distinct30
Distinct (%)25.9%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2011-03-03
36 
2011-03-05
2015-06-09
2011-04-06
2011-03-09
 
5
Other values (25)
53 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique12 ?
Unique (%)10.3%

Sample

1st row2011-03-03
2nd row2011-03-04
3rd row2011-03-04
4th row2011-03-04
5th row2011-03-05

Common Values

ValueCountFrequency (%)
2011-03-03 36
31.0%
2011-03-05 8
 
6.9%
2015-06-09 8
 
6.9%
2011-04-06 6
 
5.2%
2011-03-09 5
 
4.3%
2011-03-16 5
 
4.3%
2011-03-17 5
 
4.3%
2011-03-10 5
 
4.3%
2011-03-22 4
 
3.4%
2011-03-04 3
 
2.6%
Other values (20) 31
26.7%

Length

2023-12-12T16:20:23.631356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2011-03-03 36
31.0%
2011-03-05 8
 
6.9%
2015-06-09 8
 
6.9%
2011-04-06 6
 
5.2%
2011-03-09 5
 
4.3%
2011-03-16 5
 
4.3%
2011-03-17 5
 
4.3%
2011-03-10 5
 
4.3%
2011-03-22 4
 
3.4%
2011-03-04 3
 
2.6%
Other values (20) 31
26.7%

수정일시
Categorical

HIGH CORRELATION 

Distinct30
Distinct (%)25.9%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2011-03-03
36 
2011-03-05
2015-06-09
2011-04-06
2011-03-09
 
5
Other values (25)
53 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique12 ?
Unique (%)10.3%

Sample

1st row2011-03-03
2nd row2011-03-04
3rd row2011-03-04
4th row2011-03-04
5th row2011-03-05

Common Values

ValueCountFrequency (%)
2011-03-03 36
31.0%
2011-03-05 8
 
6.9%
2015-06-09 8
 
6.9%
2011-04-06 6
 
5.2%
2011-03-09 5
 
4.3%
2011-03-16 5
 
4.3%
2011-03-17 5
 
4.3%
2011-03-10 5
 
4.3%
2011-03-22 4
 
3.4%
2011-03-04 3
 
2.6%
Other values (20) 31
26.7%

Length

2023-12-12T16:20:23.785975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2011-03-03 36
31.0%
2011-03-05 8
 
6.9%
2015-06-09 8
 
6.9%
2011-04-06 6
 
5.2%
2011-03-09 5
 
4.3%
2011-03-16 5
 
4.3%
2011-03-17 5
 
4.3%
2011-03-10 5
 
4.3%
2011-03-22 4
 
3.4%
2011-03-04 3
 
2.6%
Other values (20) 31
26.7%

Interactions

2023-12-12T16:20:21.346225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:20:23.886413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
그룹여부정렬순서설명사용여부생성일시수정일시
그룹여부1.0000.000NaNNaN0.6180.618
정렬순서0.0001.000NaN0.0000.9130.913
설명NaNNaN1.000NaN1.0001.000
사용여부NaN0.000NaN1.0000.8110.811
생성일시0.6180.9131.0000.8111.0001.000
수정일시0.6180.9131.0000.8111.0001.000
2023-12-12T16:20:24.005597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
생성일시수정일시사용여부그룹여부
생성일시1.0001.0000.5830.483
수정일시1.0001.0000.5830.483
사용여부0.5830.5831.0001.000
그룹여부0.4830.4831.0001.000
2023-12-12T16:20:24.102462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정렬순서그룹여부사용여부생성일시수정일시
정렬순서1.0000.0000.0000.6430.643
그룹여부0.0001.0001.0000.4830.483
사용여부0.0001.0001.0000.5830.583
생성일시0.6430.4830.5831.0001.000
수정일시0.6430.4830.5831.0001.000

Missing values

2023-12-12T16:20:21.484023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:20:21.624936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T16:20:21.709424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

프로그램명그룹여부정렬순서설명사용여부생성일시수정일시
0권한 관리<NA>4<NA>Y2011-03-032011-03-03
1재적학생수Y1정원내/정원외 재적학생수Y2011-03-042011-03-04
2연령별 학생현황Y3<NA>Y2011-03-042011-03-04
3연령별 졸업자Y7<NA>Y2011-03-042011-03-04
4자격증 취득현황Y8<NA>Y2011-03-052011-03-05
5교원자격증 발급현황Y9<NA>Y2011-03-052011-03-05
6재적 학생 현황Y9<NA>Y2011-03-052011-03-05
7졸업생 현황Y15<NA>Y2011-03-052011-03-05
8장학금급여 및 학비감면상황Y12<NA>Y2011-03-062011-03-06
9우편번호 찾기Y0공통화면으로 사용여부는 'N'Y2011-03-032011-03-03
프로그램명그룹여부정렬순서설명사용여부생성일시수정일시
106상담2<NA>98<NA>Y2015-06-092015-06-09
107상담3<NA>97<NA>Y2015-06-092015-06-09
108상담4<NA>96<NA>Y2015-06-092015-06-09
109상담5<NA>95<NA>Y2015-06-092015-06-09
110상담6<NA>94<NA>Y2015-06-092015-06-09
111상담7<NA>93<NA>Y2015-06-092015-06-09
112제증명발급대장<NA>5<NA>Y2015-06-092015-06-09
113학적관리<NA>16<NA>Y2015-06-112015-06-11
114경남지역별재적생현황Y27<NA>Y2015-06-112015-06-11
115제적생및재적생현황Y29<NA>Y2015-06-122015-06-12