Overview

Dataset statistics

Number of variables10
Number of observations623
Missing cells1887
Missing cells (%)30.3%
Duplicate rows15
Duplicate rows (%)2.4%
Total size in memory52.5 KiB
Average record size in memory86.2 B

Variable types

Categorical6
DateTime3
Boolean1

Dataset

Description중장기개방계획에따른 경상남도 경남도립남해대학 데이터자료입니다.(년도, 표시과목, 자격기준, 발급일, 합격유무, 실기교사자격신청일, 이수예정자명부제출일 등의 데이터를 포함하고있습니다.)
Author경상남도
URLhttps://www.data.go.kr/data/15067554/fileData.do

Alerts

자격기준(조) has constant value ""Constant
자격기준(항) has constant value ""Constant
자격기준(호) has constant value ""Constant
합격유무 has constant value ""Constant
Dataset has 15 (2.4%) duplicate rowsDuplicates
년도 is highly imbalanced (63.2%)Imbalance
발급일 has 580 (93.1%) missing valuesMissing
합격유무 has 578 (92.8%) missing valuesMissing
실기교사자격신청일 has 351 (56.3%) missing valuesMissing
이수예정자명부제출일 has 378 (60.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 22:52:09.016281
Analysis finished2023-12-12 22:52:09.737022
Duration0.72 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
1
351 
2
272 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 351
56.3%
2 272
43.7%

Length

2023-12-13T07:52:09.816073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:09.935559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 351
56.3%
2 272
43.7%

년도
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
2002
543 
2001
77 
2000
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2000
2nd row2000
3rd row2000
4th row2001
5th row2001

Common Values

ValueCountFrequency (%)
2002 543
87.2%
2001 77
 
12.4%
2000 3
 
0.5%

Length

2023-12-13T07:52:10.061283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:10.174573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2002 543
87.2%
2001 77
 
12.4%
2000 3
 
0.5%

표시과목
Categorical

Distinct3
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
1
375 
2
145 
11
103 

Length

Max length2
Median length1
Mean length1.1653291
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11
2nd row11
3rd row11
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 375
60.2%
2 145
 
23.3%
11 103
 
16.5%

Length

2023-12-13T07:52:10.283459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:10.404862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 375
60.2%
2 145
 
23.3%
11 103
 
16.5%

자격기준(조)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
21
623 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row21
2nd row21
3rd row21
4th row21
5th row21

Common Values

ValueCountFrequency (%)
21 623
100.0%

Length

2023-12-13T07:52:10.517844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:10.628961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
21 623
100.0%

자격기준(항)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
2
623 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 623
100.0%

Length

2023-12-13T07:52:10.753128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:10.845395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 623
100.0%

자격기준(호)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
2
623 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 623
100.0%

Length

2023-12-13T07:52:10.937262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:11.030919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 623
100.0%

발급일
Date

MISSING 

Distinct5
Distinct (%)11.6%
Missing580
Missing (%)93.1%
Memory size5.0 KiB
Minimum2001-12-01 00:00:00
Maximum2002-05-30 00:00:00
2023-12-13T07:52:11.119423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:52:11.241299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)

합격유무
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)2.2%
Missing578
Missing (%)92.8%
Memory size1.3 KiB
True
 
45
(Missing)
578 
ValueCountFrequency (%)
True 45
 
7.2%
(Missing) 578
92.8%
2023-12-13T07:52:11.362196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct4
Distinct (%)1.5%
Missing351
Missing (%)56.3%
Memory size5.0 KiB
Minimum2001-12-01 00:00:00
Maximum2002-03-20 00:00:00
2023-12-13T07:52:11.459862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:52:11.543046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=4)
Distinct2
Distinct (%)0.8%
Missing378
Missing (%)60.7%
Memory size5.0 KiB
Minimum2002-01-07 00:00:00
Maximum2002-03-20 00:00:00
2023-12-13T07:52:11.622382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:52:11.702164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

Correlations

2023-12-13T07:52:11.771343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분년도표시과목발급일실기교사자격신청일이수예정자명부제출일
구분1.0000.2430.2401.000NaNNaN
년도0.2431.0000.5491.0001.000NaN
표시과목0.2400.5491.000NaN0.7461.000
발급일1.0001.000NaN1.0001.000NaN
실기교사자격신청일NaN1.0000.7461.0001.000NaN
이수예정자명부제출일NaNNaN1.000NaNNaN1.000
2023-12-13T07:52:11.863481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분년도표시과목
구분1.0000.3970.392
년도0.3971.0000.235
표시과목0.3920.2351.000
2023-12-13T07:52:11.948179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분년도표시과목
구분1.0000.3970.392
년도0.3971.0000.235
표시과목0.3920.2351.000

Missing values

2023-12-13T07:52:09.358669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:52:09.512866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:52:09.651722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분년도표시과목자격기준(조)자격기준(항)자격기준(호)발급일합격유무실기교사자격신청일이수예정자명부제출일
012000112122<NA><NA><NA><NA>
112000112122<NA><NA><NA><NA>
212000112122<NA><NA><NA><NA>
312001121222001-12-12Y<NA><NA>
41200112122<NA><NA><NA><NA>
512001121222001-12-12Y<NA><NA>
62200112122<NA><NA>2001-12-11<NA>
72200112122<NA>Y2001-12-11<NA>
82200112122<NA>Y2001-12-11<NA>
91200212122<NA><NA><NA>2002-03-20
구분년도표시과목자격기준(조)자격기준(항)자격기준(호)발급일합격유무실기교사자격신청일이수예정자명부제출일
6132200212122<NA><NA>2002-03-20<NA>
6142200212122<NA><NA>2002-03-20<NA>
6152200212122<NA><NA>2002-03-20<NA>
6162200212122<NA><NA>2002-03-20<NA>
6172200212122<NA><NA>2002-03-20<NA>
6182200212122<NA><NA>2002-03-20<NA>
6192200212122<NA><NA>2002-03-20<NA>
6202200212122<NA><NA>2002-03-20<NA>
6212200212122<NA><NA>2002-03-20<NA>
6222200212122<NA><NA>2002-03-20<NA>

Duplicate rows

Most frequently occurring

구분년도표시과목자격기준(조)자격기준(항)자격기준(호)발급일합격유무실기교사자격신청일이수예정자명부제출일# duplicates
31200212122<NA><NA><NA>2002-03-20158
41200222122<NA><NA><NA><NA>76
512002112122<NA><NA><NA>2002-01-0776
122200212122<NA><NA>2002-01-07<NA>70
142200222122<NA><NA>2002-01-07<NA>69
92200112122<NA><NA>2001-12-01<NA>54
132200212122<NA><NA>2002-03-20<NA>46
612002112122<NA><NA><NA><NA>24
722001121222001-12-01Y2001-12-01<NA>17
212002121222002-05-30Y<NA>2002-03-2011