Overview

Dataset statistics

Number of variables5
Number of observations5287
Missing cells0
Missing cells (%)0.0%
Duplicate rows386
Duplicate rows (%)7.3%
Total size in memory232.5 KiB
Average record size in memory45.0 B

Variable types

Numeric3
Categorical2

Dataset

Description대학도서관 인적 자원에 관한 데이터 항목(대학도서관 직원 관련 인적 정보(겸직 등)) 등에 관한 내용을 제공합니다.
Author한국교육학술정보원
URLhttps://www.data.go.kr/data/15071920/fileData.do

Alerts

Dataset has 386 (7.3%) duplicate rowsDuplicates
4.2 직원_정규직_겸직_합계 is highly imbalanced (93.1%)Imbalance
4.2 직원_비정규직_겸직_합계 is highly imbalanced (98.3%)Imbalance
4.2 직원_정규직_전담_합계 is highly skewed (γ1 = 23.78594779)Skewed
4.2 직원_정규직_전담_합계 has 4918 (93.0%) zerosZeros
직원-합계-계 has 154 (2.9%) zerosZeros

Reproduction

Analysis started2023-12-12 15:11:42.380787
Analysis finished2023-12-12 15:11:43.957462
Duration1.58 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

4.2 직원_정규직_전담_합계
Real number (ℝ)

SKEWED  ZEROS 

Distinct35
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38840552
Minimum0
Maximum131
Zeros4918
Zeros (%)93.0%
Negative0
Negative (%)0.0%
Memory size46.6 KiB
2023-12-13T00:11:44.035256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum131
Range131
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.8059366
Coefficient of variation (CV)7.2242449
Kurtosis935.66296
Mean0.38840552
Median Absolute Deviation (MAD)0
Skewness23.785948
Sum2053.5
Variance7.8732802
MonotonicityNot monotonic
2023-12-13T00:11:44.189875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
0.0 4918
93.0%
1.0 95
 
1.8%
2.0 61
 
1.2%
3.0 49
 
0.9%
4.0 40
 
0.8%
5.0 22
 
0.4%
6.0 19
 
0.4%
9.0 12
 
0.2%
7.0 9
 
0.2%
11.0 9
 
0.2%
Other values (25) 53
 
1.0%
ValueCountFrequency (%)
0.0 4918
93.0%
0.5 1
 
< 0.1%
1.0 95
 
1.8%
2.0 61
 
1.2%
3.0 49
 
0.9%
4.0 40
 
0.8%
5.0 22
 
0.4%
6.0 19
 
0.4%
7.0 9
 
0.2%
8.0 5
 
0.1%
ValueCountFrequency (%)
131.0 1
< 0.1%
44.0 1
< 0.1%
40.0 1
< 0.1%
35.0 2
< 0.1%
33.0 1
< 0.1%
32.0 1
< 0.1%
30.5 1
< 0.1%
28.0 1
< 0.1%
27.0 1
< 0.1%
26.0 2
< 0.1%

4.2 직원_정규직_겸직_합계
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size41.4 KiB
0
5194 
1
 
81
2
 
11
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 5194
98.2%
1 81
 
1.5%
2 11
 
0.2%
4 1
 
< 0.1%

Length

2023-12-13T00:11:44.335841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:11:44.489781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 5194
98.2%
1 81
 
1.5%
2 11
 
0.2%
4 1
 
< 0.1%

직원-합계-계
Real number (ℝ)

ZEROS 

Distinct103
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.0628523
Minimum0
Maximum137
Zeros154
Zeros (%)2.9%
Negative0
Negative (%)0.0%
Memory size46.6 KiB
2023-12-13T00:11:44.617235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q310
95-th percentile29
Maximum137
Range137
Interquartile range (IQR)8

Descriptive statistics

Standard deviation11.82427
Coefficient of variation (CV)1.4665121
Kurtosis27.476644
Mean8.0628523
Median Absolute Deviation (MAD)3
Skewness4.2328161
Sum42628.3
Variance139.81336
MonotonicityNot monotonic
2023-12-13T00:11:44.775306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.0 883
16.7%
2.0 850
16.1%
3.0 654
12.4%
4.0 435
 
8.2%
5.0 248
 
4.7%
6.0 197
 
3.7%
7.0 191
 
3.6%
0.0 154
 
2.9%
8.0 135
 
2.6%
13.0 133
 
2.5%
Other values (93) 1407
26.6%
ValueCountFrequency (%)
0.0 154
 
2.9%
0.5 1
 
< 0.1%
1.0 883
16.7%
1.3 1
 
< 0.1%
1.7 1
 
< 0.1%
2.0 850
16.1%
2.5 1
 
< 0.1%
3.0 654
12.4%
3.5 1
 
< 0.1%
4.0 435
8.2%
ValueCountFrequency (%)
137.0 2
< 0.1%
136.0 1
< 0.1%
125.0 1
< 0.1%
124.0 1
< 0.1%
120.0 1
< 0.1%
118.0 1
< 0.1%
116.0 1
< 0.1%
115.0 1
< 0.1%
106.0 1
< 0.1%
104.0 1
< 0.1%
Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size41.4 KiB
0
5270 
1
 
13
2
 
3
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 5270
99.7%
1 13
 
0.2%
2 3
 
0.1%
4 1
 
< 0.1%

Length

2023-12-13T00:11:44.927926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:11:45.023928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 5270
99.7%
1 13
 
0.2%
2 3
 
0.1%
4 1
 
< 0.1%

조사년도 키
Real number (ℝ)

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.6656
Minimum2008
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.6 KiB
2023-12-13T00:11:45.141796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2008
5-th percentile2008
Q12011
median2014
Q32017
95-th percentile2019
Maximum2019
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.4082419
Coefficient of variation (CV)0.001692556
Kurtosis-1.1880171
Mean2013.6656
Median Absolute Deviation (MAD)3
Skewness-0.053347407
Sum10646250
Variance11.616113
MonotonicityNot monotonic
2023-12-13T00:11:45.263311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2016 462
8.7%
2017 461
8.7%
2019 460
8.7%
2013 458
8.7%
2015 458
8.7%
2014 457
8.6%
2018 453
8.6%
2011 434
8.2%
2012 430
8.1%
2010 426
8.1%
Other values (2) 788
14.9%
ValueCountFrequency (%)
2008 381
7.2%
2009 407
7.7%
2010 426
8.1%
2011 434
8.2%
2012 430
8.1%
2013 458
8.7%
2014 457
8.6%
2015 458
8.7%
2016 462
8.7%
2017 461
8.7%
ValueCountFrequency (%)
2019 460
8.7%
2018 453
8.6%
2017 461
8.7%
2016 462
8.7%
2015 458
8.7%
2014 457
8.6%
2013 458
8.7%
2012 430
8.1%
2011 434
8.2%
2010 426
8.1%

Interactions

2023-12-13T00:11:43.370644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:11:42.669393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:11:43.015146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:11:43.497784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:11:42.776275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:11:43.127366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:11:43.614025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:11:42.910414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:11:43.257890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:11:45.341005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
4.2 직원_정규직_전담_합계4.2 직원_정규직_겸직_합계직원-합계-계4.2 직원_비정규직_겸직_합계조사년도 키
4.2 직원_정규직_전담_합계1.0000.0470.5540.0000.159
4.2 직원_정규직_겸직_합계0.0471.0000.0000.1850.267
직원-합계-계0.5540.0001.0000.0000.000
4.2 직원_비정규직_겸직_합계0.0000.1850.0001.0000.095
조사년도 키0.1590.2670.0000.0951.000
2023-12-13T00:11:45.457011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
4.2 직원_정규직_겸직_합계4.2 직원_비정규직_겸직_합계
4.2 직원_정규직_겸직_합계1.0000.074
4.2 직원_비정규직_겸직_합계0.0741.000
2023-12-13T00:11:45.542832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
4.2 직원_정규직_전담_합계직원-합계-계조사년도 키4.2 직원_정규직_겸직_합계4.2 직원_비정규직_겸직_합계
4.2 직원_정규직_전담_합계1.0000.0610.4340.0380.000
직원-합계-계0.0611.000-0.0710.0000.000
조사년도 키0.434-0.0711.0000.1640.059
4.2 직원_정규직_겸직_합계0.0380.0000.1641.0000.074
4.2 직원_비정규직_겸직_합계0.0000.0000.0590.0741.000

Missing values

2023-12-13T00:11:43.770363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:11:43.901850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

4.2 직원_정규직_전담_합계4.2 직원_정규직_겸직_합계직원-합계-계4.2 직원_비정규직_겸직_합계조사년도 키
00.008.002013
10.002.002013
20.002.002013
30.004.002013
40.002.002013
50.001.002013
60.003.002013
70.0062.002013
80.001.002013
90.001.002013
4.2 직원_정규직_전담_합계4.2 직원_정규직_겸직_합계직원-합계-계4.2 직원_비정규직_겸직_합계조사년도 키
52770.007.002009
52780.005.002009
52790.002.002009
52800.005.002009
52810.0015.002009
52820.0014.002009
52830.003.002009
52840.006.002009
52850.002.002009
52860.0012.002009

Duplicate rows

Most frequently occurring

4.2 직원_정규직_전담_합계4.2 직원_정규직_겸직_합계직원-합계-계4.2 직원_비정규직_겸직_합계조사년도 키# duplicates
190.001.00201691
200.001.00201791
210.001.00201885
180.001.00201583
260.002.00201081
340.002.00201879
170.001.00201477
140.001.00201176
320.002.00201674
330.002.00201774