Overview

Dataset statistics

Number of variables9
Number of observations2577
Missing cells37
Missing cells (%)0.2%
Duplicate rows8
Duplicate rows (%)0.3%
Total size in memory186.4 KiB
Average record size in memory74.1 B

Variable types

Numeric2
Categorical5
Boolean1
DateTime1

Dataset

DescriptionOne-Stop융자시스템에서 융자신청교에 대한 심사진행시 시스템이 참고하는 관리코드를 제공하는 DB로써 현재는 폐기된 시스템의 시스템 코드 데이터로 신규데이터가 발생하지 않음
Author한국사학진흥재단
URLhttps://www.data.go.kr/data/15042305/fileData.do

Alerts

Dataset has 8 (0.3%) duplicate rowsDuplicates
점수 is highly overall correlated with 대분류코드명High correlation
대분류코드명 is highly overall correlated with 점수High correlation
심사군코드명 is highly imbalanced (98.3%)Imbalance
사용여부 is highly imbalanced (81.6%)Imbalance
점수 has 37 (1.4%) missing valuesMissing
점수 has 584 (22.7%) zerosZeros

Reproduction

Analysis started2023-12-12 21:10:12.941015
Analysis finished2023-12-12 21:10:14.156674
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

배정년도
Real number (ℝ)

Distinct13
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.6542
Minimum2009
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.8 KiB
2023-12-13T06:10:14.210962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2009
5-th percentile2010
Q12012
median2014
Q32017
95-th percentile2020.2
Maximum2021
Range12
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.1317434
Coefficient of variation (CV)0.0015544818
Kurtosis-0.70542621
Mean2014.6542
Median Absolute Deviation (MAD)2
Skewness0.45146448
Sum5191764
Variance9.8078168
MonotonicityNot monotonic
2023-12-13T06:10:14.339473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
2015 370
14.4%
2014 365
14.2%
2013 326
12.7%
2012 305
11.8%
2011 224
8.7%
2019 152
5.9%
2010 151
5.9%
2020 149
5.8%
2018 134
 
5.2%
2021 129
 
5.0%
Other values (3) 272
10.6%
ValueCountFrequency (%)
2009 33
 
1.3%
2010 151
5.9%
2011 224
8.7%
2012 305
11.8%
2013 326
12.7%
2014 365
14.2%
2015 370
14.4%
2016 117
 
4.5%
2017 122
 
4.7%
2018 134
 
5.2%
ValueCountFrequency (%)
2021 129
 
5.0%
2020 149
5.8%
2019 152
5.9%
2018 134
 
5.2%
2017 122
 
4.7%
2016 117
 
4.5%
2015 370
14.4%
2014 365
14.2%
2013 326
12.7%
2012 305
11.8%
Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.3 KiB
정기신청
1435 
추가신청
1112 
변경(정기신청)
 
28
변경(추가신청)
 
2

Length

Max length8
Median length4
Mean length4.0465658
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정기신청
2nd row정기신청
3rd row정기신청
4th row정기신청
5th row정기신청

Common Values

ValueCountFrequency (%)
정기신청 1435
55.7%
추가신청 1112
43.2%
변경(정기신청) 28
 
1.1%
변경(추가신청) 2
 
0.1%

Length

2023-12-13T06:10:14.491986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:10:14.596878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정기신청 1435
55.7%
추가신청 1112
43.2%
변경(정기신청 28
 
1.1%
변경(추가신청 2
 
0.1%
Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.3 KiB
대학교
953 
전문대학
669 
원격대학
453 
중등이하
373 
유치원
128 

Length

Max length4
Median length4
Mean length3.5797439
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row대학교
2nd row대학교
3rd row대학교
4th row대학교
5th row대학교

Common Values

ValueCountFrequency (%)
대학교 953
37.0%
전문대학 669
26.0%
원격대학 453
17.6%
중등이하 373
 
14.5%
유치원 128
 
5.0%
기타 1
 
< 0.1%

Length

2023-12-13T06:10:14.717600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:10:14.835532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대학교 953
37.0%
전문대학 669
26.0%
원격대학 453
17.6%
중등이하 373
 
14.5%
유치원 128
 
5.0%
기타 1
 
< 0.1%
Distinct18
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size20.3 KiB
사학시설
442 
교육용기자재
357 
대환대출
334 
수익용기본재산
254 
대학구조개선지원
169 
Other values (13)
1021 

Length

Max length12
Median length10
Mean length5.5797439
Min length4

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row수익용기본재산
2nd row수익용기본재산
3rd row교육환경 안전강화 지원
4th row교육환경 안전강화 지원
5th row교육환경 안전강화 지원

Common Values

ValueCountFrequency (%)
사학시설 442
17.2%
교육용기자재 357
13.9%
대환대출 334
13.0%
수익용기본재산 254
9.9%
대학구조개선지원 169
 
6.6%
평생교육시설 158
 
6.1%
부속병원 152
 
5.9%
산학협력 150
 
5.8%
학교기업 150
 
5.8%
국제화지원 139
 
5.4%
Other values (8) 272
10.6%

Length

2023-12-13T06:10:14.998080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사학시설 442
15.7%
교육용기자재 357
12.7%
대환대출 334
11.9%
수익용기본재산 254
9.0%
대학구조개선지원 169
 
6.0%
평생교육시설 158
 
5.6%
부속병원 152
 
5.4%
산학협력 150
 
5.3%
학교기업 150
 
5.3%
국제화지원 139
 
4.9%
Other values (11) 511
18.1%

심사군코드명
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size20.3 KiB
계량
2573 
비계량
 
4

Length

Max length3
Median length2
Mean length2.0015522
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row계량
2nd row계량
3rd row계량
4th row계량
5th row계량

Common Values

ValueCountFrequency (%)
계량 2573
99.8%
비계량 4
 
0.2%

Length

2023-12-13T06:10:15.149678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:10:15.256528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
계량 2573
99.8%
비계량 4
 
0.2%

대분류코드명
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size20.3 KiB
가산점
564 
감점
556 
공익성 평가
371 
상환능력 평가
367 
상환능력지표
188 
Other values (16)
531 

Length

Max length8
Median length6
Mean length4.4536282
Min length2

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row감점
2nd row가산점
3rd row공익성지표
4th row공익성 지표
5th row상환능력지표

Common Values

ValueCountFrequency (%)
가산점 564
21.9%
감점 556
21.6%
공익성 평가 371
14.4%
상환능력 평가 367
14.2%
상환능력지표 188
 
7.3%
신청데이터 163
 
6.3%
공익성 지표 113
 
4.4%
공익성지표 91
 
3.5%
신청자료 84
 
3.3%
상환능력 지표 17
 
0.7%
Other values (11) 63
 
2.4%

Length

2023-12-13T06:10:15.382447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
평가 763
21.8%
가산점 564
16.1%
감점 556
15.9%
공익성 486
13.9%
상환능력 387
11.1%
상환능력지표 188
 
5.4%
신청데이터 163
 
4.7%
지표 130
 
3.7%
공익성지표 91
 
2.6%
신청자료 84
 
2.4%
Other values (12) 83
 
2.4%

점수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct53
Distinct (%)2.1%
Missing37
Missing (%)1.4%
Infinite0
Infinite (%)0.0%
Mean115.69409
Minimum-80
Maximum600
Zeros584
Zeros (%)22.7%
Negative11
Negative (%)0.4%
Memory size22.8 KiB
2023-12-13T06:10:15.527349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-80
5-th percentile0
Q120
median60
Q3200
95-th percentile320
Maximum600
Range680
Interquartile range (IQR)180

Descriptive statistics

Standard deviation119.54574
Coefficient of variation (CV)1.0332916
Kurtosis-0.37984905
Mean115.69409
Median Absolute Deviation (MAD)60
Skewness0.79369777
Sum293863
Variance14291.184
MonotonicityNot monotonic
2023-12-13T06:10:15.978389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 584
22.7%
200 380
14.7%
20 306
11.9%
300 290
11.3%
120 204
 
7.9%
40 139
 
5.4%
30 84
 
3.3%
50 75
 
2.9%
350 74
 
2.9%
80 56
 
2.2%
Other values (43) 348
13.5%
ValueCountFrequency (%)
-80 1
 
< 0.1%
-30 2
 
0.1%
-25 2
 
0.1%
-20 6
 
0.2%
0 584
22.7%
5 2
 
0.1%
10 9
 
0.3%
20 306
11.9%
25 48
 
1.9%
30 84
 
3.3%
ValueCountFrequency (%)
600 4
 
0.2%
580 2
 
0.1%
560 1
 
< 0.1%
500 1
 
< 0.1%
490 3
 
0.1%
400 4
 
0.2%
380 1
 
< 0.1%
370 1
 
< 0.1%
360 20
 
0.8%
350 74
2.9%

사용여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
True
2505 
False
 
72
ValueCountFrequency (%)
True 2505
97.2%
False 72
 
2.8%
2023-12-13T06:10:16.084933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct89
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size20.3 KiB
Minimum2009-06-17 00:00:00
Maximum2020-05-14 00:00:00
2023-12-13T06:10:16.177643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:10:16.317676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-13T06:10:13.691885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:10:13.492765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:10:13.794552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:10:13.590134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:10:16.434811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배정년도접수구분코드명학교급코드명사업구분코드명심사군코드명대분류코드명점수사용여부등록일자
배정년도1.0000.3010.3040.5310.2060.5970.5970.5070.997
접수구분코드명0.3011.0000.1120.1430.0130.2070.0810.0790.831
학교급코드명0.3040.1121.0000.4810.0370.2790.1900.1690.656
사업구분코드명0.5310.1430.4811.0000.3000.6140.3020.3430.895
심사군코드명0.2060.0130.0370.3001.0000.0000.0000.0000.489
대분류코드명0.5970.2070.2790.6140.0001.0000.8670.2690.923
점수0.5970.0810.1900.3020.0000.8671.0000.0000.705
사용여부0.5070.0790.1690.3430.0000.2690.0001.0000.800
등록일자0.9970.8310.6560.8950.4890.9230.7050.8001.000
2023-12-13T06:10:16.564622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
심사군코드명접수구분코드명사용여부사업구분코드명학교급코드명대분류코드명
심사군코드명1.0000.0090.0000.2350.0270.000
접수구분코드명0.0091.0000.0520.0780.0730.113
사용여부0.0000.0521.0000.2700.1210.236
사업구분코드명0.2350.0780.2701.0000.2120.226
학교급코드명0.0270.0730.1210.2121.0000.129
대분류코드명0.0000.1130.2360.2260.1291.000
2023-12-13T06:10:16.676268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배정년도점수접수구분코드명학교급코드명사업구분코드명심사군코드명대분류코드명사용여부
배정년도1.000-0.0780.1840.1670.2360.1580.2920.359
점수-0.0781.0000.0490.1000.1200.0000.5510.000
접수구분코드명0.1840.0491.0000.0730.0780.0090.1130.052
학교급코드명0.1670.1000.0731.0000.2120.0270.1290.121
사업구분코드명0.2360.1200.0780.2121.0000.2350.2260.270
심사군코드명0.1580.0000.0090.0270.2351.0000.0000.000
대분류코드명0.2920.5510.1130.1290.2260.0001.0000.236
사용여부0.3590.0000.0520.1210.2700.0000.2361.000

Missing values

2023-12-13T06:10:13.954162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:10:14.105262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

배정년도접수구분코드명학교급코드명사업구분코드명심사군코드명대분류코드명점수사용여부등록일자
02021정기신청대학교수익용기본재산계량감점0Y2020-05-14
12021정기신청대학교수익용기본재산계량가산점125Y2020-05-14
22021정기신청대학교교육환경 안전강화 지원계량공익성지표200Y2020-05-14
32021정기신청대학교교육환경 안전강화 지원계량공익성 지표30Y2020-05-14
42021정기신청대학교교육환경 안전강화 지원계량상환능력지표350Y2020-05-14
52021정기신청대학교교육환경 안전강화 지원계량감점0Y2020-05-14
62021정기신청대학교교육환경 안전강화 지원계량가산점125Y2020-05-14
72021정기신청전문대학사학시설계량공익성지표150Y2020-05-14
82021정기신청전문대학사학시설계량상환능력지표350Y2020-05-14
92021정기신청전문대학사학시설계량감점0Y2020-05-14
배정년도접수구분코드명학교급코드명사업구분코드명심사군코드명대분류코드명점수사용여부등록일자
25672013정기신청전문대학평생교육시설계량감점20Y2013-11-26
25682013정기신청전문대학평생교육시설계량신청데이터0Y2013-11-26
25692013추가신청대학교평생교육시설계량공익성 평가300Y2013-11-26
25702013추가신청대학교평생교육시설계량상환능력 평가200Y2013-11-26
25712013추가신청대학교평생교육시설계량가산점120Y2013-11-26
25722013추가신청대학교평생교육시설계량감점20Y2013-11-26
25732013추가신청원격대학평생교육시설계량공익성 평가300Y2013-11-26
25742013추가신청원격대학평생교육시설계량상환능력 평가200Y2013-11-26
25752013추가신청원격대학평생교육시설계량가산점120Y2013-11-26
25762012추가신청대학교교육용기자재계량가산점120Y2012-04-23

Duplicate rows

Most frequently occurring

배정년도접수구분코드명학교급코드명사업구분코드명심사군코드명대분류코드명점수사용여부등록일자# duplicates
02018정기신청대학교법인수익용기본재산계량공익성지표40Y2017-12-222
12018추가신청대학교법인수익용기본재산계량공익성지표40Y2018-06-292
22018추가신청전문대학사학시설계량공익성 지표25Y2018-06-292
32019정기신청전문대학사학시설계량공익성 지표25Y2018-06-292
42019추가신청대학교대환대출계량상환능력지표40Y2018-06-292
52019추가신청전문대학사학시설계량공익성 지표25Y2019-07-122
62020정기신청대학교대환대출계량상환능력지표40Y2019-10-112
72021정기신청대학교대환대출계량상환능력지표40Y2020-05-142