Overview

Dataset statistics

Number of variables6
Number of observations67
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 KiB
Average record size in memory51.0 B

Variable types

Text1
Categorical3
Numeric1
Boolean1

Dataset

Description한국사학진흥재단 대학재정정보시스템 문서정보(문서명, 문서정렬번호, 사용유무, 입력일시, 최종변경일시)를 보유한 파일데이터
URLhttps://www.data.go.kr/data/15120083/fileData.do

Alerts

문서정렬번호 is highly overall correlated with 사용유무High correlation
문서유형 is highly overall correlated with 입력일시High correlation
사용유무 is highly overall correlated with 문서정렬번호High correlation
입력일시 is highly overall correlated with 문서유형High correlation
사용유무 is highly imbalanced (80.6%)Imbalance

Reproduction

Analysis started2023-12-12 05:18:00.294951
Analysis finished2023-12-12 05:18:00.958992
Duration0.66 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct63
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size668.0 B
2023-12-12T14:18:01.138939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length18
Mean length13.149254
Min length5

Characters and Unicode

Total characters881
Distinct characters94
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique59 ?
Unique (%)88.1%

Sample

1st row세입결산서
2nd row세출결산서
3rd row법인회계 세입 결산명세서
4th row교비회계 세입 결산명세서
5th row학교발전기금회계 세입 결산명세서
ValueCountFrequency (%)
명세서 14
 
7.0%
12
 
6.0%
법인 11
 
5.5%
교비 11
 
5.5%
적립 9
 
4.5%
세출 6
 
3.0%
결산명세서 6
 
3.0%
예산명세서 6
 
3.0%
학교발전기금 6
 
3.0%
세입 6
 
3.0%
Other values (53) 114
56.7%
2023-12-12T14:18:01.621510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
135
 
15.3%
68
 
7.7%
52
 
5.9%
37
 
4.2%
31
 
3.5%
29
 
3.3%
29
 
3.3%
24
 
2.7%
24
 
2.7%
23
 
2.6%
Other values (84) 429
48.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 737
83.7%
Space Separator 135
 
15.3%
Other Punctuation 3
 
0.3%
Open Punctuation 3
 
0.3%
Close Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
68
 
9.2%
52
 
7.1%
37
 
5.0%
31
 
4.2%
29
 
3.9%
29
 
3.9%
24
 
3.3%
24
 
3.3%
23
 
3.1%
20
 
2.7%
Other values (80) 400
54.3%
Space Separator
ValueCountFrequency (%)
135
100.0%
Other Punctuation
ValueCountFrequency (%)
· 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 737
83.7%
Common 144
 
16.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
68
 
9.2%
52
 
7.1%
37
 
5.0%
31
 
4.2%
29
 
3.9%
29
 
3.9%
24
 
3.3%
24
 
3.3%
23
 
3.1%
20
 
2.7%
Other values (80) 400
54.3%
Common
ValueCountFrequency (%)
135
93.8%
· 3
 
2.1%
( 3
 
2.1%
) 3
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 737
83.7%
ASCII 141
 
16.0%
None 3
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
135
95.7%
( 3
 
2.1%
) 3
 
2.1%
Hangul
ValueCountFrequency (%)
68
 
9.2%
52
 
7.1%
37
 
5.0%
31
 
4.2%
29
 
3.9%
29
 
3.9%
24
 
3.3%
24
 
3.3%
23
 
3.1%
20
 
2.7%
Other values (80) 400
54.3%
None
ValueCountFrequency (%)
· 3
100.0%
Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size668.0 B
학교
56 
교육원
11 

Length

Max length3
Median length2
Mean length2.1641791
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교육원
2nd row교육원
3rd row학교
4th row학교
5th row학교

Common Values

ValueCountFrequency (%)
학교 56
83.6%
교육원 11
 
16.4%

Length

2023-12-12T14:18:01.763181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:18:01.858204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
학교 56
83.6%
교육원 11
 
16.4%

문서유형
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size668.0 B
결산서 부속서류
40 
예산서 부속서류
10 
결산서
예산서

Length

Max length8
Median length8
Mean length6.7313433
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row결산서
2nd row결산서
3rd row결산서
4th row결산서
5th row결산서

Common Values

ValueCountFrequency (%)
결산서 부속서류 40
59.7%
예산서 부속서류 10
 
14.9%
결산서 9
 
13.4%
예산서 8
 
11.9%

Length

2023-12-12T14:18:01.962206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:18:02.121258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부속서류 50
42.7%
결산서 49
41.9%
예산서 18
 
15.4%

문서정렬번호
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)34.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.283582
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size735.0 B
2023-12-12T14:18:02.272484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median11
Q315.5
95-th percentile21
Maximum99
Range98
Interquartile range (IQR)9.5

Descriptive statistics

Standard deviation12.345783
Coefficient of variation (CV)1.0050638
Kurtosis37.384438
Mean12.283582
Median Absolute Deviation (MAD)5
Skewness5.3355458
Sum823
Variance152.41836
MonotonicityNot monotonic
2023-12-12T14:18:02.413947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
13 5
 
7.5%
14 5
 
7.5%
3 4
 
6.0%
15 4
 
6.0%
2 3
 
4.5%
21 3
 
4.5%
19 3
 
4.5%
10 3
 
4.5%
9 3
 
4.5%
11 3
 
4.5%
Other values (13) 31
46.3%
ValueCountFrequency (%)
1 3
4.5%
2 3
4.5%
3 4
6.0%
4 3
4.5%
5 3
4.5%
6 3
4.5%
7 3
4.5%
8 3
4.5%
9 3
4.5%
10 3
4.5%
ValueCountFrequency (%)
99 1
 
1.5%
22 1
 
1.5%
21 3
4.5%
20 3
4.5%
19 3
4.5%
18 2
 
3.0%
17 2
 
3.0%
16 2
 
3.0%
15 4
6.0%
14 5
7.5%

사용유무
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size199.0 B
True
65 
False
 
2
ValueCountFrequency (%)
True 65
97.0%
False 2
 
3.0%
2023-12-12T14:18:02.520326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

입력일시
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size668.0 B
2021-10-18
30 
2021-10-12
20 
2021-09-28
14 
2021-10-06
 
2
2021-09-30
 
1

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row2021-09-28
2nd row2021-09-28
3rd row2021-09-28
4th row2021-09-28
5th row2021-09-28

Common Values

ValueCountFrequency (%)
2021-10-18 30
44.8%
2021-10-12 20
29.9%
2021-09-28 14
20.9%
2021-10-06 2
 
3.0%
2021-09-30 1
 
1.5%

Length

2023-12-12T14:18:02.635638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:18:02.784363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-10-18 30
44.8%
2021-10-12 20
29.9%
2021-09-28 14
20.9%
2021-10-06 2
 
3.0%
2021-09-30 1
 
1.5%

Interactions

2023-12-12T14:18:00.638227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:18:02.874413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문서명학교_교육원 구분문서유형문서정렬번호사용유무입력일시
문서명1.0001.0000.6750.7480.0000.990
학교_교육원 구분1.0001.0000.3130.5070.0000.000
문서유형0.6750.3131.0000.7170.0000.623
문서정렬번호0.7480.5070.7171.0000.8840.562
사용유무0.0000.0000.0000.8841.0000.372
입력일시0.9900.0000.6230.5620.3721.000
2023-12-12T14:18:03.001929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용유무입력일시문서유형학교_교육원 구분
사용유무1.0000.4420.0000.000
입력일시0.4421.0000.5470.000
문서유형0.0000.5471.0000.204
학교_교육원 구분0.0000.0000.2041.000
2023-12-12T14:18:03.110940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문서정렬번호학교_교육원 구분문서유형사용유무입력일시
문서정렬번호1.0000.3380.3560.6800.484
학교_교육원 구분0.3381.0000.2040.0000.000
문서유형0.3560.2041.0000.0000.547
사용유무0.6800.0000.0001.0000.442
입력일시0.4840.0000.5470.4421.000

Missing values

2023-12-12T14:18:00.779938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:18:00.910540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

문서명학교_교육원 구분문서유형문서정렬번호사용유무입력일시
0세입결산서교육원결산서1Y2021-09-28
1세출결산서교육원결산서2Y2021-09-28
2법인회계 세입 결산명세서학교결산서1Y2021-09-28
3교비회계 세입 결산명세서학교결산서3Y2021-09-28
4학교발전기금회계 세입 결산명세서학교결산서5Y2021-09-28
5법인회계 세출 결산명세서학교결산서2Y2021-09-28
6교비회계 세출 결산명세서학교결산서4Y2021-09-28
7학교발전기금회계 세출 결산명세서학교결산서6Y2021-09-28
8현금 및 예금명세서학교결산서 부속서류12Y2021-10-12
9법인 차입금 명세서학교결산서 부속서류18Y2021-10-12
문서명학교_교육원 구분문서유형문서정렬번호사용유무입력일시
57예금잔액증명서교육원결산서 부속서류4Y2021-10-12
58현금실사표교육원결산서 부속서류5Y2021-10-12
59은행잔고 불부합조서교육원결산서 부속서류6Y2021-10-18
60퇴직급여충당금 적립 및 사용명세서교육원결산서 부속서류7Y2021-10-18
61세부집행내역(경상운영비)교육원결산서 부속서류8Y2021-10-18
62세부집행내역(사업비)교육원결산서 부속서류9Y2021-10-18
63세부집행내역(대수선비)교육원결산서 부속서류10Y2021-10-18
64해외 초·중등학교 한국어 채택 사업비 최종보고서교육원결산서 부속서류11Y2021-10-18
65한국교육원 세입세출 집행내역명세표교육원결산서3Y2021-10-12
66교비 퇴직적립금 적립 및 사용계획서학교예산서 부속서류14Y2021-10-18