Overview

Dataset statistics

Number of variables3
Number of observations2776
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory67.9 KiB
Average record size in memory25.0 B

Variable types

Numeric1
Categorical1
Text1

Dataset

Description대학(대학교, 전문대학 및 사이버대학 포함) 정보에 대한 데이터로 일련번호, 상태, 학교명 등의 항목을 제공합니다.
Author국가평생교육진흥원
URLhttps://www.data.go.kr/data/15070852/fileData.do

Alerts

일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 11:19:24.884353
Analysis finished2023-12-12 11:19:25.852213
Duration0.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

UNIQUE 

Distinct2776
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1388.5
Minimum1
Maximum2776
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.5 KiB
2023-12-12T20:19:25.977808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile139.75
Q1694.75
median1388.5
Q32082.25
95-th percentile2637.25
Maximum2776
Range2775
Interquartile range (IQR)1387.5

Descriptive statistics

Standard deviation801.5065
Coefficient of variation (CV)0.57724631
Kurtosis-1.2
Mean1388.5
Median Absolute Deviation (MAD)694
Skewness0
Sum3854476
Variance642412.67
MonotonicityStrictly increasing
2023-12-12T20:19:26.224779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1856 1
 
< 0.1%
1848 1
 
< 0.1%
1849 1
 
< 0.1%
1850 1
 
< 0.1%
1851 1
 
< 0.1%
1852 1
 
< 0.1%
1853 1
 
< 0.1%
1854 1
 
< 0.1%
1855 1
 
< 0.1%
Other values (2766) 2766
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2776 1
< 0.1%
2775 1
< 0.1%
2774 1
< 0.1%
2773 1
< 0.1%
2772 1
< 0.1%
2771 1
< 0.1%
2770 1
< 0.1%
2769 1
< 0.1%
2768 1
< 0.1%
2767 1
< 0.1%

상태
Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.8 KiB
유효
2017 
변경
624 
삭제
 
81
통합됨
 
54

Length

Max length3
Median length2
Mean length2.0194524
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유효
2nd row변경
3rd row변경
4th row변경
5th row유효

Common Values

ValueCountFrequency (%)
유효 2017
72.7%
변경 624
 
22.5%
삭제 81
 
2.9%
통합됨 54
 
1.9%

Length

2023-12-12T20:19:26.482751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:19:26.702599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유효 2017
72.7%
변경 624
 
22.5%
삭제 81
 
2.9%
통합됨 54
 
1.9%
Distinct2766
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size21.8 KiB
2023-12-12T20:19:27.372303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length61
Median length49
Mean length16.471902
Min length2

Characters and Unicode

Total characters45726
Distinct characters389
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2756 ?
Unique (%)99.3%

Sample

1st row서울여자대학교
2nd row서울예술전문대학
3rd row서울예술대학
4th row서울장로회신학교
5th row서울장신대학교
ValueCountFrequency (%)
university 923
 
15.1%
of 469
 
7.7%
college 215
 
3.5%
state 103
 
1.7%
the 102
 
1.7%
institute 54
 
0.9%
and 52
 
0.8%
technology 49
 
0.8%
international 43
 
0.7%
california 30
 
0.5%
Other values (2740) 4078
66.7%
2023-12-12T20:19:28.254682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3499
 
7.7%
3363
 
7.4%
e 2981
 
6.5%
n 2774
 
6.1%
t 2246
 
4.9%
o 2046
 
4.5%
a 1933
 
4.2%
r 1782
 
3.9%
s 1736
 
3.8%
1658
 
3.6%
Other values (379) 21708
47.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26986
59.0%
Other Letter 10429
 
22.8%
Uppercase Letter 4780
 
10.5%
Space Separator 3363
 
7.4%
Other Punctuation 94
 
0.2%
Decimal Number 22
 
< 0.1%
Close Punctuation 17
 
< 0.1%
Open Punctuation 17
 
< 0.1%
Dash Punctuation 14
 
< 0.1%
Letter Number 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1658
 
15.9%
1514
 
14.5%
791
 
7.6%
468
 
4.5%
417
 
4.0%
205
 
2.0%
196
 
1.9%
159
 
1.5%
159
 
1.5%
154
 
1.5%
Other values (306) 4708
45.1%
Lowercase Letter
ValueCountFrequency (%)
i 3499
13.0%
e 2981
11.0%
n 2774
10.3%
t 2246
8.3%
o 2046
 
7.6%
a 1933
 
7.2%
r 1782
 
6.6%
s 1736
 
6.4%
y 1291
 
4.8%
l 1181
 
4.4%
Other values (16) 5517
20.4%
Uppercase Letter
ValueCountFrequency (%)
U 1000
20.9%
C 532
11.1%
S 423
 
8.8%
T 321
 
6.7%
I 257
 
5.4%
N 241
 
5.0%
M 220
 
4.6%
A 205
 
4.3%
E 162
 
3.4%
B 142
 
3.0%
Other values (16) 1277
26.7%
Decimal Number
ValueCountFrequency (%)
1 8
36.4%
2 7
31.8%
4 2
 
9.1%
3 2
 
9.1%
5 1
 
4.5%
7 1
 
4.5%
6 1
 
4.5%
Other Punctuation
ValueCountFrequency (%)
, 27
28.7%
. 25
26.6%
' 23
24.5%
& 12
12.8%
5
 
5.3%
" 2
 
2.1%
Letter Number
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Space Separator
ValueCountFrequency (%)
3363
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31770
69.5%
Hangul 10425
 
22.8%
Common 3527
 
7.7%
Han 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1658
 
15.9%
1514
 
14.5%
791
 
7.6%
468
 
4.5%
417
 
4.0%
205
 
2.0%
196
 
1.9%
159
 
1.5%
159
 
1.5%
154
 
1.5%
Other values (302) 4704
45.1%
Latin
ValueCountFrequency (%)
i 3499
 
11.0%
e 2981
 
9.4%
n 2774
 
8.7%
t 2246
 
7.1%
o 2046
 
6.4%
a 1933
 
6.1%
r 1782
 
5.6%
s 1736
 
5.5%
y 1291
 
4.1%
l 1181
 
3.7%
Other values (46) 10301
32.4%
Common
ValueCountFrequency (%)
3363
95.4%
, 27
 
0.8%
. 25
 
0.7%
' 23
 
0.7%
) 17
 
0.5%
( 17
 
0.5%
- 14
 
0.4%
& 12
 
0.3%
1 8
 
0.2%
2 7
 
0.2%
Other values (7) 14
 
0.4%
Han
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35288
77.2%
Hangul 10425
 
22.8%
None 5
 
< 0.1%
Number Forms 4
 
< 0.1%
CJK 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 3499
 
9.9%
3363
 
9.5%
e 2981
 
8.4%
n 2774
 
7.9%
t 2246
 
6.4%
o 2046
 
5.8%
a 1933
 
5.5%
r 1782
 
5.0%
s 1736
 
4.9%
y 1291
 
3.7%
Other values (58) 11637
33.0%
Hangul
ValueCountFrequency (%)
1658
 
15.9%
1514
 
14.5%
791
 
7.6%
468
 
4.5%
417
 
4.0%
205
 
2.0%
196
 
1.9%
159
 
1.5%
159
 
1.5%
154
 
1.5%
Other values (302) 4704
45.1%
None
ValueCountFrequency (%)
5
100.0%
Number Forms
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
CJK
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Interactions

2023-12-12T20:19:25.392166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:19:28.904146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호상태
일련번호1.0000.443
상태0.4431.000
2023-12-12T20:19:29.042676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호상태
일련번호1.0000.279
상태0.2791.000

Missing values

2023-12-12T20:19:25.644771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:19:25.788654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호상태학교명
01유효서울여자대학교
12변경서울예술전문대학
23변경서울예술대학
34변경서울장로회신학교
45유효서울장신대학교
56유효서원대학교
67변경서일전문대학
78변경서일대학
89변경서정대학
910변경군산전문대학
일련번호상태학교명
27662767유효Bukhara Technological Institute of Food and L.I
27672768유효University of Greenwich
27682769유효Ramkhamhaeng University
27692770유효Montgomery County Community College
27702771유효American University of Central Asia
27712772유효Graffith University
27722773유효Deakin University
27732774유효Health Sciences University of Hokkaido
27742775유효San Jose Christian College
27752776유효INTI International College Subang