Overview

Dataset statistics

Number of variables4
Number of observations1347
Missing cells5
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory43.5 KiB
Average record size in memory33.1 B

Variable types

Numeric1
Categorical1
Text1
Unsupported1

Dataset

Description경기도 학사 일정에 대한 데이터로 초등학교(1349교), 중학교(672교), 고등학교(489교)의 2023학년도 졸업식 일정을 제공합니다.
URLhttps://www.data.go.kr/data/15013362/fileData.do

Alerts

is highly overall correlated with 지역High correlation
지역 is highly overall correlated with High correlation
졸업일 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 19:50:17.568602
Analysis finished2023-12-12 19:50:18.399291
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Real number (ℝ)

HIGH CORRELATION 

Distinct1343
Distinct (%)100.0%
Missing4
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean674.12733
Minimum1
Maximum1347
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 KiB
2023-12-13T04:50:18.484923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile68.1
Q1337.5
median674
Q31010.5
95-th percentile1279.9
Maximum1347
Range1346
Interquartile range (IQR)673

Descriptive statistics

Standard deviation388.80398
Coefficient of variation (CV)0.57675155
Kurtosis-1.1984607
Mean674.12733
Median Absolute Deviation (MAD)337
Skewness0.00012741965
Sum905353
Variance151168.53
MonotonicityStrictly increasing
2023-12-13T04:50:18.719312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
897 1
 
0.1%
905 1
 
0.1%
904 1
 
0.1%
903 1
 
0.1%
902 1
 
0.1%
901 1
 
0.1%
900 1
 
0.1%
899 1
 
0.1%
898 1
 
0.1%
895 1
 
0.1%
Other values (1333) 1333
99.0%
(Missing) 4
 
0.3%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1347 1
0.1%
1346 1
0.1%
1345 1
0.1%
1344 1
0.1%
1343 1
0.1%
1342 1
0.1%
1341 1
0.1%
1340 1
0.1%
1339 1
0.1%
1338 1
0.1%

지역
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size10.7 KiB
화성오산
129 
용인
104 
수원
100 
고양
90 
구리남양주
87 
Other values (20)
837 

Length

Max length5
Median length2
Mean length2.731997
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가평
2nd row가평
3rd row가평
4th row가평
5th row가평

Common Values

ValueCountFrequency (%)
화성오산 129
 
9.6%
용인 104
 
7.7%
수원 100
 
7.4%
고양 90
 
6.7%
구리남양주 87
 
6.5%
성남 72
 
5.3%
평택 67
 
5.0%
부천 64
 
4.8%
파주 60
 
4.5%
광주하남 56
 
4.2%
Other values (15) 518
38.5%

Length

2023-12-13T04:50:18.868020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
화성오산 129
 
9.6%
용인 104
 
7.7%
수원 100
 
7.4%
고양 90
 
6.7%
구리남양주 87
 
6.5%
성남 72
 
5.3%
평택 67
 
5.0%
부천 64
 
4.8%
파주 60
 
4.5%
광주하남 56
 
4.2%
Other values (15) 518
38.5%
Distinct1336
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size10.7 KiB
2023-12-13T04:50:19.224146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length3
Mean length3.450631
Min length3

Characters and Unicode

Total characters4648
Distinct characters291
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1325 ?
Unique (%)98.4%

Sample

1st row가평마장초
2nd row가평초
3rd row대성초
4th row목동초
5th row목동초명지분교장
ValueCountFrequency (%)
교문초 2
 
0.1%
생금초 2
 
0.1%
서탄초 2
 
0.1%
원일초 2
 
0.1%
진위초 2
 
0.1%
송천분교 2
 
0.1%
서촌초 2
 
0.1%
삼성초 2
 
0.1%
오산초 2
 
0.1%
석천초 2
 
0.1%
Other values (1326) 1329
98.5%
2023-12-13T04:50:19.736351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1348
29.0%
93
 
2.0%
85
 
1.8%
80
 
1.7%
77
 
1.7%
76
 
1.6%
71
 
1.5%
68
 
1.5%
68
 
1.5%
65
 
1.4%
Other values (281) 2617
56.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4642
99.9%
Space Separator 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1348
29.0%
93
 
2.0%
85
 
1.8%
80
 
1.7%
77
 
1.7%
76
 
1.6%
71
 
1.5%
68
 
1.5%
68
 
1.5%
65
 
1.4%
Other values (278) 2611
56.2%
Space Separator
ValueCountFrequency (%)
2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4642
99.9%
Common 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1348
29.0%
93
 
2.0%
85
 
1.8%
80
 
1.7%
77
 
1.7%
76
 
1.6%
71
 
1.5%
68
 
1.5%
68
 
1.5%
65
 
1.4%
Other values (278) 2611
56.2%
Common
ValueCountFrequency (%)
2
33.3%
) 2
33.3%
( 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4642
99.9%
ASCII 6
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1348
29.0%
93
 
2.0%
85
 
1.8%
80
 
1.7%
77
 
1.7%
76
 
1.6%
71
 
1.5%
68
 
1.5%
68
 
1.5%
65
 
1.4%
Other values (278) 2611
56.2%
ASCII
ValueCountFrequency (%)
2
33.3%
) 2
33.3%
( 2
33.3%

졸업일
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.1%
Memory size10.7 KiB

Interactions

2023-12-13T04:50:17.879869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:50:19.845241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역
1.0000.992
지역0.9921.000
2023-12-13T04:50:19.931166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역
1.0000.915
지역0.9151.000

Missing values

2023-12-13T04:50:18.065870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:50:18.208297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T04:50:18.338918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

지역학교명졸업일
01가평가평마장초45282
12가평가평초45294
23가평대성초45295
34가평목동초45281
45가평목동초명지분교장45281
56가평미원초45289
67가평위곡분교장45289
78가평장락분교장45289
89가평방일초45289
910가평상면초45295
지역학교명졸업일
13371338평택평택청아초45289
13381339평택평택초45296
13391340평택합정초45296
13401341평택현덕초45296
13411342평택현덕초광덕분교장45296
13421343평택현일초45289
13431344평택현촌초45289
13441345평택현화초45282
13451346평택홍원초45289
13461347평택효덕초45289