Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows413
Duplicate rows (%)4.1%
Total size in memory576.2 KiB
Average record size in memory59.0 B

Variable types

Categorical2
Numeric2
Text1
DateTime1

Dataset

Description충청북도_충북도립대학홈페이지시스템 홈페이지 검색 기록에 대한 로그로 검색연도,검색월,검색일,검색시간,검색키워드,검색 등록일 등을 제공하는 데이터입니다.
Author충청북도
URLhttps://www.data.go.kr/data/15041875/fileData.do

Alerts

Dataset has 413 (4.1%) duplicate rowsDuplicates
검색월 is highly overall correlated with 검색연도High correlation
검색연도 is highly overall correlated with 검색월High correlation

Reproduction

Analysis started2023-12-12 08:47:21.628086
Analysis finished2023-12-12 08:47:22.893686
Duration1.27 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

검색연도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2020
8106 
2019
1894 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 8106
81.1%
2019 1894
 
18.9%

Length

2023-12-12T17:47:22.958523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:47:23.061344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 8106
81.1%
2019 1894
 
18.9%

검색월
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.0281
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:47:23.164803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median6
Q37
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.1519683
Coefficient of variation (CV)0.52287922
Kurtosis-0.68788637
Mean6.0281
Median Absolute Deviation (MAD)2
Skewness0.3379592
Sum60281
Variance9.9349039
MonotonicityNot monotonic
2023-12-12T17:47:23.297680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
7 1768
17.7%
5 1300
13.0%
6 1155
11.6%
4 908
9.1%
2 835
8.3%
3 831
8.3%
12 818
8.2%
1 705
 
7.0%
8 604
 
6.0%
11 567
 
5.7%
ValueCountFrequency (%)
1 705
 
7.0%
2 835
8.3%
3 831
8.3%
4 908
9.1%
5 1300
13.0%
6 1155
11.6%
7 1768
17.7%
8 604
 
6.0%
10 509
 
5.1%
11 567
 
5.7%
ValueCountFrequency (%)
12 818
8.2%
11 567
 
5.7%
10 509
 
5.1%
8 604
 
6.0%
7 1768
17.7%
6 1155
11.6%
5 1300
13.0%
4 908
9.1%
3 831
8.3%
2 835
8.3%

검색일
Real number (ℝ)

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.7332
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:47:23.454870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q110
median17.5
Q324
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.483974
Coefficient of variation (CV)0.50701444
Kurtosis-1.0646254
Mean16.7332
Median Absolute Deviation (MAD)6.5
Skewness-0.17870343
Sum167332
Variance71.977816
MonotonicityNot monotonic
2023-12-12T17:47:23.588835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
22 432
 
4.3%
19 413
 
4.1%
21 411
 
4.1%
18 408
 
4.1%
23 398
 
4.0%
17 398
 
4.0%
20 392
 
3.9%
16 380
 
3.8%
15 364
 
3.6%
27 353
 
3.5%
Other values (21) 6051
60.5%
ValueCountFrequency (%)
1 251
2.5%
2 265
2.6%
3 273
2.7%
4 270
2.7%
5 253
2.5%
6 303
3.0%
7 307
3.1%
8 264
2.6%
9 263
2.6%
10 271
2.7%
ValueCountFrequency (%)
31 214
2.1%
30 280
2.8%
29 320
3.2%
28 341
3.4%
27 353
3.5%
26 340
3.4%
25 349
3.5%
24 349
3.5%
23 398
4.0%
22 432
4.3%

검색시간
Categorical

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
15:00
 
469
17:00
 
463
16:00
 
452
12:00
 
444
14:00
 
441
Other values (19)
7731 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row06:00
2nd row02:00
3rd row05:00
4th row16:00
5th row03:00

Common Values

ValueCountFrequency (%)
15:00 469
 
4.7%
17:00 463
 
4.6%
16:00 452
 
4.5%
12:00 444
 
4.4%
14:00 441
 
4.4%
13:00 437
 
4.4%
00:00 432
 
4.3%
02:00 432
 
4.3%
19:00 427
 
4.3%
08:00 424
 
4.2%
Other values (14) 5579
55.8%

Length

2023-12-12T17:47:23.726521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
15:00 469
 
4.7%
17:00 463
 
4.6%
16:00 452
 
4.5%
12:00 444
 
4.4%
14:00 441
 
4.4%
13:00 437
 
4.4%
00:00 432
 
4.3%
02:00 432
 
4.3%
19:00 427
 
4.3%
08:00 424
 
4.2%
Other values (14) 5579
55.8%
Distinct454
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T17:47:23.980334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length2
Mean length2.8004
Min length1

Characters and Unicode

Total characters28004
Distinct characters329
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique325 ?
Unique (%)3.2%

Sample

1st row상담
2nd row일정
3rd row휴학
4th row주말
5th row휴학
ValueCountFrequency (%)
일정 1238
 
12.3%
상담 930
 
9.3%
학사일정 871
 
8.7%
채용 761
 
7.6%
성적 562
 
5.6%
기숙사 410
 
4.1%
수강신청 409
 
4.1%
등록금 370
 
3.7%
휴학 356
 
3.5%
장학금 340
 
3.4%
Other values (444) 3784
37.7%
2023-12-12T17:47:24.475719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2622
 
9.4%
2156
 
7.7%
2139
 
7.6%
1457
 
5.2%
934
 
3.3%
930
 
3.3%
763
 
2.7%
763
 
2.7%
761
 
2.7%
684
 
2.4%
Other values (319) 14795
52.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 27658
98.8%
Lowercase Letter 196
 
0.7%
Dash Punctuation 55
 
0.2%
Decimal Number 40
 
0.1%
Space Separator 31
 
0.1%
Uppercase Letter 14
 
< 0.1%
Other Punctuation 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2622
 
9.5%
2156
 
7.8%
2139
 
7.7%
1457
 
5.3%
934
 
3.4%
930
 
3.4%
763
 
2.8%
763
 
2.8%
761
 
2.8%
684
 
2.5%
Other values (276) 14449
52.2%
Lowercase Letter
ValueCountFrequency (%)
e 57
29.1%
s 23
11.7%
n 22
 
11.2%
d 21
 
10.7%
c 12
 
6.1%
i 8
 
4.1%
p 7
 
3.6%
u 6
 
3.1%
r 5
 
2.6%
o 5
 
2.6%
Other values (13) 30
15.3%
Decimal Number
ValueCountFrequency (%)
2 12
30.0%
0 9
22.5%
1 7
17.5%
3 6
15.0%
8 2
 
5.0%
7 2
 
5.0%
9 1
 
2.5%
4 1
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
M 3
21.4%
S 3
21.4%
W 2
14.3%
V 2
14.3%
C 1
 
7.1%
L 1
 
7.1%
B 1
 
7.1%
P 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 6
60.0%
@ 4
40.0%
Dash Punctuation
ValueCountFrequency (%)
- 55
100.0%
Space Separator
ValueCountFrequency (%)
31
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 27658
98.8%
Latin 210
 
0.7%
Common 136
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2622
 
9.5%
2156
 
7.8%
2139
 
7.7%
1457
 
5.3%
934
 
3.4%
930
 
3.4%
763
 
2.8%
763
 
2.8%
761
 
2.8%
684
 
2.5%
Other values (276) 14449
52.2%
Latin
ValueCountFrequency (%)
e 57
27.1%
s 23
11.0%
n 22
 
10.5%
d 21
 
10.0%
c 12
 
5.7%
i 8
 
3.8%
p 7
 
3.3%
u 6
 
2.9%
r 5
 
2.4%
o 5
 
2.4%
Other values (21) 44
21.0%
Common
ValueCountFrequency (%)
- 55
40.4%
31
22.8%
2 12
 
8.8%
0 9
 
6.6%
1 7
 
5.1%
. 6
 
4.4%
3 6
 
4.4%
@ 4
 
2.9%
8 2
 
1.5%
7 2
 
1.5%
Other values (2) 2
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 27657
98.8%
ASCII 346
 
1.2%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2622
 
9.5%
2156
 
7.8%
2139
 
7.7%
1457
 
5.3%
934
 
3.4%
930
 
3.4%
763
 
2.8%
763
 
2.8%
761
 
2.8%
684
 
2.5%
Other values (275) 14448
52.2%
ASCII
ValueCountFrequency (%)
e 57
16.5%
- 55
15.9%
31
 
9.0%
s 23
 
6.6%
n 22
 
6.4%
d 21
 
6.1%
2 12
 
3.5%
c 12
 
3.5%
0 9
 
2.6%
i 8
 
2.3%
Other values (33) 96
27.7%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Distinct321
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2019-10-01 00:00:00
Maximum2020-08-31 00:00:00
2023-12-12T17:47:24.979317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:47:25.150323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T17:47:22.402821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:47:22.153504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:47:22.517812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:47:22.275291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:47:25.253857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검색연도검색월검색일검색시간
검색연도1.0001.0000.0970.006
검색월1.0001.0000.2730.080
검색일0.0970.2731.0000.029
검색시간0.0060.0800.0291.000
2023-12-12T17:47:25.368765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검색시간검색연도
검색시간1.0000.005
검색연도0.0051.000
2023-12-12T17:47:25.481749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검색월검색일검색연도검색시간
검색월1.0000.0281.0000.030
검색일0.0281.0000.0810.000
검색연도1.0000.0811.0000.005
검색시간0.0300.0000.0051.000

Missing values

2023-12-12T17:47:22.704610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:47:22.831265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

검색연도검색월검색일검색시간검색키워드검색 등록일
67183202062506:00상담2020-06-25
38055202031902:00일정2020-03-19
38005202031905:00휴학2020-03-19
86608202071616:00주말2020-07-16
18768202013103:00휴학2020-01-31
33874202033102:00성적2020-03-31
5081420204218:00상담2020-04-02
4655201910317:00버스2019-10-03
26445202022623:00공무원2020-02-26
7565120206415:00시간표2020-06-04
검색연도검색월검색일검색시간검색키워드검색 등록일
20097202012308:00수강신청2020-01-23
77615202073008:00이력서2020-07-30
60845202051111:00상담2020-05-11
53016202052712:00일정2020-05-27
94172202083115:00강사료2020-08-31
21000202011903:00일정2020-01-19
2520720201214:00채용2020-01-02
23216202011013:00휴학2020-01-10
3700201910915:00성적2019-10-09
9090220207715:00학사일정2020-07-07

Duplicate rows

Most frequently occurring

검색연도검색월검색일검색시간검색키워드검색 등록일# duplicates
19620206808:00일정2020-06-084
62202022203:00일정2020-02-223
80202032419:00상담2020-03-243
9720204509:00일정2020-04-053
112202042114:00학사일정2020-04-213
118202042802:00일정2020-04-283
12020205423:00일정2020-05-043
12220205519:00일정2020-05-053
18720206605:00일정2020-06-063
19020206713:00일정2020-06-073