Overview

Dataset statistics

Number of variables6
Number of observations199
Missing cells24
Missing cells (%)2.0%
Duplicate rows1
Duplicate rows (%)0.5%
Total size in memory9.7 KiB
Average record size in memory49.7 B

Variable types

Text4
Numeric1
DateTime1

Dataset

Description충청남도 도정신문에 서평이 게시된 도서에 대한 데이터로, 도서명, 저자,출판사, 수록된 도정신문 회차 등의 내용을 담고 있습니다.
Author충청남도
URLhttps://www.data.go.kr/data/15095052/fileData.do

Alerts

Dataset has 1 (0.5%) duplicate rowsDuplicates
도서명 has 4 (2.0%) missing valuesMissing
저자명 has 4 (2.0%) missing valuesMissing
출판사 has 4 (2.0%) missing valuesMissing
발행연도 has 4 (2.0%) missing valuesMissing
도정신문 발행일 has 4 (2.0%) missing valuesMissing
도정신문 호수 has 4 (2.0%) missing valuesMissing

Reproduction

Analysis started2024-04-06 08:19:01.228334
Analysis finished2024-04-06 08:19:02.917294
Duration1.69 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

도서명
Text

MISSING 

Distinct194
Distinct (%)99.5%
Missing4
Missing (%)2.0%
Memory size1.7 KiB
2024-04-06T17:19:03.454238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length18
Mean length11.317949
Min length1

Characters and Unicode

Total characters2207
Distinct characters432
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique193 ?
Unique (%)99.0%

Sample

1st row8체질 이야기
2nd row말의 품격
3rd row운다고 달라지는 일은 아무것도 없겠지만
4th row우리는 차별에 찬성합니다
5th row당신은 개를 키우면 안 된다
ValueCountFrequency (%)
5
 
0.8%
세계 4
 
0.6%
위한 4
 
0.6%
없다 4
 
0.6%
다시 3
 
0.5%
사는 3
 
0.5%
않는다 3
 
0.5%
3
 
0.5%
나는 3
 
0.5%
개의 3
 
0.5%
Other values (545) 595
94.4%
2024-04-06T17:19:04.574246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
435
 
19.7%
56
 
2.5%
51
 
2.3%
51
 
2.3%
48
 
2.2%
39
 
1.8%
30
 
1.4%
27
 
1.2%
26
 
1.2%
24
 
1.1%
Other values (422) 1420
64.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1711
77.5%
Space Separator 435
 
19.7%
Decimal Number 30
 
1.4%
Other Punctuation 14
 
0.6%
Uppercase Letter 11
 
0.5%
Lowercase Letter 2
 
0.1%
Connector Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
56
 
3.3%
51
 
3.0%
51
 
3.0%
48
 
2.8%
39
 
2.3%
30
 
1.8%
27
 
1.6%
26
 
1.5%
24
 
1.4%
22
 
1.3%
Other values (394) 1337
78.1%
Uppercase Letter
ValueCountFrequency (%)
I 2
18.2%
A 1
9.1%
Z 1
9.1%
L 1
9.1%
E 1
9.1%
V 1
9.1%
O 1
9.1%
Q 1
9.1%
F 1
9.1%
B 1
9.1%
Decimal Number
ValueCountFrequency (%)
0 8
26.7%
1 7
23.3%
2 6
20.0%
9 4
13.3%
8 2
 
6.7%
5 2
 
6.7%
3 1
 
3.3%
Other Punctuation
ValueCountFrequency (%)
, 10
71.4%
: 2
 
14.3%
. 1
 
7.1%
? 1
 
7.1%
Lowercase Letter
ValueCountFrequency (%)
v 1
50.0%
s 1
50.0%
Space Separator
ValueCountFrequency (%)
435
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1711
77.5%
Common 483
 
21.9%
Latin 13
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
56
 
3.3%
51
 
3.0%
51
 
3.0%
48
 
2.8%
39
 
2.3%
30
 
1.8%
27
 
1.6%
26
 
1.5%
24
 
1.4%
22
 
1.3%
Other values (394) 1337
78.1%
Common
ValueCountFrequency (%)
435
90.1%
, 10
 
2.1%
0 8
 
1.7%
1 7
 
1.4%
2 6
 
1.2%
9 4
 
0.8%
8 2
 
0.4%
: 2
 
0.4%
5 2
 
0.4%
. 1
 
0.2%
Other values (6) 6
 
1.2%
Latin
ValueCountFrequency (%)
I 2
15.4%
A 1
7.7%
Z 1
7.7%
L 1
7.7%
E 1
7.7%
V 1
7.7%
O 1
7.7%
Q 1
7.7%
v 1
7.7%
s 1
7.7%
Other values (2) 2
15.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1711
77.5%
ASCII 496
 
22.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
435
87.7%
, 10
 
2.0%
0 8
 
1.6%
1 7
 
1.4%
2 6
 
1.2%
9 4
 
0.8%
8 2
 
0.4%
I 2
 
0.4%
: 2
 
0.4%
5 2
 
0.4%
Other values (18) 18
 
3.6%
Hangul
ValueCountFrequency (%)
56
 
3.3%
51
 
3.0%
51
 
3.0%
48
 
2.8%
39
 
2.3%
30
 
1.8%
27
 
1.6%
26
 
1.5%
24
 
1.4%
22
 
1.3%
Other values (394) 1337
78.1%

저자명
Text

MISSING 

Distinct190
Distinct (%)97.4%
Missing4
Missing (%)2.0%
Memory size1.7 KiB
2024-04-06T17:19:05.164886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length3
Mean length4.5897436
Min length2

Characters and Unicode

Total characters895
Distinct characters271
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique185 ?
Unique (%)94.9%

Sample

1st row주석원
2nd row이기주
3rd row박준
4th row오찬호
5th row강형욱
ValueCountFrequency (%)
8
 
2.8%
김초엽 3
 
1.1%
3
 
1.1%
브라이언 3
 
1.1%
오찬호 2
 
0.7%
피터 2
 
0.7%
a 2
 
0.7%
b 2
 
0.7%
최원형 2
 
0.7%
정세랑 2
 
0.7%
Other values (254) 255
89.8%
2024-04-06T17:19:06.004045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
89
 
9.9%
37
 
4.1%
33
 
3.7%
14
 
1.6%
12
 
1.3%
11
 
1.2%
11
 
1.2%
11
 
1.2%
10
 
1.1%
10
 
1.1%
Other values (261) 657
73.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 777
86.8%
Space Separator 89
 
9.9%
Other Punctuation 12
 
1.3%
Decimal Number 6
 
0.7%
Uppercase Letter 6
 
0.7%
Close Punctuation 2
 
0.2%
Open Punctuation 2
 
0.2%
Lowercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
37
 
4.8%
33
 
4.2%
14
 
1.8%
12
 
1.5%
11
 
1.4%
11
 
1.4%
11
 
1.4%
10
 
1.3%
10
 
1.3%
10
 
1.3%
Other values (247) 618
79.5%
Decimal Number
ValueCountFrequency (%)
1 3
50.0%
8 1
 
16.7%
6 1
 
16.7%
4 1
 
16.7%
Uppercase Letter
ValueCountFrequency (%)
A 2
33.3%
B 2
33.3%
J 1
16.7%
M 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 7
58.3%
, 5
41.7%
Space Separator
ValueCountFrequency (%)
89
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
w 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 777
86.8%
Common 111
 
12.4%
Latin 7
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
37
 
4.8%
33
 
4.2%
14
 
1.8%
12
 
1.5%
11
 
1.4%
11
 
1.4%
11
 
1.4%
10
 
1.3%
10
 
1.3%
10
 
1.3%
Other values (247) 618
79.5%
Common
ValueCountFrequency (%)
89
80.2%
. 7
 
6.3%
, 5
 
4.5%
1 3
 
2.7%
) 2
 
1.8%
( 2
 
1.8%
8 1
 
0.9%
6 1
 
0.9%
4 1
 
0.9%
Latin
ValueCountFrequency (%)
A 2
28.6%
B 2
28.6%
J 1
14.3%
M 1
14.3%
w 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 777
86.8%
ASCII 118
 
13.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
89
75.4%
. 7
 
5.9%
, 5
 
4.2%
1 3
 
2.5%
A 2
 
1.7%
B 2
 
1.7%
) 2
 
1.7%
( 2
 
1.7%
J 1
 
0.8%
8 1
 
0.8%
Other values (4) 4
 
3.4%
Hangul
ValueCountFrequency (%)
37
 
4.8%
33
 
4.2%
14
 
1.8%
12
 
1.5%
11
 
1.4%
11
 
1.4%
11
 
1.4%
10
 
1.3%
10
 
1.3%
10
 
1.3%
Other values (247) 618
79.5%

출판사
Text

MISSING 

Distinct144
Distinct (%)73.8%
Missing4
Missing (%)2.0%
Memory size1.7 KiB
2024-04-06T17:19:06.874497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.0205128
Min length1

Characters and Unicode

Total characters784
Distinct characters218
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique114 ?
Unique (%)58.5%

Sample

1st row씨앗을뿌리는사람
2nd row황소북스
3rd row난다
4th row개마고원
5th row동아일보사
ValueCountFrequency (%)
위즈덤하우스 7
 
3.6%
창비 6
 
3.0%
문학동네 5
 
2.5%
부키 5
 
2.5%
한겨레출판사 3
 
1.5%
인플루엔셜 3
 
1.5%
어크로스 3
 
1.5%
사이언스북스 3
 
1.5%
쌤앤파커스 3
 
1.5%
한빛비즈 3
 
1.5%
Other values (136) 156
79.2%
2024-04-06T17:19:07.712068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
58
 
7.4%
31
 
4.0%
30
 
3.8%
17
 
2.2%
15
 
1.9%
14
 
1.8%
13
 
1.7%
13
 
1.7%
12
 
1.5%
11
 
1.4%
Other values (208) 570
72.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 779
99.4%
Space Separator 2
 
0.3%
Decimal Number 2
 
0.3%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
58
 
7.4%
31
 
4.0%
30
 
3.9%
17
 
2.2%
15
 
1.9%
14
 
1.8%
13
 
1.7%
13
 
1.7%
12
 
1.5%
11
 
1.4%
Other values (204) 565
72.5%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
1 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 779
99.4%
Common 5
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
58
 
7.4%
31
 
4.0%
30
 
3.9%
17
 
2.2%
15
 
1.9%
14
 
1.8%
13
 
1.7%
13
 
1.7%
12
 
1.5%
11
 
1.4%
Other values (204) 565
72.5%
Common
ValueCountFrequency (%)
2
40.0%
. 1
20.0%
2 1
20.0%
1 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 779
99.4%
ASCII 5
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
58
 
7.4%
31
 
4.0%
30
 
3.9%
17
 
2.2%
15
 
1.9%
14
 
1.8%
13
 
1.7%
13
 
1.7%
12
 
1.5%
11
 
1.4%
Other values (204) 565
72.5%
ASCII
ValueCountFrequency (%)
2
40.0%
. 1
20.0%
2 1
20.0%
1 1
20.0%

발행연도
Real number (ℝ)

MISSING 

Distinct19
Distinct (%)9.7%
Missing4
Missing (%)2.0%
Infinite0
Infinite (%)0.0%
Mean2018.6923
Minimum1996
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2024-04-06T17:19:07.948872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1996
5-th percentile2011
Q12017
median2019
Q32021
95-th percentile2023
Maximum2024
Range28
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.7164572
Coefficient of variation (CV)0.0018410221
Kurtosis7.9667631
Mean2018.6923
Median Absolute Deviation (MAD)2
Skewness-2.1439081
Sum393645
Variance13.812054
MonotonicityNot monotonic
2024-04-06T17:19:08.191927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
2021 29
14.6%
2020 27
13.6%
2019 26
13.1%
2022 24
12.1%
2018 21
10.6%
2017 18
9.0%
2023 14
7.0%
2016 12
6.0%
2011 6
 
3.0%
2014 4
 
2.0%
Other values (9) 14
7.0%
(Missing) 4
 
2.0%
ValueCountFrequency (%)
1996 1
 
0.5%
2005 1
 
0.5%
2006 1
 
0.5%
2007 1
 
0.5%
2010 2
 
1.0%
2011 6
3.0%
2012 1
 
0.5%
2013 3
1.5%
2014 4
2.0%
2015 3
1.5%
ValueCountFrequency (%)
2024 1
 
0.5%
2023 14
7.0%
2022 24
12.1%
2021 29
14.6%
2020 27
13.6%
2019 26
13.1%
2018 21
10.6%
2017 18
9.0%
2016 12
6.0%
2015 3
 
1.5%
Distinct195
Distinct (%)100.0%
Missing4
Missing (%)2.0%
Memory size1.7 KiB
Minimum2018-01-15 00:00:00
Maximum2024-03-15 00:00:00
2024-04-06T17:19:08.557175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:19:09.027427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

도정신문 호수
Text

MISSING 

Distinct195
Distinct (%)100.0%
Missing4
Missing (%)2.0%
Memory size1.7 KiB
2024-04-06T17:19:09.774465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters780
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique195 ?
Unique (%)100.0%

Sample

1st row800호
2nd row801호
3rd row802호
4th row803호
5th row804호
ValueCountFrequency (%)
826호 1
 
0.5%
950호 1
 
0.5%
936호 1
 
0.5%
927호 1
 
0.5%
928호 1
 
0.5%
929호 1
 
0.5%
930호 1
 
0.5%
931호 1
 
0.5%
932호 1
 
0.5%
933호 1
 
0.5%
Other values (185) 185
94.9%
2024-04-06T17:19:10.730171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
195
25.0%
8 137
17.6%
9 133
17.1%
6 40
 
5.1%
3 40
 
5.1%
5 40
 
5.1%
7 40
 
5.1%
2 39
 
5.0%
1 39
 
5.0%
4 39
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 585
75.0%
Other Letter 195
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 137
23.4%
9 133
22.7%
6 40
 
6.8%
3 40
 
6.8%
5 40
 
6.8%
7 40
 
6.8%
2 39
 
6.7%
1 39
 
6.7%
4 39
 
6.7%
0 38
 
6.5%
Other Letter
ValueCountFrequency (%)
195
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 585
75.0%
Hangul 195
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 137
23.4%
9 133
22.7%
6 40
 
6.8%
3 40
 
6.8%
5 40
 
6.8%
7 40
 
6.8%
2 39
 
6.7%
1 39
 
6.7%
4 39
 
6.7%
0 38
 
6.5%
Hangul
ValueCountFrequency (%)
195
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 585
75.0%
Hangul 195
 
25.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
195
100.0%
ASCII
ValueCountFrequency (%)
8 137
23.4%
9 133
22.7%
6 40
 
6.8%
3 40
 
6.8%
5 40
 
6.8%
7 40
 
6.8%
2 39
 
6.7%
1 39
 
6.7%
4 39
 
6.7%
0 38
 
6.5%

Interactions

2024-04-06T17:19:02.118161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-04-06T17:19:02.349215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:19:02.560634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-06T17:19:02.779314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

도서명저자명출판사발행연도도정신문 발행일도정신문 호수
08체질 이야기주석원씨앗을뿌리는사람20072018-01-15800호
1말의 품격이기주황소북스20172018-01-25801호
2운다고 달라지는 일은 아무것도 없겠지만박준난다20172018-02-05802호
3우리는 차별에 찬성합니다오찬호개마고원20132018-02-25803호
4당신은 개를 키우면 안 된다강형욱동아일보사20142018-03-05804호
5아무것도 아닌 지금은 없다김동혁쌤앤파커스20172018-03-15805호
6이 모든 극적인 순간들윤대녕푸르메20102018-03-25806호
7LOVE, 사랑에 대해 알아야 할 모든 것A. M. 파인스다산초당20052018-04-05807호
8손빈병법손빈(이병호 옮김)홍익출한사19962018-04-15808호
9신경 끄기의 기술마크 맨슨갤리온20172018-05-05810호
도서명저자명출판사발행연도도정신문 발행일도정신문 호수
189오늘 뭐 먹지권여선한겨레출판사20232024-01-05992호
190두 리더_영조와 정조노혜경뜨인돌20202024-01-25993호
191좋아요는 어떻게 지구를 파괴하는가기욤 피트롱갈라파고스20232024-02-05994호
192쇼펜하우어 아포리즘아르투어 쇼펜하우어포레스트북스20232024-02-25995호
193모순안귀자쓰다20242024-03-05996호
194느긋하게 웃으면서 짜증내지 않고 살아가는 법브라이언 킹프롬북스20232024-03-15997호
195<NA><NA><NA><NA><NA><NA>
196<NA><NA><NA><NA><NA><NA>
197<NA><NA><NA><NA><NA><NA>
198<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

도서명저자명출판사발행연도도정신문 발행일도정신문 호수# duplicates
0<NA><NA><NA><NA><NA><NA>4