Overview

Dataset statistics

Number of variables4
Number of observations71
Missing cells1
Missing cells (%)0.4%
Duplicate rows2
Duplicate rows (%)2.8%
Total size in memory2.4 KiB
Average record size in memory34.9 B

Variable types

Text3
Numeric1

Dataset

Description한국장애인고용공단에서 사회공헌활동차원에서 시각장애인들을 위해 음성, 문자, 촉각의 형태로 입력한 시각장애인 음성, 문자, 촉각 도서 목록 데이터
URLhttps://www.data.go.kr/data/15004147/fileData.do

Alerts

Dataset has 2 (2.8%) duplicate rowsDuplicates
출판사 has 1 (1.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:50:51.172479
Analysis finished2023-12-12 14:50:52.009785
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct68
Distinct (%)95.8%
Missing0
Missing (%)0.0%
Memory size700.0 B
2023-12-12T23:50:52.351567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length17
Mean length9.9295775
Min length1

Characters and Unicode

Total characters705
Distinct characters238
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)91.5%

Sample

1st row시각장애인을 위한 내 힘으로 글쓰기Ⅰ(촉각도서)
2nd row어느 날, 내 죽음에 네가 들어왔다
3rd row기분을 관리하면 인생이 관리된다
4th row저만치 혼자서
5th row수호지1
ValueCountFrequency (%)
불편한 3
 
1.5%
펭귄 3
 
1.5%
기분을 2
 
1.0%
일이 2
 
1.0%
2
 
1.0%
나는 2
 
1.0%
삼개주막 2
 
1.0%
관리하면 2
 
1.0%
2
 
1.0%
레슨 2
 
1.0%
Other values (169) 175
88.8%
2023-12-12T23:50:52.910228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
126
 
17.9%
16
 
2.3%
15
 
2.1%
15
 
2.1%
12
 
1.7%
11
 
1.6%
11
 
1.6%
11
 
1.6%
9
 
1.3%
9
 
1.3%
Other values (228) 470
66.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 546
77.4%
Space Separator 126
 
17.9%
Decimal Number 19
 
2.7%
Other Punctuation 7
 
1.0%
Uppercase Letter 4
 
0.6%
Open Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%
Letter Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
16
 
2.9%
15
 
2.7%
15
 
2.7%
12
 
2.2%
11
 
2.0%
11
 
2.0%
11
 
2.0%
9
 
1.6%
9
 
1.6%
9
 
1.6%
Other values (212) 428
78.4%
Decimal Number
ValueCountFrequency (%)
2 6
31.6%
1 5
26.3%
5 2
 
10.5%
3 2
 
10.5%
8 1
 
5.3%
7 1
 
5.3%
6 1
 
5.3%
4 1
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
D 2
50.0%
A 1
25.0%
H 1
25.0%
Space Separator
ValueCountFrequency (%)
126
100.0%
Other Punctuation
ValueCountFrequency (%)
, 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 546
77.4%
Common 154
 
21.8%
Latin 5
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
16
 
2.9%
15
 
2.7%
15
 
2.7%
12
 
2.2%
11
 
2.0%
11
 
2.0%
11
 
2.0%
9
 
1.6%
9
 
1.6%
9
 
1.6%
Other values (212) 428
78.4%
Common
ValueCountFrequency (%)
126
81.8%
, 7
 
4.5%
2 6
 
3.9%
1 5
 
3.2%
5 2
 
1.3%
3 2
 
1.3%
8 1
 
0.6%
7 1
 
0.6%
6 1
 
0.6%
4 1
 
0.6%
Other values (2) 2
 
1.3%
Latin
ValueCountFrequency (%)
D 2
40.0%
A 1
20.0%
H 1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 546
77.4%
ASCII 158
 
22.4%
Number Forms 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
126
79.7%
, 7
 
4.4%
2 6
 
3.8%
1 5
 
3.2%
D 2
 
1.3%
5 2
 
1.3%
3 2
 
1.3%
A 1
 
0.6%
H 1
 
0.6%
8 1
 
0.6%
Other values (5) 5
 
3.2%
Hangul
ValueCountFrequency (%)
16
 
2.9%
15
 
2.7%
15
 
2.7%
12
 
2.2%
11
 
2.0%
11
 
2.0%
11
 
2.0%
9
 
1.6%
9
 
1.6%
9
 
1.6%
Other values (212) 428
78.4%
Number Forms
ValueCountFrequency (%)
1
100.0%

저자
Text

Distinct55
Distinct (%)77.5%
Missing0
Missing (%)0.0%
Memory size700.0 B
2023-12-12T23:50:53.224944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length21
Mean length8.1267606
Min length2

Characters and Unicode

Total characters577
Distinct characters156
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)67.6%

Sample

1st row한국시각장애인엽합회, 한국장애인고용공단
2nd row세이카 료겐, 김윤경 번역
3rd row김다슬
4th row김훈
5th row시내암 저, 이문열 편역
ValueCountFrequency (%)
26
 
14.4%
18
 
10.0%
시내암 8
 
4.4%
이문열 8
 
4.4%
편역 8
 
4.4%
김훈 4
 
2.2%
김호연 3
 
1.7%
심연희 2
 
1.1%
코엘료 2
 
1.1%
김다슬 2
 
1.1%
Other values (93) 99
55.0%
2023-12-12T23:50:53.678424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
109
 
18.9%
, 33
 
5.7%
27
 
4.7%
27
 
4.7%
20
 
3.5%
19
 
3.3%
10
 
1.7%
9
 
1.6%
8
 
1.4%
8
 
1.4%
Other values (146) 307
53.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 431
74.7%
Space Separator 109
 
18.9%
Other Punctuation 35
 
6.1%
Uppercase Letter 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27
 
6.3%
27
 
6.3%
20
 
4.6%
19
 
4.4%
10
 
2.3%
9
 
2.1%
8
 
1.9%
8
 
1.9%
8
 
1.9%
8
 
1.9%
Other values (141) 287
66.6%
Other Punctuation
ValueCountFrequency (%)
, 33
94.3%
. 2
 
5.7%
Uppercase Letter
ValueCountFrequency (%)
D 1
50.0%
W 1
50.0%
Space Separator
ValueCountFrequency (%)
109
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 431
74.7%
Common 144
 
25.0%
Latin 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
27
 
6.3%
27
 
6.3%
20
 
4.6%
19
 
4.4%
10
 
2.3%
9
 
2.1%
8
 
1.9%
8
 
1.9%
8
 
1.9%
8
 
1.9%
Other values (141) 287
66.6%
Common
ValueCountFrequency (%)
109
75.7%
, 33
 
22.9%
. 2
 
1.4%
Latin
ValueCountFrequency (%)
D 1
50.0%
W 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 431
74.7%
ASCII 146
 
25.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
109
74.7%
, 33
 
22.6%
. 2
 
1.4%
D 1
 
0.7%
W 1
 
0.7%
Hangul
ValueCountFrequency (%)
27
 
6.3%
27
 
6.3%
20
 
4.6%
19
 
4.4%
10
 
2.3%
9
 
2.1%
8
 
1.9%
8
 
1.9%
8
 
1.9%
8
 
1.9%
Other values (141) 287
66.6%

출판사
Text

MISSING 

Distinct46
Distinct (%)65.7%
Missing1
Missing (%)1.4%
Memory size700.0 B
2023-12-12T23:50:53.919955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length4.5714286
Min length2

Characters and Unicode

Total characters320
Distinct characters127
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)51.4%

Sample

1st row㈜바이포엠 스튜디오
2nd row㈜필름
3rd row㈜문학동네
4th row알에이치코리아
5th row알에이치코리아
ValueCountFrequency (%)
알에이치코리아 8
 
11.3%
문학동네 7
 
9.9%
민음사 3
 
4.2%
고즈넉이엔티 3
 
4.2%
나무옆의자 3
 
4.2%
푸른숲 2
 
2.8%
다산책방 2
 
2.8%
김영사 2
 
2.8%
㈜필름 2
 
2.8%
인플루엔셜 2
 
2.8%
Other values (37) 37
52.1%
2023-12-12T23:50:54.318341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
3.8%
9
 
2.8%
9
 
2.8%
8
 
2.5%
8
 
2.5%
8
 
2.5%
8
 
2.5%
8
 
2.5%
8
 
2.5%
8
 
2.5%
Other values (117) 234
73.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 294
91.9%
Lowercase Letter 20
 
6.2%
Other Symbol 4
 
1.2%
Uppercase Letter 1
 
0.3%
Space Separator 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
4.1%
9
 
3.1%
9
 
3.1%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
Other values (103) 208
70.7%
Lowercase Letter
ValueCountFrequency (%)
o 6
30.0%
b 2
 
10.0%
n 2
 
10.0%
k 2
 
10.0%
i 2
 
10.0%
a 1
 
5.0%
y 1
 
5.0%
m 1
 
5.0%
r 1
 
5.0%
d 1
 
5.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Uppercase Letter
ValueCountFrequency (%)
H 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 298
93.1%
Latin 21
 
6.6%
Common 1
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
4.0%
9
 
3.0%
9
 
3.0%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
Other values (104) 212
71.1%
Latin
ValueCountFrequency (%)
o 6
28.6%
b 2
 
9.5%
n 2
 
9.5%
k 2
 
9.5%
i 2
 
9.5%
a 1
 
4.8%
y 1
 
4.8%
m 1
 
4.8%
r 1
 
4.8%
H 1
 
4.8%
Other values (2) 2
 
9.5%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 294
91.9%
ASCII 22
 
6.9%
None 4
 
1.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
4.1%
9
 
3.1%
9
 
3.1%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
Other values (103) 208
70.7%
ASCII
ValueCountFrequency (%)
o 6
27.3%
b 2
 
9.1%
n 2
 
9.1%
k 2
 
9.1%
i 2
 
9.1%
a 1
 
4.5%
y 1
 
4.5%
m 1
 
4.5%
r 1
 
4.5%
H 1
 
4.5%
Other values (3) 3
13.6%
None
ValueCountFrequency (%)
4
100.0%

페이지
Real number (ℝ)

Distinct49
Distinct (%)69.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean307.97183
Minimum8
Maximum676
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size771.0 B
2023-12-12T23:50:54.520960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile212
Q1256
median303
Q3348
95-th percentile424
Maximum676
Range668
Interquartile range (IQR)92

Descriptive statistics

Standard deviation85.658954
Coefficient of variation (CV)0.27813892
Kurtosis5.5943403
Mean307.97183
Median Absolute Deviation (MAD)47
Skewness0.67850131
Sum21866
Variance7337.4563
MonotonicityNot monotonic
2023-12-12T23:50:54.724712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
256 6
 
8.5%
320 3
 
4.2%
264 3
 
4.2%
348 3
 
4.2%
304 2
 
2.8%
268 2
 
2.8%
296 2
 
2.8%
424 2
 
2.8%
288 2
 
2.8%
388 2
 
2.8%
Other values (39) 44
62.0%
ValueCountFrequency (%)
8 1
1.4%
156 1
1.4%
184 1
1.4%
208 1
1.4%
216 2
2.8%
222 1
1.4%
236 1
1.4%
238 1
1.4%
239 1
1.4%
240 1
1.4%
ValueCountFrequency (%)
676 1
1.4%
500 1
1.4%
438 1
1.4%
424 2
2.8%
420 1
1.4%
408 1
1.4%
392 1
1.4%
388 2
2.8%
380 1
1.4%
376 1
1.4%

Interactions

2023-12-12T23:50:51.704322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:50:54.860343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도서명저자출판사페이지
도서명1.0001.0000.9981.000
저자1.0001.0000.9990.985
출판사0.9980.9991.0000.932
페이지1.0000.9850.9321.000

Missing values

2023-12-12T23:50:51.870749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:50:51.971670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

도서명저자출판사페이지
0시각장애인을 위한 내 힘으로 글쓰기Ⅰ(촉각도서)한국시각장애인엽합회, 한국장애인고용공단<NA>8
1어느 날, 내 죽음에 네가 들어왔다세이카 료겐, 김윤경 번역㈜바이포엠 스튜디오296
2기분을 관리하면 인생이 관리된다김다슬㈜필름280
3저만치 혼자서김훈㈜문학동네264
4수호지1시내암 저, 이문열 편역알에이치코리아360
5수호지2시내암 저, 이문열 편역알에이치코리아348
6수호지3시내암 저, 이문열 편역알에이치코리아348
7수호지4시내암 저, 이문열 편역알에이치코리아332
8수호지5시내암 저, 이문열 편역알에이치코리아344
9수호지6시내암 저, 이문열 편역알에이치코리아328
도서명저자출판사페이지
61여덟건의 완벽한 살인피터 스완슨 저, 노진선 역푸른숲320
62투명인간은 밀실에 숨는다아쓰카와 다쓰미 저, 이재원 역디앤씨미디어348
63방과 후 복수활동박성신, 윤자영, 양수련, 장우석 공저북오션256
64조선의 왈가닥 비바리천영미고즈넉이엔티420
65절벽의 밤미치오 슈스케 저, 김은모 역청미래264
66주아최윤호Harmonybook256
67파친코1이민진 저, 신승미 역인플루엔셜388
68파친코2이민진 저, 신승미 역인플루엔셜380
69하얼빈김훈문학동네308
70불편한 편의점2김호연나무옆의자320

Duplicate rows

Most frequently occurring

도서명저자출판사페이지# duplicates
0기분을 관리하면 인생이 관리된다김다슬㈜필름2802
1불편한 편의점2김호연나무옆의자3202