Overview

Dataset statistics

Number of variables5
Number of observations30
Missing cells6
Missing cells (%)4.0%
Duplicate rows1
Duplicate rows (%)3.3%
Total size in memory1.3 KiB
Average record size in memory44.4 B

Variable types

Text4
Categorical1

Dataset

Description독립기념관 교양총서의 제목, 작가, 파일명 등의 자료입니다.
Author독립기념관
URLhttps://www.data.go.kr/data/15067846/fileData.do

Alerts

Dataset has 1 (3.3%) duplicate rowsDuplicates
제목 has 2 (6.7%) missing valuesMissing
작가 has 2 (6.7%) missing valuesMissing
파일명 has 2 (6.7%) missing valuesMissing

Reproduction

Analysis started2023-12-13 00:48:46.409573
Analysis finished2023-12-13 00:48:46.833672
Duration0.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct29
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-13T09:48:46.929136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.6333333
Min length1

Characters and Unicode

Total characters49
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)93.3%

Sample

1st row1
2nd row2
3rd row3
4th row4
5th row5
ValueCountFrequency (%)
2 1
 
3.6%
3 1
 
3.6%
28 1
 
3.6%
27 1
 
3.6%
26 1
 
3.6%
25 1
 
3.6%
24 1
 
3.6%
23 1
 
3.6%
22 1
 
3.6%
21 1
 
3.6%
Other values (18) 18
64.3%
2023-12-13T09:48:47.178699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 13
26.5%
2 12
24.5%
3 3
 
6.1%
4 3
 
6.1%
5 3
 
6.1%
6 3
 
6.1%
7 3
 
6.1%
8 3
 
6.1%
2
 
4.1%
9 2
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 47
95.9%
Space Separator 2
 
4.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 13
27.7%
2 12
25.5%
3 3
 
6.4%
4 3
 
6.4%
5 3
 
6.4%
6 3
 
6.4%
7 3
 
6.4%
8 3
 
6.4%
9 2
 
4.3%
0 2
 
4.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 13
26.5%
2 12
24.5%
3 3
 
6.1%
4 3
 
6.1%
5 3
 
6.1%
6 3
 
6.1%
7 3
 
6.1%
8 3
 
6.1%
2
 
4.1%
9 2
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 13
26.5%
2 12
24.5%
3 3
 
6.1%
4 3
 
6.1%
5 3
 
6.1%
6 3
 
6.1%
7 3
 
6.1%
8 3
 
6.1%
2
 
4.1%
9 2
 
4.1%

제목
Text

MISSING 

Distinct28
Distinct (%)100.0%
Missing2
Missing (%)6.7%
Memory size372.0 B
2023-12-13T09:48:47.359506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length32.5
Mean length16.821429
Min length5

Characters and Unicode

Total characters471
Distinct characters110
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)100.0%

Sample

1st row한국 독립운동사의 재조명
2nd row만주·노령지역의 독립운동
3rd row3.1독립운동
4th row한국여성 독립운동
5th row한국근대문학과 문인들의 독립운동
ValueCountFrequency (%)
역사왜곡 7
 
7.6%
일문판 4
 
4.3%
어떻게 4
 
4.3%
위안부 3
 
3.3%
알고 3
 
3.3%
있나요 3
 
3.3%
독립운동 3
 
3.3%
근대화론 2
 
2.2%
일본군 2
 
2.2%
한국 2
 
2.2%
Other values (47) 59
64.1%
2023-12-13T09:48:47.679878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
73
 
15.5%
15
 
3.2%
14
 
3.0%
13
 
2.8%
12
 
2.5%
12
 
2.5%
11
 
2.3%
10
 
2.1%
9
 
1.9%
( 9
 
1.9%
Other values (100) 293
62.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 358
76.0%
Space Separator 73
 
15.5%
Other Punctuation 16
 
3.4%
Open Punctuation 9
 
1.9%
Close Punctuation 9
 
1.9%
Decimal Number 6
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
 
4.2%
14
 
3.9%
13
 
3.6%
12
 
3.4%
12
 
3.4%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
9
 
2.5%
Other values (91) 244
68.2%
Other Punctuation
ValueCountFrequency (%)
? 9
56.2%
, 3
 
18.8%
· 3
 
18.8%
. 1
 
6.2%
Decimal Number
ValueCountFrequency (%)
3 3
50.0%
1 3
50.0%
Space Separator
ValueCountFrequency (%)
73
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 358
76.0%
Common 113
 
24.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
 
4.2%
14
 
3.9%
13
 
3.6%
12
 
3.4%
12
 
3.4%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
9
 
2.5%
Other values (91) 244
68.2%
Common
ValueCountFrequency (%)
73
64.6%
( 9
 
8.0%
? 9
 
8.0%
) 9
 
8.0%
, 3
 
2.7%
· 3
 
2.7%
3 3
 
2.7%
1 3
 
2.7%
. 1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 358
76.0%
ASCII 110
 
23.4%
None 3
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
73
66.4%
( 9
 
8.2%
? 9
 
8.2%
) 9
 
8.2%
, 3
 
2.7%
3 3
 
2.7%
1 3
 
2.7%
. 1
 
0.9%
Hangul
ValueCountFrequency (%)
15
 
4.2%
14
 
3.9%
13
 
3.6%
12
 
3.4%
12
 
3.4%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
9
 
2.5%
Other values (91) 244
68.2%
None
ValueCountFrequency (%)
· 3
100.0%

작가
Text

MISSING 

Distinct24
Distinct (%)85.7%
Missing2
Missing (%)6.7%
Memory size372.0 B
2023-12-13T09:48:47.828930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.25
Min length3

Characters and Unicode

Total characters91
Distinct characters45
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)75.0%

Sample

1st row김준엽
2nd row박영석
3rd row신용하
4th row박용옥
5th row김윤식
ValueCountFrequency (%)
강정숙 3
 
10.7%
허수열 2
 
7.1%
한철호 2
 
7.1%
이정은 2
 
7.1%
김준엽 1
 
3.6%
김호일 1
 
3.6%
이균영 1
 
3.6%
이현희 1
 
3.6%
권태억 1
 
3.6%
권영민 1
 
3.6%
Other values (13) 13
46.4%
2023-12-13T09:48:48.070939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
7.7%
5
 
5.5%
5
 
5.5%
4
 
4.4%
4
 
4.4%
4
 
4.4%
4
 
4.4%
3
 
3.3%
3
 
3.3%
3
 
3.3%
Other values (35) 49
53.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 84
92.3%
Space Separator 7
 
7.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
6.0%
5
 
6.0%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
Other values (34) 46
54.8%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 84
92.3%
Common 7
 
7.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
6.0%
5
 
6.0%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
Other values (34) 46
54.8%
Common
ValueCountFrequency (%)
7
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 84
92.3%
ASCII 7
 
7.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7
100.0%
Hangul
ValueCountFrequency (%)
5
 
6.0%
5
 
6.0%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
Other values (34) 46
54.8%

파일명
Text

MISSING 

Distinct28
Distinct (%)100.0%
Missing2
Missing (%)6.7%
Memory size372.0 B
2023-12-13T09:48:48.238993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length8.6785714
Min length8

Characters and Unicode

Total characters243
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)100.0%

Sample

1st rowe0001.pdf
2nd rowe0002.pdf
3rd rowe0003.pdf
4th rowe0004.pdf
5th rowe0005.pdf
ValueCountFrequency (%)
e0002.pdf 1
 
3.6%
e0003.pdf 1
 
3.6%
e028.pdf 1
 
3.6%
e027.pdf 1
 
3.6%
e026.pdf 1
 
3.6%
e025.pdf 1
 
3.6%
e024.pdf 1
 
3.6%
e023.pdf 1
 
3.6%
e022.pdf 1
 
3.6%
e021.pdf 1
 
3.6%
Other values (18) 18
64.3%
2023-12-13T09:48:48.490804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 58
23.9%
e 28
11.5%
. 28
11.5%
p 28
11.5%
d 28
11.5%
f 28
11.5%
1 13
 
5.3%
2 12
 
4.9%
3 3
 
1.2%
4 3
 
1.2%
Other values (5) 14
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 112
46.1%
Decimal Number 103
42.4%
Other Punctuation 28
 
11.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 58
56.3%
1 13
 
12.6%
2 12
 
11.7%
3 3
 
2.9%
4 3
 
2.9%
5 3
 
2.9%
6 3
 
2.9%
7 3
 
2.9%
8 3
 
2.9%
9 2
 
1.9%
Lowercase Letter
ValueCountFrequency (%)
e 28
25.0%
p 28
25.0%
d 28
25.0%
f 28
25.0%
Other Punctuation
ValueCountFrequency (%)
. 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 131
53.9%
Latin 112
46.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 58
44.3%
. 28
21.4%
1 13
 
9.9%
2 12
 
9.2%
3 3
 
2.3%
4 3
 
2.3%
5 3
 
2.3%
6 3
 
2.3%
7 3
 
2.3%
8 3
 
2.3%
Latin
ValueCountFrequency (%)
e 28
25.0%
p 28
25.0%
d 28
25.0%
f 28
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 243
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 58
23.9%
e 28
11.5%
. 28
11.5%
p 28
11.5%
d 28
11.5%
f 28
11.5%
1 13
 
5.3%
2 12
 
4.9%
3 3
 
1.2%
4 3
 
1.2%
Other values (5) 14
 
5.8%

내용
Categorical

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
-
25 
이 책은 국내외 독자들이 위안부 문제를 차근차근 이해할 수 있도록 하고 위안부 문제의 문제의 본질이 무엇이며, 우리가 어떻게 이해하고 해결 할 지 그 해답을 찾고자 제작하였다.
<NA>
 
2

Length

Max length98
Median length1
Mean length10.9
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 25
83.3%
이 책은 국내외 독자들이 위안부 문제를 차근차근 이해할 수 있도록 하고 위안부 문제의 문제의 본질이 무엇이며, 우리가 어떻게 이해하고 해결 할 지 그 해답을 찾고자 제작하였다. 3
 
10.0%
<NA> 2
 
6.7%

Length

2023-12-13T09:48:48.591769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:48:48.664662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
25
23.8%
위안부 6
 
5.7%
문제의 6
 
5.7%
무엇이며 3
 
2.9%
제작하였다 3
 
2.9%
찾고자 3
 
2.9%
해답을 3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (16) 47
44.8%

Correlations

2023-12-13T09:48:48.721692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자료번호제목작가파일명내용
자료번호1.0001.0001.0001.0001.000
제목1.0001.0001.0001.0001.000
작가1.0001.0001.0001.0001.000
파일명1.0001.0001.0001.0001.000
내용1.0001.0001.0001.0001.000

Missing values

2023-12-13T09:48:46.655408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:48:46.722230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T09:48:46.791271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

자료번호제목작가파일명내용
01한국 독립운동사의 재조명김준엽e0001.pdf-
12만주·노령지역의 독립운동박영석e0002.pdf-
233.1독립운동신용하e0003.pdf-
34한국여성 독립운동박용옥e0004.pdf-
45한국근대문학과 문인들의 독립운동김윤식e0005.pdf-
56한국의 경제발전박우회e0006.pdf-
67대한민국의 수립김학준e0007.pdf-
78천개소문전, 몽배금태조박은식e0008.pdf-
89을지문덕, 이순신전, 최도통전신채호e0009.pdf-
910한말의 의병전쟁조동걸e0010.pdf-
자료번호제목작가파일명내용
2021역사왜곡교양서 (일본군 위안부 알고 있나요? 영문판)강정숙e021.pdf이 책은 국내외 독자들이 위안부 문제를 차근차근 이해할 수 있도록 하고 위안부 문제의 문제의 본질이 무엇이며, 우리가 어떻게 이해하고 해결 할 지 그 해답을 찾고자 제작하였다.
2122역사왜곡교양서 (일본군 위안부 알고 있나요? 일문판)강정숙e022.pdf이 책은 국내외 독자들이 위안부 문제를 차근차근 이해할 수 있도록 하고 위안부 문제의 문제의 본질이 무엇이며, 우리가 어떻게 이해하고 해결 할 지 그 해답을 찾고자 제작하였다.
2223역사왜곡 교양서(근대 일본은 한국을 어떻게 병탄했나?)한철호e023.pdf-
2324역사왜곡 교양서(근대 일본은 한국을 어떻게 병탄했나? 일문판)한철호e024.pdf-
2425역사왜곡 교양서(식민지 근대화론 무엇이 문제인가?)허수열e025.pdf-
2526역사왜곡 교양서(식민지 근대화론 무엇이 문제인가? 일문판)허수열e026.pdf-
2627역사왜곡 교양서(일본제국주의는 3·1운동을 어떻게 탄압했나?)이정은e027.pdf-
2728역사왜곡 교양서(일본제국주의는 3·1운동을 어떻게 탄압했나? 일문판)이정은e028.pdf-
28<NA><NA><NA><NA>
29<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

자료번호제목작가파일명내용# duplicates
0<NA><NA><NA><NA>2