Overview

Dataset statistics

Number of variables4
Number of observations95
Missing cells34
Missing cells (%)8.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 KiB
Average record size in memory33.4 B

Variable types

Categorical1
Text3

Alerts

년도 has 1 (1.1%) missing valuesMissing
월일 has 33 (34.7%) missing valuesMissing
주요내용 has unique valuesUnique

Reproduction

Analysis started2024-04-17 19:24:22.998333
Analysis finished2024-04-17 19:24:23.425455
Duration0.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시대별
Categorical

Distinct6
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Memory size892.0 B
현대
63 
조선
12 
일제강점기
삼국 및 통일신라
 
5
대한제국
 
4

Length

Max length9
Median length2
Mean length2.7052632
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row삼국 및 통일신라
2nd row삼국 및 통일신라
3rd row삼국 및 통일신라
4th row삼국 및 통일신라
5th row삼국 및 통일신라

Common Values

ValueCountFrequency (%)
현대 63
66.3%
조선 12
 
12.6%
일제강점기 8
 
8.4%
삼국 및 통일신라 5
 
5.3%
대한제국 4
 
4.2%
고려 3
 
3.2%

Length

2024-04-18T04:24:23.478262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T04:24:23.559447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
현대 63
60.0%
조선 12
 
11.4%
일제강점기 8
 
7.6%
삼국 5
 
4.8%
5
 
4.8%
통일신라 5
 
4.8%
대한제국 4
 
3.8%
고려 3
 
2.9%

년도
Text

MISSING 

Distinct74
Distinct (%)78.7%
Missing1
Missing (%)1.1%
Memory size892.0 B
2024-04-18T04:24:23.770294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length4
Mean length5.7021277
Min length4

Characters and Unicode

Total characters536
Distinct characters40
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)60.6%

Sample

1st row261년(첨해이사금15)
2nd row689년(신문왕9)
3rd row757년(경덕왕16)
4th row839년(민애왕2)
5th row927년(태조10)
ValueCountFrequency (%)
2019 3
 
3.2%
1969 3
 
3.2%
2011 3
 
3.2%
1999 2
 
2.1%
1998 2
 
2.1%
1980 2
 
2.1%
1981 2
 
2.1%
1984 2
 
2.1%
2003 2
 
2.1%
1945 2
 
2.1%
Other values (64) 71
75.5%
2024-04-18T04:24:24.074592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 107
20.0%
9 85
15.9%
0 52
9.7%
2 46
8.6%
8 25
 
4.7%
6 24
 
4.5%
24
 
4.5%
( 23
 
4.3%
) 23
 
4.3%
3 22
 
4.1%
Other values (30) 105
19.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 413
77.1%
Other Letter 76
 
14.2%
Open Punctuation 23
 
4.3%
Close Punctuation 23
 
4.3%
Math Symbol 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
31.6%
8
 
10.5%
7
 
9.2%
5
 
6.6%
3
 
3.9%
3
 
3.9%
3
 
3.9%
2
 
2.6%
2
 
2.6%
2
 
2.6%
Other values (17) 17
22.4%
Decimal Number
ValueCountFrequency (%)
1 107
25.9%
9 85
20.6%
0 52
12.6%
2 46
11.1%
8 25
 
6.1%
6 24
 
5.8%
3 22
 
5.3%
7 19
 
4.6%
5 18
 
4.4%
4 15
 
3.6%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 460
85.8%
Hangul 76
 
14.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
31.6%
8
 
10.5%
7
 
9.2%
5
 
6.6%
3
 
3.9%
3
 
3.9%
3
 
3.9%
2
 
2.6%
2
 
2.6%
2
 
2.6%
Other values (17) 17
22.4%
Common
ValueCountFrequency (%)
1 107
23.3%
9 85
18.5%
0 52
11.3%
2 46
10.0%
8 25
 
5.4%
6 24
 
5.2%
( 23
 
5.0%
) 23
 
5.0%
3 22
 
4.8%
7 19
 
4.1%
Other values (3) 34
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 460
85.8%
Hangul 76
 
14.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 107
23.3%
9 85
18.5%
0 52
11.3%
2 46
10.0%
8 25
 
5.4%
6 24
 
5.2%
( 23
 
5.0%
) 23
 
5.0%
3 22
 
4.8%
7 19
 
4.1%
Other values (3) 34
 
7.4%
Hangul
ValueCountFrequency (%)
24
31.6%
8
 
10.5%
7
 
9.2%
5
 
6.6%
3
 
3.9%
3
 
3.9%
3
 
3.9%
2
 
2.6%
2
 
2.6%
2
 
2.6%
Other values (17) 17
22.4%

월일
Text

MISSING 

Distinct48
Distinct (%)77.4%
Missing33
Missing (%)34.7%
Memory size892.0 B
2024-04-18T04:24:24.240797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters310
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)66.1%

Sample

1st row08-15
2nd row08-17
3rd row04-01
4th row08-15
5th row07-16
ValueCountFrequency (%)
01-01 6
 
9.7%
12-01 4
 
6.5%
08-01 3
 
4.8%
04-01 2
 
3.2%
08-15 2
 
3.2%
07-01 2
 
3.2%
10-01 2
 
3.2%
08-27 1
 
1.6%
04-18 1
 
1.6%
02-28 1
 
1.6%
Other values (38) 38
61.3%
2024-04-18T04:24:24.495605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 89
28.7%
1 71
22.9%
- 62
20.0%
2 21
 
6.8%
8 14
 
4.5%
7 12
 
3.9%
5 11
 
3.5%
3 9
 
2.9%
6 8
 
2.6%
4 7
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 248
80.0%
Dash Punctuation 62
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 89
35.9%
1 71
28.6%
2 21
 
8.5%
8 14
 
5.6%
7 12
 
4.8%
5 11
 
4.4%
3 9
 
3.6%
6 8
 
3.2%
4 7
 
2.8%
9 6
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 310
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 89
28.7%
1 71
22.9%
- 62
20.0%
2 21
 
6.8%
8 14
 
4.5%
7 12
 
3.9%
5 11
 
3.5%
3 9
 
2.9%
6 8
 
2.6%
4 7
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 310
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 89
28.7%
1 71
22.9%
- 62
20.0%
2 21
 
6.8%
8 14
 
4.5%
7 12
 
3.9%
5 11
 
3.5%
3 9
 
2.9%
6 8
 
2.6%
4 7
 
2.3%

주요내용
Text

UNIQUE 

Distinct95
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size892.0 B
2024-04-18T04:24:24.717481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length87
Median length37
Mean length25.442105
Min length2

Characters and Unicode

Total characters2417
Distinct characters335
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique95 ?
Unique (%)100.0%

Sample

1st row달벌성 축조 기사(『三國史記』十五年春二月)
2nd row달구벌로의 천도 시도(『三國史記』神文王9年)
3rd row지방행정구역 명칭 개정 (위화군→수창군, 달구화현→대구현, 팔거리현→팔리현, 다사지현→ 하빈현, 설화현→화원현)
4th row처음으로 ‘大丘’라는 지명 등장
5th row달구벌전투(민애왕 즉위 후 국왕파와 반대파인 김우징군의 격전)
ValueCountFrequency (%)
대구 8
 
1.7%
개통 8
 
1.7%
조성 6
 
1.3%
대구도시철도 5
 
1.1%
개최 5
 
1.1%
설치 5
 
1.1%
개관 4
 
0.9%
대구읍성 3
 
0.7%
처음으로 3
 
0.7%
편입 3
 
0.7%
Other values (368) 411
89.2%
2024-04-18T04:24:25.042483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
366
 
15.1%
114
 
4.7%
103
 
4.3%
44
 
1.8%
( 43
 
1.8%
) 43
 
1.8%
, 43
 
1.8%
40
 
1.7%
34
 
1.4%
32
 
1.3%
Other values (325) 1555
64.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1696
70.2%
Space Separator 366
 
15.1%
Decimal Number 134
 
5.5%
Other Punctuation 55
 
2.3%
Open Punctuation 49
 
2.0%
Close Punctuation 49
 
2.0%
Math Symbol 19
 
0.8%
Uppercase Letter 18
 
0.7%
Initial Punctuation 11
 
0.5%
Final Punctuation 11
 
0.5%
Other values (2) 9
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
114
 
6.7%
103
 
6.1%
44
 
2.6%
40
 
2.4%
34
 
2.0%
32
 
1.9%
27
 
1.6%
24
 
1.4%
23
 
1.4%
22
 
1.3%
Other values (287) 1233
72.7%
Uppercase Letter
ValueCountFrequency (%)
C 3
16.7%
I 3
16.7%
F 3
16.7%
A 2
11.1%
E 1
 
5.6%
H 1
 
5.6%
R 1
 
5.6%
M 1
 
5.6%
B 1
 
5.6%
D 1
 
5.6%
Decimal Number
ValueCountFrequency (%)
1 25
18.7%
2 19
14.2%
0 18
13.4%
3 14
10.4%
8 14
10.4%
9 10
 
7.5%
4 10
 
7.5%
6 10
 
7.5%
7 7
 
5.2%
5 7
 
5.2%
Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
t 2
33.3%
n 1
16.7%
l 1
16.7%
Open Punctuation
ValueCountFrequency (%)
( 43
87.8%
6
 
12.2%
Close Punctuation
ValueCountFrequency (%)
) 43
87.8%
6
 
12.2%
Other Punctuation
ValueCountFrequency (%)
, 43
78.2%
. 12
 
21.8%
Math Symbol
ValueCountFrequency (%)
~ 14
73.7%
5
 
26.3%
Other Symbol
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
366
100.0%
Initial Punctuation
ValueCountFrequency (%)
11
100.0%
Final Punctuation
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1627
67.3%
Common 697
28.8%
Han 69
 
2.9%
Latin 24
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
114
 
7.0%
103
 
6.3%
44
 
2.7%
40
 
2.5%
34
 
2.1%
32
 
2.0%
27
 
1.7%
24
 
1.5%
23
 
1.4%
22
 
1.4%
Other values (264) 1164
71.5%
Common
ValueCountFrequency (%)
366
52.5%
( 43
 
6.2%
) 43
 
6.2%
, 43
 
6.2%
1 25
 
3.6%
2 19
 
2.7%
0 18
 
2.6%
3 14
 
2.0%
8 14
 
2.0%
~ 14
 
2.0%
Other values (13) 98
 
14.1%
Han
ValueCountFrequency (%)
16
23.2%
10
14.5%
6
 
8.7%
6
 
8.7%
4
 
5.8%
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
Other values (13) 16
23.2%
Latin
ValueCountFrequency (%)
C 3
12.5%
I 3
12.5%
F 3
12.5%
A 2
 
8.3%
a 2
 
8.3%
t 2
 
8.3%
E 1
 
4.2%
H 1
 
4.2%
R 1
 
4.2%
M 1
 
4.2%
Other values (5) 5
20.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1612
66.7%
ASCII 679
28.1%
CJK 69
 
2.9%
Punctuation 22
 
0.9%
Compat Jamo 15
 
0.6%
None 12
 
0.5%
Arrows 5
 
0.2%
CJK Compat 3
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
366
53.9%
( 43
 
6.3%
) 43
 
6.3%
, 43
 
6.3%
1 25
 
3.7%
2 19
 
2.8%
0 18
 
2.7%
3 14
 
2.1%
8 14
 
2.1%
~ 14
 
2.1%
Other values (21) 80
 
11.8%
Hangul
ValueCountFrequency (%)
114
 
7.1%
103
 
6.4%
44
 
2.7%
40
 
2.5%
34
 
2.1%
32
 
2.0%
27
 
1.7%
24
 
1.5%
23
 
1.4%
22
 
1.4%
Other values (263) 1149
71.3%
CJK
ValueCountFrequency (%)
16
23.2%
10
14.5%
6
 
8.7%
6
 
8.7%
4
 
5.8%
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
Other values (13) 16
23.2%
Compat Jamo
ValueCountFrequency (%)
15
100.0%
Punctuation
ValueCountFrequency (%)
11
50.0%
11
50.0%
None
ValueCountFrequency (%)
6
50.0%
6
50.0%
Arrows
ValueCountFrequency (%)
5
100.0%
CJK Compat
ValueCountFrequency (%)
2
66.7%
1
33.3%

Correlations

2024-04-18T04:24:25.121614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시대별년도월일주요내용
시대별1.0001.000NaN1.000
년도1.0001.0000.0001.000
월일NaN0.0001.0001.000
주요내용1.0001.0001.0001.000

Missing values

2024-04-18T04:24:23.264414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T04:24:23.328025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T04:24:23.392586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시대별년도월일주요내용
0삼국 및 통일신라261년(첨해이사금15)<NA>달벌성 축조 기사(『三國史記』十五年春二月)
1삼국 및 통일신라689년(신문왕9)<NA>달구벌로의 천도 시도(『三國史記』神文王9年)
2삼국 및 통일신라757년(경덕왕16)<NA>지방행정구역 명칭 개정 (위화군→수창군, 달구화현→대구현, 팔거리현→팔리현, 다사지현→ 하빈현, 설화현→화원현)
3삼국 및 통일신라<NA><NA>처음으로 ‘大丘’라는 지명 등장
4삼국 및 통일신라839년(민애왕2)<NA>달구벌전투(민애왕 즉위 후 국왕파와 반대파인 김우징군의 격전)
5고려927년(태조10)<NA>공산전투(고려 왕건군과 후백제 견훤군이 팔공산에서 벌인 전투)
6고려1143년(인종21)<NA>대구현에 현령(수령, 종5품)이 파견
7고려1232년(고종19)<NA>부인사에 보관된 초조대장경이 몽고의 침략으로 소실
8조선1394년(태조3)<NA>대구현에 수성, 해안, 하빈 영속
9조선1419년(세종1)<NA>대구현(大丘縣)이 대구군(大丘郡)으로 승격
시대별년도월일주요내용
85현대201312-202013년도 노사상생협력 최우수도시 대통령상 수상
86현대201504-122015대구경북 세계물포럼 개최 (4.12~4.17)
87현대201504-23대구도시철도 3호선 개통
88현대201603-19대구 삼성라이온즈파크 개장
89현대201609-08대구도시철도 1호선 연장구간 개통(대곡~설화명곡)
90현대201710-31국채보상운동 기록물 유네스코 세계 기록유산 등재
91현대201711-01유네스코 음악창의도시 선정
92현대201903-09시민 프로축구단 대구FC의 새 전용구장인 DGB대구은행파크 개장
93현대201904-18서대구 고속철도역 기공식 개최
94현대201905-10한국물기술인증원 대구 유치 발표