Overview

Dataset statistics

Number of variables4
Number of observations241
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.4%
Total size in memory7.7 KiB
Average record size in memory32.5 B

Variable types

Text1
Categorical3

Dataset

Description여성사전시관 인물연구 정보 서비스 정보를 제공합니다. (인물연구명, 인물연구실명, 등록일자, 데이터기준일자)
Author여성가족부
URLhttps://www.data.go.kr/data/15085777/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.4%) duplicate rowsDuplicates
인물연구실명 is highly overall correlated with 등록일자High correlation
등록일자 is highly overall correlated with 인물연구실명High correlation

Reproduction

Analysis started2023-12-12 01:44:13.651127
Analysis finished2023-12-12 01:44:14.145921
Duration0.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct240
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-12T10:44:14.386458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length10.481328
Min length2

Characters and Unicode

Total characters2526
Distinct characters428
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique239 ?
Unique (%)99.2%

Sample

1st row231. 평량(平亮)의 처
2nd row가야 용녀(傭女)
3rd row가야의 이뇌왕비(異腦王妃)
4th row강경애(1906~1944)
5th row강빈
ValueCountFrequency (%)
8
 
2.6%
이씨 3
 
1.0%
3
 
1.0%
원덕태후(元德太后 2
 
0.6%
2
 
0.6%
신씨 2
 
0.6%
金氏 2
 
0.6%
정순왕후 2
 
0.6%
아내 2
 
0.6%
이숙희(李淑禧 1
 
0.3%
Other values (286) 286
91.4%
2023-12-12T10:44:14.977355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 201
 
8.0%
( 201
 
8.0%
1 156
 
6.2%
9 123
 
4.9%
114
 
4.5%
~ 66
 
2.6%
8 47
 
1.9%
44
 
1.7%
0 44
 
1.7%
38
 
1.5%
Other values (418) 1492
59.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1439
57.0%
Decimal Number 503
 
19.9%
Close Punctuation 201
 
8.0%
Open Punctuation 201
 
8.0%
Space Separator 114
 
4.5%
Math Symbol 66
 
2.6%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
44
 
3.1%
38
 
2.6%
31
 
2.2%
29
 
2.0%
27
 
1.9%
25
 
1.7%
25
 
1.7%
24
 
1.7%
23
 
1.6%
22
 
1.5%
Other values (402) 1151
80.0%
Decimal Number
ValueCountFrequency (%)
1 156
31.0%
9 123
24.5%
8 47
 
9.3%
0 44
 
8.7%
2 33
 
6.6%
3 26
 
5.2%
7 21
 
4.2%
4 19
 
3.8%
5 19
 
3.8%
6 15
 
3.0%
Other Punctuation
ValueCountFrequency (%)
· 1
50.0%
. 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 201
100.0%
Open Punctuation
ValueCountFrequency (%)
( 201
100.0%
Space Separator
ValueCountFrequency (%)
114
100.0%
Math Symbol
ValueCountFrequency (%)
~ 66
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1087
43.0%
Hangul 964
38.2%
Han 475
18.8%

Most frequent character per script

Han
ValueCountFrequency (%)
25
 
5.3%
24
 
5.1%
20
 
4.2%
15
 
3.2%
14
 
2.9%
12
 
2.5%
10
 
2.1%
7
 
1.5%
6
 
1.3%
6
 
1.3%
Other values (210) 336
70.7%
Hangul
ValueCountFrequency (%)
44
 
4.6%
38
 
3.9%
31
 
3.2%
29
 
3.0%
27
 
2.8%
25
 
2.6%
23
 
2.4%
22
 
2.3%
21
 
2.2%
19
 
2.0%
Other values (182) 685
71.1%
Common
ValueCountFrequency (%)
) 201
18.5%
( 201
18.5%
1 156
14.4%
9 123
11.3%
114
10.5%
~ 66
 
6.1%
8 47
 
4.3%
0 44
 
4.0%
2 33
 
3.0%
3 26
 
2.4%
Other values (6) 76
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1086
43.0%
Hangul 964
38.2%
CJK 457
18.1%
CJK Compat Ideographs 18
 
0.7%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 201
18.5%
( 201
18.5%
1 156
14.4%
9 123
11.3%
114
10.5%
~ 66
 
6.1%
8 47
 
4.3%
0 44
 
4.1%
2 33
 
3.0%
3 26
 
2.4%
Other values (5) 75
 
6.9%
Hangul
ValueCountFrequency (%)
44
 
4.6%
38
 
3.9%
31
 
3.2%
29
 
3.0%
27
 
2.8%
25
 
2.6%
23
 
2.4%
22
 
2.3%
21
 
2.2%
19
 
2.0%
Other values (182) 685
71.1%
CJK
ValueCountFrequency (%)
25
 
5.5%
24
 
5.3%
20
 
4.4%
15
 
3.3%
14
 
3.1%
12
 
2.6%
10
 
2.2%
7
 
1.5%
6
 
1.3%
6
 
1.3%
Other values (198) 318
69.6%
CJK Compat Ideographs
ValueCountFrequency (%)
5
27.8%
2
 
11.1%
2
 
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (2) 2
 
11.1%
None
ValueCountFrequency (%)
· 1
100.0%

인물연구실명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
고려
59 
고대
58 
조선
57 
일제강점기
48 
현대
19 

Length

Max length5
Median length2
Mean length2.5975104
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row고려
2nd row고대
3rd row고대
4th row일제강점기
5th row조선

Common Values

ValueCountFrequency (%)
고려 59
24.5%
고대 58
24.1%
조선 57
23.7%
일제강점기 48
19.9%
현대 19
 
7.9%

Length

2023-12-12T10:44:15.170641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:44:15.319667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고려 59
24.5%
고대 58
24.1%
조선 57
23.7%
일제강점기 48
19.9%
현대 19
 
7.9%

등록일자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2019-09-09
183 
2019-09-06
58 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019-09-09
2nd row2019-09-06
3rd row2019-09-06
4th row2019-09-09
5th row2019-09-09

Common Values

ValueCountFrequency (%)
2019-09-09 183
75.9%
2019-09-06 58
 
24.1%

Length

2023-12-12T10:44:15.482297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:44:15.623913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019-09-09 183
75.9%
2019-09-06 58
 
24.1%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2021-08-06
241 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-08-06
2nd row2021-08-06
3rd row2021-08-06
4th row2021-08-06
5th row2021-08-06

Common Values

ValueCountFrequency (%)
2021-08-06 241
100.0%

Length

2023-12-12T10:44:15.775276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:44:15.898004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-08-06 241
100.0%

Correlations

2023-12-12T10:44:15.967160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인물연구실명등록일자
인물연구실명1.0001.000
등록일자1.0001.000
2023-12-12T10:44:16.116934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인물연구실명등록일자
인물연구실명1.0000.994
등록일자0.9941.000
2023-12-12T10:44:16.252665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인물연구실명등록일자
인물연구실명1.0000.994
등록일자0.9941.000

Missing values

2023-12-12T10:44:14.004964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:44:14.108441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

인물연구명인물연구실명등록일자데이터기준일자
0231. 평량(平亮)의 처고려2019-09-092021-08-06
1가야 용녀(傭女)고대2019-09-062021-08-06
2가야의 이뇌왕비(異腦王妃)고대2019-09-062021-08-06
3강경애(1906~1944)일제강점기2019-09-092021-08-06
4강빈조선2019-09-092021-08-06
5강수(强首)의 처고대2019-09-062021-08-06
6강신재(1924~2001)현대2019-09-092021-08-06
7강완숙(姜完淑)조선2019-09-092021-08-06
8강은교(1945~ )현대2019-09-092021-08-06
9강정일당(姜靜一堂)조선2019-09-092021-08-06
인물연구명인물연구실명등록일자데이터기준일자
231헌정왕후(獻貞王后)고려2019-09-092021-08-06
232현덕왕후조선2019-09-092021-08-06
233현문혁(玄文弈)의 처고려2019-09-092021-08-06
234혜명왕후고대2019-09-062021-08-06
235화순옹주(和順翁主)조선2019-09-092021-08-06
236화완옹주(和緩翁主)조선2019-09-092021-08-06
237황애덕(1892~1971)일제강점기2019-09-092021-08-06
238황진이조선2019-09-092021-08-06
239효녀 지은고대2019-09-062021-08-06
240희명(希明)고대2019-09-062021-08-06

Duplicate rows

Most frequently occurring

인물연구명인물연구실명등록일자데이터기준일자# duplicates
0원덕태후(元德太后)고려2019-09-092021-08-062