Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory625.0 KiB
Average record size in memory64.0 B

Variable types

Text3
Categorical2
DateTime2

Dataset

Description독립기념관 황성신문에 관한 데이터로 관리번호, 신문구분, 발행일, 등록일, 내용, IDP코드, UCI코드 등을 제공합니다.
Author독립기념관
URLhttps://www.data.go.kr/data/15089867/fileData.do

Alerts

신문구분 has constant value ""Constant
등록일 has constant value ""Constant
IDP코드 has constant value ""Constant
관리번호 has unique valuesUnique
UCI코드 has unique valuesUnique

Reproduction

Analysis started2023-12-12 22:58:08.736520
Analysis finished2023-12-12 22:58:12.079334
Duration3.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

관리번호
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:58:12.251972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters120000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowHS1910011404
2nd rowHS1900080601
3rd rowHS1906051502
4th rowHS1907102203
5th rowHS1910052701
ValueCountFrequency (%)
hs1910011404 1
 
< 0.1%
hs1906052404 1
 
< 0.1%
hs1900090102 1
 
< 0.1%
hs1904121403 1
 
< 0.1%
hs1900103004 1
 
< 0.1%
hs1905011902 1
 
< 0.1%
hs1901101201 1
 
< 0.1%
hs1903040404 1
 
< 0.1%
hs1901123003 1
 
< 0.1%
hs1903100303 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-13T07:58:12.589969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 31972
26.6%
1 22328
18.6%
9 13478
11.2%
H 10000
 
8.3%
S 10000
 
8.3%
2 9013
 
7.5%
3 5644
 
4.7%
4 5239
 
4.4%
8 4191
 
3.5%
7 2789
 
2.3%
Other values (2) 5346
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 100000
83.3%
Uppercase Letter 20000
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 31972
32.0%
1 22328
22.3%
9 13478
13.5%
2 9013
 
9.0%
3 5644
 
5.6%
4 5239
 
5.2%
8 4191
 
4.2%
7 2789
 
2.8%
6 2704
 
2.7%
5 2642
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
H 10000
50.0%
S 10000
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 100000
83.3%
Latin 20000
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 31972
32.0%
1 22328
22.3%
9 13478
13.5%
2 9013
 
9.0%
3 5644
 
5.6%
4 5239
 
5.2%
8 4191
 
4.2%
7 2789
 
2.8%
6 2704
 
2.7%
5 2642
 
2.6%
Latin
ValueCountFrequency (%)
H 10000
50.0%
S 10000
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 120000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 31972
26.6%
1 22328
18.6%
9 13478
11.2%
H 10000
 
8.3%
S 10000
 
8.3%
2 9013
 
7.5%
3 5644
 
4.7%
4 5239
 
4.4%
8 4191
 
3.5%
7 2789
 
2.3%
Other values (2) 5346
 
4.5%

신문구분
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
황성신문
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row황성신문
2nd row황성신문
3rd row황성신문
4th row황성신문
5th row황성신문

Common Values

ValueCountFrequency (%)
황성신문 10000
100.0%

Length

2023-12-13T07:58:12.735959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:58:12.825335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
황성신문 10000
100.0%
Distinct3404
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1898-09-05 00:00:00
Maximum1910-09-14 00:00:00
2023-12-13T07:58:12.942583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:58:13.083957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

등록일
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2005-11-24 00:00:00
Maximum2005-11-24 00:00:00
2023-12-13T07:58:13.186814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:58:13.262079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

내용
Text

Distinct8317
Distinct (%)83.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:58:13.699095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1024
Median length794
Mean length271.2099
Min length1

Characters and Unicode

Total characters2712099
Distinct characters5255
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8272 ?
Unique (%)82.7%

Sample

1st row廣告
2nd row官報/宮廷錄事/議政府議政署理/議政府贊政/警部大臣臨時署理/度支部大臣/趙秉式/平安北道觀察使/李道宰/秘書丞/吳衡根/法部/法部署理大臣/閔種默/全權公使/兪箕煥/趙秉稷/議政府參政/陸軍副將/李允用/官廳事項/加平郡守/趙準熙/龍仁郡守/兪鎭哲/陰竹郡守/鄭基恒/振威郡守/李完鎬/陽川郡守/權世圭/始興郡守/李丙儀/砥平郡守/李載夏/積城郡守/尹命五/果川郡守/姜相驥/漣川郡守/沈鍾舜/陽智郡守/申容均/陽城/喬桐郡守/徐相喬/雜報/李韓被囚/李顯廚/韓基鎬/警部/一浦兩稅/忠南觀察使/農部/牙山/屯浦/宣禧宮/私鑄曰銅/私鑄/銅貨/度支/請從褒題/南陽/大阜/島로/林平三/議政府/鹽井/郡守/徐相鶴/義州對岸의淸兵/漢城新報/義州/鴨綠江/大東滿/大孤山
3rd row寄書/關西生/平安南北道/高麗/雜報/獸醫雇聘契約/農商工部大臣/日本政府/入交淸江/大韓政府/合同契約書/政府會議/大韓國農商工部獸醫/農商工部大臣/雇聘年限/光武/政會退期/鐵道支線踏査/湖南鐵道會社總事務長/徐午淳/農商工部/燕岐烏致院/燕岐광山公州益山魯城全州連山恩津/自動車會社/北松峴居/權丙壽/官公私立小學校/大運動式盛況/各員初會/訓鍊院前坪/小學校/入場景況/訓鍊院大廳前/大韓國旗/運動開始/總務長/李完用/事務長/金奎熙/特別事務員/幣原垣/事務員/申泰游/鄭喬外他諸/奔競審判/審判委員/蔡範錫/柳基永/優等施賞/施賞委員長/張世基/委員/閔健植/李夏珽/休息及立食/參政大臣/朴齊純/內大/李址鎔/農大/權重顯/度大/閔泳綺諸/內外國紳士及各新聞記者等諸/陛下勞問/志士寄附/在我國京城小學校長/日本人/橫山彌長谷川好道金次郞/繪葉書/崇呼●歸/裴氏施賞/昨日小學校學徒大運動/大韓每日申報社長英人裴說/舒川義擾急電/舒川郡守/李種奭/靑曾演說/皇城基督敎靑年會/濟衆院長/魚不信/個人的淸潔問題/減還添額/北靑郡民/金正學/朴東協/訓痘實施/濟州郡守/訓令/本郡種痘委員
4th row文明錄/咸安郡守/黃秘/朴泰魯/國債報償義務金集送人員及額數/坡州坡平面訥老里/固城郡/月峙洞/李宗煥/李廷明/佳里洞/黃宗喆/白贊洙/金炳萬/金柱天/李鎭源/白樂俊/白必鎭/卜里面歌瑟洞/金基琦/趙永祿/徐守文/曹又俊/朴再明/徐相淳/古谷洞/姜昌佑/李道正/朴貞來/林成龍/金在彬/朴秉浩/朴秉石/蔡敬默/蔡致默/文達元/劉奉允/張斗憲/박丙祿/金丁玉/金成玉/李道石/李在先/文化年/蔡仁默/朴永來/徐行能/金龍伊/林大見/朴相來/박相九/金奇眞/박義來/박性●/金周西/裴奉勸/鳥山洞/陳●成/陳得成/●●●/李志宇/李起聿/陳杞烈/陳甲成/陳元烈/陳年烈/박相五/박茂相/金根祚/韓基源/崔必成/韓渭源/韓進圭/陳千烈/韓鎭云/文在淳/金三用/韓鎭英/文昌圭/韓碩元/崔碩眞/조洞/●善宇/朴鎭木/金海眞/金璜/崔涇模/徐國乾/姜斗千/白達俊/李鎭●/정馬岩/정末岩/정金岩/●龍洞/李得石/李學文/金見榮/박善伊/박用來/金相洙/李相式/成敬善/李卨在/東山洞/金亨眞/金炳式/金●贊/金英夏/金炳烈/金炳浩/柳鎭洪/黃用洙/金鎭洙/辛石振/崔權祚/孫萬順/姜●●/●炳用/崔鳳祚/金尙辰/金以辰/柯洞/蔡東權/박成坤/文武兼/蔡圭成/蔡圭萬/蔡圭銀/蔡順默/郭址旭/蔡東周/蔡圭文/蔡圭用/蔡圭角/安文植/蔡圭植/蔡圭元/蔡圭正/安樂式/蔡圭伯/최命千/蔡圭權/金明辰/李在化/金永辰/金丙仁/최鶴千/朴丁坤/嚴光必/仙洞/朴永默/白파仁/李英實/李鎭采/배性寬/南相萬/林龍秋/諸敬彦/鄭基石/崔相伊/沈億萬/배順先/李克用/崔鶴用/鄭義●/南相德/望林洞/朴聳洙/朴子善/朴贊洙/朴石洙/朴鎭文/夫浦洞/白樂成/白규洙/李明규/白昌洙/白樂贊/白樂云/曹石九/朴永叔/金冠●/崔必淳/廉化德/金在洙/白丁洙/金亨權/白永兮/白命洙/林明式/白重洙/朴海龍/金致洙/九尾洞/李重先/徐元培/朴宗西/李叔亨/林宗汝/金君七/申子辰/朴丁守/梁召史/白化辰/黃자敬/黃召史/具仁汝/蔡召史/姜周西/永縣面店村/宋涇文/金起權/李康伯/李正道/李洛元/金又규/禹文甲/林淳相/李明祚/●世錢/禹卜萬/方海用/朴昌植/林夫億/林又夫億/최學權/林淳旭/林淳權/方日五/金哲甲/崔其年/金浩權/鄭貴哲/李祥●/崔哲文/孫永王/●●權/禹文圭/林文吉/晨村/李亨先/朴且榮/千一興/李承文/李伊●/吳永守/朴●化/金元辰/李性先/李克文/朴演化
5th row官報/敍任及辭令/森本豊吉/郡主事/御田尹太郞/道立咸興農業學校敎授/鶴崎敏行/重松昌成/西原貞治/尾崎登代太/岡本爲四郞/平馬錦一郞/大河內巖/宮崎隆義/中谷隆彦/道主事/船越光雄/種苗場技手/崔中順/臨時棉花栽培所技手/春川種苗場/張麟煥/外報/日本華族의政況/廷尙書와新法律/淸國法部尙書/廷杰/軍機大臣/那桐/淸露領地談判/러시아/黑龍江省/淸國/徐尙書의靑島行/淸國郵傳部尙書/徐世昌/獨逸膠/州灣/議院硏究會設立/淸國/淸政府秘密偵探/錦愛와蒙古王/蒙古/錦愛鐵道/東三省諮議國聯合會議/俄淸商約하議/亞國破工鎭靜/亞典/同盟破工/露議會에二案/폴란드/萬國衛生會議/波斯의形勢/페르시아/獨逸/南米紛爭의仲裁/●퀴드兩共和國/米國/伯西爾/亞典國/雲南採鑛問題/隆興/公使/雲南省/北京/獨葡通商條約/獨帝還幸期/英國의特赦/露國學堂飛艇會/露佛軍事務協約問題/프랑스/佛國의飛行艦隊/프랑스陸軍大臣/크島問題續聞/터키/雜報/實業家往參/白完爀/趙鎭泰/實業家楓岳行/金宗國/金基永/金剛山/馬山停車場竣工/舊馬山/銀貨犯被捉/任尙俊/靑年同志會移接/靑年同志會/昌校有人/慶北咸昌郡/私立昌明學校/贊務會長/金圭煥姜氏有志/泗川郡/郡守/姜甲秀/崔氏盛擧/崔寅熏/俱樂園宴會/金容鎭/兪吉/閔泳瓚/京城日報社/太極講會/太極敎宗本部/英博館觀覽者/英國博物館
ValueCountFrequency (%)
廣告 1407
 
11.2%
仁港輪船出帆廣告/仁川港出帆廣告/本社告白 25
 
0.2%
廣告/仁港輸船出帆廣告/仁川港出帆廣告/本社告白 25
 
0.2%
廣告/仁港輪船出帆廣告/仁川港出帆廣告/本社告白 22
 
0.2%
廣告/本社告白/仁川港商船出入豫期表 19
 
0.2%
廣告/本社告白 17
 
0.1%
廣告/特別廣告 16
 
0.1%
商標 15
 
0.1%
告白 15
 
0.1%
本社特別廣告 14
 
0.1%
Other values (10764) 10965
87.4%
2023-12-13T07:58:14.247503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 559878
 
20.6%
35871
 
1.3%
26914
 
1.0%
22216
 
0.8%
21903
 
0.8%
21716
 
0.8%
20445
 
0.8%
20331
 
0.7%
19751
 
0.7%
18384
 
0.7%
Other values (5245) 1944690
71.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2113395
77.9%
Other Punctuation 560107
 
20.7%
Other Symbol 35881
 
1.3%
Space Separator 2544
 
0.1%
Lowercase Letter 63
 
< 0.1%
Dash Punctuation 51
 
< 0.1%
Uppercase Letter 33
 
< 0.1%
Decimal Number 12
 
< 0.1%
Close Punctuation 5
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26914
 
1.3%
22216
 
1.1%
21903
 
1.0%
21716
 
1.0%
20445
 
1.0%
20331
 
1.0%
19751
 
0.9%
18384
 
0.9%
17714
 
0.8%
15878
 
0.8%
Other values (5195) 1908143
90.3%
Lowercase Letter
ValueCountFrequency (%)
e 10
15.9%
l 6
9.5%
r 5
7.9%
p 5
7.9%
o 5
7.9%
n 5
7.9%
d 4
 
6.3%
t 4
 
6.3%
a 4
 
6.3%
h 3
 
4.8%
Other values (6) 12
19.0%
Uppercase Letter
ValueCountFrequency (%)
C 7
21.2%
B 6
18.2%
M 5
15.2%
E 3
9.1%
O 3
9.1%
A 2
 
6.1%
G 2
 
6.1%
T 1
 
3.0%
R 1
 
3.0%
H 1
 
3.0%
Other values (2) 2
 
6.1%
Decimal Number
ValueCountFrequency (%)
1 4
33.3%
0 3
25.0%
6 2
16.7%
3 1
 
8.3%
5 1
 
8.3%
4 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
/ 559878
> 99.9%
, 211
 
< 0.1%
. 8
 
< 0.1%
& 6
 
< 0.1%
4
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
35871
> 99.9%
8
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
< 1
50.0%
> 1
50.0%
Space Separator
ValueCountFrequency (%)
2544
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 51
100.0%
Close Punctuation
ValueCountFrequency (%)
] 5
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 5
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 2025432
74.7%
Common 598608
 
22.1%
Hangul 87963
 
3.2%
Latin 96
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
26914
 
1.3%
22216
 
1.1%
21903
 
1.1%
21716
 
1.1%
20445
 
1.0%
20331
 
1.0%
19751
 
1.0%
18384
 
0.9%
17714
 
0.9%
15878
 
0.8%
Other values (4212) 1820180
89.9%
Hangul
ValueCountFrequency (%)
11129
 
12.7%
4284
 
4.9%
4189
 
4.8%
3555
 
4.0%
3452
 
3.9%
2705
 
3.1%
1573
 
1.8%
1414
 
1.6%
1376
 
1.6%
1276
 
1.5%
Other values (973) 53010
60.3%
Latin
ValueCountFrequency (%)
e 10
 
10.4%
C 7
 
7.3%
B 6
 
6.2%
l 6
 
6.2%
r 5
 
5.2%
p 5
 
5.2%
o 5
 
5.2%
M 5
 
5.2%
n 5
 
5.2%
d 4
 
4.2%
Other values (18) 38
39.6%
Common
ValueCountFrequency (%)
/ 559878
93.5%
35871
 
6.0%
2544
 
0.4%
, 211
 
< 0.1%
- 51
 
< 0.1%
. 8
 
< 0.1%
8
 
< 0.1%
& 6
 
< 0.1%
] 5
 
< 0.1%
[ 5
 
< 0.1%
Other values (12) 21
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
CJK 1986632
73.3%
ASCII 562819
 
20.8%
Hangul 87959
 
3.2%
CJK Compat Ideographs 38800
 
1.4%
Geometric Shapes 35881
 
1.3%
None 4
 
< 0.1%
Compat Jamo 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 559878
99.5%
2544
 
0.5%
, 211
 
< 0.1%
- 51
 
< 0.1%
e 10
 
< 0.1%
. 8
 
< 0.1%
C 7
 
< 0.1%
B 6
 
< 0.1%
& 6
 
< 0.1%
l 6
 
< 0.1%
Other values (35) 92
 
< 0.1%
Geometric Shapes
ValueCountFrequency (%)
35871
> 99.9%
8
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
CJK
ValueCountFrequency (%)
26914
 
1.4%
22216
 
1.1%
21903
 
1.1%
21716
 
1.1%
20445
 
1.0%
20331
 
1.0%
19751
 
1.0%
18384
 
0.9%
17714
 
0.9%
15878
 
0.8%
Other values (4036) 1781380
89.7%
CJK Compat Ideographs
ValueCountFrequency (%)
15163
39.1%
10040
25.9%
1082
 
2.8%
789
 
2.0%
649
 
1.7%
639
 
1.6%
464
 
1.2%
456
 
1.2%
407
 
1.0%
390
 
1.0%
Other values (166) 8721
22.5%
Hangul
ValueCountFrequency (%)
11129
 
12.7%
4284
 
4.9%
4189
 
4.8%
3555
 
4.0%
3452
 
3.9%
2705
 
3.1%
1573
 
1.8%
1414
 
1.6%
1376
 
1.6%
1276
 
1.5%
Other values (969) 53006
60.3%
None
ValueCountFrequency (%)
4
100.0%
Compat Jamo
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

IDP코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
IDP-NP-007
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIDP-NP-007
2nd rowIDP-NP-007
3rd rowIDP-NP-007
4th rowIDP-NP-007
5th rowIDP-NP-007

Common Values

ValueCountFrequency (%)
IDP-NP-007 10000
100.0%

Length

2023-12-13T07:58:14.423820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:58:14.557623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
idp-np-007 10000
100.0%

UCI코드
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:58:14.822207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length41
Mean length41
Min length41

Characters and Unicode

Total characters410000
Distinct characters21
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowG001+KR03-B370008.051124.D0.HS1910011404:
2nd rowG001+KR03-B370008.051124.D0.HS1900080601:
3rd rowG001+KR03-B370008.051124.D0.HS1906051502:
4th rowG001+KR03-B370008.051124.D0.HS1907102203:
5th rowG001+KR03-B370008.051124.D0.HS1910052701:
ValueCountFrequency (%)
g001+kr03-b370008.051124.d0.hs1910011404 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1906052404 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1900090102 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1904121403 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1900103004 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1905011902 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1901101201 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1903040404 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1901123003 1
 
< 0.1%
g001+kr03-b370008.051124.d0.hs1903100303 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-13T07:58:15.290481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 111972
27.3%
1 52328
12.8%
. 30000
 
7.3%
3 25644
 
6.3%
2 19013
 
4.6%
4 15239
 
3.7%
8 14191
 
3.5%
9 13478
 
3.3%
7 12789
 
3.1%
5 12642
 
3.1%
Other values (11) 102704
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 280000
68.3%
Uppercase Letter 70000
 
17.1%
Other Punctuation 40000
 
9.8%
Dash Punctuation 10000
 
2.4%
Math Symbol 10000
 
2.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 111972
40.0%
1 52328
18.7%
3 25644
 
9.2%
2 19013
 
6.8%
4 15239
 
5.4%
8 14191
 
5.1%
9 13478
 
4.8%
7 12789
 
4.6%
5 12642
 
4.5%
6 2704
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
S 10000
14.3%
H 10000
14.3%
D 10000
14.3%
G 10000
14.3%
B 10000
14.3%
R 10000
14.3%
K 10000
14.3%
Other Punctuation
ValueCountFrequency (%)
. 30000
75.0%
: 10000
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%
Math Symbol
ValueCountFrequency (%)
+ 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 340000
82.9%
Latin 70000
 
17.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 111972
32.9%
1 52328
15.4%
. 30000
 
8.8%
3 25644
 
7.5%
2 19013
 
5.6%
4 15239
 
4.5%
8 14191
 
4.2%
9 13478
 
4.0%
7 12789
 
3.8%
5 12642
 
3.7%
Other values (4) 32704
 
9.6%
Latin
ValueCountFrequency (%)
S 10000
14.3%
H 10000
14.3%
D 10000
14.3%
G 10000
14.3%
B 10000
14.3%
R 10000
14.3%
K 10000
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 410000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 111972
27.3%
1 52328
12.8%
. 30000
 
7.3%
3 25644
 
6.3%
2 19013
 
4.6%
4 15239
 
3.7%
8 14191
 
3.5%
9 13478
 
3.3%
7 12789
 
3.1%
5 12642
 
3.1%
Other values (11) 102704
25.0%

Missing values

2023-12-13T07:58:11.886625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:58:12.010450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관리번호신문구분발행일등록일내용IDP코드UCI코드
13074HS1910011404황성신문1910-01-142005-11-24廣告IDP-NP-007G001+KR03-B370008.051124.D0.HS1910011404:
2401HS1900080601황성신문1900-08-062005-11-24官報/宮廷錄事/議政府議政署理/議政府贊政/警部大臣臨時署理/度支部大臣/趙秉式/平安北道觀察使/李道宰/秘書丞/吳衡根/法部/法部署理大臣/閔種默/全權公使/兪箕煥/趙秉稷/議政府參政/陸軍副將/李允用/官廳事項/加平郡守/趙準熙/龍仁郡守/兪鎭哲/陰竹郡守/鄭基恒/振威郡守/李完鎬/陽川郡守/權世圭/始興郡守/李丙儀/砥平郡守/李載夏/積城郡守/尹命五/果川郡守/姜相驥/漣川郡守/沈鍾舜/陽智郡守/申容均/陽城/喬桐郡守/徐相喬/雜報/李韓被囚/李顯廚/韓基鎬/警部/一浦兩稅/忠南觀察使/農部/牙山/屯浦/宣禧宮/私鑄曰銅/私鑄/銅貨/度支/請從褒題/南陽/大阜/島로/林平三/議政府/鹽井/郡守/徐相鶴/義州對岸의淸兵/漢城新報/義州/鴨綠江/大東滿/大孤山IDP-NP-007G001+KR03-B370008.051124.D0.HS1900080601:
8340HS1906051502황성신문1906-05-152005-11-24寄書/關西生/平安南北道/高麗/雜報/獸醫雇聘契約/農商工部大臣/日本政府/入交淸江/大韓政府/合同契約書/政府會議/大韓國農商工部獸醫/農商工部大臣/雇聘年限/光武/政會退期/鐵道支線踏査/湖南鐵道會社總事務長/徐午淳/農商工部/燕岐烏致院/燕岐광山公州益山魯城全州連山恩津/自動車會社/北松峴居/權丙壽/官公私立小學校/大運動式盛況/各員初會/訓鍊院前坪/小學校/入場景況/訓鍊院大廳前/大韓國旗/運動開始/總務長/李完用/事務長/金奎熙/特別事務員/幣原垣/事務員/申泰游/鄭喬外他諸/奔競審判/審判委員/蔡範錫/柳基永/優等施賞/施賞委員長/張世基/委員/閔健植/李夏珽/休息及立食/參政大臣/朴齊純/內大/李址鎔/農大/權重顯/度大/閔泳綺諸/內外國紳士及各新聞記者等諸/陛下勞問/志士寄附/在我國京城小學校長/日本人/橫山彌長谷川好道金次郞/繪葉書/崇呼●歸/裴氏施賞/昨日小學校學徒大運動/大韓每日申報社長英人裴說/舒川義擾急電/舒川郡守/李種奭/靑曾演說/皇城基督敎靑年會/濟衆院長/魚不信/個人的淸潔問題/減還添額/北靑郡民/金正學/朴東協/訓痘實施/濟州郡守/訓令/本郡種痘委員IDP-NP-007G001+KR03-B370008.051124.D0.HS1906051502:
10210HS1907102203황성신문1907-10-222005-11-24文明錄/咸安郡守/黃秘/朴泰魯/國債報償義務金集送人員及額數/坡州坡平面訥老里/固城郡/月峙洞/李宗煥/李廷明/佳里洞/黃宗喆/白贊洙/金炳萬/金柱天/李鎭源/白樂俊/白必鎭/卜里面歌瑟洞/金基琦/趙永祿/徐守文/曹又俊/朴再明/徐相淳/古谷洞/姜昌佑/李道正/朴貞來/林成龍/金在彬/朴秉浩/朴秉石/蔡敬默/蔡致默/文達元/劉奉允/張斗憲/박丙祿/金丁玉/金成玉/李道石/李在先/文化年/蔡仁默/朴永來/徐行能/金龍伊/林大見/朴相來/박相九/金奇眞/박義來/박性●/金周西/裴奉勸/鳥山洞/陳●成/陳得成/●●●/李志宇/李起聿/陳杞烈/陳甲成/陳元烈/陳年烈/박相五/박茂相/金根祚/韓基源/崔必成/韓渭源/韓進圭/陳千烈/韓鎭云/文在淳/金三用/韓鎭英/文昌圭/韓碩元/崔碩眞/조洞/●善宇/朴鎭木/金海眞/金璜/崔涇模/徐國乾/姜斗千/白達俊/李鎭●/정馬岩/정末岩/정金岩/●龍洞/李得石/李學文/金見榮/박善伊/박用來/金相洙/李相式/成敬善/李卨在/東山洞/金亨眞/金炳式/金●贊/金英夏/金炳烈/金炳浩/柳鎭洪/黃用洙/金鎭洙/辛石振/崔權祚/孫萬順/姜●●/●炳用/崔鳳祚/金尙辰/金以辰/柯洞/蔡東權/박成坤/文武兼/蔡圭成/蔡圭萬/蔡圭銀/蔡順默/郭址旭/蔡東周/蔡圭文/蔡圭用/蔡圭角/安文植/蔡圭植/蔡圭元/蔡圭正/安樂式/蔡圭伯/최命千/蔡圭權/金明辰/李在化/金永辰/金丙仁/최鶴千/朴丁坤/嚴光必/仙洞/朴永默/白파仁/李英實/李鎭采/배性寬/南相萬/林龍秋/諸敬彦/鄭基石/崔相伊/沈億萬/배順先/李克用/崔鶴用/鄭義●/南相德/望林洞/朴聳洙/朴子善/朴贊洙/朴石洙/朴鎭文/夫浦洞/白樂成/白규洙/李明규/白昌洙/白樂贊/白樂云/曹石九/朴永叔/金冠●/崔必淳/廉化德/金在洙/白丁洙/金亨權/白永兮/白命洙/林明式/白重洙/朴海龍/金致洙/九尾洞/李重先/徐元培/朴宗西/李叔亨/林宗汝/金君七/申子辰/朴丁守/梁召史/白化辰/黃자敬/黃召史/具仁汝/蔡召史/姜周西/永縣面店村/宋涇文/金起權/李康伯/李正道/李洛元/金又규/禹文甲/林淳相/李明祚/●世錢/禹卜萬/方海用/朴昌植/林夫億/林又夫億/최學權/林淳旭/林淳權/方日五/金哲甲/崔其年/金浩權/鄭貴哲/李祥●/崔哲文/孫永王/●●權/禹文圭/林文吉/晨村/李亨先/朴且榮/千一興/李承文/李伊●/吳永守/朴●化/金元辰/李性先/李克文/朴演化IDP-NP-007G001+KR03-B370008.051124.D0.HS1907102203:
13186HS1910052701황성신문1910-05-272005-11-24官報/敍任及辭令/森本豊吉/郡主事/御田尹太郞/道立咸興農業學校敎授/鶴崎敏行/重松昌成/西原貞治/尾崎登代太/岡本爲四郞/平馬錦一郞/大河內巖/宮崎隆義/中谷隆彦/道主事/船越光雄/種苗場技手/崔中順/臨時棉花栽培所技手/春川種苗場/張麟煥/外報/日本華族의政況/廷尙書와新法律/淸國法部尙書/廷杰/軍機大臣/那桐/淸露領地談判/러시아/黑龍江省/淸國/徐尙書의靑島行/淸國郵傳部尙書/徐世昌/獨逸膠/州灣/議院硏究會設立/淸國/淸政府秘密偵探/錦愛와蒙古王/蒙古/錦愛鐵道/東三省諮議國聯合會議/俄淸商約하議/亞國破工鎭靜/亞典/同盟破工/露議會에二案/폴란드/萬國衛生會議/波斯의形勢/페르시아/獨逸/南米紛爭의仲裁/●퀴드兩共和國/米國/伯西爾/亞典國/雲南採鑛問題/隆興/公使/雲南省/北京/獨葡通商條約/獨帝還幸期/英國의特赦/露國學堂飛艇會/露佛軍事務協約問題/프랑스/佛國의飛行艦隊/프랑스陸軍大臣/크島問題續聞/터키/雜報/實業家往參/白完爀/趙鎭泰/實業家楓岳行/金宗國/金基永/金剛山/馬山停車場竣工/舊馬山/銀貨犯被捉/任尙俊/靑年同志會移接/靑年同志會/昌校有人/慶北咸昌郡/私立昌明學校/贊務會長/金圭煥姜氏有志/泗川郡/郡守/姜甲秀/崔氏盛擧/崔寅熏/俱樂園宴會/金容鎭/兪吉/閔泳瓚/京城日報社/太極講會/太極敎宗本部/英博館觀覽者/英國博物館IDP-NP-007G001+KR03-B370008.051124.D0.HS1910052701:
4802HS1902091005황성신문1902-09-102005-11-24社告IDP-NP-007G001+KR03-B370008.051124.D0.HS1902091005:
7224HS1905013001황성신문1905-01-302005-11-24官報/外報/雜報/官報/●任及辭令/外報/●國●亂/●●粮의乏絶/●●●●/國●●의●行/雜報/●●通文條例/取졸治績/一會質問/俄國莫斯科/大韓帝國/姜●●/李●●/高源●/姜仁基/鄭●●/金東●/朴●●/權●●/延浚/權興●/李●●/金永默/李冕容/柳文拜/李胤●/李●永/安●和/申●善/徐相大/禹●釋/金●圭/成岐連/慶南觀察使/●逸●/●南郡守/尹定植/尹甲炳/●丙漢/趙鼎允/●川郡守/尹●天/楚山郡守/尹錫天/李榕承/朴潤秀/●理總監事務/敎育副監●●/●道●主事/中樞院●●●/法官養成所博士/一進會/大韓日報東京●/●廳條例IDP-NP-007G001+KR03-B370008.051124.D0.HS1905013001:
2824HS1901011204황성신문1901-01-122005-11-24廣告IDP-NP-007G001+KR03-B370008.051124.D0.HS1901011204:
13242HS1910061801황성신문1910-06-182005-11-24墳墓觀念及迷信에就하야/官報/敍任及辭令/梅原三千/河野浦吉/原六郞/篠塚隆次/長谷川庄太郞/申敬鉉/梅原三千/原六郞/外報/南博獨逸部/淸國南京博覽會/獨逸/露工夫破工/러시아/露大臣向東/露相辭職說/亞典國의新統領/亞典國/英元帥辭職/英國/紐費間費行/米國/埃及人自治可否/英國/露國飛機熱/러시아/埃及案討議/英國/露國의國防計劃/러시아/智利獨立察/南米智利國/智利와日本/獨相과淸使/獨逸/淸國/淸獨의談判/獨逸陸軍大演習/獨逸/크릿트問題續聞/크릿트島/프랑스/런던/獨逸/英國/그리스/터키/英國/雜報/嶠會定期/孤院寄付/李容相/親睦總會/趙완九/談屑/莫大한音樂家給料/米國/皇城新聞社IDP-NP-007G001+KR03-B370008.051124.D0.HS1910061801:
991HS1899120202황성신문1899-12-022005-11-24論說/東亞/雜報/洪氏致祭/乙未事變/洪啓薰/紙幣加計와日商貿穀/黑死病과各港檢疫/日本/神戶/黑死病/仁港/釜港/元港/英語新師/英語總敎師/허치슨/副敎師/핼리팩스/學部/外部/英公館/英國人/一土兩稅/甕津郡/昌麟島/麒麟島/內藏院/觀察使/閔亨植/宮內府/俄艦去留/馬山浦/러시아/巨文嶋/長崎/蔘民과日人/開城/蔘圃/外部/漢城/白人種/黃人種/西歐人/西勢東漸/植民/列强/外國人IDP-NP-007G001+KR03-B370008.051124.D0.HS1899120202:
관리번호신문구분발행일등록일내용IDP코드UCI코드
4328HS1902062703황성신문1902-06-272005-11-24廣告/金益昇/高木/治病院/仁港輸船出帆廣告/仁川港出帆廣告/本社告白IDP-NP-007G001+KR03-B370008.051124.D0.HS1902062703:
10197HS1907101102황성신문1907-10-112005-11-24論說/新聞束縛의條例/雜報/御駕出迎/皇后陛下接見/奉迎節次/郡守任免/四郡守依免/各部陪從員/東南門修理/度次巡視/月內移御/請願日至/內下錢還納/官私立學徒祗迎/幸路更築/命令龜山/咸南宣諭停止/宣諭報告/仁川府尹報告/畿察報告/畿察又報/海察報告/楊根報告/地方消息一通/天氣豫報IDP-NP-007G001+KR03-B370008.051124.D0.HS1907101102:
3200HS1901012802황성신문1901-01-282005-11-24雜報/尹任畿察/京畿觀察使/李載克/特進官/秘書院卿/尹德榮/影파陪往/議政/尹容善/穆淸殿/太祖高皇帝影幀/開城府/衛將/府尹/鎭衛隊參領/雪告民弊/鎭南/內部/授與勳章/大阪每日新聞/日皇/韓國宮內大臣/外部大臣/高等勳章/林/公使/權仕行長/農商入臣/權在衡/大韓特立銀行長/設宴話別/日本代理公使/山座圓次郞/秘飭捕柳/忠北觀察/柳麟錫/李氏兼仕/度支協辦/李容翊/典환局/親衡三大隊聯隊長/誤打其子/紫門澗/私鑄買器/楊州/私鑄錢器械/警部/世界産金額/米國/製幣局長/外報/滿州保護에關한密約/俄國/滿州/달조將軍/增祺/俄國關東總督/일릭시압/고로스도벳지/奉天府/盛京省/俄國鐵道敷設工事/淸國/俄國辦理官/北京/李鴻章/우구돔스기/廣告/鄭敦永/池相哲/柳星均IDP-NP-007G001+KR03-B370008.051124.D0.HS1901012802:
7845HS1905081104황성신문1905-08-112005-11-24至急廣告IDP-NP-007G001+KR03-B370008.051124.D0.HS1905081104:
1527HS1900042404황성신문1900-04-242005-11-24廣告IDP-NP-007G001+KR03-B370008.051124.D0.HS1900042404:
3265HS1901061703황성신문1901-06-172005-11-24廣告IDP-NP-007G001+KR03-B370008.051124.D0.HS1901061703:
13282HS1910063002황성신문1910-06-302005-11-24論說/文勝의弊害를痛論함/續/新統監渡韓期/寺內/統監/統監渡韓一說/日本의革命黨/東京/陸軍豫算編成/東拓顧問渡韓/東洋拓殖會社農政經濟顧問/松浦博/日本海軍擴張/雜報/語學日進/總相陛見/昌德宮/各大密議/農大訪問/農商大臣/趙重應/朴齊純/若林着京/警視總監/若林/警察案頒佈延期/木內同渡/副統監/內容果然否/優哉遊哉/義親王/分遣所位置視察/警視廳/日憲兵司令部/財局調査終了/度支部臨時財産整理局/慰安其心/內部警務局/李察上京/平南觀察使/李軫鎬/內部大臣/留學生調査送交/憲兵司令部/水野視察/水野/水原/勸業模範場農林學校/視察後歸國期/馬山/釜山/築園蒔花/承寧府侍從/李恒九/得無托詞/李完用/副統監迎接/石井/山縣/趙高會議/政友會員/高羲駿/趙重應/申氏의住所探問/郡守/申耕熙/警視廳/法令要錄配付/內部/石塚有病/石塚/合同宿直/度支部/任婦觀藝/社稷修理費請撥/內部/因何被捉/違規徵金/日人承認乎/密賣淫大懲治/申訴退却/大東報押收/米國의獨立紀念/米國/山縣/副統監/印紙賣下의團束/度支部/宜其調査/帝社喜信/協判辦/沈相翊/鄭雲復/着手生春/李顯宰/被殺尸體發見/林大男/尹秉斗/詞藻/歸去好IDP-NP-007G001+KR03-B370008.051124.D0.HS1910063002:
4628HS1902102502황성신문1902-10-252005-11-24別報/第六條/雜報/日官來訪/日公館書記官萩原/學部大臣/漢府指定/漢城府/平壤隊/漢●少尹/河圭一/博物出●/農商工部/民家●收/●人/高率基/郵司告示/通信院/新聞/船稅訓飭/管船課/移●審査/陸軍法院/元帥府訓今/法部/平理院/李鍾健/在美韓人/美國/水●調査/水輪院/兩氏起復/陸軍參將/李址鎔/軍部協辦/●俊源/卒●歸國/日本/李●榮/鐵道學校/京釜●道會社/堤訓殺人/淮陽IDP-NP-007G001+KR03-B370008.051124.D0.HS1902102502:
164HS1899030702황성신문1899-03-072005-11-24官報/宮廷錄事/閔商鎬/外部協辦/外部大臣/朴齊純/申應朝/趙秉式/閔種默/宋秉璿/趙臣熙/宮內府特進官/議政府贊政/韓圭卨/忠淸北道觀察使/朴齊億/敍任及辭令/秘書院丞/鄭寅燮/江原道觀察府主事/朱錫興/姜龍洙/平安南道觀察府主事/金均錫/廣告/學部/雜報/民不聊生/學校興旺/巡洞私立小學校/金珏鉉/椎勝於劒/日使照會/駐京日公使/加藤增雄/李埈鎔/韓國/皇帝/光興學校IDP-NP-007G001+KR03-B370008.051124.D0.HS1899030702:
11846HS1909042204황성신문1909-04-222005-11-24廣告IDP-NP-007G001+KR03-B370008.051124.D0.HS1909042204: