Dataset statistics
Number of variables | 59 |
---|---|
Number of observations | 6296 |
Missing cells | 153783 |
Missing cells (%) | 41.4% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 2.9 MiB |
Average record size in memory | 475.0 B |
Variable types
Text | 44 |
---|---|
Categorical | 13 |
Unsupported | 1 |
Numeric | 1 |
Dataset
Description | 고대부터 현대까지 한국 역사의 주요 자료를 제공하는 웹 사이트 한국사데이터베이스(http://db.history.go.kr/)에서 제공 중인 일제감시대상인물카드로 일제 경찰이 작성한 감시 대상 인물 카드의 데이터 |
---|---|
Author | 교육부 국사편찬위원회 |
URL | https://www.data.go.kr/data/15053627/fileData.do |
ITEM_ID has constant value "" | Constant |
REGISTER_DATE has constant value "" | Constant |
INDICTMENT is highly imbalanced (94.5%) | Imbalance |
INDICTMENT_OFFICE is highly imbalanced (92.8%) | Imbalance |
CRIME_RECORD is highly imbalanced (96.8%) | Imbalance |
EXECUTIVE_PRISON is highly imbalanced (72.7%) | Imbalance |
PRISON is highly imbalanced (72.8%) | Imbalance |
IMAGE_QUANTITY is highly imbalanced (99.8%) | Imbalance |
REGISTRANT is highly imbalanced (99.7%) | Imbalance |
MODIFY_DATE is highly imbalanced (98.4%) | Imbalance |
MODIFIER is highly imbalanced (97.7%) | Imbalance |
STATUS is highly imbalanced (70.6%) | Imbalance |
REGIST_NO has 134 (2.1%) missing values | Missing |
SERIAL_NO has 2924 (46.4%) missing values | Missing |
ALIAS_CH has 3990 (63.4%) missing values | Missing |
ALIAS_KR has 4103 (65.2%) missing values | Missing |
FINGERPRINT_NO has 2538 (40.3%) missing values | Missing |
AGE has 525 (8.3%) missing values | Missing |
TYPE_NO has 6281 (99.8%) missing values | Missing |
CAREER has 1535 (24.4%) missing values | Missing |
FAMILY_NAME has 6247 (99.2%) missing values | Missing |
FAMILY_RELATION has 6252 (99.3%) missing values | Missing |
FATHER_NAME has 6255 (99.3%) missing values | Missing |
TAIL has 3501 (55.6%) missing values | Missing |
FEATURES has 5749 (91.3%) missing values | Missing |
FEATURES_NO has 6294 (> 99.9%) missing values | Missing |
ORIGIN_ADDRESS has 437 (6.9%) missing values | Missing |
BIRTH_PLACE has 1118 (17.8%) missing values | Missing |
ADDRESS has 538 (8.5%) missing values | Missing |
INDICTMENT_DATE has 6082 (96.6%) missing values | Missing |
RELEASE has 6211 (98.6%) missing values | Missing |
CRIME_NAME has 467 (7.4%) missing values | Missing |
PRISON_TERM has 2794 (44.4%) missing values | Missing |
PRISON_DATE has 5575 (88.5%) missing values | Missing |
SENTENCE_OFFICE has 2880 (45.7%) missing values | Missing |
SENTENCE_DATE has 3624 (57.6%) missing values | Missing |
ADMISSION_DATE has 3935 (62.5%) missing values | Missing |
RELEASE_DATE has 3091 (49.1%) missing values | Missing |
CRIMINAL_RECORD has 4962 (78.8%) missing values | Missing |
NOTE has 5489 (87.2%) missing values | Missing |
CRIMINAL_REASON has 6286 (99.8%) missing values | Missing |
ACCOMPLICE_NAME has 6285 (99.8%) missing values | Missing |
RELEASE_PLACE has 6285 (99.8%) missing values | Missing |
ARREST_OFFICE has 5355 (85.1%) missing values | Missing |
ARREST has 5868 (93.2%) missing values | Missing |
TYPES has 5839 (92.7%) missing values | Missing |
WANDERPLACE has 6296 (100.0%) missing values | Missing |
PHOTOGRAPHING has 1258 (20.0%) missing values | Missing |
PRESERVE_NEGATIVE has 1229 (19.5%) missing values | Missing |
CAPTION has 4293 (68.2%) missing values | Missing |
PHOTOGRAPHING_DATE has 1258 (20.0%) missing values | Missing |
LEVEL_ID has unique values | Unique |
WANDERPLACE is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-12 16:33:30.736846 |
---|---|
Analysis finished | 2023-12-12 16:33:37.943854 |
Duration | 7.21 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
LEVEL_ID
Text
UNIQUE
 
Distinct | 6296 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
Length
Max length | 12 |
---|---|
Median length | 12 |
Mean length | 12 |
Min length | 12 |
Characters and Unicode
Total characters | 75552 |
---|---|
Distinct characters | 13 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 6296 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | ia_0001_0001 |
---|---|
2nd row | ia_0002_0002 |
3rd row | ia_0003_0003 |
4th row | ia_0004_0004 |
5th row | ia_0005_0006 |
Value | Count | Frequency (%) |
ia_0001_0001 | 1 | < 0.1% |
ia_4181_3252 | 1 | < 0.1% |
ia_4179_3250 | 1 | < 0.1% |
ia_4178_3249 | 1 | < 0.1% |
ia_4177_3248 | 1 | < 0.1% |
ia_4176_3247 | 1 | < 0.1% |
ia_4175_3246 | 1 | < 0.1% |
ia_4174_3245 | 1 | < 0.1% |
ia_4173_3244 | 1 | < 0.1% |
ia_4172_3244 | 1 | < 0.1% |
Other values (6286) | 6286 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 12592 | |
i | 6296 | |
a | 6296 | |
0 | 6208 | |
1 | 6184 | |
2 | 6176 | |
3 | 6053 | |
4 | 5899 | |
5 | 4809 | 6.4% |
6 | 3996 | 5.3% |
Other values (3) | 11043 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 50368 | |
Connector Punctuation | 12592 | 16.7% |
Lowercase Letter | 12592 | 16.7% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 6208 | |
1 | 6184 | |
2 | 6176 | |
3 | 6053 | |
4 | 5899 | |
5 | 4809 | |
6 | 3996 | |
7 | 3751 | |
8 | 3726 | |
9 | 3566 |
Lowercase Letter
Value | Count | Frequency (%) |
i | 6296 | |
a | 6296 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 12592 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 62960 | |
Latin | 12592 | 16.7% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
_ | 12592 | |
0 | 6208 | |
1 | 6184 | |
2 | 6176 | |
3 | 6053 | |
4 | 5899 | |
5 | 4809 | 7.6% |
6 | 3996 | 6.3% |
7 | 3751 | 6.0% |
8 | 3726 | 5.9% |
Latin
Value | Count | Frequency (%) |
i | 6296 | |
a | 6296 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 75552 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 12592 | |
i | 6296 | |
a | 6296 | |
0 | 6208 | |
1 | 6184 | |
2 | 6176 | |
3 | 6053 | |
4 | 5899 | |
5 | 4809 | 6.4% |
6 | 3996 | 5.3% |
Other values (3) | 11043 |
PERSON_ID
Text
Distinct | 4856 |
---|---|
Distinct (%) | 77.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
iap_1710 | 7 | 0.1% |
iap_2419 | 7 | 0.1% |
iap_0568 | 6 | 0.1% |
iap_3132 | 6 | 0.1% |
iap_1132 | 6 | 0.1% |
iap_0270 | 6 | 0.1% |
iap_0410 | 6 | 0.1% |
iap_0182 | 6 | 0.1% |
iap_3661 | 6 | 0.1% |
iap_2749 | 5 | 0.1% |
Other values (4846) | 6235 |
Most occurring characters
Value | Count | Frequency (%) |
i | 6296 | |
a | 6296 | |
p | 6296 | |
_ | 6296 | |
2 | 3234 | |
0 | 3225 | |
1 | 3216 | |
3 | 3183 | |
4 | 3023 | |
5 | 1944 | 3.9% |
Other values (4) | 7359 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 25184 | |
Lowercase Letter | 18888 | |
Connector Punctuation | 6296 | 12.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
2 | 3234 | |
0 | 3225 | |
1 | 3216 | |
3 | 3183 | |
4 | 3023 | |
5 | 1944 | |
7 | 1899 | |
8 | 1875 | |
6 | 1873 | |
9 | 1712 |
Lowercase Letter
Value | Count | Frequency (%) |
i | 6296 | |
a | 6296 | |
p | 6296 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 6296 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 31480 | |
Latin | 18888 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
_ | 6296 | |
2 | 3234 | |
0 | 3225 | |
1 | 3216 | |
3 | 3183 | |
4 | 3023 | |
5 | 1944 | 6.2% |
7 | 1899 | 6.0% |
8 | 1875 | 6.0% |
6 | 1873 | 5.9% |
Latin
Value | Count | Frequency (%) |
i | 6296 | |
a | 6296 | |
p | 6296 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 50368 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
i | 6296 | |
a | 6296 | |
p | 6296 | |
_ | 6296 | |
2 | 3234 | |
0 | 3225 | |
1 | 3216 | |
3 | 3183 | |
4 | 3023 | |
5 | 1944 | 3.9% |
Other values (4) | 7359 |
ITEM_ID
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
ia |
---|
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | ia |
---|---|
2nd row | ia |
3rd row | ia |
4th row | ia |
5th row | ia |
Common Values
Value | Count | Frequency (%) |
ia | 6296 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
ia | 6296 |
REGIST_NO
Text
MISSING
 
Distinct | 6113 |
---|---|
Distinct (%) | 99.2% |
Missing | 134 |
Missing (%) | 2.1% |
Memory size | 49.3 KiB |
Length
Max length | 12 |
---|---|
Median length | 12 |
Mean length | 12 |
Min length | 12 |
Characters and Unicode
Total characters | 73944 |
---|---|
Distinct characters | 12 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 6082 ? |
---|---|
Unique (%) | 98.7% |
Sample
1st row | SJ0000002290 |
---|---|
2nd row | SJ0000002291 |
3rd row | SJ0000002292 |
4th row | SJ0000002293 |
5th row | SJ0000002294 |
Value | Count | Frequency (%) |
sj0000002395 | 8 | 0.1% |
sj0000007230 | 7 | 0.1% |
sj0000002831 | 5 | 0.1% |
sj0000004889 | 4 | 0.1% |
sj0000002668 | 4 | 0.1% |
sj0000008483 | 2 | < 0.1% |
sj0000004794 | 2 | < 0.1% |
sj0000006709 | 2 | < 0.1% |
sj0000002336 | 2 | < 0.1% |
sj0000008094 | 2 | < 0.1% |
Other values (6103) | 6124 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 38787 | |
S | 6162 | 8.3% |
J | 6162 | 8.3% |
3 | 2948 | 4.0% |
4 | 2940 | 4.0% |
5 | 2852 | 3.9% |
6 | 2771 | 3.7% |
7 | 2743 | 3.7% |
2 | 2557 | 3.5% |
8 | 2371 | 3.2% |
Other values (2) | 3651 | 4.9% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 61620 | |
Uppercase Letter | 12324 | 16.7% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 38787 | |
3 | 2948 | 4.8% |
4 | 2940 | 4.8% |
5 | 2852 | 4.6% |
6 | 2771 | 4.5% |
7 | 2743 | 4.5% |
2 | 2557 | 4.1% |
8 | 2371 | 3.8% |
9 | 1838 | 3.0% |
1 | 1813 | 2.9% |
Uppercase Letter
Value | Count | Frequency (%) |
S | 6162 | |
J | 6162 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 61620 | |
Latin | 12324 | 16.7% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 38787 | |
3 | 2948 | 4.8% |
4 | 2940 | 4.8% |
5 | 2852 | 4.6% |
6 | 2771 | 4.5% |
7 | 2743 | 4.5% |
2 | 2557 | 4.1% |
8 | 2371 | 3.8% |
9 | 1838 | 3.0% |
1 | 1813 | 2.9% |
Latin
Value | Count | Frequency (%) |
S | 6162 | |
J | 6162 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 73944 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 38787 | |
S | 6162 | 8.3% |
J | 6162 | 8.3% |
3 | 2948 | 4.0% |
4 | 2940 | 4.0% |
5 | 2852 | 3.9% |
6 | 2771 | 3.7% |
7 | 2743 | 3.7% |
2 | 2557 | 3.5% |
8 | 2371 | 3.2% |
Other values (2) | 3651 | 4.9% |
IMAGES
Text
Distinct | 6264 |
---|---|
Distinct (%) | 99.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
Length
Max length | 69 |
---|---|
Median length | 27 |
Mean length | 27.006671 |
Min length | 27 |
Characters and Unicode
Total characters | 170034 |
---|---|
Distinct characters | 22 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 6250 ? |
---|---|
Unique (%) | 99.3% |
Sample
1st row | ia_0001_a.jpg;ia_0001_b.jpg |
---|---|
2nd row | ia_0002_a.jpg;ia_0002_b.jpg |
3rd row | ia_0003_a.jpg;ia_0003_b.jpg |
4th row | ia_0004_a.jpg;ia_0004_b.jpg |
5th row | ia_0005_a.jpg;ia_0005_b.jpg |
Value | Count | Frequency (%) |
ia_0103_a.jpg;ia_0103_b.jpg | 8 | 0.1% |
ia_4942_a.jpg;ia_4942_b.jpg | 7 | 0.1% |
ia_0542_a.jpg;ia_0542_b.jpg | 5 | 0.1% |
ia_0378_a.jpg;ia_0378_b.jpg | 4 | 0.1% |
ia_2602_a.jpg;ia_2602_b.jpg | 4 | 0.1% |
ia_0507_a.jpg;ia_0507_b.jpg | 2 | < 0.1% |
ia_2473_a.jpg;ia_2473_b.jpg | 2 | < 0.1% |
ia_2601_a.jpg;ia_2601_b.jpg | 2 | < 0.1% |
ia_2506_a.jpg;ia_2506_b.jpg | 2 | < 0.1% |
ia_2128_a.jpg;ia_2128_b.jpg | 2 | < 0.1% |
Other values (6254) | 6258 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 25190 | |
a | 18891 | |
i | 12595 | 7.4% |
p | 12595 | 7.4% |
g | 12595 | 7.4% |
. | 12595 | 7.4% |
j | 12595 | 7.4% |
; | 6299 | 3.7% |
b | 6296 | 3.7% |
0 | 5966 | 3.5% |
Other values (12) | 44417 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 75570 | |
Decimal Number | 50380 | |
Connector Punctuation | 25190 | 14.8% |
Other Punctuation | 18894 | 11.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 5966 | |
1 | 5936 | |
2 | 5886 | |
4 | 5755 | |
3 | 5738 | |
5 | 5733 | |
6 | 4246 | |
9 | 3711 | |
7 | 3707 | |
8 | 3702 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 18891 | |
i | 12595 | |
p | 12595 | |
g | 12595 | |
j | 12595 | |
b | 6296 | 8.3% |
c | 1 | < 0.1% |
d | 1 | < 0.1% |
e | 1 | < 0.1% |
Other Punctuation
Value | Count | Frequency (%) |
. | 12595 | |
; | 6299 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 25190 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 94464 | |
Latin | 75570 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
_ | 25190 | |
. | 12595 | |
; | 6299 | 6.7% |
0 | 5966 | 6.3% |
1 | 5936 | 6.3% |
2 | 5886 | 6.2% |
4 | 5755 | 6.1% |
3 | 5738 | 6.1% |
5 | 5733 | 6.1% |
6 | 4246 | 4.5% |
Other values (3) | 11120 |
Latin
Value | Count | Frequency (%) |
a | 18891 | |
i | 12595 | |
p | 12595 | |
g | 12595 | |
j | 12595 | |
b | 6296 | 8.3% |
c | 1 | < 0.1% |
d | 1 | < 0.1% |
e | 1 | < 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 170034 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 25190 | |
a | 18891 | |
i | 12595 | 7.4% |
p | 12595 | 7.4% |
g | 12595 | 7.4% |
. | 12595 | 7.4% |
j | 12595 | 7.4% |
; | 6299 | 3.7% |
b | 6296 | 3.7% |
0 | 5966 | 3.5% |
Other values (12) | 44417 |
THUMB_IMAGE
Text
Distinct | 6264 |
---|---|
Distinct (%) | 99.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
Length
Max length | 18 |
---|---|
Median length | 18 |
Mean length | 18 |
Min length | 18 |
Characters and Unicode
Total characters | 113328 |
---|---|
Distinct characters | 23 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 6250 ? |
---|---|
Unique (%) | 99.3% |
Sample
1st row | thumbs_ia_0001.jpg |
---|---|
2nd row | thumbs_ia_0002.jpg |
3rd row | thumbs_ia_0003.jpg |
4th row | thumbs_ia_0004.jpg |
5th row | thumbs_ia_0005.jpg |
Value | Count | Frequency (%) |
thumbs_ia_0103.jpg | 8 | 0.1% |
thumbs_ia_4942.jpg | 7 | 0.1% |
thumbs_ia_0542.jpg | 5 | 0.1% |
thumbs_ia_0378.jpg | 4 | 0.1% |
thumbs_ia_2602.jpg | 4 | 0.1% |
thumbs_ia_0507.jpg | 2 | < 0.1% |
thumbs_ia_2473.jpg | 2 | < 0.1% |
thumbs_ia_2601.jpg | 2 | < 0.1% |
thumbs_ia_2506.jpg | 2 | < 0.1% |
thumbs_ia_2128.jpg | 2 | < 0.1% |
Other values (6254) | 6258 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 12592 | 11.1% |
t | 6296 | 5.6% |
h | 6296 | 5.6% |
g | 6296 | 5.6% |
p | 6296 | 5.6% |
j | 6296 | 5.6% |
. | 6296 | 5.6% |
a | 6296 | 5.6% |
i | 6296 | 5.6% |
s | 6296 | 5.6% |
Other values (13) | 44072 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 69256 | |
Decimal Number | 25184 | 22.2% |
Connector Punctuation | 12592 | 11.1% |
Other Punctuation | 6296 | 5.6% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
t | 6296 | |
h | 6296 | |
g | 6296 | |
p | 6296 | |
j | 6296 | |
a | 6296 | |
i | 6296 | |
s | 6296 | |
b | 6296 | |
m | 6296 |
Decimal Number
Value | Count | Frequency (%) |
0 | 2983 | |
1 | 2968 | |
2 | 2943 | |
4 | 2876 | |
3 | 2869 | |
5 | 2865 | |
6 | 2123 | |
9 | 1854 | |
7 | 1852 | |
8 | 1851 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 12592 |
Other Punctuation
Value | Count | Frequency (%) |
. | 6296 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 69256 | |
Common | 44072 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
_ | 12592 | |
. | 6296 | |
0 | 2983 | 6.8% |
1 | 2968 | 6.7% |
2 | 2943 | 6.7% |
4 | 2876 | 6.5% |
3 | 2869 | 6.5% |
5 | 2865 | 6.5% |
6 | 2123 | 4.8% |
9 | 1854 | 4.2% |
Other values (2) | 3703 | 8.4% |
Latin
Value | Count | Frequency (%) |
t | 6296 | |
h | 6296 | |
g | 6296 | |
p | 6296 | |
j | 6296 | |
a | 6296 | |
i | 6296 | |
s | 6296 | |
b | 6296 | |
m | 6296 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 113328 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 12592 | 11.1% |
t | 6296 | 5.6% |
h | 6296 | 5.6% |
g | 6296 | 5.6% |
p | 6296 | 5.6% |
j | 6296 | 5.6% |
. | 6296 | 5.6% |
a | 6296 | 5.6% |
i | 6296 | 5.6% |
s | 6296 | 5.6% |
Other values (13) | 44072 |
SERIAL_NO
Text
MISSING
 
Distinct | 1041 |
---|---|
Distinct (%) | 30.9% |
Missing | 2924 |
Missing (%) | 46.4% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
31200 | 97 | 2.9% |
41200 | 58 | 1.7% |
31210 | 51 | 1.5% |
41210 | 39 | 1.2% |
21200 | 39 | 1.2% |
21310 | 38 | 1.1% |
20210 | 30 | 0.9% |
31310 | 29 | 0.9% |
61200 | 27 | 0.8% |
21210 | 26 | 0.8% |
Other values (1031) | 2938 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 5050 | |
1 | 3480 | |
2 | 2788 | |
3 | 1800 | 10.7% |
4 | 1583 | 9.4% |
8 | 894 | 5.3% |
6 | 532 | 3.2% |
9 | 357 | 2.1% |
7 | 200 | 1.2% |
■ | 110 | 0.7% |
Other values (2) | 80 | 0.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 16761 | |
Other Symbol | 110 | 0.7% |
Math Symbol | 3 | < 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 5050 | |
1 | 3480 | |
2 | 2788 | |
3 | 1800 | 10.7% |
4 | 1583 | 9.4% |
8 | 894 | 5.3% |
6 | 532 | 3.2% |
9 | 357 | 2.1% |
7 | 200 | 1.2% |
5 | 77 | 0.5% |
Other Symbol
Value | Count | Frequency (%) |
■ | 110 |
Math Symbol
Value | Count | Frequency (%) |
| | 3 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 16874 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 5050 | |
1 | 3480 | |
2 | 2788 | |
3 | 1800 | 10.7% |
4 | 1583 | 9.4% |
8 | 894 | 5.3% |
6 | 532 | 3.2% |
9 | 357 | 2.1% |
7 | 200 | 1.2% |
■ | 110 | 0.7% |
Other values (2) | 80 | 0.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 16764 | |
Geometric Shapes | 110 | 0.7% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 5050 | |
1 | 3480 | |
2 | 2788 | |
3 | 1800 | 10.7% |
4 | 1583 | 9.4% |
8 | 894 | 5.3% |
6 | 532 | 3.2% |
9 | 357 | 2.1% |
7 | 200 | 1.2% |
5 | 77 | 0.5% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 110 |
MAIN_TITLE
Text
Distinct | 5427 |
---|---|
Distinct (%) | 86.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
姜元昊 | 10 | 0.2% |
鄭利逍 | 7 | 0.1% |
金基弘 | 6 | 0.1% |
安昌浩 | 5 | 0.1% |
孔元檜 | 5 | 0.1% |
邊雨植 | 5 | 0.1% |
李順今 | 5 | 0.1% |
金昌洙 | 5 | 0.1% |
朴鎭洪 | 4 | 0.1% |
黃金鳳 | 4 | 0.1% |
Other values (5421) | 6244 |
Most occurring characters
Value | Count | Frequency (%) |
金 | 1293 | 6.5% |
李 | 868 | 4.3% |
朴 | 395 | 2.0% |
崔 | 335 | 1.7% |
鄭 | 197 | 1.0% |
昌 | 182 | 0.9% |
元 | 181 | 0.9% |
韓 | 179 | 0.9% |
龍 | 173 | 0.9% |
東 | 173 | 0.9% |
Other values (1121) | 15988 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 19635 | |
Lowercase Letter | 104 | 0.5% |
Other Punctuation | 97 | 0.5% |
Decimal Number | 51 | 0.3% |
Uppercase Letter | 33 | 0.2% |
Math Symbol | 14 | 0.1% |
Connector Punctuation | 8 | < 0.1% |
Open Punctuation | 7 | < 0.1% |
Close Punctuation | 7 | < 0.1% |
Space Separator | 4 | < 0.1% |
Other values (2) | 4 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
金 | 1293 | 6.6% |
李 | 868 | 4.4% |
朴 | 395 | 2.0% |
崔 | 335 | 1.7% |
鄭 | 197 | 1.0% |
昌 | 182 | 0.9% |
元 | 181 | 0.9% |
韓 | 179 | 0.9% |
龍 | 173 | 0.9% |
東 | 173 | 0.9% |
Other values (1076) | 15659 |
Lowercase Letter
Value | Count | Frequency (%) |
g | 12 | |
a | 12 | |
m | 12 | |
s | 12 | |
i | 12 | |
e | 12 | |
r | 8 | |
c | 8 | |
h | 4 | 3.8% |
x | 4 | 3.8% |
Other values (2) | 8 |
Decimal Number
Value | Count | Frequency (%) |
2 | 13 | |
0 | 13 | |
4 | 6 | |
7 | 5 | 9.8% |
3 | 4 | 7.8% |
5 | 4 | 7.8% |
1 | 3 | 5.9% |
6 | 2 | 3.9% |
8 | 1 | 2.0% |
Other Punctuation
Value | Count | Frequency (%) |
, | 52 | |
/ | 21 | |
" | 8 | 8.2% |
; | 4 | 4.1% |
# | 4 | 4.1% |
& | 4 | 4.1% |
. | 4 | 4.1% |
Uppercase Letter
Value | Count | Frequency (%) |
C | 9 | |
K | 8 | |
G | 4 | |
I | 4 | |
F | 4 | |
E | 2 | 6.1% |
A | 2 | 6.1% |
Math Symbol
Value | Count | Frequency (%) |
< | 4 | |
> | 4 | |
= | 4 | |
+ | 2 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 8 |
Open Punctuation
Value | Count | Frequency (%) |
( | 7 |
Close Punctuation
Value | Count | Frequency (%) |
) | 7 |
Space Separator
Value | Count | Frequency (%) |
4 |
Other Symbol
Value | Count | Frequency (%) |
▼ | 3 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 19626 | |
Common | 192 | 1.0% |
Latin | 137 | 0.7% |
Katakana | 8 | < 0.1% |
Hangul | 1 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
金 | 1293 | 6.6% |
李 | 868 | 4.4% |
朴 | 395 | 2.0% |
崔 | 335 | 1.7% |
鄭 | 197 | 1.0% |
昌 | 182 | 0.9% |
元 | 181 | 0.9% |
韓 | 179 | 0.9% |
龍 | 173 | 0.9% |
東 | 173 | 0.9% |
Other values (1067) | 15650 |
Common
Value | Count | Frequency (%) |
, | 52 | |
/ | 21 | |
2 | 13 | 6.8% |
0 | 13 | 6.8% |
" | 8 | 4.2% |
_ | 8 | 4.2% |
( | 7 | 3.6% |
) | 7 | 3.6% |
4 | 6 | 3.1% |
7 | 5 | 2.6% |
Other values (16) | 52 |
Latin
Value | Count | Frequency (%) |
g | 12 | 8.8% |
a | 12 | 8.8% |
m | 12 | 8.8% |
s | 12 | 8.8% |
i | 12 | 8.8% |
e | 12 | 8.8% |
C | 9 | 6.6% |
r | 8 | 5.8% |
c | 8 | 5.8% |
K | 8 | 5.8% |
Other values (9) | 32 |
Katakana
Value | Count | Frequency (%) |
リ | 1 | |
タ | 1 | |
ツ | 1 | |
セ | 1 | |
ニ | 1 | |
イ | 1 | |
ラ | 1 | |
コ | 1 |
Hangul
Value | Count | Frequency (%) |
엽 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 18012 | |
CJK Compat Ideographs | 1610 | 8.1% |
ASCII | 325 | 1.6% |
Katakana | 8 | < 0.1% |
CJK Ext A | 4 | < 0.1% |
Geometric Shapes | 3 | < 0.1% |
None | 1 | < 0.1% |
Hangul | 1 | < 0.1% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
金 | 1293 | 7.2% |
朴 | 395 | 2.2% |
崔 | 335 | 1.9% |
鄭 | 197 | 1.1% |
昌 | 182 | 1.0% |
元 | 181 | 1.0% |
韓 | 179 | 1.0% |
東 | 173 | 1.0% |
姜 | 170 | 0.9% |
泰 | 167 | 0.9% |
Other values (1003) | 14740 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
李 | 868 | |
龍 | 173 | 10.7% |
柳 | 74 | 4.6% |
林 | 71 | 4.4% |
烈 | 43 | 2.7% |
劉 | 38 | 2.4% |
金 | 36 | 2.2% |
沈 | 32 | 2.0% |
盧 | 29 | 1.8% |
梁 | 28 | 1.7% |
Other values (50) | 218 | 13.5% |
ASCII
Value | Count | Frequency (%) |
, | 52 | 16.0% |
/ | 21 | 6.5% |
2 | 13 | 4.0% |
0 | 13 | 4.0% |
g | 12 | 3.7% |
a | 12 | 3.7% |
m | 12 | 3.7% |
s | 12 | 3.7% |
i | 12 | 3.7% |
e | 12 | 3.7% |
Other values (33) | 154 |
Geometric Shapes
Value | Count | Frequency (%) |
▼ | 3 |
CJK Ext A
Value | Count | Frequency (%) |
㻐 | 1 | |
㳟 | 1 | |
㘽 | 1 | |
㬚 | 1 |
None
Value | Count | Frequency (%) |
- | 1 |
Katakana
Value | Count | Frequency (%) |
リ | 1 | |
タ | 1 | |
ツ | 1 | |
セ | 1 | |
ニ | 1 | |
イ | 1 | |
ラ | 1 | |
コ | 1 |
Hangul
Value | Count | Frequency (%) |
엽 | 1 |
NAME_KR
Text
Distinct | 5224 |
---|---|
Distinct (%) | 83.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
강원호 | 10 | 0.2% |
정이소 | 7 | 0.1% |
김기홍 | 6 | 0.1% |
이경수 | 6 | 0.1% |
안창호 | 5 | 0.1% |
변우식 | 5 | 0.1% |
이수봉 | 5 | 0.1% |
김용찬 | 5 | 0.1% |
공원회 | 5 | 0.1% |
이순금 | 5 | 0.1% |
Other values (5214) | 6237 |
Most occurring characters
Value | Count | Frequency (%) |
김 | 1189 | 6.0% |
이 | 934 | 4.7% |
정 | 446 | 2.3% |
박 | 396 | 2.0% |
용 | 395 | 2.0% |
영 | 356 | 1.8% |
최 | 338 | 1.7% |
원 | 333 | 1.7% |
수 | 332 | 1.7% |
성 | 325 | 1.7% |
Other values (301) | 14628 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 19622 | |
Other Punctuation | 48 | 0.2% |
Dash Punctuation | 1 | < 0.1% |
Other Symbol | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
김 | 1189 | 6.1% |
이 | 934 | 4.8% |
정 | 446 | 2.3% |
박 | 396 | 2.0% |
용 | 395 | 2.0% |
영 | 356 | 1.8% |
최 | 338 | 1.7% |
원 | 333 | 1.7% |
수 | 332 | 1.7% |
성 | 325 | 1.7% |
Other values (298) | 14578 |
Other Punctuation
Value | Count | Frequency (%) |
, | 48 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1 |
Other Symbol
Value | Count | Frequency (%) |
■ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 19620 | |
Common | 50 | 0.3% |
Katakana | 2 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
김 | 1189 | 6.1% |
이 | 934 | 4.8% |
정 | 446 | 2.3% |
박 | 396 | 2.0% |
용 | 395 | 2.0% |
영 | 356 | 1.8% |
최 | 338 | 1.7% |
원 | 333 | 1.7% |
수 | 332 | 1.7% |
성 | 325 | 1.7% |
Other values (296) | 14576 |
Common
Value | Count | Frequency (%) |
, | 48 | |
- | 1 | 2.0% |
■ | 1 | 2.0% |
Katakana
Value | Count | Frequency (%) |
リ | 1 | |
タ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 19620 | |
ASCII | 48 | 0.2% |
Katakana | 2 | < 0.1% |
None | 1 | < 0.1% |
Geometric Shapes | 1 | < 0.1% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
김 | 1189 | 6.1% |
이 | 934 | 4.8% |
정 | 446 | 2.3% |
박 | 396 | 2.0% |
용 | 395 | 2.0% |
영 | 356 | 1.8% |
최 | 338 | 1.7% |
원 | 333 | 1.7% |
수 | 332 | 1.7% |
성 | 325 | 1.7% |
Other values (296) | 14576 |
ASCII
Value | Count | Frequency (%) |
, | 48 |
Katakana
Value | Count | Frequency (%) |
リ | 1 | |
タ | 1 |
None
Value | Count | Frequency (%) |
- | 1 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 1 |
ALIAS_CH
Text
MISSING
 
Distinct | 2088 |
---|---|
Distinct (%) | 90.5% |
Missing | 3990 |
Missing (%) | 63.4% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
ナシ | 104 | 4.5% |
權泰山 | 6 | 0.3% |
朴改メ | 3 | 0.1% |
鄭泰洙 | 3 | 0.1% |
岩村蓮竹 | 3 | 0.1% |
元奉 | 3 | 0.1% |
嘯山 | 3 | 0.1% |
白雲,朴洙 | 3 | 0.1% |
君燮 | 3 | 0.1% |
德俊 | 3 | 0.1% |
Other values (2080) | 2175 |
Most occurring characters
Value | Count | Frequency (%) |
, | 753 | 8.4% |
金 | 403 | 4.5% |
李 | 297 | 3.3% |
朴 | 155 | 1.7% |
シ | 105 | 1.2% |
ナ | 104 | 1.2% |
成 | 98 | 1.1% |
山 | 91 | 1.0% |
一 | 87 | 1.0% |
崔 | 80 | 0.9% |
Other values (887) | 6752 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 8058 | |
Other Punctuation | 769 | 8.6% |
Lowercase Letter | 45 | 0.5% |
Decimal Number | 22 | 0.2% |
Uppercase Letter | 7 | 0.1% |
Modifier Letter | 6 | 0.1% |
Other Symbol | 4 | < 0.1% |
Dash Punctuation | 3 | < 0.1% |
Space Separator | 3 | < 0.1% |
Math Symbol | 3 | < 0.1% |
Other values (3) | 5 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
金 | 403 | 5.0% |
李 | 297 | 3.7% |
朴 | 155 | 1.9% |
シ | 105 | 1.3% |
ナ | 104 | 1.3% |
成 | 98 | 1.2% |
山 | 91 | 1.1% |
一 | 87 | 1.1% |
崔 | 80 | 1.0% |
洙 | 78 | 1.0% |
Other values (837) | 6560 |
Lowercase Letter
Value | Count | Frequency (%) |
r | 6 | |
o | 4 | 8.9% |
t | 4 | 8.9% |
w | 4 | 8.9% |
n | 3 | 6.7% |
h | 3 | 6.7% |
i | 3 | 6.7% |
e | 2 | 4.4% |
k | 2 | 4.4% |
a | 2 | 4.4% |
Other values (8) | 12 |
Other Punctuation
Value | Count | Frequency (%) |
, | 753 | |
/ | 6 | 0.8% |
. | 4 | 0.5% |
" | 2 | 0.3% |
# | 1 | 0.1% |
& | 1 | 0.1% |
; | 1 | 0.1% |
: | 1 | 0.1% |
Decimal Number
Value | Count | Frequency (%) |
0 | 5 | |
3 | 4 | |
7 | 4 | |
2 | 4 | |
4 | 2 | 9.1% |
9 | 1 | 4.5% |
1 | 1 | 4.5% |
6 | 1 | 4.5% |
Uppercase Letter
Value | Count | Frequency (%) |
C | 2 | |
E | 1 | |
K | 1 | |
G | 1 | |
I | 1 | |
F | 1 |
Math Symbol
Value | Count | Frequency (%) |
= | 1 | |
< | 1 | |
> | 1 |
Modifier Letter
Value | Count | Frequency (%) |
ー | 6 |
Other Symbol
Value | Count | Frequency (%) |
■ | 4 |
Dash Punctuation
Value | Count | Frequency (%) |
― | 3 |
Space Separator
Value | Count | Frequency (%) |
3 |
Open Punctuation
Value | Count | Frequency (%) |
( | 2 |
Close Punctuation
Value | Count | Frequency (%) |
) | 2 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 7749 | |
Common | 815 | 9.1% |
Katakana | 304 | 3.4% |
Latin | 52 | 0.6% |
Hangul | 5 | 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
金 | 403 | 5.2% |
李 | 297 | 3.8% |
朴 | 155 | 2.0% |
成 | 98 | 1.3% |
山 | 91 | 1.2% |
一 | 87 | 1.1% |
崔 | 80 | 1.0% |
洙 | 78 | 1.0% |
泰 | 77 | 1.0% |
東 | 73 | 0.9% |
Other values (789) | 6310 |
Katakana
Value | Count | Frequency (%) |
シ | 105 | |
ナ | 104 | |
メ | 15 | 4.9% |
ス | 6 | 2.0% |
ン | 5 | 1.6% |
ク | 4 | 1.3% |
ラ | 4 | 1.3% |
イ | 4 | 1.3% |
エ | 4 | 1.3% |
オ | 4 | 1.3% |
Other values (33) | 49 |
Common
Value | Count | Frequency (%) |
, | 753 | |
/ | 6 | 0.7% |
ー | 6 | 0.7% |
0 | 5 | 0.6% |
3 | 4 | 0.5% |
■ | 4 | 0.5% |
7 | 4 | 0.5% |
2 | 4 | 0.5% |
. | 4 | 0.5% |
― | 3 | 0.4% |
Other values (16) | 22 | 2.7% |
Latin
Value | Count | Frequency (%) |
r | 6 | 11.5% |
o | 4 | 7.7% |
t | 4 | 7.7% |
w | 4 | 7.7% |
n | 3 | 5.8% |
h | 3 | 5.8% |
i | 3 | 5.8% |
e | 2 | 3.8% |
k | 2 | 3.8% |
a | 2 | 3.8% |
Other values (14) | 19 |
Hangul
Value | Count | Frequency (%) |
김 | 1 | |
효 | 1 | |
순 | 1 | |
천 | 1 | |
남 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 7204 | |
ASCII | 854 | 9.6% |
CJK Compat Ideographs | 543 | 6.1% |
Katakana | 310 | 3.5% |
Hangul | 5 | 0.1% |
Geometric Shapes | 4 | < 0.1% |
Punctuation | 3 | < 0.1% |
CJK Ext A | 2 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
, | 753 | |
/ | 6 | 0.7% |
r | 6 | 0.7% |
0 | 5 | 0.6% |
3 | 4 | 0.5% |
o | 4 | 0.5% |
7 | 4 | 0.5% |
2 | 4 | 0.5% |
. | 4 | 0.5% |
t | 4 | 0.5% |
Other values (37) | 60 | 7.0% |
CJK
Value | Count | Frequency (%) |
金 | 403 | 5.6% |
朴 | 155 | 2.2% |
成 | 98 | 1.4% |
山 | 91 | 1.3% |
一 | 87 | 1.2% |
崔 | 80 | 1.1% |
洙 | 78 | 1.1% |
泰 | 77 | 1.1% |
東 | 73 | 1.0% |
基 | 72 | 1.0% |
Other values (744) | 5990 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
李 | 297 | |
龍 | 49 | 9.0% |
金 | 29 | 5.3% |
烈 | 23 | 4.2% |
林 | 23 | 4.2% |
魯 | 10 | 1.8% |
柳 | 10 | 1.8% |
盧 | 8 | 1.5% |
律 | 8 | 1.5% |
連 | 7 | 1.3% |
Other values (33) | 79 | 14.5% |
Katakana
Value | Count | Frequency (%) |
シ | 105 | |
ナ | 104 | |
メ | 15 | 4.8% |
ー | 6 | 1.9% |
ス | 6 | 1.9% |
ン | 5 | 1.6% |
ク | 4 | 1.3% |
ラ | 4 | 1.3% |
イ | 4 | 1.3% |
エ | 4 | 1.3% |
Other values (34) | 53 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 4 |
Punctuation
Value | Count | Frequency (%) |
― | 3 |
CJK Ext A
Value | Count | Frequency (%) |
㫌 | 1 | |
㻞 | 1 |
Hangul
Value | Count | Frequency (%) |
김 | 1 | |
효 | 1 | |
순 | 1 | |
천 | 1 | |
남 | 1 |
ALIAS_KR
Text
MISSING
 
Distinct | 2027 |
---|---|
Distinct (%) | 92.4% |
Missing | 4103 |
Missing (%) | 65.2% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
창씨 | 13 | 0.6% |
권태산 | 6 | 0.3% |
철수 | 4 | 0.2% |
덕준 | 4 | 0.2% |
원식 | 4 | 0.2% |
기석 | 4 | 0.2% |
박의 | 3 | 0.1% |
원봉 | 3 | 0.1% |
군섭 | 3 | 0.1% |
고경인 | 3 | 0.1% |
Other values (2018) | 2162 |
Most occurring characters
Value | Count | Frequency (%) |
, | 746 | 8.7% |
김 | 379 | 4.4% |
이 | 349 | 4.1% |
성 | 172 | 2.0% |
수 | 160 | 1.9% |
박 | 157 | 1.8% |
용 | 155 | 1.8% |
정 | 147 | 1.7% |
영 | 144 | 1.7% |
철 | 118 | 1.4% |
Other values (309) | 6053 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 7809 | |
Other Punctuation | 746 | 8.7% |
Space Separator | 16 | 0.2% |
Other Symbol | 5 | 0.1% |
Decimal Number | 4 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
김 | 379 | 4.9% |
이 | 349 | 4.5% |
성 | 172 | 2.2% |
수 | 160 | 2.0% |
박 | 157 | 2.0% |
용 | 155 | 2.0% |
정 | 147 | 1.9% |
영 | 144 | 1.8% |
철 | 118 | 1.5% |
일 | 118 | 1.5% |
Other values (303) | 5910 |
Decimal Number
Value | Count | Frequency (%) |
2 | 2 | |
7 | 1 | |
3 | 1 |
Other Punctuation
Value | Count | Frequency (%) |
, | 746 |
Space Separator
Value | Count | Frequency (%) |
16 |
Other Symbol
Value | Count | Frequency (%) |
■ | 5 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 7784 | |
Common | 771 | 9.0% |
Katakana | 20 | 0.2% |
Han | 5 | 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
김 | 379 | 4.9% |
이 | 349 | 4.5% |
성 | 172 | 2.2% |
수 | 160 | 2.1% |
박 | 157 | 2.0% |
용 | 155 | 2.0% |
정 | 147 | 1.9% |
영 | 144 | 1.8% |
철 | 118 | 1.5% |
일 | 118 | 1.5% |
Other values (283) | 5885 |
Katakana
Value | Count | Frequency (%) |
ノ | 3 | |
ロ | 2 | 10.0% |
ク | 2 | 10.0% |
ブ | 2 | 10.0% |
ガ | 1 | 5.0% |
ネ | 1 | 5.0% |
モ | 1 | 5.0% |
メ | 1 | 5.0% |
フ | 1 | 5.0% |
セ | 1 | 5.0% |
Other values (5) | 5 |
Common
Value | Count | Frequency (%) |
, | 746 | |
16 | 2.1% | |
■ | 5 | 0.6% |
2 | 2 | 0.3% |
7 | 1 | 0.1% |
3 | 1 | 0.1% |
Han
Value | Count | Frequency (%) |
金 | 1 | |
孝 | 1 | |
順 | 1 | |
天 | 1 | |
南 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 7784 | |
ASCII | 766 | 8.9% |
Katakana | 20 | 0.2% |
Geometric Shapes | 5 | 0.1% |
CJK | 5 | 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
, | 746 | |
16 | 2.1% | |
2 | 2 | 0.3% |
7 | 1 | 0.1% |
3 | 1 | 0.1% |
Hangul
Value | Count | Frequency (%) |
김 | 379 | 4.9% |
이 | 349 | 4.5% |
성 | 172 | 2.2% |
수 | 160 | 2.1% |
박 | 157 | 2.0% |
용 | 155 | 2.0% |
정 | 147 | 1.9% |
영 | 144 | 1.8% |
철 | 118 | 1.5% |
일 | 118 | 1.5% |
Other values (283) | 5885 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 5 |
Katakana
Value | Count | Frequency (%) |
ノ | 3 | |
ロ | 2 | 10.0% |
ク | 2 | 10.0% |
ブ | 2 | 10.0% |
ガ | 1 | 5.0% |
ネ | 1 | 5.0% |
モ | 1 | 5.0% |
メ | 1 | 5.0% |
フ | 1 | 5.0% |
セ | 1 | 5.0% |
Other values (5) | 5 |
CJK
Value | Count | Frequency (%) |
金 | 1 | |
孝 | 1 | |
順 | 1 | |
天 | 1 | |
南 | 1 |
FINGERPRINT_NO
Text
MISSING
 
Distinct | 3192 |
---|---|
Distinct (%) | 84.9% |
Missing | 2538 |
Missing (%) | 40.3% |
Memory size | 49.3 KiB |
Length
Max length | 13 |
---|---|
Median length | 11 |
Mean length | 10.999734 |
Min length | 5 |
Characters and Unicode
Total characters | 41337 |
---|---|
Distinct characters | 17 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 3 ? |
Unique
Unique | 2760 ? |
---|---|
Unique (%) | 73.4% |
Sample
1st row | 75759|73647 |
---|---|
2nd row | 24646|45950 |
3rd row | 37769|49869 |
4th row | 77878|87787 |
5th row | 98758|28757 |
Value | Count | Frequency (%) |
77767|78898 | 7 | 0.2% |
13333|33333 | 7 | 0.2% |
65666|76959 | 5 | 0.1% |
54537|74848 | 5 | 0.1% |
84765|76866 | 5 | 0.1% |
46667|45809 | 5 | 0.1% |
24645|28863 | 5 | 0.1% |
98879|88898 | 5 | 0.1% |
78877|89788 | 4 | 0.1% |
13447|13469 | 4 | 0.1% |
Other values (3182) | 3706 |
Most occurring characters
Value | Count | Frequency (%) |
7 | 7244 | |
8 | 6210 | |
6 | 5627 | |
5 | 4303 | |
4 | 4300 | |
9 | 4006 | |
| | 3756 | |
3 | 3330 | |
2 | 1261 | 3.1% |
1 | 929 | 2.2% |
Other values (7) | 371 | 0.9% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 37537 | |
Math Symbol | 3756 | 9.1% |
Other Letter | 23 | 0.1% |
Other Symbol | 19 | < 0.1% |
Open Punctuation | 1 | < 0.1% |
Close Punctuation | 1 | < 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
7 | 7244 | |
8 | 6210 | |
6 | 5627 | |
5 | 4303 | |
4 | 4300 | |
9 | 4006 | |
3 | 3330 | |
2 | 1261 | 3.4% |
1 | 929 | 2.5% |
0 | 327 | 0.9% |
Other Letter
Value | Count | Frequency (%) |
左 | 11 | |
右 | 11 | |
當 | 1 | 4.3% |
Math Symbol
Value | Count | Frequency (%) |
| | 3756 |
Other Symbol
Value | Count | Frequency (%) |
■ | 19 |
Open Punctuation
Value | Count | Frequency (%) |
[ | 1 |
Close Punctuation
Value | Count | Frequency (%) |
] | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 41314 | |
Han | 23 | 0.1% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
7 | 7244 | |
8 | 6210 | |
6 | 5627 | |
5 | 4303 | |
4 | 4300 | |
9 | 4006 | |
| | 3756 | |
3 | 3330 | |
2 | 1261 | 3.1% |
1 | 929 | 2.2% |
Other values (4) | 348 | 0.8% |
Han
Value | Count | Frequency (%) |
左 | 11 | |
右 | 11 | |
當 | 1 | 4.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 41295 | |
CJK | 23 | 0.1% |
Geometric Shapes | 19 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
7 | 7244 | |
8 | 6210 | |
6 | 5627 | |
5 | 4303 | |
4 | 4300 | |
9 | 4006 | |
| | 3756 | |
3 | 3330 | |
2 | 1261 | 3.1% |
1 | 929 | 2.2% |
Other values (3) | 329 | 0.8% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 19 |
CJK
Value | Count | Frequency (%) |
左 | 11 | |
右 | 11 | |
當 | 1 | 4.3% |
AGE
Text
MISSING
 
Distinct | 4240 |
---|---|
Distinct (%) | 73.5% |
Missing | 525 |
Missing (%) | 8.3% |
Memory size | 49.3 KiB |
Length
Max length | 28 |
---|---|
Median length | 27 |
Mean length | 11.896032 |
Min length | 2 |
Characters and Unicode
Total characters | 68652 |
---|---|
Distinct characters | 73 |
Distinct categories | 7 ? |
Distinct scripts | 3 ? |
Distinct blocks | 5 ? |
Unique
Unique | 3344 ? |
---|---|
Unique (%) | 57.9% |
Sample
1st row | 1881年生 |
---|---|
2nd row | 大正4年 12月 11日生 |
3rd row | 明治38年 11月 5日生 |
4th row | 明治41年 10月 27日生 |
5th row | 建陽1年 3月 3日生,明治29年 3月 3日生 |
Value | Count | Frequency (%) |
2月 | 562 | 3.3% |
12月 | 540 | 3.2% |
1月 | 501 | 3.0% |
10月 | 491 | 2.9% |
3月 | 458 | 2.7% |
8月 | 452 | 2.7% |
6月 | 444 | 2.6% |
11月 | 444 | 2.6% |
5月 | 431 | 2.5% |
4月 | 425 | 2.5% |
Other values (814) | 12190 |
Most occurring characters
Value | Count | Frequency (%) |
11167 | ||
年 | 6222 | 9.1% |
1 | 5893 | 8.6% |
月 | 5576 | 8.1% |
日 | 5572 | 8.1% |
2 | 5097 | 7.4% |
3 | 3425 | 5.0% |
4 | 3416 | 5.0% |
明 | 3185 | 4.6% |
治 | 3180 | 4.6% |
Other values (63) | 15919 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 29434 | |
Decimal Number | 27434 | |
Space Separator | 11167 | 16.3% |
Other Punctuation | 488 | 0.7% |
Open Punctuation | 53 | 0.1% |
Close Punctuation | 53 | 0.1% |
Other Symbol | 23 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
年 | 6222 | |
月 | 5576 | |
日 | 5572 | |
明 | 3185 | |
治 | 3180 | |
大 | 1391 | 4.7% |
正 | 1390 | 4.7% |
光 | 485 | 1.6% |
武 | 485 | 1.6% |
生 | 387 | 1.3% |
Other values (48) | 1561 | 5.3% |
Decimal Number
Value | Count | Frequency (%) |
1 | 5893 | |
2 | 5097 | |
3 | 3425 | |
4 | 3416 | |
5 | 1764 | 6.4% |
0 | 1714 | 6.2% |
9 | 1625 | 5.9% |
8 | 1618 | 5.9% |
7 | 1494 | 5.4% |
6 | 1388 | 5.1% |
Space Separator
Value | Count | Frequency (%) |
11167 |
Other Punctuation
Value | Count | Frequency (%) |
, | 488 |
Open Punctuation
Value | Count | Frequency (%) |
( | 53 |
Close Punctuation
Value | Count | Frequency (%) |
) | 53 |
Other Symbol
Value | Count | Frequency (%) |
■ | 23 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 39218 | |
Han | 29425 | |
Katakana | 9 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 6222 | |
月 | 5576 | |
日 | 5572 | |
明 | 3185 | |
治 | 3180 | |
大 | 1391 | 4.7% |
正 | 1390 | 4.7% |
光 | 485 | 1.6% |
武 | 485 | 1.6% |
生 | 387 | 1.3% |
Other values (40) | 1552 | 5.3% |
Common
Value | Count | Frequency (%) |
11167 | ||
1 | 5893 | |
2 | 5097 | |
3 | 3425 | 8.7% |
4 | 3416 | 8.7% |
5 | 1764 | 4.5% |
0 | 1714 | 4.4% |
9 | 1625 | 4.1% |
8 | 1618 | 4.1% |
7 | 1494 | 3.8% |
Other values (5) | 2005 | 5.1% |
Katakana
Value | Count | Frequency (%) |
ノ | 2 | |
モ | 1 | |
ン | 1 | |
イ | 1 | |
シ | 1 | |
ル | 1 | |
ナ | 1 | |
ガ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 39195 | |
CJK | 29292 | |
CJK Compat Ideographs | 133 | 0.2% |
Geometric Shapes | 23 | < 0.1% |
Katakana | 9 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
11167 | ||
1 | 5893 | |
2 | 5097 | |
3 | 3425 | 8.7% |
4 | 3416 | 8.7% |
5 | 1764 | 4.5% |
0 | 1714 | 4.4% |
9 | 1625 | 4.1% |
8 | 1618 | 4.1% |
7 | 1494 | 3.8% |
Other values (4) | 1982 | 5.1% |
CJK
Value | Count | Frequency (%) |
年 | 6222 | |
月 | 5576 | |
日 | 5572 | |
明 | 3185 | |
治 | 3180 | |
大 | 1391 | 4.7% |
正 | 1390 | 4.7% |
光 | 485 | 1.7% |
武 | 485 | 1.7% |
生 | 387 | 1.3% |
Other values (38) | 1419 | 4.8% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
隆 | 127 | |
不 | 6 | 4.5% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 23 |
Katakana
Value | Count | Frequency (%) |
ノ | 2 | |
モ | 1 | |
ン | 1 | |
イ | 1 | |
シ | 1 | |
ル | 1 | |
ナ | 1 | |
ガ | 1 |
TYPE_NO
Text
MISSING
 
Distinct | 15 |
---|---|
Distinct (%) | 100.0% |
Missing | 6281 |
Missing (%) | 99.8% |
Memory size | 49.3 KiB |
Length
Max length | 66 |
---|---|
Median length | 31 |
Mean length | 19.733333 |
Min length | 2 |
Characters and Unicode
Total characters | 296 |
---|---|
Distinct characters | 145 |
Distinct categories | 6 ? |
Distinct scripts | 4 ? |
Distinct blocks | 5 ? |
Unique
Unique | 15 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 6177 |
---|---|
2nd row | 98757|72847 |
3rd row | 64857|64669 |
4th row | 55769|55669 |
5th row | C 323孫 |
Value | Count | Frequency (%) |
2月 | 2 | 8.0% |
6177 | 1 | 4.0% |
昭和10年2月20日退學生復校ヲ理由ニ暴力行爲ヲ敢行シタルモノ | 1 | 4.0% |
大正9年恩赦 | 1 | 4.0% |
勞農議會機關紙勞働者農民通信ノ配布ヲ受主義硏究ニ專念ノ揚句本籍地ノ夜間普通學校ヲ敎フル傍共産主義ヲ說キ其ノ實行等ニ付煽動シタル者ナリ | 1 | 4.0% |
京城工業學校內ニテ同窓生金林瀅金鍾千ト秘密結社ウリ學校ヲ組織シ讀書シ金鍾千柳宅夏沈壽石等同血會ヲ組織シ共産主義實踐運動ヲシタリ | 1 | 4.0% |
共靑ヲ組織シ又ハ■勞ヲ組織スル虞アリ | 1 | 4.0% |
14日 | 1 | 4.0% |
5月 | 1 | 4.0% |
令第19號ニヨリ懲役2年 | 1 | 4.0% |
Other values (14) | 14 |
Most occurring characters
Value | Count | Frequency (%) |
5 | 10 | 3.4% |
ヲ | 10 | 3.4% |
10 | 3.4% | |
6 | 9 | 3.0% |
シ | 9 | 3.0% |
7 | 8 | 2.7% |
2 | 7 | 2.4% |
9 | 7 | 2.4% |
1 | 6 | 2.0% |
ニ | 6 | 2.0% |
Other values (135) | 214 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 223 | |
Decimal Number | 58 | 19.6% |
Space Separator | 10 | 3.4% |
Math Symbol | 3 | 1.0% |
Uppercase Letter | 1 | 0.3% |
Other Symbol | 1 | 0.3% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
ヲ | 10 | 4.5% |
シ | 9 | 4.0% |
ニ | 6 | 2.7% |
ノ | 5 | 2.2% |
學 | 5 | 2.2% |
リ | 5 | 2.2% |
年 | 5 | 2.2% |
日 | 5 | 2.2% |
ル | 4 | 1.8% |
織 | 4 | 1.8% |
Other values (121) | 165 |
Decimal Number
Value | Count | Frequency (%) |
5 | 10 | |
6 | 9 | |
7 | 8 | |
2 | 7 | |
9 | 7 | |
1 | 6 | |
4 | 4 | 6.9% |
8 | 3 | 5.2% |
0 | 2 | 3.4% |
3 | 2 | 3.4% |
Space Separator
Value | Count | Frequency (%) |
10 |
Math Symbol
Value | Count | Frequency (%) |
| | 3 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 1 |
Other Symbol
Value | Count | Frequency (%) |
■ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 169 | |
Common | 72 | |
Katakana | 54 | 18.2% |
Latin | 1 | 0.3% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
學 | 5 | 3.0% |
年 | 5 | 3.0% |
日 | 5 | 3.0% |
織 | 4 | 2.4% |
校 | 4 | 2.4% |
月 | 4 | 2.4% |
組 | 4 | 2.4% |
動 | 3 | 1.8% |
義 | 3 | 1.8% |
共 | 3 | 1.8% |
Other values (103) | 129 |
Katakana
Value | Count | Frequency (%) |
ヲ | 10 | |
シ | 9 | |
ニ | 6 | |
ノ | 5 | |
リ | 5 | |
ル | 4 | 7.4% |
タ | 3 | 5.6% |
ナ | 2 | 3.7% |
キ | 1 | 1.9% |
フ | 1 | 1.9% |
Other values (8) | 8 |
Common
Value | Count | Frequency (%) |
5 | 10 | |
10 | ||
6 | 9 | |
7 | 8 | |
2 | 7 | |
9 | 7 | |
1 | 6 | |
4 | 4 | 5.6% |
8 | 3 | 4.2% |
| | 3 | 4.2% |
Other values (3) | 5 |
Latin
Value | Count | Frequency (%) |
C | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 168 | |
ASCII | 72 | |
Katakana | 54 | 18.2% |
CJK Compat Ideographs | 1 | 0.3% |
Geometric Shapes | 1 | 0.3% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
5 | 10 | |
10 | ||
6 | 9 | |
7 | 8 | |
2 | 7 | |
9 | 7 | |
1 | 6 | |
4 | 4 | 5.6% |
8 | 3 | 4.2% |
| | 3 | 4.2% |
Other values (3) | 5 |
Katakana
Value | Count | Frequency (%) |
ヲ | 10 | |
シ | 9 | |
ニ | 6 | |
ノ | 5 | |
リ | 5 | |
ル | 4 | 7.4% |
タ | 3 | 5.6% |
ナ | 2 | 3.7% |
キ | 1 | 1.9% |
フ | 1 | 1.9% |
Other values (8) | 8 |
CJK
Value | Count | Frequency (%) |
學 | 5 | 3.0% |
年 | 5 | 3.0% |
日 | 5 | 3.0% |
織 | 4 | 2.4% |
校 | 4 | 2.4% |
月 | 4 | 2.4% |
組 | 4 | 2.4% |
動 | 3 | 1.8% |
義 | 3 | 1.8% |
共 | 3 | 1.8% |
Other values (102) | 128 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
理 | 1 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 1 |
CAREER
Text
MISSING
 
Distinct | 744 |
---|---|
Distinct (%) | 15.6% |
Missing | 1535 |
Missing (%) | 24.4% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
農 | 1454 | |
學生 | 311 | 6.5% |
無 | 309 | 6.4% |
無職 | 302 | 6.3% |
ナシ | 287 | 6.0% |
農業 | 180 | 3.8% |
勞働 | 124 | 2.6% |
生徒 | 61 | 1.3% |
敎員 | 57 | 1.2% |
敎師 | 51 | 1.1% |
Other values (742) | 1664 |
Most occurring characters
Value | Count | Frequency (%) |
農 | 1646 | 15.6% |
無 | 622 | 5.9% |
生 | 505 | 4.8% |
學 | 463 | 4.4% |
職 | 421 | 4.0% |
業 | 352 | 3.3% |
シ | 302 | 2.9% |
ナ | 287 | 2.7% |
商 | 281 | 2.7% |
工 | 246 | 2.3% |
Other values (528) | 5422 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 10412 | |
Other Symbol | 45 | 0.4% |
Space Separator | 39 | 0.4% |
Decimal Number | 30 | 0.3% |
Close Punctuation | 9 | 0.1% |
Open Punctuation | 9 | 0.1% |
Modifier Letter | 1 | < 0.1% |
Other Punctuation | 1 | < 0.1% |
Dash Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
農 | 1646 | 15.8% |
無 | 622 | 6.0% |
生 | 505 | 4.9% |
學 | 463 | 4.4% |
職 | 421 | 4.0% |
業 | 352 | 3.4% |
シ | 302 | 2.9% |
ナ | 287 | 2.8% |
商 | 281 | 2.7% |
工 | 246 | 2.4% |
Other values (513) | 5287 |
Decimal Number
Value | Count | Frequency (%) |
2 | 8 | |
3 | 7 | |
4 | 6 | |
1 | 4 | |
5 | 2 | 6.7% |
6 | 1 | 3.3% |
8 | 1 | 3.3% |
7 | 1 | 3.3% |
Other Symbol
Value | Count | Frequency (%) |
■ | 45 |
Space Separator
Value | Count | Frequency (%) |
39 |
Close Punctuation
Value | Count | Frequency (%) |
) | 9 |
Open Punctuation
Value | Count | Frequency (%) |
( | 9 |
Modifier Letter
Value | Count | Frequency (%) |
ー | 1 |
Other Punctuation
Value | Count | Frequency (%) |
… | 1 |
Dash Punctuation
Value | Count | Frequency (%) |
― | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 9699 | |
Katakana | 711 | 6.7% |
Common | 135 | 1.3% |
Hiragana | 2 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
農 | 1646 | 17.0% |
無 | 622 | 6.4% |
生 | 505 | 5.2% |
學 | 463 | 4.8% |
職 | 421 | 4.3% |
業 | 352 | 3.6% |
商 | 281 | 2.9% |
工 | 246 | 2.5% |
敎 | 222 | 2.3% |
員 | 206 | 2.1% |
Other values (478) | 4735 |
Katakana
Value | Count | Frequency (%) |
シ | 302 | |
ナ | 287 | |
ト | 12 | 1.7% |
ス | 10 | 1.4% |
リ | 10 | 1.4% |
ム | 10 | 1.4% |
ヤ | 8 | 1.1% |
キ | 8 | 1.1% |
ン | 7 | 1.0% |
ゴ | 5 | 0.7% |
Other values (23) | 52 | 7.3% |
Common
Value | Count | Frequency (%) |
■ | 45 | |
39 | ||
) | 9 | 6.7% |
( | 9 | 6.7% |
2 | 8 | 5.9% |
3 | 7 | 5.2% |
4 | 6 | 4.4% |
1 | 4 | 3.0% |
5 | 2 | 1.5% |
ー | 1 | 0.7% |
Other values (5) | 5 | 3.7% |
Hiragana
Value | Count | Frequency (%) |
じ | 1 | |
し | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 9640 | |
Katakana | 712 | 6.8% |
ASCII | 87 | 0.8% |
CJK Compat Ideographs | 58 | 0.5% |
Geometric Shapes | 45 | 0.4% |
Hiragana | 2 | < 0.1% |
Punctuation | 2 | < 0.1% |
CJK Ext A | 1 | < 0.1% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
農 | 1646 | 17.1% |
無 | 622 | 6.5% |
生 | 505 | 5.2% |
學 | 463 | 4.8% |
職 | 421 | 4.4% |
業 | 352 | 3.7% |
商 | 281 | 2.9% |
工 | 246 | 2.6% |
敎 | 222 | 2.3% |
員 | 206 | 2.1% |
Other values (467) | 4676 |
Katakana
Value | Count | Frequency (%) |
シ | 302 | |
ナ | 287 | |
ト | 12 | 1.7% |
ス | 10 | 1.4% |
リ | 10 | 1.4% |
ム | 10 | 1.4% |
ヤ | 8 | 1.1% |
キ | 8 | 1.1% |
ン | 7 | 1.0% |
ゴ | 5 | 0.7% |
Other values (24) | 53 | 7.4% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 45 |
ASCII
Value | Count | Frequency (%) |
39 | ||
) | 9 | 10.3% |
( | 9 | 10.3% |
2 | 8 | 9.2% |
3 | 7 | 8.0% |
4 | 6 | 6.9% |
1 | 4 | 4.6% |
5 | 2 | 2.3% |
6 | 1 | 1.1% |
8 | 1 | 1.1% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
勞 | 31 | |
女 | 12 | 20.7% |
理 | 5 | 8.6% |
金 | 3 | 5.2% |
車 | 2 | 3.4% |
易 | 1 | 1.7% |
料 | 1 | 1.7% |
旅 | 1 | 1.7% |
煉 | 1 | 1.7% |
梨 | 1 | 1.7% |
Hiragana
Value | Count | Frequency (%) |
じ | 1 | |
し | 1 |
Punctuation
Value | Count | Frequency (%) |
… | 1 | |
― | 1 |
CJK Ext A
Value | Count | Frequency (%) |
䴹 | 1 |
FAMILY_NAME
Text
MISSING
 
Distinct | 40 |
---|---|
Distinct (%) | 81.6% |
Missing | 6247 |
Missing (%) | 99.2% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
本人 | 7 | 14.3% |
戶主 | 4 | 8.2% |
食料品小賣 | 1 | 2.0% |
李鍾泰 | 1 | 2.0% |
梁源達 | 1 | 2.0% |
柳章鉉 | 1 | 2.0% |
李性主 | 1 | 2.0% |
李圭東 | 1 | 2.0% |
李昌榮 | 1 | 2.0% |
職工 | 1 | 2.0% |
Other values (30) | 30 |
Most occurring characters
Value | Count | Frequency (%) |
本 | 7 | 5.3% |
人 | 7 | 5.3% |
李 | 5 | 3.8% |
主 | 5 | 3.8% |
金 | 5 | 3.8% |
朴 | 4 | 3.1% |
戶 | 4 | 3.1% |
圭 | 3 | 2.3% |
丁 | 2 | 1.5% |
韓 | 2 | 1.5% |
Other values (77) | 87 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 131 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
本 | 7 | 5.3% |
人 | 7 | 5.3% |
李 | 5 | 3.8% |
主 | 5 | 3.8% |
金 | 5 | 3.8% |
朴 | 4 | 3.1% |
戶 | 4 | 3.1% |
圭 | 3 | 2.3% |
丁 | 2 | 1.5% |
韓 | 2 | 1.5% |
Other values (77) | 87 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 131 |
Most frequent character per script
Han
Value | Count | Frequency (%) |
本 | 7 | 5.3% |
人 | 7 | 5.3% |
李 | 5 | 3.8% |
主 | 5 | 3.8% |
金 | 5 | 3.8% |
朴 | 4 | 3.1% |
戶 | 4 | 3.1% |
圭 | 3 | 2.3% |
丁 | 2 | 1.5% |
韓 | 2 | 1.5% |
Other values (77) | 87 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 131 |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
本 | 7 | 5.3% |
人 | 7 | 5.3% |
李 | 5 | 3.8% |
主 | 5 | 3.8% |
金 | 5 | 3.8% |
朴 | 4 | 3.1% |
戶 | 4 | 3.1% |
圭 | 3 | 2.3% |
丁 | 2 | 1.5% |
韓 | 2 | 1.5% |
Other values (77) | 87 |
FAMILY_RELATION
Text
MISSING
 
Distinct | 23 |
---|---|
Distinct (%) | 52.3% |
Missing | 6252 |
Missing (%) | 99.3% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
長男 | 9 | |
戶主 | 8 | |
本人 | 2 | 4.5% |
父 | 2 | 4.5% |
長孫 | 2 | 4.5% |
次男 | 2 | 4.5% |
3男 | 2 | 4.5% |
弟 | 2 | 4.5% |
李昌榮ノ二男 | 1 | 2.3% |
戶主ノ長孫 | 1 | 2.3% |
Other values (13) | 13 |
Most occurring characters
Value | Count | Frequency (%) |
男 | 21 | |
長 | 14 | |
戶 | 14 | |
主 | 14 | |
ノ | 6 | 5.2% |
孫 | 4 | 3.5% |
弟 | 3 | 2.6% |
次 | 3 | 2.6% |
人 | 3 | 2.6% |
本 | 3 | 2.6% |
Other values (26) | 30 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 111 | |
Decimal Number | 4 | 3.5% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
男 | 21 | |
長 | 14 | |
戶 | 14 | |
主 | 14 | |
ノ | 6 | 5.4% |
孫 | 4 | 3.6% |
弟 | 3 | 2.7% |
次 | 3 | 2.7% |
人 | 3 | 2.7% |
本 | 3 | 2.7% |
Other values (23) | 26 |
Decimal Number
Value | Count | Frequency (%) |
3 | 2 | |
2 | 1 | |
4 | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 101 | |
Katakana | 10 | 8.7% |
Common | 4 | 3.5% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
男 | 21 | |
長 | 14 | |
戶 | 14 | |
主 | 14 | |
孫 | 4 | 4.0% |
弟 | 3 | 3.0% |
次 | 3 | 3.0% |
人 | 3 | 3.0% |
本 | 3 | 3.0% |
父 | 2 | 2.0% |
Other values (18) | 20 |
Katakana
Value | Count | Frequency (%) |
ノ | 6 | |
リ | 1 | 10.0% |
ナ | 1 | 10.0% |
ニ | 1 | 10.0% |
ハ | 1 | 10.0% |
Common
Value | Count | Frequency (%) |
3 | 2 | |
2 | 1 | |
4 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 101 | |
Katakana | 10 | 8.7% |
ASCII | 4 | 3.5% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
男 | 21 | |
長 | 14 | |
戶 | 14 | |
主 | 14 | |
孫 | 4 | 4.0% |
弟 | 3 | 3.0% |
次 | 3 | 3.0% |
人 | 3 | 3.0% |
本 | 3 | 3.0% |
父 | 2 | 2.0% |
Other values (18) | 20 |
Katakana
Value | Count | Frequency (%) |
ノ | 6 | |
リ | 1 | 10.0% |
ナ | 1 | 10.0% |
ニ | 1 | 10.0% |
ハ | 1 | 10.0% |
ASCII
Value | Count | Frequency (%) |
3 | 2 | |
2 | 1 | |
4 | 1 |
FATHER_NAME
Text
MISSING
 
Distinct | 41 |
---|---|
Distinct (%) | 100.0% |
Missing | 6255 |
Missing (%) | 99.3% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
父 | 27 | |
母 | 26 | |
金氏 | 5 | 4.1% |
李氏 | 4 | 3.3% |
亡 | 3 | 2.5% |
ナシ | 3 | 2.5% |
崔氏 | 2 | 1.6% |
朴氏 | 2 | 1.6% |
李氏 | 2 | 1.6% |
尹氏 | 2 | 1.6% |
Other values (46) | 46 |
Most occurring characters
Value | Count | Frequency (%) |
81 | ||
母 | 30 | 9.3% |
父 | 29 | 9.0% |
氏 | 23 | 7.1% |
金 | 12 | 3.7% |
李 | 10 | 3.1% |
| | 5 | 1.5% |
シ | 5 | 1.5% |
亡 | 5 | 1.5% |
ナ | 4 | 1.2% |
Other values (87) | 119 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 233 | |
Space Separator | 81 | 25.1% |
Math Symbol | 5 | 1.5% |
Other Punctuation | 3 | 0.9% |
Other Symbol | 1 | 0.3% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
母 | 30 | 12.9% |
父 | 29 | 12.4% |
氏 | 23 | 9.9% |
金 | 12 | 5.2% |
李 | 10 | 4.3% |
シ | 5 | 2.1% |
亡 | 5 | 2.1% |
ナ | 4 | 1.7% |
閔 | 4 | 1.7% |
昌 | 4 | 1.7% |
Other values (82) | 107 |
Other Punctuation
Value | Count | Frequency (%) |
: | 2 | |
/ | 1 |
Space Separator
Value | Count | Frequency (%) |
81 |
Math Symbol
Value | Count | Frequency (%) |
| | 5 |
Other Symbol
Value | Count | Frequency (%) |
■ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 223 | |
Common | 90 | |
Katakana | 10 | 3.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
母 | 30 | 13.5% |
父 | 29 | 13.0% |
氏 | 23 | 10.3% |
金 | 12 | 5.4% |
李 | 10 | 4.5% |
亡 | 5 | 2.2% |
閔 | 4 | 1.8% |
昌 | 4 | 1.8% |
徐 | 3 | 1.3% |
崔 | 3 | 1.3% |
Other values (79) | 100 |
Common
Value | Count | Frequency (%) |
81 | ||
| | 5 | 5.6% |
: | 2 | 2.2% |
/ | 1 | 1.1% |
■ | 1 | 1.1% |
Katakana
Value | Count | Frequency (%) |
シ | 5 | |
ナ | 4 | |
ジ | 1 | 10.0% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 220 | |
ASCII | 89 | |
Katakana | 10 | 3.1% |
CJK Compat Ideographs | 3 | 0.9% |
Geometric Shapes | 1 | 0.3% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
81 | ||
| | 5 | 5.6% |
: | 2 | 2.2% |
/ | 1 | 1.1% |
CJK
Value | Count | Frequency (%) |
母 | 30 | 13.6% |
父 | 29 | 13.2% |
氏 | 23 | 10.5% |
金 | 12 | 5.5% |
李 | 10 | 4.5% |
亡 | 5 | 2.3% |
閔 | 4 | 1.8% |
昌 | 4 | 1.8% |
徐 | 3 | 1.4% |
崔 | 3 | 1.4% |
Other values (77) | 97 |
Katakana
Value | Count | Frequency (%) |
シ | 5 | |
ナ | 4 | |
ジ | 1 | 10.0% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
李 | 2 | |
柳 | 1 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 1 |
TAIL
Text
MISSING
 
Distinct | 426 |
---|---|
Distinct (%) | 15.2% |
Missing | 3501 |
Missing (%) | 55.6% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
5尺4寸 | 49 | 1.8% |
5尺3寸 | 48 | 1.7% |
5尺5寸 | 41 | 1.5% |
5尺4寸0分 | 41 | 1.5% |
1米670 | 40 | 1.4% |
5尺4寸5分 | 39 | 1.4% |
1米650 | 38 | 1.4% |
5尺2寸5分 | 37 | 1.3% |
5尺5寸0分 | 36 | 1.3% |
5尺4寸1分 | 35 | 1.3% |
Other values (417) | 2392 |
Most occurring characters
Value | Count | Frequency (%) |
5 | 2618 | |
1 | 1603 | |
尺 | 1530 | |
寸 | 1511 | |
分 | 1267 | |
米 | 1192 | |
6 | 1147 | |
0 | 821 | 5.6% |
4 | 612 | 4.2% |
7 | 573 | 3.9% |
Other values (27) | 1860 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 8964 | |
Other Letter | 5568 | |
Other Punctuation | 100 | 0.7% |
Lowercase Letter | 63 | 0.4% |
Other Symbol | 34 | 0.2% |
Math Symbol | 4 | < 0.1% |
Space Separator | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
尺 | 1530 | |
寸 | 1511 | |
分 | 1267 | |
米 | 1192 | |
不 | 23 | 0.4% |
位 | 15 | 0.3% |
不 | 11 | 0.2% |
糎 | 7 | 0.1% |
明 | 2 | < 0.1% |
强 | 2 | < 0.1% |
Other values (7) | 8 | 0.1% |
Decimal Number
Value | Count | Frequency (%) |
5 | 2618 | |
1 | 1603 | |
6 | 1147 | |
0 | 821 | 9.2% |
4 | 612 | 6.8% |
7 | 573 | 6.4% |
3 | 557 | 6.2% |
2 | 510 | 5.7% |
8 | 308 | 3.4% |
9 | 215 | 2.4% |
Lowercase Letter
Value | Count | Frequency (%) |
m | 31 | |
c | 30 | |
k | 1 | 1.6% |
g | 1 | 1.6% |
Other Punctuation
Value | Count | Frequency (%) |
. | 99 | |
, | 1 | 1.0% |
Other Symbol
Value | Count | Frequency (%) |
㎝ | 31 | |
■ | 3 | 8.8% |
Math Symbol
Value | Count | Frequency (%) |
| | 4 |
Space Separator
Value | Count | Frequency (%) |
1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 9103 | |
Han | 5568 | |
Latin | 63 | 0.4% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
尺 | 1530 | |
寸 | 1511 | |
分 | 1267 | |
米 | 1192 | |
不 | 23 | 0.4% |
位 | 15 | 0.3% |
不 | 11 | 0.2% |
糎 | 7 | 0.1% |
明 | 2 | < 0.1% |
强 | 2 | < 0.1% |
Other values (7) | 8 | 0.1% |
Common
Value | Count | Frequency (%) |
5 | 2618 | |
1 | 1603 | |
6 | 1147 | |
0 | 821 | 9.0% |
4 | 612 | 6.7% |
7 | 573 | 6.3% |
3 | 557 | 6.1% |
2 | 510 | 5.6% |
8 | 308 | 3.4% |
9 | 215 | 2.4% |
Other values (6) | 139 | 1.5% |
Latin
Value | Count | Frequency (%) |
m | 31 | |
c | 30 | |
k | 1 | 1.6% |
g | 1 | 1.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 9132 | |
CJK | 5556 | |
CJK Compat | 31 | 0.2% |
CJK Compat Ideographs | 12 | 0.1% |
Geometric Shapes | 3 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
5 | 2618 | |
1 | 1603 | |
6 | 1147 | |
0 | 821 | 9.0% |
4 | 612 | 6.7% |
7 | 573 | 6.3% |
3 | 557 | 6.1% |
2 | 510 | 5.6% |
8 | 308 | 3.4% |
9 | 215 | 2.4% |
Other values (8) | 168 | 1.8% |
CJK
Value | Count | Frequency (%) |
尺 | 1530 | |
寸 | 1511 | |
分 | 1267 | |
米 | 1192 | |
不 | 23 | 0.4% |
位 | 15 | 0.3% |
糎 | 7 | 0.1% |
明 | 2 | < 0.1% |
强 | 2 | < 0.1% |
耗 | 2 | < 0.1% |
Other values (5) | 5 | 0.1% |
CJK Compat
Value | Count | Frequency (%) |
㎝ | 31 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
不 | 11 | |
六 | 1 | 8.3% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 3 |
FEATURES
Text
MISSING
 
Distinct | 234 |
---|---|
Distinct (%) | 42.8% |
Missing | 5749 |
Missing (%) | 91.3% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
ナシ | 284 | |
右前膊部ニ文身 | 5 | 0.9% |
面部ニ痘痕 | 4 | 0.7% |
腰部ニ瘍痕 | 4 | 0.7% |
背部ニ黑子1 | 3 | 0.5% |
背部ニ瘍痕 | 3 | 0.5% |
上腹部ニ瘍痕 | 3 | 0.5% |
左右前膊部ニ文身 | 2 | 0.4% |
中腹部ニ疚痕 | 2 | 0.4% |
胸部ニ黑子1 | 2 | 0.4% |
Other values (240) | 251 |
Most occurring characters
Value | Count | Frequency (%) |
シ | 291 | 12.0% |
ナ | 286 | 11.7% |
ニ | 223 | 9.2% |
部 | 187 | 7.7% |
痕 | 141 | 5.8% |
リ | 64 | 2.6% |
ア | 63 | 2.6% |
1 | 58 | 2.4% |
傷 | 47 | 1.9% |
瘍 | 45 | 1.8% |
Other values (192) | 1030 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 2309 | |
Decimal Number | 82 | 3.4% |
Space Separator | 16 | 0.7% |
Other Symbol | 14 | 0.6% |
Other Punctuation | 11 | 0.5% |
Open Punctuation | 1 | < 0.1% |
Math Symbol | 1 | < 0.1% |
Close Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
シ | 291 | 12.6% |
ナ | 286 | 12.4% |
ニ | 223 | 9.7% |
部 | 187 | 8.1% |
痕 | 141 | 6.1% |
リ | 64 | 2.8% |
ア | 63 | 2.7% |
傷 | 47 | 2.0% |
瘍 | 45 | 1.9% |
右 | 38 | 1.6% |
Other values (178) | 924 |
Decimal Number
Value | Count | Frequency (%) |
1 | 58 | |
2 | 10 | 12.2% |
3 | 8 | 9.8% |
7 | 2 | 2.4% |
6 | 2 | 2.4% |
0 | 1 | 1.2% |
5 | 1 | 1.2% |
Other Symbol
Value | Count | Frequency (%) |
■ | 13 | |
▼ | 1 | 7.1% |
Space Separator
Value | Count | Frequency (%) |
16 |
Other Punctuation
Value | Count | Frequency (%) |
, | 11 |
Open Punctuation
Value | Count | Frequency (%) |
( | 1 |
Math Symbol
Value | Count | Frequency (%) |
+ | 1 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 1350 | |
Katakana | 959 | |
Common | 126 | 5.2% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
部 | 187 | 13.9% |
痕 | 141 | 10.4% |
傷 | 47 | 3.5% |
瘍 | 45 | 3.3% |
右 | 38 | 2.8% |
左 | 36 | 2.7% |
膊 | 35 | 2.6% |
上 | 34 | 2.5% |
腹 | 32 | 2.4% |
前 | 30 | 2.2% |
Other values (157) | 725 |
Katakana
Value | Count | Frequency (%) |
シ | 291 | |
ナ | 286 | |
ニ | 223 | |
リ | 64 | 6.7% |
ア | 63 | 6.6% |
ノ | 12 | 1.3% |
ハ | 2 | 0.2% |
テ | 2 | 0.2% |
キ | 2 | 0.2% |
ケ | 2 | 0.2% |
Other values (11) | 12 | 1.3% |
Common
Value | Count | Frequency (%) |
1 | 58 | |
16 | 12.7% | |
■ | 13 | 10.3% |
, | 11 | 8.7% |
2 | 10 | 7.9% |
3 | 8 | 6.3% |
7 | 2 | 1.6% |
6 | 2 | 1.6% |
0 | 1 | 0.8% |
▼ | 1 | 0.8% |
Other values (4) | 4 | 3.2% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 1346 | |
Katakana | 959 | |
ASCII | 112 | 4.6% |
Geometric Shapes | 14 | 0.6% |
CJK Compat Ideographs | 3 | 0.1% |
CJK Ext A | 1 | < 0.1% |
Most frequent character per block
Katakana
Value | Count | Frequency (%) |
シ | 291 | |
ナ | 286 | |
ニ | 223 | |
リ | 64 | 6.7% |
ア | 63 | 6.6% |
ノ | 12 | 1.3% |
ハ | 2 | 0.2% |
テ | 2 | 0.2% |
キ | 2 | 0.2% |
ケ | 2 | 0.2% |
Other values (11) | 12 | 1.3% |
CJK
Value | Count | Frequency (%) |
部 | 187 | 13.9% |
痕 | 141 | 10.5% |
傷 | 47 | 3.5% |
瘍 | 45 | 3.3% |
右 | 38 | 2.8% |
左 | 36 | 2.7% |
膊 | 35 | 2.6% |
上 | 34 | 2.5% |
腹 | 32 | 2.4% |
前 | 30 | 2.2% |
Other values (153) | 721 |
ASCII
Value | Count | Frequency (%) |
1 | 58 | |
16 | 14.3% | |
, | 11 | 9.8% |
2 | 10 | 8.9% |
3 | 8 | 7.1% |
7 | 2 | 1.8% |
6 | 2 | 1.8% |
0 | 1 | 0.9% |
( | 1 | 0.9% |
+ | 1 | 0.9% |
Other values (2) | 2 | 1.8% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 13 | |
▼ | 1 | 7.1% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
兩 | 1 | |
不 | 1 | |
金 | 1 |
CJK Ext A
Value | Count | Frequency (%) |
㡱 | 1 |
FEATURES_NO
Text
MISSING
 
Distinct | 2 |
---|---|
Distinct (%) | 100.0% |
Missing | 6294 |
Missing (%) | > 99.9% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
451 | 1 | |
上下腹部ニ灸痕アリ | 1 |
Most occurring characters
Value | Count | Frequency (%) |
4 | 1 | |
5 | 1 | |
1 | 1 | |
上 | 1 | |
下 | 1 | |
腹 | 1 | |
部 | 1 | |
ニ | 1 | |
灸 | 1 | |
痕 | 1 | |
Other values (2) | 2 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 9 | |
Decimal Number | 3 | 25.0% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
上 | 1 | |
下 | 1 | |
腹 | 1 | |
部 | 1 | |
ニ | 1 | |
灸 | 1 | |
痕 | 1 | |
ア | 1 | |
リ | 1 |
Decimal Number
Value | Count | Frequency (%) |
4 | 1 | |
5 | 1 | |
1 | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 6 | |
Common | 3 | |
Katakana | 3 |
Most frequent character per script
Han
Value | Count | Frequency (%) |
上 | 1 | |
下 | 1 | |
腹 | 1 | |
部 | 1 | |
灸 | 1 | |
痕 | 1 |
Common
Value | Count | Frequency (%) |
4 | 1 | |
5 | 1 | |
1 | 1 |
Katakana
Value | Count | Frequency (%) |
ニ | 1 | |
ア | 1 | |
リ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 6 | |
ASCII | 3 | |
Katakana | 3 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
4 | 1 | |
5 | 1 | |
1 | 1 |
CJK
Value | Count | Frequency (%) |
上 | 1 | |
下 | 1 | |
腹 | 1 | |
部 | 1 | |
灸 | 1 | |
痕 | 1 |
Katakana
Value | Count | Frequency (%) |
ニ | 1 | |
ア | 1 | |
リ | 1 |
ORIGIN_ADDRESS
Text
MISSING
 
Distinct | 4731 |
---|---|
Distinct (%) | 80.7% |
Missing | 437 |
Missing (%) | 6.9% |
Memory size | 49.3 KiB |
Length
Max length | 36 |
---|---|
Median length | 25 |
Mean length | 16.283154 |
Min length | 2 |
Characters and Unicode
Total characters | 95403 |
---|---|
Distinct characters | 1208 |
Distinct categories | 10 ? |
Distinct scripts | 6 ? |
Distinct blocks | 8 ? |
Unique
Unique | 3918 ? |
---|---|
Unique (%) | 66.9% |
Sample
1st row | 太原綏靖公署秘書長 |
---|---|
2nd row | 奉川省 蓋平縣 陳家屯 |
3rd row | 靜岡 安倍 有度 上原 863 |
4th row | 黃海道 平郡 古北 書梧 |
5th row | 忠淸北道 淸州郡 南州內 新場垈 |
Value | Count | Frequency (%) |
京畿道 | 1426 | 5.3% |
咸鏡北道 | 1039 | 3.9% |
咸鏡南道 | 787 | 2.9% |
京城府 | 551 | 2.1% |
江原道 | 525 | 2.0% |
慶尙北道 | 362 | 1.4% |
明川郡 | 270 | 1.0% |
忠淸南道 | 251 | 0.9% |
鏡城郡 | 238 | 0.9% |
黃海道 | 232 | 0.9% |
Other values (5499) | 21091 |
Most occurring characters
Value | Count | Frequency (%) |
20913 | ||
道 | 5863 | 6.1% |
郡 | 4658 | 4.9% |
北 | 2284 | 2.4% |
南 | 2145 | 2.2% |
鏡 | 2070 | 2.2% |
1 | 2030 | 2.1% |
京 | 2001 | 2.1% |
咸 | 1991 | 2.1% |
城 | 1626 | 1.7% |
Other values (1198) | 49822 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 62350 | |
Space Separator | 20913 | 21.9% |
Decimal Number | 11953 | 12.5% |
Dash Punctuation | 103 | 0.1% |
Other Symbol | 52 | 0.1% |
Other Punctuation | 16 | < 0.1% |
Close Punctuation | 6 | < 0.1% |
Open Punctuation | 6 | < 0.1% |
Lowercase Letter | 2 | < 0.1% |
Uppercase Letter | 2 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
道 | 5863 | 9.4% |
郡 | 4658 | 7.5% |
北 | 2284 | 3.7% |
南 | 2145 | 3.4% |
鏡 | 2070 | 3.3% |
京 | 2001 | 3.2% |
咸 | 1991 | 3.2% |
城 | 1626 | 2.6% |
畿 | 1426 | 2.3% |
川 | 1125 | 1.8% |
Other values (1172) | 37161 |
Decimal Number
Value | Count | Frequency (%) |
1 | 2030 | |
2 | 1623 | |
3 | 1302 | |
4 | 1201 | |
5 | 1133 | |
6 | 1058 | |
7 | 991 | |
8 | 908 | |
9 | 864 | |
0 | 843 |
Other Punctuation
Value | Count | Frequency (%) |
, | 5 | |
' | 4 | |
& | 2 | 12.5% |
# | 2 | 12.5% |
; | 2 | 12.5% |
. | 1 | 6.2% |
Close Punctuation
Value | Count | Frequency (%) |
) | 5 | |
] | 1 | 16.7% |
Open Punctuation
Value | Count | Frequency (%) |
( | 5 | |
[ | 1 | 16.7% |
Uppercase Letter
Value | Count | Frequency (%) |
E | 1 | |
F | 1 |
Space Separator
Value | Count | Frequency (%) |
20913 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 103 |
Other Symbol
Value | Count | Frequency (%) |
■ | 52 |
Lowercase Letter
Value | Count | Frequency (%) |
x | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 62222 | |
Common | 33049 | |
Katakana | 99 | 0.1% |
Hiragana | 22 | < 0.1% |
Hangul | 7 | < 0.1% |
Latin | 4 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
道 | 5863 | 9.4% |
郡 | 4658 | 7.5% |
北 | 2284 | 3.7% |
南 | 2145 | 3.4% |
鏡 | 2070 | 3.3% |
京 | 2001 | 3.2% |
咸 | 1991 | 3.2% |
城 | 1626 | 2.6% |
畿 | 1426 | 2.3% |
川 | 1125 | 1.8% |
Other values (1149) | 37033 |
Common
Value | Count | Frequency (%) |
20913 | ||
1 | 2030 | 6.1% |
2 | 1623 | 4.9% |
3 | 1302 | 3.9% |
4 | 1201 | 3.6% |
5 | 1133 | 3.4% |
6 | 1058 | 3.2% |
7 | 991 | 3.0% |
8 | 908 | 2.7% |
9 | 864 | 2.6% |
Other values (13) | 1026 | 3.1% |
Katakana
Value | Count | Frequency (%) |
ノ | 72 | |
ヤ | 3 | 3.0% |
イ | 3 | 3.0% |
リ | 3 | 3.0% |
ム | 2 | 2.0% |
ス | 2 | 2.0% |
カ | 2 | 2.0% |
ト | 2 | 2.0% |
マ | 2 | 2.0% |
サ | 2 | 2.0% |
Other values (6) | 6 | 6.1% |
Hangul
Value | Count | Frequency (%) |
의 | 2 | |
본 | 1 | |
원 | 1 | |
은 | 1 | |
오 | 1 | |
기 | 1 |
Latin
Value | Count | Frequency (%) |
x | 2 | |
E | 1 | |
F | 1 |
Hiragana
Value | Count | Frequency (%) |
の | 22 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 61903 | |
ASCII | 33001 | |
CJK Compat Ideographs | 316 | 0.3% |
Katakana | 99 | 0.1% |
Geometric Shapes | 52 | 0.1% |
Hiragana | 22 | < 0.1% |
Hangul | 7 | < 0.1% |
CJK Ext A | 3 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
20913 | ||
1 | 2030 | 6.2% |
2 | 1623 | 4.9% |
3 | 1302 | 3.9% |
4 | 1201 | 3.6% |
5 | 1133 | 3.4% |
6 | 1058 | 3.2% |
7 | 991 | 3.0% |
8 | 908 | 2.8% |
9 | 864 | 2.6% |
Other values (15) | 978 | 3.0% |
CJK
Value | Count | Frequency (%) |
道 | 5863 | 9.5% |
郡 | 4658 | 7.5% |
北 | 2284 | 3.7% |
南 | 2145 | 3.5% |
鏡 | 2070 | 3.3% |
京 | 2001 | 3.2% |
咸 | 1991 | 3.2% |
城 | 1626 | 2.6% |
畿 | 1426 | 2.3% |
川 | 1125 | 1.8% |
Other values (1105) | 36714 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
不 | 78 | |
龍 | 63 | |
金 | 42 | |
栗 | 13 | 4.1% |
漣 | 12 | 3.8% |
寧 | 9 | 2.8% |
禮 | 8 | 2.5% |
驪 | 8 | 2.5% |
連 | 7 | 2.2% |
良 | 5 | 1.6% |
Other values (32) | 71 |
Katakana
Value | Count | Frequency (%) |
ノ | 72 | |
ヤ | 3 | 3.0% |
イ | 3 | 3.0% |
リ | 3 | 3.0% |
ム | 2 | 2.0% |
ス | 2 | 2.0% |
カ | 2 | 2.0% |
ト | 2 | 2.0% |
マ | 2 | 2.0% |
サ | 2 | 2.0% |
Other values (6) | 6 | 6.1% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 52 |
Hiragana
Value | Count | Frequency (%) |
の | 22 |
Hangul
Value | Count | Frequency (%) |
의 | 2 | |
본 | 1 | |
원 | 1 | |
은 | 1 | |
오 | 1 | |
기 | 1 |
CJK Ext A
Value | Count | Frequency (%) |
㷨 | 2 | |
䦍 | 1 |
BIRTH_PLACE
Text
MISSING
 
Distinct | 4283 |
---|---|
Distinct (%) | 82.7% |
Missing | 1118 |
Missing (%) | 17.8% |
Memory size | 49.3 KiB |
Length
Max length | 87 |
---|---|
Median length | 24 |
Mean length | 15.923523 |
Min length | 1 |
Characters and Unicode
Total characters | 82452 |
---|---|
Distinct characters | 1260 |
Distinct categories | 13 ? |
Distinct scripts | 5 ? |
Distinct blocks | 8 ? |
Unique
Unique | 3628 ? |
---|---|
Unique (%) | 70.1% |
Sample
1st row | 奉川省 蓋平縣 陳家屯 |
---|---|
2nd row | 靜岡 安倍 有度 上原 863 |
3rd row | 忠淸南道 論山郡 夫赤 新橋 |
4th row | 咸鏡南道 洪原郡 鶴泉 豊洞 411 |
5th row | 咸鏡南道 洪原郡 鶴泉 豊洞 450 |
Value | Count | Frequency (%) |
京畿道 | 1140 | 4.9% |
咸鏡北道 | 871 | 3.8% |
咸鏡南道 | 688 | 3.0% |
江原道 | 465 | 2.0% |
京城府 | 399 | 1.7% |
慶尙北道 | 336 | 1.4% |
忠淸南道 | 222 | 1.0% |
全羅北道 | 218 | 0.9% |
明川郡 | 217 | 0.9% |
黃海道 | 209 | 0.9% |
Other values (5359) | 18444 |
Most occurring characters
Value | Count | Frequency (%) |
18031 | ||
道 | 5076 | 6.2% |
郡 | 4099 | 5.0% |
北 | 1996 | 2.4% |
南 | 1887 | 2.3% |
鏡 | 1762 | 2.1% |
咸 | 1687 | 2.0% |
1 | 1574 | 1.9% |
京 | 1558 | 1.9% |
城 | 1334 | 1.6% |
Other values (1250) | 43448 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 54798 | |
Space Separator | 18031 | 21.9% |
Decimal Number | 9418 | 11.4% |
Dash Punctuation | 72 | 0.1% |
Other Symbol | 45 | 0.1% |
Lowercase Letter | 45 | 0.1% |
Other Punctuation | 22 | < 0.1% |
Uppercase Letter | 7 | < 0.1% |
Math Symbol | 4 | < 0.1% |
Modifier Letter | 3 | < 0.1% |
Other values (3) | 7 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
道 | 5076 | 9.3% |
郡 | 4099 | 7.5% |
北 | 1996 | 3.6% |
南 | 1887 | 3.4% |
鏡 | 1762 | 3.2% |
咸 | 1687 | 3.1% |
京 | 1558 | 2.8% |
城 | 1334 | 2.4% |
畿 | 1141 | 2.1% |
川 | 976 | 1.8% |
Other values (1195) | 33282 |
Lowercase Letter
Value | Count | Frequency (%) |
r | 6 | |
t | 4 | 8.9% |
o | 4 | 8.9% |
w | 4 | 8.9% |
n | 3 | 6.7% |
h | 3 | 6.7% |
i | 3 | 6.7% |
a | 2 | 4.4% |
e | 2 | 4.4% |
k | 2 | 4.4% |
Other values (8) | 12 |
Decimal Number
Value | Count | Frequency (%) |
1 | 1574 | |
2 | 1273 | |
3 | 1042 | |
4 | 953 | |
5 | 907 | |
6 | 825 | |
7 | 753 | |
8 | 736 | |
9 | 687 | |
0 | 668 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 6 | |
, | 5 | |
. | 5 | |
" | 2 | 9.1% |
& | 1 | 4.5% |
# | 1 | 4.5% |
; | 1 | 4.5% |
: | 1 | 4.5% |
Uppercase Letter
Value | Count | Frequency (%) |
F | 2 | |
E | 1 | |
K | 1 | |
C | 1 | |
I | 1 | |
G | 1 |
Math Symbol
Value | Count | Frequency (%) |
= | 1 | |
< | 1 | |
> | 1 | |
+ | 1 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 71 | |
― | 1 | 1.4% |
Other Symbol
Value | Count | Frequency (%) |
■ | 44 | |
▼ | 1 | 2.2% |
Space Separator
Value | Count | Frequency (%) |
18031 |
Modifier Letter
Value | Count | Frequency (%) |
ー | 3 |
Open Punctuation
Value | Count | Frequency (%) |
( | 3 |
Close Punctuation
Value | Count | Frequency (%) |
) | 3 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 54650 | |
Common | 27602 | |
Katakana | 126 | 0.2% |
Latin | 52 | 0.1% |
Hiragana | 22 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
道 | 5076 | 9.3% |
郡 | 4099 | 7.5% |
北 | 1996 | 3.7% |
南 | 1887 | 3.5% |
鏡 | 1762 | 3.2% |
咸 | 1687 | 3.1% |
京 | 1558 | 2.9% |
城 | 1334 | 2.4% |
畿 | 1141 | 2.1% |
川 | 976 | 1.8% |
Other values (1160) | 33134 |
Katakana
Value | Count | Frequency (%) |
ノ | 50 | |
ロ | 6 | 4.8% |
ニ | 6 | 4.8% |
シ | 5 | 4.0% |
ス | 4 | 3.2% |
ト | 4 | 3.2% |
ヤ | 4 | 3.2% |
ア | 4 | 3.2% |
リ | 4 | 3.2% |
ク | 4 | 3.2% |
Other values (24) | 35 |
Common
Value | Count | Frequency (%) |
18031 | ||
1 | 1574 | 5.7% |
2 | 1273 | 4.6% |
3 | 1042 | 3.8% |
4 | 953 | 3.5% |
5 | 907 | 3.3% |
6 | 825 | 3.0% |
7 | 753 | 2.7% |
8 | 736 | 2.7% |
9 | 687 | 2.5% |
Other values (21) | 821 | 3.0% |
Latin
Value | Count | Frequency (%) |
r | 6 | 11.5% |
t | 4 | 7.7% |
o | 4 | 7.7% |
w | 4 | 7.7% |
n | 3 | 5.8% |
h | 3 | 5.8% |
i | 3 | 5.8% |
a | 2 | 3.8% |
e | 2 | 3.8% |
k | 2 | 3.8% |
Other values (14) | 19 |
Hiragana
Value | Count | Frequency (%) |
の | 22 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 54319 | |
ASCII | 27605 | |
CJK Compat Ideographs | 328 | 0.4% |
Katakana | 129 | 0.2% |
Geometric Shapes | 45 | 0.1% |
Hiragana | 22 | < 0.1% |
CJK Ext A | 3 | < 0.1% |
Punctuation | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
18031 | ||
1 | 1574 | 5.7% |
2 | 1273 | 4.6% |
3 | 1042 | 3.8% |
4 | 953 | 3.5% |
5 | 907 | 3.3% |
6 | 825 | 3.0% |
7 | 753 | 2.7% |
8 | 736 | 2.7% |
9 | 687 | 2.5% |
Other values (41) | 824 | 3.0% |
CJK
Value | Count | Frequency (%) |
道 | 5076 | 9.3% |
郡 | 4099 | 7.5% |
北 | 1996 | 3.7% |
南 | 1887 | 3.5% |
鏡 | 1762 | 3.2% |
咸 | 1687 | 3.1% |
京 | 1558 | 2.9% |
城 | 1334 | 2.5% |
畿 | 1141 | 2.1% |
川 | 976 | 1.8% |
Other values (1116) | 32803 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
不 | 101 | |
龍 | 67 | |
金 | 32 | 9.8% |
漣 | 11 | 3.4% |
栗 | 11 | 3.4% |
驪 | 10 | 3.0% |
寧 | 7 | 2.1% |
禮 | 7 | 2.1% |
連 | 7 | 2.1% |
良 | 6 | 1.8% |
Other values (32) | 69 |
Katakana
Value | Count | Frequency (%) |
ノ | 50 | |
ロ | 6 | 4.7% |
ニ | 6 | 4.7% |
シ | 5 | 3.9% |
ス | 4 | 3.1% |
ト | 4 | 3.1% |
ヤ | 4 | 3.1% |
ア | 4 | 3.1% |
リ | 4 | 3.1% |
ク | 4 | 3.1% |
Other values (25) | 38 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 44 | |
▼ | 1 | 2.2% |
Hiragana
Value | Count | Frequency (%) |
の | 22 |
CJK Ext A
Value | Count | Frequency (%) |
㷨 | 2 | |
䦍 | 1 |
Punctuation
Value | Count | Frequency (%) |
― | 1 |
ADDRESS
Text
MISSING
 
Distinct | 4513 |
---|---|
Distinct (%) | 78.4% |
Missing | 538 |
Missing (%) | 8.5% |
Memory size | 49.3 KiB |
Length
Max length | 86 |
---|---|
Median length | 42 |
Mean length | 15.210663 |
Min length | 1 |
Characters and Unicode
Total characters | 87583 |
---|---|
Distinct characters | 1336 |
Distinct categories | 13 ? |
Distinct scripts | 6 ? |
Distinct blocks | 10 ? |
Unique
Unique | 3808 ? |
---|---|
Unique (%) | 66.1% |
Sample
1st row | 子宅 |
---|---|
2nd row | 京畿道 京城府 漢江通 7 |
3rd row | 京畿道 京城府 昌信 以下 不詳 |
4th row | 咸鏡南道 北靑郡 新昌 新 85 |
5th row | 咸鏡南道 洪原郡 鶴泉 豊洞 411 |
Value | Count | Frequency (%) |
京畿道 | 2222 | 9.0% |
京城府 | 1499 | 6.1% |
咸鏡南道 | 554 | 2.3% |
間島 | 446 | 1.8% |
江原道 | 409 | 1.7% |
咸鏡北道 | 367 | 1.5% |
延吉 | 323 | 1.3% |
不定 | 229 | 0.9% |
慶尙北道 | 218 | 0.9% |
平安南道 | 173 | 0.7% |
Other values (5240) | 18156 |
Most occurring characters
Value | Count | Frequency (%) |
18838 | ||
道 | 4912 | 5.6% |
京 | 3784 | 4.3% |
郡 | 2600 | 3.0% |
畿 | 2223 | 2.5% |
1 | 2198 | 2.5% |
城 | 2098 | 2.4% |
府 | 2094 | 2.4% |
2 | 1614 | 1.8% |
南 | 1422 | 1.6% |
Other values (1326) | 45800 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 56346 | |
Space Separator | 18838 | 21.5% |
Decimal Number | 11552 | 13.2% |
Dash Punctuation | 336 | 0.4% |
Lowercase Letter | 266 | 0.3% |
Other Punctuation | 91 | 0.1% |
Other Symbol | 79 | 0.1% |
Uppercase Letter | 30 | < 0.1% |
Math Symbol | 20 | < 0.1% |
Open Punctuation | 7 | < 0.1% |
Other values (3) | 18 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
道 | 4912 | 8.7% |
京 | 3784 | 6.7% |
郡 | 2600 | 4.6% |
畿 | 2223 | 3.9% |
城 | 2098 | 3.7% |
府 | 2094 | 3.7% |
南 | 1422 | 2.5% |
北 | 1113 | 2.0% |
咸 | 1066 | 1.9% |
鏡 | 995 | 1.8% |
Other values (1274) | 34039 |
Lowercase Letter
Value | Count | Frequency (%) |
r | 37 | |
t | 24 | 9.0% |
o | 24 | 9.0% |
w | 24 | 9.0% |
h | 18 | 6.8% |
n | 18 | 6.8% |
i | 18 | 6.8% |
m | 12 | 4.5% |
k | 12 | 4.5% |
g | 12 | 4.5% |
Other values (8) | 67 |
Decimal Number
Value | Count | Frequency (%) |
1 | 2198 | |
2 | 1614 | |
3 | 1290 | |
4 | 1156 | |
5 | 1070 | |
6 | 944 | |
7 | 907 | |
8 | 834 | 7.2% |
9 | 785 | 6.8% |
0 | 754 | 6.5% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 38 | |
. | 26 | |
" | 12 | 13.2% |
, | 8 | 8.8% |
: | 7 | 7.7% |
Uppercase Letter
Value | Count | Frequency (%) |
F | 6 | |
I | 6 | |
G | 6 | |
C | 6 | |
K | 6 |
Math Symbol
Value | Count | Frequency (%) |
< | 7 | |
> | 7 | |
= | 6 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 335 | |
― | 1 | 0.3% |
Other Symbol
Value | Count | Frequency (%) |
■ | 78 | |
▼ | 1 | 1.3% |
Open Punctuation
Value | Count | Frequency (%) |
( | 6 | |
「 | 1 | 14.3% |
Close Punctuation
Value | Count | Frequency (%) |
) | 6 | |
」 | 1 | 14.3% |
Space Separator
Value | Count | Frequency (%) |
18838 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 6 |
Modifier Letter
Value | Count | Frequency (%) |
ー | 5 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 55853 | |
Common | 30941 | |
Katakana | 392 | 0.4% |
Latin | 296 | 0.3% |
Hiragana | 61 | 0.1% |
Hangul | 40 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
道 | 4912 | 8.8% |
京 | 3784 | 6.8% |
郡 | 2600 | 4.7% |
畿 | 2223 | 4.0% |
城 | 2098 | 3.8% |
府 | 2094 | 3.7% |
南 | 1422 | 2.5% |
北 | 1113 | 2.0% |
咸 | 1066 | 1.9% |
鏡 | 995 | 1.8% |
Other values (1197) | 33546 |
Katakana
Value | Count | Frequency (%) |
ノ | 212 | |
ス | 23 | 5.9% |
ク | 16 | 4.1% |
ラ | 10 | 2.6% |
コ | 8 | 2.0% |
リ | 7 | 1.8% |
ル | 7 | 1.8% |
ト | 7 | 1.8% |
カ | 6 | 1.5% |
イ | 6 | 1.5% |
Other values (35) | 90 |
Hangul
Value | Count | Frequency (%) |
은 | 2 | 5.0% |
광 | 2 | 5.0% |
주 | 2 | 5.0% |
군 | 2 | 5.0% |
부 | 2 | 5.0% |
면 | 2 | 5.0% |
교 | 2 | 5.0% |
산 | 2 | 5.0% |
리 | 2 | 5.0% |
울 | 1 | 2.5% |
Other values (21) | 21 |
Common
Value | Count | Frequency (%) |
18838 | ||
1 | 2198 | 7.1% |
2 | 1614 | 5.2% |
3 | 1290 | 4.2% |
4 | 1156 | 3.7% |
5 | 1070 | 3.5% |
6 | 944 | 3.1% |
7 | 907 | 2.9% |
8 | 834 | 2.7% |
9 | 785 | 2.5% |
Other values (19) | 1305 | 4.2% |
Latin
Value | Count | Frequency (%) |
r | 37 | 12.5% |
t | 24 | 8.1% |
o | 24 | 8.1% |
w | 24 | 8.1% |
h | 18 | 6.1% |
n | 18 | 6.1% |
i | 18 | 6.1% |
m | 12 | 4.1% |
k | 12 | 4.1% |
g | 12 | 4.1% |
Other values (13) | 97 |
Hiragana
Value | Count | Frequency (%) |
の | 61 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 55619 | |
ASCII | 31150 | |
Katakana | 397 | 0.5% |
CJK Compat Ideographs | 229 | 0.3% |
Geometric Shapes | 79 | 0.1% |
Hiragana | 61 | 0.1% |
Hangul | 40 | < 0.1% |
CJK Ext A | 5 | < 0.1% |
None | 2 | < 0.1% |
Punctuation | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
18838 | ||
1 | 2198 | 7.1% |
2 | 1614 | 5.2% |
3 | 1290 | 4.1% |
4 | 1156 | 3.7% |
5 | 1070 | 3.4% |
6 | 944 | 3.0% |
7 | 907 | 2.9% |
8 | 834 | 2.7% |
9 | 785 | 2.5% |
Other values (36) | 1514 | 4.9% |
CJK
Value | Count | Frequency (%) |
道 | 4912 | 8.8% |
京 | 3784 | 6.8% |
郡 | 2600 | 4.7% |
畿 | 2223 | 4.0% |
城 | 2098 | 3.8% |
府 | 2094 | 3.8% |
南 | 1422 | 2.6% |
北 | 1113 | 2.0% |
咸 | 1066 | 1.9% |
鏡 | 995 | 1.8% |
Other values (1149) | 33312 |
Katakana
Value | Count | Frequency (%) |
ノ | 212 | |
ス | 23 | 5.8% |
ク | 16 | 4.0% |
ラ | 10 | 2.5% |
コ | 8 | 2.0% |
リ | 7 | 1.8% |
ル | 7 | 1.8% |
ト | 7 | 1.8% |
カ | 6 | 1.5% |
イ | 6 | 1.5% |
Other values (36) | 95 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 78 | |
▼ | 1 | 1.3% |
Hiragana
Value | Count | Frequency (%) |
の | 61 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
龍 | 57 | |
金 | 38 | |
不 | 23 | 10.0% |
樂 | 12 | 5.2% |
寧 | 8 | 3.5% |
漣 | 6 | 2.6% |
栗 | 6 | 2.6% |
醴 | 5 | 2.2% |
復 | 5 | 2.2% |
驪 | 5 | 2.2% |
Other values (34) | 64 |
Hangul
Value | Count | Frequency (%) |
은 | 2 | 5.0% |
광 | 2 | 5.0% |
주 | 2 | 5.0% |
군 | 2 | 5.0% |
부 | 2 | 5.0% |
면 | 2 | 5.0% |
교 | 2 | 5.0% |
산 | 2 | 5.0% |
리 | 2 | 5.0% |
울 | 1 | 2.5% |
Other values (21) | 21 |
CJK Ext A
Value | Count | Frequency (%) |
㷨 | 2 | |
䲖 | 1 | |
䁏 | 1 | |
䦍 | 1 |
Punctuation
Value | Count | Frequency (%) |
― | 1 |
None
Value | Count | Frequency (%) |
」 | 1 | |
「 | 1 |
INDICTMENT
Categorical
IMBALANCE
 
Distinct | 29 |
---|---|
Distinct (%) | 0.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
<NA> | |
---|---|
治安維持法違反 | 86 |
治安維持法事件 | 44 |
保安法違反 | 16 |
出版法違反 | 5 |
Other values (24) | 31 |
Length
Max length | 29 |
---|---|
Median length | 4 |
Mean length | 4.10054 |
Min length | 4 |
Unique
Unique | 21 ? |
---|---|
Unique (%) | 0.3% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 6114 | |
治安維持法違反 | 86 | 1.4% |
治安維持法事件 | 44 | 0.7% |
保安法違反 | 16 | 0.3% |
出版法違反 | 5 | 0.1% |
朝鮮共産黨事件 | 5 | 0.1% |
治安維持法並ニ出版法違反 | 3 | < 0.1% |
治安法違反事件 | 2 | < 0.1% |
時局標榜强盜取調ノ爲メ | 1 | < 0.1% |
萬歲高唱, 不穩文件撒布(大正8年制令第7號 違反) | 1 | < 0.1% |
Other values (19) | 19 | 0.3% |
Length
Value | Count | Frequency (%) |
na | 6114 | |
治安維持法違反 | 86 | 1.4% |
治安維持法事件 | 44 | 0.7% |
保安法違反 | 17 | 0.3% |
出版法違反 | 5 | 0.1% |
朝鮮共産黨事件 | 5 | 0.1% |
治安維持法並ニ出版法違反 | 3 | < 0.1% |
治安法違反事件 | 2 | < 0.1% |
瑞山 | 2 | < 0.1% |
治安維持法竝ニ出版法違反 | 1 | < 0.1% |
Other values (27) | 27 | 0.4% |
INDICTMENT_OFFICE
Categorical
IMBALANCE
 
Distinct | 16 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
<NA> | |
---|---|
京畿道警察部 | 103 |
京城鍾路警察署 | 40 |
京城西大門警察署 | 28 |
京城鐘路警察署 | 9 |
Other values (11) | 29 |
Length
Max length | 9 |
---|---|
Median length | 4 |
Mean length | 4.0844981 |
Min length | 3 |
Unique
Unique | 7 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 6087 | |
京畿道警察部 | 103 | 1.6% |
京城鍾路警察署 | 40 | 0.6% |
京城西大門警察署 | 28 | 0.4% |
京城鐘路警察署 | 9 | 0.1% |
鍾路警察署 | 9 | 0.1% |
京畿道 警察部 | 9 | 0.1% |
京城本町警察署 | 2 | < 0.1% |
京城東大門警察署 | 2 | < 0.1% |
鐘路警察署 | 1 | < 0.1% |
Other values (6) | 6 | 0.1% |
Length
Value | Count | Frequency (%) |
na | 6087 | |
京畿道警察部 | 103 | 1.6% |
京城鍾路警察署 | 40 | 0.6% |
京城西大門警察署 | 28 | 0.4% |
京畿道 | 11 | 0.2% |
警察部 | 11 | 0.2% |
京城鐘路警察署 | 9 | 0.1% |
鍾路警察署 | 9 | 0.1% |
京城本町警察署 | 2 | < 0.1% |
京城東大門警察署 | 2 | < 0.1% |
Other values (6) | 6 | 0.1% |
INDICTMENT_DATE
Text
MISSING
 
Distinct | 104 |
---|---|
Distinct (%) | 48.6% |
Missing | 6082 |
Missing (%) | 96.6% |
Memory size | 49.3 KiB |
Length
Max length | 13 |
---|---|
Median length | 11 |
Mean length | 11.196262 |
Min length | 8 |
Characters and Unicode
Total characters | 2396 |
---|---|
Distinct characters | 19 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 3 ? |
Unique
Unique | 65 ? |
---|---|
Unique (%) | 30.4% |
Sample
1st row | 大正15年 7月 17日 |
---|---|
2nd row | 大正14年 8月 7日 |
3rd row | 大正12年 8月 30日 |
4th row | 昭和3年 9月 5日 |
5th row | 昭和3年 8月 26日 |
Value | Count | Frequency (%) |
昭和3年 | 108 | |
8月 | 77 | 12.0% |
大正15年 | 42 | 6.6% |
7月 | 32 | 5.0% |
26日 | 29 | 4.5% |
25日 | 26 | 4.1% |
4月 | 23 | 3.6% |
10月 | 21 | 3.3% |
昭和4年 | 20 | 3.1% |
昭和5年 | 17 | 2.7% |
Other values (44) | 246 |
Most occurring characters
Value | Count | Frequency (%) |
427 | ||
月 | 214 | |
年 | 214 | |
日 | 213 | |
1 | 177 | 7.4% |
和 | 149 | 6.2% |
昭 | 149 | 6.2% |
2 | 146 | 6.1% |
3 | 142 | 5.9% |
5 | 105 | 4.4% |
Other values (9) | 460 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 1069 | |
Decimal Number | 899 | |
Space Separator | 427 | 17.8% |
Other Symbol | 1 | < 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 177 | |
2 | 146 | |
3 | 142 | |
5 | 105 | |
8 | 89 | |
4 | 65 | 7.2% |
7 | 52 | 5.8% |
6 | 48 | 5.3% |
0 | 41 | 4.6% |
9 | 34 | 3.8% |
Other Letter
Value | Count | Frequency (%) |
月 | 214 | |
年 | 214 | |
日 | 213 | |
和 | 149 | |
昭 | 149 | |
大 | 65 | 6.1% |
正 | 65 | 6.1% |
Space Separator
Value | Count | Frequency (%) |
427 |
Other Symbol
Value | Count | Frequency (%) |
■ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 1327 | |
Han | 1069 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
427 | ||
1 | 177 | |
2 | 146 | 11.0% |
3 | 142 | 10.7% |
5 | 105 | 7.9% |
8 | 89 | 6.7% |
4 | 65 | 4.9% |
7 | 52 | 3.9% |
6 | 48 | 3.6% |
0 | 41 | 3.1% |
Other values (2) | 35 | 2.6% |
Han
Value | Count | Frequency (%) |
月 | 214 | |
年 | 214 | |
日 | 213 | |
和 | 149 | |
昭 | 149 | |
大 | 65 | 6.1% |
正 | 65 | 6.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1326 | |
CJK | 1069 | |
Geometric Shapes | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
427 | ||
1 | 177 | |
2 | 146 | 11.0% |
3 | 142 | 10.7% |
5 | 105 | 7.9% |
8 | 89 | 6.7% |
4 | 65 | 4.9% |
7 | 52 | 3.9% |
6 | 48 | 3.6% |
0 | 41 | 3.1% |
CJK
Value | Count | Frequency (%) |
月 | 214 | |
年 | 214 | |
日 | 213 | |
和 | 149 | |
昭 | 149 | |
大 | 65 | 6.1% |
正 | 65 | 6.1% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 1 |
RELEASE
Text
MISSING
 
Distinct | 56 |
---|---|
Distinct (%) | 65.9% |
Missing | 6211 |
Missing (%) | 98.6% |
Memory size | 49.3 KiB |
Length
Max length | 18 |
---|---|
Median length | 14 |
Mean length | 13.894118 |
Min length | 10 |
Characters and Unicode
Total characters | 1181 |
---|---|
Distinct characters | 30 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 2 ? |
Unique
Unique | 38 ? |
---|---|
Unique (%) | 44.7% |
Sample
1st row | 昭和8年 9月 17日 釋放 |
---|---|
2nd row | 昭和4年 7月 2日 釋放 |
3rd row | 昭和3年 10月 4日 釋放 |
4th row | 昭和3年 9月 29日 釋放 |
5th row | 昭和4年 5月 5日 釋放 |
Value | Count | Frequency (%) |
釋放 | 83 | |
昭和3年 | 34 | 10.1% |
9月 | 32 | 9.5% |
昭和4年 | 17 | 5.0% |
昭和5年 | 15 | 4.5% |
昭和6年 | 9 | 2.7% |
17日 | 8 | 2.4% |
2月 | 8 | 2.4% |
7月 | 7 | 2.1% |
10月 | 7 | 2.1% |
Other values (37) | 117 |
Most occurring characters
Value | Count | Frequency (%) |
252 | ||
年 | 85 | 7.2% |
月 | 85 | 7.2% |
昭 | 84 | 7.1% |
日 | 84 | 7.1% |
和 | 84 | 7.1% |
釋 | 83 | 7.0% |
放 | 83 | 7.0% |
1 | 55 | 4.7% |
3 | 53 | 4.5% |
Other values (20) | 233 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 601 | |
Decimal Number | 328 | |
Space Separator | 252 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
年 | 85 | |
月 | 85 | |
昭 | 84 | |
日 | 84 | |
和 | 84 | |
釋 | 83 | |
放 | 83 | |
大 | 2 | 0.3% |
檢 | 1 | 0.2% |
事 | 1 | 0.2% |
Other values (9) | 9 | 1.5% |
Decimal Number
Value | Count | Frequency (%) |
1 | 55 | |
3 | 53 | |
9 | 51 | |
2 | 36 | |
4 | 29 | |
6 | 27 | |
5 | 26 | |
7 | 22 | 6.7% |
8 | 15 | 4.6% |
0 | 14 | 4.3% |
Space Separator
Value | Count | Frequency (%) |
252 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 601 | |
Common | 580 |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 85 | |
月 | 85 | |
昭 | 84 | |
日 | 84 | |
和 | 84 | |
釋 | 83 | |
放 | 83 | |
大 | 2 | 0.3% |
檢 | 1 | 0.2% |
事 | 1 | 0.2% |
Other values (9) | 9 | 1.5% |
Common
Value | Count | Frequency (%) |
252 | ||
1 | 55 | 9.5% |
3 | 53 | 9.1% |
9 | 51 | 8.8% |
2 | 36 | 6.2% |
4 | 29 | 5.0% |
6 | 27 | 4.7% |
5 | 26 | 4.5% |
7 | 22 | 3.8% |
8 | 15 | 2.6% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 601 | |
ASCII | 580 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
252 | ||
1 | 55 | 9.5% |
3 | 53 | 9.1% |
9 | 51 | 8.8% |
2 | 36 | 6.2% |
4 | 29 | 5.0% |
6 | 27 | 4.7% |
5 | 26 | 4.5% |
7 | 22 | 3.8% |
8 | 15 | 2.6% |
CJK
Value | Count | Frequency (%) |
年 | 85 | |
月 | 85 | |
昭 | 84 | |
日 | 84 | |
和 | 84 | |
釋 | 83 | |
放 | 83 | |
大 | 2 | 0.3% |
檢 | 1 | 0.2% |
事 | 1 | 0.2% |
Other values (9) | 9 | 1.5% |
CRIME_NAME
Text
MISSING
 
Distinct | 569 |
---|---|
Distinct (%) | 9.8% |
Missing | 467 |
Missing (%) | 7.4% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
治安維持法違反 | 3300 | |
保安法違反 | 1021 | 14.5% |
國家總動員法違反 | 337 | 4.8% |
治安維持法 | 336 | 4.8% |
保安法犯 | 259 | 3.7% |
出版法違反 | 179 | 2.5% |
國家總動員法 | 149 | 2.1% |
騷擾 | 114 | 1.6% |
强盜 | 90 | 1.3% |
殺人 | 70 | 1.0% |
Other values (410) | 1176 | 16.7% |
Most occurring characters
Value | Count | Frequency (%) |
法 | 6109 | |
違 | 5235 | |
反 | 5225 | |
安 | 5219 | |
治 | 3804 | |
維 | 3801 | |
持 | 3791 | |
保 | 1440 | 3.3% |
1202 | 2.8% | |
國 | 516 | 1.2% |
Other values (289) | 7272 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 42163 | |
Space Separator | 1202 | 2.8% |
Decimal Number | 75 | 0.2% |
Other Punctuation | 60 | 0.1% |
Other Symbol | 41 | 0.1% |
Close Punctuation | 32 | 0.1% |
Open Punctuation | 32 | 0.1% |
Other Number | 6 | < 0.1% |
Math Symbol | 3 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
法 | 6109 | |
違 | 5235 | |
反 | 5225 | |
安 | 5219 | |
治 | 3804 | |
維 | 3801 | |
持 | 3791 | |
保 | 1440 | 3.4% |
國 | 516 | 1.2% |
家 | 514 | 1.2% |
Other values (268) | 6509 |
Decimal Number
Value | Count | Frequency (%) |
1 | 15 | |
2 | 14 | |
8 | 14 | |
6 | 8 | |
7 | 8 | |
3 | 6 | 8.0% |
4 | 5 | 6.7% |
5 | 3 | 4.0% |
9 | 1 | 1.3% |
0 | 1 | 1.3% |
Other Punctuation
Value | Count | Frequency (%) |
, | 55 | |
. | 5 | 8.3% |
Close Punctuation
Value | Count | Frequency (%) |
) | 30 | |
」 | 2 | 6.2% |
Open Punctuation
Value | Count | Frequency (%) |
( | 30 | |
「 | 2 | 6.2% |
Other Number
Value | Count | Frequency (%) |
② | 3 | |
① | 3 |
Space Separator
Value | Count | Frequency (%) |
1202 |
Other Symbol
Value | Count | Frequency (%) |
■ | 41 |
Math Symbol
Value | Count | Frequency (%) |
| | 3 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 41989 | |
Common | 1451 | 3.3% |
Katakana | 174 | 0.4% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
法 | 6109 | |
違 | 5235 | |
反 | 5225 | |
安 | 5219 | |
治 | 3804 | |
維 | 3801 | |
持 | 3791 | |
保 | 1440 | 3.4% |
國 | 516 | 1.2% |
家 | 514 | 1.2% |
Other values (251) | 6335 |
Common
Value | Count | Frequency (%) |
1202 | ||
, | 55 | 3.8% |
■ | 41 | 2.8% |
) | 30 | 2.1% |
( | 30 | 2.1% |
1 | 15 | 1.0% |
2 | 14 | 1.0% |
8 | 14 | 1.0% |
6 | 8 | 0.6% |
7 | 8 | 0.6% |
Other values (11) | 34 | 2.3% |
Katakana
Value | Count | Frequency (%) |
ニ | 47 | |
ル | 46 | |
ス | 46 | |
シ | 11 | 6.3% |
ジ | 7 | 4.0% |
ノ | 4 | 2.3% |
ン | 2 | 1.1% |
ト | 2 | 1.1% |
テ | 1 | 0.6% |
モ | 1 | 0.6% |
Other values (7) | 7 | 4.0% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 41937 | |
ASCII | 1400 | 3.2% |
Katakana | 174 | 0.4% |
CJK Compat Ideographs | 52 | 0.1% |
Geometric Shapes | 41 | 0.1% |
Enclosed Alphanum | 6 | < 0.1% |
None | 4 | < 0.1% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
法 | 6109 | |
違 | 5235 | |
反 | 5225 | |
安 | 5219 | |
治 | 3804 | |
維 | 3801 | |
持 | 3791 | |
保 | 1440 | 3.4% |
國 | 516 | 1.2% |
家 | 514 | 1.2% |
Other values (246) | 6283 |
ASCII
Value | Count | Frequency (%) |
1202 | ||
, | 55 | 3.9% |
) | 30 | 2.1% |
( | 30 | 2.1% |
1 | 15 | 1.1% |
2 | 14 | 1.0% |
8 | 14 | 1.0% |
6 | 8 | 0.6% |
7 | 8 | 0.6% |
3 | 6 | 0.4% |
Other values (6) | 18 | 1.3% |
Katakana
Value | Count | Frequency (%) |
ニ | 47 | |
ル | 46 | |
ス | 46 | |
シ | 11 | 6.3% |
ジ | 7 | 4.0% |
ノ | 4 | 2.3% |
ン | 2 | 1.1% |
ト | 2 | 1.1% |
テ | 1 | 0.6% |
モ | 1 | 0.6% |
Other values (7) | 7 | 4.0% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 41 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
暴 | 24 | |
陸 | 14 | |
不 | 12 | |
宅 | 1 | 1.9% |
臨 | 1 | 1.9% |
Enclosed Alphanum
Value | Count | Frequency (%) |
② | 3 | |
① | 3 |
None
Value | Count | Frequency (%) |
」 | 2 | |
「 | 2 |
CRIME_RECORD
Categorical
IMBALANCE
 
Distinct | 28 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
<NA> | |
---|---|
無 | 17 |
高等課要視察人 | 12 |
高等手配 | 11 |
1 | 10 |
Other values (23) | 33 |
Length
Max length | 42 |
---|---|
Median length | 4 |
Mean length | 4.0193774 |
Min length | 1 |
Unique
Unique | 17 ? |
---|---|
Unique (%) | 0.3% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 6213 | |
無 | 17 | 0.3% |
高等課要視察人 | 12 | 0.2% |
高等手配 | 11 | 0.2% |
1 | 10 | 0.2% |
ナシ | 3 | < 0.1% |
排日政事要視察人 | 3 | < 0.1% |
高等課手配用 | 3 | < 0.1% |
刑事課手配用 | 3 | < 0.1% |
高等課 要視察人 | 2 | < 0.1% |
Other values (18) | 19 | 0.3% |
Length
Value | Count | Frequency (%) |
na | 6213 | |
無 | 17 | 0.3% |
高等課要視察人 | 12 | 0.2% |
高等手配 | 12 | 0.2% |
1 | 10 | 0.2% |
ナシ | 3 | < 0.1% |
排日政事要視察人 | 3 | < 0.1% |
高等課手配用 | 3 | < 0.1% |
刑事課手配用 | 3 | < 0.1% |
排日 | 2 | < 0.1% |
Other values (27) | 30 | 0.5% |
PRISON_TERM
Text
MISSING
 
Distinct | 869 |
---|---|
Distinct (%) | 24.8% |
Missing | 2794 |
Missing (%) | 44.4% |
Memory size | 49.3 KiB |
Length
Max length | 86 |
---|---|
Median length | 47 |
Mean length | 6.2592804 |
Min length | 1 |
Characters and Unicode
Total characters | 21920 |
---|---|
Distinct characters | 139 |
Distinct categories | 9 ? |
Distinct scripts | 3 ? |
Distinct blocks | 7 ? |
Unique
Unique | 687 ? |
---|---|
Unique (%) | 19.6% |
Sample
1st row | 懲役15年 裁定210日 |
---|---|
2nd row | 懲役2年 |
3rd row | 懲役6月 |
4th row | 懲役6月 |
5th row | 懲役2年6月 未決35日 通算 |
Value | Count | Frequency (%) |
懲役2年 | 342 | 7.3% |
懲役1年 | 336 | 7.2% |
6月 | 278 | 5.9% |
懲役6月 | 242 | 5.2% |
懲役8月 | 202 | 4.3% |
懲役1年6月 | 184 | 3.9% |
2年 | 176 | 3.8% |
1年 | 151 | 3.2% |
懲役3年 | 132 | 2.8% |
通算 | 124 | 2.6% |
Other values (713) | 2522 |
Most occurring characters
Value | Count | Frequency (%) |
年 | 2578 | |
役 | 2418 | 11.0% |
懲 | 2412 | 11.0% |
月 | 1670 | 7.6% |
1 | 1512 | 6.9% |
1187 | 5.4% | |
0 | 1100 | 5.0% |
6 | 1063 | 4.8% |
2 | 875 | 4.0% |
日 | 734 | 3.3% |
Other values (129) | 6371 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 13665 | |
Decimal Number | 6658 | |
Space Separator | 1187 | 5.4% |
Math Symbol | 157 | 0.7% |
Open Punctuation | 116 | 0.5% |
Close Punctuation | 115 | 0.5% |
Other Symbol | 16 | 0.1% |
Other Number | 4 | < 0.1% |
Other Punctuation | 2 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
年 | 2578 | |
役 | 2418 | |
懲 | 2412 | |
月 | 1670 | |
日 | 734 | 5.4% |
未 | 397 | 2.9% |
決 | 391 | 2.9% |
算 | 342 | 2.5% |
通 | 313 | 2.3% |
留 | 195 | 1.4% |
Other values (108) | 2215 |
Decimal Number
Value | Count | Frequency (%) |
1 | 1512 | |
0 | 1100 | |
6 | 1063 | |
2 | 875 | |
3 | 515 | 7.7% |
8 | 479 | 7.2% |
5 | 471 | 7.1% |
4 | 411 | 6.2% |
7 | 156 | 2.3% |
9 | 76 | 1.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 115 | |
[ | 1 | 0.9% |
Close Punctuation
Value | Count | Frequency (%) |
) | 114 | |
] | 1 | 0.9% |
Other Number
Value | Count | Frequency (%) |
② | 2 | |
① | 2 |
Other Punctuation
Value | Count | Frequency (%) |
, | 1 | |
… | 1 |
Space Separator
Value | Count | Frequency (%) |
1187 |
Math Symbol
Value | Count | Frequency (%) |
| | 157 |
Other Symbol
Value | Count | Frequency (%) |
■ | 16 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 13555 | |
Common | 8255 | |
Katakana | 110 | 0.5% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 2578 | |
役 | 2418 | |
懲 | 2412 | |
月 | 1670 | |
日 | 734 | 5.4% |
未 | 397 | 2.9% |
決 | 391 | 2.9% |
算 | 342 | 2.5% |
通 | 313 | 2.3% |
留 | 195 | 1.4% |
Other values (94) | 2105 |
Common
Value | Count | Frequency (%) |
1 | 1512 | |
1187 | ||
0 | 1100 | |
6 | 1063 | |
2 | 875 | |
3 | 515 | 6.2% |
8 | 479 | 5.8% |
5 | 471 | 5.7% |
4 | 411 | 5.0% |
| | 157 | 1.9% |
Other values (11) | 485 | 5.9% |
Katakana
Value | Count | Frequency (%) |
ヲ | 47 | |
ニ | 36 | |
ノ | 7 | 6.4% |
ケ | 5 | 4.5% |
サ | 2 | 1.8% |
リ | 2 | 1.8% |
ナ | 2 | 1.8% |
キ | 2 | 1.8% |
ン | 2 | 1.8% |
ヒ | 1 | 0.9% |
Other values (4) | 4 | 3.6% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 13549 | |
ASCII | 8234 | |
Katakana | 110 | 0.5% |
Geometric Shapes | 16 | 0.1% |
CJK Compat Ideographs | 6 | < 0.1% |
Enclosed Alphanum | 4 | < 0.1% |
Punctuation | 1 | < 0.1% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
年 | 2578 | |
役 | 2418 | |
懲 | 2412 | |
月 | 1670 | |
日 | 734 | 5.4% |
未 | 397 | 2.9% |
決 | 391 | 2.9% |
算 | 342 | 2.5% |
通 | 313 | 2.3% |
留 | 195 | 1.4% |
Other values (91) | 2099 |
ASCII
Value | Count | Frequency (%) |
1 | 1512 | |
1187 | ||
0 | 1100 | |
6 | 1063 | |
2 | 875 | |
3 | 515 | 6.3% |
8 | 479 | 5.8% |
5 | 471 | 5.7% |
4 | 411 | 5.0% |
| | 157 | 1.9% |
Other values (7) | 464 | 5.6% |
Katakana
Value | Count | Frequency (%) |
ヲ | 47 | |
ニ | 36 | |
ノ | 7 | 6.4% |
ケ | 5 | 4.5% |
サ | 2 | 1.8% |
リ | 2 | 1.8% |
ナ | 2 | 1.8% |
キ | 2 | 1.8% |
ン | 2 | 1.8% |
ヒ | 1 | 0.9% |
Other values (4) | 4 | 3.6% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 16 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
不 | 3 | |
金 | 2 | |
勞 | 1 | 16.7% |
Enclosed Alphanum
Value | Count | Frequency (%) |
② | 2 | |
① | 2 |
Punctuation
Value | Count | Frequency (%) |
… | 1 |
PRISON_DATE
Text
MISSING
 
Distinct | 172 |
---|---|
Distinct (%) | 23.9% |
Missing | 5575 |
Missing (%) | 88.5% |
Memory size | 49.3 KiB |
Length
Max length | 12 |
---|---|
Median length | 11 |
Mean length | 10.832178 |
Min length | 10 |
Characters and Unicode
Total characters | 7810 |
---|---|
Distinct characters | 19 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 3 ? |
Unique
Unique | 83 ? |
---|---|
Unique (%) | 11.5% |
Sample
1st row | 大正9年 4月 12日 |
---|---|
2nd row | 大正8年 9月 27日 |
3rd row | 大正9年 11月 4日 |
4th row | 大正9年 11月 4日 |
5th row | 大正8年 5月 6日 |
Value | Count | Frequency (%) |
大正8年 | 581 | |
5月 | 155 | 7.2% |
大正9年 | 139 | 6.4% |
9月 | 110 | 5.1% |
6月 | 94 | 4.3% |
10月 | 91 | 4.2% |
27日 | 83 | 3.8% |
2月 | 67 | 3.1% |
4月 | 61 | 2.8% |
8月 | 57 | 2.6% |
Other values (38) | 725 |
Most occurring characters
Value | Count | Frequency (%) |
1442 | ||
月 | 721 | |
年 | 721 | |
日 | 721 | |
大 | 720 | |
正 | 720 | |
8 | 720 | |
2 | 428 | 5.5% |
1 | 369 | 4.7% |
9 | 324 | 4.1% |
Other values (9) | 924 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 3605 | |
Decimal Number | 2762 | |
Space Separator | 1442 | 18.5% |
Other Symbol | 1 | < 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
8 | 720 | |
2 | 428 | |
1 | 369 | |
9 | 324 | |
5 | 245 | 8.9% |
7 | 165 | 6.0% |
0 | 154 | 5.6% |
6 | 136 | 4.9% |
4 | 116 | 4.2% |
3 | 105 | 3.8% |
Other Letter
Value | Count | Frequency (%) |
月 | 721 | |
年 | 721 | |
日 | 721 | |
大 | 720 | |
正 | 720 | |
昭 | 1 | < 0.1% |
和 | 1 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
1442 |
Other Symbol
Value | Count | Frequency (%) |
■ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 4205 | |
Han | 3605 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1442 | ||
8 | 720 | |
2 | 428 | 10.2% |
1 | 369 | 8.8% |
9 | 324 | 7.7% |
5 | 245 | 5.8% |
7 | 165 | 3.9% |
0 | 154 | 3.7% |
6 | 136 | 3.2% |
4 | 116 | 2.8% |
Other values (2) | 106 | 2.5% |
Han
Value | Count | Frequency (%) |
月 | 721 | |
年 | 721 | |
日 | 721 | |
大 | 720 | |
正 | 720 | |
昭 | 1 | < 0.1% |
和 | 1 | < 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 4204 | |
CJK | 3605 | |
Geometric Shapes | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1442 | ||
8 | 720 | |
2 | 428 | 10.2% |
1 | 369 | 8.8% |
9 | 324 | 7.7% |
5 | 245 | 5.8% |
7 | 165 | 3.9% |
0 | 154 | 3.7% |
6 | 136 | 3.2% |
4 | 116 | 2.8% |
CJK
Value | Count | Frequency (%) |
月 | 721 | |
年 | 721 | |
日 | 721 | |
大 | 720 | |
正 | 720 | |
昭 | 1 | < 0.1% |
和 | 1 | < 0.1% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 1 |
SENTENCE_OFFICE
Text
MISSING
 
Distinct | 159 |
---|---|
Distinct (%) | 4.7% |
Missing | 2880 |
Missing (%) | 45.7% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
京城地方法院 | 912 | |
京城覆審法院 | 748 | |
京城地方 | 671 | |
京城覆審 | 149 | 4.3% |
淸津地方法院 | 106 | 3.1% |
淸津地方 | 103 | 3.0% |
咸興地方法院 | 92 | 2.6% |
平壤覆審法院 | 75 | 2.2% |
大邱覆審法院 | 64 | 1.8% |
鐵原支廳 | 45 | 1.3% |
Other values (149) | 508 |
Most occurring characters
Value | Count | Frequency (%) |
城 | 2620 | |
京 | 2615 | |
法 | 2197 | |
院 | 2188 | |
地 | 2179 | |
方 | 2162 | |
審 | 1081 | |
覆 | 1077 | |
淸 | 216 | 1.2% |
津 | 214 | 1.1% |
Other values (137) | 2198 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 18595 | |
Space Separator | 57 | 0.3% |
Decimal Number | 50 | 0.3% |
Open Punctuation | 14 | 0.1% |
Math Symbol | 14 | 0.1% |
Close Punctuation | 14 | 0.1% |
Other Punctuation | 2 | < 0.1% |
Other Symbol | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
城 | 2620 | |
京 | 2615 | |
法 | 2197 | |
院 | 2188 | |
地 | 2179 | |
方 | 2162 | |
審 | 1081 | |
覆 | 1077 | |
淸 | 216 | 1.2% |
津 | 214 | 1.2% |
Other values (122) | 2046 |
Decimal Number
Value | Count | Frequency (%) |
1 | 14 | |
2 | 11 | |
5 | 6 | |
3 | 6 | |
0 | 5 | 10.0% |
4 | 4 | 8.0% |
9 | 2 | 4.0% |
6 | 1 | 2.0% |
7 | 1 | 2.0% |
Space Separator
Value | Count | Frequency (%) |
57 |
Open Punctuation
Value | Count | Frequency (%) |
( | 14 |
Math Symbol
Value | Count | Frequency (%) |
| | 14 |
Close Punctuation
Value | Count | Frequency (%) |
) | 14 |
Other Punctuation
Value | Count | Frequency (%) |
. | 2 |
Other Symbol
Value | Count | Frequency (%) |
■ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 18595 | |
Common | 152 | 0.8% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
城 | 2620 | |
京 | 2615 | |
法 | 2197 | |
院 | 2188 | |
地 | 2179 | |
方 | 2162 | |
審 | 1081 | |
覆 | 1077 | |
淸 | 216 | 1.2% |
津 | 214 | 1.2% |
Other values (122) | 2046 |
Common
Value | Count | Frequency (%) |
57 | ||
1 | 14 | 9.2% |
( | 14 | 9.2% |
| | 14 | 9.2% |
) | 14 | 9.2% |
2 | 11 | 7.2% |
5 | 6 | 3.9% |
3 | 6 | 3.9% |
0 | 5 | 3.3% |
4 | 4 | 2.6% |
Other values (5) | 7 | 4.6% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 18591 | |
ASCII | 151 | 0.8% |
CJK Compat Ideographs | 4 | < 0.1% |
Geometric Shapes | 1 | < 0.1% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
城 | 2620 | |
京 | 2615 | |
法 | 2197 | |
院 | 2188 | |
地 | 2179 | |
方 | 2162 | |
審 | 1081 | |
覆 | 1077 | |
淸 | 216 | 1.2% |
津 | 214 | 1.2% |
Other values (120) | 2042 |
ASCII
Value | Count | Frequency (%) |
57 | ||
1 | 14 | 9.3% |
( | 14 | 9.3% |
| | 14 | 9.3% |
) | 14 | 9.3% |
2 | 11 | 7.3% |
5 | 6 | 4.0% |
3 | 6 | 4.0% |
0 | 5 | 3.3% |
4 | 4 | 2.6% |
Other values (4) | 6 | 4.0% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
驪 | 3 | |
領 | 1 | 25.0% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 1 |
EXECUTIVE_PRISON
Categorical
IMBALANCE
 
Distinct | 7 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
<NA> | |
---|---|
西大門監獄 | |
京城監獄 | 268 |
京城西大門監獄 | 4 |
西大門 監獄 | 2 |
Other values (2) | 2 |
Length
Max length | 7 |
---|---|
Median length | 4 |
Mean length | 4.113723 |
Min length | 4 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | 西大門監獄 |
Common Values
Value | Count | Frequency (%) |
<NA> | 5322 | |
西大門監獄 | 698 | 11.1% |
京城監獄 | 268 | 4.3% |
京城西大門監獄 | 4 | 0.1% |
西大門 監獄 | 2 | < 0.1% |
平壤監獄 | 1 | < 0.1% |
京城地方監獄 | 1 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 5322 | |
西大門監獄 | 698 | 11.1% |
京城監獄 | 268 | 4.3% |
京城西大門監獄 | 4 | 0.1% |
西大門 | 2 | < 0.1% |
監獄 | 2 | < 0.1% |
平壤監獄 | 1 | < 0.1% |
京城地方監獄 | 1 | < 0.1% |
PRISON
Categorical
IMBALANCE
 
Distinct | 38 |
---|---|
Distinct (%) | 0.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
<NA> | |
---|---|
西大門刑務所 | |
京城刑務所 | 220 |
大田刑務所 | 60 |
咸興刑務所 | 51 |
Other values (33) | 135 |
Length
Max length | 17 |
---|---|
Median length | 4 |
Mean length | 4.6709022 |
Min length | 4 |
Unique
Unique | 17 ? |
---|---|
Unique (%) | 0.3% |
Sample
1st row | <NA> |
---|---|
2nd row | 西大門刑務所 |
3rd row | 西大門刑務所 |
4th row | 西大門刑務所 |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 4015 | |
西大門刑務所 | 1815 | |
京城刑務所 | 220 | 3.5% |
大田刑務所 | 60 | 1.0% |
咸興刑務所 | 51 | 0.8% |
大邱刑務所 | 24 | 0.4% |
仁川刑務所 | 22 | 0.3% |
金泉少年刑務所 | 17 | 0.3% |
金泉刑務所 | 10 | 0.2% |
京城西大門刑務所 | 8 | 0.1% |
Other values (28) | 54 | 0.9% |
Length
Value | Count | Frequency (%) |
na | 4015 | |
西大門刑務所 | 1816 | |
京城刑務所 | 220 | 3.5% |
大田刑務所 | 60 | 1.0% |
咸興刑務所 | 51 | 0.8% |
大邱刑務所 | 24 | 0.4% |
仁川刑務所 | 22 | 0.3% |
金泉少年刑務所 | 17 | 0.3% |
金泉刑務所 | 10 | 0.2% |
京城西大門刑務所 | 8 | 0.1% |
Other values (31) | 61 | 1.0% |
SENTENCE_DATE
Text
MISSING
 
Distinct | 895 |
---|---|
Distinct (%) | 33.5% |
Missing | 3624 |
Missing (%) | 57.6% |
Memory size | 49.3 KiB |
Length
Max length | 63 |
---|---|
Median length | 25 |
Mean length | 11.691617 |
Min length | 4 |
Characters and Unicode
Total characters | 31240 |
---|---|
Distinct characters | 40 |
Distinct categories | 7 ? |
Distinct scripts | 3 ? |
Distinct blocks | 5 ? |
Unique
Unique | 426 ? |
---|---|
Unique (%) | 15.9% |
Sample
1st row | 昭和16年 10月 31日 |
---|---|
2nd row | 昭和17年 7月 9日 |
3rd row | 大正9年 4月 12日 |
4th row | 大正8年 9月 27日 |
5th row | 昭和3年 2月 13日 |
Value | Count | Frequency (%) |
大正8年 | 601 | 7.5% |
昭和17年 | 513 | 6.4% |
7月 | 364 | 4.5% |
5月 | 312 | 3.9% |
6月 | 297 | 3.7% |
2月 | 274 | 3.4% |
10月 | 248 | 3.1% |
12月 | 240 | 3.0% |
昭和16年 | 238 | 3.0% |
8月 | 228 | 2.8% |
Other values (83) | 4715 |
Most occurring characters
Value | Count | Frequency (%) |
5358 | ||
1 | 3808 | |
年 | 2691 | |
月 | 2685 | |
日 | 2682 | |
昭 | 1913 | 6.1% |
和 | 1913 | 6.1% |
2 | 1803 | 5.8% |
8 | 1337 | 4.3% |
7 | 1268 | 4.1% |
Other values (30) | 5782 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 13466 | |
Decimal Number | 12364 | |
Space Separator | 5358 | 17.2% |
Math Symbol | 17 | 0.1% |
Close Punctuation | 16 | 0.1% |
Open Punctuation | 16 | 0.1% |
Other Symbol | 3 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
年 | 2691 | |
月 | 2685 | |
日 | 2682 | |
昭 | 1913 | |
和 | 1913 | |
大 | 775 | 5.8% |
正 | 775 | 5.8% |
算 | 4 | < 0.1% |
刑 | 3 | < 0.1% |
猶 | 3 | < 0.1% |
Other values (15) | 22 | 0.2% |
Decimal Number
Value | Count | Frequency (%) |
1 | 3808 | |
2 | 1803 | |
8 | 1337 | 10.8% |
7 | 1268 | 10.3% |
6 | 785 | 6.3% |
3 | 780 | 6.3% |
9 | 768 | 6.2% |
5 | 646 | 5.2% |
0 | 586 | 4.7% |
4 | 583 | 4.7% |
Space Separator
Value | Count | Frequency (%) |
5358 |
Math Symbol
Value | Count | Frequency (%) |
| | 17 |
Close Punctuation
Value | Count | Frequency (%) |
) | 16 |
Open Punctuation
Value | Count | Frequency (%) |
( | 16 |
Other Symbol
Value | Count | Frequency (%) |
■ | 3 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 17774 | |
Han | 13465 | |
Katakana | 1 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 2691 | |
月 | 2685 | |
日 | 2682 | |
昭 | 1913 | |
和 | 1913 | |
大 | 775 | 5.8% |
正 | 775 | 5.8% |
算 | 4 | < 0.1% |
刑 | 3 | < 0.1% |
猶 | 3 | < 0.1% |
Other values (14) | 21 | 0.2% |
Common
Value | Count | Frequency (%) |
5358 | ||
1 | 3808 | |
2 | 1803 | 10.1% |
8 | 1337 | 7.5% |
7 | 1268 | 7.1% |
6 | 785 | 4.4% |
3 | 780 | 4.4% |
9 | 768 | 4.3% |
5 | 646 | 3.6% |
0 | 586 | 3.3% |
Other values (5) | 635 | 3.6% |
Katakana
Value | Count | Frequency (%) |
ノ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 17771 | |
CJK | 13464 | |
Geometric Shapes | 3 | < 0.1% |
CJK Compat Ideographs | 1 | < 0.1% |
Katakana | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
5358 | ||
1 | 3808 | |
2 | 1803 | 10.1% |
8 | 1337 | 7.5% |
7 | 1268 | 7.1% |
6 | 785 | 4.4% |
3 | 780 | 4.4% |
9 | 768 | 4.3% |
5 | 646 | 3.6% |
0 | 586 | 3.3% |
Other values (4) | 632 | 3.6% |
CJK
Value | Count | Frequency (%) |
年 | 2691 | |
月 | 2685 | |
日 | 2682 | |
昭 | 1913 | |
和 | 1913 | |
大 | 775 | 5.8% |
正 | 775 | 5.8% |
算 | 4 | < 0.1% |
刑 | 3 | < 0.1% |
猶 | 3 | < 0.1% |
Other values (13) | 20 | 0.1% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 3 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
不 | 1 |
Katakana
Value | Count | Frequency (%) |
ノ | 1 |
ADMISSION_DATE
Text
MISSING
 
Distinct | 1070 |
---|---|
Distinct (%) | 45.3% |
Missing | 3935 |
Missing (%) | 62.5% |
Memory size | 49.3 KiB |
Length
Max length | 27 |
---|---|
Median length | 26 |
Mean length | 11.465481 |
Min length | 4 |
Characters and Unicode
Total characters | 27070 |
---|---|
Distinct characters | 44 |
Distinct categories | 7 ? |
Distinct scripts | 3 ? |
Distinct blocks | 4 ? |
Unique
Unique | 632 ? |
---|---|
Unique (%) | 26.8% |
Sample
1st row | 昭和16年 10月 31日 |
---|---|
2nd row | 昭和6年 11月 28日 |
3rd row | 昭和17年 7月 17日 |
4th row | 昭和8年 2月 23日 |
5th row | 昭和5年 8月 30日 |
Value | Count | Frequency (%) |
昭和5年 | 356 | 5.1% |
昭和17年 | 305 | 4.3% |
7月 | 303 | 4.3% |
12月 | 275 | 3.9% |
9月 | 259 | 3.7% |
5月 | 228 | 3.2% |
8月 | 210 | 3.0% |
昭和8年 | 198 | 2.8% |
11月 | 197 | 2.8% |
2月 | 178 | 2.5% |
Other values (109) | 4518 |
Most occurring characters
Value | Count | Frequency (%) |
4667 | ||
1 | 3119 | |
年 | 2374 | |
月 | 2347 | |
日 | 2332 | |
昭 | 2320 | |
和 | 2320 | |
2 | 1607 | 5.9% |
7 | 908 | 3.4% |
5 | 863 | 3.2% |
Other values (34) | 4213 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 11876 | |
Decimal Number | 10503 | |
Space Separator | 4667 | 17.2% |
Math Symbol | 9 | < 0.1% |
Other Symbol | 7 | < 0.1% |
Close Punctuation | 4 | < 0.1% |
Open Punctuation | 4 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
年 | 2374 | |
月 | 2347 | |
日 | 2332 | |
昭 | 2320 | |
和 | 2320 | |
大 | 46 | 0.4% |
正 | 46 | 0.4% |
刑 | 9 | 0.1% |
算 | 8 | 0.1% |
期 | 8 | 0.1% |
Other values (19) | 66 | 0.6% |
Decimal Number
Value | Count | Frequency (%) |
1 | 3119 | |
2 | 1607 | |
7 | 908 | 8.6% |
5 | 863 | 8.2% |
8 | 846 | 8.1% |
6 | 698 | 6.6% |
3 | 690 | 6.6% |
9 | 680 | 6.5% |
4 | 571 | 5.4% |
0 | 521 | 5.0% |
Space Separator
Value | Count | Frequency (%) |
4667 |
Math Symbol
Value | Count | Frequency (%) |
| | 9 |
Other Symbol
Value | Count | Frequency (%) |
■ | 7 |
Close Punctuation
Value | Count | Frequency (%) |
) | 4 |
Open Punctuation
Value | Count | Frequency (%) |
( | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 15194 | |
Han | 11875 | |
Katakana | 1 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 2374 | |
月 | 2347 | |
日 | 2332 | |
昭 | 2320 | |
和 | 2320 | |
大 | 46 | 0.4% |
正 | 46 | 0.4% |
刑 | 9 | 0.1% |
算 | 8 | 0.1% |
期 | 8 | 0.1% |
Other values (18) | 65 | 0.5% |
Common
Value | Count | Frequency (%) |
4667 | ||
1 | 3119 | |
2 | 1607 | 10.6% |
7 | 908 | 6.0% |
5 | 863 | 5.7% |
8 | 846 | 5.6% |
6 | 698 | 4.6% |
3 | 690 | 4.5% |
9 | 680 | 4.5% |
4 | 571 | 3.8% |
Other values (5) | 545 | 3.6% |
Katakana
Value | Count | Frequency (%) |
ノ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 15187 | |
CJK | 11875 | |
Geometric Shapes | 7 | < 0.1% |
Katakana | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
4667 | ||
1 | 3119 | |
2 | 1607 | 10.6% |
7 | 908 | 6.0% |
5 | 863 | 5.7% |
8 | 846 | 5.6% |
6 | 698 | 4.6% |
3 | 690 | 4.5% |
9 | 680 | 4.5% |
4 | 571 | 3.8% |
Other values (4) | 538 | 3.5% |
CJK
Value | Count | Frequency (%) |
年 | 2374 | |
月 | 2347 | |
日 | 2332 | |
昭 | 2320 | |
和 | 2320 | |
大 | 46 | 0.4% |
正 | 46 | 0.4% |
刑 | 9 | 0.1% |
算 | 8 | 0.1% |
期 | 8 | 0.1% |
Other values (18) | 65 | 0.5% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 7 |
Katakana
Value | Count | Frequency (%) |
ノ | 1 |
RELEASE_DATE
Text
MISSING
 
Distinct | 1662 |
---|---|
Distinct (%) | 51.9% |
Missing | 3091 |
Missing (%) | 49.1% |
Memory size | 49.3 KiB |
Length
Max length | 67 |
---|---|
Median length | 63 |
Mean length | 12.71014 |
Min length | 4 |
Characters and Unicode
Total characters | 40736 |
---|---|
Distinct characters | 61 |
Distinct categories | 8 ? |
Distinct scripts | 3 ? |
Distinct blocks | 5 ? |
Unique
Unique | 1062 ? |
---|---|
Unique (%) | 33.1% |
Sample
1st row | 昭和31年 4月 2日 |
---|---|
2nd row | 昭和8年 11月 27日 |
3rd row | 昭和18年 1月 17日 |
4th row | 大正9年 7月 11日 滿期 |
5th row | 昭和10年 7月 20日 |
Value | Count | Frequency (%) |
4月 | 650 | 6.2% |
大正9年 | 649 | 6.2% |
28日 | 325 | 3.1% |
恩免 | 310 | 3.0% |
2月 | 300 | 2.9% |
昭和18年 | 277 | 2.6% |
昭和17年 | 274 | 2.6% |
6月 | 257 | 2.4% |
8月 | 246 | 2.3% |
7月 | 246 | 2.3% |
Other values (168) | 6962 |
Most occurring characters
Value | Count | Frequency (%) |
7291 | ||
1 | 4157 | |
年 | 3311 | 8.1% |
日 | 3287 | 8.1% |
月 | 3286 | 8.1% |
2 | 2794 | 6.9% |
昭 | 2506 | 6.2% |
和 | 2506 | 6.2% |
9 | 1501 | 3.7% |
8 | 1193 | 2.9% |
Other values (51) | 8904 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 18277 | |
Decimal Number | 15037 | |
Space Separator | 7291 | 17.9% |
Math Symbol | 80 | 0.2% |
Open Punctuation | 22 | 0.1% |
Close Punctuation | 22 | 0.1% |
Other Number | 4 | < 0.1% |
Other Symbol | 3 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
年 | 3311 | |
日 | 3287 | |
月 | 3286 | |
昭 | 2506 | |
和 | 2506 | |
大 | 783 | 4.3% |
正 | 782 | 4.3% |
恩 | 324 | 1.8% |
免 | 310 | 1.7% |
滿 | 232 | 1.3% |
Other values (34) | 950 | 5.2% |
Decimal Number
Value | Count | Frequency (%) |
1 | 4157 | |
2 | 2794 | |
9 | 1501 | 10.0% |
8 | 1193 | 7.9% |
4 | 1091 | 7.3% |
6 | 977 | 6.5% |
7 | 940 | 6.3% |
0 | 926 | 6.2% |
3 | 780 | 5.2% |
5 | 678 | 4.5% |
Other Number
Value | Count | Frequency (%) |
② | 2 | |
① | 2 |
Space Separator
Value | Count | Frequency (%) |
7291 |
Math Symbol
Value | Count | Frequency (%) |
| | 80 |
Open Punctuation
Value | Count | Frequency (%) |
( | 22 |
Close Punctuation
Value | Count | Frequency (%) |
) | 22 |
Other Symbol
Value | Count | Frequency (%) |
■ | 3 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 22459 | |
Han | 18274 | |
Katakana | 3 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 3311 | |
日 | 3287 | |
月 | 3286 | |
昭 | 2506 | |
和 | 2506 | |
大 | 783 | 4.3% |
正 | 782 | 4.3% |
恩 | 324 | 1.8% |
免 | 310 | 1.7% |
滿 | 232 | 1.3% |
Other values (31) | 947 | 5.2% |
Common
Value | Count | Frequency (%) |
7291 | ||
1 | 4157 | |
2 | 2794 | 12.4% |
9 | 1501 | 6.7% |
8 | 1193 | 5.3% |
4 | 1091 | 4.9% |
6 | 977 | 4.4% |
7 | 940 | 4.2% |
0 | 926 | 4.1% |
3 | 780 | 3.5% |
Other values (7) | 809 | 3.6% |
Katakana
Value | Count | Frequency (%) |
ノ | 1 | |
ニ | 1 | |
テ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 22452 | |
CJK | 18274 | |
Enclosed Alphanum | 4 | < 0.1% |
Geometric Shapes | 3 | < 0.1% |
Katakana | 3 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
7291 | ||
1 | 4157 | |
2 | 2794 | 12.4% |
9 | 1501 | 6.7% |
8 | 1193 | 5.3% |
4 | 1091 | 4.9% |
6 | 977 | 4.4% |
7 | 940 | 4.2% |
0 | 926 | 4.1% |
3 | 780 | 3.5% |
Other values (4) | 802 | 3.6% |
CJK
Value | Count | Frequency (%) |
年 | 3311 | |
日 | 3287 | |
月 | 3286 | |
昭 | 2506 | |
和 | 2506 | |
大 | 783 | 4.3% |
正 | 782 | 4.3% |
恩 | 324 | 1.8% |
免 | 310 | 1.7% |
滿 | 232 | 1.3% |
Other values (31) | 947 | 5.2% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 3 |
Enclosed Alphanum
Value | Count | Frequency (%) |
② | 2 | |
① | 2 |
Katakana
Value | Count | Frequency (%) |
ノ | 1 | |
ニ | 1 | |
テ | 1 |
CRIMINAL_RECORD
Text
MISSING
 
Distinct | 279 |
---|---|
Distinct (%) | 20.9% |
Missing | 4962 |
Missing (%) | 78.8% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
初犯 | 776 | |
ナシ | 99 | 6.2% |
1犯 | 58 | 3.6% |
2犯 | 24 | 1.5% |
保釋出監 | 23 | 1.4% |
1 | 23 | 1.4% |
保釋 | 18 | 1.1% |
刑執行猶豫 | 17 | 1.1% |
保安法違反 | 16 | 1.0% |
3年間 | 12 | 0.7% |
Other values (328) | 541 |
Most occurring characters
Value | Count | Frequency (%) |
犯 | 900 | 13.2% |
初 | 780 | 11.4% |
271 | 4.0% | |
1 | 269 | 3.9% |
年 | 209 | 3.1% |
2 | 151 | 2.2% |
日 | 147 | 2.1% |
3 | 131 | 1.9% |
シ | 125 | 1.8% |
ナ | 110 | 1.6% |
Other values (360) | 3746 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 5449 | |
Decimal Number | 1005 | 14.7% |
Space Separator | 271 | 4.0% |
Other Punctuation | 67 | 1.0% |
Open Punctuation | 21 | 0.3% |
Close Punctuation | 21 | 0.3% |
Other Symbol | 3 | < 0.1% |
Control | 2 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
犯 | 900 | 16.5% |
初 | 780 | 14.3% |
年 | 209 | 3.8% |
日 | 147 | 2.7% |
シ | 125 | 2.3% |
ナ | 110 | 2.0% |
昭 | 107 | 2.0% |
月 | 106 | 1.9% |
和 | 106 | 1.9% |
豫 | 97 | 1.8% |
Other values (343) | 2762 |
Decimal Number
Value | Count | Frequency (%) |
1 | 269 | |
2 | 151 | |
3 | 131 | |
0 | 106 | 10.5% |
5 | 80 | 8.0% |
7 | 60 | 6.0% |
9 | 56 | 5.6% |
8 | 54 | 5.4% |
4 | 53 | 5.3% |
6 | 45 | 4.5% |
Other Punctuation
Value | Count | Frequency (%) |
. | 62 | |
, | 5 | 7.5% |
Space Separator
Value | Count | Frequency (%) |
271 |
Open Punctuation
Value | Count | Frequency (%) |
( | 21 |
Close Punctuation
Value | Count | Frequency (%) |
) | 21 |
Other Symbol
Value | Count | Frequency (%) |
■ | 3 |
Control
Value | Count | Frequency (%) |
2 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 4933 | |
Common | 1390 | 20.3% |
Katakana | 516 | 7.5% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
犯 | 900 | 18.2% |
初 | 780 | 15.8% |
年 | 209 | 4.2% |
日 | 147 | 3.0% |
昭 | 107 | 2.2% |
月 | 106 | 2.1% |
和 | 106 | 2.1% |
豫 | 97 | 2.0% |
法 | 93 | 1.9% |
刑 | 92 | 1.9% |
Other values (316) | 2296 |
Katakana
Value | Count | Frequency (%) |
シ | 125 | |
ナ | 110 | |
ニ | 55 | |
テ | 32 | 6.2% |
ノ | 29 | 5.6% |
リ | 25 | 4.8% |
ル | 19 | 3.7% |
ヲ | 18 | 3.5% |
ケ | 15 | 2.9% |
ト | 14 | 2.7% |
Other values (17) | 74 |
Common
Value | Count | Frequency (%) |
271 | ||
1 | 269 | |
2 | 151 | |
3 | 131 | |
0 | 106 | 7.6% |
5 | 80 | 5.8% |
. | 62 | 4.5% |
7 | 60 | 4.3% |
9 | 56 | 4.0% |
8 | 54 | 3.9% |
Other values (7) | 150 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 4919 | |
ASCII | 1387 | 20.3% |
Katakana | 516 | 7.5% |
CJK Compat Ideographs | 14 | 0.2% |
Geometric Shapes | 3 | < 0.1% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
犯 | 900 | 18.3% |
初 | 780 | 15.9% |
年 | 209 | 4.2% |
日 | 147 | 3.0% |
昭 | 107 | 2.2% |
月 | 106 | 2.2% |
和 | 106 | 2.2% |
豫 | 97 | 2.0% |
法 | 93 | 1.9% |
刑 | 92 | 1.9% |
Other values (311) | 2282 |
ASCII
Value | Count | Frequency (%) |
271 | ||
1 | 269 | |
2 | 151 | |
3 | 131 | |
0 | 106 | 7.6% |
5 | 80 | 5.8% |
. | 62 | 4.5% |
7 | 60 | 4.3% |
9 | 56 | 4.0% |
8 | 54 | 3.9% |
Other values (6) | 147 |
Katakana
Value | Count | Frequency (%) |
シ | 125 | |
ナ | 110 | |
ニ | 55 | |
テ | 32 | 6.2% |
ノ | 29 | 5.6% |
リ | 25 | 4.8% |
ル | 19 | 3.7% |
ヲ | 18 | 3.5% |
ケ | 15 | 2.9% |
ト | 14 | 2.7% |
Other values (17) | 74 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
不 | 8 | |
金 | 2 | 14.3% |
留 | 2 | 14.3% |
連 | 1 | 7.1% |
例 | 1 | 7.1% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 3 |
NOTE
Text
MISSING
 
Distinct | 428 |
---|---|
Distinct (%) | 53.0% |
Missing | 5489 |
Missing (%) | 87.2% |
Memory size | 49.3 KiB |
Length
Max length | 157 |
---|---|
Median length | 41 |
Mean length | 9.7905824 |
Min length | 3 |
Characters and Unicode
Total characters | 7901 |
---|---|
Distinct characters | 664 |
Distinct categories | 8 ? |
Distinct scripts | 3 ? |
Distinct blocks | 5 ? |
Unique
Unique | 403 ? |
---|---|
Unique (%) | 49.9% |
Sample
1st row | 歸住地本籍地ナシトモ引受人ナシ |
---|---|
2nd row | 歸住地本籍地娘姜氏方 |
3rd row | 360日通算 |
4th row | 360日通算 |
5th row | 歸住地本籍 |
Value | Count | Frequency (%) |
歸住地本籍地 | 349 | |
歸住地 | 86 | 6.3% |
歸住地本籍 | 67 | 4.9% |
父 | 42 | 3.1% |
本籍地 | 36 | 2.6% |
方 | 22 | 1.6% |
妻 | 16 | 1.2% |
大正9年 | 16 | 1.2% |
歸住地住所地 | 13 | 0.9% |
高等課寫眞帳ヨリ復寫 | 11 | 0.8% |
Other values (566) | 717 |
Most occurring characters
Value | Count | Frequency (%) |
地 | 1249 | |
住 | 726 | 9.2% |
歸 | 696 | 8.8% |
本 | 589 | 7.5% |
籍 | 586 | 7.4% |
567 | 7.2% | |
方 | 175 | 2.2% |
父 | 125 | 1.6% |
金 | 62 | 0.8% |
1 | 60 | 0.8% |
Other values (654) | 3066 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 6993 | |
Space Separator | 567 | 7.2% |
Decimal Number | 315 | 4.0% |
Other Symbol | 18 | 0.2% |
Open Punctuation | 3 | < 0.1% |
Close Punctuation | 3 | < 0.1% |
Dash Punctuation | 1 | < 0.1% |
Control | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
地 | 1249 | |
住 | 726 | 10.4% |
歸 | 696 | 10.0% |
本 | 589 | 8.4% |
籍 | 586 | 8.4% |
方 | 175 | 2.5% |
父 | 125 | 1.8% |
金 | 62 | 0.9% |
李 | 60 | 0.9% |
日 | 48 | 0.7% |
Other values (638) | 2677 |
Decimal Number
Value | Count | Frequency (%) |
1 | 60 | |
3 | 43 | |
2 | 42 | |
6 | 39 | |
9 | 37 | |
0 | 34 | |
4 | 18 | 5.7% |
7 | 16 | 5.1% |
5 | 15 | 4.8% |
8 | 11 | 3.5% |
Space Separator
Value | Count | Frequency (%) |
567 |
Other Symbol
Value | Count | Frequency (%) |
■ | 18 |
Open Punctuation
Value | Count | Frequency (%) |
( | 3 |
Close Punctuation
Value | Count | Frequency (%) |
) | 3 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1 |
Control
Value | Count | Frequency (%) |
1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 6772 | |
Common | 908 | 11.5% |
Katakana | 221 | 2.8% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
地 | 1249 | |
住 | 726 | 10.7% |
歸 | 696 | 10.3% |
本 | 589 | 8.7% |
籍 | 586 | 8.7% |
方 | 175 | 2.6% |
父 | 125 | 1.8% |
金 | 62 | 0.9% |
李 | 60 | 0.9% |
日 | 48 | 0.7% |
Other values (604) | 2456 |
Katakana
Value | Count | Frequency (%) |
リ | 31 | |
ニ | 28 | |
ヨ | 22 | 10.0% |
シ | 20 | 9.0% |
ノ | 19 | 8.6% |
ル | 12 | 5.4% |
ヲ | 10 | 4.5% |
ス | 7 | 3.2% |
ナ | 6 | 2.7% |
ハ | 5 | 2.3% |
Other values (24) | 61 |
Common
Value | Count | Frequency (%) |
567 | ||
1 | 60 | 6.6% |
3 | 43 | 4.7% |
2 | 42 | 4.6% |
6 | 39 | 4.3% |
9 | 37 | 4.1% |
0 | 34 | 3.7% |
4 | 18 | 2.0% |
■ | 18 | 2.0% |
7 | 16 | 1.8% |
Other values (6) | 34 | 3.7% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 6747 | |
ASCII | 890 | 11.3% |
Katakana | 221 | 2.8% |
CJK Compat Ideographs | 25 | 0.3% |
Geometric Shapes | 18 | 0.2% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
地 | 1249 | |
住 | 726 | 10.8% |
歸 | 696 | 10.3% |
本 | 589 | 8.7% |
籍 | 586 | 8.7% |
方 | 175 | 2.6% |
父 | 125 | 1.9% |
金 | 62 | 0.9% |
李 | 60 | 0.9% |
日 | 48 | 0.7% |
Other values (594) | 2431 |
ASCII
Value | Count | Frequency (%) |
567 | ||
1 | 60 | 6.7% |
3 | 43 | 4.8% |
2 | 42 | 4.7% |
6 | 39 | 4.4% |
9 | 37 | 4.2% |
0 | 34 | 3.8% |
4 | 18 | 2.0% |
7 | 16 | 1.8% |
5 | 15 | 1.7% |
Other values (5) | 19 | 2.1% |
Katakana
Value | Count | Frequency (%) |
リ | 31 | |
ニ | 28 | |
ヨ | 22 | 10.0% |
シ | 20 | 9.0% |
ノ | 19 | 8.6% |
ル | 12 | 5.4% |
ヲ | 10 | 4.5% |
ス | 7 | 3.2% |
ナ | 6 | 2.7% |
ハ | 5 | 2.3% |
Other values (24) | 61 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 18 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
不 | 11 | |
金 | 3 | 12.0% |
宅 | 2 | 8.0% |
女 | 2 | 8.0% |
李 | 2 | 8.0% |
柳 | 1 | 4.0% |
林 | 1 | 4.0% |
勞 | 1 | 4.0% |
漣 | 1 | 4.0% |
律 | 1 | 4.0% |
CRIMINAL_REASON
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 6286 |
Missing (%) | 99.8% |
Memory size | 49.3 KiB |
Length
Max length | 31 |
---|---|
Median length | 9.5 |
Mean length | 14.4 |
Min length | 4 |
Characters and Unicode
Total characters | 144 |
---|---|
Distinct characters | 94 |
Distinct categories | 6 ? |
Distinct scripts | 3 ? |
Distinct blocks | 4 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 韓建國ト稱シ軍資金募集 |
---|---|
2nd row | 大正15年 7月頃 秘密書類携帶上海出發平壤ニ向ヘリト |
3rd row | 爆發物取締規則 及 制令違反 1 |
4th row | 政治要視察人 |
5th row | 懲役2年 |
Value | Count | Frequency (%) |
韓建國ト稱シ軍資金募集 | 1 | 5.3% |
海州政治要視察人 | 1 | 5.3% |
蔡康鉉 | 1 | 5.3% |
崔鳳鎭 | 1 | 5.3% |
金顯貞 | 1 | 5.3% |
金明福 | 1 | 5.3% |
韓建團ト稱シ軍資募集 | 1 | 5.3% |
元北風會建設者 | 1 | 5.3% |
兇暴計劃者 | 1 | 5.3% |
懲役2年 | 1 | 5.3% |
Other values (9) | 9 |
Most occurring characters
Value | Count | Frequency (%) |
9 | 6.2% | |
ト | 4 | 2.8% |
5 | 3 | 2.1% |
, | 3 | 2.1% |
リ | 3 | 2.1% |
人 | 3 | 2.1% |
建 | 3 | 2.1% |
年 | 3 | 2.1% |
1 | 3 | 2.1% |
韓 | 2 | 1.4% |
Other values (84) | 108 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 121 | |
Space Separator | 9 | 6.2% |
Decimal Number | 9 | 6.2% |
Other Punctuation | 3 | 2.1% |
Open Punctuation | 1 | 0.7% |
Close Punctuation | 1 | 0.7% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
ト | 4 | 3.3% |
リ | 3 | 2.5% |
人 | 3 | 2.5% |
建 | 3 | 2.5% |
年 | 3 | 2.5% |
韓 | 2 | 1.7% |
海 | 2 | 1.7% |
發 | 2 | 1.7% |
ニ | 2 | 1.7% |
爆 | 2 | 1.7% |
Other values (75) | 95 |
Decimal Number
Value | Count | Frequency (%) |
5 | 3 | |
1 | 3 | |
9 | 1 | 11.1% |
7 | 1 | 11.1% |
2 | 1 | 11.1% |
Space Separator
Value | Count | Frequency (%) |
9 |
Other Punctuation
Value | Count | Frequency (%) |
, | 3 |
Open Punctuation
Value | Count | Frequency (%) |
( | 1 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 106 | |
Common | 23 | 16.0% |
Katakana | 15 | 10.4% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
人 | 3 | 2.8% |
建 | 3 | 2.8% |
年 | 3 | 2.8% |
韓 | 2 | 1.9% |
海 | 2 | 1.9% |
發 | 2 | 1.9% |
爆 | 2 | 1.9% |
治 | 2 | 1.9% |
携 | 2 | 1.9% |
要 | 2 | 1.9% |
Other values (67) | 83 |
Common
Value | Count | Frequency (%) |
9 | ||
5 | 3 | 13.0% |
, | 3 | 13.0% |
1 | 3 | 13.0% |
9 | 1 | 4.3% |
( | 1 | 4.3% |
7 | 1 | 4.3% |
2 | 1 | 4.3% |
) | 1 | 4.3% |
Katakana
Value | Count | Frequency (%) |
ト | 4 | |
リ | 3 | |
ニ | 2 | |
シ | 2 | |
セ | 1 | 6.7% |
ラ | 1 | 6.7% |
ヨ | 1 | 6.7% |
ヘ | 1 | 6.7% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 105 | |
ASCII | 23 | 16.0% |
Katakana | 15 | 10.4% |
CJK Compat Ideographs | 1 | 0.7% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
9 | ||
5 | 3 | 13.0% |
, | 3 | 13.0% |
1 | 3 | 13.0% |
9 | 1 | 4.3% |
( | 1 | 4.3% |
7 | 1 | 4.3% |
2 | 1 | 4.3% |
) | 1 | 4.3% |
Katakana
Value | Count | Frequency (%) |
ト | 4 | |
リ | 3 | |
ニ | 2 | |
シ | 2 | |
セ | 1 | 6.7% |
ラ | 1 | 6.7% |
ヨ | 1 | 6.7% |
ヘ | 1 | 6.7% |
CJK
Value | Count | Frequency (%) |
人 | 3 | 2.9% |
建 | 3 | 2.9% |
年 | 3 | 2.9% |
韓 | 2 | 1.9% |
海 | 2 | 1.9% |
發 | 2 | 1.9% |
爆 | 2 | 1.9% |
治 | 2 | 1.9% |
携 | 2 | 1.9% |
要 | 2 | 1.9% |
Other values (66) | 82 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
金 | 1 |
ACCOMPLICE_NAME
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 90.9% |
Missing | 6285 |
Missing (%) | 99.8% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
李英 | 3 | |
外20名 | 3 | |
崔養玉 | 2 | 8.3% |
李善九 | 2 | 8.3% |
金正連 | 2 | 8.3% |
外 | 2 | 8.3% |
20名 | 2 | 8.3% |
李永 | 1 | 4.2% |
金顯貞 | 1 | 4.2% |
李晶雨 | 1 | 4.2% |
Other values (5) | 5 |
Most occurring characters
Value | Count | Frequency (%) |
13 | ||
李 | 8 | 9.2% |
外 | 6 | 6.9% |
2 | 6 | 6.9% |
0 | 6 | 6.9% |
名 | 6 | 6.9% |
英 | 5 | 5.7% |
崔 | 3 | 3.4% |
金 | 3 | 3.4% |
, | 3 | 3.4% |
Other values (22) | 28 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 59 | |
Space Separator | 13 | 14.9% |
Decimal Number | 12 | 13.8% |
Other Punctuation | 3 | 3.4% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
李 | 8 | 13.6% |
外 | 6 | 10.2% |
名 | 6 | 10.2% |
英 | 5 | 8.5% |
崔 | 3 | 5.1% |
金 | 3 | 5.1% |
養 | 2 | 3.4% |
善 | 2 | 3.4% |
九 | 2 | 3.4% |
玉 | 2 | 3.4% |
Other values (18) | 20 |
Decimal Number
Value | Count | Frequency (%) |
2 | 6 | |
0 | 6 |
Space Separator
Value | Count | Frequency (%) |
13 |
Other Punctuation
Value | Count | Frequency (%) |
, | 3 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 59 | |
Common | 28 |
Most frequent character per script
Han
Value | Count | Frequency (%) |
李 | 8 | 13.6% |
外 | 6 | 10.2% |
名 | 6 | 10.2% |
英 | 5 | 8.5% |
崔 | 3 | 5.1% |
金 | 3 | 5.1% |
養 | 2 | 3.4% |
善 | 2 | 3.4% |
九 | 2 | 3.4% |
玉 | 2 | 3.4% |
Other values (18) | 20 |
Common
Value | Count | Frequency (%) |
13 | ||
2 | 6 | |
0 | 6 | |
, | 3 | 10.7% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 58 | |
ASCII | 28 | |
CJK Compat Ideographs | 1 | 1.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
13 | ||
2 | 6 | |
0 | 6 | |
, | 3 | 10.7% |
CJK
Value | Count | Frequency (%) |
李 | 8 | |
外 | 6 | 10.3% |
名 | 6 | 10.3% |
英 | 5 | 8.6% |
崔 | 3 | 5.2% |
金 | 3 | 5.2% |
養 | 2 | 3.4% |
善 | 2 | 3.4% |
九 | 2 | 3.4% |
玉 | 2 | 3.4% |
Other values (17) | 19 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
李 | 1 |
RELEASE_PLACE
Text
MISSING
 
Distinct | 9 |
---|---|
Distinct (%) | 81.8% |
Missing | 6285 |
Missing (%) | 99.8% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
本籍地 | 2 | 11.1% |
肩書地 | 2 | 11.1% |
忠淸南道 | 1 | 5.6% |
夫餘郡 | 1 | 5.6% |
場岩 | 1 | 5.6% |
長蝦 | 1 | 5.6% |
慶尙南道 | 1 | 5.6% |
金海郡 | 1 | 5.6% |
下界面 | 1 | 5.6% |
進水里 | 1 | 5.6% |
Other values (6) | 6 |
Most occurring characters
Value | Count | Frequency (%) |
7 | 9.1% | |
地 | 5 | 6.5% |
所 | 3 | 3.9% |
ニ | 3 | 3.9% |
住 | 3 | 3.9% |
籍 | 2 | 2.6% |
シ | 2 | 2.6% |
郡 | 2 | 2.6% |
本 | 2 | 2.6% |
道 | 2 | 2.6% |
Other values (42) | 46 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 67 | |
Space Separator | 7 | 9.1% |
Decimal Number | 3 | 3.9% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
地 | 5 | 7.5% |
所 | 3 | 4.5% |
ニ | 3 | 4.5% |
住 | 3 | 4.5% |
籍 | 2 | 3.0% |
シ | 2 | 3.0% |
郡 | 2 | 3.0% |
本 | 2 | 3.0% |
道 | 2 | 3.0% |
同 | 2 | 3.0% |
Other values (38) | 41 |
Decimal Number
Value | Count | Frequency (%) |
3 | 1 | |
0 | 1 | |
2 | 1 |
Space Separator
Value | Count | Frequency (%) |
7 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 56 | |
Katakana | 11 | 14.3% |
Common | 10 | 13.0% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
地 | 5 | 8.9% |
所 | 3 | 5.4% |
住 | 3 | 5.4% |
籍 | 2 | 3.6% |
郡 | 2 | 3.6% |
本 | 2 | 3.6% |
道 | 2 | 3.6% |
同 | 2 | 3.6% |
南 | 2 | 3.6% |
肩 | 2 | 3.6% |
Other values (30) | 31 |
Katakana
Value | Count | Frequency (%) |
ニ | 3 | |
シ | 2 | |
ノ | 1 | 9.1% |
タ | 1 | 9.1% |
ル | 1 | 9.1% |
ア | 1 | 9.1% |
リ | 1 | 9.1% |
ジ | 1 | 9.1% |
Common
Value | Count | Frequency (%) |
7 | ||
3 | 1 | 10.0% |
0 | 1 | 10.0% |
2 | 1 | 10.0% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 55 | |
Katakana | 11 | 14.3% |
ASCII | 10 | 13.0% |
CJK Compat Ideographs | 1 | 1.3% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
7 | ||
3 | 1 | 10.0% |
0 | 1 | 10.0% |
2 | 1 | 10.0% |
CJK
Value | Count | Frequency (%) |
地 | 5 | 9.1% |
所 | 3 | 5.5% |
住 | 3 | 5.5% |
籍 | 2 | 3.6% |
郡 | 2 | 3.6% |
本 | 2 | 3.6% |
道 | 2 | 3.6% |
同 | 2 | 3.6% |
南 | 2 | 3.6% |
肩 | 2 | 3.6% |
Other values (29) | 30 |
Katakana
Value | Count | Frequency (%) |
ニ | 3 | |
シ | 2 | |
ノ | 1 | 9.1% |
タ | 1 | 9.1% |
ル | 1 | 9.1% |
ア | 1 | 9.1% |
リ | 1 | 9.1% |
ジ | 1 | 9.1% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
暴 | 1 |
ARREST_OFFICE
Text
MISSING
 
Distinct | 160 |
---|---|
Distinct (%) | 17.0% |
Missing | 5355 |
Missing (%) | 85.1% |
Memory size | 49.3 KiB |
Value | Count | Frequency (%) |
京畿道 | 325 | |
京畿道警察部 | 178 | |
警察部 | 112 | 7.4% |
西大門警察署 | 104 | 6.9% |
京城 | 94 | 6.2% |
咸鏡南道 | 65 | 4.3% |
龍山警察署 | 62 | 4.1% |
本町警察署 | 59 | 3.9% |
鍾路警察署 | 49 | 3.2% |
京城鍾路警察署 | 25 | 1.7% |
Other values (132) | 442 |
Most occurring characters
Value | Count | Frequency (%) |
警 | 905 | |
察 | 905 | |
京 | 672 | 9.1% |
道 | 654 | 8.9% |
署 | 612 | 8.3% |
574 | 7.8% | |
畿 | 512 | 6.9% |
部 | 300 | 4.1% |
城 | 179 | 2.4% |
大 | 142 | 1.9% |
Other values (132) | 1924 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 6778 | |
Space Separator | 574 | 7.8% |
Decimal Number | 17 | 0.2% |
Other Punctuation | 6 | 0.1% |
Math Symbol | 2 | < 0.1% |
Other Symbol | 2 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
警 | 905 | |
察 | 905 | |
京 | 672 | 9.9% |
道 | 654 | 9.6% |
署 | 612 | 9.0% |
畿 | 512 | 7.6% |
部 | 300 | 4.4% |
城 | 179 | 2.6% |
大 | 142 | 2.1% |
門 | 136 | 2.0% |
Other values (122) | 1761 |
Decimal Number
Value | Count | Frequency (%) |
1 | 5 | |
6 | 4 | |
5 | 4 | |
2 | 2 | 11.8% |
0 | 1 | 5.9% |
9 | 1 | 5.9% |
Space Separator
Value | Count | Frequency (%) |
574 |
Other Punctuation
Value | Count | Frequency (%) |
. | 6 |
Math Symbol
Value | Count | Frequency (%) |
| | 2 |
Other Symbol
Value | Count | Frequency (%) |
■ | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 6778 | |
Common | 601 | 8.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
警 | 905 | |
察 | 905 | |
京 | 672 | 9.9% |
道 | 654 | 9.6% |
署 | 612 | 9.0% |
畿 | 512 | 7.6% |
部 | 300 | 4.4% |
城 | 179 | 2.6% |
大 | 142 | 2.1% |
門 | 136 | 2.0% |
Other values (122) | 1761 |
Common
Value | Count | Frequency (%) |
574 | ||
. | 6 | 1.0% |
1 | 5 | 0.8% |
6 | 4 | 0.7% |
5 | 4 | 0.7% |
2 | 2 | 0.3% |
| | 2 | 0.3% |
■ | 2 | 0.3% |
0 | 1 | 0.2% |
9 | 1 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 6730 | |
ASCII | 599 | 8.1% |
CJK Compat Ideographs | 48 | 0.7% |
Geometric Shapes | 2 | < 0.1% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
警 | 905 | |
察 | 905 | |
京 | 672 | 10.0% |
道 | 654 | 9.7% |
署 | 612 | 9.1% |
畿 | 512 | 7.6% |
部 | 300 | 4.5% |
城 | 179 | 2.7% |
大 | 142 | 2.1% |
門 | 136 | 2.0% |
Other values (118) | 1713 |
ASCII
Value | Count | Frequency (%) |
574 | ||
. | 6 | 1.0% |
1 | 5 | 0.8% |
6 | 4 | 0.7% |
5 | 4 | 0.7% |
2 | 2 | 0.3% |
| | 2 | 0.3% |
0 | 1 | 0.2% |
9 | 1 | 0.2% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
領 | 25 | |
龍 | 21 | |
裡 | 1 | 2.1% |
禮 | 1 | 2.1% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 2 |
ARREST
Text
MISSING
 
Distinct | 215 |
---|---|
Distinct (%) | 50.2% |
Missing | 5868 |
Missing (%) | 93.2% |
Memory size | 49.3 KiB |
Length
Max length | 93 |
---|---|
Median length | 53 |
Mean length | 11.841121 |
Min length | 1 |
Characters and Unicode
Total characters | 5068 |
---|---|
Distinct characters | 244 |
Distinct categories | 7 ? |
Distinct scripts | 4 ? |
Distinct blocks | 4 ? |
Unique
Unique | 153 ? |
---|---|
Unique (%) | 35.7% |
Sample
1st row | 昭和9年 5月 18日 |
---|---|
2nd row | 昭和11年 7月 22日 |
3rd row | 昭和6年 5月 18日 |
4th row | 昭和6年 2月 |
5th row | 昭和9年 3月13日 |
Value | Count | Frequency (%) |
昭和9年 | 128 | 11.3% |
5月 | 72 | 6.3% |
昭和11年 | 56 | 4.9% |
12月 | 48 | 4.2% |
2月 | 42 | 3.7% |
事件檢擧 | 40 | 3.5% |
1月 | 39 | 3.4% |
昭和6年 | 38 | 3.3% |
6月 | 38 | 3.3% |
21日 | 24 | 2.1% |
Other values (165) | 612 |
Most occurring characters
Value | Count | Frequency (%) |
709 | 14.0% | |
1 | 436 | 8.6% |
年 | 325 | 6.4% |
月 | 320 | 6.3% |
昭 | 318 | 6.3% |
和 | 318 | 6.3% |
日 | 280 | 5.5% |
2 | 237 | 4.7% |
9 | 167 | 3.3% |
5 | 135 | 2.7% |
Other values (234) | 1823 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 2981 | |
Decimal Number | 1366 | |
Space Separator | 709 | 14.0% |
Close Punctuation | 4 | 0.1% |
Open Punctuation | 4 | 0.1% |
Uppercase Letter | 2 | < 0.1% |
Other Punctuation | 2 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
年 | 325 | 10.9% |
月 | 320 | 10.7% |
昭 | 318 | 10.7% |
和 | 318 | 10.7% |
日 | 280 | 9.4% |
檢 | 83 | 2.8% |
擧 | 81 | 2.7% |
署 | 54 | 1.8% |
事 | 53 | 1.8% |
件 | 49 | 1.6% |
Other values (217) | 1100 |
Decimal Number
Value | Count | Frequency (%) |
1 | 436 | |
2 | 237 | |
9 | 167 | 12.2% |
5 | 135 | 9.9% |
6 | 116 | 8.5% |
0 | 80 | 5.9% |
3 | 66 | 4.8% |
8 | 52 | 3.8% |
7 | 43 | 3.1% |
4 | 34 | 2.5% |
Uppercase Letter
Value | Count | Frequency (%) |
C | 1 | |
M | 1 |
Other Punctuation
Value | Count | Frequency (%) |
. | 1 | |
, | 1 |
Space Separator
Value | Count | Frequency (%) |
709 |
Close Punctuation
Value | Count | Frequency (%) |
) | 4 |
Open Punctuation
Value | Count | Frequency (%) |
( | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 2829 | |
Common | 2085 | |
Katakana | 152 | 3.0% |
Latin | 2 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 325 | 11.5% |
月 | 320 | 11.3% |
昭 | 318 | 11.2% |
和 | 318 | 11.2% |
日 | 280 | 9.9% |
檢 | 83 | 2.9% |
擧 | 81 | 2.9% |
署 | 54 | 1.9% |
事 | 53 | 1.9% |
件 | 49 | 1.7% |
Other values (196) | 948 |
Katakana
Value | Count | Frequency (%) |
ニ | 37 | |
テ | 31 | |
ノ | 19 | |
シ | 10 | 6.6% |
ス | 7 | 4.6% |
ナ | 7 | 4.6% |
モ | 7 | 4.6% |
リ | 6 | 3.9% |
タ | 5 | 3.3% |
ル | 5 | 3.3% |
Other values (11) | 18 |
Common
Value | Count | Frequency (%) |
709 | ||
1 | 436 | |
2 | 237 | 11.4% |
9 | 167 | 8.0% |
5 | 135 | 6.5% |
6 | 116 | 5.6% |
0 | 80 | 3.8% |
3 | 66 | 3.2% |
8 | 52 | 2.5% |
7 | 43 | 2.1% |
Other values (5) | 44 | 2.1% |
Latin
Value | Count | Frequency (%) |
C | 1 | |
M | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 2827 | |
ASCII | 2087 | |
Katakana | 152 | 3.0% |
CJK Compat Ideographs | 2 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
709 | ||
1 | 436 | |
2 | 237 | 11.4% |
9 | 167 | 8.0% |
5 | 135 | 6.5% |
6 | 116 | 5.6% |
0 | 80 | 3.8% |
3 | 66 | 3.2% |
8 | 52 | 2.5% |
7 | 43 | 2.1% |
Other values (7) | 46 | 2.2% |
CJK
Value | Count | Frequency (%) |
年 | 325 | 11.5% |
月 | 320 | 11.3% |
昭 | 318 | 11.2% |
和 | 318 | 11.2% |
日 | 280 | 9.9% |
檢 | 83 | 2.9% |
擧 | 81 | 2.9% |
署 | 54 | 1.9% |
事 | 53 | 1.9% |
件 | 49 | 1.7% |
Other values (194) | 946 |
Katakana
Value | Count | Frequency (%) |
ニ | 37 | |
テ | 31 | |
ノ | 19 | |
シ | 10 | 6.6% |
ス | 7 | 4.6% |
ナ | 7 | 4.6% |
モ | 7 | 4.6% |
リ | 6 | 3.9% |
タ | 5 | 3.3% |
ル | 5 | 3.3% |
Other values (11) | 18 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
參 | 1 | |
勞 | 1 |
TYPES
Text
MISSING
 
Distinct | 362 |
---|---|
Distinct (%) | 79.2% |
Missing | 5839 |
Missing (%) | 92.7% |
Memory size | 49.3 KiB |
Length
Max length | 167 |
---|---|
Median length | 95 |
Mean length | 24.277899 |
Min length | 2 |
Characters and Unicode
Total characters | 11095 |
---|---|
Distinct characters | 861 |
Distinct categories | 10 ? |
Distinct scripts | 5 ? |
Distinct blocks | 9 ? |
Unique
Unique | 323 ? |
---|---|
Unique (%) | 70.7% |
Sample
1st row | 特高係 |
---|---|
2nd row | 安承樂, 安昌大, 姜壽求, 金東植, 鄭洛 等 共産主義者培養ノ溫床ナル役割ヲ爲ス |
3rd row | 共産主義者 李載裕及安炳春ノ指導ヲ受ケ李東千金七星等ヲ組織シタルモノ |
4th row | 手配用 原紙ナシ |
5th row | 京畿 安城郡 以下 不詳 尹瑨榮, 仁川府 外里 8 康復陽 當37年 |
Value | Count | Frequency (%) |
昭和9年 | 27 | 2.7% |
昭和6年 | 22 | 2.2% |
3月 | 20 | 2.0% |
5月 | 19 | 1.9% |
2月 | 17 | 1.7% |
豫審免訴 | 16 | 1.6% |
不起訴 | 15 | 1.5% |
治安維持法違反 | 14 | 1.4% |
4月 | 12 | 1.2% |
勅令第45號 | 11 | 1.1% |
Other values (507) | 823 |
Most occurring characters
Value | Count | Frequency (%) |
536 | 4.8% | |
シ | 311 | 2.8% |
ヲ | 287 | 2.6% |
1 | 284 | 2.6% |
ニ | 242 | 2.2% |
年 | 227 | 2.0% |
ノ | 212 | 1.9% |
ル | 196 | 1.8% |
月 | 195 | 1.8% |
日 | 184 | 1.7% |
Other values (851) | 8421 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 9350 | |
Decimal Number | 1099 | 9.9% |
Space Separator | 536 | 4.8% |
Other Punctuation | 34 | 0.3% |
Open Punctuation | 22 | 0.2% |
Close Punctuation | 21 | 0.2% |
Other Symbol | 13 | 0.1% |
Modifier Letter | 9 | 0.1% |
Control | 8 | 0.1% |
Dash Punctuation | 3 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
シ | 311 | 3.3% |
ヲ | 287 | 3.1% |
ニ | 242 | 2.6% |
年 | 227 | 2.4% |
ノ | 212 | 2.3% |
ル | 196 | 2.1% |
月 | 195 | 2.1% |
日 | 184 | 2.0% |
和 | 156 | 1.7% |
昭 | 156 | 1.7% |
Other values (827) | 7184 |
Decimal Number
Value | Count | Frequency (%) |
1 | 284 | |
2 | 181 | |
5 | 117 | |
9 | 105 | 9.6% |
0 | 101 | 9.2% |
6 | 83 | 7.6% |
3 | 78 | 7.1% |
4 | 58 | 5.3% |
7 | 48 | 4.4% |
8 | 44 | 4.0% |
Other Punctuation
Value | Count | Frequency (%) |
. | 20 | |
, | 12 | |
: | 2 | 5.9% |
Dash Punctuation
Value | Count | Frequency (%) |
゠ | 1 | |
- | 1 | |
― | 1 |
Open Punctuation
Value | Count | Frequency (%) |
( | 21 | |
「 | 1 | 4.5% |
Modifier Letter
Value | Count | Frequency (%) |
ー | 6 | |
々 | 3 |
Space Separator
Value | Count | Frequency (%) |
536 |
Close Punctuation
Value | Count | Frequency (%) |
) | 21 |
Other Symbol
Value | Count | Frequency (%) |
■ | 13 |
Control
Value | Count | Frequency (%) |
8 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 6928 | |
Katakana | 2404 | 21.7% |
Common | 1742 | 15.7% |
Hangul | 19 | 0.2% |
Hiragana | 2 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 227 | 3.3% |
月 | 195 | 2.8% |
日 | 184 | 2.7% |
和 | 156 | 2.3% |
昭 | 156 | 2.3% |
組 | 145 | 2.1% |
共 | 139 | 2.0% |
産 | 102 | 1.5% |
義 | 96 | 1.4% |
主 | 94 | 1.4% |
Other values (754) | 5434 |
Katakana
Value | Count | Frequency (%) |
シ | 311 | |
ヲ | 287 | |
ニ | 242 | |
ノ | 212 | 8.8% |
ル | 196 | 8.2% |
リ | 133 | 5.5% |
ト | 126 | 5.2% |
テ | 111 | 4.6% |
ス | 103 | 4.3% |
モ | 101 | 4.2% |
Other values (45) | 582 |
Common
Value | Count | Frequency (%) |
536 | ||
1 | 284 | |
2 | 181 | 10.4% |
5 | 117 | 6.7% |
9 | 105 | 6.0% |
0 | 101 | 5.8% |
6 | 83 | 4.8% |
3 | 78 | 4.5% |
4 | 58 | 3.3% |
7 | 48 | 2.8% |
Other values (13) | 151 | 8.7% |
Hangul
Value | Count | Frequency (%) |
동 | 2 | 10.5% |
조 | 2 | 10.5% |
갑 | 1 | 5.3% |
리 | 1 | 5.3% |
우 | 1 | 5.3% |
계 | 1 | 5.3% |
관 | 1 | 5.3% |
에 | 1 | 5.3% |
직 | 1 | 5.3% |
합 | 1 | 5.3% |
Other values (7) | 7 |
Hiragana
Value | Count | Frequency (%) |
し | 1 | |
じ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 6862 | |
Katakana | 2411 | 21.7% |
ASCII | 1720 | 15.5% |
CJK Compat Ideographs | 63 | 0.6% |
Hangul | 19 | 0.2% |
Geometric Shapes | 13 | 0.1% |
None | 4 | < 0.1% |
Hiragana | 2 | < 0.1% |
Punctuation | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
536 | ||
1 | 284 | |
2 | 181 | 10.5% |
5 | 117 | 6.8% |
9 | 105 | 6.1% |
0 | 101 | 5.9% |
6 | 83 | 4.8% |
3 | 78 | 4.5% |
4 | 58 | 3.4% |
7 | 48 | 2.8% |
Other values (8) | 129 | 7.5% |
Katakana
Value | Count | Frequency (%) |
シ | 311 | |
ヲ | 287 | |
ニ | 242 | |
ノ | 212 | 8.8% |
ル | 196 | 8.1% |
リ | 133 | 5.5% |
ト | 126 | 5.2% |
テ | 111 | 4.6% |
ス | 103 | 4.3% |
モ | 101 | 4.2% |
Other values (47) | 589 |
CJK
Value | Count | Frequency (%) |
年 | 227 | 3.3% |
月 | 195 | 2.8% |
日 | 184 | 2.7% |
和 | 156 | 2.3% |
昭 | 156 | 2.3% |
組 | 145 | 2.1% |
共 | 139 | 2.0% |
産 | 102 | 1.5% |
義 | 96 | 1.4% |
主 | 94 | 1.4% |
Other values (736) | 5368 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
不 | 21 | |
連 | 9 | |
勞 | 9 | |
暴 | 5 | 7.9% |
理 | 3 | 4.8% |
金 | 3 | 4.8% |
李 | 2 | 3.2% |
寧 | 2 | 3.2% |
六 | 1 | 1.6% |
林 | 1 | 1.6% |
Other values (7) | 7 | 11.1% |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 13 |
None
Value | Count | Frequency (%) |
々 | 3 | |
「 | 1 | 25.0% |
Hangul
Value | Count | Frequency (%) |
동 | 2 | 10.5% |
조 | 2 | 10.5% |
갑 | 1 | 5.3% |
리 | 1 | 5.3% |
우 | 1 | 5.3% |
계 | 1 | 5.3% |
관 | 1 | 5.3% |
에 | 1 | 5.3% |
직 | 1 | 5.3% |
합 | 1 | 5.3% |
Other values (7) | 7 |
Hiragana
Value | Count | Frequency (%) |
し | 1 | |
じ | 1 |
Punctuation
Value | Count | Frequency (%) |
― | 1 |
WANDERPLACE
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 6296 |
---|---|
Missing (%) | 100.0% |
Memory size | 55.5 KiB |
PHOTOGRAPHING
Text
MISSING
 
Distinct | 1523 |
---|---|
Distinct (%) | 30.2% |
Missing | 1258 |
Missing (%) | 20.0% |
Memory size | 49.3 KiB |
Length
Max length | 27 |
---|---|
Median length | 26 |
Mean length | 21.990472 |
Min length | 10 |
Characters and Unicode
Total characters | 110788 |
---|---|
Distinct characters | 93 |
Distinct categories | 6 ? |
Distinct scripts | 4 ? |
Distinct blocks | 7 ? |
Unique
Unique | 856 ? |
---|---|
Unique (%) | 17.0% |
Sample
1st row | 昭和11年 9月 10日 刑事課ニ於テ復寫 |
---|---|
2nd row | 昭和16年 6月 21日 西大門刑務所ニ於テ撮影 |
3rd row | 昭和6年 10月 16日 西大門刑務所ニ於テ撮影 |
4th row | 昭和17年 6月 5日 西大門刑務所ニ於テ撮影 |
5th row | 昭和8年 3月 3日 西大門刑務所ニ於テ撮影 |
Value | Count | Frequency (%) |
西大門刑務所ニ於テ撮影 | 2966 | 14.9% |
昭和6年 | 786 | 3.9% |
12月 | 739 | 3.7% |
昭和5年 | 583 | 2.9% |
6月 | 541 | 2.7% |
昭和17年 | 520 | 2.6% |
9月 | 493 | 2.5% |
5月 | 478 | 2.4% |
1月 | 458 | 2.3% |
刑事課ニ於テ撮影 | 410 | 2.1% |
Other values (164) | 11969 |
Most occurring characters
Value | Count | Frequency (%) |
14513 | 13.1% | |
1 | 6877 | 6.2% |
年 | 5059 | 4.6% |
テ | 5036 | 4.5% |
ニ | 5036 | 4.5% |
於 | 5035 | 4.5% |
月 | 5020 | 4.5% |
昭 | 4863 | 4.4% |
和 | 4863 | 4.4% |
日 | 4846 | 4.4% |
Other values (83) | 49640 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 73989 | |
Decimal Number | 21884 | 19.8% |
Space Separator | 14905 | 13.5% |
Other Symbol | 8 | < 0.1% |
Close Punctuation | 1 | < 0.1% |
Open Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
年 | 5059 | 6.8% |
テ | 5036 | 6.8% |
ニ | 5036 | 6.8% |
於 | 5035 | 6.8% |
月 | 5020 | 6.8% |
昭 | 4863 | 6.6% |
和 | 4863 | 6.6% |
日 | 4846 | 6.5% |
影 | 4824 | 6.5% |
撮 | 4824 | 6.5% |
Other values (68) | 24583 |
Decimal Number
Value | Count | Frequency (%) |
1 | 6877 | |
2 | 3079 | |
6 | 2107 | 9.6% |
5 | 2037 | 9.3% |
7 | 1585 | 7.2% |
9 | 1416 | 6.5% |
3 | 1389 | 6.3% |
8 | 1209 | 5.5% |
0 | 1103 | 5.0% |
4 | 1082 | 4.9% |
Space Separator
Value | Count | Frequency (%) |
14513 | ||
392 | 2.6% |
Other Symbol
Value | Count | Frequency (%) |
■ | 8 |
Close Punctuation
Value | Count | Frequency (%) |
] | 1 |
Open Punctuation
Value | Count | Frequency (%) |
[ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 63913 | |
Common | 36799 | |
Katakana | 10074 | 9.1% |
Hangul | 2 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
年 | 5059 | 7.9% |
於 | 5035 | 7.9% |
月 | 5020 | 7.9% |
昭 | 4863 | 7.6% |
和 | 4863 | 7.6% |
日 | 4846 | 7.6% |
影 | 4824 | 7.5% |
撮 | 4824 | 7.5% |
刑 | 3842 | 6.0% |
大 | 3473 | 5.4% |
Other values (63) | 17264 |
Common
Value | Count | Frequency (%) |
14513 | ||
1 | 6877 | |
2 | 3079 | 8.4% |
6 | 2107 | 5.7% |
5 | 2037 | 5.5% |
7 | 1585 | 4.3% |
9 | 1416 | 3.8% |
3 | 1389 | 3.8% |
8 | 1209 | 3.3% |
0 | 1103 | 3.0% |
Other values (5) | 1484 | 4.0% |
Katakana
Value | Count | Frequency (%) |
テ | 5036 | |
ニ | 5036 | |
ム | 2 | < 0.1% |
Hangul
Value | Count | Frequency (%) |
울 | 1 | |
서 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 63812 | |
ASCII | 36399 | |
Katakana | 10074 | 9.1% |
None | 392 | 0.4% |
CJK Compat Ideographs | 101 | 0.1% |
Geometric Shapes | 8 | < 0.1% |
Hangul | 2 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
14513 | ||
1 | 6877 | |
2 | 3079 | 8.5% |
6 | 2107 | 5.8% |
5 | 2037 | 5.6% |
7 | 1585 | 4.4% |
9 | 1416 | 3.9% |
3 | 1389 | 3.8% |
8 | 1209 | 3.3% |
0 | 1103 | 3.0% |
Other values (3) | 1084 | 3.0% |
CJK
Value | Count | Frequency (%) |
年 | 5059 | 7.9% |
於 | 5035 | 7.9% |
月 | 5020 | 7.9% |
昭 | 4863 | 7.6% |
和 | 4863 | 7.6% |
日 | 4846 | 7.6% |
影 | 4824 | 7.6% |
撮 | 4824 | 7.6% |
刑 | 3842 | 6.0% |
大 | 3473 | 5.4% |
Other values (62) | 17163 |
Katakana
Value | Count | Frequency (%) |
テ | 5036 | |
ニ | 5036 | |
ム | 2 | < 0.1% |
None
Value | Count | Frequency (%) |
392 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
龍 | 101 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 8 |
Hangul
Value | Count | Frequency (%) |
울 | 1 | |
서 | 1 |
MISSING
 
Distinct | 4488 |
---|---|
Distinct (%) | 88.6% |
Missing | 1229 |
Missing (%) | 19.5% |
Memory size | 49.3 KiB |
Length
Max length | 47 |
---|---|
Median length | 11 |
Mean length | 11.044997 |
Min length | 4 |
Characters and Unicode
Total characters | 55965 |
---|---|
Distinct characters | 22 |
Distinct categories | 6 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 4033 ? |
---|---|
Unique (%) | 79.6% |
Sample
1st row | (小) 第32297番 |
---|---|
2nd row | (小) 第49576番 |
3rd row | (小) 第17012番 |
4th row | (小) 第53977番 |
5th row | (小) 第22001番 |
Value | Count | Frequency (%) |
小 | 4889 | |
中 | 116 | 1.1% |
第5920番 | 8 | 0.1% |
第1733番 | 7 | 0.1% |
番 | 7 | 0.1% |
第15969番 | 6 | 0.1% |
第16300番 | 6 | 0.1% |
第16308番 | 6 | 0.1% |
第16301番 | 5 | < 0.1% |
第16326番 | 5 | < 0.1% |
Other values (4519) | 5105 |
Most occurring characters
Value | Count | Frequency (%) |
番 | 5144 | |
第 | 5137 | |
5093 | ||
( | 5088 | |
) | 5088 | |
小 | 4970 | 8.9% |
1 | 3960 | 7.1% |
2 | 3112 | 5.6% |
4 | 3026 | 5.4% |
5 | 2861 | 5.1% |
Other values (12) | 12486 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 25239 | |
Other Letter | 15377 | |
Space Separator | 5093 | 9.1% |
Open Punctuation | 5088 | 9.1% |
Close Punctuation | 5088 | 9.1% |
Math Symbol | 80 | 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 3960 | |
2 | 3112 | |
4 | 3026 | |
5 | 2861 | |
3 | 2485 | |
6 | 2260 | |
7 | 2025 | |
8 | 1873 | |
9 | 1834 | |
0 | 1803 |
Other Letter
Value | Count | Frequency (%) |
番 | 5144 | |
第 | 5137 | |
小 | 4970 | |
中 | 116 | 0.8% |
ナ | 3 | < 0.1% |
シ | 3 | < 0.1% |
原 | 2 | < 0.1% |
板 | 2 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
5093 |
Open Punctuation
Value | Count | Frequency (%) |
( | 5088 |
Close Punctuation
Value | Count | Frequency (%) |
) | 5088 |
Math Symbol
Value | Count | Frequency (%) |
| | 80 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 40588 | |
Han | 15371 | 27.5% |
Katakana | 6 | < 0.1% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
5093 | ||
( | 5088 | |
) | 5088 | |
1 | 3960 | |
2 | 3112 | |
4 | 3026 | |
5 | 2861 | |
3 | 2485 | |
6 | 2260 | |
7 | 2025 | 5.0% |
Other values (4) | 5590 |
Han
Value | Count | Frequency (%) |
番 | 5144 | |
第 | 5137 | |
小 | 4970 | |
中 | 116 | 0.8% |
原 | 2 | < 0.1% |
板 | 2 | < 0.1% |
Katakana
Value | Count | Frequency (%) |
ナ | 3 | |
シ | 3 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 40588 | |
CJK | 15371 | 27.5% |
Katakana | 6 | < 0.1% |
Most frequent character per block
CJK
Value | Count | Frequency (%) |
番 | 5144 | |
第 | 5137 | |
小 | 4970 | |
中 | 116 | 0.8% |
原 | 2 | < 0.1% |
板 | 2 | < 0.1% |
ASCII
Value | Count | Frequency (%) |
5093 | ||
( | 5088 | |
) | 5088 | |
1 | 3960 | |
2 | 3112 | |
4 | 3026 | |
5 | 2861 | |
3 | 2485 | |
6 | 2260 | |
7 | 2025 | 5.0% |
Other values (4) | 5590 |
Katakana
Value | Count | Frequency (%) |
ナ | 3 | |
シ | 3 |
CAPTION
Text
MISSING
 
Distinct | 1399 |
---|---|
Distinct (%) | 69.8% |
Missing | 4293 |
Missing (%) | 68.2% |
Memory size | 49.3 KiB |
Length
Max length | 322 |
---|---|
Median length | 77 |
Mean length | 11.050424 |
Min length | 1 |
Characters and Unicode
Total characters | 22134 |
---|---|
Distinct characters | 988 |
Distinct categories | 13 ? |
Distinct scripts | 6 ? |
Distinct blocks | 7 ? |
Unique
Unique | 1220 ? |
---|---|
Unique (%) | 60.9% |
Sample
1st row | 697 |
---|---|
2nd row | 貴柱ヲ見よ |
3rd row | 原板ハ京城監獄ニアリ |
4th row | 寫眞ハ基寶ニアリ |
5th row | 9年5月21日 不起訴 |
Value | Count | Frequency (%) |
原板ハ京城監獄ニアリ | 237 | 6.2% |
ミ | 97 | 2.5% |
保釋 | 67 | 1.7% |
刑執行猶豫 | 53 | 1.4% |
未 | 47 | 1.2% |
キ | 46 | 1.2% |
滿釋 | 39 | 1.0% |
寫眞 | 33 | 0.9% |
豫審免訴 | 27 | 0.7% |
16未 | 24 | 0.6% |
Other values (1888) | 3166 |
Most occurring characters
Value | Count | Frequency (%) |
1824 | 8.2% | |
1 | 1310 | 5.9% |
. | 814 | 3.7% |
2 | 650 | 2.9% |
ハ | 564 | 2.5% |
ヲ | 415 | 1.9% |
ヨ | 393 | 1.8% |
見 | 393 | 1.8% |
7 | 387 | 1.7% |
3 | 386 | 1.7% |
Other values (978) | 14998 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 14359 | |
Decimal Number | 4607 | 20.8% |
Space Separator | 1824 | 8.2% |
Other Punctuation | 848 | 3.8% |
Open Punctuation | 141 | 0.6% |
Close Punctuation | 141 | 0.6% |
Other Symbol | 126 | 0.6% |
Lowercase Letter | 27 | 0.1% |
Uppercase Letter | 20 | 0.1% |
Math Symbol | 18 | 0.1% |
Other values (3) | 23 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
ハ | 564 | 3.9% |
ヲ | 415 | 2.9% |
ヨ | 393 | 2.7% |
見 | 393 | 2.7% |
ニ | 383 | 2.7% |
年 | 368 | 2.6% |
リ | 357 | 2.5% |
京 | 340 | 2.4% |
原 | 333 | 2.3% |
ア | 322 | 2.2% |
Other values (924) | 10491 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 4 | |
r | 4 | |
o | 3 | |
t | 2 | 7.4% |
s | 2 | 7.4% |
e | 2 | 7.4% |
b | 2 | 7.4% |
f | 1 | 3.7% |
x | 1 | 3.7% |
i | 1 | 3.7% |
Other values (5) | 5 |
Decimal Number
Value | Count | Frequency (%) |
1 | 1310 | |
2 | 650 | |
7 | 387 | 8.4% |
3 | 386 | 8.4% |
6 | 357 | 7.7% |
4 | 342 | 7.4% |
0 | 339 | 7.4% |
5 | 311 | 6.8% |
8 | 283 | 6.1% |
9 | 242 | 5.3% |
Other Punctuation
Value | Count | Frequency (%) |
. | 814 | |
, | 22 | 2.6% |
' | 4 | 0.5% |
/ | 2 | 0.2% |
: | 2 | 0.2% |
& | 1 | 0.1% |
# | 1 | 0.1% |
; | 1 | 0.1% |
? | 1 | 0.1% |
Uppercase Letter
Value | Count | Frequency (%) |
P | 12 | |
C | 3 | 15.0% |
D | 2 | 10.0% |
B | 2 | 10.0% |
E | 1 | 5.0% |
Math Symbol
Value | Count | Frequency (%) |
+ | 7 | |
| | 7 | |
< | 2 | 11.1% |
> | 2 | 11.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 140 | |
[ | 1 | 0.7% |
Close Punctuation
Value | Count | Frequency (%) |
) | 140 | |
] | 1 | 0.7% |
Other Symbol
Value | Count | Frequency (%) |
■ | 119 | |
▼ | 7 | 5.6% |
Dash Punctuation
Value | Count | Frequency (%) |
゠ | 8 | |
- | 1 | 11.1% |
Space Separator
Value | Count | Frequency (%) |
1824 |
Control
Value | Count | Frequency (%) |
9 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 5 |
Most occurring scripts
Value | Count | Frequency (%) |
Han | 10809 | |
Common | 7728 | |
Katakana | 3289 | 14.9% |
Hangul | 257 | 1.2% |
Latin | 47 | 0.2% |
Hiragana | 4 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
見 | 393 | 3.6% |
年 | 368 | 3.4% |
京 | 340 | 3.1% |
原 | 333 | 3.1% |
城 | 320 | 3.0% |
寫 | 311 | 2.9% |
眞 | 310 | 2.9% |
板 | 290 | 2.7% |
月 | 274 | 2.5% |
監 | 273 | 2.5% |
Other values (770) | 7597 |
Hangul
Value | Count | Frequency (%) |
로 | 12 | 4.7% |
으 | 10 | 3.9% |
기 | 10 | 3.9% |
년 | 7 | 2.7% |
일 | 6 | 2.3% |
월 | 5 | 1.9% |
의 | 5 | 1.9% |
드 | 5 | 1.9% |
카 | 5 | 1.9% |
고 | 5 | 1.9% |
Other values (95) | 187 |
Katakana
Value | Count | Frequency (%) |
ハ | 564 | |
ヲ | 415 | |
ヨ | 393 | |
ニ | 383 | |
リ | 357 | |
ア | 322 | |
ヘ | 149 | 4.5% |
ミ | 112 | 3.4% |
キ | 100 | 3.0% |
シ | 54 | 1.6% |
Other values (36) | 440 |
Common
Value | Count | Frequency (%) |
1824 | ||
1 | 1310 | |
. | 814 | |
2 | 650 | 8.4% |
7 | 387 | 5.0% |
3 | 386 | 5.0% |
6 | 357 | 4.6% |
4 | 342 | 4.4% |
0 | 339 | 4.4% |
5 | 311 | 4.0% |
Other values (24) | 1008 |
Latin
Value | Count | Frequency (%) |
P | 12 | |
a | 4 | 8.5% |
r | 4 | 8.5% |
o | 3 | 6.4% |
C | 3 | 6.4% |
t | 2 | 4.3% |
s | 2 | 4.3% |
e | 2 | 4.3% |
b | 2 | 4.3% |
D | 2 | 4.3% |
Other values (10) | 11 |
Hiragana
Value | Count | Frequency (%) |
よ | 2 | |
を | 1 | |
へ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
CJK | 10671 | |
ASCII | 7641 | |
Katakana | 3297 | 14.9% |
Hangul | 257 | 1.2% |
CJK Compat Ideographs | 138 | 0.6% |
Geometric Shapes | 126 | 0.6% |
Hiragana | 4 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1824 | ||
1 | 1310 | |
. | 814 | |
2 | 650 | 8.5% |
7 | 387 | 5.1% |
3 | 386 | 5.1% |
6 | 357 | 4.7% |
4 | 342 | 4.5% |
0 | 339 | 4.4% |
5 | 311 | 4.1% |
Other values (41) | 921 |
Katakana
Value | Count | Frequency (%) |
ハ | 564 | |
ヲ | 415 | |
ヨ | 393 | |
ニ | 383 | |
リ | 357 | |
ア | 322 | |
ヘ | 149 | 4.5% |
ミ | 112 | 3.4% |
キ | 100 | 3.0% |
シ | 54 | 1.6% |
Other values (37) | 448 |
CJK
Value | Count | Frequency (%) |
見 | 393 | 3.7% |
年 | 368 | 3.4% |
京 | 340 | 3.2% |
原 | 333 | 3.1% |
城 | 320 | 3.0% |
寫 | 311 | 2.9% |
眞 | 310 | 2.9% |
板 | 290 | 2.7% |
月 | 274 | 2.6% |
監 | 273 | 2.6% |
Other values (741) | 7459 |
Geometric Shapes
Value | Count | Frequency (%) |
■ | 119 | |
▼ | 7 | 5.6% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
李 | 41 | |
龍 | 17 | |
金 | 11 | 8.0% |
不 | 8 | 5.8% |
利 | 8 | 5.8% |
烈 | 7 | 5.1% |
栗 | 6 | 4.3% |
林 | 5 | 3.6% |
洛 | 3 | 2.2% |
連 | 3 | 2.2% |
Other values (19) | 29 |
Hangul
Value | Count | Frequency (%) |
로 | 12 | 4.7% |
으 | 10 | 3.9% |
기 | 10 | 3.9% |
년 | 7 | 2.7% |
일 | 6 | 2.3% |
월 | 5 | 1.9% |
의 | 5 | 1.9% |
드 | 5 | 1.9% |
카 | 5 | 1.9% |
고 | 5 | 1.9% |
Other values (95) | 187 |
Hiragana
Value | Count | Frequency (%) |
よ | 2 | |
を | 1 | |
へ | 1 |
IMAGE_QUANTITY
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
2 | |
---|---|
5 | 1 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | 2 |
---|---|
2nd row | 2 |
3rd row | 2 |
4th row | 2 |
5th row | 2 |
Common Values
Value | Count | Frequency (%) |
2 | 6295 | |
5 | 1 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2 | 6295 | |
5 | 1 | < 0.1% |
CARD_TYPE
Categorical
Distinct | 4 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
D | |
---|---|
A | |
C | |
B | 411 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | D |
---|---|
2nd row | D |
3rd row | D |
4th row | D |
5th row | A |
Common Values
Value | Count | Frequency (%) |
D | 4131 | |
A | 1057 | 16.8% |
C | 697 | 11.1% |
B | 411 | 6.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
d | 4131 | |
a | 1057 | 16.8% |
c | 697 | 11.1% |
b | 411 | 6.5% |
REGISTER_DATE
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
2014-11-30 00:00:00 |
---|
Length
Max length | 19 |
---|---|
Median length | 19 |
Mean length | 19 |
Min length | 19 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2014-11-30 00:00:00 |
---|---|
2nd row | 2014-11-30 00:00:00 |
3rd row | 2014-11-30 00:00:00 |
4th row | 2014-11-30 00:00:00 |
5th row | 2014-11-30 00:00:00 |
Common Values
Value | Count | Frequency (%) |
2014-11-30 00:00:00 | 6296 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2014-11-30 | 6296 | |
00:00:00 | 6296 |
REGISTRANT
Categorical
IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
ssc2013 | |
---|---|
handb2 | 1 |
diquest | 1 |
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 6.9998412 |
Min length | 6 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | ssc2013 |
---|---|
2nd row | ssc2013 |
3rd row | ssc2013 |
4th row | ssc2013 |
5th row | ssc2013 |
Common Values
Value | Count | Frequency (%) |
ssc2013 | 6294 | |
handb2 | 1 | < 0.1% |
diquest | 1 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
ssc2013 | 6294 | |
handb2 | 1 | < 0.1% |
diquest | 1 | < 0.1% |
MODIFY_DATE
Categorical
IMBALANCE
 
Distinct | 39 |
---|---|
Distinct (%) | 0.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
2015-02-26 00:00:00 | |
---|---|
2019-03-15 10:57:53 | 1 |
2020-06-16 18:52:36 | 1 |
2018-12-17 11:20:53 | 1 |
2020-03-16 09:49:04 | 1 |
Other values (34) | 34 |
Length
Max length | 19 |
---|---|
Median length | 19 |
Mean length | 19 |
Min length | 19 |
Unique
Unique | 38 ? |
---|---|
Unique (%) | 0.6% |
Sample
1st row | 2015-02-26 00:00:00 |
---|---|
2nd row | 2020-06-30 14:29:56 |
3rd row | 2015-02-26 00:00:00 |
4th row | 2015-02-26 00:00:00 |
5th row | 2020-06-16 18:52:36 |
Common Values
Value | Count | Frequency (%) |
2015-02-26 00:00:00 | 6258 | |
2019-03-15 10:57:53 | 1 | < 0.1% |
2020-06-16 18:52:36 | 1 | < 0.1% |
2018-12-17 11:20:53 | 1 | < 0.1% |
2020-03-16 09:49:04 | 1 | < 0.1% |
2018-08-01 11:12:10 | 1 | < 0.1% |
2020-03-23 14:42:09 | 1 | < 0.1% |
2020-03-23 14:59:02 | 1 | < 0.1% |
2019-03-04 15:32:55 | 1 | < 0.1% |
2019-03-15 10:47:13 | 1 | < 0.1% |
Other values (29) | 29 | 0.5% |
Length
Value | Count | Frequency (%) |
2015-02-26 | 6258 | |
00:00:00 | 6258 | |
2017-01-20 | 4 | < 0.1% |
2020-05-11 | 3 | < 0.1% |
2019-03-15 | 3 | < 0.1% |
2020-03-23 | 2 | < 0.1% |
2016-12-12 | 2 | < 0.1% |
17:28:29 | 1 | < 0.1% |
14:11:56 | 1 | < 0.1% |
2020-07-14 | 1 | < 0.1% |
Other values (59) | 59 | 0.5% |
MODIFIER
Categorical
IMBALANCE
 
Distinct | 7 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
ssc2013 | |
---|---|
handb | 22 |
diquest | 5 |
handbsj5 | 4 |
handbsj10 | 4 |
Other values (2) | 3 |
Length
Max length | 9 |
---|---|
Median length | 7 |
Mean length | 6.9950762 |
Min length | 5 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | ssc2013 |
---|---|
2nd row | handb |
3rd row | ssc2013 |
4th row | ssc2013 |
5th row | handb |
Common Values
Value | Count | Frequency (%) |
ssc2013 | 6258 | |
handb | 22 | 0.3% |
diquest | 5 | 0.1% |
handbsj5 | 4 | 0.1% |
handbsj10 | 4 | 0.1% |
handbsj7 | 2 | < 0.1% |
handb2 | 1 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
ssc2013 | 6258 | |
handb | 22 | 0.3% |
diquest | 5 | 0.1% |
handbsj5 | 4 | 0.1% |
handbsj10 | 4 | 0.1% |
handbsj7 | 2 | < 0.1% |
handb2 | 1 | < 0.1% |
SORT_NO
Real number (ℝ)
Distinct | 7 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.3140089 |
Minimum | 1 |
---|---|
Maximum | 7 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 55.5 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 1 |
median | 1 |
Q3 | 1 |
95-th percentile | 3 |
Maximum | 7 |
Range | 6 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 0.6808634 |
---|---|
Coefficient of variation (CV) | 0.51815737 |
Kurtosis | 10.566052 |
Mean | 1.3140089 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 2.8661993 |
Sum | 8273 |
Variance | 0.46357497 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 4857 | |
2 | 1075 | 17.1% |
3 | 244 | 3.9% |
4 | 79 | 1.3% |
5 | 30 | 0.5% |
6 | 9 | 0.1% |
7 | 2 | < 0.1% |
Value | Count | Frequency (%) |
1 | 4857 | |
2 | 1075 | 17.1% |
3 | 244 | 3.9% |
4 | 79 | 1.3% |
5 | 30 | 0.5% |
6 | 9 | 0.1% |
7 | 2 | < 0.1% |
Value | Count | Frequency (%) |
7 | 2 | < 0.1% |
6 | 9 | 0.1% |
5 | 30 | 0.5% |
4 | 79 | 1.3% |
3 | 244 | 3.9% |
2 | 1075 | 17.1% |
1 | 4857 |
STATUS
Categorical
IMBALANCE
 
Distinct | 41 |
---|---|
Distinct (%) | 0.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 49.3 KiB |
<NA> | |
---|---|
常民 | |
平民 | 325 |
兩班 | 229 |
常 | 56 |
Other values (36) | 92 |
Length
Max length | 43 |
---|---|
Median length | 4 |
Mean length | 3.1871029 |
Min length | 1 |
Unique
Unique | 26 ? |
---|---|
Unique (%) | 0.4% |
Sample
1st row | 山西省 沁水縣人 |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | 平民 |
Common Values
Value | Count | Frequency (%) |
<NA> | 3692 | |
常民 | 1902 | |
平民 | 325 | 5.2% |
兩班 | 229 | 3.6% |
常 | 56 | 0.9% |
常人 | 34 | 0.5% |
兩班 | 15 | 0.2% |
士族 | 3 | < 0.1% |
ナシ | 2 | < 0.1% |
儒生 | 2 | < 0.1% |
Other values (31) | 36 | 0.6% |
Length
Value | Count | Frequency (%) |
na | 3692 | |
常民 | 1902 | |
平民 | 325 | 5.2% |
兩班 | 229 | 3.6% |
常 | 56 | 0.9% |
常人 | 34 | 0.5% |
兩班 | 15 | 0.2% |
士族 | 3 | < 0.1% |
山西省 | 2 | < 0.1% |
新聞社員 | 2 | < 0.1% |
Other values (36) | 42 | 0.7% |
MISSING
 
Distinct | 1271 |
---|---|
Distinct (%) | 25.2% |
Missing | 1258 |
Missing (%) | 20.0% |
Memory size | 49.3 KiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 50380 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 657 ? |
---|---|
Unique (%) | 13.0% |
Sample
1st row | 1936-09-10 |
---|---|
2nd row | 1941-06-21 |
3rd row | 1931-10-16 |
4th row | 1942-06-05 |
5th row | 1933-03-03 |
Value | Count | Frequency (%) |
1930-12-12 | 83 | 1.6% |
1931-08-10 | 81 | 1.6% |
1930-12-13 | 68 | 1.3% |
1934-04-05 | 59 | 1.2% |
1931-06-12 | 50 | 1.0% |
1930-12-05 | 42 | 0.8% |
1928-09-15 | 39 | 0.8% |
1933-01-20 | 39 | 0.8% |
1931-06-16 | 37 | 0.7% |
1932-01-11 | 37 | 0.7% |
Other values (1261) | 4503 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 10615 | |
- | 10076 | |
9 | 6731 | |
0 | 6651 | |
3 | 4847 | |
2 | 4211 | 8.4% |
4 | 2502 | 5.0% |
5 | 1456 | 2.9% |
6 | 1358 | 2.7% |
8 | 1013 | 2.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 40304 | |
Dash Punctuation | 10076 | 20.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 10615 | |
9 | 6731 | |
0 | 6651 | |
3 | 4847 | |
2 | 4211 | 10.4% |
4 | 2502 | 6.2% |
5 | 1456 | 3.6% |
6 | 1358 | 3.4% |
8 | 1013 | 2.5% |
7 | 920 | 2.3% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 10076 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 50380 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 10615 | |
- | 10076 | |
9 | 6731 | |
0 | 6651 | |
3 | 4847 | |
2 | 4211 | 8.4% |
4 | 2502 | 5.0% |
5 | 1456 | 2.9% |
6 | 1358 | 2.7% |
8 | 1013 | 2.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 50380 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 10615 | |
- | 10076 | |
9 | 6731 | |
0 | 6651 | |
3 | 4847 | |
2 | 4211 | 8.4% |
4 | 2502 | 5.0% |
5 | 1456 | 2.9% |
6 | 1358 | 2.7% |
8 | 1013 | 2.0% |
LEVEL_ID | PERSON_ID | ITEM_ID | REGIST_NO | IMAGES | THUMB_IMAGE | SERIAL_NO | MAIN_TITLE | NAME_KR | ALIAS_CH | ALIAS_KR | FINGERPRINT_NO | AGE | TYPE_NO | CAREER | FAMILY_NAME | FAMILY_RELATION | FATHER_NAME | TAIL | FEATURES | FEATURES_NO | ORIGIN_ADDRESS | BIRTH_PLACE | ADDRESS | INDICTMENT | INDICTMENT_OFFICE | INDICTMENT_DATE | RELEASE | CRIME_NAME | CRIME_RECORD | PRISON_TERM | PRISON_DATE | SENTENCE_OFFICE | EXECUTIVE_PRISON | PRISON | SENTENCE_DATE | ADMISSION_DATE | RELEASE_DATE | CRIMINAL_RECORD | NOTE | CRIMINAL_REASON | ACCOMPLICE_NAME | RELEASE_PLACE | ARREST_OFFICE | ARREST | TYPES | WANDERPLACE | PHOTOGRAPHING | PRESERVE_NEGATIVE | CAPTION | IMAGE_QUANTITY | CARD_TYPE | REGISTER_DATE | REGISTRANT | MODIFY_DATE | MODIFIER | SORT_NO | STATUS | PHOTOGRAPHING_DATE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ia_0001_0001 | iap_0001 | ia | SJ0000002290 | ia_0001_a.jpg;ia_0001_b.jpg | thumbs_ia_0001.jpg | 21420 | 賈景德 | 가경덕 | 煜如 | 욱여 | <NA> | 1881年生 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 太原綏靖公署秘書長 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 特高係 | <NA> | 昭和11年 9月 10日 刑事課ニ於テ復寫 | (小) 第32297番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | 山西省 沁水縣人 | 1936-09-10 |
1 | ia_0002_0002 | iap_0002 | ia | SJ0000002291 | ia_0002_a.jpg;ia_0002_b.jpg | thumbs_ia_0002.jpg | 42200 | 賈德義 | 가덕의 | <NA> | <NA> | 75759|73647 | 大正4年 12月 11日生 | <NA> | <NA> | <NA> | <NA> | <NA> | 1米65 | <NA> | <NA> | 奉川省 蓋平縣 陳家屯 | 奉川省 蓋平縣 陳家屯 | 子宅 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 强盜殺人等 | <NA> | 懲役15年 裁定210日 | <NA> | 京城覆審 | <NA> | 西大門刑務所 | 昭和16年 10月 31日 | 昭和16年 10月 31日 | 昭和31年 4月 2日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和16年 6月 21日 西大門刑務所ニ於テ撮影 | (小) 第49576番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2020-06-30 14:29:56 | handb | 1 | <NA> | 1941-06-21 |
2 | ia_0003_0003 | iap_0003 | ia | SJ0000002292 | ia_0003_a.jpg;ia_0003_b.jpg | thumbs_ia_0003.jpg | 24318 | 加藤政雄 | 가등정웅 | <NA> | <NA> | 24646|45950 | 明治38年 11月 5日生 | <NA> | 鐵道局雇員 | <NA> | <NA> | <NA> | 1米610 | <NA> | <NA> | 靜岡 安倍 有度 上原 863 | 靜岡 安倍 有度 上原 863 | 京畿道 京城府 漢江通 7 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | 懲役2年 | <NA> | 京城地方法院 | <NA> | 西大門刑務所 | <NA> | 昭和6年 11月 28日 | 昭和8年 11月 27日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和6年 10月 16日 西大門刑務所ニ於テ撮影 | (小) 第17012番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | 1931-10-16 |
3 | ia_0004_0004 | iap_0004 | ia | SJ0000002293 | ia_0004_a.jpg;ia_0004_b.jpg | thumbs_ia_0004.jpg | 20900 | 姜干蘭 | 강간난 | 三山再來 | 삼산재래 | 37769|49869 | 明治41年 10月 27日生 | <NA> | 絲行商 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 黃海道 平郡 古北 書梧 | <NA> | 京畿道 京城府 昌信 以下 不詳 | <NA> | <NA> | <NA> | <NA> | 國家總動員法違反 | <NA> | 懲役6月 | <NA> | 京城地方法院 | <NA> | 西大門刑務所 | 昭和17年 7月 9日 | 昭和17年 7月 17日 | 昭和18年 1月 17日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和17年 6月 5日 西大門刑務所ニ於テ撮影 | (小) 第53977番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | 1942-06-05 |
4 | ia_0005_0006 | iap_0006 | ia | SJ0000002294 | ia_0005_a.jpg;ia_0005_b.jpg | thumbs_ia_0005.jpg | 21200 | 姜敬化 | 강경화 | <NA> | <NA> | 77878|87787 | 建陽1年 3月 3日生,明治29年 3月 3日生 | <NA> | 農 | <NA> | <NA> | <NA> | 5尺1寸2分 | 左足擧踵部切リ | <NA> | 忠淸北道 淸州郡 南州內 新場垈 | 忠淸南道 論山郡 夫赤 新橋 | 咸鏡南道 北靑郡 新昌 新 85 | <NA> | <NA> | <NA> | <NA> | 保安法違反 | <NA> | 懲役6月 | 大正9年 4月 12日 | 京城覆審法院 | 西大門監獄 | <NA> | 大正9年 4月 12日 | <NA> | 大正9年 7月 11日 滿期 | 4犯 | 歸住地本籍地ナシトモ引受人ナシ | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 2 | A | 2014-11-30 00:00:00 | ssc2013 | 2020-06-16 18:52:36 | handb | 1 | 平民 | <NA> |
5 | ia_0006_0007 | iap_0007 | ia | SJ0000002295 | ia_0006_a.jpg;ia_0006_b.jpg | thumbs_ia_0006.jpg | 48421 | 姜逑洪 | 강구홍 | <NA> | <NA> | 98758|28757 | 明治45年 5月 10日生 | <NA> | 農 | <NA> | <NA> | <NA> | 1米600 | <NA> | <NA> | 咸鏡南道 洪原郡 鶴泉 豊洞 411 | 咸鏡南道 洪原郡 鶴泉 豊洞 411 | 咸鏡南道 洪原郡 鶴泉 豊洞 411 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | 懲役2年6月 未決35日 通算 | <NA> | 京城覆審法院 | <NA> | 西大門刑務所 | <NA> | 昭和8年 2月 23日 | 昭和10年 7月 20日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和8年 3月 3日 西大門刑務所ニ於テ撮影 | (小) 第22001番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | 常民 | 1933-03-03 |
6 | ia_0007_0008 | iap_0008 | ia | SJ0000002296 | ia_0007_a.jpg;ia_0007_b.jpg | thumbs_ia_0007.jpg | 22600 | 康國甫 | 강국보 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 保安法犯 | <NA> | <NA> | <NA> | <NA> | 西大門監獄 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 697 | 2 | A | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | <NA> |
7 | ia_0008_0009 | iap_0009 | ia | SJ0000002297 | ia_0008_a.jpg;ia_0008_b.jpg | thumbs_ia_0008.jpg | 24000 | 姜貴男 | 강귀남 | <NA> | <NA> | 98877|98899 | 明治44年 2月 5日生 | <NA> | ナシ | <NA> | <NA> | <NA> | 5尺0寸0分 | <NA> | <NA> | 咸鏡南道 洪原郡 鶴泉 豊洞 450 | 咸鏡南道 洪原郡 鶴泉 豊洞 450 | 京畿道 京城府 樂園 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和11年 1月 6日 西大門刑務所ニ於テ撮影 | (小) 第29706番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 2 | <NA> | 1936-01-06 |
8 | ia_0009_0009 | iap_0009 | ia | SJ0000002298 | ia_0009_a.jpg;ia_0009_b.jpg | thumbs_ia_0009.jpg | 44000 | 姜貴男 | 강귀남 | 金福順,姜京子 | 강복순,강경자 | 98877|98899 | 明治44年 10月 25日生 | <NA> | 女給 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 咸鏡南道 洪原郡 鶴泉 豊洞 | 咸鏡南道 洪原郡 鶴泉 豊洞 | 京畿道 京城府 樂園 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 京畿道 京城鍾路警察署 | <NA> | 安承樂, 安昌大, 姜壽求, 金東植, 鄭洛 等 共産主義者培養ノ溫床ナル役割ヲ爲ス | <NA> | 昭和10年 12月 14日 鐘路署ニ於テ撮影 | (小) 第29525番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | 常民 | 1935-12-14 |
9 | ia_0010_0010 | iap_0010 | ia | SJ0000002299 | ia_0010_a.jpg;ia_0010_b.jpg | thumbs_ia_0010.jpg | 24100 | 姜貴柱 | 강귀주 | 東柱 | 동주 | 87879|77897 | 明治38年 3月 26日生 | <NA> | 無 | <NA> | <NA> | <NA> | 1米720 | <NA> | <NA> | 咸鏡北道 會寧郡 昌斗 倉台 246 | 咸鏡北道 會寧郡 昌斗 倉台 246 | 京畿道 京城府 三角 12■■ | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | 懲役6年 | <NA> | 京城地方法院 | <NA> | <NA> | <NA> | 昭和5年 8月 30日 | 昭和10年 1月 8日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和5年 9月 1日 西大門刑務所ニ於テ撮影 | (小) 第13746番 | 貴柱ヲ見よ | 2 | C | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | 常民 | 1930-09-01 |
LEVEL_ID | PERSON_ID | ITEM_ID | REGIST_NO | IMAGES | THUMB_IMAGE | SERIAL_NO | MAIN_TITLE | NAME_KR | ALIAS_CH | ALIAS_KR | FINGERPRINT_NO | AGE | TYPE_NO | CAREER | FAMILY_NAME | FAMILY_RELATION | FATHER_NAME | TAIL | FEATURES | FEATURES_NO | ORIGIN_ADDRESS | BIRTH_PLACE | ADDRESS | INDICTMENT | INDICTMENT_OFFICE | INDICTMENT_DATE | RELEASE | CRIME_NAME | CRIME_RECORD | PRISON_TERM | PRISON_DATE | SENTENCE_OFFICE | EXECUTIVE_PRISON | PRISON | SENTENCE_DATE | ADMISSION_DATE | RELEASE_DATE | CRIMINAL_RECORD | NOTE | CRIMINAL_REASON | ACCOMPLICE_NAME | RELEASE_PLACE | ARREST_OFFICE | ARREST | TYPES | WANDERPLACE | PHOTOGRAPHING | PRESERVE_NEGATIVE | CAPTION | IMAGE_QUANTITY | CARD_TYPE | REGISTER_DATE | REGISTRANT | MODIFY_DATE | MODIFIER | SORT_NO | STATUS | PHOTOGRAPHING_DATE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6286 | ia_6255_4850 | iap_4850 | ia | SJ0000008544 | ia_6255_a.jpg;ia_6255_b.jpg | thumbs_ia_6255.jpg | <NA> | 黃鶴烈 | 황학렬 | <NA> | <NA> | <NA> | 明治7年 8月 3日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 全羅北道 全州府 完山 284 | 全羅北道 全州府 完山 284 | 全羅北道 南原郡 仝 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | 豫審免訴 | <NA> | 京城地方 | <NA> | 西大門刑務所 | 昭和16年 8月 30日 | <NA> | 昭和16年 9月 5日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和15年 7月 11日 西大門刑務所ニ於テ撮影 | (小) 第44864番 | 16.9.5 豫審免訴 16未 | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | 1940-07-11 |
6287 | ia_6256_4851 | iap_4851 | ia | SJ0000008545 | ia_6256_a.jpg;ia_6256_b.jpg | thumbs_ia_6256.jpg | <NA> | 黃鶴老 | 황학로 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和7 9 7 保釋 | <NA> | 昭和6年 10月 26日 西大門刑務所ニ於テ撮影 | (小) 第17054番 | 7.9.7 保釋 | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | 1931-10-26 |
6288 | ia_6257_4852 | iap_4852 | ia | SJ0000008546 | ia_6257_a.jpg;ia_6257_b.jpg | thumbs_ia_6257.jpg | <NA> | 松村行玉 | 송촌행옥 | 黃行玉 | 황행옥 | 95654|24045 | 明治43年 4月 21日 | <NA> | 布木行商 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 黃海道 甕津郡 鳳鷗 王陰 | 黃海道 甕津郡 鳳鷗 王陰 | 京畿道 京城府 神堂 644-19 | <NA> | <NA> | <NA> | <NA> | 國家總動員法違反 | <NA> | 懲役4月 | <NA> | 京城地方 | <NA> | 西大門刑務所 | 昭和17年 4月 13日 | 昭和17年 4月 13日 | 昭和17年 8月 13日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和17年 1月 28日 西大門刑務所ニ於テ撮影 | (小) 第52238番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | 1942-01-28 |
6289 | ia_6258_4853 | iap_4853 | ia | SJ0000008547 | ia_6258_a.jpg;ia_6258_b.jpg | thumbs_ia_6258.jpg | <NA> | 黃亨魯 | 황형로 | 京京宗,珠進壹 | 경경종,주진일 | 48445|72849 | 明治41年 6月 8日 | <NA> | 農 | <NA> | <NA> | <NA> | 162.0cm | <NA> | <NA> | 咸鏡北道 富寧郡 連川 連津 8 | 咸鏡北道 富寧郡 連川 連津 8 | 間島 延吉 守信鄕 4道溝 興春洞 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 住居侵入 强盜放火 强盜未遂 放火豫備 電信法違反 | <NA> | 懲役7年 | <NA> | 京城地方法院 | <NA> | 西大門刑務所 | <NA> | 昭和8年 12月 20日 | 昭和14年 11月 15日 | 裁定 400日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和9 3 27 調 不明 | <NA> | 昭和8年 1月 15日 西大門刑務所ニ於テ撮影 | (小) 第21623番 | ミ 要調査 | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 2 | <NA> | 1933-01-15 |
6290 | ia_6259_4854 | iap_4854 | ia | SJ0000008548 | ia_6259_a.jpg;ia_6259_b.jpg | thumbs_ia_6259.jpg | <NA> | 黃和烈 | 황화열 | 學鳳 | 학봉 | <NA> | 明治44年 5月 29日 | <NA> | 農 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 咸鏡北道 富寧郡 青岩 未陰 | 支那 間島 汪淸縣 春融 牡丹 龍湖 | 支那 間島 汪淸縣 春融 牡丹 龍湖 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | <NA> | <NA> | <NA> | <NA> | 西大門刑務所 | <NA> | 昭和5年 12月 24日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和5年 12月 24日 西大門刑務所ニ於テ撮影 | (小) 第15229番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | 常民 | 1930-12-24 |
6291 | ia_6260_4855 | iap_4855 | ia | SJ0000008549 | ia_6260_a.jpg;ia_6260_b.jpg | thumbs_ia_6260.jpg | <NA> | 黃興任 | 황흥임 | 淑貞 | 숙정 | 83348|74548 | 明治33年 7月 2日 | <NA> | 女工 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 京畿道 開城府 北本町 360 | 京畿道 開城府 北本町 360 | 京畿道 開城府 南山 448 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 京畿道 開城警察署 | 京畿道 開城署 | <NA> | <NA> | 昭和7年 12月 14日 西大門刑務所ニ於テ撮影 | (小) 第21347番 | ミ決 黃興任 | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | 1932-12-14 |
6292 | ia_6261_4856 | iap_4856 | ia | SJ0000008550 | ia_6261_a.jpg;ia_6261_b.jpg | thumbs_ia_6261.jpg | <NA> | 橫山禮太 | 횡산예태 | ナシ | <NA> | <NA> | 明治43年 3月 31日 | <NA> | 雇員 | <NA> | <NA> | <NA> | 5尺2寸3分 | <NA> | <NA> | 山口 阿部 荻平 安古 | 山口 阿部 荻平 安古 | 京畿道 京城府 竹溪 1丁目 132 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和8年 2月 9日 東大門署ニ於テ撮影 | (小) 第22071番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | 平民 | 1933-02-09 |
6293 | ia_6262_4856 | iap_4856 | ia | SJ0000008551 | ia_6262_a.jpg;ia_6262_b.jpg | thumbs_ia_6262.jpg | <NA> | 橫山禮太 | 횡산예태 | <NA> | <NA> | <NA> | 明治43年 3月 31日 | <NA> | 京城穀物組合 廣ム會社係 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 山口 阿部 荻平 安古 271,1 | 山口 阿部 荻平 安古 271,1 | 京畿道 京城府 竹溪 2丁目 132 | <NA> | <NA> | <NA> | <NA> | 治安維持法違反 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 原紙ナシ(9411) | <NA> | <NA> | (小) 第22203番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 2 | <NA> | <NA> |
6294 | ia_6263_4857 | iap_4857 | ia | SJ0000008552 | ia_6263_a.jpg;ia_6263_b.jpg | thumbs_ia_6263.jpg | <NA> | 後藤管男 | 후등관남 | <NA> | <NA> | 56767|16957 | 明治32年 2月 5日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 分 大分 大字 大分 320 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 國家總動員法違反 | <NA> | 懲役4月 未決句 50 通算 | <NA> | 京城地方法院 | <NA> | 京城西大門刑務所 | 昭和16年 3月 4日 | 昭和16年 3月 8日 | 昭和16年 5月 19日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和15年 12月 18日 西大門刑務所ニ於テ撮影 | (小) 第47621番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | 1940-12-18 |
6295 | ia_6264_4858 | iap_4858 | ia | SJ0000008553 | ia_6264_a.jpg;ia_6264_b.jpg | thumbs_ia_6264.jpg | <NA> | 薰玉明 | 훈옥명 | <NA> | <NA> | 78874|89799 | 明治36年 5月 15日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 山東省 長山縣 城內 | 山東 長山縣 城內 | 京畿道 京城府 長谷川 98 | <NA> | <NA> | <NA> | <NA> | 朝鮮臨時保安令及陸海軍刑法違反 | <NA> | 懲役1年 | <NA> | 京城地方 | <NA> | <NA> | 昭和17年 12月 3日 | <NA> | 昭和18年 12月 3日 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 昭和17年 9月 3日 西大門刑務所ニ於テ撮影 | (小) 第54822番 | <NA> | 2 | D | 2014-11-30 00:00:00 | ssc2013 | 2015-02-26 00:00:00 | ssc2013 | 1 | <NA> | 1942-09-03 |