Overview

Dataset statistics

Number of variables12
Number of observations10000
Missing cells10743
Missing cells (%)9.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 MiB
Average record size in memory106.0 B

Variable types

Categorical3
Numeric2
Text7

Dataset

Description국립 박물관에서 보유한 유물정보의 소장구분, 유물명, 한자명칭, 국적/시대, 재질, 용도, 크기, 출토지, 유물에 대한 설명을 제공합니다.
Author문화체육관광부 국립중앙박물관
URLhttps://www.data.go.kr/data/3070595/fileData.do

Alerts

세부번호 is highly overall correlated with 지정구분High correlation
소장구분 is highly overall correlated with 지정구분High correlation
국적-시대 is highly overall correlated with 지정구분High correlation
지정구분 is highly overall correlated with 세부번호 and 2 other fieldsHigh correlation
지정구분 is highly imbalanced (92.8%)Imbalance
한자명칭 has 1034 (10.3%) missing valuesMissing
용도 has 185 (1.8%) missing valuesMissing
크기 has 160 (1.6%) missing valuesMissing
출토지 has 584 (5.8%) missing valuesMissing
설명 has 8759 (87.6%) missing valuesMissing
세부번호 is highly skewed (γ1 = 99.41603231)Skewed
세부번호 has 9885 (98.9%) zerosZeros

Reproduction

Analysis started2023-12-12 07:00:24.712633
Analysis finished2023-12-12 07:00:29.449356
Duration4.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소장구분
Categorical

HIGH CORRELATION 

Distinct29
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
국립1-광주-광주
4057 
국립1-청주-청주
1901 
국립1-춘천-춘천
1540 
국립1-춘천-수증
912 
국립1-제주-제주
434 
Other values (24)
1156 

Length

Max length12
Median length9
Mean length9.0014
Min length8

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row국립1-춘천-춘천
2nd row국립1-대구-증
3rd row국립1-공주-공주
4th row국립1-청주-청주
5th row국립1-대구-임당

Common Values

ValueCountFrequency (%)
국립1-광주-광주 4057
40.6%
국립1-청주-청주 1901
19.0%
국립1-춘천-춘천 1540
 
15.4%
국립1-춘천-수증 912
 
9.1%
국립1-제주-제주 434
 
4.3%
국립1-대구-증 186
 
1.9%
국립1-부여-부여 163
 
1.6%
국립1-공주-공주 150
 
1.5%
국립1-진주-진주 119
 
1.2%
국립1-김해-김해 112
 
1.1%
Other values (19) 426
 
4.3%

Length

2023-12-12T16:00:29.547481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
국립1-광주-광주 4057
40.6%
국립1-청주-청주 1901
19.0%
국립1-춘천-춘천 1540
 
15.4%
국립1-춘천-수증 912
 
9.1%
국립1-제주-제주 434
 
4.3%
국립1-대구-증 186
 
1.9%
국립1-부여-부여 163
 
1.6%
국립1-공주-공주 150
 
1.5%
국립1-진주-진주 119
 
1.2%
국립1-김해-김해 112
 
1.1%
Other values (19) 426
 
4.3%

소장품번호
Real number (ℝ)

Distinct7801
Distinct (%)78.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6130.4877
Minimum1
Maximum21269
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:00:29.768102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile84
Q1869
median6727.5
Q310831.25
95-th percentile13059.05
Maximum21269
Range21268
Interquartile range (IQR)9962.25

Descriptive statistics

Standard deviation5153.2705
Coefficient of variation (CV)0.84059714
Kurtosis-1.0567453
Mean6130.4877
Median Absolute Deviation (MAD)5215
Skewness0.29227542
Sum61304877
Variance26556196
MonotonicityNot monotonic
2023-12-12T16:00:29.979902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28 17
 
0.2%
1 15
 
0.1%
25 13
 
0.1%
4 10
 
0.1%
2 10
 
0.1%
30 10
 
0.1%
256 10
 
0.1%
249 10
 
0.1%
77 9
 
0.1%
7 9
 
0.1%
Other values (7791) 9887
98.9%
ValueCountFrequency (%)
1 15
0.1%
2 10
0.1%
3 8
0.1%
4 10
0.1%
5 8
0.1%
6 8
0.1%
7 9
0.1%
8 5
 
0.1%
9 9
0.1%
10 8
0.1%
ValueCountFrequency (%)
21269 1
< 0.1%
21267 1
< 0.1%
21266 1
< 0.1%
21265 1
< 0.1%
21264 1
< 0.1%
21263 1
< 0.1%
21262 1
< 0.1%
21261 1
< 0.1%
21260 1
< 0.1%
21259 1
< 0.1%

세부번호
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1089
Minimum0
Maximum710
Zeros9885
Zeros (%)98.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:00:30.155971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum710
Range710
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7.1135857
Coefficient of variation (CV)65.322182
Kurtosis9921.6879
Mean0.1089
Median Absolute Deviation (MAD)0
Skewness99.416032
Sum1089
Variance50.603101
MonotonicityNot monotonic
2023-12-12T16:00:30.326018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
0 9885
98.9%
1 31
 
0.3%
2 26
 
0.3%
3 18
 
0.2%
5 10
 
0.1%
4 9
 
0.1%
6 8
 
0.1%
7 5
 
0.1%
11 2
 
< 0.1%
12 2
 
< 0.1%
Other values (4) 4
 
< 0.1%
ValueCountFrequency (%)
0 9885
98.9%
1 31
 
0.3%
2 26
 
0.3%
3 18
 
0.2%
4 9
 
0.1%
5 10
 
0.1%
6 8
 
0.1%
7 5
 
0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
710 1
 
< 0.1%
12 2
 
< 0.1%
11 2
 
< 0.1%
10 1
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%
7 5
0.1%
6 8
0.1%
5 10
0.1%
4 9
0.1%
Distinct3356
Distinct (%)33.6%
Missing5
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-12T16:00:30.687065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length33
Mean length5.1443722
Min length1

Characters and Unicode

Total characters51418
Distinct characters1577
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2325 ?
Unique (%)23.3%

Sample

1st row토기바닥조각
2nd row무명 홑바지
3rd row金銀製腰佩
4th row토기편
5th row鐵鏃
ValueCountFrequency (%)
암키와 251
 
2.3%
수키와 226
 
2.0%
土器底部片 218
 
2.0%
土器口緣部片 202
 
1.8%
현풍곽씨 146
 
1.3%
土器胴體部片 115
 
1.0%
긁개 92
 
0.8%
완형토기편 89
 
0.8%
백자접시 85
 
0.8%
把手 84
 
0.8%
Other values (3859) 9518
86.3%
2023-12-12T16:00:31.148733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1620
 
3.2%
1402
 
2.7%
1310
 
2.5%
1299
 
2.5%
1164
 
2.3%
1026
 
2.0%
982
 
1.9%
939
 
1.8%
810
 
1.6%
770
 
1.5%
Other values (1567) 40096
78.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49464
96.2%
Space Separator 1164
 
2.3%
Close Punctuation 314
 
0.6%
Open Punctuation 314
 
0.6%
Other Punctuation 136
 
0.3%
Decimal Number 10
 
< 0.1%
Lowercase Letter 6
 
< 0.1%
Math Symbol 4
 
< 0.1%
Final Punctuation 3
 
< 0.1%
Initial Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1620
 
3.3%
1402
 
2.8%
1310
 
2.6%
1299
 
2.6%
1026
 
2.1%
982
 
2.0%
939
 
1.9%
810
 
1.6%
770
 
1.6%
723
 
1.5%
Other values (1544) 38583
78.0%
Other Punctuation
ValueCountFrequency (%)
' 48
35.3%
. 38
27.9%
, 33
24.3%
· 8
 
5.9%
" 8
 
5.9%
1
 
0.7%
Decimal Number
ValueCountFrequency (%)
2 3
30.0%
1 3
30.0%
4 2
20.0%
5 1
 
10.0%
3 1
 
10.0%
Close Punctuation
ValueCountFrequency (%)
) 312
99.4%
1
 
0.3%
1
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 312
99.4%
1
 
0.3%
1
 
0.3%
Math Symbol
ValueCountFrequency (%)
> 2
50.0%
< 2
50.0%
Space Separator
ValueCountFrequency (%)
1164
100.0%
Lowercase Letter
ValueCountFrequency (%)
s 6
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Initial Punctuation
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30117
58.6%
Han 19347
37.6%
Common 1948
 
3.8%
Latin 6
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
1620
 
8.4%
1402
 
7.2%
1299
 
6.7%
939
 
4.9%
571
 
3.0%
548
 
2.8%
547
 
2.8%
501
 
2.6%
476
 
2.5%
441
 
2.3%
Other values (900) 11003
56.9%
Hangul
ValueCountFrequency (%)
1310
 
4.3%
1026
 
3.4%
982
 
3.3%
810
 
2.7%
770
 
2.6%
723
 
2.4%
663
 
2.2%
558
 
1.9%
537
 
1.8%
511
 
1.7%
Other values (634) 22227
73.8%
Common
ValueCountFrequency (%)
1164
59.8%
) 312
 
16.0%
( 312
 
16.0%
' 48
 
2.5%
. 38
 
2.0%
, 33
 
1.7%
· 8
 
0.4%
" 8
 
0.4%
2 3
 
0.2%
3
 
0.2%
Other values (12) 19
 
1.0%
Latin
ValueCountFrequency (%)
s 6
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30114
58.6%
CJK 19064
37.1%
ASCII 1935
 
3.8%
CJK Compat Ideographs 283
 
0.6%
None 13
 
< 0.1%
Punctuation 6
 
< 0.1%
Compat Jamo 3
 
< 0.1%

Most frequent character per block

CJK
ValueCountFrequency (%)
1620
 
8.5%
1402
 
7.4%
1299
 
6.8%
939
 
4.9%
571
 
3.0%
548
 
2.9%
547
 
2.9%
501
 
2.6%
476
 
2.5%
441
 
2.3%
Other values (852) 10720
56.2%
Hangul
ValueCountFrequency (%)
1310
 
4.4%
1026
 
3.4%
982
 
3.3%
810
 
2.7%
770
 
2.6%
723
 
2.4%
663
 
2.2%
558
 
1.9%
537
 
1.8%
511
 
1.7%
Other values (632) 22224
73.8%
ASCII
ValueCountFrequency (%)
1164
60.2%
) 312
 
16.1%
( 312
 
16.1%
' 48
 
2.5%
. 38
 
2.0%
, 33
 
1.7%
" 8
 
0.4%
s 6
 
0.3%
2 3
 
0.2%
1 3
 
0.2%
Other values (5) 8
 
0.4%
CJK Compat Ideographs
ValueCountFrequency (%)
81
28.6%
34
12.0%
24
 
8.5%
19
 
6.7%
17
 
6.0%
12
 
4.2%
12
 
4.2%
8
 
2.8%
6
 
2.1%
6
 
2.1%
Other values (38) 64
22.6%
None
ValueCountFrequency (%)
· 8
61.5%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Punctuation
ValueCountFrequency (%)
3
50.0%
3
50.0%
Compat Jamo
ValueCountFrequency (%)
2
66.7%
1
33.3%

한자명칭
Text

MISSING 

Distinct2486
Distinct (%)27.7%
Missing1034
Missing (%)10.3%
Memory size156.2 KiB
2023-12-12T16:00:31.494986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length26
Mean length4.0392594
Min length1

Characters and Unicode

Total characters36216
Distinct characters1405
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1629 ?
Unique (%)18.2%

Sample

1st row 土器底部片
2nd row
3rd row土器片
4th row쇠살촉
5th row소옹
ValueCountFrequency (%)
암키와 233
 
2.5%
토기저부편 221
 
2.4%
토기구연부편 204
 
2.2%
수키와 195
 
2.1%
諺簡 146
 
1.6%
토기동체부편 116
 
1.3%
剝片 104
 
1.1%
棺釘 97
 
1.1%
短頸壺 95
 
1.0%
완형토기편 91
 
1.0%
Other values (2494) 7698
83.7%
2023-12-12T16:00:31.976036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1660
 
4.6%
1424
 
3.9%
1129
 
3.1%
1019
 
2.8%
964
 
2.7%
962
 
2.7%
667
 
1.8%
632
 
1.7%
541
 
1.5%
518
 
1.4%
Other values (1395) 26700
73.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 35691
98.6%
Space Separator 348
 
1.0%
Other Punctuation 122
 
0.3%
Open Punctuation 21
 
0.1%
Close Punctuation 21
 
0.1%
Decimal Number 5
 
< 0.1%
Final Punctuation 2
 
< 0.1%
Initial Punctuation 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1660
 
4.7%
1424
 
4.0%
1129
 
3.2%
1019
 
2.9%
964
 
2.7%
962
 
2.7%
667
 
1.9%
632
 
1.8%
541
 
1.5%
518
 
1.5%
Other values (1372) 26175
73.3%
Other Punctuation
ValueCountFrequency (%)
, 68
55.7%
' 42
34.4%
· 5
 
4.1%
. 4
 
3.3%
2
 
1.6%
1
 
0.8%
Decimal Number
ValueCountFrequency (%)
2 1
20.0%
1 1
20.0%
5 1
20.0%
3 1
20.0%
4 1
20.0%
Open Punctuation
ValueCountFrequency (%)
( 19
90.5%
1
 
4.8%
1
 
4.8%
Close Punctuation
ValueCountFrequency (%)
) 19
90.5%
1
 
4.8%
1
 
4.8%
Math Symbol
ValueCountFrequency (%)
< 1
50.0%
> 1
50.0%
Space Separator
ValueCountFrequency (%)
348
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Uppercase Letter
ValueCountFrequency (%)
Ё 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19445
53.7%
Han 16246
44.9%
Common 523
 
1.4%
Cyrillic 2
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
964
 
5.9%
962
 
5.9%
667
 
4.1%
541
 
3.3%
481
 
3.0%
436
 
2.7%
379
 
2.3%
365
 
2.2%
333
 
2.0%
311
 
1.9%
Other values (989) 10807
66.5%
Hangul
ValueCountFrequency (%)
1660
 
8.5%
1424
 
7.3%
1129
 
5.8%
1019
 
5.2%
632
 
3.3%
518
 
2.7%
475
 
2.4%
465
 
2.4%
444
 
2.3%
430
 
2.2%
Other values (373) 11249
57.9%
Common
ValueCountFrequency (%)
348
66.5%
, 68
 
13.0%
' 42
 
8.0%
( 19
 
3.6%
) 19
 
3.6%
· 5
 
1.0%
. 4
 
0.8%
2
 
0.4%
2
 
0.4%
2
 
0.4%
Other values (12) 12
 
2.3%
Cyrillic
ValueCountFrequency (%)
Ё 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19443
53.7%
CJK 16075
44.4%
ASCII 507
 
1.4%
CJK Compat Ideographs 171
 
0.5%
None 12
 
< 0.1%
Punctuation 4
 
< 0.1%
Cyrillic 2
 
< 0.1%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1660
 
8.5%
1424
 
7.3%
1129
 
5.8%
1019
 
5.2%
632
 
3.3%
518
 
2.7%
475
 
2.4%
465
 
2.4%
444
 
2.3%
430
 
2.2%
Other values (371) 11247
57.8%
CJK
ValueCountFrequency (%)
964
 
6.0%
962
 
6.0%
667
 
4.1%
541
 
3.4%
481
 
3.0%
436
 
2.7%
379
 
2.4%
365
 
2.3%
333
 
2.1%
311
 
1.9%
Other values (939) 10636
66.2%
ASCII
ValueCountFrequency (%)
348
68.6%
, 68
 
13.4%
' 42
 
8.3%
( 19
 
3.7%
) 19
 
3.7%
. 4
 
0.8%
< 1
 
0.2%
> 1
 
0.2%
2 1
 
0.2%
1 1
 
0.2%
Other values (3) 3
 
0.6%
CJK Compat Ideographs
ValueCountFrequency (%)
26
15.2%
17
 
9.9%
13
 
7.6%
11
 
6.4%
11
 
6.4%
9
 
5.3%
7
 
4.1%
6
 
3.5%
5
 
2.9%
5
 
2.9%
Other values (40) 61
35.7%
None
ValueCountFrequency (%)
· 5
41.7%
2
 
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
Punctuation
ValueCountFrequency (%)
2
50.0%
2
50.0%
Cyrillic
ValueCountFrequency (%)
Ё 2
100.0%
Compat Jamo
ValueCountFrequency (%)
1
50.0%
1
50.0%

국적-시대
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
한국-조선
2438 
한국-삼국
1352 
한국-통일신라
1165 
한국-고려
875 
한국-백제
836 
Other values (26)
3334 

Length

Max length7
Median length5
Mean length5.5176
Min length3

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row한국-원삼국
2nd row한국-조선
3rd row한국-백제
4th row한국-삼국
5th row한국-원삼국

Common Values

ValueCountFrequency (%)
한국-조선 2438
24.4%
한국-삼국 1352
13.5%
한국-통일신라 1165
11.7%
한국-고려 875
 
8.8%
한국-백제 836
 
8.4%
한국-청동기 647
 
6.5%
한국-구석기 477
 
4.8%
한국-원삼국 468
 
4.7%
한국-신석기 445
 
4.5%
한국- 295
 
2.9%
Other values (21) 1002
10.0%

Length

2023-12-12T16:00:32.125198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한국-조선 2438
24.4%
한국-삼국 1352
13.5%
한국-통일신라 1165
11.7%
한국-고려 875
 
8.8%
한국-백제 836
 
8.4%
한국-청동기 647
 
6.5%
한국-구석기 477
 
4.8%
한국-원삼국 468
 
4.7%
한국-신석기 445
 
4.5%
한국 295
 
2.9%
Other values (21) 1002
10.0%

재질
Text

Distinct87
Distinct (%)0.9%
Missing16
Missing (%)0.2%
Memory size156.2 KiB
2023-12-12T16:00:32.376055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length4.6488381
Min length2

Characters and Unicode

Total characters46414
Distinct characters96
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)0.2%

Sample

1st row토제-연질
2nd row사직-면
3rd row금속-금
4th row토제-연질
5th row금속-철
ValueCountFrequency (%)
토제-연질 1749
17.5%
토제-경질 1646
16.5%
도자기-백자 833
 
8.3%
토제-와질 690
 
6.9%
624
 
6.2%
567
 
5.7%
금속-철 484
 
4.8%
금속-동합금 420
 
4.2%
석-기타 410
 
4.1%
도자기-청자 400
 
4.0%
Other values (77) 2161
21.6%
2023-12-12T16:00:32.780221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 9984
21.5%
4503
 
9.7%
4481
 
9.7%
4085
 
8.8%
2822
 
6.1%
2518
 
5.4%
1753
 
3.8%
1646
 
3.5%
1637
 
3.5%
1586
 
3.4%
Other values (86) 11399
24.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 36302
78.2%
Dash Punctuation 9984
 
21.5%
Other Punctuation 128
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4503
12.4%
4481
12.3%
4085
11.3%
2822
 
7.8%
2518
 
6.9%
1753
 
4.8%
1646
 
4.5%
1637
 
4.5%
1586
 
4.4%
1422
 
3.9%
Other values (84) 9849
27.1%
Dash Punctuation
ValueCountFrequency (%)
- 9984
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 128
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 36302
78.2%
Common 10112
 
21.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4503
12.4%
4481
12.3%
4085
11.3%
2822
 
7.8%
2518
 
6.9%
1753
 
4.8%
1646
 
4.5%
1637
 
4.5%
1586
 
4.4%
1422
 
3.9%
Other values (84) 9849
27.1%
Common
ValueCountFrequency (%)
- 9984
98.7%
/ 128
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 36302
78.2%
ASCII 10112
 
21.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 9984
98.7%
/ 128
 
1.3%
Hangul
ValueCountFrequency (%)
4503
12.4%
4481
12.3%
4085
11.3%
2822
 
7.8%
2518
 
6.9%
1753
 
4.8%
1646
 
4.5%
1637
 
4.5%
1586
 
4.4%
1422
 
3.9%
Other values (84) 9849
27.1%

용도
Text

MISSING 

Distinct158
Distinct (%)1.6%
Missing185
Missing (%)1.8%
Memory size156.2 KiB
2023-12-12T16:00:33.016209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length16
Mean length10.058991
Min length3

Characters and Unicode

Total characters98729
Distinct characters166
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)0.3%

Sample

1st row식-음식기-기타
2nd row의-의류-평상복
3rd row사회생활-의례생활-상장
4th row식-음식기-기타
5th row산업/생업-수렵-사냥구
ValueCountFrequency (%)
식-음식기-음식 2851
29.0%
식-음식기-저장운반 1639
16.7%
주-건축부재-지붕재 889
 
9.1%
산업/생업-선사생활-생활구일체 643
 
6.6%
사회생활-의례생활-상장 472
 
4.8%
기타-용도불명-용도불명품 217
 
2.2%
산업/생업-공업-염직 184
 
1.9%
사회생활-의례생활-제례 165
 
1.7%
사회생활-사회제도-교육 163
 
1.7%
기타-자료 146
 
1.5%
Other values (148) 2446
24.9%
2023-12-12T16:00:33.360201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 19630
19.9%
12519
 
12.7%
7511
 
7.6%
5648
 
5.7%
4781
 
4.8%
3783
 
3.8%
3156
 
3.2%
2582
 
2.6%
2488
 
2.5%
1918
 
1.9%
Other values (156) 34713
35.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 77375
78.4%
Dash Punctuation 19630
 
19.9%
Other Punctuation 1724
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12519
16.2%
7511
 
9.7%
5648
 
7.3%
4781
 
6.2%
3783
 
4.9%
3156
 
4.1%
2582
 
3.3%
2488
 
3.2%
1918
 
2.5%
1729
 
2.2%
Other values (154) 31260
40.4%
Dash Punctuation
ValueCountFrequency (%)
- 19630
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1724
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 77375
78.4%
Common 21354
 
21.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12519
16.2%
7511
 
9.7%
5648
 
7.3%
4781
 
6.2%
3783
 
4.9%
3156
 
4.1%
2582
 
3.3%
2488
 
3.2%
1918
 
2.5%
1729
 
2.2%
Other values (154) 31260
40.4%
Common
ValueCountFrequency (%)
- 19630
91.9%
/ 1724
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 77375
78.4%
ASCII 21354
 
21.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 19630
91.9%
/ 1724
 
8.1%
Hangul
ValueCountFrequency (%)
12519
16.2%
7511
 
9.7%
5648
 
7.3%
4781
 
6.2%
3783
 
4.9%
3156
 
4.1%
2582
 
3.3%
2488
 
3.2%
1918
 
2.5%
1729
 
2.2%
Other values (154) 31260
40.4%

크기
Text

MISSING 

Distinct9395
Distinct (%)95.5%
Missing160
Missing (%)1.6%
Memory size156.2 KiB
2023-12-12T16:00:33.694763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length191
Median length118
Mean length20.836484
Min length4

Characters and Unicode

Total characters205031
Distinct characters142
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9090 ?
Unique (%)92.4%

Sample

1st row현재높이 6.7cm,바닥지름 11.5cm
2nd row전체길이 105,허리둘레 94
3rd row전체길이 58
4th row현재길이 4.9
5th row현재길이 3.7,너비 .9
ValueCountFrequency (%)
높이 2147
 
6.3%
현재높이 1817
 
5.4%
길이 1621
 
4.8%
현재길이 1305
 
3.8%
가로 538
 
1.6%
입지름 481
 
1.4%
세로 434
 
1.3%
두께 357
 
1.1%
지름 342
 
1.0%
바닥지름 287
 
0.8%
Other values (9803) 24591
72.5%
2023-12-12T16:00:34.170344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 26215
 
12.8%
24084
 
11.7%
, 14077
 
6.9%
1 12129
 
5.9%
9660
 
4.7%
2 8170
 
4.0%
5 8031
 
3.9%
0 7807
 
3.8%
3 6440
 
3.1%
4 6187
 
3.0%
Other values (132) 82231
40.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 67236
32.8%
Other Letter 64806
31.6%
Other Punctuation 40297
19.7%
Space Separator 24084
 
11.7%
Math Symbol 4764
 
2.3%
Lowercase Letter 2852
 
1.4%
Other Symbol 682
 
0.3%
Open Punctuation 147
 
0.1%
Close Punctuation 147
 
0.1%
Dash Punctuation 16
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9660
14.9%
5696
 
8.8%
5693
 
8.8%
5629
 
8.7%
4191
 
6.5%
4188
 
6.5%
4026
 
6.2%
3437
 
5.3%
3436
 
5.3%
2509
 
3.9%
Other values (106) 16341
25.2%
Decimal Number
ValueCountFrequency (%)
1 12129
18.0%
2 8170
12.2%
5 8031
11.9%
0 7807
11.6%
3 6440
9.6%
4 6187
9.2%
6 5286
7.9%
8 5005
7.4%
7 4711
 
7.0%
9 3470
 
5.2%
Other Punctuation
ValueCountFrequency (%)
. 26215
65.1%
, 14077
34.9%
* 3
 
< 0.1%
/ 1
 
< 0.1%
· 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
m 1768
62.0%
c 1082
37.9%
x 2
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 4762
> 99.9%
× 2
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
624
91.5%
58
 
8.5%
Space Separator
ValueCountFrequency (%)
24084
100.0%
Open Punctuation
ValueCountFrequency (%)
( 147
100.0%
Close Punctuation
ValueCountFrequency (%)
) 147
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 137373
67.0%
Hangul 64806
31.6%
Latin 2852
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9660
14.9%
5696
 
8.8%
5693
 
8.8%
5629
 
8.7%
4191
 
6.5%
4188
 
6.5%
4026
 
6.2%
3437
 
5.3%
3436
 
5.3%
2509
 
3.9%
Other values (106) 16341
25.2%
Common
ValueCountFrequency (%)
. 26215
19.1%
24084
17.5%
, 14077
10.2%
1 12129
8.8%
2 8170
 
5.9%
5 8031
 
5.8%
0 7807
 
5.7%
3 6440
 
4.7%
4 6187
 
4.5%
6 5286
 
3.8%
Other values (13) 18947
13.8%
Latin
ValueCountFrequency (%)
m 1768
62.0%
c 1082
37.9%
x 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 139540
68.1%
Hangul 64806
31.6%
CJK Compat 682
 
0.3%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 26215
18.8%
24084
17.3%
, 14077
10.1%
1 12129
8.7%
2 8170
 
5.9%
5 8031
 
5.8%
0 7807
 
5.6%
3 6440
 
4.6%
4 6187
 
4.4%
6 5286
 
3.8%
Other values (12) 21114
15.1%
Hangul
ValueCountFrequency (%)
9660
14.9%
5696
 
8.8%
5693
 
8.8%
5629
 
8.7%
4191
 
6.5%
4188
 
6.5%
4026
 
6.2%
3437
 
5.3%
3436
 
5.3%
2509
 
3.9%
Other values (106) 16341
25.2%
CJK Compat
ValueCountFrequency (%)
624
91.5%
58
 
8.5%
None
ValueCountFrequency (%)
× 2
66.7%
· 1
33.3%

출토지
Text

MISSING 

Distinct1332
Distinct (%)14.1%
Missing584
Missing (%)5.8%
Memory size156.2 KiB
2023-12-12T16:00:34.413489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length44
Mean length20.857795
Min length2

Characters and Unicode

Total characters196397
Distinct characters516
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique750 ?
Unique (%)8.0%

Sample

1st row강원도-강릉시-초당동 57-2
2nd row대구광역시-달성군-현풍면
3rd row충청남도-공주시-송산리 무령왕릉공주군 공주읍
4th row충청북도-청주시-산남동 42-6번지 일원지표
5th row경상북도-경산시-임당동 625번지 일대 C-I지구 고분군 25호분
ValueCountFrequency (%)
일대 2757
 
9.4%
광주광역시-북구-동림동 1068
 
3.6%
전라남도-완도군-완도읍 965
 
3.3%
장좌리 964
 
3.3%
장도 861
 
2.9%
토광묘 527
 
1.8%
464
 
1.6%
453
 
1.5%
충청북도-청주시-상당구 409
 
1.4%
용정동 394
 
1.3%
Other values (1937) 20442
69.8%
2023-12-12T16:00:34.839387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 21289
 
10.8%
20176
 
10.3%
10892
 
5.5%
5458
 
2.8%
5085
 
2.6%
5061
 
2.6%
5020
 
2.6%
4136
 
2.1%
3974
 
2.0%
3952
 
2.0%
Other values (506) 111354
56.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 136954
69.7%
Dash Punctuation 21289
 
10.8%
Space Separator 20176
 
10.3%
Decimal Number 15042
 
7.7%
Close Punctuation 683
 
0.3%
Open Punctuation 683
 
0.3%
Uppercase Letter 607
 
0.3%
Other Punctuation 460
 
0.2%
Letter Number 404
 
0.2%
Lowercase Letter 95
 
< 0.1%
Other values (3) 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10892
 
8.0%
5458
 
4.0%
5085
 
3.7%
5061
 
3.7%
5020
 
3.7%
4136
 
3.0%
3974
 
2.9%
3952
 
2.9%
3646
 
2.7%
3584
 
2.6%
Other values (460) 86146
62.9%
Uppercase Letter
ValueCountFrequency (%)
A 175
28.8%
C 144
23.7%
T 111
18.3%
B 76
12.5%
D 46
 
7.6%
E 21
 
3.5%
M 14
 
2.3%
I 4
 
0.7%
H 4
 
0.7%
F 4
 
0.7%
Other values (4) 8
 
1.3%
Decimal Number
ValueCountFrequency (%)
1 3146
20.9%
4 2021
13.4%
3 1882
12.5%
2 1874
12.5%
5 1343
8.9%
0 1168
 
7.8%
6 1000
 
6.6%
7 931
 
6.2%
9 912
 
6.1%
8 765
 
5.1%
Lowercase Letter
ValueCountFrequency (%)
r 84
88.4%
m 3
 
3.2%
i 2
 
2.1%
k 2
 
2.1%
t 2
 
2.1%
p 2
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 392
85.2%
. 58
 
12.6%
' 6
 
1.3%
· 3
 
0.7%
1
 
0.2%
Letter Number
ValueCountFrequency (%)
362
89.6%
36
 
8.9%
5
 
1.2%
1
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 21289
100.0%
Space Separator
ValueCountFrequency (%)
20176
100.0%
Close Punctuation
ValueCountFrequency (%)
) 683
100.0%
Open Punctuation
ValueCountFrequency (%)
( 683
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 136209
69.4%
Common 58337
29.7%
Latin 1106
 
0.6%
Han 745
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10892
 
8.0%
5458
 
4.0%
5085
 
3.7%
5061
 
3.7%
5020
 
3.7%
4136
 
3.0%
3974
 
2.9%
3952
 
2.9%
3646
 
2.7%
3584
 
2.6%
Other values (314) 85401
62.7%
Han
ValueCountFrequency (%)
59
 
7.9%
55
 
7.4%
48
 
6.4%
37
 
5.0%
32
 
4.3%
27
 
3.6%
19
 
2.6%
19
 
2.6%
15
 
2.0%
14
 
1.9%
Other values (136) 420
56.4%
Latin
ValueCountFrequency (%)
362
32.7%
A 175
15.8%
C 144
 
13.0%
T 111
 
10.0%
r 84
 
7.6%
B 76
 
6.9%
D 46
 
4.2%
36
 
3.3%
E 21
 
1.9%
M 14
 
1.3%
Other values (14) 37
 
3.3%
Common
ValueCountFrequency (%)
- 21289
36.5%
20176
34.6%
1 3146
 
5.4%
4 2021
 
3.5%
3 1882
 
3.2%
2 1874
 
3.2%
5 1343
 
2.3%
0 1168
 
2.0%
6 1000
 
1.7%
7 931
 
1.6%
Other values (12) 3507
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 136209
69.4%
ASCII 59033
30.1%
CJK 739
 
0.4%
Number Forms 404
 
0.2%
CJK Compat Ideographs 6
 
< 0.1%
None 4
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 21289
36.1%
20176
34.2%
1 3146
 
5.3%
4 2021
 
3.4%
3 1882
 
3.2%
2 1874
 
3.2%
5 1343
 
2.3%
0 1168
 
2.0%
6 1000
 
1.7%
7 931
 
1.6%
Other values (28) 4203
 
7.1%
Hangul
ValueCountFrequency (%)
10892
 
8.0%
5458
 
4.0%
5085
 
3.7%
5061
 
3.7%
5020
 
3.7%
4136
 
3.0%
3974
 
2.9%
3952
 
2.9%
3646
 
2.7%
3584
 
2.6%
Other values (314) 85401
62.7%
Number Forms
ValueCountFrequency (%)
362
89.6%
36
 
8.9%
5
 
1.2%
1
 
0.2%
CJK
ValueCountFrequency (%)
59
 
8.0%
55
 
7.4%
48
 
6.5%
37
 
5.0%
32
 
4.3%
27
 
3.7%
19
 
2.6%
19
 
2.6%
15
 
2.0%
14
 
1.9%
Other values (132) 414
56.0%
None
ValueCountFrequency (%)
· 3
75.0%
1
 
25.0%
CJK Compat Ideographs
ValueCountFrequency (%)
2
33.3%
2
33.3%
1
16.7%
1
16.7%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

지정구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9783 
중요민속
 
187
국보
 
18
보물
 
8
도지정
 
4

Length

Max length4
Median length4
Mean length3.9944
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row중요민속
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9783
97.8%
중요민속 187
 
1.9%
국보 18
 
0.2%
보물 8
 
0.1%
도지정 4
 
< 0.1%

Length

2023-12-12T16:00:35.022319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:00:35.176421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9783
97.8%
중요민속 187
 
1.9%
국보 18
 
0.2%
보물 8
 
0.1%
도지정 4
 
< 0.1%

설명
Text

MISSING 

Distinct1227
Distinct (%)98.9%
Missing8759
Missing (%)87.6%
Memory size156.2 KiB
2023-12-12T16:00:35.467315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1024
Median length424
Mean length279.91459
Min length1

Characters and Unicode

Total characters347374
Distinct characters2824
Distinct categories16 ?
Distinct scripts7 ?
Distinct blocks17 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1217 ?
Unique (%)98.1%

Sample

1st row각 판은 표면이 오목하게 되어있고 여기에 2개의 心葉形 瓔珞이 달려있는데 대형 판은 8개가 연결되고 전체는 도금 또는 금은합금으로 제작된 듯하며 이 중에서 소형판과 영락에 특히 금색이 잘 남아있다. 상단은 두꺼비무늬이나 끝의 하부판은 도깨비문늬가 장식되고 가장자리에 각 선, 점으로 波狀文이 돌려져 있고 장방형 장식표면에는 녹이 심해 확실치는 않으나 파상선 윤곽선 내부에 두마리의 용 비슷한 각선이 되어있다. 한쪽끝의 두꺼비 배에 심엽형을 도치한 두각문을 새기고 네 다리는 금방이라도 뛰어갈 듯한 자세로 구부리고 있는 형상이 위가 삐죽삐죽한 네모꼴의 틀 안에 맞새김 되었다.
2nd row귀걸이[耳飾]는 대체로 귀에 다는 고리[環]와 중간장식[中間裝飾], 드리개[垂飾]등 3부분으로 구성되어 있는데, 고리의 굵기에 따라 굵은 고리인 태환이식(太環耳飾) 과 가는 고리인 세환이식(細環耳飾)으로 나누어진다.굵은 고리식은 대체로 샛장식이 작은 고리을 위 아래로 돌려 붙여 만든 둥근 달개장식[瓔珞]을 단 형태이며, 드리개는 대부분 하트 모양인 심엽형(心葉形)이다. 또 고리의 표면이나 드리개의 가장자리에 금알갱이를 붙이는 누금세공을 한다.가는 고리식은 굵은 고리에 비해 모양이 다양하며, 심엽형이나 펜촉형, 또는 금판을 꼰 형태도 있다.
3rd row 金銅製耳飾으로, 금도금이 많이 벗겨지고 일부 부식이 일어난 모습이 보인다. 평·단면 원형으로, 하나의 봉으로 구부려 제작한 것으로 環端부분이 약간 떨어져 있다. 垂下飾과 中間飾이 원래 있었는데 없어졌는지 아니면 단독 素環飾이였는지는 정확히 알수가 없다.
4th row내저 원각에 4군데의 태토 받침 흔적이 있는 백자접시이다. 구연부와 측사면 일부는 유약이 산화되어 黃味를 띠고 있다.기면의 바깥면은 물레 자국이 선명하며 유약이 뭉쳐 흘러내린 곳이 있다.
5th row圓形의 銅製鏡. 거울 직경이 약 26cm로 비교적 크고 무거운 편이다. 周緣은 단면 사다리꼴로 비교적 높게 이루어져 있으며 內部는 2개의 圓圈이 돌려져 3구획되어 있다. 內區는 連花文이 장식되어 있는데 中瓣양식으로 外瓣은 14엽을 가진 복판단엽. 內瓣은 희미하나 8엽을 가진 복판단엽형식을 지니고 있으며 자방부에는 半圓形의 單ⓣ가 부착되어 있다. 中區는 2마리의 龍이 대칭되어 있는 여의주를 향해 연이어 있는 모습이 양각화되어 있다. 外區는 草花文이 장식되어 있는데 8개의 꽃과 잎이 등간격으로 이어져 있다. 부식이 진행된 상태이고, ⓣ의 일부가 결손되어 있다. (조인진)
ValueCountFrequency (%)
있다 1249
 
1.7%
468
 
0.6%
있는 375
 
0.5%
것으로 340
 
0.5%
331
 
0.4%
있으며 311
 
0.4%
281
 
0.4%
있고 255
 
0.3%
약간 222
 
0.3%
206
 
0.3%
Other values (26744) 71286
94.6%
2023-12-12T16:00:35.995486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
76025
 
21.9%
8997
 
2.6%
7270
 
2.1%
. 6756
 
1.9%
6465
 
1.9%
6163
 
1.8%
5131
 
1.5%
4817
 
1.4%
3887
 
1.1%
3832
 
1.1%
Other values (2814) 218031
62.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 246746
71.0%
Space Separator 76042
 
21.9%
Other Punctuation 11186
 
3.2%
Decimal Number 5229
 
1.5%
Close Punctuation 2955
 
0.9%
Open Punctuation 2919
 
0.8%
Lowercase Letter 1011
 
0.3%
Math Symbol 900
 
0.3%
Uppercase Letter 144
 
< 0.1%
Dash Punctuation 137
 
< 0.1%
Other values (6) 105
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8997
 
3.6%
7270
 
2.9%
6465
 
2.6%
6163
 
2.5%
5131
 
2.1%
4817
 
2.0%
3887
 
1.6%
3832
 
1.6%
3712
 
1.5%
3472
 
1.4%
Other values (2674) 193000
78.2%
Lowercase Letter
ValueCountFrequency (%)
m 425
42.0%
c 404
40.0%
e 23
 
2.3%
n 19
 
1.9%
i 18
 
1.8%
a 14
 
1.4%
r 14
 
1.4%
u 14
 
1.4%
s 13
 
1.3%
p 10
 
1.0%
Other values (15) 57
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
Ш 35
24.3%
X 18
12.5%
V 14
 
9.7%
U 14
 
9.7%
S 9
 
6.2%
C 8
 
5.6%
D 5
 
3.5%
A 5
 
3.5%
L 5
 
3.5%
M 4
 
2.8%
Other values (15) 27
18.8%
Other Symbol
ValueCountFrequency (%)
7
14.9%
6
12.8%
° 5
10.6%
5
10.6%
4
8.5%
4
8.5%
3
6.4%
3
6.4%
2
 
4.3%
2
 
4.3%
Other values (6) 6
12.8%
Other Punctuation
ValueCountFrequency (%)
. 6756
60.4%
, 3348
29.9%
· 382
 
3.4%
' 321
 
2.9%
/ 195
 
1.7%
" 80
 
0.7%
: 40
 
0.4%
16
 
0.1%
* 15
 
0.1%
% 13
 
0.1%
Other values (4) 20
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 1200
22.9%
2 738
14.1%
3 561
10.7%
4 533
10.2%
5 471
 
9.0%
0 450
 
8.6%
8 378
 
7.2%
6 336
 
6.4%
7 294
 
5.6%
9 268
 
5.1%
Math Symbol
ValueCountFrequency (%)
> 390
43.3%
< 390
43.3%
~ 56
 
6.2%
30
 
3.3%
+ 14
 
1.6%
× 11
 
1.2%
6
 
0.7%
1
 
0.1%
= 1
 
0.1%
1
 
0.1%
Other Number
ValueCountFrequency (%)
3
23.1%
3
23.1%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Close Punctuation
ValueCountFrequency (%)
) 2712
91.8%
] 187
 
6.3%
17
 
0.6%
13
 
0.4%
12
 
0.4%
9
 
0.3%
4
 
0.1%
} 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 2671
91.5%
[ 190
 
6.5%
17
 
0.6%
13
 
0.4%
13
 
0.4%
9
 
0.3%
4
 
0.1%
{ 2
 
0.1%
Letter Number
ValueCountFrequency (%)
2
25.0%
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Space Separator
ValueCountFrequency (%)
76025
> 99.9%
  17
 
< 0.1%
Initial Punctuation
ValueCountFrequency (%)
15
88.2%
2
 
11.8%
Final Punctuation
ValueCountFrequency (%)
15
93.8%
1
 
6.2%
Modifier Symbol
ValueCountFrequency (%)
˚ 3
75.0%
^ 1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 137
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 228879
65.9%
Common 99448
28.6%
Han 17881
 
5.1%
Latin 1118
 
0.3%
Cyrillic 43
 
< 0.1%
Hiragana 3
 
< 0.1%
Greek 2
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
506
 
2.8%
468
 
2.6%
368
 
2.1%
268
 
1.5%
231
 
1.3%
172
 
1.0%
164
 
0.9%
153
 
0.9%
148
 
0.8%
144
 
0.8%
Other values (1674) 15259
85.3%
Hangul
ValueCountFrequency (%)
8997
 
3.9%
7270
 
3.2%
6465
 
2.8%
6163
 
2.7%
5131
 
2.2%
4817
 
2.1%
3887
 
1.7%
3832
 
1.7%
3712
 
1.6%
3472
 
1.5%
Other values (991) 175133
76.5%
Common
ValueCountFrequency (%)
76025
76.4%
. 6756
 
6.8%
, 3348
 
3.4%
) 2712
 
2.7%
( 2671
 
2.7%
1 1200
 
1.2%
2 738
 
0.7%
3 561
 
0.6%
4 533
 
0.5%
5 471
 
0.5%
Other values (70) 4433
 
4.5%
Latin
ValueCountFrequency (%)
m 425
38.0%
c 404
36.1%
e 23
 
2.1%
n 19
 
1.7%
i 18
 
1.6%
X 18
 
1.6%
V 14
 
1.3%
a 14
 
1.3%
r 14
 
1.3%
u 14
 
1.3%
Other values (40) 155
 
13.9%
Cyrillic
ValueCountFrequency (%)
Ш 35
81.4%
а 4
 
9.3%
Г 3
 
7.0%
Ь 1
 
2.3%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Greek
ValueCountFrequency (%)
Β 1
50.0%
Ζ 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 228841
65.9%
ASCII 99894
28.8%
CJK 17357
 
5.0%
None 569
 
0.2%
CJK Compat Ideographs 524
 
0.2%
Cyrillic 43
 
< 0.1%
Punctuation 35
 
< 0.1%
Math Operators 31
 
< 0.1%
Enclosed Alphanum 24
 
< 0.1%
Compat Jamo 21
 
< 0.1%
Other values (7) 35
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
76025
76.1%
. 6756
 
6.8%
, 3348
 
3.4%
) 2712
 
2.7%
( 2671
 
2.7%
1 1200
 
1.2%
2 738
 
0.7%
3 561
 
0.6%
4 533
 
0.5%
5 471
 
0.5%
Other values (67) 4879
 
4.9%
Hangul
ValueCountFrequency (%)
8997
 
3.9%
7270
 
3.2%
6465
 
2.8%
6163
 
2.7%
5131
 
2.2%
4817
 
2.1%
3887
 
1.7%
3832
 
1.7%
3712
 
1.6%
3472
 
1.5%
Other values (980) 175095
76.5%
CJK
ValueCountFrequency (%)
506
 
2.9%
468
 
2.7%
368
 
2.1%
268
 
1.5%
231
 
1.3%
172
 
1.0%
164
 
0.9%
153
 
0.9%
148
 
0.9%
144
 
0.8%
Other values (1591) 14735
84.9%
None
ValueCountFrequency (%)
· 382
67.1%
  17
 
3.0%
17
 
3.0%
17
 
3.0%
16
 
2.8%
13
 
2.3%
13
 
2.3%
13
 
2.3%
12
 
2.1%
× 11
 
1.9%
Other values (13) 58
 
10.2%
CJK Compat Ideographs
ValueCountFrequency (%)
110
21.0%
62
 
11.8%
25
 
4.8%
22
 
4.2%
21
 
4.0%
17
 
3.2%
15
 
2.9%
15
 
2.9%
15
 
2.9%
13
 
2.5%
Other values (73) 209
39.9%
Cyrillic
ValueCountFrequency (%)
Ш 35
81.4%
а 4
 
9.3%
Г 3
 
7.0%
Ь 1
 
2.3%
Math Operators
ValueCountFrequency (%)
30
96.8%
1
 
3.2%
Punctuation
ValueCountFrequency (%)
15
42.9%
15
42.9%
2
 
5.7%
2
 
5.7%
1
 
2.9%
Compat Jamo
ValueCountFrequency (%)
7
33.3%
5
23.8%
4
19.0%
2
 
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Enclosed Alphanum
ValueCountFrequency (%)
6
25.0%
3
12.5%
3
12.5%
3
12.5%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (3) 3
12.5%
Arrows
ValueCountFrequency (%)
6
85.7%
1
 
14.3%
CJK Compat
ValueCountFrequency (%)
4
66.7%
2
33.3%
Modifier Letters
ValueCountFrequency (%)
˚ 3
100.0%
Geometric Shapes
ValueCountFrequency (%)
3
50.0%
2
33.3%
1
 
16.7%
Number Forms
ValueCountFrequency (%)
2
22.2%
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Interactions

2023-12-12T16:00:28.774862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:00:28.547403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:00:28.864889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:00:28.658271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:00:36.397293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소장구분소장품번호세부번호국적-시대재질지정구분
소장구분1.0000.8330.0000.8010.8650.928
소장품번호0.8331.0000.0000.7890.7050.446
세부번호0.0000.0001.0000.0000.000NaN
국적-시대0.8010.7890.0001.0000.8780.767
재질0.8650.7050.0000.8781.0000.907
지정구분0.9280.446NaN0.7670.9071.000
2023-12-12T16:00:36.513553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지정구분국적-시대소장구분
지정구분1.0000.7160.817
국적-시대0.7161.0000.304
소장구분0.8170.3041.000
2023-12-12T16:00:36.613184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소장품번호세부번호소장구분국적-시대지정구분
소장품번호1.000-0.1300.4790.3790.299
세부번호-0.1301.0000.0000.0001.000
소장구분0.4790.0001.0000.3040.817
국적-시대0.3790.0000.3041.0000.716
지정구분0.2991.0000.8170.7161.000

Missing values

2023-12-12T16:00:28.999173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:00:29.163795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T16:00:29.336673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

소장구분소장품번호세부번호유물명한자명칭국적-시대재질용도크기출토지지정구분설명
9478국립1-춘천-춘천12820토기바닥조각土器底部片한국-원삼국토제-연질식-음식기-기타현재높이 6.7cm,바닥지름 11.5cm강원도-강릉시-초당동 57-2<NA><NA>
7728국립1-대구-증2000무명 홑바지한국-조선사직-면의-의류-평상복전체길이 105,허리둘레 94대구광역시-달성군-현풍면중요민속<NA>
268국립1-공주-공주6850金銀製腰佩<NA>한국-백제금속-금사회생활-의례생활-상장전체길이 58충청남도-공주시-송산리 무령왕릉공주군 공주읍<NA>각 판은 표면이 오목하게 되어있고 여기에 2개의 心葉形 瓔珞이 달려있는데 대형 판은 8개가 연결되고 전체는 도금 또는 금은합금으로 제작된 듯하며 이 중에서 소형판과 영락에 특히 금색이 잘 남아있다. 상단은 두꺼비무늬이나 끝의 하부판은 도깨비문늬가 장식되고 가장자리에 각 선, 점으로 波狀文이 돌려져 있고 장방형 장식표면에는 녹이 심해 확실치는 않으나 파상선 윤곽선 내부에 두마리의 용 비슷한 각선이 되어있다. 한쪽끝의 두꺼비 배에 심엽형을 도치한 두각문을 새기고 네 다리는 금방이라도 뛰어갈 듯한 자세로 구부리고 있는 형상이 위가 삐죽삐죽한 네모꼴의 틀 안에 맞새김 되었다.
7226국립1-청주-청주212620토기편土器片한국-삼국토제-연질식-음식기-기타현재길이 4.9충청북도-청주시-산남동 42-6번지 일원지표<NA><NA>
7520국립1-대구-임당53720鐵鏃쇠살촉한국-원삼국금속-철산업/생업-수렵-사냥구현재길이 3.7,너비 .9경상북도-경산시-임당동 625번지 일대 C-I지구 고분군 25호분<NA><NA>
2802국립1-광주-광주112730소옹소옹한국-삼국토제-연질사회생활-의례생활-상장현재높이 89.7전라남도-나주시-공산면 금곡리 산 24-1번지 외<NA><NA>
8862국립1-춘천-춘천6070그물추漁網錘한국-신석기석-산업/생업-선사생활-생활구일체너비 3.5,길이 5.6,두께 1.3강원도-양양군-손양면 오산리 81-3 5차<NA><NA>
5206국립1-청주-청주10420金銅細環耳飾金銅細環耳飾한국-신라금속-금동의-장신구-신체장식지름 2.5충청북도-충주시-가금면 루암리 산 36번지<NA>귀걸이[耳飾]는 대체로 귀에 다는 고리[環]와 중간장식[中間裝飾], 드리개[垂飾]등 3부분으로 구성되어 있는데, 고리의 굵기에 따라 굵은 고리인 태환이식(太環耳飾) 과 가는 고리인 세환이식(細環耳飾)으로 나누어진다.굵은 고리식은 대체로 샛장식이 작은 고리을 위 아래로 돌려 붙여 만든 둥근 달개장식[瓔珞]을 단 형태이며, 드리개는 대부분 하트 모양인 심엽형(心葉形)이다. 또 고리의 표면이나 드리개의 가장자리에 금알갱이를 붙이는 누금세공을 한다.가는 고리식은 굵은 고리에 비해 모양이 다양하며, 심엽형이나 펜촉형, 또는 금판을 꼰 형태도 있다.
1648국립1-광주-광주101170수키와수키와한국-통일신라토제-와질주-건축부재-지붕재현재길이 6.0~18.7,두께 1.4~2.1전라남도-완도군-완도읍 장좌리 장도 일대<NA><NA>
6520국립1-청주-청주73590확돌石臼石臼한국-석-식-가공-가공지름 67,현재높이 15,구멍지름 23엄순녀 기증품<NA><NA>
소장구분소장품번호세부번호유물명한자명칭국적-시대재질용도크기출토지지정구분설명
10279국립1-춘천-수증1870물레<NA>한국-광복이후나무-산업/생업-공업-염직높이 41,길이 52강원도-평창군-<NA><NA>
6797국립1-청주-청주76360민무늬토기無文土器無文土器한국-청동기토제-연질산업/생업-선사생활-생활구일체현재높이 2.8,바닥지름 6.0,현재높이 3.2,바닥지름 6.0충청북도-음성군-음성읍 하당신천리 일원 6호 주거지<NA><NA>
9150국립1-춘천-춘천9520재갈<NA>한국-백제금속-철교통/통신-마구-제어길이 38.9강원도-원주시-부론면 법천리 2차 지표<NA><NA>
2464국립1-광주-광주109350시루시루한국-초기철기토제-연질식-취사-취사높이 25.1,바닥지름 10.2전라남도-순천시-가곡동 137-4 일원<NA><NA>
311국립1-공주-공주13450細形銅劍<NA>한국-청동기금속-동합금군사-근력무기-도검길이 28,너비 3충청남도-서산시-음암면 탑곡리 549-1서산군<NA>細形銅劍은 칼날이 부식되어 심하게 결손된 상태이다. 형식은 전체적으로 세장하며 끝이 예리한 편이고 등대의 간면은 슴베에 미치지 못하며 缺入部에서 끝난다. 등대 간면의 결입부 상단부에 節이 있다. 이러한 형식으로 세형동검은 1식에 해당하는 것으로 보여지며 시기는 3C말에서 4C초의 것으로 추정된다.
4331국립1-광주-광주128020把手파수한국-삼국토제-연질식-음식기-저장운반현재길이 5.2~8.0광주광역시-북구-동림동 일대<NA><NA>
5449국립1-청주-청주41790토기바리土器鉢한국-원삼국토제-연질식-음식기-음식높이 9.7,입지름 11.5,바닥지름 7.2충청북도-청원군-오창면 송대리(마-58호)<NA><NA>
3540국립1-광주-광주120110土器口緣部片토기구연부편한국-청동기토제-연질식-음식기-음식현재높이 3.9~4.6,두께 0.4~0.5광주광역시-북구-동림동 일대<NA><NA>
9768국립1-춘천-춘천15860접합유물接合遺物한국-구석기석-산업/생업-선사생활-생활구일체두께 15.6㎜,가로 47.7㎜,세로 69.2㎜,두께 14.8㎜,가로 41.0㎜,세로 63.8㎜강원도-동해시-망상동 408-7<NA><NA>
701국립1-광주-광주29320磨製石劍간돌검한국-청동기석-기타군사-근력무기-도검길이 25.0,길이 23.0,길이 20.5,길이 18.4,길이 16.9전라남도-순천시-송광면 우산리 내우<NA><NA>