Overview

Dataset statistics

Number of variables13
Number of observations10000
Missing cells8930
Missing cells (%)6.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 MiB
Average record size in memory115.0 B

Variable types

Categorical6
Numeric3
Text4

Dataset

Description경상북도 영주시 소수박물관의 소장품 정보
Author경상북도 영주시
URLhttps://www.data.go.kr/data/3052016/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
소장구분 is highly overall correlated with 국적/시대 and 2 other fieldsHigh correlation
지정구분 is highly overall correlated with 소장품번호 and 5 other fieldsHigh correlation
입수연유 is highly overall correlated with 소장구분 and 2 other fieldsHigh correlation
국적/시대 is highly overall correlated with 소장구분 and 2 other fieldsHigh correlation
소장품번호 is highly overall correlated with 지정구분High correlation
주수량 is highly overall correlated with 지정구분High correlation
지정번호 is highly overall correlated with 지정구분High correlation
국적/시대 is highly imbalanced (89.1%)Imbalance
재질 is highly imbalanced (85.8%)Imbalance
지정구분 is highly imbalanced (96.6%)Imbalance
입수처 has 5232 (52.3%) missing valuesMissing
특징 has 3698 (37.0%) missing valuesMissing
주수량 is highly skewed (γ1 = 69.8726254)Skewed
지정번호 has 9907 (99.1%) zerosZeros

Reproduction

Analysis started2023-12-12 09:16:16.088428
Analysis finished2023-12-12 09:16:20.333627
Duration4.25 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소장구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
공립-소수박물관-기증
4968 
공립-소수박물관-구입
4784 
공립-소수박물관-소수
 
248

Length

Max length11
Median length11
Mean length11
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공립-소수박물관-기증
2nd row공립-소수박물관-구입
3rd row공립-소수박물관-구입
4th row공립-소수박물관-구입
5th row공립-소수박물관-기증

Common Values

ValueCountFrequency (%)
공립-소수박물관-기증 4968
49.7%
공립-소수박물관-구입 4784
47.8%
공립-소수박물관-소수 248
 
2.5%

Length

2023-12-12T18:16:20.428604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:16:20.548483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공립-소수박물관-기증 4968
49.7%
공립-소수박물관-구입 4784
47.8%
공립-소수박물관-소수 248
 
2.5%

소장품번호
Real number (ℝ)

HIGH CORRELATION 

Distinct6331
Distinct (%)63.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3310.9942
Minimum1
Maximum6888
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T18:16:20.710302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile232.95
Q11575
median3302.5
Q35026.25
95-th percentile6411.05
Maximum6888
Range6887
Interquartile range (IQR)3451.25

Descriptive statistics

Standard deviation1990.5386
Coefficient of variation (CV)0.60119061
Kurtosis-1.2056852
Mean3310.9942
Median Absolute Deviation (MAD)1726
Skewness0.012737548
Sum33109942
Variance3962244
MonotonicityNot monotonic
2023-12-12T18:16:20.909583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
192 3
 
< 0.1%
107 3
 
< 0.1%
149 3
 
< 0.1%
191 3
 
< 0.1%
198 3
 
< 0.1%
104 3
 
< 0.1%
106 3
 
< 0.1%
171 3
 
< 0.1%
176 3
 
< 0.1%
114 3
 
< 0.1%
Other values (6321) 9970
99.7%
ValueCountFrequency (%)
1 3
< 0.1%
2 2
< 0.1%
3 2
< 0.1%
4 3
< 0.1%
5 1
 
< 0.1%
6 3
< 0.1%
7 3
< 0.1%
8 2
< 0.1%
9 1
 
< 0.1%
10 3
< 0.1%
ValueCountFrequency (%)
6888 1
< 0.1%
6887 1
< 0.1%
6886 1
< 0.1%
6885 1
< 0.1%
6884 1
< 0.1%
6883 1
< 0.1%
6882 1
< 0.1%
6880 1
< 0.1%
6879 1
< 0.1%
6878 1
< 0.1%

명칭
Text

Distinct4836
Distinct (%)48.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T18:16:21.377613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length30
Mean length4.6265
Min length1

Characters and Unicode

Total characters46265
Distinct characters728
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4125 ?
Unique (%)41.2%

Sample

1st row낙암집
2nd row간찰(정흠)
3rd row준호구
4th row영주군수위세보
5th row소남실기
ValueCountFrequency (%)
간찰 1164
 
10.3%
명문 503
 
4.4%
문서 488
 
4.3%
호구단자 318
 
2.8%
교지 304
 
2.7%
소지 249
 
2.2%
준호구 225
 
2.0%
시문 185
 
1.6%
제문 163
 
1.4%
호적단자 157
 
1.4%
Other values (5121) 7556
66.8%
2023-12-12T18:16:22.046138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2613
 
5.6%
2028
 
4.4%
1968
 
4.3%
( 1914
 
4.1%
) 1914
 
4.1%
1365
 
3.0%
1308
 
2.8%
1067
 
2.3%
949
 
2.1%
838
 
1.8%
Other values (718) 30301
65.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40632
87.8%
Open Punctuation 1932
 
4.2%
Close Punctuation 1932
 
4.2%
Space Separator 1312
 
2.8%
Decimal Number 186
 
0.4%
Other Punctuation 98
 
0.2%
Other Symbol 83
 
0.2%
Uppercase Letter 49
 
0.1%
Dash Punctuation 34
 
0.1%
Math Symbol 3
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2613
 
6.4%
2028
 
5.0%
1968
 
4.8%
1365
 
3.4%
1067
 
2.6%
949
 
2.3%
838
 
2.1%
826
 
2.0%
800
 
2.0%
701
 
1.7%
Other values (683) 27477
67.6%
Decimal Number
ValueCountFrequency (%)
1 44
23.7%
2 42
22.6%
3 28
15.1%
4 16
 
8.6%
0 16
 
8.6%
8 8
 
4.3%
5 7
 
3.8%
9 6
 
3.2%
7 5
 
2.7%
6 5
 
2.7%
Other values (5) 9
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 72
73.5%
? 18
 
18.4%
: 4
 
4.1%
/ 3
 
3.1%
1
 
1.0%
Open Punctuation
ValueCountFrequency (%)
( 1914
99.1%
18
 
0.9%
Close Punctuation
ValueCountFrequency (%)
) 1914
99.1%
18
 
0.9%
Space Separator
ValueCountFrequency (%)
1308
99.7%
  4
 
0.3%
Math Symbol
ValueCountFrequency (%)
+ 2
66.7%
~ 1
33.3%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Final Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Other Symbol
ValueCountFrequency (%)
83
100.0%
Uppercase Letter
ValueCountFrequency (%)
O 49
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 40094
86.7%
Common 5584
 
12.1%
Han 538
 
1.2%
Latin 49
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2613
 
6.5%
2028
 
5.1%
1968
 
4.9%
1365
 
3.4%
1067
 
2.7%
949
 
2.4%
838
 
2.1%
826
 
2.1%
800
 
2.0%
701
 
1.7%
Other values (476) 26939
67.2%
Han
ValueCountFrequency (%)
25
 
4.6%
23
 
4.3%
19
 
3.5%
15
 
2.8%
14
 
2.6%
14
 
2.6%
13
 
2.4%
13
 
2.4%
13
 
2.4%
12
 
2.2%
Other values (197) 377
70.1%
Common
ValueCountFrequency (%)
( 1914
34.3%
) 1914
34.3%
1308
23.4%
83
 
1.5%
, 72
 
1.3%
1 44
 
0.8%
2 42
 
0.8%
- 34
 
0.6%
3 28
 
0.5%
18
 
0.3%
Other values (24) 127
 
2.3%
Latin
ValueCountFrequency (%)
O 49
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 40094
86.7%
ASCII 5496
 
11.9%
CJK 528
 
1.1%
Geometric Shapes 83
 
0.2%
None 49
 
0.1%
CJK Compat Ideographs 10
 
< 0.1%
Punctuation 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2613
 
6.5%
2028
 
5.1%
1968
 
4.9%
1365
 
3.4%
1067
 
2.7%
949
 
2.4%
838
 
2.1%
826
 
2.1%
800
 
2.0%
701
 
1.7%
Other values (476) 26939
67.2%
ASCII
ValueCountFrequency (%)
( 1914
34.8%
) 1914
34.8%
1308
23.8%
, 72
 
1.3%
O 49
 
0.9%
1 44
 
0.8%
2 42
 
0.8%
- 34
 
0.6%
3 28
 
0.5%
? 18
 
0.3%
Other values (11) 73
 
1.3%
Geometric Shapes
ValueCountFrequency (%)
83
100.0%
CJK
ValueCountFrequency (%)
25
 
4.7%
23
 
4.4%
19
 
3.6%
15
 
2.8%
14
 
2.7%
14
 
2.7%
13
 
2.5%
13
 
2.5%
13
 
2.5%
12
 
2.3%
Other values (189) 367
69.5%
None
ValueCountFrequency (%)
18
36.7%
18
36.7%
  4
 
8.2%
3
 
6.1%
2
 
4.1%
2
 
4.1%
1
 
2.0%
1
 
2.0%
CJK Compat Ideographs
ValueCountFrequency (%)
3
30.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
Punctuation
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

주수량
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct41
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.7332
Minimum1
Maximum902
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T18:16:22.233435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile4
Maximum902
Range901
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.522219
Coefficient of variation (CV)6.0709779
Kurtosis5638.1547
Mean1.7332
Median Absolute Deviation (MAD)0
Skewness69.872625
Sum17332
Variance110.71709
MonotonicityNot monotonic
2023-12-12T18:16:22.393789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1 8935
89.3%
2 395
 
4.0%
3 160
 
1.6%
20 107
 
1.1%
4 100
 
1.0%
5 81
 
0.8%
10 42
 
0.4%
7 42
 
0.4%
6 38
 
0.4%
8 15
 
0.1%
Other values (31) 85
 
0.9%
ValueCountFrequency (%)
1 8935
89.3%
2 395
 
4.0%
3 160
 
1.6%
4 100
 
1.0%
5 81
 
0.8%
6 38
 
0.4%
7 42
 
0.4%
8 15
 
0.1%
9 10
 
0.1%
10 42
 
0.4%
ValueCountFrequency (%)
902 1
< 0.1%
430 1
< 0.1%
138 1
< 0.1%
85 1
< 0.1%
75 1
< 0.1%
72 1
< 0.1%
70 1
< 0.1%
60 1
< 0.1%
45 1
< 0.1%
43 1
< 0.1%

국적/시대
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
한국-조선
9505 
한국
 
250
한국-광복이후
 
227
한국-일제강점
 
8
한국-시대미상
 
4
Other values (4)
 
6

Length

Max length7
Median length5
Mean length4.9733
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row한국-조선
2nd row한국-조선
3rd row한국-조선
4th row한국-조선
5th row한국-조선

Common Values

ValueCountFrequency (%)
한국-조선 9505
95.0%
한국 250
 
2.5%
한국-광복이후 227
 
2.3%
한국-일제강점 8
 
0.1%
한국-시대미상 4
 
< 0.1%
한국-대한제국 2
 
< 0.1%
한국-고려 2
 
< 0.1%
<NA> 1
 
< 0.1%
중국-시대미상 1
 
< 0.1%

Length

2023-12-12T18:16:22.574614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:16:22.734239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한국-조선 9505
95.0%
한국 250
 
2.5%
한국-광복이후 227
 
2.3%
한국-일제강점 8
 
0.1%
한국-시대미상 4
 
< 0.1%
한국-대한제국 2
 
< 0.1%
한국-고려 2
 
< 0.1%
na 1
 
< 0.1%
중국-시대미상 1
 
< 0.1%

재질
Categorical

IMBALANCE 

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
9355 
나무
 
188
기타
 
119
 
99
사직
 
80
Other values (8)
 
159

Length

Max length6
Median length1
Mean length1.064
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
9355
93.5%
나무 188
 
1.9%
기타 119
 
1.2%
99
 
1.0%
사직 80
 
0.8%
금속 67
 
0.7%
도자기 46
 
0.5%
피모 23
 
0.2%
초제 8
 
0.1%
유리/보석 7
 
0.1%
Other values (3) 8
 
0.1%

Length

2023-12-12T18:16:22.939623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
9355
93.5%
나무 188
 
1.9%
기타 119
 
1.2%
99
 
1.0%
사직 80
 
0.8%
금속 67
 
0.7%
도자기 46
 
0.5%
피모 23
 
0.2%
초제 8
 
0.1%
유리/보석 7
 
0.1%
Other values (3) 8
 
0.1%
Distinct9048
Distinct (%)90.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T18:16:23.600875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length124
Median length62
Mean length19.7709
Min length7

Characters and Unicode

Total characters197709
Distinct characters43
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8397 ?
Unique (%)84.0%

Sample

1st row 1:가로:27.9, 1:세로:18.9
2nd row 1:가로:31.5, 1:세로:24.2
3rd row :가로:64.1, :세로:57.9
4th row 1:가로:30, 1:세로:48
5th row 1:가로:28.8, 1:세로:20.8
ValueCountFrequency (%)
1:가로:21 109
 
0.5%
1:세로:32 86
 
0.4%
1:가로:20 79
 
0.4%
1:가로:19 77
 
0.4%
1:세로:29 75
 
0.4%
1:세로:31 73
 
0.4%
1:세로:30 71
 
0.4%
1:세로:28 71
 
0.4%
1:가로:19.5 70
 
0.3%
1:세로:28.5 67
 
0.3%
Other values (3526) 19371
96.1%
2023-12-12T18:16:24.160482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
: 40298
20.4%
1 22335
11.3%
20149
10.2%
19605
9.9%
. 15313
 
7.7%
2 10355
 
5.2%
, 10150
 
5.1%
9814
 
5.0%
9791
 
5.0%
3 7932
 
4.0%
Other values (33) 31967
16.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 71338
36.1%
Other Punctuation 65761
33.3%
Other Letter 40461
20.5%
Space Separator 20149
 
10.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19605
48.5%
9814
24.3%
9791
24.2%
430
 
1.1%
296
 
0.7%
134
 
0.3%
68
 
0.2%
68
 
0.2%
42
 
0.1%
32
 
0.1%
Other values (19) 181
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 22335
31.3%
2 10355
14.5%
3 7932
 
11.1%
5 6478
 
9.1%
4 6008
 
8.4%
8 4342
 
6.1%
6 3975
 
5.6%
7 3750
 
5.3%
9 3415
 
4.8%
0 2748
 
3.9%
Other Punctuation
ValueCountFrequency (%)
: 40298
61.3%
. 15313
 
23.3%
, 10150
 
15.4%
Space Separator
ValueCountFrequency (%)
20149
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 157248
79.5%
Hangul 40461
 
20.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19605
48.5%
9814
24.3%
9791
24.2%
430
 
1.1%
296
 
0.7%
134
 
0.3%
68
 
0.2%
68
 
0.2%
42
 
0.1%
32
 
0.1%
Other values (19) 181
 
0.4%
Common
ValueCountFrequency (%)
: 40298
25.6%
1 22335
14.2%
20149
12.8%
. 15313
 
9.7%
2 10355
 
6.6%
, 10150
 
6.5%
3 7932
 
5.0%
5 6478
 
4.1%
4 6008
 
3.8%
8 4342
 
2.8%
Other values (4) 13888
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 157248
79.5%
Hangul 40461
 
20.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
: 40298
25.6%
1 22335
14.2%
20149
12.8%
. 15313
 
9.7%
2 10355
 
6.6%
, 10150
 
6.5%
3 7932
 
5.0%
5 6478
 
4.1%
4 6008
 
3.8%
8 4342
 
2.8%
Other values (4) 13888
 
8.8%
Hangul
ValueCountFrequency (%)
19605
48.5%
9814
24.3%
9791
24.2%
430
 
1.1%
296
 
0.7%
134
 
0.3%
68
 
0.2%
68
 
0.2%
42
 
0.1%
32
 
0.1%
Other values (19) 181
 
0.4%

입수연유
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기증품
4968 
구입품
4784 
보관품
 
248

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기증품
2nd row구입품
3rd row구입품
4th row구입품
5th row기증품

Common Values

ValueCountFrequency (%)
기증품 4968
49.7%
구입품 4784
47.8%
보관품 248
 
2.5%

Length

2023-12-12T18:16:24.369196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:16:24.507745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기증품 4968
49.7%
구입품 4784
47.8%
보관품 248
 
2.5%

입수처
Text

MISSING 

Distinct148
Distinct (%)3.1%
Missing5232
Missing (%)52.3%
Memory size156.2 KiB
2023-12-12T18:16:24.785743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length9.1436661
Min length1

Characters and Unicode

Total characters43597
Distinct characters185
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)1.0%

Sample

1st row연안김씨 괴헌고택(김종국)
2nd row달성서씨(서석호)
3rd row김선균
4th row옥천전씨(전호인)
5th row연안김씨 괴헌고택(김종국)
ValueCountFrequency (%)
연안김씨 1552
23.7%
괴헌고택(김종국 1550
23.6%
달성서씨(서진원 695
10.6%
달성서씨(서석호 386
 
5.9%
김동양 284
 
4.3%
소수서원 249
 
3.8%
김제만 169
 
2.6%
유림원 144
 
2.2%
공주이씨 108
 
1.6%
옥천전씨(전호인 100
 
1.5%
Other values (151) 1322
20.2%
2023-12-12T18:16:25.310319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3755
 
8.6%
( 3350
 
7.7%
) 3350
 
7.7%
3119
 
7.2%
2900
 
6.7%
1793
 
4.1%
1672
 
3.8%
1653
 
3.8%
1611
 
3.7%
1583
 
3.6%
Other values (175) 18811
43.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 35090
80.5%
Open Punctuation 3350
 
7.7%
Close Punctuation 3350
 
7.7%
Space Separator 1793
 
4.1%
Decimal Number 11
 
< 0.1%
Other Punctuation 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3755
 
10.7%
3119
 
8.9%
2900
 
8.3%
1672
 
4.8%
1653
 
4.7%
1611
 
4.6%
1583
 
4.5%
1572
 
4.5%
1561
 
4.4%
1555
 
4.4%
Other values (164) 14109
40.2%
Decimal Number
ValueCountFrequency (%)
0 3
27.3%
2 2
18.2%
7 2
18.2%
8 2
18.2%
3 1
 
9.1%
5 1
 
9.1%
Open Punctuation
ValueCountFrequency (%)
( 3350
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3350
100.0%
Space Separator
ValueCountFrequency (%)
1793
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 35090
80.5%
Common 8507
 
19.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3755
 
10.7%
3119
 
8.9%
2900
 
8.3%
1672
 
4.8%
1653
 
4.7%
1611
 
4.6%
1583
 
4.5%
1572
 
4.5%
1561
 
4.4%
1555
 
4.4%
Other values (164) 14109
40.2%
Common
ValueCountFrequency (%)
( 3350
39.4%
) 3350
39.4%
1793
21.1%
0 3
 
< 0.1%
2 2
 
< 0.1%
7 2
 
< 0.1%
8 2
 
< 0.1%
. 2
 
< 0.1%
- 1
 
< 0.1%
3 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 35090
80.5%
ASCII 8507
 
19.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3755
 
10.7%
3119
 
8.9%
2900
 
8.3%
1672
 
4.8%
1653
 
4.7%
1611
 
4.6%
1583
 
4.5%
1572
 
4.5%
1561
 
4.4%
1555
 
4.4%
Other values (164) 14109
40.2%
ASCII
ValueCountFrequency (%)
( 3350
39.4%
) 3350
39.4%
1793
21.1%
0 3
 
< 0.1%
2 2
 
< 0.1%
7 2
 
< 0.1%
8 2
 
< 0.1%
. 2
 
< 0.1%
- 1
 
< 0.1%
3 1
 
< 0.1%

지정구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9907 
중요민속
 
88
도지정
 
3
국보
 
1
보물
 
1

Length

Max length4
Median length4
Mean length3.9993
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9907
99.1%
중요민속 88
 
0.9%
도지정 3
 
< 0.1%
국보 1
 
< 0.1%
보물 1
 
< 0.1%

Length

2023-12-12T18:16:25.517418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:16:25.637913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9907
99.1%
중요민속 88
 
0.9%
도지정 3
 
< 0.1%
국보 1
 
< 0.1%
보물 1
 
< 0.1%

지정번호
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3111
Minimum0
Maximum717
Zeros9907
Zeros (%)99.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T18:16:25.739006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum717
Range717
Interquartile range (IQR)0

Descriptive statistics

Standard deviation24.42981
Coefficient of variation (CV)10.570642
Kurtosis164.47267
Mean2.3111
Median Absolute Deviation (MAD)0
Skewness11.652467
Sum23111
Variance596.8156
MonotonicityNot monotonic
2023-12-12T18:16:25.862426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 9907
99.1%
242 88
 
0.9%
331 1
 
< 0.1%
238 1
 
< 0.1%
111 1
 
< 0.1%
418 1
 
< 0.1%
717 1
 
< 0.1%
ValueCountFrequency (%)
0 9907
99.1%
111 1
 
< 0.1%
238 1
 
< 0.1%
242 88
 
0.9%
331 1
 
< 0.1%
418 1
 
< 0.1%
717 1
 
< 0.1%
ValueCountFrequency (%)
717 1
 
< 0.1%
418 1
 
< 0.1%
331 1
 
< 0.1%
242 88
 
0.9%
238 1
 
< 0.1%
111 1
 
< 0.1%
0 9907
99.1%

특징
Text

MISSING 

Distinct5995
Distinct (%)95.1%
Missing3698
Missing (%)37.0%
Memory size156.2 KiB
2023-12-12T18:16:26.228015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length246
Median length189
Mean length9.8054586
Min length1

Characters and Unicode

Total characters61794
Distinct characters1143
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5945 ?
Unique (%)94.3%

Sample

1st row괴-0036
2nd row기2-0236
3rd row기문-0181
4th row13석호증-70, 掌令
5th row박-0820
ValueCountFrequency (%)
250
 
2.4%
필사본 115
 
1.1%
봉투있음 89
 
0.9%
구성 53
 
0.5%
소책자 50
 
0.5%
49
 
0.5%
48
 
0.5%
현판글씨 38
 
0.4%
2장으로 34
 
0.3%
共二 29
 
0.3%
Other values (8000) 9471
92.6%
2023-12-12T18:16:26.939543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 6157
 
10.0%
- 5475
 
8.9%
1 4882
 
7.9%
3951
 
6.4%
3 3034
 
4.9%
2 2581
 
4.2%
, 2460
 
4.0%
5 2138
 
3.5%
4 1873
 
3.0%
6 1690
 
2.7%
Other values (1133) 27553
44.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27013
43.7%
Other Letter 22494
36.4%
Dash Punctuation 5475
 
8.9%
Space Separator 3951
 
6.4%
Other Punctuation 2613
 
4.2%
Open Punctuation 73
 
0.1%
Close Punctuation 73
 
0.1%
Math Symbol 57
 
0.1%
Other Symbol 21
 
< 0.1%
Uppercase Letter 12
 
< 0.1%
Other values (3) 12
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1671
 
7.4%
1552
 
6.9%
1176
 
5.2%
1151
 
5.1%
829
 
3.7%
803
 
3.6%
511
 
2.3%
446
 
2.0%
439
 
2.0%
299
 
1.3%
Other values (1092) 13617
60.5%
Decimal Number
ValueCountFrequency (%)
0 6157
22.8%
1 4882
18.1%
3 3034
11.2%
2 2581
9.6%
5 2138
 
7.9%
4 1873
 
6.9%
6 1690
 
6.3%
8 1634
 
6.0%
7 1606
 
5.9%
9 1418
 
5.2%
Other Punctuation
ValueCountFrequency (%)
, 2460
94.1%
. 72
 
2.8%
? 38
 
1.5%
' 19
 
0.7%
/ 9
 
0.3%
: 5
 
0.2%
· 4
 
0.2%
" 4
 
0.2%
2
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 46
80.7%
< 2
 
3.5%
> 2
 
3.5%
= 2
 
3.5%
× 2
 
3.5%
1
 
1.8%
1
 
1.8%
+ 1
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
C 6
50.0%
O 5
41.7%
N 1
 
8.3%
Open Punctuation
ValueCountFrequency (%)
( 72
98.6%
[ 1
 
1.4%
Close Punctuation
ValueCountFrequency (%)
) 72
98.6%
] 1
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
x 4
66.7%
o 2
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 5475
100.0%
Space Separator
ValueCountFrequency (%)
3951
100.0%
Other Symbol
ValueCountFrequency (%)
21
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Initial Punctuation
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 39282
63.6%
Hangul 17431
28.2%
Han 5063
 
8.2%
Latin 18
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
439
 
8.7%
282
 
5.6%
262
 
5.2%
214
 
4.2%
137
 
2.7%
129
 
2.5%
103
 
2.0%
95
 
1.9%
90
 
1.8%
82
 
1.6%
Other values (625) 3230
63.8%
Hangul
ValueCountFrequency (%)
1671
 
9.6%
1552
 
8.9%
1176
 
6.7%
1151
 
6.6%
829
 
4.8%
803
 
4.6%
511
 
2.9%
446
 
2.6%
299
 
1.7%
207
 
1.2%
Other values (457) 8786
50.4%
Common
ValueCountFrequency (%)
0 6157
15.7%
- 5475
13.9%
1 4882
12.4%
3951
10.1%
3 3034
7.7%
2 2581
6.6%
, 2460
 
6.3%
5 2138
 
5.4%
4 1873
 
4.8%
6 1690
 
4.3%
Other values (26) 5041
12.8%
Latin
ValueCountFrequency (%)
C 6
33.3%
O 5
27.8%
x 4
22.2%
o 2
 
11.1%
N 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39263
63.5%
Hangul 17431
28.2%
CJK 4941
 
8.0%
CJK Compat Ideographs 122
 
0.2%
Geometric Shapes 21
 
< 0.1%
Punctuation 8
 
< 0.1%
None 6
 
< 0.1%
Math Operators 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6157
15.7%
- 5475
13.9%
1 4882
12.4%
3951
10.1%
3 3034
7.7%
2 2581
6.6%
, 2460
 
6.3%
5 2138
 
5.4%
4 1873
 
4.8%
6 1690
 
4.3%
Other values (23) 5022
12.8%
Hangul
ValueCountFrequency (%)
1671
 
9.6%
1552
 
8.9%
1176
 
6.7%
1151
 
6.6%
829
 
4.8%
803
 
4.6%
511
 
2.9%
446
 
2.6%
299
 
1.7%
207
 
1.2%
Other values (457) 8786
50.4%
CJK
ValueCountFrequency (%)
439
 
8.9%
282
 
5.7%
262
 
5.3%
214
 
4.3%
137
 
2.8%
129
 
2.6%
103
 
2.1%
95
 
1.9%
90
 
1.8%
82
 
1.7%
Other values (602) 3108
62.9%
CJK Compat Ideographs
ValueCountFrequency (%)
59
48.4%
11
 
9.0%
9
 
7.4%
5
 
4.1%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
2
 
1.6%
Other values (13) 20
 
16.4%
Geometric Shapes
ValueCountFrequency (%)
21
100.0%
None
ValueCountFrequency (%)
· 4
66.7%
× 2
33.3%
Punctuation
ValueCountFrequency (%)
3
37.5%
3
37.5%
2
25.0%
Math Operators
ValueCountFrequency (%)
1
50.0%
1
50.0%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2016-06-01
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016-06-01
2nd row2016-06-01
3rd row2016-06-01
4th row2016-06-01
5th row2016-06-01

Common Values

ValueCountFrequency (%)
2016-06-01 10000
100.0%

Length

2023-12-12T18:16:27.130235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:16:27.249093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016-06-01 10000
100.0%

Interactions

2023-12-12T18:16:19.120606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:16:18.312827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:16:18.721516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:16:19.279990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:16:18.450842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:16:18.854600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:16:19.472218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:16:18.572448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:16:18.975742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:16:27.348577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소장구분소장품번호주수량국적/시대재질입수연유지정구분지정번호
소장구분1.0000.4500.0450.7880.2331.0000.9890.234
소장품번호0.4501.0000.0000.3880.3080.4500.7590.210
주수량0.0450.0001.0000.0550.2080.0450.7590.741
국적/시대0.7880.3880.0551.0000.5120.7880.8380.514
재질0.2330.3080.2080.5121.0000.2330.1540.492
입수연유1.0000.4500.0450.7880.2331.0000.9890.234
지정구분0.9890.7590.7590.8380.1540.9891.0000.925
지정번호0.2340.2100.7410.5140.4920.2340.9251.000
2023-12-12T18:16:27.547374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소장구분지정구분재질입수연유국적/시대
소장구분1.0000.8960.1341.0000.707
지정구분0.8961.0000.1010.8960.903
재질0.1340.1011.0000.1340.245
입수연유1.0000.8960.1341.0000.707
국적/시대0.7070.9030.2450.7071.000
2023-12-12T18:16:27.684289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소장품번호주수량지정번호소장구분국적/시대재질입수연유지정구분
소장품번호1.000-0.210-0.1610.3030.1960.1330.3030.545
주수량-0.2101.000-0.0170.0420.0250.1230.0420.545
지정번호-0.161-0.0171.0000.0990.3170.2690.0990.933
소장구분0.3030.0420.0991.0000.7070.1341.0000.896
국적/시대0.1960.0250.3170.7071.0000.2450.7070.903
재질0.1330.1230.2690.1340.2451.0000.1340.101
입수연유0.3030.0420.0991.0000.7070.1341.0000.896
지정구분0.5450.5450.9330.8960.9030.1010.8961.000

Missing values

2023-12-12T18:16:19.706288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:16:19.999474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T18:16:20.213724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

소장구분소장품번호명칭주수량국적/시대재질실측치입수연유입수처지정구분지정번호특징데이터기준일자
7129공립-소수박물관-기증182낙암집2한국-조선1:가로:27.9, 1:세로:18.9기증품연안김씨 괴헌고택(김종국)<NA>0괴-00362016-06-01
5762공립-소수박물관-구입5425간찰(정흠)1한국-조선1:가로:31.5, 1:세로:24.2구입품<NA><NA>0<NA>2016-06-01
3224공립-소수박물관-구입2887준호구1한국-조선:가로:64.1, :세로:57.9구입품<NA><NA>0<NA>2016-06-01
2428공립-소수박물관-구입2091영주군수위세보1한국-조선1:가로:30, 1:세로:48구입품<NA><NA>0<NA>2016-06-01
9476공립-소수박물관-기증2534소남실기1한국-조선1:가로:28.8, 1:세로:20.8기증품<NA><NA>0기2-02362016-06-01
10746공립-소수박물관-기증3804검도투구1한국-조선기타1:높이:21.4기증품<NA><NA>0기문-01812016-06-01
11441공립-소수박물관-기증4499만사(김형의)1한국-조선1:가로:26.4, 1:세로:136.6기증품달성서씨(서석호)<NA>013석호증-70, 掌令2016-06-01
715공립-소수박물관-구입378백우서실기1한국-조선1:가로:19.3, 1:세로:29.0구입품<NA><NA>0박-08202016-06-01
10320공립-소수박물관-기증3378시문1한국-조선1:가로:18.6, 1:세로:35.5기증품<NA><NA>0기문-03492016-06-01
1693공립-소수박물관-구입1356권상철 간찰1한국-조선1:가로:42.2, 1:세로:24구입품김선균<NA>009구-602016-06-01
소장구분소장품번호명칭주수량국적/시대재질실측치입수연유입수처지정구분지정번호특징데이터기준일자
920공립-소수박물관-구입583장릉사보3한국-조선1:가로:20.7, 1:세로:31.1구입품<NA><NA>0박-18182016-06-01
6437공립-소수박물관-구입6100간찰(양시목)1한국-조선1:가로:32.9, 1:세로:31.7구입품<NA><NA>0<NA>2016-06-01
10800공립-소수박물관-기증3858중용언해1한국-조선1:가로:21.3, 1:세로:33.8기증품<NA><NA>05공주증-41, 1412016-06-01
12581공립-소수박물관-기증5639문서(원월지초)1한국-조선1:가로:168.5, 1:세로:25.7기증품달성서씨(서진원)<NA>05752016-06-01
3138공립-소수박물관-구입2801소지1한국-조선:가로:67.4, :세로:43구입품<NA><NA>0<NA>2016-06-01
4397공립-소수박물관-구입4060간찰(제)1한국-조선:가로:28.3, :세로:45.1구입품<NA><NA>0<NA>2016-06-01
11502공립-소수박물관-기증4560시문(서준?)1한국-조선1:가로:30.8, 1:세로:50.6기증품달성서씨(서석호)<NA>013석호증-131, 甲申四月三日2016-06-01
727공립-소수박물관-구입390태사권공실기3한국-조선1:가로:19.8, 1:세로:30.6구입품<NA><NA>0박-08332016-06-01
5296공립-소수박물관-구입4959김해허씨세보1한국-조선:가로:27, :세로:101구입품<NA><NA>0<NA>2016-06-01
10030공립-소수박물관-기증3088제문1한국-조선1:가로:40.3, 1:세로:31기증품<NA><NA>0기문-00592016-06-01