Overview

Dataset statistics

Number of variables7
Number of observations5861
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory332.1 KiB
Average record size in memory58.0 B

Variable types

Categorical4
Numeric2
Text1

Dataset

Description함안박물관에 소장품의 명칭, 재질, 시대, 입수경위, 관리번호에 대한 데이터이며, 관련 사진은 이뮤지엄 사이트에 모두 공개되어 있어 확인 가능합니다.
Author경상남도 함안군
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3068630

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
소장구분 is highly overall correlated with 국적_시대 and 2 other fieldsHigh correlation
입수연유 is highly overall correlated with 소장구분 and 1 other fieldsHigh correlation
국적_시대 is highly overall correlated with 소장구분High correlation
재질 is highly overall correlated with 소장구분 and 1 other fieldsHigh correlation
소장구분 is highly imbalanced (67.5%)Imbalance
재질 is highly imbalanced (57.4%)Imbalance
입수연유 is highly imbalanced (78.9%)Imbalance

Reproduction

Analysis started2023-12-10 23:45:25.564875
Analysis finished2023-12-10 23:45:27.241036
Duration1.68 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소장구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size45.9 KiB
공립-함안박물관-국가귀속
5308 
공립-함안박물관-기증
 
452
공립-함안박물관-복제
 
101

Length

Max length13
Median length13
Mean length12.811295
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공립-함안박물관-기증
2nd row공립-함안박물관-기증
3rd row공립-함안박물관-기증
4th row공립-함안박물관-기증
5th row공립-함안박물관-기증

Common Values

ValueCountFrequency (%)
공립-함안박물관-국가귀속 5308
90.6%
공립-함안박물관-기증 452
 
7.7%
공립-함안박물관-복제 101
 
1.7%

Length

2023-12-11T08:45:27.323121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:45:27.473762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공립-함안박물관-국가귀속 5308
90.6%
공립-함안박물관-기증 452
 
7.7%
공립-함안박물관-복제 101
 
1.7%

소장품번호
Real number (ℝ)

Distinct5308
Distinct (%)90.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2422.3416
Minimum1
Maximum5308
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size51.6 KiB
2023-12-11T08:45:27.622307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile98
Q1913
median2378
Q33843
95-th percentile5015
Maximum5308
Range5307
Interquartile range (IQR)2930

Descriptive statistics

Standard deviation1626.6266
Coefficient of variation (CV)0.67150999
Kurtosis-1.2775574
Mean2422.3416
Median Absolute Deviation (MAD)1465
Skewness0.10149838
Sum14197344
Variance2645914
MonotonicityNot monotonic
2023-12-11T08:45:27.812612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
216 4
 
0.1%
1 3
 
0.1%
66 3
 
0.1%
76 3
 
0.1%
75 3
 
0.1%
74 3
 
0.1%
73 3
 
0.1%
72 3
 
0.1%
71 3
 
0.1%
70 3
 
0.1%
Other values (5298) 5830
99.5%
ValueCountFrequency (%)
1 3
0.1%
2 3
0.1%
3 3
0.1%
4 3
0.1%
5 3
0.1%
6 3
0.1%
7 3
0.1%
8 3
0.1%
9 3
0.1%
10 3
0.1%
ValueCountFrequency (%)
5308 1
< 0.1%
5307 1
< 0.1%
5306 1
< 0.1%
5305 1
< 0.1%
5304 1
< 0.1%
5303 1
< 0.1%
5302 1
< 0.1%
5301 1
< 0.1%
5300 1
< 0.1%
5299 1
< 0.1%

명칭
Text

Distinct785
Distinct (%)13.4%
Missing0
Missing (%)0.0%
Memory size45.9 KiB
2023-12-11T08:45:28.110832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length3.3555707
Min length1

Characters and Unicode

Total characters19667
Distinct characters354
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique402 ?
Unique (%)6.9%

Sample

1st row고배
2nd row단경호
3rd row
4th row단경호
5th row이단투창 고배
ValueCountFrequency (%)
관정 994
 
14.6%
고배 353
 
5.2%
298
 
4.4%
백자 214
 
3.1%
저부편 180
 
2.6%
단경호 145
 
2.1%
철촉 143
 
2.1%
구연부편 134
 
2.0%
청동숟가락 130
 
1.9%
접시 99
 
1.5%
Other values (648) 4133
60.6%
2023-12-11T08:45:28.559364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1210
 
6.2%
1073
 
5.5%
1003
 
5.1%
988
 
5.0%
953
 
4.8%
717
 
3.6%
704
 
3.6%
594
 
3.0%
517
 
2.6%
512
 
2.6%
Other values (344) 11396
57.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18570
94.4%
Space Separator 988
 
5.0%
Close Punctuation 39
 
0.2%
Open Punctuation 39
 
0.2%
Dash Punctuation 18
 
0.1%
Other Punctuation 10
 
0.1%
Math Symbol 2
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1210
 
6.5%
1073
 
5.8%
1003
 
5.4%
953
 
5.1%
717
 
3.9%
704
 
3.8%
594
 
3.2%
517
 
2.8%
512
 
2.8%
512
 
2.8%
Other values (334) 10775
58.0%
Other Punctuation
ValueCountFrequency (%)
· 6
60.0%
, 2
 
20.0%
/ 1
 
10.0%
? 1
 
10.0%
Space Separator
ValueCountFrequency (%)
988
100.0%
Close Punctuation
ValueCountFrequency (%)
) 39
100.0%
Open Punctuation
ValueCountFrequency (%)
( 39
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
U 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18376
93.4%
Common 1096
 
5.6%
Han 194
 
1.0%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1210
 
6.6%
1073
 
5.8%
1003
 
5.5%
953
 
5.2%
717
 
3.9%
704
 
3.8%
594
 
3.2%
517
 
2.8%
512
 
2.8%
512
 
2.8%
Other values (287) 10581
57.6%
Han
ValueCountFrequency (%)
24
 
12.4%
20
 
10.3%
16
 
8.2%
13
 
6.7%
13
 
6.7%
11
 
5.7%
9
 
4.6%
8
 
4.1%
5
 
2.6%
5
 
2.6%
Other values (37) 70
36.1%
Common
ValueCountFrequency (%)
988
90.1%
) 39
 
3.6%
( 39
 
3.6%
- 18
 
1.6%
· 6
 
0.5%
+ 2
 
0.2%
, 2
 
0.2%
/ 1
 
0.1%
? 1
 
0.1%
Latin
ValueCountFrequency (%)
U 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18375
93.4%
ASCII 1091
 
5.5%
CJK 194
 
1.0%
None 6
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1210
 
6.6%
1073
 
5.8%
1003
 
5.5%
953
 
5.2%
717
 
3.9%
704
 
3.8%
594
 
3.2%
517
 
2.8%
512
 
2.8%
512
 
2.8%
Other values (286) 10580
57.6%
ASCII
ValueCountFrequency (%)
988
90.6%
) 39
 
3.6%
( 39
 
3.6%
- 18
 
1.6%
+ 2
 
0.2%
, 2
 
0.2%
/ 1
 
0.1%
? 1
 
0.1%
U 1
 
0.1%
CJK
ValueCountFrequency (%)
24
 
12.4%
20
 
10.3%
16
 
8.2%
13
 
6.7%
13
 
6.7%
11
 
5.7%
9
 
4.6%
8
 
4.1%
5
 
2.6%
5
 
2.6%
Other values (37) 70
36.1%
None
ValueCountFrequency (%)
· 6
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

주수량
Real number (ℝ)

Distinct28
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.185122
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size51.6 KiB
2023-12-11T08:45:28.713386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum52
Range51
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.7010408
Coefficient of variation (CV)1.4353297
Kurtosis334.77712
Mean1.185122
Median Absolute Deviation (MAD)0
Skewness16.618556
Sum6946
Variance2.8935397
MonotonicityNot monotonic
2023-12-11T08:45:28.882480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
1 5529
94.3%
2 252
 
4.3%
3 15
 
0.3%
4 9
 
0.2%
7 8
 
0.1%
6 7
 
0.1%
10 4
 
0.1%
11 4
 
0.1%
5 3
 
0.1%
18 3
 
0.1%
Other values (18) 27
 
0.5%
ValueCountFrequency (%)
1 5529
94.3%
2 252
 
4.3%
3 15
 
0.3%
4 9
 
0.2%
5 3
 
0.1%
6 7
 
0.1%
7 8
 
0.1%
8 2
 
< 0.1%
9 3
 
0.1%
10 4
 
0.1%
ValueCountFrequency (%)
52 1
< 0.1%
38 1
< 0.1%
35 1
< 0.1%
34 2
< 0.1%
33 1
< 0.1%
30 1
< 0.1%
28 1
< 0.1%
27 1
< 0.1%
26 2
< 0.1%
21 1
< 0.1%

국적_시대
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size45.9 KiB
조선시대
2114 
삼국시대
1681 
가야시대
1108 
청동기시대
365 
고려시대
255 
Other values (8)
338 

Length

Max length7
Median length4
Mean length4.0783143
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가야시대
2nd row가야시대
3rd row가야시대
4th row가야시대
5th row가야시대

Common Values

ValueCountFrequency (%)
조선시대 2114
36.1%
삼국시대 1681
28.7%
가야시대 1108
18.9%
청동기시대 365
 
6.2%
고려시대 255
 
4.4%
통일신라시대 155
 
2.6%
한국 112
 
1.9%
광복이후 28
 
0.5%
구석기시대 16
 
0.3%
기타 13
 
0.2%
Other values (3) 14
 
0.2%

Length

2023-12-11T08:45:29.037633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
조선시대 2114
36.1%
삼국시대 1681
28.7%
가야시대 1108
18.9%
청동기시대 365
 
6.2%
고려시대 255
 
4.4%
통일신라시대 155
 
2.6%
한국 112
 
1.9%
광복이후 28
 
0.5%
구석기시대 16
 
0.3%
기타 13
 
0.2%
Other values (3) 14
 
0.2%

재질
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct31
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size45.9 KiB
토도류
2628 
금속류
2066 
금속
393 
토제
309 
옥석유리류
 
121
Other values (26)
344 

Length

Max length9
Median length3
Mean length2.9805494
Min length1

Unique

Unique7 ?
Unique (%)0.1%

Sample

1st row토제-경질
2nd row토제-경질
3rd row토제-경질
4th row토제-경질
5th row토제-경질

Common Values

ValueCountFrequency (%)
토도류 2628
44.8%
금속류 2066
35.2%
금속 393
 
6.7%
토제 309
 
5.3%
옥석유리류 121
 
2.1%
금속-철 61
 
1.0%
나무 44
 
0.8%
합성재질-합성수지 37
 
0.6%
도자기-백자 37
 
0.6%
31
 
0.5%
Other values (21) 134
 
2.3%

Length

2023-12-11T08:45:29.186276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
토도류 2628
44.8%
금속류 2066
35.2%
금속 393
 
6.7%
토제 309
 
5.3%
옥석유리류 121
 
2.1%
금속-철 61
 
1.0%
나무 44
 
0.8%
합성재질-합성수지 37
 
0.6%
도자기-백자 37
 
0.6%
31
 
0.5%
Other values (21) 134
 
2.3%

입수연유
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size45.9 KiB
국가귀속
5308 
기증품
 
426
구입품
 
101
발견품
 
19
기타(미상)
 
5

Length

Max length6
Median length4
Mean length3.9082068
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기증품
2nd row기증품
3rd row기증품
4th row기증품
5th row기증품

Common Values

ValueCountFrequency (%)
국가귀속 5308
90.6%
기증품 426
 
7.3%
구입품 101
 
1.7%
발견품 19
 
0.3%
기타(미상) 5
 
0.1%
이관품 2
 
< 0.1%

Length

2023-12-11T08:45:29.336851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:45:29.472870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국가귀속 5308
90.6%
기증품 426
 
7.3%
구입품 101
 
1.7%
발견품 19
 
0.3%
기타(미상 5
 
0.1%
이관품 2
 
< 0.1%

Interactions

2023-12-11T08:45:26.739181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:45:26.190543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:45:26.868220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:45:26.601072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:45:29.564346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소장구분소장품번호주수량국적_시대재질입수연유
소장구분1.0000.6320.0000.8390.9301.000
소장품번호0.6321.0000.0780.6430.6840.513
주수량0.0000.0781.0000.0000.5680.000
국적_시대0.8390.6430.0001.0000.8400.714
재질0.9300.6840.5680.8401.0000.909
입수연유1.0000.5130.0000.7140.9091.000
2023-12-11T08:45:29.679972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소장구분재질국적_시대입수연유
소장구분1.0000.7990.7141.000
재질0.7991.0000.4490.686
국적_시대0.7140.4491.0000.455
입수연유1.0000.6860.4551.000
2023-12-11T08:45:29.807355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소장품번호주수량소장구분국적_시대재질입수연유
소장품번호1.000-0.0000.4780.3320.3160.301
주수량-0.0001.0000.0000.0000.2500.000
소장구분0.4780.0001.0000.7140.7991.000
국적_시대0.3320.0000.7141.0000.4490.455
재질0.3160.2500.7990.4491.0000.686
입수연유0.3010.0001.0000.4550.6861.000

Missing values

2023-12-11T08:45:27.044955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:45:27.184075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

소장구분소장품번호명칭주수량국적_시대재질입수연유
0공립-함안박물관-기증1고배1가야시대토제-경질기증품
1공립-함안박물관-기증2단경호1가야시대토제-경질기증품
2공립-함안박물관-기증31가야시대토제-경질기증품
3공립-함안박물관-기증4단경호1가야시대토제-경질기증품
4공립-함안박물관-기증5이단투창 고배1가야시대토제-경질기증품
5공립-함안박물관-기증6고배형 기대1가야시대토제기증품
6공립-함안박물관-기증7삼각투창 고배1가야시대토제기증품
7공립-함안박물관-기증8단경호1가야시대토제기증품
8공립-함안박물관-기증9삼각투창 고배1가야시대토제기증품
9공립-함안박물관-기증10이단투창 고배1가야시대토제기증품
소장구분소장품번호명칭주수량국적_시대재질입수연유
5851공립-함안박물관-국가귀속5299철정1가야시대금속류국가귀속
5852공립-함안박물관-국가귀속5300철정1가야시대금속류국가귀속
5853공립-함안박물관-국가귀속5301철정1가야시대금속류국가귀속
5854공립-함안박물관-국가귀속5302철정1가야시대금속류국가귀속
5855공립-함안박물관-국가귀속5303철정1가야시대금속류국가귀속
5856공립-함안박물관-국가귀속5304철정1가야시대금속류국가귀속
5857공립-함안박물관-국가귀속5305철정1가야시대금속류국가귀속
5858공립-함안박물관-국가귀속5306철정1가야시대금속류국가귀속
5859공립-함안박물관-국가귀속5307철정1가야시대금속류국가귀속
5860공립-함안박물관-국가귀속5308철정1가야시대금속류국가귀속

Duplicate rows

Most frequently occurring

소장구분소장품번호명칭주수량국적_시대재질입수연유# duplicates
0공립-함안박물관-기증216파수부 고배1가야시대토제기증품3