Overview

Dataset statistics

Number of variables9
Number of observations2593
Missing cells2593
Missing cells (%)11.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory190.0 KiB
Average record size in memory75.0 B

Variable types

Numeric2
Categorical4
Text1
DateTime1
Unsupported1

Dataset

Description전북특별자치도 2020년 산림박물관 전시물 목록 데이터입니다. 구분, 목록명, 재질 등의 데이터를 포함하고 있습니다.
Author전북특별자치도
URLhttps://www.data.go.kr/data/15055676/fileData.do

Alerts

구분 has constant value ""Constant
대분류 is highly overall correlated with 재질High correlation
재질 is highly overall correlated with 대분류High correlation
보관장소 is highly imbalanced (55.6%)Imbalance
비고 has 2593 (100.0%) missing valuesMissing
구연번 has unique valuesUnique
비고 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 19:47:54.729235
Analysis finished2024-03-14 19:47:57.542796
Duration2.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구연번
Real number (ℝ)

UNIQUE 

Distinct2593
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1300.6375
Minimum1
Maximum2599
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.9 KiB
2024-03-15T04:47:57.698591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile132.6
Q1652
median1301
Q31949
95-th percentile2468.4
Maximum2599
Range2598
Interquartile range (IQR)1297

Descriptive statistics

Standard deviation749.58304
Coefficient of variation (CV)0.57631973
Kurtosis-1.1989574
Mean1300.6375
Median Absolute Deviation (MAD)649
Skewness-0.00053567516
Sum3372553
Variance561874.73
MonotonicityStrictly increasing
2024-03-15T04:47:58.050686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1747 1
 
< 0.1%
1729 1
 
< 0.1%
1730 1
 
< 0.1%
1731 1
 
< 0.1%
1732 1
 
< 0.1%
1733 1
 
< 0.1%
1734 1
 
< 0.1%
1735 1
 
< 0.1%
1736 1
 
< 0.1%
Other values (2583) 2583
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2599 1
< 0.1%
2598 1
< 0.1%
2597 1
< 0.1%
2596 1
< 0.1%
2595 1
< 0.1%
2594 1
< 0.1%
2593 1
< 0.1%
2592 1
< 0.1%
2591 1
< 0.1%
2590 1
< 0.1%

구분
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
전시물
2593 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전시물
2nd row전시물
3rd row전시물
4th row전시물
5th row전시물

Common Values

ValueCountFrequency (%)
전시물 2593
100.0%

Length

2024-03-15T04:47:58.281930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T04:47:58.528045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전시물 2593
100.0%
Distinct1800
Distinct (%)69.4%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
2024-03-15T04:47:59.961834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length15
Mean length4.2526032
Min length1

Characters and Unicode

Total characters11027
Distinct characters656
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1412 ?
Unique (%)54.5%

Sample

1st row원앙(암,수)
2nd row
3rd row청머리오리
4th row후투티
5th row잿빛개구리매
ValueCountFrequency (%)
나비표본 35
 
1.3%
소나무 15
 
0.6%
큰소쩍새 13
 
0.5%
목침 11
 
0.4%
곤충화석 11
 
0.4%
잎,씨앗 11
 
0.4%
곤충표본 10
 
0.4%
너구리 9
 
0.3%
괭이 9
 
0.3%
잣나무 9
 
0.3%
Other values (1814) 2561
95.1%
2024-03-15T04:48:01.746496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
501
 
4.5%
422
 
3.8%
280
 
2.5%
251
 
2.3%
241
 
2.2%
215
 
1.9%
153
 
1.4%
122
 
1.1%
117
 
1.1%
) 116
 
1.1%
Other values (646) 8609
78.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10501
95.2%
Close Punctuation 116
 
1.1%
Open Punctuation 116
 
1.1%
Space Separator 105
 
1.0%
Decimal Number 84
 
0.8%
Other Punctuation 45
 
0.4%
Lowercase Letter 33
 
0.3%
Uppercase Letter 16
 
0.1%
Dash Punctuation 8
 
0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
501
 
4.8%
422
 
4.0%
280
 
2.7%
251
 
2.4%
241
 
2.3%
215
 
2.0%
153
 
1.5%
122
 
1.2%
117
 
1.1%
109
 
1.0%
Other values (616) 8090
77.0%
Decimal Number
ValueCountFrequency (%)
1 19
22.6%
5 14
16.7%
3 13
15.5%
2 11
13.1%
0 11
13.1%
4 6
 
7.1%
8 3
 
3.6%
7 3
 
3.6%
9 2
 
2.4%
6 2
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
D 3
18.8%
M 3
18.8%
I 2
12.5%
P 2
12.5%
R 1
 
6.2%
T 1
 
6.2%
W 1
 
6.2%
L 1
 
6.2%
U 1
 
6.2%
F 1
 
6.2%
Lowercase Letter
ValueCountFrequency (%)
s 11
33.3%
e 11
33.3%
t 11
33.3%
Other Punctuation
ValueCountFrequency (%)
, 43
95.6%
" 2
 
4.4%
Close Punctuation
ValueCountFrequency (%)
) 116
100.0%
Open Punctuation
ValueCountFrequency (%)
( 116
100.0%
Space Separator
ValueCountFrequency (%)
105
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10461
94.9%
Common 477
 
4.3%
Latin 49
 
0.4%
Han 40
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
501
 
4.8%
422
 
4.0%
280
 
2.7%
251
 
2.4%
241
 
2.3%
215
 
2.1%
153
 
1.5%
122
 
1.2%
117
 
1.1%
109
 
1.0%
Other values (612) 8050
77.0%
Common
ValueCountFrequency (%)
) 116
24.3%
( 116
24.3%
105
22.0%
, 43
 
9.0%
1 19
 
4.0%
5 14
 
2.9%
3 13
 
2.7%
2 11
 
2.3%
0 11
 
2.3%
- 8
 
1.7%
Other values (7) 21
 
4.4%
Latin
ValueCountFrequency (%)
s 11
22.4%
e 11
22.4%
t 11
22.4%
D 3
 
6.1%
M 3
 
6.1%
I 2
 
4.1%
P 2
 
4.1%
R 1
 
2.0%
T 1
 
2.0%
W 1
 
2.0%
Other values (3) 3
 
6.1%
Han
ValueCountFrequency (%)
16
40.0%
12
30.0%
9
22.5%
3
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10460
94.9%
ASCII 526
 
4.8%
CJK 40
 
0.4%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
501
 
4.8%
422
 
4.0%
280
 
2.7%
251
 
2.4%
241
 
2.3%
215
 
2.1%
153
 
1.5%
122
 
1.2%
117
 
1.1%
109
 
1.0%
Other values (611) 8049
77.0%
ASCII
ValueCountFrequency (%)
) 116
22.1%
( 116
22.1%
105
20.0%
, 43
 
8.2%
1 19
 
3.6%
5 14
 
2.7%
3 13
 
2.5%
2 11
 
2.1%
0 11
 
2.1%
s 11
 
2.1%
Other values (20) 67
12.7%
CJK
ValueCountFrequency (%)
16
40.0%
12
30.0%
9
22.5%
3
 
7.5%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

대분류
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
표본류
831 
민속류
418 
공예품
361 
박제류
327 
임산물
240 
Other values (4)
416 

Length

Max length3
Median length3
Mean length2.9444659
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row박제류
2nd row박제류
3rd row박제류
4th row박제류
5th row박제류

Common Values

ValueCountFrequency (%)
표본류 831
32.0%
민속류 418
16.1%
공예품 361
13.9%
박제류 327
 
12.6%
임산물 240
 
9.3%
소품류 182
 
7.0%
사진 144
 
5.6%
가구류 60
 
2.3%
서적류 30
 
1.2%

Length

2024-03-15T04:48:02.162876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T04:48:02.514086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
표본류 831
32.0%
민속류 418
16.1%
공예품 361
13.9%
박제류 327
 
12.6%
임산물 240
 
9.3%
소품류 182
 
7.0%
사진 144
 
5.6%
가구류 60
 
2.3%
서적류 30
 
1.2%

재질
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
목재
835 
깃털
262 
식물
258 
버섯
256 
한지
210 
Other values (15)
772 

Length

Max length3
Median length2
Mean length1.9625916
Min length1

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row깃털
2nd row깃털
3rd row깃털
4th row깃털
5th row깃털

Common Values

ValueCountFrequency (%)
목재 835
32.2%
깃털 262
 
10.1%
식물 258
 
9.9%
버섯 256
 
9.9%
한지 210
 
8.1%
종이 170
 
6.6%
금속 103
 
4.0%
석재 83
 
3.2%
화석 79
 
3.0%
79
 
3.0%
Other values (10) 258
 
9.9%

Length

2024-03-15T04:48:03.014129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
목재 835
32.2%
깃털 263
 
10.1%
식물 258
 
9.9%
버섯 256
 
9.9%
한지 210
 
8.1%
종이 170
 
6.6%
금속 103
 
4.0%
석재 83
 
3.2%
79
 
3.0%
화석 79
 
3.0%
Other values (9) 257
 
9.9%

수량
Real number (ℝ)

Distinct36
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.7986888
Minimum1
Maximum150
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.9 KiB
2024-03-15T04:48:03.427112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile4
Maximum150
Range149
Interquartile range (IQR)0

Descriptive statistics

Standard deviation5.4384503
Coefficient of variation (CV)3.0235638
Kurtosis445.8517
Mean1.7986888
Median Absolute Deviation (MAD)0
Skewness18.399348
Sum4664
Variance29.576742
MonotonicityNot monotonic
2024-03-15T04:48:03.774346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
1 2263
87.3%
2 148
 
5.7%
3 36
 
1.4%
4 25
 
1.0%
10 22
 
0.8%
5 21
 
0.8%
12 15
 
0.6%
6 12
 
0.5%
9 6
 
0.2%
11 6
 
0.2%
Other values (26) 39
 
1.5%
ValueCountFrequency (%)
1 2263
87.3%
2 148
 
5.7%
3 36
 
1.4%
4 25
 
1.0%
5 21
 
0.8%
6 12
 
0.5%
7 3
 
0.1%
8 1
 
< 0.1%
9 6
 
0.2%
10 22
 
0.8%
ValueCountFrequency (%)
150 1
< 0.1%
149 1
< 0.1%
74 1
< 0.1%
65 1
< 0.1%
55 1
< 0.1%
43 1
< 0.1%
36 1
< 0.1%
35 1
< 0.1%
34 1
< 0.1%
33 1
< 0.1%
Distinct75
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
Minimum2000-08-08 00:00:00
Maximum2017-12-05 00:00:00
2024-03-15T04:48:04.133059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T04:48:04.577085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

보관장소
Categorical

IMBALANCE 

Distinct21
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
수장고 2
1654 
전시실 1
234 
전시실 5
205 
수장고 1
178 
전시실 4
 
146
Other values (16)
176 

Length

Max length15
Median length5
Mean length4.9255688
Min length2

Unique

Unique7 ?
Unique (%)0.3%

Sample

1st row수장고 2
2nd row수장고 2
3rd row수장고 2
4th row수장고 2
5th row수장고 2

Common Values

ValueCountFrequency (%)
수장고 2 1654
63.8%
전시실 1 234
 
9.0%
전시실 5 205
 
7.9%
수장고 1 178
 
6.9%
전시실 4 146
 
5.6%
기획실 66
 
2.5%
쉼터 43
 
1.7%
전시실 2 34
 
1.3%
산촌주택 12
 
0.5%
전시실 2, 전시실 4 3
 
0.1%
Other values (11) 18
 
0.7%

Length

2024-03-15T04:48:05.042361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수장고 1837
36.2%
2 1700
33.5%
전시실 635
 
12.5%
1 415
 
8.2%
5 205
 
4.0%
4 150
 
3.0%
기획실 70
 
1.4%
쉼터 43
 
0.8%
산촌주택 12
 
0.2%
숲체험장 3
 
0.1%
Other values (6) 7
 
0.1%

비고
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2593
Missing (%)100.0%
Memory size22.9 KiB

Interactions

2024-03-15T04:47:56.586267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T04:47:55.771981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T04:47:56.852169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T04:47:56.258908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T04:48:05.309663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구연번대분류재질수량취득일자보관장소
구연번1.0000.6910.8240.0810.8280.763
대분류0.6911.0000.9260.0000.9060.747
재질0.8240.9261.0000.2940.8650.623
수량0.0810.0000.2941.0000.1970.340
취득일자0.8280.9060.8650.1971.0000.854
보관장소0.7630.7470.6230.3400.8541.000
2024-03-15T04:48:05.744099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류재질보관장소
대분류1.0000.7110.403
재질0.7111.0000.223
보관장소0.4030.2231.000
2024-03-15T04:48:05.919983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구연번수량대분류재질보관장소
구연번1.000-0.0110.4040.4140.405
수량-0.0111.0000.0000.1410.160
대분류0.4040.0001.0000.7110.403
재질0.4140.1410.7111.0000.223
보관장소0.4050.1600.4030.2231.000

Missing values

2024-03-15T04:47:57.094893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T04:47:57.368795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구연번구분자료명대분류재질수량취득일자보관장소비고
01전시물원앙(암,수)박제류깃털22009-05-28수장고 2<NA>
12전시물박제류깃털12009-05-28수장고 2<NA>
23전시물청머리오리박제류깃털12009-05-28수장고 2<NA>
34전시물후투티박제류깃털22009-05-28수장고 2<NA>
45전시물잿빛개구리매박제류깃털22009-05-28수장고 2<NA>
56전시물황여새박제류깃털22009-05-28수장고 2<NA>
67전시물해오라기박제류깃털12009-05-28수장고 2<NA>
78전시물박제류깃털12009-05-28수장고 2<NA>
89전시물되새박제류깃털52009-05-28수장고 2<NA>
910전시물흑부리오리박제류깃털12009-05-28수장고 2<NA>
구연번구분자료명대분류재질수량취득일자보관장소비고
25832590전시물버섯(이름알수없음)표본류버섯652009-05-28수장고 2<NA>
25842591전시물한지보석함소품류한지12004-12-06수장고 2<NA>
25852592전시물맥반석임산물석재12009-05-28수장고 2<NA>
25862593전시물합죽선(백선15절)공예품한지12007-12-10수장고 2<NA>
25872594전시물받침공예품목재72004-12-06수장고 2<NA>
25882595전시물허수아비공예품목재12005-01-07전시실 4<NA>
25892596전시물참내는 여인공예품목재12005-01-07전시실 4<NA>
25902597전시물탈춤공예품목재12005-01-07전시실 4<NA>
25912598전시물어린아이 조각상공예품목재12005-01-07전시실 4<NA>
25922599전시물나무절구민속류목재22005-11-19수장고 1<NA>