Overview

Dataset statistics

Number of variables6
Number of observations3220
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory157.4 KiB
Average record size in memory50.0 B

Variable types

Numeric2
Categorical3
Text1

Dataset

Description목포시 자연사박물관에 보유하고 있는 전시품상세정보(카테고리 대분류, 카테고리 중분류, 카테고리 소분류, 한글이름, 수량)를 제공하고 있습니다.
Author전라남도 목포시
URLhttps://www.data.go.kr/data/15064173/fileData.do

Alerts

카테고리 대분류 is highly overall correlated with 순차번호 and 2 other fieldsHigh correlation
카테고리 중분류 is highly overall correlated with 순차번호 and 2 other fieldsHigh correlation
카테고리 소분류 is highly overall correlated with 순차번호 and 3 other fieldsHigh correlation
순차번호 is highly overall correlated with 카테고리 대분류 and 2 other fieldsHigh correlation
수량 is highly overall correlated with 카테고리 소분류High correlation
순차번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:41:46.998670
Analysis finished2023-12-12 10:41:48.294581
Duration1.3 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순차번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct3220
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1610.5
Minimum1
Maximum3220
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.4 KiB
2023-12-12T19:41:48.380928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile161.95
Q1805.75
median1610.5
Q32415.25
95-th percentile3059.05
Maximum3220
Range3219
Interquartile range (IQR)1609.5

Descriptive statistics

Standard deviation929.67826
Coefficient of variation (CV)0.57726064
Kurtosis-1.2
Mean1610.5
Median Absolute Deviation (MAD)805
Skewness0
Sum5185810
Variance864301.67
MonotonicityStrictly increasing
2023-12-12T19:41:48.587283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2153 1
 
< 0.1%
2143 1
 
< 0.1%
2144 1
 
< 0.1%
2145 1
 
< 0.1%
2146 1
 
< 0.1%
2147 1
 
< 0.1%
2148 1
 
< 0.1%
2149 1
 
< 0.1%
2150 1
 
< 0.1%
Other values (3210) 3210
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
3220 1
< 0.1%
3219 1
< 0.1%
3218 1
< 0.1%
3217 1
< 0.1%
3216 1
< 0.1%
3215 1
< 0.1%
3214 1
< 0.1%
3213 1
< 0.1%
3212 1
< 0.1%
3211 1
< 0.1%

카테고리 대분류
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size25.3 KiB
자연사박물관
2698 
문예역사관
522 

Length

Max length6
Median length6
Mean length5.8378882
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자연사박물관
2nd row자연사박물관
3rd row자연사박물관
4th row자연사박물관
5th row자연사박물관

Common Values

ValueCountFrequency (%)
자연사박물관 2698
83.8%
문예역사관 522
 
16.2%

Length

2023-12-12T19:41:48.736689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:41:48.846490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자연사박물관 2698
83.8%
문예역사관 522
 
16.2%

카테고리 중분류
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size25.3 KiB
지질
1142 
해양생물
1041 
조 · 포유류
515 
화폐
262 
오승우
 
100
Other values (5)
160 

Length

Max length7
Median length6
Mean length3.5701863
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지질
2nd row지질
3rd row지질
4th row지질
5th row지질

Common Values

ValueCountFrequency (%)
지질 1142
35.5%
해양생물 1041
32.3%
조 · 포유류 515
16.0%
화폐 262
 
8.1%
오승우 100
 
3.1%
운림산방 67
 
2.1%
도자기 43
 
1.3%
향토작가 31
 
1.0%
목물고가구 16
 
0.5%
기타(조각) 3
 
0.1%

Length

2023-12-12T19:41:48.972659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:41:49.134825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지질 1142
26.9%
해양생물 1041
24.5%
515
12.1%
· 515
12.1%
포유류 515
12.1%
화폐 262
 
6.2%
오승우 100
 
2.4%
운림산방 67
 
1.6%
도자기 43
 
1.0%
향토작가 31
 
0.7%
Other values (2) 19
 
0.4%

카테고리 소분류
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size25.3 KiB
<NA>
1463 
광물
1142 
조류
348 
포유류
 
132
서양화
 
100

Length

Max length5
Median length4
Mean length3.013354
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광물
2nd row광물
3rd row광물
4th row광물
5th row광물

Common Values

ValueCountFrequency (%)
<NA> 1463
45.4%
광물 1142
35.5%
조류 348
 
10.8%
포유류 132
 
4.1%
서양화 100
 
3.1%
양서파충류 35
 
1.1%

Length

2023-12-12T19:41:49.339823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:41:49.502025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1463
45.4%
광물 1142
35.5%
조류 348
 
10.8%
포유류 132
 
4.1%
서양화 100
 
3.1%
양서파충류 35
 
1.1%
Distinct2285
Distinct (%)71.0%
Missing0
Missing (%)0.0%
Memory size25.3 KiB
2023-12-12T19:41:49.943831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length43
Mean length6.3015528
Min length1

Characters and Unicode

Total characters20291
Distinct characters748
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1910 ?
Unique (%)59.3%

Sample

1st row규석
2nd row보크사이트
3rd row반토혈암
4th row황철석
5th row금은혼합석
ValueCountFrequency (%)
화폐 187
 
4.5%
삼엽충 55
 
1.3%
원시소철류 54
 
1.3%
원시 36
 
0.9%
고사리류 29
 
0.7%
27
 
0.7%
인목류 25
 
0.6%
신생대식물 22
 
0.5%
구과류 17
 
0.4%
화석 15
 
0.4%
Other values (2565) 3644
88.6%
2023-12-12T19:41:50.603095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1158
 
5.7%
414
 
2.0%
406
 
2.0%
365
 
1.8%
i 362
 
1.8%
a 359
 
1.8%
344
 
1.7%
331
 
1.6%
( 319
 
1.6%
) 319
 
1.6%
Other values (738) 15914
78.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14681
72.4%
Lowercase Letter 3386
 
16.7%
Space Separator 1158
 
5.7%
Open Punctuation 319
 
1.6%
Close Punctuation 319
 
1.6%
Uppercase Letter 294
 
1.4%
Decimal Number 62
 
0.3%
Connector Punctuation 49
 
0.2%
Dash Punctuation 18
 
0.1%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
414
 
2.8%
406
 
2.8%
365
 
2.5%
344
 
2.3%
331
 
2.3%
283
 
1.9%
267
 
1.8%
256
 
1.7%
214
 
1.5%
210
 
1.4%
Other values (671) 11591
79.0%
Lowercase Letter
ValueCountFrequency (%)
i 362
10.7%
a 359
10.6%
e 301
 
8.9%
o 260
 
7.7%
s 253
 
7.5%
r 243
 
7.2%
l 229
 
6.8%
t 224
 
6.6%
n 189
 
5.6%
u 167
 
4.9%
Other values (16) 799
23.6%
Uppercase Letter
ValueCountFrequency (%)
C 40
13.6%
A 39
13.3%
P 31
10.5%
T 30
10.2%
S 25
 
8.5%
D 16
 
5.4%
F 16
 
5.4%
B 10
 
3.4%
M 10
 
3.4%
H 9
 
3.1%
Other values (14) 68
23.1%
Decimal Number
ValueCountFrequency (%)
8 14
22.6%
1 11
17.7%
0 8
12.9%
2 8
12.9%
3 6
9.7%
9 4
 
6.5%
7 3
 
4.8%
4 3
 
4.8%
5 3
 
4.8%
6 2
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
: 2
40.0%
Space Separator
ValueCountFrequency (%)
1158
100.0%
Open Punctuation
ValueCountFrequency (%)
( 319
100.0%
Close Punctuation
ValueCountFrequency (%)
) 319
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 49
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14681
72.4%
Latin 3680
 
18.1%
Common 1930
 
9.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
414
 
2.8%
406
 
2.8%
365
 
2.5%
344
 
2.3%
331
 
2.3%
283
 
1.9%
267
 
1.8%
256
 
1.7%
214
 
1.5%
210
 
1.4%
Other values (671) 11591
79.0%
Latin
ValueCountFrequency (%)
i 362
 
9.8%
a 359
 
9.8%
e 301
 
8.2%
o 260
 
7.1%
s 253
 
6.9%
r 243
 
6.6%
l 229
 
6.2%
t 224
 
6.1%
n 189
 
5.1%
u 167
 
4.5%
Other values (40) 1093
29.7%
Common
ValueCountFrequency (%)
1158
60.0%
( 319
 
16.5%
) 319
 
16.5%
_ 49
 
2.5%
- 18
 
0.9%
8 14
 
0.7%
1 11
 
0.6%
0 8
 
0.4%
2 8
 
0.4%
3 6
 
0.3%
Other values (7) 20
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14670
72.3%
ASCII 5610
 
27.6%
Compat Jamo 11
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1158
20.6%
i 362
 
6.5%
a 359
 
6.4%
( 319
 
5.7%
) 319
 
5.7%
e 301
 
5.4%
o 260
 
4.6%
s 253
 
4.5%
r 243
 
4.3%
l 229
 
4.1%
Other values (57) 1807
32.2%
Hangul
ValueCountFrequency (%)
414
 
2.8%
406
 
2.8%
365
 
2.5%
344
 
2.3%
331
 
2.3%
283
 
1.9%
267
 
1.8%
256
 
1.7%
214
 
1.5%
210
 
1.4%
Other values (670) 11580
78.9%
Compat Jamo
ValueCountFrequency (%)
11
100.0%

수량
Real number (ℝ)

HIGH CORRELATION 

Distinct86
Distinct (%)2.7%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean5.8229264
Minimum1
Maximum839
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.4 KiB
2023-12-12T19:41:50.817329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile21
Maximum839
Range838
Interquartile range (IQR)1

Descriptive statistics

Standard deviation27.219403
Coefficient of variation (CV)4.674523
Kurtosis425.55087
Mean5.8229264
Median Absolute Deviation (MAD)0
Skewness17.957582
Sum18744
Variance740.89592
MonotonicityNot monotonic
2023-12-12T19:41:51.032102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 2052
63.7%
2 369
 
11.5%
5 120
 
3.7%
3 106
 
3.3%
4 82
 
2.5%
10 65
 
2.0%
6 51
 
1.6%
7 32
 
1.0%
8 32
 
1.0%
12 23
 
0.7%
Other values (76) 287
 
8.9%
ValueCountFrequency (%)
1 2052
63.7%
2 369
 
11.5%
3 106
 
3.3%
4 82
 
2.5%
5 120
 
3.7%
6 51
 
1.6%
7 32
 
1.0%
8 32
 
1.0%
9 17
 
0.5%
10 65
 
2.0%
ValueCountFrequency (%)
839 1
< 0.1%
576 1
< 0.1%
522 1
< 0.1%
518 1
< 0.1%
295 2
0.1%
256 1
< 0.1%
250 1
< 0.1%
200 1
< 0.1%
190 1
< 0.1%
180 1
< 0.1%

Interactions

2023-12-12T19:41:47.923241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:41:47.714791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:41:48.020944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:41:47.821959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:41:51.178648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순차번호카테고리 대분류카테고리 중분류카테고리 소분류수량
순차번호1.0000.9890.9310.8360.145
카테고리 대분류0.9891.0001.0001.0000.135
카테고리 중분류0.9311.0001.0001.0000.117
카테고리 소분류0.8361.0001.0001.000NaN
수량0.1450.1350.117NaN1.000
2023-12-12T19:41:51.336726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
카테고리 대분류카테고리 중분류카테고리 소분류
카테고리 대분류1.0000.9990.999
카테고리 중분류0.9991.0000.999
카테고리 소분류0.9990.9991.000
2023-12-12T19:41:51.816170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순차번호수량카테고리 대분류카테고리 중분류카테고리 소분류
순차번호1.0000.4150.9080.5670.736
수량0.4151.0000.0970.0611.000
카테고리 대분류0.9080.0971.0000.9990.999
카테고리 중분류0.5670.0610.9991.0000.999
카테고리 소분류0.7361.0000.9990.9991.000

Missing values

2023-12-12T19:41:48.146493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:41:48.247556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순차번호카테고리 대분류카테고리 중분류카테고리 소분류한글이름수량
01자연사박물관지질광물규석55
12자연사박물관지질광물보크사이트46
23자연사박물관지질광물반토혈암3
34자연사박물관지질광물황철석4
45자연사박물관지질광물금은혼합석32
56자연사박물관지질광물농홍은석1
67자연사박물관지질광물담홍은석2
78자연사박물관지질광물취은석_자연은2
89자연사박물관지질광물금은광4
910자연사박물관지질광물입상회석3
순차번호카테고리 대분류카테고리 중분류카테고리 소분류한글이름수량
32103211문예역사관도자기<NA>백자장군1
32113212문예역사관도자기<NA>물고기모양연적1
32123213문예역사관도자기<NA>백자떡살1
32133214문예역사관도자기<NA>백자떡살1
32143215문예역사관도자기<NA>청화백자운학문병1
32153216문예역사관도자기<NA>청화백자팔괴문다각호1
32163217문예역사관도자기<NA>청화백자모란문항아리1
32173218문예역사관기타(조각)<NA>예술비1
32183219문예역사관기타(조각)<NA>파도타는 여인1
32193220문예역사관기타(조각)<NA>고 남농허건 애석비1