Overview

Dataset statistics

Number of variables9
Number of observations200
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.8 KiB
Average record size in memory75.7 B

Variable types

Numeric3
Text2
Categorical4

Dataset

Description새만금 간척산업과 우리나라 간척의 역사와 문화유산을 널리 알리기 위한 국립새만금간척박물관 유물 목록 현황에 대한 자료입니다. 제공된 자료는 유물명, 재질, 용도 관련 자료 등을 제공하고 있습니다.
URLhttps://www.data.go.kr/data/15119730/fileData.do

Alerts

재질 has constant value ""Constant
입수연유 has constant value ""Constant
용도_기능 is highly overall correlated with 국적High correlation
국적 is highly overall correlated with 연번 and 3 other fieldsHigh correlation
연번 is highly overall correlated with 유물번호 and 1 other fieldsHigh correlation
유물번호 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
주수량(점) is highly overall correlated with 국적High correlation
국적 is highly imbalanced (95.5%)Imbalance
연번 has unique valuesUnique
유물번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 17:23:13.220071
Analysis finished2023-12-12 17:23:15.323727
Duration2.1 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct200
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.5
Minimum1
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-13T02:23:15.422426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10.95
Q150.75
median100.5
Q3150.25
95-th percentile190.05
Maximum200
Range199
Interquartile range (IQR)99.5

Descriptive statistics

Standard deviation57.879185
Coefficient of variation (CV)0.57591228
Kurtosis-1.2
Mean100.5
Median Absolute Deviation (MAD)50
Skewness0
Sum20100
Variance3350
MonotonicityStrictly increasing
2023-12-13T02:23:15.587692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
139 1
 
0.5%
129 1
 
0.5%
130 1
 
0.5%
131 1
 
0.5%
132 1
 
0.5%
133 1
 
0.5%
134 1
 
0.5%
135 1
 
0.5%
136 1
 
0.5%
Other values (190) 190
95.0%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
200 1
0.5%
199 1
0.5%
198 1
0.5%
197 1
0.5%
196 1
0.5%
195 1
0.5%
194 1
0.5%
193 1
0.5%
192 1
0.5%
191 1
0.5%

유물번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct200
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.5
Minimum1
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-13T02:23:15.759821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10.95
Q150.75
median100.5
Q3150.25
95-th percentile190.05
Maximum200
Range199
Interquartile range (IQR)99.5

Descriptive statistics

Standard deviation57.879185
Coefficient of variation (CV)0.57591228
Kurtosis-1.2
Mean100.5
Median Absolute Deviation (MAD)50
Skewness0
Sum20100
Variance3350
MonotonicityStrictly increasing
2023-12-13T02:23:15.924515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
139 1
 
0.5%
129 1
 
0.5%
130 1
 
0.5%
131 1
 
0.5%
132 1
 
0.5%
133 1
 
0.5%
134 1
 
0.5%
135 1
 
0.5%
136 1
 
0.5%
Other values (190) 190
95.0%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
200 1
0.5%
199 1
0.5%
198 1
0.5%
197 1
0.5%
196 1
0.5%
195 1
0.5%
194 1
0.5%
193 1
0.5%
192 1
0.5%
191 1
0.5%
Distinct196
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-13T02:23:16.334650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length20
Mean length10.025
Min length2

Characters and Unicode

Total characters2005
Distinct characters256
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique192 ?
Unique (%)96.0%

Sample

1st row조선박람회 포스터 3종과 연하장
2nd row조선박람회장 컬러 엽서
3rd row익옥수리조합엽서
4th row동진수리조합 사진엽서
5th row한강채집 사진엽서
ValueCountFrequency (%)
엽서 19
 
4.4%
봉투 17
 
3.9%
엽서와 9
 
2.1%
사진엽서 9
 
2.1%
조선의 8
 
1.8%
지형도 7
 
1.6%
조선총독부 6
 
1.4%
경성 5
 
1.2%
기념엽서와 4
 
0.9%
조선 4
 
0.9%
Other values (302) 345
79.7%
2023-12-13T02:23:16.999245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
236
 
11.8%
78
 
3.9%
75
 
3.7%
72
 
3.6%
69
 
3.4%
61
 
3.0%
58
 
2.9%
46
 
2.3%
40
 
2.0%
32
 
1.6%
Other values (246) 1238
61.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1723
85.9%
Space Separator 236
 
11.8%
Decimal Number 33
 
1.6%
Open Punctuation 6
 
0.3%
Close Punctuation 6
 
0.3%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
78
 
4.5%
75
 
4.4%
72
 
4.2%
69
 
4.0%
61
 
3.5%
58
 
3.4%
46
 
2.7%
40
 
2.3%
32
 
1.9%
25
 
1.5%
Other values (234) 1167
67.7%
Decimal Number
ValueCountFrequency (%)
1 9
27.3%
2 5
15.2%
5 4
12.1%
6 4
12.1%
3 4
12.1%
9 3
 
9.1%
0 2
 
6.1%
4 2
 
6.1%
Space Separator
ValueCountFrequency (%)
236
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1715
85.5%
Common 282
 
14.1%
Han 8
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
78
 
4.5%
75
 
4.4%
72
 
4.2%
69
 
4.0%
61
 
3.6%
58
 
3.4%
46
 
2.7%
40
 
2.3%
32
 
1.9%
25
 
1.5%
Other values (226) 1159
67.6%
Common
ValueCountFrequency (%)
236
83.7%
1 9
 
3.2%
( 6
 
2.1%
) 6
 
2.1%
2 5
 
1.8%
5 4
 
1.4%
6 4
 
1.4%
3 4
 
1.4%
9 3
 
1.1%
0 2
 
0.7%
Other values (2) 3
 
1.1%
Han
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1715
85.5%
ASCII 282
 
14.1%
CJK 8
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
236
83.7%
1 9
 
3.2%
( 6
 
2.1%
) 6
 
2.1%
2 5
 
1.8%
5 4
 
1.4%
6 4
 
1.4%
3 4
 
1.4%
9 3
 
1.1%
0 2
 
0.7%
Other values (2) 3
 
1.1%
Hangul
ValueCountFrequency (%)
78
 
4.5%
75
 
4.4%
72
 
4.2%
69
 
4.0%
61
 
3.6%
58
 
3.4%
46
 
2.7%
40
 
2.3%
32
 
1.9%
25
 
1.5%
Other values (226) 1159
67.6%
CJK
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

주수량(점)
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.455
Minimum1
Maximum55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-13T02:23:17.180556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile8.05
Maximum55
Range54
Interquartile range (IQR)1

Descriptive statistics

Standard deviation4.8895309
Coefficient of variation (CV)1.9916623
Kurtosis74.337082
Mean2.455
Median Absolute Deviation (MAD)0
Skewness7.7598663
Sum491
Variance23.907513
MonotonicityNot monotonic
2023-12-13T02:23:17.359024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
1 142
71.0%
2 18
 
9.0%
4 8
 
4.0%
3 8
 
4.0%
6 7
 
3.5%
5 4
 
2.0%
9 3
 
1.5%
12 2
 
1.0%
7 2
 
1.0%
10 1
 
0.5%
Other values (5) 5
 
2.5%
ValueCountFrequency (%)
1 142
71.0%
2 18
 
9.0%
3 8
 
4.0%
4 8
 
4.0%
5 4
 
2.0%
6 7
 
3.5%
7 2
 
1.0%
8 1
 
0.5%
9 3
 
1.5%
10 1
 
0.5%
ValueCountFrequency (%)
55 1
 
0.5%
33 1
 
0.5%
13 1
 
0.5%
12 2
 
1.0%
11 1
 
0.5%
10 1
 
0.5%
9 3
1.5%
8 1
 
0.5%
7 2
 
1.0%
6 7
3.5%

국적
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
한국
199 
<NA>
 
1

Length

Max length4
Median length2
Mean length2.01
Min length2

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row한국
2nd row한국
3rd row한국
4th row한국
5th row한국

Common Values

ValueCountFrequency (%)
한국 199
99.5%
<NA> 1
 
0.5%

Length

2023-12-13T02:23:17.551567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:23:17.687217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한국 199
99.5%
na 1
 
0.5%

시기
Text

Distinct80
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-13T02:23:17.989251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length5
Mean length5.505
Min length2

Characters and Unicode

Total characters1101
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)23.0%

Sample

1st row1911년
2nd row1929년
3rd row1930년 경
4th row1929년
5th row1930년 대
ValueCountFrequency (%)
일제강점기 26
 
11.0%
19
 
8.0%
1930년 10
 
4.2%
1936년 10
 
4.2%
이후 8
 
3.4%
1937년 8
 
3.4%
1932년 8
 
3.4%
1929년 8
 
3.4%
1920년 8
 
3.4%
1935년 7
 
3.0%
Other values (54) 125
52.7%
2023-12-13T02:23:18.496046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 219
19.9%
9 193
17.5%
172
15.6%
3 84
 
7.6%
2 60
 
5.4%
38
 
3.5%
0 32
 
2.9%
6 29
 
2.6%
26
 
2.4%
26
 
2.4%
Other values (17) 222
20.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
63.6%
Other Letter 360
32.7%
Space Separator 38
 
3.5%
Math Symbol 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
172
47.8%
26
 
7.2%
26
 
7.2%
26
 
7.2%
26
 
7.2%
26
 
7.2%
22
 
6.1%
9
 
2.5%
9
 
2.5%
6
 
1.7%
Other values (5) 12
 
3.3%
Decimal Number
ValueCountFrequency (%)
1 219
31.3%
9 193
27.6%
3 84
 
12.0%
2 60
 
8.6%
0 32
 
4.6%
6 29
 
4.1%
5 24
 
3.4%
4 23
 
3.3%
8 21
 
3.0%
7 15
 
2.1%
Space Separator
ValueCountFrequency (%)
38
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 741
67.3%
Hangul 360
32.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
172
47.8%
26
 
7.2%
26
 
7.2%
26
 
7.2%
26
 
7.2%
26
 
7.2%
22
 
6.1%
9
 
2.5%
9
 
2.5%
6
 
1.7%
Other values (5) 12
 
3.3%
Common
ValueCountFrequency (%)
1 219
29.6%
9 193
26.0%
3 84
 
11.3%
2 60
 
8.1%
38
 
5.1%
0 32
 
4.3%
6 29
 
3.9%
5 24
 
3.2%
4 23
 
3.1%
8 21
 
2.8%
Other values (2) 18
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 741
67.3%
Hangul 360
32.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 219
29.6%
9 193
26.0%
3 84
 
11.3%
2 60
 
8.1%
38
 
5.1%
0 32
 
4.3%
6 29
 
3.9%
5 24
 
3.2%
4 23
 
3.1%
8 21
 
2.8%
Other values (2) 18
 
2.4%
Hangul
ValueCountFrequency (%)
172
47.8%
26
 
7.2%
26
 
7.2%
26
 
7.2%
26
 
7.2%
26
 
7.2%
22
 
6.1%
9
 
2.5%
9
 
2.5%
6
 
1.7%
Other values (5) 12
 
3.3%

재질
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
지류
200 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지류
2nd row지류
3rd row지류
4th row지류
5th row지류

Common Values

ValueCountFrequency (%)
지류 200
100.0%

Length

2023-12-13T02:23:18.666237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:23:18.793223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지류 200
100.0%

용도_기능
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
엽서
63 
서적
62 
안내서
24 
지도
22 
보고서
 
6
Other values (11)
23 

Length

Max length4
Median length2
Mean length2.18
Min length2

Unique

Unique7 ?
Unique (%)3.5%

Sample

1st row엽서
2nd row엽서
3rd row엽서
4th row엽서
5th row엽서

Common Values

ValueCountFrequency (%)
엽서 63
31.5%
서적 62
31.0%
안내서 24
 
12.0%
지도 22
 
11.0%
보고서 6
 
3.0%
문서 5
 
2.5%
도면 4
 
2.0%
사진첩 4
 
2.0%
사진 3
 
1.5%
회화 1
 
0.5%
Other values (6) 6
 
3.0%

Length

2023-12-13T02:23:18.947773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
엽서 63
31.5%
서적 62
31.0%
안내서 24
 
12.0%
지도 22
 
11.0%
보고서 6
 
3.0%
문서 5
 
2.5%
도면 4
 
2.0%
사진첩 4
 
2.0%
사진 3
 
1.5%
회화 1
 
0.5%
Other values (6) 6
 
3.0%

입수연유
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
구입
200 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구입
2nd row구입
3rd row구입
4th row구입
5th row구입

Common Values

ValueCountFrequency (%)
구입 200
100.0%

Length

2023-12-13T02:23:19.098001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:23:19.218183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
구입 200
100.0%

Interactions

2023-12-13T02:23:14.740414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:13.687125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:14.412578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:14.834836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:14.166521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:14.529765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:14.938181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:14.301799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:14.632355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:23:19.317222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번유물번호주수량(점)시기용도_기능
연번1.0001.0000.1340.7420.713
유물번호1.0001.0000.1340.7420.713
주수량(점)0.1340.1341.0000.8410.276
시기0.7420.7420.8411.0000.848
용도_기능0.7130.7130.2760.8481.000
2023-12-13T02:23:19.436642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용도_기능국적
용도_기능1.0001.000
국적1.0001.000
2023-12-13T02:23:19.536012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번유물번호주수량(점)국적용도_기능
연번1.0001.000-0.2911.0000.352
유물번호1.0001.000-0.2911.0000.352
주수량(점)-0.291-0.2911.0001.0000.116
국적1.0001.0001.0001.0001.000
용도_기능0.3520.3520.1161.0001.000

Missing values

2023-12-13T02:23:15.066034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:23:15.254523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번유물번호유물명주수량(점)국적시기재질용도_기능입수연유
011조선박람회 포스터 3종과 연하장4한국1911년지류엽서구입
122조선박람회장 컬러 엽서10한국1929년지류엽서구입
233익옥수리조합엽서3한국1930년 경지류엽서구입
344동진수리조합 사진엽서1한국1929년지류엽서구입
455한강채집 사진엽서1한국1930년 대지류엽서구입
566대천해수욕장 사진엽서1한국1920년 대지류엽서구입
677평양보통문 사진엽서1한국1929년지류엽서구입
788대동강 부녀자 세탁 사진엽서1한국1920년 대지류엽서구입
899부산항 잔교1한국1920년 대지류회화구입
91010조선 노동자의 식사1한국1920년 대지류청자구입
연번유물번호유물명주수량(점)국적시기재질용도_기능입수연유
190191191조석표 상권(1934년)1한국1934년지류서적구입
191192192척무요람1한국1932년지류서적구입
192193193해양조사요보 5호 6호1한국1933년지류서적구입
193194194해양조사보고 1호 2호2한국1926년지류서적구입
194195195부세일반 청진부1한국1937년지류안내서구입
195196196사가현 간척사부도1한국1938년지류문서구입
196197197간석지기본조사보고1한국1961년지류보고서구입
197198198금난간척지구기술조사보고1한국1963년지류보고서구입
198199199서남해안간척자원조사종합보고서1한국1995년지류보고서구입
199200200수문보고서(평택금강지구)1한국1971년지류보고서구입