Overview

Dataset statistics

Number of variables4
Number of observations1166
Missing cells0
Missing cells (%)0.0%
Duplicate rows3
Duplicate rows (%)0.3%
Total size in memory37.7 KiB
Average record size in memory33.1 B

Variable types

Numeric1
Text3

Dataset

Description한국광해광업공단이 1978년부터 시행한 해외자원개발 조사사업의 목록을 연도별, 국가별, 광종별, 광산이름 정보로 정리하여 제공합니다
URLhttps://www.data.go.kr/data/15025211/fileData.do

Alerts

Dataset has 3 (0.3%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 04:52:33.432182
Analysis finished2023-12-12 04:52:34.386715
Duration0.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년도
Real number (ℝ)

Distinct45
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2001.7856
Minimum1978
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2023-12-12T13:52:34.474254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1978
5-th percentile1978
Q11994
median2005
Q32011
95-th percentile2019
Maximum2022
Range44
Interquartile range (IQR)17

Descriptive statistics

Standard deviation12.335872
Coefficient of variation (CV)0.0061624341
Kurtosis-0.74158811
Mean2001.7856
Median Absolute Deviation (MAD)8
Skewness-0.57391927
Sum2334082
Variance152.17373
MonotonicityIncreasing
2023-12-12T13:52:34.996610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
2011 70
 
6.0%
1978 65
 
5.6%
2012 61
 
5.2%
2010 59
 
5.1%
2009 57
 
4.9%
2013 46
 
3.9%
2007 46
 
3.9%
2008 43
 
3.7%
1979 36
 
3.1%
2006 32
 
2.7%
Other values (35) 651
55.8%
ValueCountFrequency (%)
1978 65
5.6%
1979 36
3.1%
1980 25
 
2.1%
1981 22
 
1.9%
1982 13
 
1.1%
1983 6
 
0.5%
1984 5
 
0.4%
1985 9
 
0.8%
1986 11
 
0.9%
1987 9
 
0.8%
ValueCountFrequency (%)
2022 11
 
0.9%
2021 23
2.0%
2020 14
 
1.2%
2019 14
 
1.2%
2018 15
 
1.3%
2017 14
 
1.2%
2016 15
 
1.3%
2015 23
2.0%
2014 30
2.6%
2013 46
3.9%
Distinct79
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size9.2 KiB
2023-12-12T13:52:35.273966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length3.3567753
Min length2

Characters and Unicode

Total characters3914
Distinct characters114
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)1.7%

Sample

1st row페루
2nd row볼리비아
3rd row볼리비아
4th row볼리비아
5th row볼리비아
ValueCountFrequency (%)
인도네시아 196
16.8%
호주 124
 
10.6%
몽골 93
 
8.0%
중국 84
 
7.2%
필리핀 71
 
6.1%
캐나다 54
 
4.6%
페루 48
 
4.1%
미국 41
 
3.5%
볼리비아 39
 
3.3%
카자흐스탄 34
 
2.9%
Other values (70) 383
32.8%
2023-12-12T13:52:35.751187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
372
 
9.5%
244
 
6.2%
205
 
5.2%
202
 
5.2%
201
 
5.1%
147
 
3.8%
137
 
3.5%
131
 
3.3%
124
 
3.2%
109
 
2.8%
Other values (104) 2042
52.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3913
> 99.9%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
372
 
9.5%
244
 
6.2%
205
 
5.2%
202
 
5.2%
201
 
5.1%
147
 
3.8%
137
 
3.5%
131
 
3.3%
124
 
3.2%
109
 
2.8%
Other values (103) 2041
52.2%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3913
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
372
 
9.5%
244
 
6.2%
205
 
5.2%
202
 
5.2%
201
 
5.1%
147
 
3.8%
137
 
3.5%
131
 
3.3%
124
 
3.2%
109
 
2.8%
Other values (103) 2041
52.2%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3913
> 99.9%
ASCII 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
372
 
9.5%
244
 
6.2%
205
 
5.2%
202
 
5.2%
201
 
5.1%
147
 
3.8%
137
 
3.5%
131
 
3.3%
124
 
3.2%
109
 
2.8%
Other values (103) 2041
52.2%
ASCII
ValueCountFrequency (%)
1
100.0%
Distinct1124
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Memory size9.2 KiB
2023-12-12T13:52:36.284845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length15
Mean length3.8301887
Min length1

Characters and Unicode

Total characters4466
Distinct characters600
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1085 ?
Unique (%)93.1%

Sample

1st row토로모쵸
2nd row야루이꼬야
3rd row출츄카니
4th row빠드꼬요
5th row에스페란쟈
ValueCountFrequency (%)
탐보 3
 
0.3%
잠발레스 3
 
0.3%
tbs 3
 
0.3%
kwb 3
 
0.3%
카라토르 2
 
0.2%
심바이 2
 
0.2%
사라 2
 
0.2%
지캄보 2
 
0.2%
존시트톨고이 2
 
0.2%
오프레이 2
 
0.2%
Other values (1129) 1160
98.0%
2023-12-12T13:52:36.970999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
174
 
3.9%
141
 
3.2%
128
 
2.9%
100
 
2.2%
91
 
2.0%
78
 
1.7%
77
 
1.7%
75
 
1.7%
70
 
1.6%
67
 
1.5%
Other values (590) 3465
77.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4040
90.5%
Uppercase Letter 259
 
5.8%
Lowercase Letter 88
 
2.0%
Space Separator 19
 
0.4%
Open Punctuation 16
 
0.4%
Close Punctuation 16
 
0.4%
Decimal Number 13
 
0.3%
Dash Punctuation 12
 
0.3%
Other Punctuation 2
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
174
 
4.3%
141
 
3.5%
128
 
3.2%
100
 
2.5%
91
 
2.3%
78
 
1.9%
77
 
1.9%
75
 
1.9%
70
 
1.7%
67
 
1.7%
Other values (535) 3039
75.2%
Uppercase Letter
ValueCountFrequency (%)
B 32
12.4%
S 29
11.2%
M 26
 
10.0%
P 19
 
7.3%
K 19
 
7.3%
C 17
 
6.6%
T 15
 
5.8%
R 14
 
5.4%
A 11
 
4.2%
I 10
 
3.9%
Other values (15) 67
25.9%
Lowercase Letter
ValueCountFrequency (%)
a 16
18.2%
e 11
12.5%
r 7
 
8.0%
n 5
 
5.7%
u 5
 
5.7%
o 5
 
5.7%
m 5
 
5.7%
i 4
 
4.5%
t 4
 
4.5%
s 4
 
4.5%
Other values (11) 22
25.0%
Decimal Number
ValueCountFrequency (%)
1 7
53.8%
2 5
38.5%
3 1
 
7.7%
Space Separator
ValueCountFrequency (%)
19
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4038
90.4%
Latin 348
 
7.8%
Common 78
 
1.7%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
174
 
4.3%
141
 
3.5%
128
 
3.2%
100
 
2.5%
91
 
2.3%
78
 
1.9%
77
 
1.9%
75
 
1.9%
70
 
1.7%
67
 
1.7%
Other values (534) 3037
75.2%
Latin
ValueCountFrequency (%)
B 32
 
9.2%
S 29
 
8.3%
M 26
 
7.5%
P 19
 
5.5%
K 19
 
5.5%
C 17
 
4.9%
a 16
 
4.6%
T 15
 
4.3%
R 14
 
4.0%
A 11
 
3.2%
Other values (37) 150
43.1%
Common
ValueCountFrequency (%)
19
24.4%
( 16
20.5%
) 16
20.5%
- 12
15.4%
1 7
 
9.0%
2 5
 
6.4%
/ 2
 
2.6%
3 1
 
1.3%
Han
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4038
90.4%
ASCII 425
 
9.5%
CJK 2
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
174
 
4.3%
141
 
3.5%
128
 
3.2%
100
 
2.5%
91
 
2.3%
78
 
1.9%
77
 
1.9%
75
 
1.9%
70
 
1.7%
67
 
1.7%
Other values (534) 3037
75.2%
ASCII
ValueCountFrequency (%)
B 32
 
7.5%
S 29
 
6.8%
M 26
 
6.1%
P 19
 
4.5%
K 19
 
4.5%
19
 
4.5%
C 17
 
4.0%
a 16
 
3.8%
( 16
 
3.8%
) 16
 
3.8%
Other values (44) 216
50.8%
CJK
ValueCountFrequency (%)
2
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

광종
Text

Distinct65
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size9.2 KiB
2023-12-12T13:52:37.266799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length2.1492281
Min length1

Characters and Unicode

Total characters2506
Distinct characters84
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)2.2%

Sample

1st row
2nd row우라늄
3rd row우라늄
4th row우라늄
5th row우라늄
ValueCountFrequency (%)
유연탄 311
26.5%
274
23.3%
107
 
9.1%
우라늄 63
 
5.4%
아연 50
 
4.3%
45
 
3.8%
무연탄 28
 
2.4%
니켈 28
 
2.4%
22
 
1.9%
주석 21
 
1.8%
Other values (51) 225
19.2%
2023-12-12T13:52:37.739595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
421
16.8%
356
14.2%
315
12.6%
279
11.1%
126
 
5.0%
73
 
2.9%
63
 
2.5%
63
 
2.5%
62
 
2.5%
53
 
2.1%
Other values (74) 695
27.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2477
98.8%
Other Punctuation 15
 
0.6%
Space Separator 8
 
0.3%
Open Punctuation 3
 
0.1%
Close Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
421
17.0%
356
14.4%
315
12.7%
279
11.3%
126
 
5.1%
73
 
2.9%
63
 
2.5%
63
 
2.5%
62
 
2.5%
53
 
2.1%
Other values (70) 666
26.9%
Other Punctuation
ValueCountFrequency (%)
, 15
100.0%
Space Separator
ValueCountFrequency (%)
8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2477
98.8%
Common 29
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
421
17.0%
356
14.4%
315
12.7%
279
11.3%
126
 
5.1%
73
 
2.9%
63
 
2.5%
63
 
2.5%
62
 
2.5%
53
 
2.1%
Other values (70) 666
26.9%
Common
ValueCountFrequency (%)
, 15
51.7%
8
27.6%
( 3
 
10.3%
) 3
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2477
98.8%
ASCII 29
 
1.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
421
17.0%
356
14.4%
315
12.7%
279
11.3%
126
 
5.1%
73
 
2.9%
63
 
2.5%
63
 
2.5%
62
 
2.5%
53
 
2.1%
Other values (70) 666
26.9%
ASCII
ValueCountFrequency (%)
, 15
51.7%
8
27.6%
( 3
 
10.3%
) 3
 
10.3%

Interactions

2023-12-12T13:52:34.131236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:52:37.857676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도국가명광종
년도1.0000.6920.614
국가명0.6921.0000.829
광종0.6140.8291.000

Missing values

2023-12-12T13:52:34.243973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:52:34.336130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년도국가명광산이름광종
01978페루토로모쵸
11978볼리비아야루이꼬야우라늄
21978볼리비아출츄카니우라늄
31978볼리비아빠드꼬요우라늄
41978볼리비아에스페란쟈우라늄
51978볼리비아룬라야우라늄
61978미국쓰리스타무연탄
71978페루띤따야
81978태국푸비엥우라늄
91978콜롬비아마카레나우라늄
년도국가명광산이름광종
11562022인도네시아PUP니켈
11572022인도네시아모라모석회석
11582022몽골하단하르몰리브덴
11592022몽골소곳
11602022인도네시아MDK니켈
11612022인도네시아TBS유연탄
11622022몽골하르골유연탄
11632022카자흐스탄쿠르다이
11642022몽골운드르차간텅스텐
11652022몽골노썬보르츠

Duplicate rows

Most frequently occurring

년도국가명광산이름광종# duplicates
01986호주글레니스크릭유연탄2
12018몽골타카라진2
22022몽골노썬보르츠2