Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells2
Missing cells (%)< 0.1%
Duplicate rows6
Duplicate rows (%)0.1%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

Text5
Numeric1
Categorical1

Dataset

Description광물자원에 대한 광산도면 및 광구도면 등에 대한 정보를 제공(KMRGIS) : 해당 사이트에 찾고자하는 광구의 주소검색, 보고서 검색, 광업지적 검색, 시추정보 검색, 북한광산 검색 등의 서비스를 이용하수 있음 ## LINK 미리보기 [![미리보기](http://curate.gimi9.com/linkview/www-data-go-kr-data-filedata-3074447?url=https%3A//www.kmrgis.net/KMRGIS/MRInfomationQuery/MRIQ0401.aspx%3FmenuID%3DMRIQ04%23&version=d7)](https://www.data.go.kr/data/3074447/fileData.do)
URLhttps://www.data.go.kr/data/3074447/fileData.do

Alerts

Dataset has 6 (0.1%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 14:35:09.511505
Analysis finished2023-12-12 14:35:11.696240
Duration2.18 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct4381
Distinct (%)43.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T23:35:11.905054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length26
Mean length9.2661
Min length2

Characters and Unicode

Total characters92661
Distinct characters394
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2442 ?
Unique (%)24.4%

Sample

1st row송경(구 송학)(석회석)
2nd row북삼(고령토)광산
3rd row광진(석회석)광산
4th row신덕(금,은)광산
5th row광도(도석)광산
ValueCountFrequency (%)
신예미(철)광산 38
 
0.4%
양양(철)광산 37
 
0.4%
연화(연,아연)광산 34
 
0.3%
대성단양(석회석)광산 32
 
0.3%
무극(금,은)광산 32
 
0.3%
대성동해(석회석)광산 31
 
0.3%
노화(납석)광산 30
 
0.3%
대성제천(석회석)광산 25
 
0.2%
마로(석탄)광산 23
 
0.2%
태영삼도(석회석)광산 23
 
0.2%
Other values (4373) 9697
97.0%
2023-12-12T23:35:12.350367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 10090
 
10.9%
) 10085
 
10.9%
9791
 
10.6%
9295
 
10.0%
7750
 
8.4%
, 3212
 
3.5%
2922
 
3.2%
2250
 
2.4%
1795
 
1.9%
1589
 
1.7%
Other values (384) 33882
36.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68105
73.5%
Open Punctuation 10090
 
10.9%
Close Punctuation 10085
 
10.9%
Other Punctuation 3404
 
3.7%
Decimal Number 859
 
0.9%
Dash Punctuation 74
 
0.1%
Uppercase Letter 40
 
< 0.1%
Letter Number 2
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9791
 
14.4%
9295
 
13.6%
7750
 
11.4%
2922
 
4.3%
2250
 
3.3%
1795
 
2.6%
1589
 
2.3%
1299
 
1.9%
1161
 
1.7%
921
 
1.4%
Other values (352) 29332
43.1%
Uppercase Letter
ValueCountFrequency (%)
C 13
32.5%
I 7
17.5%
M 3
 
7.5%
W 3
 
7.5%
Y 3
 
7.5%
K 3
 
7.5%
N 2
 
5.0%
E 1
 
2.5%
T 1
 
2.5%
R 1
 
2.5%
Other values (3) 3
 
7.5%
Decimal Number
ValueCountFrequency (%)
2 187
21.8%
1 165
19.2%
4 86
10.0%
3 78
9.1%
5 72
 
8.4%
9 65
 
7.6%
7 56
 
6.5%
0 53
 
6.2%
6 51
 
5.9%
8 46
 
5.4%
Other Punctuation
ValueCountFrequency (%)
, 3212
94.4%
. 185
 
5.4%
· 6
 
0.2%
; 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 10090
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10085
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 74
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68105
73.5%
Common 24514
 
26.5%
Latin 42
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9791
 
14.4%
9295
 
13.6%
7750
 
11.4%
2922
 
4.3%
2250
 
3.3%
1795
 
2.6%
1589
 
2.3%
1299
 
1.9%
1161
 
1.7%
921
 
1.4%
Other values (352) 29332
43.1%
Common
ValueCountFrequency (%)
( 10090
41.2%
) 10085
41.1%
, 3212
 
13.1%
2 187
 
0.8%
. 185
 
0.8%
1 165
 
0.7%
4 86
 
0.4%
3 78
 
0.3%
- 74
 
0.3%
5 72
 
0.3%
Other values (8) 280
 
1.1%
Latin
ValueCountFrequency (%)
C 13
31.0%
I 7
16.7%
M 3
 
7.1%
W 3
 
7.1%
Y 3
 
7.1%
K 3
 
7.1%
N 2
 
4.8%
2
 
4.8%
E 1
 
2.4%
T 1
 
2.4%
Other values (4) 4
 
9.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68105
73.5%
ASCII 24548
 
26.5%
None 6
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 10090
41.1%
) 10085
41.1%
, 3212
 
13.1%
2 187
 
0.8%
. 185
 
0.8%
1 165
 
0.7%
4 86
 
0.4%
3 78
 
0.3%
- 74
 
0.3%
5 72
 
0.3%
Other values (20) 314
 
1.3%
Hangul
ValueCountFrequency (%)
9791
 
14.4%
9295
 
13.6%
7750
 
11.4%
2922
 
4.3%
2250
 
3.3%
1795
 
2.6%
1589
 
2.3%
1299
 
1.9%
1161
 
1.7%
921
 
1.4%
Other values (352) 29332
43.1%
None
ValueCountFrequency (%)
· 6
100.0%
Number Forms
ValueCountFrequency (%)
2
100.0%
Distinct6101
Distinct (%)61.0%
Missing2
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-12T23:35:12.677406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.9817964
Min length4

Characters and Unicode

Total characters69804
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3570 ?
Unique (%)35.7%

Sample

1st row19시-15
2nd row94의-053
3rd row03매-001
4th row86기-047
5th row82의-009
ValueCountFrequency (%)
18시-00 40
 
0.4%
88시-014 8
 
0.1%
88시-010 8
 
0.1%
87시-007 6
 
0.1%
84의-024 6
 
0.1%
93의-005 6
 
0.1%
88매-023 6
 
0.1%
88시-038 6
 
0.1%
90시-017 6
 
0.1%
87시-045 6
 
0.1%
Other values (6091) 9900
99.0%
2023-12-12T23:35:13.168158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 15553
22.3%
- 9998
14.3%
8 6034
 
8.6%
1 5173
 
7.4%
9 4940
 
7.1%
2 3740
 
5.4%
7 3608
 
5.2%
3036
 
4.3%
3 2956
 
4.2%
4 2757
 
3.9%
Other values (8) 12009
17.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49808
71.4%
Dash Punctuation 9998
 
14.3%
Other Letter 9998
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 15553
31.2%
8 6034
 
12.1%
1 5173
 
10.4%
9 4940
 
9.9%
2 3740
 
7.5%
7 3608
 
7.2%
3 2956
 
5.9%
4 2757
 
5.5%
5 2623
 
5.3%
6 2424
 
4.9%
Other Letter
ValueCountFrequency (%)
3036
30.4%
2450
24.5%
2402
24.0%
1087
 
10.9%
526
 
5.3%
324
 
3.2%
173
 
1.7%
Dash Punctuation
ValueCountFrequency (%)
- 9998
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 59806
85.7%
Hangul 9998
 
14.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 15553
26.0%
- 9998
16.7%
8 6034
 
10.1%
1 5173
 
8.6%
9 4940
 
8.3%
2 3740
 
6.3%
7 3608
 
6.0%
3 2956
 
4.9%
4 2757
 
4.6%
5 2623
 
4.4%
Hangul
ValueCountFrequency (%)
3036
30.4%
2450
24.5%
2402
24.0%
1087
 
10.9%
526
 
5.3%
324
 
3.2%
173
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59806
85.7%
Hangul 9998
 
14.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 15553
26.0%
- 9998
16.7%
8 6034
 
10.1%
1 5173
 
8.6%
9 4940
 
8.3%
2 3740
 
6.3%
7 3608
 
6.0%
3 2956
 
4.9%
4 2757
 
4.6%
5 2623
 
4.4%
Hangul
ValueCountFrequency (%)
3036
30.4%
2450
24.5%
2402
24.0%
1087
 
10.9%
526
 
5.3%
324
 
3.2%
173
 
1.7%
Distinct5764
Distinct (%)57.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T23:35:13.440684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length30
Mean length17.9234
Min length2

Characters and Unicode

Total characters179234
Distinct characters396
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3634 ?
Unique (%)36.3%

Sample

1st row송경(구 송학)(석회석)광산 시추결과보고서
2nd row북삼(고령토)광산 기술조사보고서
3rd row광진(석회석)광산 매장량조사보고서
4th row신덕(금,은)광산 기본조사보고서
5th row광도(도석)광산 기술조사보고서
ValueCountFrequency (%)
시추결과보고서 2914
 
14.5%
기술조사보고서 2447
 
12.2%
매장량조사보고서 2237
 
11.2%
굴진효과조사보고서 1036
 
5.2%
기본조사보고서 518
 
2.6%
정밀조사보고서 321
 
1.6%
매장량조사보고서(석재 168
 
0.8%
물리탐사보고서 164
 
0.8%
시추결과보고서(석재 104
 
0.5%
신예미(철)광산 44
 
0.2%
Other values (4376) 10090
50.3%
2023-12-12T23:35:13.819458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 10445
 
5.8%
) 10443
 
5.8%
10402
 
5.8%
10374
 
5.8%
10159
 
5.7%
10043
 
5.6%
9955
 
5.6%
9467
 
5.3%
8022
 
4.5%
7309
 
4.1%
Other values (386) 82615
46.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 143850
80.3%
Open Punctuation 10445
 
5.8%
Close Punctuation 10443
 
5.8%
Space Separator 10043
 
5.6%
Other Punctuation 3409
 
1.9%
Decimal Number 926
 
0.5%
Dash Punctuation 76
 
< 0.1%
Uppercase Letter 40
 
< 0.1%
Letter Number 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10402
 
7.2%
10374
 
7.2%
10159
 
7.1%
9955
 
6.9%
9467
 
6.6%
8022
 
5.6%
7309
 
5.1%
6879
 
4.8%
4124
 
2.9%
3101
 
2.2%
Other values (353) 64058
44.5%
Uppercase Letter
ValueCountFrequency (%)
C 13
32.5%
I 7
17.5%
M 3
 
7.5%
W 3
 
7.5%
Y 3
 
7.5%
K 3
 
7.5%
N 2
 
5.0%
E 1
 
2.5%
R 1
 
2.5%
T 1
 
2.5%
Other values (3) 3
 
7.5%
Decimal Number
ValueCountFrequency (%)
2 219
23.7%
1 197
21.3%
4 86
 
9.3%
3 78
 
8.4%
5 72
 
7.8%
9 65
 
7.0%
7 56
 
6.0%
0 53
 
5.7%
6 51
 
5.5%
8 49
 
5.3%
Other Punctuation
ValueCountFrequency (%)
, 3206
94.0%
. 185
 
5.4%
· 15
 
0.4%
@ 2
 
0.1%
; 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 10445
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10443
100.0%
Space Separator
ValueCountFrequency (%)
10043
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 76
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 143850
80.3%
Common 35342
 
19.7%
Latin 42
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10402
 
7.2%
10374
 
7.2%
10159
 
7.1%
9955
 
6.9%
9467
 
6.6%
8022
 
5.6%
7309
 
5.1%
6879
 
4.8%
4124
 
2.9%
3101
 
2.2%
Other values (353) 64058
44.5%
Common
ValueCountFrequency (%)
( 10445
29.6%
) 10443
29.5%
10043
28.4%
, 3206
 
9.1%
2 219
 
0.6%
1 197
 
0.6%
. 185
 
0.5%
4 86
 
0.2%
3 78
 
0.2%
- 76
 
0.2%
Other values (9) 364
 
1.0%
Latin
ValueCountFrequency (%)
C 13
31.0%
I 7
16.7%
M 3
 
7.1%
W 3
 
7.1%
Y 3
 
7.1%
K 3
 
7.1%
N 2
 
4.8%
2
 
4.8%
E 1
 
2.4%
R 1
 
2.4%
Other values (4) 4
 
9.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 143850
80.3%
ASCII 35367
 
19.7%
None 15
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 10445
29.5%
) 10443
29.5%
10043
28.4%
, 3206
 
9.1%
2 219
 
0.6%
1 197
 
0.6%
. 185
 
0.5%
4 86
 
0.2%
3 78
 
0.2%
- 76
 
0.2%
Other values (21) 389
 
1.1%
Hangul
ValueCountFrequency (%)
10402
 
7.2%
10374
 
7.2%
10159
 
7.1%
9955
 
6.9%
9467
 
6.6%
8022
 
5.6%
7309
 
5.1%
6879
 
4.8%
4124
 
2.9%
3101
 
2.2%
Other values (353) 64058
44.5%
None
ValueCountFrequency (%)
· 15
100.0%
Number Forms
ValueCountFrequency (%)
2
100.0%

보고서작성연도
Real number (ℝ)

Distinct53
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1991.1491
Minimum1970
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T23:35:14.013676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1970
5-th percentile1975
Q11985
median1989
Q31996
95-th percentile2013
Maximum2022
Range52
Interquartile range (IQR)11

Descriptive statistics

Standard deviation10.606956
Coefficient of variation (CV)0.0053270527
Kurtosis0.24172409
Mean1991.1491
Median Absolute Deviation (MAD)5
Skewness0.63992608
Sum19911491
Variance112.50752
MonotonicityNot monotonic
2023-12-12T23:35:14.181854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1987 716
 
7.2%
1988 714
 
7.1%
1989 597
 
6.0%
1990 545
 
5.5%
1986 493
 
4.9%
1985 484
 
4.8%
1991 416
 
4.2%
1992 398
 
4.0%
1994 309
 
3.1%
1993 298
 
3.0%
Other values (43) 5030
50.3%
ValueCountFrequency (%)
1970 62
 
0.6%
1971 77
 
0.8%
1972 62
 
0.6%
1973 78
 
0.8%
1974 136
1.4%
1975 170
1.7%
1976 228
2.3%
1977 162
1.6%
1978 143
1.4%
1979 158
1.6%
ValueCountFrequency (%)
2022 35
 
0.4%
2021 26
 
0.3%
2020 22
 
0.2%
2019 24
 
0.2%
2018 56
0.6%
2017 21
 
0.2%
2016 80
0.8%
2015 47
0.5%
2014 107
1.1%
2013 97
1.0%
Distinct9980
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T23:35:14.455533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length52
Mean length30.4384
Min length8

Characters and Unicode

Total characters304384
Distinct characters420
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9960 ?
Unique (%)99.6%

Sample

1st row2019년 송경(구 송학)(석회석)광산 시추결과보고서.pdf
2nd row94의-53@북삼(고령토)광산 기술조사보고서.pdf
3rd row03매-01@광진(석회석)광산 매장량조사보고서.PDF
4th row86기-47@신덕(금,은)광산 기본조사보고서.hwp
5th row82의-09@광도(도석)광산 기술 조사보고서.pdf
ValueCountFrequency (%)
시추결과보고서.pdf 1353
 
6.0%
조사보고서.pdf 1183
 
5.3%
시추결과보고서.hwp 951
 
4.2%
기술조사보고서.hwp 896
 
4.0%
기술조사보고서.pdf 819
 
3.6%
조사보고서.hwp 809
 
3.6%
매장량조사보고서.hwp 745
 
3.3%
매장량조사보고서.pdf 701
 
3.1%
기술 671
 
3.0%
매장량 671
 
3.0%
Other values (8780) 13665
60.8%
2023-12-12T23:35:14.901982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12465
 
4.1%
- 10760
 
3.5%
@ 10673
 
3.5%
( 10536
 
3.5%
) 10535
 
3.5%
10303
 
3.4%
10278
 
3.4%
. 10204
 
3.4%
10066
 
3.3%
9977
 
3.3%
Other values (410) 198587
65.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 161640
53.1%
Decimal Number 43256
 
14.2%
Lowercase Letter 29243
 
9.6%
Other Punctuation 25043
 
8.2%
Space Separator 12465
 
4.1%
Dash Punctuation 10760
 
3.5%
Open Punctuation 10536
 
3.5%
Close Punctuation 10535
 
3.5%
Uppercase Letter 861
 
0.3%
Connector Punctuation 31
 
< 0.1%
Other values (2) 14
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10303
 
6.4%
10278
 
6.4%
10066
 
6.2%
9977
 
6.2%
9925
 
6.1%
8034
 
5.0%
7743
 
4.8%
7294
 
4.5%
6743
 
4.2%
4820
 
3.0%
Other values (363) 76457
47.3%
Uppercase Letter
ValueCountFrequency (%)
P 272
31.6%
W 142
16.5%
H 139
16.1%
D 133
15.4%
F 133
15.4%
C 13
 
1.5%
I 9
 
1.0%
M 4
 
0.5%
Y 3
 
0.3%
K 3
 
0.3%
Other values (8) 10
 
1.2%
Decimal Number
ValueCountFrequency (%)
0 6963
16.1%
1 6274
14.5%
8 6101
14.1%
9 5027
11.6%
2 4124
9.5%
7 3684
8.5%
3 3031
7.0%
4 2847
6.6%
5 2720
 
6.3%
6 2485
 
5.7%
Lowercase Letter
ValueCountFrequency (%)
p 9746
33.3%
d 5331
18.2%
f 5330
18.2%
w 4417
15.1%
h 4416
15.1%
x 2
 
< 0.1%
g 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
@ 10673
42.6%
. 10204
40.7%
, 4157
 
16.6%
· 8
 
< 0.1%
; 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
12465
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10760
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10536
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10535
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 31
100.0%
Math Symbol
ValueCountFrequency (%)
~ 12
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 161640
53.1%
Common 112638
37.0%
Latin 30106
 
9.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10303
 
6.4%
10278
 
6.4%
10066
 
6.2%
9977
 
6.2%
9925
 
6.1%
8034
 
5.0%
7743
 
4.8%
7294
 
4.5%
6743
 
4.2%
4820
 
3.0%
Other values (363) 76457
47.3%
Latin
ValueCountFrequency (%)
p 9746
32.4%
d 5331
17.7%
f 5330
17.7%
w 4417
14.7%
h 4416
14.7%
P 272
 
0.9%
W 142
 
0.5%
H 139
 
0.5%
D 133
 
0.4%
F 133
 
0.4%
Other values (16) 47
 
0.2%
Common
ValueCountFrequency (%)
12465
11.1%
- 10760
9.6%
@ 10673
9.5%
( 10536
9.4%
) 10535
9.4%
. 10204
9.1%
0 6963
 
6.2%
1 6274
 
5.6%
8 6101
 
5.4%
9 5027
 
4.5%
Other values (11) 23100
20.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 161640
53.1%
ASCII 142734
46.9%
None 8
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12465
 
8.7%
- 10760
 
7.5%
@ 10673
 
7.5%
( 10536
 
7.4%
) 10535
 
7.4%
. 10204
 
7.1%
p 9746
 
6.8%
0 6963
 
4.9%
1 6274
 
4.4%
8 6101
 
4.3%
Other values (35) 48477
34.0%
Hangul
ValueCountFrequency (%)
10303
 
6.4%
10278
 
6.4%
10066
 
6.2%
9977
 
6.2%
9925
 
6.1%
8034
 
5.0%
7743
 
4.8%
7294
 
4.5%
6743
 
4.2%
4820
 
3.0%
Other values (363) 76457
47.3%
None
ValueCountFrequency (%)
· 8
100.0%
Number Forms
ValueCountFrequency (%)
2
100.0%

파일타입
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
pdf
5472 
hwp
4526 
PDF
 
2

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpdf
2nd rowpdf
3rd rowpdf
4th rowhwp
5th rowpdf

Common Values

ValueCountFrequency (%)
pdf 5472
54.7%
hwp 4526
45.3%
PDF 2
 
< 0.1%

Length

2023-12-12T23:35:15.041236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:35:15.127450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
pdf 5474
54.7%
hwp 4526
45.3%
Distinct6123
Distinct (%)61.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T23:35:15.313479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length41
Mean length17.4973
Min length4

Characters and Unicode

Total characters174973
Distinct characters407
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4277 ?
Unique (%)42.8%

Sample

1st row송경(구 송학)(석회석)광산 시추결과보고서
2nd row북삼(고령토)광산 기술조사보고서
3rd row광진(석회석)광산 매장량조사보고서
4th row신덕(금,은)광산 기본조사보고서
5th row광도(도석)광산 기술 조사보고서
ValueCountFrequency (%)
조사보고서 1940
 
9.1%
기술조사보고서 1656
 
7.8%
시추결과보고서 1610
 
7.6%
매장량조사보고서 1424
 
6.7%
시추주상도-01 742
 
3.5%
기술 660
 
3.1%
매장량 647
 
3.0%
굴진효과조사보고서 560
 
2.6%
시추 313
 
1.5%
탐광갱도굴진효과조사보고서 296
 
1.4%
Other values (4701) 11380
53.6%
2023-12-12T23:35:15.760717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11228
 
6.4%
( 9654
 
5.5%
) 9471
 
5.4%
9274
 
5.3%
9259
 
5.3%
9055
 
5.2%
8925
 
5.1%
8908
 
5.1%
7415
 
4.2%
7189
 
4.1%
Other values (397) 84595
48.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 137031
78.3%
Space Separator 11228
 
6.4%
Open Punctuation 9654
 
5.5%
Close Punctuation 9471
 
5.4%
Other Punctuation 3686
 
2.1%
Decimal Number 2917
 
1.7%
Dash Punctuation 923
 
0.5%
Uppercase Letter 36
 
< 0.1%
Connector Punctuation 17
 
< 0.1%
Math Symbol 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9274
 
6.8%
9259
 
6.8%
9055
 
6.6%
8925
 
6.5%
8908
 
6.5%
7415
 
5.4%
7189
 
5.2%
7038
 
5.1%
3124
 
2.3%
3069
 
2.2%
Other values (363) 63775
46.5%
Uppercase Letter
ValueCountFrequency (%)
I 9
25.0%
C 8
22.2%
M 4
11.1%
Y 3
 
8.3%
K 2
 
5.6%
A 2
 
5.6%
N 2
 
5.6%
R 1
 
2.8%
E 1
 
2.8%
L 1
 
2.8%
Other values (3) 3
 
8.3%
Decimal Number
ValueCountFrequency (%)
1 1109
38.0%
0 916
31.4%
2 275
 
9.4%
4 105
 
3.6%
5 98
 
3.4%
3 98
 
3.4%
9 87
 
3.0%
8 79
 
2.7%
7 76
 
2.6%
6 74
 
2.5%
Other Punctuation
ValueCountFrequency (%)
, 3673
99.6%
· 10
 
0.3%
@ 2
 
0.1%
; 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
11228
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9654
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9471
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 923
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 17
100.0%
Math Symbol
ValueCountFrequency (%)
~ 8
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 137031
78.3%
Common 37904
 
21.7%
Latin 38
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9274
 
6.8%
9259
 
6.8%
9055
 
6.6%
8925
 
6.5%
8908
 
6.5%
7415
 
5.4%
7189
 
5.2%
7038
 
5.1%
3124
 
2.3%
3069
 
2.2%
Other values (363) 63775
46.5%
Common
ValueCountFrequency (%)
11228
29.6%
( 9654
25.5%
) 9471
25.0%
, 3673
 
9.7%
1 1109
 
2.9%
- 923
 
2.4%
0 916
 
2.4%
2 275
 
0.7%
4 105
 
0.3%
5 98
 
0.3%
Other values (10) 452
 
1.2%
Latin
ValueCountFrequency (%)
I 9
23.7%
C 8
21.1%
M 4
10.5%
Y 3
 
7.9%
2
 
5.3%
K 2
 
5.3%
A 2
 
5.3%
N 2
 
5.3%
R 1
 
2.6%
E 1
 
2.6%
Other values (4) 4
10.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 137031
78.3%
ASCII 37930
 
21.7%
None 10
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11228
29.6%
( 9654
25.5%
) 9471
25.0%
, 3673
 
9.7%
1 1109
 
2.9%
- 923
 
2.4%
0 916
 
2.4%
2 275
 
0.7%
4 105
 
0.3%
5 98
 
0.3%
Other values (22) 478
 
1.3%
Hangul
ValueCountFrequency (%)
9274
 
6.8%
9259
 
6.8%
9055
 
6.6%
8925
 
6.5%
8908
 
6.5%
7415
 
5.4%
7189
 
5.2%
7038
 
5.1%
3124
 
2.3%
3069
 
2.2%
Other values (363) 63775
46.5%
None
ValueCountFrequency (%)
· 10
100.0%
Number Forms
ValueCountFrequency (%)
2
100.0%

Interactions

2023-12-12T23:35:11.318613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:35:15.887261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
보고서작성연도파일타입
보고서작성연도1.0000.247
파일타입0.2471.000
2023-12-12T23:35:15.992265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
보고서작성연도파일타입
보고서작성연도1.0000.152
파일타입0.1521.000

Missing values

2023-12-12T23:35:11.488455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:35:11.630577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

광산명보고서번호보고서명보고서작성연도파일명파일타입요약정보
207송경(구 송학)(석회석)19시-15송경(구 송학)(석회석)광산 시추결과보고서20192019년 송경(구 송학)(석회석)광산 시추결과보고서.pdfpdf송경(구 송학)(석회석)광산 시추결과보고서
6290북삼(고령토)광산94의-053북삼(고령토)광산 기술조사보고서199494의-53@북삼(고령토)광산 기술조사보고서.pdfpdf북삼(고령토)광산 기술조사보고서
3097광진(석회석)광산03매-001광진(석회석)광산 매장량조사보고서200303매-01@광진(석회석)광산 매장량조사보고서.PDFpdf광진(석회석)광산 매장량조사보고서
14617신덕(금,은)광산86기-047신덕(금,은)광산 기본조사보고서198686기-47@신덕(금,은)광산 기본조사보고서.hwphwp신덕(금,은)광산 기본조사보고서
17387광도(도석)광산82의-009광도(도석)광산 기술조사보고서198282의-09@광도(도석)광산 기술 조사보고서.pdfpdf광도(도석)광산 기술 조사보고서
7522풍천(사문석)광산92매-022풍천(사문석)광산 매장량조사보고서199292매-22@풍천(사문석)광산 매장량조사보고서.pdfpdf풍천(사문석)광산 매장량조사보고서
18223삼봉(금,은,동)광산79매-014삼봉(금,은,동)광산 매장량조사보고서197979매-14@삼봉(금,은,동,연,아연)광산 매장량 조사보고서.pdfpdf삼봉(금,은,동,연,아연)광산 매장량 조사보고서
15958울산(철)광산85물-030울산(철)광산 물리탐사보고서198585물-30@울산(철)광산 물리탐사보고서(기본).hwphwp울산(철)광산 물리탐사보고서(기본)
12732임천(금,은)광산88굴-073임천(금,은)광산 굴진효과조사보고서198888굴-73@임천(금,은)광산 탐광갱도굴진효과조사보고서.pdfpdf임천(금,은)광산 탐광갱도굴진효과조사보고서
5551가평(도석)지구95정-010가평(도석)지구 정밀조사보고서199595정-10@가평(도석)지구 정밀조사보고서.hwphwp가평(도석)지구 정밀조사보고서
광산명보고서번호보고서명보고서작성연도파일명파일타입요약정보
16238삼화(고령토)광산84매-026삼화(고령토)광산 매장량조사보고서198484매-26@삼화(고령토)광산 매장량 조사보고서.pdfpdf삼화(고령토)광산 매장량 조사보고서
2690반천(석회석)광산05시-032반천(석회석)광산 시추결과보고서200505시-32@반천(석회석)광산 시추결과보고서.hwphwp반천(석회석)광산 시추결과보고서
5888대광(화강암)95매-127대광(화강암) 매장량조사보고서(석재)199595매-127@대광석재 석재매장량 조사보고서.pdfpdf대광석재 석재매장량 조사보고서
5777유일(금,은)광산95시-010유일(금,은)광산 시추결과보고서199595시-10@유일(금,은)광산 시추결과보고서.pdfpdf유일(금,은)광산 시추결과보고서
2848충원(석회석)광산04시-064충원(석회석)광산 시추결과보고서200404시-64@시추주상도-01@충원(석회석)광산 시추결과보고서.pdfpdf시추주상도-01
7109매현(금,은,동,연,규석)광산93의-006매현(금,은,동,연,규석)광산 기술조사보고서199393의-06@매현(금,은,동,연,규석)광산 기술조사보고서.pdfpdf매현(금,은,동,연,규석)광산 기술조사보고서
19851마성(연,아연)광산75시-006마성(연,아연)광산 시추결과보고서197575시-06@마성(연,아연)광산 시추 조사보고서.pdfpdf마성(연,아연)광산 시추 조사보고서
307완도(납석)18시-0018시-@완도(납석)광산 시추결과보고서201818시-@완도(납석)광산 시추주상도.pdfpdf18시-@완도(납석)광산 시추결과보고서
16116김제(사금)광산85시-071김제(사금)광산 시추결과보고서198585시-71@김제(사금)광산 시추결과보고서.pdfpdf김제(사금)광산 시추결과보고서
12743연천(티탄철)광산88굴-062연천(티탄철)광산 굴진효과조사보고서198888굴-62@연천(티탄철)광산 탐광갱도굴진효과조사보고서.pdfpdf연천(티탄철)광산 탐광갱도굴진효과조사보고서

Duplicate rows

Most frequently occurring

광산명보고서번호보고서명보고서작성연도파일명파일타입요약정보# duplicates
0금호(연,아연)광산14굴-002금호(연,아연)광산 굴진효과조사보고서201414굴-02@금호(연,아연)광산 굴진효과조사보고서.hwphwp금호(연,아연)광산 굴진효과조사보고서2
1대금(금,은)18시-02대금(금·은)광산 시추결과보고서20184_2018년도 대금(금은)광산 시추암추 분석결과.pdfpdf대금(금·은)광산 시추결과보고서2
2삼성영월(석회석)18시-00삼성영월(석회석)광산 시추결과보고서201818 삼성영월(석회석)광산 시추결과보고서.pdfpdf삼성영월(석회석)광산 시추결과보고서2
3신예미(철)18시-06신예미(철)광산 시추결과보고서20182018년도 신예미(철)광산 시추결과보고서.hwphwp신예미(철)광산 시추결과보고서2
4청림청삼(석회석)18시-00청림청삼(석회석)광산 시추결과보고서201818시_청림청삼(석회석)_시추주상도.pdfpdf청림청삼(석회석)광산 시추결과보고서2
5청림청삼(석회석)18시-00청림청삼(석회석)광산 시추결과보고서20182018년도 청림청삼(석회석)광산 시추결과보고서.hwphwp청림청삼(석회석)광산 시추결과보고서2