Overview

Dataset statistics

Number of variables6
Number of observations70
Missing cells46
Missing cells (%)11.0%
Duplicate rows2
Duplicate rows (%)2.9%
Total size in memory3.5 KiB
Average record size in memory51.9 B

Variable types

Text3
Numeric2
Categorical1

Dataset

Description경기도 연천군의 관광명소 현황입니다. 관광지명, 위치, 등 관광명소 정보를 개방하여 이용자들의 편의를 제공합니다.
URLhttps://www.data.go.kr/data/15069085/fileData.do

Alerts

Dataset has 2 (2.9%) duplicate rowsDuplicates
위도 is highly overall correlated with 경도High correlation
경도 is highly overall correlated with 위도High correlation
도로명주소 has 46 (65.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 19:39:42.227069
Analysis finished2023-12-12 19:39:43.273253
Duration1.05 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct68
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size692.0 B
2023-12-13T04:39:43.408851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length16
Mean length8.1857143
Min length3

Characters and Unicode

Total characters573
Distinct characters179
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)94.3%

Sample

1st row한탄강 관광지
2nd row고대산캠핑리조트
3rd row한반도통일미래센터
4th row한탄강지질공원 재인폭포
5th row임진강 주상절리
ValueCountFrequency (%)
연천 16
 
12.0%
9
 
6.8%
한탄강지질공원 4
 
3.0%
고인돌 3
 
2.3%
한반도통일미래센터 2
 
1.5%
학곡리 2
 
1.5%
묘역 2
 
1.5%
좌상바위 2
 
1.5%
은대리 2
 
1.5%
주상절리 2
 
1.5%
Other values (87) 89
66.9%
2023-12-13T04:39:43.739075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
63
 
11.0%
26
 
4.5%
25
 
4.4%
21
 
3.7%
16
 
2.8%
16
 
2.8%
12
 
2.1%
11
 
1.9%
9
 
1.6%
9
 
1.6%
Other values (169) 365
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 491
85.7%
Space Separator 63
 
11.0%
Decimal Number 9
 
1.6%
Close Punctuation 4
 
0.7%
Open Punctuation 3
 
0.5%
Uppercase Letter 2
 
0.3%
Other Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
 
5.3%
25
 
5.1%
21
 
4.3%
16
 
3.3%
16
 
3.3%
12
 
2.4%
11
 
2.2%
9
 
1.8%
9
 
1.8%
9
 
1.8%
Other values (159) 337
68.6%
Decimal Number
ValueCountFrequency (%)
1 3
33.3%
2 3
33.3%
4 2
22.2%
5 1
 
11.1%
Uppercase Letter
ValueCountFrequency (%)
N 1
50.0%
U 1
50.0%
Space Separator
ValueCountFrequency (%)
63
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 491
85.7%
Common 80
 
14.0%
Latin 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
 
5.3%
25
 
5.1%
21
 
4.3%
16
 
3.3%
16
 
3.3%
12
 
2.4%
11
 
2.2%
9
 
1.8%
9
 
1.8%
9
 
1.8%
Other values (159) 337
68.6%
Common
ValueCountFrequency (%)
63
78.8%
) 4
 
5.0%
( 3
 
3.8%
1 3
 
3.8%
2 3
 
3.8%
4 2
 
2.5%
5 1
 
1.2%
. 1
 
1.2%
Latin
ValueCountFrequency (%)
N 1
50.0%
U 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 491
85.7%
ASCII 82
 
14.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
63
76.8%
) 4
 
4.9%
( 3
 
3.7%
1 3
 
3.7%
2 3
 
3.7%
4 2
 
2.4%
N 1
 
1.2%
U 1
 
1.2%
5 1
 
1.2%
. 1
 
1.2%
Hangul
ValueCountFrequency (%)
26
 
5.3%
25
 
5.1%
21
 
4.3%
16
 
3.3%
16
 
3.3%
12
 
2.4%
11
 
2.2%
9
 
1.8%
9
 
1.8%
9
 
1.8%
Other values (159) 337
68.6%

도로명주소
Text

MISSING 

Distinct22
Distinct (%)91.7%
Missing46
Missing (%)65.7%
Memory size692.0 B
2023-12-13T04:39:43.954880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length23
Mean length20.791667
Min length18

Characters and Unicode

Total characters499
Distinct characters59
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)83.3%

Sample

1st row경기도 연천군 전곡읍 선사로 76
2nd row경기도 연천군 신서면 고대산길 84-12
3rd row경기도 연천군 전곡읍 남계로 408
4th row경기도 연천군 연천읍 동내로 431
5th row경기도 연천군 연천읍 현문로 526-35
ValueCountFrequency (%)
경기도 24
19.5%
연천군 24
19.5%
연천읍 5
 
4.1%
전곡읍 5
 
4.1%
신서면 4
 
3.3%
중면 3
 
2.4%
고대산길 3
 
2.4%
408 2
 
1.6%
현문로 2
 
1.6%
왕징면 2
 
1.6%
Other values (43) 49
39.8%
2023-12-13T04:39:44.478508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
99
19.8%
30
 
6.0%
29
 
5.8%
27
 
5.4%
24
 
4.8%
24
 
4.8%
24
 
4.8%
20
 
4.0%
14
 
2.8%
2 13
 
2.6%
Other values (49) 195
39.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 299
59.9%
Space Separator 99
 
19.8%
Decimal Number 93
 
18.6%
Dash Punctuation 8
 
1.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
10.0%
29
 
9.7%
27
 
9.0%
24
 
8.0%
24
 
8.0%
24
 
8.0%
20
 
6.7%
14
 
4.7%
10
 
3.3%
9
 
3.0%
Other values (37) 88
29.4%
Decimal Number
ValueCountFrequency (%)
2 13
14.0%
4 12
12.9%
8 12
12.9%
0 11
11.8%
3 10
10.8%
7 9
9.7%
1 8
8.6%
6 7
7.5%
5 6
6.5%
9 5
 
5.4%
Space Separator
ValueCountFrequency (%)
99
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 299
59.9%
Common 200
40.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
10.0%
29
 
9.7%
27
 
9.0%
24
 
8.0%
24
 
8.0%
24
 
8.0%
20
 
6.7%
14
 
4.7%
10
 
3.3%
9
 
3.0%
Other values (37) 88
29.4%
Common
ValueCountFrequency (%)
99
49.5%
2 13
 
6.5%
4 12
 
6.0%
8 12
 
6.0%
0 11
 
5.5%
3 10
 
5.0%
7 9
 
4.5%
- 8
 
4.0%
1 8
 
4.0%
6 7
 
3.5%
Other values (2) 11
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 299
59.9%
ASCII 200
40.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
99
49.5%
2 13
 
6.5%
4 12
 
6.0%
8 12
 
6.0%
0 11
 
5.5%
3 10
 
5.0%
7 9
 
4.5%
- 8
 
4.0%
1 8
 
4.0%
6 7
 
3.5%
Other values (2) 11
 
5.5%
Hangul
ValueCountFrequency (%)
30
 
10.0%
29
 
9.7%
27
 
9.0%
24
 
8.0%
24
 
8.0%
24
 
8.0%
20
 
6.7%
14
 
4.7%
10
 
3.3%
9
 
3.0%
Other values (37) 88
29.4%
Distinct66
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Memory size692.0 B
2023-12-13T04:39:44.858958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length21
Mean length20.157143
Min length17

Characters and Unicode

Total characters1411
Distinct characters81
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)88.6%

Sample

1st row경기도 연천군 전곡읍 전곡리 698-5
2nd row경기도 연천군 신서면 대광리 130-3
3rd row경기도 연천군 전곡읍 마포리 167
4th row경기도 연천군 연천읍 부곡리 192
5th row경기도 연천군 미산면 동이리 67-1
ValueCountFrequency (%)
경기도 70
19.9%
연천군 70
19.9%
연천읍 15
 
4.3%
전곡읍 15
 
4.3%
미산면 8
 
2.3%
신서면 7
 
2.0%
장남면 7
 
2.0%
백학면 5
 
1.4%
중면 5
 
1.4%
신답리 5
 
1.4%
Other values (104) 145
41.2%
2023-12-13T04:39:45.423221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
282
20.0%
85
 
6.0%
85
 
6.0%
72
 
5.1%
70
 
5.0%
70
 
5.0%
70
 
5.0%
70
 
5.0%
1 53
 
3.8%
40
 
2.8%
Other values (71) 514
36.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 858
60.8%
Space Separator 282
 
20.0%
Decimal Number 235
 
16.7%
Dash Punctuation 36
 
2.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
85
 
9.9%
85
 
9.9%
72
 
8.4%
70
 
8.2%
70
 
8.2%
70
 
8.2%
70
 
8.2%
40
 
4.7%
38
 
4.4%
30
 
3.5%
Other values (59) 228
26.6%
Decimal Number
ValueCountFrequency (%)
1 53
22.6%
2 30
12.8%
3 30
12.8%
4 24
10.2%
7 23
9.8%
6 18
 
7.7%
8 15
 
6.4%
5 14
 
6.0%
9 14
 
6.0%
0 14
 
6.0%
Space Separator
ValueCountFrequency (%)
282
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 858
60.8%
Common 553
39.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
85
 
9.9%
85
 
9.9%
72
 
8.4%
70
 
8.2%
70
 
8.2%
70
 
8.2%
70
 
8.2%
40
 
4.7%
38
 
4.4%
30
 
3.5%
Other values (59) 228
26.6%
Common
ValueCountFrequency (%)
282
51.0%
1 53
 
9.6%
- 36
 
6.5%
2 30
 
5.4%
3 30
 
5.4%
4 24
 
4.3%
7 23
 
4.2%
6 18
 
3.3%
8 15
 
2.7%
5 14
 
2.5%
Other values (2) 28
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 858
60.8%
ASCII 553
39.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
282
51.0%
1 53
 
9.6%
- 36
 
6.5%
2 30
 
5.4%
3 30
 
5.4%
4 24
 
4.3%
7 23
 
4.2%
6 18
 
3.3%
8 15
 
2.7%
5 14
 
2.5%
Other values (2) 28
 
5.1%
Hangul
ValueCountFrequency (%)
85
 
9.9%
85
 
9.9%
72
 
8.4%
70
 
8.2%
70
 
8.2%
70
 
8.2%
70
 
8.2%
40
 
4.7%
38
 
4.4%
30
 
3.5%
Other values (59) 228
26.6%

위도
Real number (ℝ)

HIGH CORRELATION 

Distinct66
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.064872
Minimum37.946267
Maximum38.245944
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size762.0 B
2023-12-13T04:39:45.648360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.946267
5-th percentile37.986166
Q138.021652
median38.052538
Q338.097026
95-th percentile38.211949
Maximum38.245944
Range0.29967664
Interquartile range (IQR)0.075374028

Descriptive statistics

Standard deviation0.065851679
Coefficient of variation (CV)0.0017299856
Kurtosis0.48181263
Mean38.064872
Median Absolute Deviation (MAD)0.04273083
Skewness0.80685828
Sum2664.541
Variance0.0043364437
MonotonicityNot monotonic
2023-12-13T04:39:45.859561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
38.02777481 2
 
2.9%
38.03762459 2
 
2.9%
37.9880304 2
 
2.9%
38.00165085 2
 
2.9%
38.09262253 1
 
1.4%
38.07691097 1
 
1.4%
38.09221082 1
 
1.4%
38.05703605 1
 
1.4%
38.07034833 1
 
1.4%
38.01159172 1
 
1.4%
Other values (56) 56
80.0%
ValueCountFrequency (%)
37.94626689 1
1.4%
37.9692894 1
1.4%
37.97018501 1
1.4%
37.98539135 1
1.4%
37.98711207 1
1.4%
37.9873417 1
1.4%
37.9880304 2
2.9%
37.99016535 1
1.4%
37.99117103 1
1.4%
37.99626154 1
1.4%
ValueCountFrequency (%)
38.24594353 1
1.4%
38.23643828 1
1.4%
38.21280532 1
1.4%
38.21251732 1
1.4%
38.21125419 1
1.4%
38.15229135 1
1.4%
38.14433404 1
1.4%
38.13191395 1
1.4%
38.13069535 1
1.4%
38.12957345 1
1.4%

경도
Real number (ℝ)

HIGH CORRELATION 

Distinct66
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.02354
Minimum126.81594
Maximum127.16105
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size762.0 B
2023-12-13T04:39:46.064291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.81594
5-th percentile126.84481
Q1126.97331
median127.05338
Q3127.09869
95-th percentile127.14674
Maximum127.16105
Range0.3451133
Interquartile range (IQR)0.1253821

Descriptive statistics

Standard deviation0.091128903
Coefficient of variation (CV)0.00071741746
Kurtosis-0.39629655
Mean127.02354
Median Absolute Deviation (MAD)0.06049255
Skewness-0.63328255
Sum8891.6475
Variance0.0083044769
MonotonicityNot monotonic
2023-12-13T04:39:46.273261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.9533557 2
 
2.9%
127.1032885 2
 
2.9%
126.8477191 2
 
2.9%
127.008114 2
 
2.9%
127.0811984 1
 
1.4%
127.1022592 1
 
1.4%
126.9542974 1
 
1.4%
126.9839913 1
 
1.4%
126.9806441 1
 
1.4%
127.0637623 1
 
1.4%
Other values (56) 56
80.0%
ValueCountFrequency (%)
126.8159372 1
1.4%
126.828079 1
1.4%
126.8357211 1
1.4%
126.842421 1
1.4%
126.8477191 2
2.9%
126.8598377 1
1.4%
126.8725487 1
1.4%
126.8947454 1
1.4%
126.9038535 1
1.4%
126.9441177 1
1.4%
ValueCountFrequency (%)
127.1610505 1
1.4%
127.152515 1
1.4%
127.1507036 1
1.4%
127.1476436 1
1.4%
127.1456447 1
1.4%
127.1426934 1
1.4%
127.1389751 1
1.4%
127.1286384 1
1.4%
127.1145627 1
1.4%
127.1139768 1
1.4%

문의전화
Categorical

Distinct13
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Memory size692.0 B
031-839-2565
39 
031-839-2289
031-839-2063
031-839-2061
031-839-2144
 
3
Other values (8)
10 

Length

Max length12
Median length12
Mean length12
Min length12

Unique

Unique6 ?
Unique (%)8.6%

Sample

1st row031-833-0030
2nd row031-834-6300
3rd row031-839-7951
4th row031-839-2289
5th row031-839-2289

Common Values

ValueCountFrequency (%)
031-839-2565 39
55.7%
031-839-2289 8
 
11.4%
031-839-2063 6
 
8.6%
031-839-2061 4
 
5.7%
031-839-2144 3
 
4.3%
031-839-7951 2
 
2.9%
031-835-2002 2
 
2.9%
031-833-0030 1
 
1.4%
031-834-6300 1
 
1.4%
031-834-8887 1
 
1.4%
Other values (3) 3
 
4.3%

Length

2023-12-13T04:39:46.458031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
031-839-2565 39
55.7%
031-839-2289 8
 
11.4%
031-839-2063 6
 
8.6%
031-839-2061 4
 
5.7%
031-839-2144 3
 
4.3%
031-839-7951 2
 
2.9%
031-835-2002 2
 
2.9%
031-833-0030 1
 
1.4%
031-834-6300 1
 
1.4%
031-834-8887 1
 
1.4%
Other values (3) 3
 
4.3%

Interactions

2023-12-13T04:39:42.721882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:42.546072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:42.802651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:42.637492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:39:46.560311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관광지명도로명주소지번주소위도경도문의전화
관광지명1.0001.0001.0001.0001.0001.000
도로명주소1.0001.0001.0001.0001.0001.000
지번주소1.0001.0001.0001.0001.0001.000
위도1.0001.0001.0001.0000.4750.685
경도1.0001.0001.0000.4751.0000.000
문의전화1.0001.0001.0000.6850.0001.000
2023-12-13T04:39:46.700467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도문의전화
위도1.0000.5550.363
경도0.5551.0000.000
문의전화0.3630.0001.000

Missing values

2023-12-13T04:39:43.149128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:39:43.238665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관광지명도로명주소지번주소위도경도문의전화
0한탄강 관광지경기도 연천군 전곡읍 선사로 76경기도 연천군 전곡읍 전곡리 698-538.008877127.05886031-833-0030
1고대산캠핑리조트경기도 연천군 신서면 고대산길 84-12경기도 연천군 신서면 대광리 130-338.212517127.145645031-834-6300
2한반도통일미래센터경기도 연천군 전곡읍 남계로 408경기도 연천군 전곡읍 마포리 16738.001651127.008114031-839-7951
3한탄강지질공원 재인폭포<NA>경기도 연천군 연천읍 부곡리 19238.077129127.142693031-839-2289
4임진강 주상절리<NA>경기도 연천군 미산면 동이리 67-138.018267127.014145031-839-2289
5아우라지 베개용암<NA>경기도 연천군 전곡읍 신답리 17-4438.040611127.11377031-839-2289
6좌상바위<NA>경기도 연천군 전곡읍 신답리 30738.037625127.103289031-839-2289
7역고드름<NA>경기도 연천군 신서면 대광리 산17338.236438127.161051031-839-2061
8동막골유원지경기도 연천군 연천읍 동내로 431경기도 연천군 연천읍 동막리 19838.097267127.111633031-839-2061
9한탄강지질공원 백의리층경기도 연천군 연천읍 현문로 526-35경기도 연천군 연천읍 고문리 21238.152291127.090347031-839-2289
관광지명도로명주소지번주소위도경도문의전화
60연천 호로고루<NA>경기도 연천군 장남면 원당리 1257-137.985391126.859838031-839-2565
61연천 당포성<NA>경기도 연천군 미산면 동이리 77838.023503126.985418031-839-2565
62연천 은대리성<NA>경기도 연천군 전곡읍 은대리 57738.024272127.059643031-839-2565
63연천역 급수탑<NA>경기도 연천군 연천읍 차탄리 34-37338.101846127.074096031-839-2565
64연천 UN군 화장장시설<NA>경기도 연천군 미산면 동이리 61038.02252126.993639031-839-2565
65연천 양원리 고인돌<NA>경기도 연천군 전곡읍 양원리 408-437.996262127.034272031-839-2565
66연천 신답리고분<NA>경기도 연천군 전곡읍 신답리 17-4238.0439127.114563031-839-2565
67평정공 윤호신도비<NA>경기도 연천군 미산면 아미리 산131-138.027775126.953356031-839-2565
68연천 학곡리 적석총<NA>경기도 연천군 백학면 학곡리 20-137.990165126.951463031-839-2565
69연천 심원사지경기도 연천군 신서면 동내로 970번길 32-268경기도 연천군 신서면 내산리 35438.128578127.152515031-839-2565

Duplicate rows

Most frequently occurring

관광지명도로명주소지번주소위도경도문의전화# duplicates
0연천고랑포구역사공원경기도 연천군 장남면 장남로 270경기도 연천군 장남면 고랑포리 20637.98803126.847719031-835-20022
1한반도통일미래센터경기도 연천군 전곡읍 남계로 408경기도 연천군 전곡읍 마포리 16738.001651127.008114031-839-79512