Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells10253
Missing cells (%)25.6%
Duplicate rows12
Duplicate rows (%)0.1%
Total size in memory410.2 KiB
Average record size in memory42.0 B

Variable types

Text2
Numeric2

Dataset

Description전라북도 내 14개 시군 관광지명, 주소, 위도, 경도 제공함으로써 관광지 위치 정보 데이터를 수집하고 분석함으로써 관광 추세를 파악하고 미래의 관광 수요를 예측할 수 있습니다. 이는 지역 마케팅 및 자원 할당에 도움을 줄 수 있음
Author전라북도
URLhttps://www.bigdatahub.go.kr/index.jeonbuk?startPage=1&menuCd=DOM_000000103007001000&pListTypeStr=&pId=15124617

Alerts

Dataset has 12 (0.1%) duplicate rowsDuplicates
위도 is highly overall correlated with 경도High correlation
경도 is highly overall correlated with 위도High correlation
주소 has 9986 (99.9%) missing valuesMissing

Reproduction

Analysis started2024-03-14 00:29:30.032909
Analysis finished2024-03-14 00:29:31.094701
Duration1.06 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9113
Distinct (%)91.9%
Missing89
Missing (%)0.9%
Memory size156.2 KiB
2024-03-14T09:29:31.429116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length25
Mean length6.2929069
Min length2

Characters and Unicode

Total characters62369
Distinct characters735
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8436 ?
Unique (%)85.1%

Sample

1st row가산산성주차장
2nd row황방리
3rd row칠산
4th row두모실고개
5th row환승센터출입2(경유)
ValueCountFrequency (%)
방면 52
 
0.5%
건너 25
 
0.2%
입구 12
 
0.1%
이마트 9
 
0.1%
홈플러스 7
 
0.1%
한신아파트 6
 
0.1%
종점 6
 
0.1%
마을회관 6
 
0.1%
6
 
0.1%
현대아파트 6
 
0.1%
Other values (9176) 10006
98.7%
2024-03-14T09:29:31.774723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2387
 
3.8%
1277
 
2.0%
1208
 
1.9%
1113
 
1.8%
1105
 
1.8%
1082
 
1.7%
1057
 
1.7%
993
 
1.6%
953
 
1.5%
. 947
 
1.5%
Other values (725) 50247
80.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 56408
90.4%
Decimal Number 2144
 
3.4%
Other Punctuation 979
 
1.6%
Close Punctuation 933
 
1.5%
Open Punctuation 931
 
1.5%
Uppercase Letter 692
 
1.1%
Space Separator 230
 
0.4%
Dash Punctuation 27
 
< 0.1%
Lowercase Letter 22
 
< 0.1%
Other Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2387
 
4.2%
1277
 
2.3%
1208
 
2.1%
1113
 
2.0%
1105
 
2.0%
1082
 
1.9%
1057
 
1.9%
993
 
1.8%
953
 
1.7%
859
 
1.5%
Other values (677) 44374
78.7%
Uppercase Letter
ValueCountFrequency (%)
C 133
19.2%
I 101
14.6%
T 79
11.4%
G 69
10.0%
S 58
8.4%
K 55
7.9%
A 46
 
6.6%
L 38
 
5.5%
H 24
 
3.5%
P 14
 
2.0%
Other values (12) 75
10.8%
Decimal Number
ValueCountFrequency (%)
2 716
33.4%
1 714
33.3%
3 284
 
13.2%
4 132
 
6.2%
5 81
 
3.8%
6 54
 
2.5%
0 49
 
2.3%
7 46
 
2.1%
9 35
 
1.6%
8 33
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 947
96.7%
/ 18
 
1.8%
, 8
 
0.8%
· 4
 
0.4%
& 2
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
e 16
72.7%
s 3
 
13.6%
a 1
 
4.5%
p 1
 
4.5%
t 1
 
4.5%
Close Punctuation
ValueCountFrequency (%)
) 933
100.0%
Open Punctuation
ValueCountFrequency (%)
( 931
100.0%
Space Separator
ValueCountFrequency (%)
230
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 27
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 56410
90.4%
Common 5245
 
8.4%
Latin 714
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2387
 
4.2%
1277
 
2.3%
1208
 
2.1%
1113
 
2.0%
1105
 
2.0%
1082
 
1.9%
1057
 
1.9%
993
 
1.8%
953
 
1.7%
859
 
1.5%
Other values (678) 44376
78.7%
Latin
ValueCountFrequency (%)
C 133
18.6%
I 101
14.1%
T 79
11.1%
G 69
9.7%
S 58
8.1%
K 55
7.7%
A 46
 
6.4%
L 38
 
5.3%
H 24
 
3.4%
e 16
 
2.2%
Other values (17) 95
13.3%
Common
ValueCountFrequency (%)
. 947
18.1%
) 933
17.8%
( 931
17.8%
2 716
13.7%
1 714
13.6%
3 284
 
5.4%
230
 
4.4%
4 132
 
2.5%
5 81
 
1.5%
6 54
 
1.0%
Other values (10) 223
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 56408
90.4%
ASCII 5955
 
9.5%
None 6
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2387
 
4.2%
1277
 
2.3%
1208
 
2.1%
1113
 
2.0%
1105
 
2.0%
1082
 
1.9%
1057
 
1.9%
993
 
1.8%
953
 
1.7%
859
 
1.5%
Other values (677) 44374
78.7%
ASCII
ValueCountFrequency (%)
. 947
15.9%
) 933
15.7%
( 931
15.6%
2 716
12.0%
1 714
12.0%
3 284
 
4.8%
230
 
3.9%
C 133
 
2.2%
4 132
 
2.2%
I 101
 
1.7%
Other values (36) 834
14.0%
None
ValueCountFrequency (%)
· 4
66.7%
2
33.3%

주소
Text

MISSING 

Distinct14
Distinct (%)100.0%
Missing9986
Missing (%)99.9%
Memory size156.2 KiB
2024-03-14T09:29:31.967738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length15
Mean length14.142857
Min length13

Characters and Unicode

Total characters198
Distinct characters69
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)100.0%

Sample

1st row전북완주군소양면복은길18
2nd row전북완주군상관면정수사길18
3rd row전북완주군구이면모악산길91
4th row전북완주군소양면해월리1번지
5th row전북부안군변산면새만금로6
ValueCountFrequency (%)
전북완주군소양면복은길18 1
 
7.1%
전북완주군상관면정수사길18 1
 
7.1%
전북완주군구이면모악산길91 1
 
7.1%
전북완주군소양면해월리1번지 1
 
7.1%
전북부안군변산면새만금로6 1
 
7.1%
전북전주시덕진구어은로89-11 1
 
7.1%
전북군산시구영2길12-1 1
 
7.1%
전북무주군설천면만선로185 1
 
7.1%
전북진안군진안읍마이산로258 1
 
7.1%
전북진안군마령면마이산남로217 1
 
7.1%
Other values (4) 4
28.6%
2024-03-14T09:29:32.279181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15
 
7.6%
14
 
7.1%
1 12
 
6.1%
11
 
5.6%
10
 
5.1%
7
 
3.5%
7
 
3.5%
7
 
3.5%
6
 
3.0%
8 5
 
2.5%
Other values (59) 104
52.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 158
79.8%
Decimal Number 37
 
18.7%
Dash Punctuation 3
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
 
9.5%
14
 
8.9%
11
 
7.0%
10
 
6.3%
7
 
4.4%
7
 
4.4%
7
 
4.4%
6
 
3.8%
5
 
3.2%
4
 
2.5%
Other values (48) 72
45.6%
Decimal Number
ValueCountFrequency (%)
1 12
32.4%
8 5
13.5%
2 4
 
10.8%
5 4
 
10.8%
4 3
 
8.1%
7 2
 
5.4%
6 2
 
5.4%
9 2
 
5.4%
3 2
 
5.4%
0 1
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 158
79.8%
Common 40
 
20.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
 
9.5%
14
 
8.9%
11
 
7.0%
10
 
6.3%
7
 
4.4%
7
 
4.4%
7
 
4.4%
6
 
3.8%
5
 
3.2%
4
 
2.5%
Other values (48) 72
45.6%
Common
ValueCountFrequency (%)
1 12
30.0%
8 5
12.5%
2 4
 
10.0%
5 4
 
10.0%
- 3
 
7.5%
4 3
 
7.5%
7 2
 
5.0%
6 2
 
5.0%
9 2
 
5.0%
3 2
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 158
79.8%
ASCII 40
 
20.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
15
 
9.5%
14
 
8.9%
11
 
7.0%
10
 
6.3%
7
 
4.4%
7
 
4.4%
7
 
4.4%
6
 
3.8%
5
 
3.2%
4
 
2.5%
Other values (48) 72
45.6%
ASCII
ValueCountFrequency (%)
1 12
30.0%
8 5
12.5%
2 4
 
10.0%
5 4
 
10.0%
- 3
 
7.5%
4 3
 
7.5%
7 2
 
5.0%
6 2
 
5.0%
9 2
 
5.0%
3 2
 
5.0%

위도
Real number (ℝ)

HIGH CORRELATION 

Distinct9562
Distinct (%)96.5%
Missing89
Missing (%)0.9%
Infinite0
Infinite (%)0.0%
Mean36.514349
Minimum34.71309
Maximum38.26478
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-14T09:29:32.413519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34.71309
5-th percentile35.08034
Q135.78684
median36.66172
Q337.37151
95-th percentile37.74574
Maximum38.26478
Range3.55169
Interquartile range (IQR)1.58467

Descriptive statistics

Standard deviation0.9490745
Coefficient of variation (CV)0.025991823
Kurtosis-1.3246101
Mean36.514349
Median Absolute Deviation (MAD)0.78332
Skewness-0.28616512
Sum361893.71
Variance0.90074241
MonotonicityNot monotonic
2024-03-14T09:29:32.527121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.5292 3
 
< 0.1%
35.12066 3
 
< 0.1%
35.13478 3
 
< 0.1%
37.68045 3
 
< 0.1%
35.2445 3
 
< 0.1%
35.88212 3
 
< 0.1%
37.37273 3
 
< 0.1%
35.91007 3
 
< 0.1%
35.19938 3
 
< 0.1%
37.64597 3
 
< 0.1%
Other values (9552) 9881
98.8%
(Missing) 89
 
0.9%
ValueCountFrequency (%)
34.71309 1
< 0.1%
34.71634 1
< 0.1%
34.72649 1
< 0.1%
34.72667 1
< 0.1%
34.728 1
< 0.1%
34.73088 1
< 0.1%
34.73328 1
< 0.1%
34.74055 1
< 0.1%
34.74116 1
< 0.1%
34.74136 1
< 0.1%
ValueCountFrequency (%)
38.26478 1
< 0.1%
38.211 1
< 0.1%
38.2076 1
< 0.1%
38.11672 1
< 0.1%
38.05262 1
< 0.1%
38.05087 1
< 0.1%
38.04968 1
< 0.1%
38.04686 1
< 0.1%
38.04263 1
< 0.1%
38.03588 1
< 0.1%

경도
Real number (ℝ)

HIGH CORRELATION 

Distinct7676
Distinct (%)77.4%
Missing89
Missing (%)0.9%
Infinite0
Infinite (%)0.0%
Mean127.64798
Minimum126.3916
Maximum129.4935
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-14T09:29:32.641646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.3916
5-th percentile126.72385
Q1126.97275
median127.2999
Q3128.50045
95-th percentile129.07185
Maximum129.4935
Range3.1019
Interquartile range (IQR)1.5277

Descriptive statistics

Standard deviation0.82086473
Coefficient of variation (CV)0.006430691
Kurtosis-1.1865887
Mean127.64798
Median Absolute Deviation (MAD)0.4535
Skewness0.58151306
Sum1265119.1
Variance0.6738189
MonotonicityNot monotonic
2024-03-14T09:29:32.753975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.8386 6
 
0.1%
127.1289 6
 
0.1%
127.0996 6
 
0.1%
127.2989 6
 
0.1%
127.058 5
 
0.1%
127.135 5
 
0.1%
127.0583 5
 
0.1%
127.4281 5
 
0.1%
126.7893 5
 
0.1%
127.0942 5
 
0.1%
Other values (7666) 9857
98.6%
(Missing) 89
 
0.9%
ValueCountFrequency (%)
126.3916 1
< 0.1%
126.4131 1
< 0.1%
126.4188 1
< 0.1%
126.4232 1
< 0.1%
126.4291 1
< 0.1%
126.4326 1
< 0.1%
126.4337 1
< 0.1%
126.4502 1
< 0.1%
126.4513 1
< 0.1%
126.4523 1
< 0.1%
ValueCountFrequency (%)
129.4935 1
< 0.1%
129.4915 1
< 0.1%
129.4822 1
< 0.1%
129.4795 1
< 0.1%
129.4626 1
< 0.1%
129.4504 1
< 0.1%
129.444 1
< 0.1%
129.4432 1
< 0.1%
129.4406 1
< 0.1%
129.4333 1
< 0.1%

Interactions

2024-03-14T09:29:30.717600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T09:29:30.538234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T09:29:30.813417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T09:29:30.618284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T09:29:32.832808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주소위도경도
주소1.0001.0001.000
위도1.0001.0000.805
경도1.0000.8051.000
2024-03-14T09:29:32.903793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도
위도1.000-0.630
경도-0.6301.000

Missing values

2024-03-14T09:29:30.910258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:29:30.975167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T09:29:31.049452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

관광지명주소위도경도
30301가산산성주차장<NA>36.01913128.5893
86920황방리<NA>37.9286126.9993
29555칠산<NA>36.15832128.2412
35829두모실고개<NA>34.97973128.6844
96927환승센터출입2(경유)<NA>37.27118126.9967
72149뫼루니육교<NA>37.39128127.083
92221중흥s클래스후문<NA>37.6377126.6677
2322수심삼거리<NA>36.66018128.3625
92945대죽리-1<NA>37.13558127.4801
8913목화그린빌라<NA>35.12885129.0748
관광지명주소위도경도
13746보성리<NA>36.72714127.1188
50751삼도뷰앤빌건너삼거리<NA>36.08702128.364
78839파주삼릉<NA>37.73707126.8234
89949금산검문소<NA>37.81082126.7142
18680문화그린맨션<NA>35.2564129.2159
42483장기1리<NA>36.143128.5897
85877독지1리.돌내입구<NA>37.24615126.6993
77817용인터미널<NA>37.23407127.2088
60160성덕<NA>35.24349126.9721
85546하길중학교<NA>37.11317126.9082

Duplicate rows

Most frequently occurring

관광지명주소위도경도# duplicates
11<NA><NA><NA><NA>89
0금남1리남산입구<NA>35.93255128.40492
1다부전적기념관건너<NA>36.04635128.52032
2사림동<NA>35.24424128.67582
3석적행정복지센터<NA>36.06485128.40072
4성주터미널<NA>35.91711128.28522
5세종충남대학교병원<NA>36.52052127.25952
6수한면행정복지센터<NA>36.47968127.69832
7신흥마을<NA>36.77883127.11782
8오로리출발<NA>36.18589128.53842