Overview

Dataset statistics

Number of variables6
Number of observations6323
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory302.7 KiB
Average record size in memory49.0 B

Variable types

Numeric1
Text2
Categorical2
DateTime1

Dataset

Description경상남도 진주시 관광상품에 대하여 SNS에서 수집한 음식점, 숙소, 관광지, 행사 데이터 및 관광상품 후기 url를 제공합니다.
Author경상남도 진주시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15097737

Alerts

번호 is highly overall correlated with 관광상품분류High correlation
관광상품분류 is highly overall correlated with 번호High correlation
번호 has unique valuesUnique
홈페이지주소(URL) has unique valuesUnique

Reproduction

Analysis started2023-12-11 00:29:52.093336
Analysis finished2023-12-11 00:29:53.223131
Duration1.13 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct6323
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5003162
Minimum5000001
Maximum5006323
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.7 KiB
2023-12-11T09:29:53.294391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5000001
5-th percentile5000317.1
Q15001581.5
median5003162
Q35004742.5
95-th percentile5006006.9
Maximum5006323
Range6322
Interquartile range (IQR)3161

Descriptive statistics

Standard deviation1825.4372
Coefficient of variation (CV)0.00036485671
Kurtosis-1.2
Mean5003162
Median Absolute Deviation (MAD)1581
Skewness0
Sum3.1634993 × 1010
Variance3332221
MonotonicityNot monotonic
2023-12-11T09:29:53.424408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5005529 1
 
< 0.1%
5006094 1
 
< 0.1%
5005169 1
 
< 0.1%
5005168 1
 
< 0.1%
5005167 1
 
< 0.1%
5004552 1
 
< 0.1%
5002611 1
 
< 0.1%
5002153 1
 
< 0.1%
5000755 1
 
< 0.1%
5005161 1
 
< 0.1%
Other values (6313) 6313
99.8%
ValueCountFrequency (%)
5000001 1
< 0.1%
5000002 1
< 0.1%
5000003 1
< 0.1%
5000004 1
< 0.1%
5000005 1
< 0.1%
5000006 1
< 0.1%
5000007 1
< 0.1%
5000008 1
< 0.1%
5000009 1
< 0.1%
5000010 1
< 0.1%
ValueCountFrequency (%)
5006323 1
< 0.1%
5006322 1
< 0.1%
5006321 1
< 0.1%
5006320 1
< 0.1%
5006319 1
< 0.1%
5006318 1
< 0.1%
5006317 1
< 0.1%
5006316 1
< 0.1%
5006315 1
< 0.1%
5006314 1
< 0.1%
Distinct1608
Distinct (%)25.4%
Missing0
Missing (%)0.0%
Memory size49.5 KiB
2023-12-11T09:29:53.681565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length192
Median length97
Mean length7.7031472
Min length1

Characters and Unicode

Total characters48707
Distinct characters754
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique980 ?
Unique (%)15.5%

Sample

1st row진주국제재즈페스티벌
2nd row진주국제재즈페스티벌
3rd row경해여자고등학교
4th row에나뮤직오픈마이크
5th row진주국제재즈페스티벌
ValueCountFrequency (%)
진주남강유등축제 504
 
6.4%
진주레일바이크놀이공원 269
 
3.4%
진주성 256
 
3.2%
경상남도수목원 136
 
1.7%
진양호 134
 
1.7%
하연옥 116
 
1.5%
진주 82
 
1.0%
본점 81
 
1.0%
월아산 79
 
1.0%
진주익룡발자국전시관 74
 
0.9%
Other values (1821) 6158
78.1%
2023-12-11T09:29:54.132401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2503
 
5.1%
2292
 
4.7%
1569
 
3.2%
1195
 
2.5%
888
 
1.8%
828
 
1.7%
803
 
1.6%
765
 
1.6%
761
 
1.6%
673
 
1.4%
Other values (744) 36430
74.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45321
93.0%
Space Separator 1569
 
3.2%
Other Punctuation 751
 
1.5%
Decimal Number 235
 
0.5%
Uppercase Letter 216
 
0.4%
Lowercase Letter 194
 
0.4%
Open Punctuation 189
 
0.4%
Close Punctuation 189
 
0.4%
Math Symbol 24
 
< 0.1%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2503
 
5.5%
2292
 
5.1%
1195
 
2.6%
888
 
2.0%
828
 
1.8%
803
 
1.8%
765
 
1.7%
761
 
1.7%
673
 
1.5%
643
 
1.4%
Other values (677) 33970
75.0%
Uppercase Letter
ValueCountFrequency (%)
A 46
21.3%
M 29
13.4%
B 22
10.2%
S 16
 
7.4%
I 16
 
7.4%
O 14
 
6.5%
K 10
 
4.6%
E 9
 
4.2%
C 8
 
3.7%
N 8
 
3.7%
Other values (13) 38
17.6%
Lowercase Letter
ValueCountFrequency (%)
o 26
13.4%
e 22
11.3%
t 19
9.8%
i 17
8.8%
r 14
 
7.2%
a 14
 
7.2%
f 13
 
6.7%
h 12
 
6.2%
w 11
 
5.7%
c 10
 
5.2%
Other values (10) 36
18.6%
Decimal Number
ValueCountFrequency (%)
5 46
19.6%
2 41
17.4%
0 35
14.9%
4 27
11.5%
1 26
11.1%
6 18
 
7.7%
9 13
 
5.5%
7 11
 
4.7%
3 11
 
4.7%
8 7
 
3.0%
Other Punctuation
ValueCountFrequency (%)
/ 641
85.4%
: 49
 
6.5%
& 27
 
3.6%
' 24
 
3.2%
. 5
 
0.7%
! 4
 
0.5%
, 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
> 12
50.0%
< 12
50.0%
Space Separator
ValueCountFrequency (%)
1569
100.0%
Open Punctuation
ValueCountFrequency (%)
( 189
100.0%
Close Punctuation
ValueCountFrequency (%)
) 189
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Letter Number
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45321
93.0%
Common 2969
 
6.1%
Latin 417
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2503
 
5.5%
2292
 
5.1%
1195
 
2.6%
888
 
2.0%
828
 
1.8%
803
 
1.8%
765
 
1.7%
761
 
1.7%
673
 
1.5%
643
 
1.4%
Other values (677) 33970
75.0%
Latin
ValueCountFrequency (%)
A 46
 
11.0%
M 29
 
7.0%
o 26
 
6.2%
B 22
 
5.3%
e 22
 
5.3%
t 19
 
4.6%
i 17
 
4.1%
S 16
 
3.8%
I 16
 
3.8%
O 14
 
3.4%
Other values (34) 190
45.6%
Common
ValueCountFrequency (%)
1569
52.8%
/ 641
21.6%
( 189
 
6.4%
) 189
 
6.4%
: 49
 
1.7%
5 46
 
1.5%
2 41
 
1.4%
0 35
 
1.2%
& 27
 
0.9%
4 27
 
0.9%
Other values (13) 156
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45321
93.0%
ASCII 3379
 
6.9%
Number Forms 7
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2503
 
5.5%
2292
 
5.1%
1195
 
2.6%
888
 
2.0%
828
 
1.8%
803
 
1.8%
765
 
1.7%
761
 
1.7%
673
 
1.5%
643
 
1.4%
Other values (677) 33970
75.0%
ASCII
ValueCountFrequency (%)
1569
46.4%
/ 641
19.0%
( 189
 
5.6%
) 189
 
5.6%
: 49
 
1.5%
5 46
 
1.4%
A 46
 
1.4%
2 41
 
1.2%
0 35
 
1.0%
M 29
 
0.9%
Other values (56) 545
 
16.1%
Number Forms
ValueCountFrequency (%)
7
100.0%

관광상품분류
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size49.5 KiB
음식점
2683 
관광지
1861 
행사
1310 
숙소
469 

Length

Max length3
Median length3
Mean length2.7186462
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row행사
2nd row행사
3rd row관광지
4th row행사
5th row행사

Common Values

ValueCountFrequency (%)
음식점 2683
42.4%
관광지 1861
29.4%
행사 1310
20.7%
숙소 469
 
7.4%

Length

2023-12-11T09:29:54.272510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:29:54.372392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
음식점 2683
42.4%
관광지 1861
29.4%
행사 1310
20.7%
숙소 469
 
7.4%

수집원분류
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size49.5 KiB
네이버블로그
2986 
인스타그램
1850 
티스토리블로그
777 
유튜브
463 
다음블로그
 
247

Length

Max length7
Median length6
Mean length5.5715641
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row인스타그램
2nd row인스타그램
3rd row인스타그램
4th row인스타그램
5th row인스타그램

Common Values

ValueCountFrequency (%)
네이버블로그 2986
47.2%
인스타그램 1850
29.3%
티스토리블로그 777
 
12.3%
유튜브 463
 
7.3%
다음블로그 247
 
3.9%

Length

2023-12-11T09:29:54.482447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:29:54.600456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
네이버블로그 2986
47.2%
인스타그램 1850
29.3%
티스토리블로그 777
 
12.3%
유튜브 463
 
7.3%
다음블로그 247
 
3.9%
Distinct6323
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size49.5 KiB
2023-12-11T09:29:54.818703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length635
Median length350
Mean length52.13095
Min length26

Characters and Unicode

Total characters329624
Distinct characters71
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6323 ?
Unique (%)100.0%

Sample

1st rowhttps://www.instagram.com/p/B5CUQdEAiL6/
2nd rowhttps://www.instagram.com/p/B5CYU27lOMu/
3rd rowhttps://www.instagram.com/p/B5o5288FxpJ/
4th rowhttps://www.instagram.com/p/B5rfX9aAE7C/
5th rowhttps://www.instagram.com/p/B5t7IiylHri/
ValueCountFrequency (%)
https://www.instagram.com/p/b5cuqdeail6 1
 
< 0.1%
https://www.instagram.com/p/b59ag_ylpbe 1
 
< 0.1%
https://www.youtube.com/watch?v=dkl5lw8_zum 1
 
< 0.1%
https://www.youtube.com/watch?v=qqzhtcs8slu 1
 
< 0.1%
https://blog.naver.com/mammy200104?redirect=log&logno=221735027378 1
 
< 0.1%
https://blog.naver.com/su_heeeeee?redirect=log&logno=221734416873 1
 
< 0.1%
https://www.instagram.com/p/b5_gv9ygzbn 1
 
< 0.1%
https://blog.naver.com/strychinin?redirect=log&logno=221736348549 1
 
< 0.1%
https://blog.naver.com/ouou1111?redirect=log&logno=221736578025 1
 
< 0.1%
https://blog.naver.com/hotelraonstay?redirect=log&logno=221735372811 1
 
< 0.1%
Other values (6313) 6313
99.8%
2023-12-11T09:29:55.164150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 23085
 
7.0%
o 21670
 
6.6%
t 21340
 
6.5%
. 12649
 
3.8%
2 12426
 
3.8%
g 12336
 
3.7%
e 12103
 
3.7%
s 10658
 
3.2%
c 10467
 
3.2%
r 10174
 
3.1%
Other values (61) 182716
55.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 190204
57.7%
Decimal Number 53702
 
16.3%
Other Punctuation 52353
 
15.9%
Uppercase Letter 25668
 
7.8%
Math Symbol 6306
 
1.9%
Dash Punctuation 813
 
0.2%
Connector Punctuation 578
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 21670
 
11.4%
t 21340
 
11.2%
g 12336
 
6.5%
e 12103
 
6.4%
s 10658
 
5.6%
c 10467
 
5.5%
r 10174
 
5.3%
a 9725
 
5.1%
m 9588
 
5.0%
p 9414
 
4.9%
Other values (16) 62729
33.0%
Uppercase Letter
ValueCountFrequency (%)
R 3454
13.5%
N 3328
13.0%
L 3293
12.8%
C 2752
10.7%
B 1960
 
7.6%
E 1695
 
6.6%
A 1409
 
5.5%
D 671
 
2.6%
F 622
 
2.4%
M 580
 
2.3%
Other values (16) 5904
23.0%
Decimal Number
ValueCountFrequency (%)
2 12426
23.1%
1 6220
11.6%
0 4809
 
9.0%
8 4649
 
8.7%
3 4619
 
8.6%
9 4603
 
8.6%
4 4509
 
8.4%
7 4052
 
7.5%
6 3957
 
7.4%
5 3858
 
7.2%
Other Punctuation
ValueCountFrequency (%)
/ 23085
44.1%
. 12649
24.2%
: 6323
 
12.1%
% 3990
 
7.6%
? 3397
 
6.5%
& 2909
 
5.6%
Math Symbol
ValueCountFrequency (%)
= 6306
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 813
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 578
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 215872
65.5%
Common 113752
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 21670
 
10.0%
t 21340
 
9.9%
g 12336
 
5.7%
e 12103
 
5.6%
s 10658
 
4.9%
c 10467
 
4.8%
r 10174
 
4.7%
a 9725
 
4.5%
m 9588
 
4.4%
p 9414
 
4.4%
Other values (42) 88397
40.9%
Common
ValueCountFrequency (%)
/ 23085
20.3%
. 12649
11.1%
2 12426
10.9%
: 6323
 
5.6%
= 6306
 
5.5%
1 6220
 
5.5%
0 4809
 
4.2%
8 4649
 
4.1%
3 4619
 
4.1%
9 4603
 
4.0%
Other values (9) 28063
24.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 329624
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 23085
 
7.0%
o 21670
 
6.6%
t 21340
 
6.5%
. 12649
 
3.8%
2 12426
 
3.8%
g 12336
 
3.7%
e 12103
 
3.7%
s 10658
 
3.2%
c 10467
 
3.2%
r 10174
 
3.1%
Other values (61) 182716
55.4%
Distinct1542
Distinct (%)24.4%
Missing0
Missing (%)0.0%
Memory size49.5 KiB
Minimum2005-10-28 00:00:00
Maximum2021-12-17 00:00:00
2023-12-11T09:29:55.282709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:29:55.398824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-11T09:29:52.946380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:29:55.487187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호관광상품분류수집원분류
번호1.0000.9740.573
관광상품분류0.9741.0000.309
수집원분류0.5730.3091.000
2023-12-11T09:29:55.568855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수집원분류관광상품분류
수집원분류1.0000.256
관광상품분류0.2561.000
2023-12-11T09:29:55.640141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호관광상품분류수집원분류
번호1.0000.9190.273
관광상품분류0.9191.0000.256
수집원분류0.2730.2561.000

Missing values

2023-12-11T09:29:53.077055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:29:53.176785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호관광상품명관광상품분류수집원분류홈페이지주소(URL)게시글작성일
05005529진주국제재즈페스티벌행사인스타그램https://www.instagram.com/p/B5CUQdEAiL6/2019-11-19
15005530진주국제재즈페스티벌행사인스타그램https://www.instagram.com/p/B5CYU27lOMu/2019-11-19
25000270경해여자고등학교관광지인스타그램https://www.instagram.com/p/B5o5288FxpJ/2021-12-17
35005323에나뮤직오픈마이크행사인스타그램https://www.instagram.com/p/B5rfX9aAE7C/2021-12-15
45005540진주국제재즈페스티벌행사인스타그램https://www.instagram.com/p/B5t7IiylHri/2021-12-13
55004900헤이데이음식점인스타그램https://www.instagram.com/p/B6UfGTZhlxi/2021-12-03
65004901헤이데이음식점인스타그램https://www.instagram.com/p/B6afex0hBIj/2021-11-30
75001518진주성관광지인스타그램https://www.instagram.com/p/B62h0jrFISu/2021-11-25
85000954진양호관광지인스타그램https://www.instagram.com/p/B645cSsl3PE/2021-11-24
95004899헤이데이음식점인스타그램https://www.instagram.com/p/B65cvWOBi7c/2021-11-23
번호관광상품명관광상품분류수집원분류홈페이지주소(URL)게시글작성일
63135000542남강습지원관광지티스토리블로그https://neowind.tistory.com/2242009-04-06
63145005957진주남강유등축제행사티스토리블로그https://lalawin.tistory.com/3992009-01-11
63155005067국화작품전시회행사티스토리블로그https://heysukim114.tistory.com/4702008-11-03
63165001630진주남강유등축제행사티스토리블로그https://kimchi39.tistory.com/entry/jinju-korail-train2008-10-12
63175003702안의갈비탕음식점티스토리블로그https://kimchi39.tistory.com/entry/jinju-anui2008-10-12
63185005952진주남강유등축제행사티스토리블로그https://kimchi39.tistory.com/entry/jinju-namkang2008-10-12
63195006021진주남강유등축제행사티스토리블로그https://gyeongnamtravel.tistory.com/entry/%EC%A7%84%EC%A3%BC-%EC%9C%A0%EB%93%B1%EC%B6%95%EC%A0%9C2008-09-29
63205006063진주남강유등축제행사티스토리블로그https://5gangsan.tistory.com/entry/%EC%A7%84%EC%A3%BC-%EC%9C%A0%EB%93%B1%EC%B6%95%EC%A0%9C2006102008-09-27
63215001835촉석루관광지티스토리블로그https://talktravel.tistory.com/282832008-01-07
63225006031진주남강유등축제행사티스토리블로그https://kym5219.tistory.com/71052005-10-28