Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Numeric2
Categorical2
Text2
DateTime1

Dataset

Description경상남도 밀양시 관광후기게시글에 대한 자료로, 관광상품번호, 수집원분류, 관광상품분류, 관광상품명, 후기글주소에 대한 정보를 제공합니다.
Author경상남도 밀양시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15111090

Alerts

관광상품번호(NameID) is highly overall correlated with 관광상품분류High correlation
관광상품분류 is highly overall correlated with 관광상품번호(NameID)High correlation
수집원분류 is highly imbalanced (69.2%)Imbalance
후기글번호(ReviewID) has unique valuesUnique
후기글주소 has unique valuesUnique

Reproduction

Analysis started2023-12-11 00:23:09.087392
Analysis finished2023-12-11 00:23:10.573058
Duration1.49 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

후기글번호(ReviewID)
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6005495.6
Minimum6000001
Maximum6010951
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:23:10.674167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6000001
5-th percentile6000560
Q16002767.5
median6005515.5
Q36008232.2
95-th percentile6010392
Maximum6010951
Range10950
Interquartile range (IQR)5464.75

Descriptive statistics

Standard deviation3152.8114
Coefficient of variation (CV)0.00052498771
Kurtosis-1.1961418
Mean6005495.6
Median Absolute Deviation (MAD)2731
Skewness-0.0095368527
Sum6.0054956 × 1010
Variance9940219.5
MonotonicityNot monotonic
2023-12-11T09:23:11.106908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6004121 1
 
< 0.1%
6004317 1
 
< 0.1%
6007374 1
 
< 0.1%
6004173 1
 
< 0.1%
6005794 1
 
< 0.1%
6005414 1
 
< 0.1%
6008644 1
 
< 0.1%
6008016 1
 
< 0.1%
6004056 1
 
< 0.1%
6000442 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
6000001 1
< 0.1%
6000002 1
< 0.1%
6000003 1
< 0.1%
6000004 1
< 0.1%
6000005 1
< 0.1%
6000006 1
< 0.1%
6000007 1
< 0.1%
6000008 1
< 0.1%
6000009 1
< 0.1%
6000010 1
< 0.1%
ValueCountFrequency (%)
6010951 1
< 0.1%
6010950 1
< 0.1%
6010949 1
< 0.1%
6010947 1
< 0.1%
6010946 1
< 0.1%
6010945 1
< 0.1%
6010944 1
< 0.1%
6010943 1
< 0.1%
6010942 1
< 0.1%
6010941 1
< 0.1%

관광상품번호(NameID)
Real number (ℝ)

HIGH CORRELATION 

Distinct843
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2814107.1
Minimum1000001
Maximum4001470
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:23:11.279243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000001
5-th percentile1000013
Q11000097
median3000135
Q34000439
95-th percentile4001212
Maximum4001470
Range3001469
Interquartile range (IQR)3000342

Descriptive statistics

Standard deviation1219936.5
Coefficient of variation (CV)0.4335075
Kurtosis-1.2838492
Mean2814107.1
Median Absolute Deviation (MAD)1000413
Skewness-0.56361014
Sum2.8141071 × 1010
Variance1.4882452 × 1012
MonotonicityNot monotonic
2023-12-11T09:23:11.442240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000071 360
 
3.6%
1000009 237
 
2.4%
4000728 182
 
1.8%
1000063 130
 
1.3%
1000003 124
 
1.2%
4000353 102
 
1.0%
4000605 97
 
1.0%
3000144 95
 
0.9%
3000083 93
 
0.9%
3000093 90
 
0.9%
Other values (833) 8490
84.9%
ValueCountFrequency (%)
1000001 16
 
0.2%
1000002 10
 
0.1%
1000003 124
1.2%
1000004 8
 
0.1%
1000005 64
 
0.6%
1000006 8
 
0.1%
1000007 4
 
< 0.1%
1000008 2
 
< 0.1%
1000009 237
2.4%
1000010 2
 
< 0.1%
ValueCountFrequency (%)
4001470 5
 
0.1%
4001469 12
 
0.1%
4001468 13
 
0.1%
4001466 8
 
0.1%
4001463 6
 
0.1%
4001462 56
0.6%
4001461 2
 
< 0.1%
4001460 2
 
< 0.1%
4001451 2
 
< 0.1%
4001446 1
 
< 0.1%

수집원분류
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
naver
8611 
youtube
994 
tistory
 
371
post
 
16
brunch
 
8

Length

Max length7
Median length5
Mean length5.2722
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownaver
2nd rownaver
3rd rownaver
4th rownaver
5th rownaver

Common Values

ValueCountFrequency (%)
naver 8611
86.1%
youtube 994
 
9.9%
tistory 371
 
3.7%
post 16
 
0.2%
brunch 8
 
0.1%

Length

2023-12-11T09:23:11.589269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:23:11.705121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
naver 8611
86.1%
youtube 994
 
9.9%
tistory 371
 
3.7%
post 16
 
0.2%
brunch 8
 
0.1%

관광상품분류
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반음식점
3713 
숙박업소
3148 
관광지
2790 
모범음식점
 
177
문화축제
 
172

Length

Max length5
Median length4
Mean length4.11
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업소
2nd row모범음식점
3rd row일반음식점
4th row모범음식점
5th row숙박업소

Common Values

ValueCountFrequency (%)
일반음식점 3713
37.1%
숙박업소 3148
31.5%
관광지 2790
27.9%
모범음식점 177
 
1.8%
문화축제 172
 
1.7%

Length

2023-12-11T09:23:11.845377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:23:11.963323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반음식점 3713
37.1%
숙박업소 3148
31.5%
관광지 2790
27.9%
모범음식점 177
 
1.8%
문화축제 172
 
1.7%
Distinct843
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T09:23:12.280822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length20
Mean length6.0411
Min length1

Characters and Unicode

Total characters60411
Distinct characters565
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique235 ?
Unique (%)2.4%

Sample

1st row밀양아리랑오토캠핑장
2nd row삼문동면돈
3rd row카페평리
4th row삼문동면돈
5th row밀양 참좋은펜션
ValueCountFrequency (%)
위양지 362
 
3.1%
트윈터널 237
 
2.0%
단골집 182
 
1.6%
키즈풀빌라 155
 
1.3%
풀빌라 148
 
1.3%
카페 143
 
1.2%
밀양점 140
 
1.2%
영남루 130
 
1.1%
표충사 129
 
1.1%
캠핑장 127
 
1.1%
Other values (900) 9846
84.9%
2023-12-11T09:23:12.725426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2162
 
3.6%
1863
 
3.1%
1666
 
2.8%
1599
 
2.6%
1386
 
2.3%
1335
 
2.2%
1183
 
2.0%
884
 
1.5%
822
 
1.4%
745
 
1.2%
Other values (555) 46766
77.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 55593
92.0%
Space Separator 1599
 
2.6%
Lowercase Letter 1239
 
2.1%
Decimal Number 1168
 
1.9%
Uppercase Letter 326
 
0.5%
Open Punctuation 177
 
0.3%
Close Punctuation 177
 
0.3%
Other Punctuation 131
 
0.2%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2162
 
3.9%
1863
 
3.4%
1666
 
3.0%
1386
 
2.5%
1335
 
2.4%
1183
 
2.1%
884
 
1.6%
822
 
1.5%
745
 
1.3%
738
 
1.3%
Other values (506) 42809
77.0%
Uppercase Letter
ValueCountFrequency (%)
C 93
28.5%
D 82
25.2%
R 63
19.3%
M 18
 
5.5%
B 16
 
4.9%
P 11
 
3.4%
V 11
 
3.4%
I 11
 
3.4%
A 9
 
2.8%
G 3
 
0.9%
Other values (6) 9
 
2.8%
Lowercase Letter
ValueCountFrequency (%)
o 230
18.6%
e 218
17.6%
f 202
16.3%
s 138
11.1%
t 120
9.7%
a 110
8.9%
r 93
7.5%
n 34
 
2.7%
c 33
 
2.7%
u 21
 
1.7%
Other values (5) 40
 
3.2%
Decimal Number
ValueCountFrequency (%)
1 340
29.1%
9 289
24.7%
8 148
12.7%
4 132
 
11.3%
3 91
 
7.8%
0 78
 
6.7%
5 57
 
4.9%
2 20
 
1.7%
7 12
 
1.0%
6 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
& 93
71.0%
/ 34
 
26.0%
: 3
 
2.3%
, 1
 
0.8%
Space Separator
ValueCountFrequency (%)
1599
100.0%
Open Punctuation
ValueCountFrequency (%)
( 177
100.0%
Close Punctuation
ValueCountFrequency (%)
) 177
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 55593
92.0%
Common 3253
 
5.4%
Latin 1565
 
2.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2162
 
3.9%
1863
 
3.4%
1666
 
3.0%
1386
 
2.5%
1335
 
2.4%
1183
 
2.1%
884
 
1.6%
822
 
1.5%
745
 
1.3%
738
 
1.3%
Other values (506) 42809
77.0%
Latin
ValueCountFrequency (%)
o 230
14.7%
e 218
13.9%
f 202
12.9%
s 138
8.8%
t 120
7.7%
a 110
7.0%
C 93
5.9%
r 93
5.9%
D 82
 
5.2%
R 63
 
4.0%
Other values (21) 216
13.8%
Common
ValueCountFrequency (%)
1599
49.2%
1 340
 
10.5%
9 289
 
8.9%
( 177
 
5.4%
) 177
 
5.4%
8 148
 
4.5%
4 132
 
4.1%
& 93
 
2.9%
3 91
 
2.8%
0 78
 
2.4%
Other values (8) 129
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 55593
92.0%
ASCII 4818
 
8.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2162
 
3.9%
1863
 
3.4%
1666
 
3.0%
1386
 
2.5%
1335
 
2.4%
1183
 
2.1%
884
 
1.6%
822
 
1.5%
745
 
1.3%
738
 
1.3%
Other values (506) 42809
77.0%
ASCII
ValueCountFrequency (%)
1599
33.2%
1 340
 
7.1%
9 289
 
6.0%
o 230
 
4.8%
e 218
 
4.5%
f 202
 
4.2%
( 177
 
3.7%
) 177
 
3.7%
8 148
 
3.1%
s 138
 
2.9%
Other values (39) 1300
27.0%
Distinct1725
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2008-12-19 00:00:00
Maximum2022-09-22 00:00:00
2023-12-11T09:23:12.918174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:13.069220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

후기글주소
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T09:23:13.345742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length80
Median length78
Mean length44.0185
Min length19

Characters and Unicode

Total characters440185
Distinct characters71
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowhttps:/blog.naver.com/keo32/222632119241
2nd rowhttps://blog.naver.com/eunju861110/221996725072
3rd rowhttps://blog.naver.com/anna901020/222799457241
4th rowhttps://blog.naver.com/wofldjaak/222066034208
5th rowhttps://blog.naver.com/jinhee_0809/221384225547
ValueCountFrequency (%)
https:/blog.naver.com/keo32/222632119241 1
 
< 0.1%
https://www.youtube.com/watch?v=vujbcnpgf48 1
 
< 0.1%
https:/blog.naver.com/tcacyc/222723182025 1
 
< 0.1%
https://blog.naver.com/dlaodwo/222527829170 1
 
< 0.1%
https://www.youtube.com/watch?v=gxvvu2oxeza 1
 
< 0.1%
https:/blog.naver.com/mylovekoong/222639471053 1
 
< 0.1%
https://blog.naver.com/phsun2025/222838150538 1
 
< 0.1%
https://blog.naver.com/tnnli2/222827786573 1
 
< 0.1%
https://blog.naver.com/withcool/222437701559 1
 
< 0.1%
https:/blog.naver.com/c_r_e_e_p/222707049292 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-11T09:23:13.739157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 37982
 
8.6%
2 35120
 
8.0%
t 24421
 
5.5%
o 24285
 
5.5%
. 20006
 
4.5%
s 13886
 
3.2%
a 13854
 
3.1%
e 13720
 
3.1%
h 13710
 
3.1%
n 12811
 
2.9%
Other values (61) 230390
52.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 236096
53.6%
Decimal Number 128259
29.1%
Other Punctuation 69022
 
15.7%
Uppercase Letter 4560
 
1.0%
Math Symbol 1026
 
0.2%
Connector Punctuation 903
 
0.2%
Dash Punctuation 319
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 24421
 
10.3%
o 24285
 
10.3%
s 13886
 
5.9%
a 13854
 
5.9%
e 13720
 
5.8%
h 13710
 
5.8%
n 12811
 
5.4%
m 12554
 
5.3%
c 12338
 
5.2%
l 12064
 
5.1%
Other values (16) 82453
34.9%
Uppercase Letter
ValueCountFrequency (%)
M 240
 
5.3%
Q 235
 
5.2%
A 227
 
5.0%
Y 226
 
5.0%
E 217
 
4.8%
I 217
 
4.8%
U 205
 
4.5%
N 190
 
4.2%
P 174
 
3.8%
R 169
 
3.7%
Other values (16) 2460
53.9%
Decimal Number
ValueCountFrequency (%)
2 35120
27.4%
1 12219
 
9.5%
0 11248
 
8.8%
7 10909
 
8.5%
8 10857
 
8.5%
6 10147
 
7.9%
4 9629
 
7.5%
3 9609
 
7.5%
5 9561
 
7.5%
9 8960
 
7.0%
Other Punctuation
ValueCountFrequency (%)
/ 37982
55.0%
. 20006
29.0%
: 10000
 
14.5%
? 1010
 
1.5%
& 16
 
< 0.1%
@ 8
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 1026
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 903
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 319
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 240656
54.7%
Common 199529
45.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 24421
 
10.1%
o 24285
 
10.1%
s 13886
 
5.8%
a 13854
 
5.8%
e 13720
 
5.7%
h 13710
 
5.7%
n 12811
 
5.3%
m 12554
 
5.2%
c 12338
 
5.1%
l 12064
 
5.0%
Other values (42) 87013
36.2%
Common
ValueCountFrequency (%)
/ 37982
19.0%
2 35120
17.6%
. 20006
10.0%
1 12219
 
6.1%
0 11248
 
5.6%
7 10909
 
5.5%
8 10857
 
5.4%
6 10147
 
5.1%
: 10000
 
5.0%
4 9629
 
4.8%
Other values (9) 31412
15.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 440185
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 37982
 
8.6%
2 35120
 
8.0%
t 24421
 
5.5%
o 24285
 
5.5%
. 20006
 
4.5%
s 13886
 
3.2%
a 13854
 
3.1%
e 13720
 
3.1%
h 13710
 
3.1%
n 12811
 
2.9%
Other values (61) 230390
52.3%

Interactions

2023-12-11T09:23:10.137891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:09.901788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:10.238752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:10.022636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:23:13.832693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
후기글번호(ReviewID)관광상품번호(NameID)수집원분류관광상품분류
후기글번호(ReviewID)1.0000.5360.6620.628
관광상품번호(NameID)0.5361.0000.1991.000
수집원분류0.6620.1991.0000.367
관광상품분류0.6281.0000.3671.000
2023-12-11T09:23:13.934151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관광상품분류수집원분류
관광상품분류1.0000.144
수집원분류0.1441.000
2023-12-11T09:23:14.039155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
후기글번호(ReviewID)관광상품번호(NameID)수집원분류관광상품분류
후기글번호(ReviewID)1.0000.1870.3340.310
관광상품번호(NameID)0.1871.0000.1661.000
수집원분류0.3340.1661.0000.144
관광상품분류0.3101.0000.1441.000

Missing values

2023-12-11T09:23:10.362940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:23:10.501862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

후기글번호(ReviewID)관광상품번호(NameID)수집원분류관광상품분류관광상품명게시일후기글주소
412060041213000083naver숙박업소밀양아리랑오토캠핑장2022-01-26https:/blog.naver.com/keo32/222632119241
987060098714001179naver모범음식점삼문동면돈2020-06-10https://blog.naver.com/eunju861110/221996725072
349460034954000252naver일반음식점카페평리2022-07-05https://blog.naver.com/anna901020/222799457241
985760098584001179naver모범음식점삼문동면돈2020-08-20https://blog.naver.com/wofldjaak/222066034208
492560049263000208naver숙박업소밀양 참좋은펜션2018-10-24https://blog.naver.com/jinhee_0809/221384225547
108660010871000071naver관광지위양지2022-06-17https://blog.naver.com/rin9041/222777797221
289060028913000199naver숙박업소시운재2022-06-27https://blog.naver.com/mate13717/222790982498
630460063054000263naver일반음식점밀양1892022-08-17https://blog.naver.com/wooahsalang/222850671791
1045860104593000115naver숙박업소애플오토 캠핑장2019-09-15https://blog.naver.com/ji9803/221648735165
284860028493000181naver숙박업소다온펜션2021-11-02https://blog.naver.com/hamoni0598/222555973466
후기글번호(ReviewID)관광상품번호(NameID)수집원분류관광상품분류관광상품명게시일후기글주소
829960083004000728naver일반음식점단골집2016-04-23https://blog.naver.com/gasinae00/220690929824
696260069633000144naver숙박업소밀양로그펜션2022-09-05https://blog.naver.com/kimay3402/222867299713
4960000504000548naver일반음식점가마솥에누룽지2022-05-14https://blog.naver.com/gabuki6611/222732527896
1055060105513000117naver숙박업소물안개 오토캠핑장2021-08-02https://blog.naver.com/hayarn61/222454030029
346060034614000232naver일반음식점카페에요2022-07-13https://blog.naver.com/tlsgid115/222810231964
898060089811000014tistory관광지구만산2022-05-21http://blog.daum.net/canmore/911
522060052213000093naver숙박업소미르캠핑장2022-07-19https://blog.naver.com/dsjean2/222816818901
292360029244000605naver일반음식점1919봄2021-08-06https://blog.naver.com/asdcindy/222458617960
602060060211000096naver관광지단장유원지2022-08-06https://blog.naver.com/tmfrl6514/222841023628
810060081014001446youtube일반음식점안태칼국수2021-05-04https://www.youtube.com/watch?v=bh6hS3wq-mY