Overview

Dataset statistics

Number of variables5
Number of observations60
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 KiB
Average record size in memory44.2 B

Variable types

Numeric2
Categorical2
Text1

Dataset

Description제주관광정보시스템(VISITJEJU)의 인기태그 데이터로 일련번호, 언어, 태그명, 정렬순서, 생성일 등의 정보를 제공합니다.
Author제주관광공사
URLhttps://www.data.go.kr/data/15049992/fileData.do

Alerts

등록일시 has constant value ""Constant
일련번호 is highly overall correlated with 언어High correlation
언어 is highly overall correlated with 일련번호High correlation
일련번호 has unique valuesUnique

Reproduction

Analysis started2024-03-23 05:38:04.377155
Analysis finished2024-03-23 05:38:07.077032
Duration2.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct60
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean111909.5
Minimum111880
Maximum111939
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size672.0 B
2024-03-23T05:38:07.430541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum111880
5-th percentile111882.95
Q1111894.75
median111909.5
Q3111924.25
95-th percentile111936.05
Maximum111939
Range59
Interquartile range (IQR)29.5

Descriptive statistics

Standard deviation17.464249
Coefficient of variation (CV)0.0001560569
Kurtosis-1.2
Mean111909.5
Median Absolute Deviation (MAD)15
Skewness0
Sum6714570
Variance305
MonotonicityStrictly increasing
2024-03-23T05:38:07.987085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
111880 1
 
1.7%
111911 1
 
1.7%
111913 1
 
1.7%
111914 1
 
1.7%
111915 1
 
1.7%
111916 1
 
1.7%
111917 1
 
1.7%
111918 1
 
1.7%
111919 1
 
1.7%
111920 1
 
1.7%
Other values (50) 50
83.3%
ValueCountFrequency (%)
111880 1
1.7%
111881 1
1.7%
111882 1
1.7%
111883 1
1.7%
111884 1
1.7%
111885 1
1.7%
111886 1
1.7%
111887 1
1.7%
111888 1
1.7%
111889 1
1.7%
ValueCountFrequency (%)
111939 1
1.7%
111938 1
1.7%
111937 1
1.7%
111936 1
1.7%
111935 1
1.7%
111934 1
1.7%
111933 1
1.7%
111932 1
1.7%
111931 1
1.7%
111930 1
1.7%

언어
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size612.0 B
국문
10 
영문
10 
중문간체
10 
일문
10 
중문번체
10 

Length

Max length4
Median length3
Mean length3
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국문
2nd row국문
3rd row국문
4th row국문
5th row국문

Common Values

ValueCountFrequency (%)
국문 10
16.7%
영문 10
16.7%
중문간체 10
16.7%
일문 10
16.7%
중문번체 10
16.7%
말레이문 10
16.7%

Length

2024-03-23T05:38:08.761557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T05:38:09.265985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국문 10
16.7%
영문 10
16.7%
중문간체 10
16.7%
일문 10
16.7%
중문번체 10
16.7%
말레이문 10
16.7%
Distinct45
Distinct (%)75.0%
Missing0
Missing (%)0.0%
Memory size612.0 B
2024-03-23T05:38:10.091309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length9.5
Mean length4.6166667
Min length2

Characters and Unicode

Total characters277
Distinct characters121
Distinct categories8 ?
Distinct scripts5 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)63.3%

Sample

1st row자연경관
2nd row무장애관광
3rd row커플
4th row혼자
5th row걷기/등산
ValueCountFrequency (%)
모먼츠인제주 6
 
9.4%
여름 5
 
7.8%
아이와바다 3
 
4.7%
방학 3
 
4.7%
체험 2
 
3.1%
친구 2
 
3.1%
가족여행 2
 
3.1%
韓流推薦地 1
 
1.6%
겨울 1
 
1.6%
제주도해변 1
 
1.6%
Other values (38) 38
59.4%
2024-03-23T05:38:11.295934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
 
4.0%
11
 
4.0%
8
 
2.9%
e 7
 
2.5%
7
 
2.5%
7
 
2.5%
o 7
 
2.5%
a 7
 
2.5%
6
 
2.2%
s 6
 
2.2%
Other values (111) 200
72.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 186
67.1%
Lowercase Letter 63
 
22.7%
Uppercase Letter 9
 
3.2%
Space Separator 8
 
2.9%
Other Punctuation 7
 
2.5%
Decimal Number 2
 
0.7%
Connector Punctuation 1
 
0.4%
Dash Punctuation 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11
 
5.9%
11
 
5.9%
7
 
3.8%
7
 
3.8%
6
 
3.2%
6
 
3.2%
6
 
3.2%
6
 
3.2%
5
 
2.7%
5
 
2.7%
Other values (81) 116
62.4%
Lowercase Letter
ValueCountFrequency (%)
e 7
11.1%
o 7
11.1%
a 7
11.1%
s 6
9.5%
n 5
7.9%
d 4
 
6.3%
t 4
 
6.3%
r 4
 
6.3%
l 4
 
6.3%
c 3
 
4.8%
Other values (7) 12
19.0%
Uppercase Letter
ValueCountFrequency (%)
C 3
33.3%
O 2
22.2%
F 2
22.2%
L 1
 
11.1%
K 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
/ 3
42.9%
? 3
42.9%
. 1
 
14.3%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
8
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 159
57.4%
Latin 72
26.0%
Common 19
 
6.9%
Han 17
 
6.1%
Katakana 10
 
3.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11
 
6.9%
11
 
6.9%
7
 
4.4%
7
 
4.4%
6
 
3.8%
6
 
3.8%
6
 
3.8%
6
 
3.8%
5
 
3.1%
5
 
3.1%
Other values (56) 89
56.0%
Latin
ValueCountFrequency (%)
e 7
 
9.7%
o 7
 
9.7%
a 7
 
9.7%
s 6
 
8.3%
n 5
 
6.9%
d 4
 
5.6%
t 4
 
5.6%
r 4
 
5.6%
l 4
 
5.6%
c 3
 
4.2%
Other values (12) 21
29.2%
Han
ValueCountFrequency (%)
2
 
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (6) 6
35.3%
Katakana
ValueCountFrequency (%)
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Common
ValueCountFrequency (%)
8
42.1%
/ 3
 
15.8%
? 3
 
15.8%
1 1
 
5.3%
2 1
 
5.3%
. 1
 
5.3%
_ 1
 
5.3%
- 1
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 159
57.4%
ASCII 91
32.9%
CJK 17
 
6.1%
Katakana 10
 
3.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
11
 
6.9%
11
 
6.9%
7
 
4.4%
7
 
4.4%
6
 
3.8%
6
 
3.8%
6
 
3.8%
6
 
3.8%
5
 
3.1%
5
 
3.1%
Other values (56) 89
56.0%
ASCII
ValueCountFrequency (%)
8
 
8.8%
e 7
 
7.7%
o 7
 
7.7%
a 7
 
7.7%
s 6
 
6.6%
n 5
 
5.5%
d 4
 
4.4%
t 4
 
4.4%
r 4
 
4.4%
l 4
 
4.4%
Other values (20) 35
38.5%
CJK
ValueCountFrequency (%)
2
 
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (6) 6
35.3%
Katakana
ValueCountFrequency (%)
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

정렬순서
Real number (ℝ)

Distinct10
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.5
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size672.0 B
2024-03-23T05:38:11.861450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5.5
Q38
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.8965204
Coefficient of variation (CV)0.52664008
Kurtosis-1.225665
Mean5.5
Median Absolute Deviation (MAD)2.5
Skewness0
Sum330
Variance8.3898305
MonotonicityNot monotonic
2024-03-23T05:38:12.368441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 6
10.0%
2 6
10.0%
3 6
10.0%
4 6
10.0%
5 6
10.0%
6 6
10.0%
7 6
10.0%
8 6
10.0%
9 6
10.0%
10 6
10.0%
ValueCountFrequency (%)
1 6
10.0%
2 6
10.0%
3 6
10.0%
4 6
10.0%
5 6
10.0%
6 6
10.0%
7 6
10.0%
8 6
10.0%
9 6
10.0%
10 6
10.0%
ValueCountFrequency (%)
10 6
10.0%
9 6
10.0%
8 6
10.0%
7 6
10.0%
6 6
10.0%
5 6
10.0%
4 6
10.0%
3 6
10.0%
2 6
10.0%
1 6
10.0%

등록일시
Categorical

CONSTANT 

Distinct1
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size612.0 B
2024-03-08
60 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024-03-08
2nd row2024-03-08
3rd row2024-03-08
4th row2024-03-08
5th row2024-03-08

Common Values

ValueCountFrequency (%)
2024-03-08 60
100.0%

Length

2024-03-23T05:38:13.046221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T05:38:13.808814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2024-03-08 60
100.0%

Interactions

2024-03-23T05:38:05.552033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T05:38:04.885050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T05:38:05.953190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T05:38:05.194307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T05:38:14.167802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호언어태그명정렬순서
일련번호1.0000.9470.6690.000
언어0.9471.0000.0000.000
태그명0.6690.0001.0000.580
정렬순서0.0000.0000.5801.000
2024-03-23T05:38:14.597367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호정렬순서언어
일련번호1.0000.1660.832
정렬순서0.1661.0000.000
언어0.8320.0001.000

Missing values

2024-03-23T05:38:06.388274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T05:38:06.864691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호언어태그명정렬순서등록일시
0111880국문자연경관12024-03-08
1111881국문무장애관광22024-03-08
2111882국문커플32024-03-08
3111883국문혼자42024-03-08
4111884국문걷기/등산52024-03-08
5111885국문경관/포토62024-03-08
6111886국문아이72024-03-08
7111887국문반려동물동반_관광지82024-03-08
8111888국문친구92024-03-08
9111889국문테마공원102024-03-08
일련번호언어태그명정렬순서등록일시
50111930말레이문love12024-03-08
51111931말레이문Friends22024-03-08
52111932말레이문On Foot32024-03-08
53111933말레이문Couples42024-03-08
54111934말레이문Cafe52024-03-08
55111935말레이문모먼츠인제주62024-03-08
56111936말레이문여름72024-03-08
57111937말레이문애인82024-03-08
58111938말레이문Overcast92024-03-08
59111939말레이문친구102024-03-08