Overview

Dataset statistics

Number of variables6
Number of observations71
Missing cells8
Missing cells (%)1.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.7 KiB
Average record size in memory52.9 B

Variable types

Numeric3
Categorical1
Text1
DateTime1

Dataset

Description제주관광정보시스템(VISITJEJU)의 인기검색어 정보로 일련번호, 언어, 정렬순서, 단어, 생성일, 기존순위 등의 정보를 제공합니다.
Author제주관광공사
URLhttps://www.data.go.kr/data/15049993/fileData.do

Alerts

일련번호 is highly overall correlated with 언어High correlation
정렬순서 is highly overall correlated with 기존순위High correlation
기존순위 is highly overall correlated with 정렬순서High correlation
언어 is highly overall correlated with 일련번호High correlation
기존순위 has 8 (11.3%) missing valuesMissing
일련번호 has unique valuesUnique

Reproduction

Analysis started2024-03-23 06:54:30.650397
Analysis finished2024-03-23 06:54:35.217026
Duration4.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct71
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68476.07
Minimum101
Maximum112832
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size771.0 B
2024-03-23T06:54:35.504330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile104.5
Q1558.5
median112780
Q3112813.5
95-th percentile112828.5
Maximum112832
Range112731
Interquartile range (IQR)112255

Descriptive statistics

Standard deviation55327.066
Coefficient of variation (CV)0.80797665
Kurtosis-1.8580404
Mean68476.07
Median Absolute Deviation (MAD)44
Skewness-0.44169858
Sum4861801
Variance3.0610842 × 109
MonotonicityStrictly increasing
2024-03-23T06:54:36.158353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
101 1
 
1.4%
102 1
 
1.4%
112813 1
 
1.4%
112812 1
 
1.4%
112811 1
 
1.4%
112810 1
 
1.4%
112809 1
 
1.4%
112808 1
 
1.4%
112807 1
 
1.4%
112806 1
 
1.4%
Other values (61) 61
85.9%
ValueCountFrequency (%)
101 1
1.4%
102 1
1.4%
103 1
1.4%
104 1
1.4%
105 1
1.4%
106 1
1.4%
107 1
1.4%
108 1
1.4%
109 1
1.4%
110 1
1.4%
ValueCountFrequency (%)
112832 1
1.4%
112831 1
1.4%
112830 1
1.4%
112829 1
1.4%
112828 1
1.4%
112826 1
1.4%
112825 1
1.4%
112824 1
1.4%
112823 1
1.4%
112822 1
1.4%

언어
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Memory size700.0 B
국문
35 
일문
12 
중문번체
10 
말레이문
중문간체

Length

Max length4
Median length2
Mean length2.6478873
Min length2

Unique

Unique1 ?
Unique (%)1.4%

Sample

1st row국문
2nd row국문
3rd row국문
4th row국문
5th row국문

Common Values

ValueCountFrequency (%)
국문 35
49.3%
일문 12
 
16.9%
중문번체 10
 
14.1%
말레이문 9
 
12.7%
중문간체 4
 
5.6%
영문 1
 
1.4%

Length

2024-03-23T06:54:36.739708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T06:54:37.134321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국문 35
49.3%
일문 12
 
16.9%
중문번체 10
 
14.1%
말레이문 9
 
12.7%
중문간체 4
 
5.6%
영문 1
 
1.4%

정렬순서
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)14.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3239437
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size771.0 B
2024-03-23T06:54:37.640352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.9941593
Coefficient of variation (CV)0.562395
Kurtosis-1.3147749
Mean5.3239437
Median Absolute Deviation (MAD)3
Skewness0.058439987
Sum378
Variance8.9649899
MonotonicityNot monotonic
2024-03-23T06:54:38.043214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 9
12.7%
2 8
11.3%
3 7
9.9%
4 7
9.9%
7 7
9.9%
8 7
9.9%
9 7
9.9%
10 7
9.9%
5 6
8.5%
6 6
8.5%
ValueCountFrequency (%)
1 9
12.7%
2 8
11.3%
3 7
9.9%
4 7
9.9%
5 6
8.5%
6 6
8.5%
7 7
9.9%
8 7
9.9%
9 7
9.9%
10 7
9.9%
ValueCountFrequency (%)
10 7
9.9%
9 7
9.9%
8 7
9.9%
7 7
9.9%
6 6
8.5%
5 6
8.5%
4 7
9.9%
3 7
9.9%
2 8
11.3%
1 9
12.7%

단어
Text

Distinct62
Distinct (%)87.3%
Missing0
Missing (%)0.0%
Memory size700.0 B
2024-03-23T06:54:39.130178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length13
Mean length4.1267606
Min length1

Characters and Unicode

Total characters293
Distinct characters149
Distinct categories6 ?
Distinct scripts5 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)78.9%

Sample

1st row제주
2nd row우도
3rd row한라산
4th row성산 일출봉
5th row비자리
ValueCountFrequency (%)
제주 5
 
6.6%
성산 4
 
5.3%
한라산 3
 
3.9%
아이와바다 2
 
2.6%
사려니 2
 
2.6%
우도 2
 
2.6%
성산일출봉 2
 
2.6%
funinthewater 1
 
1.3%
친구 1
 
1.3%
미식 1
 
1.3%
Other values (53) 53
69.7%
2024-03-23T06:54:40.570688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
3.4%
10
 
3.4%
9
 
3.1%
9
 
3.1%
i 7
 
2.4%
l 7
 
2.4%
e 7
 
2.4%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (139) 218
74.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 218
74.4%
Lowercase Letter 55
 
18.8%
Space Separator 10
 
3.4%
Uppercase Letter 5
 
1.7%
Other Punctuation 3
 
1.0%
Decimal Number 2
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
4.6%
9
 
4.1%
9
 
4.1%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
5
 
2.3%
5
 
2.3%
4
 
1.8%
Other values (112) 155
71.1%
Lowercase Letter
ValueCountFrequency (%)
i 7
12.7%
l 7
12.7%
e 7
12.7%
t 4
 
7.3%
o 4
 
7.3%
a 4
 
7.3%
s 3
 
5.5%
w 3
 
5.5%
n 3
 
5.5%
v 2
 
3.6%
Other values (8) 11
20.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
20.0%
F 1
20.0%
C 1
20.0%
S 1
20.0%
Y 1
20.0%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
0 1
50.0%
Space Separator
ValueCountFrequency (%)
10
100.0%
Other Punctuation
ValueCountFrequency (%)
? 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 195
66.6%
Latin 60
 
20.5%
Han 20
 
6.8%
Common 15
 
5.1%
Katakana 3
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
 
5.1%
9
 
4.6%
9
 
4.6%
6
 
3.1%
5
 
2.6%
5
 
2.6%
5
 
2.6%
5
 
2.6%
5
 
2.6%
4
 
2.1%
Other values (89) 132
67.7%
Latin
ValueCountFrequency (%)
i 7
11.7%
l 7
11.7%
e 7
11.7%
t 4
 
6.7%
o 4
 
6.7%
a 4
 
6.7%
s 3
 
5.0%
w 3
 
5.0%
n 3
 
5.0%
v 2
 
3.3%
Other values (13) 16
26.7%
Han
ValueCountFrequency (%)
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (10) 10
50.0%
Common
ValueCountFrequency (%)
10
66.7%
? 3
 
20.0%
1 1
 
6.7%
0 1
 
6.7%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 195
66.6%
ASCII 75
 
25.6%
CJK 20
 
6.8%
Katakana 3
 
1.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
 
5.1%
9
 
4.6%
9
 
4.6%
6
 
3.1%
5
 
2.6%
5
 
2.6%
5
 
2.6%
5
 
2.6%
5
 
2.6%
4
 
2.1%
Other values (89) 132
67.7%
ASCII
ValueCountFrequency (%)
10
13.3%
i 7
 
9.3%
l 7
 
9.3%
e 7
 
9.3%
t 4
 
5.3%
o 4
 
5.3%
a 4
 
5.3%
? 3
 
4.0%
s 3
 
4.0%
w 3
 
4.0%
Other values (17) 23
30.7%
CJK
ValueCountFrequency (%)
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (10) 10
50.0%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct3
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size700.0 B
Minimum2018-02-09 00:00:00
Maximum2024-03-08 00:00:00
2024-03-23T06:54:41.072425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:41.547867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=3)

기존순위
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)15.9%
Missing8
Missing (%)11.3%
Infinite0
Infinite (%)0.0%
Mean5.015873
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size771.0 B
2024-03-23T06:54:41.963882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q38
95-th percentile9.9
Maximum10
Range9
Interquartile range (IQR)6

Descriptive statistics

Standard deviation2.9264757
Coefficient of variation (CV)0.58344294
Kurtosis-1.2799562
Mean5.015873
Median Absolute Deviation (MAD)3
Skewness0.20318924
Sum316
Variance8.5642601
MonotonicityNot monotonic
2024-03-23T06:54:42.439630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2 9
12.7%
1 8
11.3%
3 7
9.9%
9 7
9.9%
5 6
8.5%
4 6
8.5%
6 6
8.5%
8 6
8.5%
7 4
5.6%
10 4
5.6%
(Missing) 8
11.3%
ValueCountFrequency (%)
1 8
11.3%
2 9
12.7%
3 7
9.9%
4 6
8.5%
5 6
8.5%
6 6
8.5%
7 4
5.6%
8 6
8.5%
9 7
9.9%
10 4
5.6%
ValueCountFrequency (%)
10 4
5.6%
9 7
9.9%
8 6
8.5%
7 4
5.6%
6 6
8.5%
5 6
8.5%
4 6
8.5%
3 7
9.9%
2 9
12.7%
1 8
11.3%

Interactions

2024-03-23T06:54:33.504701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:31.526920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:32.519239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:33.790036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:31.852689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:32.900178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:34.077151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:32.187250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:54:33.190600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T06:54:42.770208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호언어정렬순서단어등록일시기존순위
일련번호1.0000.8560.0000.8911.0000.138
언어0.8561.0000.0000.0000.7570.000
정렬순서0.0000.0001.0000.8010.0000.952
단어0.8910.0000.8011.0000.0000.860
등록일시1.0000.7570.0000.0001.0000.000
기존순위0.1380.0000.9520.8600.0001.000
2024-03-23T06:54:43.143946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호정렬순서기존순위언어
일련번호1.0000.2220.3080.644
정렬순서0.2221.0000.9410.000
기존순위0.3080.9411.0000.000
언어0.6440.0000.0001.000

Missing values

2024-03-23T06:54:34.439823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T06:54:34.836777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호언어정렬순서단어등록일시기존순위
0101국문1제주2018-04-051
1102국문2우도2018-04-052
2103국문3한라산2018-04-05<NA>
3104국문4성산 일출봉2018-04-053
4105국문5비자리2018-04-05<NA>
5106국문6이중섭거리2018-04-05<NA>
6107국문7천지연폭포2018-04-055
7108국문8휴애리2018-04-05<NA>
8109국문9마라도2018-04-05<NA>
9110국문10천제연폭포2018-04-059
일련번호언어정렬순서단어등록일시기존순위
61112822중문번체10오감만족제주여행2024-03-089
62112823말레이문1Seogwipo Chilsimni Festival2024-03-081
63112824말레이문2Maple2024-03-082
64112825말레이문32024-03-083
65112826말레이문4Yellowtail2024-03-084
66112828말레이문6가장행복한여행2024-03-086
67112829말레이문7love2024-03-089
68112830말레이문8신혼여행2024-03-087
69112831말레이문9친구2024-03-088
70112832말레이문10우도해변2024-03-08<NA>