Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells1668
Missing cells (%)3.3%
Duplicate rows72
Duplicate rows (%)0.7%
Total size in memory478.5 KiB
Average record size in memory49.0 B

Variable types

DateTime1
Categorical1
Text2
Numeric1

Dataset

DescriptionJDC 웹사이트에 접속한 국가와 접속지역을 공개를 통해 제주 관광객의 접근 지역 유형의 데이터로 활용이 가능 - JDC 온라인 면세점 웹사이트 - JDC-JAM 웹사이트 - 제주첨단과기단지 웹사이트 - JDC 기관 웹사이트
Author제주국제자유도시개발센터
URLhttps://www.data.go.kr/data/15106175/fileData.do

Alerts

Dataset has 72 (0.7%) duplicate rowsDuplicates
국가명 has 135 (1.4%) missing valuesMissing
지역명 has 1533 (15.3%) missing valuesMissing
접속횟수 is highly skewed (γ1 = 20.90162258)Skewed

Reproduction

Analysis started2023-12-12 14:59:00.434822
Analysis finished2023-12-12 14:59:01.165902
Duration0.73 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-08-11 00:00:00
Maximum2022-08-29 00:00:00
2023-12-12T23:59:01.211477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:59:01.341010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
Distinct36
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
www.jdcdutyfree.com
3912 
jdcdutyfree.com
3082 
www.jdc-jam.com
1207 
www.jdcenter.com
393 
jdcenter.com
 
224
Other values (31)
1182 

Length

Max length22
Median length21
Mean length16.6597
Min length11

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowjdcdutyfree.com
2nd rowwww.jdcenter.com
3rd rowwww.jdcdutyfree.com
4th rowjdcdutyfree.com
5th rowjdcdutyfree.com

Common Values

ValueCountFrequency (%)
www.jdcdutyfree.com 3912
39.1%
jdcdutyfree.com 3082
30.8%
www.jdc-jam.com 1207
 
12.1%
www.jdcenter.com 393
 
3.9%
jdcenter.com 224
 
2.2%
www.jeju-sp.com 199
 
2.0%
jeju-sp.com 108
 
1.1%
jdc-jam.com 97
 
1.0%
english.jdcenter.com 65
 
0.7%
m.jdcdutyfree.com 59
 
0.6%
Other values (26) 654
 
6.5%

Length

2023-12-12T23:59:01.489388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
www.jdcdutyfree.com 3912
39.1%
jdcdutyfree.com 3082
30.8%
www.jdc-jam.com 1207
 
12.1%
www.jdcenter.com 393
 
3.9%
jdcenter.com 224
 
2.2%
www.jeju-sp.com 199
 
2.0%
jeju-sp.com 108
 
1.1%
jdc-jam.com 98
 
1.0%
english.jdcenter.com 65
 
0.7%
m.jdcdutyfree.com 59
 
0.6%
Other values (25) 653
 
6.5%

국가명
Text

MISSING 

Distinct88
Distinct (%)0.9%
Missing135
Missing (%)1.4%
Memory size156.2 KiB
2023-12-12T23:59:01.700438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length11
Mean length10.662342
Min length4

Characters and Unicode

Total characters105184
Distinct characters52
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.2%

Sample

1st rowGuam
2nd rowSouth Korea
3rd rowSouth Korea
4th rowSouth Korea
5th rowSouth Korea
ValueCountFrequency (%)
south 8494
45.2%
korea 8488
45.2%
united 369
 
2.0%
states 319
 
1.7%
china 186
 
1.0%
canada 70
 
0.4%
russia 68
 
0.4%
germany 50
 
0.3%
kingdom 48
 
0.3%
france 44
 
0.2%
Other values (91) 638
 
3.4%
2023-12-12T23:59:02.096248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 17266
16.4%
a 10043
9.5%
e 9647
9.2%
t 9630
9.2%
8909
8.5%
S 8894
8.5%
r 8823
8.4%
h 8777
8.3%
u 8624
8.2%
K 8558
8.1%
Other values (42) 6013
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 77496
73.7%
Uppercase Letter 18777
 
17.9%
Space Separator 8909
 
8.5%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 17266
22.3%
a 10043
13.0%
e 9647
12.4%
t 9630
12.4%
r 8823
11.4%
h 8777
11.3%
u 8624
11.1%
n 1212
 
1.6%
i 1037
 
1.3%
d 714
 
0.9%
Other values (16) 1723
 
2.2%
Uppercase Letter
ValueCountFrequency (%)
S 8894
47.4%
K 8558
45.6%
U 391
 
2.1%
C 265
 
1.4%
I 90
 
0.5%
R 76
 
0.4%
M 70
 
0.4%
G 59
 
0.3%
N 55
 
0.3%
A 52
 
0.3%
Other values (13) 267
 
1.4%
Space Separator
ValueCountFrequency (%)
8909
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 96273
91.5%
Common 8911
 
8.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 17266
17.9%
a 10043
10.4%
e 9647
10.0%
t 9630
10.0%
S 8894
9.2%
r 8823
9.2%
h 8777
9.1%
u 8624
9.0%
K 8558
8.9%
n 1212
 
1.3%
Other values (39) 4799
 
5.0%
Common
ValueCountFrequency (%)
8909
> 99.9%
[ 1
 
< 0.1%
] 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 105184
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 17266
16.4%
a 10043
9.5%
e 9647
9.2%
t 9630
9.2%
8909
8.5%
S 8894
8.5%
r 8823
8.4%
h 8777
8.3%
u 8624
8.2%
K 8558
8.1%
Other values (42) 6013
 
5.7%

지역명
Text

MISSING 

Distinct265
Distinct (%)3.1%
Missing1533
Missing (%)15.3%
Memory size156.2 KiB
2023-12-12T23:59:02.385916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.8855557
Min length3

Characters and Unicode

Total characters58300
Distinct characters147
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대전광역시유성구
2nd row경기도군포시
3rd row대구광역시수성구
4th row경상북도울진군
5th row대구광역시달성군
ValueCountFrequency (%)
경기도분당구 379
 
4.5%
서울특별시 156
 
1.8%
제주특별자치도제주시 97
 
1.1%
제주특별자치도서귀포시 68
 
0.8%
제주특별자치도 67
 
0.8%
서울특별시서초구 66
 
0.8%
서울특별시마포구 63
 
0.7%
서울특별시노원구 60
 
0.7%
대전광역시유성구 58
 
0.7%
경기도화성시 58
 
0.7%
Other values (255) 7395
87.3%
2023-12-12T23:59:02.879960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5507
 
9.4%
5296
 
9.1%
4692
 
8.0%
3109
 
5.3%
2332
 
4.0%
2002
 
3.4%
1937
 
3.3%
1922
 
3.3%
1866
 
3.2%
1556
 
2.7%
Other values (137) 28081
48.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 58300
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5507
 
9.4%
5296
 
9.1%
4692
 
8.0%
3109
 
5.3%
2332
 
4.0%
2002
 
3.4%
1937
 
3.3%
1922
 
3.3%
1866
 
3.2%
1556
 
2.7%
Other values (137) 28081
48.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 58300
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5507
 
9.4%
5296
 
9.1%
4692
 
8.0%
3109
 
5.3%
2332
 
4.0%
2002
 
3.4%
1937
 
3.3%
1922
 
3.3%
1866
 
3.2%
1556
 
2.7%
Other values (137) 28081
48.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 58300
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5507
 
9.4%
5296
 
9.1%
4692
 
8.0%
3109
 
5.3%
2332
 
4.0%
2002
 
3.4%
1937
 
3.3%
1922
 
3.3%
1866
 
3.2%
1556
 
2.7%
Other values (137) 28081
48.2%

접속횟수
Real number (ℝ)

SKEWED 

Distinct152
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.5039
Minimum1
Maximum1533
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T23:59:03.073225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q39
95-th percentile32
Maximum1533
Range1532
Interquartile range (IQR)8

Descriptive statistics

Standard deviation54.833757
Coefficient of variation (CV)5.2203236
Kurtosis488.38524
Mean10.5039
Median Absolute Deviation (MAD)2
Skewness20.901623
Sum105039
Variance3006.7409
MonotonicityNot monotonic
2023-12-12T23:59:03.250401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3609
36.1%
2 1362
 
13.6%
3 776
 
7.8%
4 477
 
4.8%
5 331
 
3.3%
6 289
 
2.9%
7 285
 
2.9%
8 240
 
2.4%
9 198
 
2.0%
10 194
 
1.9%
Other values (142) 2239
22.4%
ValueCountFrequency (%)
1 3609
36.1%
2 1362
 
13.6%
3 776
 
7.8%
4 477
 
4.8%
5 331
 
3.3%
6 289
 
2.9%
7 285
 
2.9%
8 240
 
2.4%
9 198
 
2.0%
10 194
 
1.9%
ValueCountFrequency (%)
1533 1
< 0.1%
1506 1
< 0.1%
1464 1
< 0.1%
1458 1
< 0.1%
1457 1
< 0.1%
1430 1
< 0.1%
1408 1
< 0.1%
1356 1
< 0.1%
1130 1
< 0.1%
1088 1
< 0.1%

Interactions

2023-12-12T23:59:00.797156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:59:03.376131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
접속일자홈페이지주소(URL)국가명접속횟수
접속일자1.0000.0000.0000.000
홈페이지주소(URL)0.0001.0000.5690.000
국가명0.0000.5691.0000.000
접속횟수0.0000.0000.0001.000
2023-12-12T23:59:03.473368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
접속횟수홈페이지주소(URL)
접속횟수1.0000.000
홈페이지주소(URL)0.0001.000

Missing values

2023-12-12T23:59:00.923843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:59:01.010451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:59:01.105680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

접속일자홈페이지주소(URL)국가명지역명접속횟수
61442022-08-17jdcdutyfree.comGuam<NA>1
9922022-08-11www.jdcenter.comSouth Korea대전광역시유성구2
78062022-08-18www.jdcdutyfree.comSouth Korea경기도군포시4
167222022-08-28jdcdutyfree.comSouth Korea대구광역시수성구20
42732022-08-15jdcdutyfree.comSouth Korea경상북도울진군2
72642022-08-18jdcdutyfree.comSouth Korea대구광역시달성군10
175552022-08-29jdcdutyfree.comSouth Korea경기도동두천시1
6452022-08-11www.jdcdutyfree.comSouth Korea충청남도동남구2
3222022-08-11jdcdutyfree.comSouth Korea경기도장안구16
150762022-08-26jeju-sp.comSouth Korea경기도분당구1
접속일자홈페이지주소(URL)국가명지역명접속횟수
56342022-08-16www.jdcdutyfree.comSouth Korea강원도삼척시1
171072022-08-28www.jdcdutyfree.comSouth Korea인천광역시강화군1
160022022-08-27www.jdc-jam.comSlovakia<NA>2
32242022-08-14jdcdutyfree.comSouth Korea전라남도담양군1
24512022-08-13jdcenter.comUnited States<NA>16
122522022-08-23www.jdc-jam.comSouth Korea경상북도구미시1
71782022-08-18jdcdutyfree.comEcuador<NA>1
152882022-08-26www.jdcdutyfree.comSouth Korea대전광역시유성구2
33112022-08-14jdcdutyfree.comSouth Korea서울특별시영등포구25
34582022-08-14www.jdc-jam.comSouth Korea부산광역시강서구2

Duplicate rows

Most frequently occurring

접속일자홈페이지주소(URL)국가명지역명접속횟수# duplicates
02022-08-11english.jdcenter.comSouth Korea경기도분당구12
12022-08-11jdcdutyfree.comChina<NA>62
22022-08-11jeju-sp.com<NA><NA>12
32022-08-11m.jeju-sp.comSouth Korea경기도분당구12
42022-08-11www.jdcdutyfree.comAustralia<NA>12
52022-08-11www.jdcdutyfree.comMalaysia<NA>12
62022-08-11www.jdcdutyfree.comSouth Korea강원도삼척시12
72022-08-11www.jdcdutyfree.comSouth Korea강원도원주시42
82022-08-11www.jdcdutyfree.comSouth Korea경기도포천시12
92022-08-11www.jdcdutyfree.comSouth Korea충청남도홍성군32