Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Categorical2
Text1
Numeric1

Dataset

DescriptionSample
Author주식회사 여기어때컴퍼니
URLhttps://www.bigdata-finance.kr/dataset/datasetView.do?datastId=SET0400002

Alerts

기준년월 has constant value ""Constant
대상기준년월 has constant value ""Constant
Dataset has 1 (< 0.1%) duplicate rowsDuplicates
서비스이용횟수 is highly skewed (γ1 = 27.77856616)Skewed

Reproduction

Analysis started2023-12-10 13:13:00.255076
Analysis finished2023-12-10 13:13:02.245773
Duration1.99 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202108
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202108
2nd row202108
3rd row202108
4th row202108
5th row202108

Common Values

ValueCountFrequency (%)
202108 10000
100.0%

Length

2023-12-10T22:13:02.383293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:13:02.557554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202108 10000
100.0%
Distinct9994
Distinct (%)99.9%
Missing1
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-10T22:13:03.211294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length70
Median length49
Mean length5.1377138
Min length1

Characters and Unicode

Total characters51372
Distinct characters1209
Distinct categories9 ?
Distinct scripts5 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9989 ?
Unique (%)99.9%

Sample

1st row공단해장국
2nd row한일식당
3rd row잠실 치킨
4th row산드레
5th row리스팬케이크
ValueCountFrequency (%)
맛집 289
 
2.1%
카페 136
 
1.0%
99
 
0.7%
베스트 96
 
0.7%
제주 63
 
0.5%
파스타 47
 
0.3%
부산 43
 
0.3%
스시 41
 
0.3%
홍대 38
 
0.3%
강남 37
 
0.3%
Other values (9411) 12755
93.5%
2023-12-10T22:13:04.853060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3856
 
7.5%
984
 
1.9%
974
 
1.9%
748
 
1.5%
710
 
1.4%
676
 
1.3%
576
 
1.1%
544
 
1.1%
467
 
0.9%
456
 
0.9%
Other values (1199) 41381
80.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 43119
83.9%
Space Separator 3856
 
7.5%
Lowercase Letter 3742
 
7.3%
Uppercase Letter 545
 
1.1%
Other Punctuation 79
 
0.2%
Dash Punctuation 14
 
< 0.1%
Open Punctuation 8
 
< 0.1%
Close Punctuation 8
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
984
 
2.3%
974
 
2.3%
748
 
1.7%
710
 
1.6%
676
 
1.6%
576
 
1.3%
544
 
1.3%
467
 
1.1%
456
 
1.1%
445
 
1.0%
Other values (1133) 36539
84.7%
Lowercase Letter
ValueCountFrequency (%)
a 438
11.7%
e 373
 
10.0%
o 339
 
9.1%
n 305
 
8.2%
i 243
 
6.5%
s 227
 
6.1%
t 209
 
5.6%
r 194
 
5.2%
l 178
 
4.8%
u 176
 
4.7%
Other values (16) 1060
28.3%
Uppercase Letter
ValueCountFrequency (%)
S 59
 
10.8%
B 47
 
8.6%
C 37
 
6.8%
G 36
 
6.6%
M 35
 
6.4%
P 33
 
6.1%
T 30
 
5.5%
O 28
 
5.1%
A 26
 
4.8%
N 23
 
4.2%
Other values (15) 191
35.0%
Other Punctuation
ValueCountFrequency (%)
. 24
30.4%
/ 18
22.8%
? 15
19.0%
& 9
 
11.4%
' 6
 
7.6%
, 3
 
3.8%
\ 2
 
2.5%
· 2
 
2.5%
Open Punctuation
ValueCountFrequency (%)
( 5
62.5%
[ 3
37.5%
Close Punctuation
ValueCountFrequency (%)
) 5
62.5%
] 3
37.5%
Space Separator
ValueCountFrequency (%)
3856
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 43097
83.9%
Latin 4287
 
8.3%
Common 3966
 
7.7%
Han 14
 
< 0.1%
Hiragana 8
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
984
 
2.3%
974
 
2.3%
748
 
1.7%
710
 
1.6%
676
 
1.6%
576
 
1.3%
544
 
1.3%
467
 
1.1%
456
 
1.1%
445
 
1.0%
Other values (1111) 36517
84.7%
Latin
ValueCountFrequency (%)
a 438
 
10.2%
e 373
 
8.7%
o 339
 
7.9%
n 305
 
7.1%
i 243
 
5.7%
s 227
 
5.3%
t 209
 
4.9%
r 194
 
4.5%
l 178
 
4.2%
u 176
 
4.1%
Other values (41) 1605
37.4%
Common
ValueCountFrequency (%)
3856
97.2%
. 24
 
0.6%
/ 18
 
0.5%
? 15
 
0.4%
- 14
 
0.4%
& 9
 
0.2%
' 6
 
0.2%
( 5
 
0.1%
) 5
 
0.1%
, 3
 
0.1%
Other values (5) 11
 
0.3%
Han
ValueCountFrequency (%)
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
Other values (4) 4
28.6%
Hiragana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42849
83.4%
ASCII 8250
 
16.1%
Compat Jamo 248
 
0.5%
CJK 14
 
< 0.1%
Hiragana 8
 
< 0.1%
None 2
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3856
46.7%
a 438
 
5.3%
e 373
 
4.5%
o 339
 
4.1%
n 305
 
3.7%
i 243
 
2.9%
s 227
 
2.8%
t 209
 
2.5%
r 194
 
2.4%
l 178
 
2.2%
Other values (54) 1888
22.9%
Hangul
ValueCountFrequency (%)
984
 
2.3%
974
 
2.3%
748
 
1.7%
710
 
1.7%
676
 
1.6%
576
 
1.3%
544
 
1.3%
467
 
1.1%
456
 
1.1%
445
 
1.0%
Other values (1081) 36269
84.6%
Compat Jamo
ValueCountFrequency (%)
26
 
10.5%
22
 
8.9%
19
 
7.7%
19
 
7.7%
16
 
6.5%
15
 
6.0%
15
 
6.0%
12
 
4.8%
11
 
4.4%
11
 
4.4%
Other values (20) 82
33.1%
None
ValueCountFrequency (%)
· 2
100.0%
CJK
ValueCountFrequency (%)
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
Other values (4) 4
28.6%
Hiragana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Punctuation
ValueCountFrequency (%)
1
100.0%

서비스이용횟수
Real number (ℝ)

SKEWED 

Distinct184
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.3946
Minimum1
Maximum3544
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-10T22:13:05.103078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile23
Maximum3544
Range3543
Interquartile range (IQR)3

Descriptive statistics

Standard deviation75.148305
Coefficient of variation (CV)7.9990958
Kurtosis995.28409
Mean9.3946
Median Absolute Deviation (MAD)1
Skewness27.778566
Sum93946
Variance5647.2678
MonotonicityNot monotonic
2023-12-10T22:13:05.343515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 4483
44.8%
2 2011
20.1%
3 911
 
9.1%
4 501
 
5.0%
5 330
 
3.3%
6 234
 
2.3%
7 171
 
1.7%
8 146
 
1.5%
9 107
 
1.1%
10 89
 
0.9%
Other values (174) 1017
 
10.2%
ValueCountFrequency (%)
1 4483
44.8%
2 2011
20.1%
3 911
 
9.1%
4 501
 
5.0%
5 330
 
3.3%
6 234
 
2.3%
7 171
 
1.7%
8 146
 
1.5%
9 107
 
1.1%
10 89
 
0.9%
ValueCountFrequency (%)
3544 1
< 0.1%
2913 1
< 0.1%
2611 1
< 0.1%
2099 1
< 0.1%
1753 1
< 0.1%
1319 1
< 0.1%
1282 1
< 0.1%
1140 1
< 0.1%
1095 1
< 0.1%
1061 1
< 0.1%

대상기준년월
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202108
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202108
2nd row202108
3rd row202108
4th row202108
5th row202108

Common Values

ValueCountFrequency (%)
202108 10000
100.0%

Length

2023-12-10T22:13:05.627159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:13:05.764178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202108 10000
100.0%

Interactions

2023-12-10T22:13:01.680628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-10T22:13:01.907699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:13:02.068586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년월검색키워드명서비스이용횟수대상기준년월
47459202108공단해장국2202108
7542202108한일식당14202108
25300202108잠실 치킨4202108
54682202108산드레2202108
56714202108리스팬케이크1202108
18646202108광화문 짬뽕5202108
26663202108사당역 횟집3202108
34336202108빨간어묵3202108
41860202108애월해녀의집2202108
38389202108씰국수2202108
기준년월검색키워드명서비스이용횟수대상기준년월
98603202108엄마의 일품 김치찜1202108
53690202108신용2202108
90770202108ofr seoul1202108
42179202108호호식당 익선2202108
70788202108사상역 막창1202108
6194202108을지로 술집18202108
91142202108한강진1202108
23074202108장흥읍4202108
49992202108네기 다이닝2202108
87188202108대구콘서트하우스1202108

Duplicate rows

Most frequently occurring

기준년월검색키워드명서비스이용횟수대상기준년월# duplicates
020210812021082