Overview

Dataset statistics

Number of variables3
Number of observations250
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.2 KiB
Average record size in memory25.5 B

Variable types

Numeric1
Categorical1
Text1

Dataset

DescriptionJDC 지정면세점_입점 브랜드 현황(`15년 11월 기준)
Author제주국제자유도시개발센터
URLhttps://www.data.go.kr/data/15044052/fileData.do

Alerts

연번 is highly overall correlated with 품종High correlation
품종 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:11:51.963979
Analysis finished2023-12-12 13:11:52.388693
Duration0.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct250
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean125.5
Minimum1
Maximum250
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 KiB
2023-12-12T22:11:52.490015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13.45
Q163.25
median125.5
Q3187.75
95-th percentile237.55
Maximum250
Range249
Interquartile range (IQR)124.5

Descriptive statistics

Standard deviation72.312977
Coefficient of variation (CV)0.57619902
Kurtosis-1.2
Mean125.5
Median Absolute Deviation (MAD)62.5
Skewness0
Sum31375
Variance5229.1667
MonotonicityStrictly increasing
2023-12-12T22:11:52.657634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
173 1
 
0.4%
160 1
 
0.4%
161 1
 
0.4%
162 1
 
0.4%
163 1
 
0.4%
164 1
 
0.4%
165 1
 
0.4%
166 1
 
0.4%
167 1
 
0.4%
Other values (240) 240
96.0%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
250 1
0.4%
249 1
0.4%
248 1
0.4%
247 1
0.4%
246 1
0.4%
245 1
0.4%
244 1
0.4%
243 1
0.4%
242 1
0.4%
241 1
0.4%

품종
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
선글라스
42 
주류
36 
화장품
35 
패션
33 
담배
27 
Other values (7)
77 

Length

Max length4
Median length2
Mean length2.628
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row담배
2nd row담배
3rd row담배
4th row담배
5th row담배

Common Values

ValueCountFrequency (%)
선글라스 42
16.8%
주류 36
14.4%
화장품 35
14.0%
패션 33
13.2%
담배 27
10.8%
시계 25
10.0%
향수 19
7.6%
액세서리 13
 
5.2%
초콜렛 12
 
4.8%
문구 3
 
1.2%
Other values (2) 5
 
2.0%

Length

2023-12-12T22:11:52.823735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
선글라스 42
16.8%
주류 36
14.4%
화장품 35
14.0%
패션 33
13.2%
담배 27
10.8%
시계 25
10.0%
향수 19
7.6%
액세서리 13
 
5.2%
초콜렛 12
 
4.8%
문구 3
 
1.2%
Other values (2) 5
 
2.0%
Distinct220
Distinct (%)88.0%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-12T22:11:53.190536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length14
Mean length7.732
Min length3

Characters and Unicode

Total characters1933
Distinct characters96
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique195 ?
Unique (%)78.0%

Sample

1st rowPIANISSIMO
2nd rowCaster
3rd rowMevius
4th rowLARK
5th rowVIRGINIA SLIMS
ValueCountFrequency (%)
kenzo 3
 
0.9%
kors 3
 
0.9%
the 3
 
0.9%
3
 
0.9%
lanvin 3
 
0.9%
davidoff 3
 
0.9%
c.k 3
 
0.9%
gucci 3
 
0.9%
shiseido 2
 
0.6%
burberry 2
 
0.6%
Other values (271) 298
91.4%
2023-12-12T22:11:53.738299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 105
 
5.4%
e 103
 
5.3%
i 91
 
4.7%
78
 
4.0%
A 75
 
3.9%
S 71
 
3.7%
E 67
 
3.5%
r 66
 
3.4%
o 66
 
3.4%
n 65
 
3.4%
Other values (86) 1146
59.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 892
46.1%
Uppercase Letter 890
46.0%
Space Separator 78
 
4.0%
Other Letter 35
 
1.8%
Other Punctuation 29
 
1.5%
Decimal Number 7
 
0.4%
Dash Punctuation 1
 
0.1%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
5.7%
2
 
5.7%
2
 
5.7%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (22) 22
62.9%
Lowercase Letter
ValueCountFrequency (%)
a 105
11.8%
e 103
11.5%
i 91
10.2%
r 66
 
7.4%
o 66
 
7.4%
n 65
 
7.3%
l 59
 
6.6%
s 51
 
5.7%
u 33
 
3.7%
c 33
 
3.7%
Other values (16) 220
24.7%
Uppercase Letter
ValueCountFrequency (%)
A 75
 
8.4%
S 71
 
8.0%
E 67
 
7.5%
O 59
 
6.6%
I 58
 
6.5%
L 58
 
6.5%
N 53
 
6.0%
R 50
 
5.6%
T 43
 
4.8%
C 43
 
4.8%
Other values (16) 313
35.2%
Decimal Number
ValueCountFrequency (%)
5 3
42.9%
3 1
 
14.3%
7 1
 
14.3%
9 1
 
14.3%
2 1
 
14.3%
Other Punctuation
ValueCountFrequency (%)
. 19
65.5%
' 5
 
17.2%
& 4
 
13.8%
/ 1
 
3.4%
Space Separator
ValueCountFrequency (%)
78
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1782
92.2%
Common 116
 
6.0%
Hangul 35
 
1.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 105
 
5.9%
e 103
 
5.8%
i 91
 
5.1%
A 75
 
4.2%
S 71
 
4.0%
E 67
 
3.8%
r 66
 
3.7%
o 66
 
3.7%
n 65
 
3.6%
l 59
 
3.3%
Other values (42) 1014
56.9%
Hangul
ValueCountFrequency (%)
2
 
5.7%
2
 
5.7%
2
 
5.7%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (22) 22
62.9%
Common
ValueCountFrequency (%)
78
67.2%
. 19
 
16.4%
' 5
 
4.3%
& 4
 
3.4%
5 3
 
2.6%
3 1
 
0.9%
7 1
 
0.9%
/ 1
 
0.9%
- 1
 
0.9%
+ 1
 
0.9%
Other values (2) 2
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1898
98.2%
Hangul 35
 
1.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 105
 
5.5%
e 103
 
5.4%
i 91
 
4.8%
78
 
4.1%
A 75
 
4.0%
S 71
 
3.7%
E 67
 
3.5%
r 66
 
3.5%
o 66
 
3.5%
n 65
 
3.4%
Other values (54) 1111
58.5%
Hangul
ValueCountFrequency (%)
2
 
5.7%
2
 
5.7%
2
 
5.7%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (22) 22
62.9%

Interactions

2023-12-12T22:11:52.109315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:11:53.863955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번품종
연번1.0000.930
품종0.9301.000
2023-12-12T22:11:53.949946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번품종
연번1.0000.747
품종0.7471.000

Missing values

2023-12-12T22:11:52.252396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:11:52.350106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번품종브랜드
01담배PIANISSIMO
12담배Caster
23담배Mevius
34담배LARK
45담배VIRGINIA SLIMS
56담배Marlboro
67담배Parliament
78담배Davidoff
89담배LUCKY STRIKE
910담배SE 555
연번품종브랜드
240241화장품Clinique
241242화장품Estee Lauder
242243화장품Lab
243244화장품Mac
244245화장품Origins
245246화장품YSL
246247화장품Elizabeth Arden
247248화장품SK-ll
248249화장품Make Up For Ever
249250화장품L'OREAL PARIS