Overview

Dataset statistics

Number of variables4
Number of observations365
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.9 KiB
Average record size in memory33.4 B

Variable types

Numeric1
Categorical2
Text1

Dataset

DescriptionJDC 지정면세점_입점 브랜드 현황(`17년 5월 기준)
Author제주국제자유도시개발센터
URLhttps://www.data.go.kr/data/15044054/fileData.do

Alerts

번호 is highly overall correlated with 상품군 and 1 other fieldsHigh correlation
상품군 is highly overall correlated with 번호High correlation
품종 is highly overall correlated with 번호High correlation
번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 20:03:54.104008
Analysis finished2023-12-12 20:03:54.579404
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct365
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean183
Minimum1
Maximum365
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.3 KiB
2023-12-13T05:03:54.678329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile19.2
Q192
median183
Q3274
95-th percentile346.8
Maximum365
Range364
Interquartile range (IQR)182

Descriptive statistics

Standard deviation105.51066
Coefficient of variation (CV)0.576561
Kurtosis-1.2
Mean183
Median Absolute Deviation (MAD)91
Skewness0
Sum66795
Variance11132.5
MonotonicityStrictly increasing
2023-12-13T05:03:54.871421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
252 1
 
0.3%
250 1
 
0.3%
249 1
 
0.3%
248 1
 
0.3%
247 1
 
0.3%
246 1
 
0.3%
245 1
 
0.3%
244 1
 
0.3%
243 1
 
0.3%
Other values (355) 355
97.3%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
365 1
0.3%
364 1
0.3%
363 1
0.3%
362 1
0.3%
361 1
0.3%
360 1
0.3%
359 1
0.3%
358 1
0.3%
357 1
0.3%
356 1
0.3%

상품군
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
수입
200 
환급대상상품
90 
국산
75 

Length

Max length6
Median length2
Mean length2.9863014
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수입
2nd row수입
3rd row수입
4th row수입
5th row수입

Common Values

ValueCountFrequency (%)
수입 200
54.8%
환급대상상품 90
24.7%
국산 75
 
20.5%

Length

2023-12-13T05:03:55.046945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:03:55.156473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수입 200
54.8%
환급대상상품 90
24.7%
국산 75
 
20.5%

품종
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
Sunglass
45 
Cosmetic
39 
Perfume
36 
Bag
35 
S.L.G.
33 
Other values (14)
177 

Length

Max length14
Median length9
Mean length6.6520548
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWhisky
2nd rowWhisky
3rd rowWhisky
4th rowWhisky
5th rowWhisky

Common Values

ValueCountFrequency (%)
Sunglass 45
12.3%
Cosmetic 39
10.7%
Perfume 36
9.9%
Bag 35
9.6%
S.L.G. 33
9.0%
Watch 29
7.9%
Cigarette 23
 
6.3%
Whisky 23
 
6.3%
Belt 18
 
4.9%
Spirit 17
 
4.7%
Other values (9) 67
18.4%

Length

2023-12-13T05:03:55.270531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sunglass 45
11.8%
cosmetic 39
10.2%
perfume 36
9.4%
bag 35
9.2%
s.l.g 33
 
8.6%
watch 29
 
7.6%
cigarette 23
 
6.0%
whisky 23
 
6.0%
belt 18
 
4.7%
spirit 17
 
4.5%
Other values (11) 84
22.0%
Distinct247
Distinct (%)67.7%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-13T05:03:55.631072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length14
Mean length8.0054795
Min length3

Characters and Unicode

Total characters2922
Distinct characters109
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique187 ?
Unique (%)51.2%

Sample

1st rowArdbeg
2nd rowBruichladdich
3rd rowBowmore
4th rowBallantines
5th rowBalvenie
ValueCountFrequency (%)
s.t.dupont 10
 
2.1%
daks 8
 
1.7%
gucci 6
 
1.3%
ferragamo 5
 
1.1%
fendi 5
 
1.1%
burberry 5
 
1.1%
prada 5
 
1.1%
quatorze 4
 
0.9%
hazzys 4
 
0.9%
follie 4
 
0.9%
Other values (304) 411
88.0%
2023-12-13T05:03:56.143942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 158
 
5.4%
e 139
 
4.8%
i 114
 
3.9%
E 111
 
3.8%
A 110
 
3.8%
r 110
 
3.8%
109
 
3.7%
o 109
 
3.7%
S 106
 
3.6%
L 92
 
3.1%
Other values (99) 1764
60.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1409
48.2%
Lowercase Letter 1269
43.4%
Space Separator 109
 
3.7%
Other Letter 60
 
2.1%
Other Punctuation 51
 
1.7%
Open Punctuation 9
 
0.3%
Close Punctuation 9
 
0.3%
Decimal Number 4
 
0.1%
Dash Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6
 
10.0%
3
 
5.0%
3
 
5.0%
3
 
5.0%
2
 
3.3%
2
 
3.3%
2
 
3.3%
2
 
3.3%
1
 
1.7%
1
 
1.7%
Other values (35) 35
58.3%
Uppercase Letter
ValueCountFrequency (%)
E 111
 
7.9%
A 110
 
7.8%
S 106
 
7.5%
L 92
 
6.5%
T 90
 
6.4%
I 89
 
6.3%
O 89
 
6.3%
N 84
 
6.0%
C 75
 
5.3%
R 68
 
4.8%
Other values (16) 495
35.1%
Lowercase Letter
ValueCountFrequency (%)
a 158
12.5%
e 139
11.0%
i 114
 
9.0%
r 110
 
8.7%
o 109
 
8.6%
n 84
 
6.6%
l 84
 
6.6%
s 69
 
5.4%
u 58
 
4.6%
c 47
 
3.7%
Other values (15) 297
23.4%
Other Punctuation
ValueCountFrequency (%)
. 36
70.6%
6
 
11.8%
' 4
 
7.8%
; 3
 
5.9%
/ 1
 
2.0%
& 1
 
2.0%
Decimal Number
ValueCountFrequency (%)
7 2
50.0%
9 1
25.0%
3 1
25.0%
Space Separator
ValueCountFrequency (%)
109
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2678
91.6%
Common 184
 
6.3%
Hangul 60
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 158
 
5.9%
e 139
 
5.2%
i 114
 
4.3%
E 111
 
4.1%
A 110
 
4.1%
r 110
 
4.1%
o 109
 
4.1%
S 106
 
4.0%
L 92
 
3.4%
T 90
 
3.4%
Other values (41) 1539
57.5%
Hangul
ValueCountFrequency (%)
6
 
10.0%
3
 
5.0%
3
 
5.0%
3
 
5.0%
2
 
3.3%
2
 
3.3%
2
 
3.3%
2
 
3.3%
1
 
1.7%
1
 
1.7%
Other values (35) 35
58.3%
Common
ValueCountFrequency (%)
109
59.2%
. 36
 
19.6%
( 9
 
4.9%
) 9
 
4.9%
6
 
3.3%
' 4
 
2.2%
; 3
 
1.6%
7 2
 
1.1%
- 2
 
1.1%
/ 1
 
0.5%
Other values (3) 3
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2856
97.7%
Hangul 60
 
2.1%
None 6
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 158
 
5.5%
e 139
 
4.9%
i 114
 
4.0%
E 111
 
3.9%
A 110
 
3.9%
r 110
 
3.9%
109
 
3.8%
o 109
 
3.8%
S 106
 
3.7%
L 92
 
3.2%
Other values (53) 1698
59.5%
None
ValueCountFrequency (%)
6
100.0%
Hangul
ValueCountFrequency (%)
6
 
10.0%
3
 
5.0%
3
 
5.0%
3
 
5.0%
2
 
3.3%
2
 
3.3%
2
 
3.3%
2
 
3.3%
1
 
1.7%
1
 
1.7%
Other values (35) 35
58.3%

Interactions

2023-12-13T05:03:54.283984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:03:56.241914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호상품군품종
번호1.0000.9540.899
상품군0.9541.0000.649
품종0.8990.6491.000
2023-12-13T05:03:56.335850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품종상품군
품종1.0000.434
상품군0.4341.000
2023-12-13T05:03:56.419070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호상품군품종
번호1.0000.9410.615
상품군0.9411.0000.434
품종0.6150.4341.000

Missing values

2023-12-13T05:03:54.435148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:03:54.531099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호상품군품종브랜드
01수입WhiskyArdbeg
12수입WhiskyBruichladdich
23수입WhiskyBowmore
34수입WhiskyBallantines
45수입WhiskyBalvenie
56수입WhiskyChivas Regal
67수입WhiskyRoyal Salute
78수입WhiskyCutty Sark
89수입WhiskyGLENGOYNE
910수입WhiskyGlenfiddIch
번호상품군품종브랜드
355356국산CigaretteESSE
356357국산CigaretteEDGE
357358국산Cigarette아리랑
358359국산CigaretteRAISON
359360국산CigaretteSEASONS
360361국산Cigarettesimple
361362국산Cigarettetime
362363국산CigaretteTHIS
363364국산CigaretteTHIS PLUS
364365국산CigaretteTHE ONE