Overview

Dataset statistics

Number of variables5
Number of observations119
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.8%
Total size in memory4.8 KiB
Average record size in memory41.1 B

Variable types

Categorical3
Text2

Dataset

DescriptionJDC지정면세점_브랜드별 AS 연락처 및 주소(2015년 11월 기준)
Author제주국제자유도시개발센터
URLhttps://www.data.go.kr/data/15044051/fileData.do

Alerts

Dataset has 1 (0.8%) duplicate rowsDuplicates
품종 is highly overall correlated with 주소-시도구분 and 1 other fieldsHigh correlation
주소-시도구분 is highly overall correlated with 품종 and 1 other fieldsHigh correlation
주소-세부주소 is highly overall correlated with 품종 and 1 other fieldsHigh correlation
주소-시도구분 is highly imbalanced (68.9%)Imbalance

Reproduction

Analysis started2023-12-12 08:06:22.243830
Analysis finished2023-12-12 08:06:23.066912
Duration0.82 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

품종
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
선글라스
40 
패션
34 
시계
25 
액세서리
15 
문구
 
3

Length

Max length4
Median length2
Mean length2.9243697
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row패션
2nd row패션
3rd row패션
4th row패션
5th row패션

Common Values

ValueCountFrequency (%)
선글라스 40
33.6%
패션 34
28.6%
시계 25
21.0%
액세서리 15
 
12.6%
문구 3
 
2.5%
완구 2
 
1.7%

Length

2023-12-12T17:06:23.245189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:06:23.475190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
선글라스 40
33.6%
패션 34
28.6%
시계 25
21.0%
액세서리 15
 
12.6%
문구 3
 
2.5%
완구 2
 
1.7%
Distinct112
Distinct (%)94.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023-12-12T17:06:23.768186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length16
Mean length8.5294118
Min length2

Characters and Unicode

Total characters1015
Distinct characters76
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique105 ?
Unique (%)88.2%

Sample

1st rowLongchamp
2nd rowEtro
3rd rowKipling
4th rowLouis Quatorze
5th rowPRIMA CLASSE
ValueCountFrequency (%)
13
 
7.6%
butti 6
 
3.5%
라베트리나 5
 
2.9%
fendi 3
 
1.8%
aigner 2
 
1.2%
lanvin 2
 
1.2%
kenzo 2
 
1.2%
ferragamo 2
 
1.2%
lagerfeld 2
 
1.2%
gucci 2
 
1.2%
Other values (122) 132
77.2%
2023-12-12T17:06:24.224170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 54
 
5.3%
52
 
5.1%
E 44
 
4.3%
I 43
 
4.2%
T 41
 
4.0%
O 40
 
3.9%
N 38
 
3.7%
a 38
 
3.7%
L 36
 
3.5%
i 32
 
3.2%
Other values (66) 597
58.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 581
57.2%
Lowercase Letter 301
29.7%
Space Separator 52
 
5.1%
Other Letter 46
 
4.5%
Other Punctuation 19
 
1.9%
Dash Punctuation 12
 
1.2%
Close Punctuation 1
 
0.1%
Math Symbol 1
 
0.1%
Decimal Number 1
 
0.1%
Open Punctuation 1
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 54
 
9.3%
E 44
 
7.6%
I 43
 
7.4%
T 41
 
7.1%
O 40
 
6.9%
N 38
 
6.5%
L 36
 
6.2%
S 31
 
5.3%
R 31
 
5.3%
C 27
 
4.6%
Other values (16) 196
33.7%
Lowercase Letter
ValueCountFrequency (%)
a 38
12.6%
i 32
10.6%
e 31
10.3%
r 28
9.3%
o 26
8.6%
s 22
 
7.3%
l 18
 
6.0%
c 17
 
5.6%
n 15
 
5.0%
m 10
 
3.3%
Other values (12) 64
21.3%
Other Letter
ValueCountFrequency (%)
7
15.2%
6
13.0%
6
13.0%
6
13.0%
6
13.0%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
1
 
2.2%
Other values (8) 8
17.4%
Other Punctuation
ValueCountFrequency (%)
. 14
73.7%
& 3
 
15.8%
' 1
 
5.3%
, 1
 
5.3%
Space Separator
ValueCountFrequency (%)
52
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 882
86.9%
Common 87
 
8.6%
Hangul 46
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 54
 
6.1%
E 44
 
5.0%
I 43
 
4.9%
T 41
 
4.6%
O 40
 
4.5%
N 38
 
4.3%
a 38
 
4.3%
L 36
 
4.1%
i 32
 
3.6%
S 31
 
3.5%
Other values (38) 485
55.0%
Hangul
ValueCountFrequency (%)
7
15.2%
6
13.0%
6
13.0%
6
13.0%
6
13.0%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
1
 
2.2%
Other values (8) 8
17.4%
Common
ValueCountFrequency (%)
52
59.8%
. 14
 
16.1%
- 12
 
13.8%
& 3
 
3.4%
) 1
 
1.1%
+ 1
 
1.1%
2 1
 
1.1%
' 1
 
1.1%
( 1
 
1.1%
, 1
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 969
95.5%
Hangul 46
 
4.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 54
 
5.6%
52
 
5.4%
E 44
 
4.5%
I 43
 
4.4%
T 41
 
4.2%
O 40
 
4.1%
N 38
 
3.9%
a 38
 
3.9%
L 36
 
3.7%
i 32
 
3.3%
Other values (48) 551
56.9%
Hangul
ValueCountFrequency (%)
7
15.2%
6
13.0%
6
13.0%
6
13.0%
6
13.0%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
1
 
2.2%
Other values (8) 8
17.4%
Distinct52
Distinct (%)43.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023-12-12T17:06:24.443354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length21
Mean length12.747899
Min length9

Characters and Unicode

Total characters1517
Distinct characters19
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)24.4%

Sample

1st row02-513-2278
2nd row02-3018-2308
3rd row02-3489-6415
4th row02-582-5271
5th row02-761-0891
ValueCountFrequency (%)
02-466-4557 9
 
7.3%
02-3481-3636~7 7
 
5.7%
02-6712-0812 7
 
5.7%
02-2156-0732 6
 
4.9%
02-2658-0013 6
 
4.9%
02-717-1439 6
 
4.9%
02-3284-1300(ext.800 5
 
4.1%
02-513-2331 4
 
3.3%
1599-3016 4
 
3.3%
02-790-6738 4
 
3.3%
Other values (45) 65
52.8%
2023-12-12T17:06:24.904454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 233
15.4%
0 222
14.6%
2 183
12.1%
1 136
9.0%
3 129
8.5%
7 107
7.1%
6 105
6.9%
5 95
6.3%
4 95
6.3%
8 78
 
5.1%
Other values (9) 134
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1207
79.6%
Dash Punctuation 233
 
15.4%
Lowercase Letter 18
 
1.2%
Open Punctuation 13
 
0.9%
Close Punctuation 13
 
0.9%
Math Symbol 11
 
0.7%
Uppercase Letter 9
 
0.6%
Other Punctuation 9
 
0.6%
Space Separator 4
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 222
18.4%
2 183
15.2%
1 136
11.3%
3 129
10.7%
7 107
8.9%
6 105
8.7%
5 95
7.9%
4 95
7.9%
8 78
 
6.5%
9 57
 
4.7%
Lowercase Letter
ValueCountFrequency (%)
x 9
50.0%
t 9
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 233
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Math Symbol
ValueCountFrequency (%)
~ 11
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 9
100.0%
Other Punctuation
ValueCountFrequency (%)
. 9
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1490
98.2%
Latin 27
 
1.8%

Most frequent character per script

Common
ValueCountFrequency (%)
- 233
15.6%
0 222
14.9%
2 183
12.3%
1 136
9.1%
3 129
8.7%
7 107
7.2%
6 105
7.0%
5 95
6.4%
4 95
6.4%
8 78
 
5.2%
Other values (6) 107
7.2%
Latin
ValueCountFrequency (%)
E 9
33.3%
x 9
33.3%
t 9
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1517
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 233
15.4%
0 222
14.6%
2 183
12.1%
1 136
9.0%
3 129
8.5%
7 107
7.1%
6 105
6.9%
5 95
6.3%
4 95
6.3%
8 78
 
5.1%
Other values (9) 134
8.8%

주소-시도구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
서울시
106 
경기도
 
9
서울
 
3
제주도
 
1

Length

Max length3
Median length3
Mean length2.9747899
Min length2

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row서울시
2nd row서울시
3rd row서울시
4th row서울시
5th row서울시

Common Values

ValueCountFrequency (%)
서울시 106
89.1%
경기도 9
 
7.6%
서울 3
 
2.5%
제주도 1
 
0.8%

Length

2023-12-12T17:06:25.085042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:06:25.203959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울시 106
89.1%
경기도 9
 
7.6%
서울 3
 
2.5%
제주도 1
 
0.8%

주소-세부주소
Categorical

HIGH CORRELATION 

Distinct48
Distinct (%)40.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
강남구 신사동 665-1 한양타운 2층
서초구 서초동 1696-13번지 애니빌딩 5층
 
7
동작구 신대방동 395-70 전문건설회관 23층
 
7
강남구 신사동 588-8번지 카라코람 빌딩 3층
 
7
중구 회현동 1가 100-83번지 ㈜디케이
 
6
Other values (43)
83 

Length

Max length38
Median length32
Mean length24.336134
Min length14

Unique

Unique25 ?
Unique (%)21.0%

Sample

1st row강남구 논현동 231-13번지 팍스타워 8층
2nd row강남구 삼성로 133길 12 청담동 백운빌딩 2층 면세사업부
3rd row서초구 효령로 317 대한건축사협회 빌딩 5층 ㈜리노스
4th row서초구 서초동 1446-11번지 현대슈퍼빌 상가동 301호
5th row영등포구 국제금융로 70 미원빌딩 1504-1호

Common Values

ValueCountFrequency (%)
강남구 신사동 665-1 한양타운 2층 9
 
7.6%
서초구 서초동 1696-13번지 애니빌딩 5층 7
 
5.9%
동작구 신대방동 395-70 전문건설회관 23층 7
 
5.9%
강남구 신사동 588-8번지 카라코람 빌딩 3층 7
 
5.9%
중구 회현동 1가 100-83번지 ㈜디케이 6
 
5.0%
강남구 논현동 50번지 삼익전자빌딩 6층 6
 
5.0%
강서구 가양 1동 192-12 6
 
5.0%
강남구 삼성동 145-18 구구빌딩 8층 135-090 4
 
3.4%
강남구 언주로 609 팍스타워 B동 지하1층 4
 
3.4%
강남구 봉은사로 44길 62 역삼동 룩옵틱스 빌딩 본관2층 AS팀 4
 
3.4%
Other values (38) 59
49.6%

Length

2023-12-12T17:06:25.366374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강남구 60
 
9.2%
신사동 17
 
2.6%
2층 16
 
2.5%
빌딩 14
 
2.2%
5층 11
 
1.7%
논현동 10
 
1.5%
역삼동 10
 
1.5%
한양타운 9
 
1.4%
서초구 9
 
1.4%
6층 9
 
1.4%
Other values (185) 485
74.6%

Correlations

2023-12-12T17:06:25.484312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품종연락처주소-시도구분주소-세부주소
품종1.0001.0000.6810.999
연락처1.0001.0001.0001.000
주소-시도구분0.6811.0001.0001.000
주소-세부주소0.9991.0001.0001.000
2023-12-12T17:06:25.580860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품종주소-시도구분주소-세부주소
품종1.0000.5060.755
주소-시도구분0.5061.0000.786
주소-세부주소0.7550.7861.000
2023-12-12T17:06:25.683401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품종주소-시도구분주소-세부주소
품종1.0000.5060.755
주소-시도구분0.5061.0000.786
주소-세부주소0.7550.7861.000

Missing values

2023-12-12T17:06:22.733654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:06:22.992180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

품종브랜드명연락처주소-시도구분주소-세부주소
0패션Longchamp02-513-2278서울시강남구 논현동 231-13번지 팍스타워 8층
1패션Etro02-3018-2308서울시강남구 삼성로 133길 12 청담동 백운빌딩 2층 면세사업부
2패션Kipling02-3489-6415서울시서초구 효령로 317 대한건축사협회 빌딩 5층 ㈜리노스
3패션Louis Quatorze02-582-5271서울시서초구 서초동 1446-11번지 현대슈퍼빌 상가동 301호
4패션PRIMA CLASSE02-761-0891서울시영등포구 국제금융로 70 미원빌딩 1504-1호
5패션BEANPOLE02-3702-7915경기도김포시 고천읍 전호리 743번지 삼성물류센터
6패션Daks1544-5114경기도군포시 산본동 1026-19번지 LF CS 센터
7패션J.ESTINA031-8028-5577경기도광주시 장지9길 48 제이에스티나 물류센터
8패션COURONNE031-218-9612경기도수원시 영통구 원천동 380-1번지
9패션S.T.DUPONT02-2106-3418서울시강남구 논현로 149길 11 세중빌딩 1층
품종브랜드명연락처주소-시도구분주소-세부주소
109선글라스Shiseido02-717-1439서울시중구 회현동 1가 100-83번지 ㈜디케이
110선글라스S.T.DUPONT02-717-1439서울시중구 회현동 1가 100-83번지 ㈜디케이
111선글라스RUDY PROJECT02-563-8264서울시강남구 봉은사로 20길 14 지하 2층(역삼동, 주미빌딩)
112선글라스TIFOSI02-563-8264서울시강남구 봉은사로 20길 14 지하 2층(역삼동, 주미빌딩)
113선글라스TAGHeuer02-563-8264서울시강남구 봉은사로 20길 14 지하 2층(역삼동, 주미빌딩)
114완구신우토이02-326-3470(Ext. 516)서울시마포구 서교통 477-28 서진빌딩 5층
115완구테디베어064-733-4627제주도서귀포시 도홍동 130-1번지 인화빌딩 1층
116문구Parker02-2017-9654~5서울시강남구 청담동 120-3 대신BD
117문구Waterman02-2017-9654~5서울시강남구 청담동 120-3 대신BD
118문구Prisma Color02-2017-9654~5서울시강남구 청담동 120-3 대신BD

Duplicate rows

Most frequently occurring

품종브랜드명연락처주소-시도구분주소-세부주소# duplicates
0액세서리Facco02-3414-0607서울강남구 역삼동 840 한은빌딩 2층2