Overview

Dataset statistics

Number of variables2
Number of observations10000
Missing cells8
Missing cells (%)< 0.1%
Duplicate rows61
Duplicate rows (%)0.6%
Total size in memory234.4 KiB
Average record size in memory24.0 B

Variable types

Text2

Dataset

Description2023년 11월 30일까지 부산광역시 해운대구에 신고된 통신판매업 현황 데이터로 상호명, 취급품목 항목을 제공합니다.
Author부산광역시 해운대구
URLhttps://www.data.go.kr/data/15063782/fileData.do

Alerts

Dataset has 61 (0.6%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-16 15:22:45.403777
Analysis finished2023-12-16 15:22:48.757547
Duration3.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9840
Distinct (%)98.4%
Missing1
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-16T15:22:49.661914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length52
Median length43
Mean length7.2063206
Min length1

Characters and Unicode

Total characters72056
Distinct characters1086
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9698 ?
Unique (%)97.0%

Sample

1st row재수니즈
2nd row해림통상
3rd row데이터케이스
4th row당당한우
5th row제이비컴퍼니
ValueCountFrequency (%)
주식회사 842
 
6.2%
83
 
0.6%
39
 
0.3%
co 35
 
0.3%
company 31
 
0.2%
컴퍼니 31
 
0.2%
해운대 31
 
0.2%
ltd 30
 
0.2%
해운대점 29
 
0.2%
인셀덤 25
 
0.2%
Other values (11145) 12332
91.3%
2023-12-16T15:22:51.903987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3546
 
4.9%
2533
 
3.5%
) 2023
 
2.8%
( 2021
 
2.8%
2007
 
2.8%
1463
 
2.0%
1162
 
1.6%
1120
 
1.6%
945
 
1.3%
921
 
1.3%
Other values (1076) 54315
75.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 50905
70.6%
Lowercase Letter 7051
 
9.8%
Uppercase Letter 5500
 
7.6%
Space Separator 3546
 
4.9%
Close Punctuation 2025
 
2.8%
Open Punctuation 2023
 
2.8%
Decimal Number 564
 
0.8%
Other Punctuation 331
 
0.5%
Dash Punctuation 58
 
0.1%
Connector Punctuation 26
 
< 0.1%
Other values (2) 27
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2533
 
5.0%
2007
 
3.9%
1463
 
2.9%
1162
 
2.3%
1120
 
2.2%
945
 
1.9%
921
 
1.8%
829
 
1.6%
782
 
1.5%
714
 
1.4%
Other values (993) 38429
75.5%
Lowercase Letter
ValueCountFrequency (%)
e 828
11.7%
o 757
 
10.7%
a 616
 
8.7%
n 551
 
7.8%
i 488
 
6.9%
t 449
 
6.4%
r 438
 
6.2%
l 389
 
5.5%
s 329
 
4.7%
m 267
 
3.8%
Other values (16) 1939
27.5%
Uppercase Letter
ValueCountFrequency (%)
A 447
 
8.1%
O 421
 
7.7%
E 396
 
7.2%
S 375
 
6.8%
N 343
 
6.2%
I 323
 
5.9%
L 306
 
5.6%
M 295
 
5.4%
C 293
 
5.3%
T 289
 
5.3%
Other values (16) 2012
36.6%
Decimal Number
ValueCountFrequency (%)
1 106
18.8%
2 96
17.0%
3 61
10.8%
5 56
9.9%
0 54
9.6%
4 50
8.9%
7 46
8.2%
8 32
 
5.7%
6 32
 
5.7%
9 31
 
5.5%
Other Punctuation
ValueCountFrequency (%)
. 206
62.2%
& 72
 
21.8%
' 30
 
9.1%
! 6
 
1.8%
: 6
 
1.8%
# 5
 
1.5%
/ 5
 
1.5%
1
 
0.3%
Math Symbol
ValueCountFrequency (%)
+ 2
33.3%
1
16.7%
< 1
16.7%
> 1
16.7%
× 1
16.7%
Close Punctuation
ValueCountFrequency (%)
) 2023
99.9%
] 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 2021
99.9%
[ 2
 
0.1%
Space Separator
ValueCountFrequency (%)
3546
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 58
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 26
100.0%
Other Symbol
ValueCountFrequency (%)
21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 50911
70.7%
Latin 12551
 
17.4%
Common 8579
 
11.9%
Han 15
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2533
 
5.0%
2007
 
3.9%
1463
 
2.9%
1162
 
2.3%
1120
 
2.2%
945
 
1.9%
921
 
1.8%
829
 
1.6%
782
 
1.5%
714
 
1.4%
Other values (979) 38435
75.5%
Latin
ValueCountFrequency (%)
e 828
 
6.6%
o 757
 
6.0%
a 616
 
4.9%
n 551
 
4.4%
i 488
 
3.9%
t 449
 
3.6%
A 447
 
3.6%
r 438
 
3.5%
O 421
 
3.4%
E 396
 
3.2%
Other values (42) 7160
57.0%
Common
ValueCountFrequency (%)
3546
41.3%
) 2023
23.6%
( 2021
23.6%
. 206
 
2.4%
1 106
 
1.2%
2 96
 
1.1%
& 72
 
0.8%
3 61
 
0.7%
- 58
 
0.7%
5 56
 
0.7%
Other values (20) 334
 
3.9%
Han
ValueCountFrequency (%)
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Other values (5) 5
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 50890
70.6%
ASCII 21127
29.3%
None 23
 
< 0.1%
CJK 15
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3546
 
16.8%
) 2023
 
9.6%
( 2021
 
9.6%
e 828
 
3.9%
o 757
 
3.6%
a 616
 
2.9%
n 551
 
2.6%
i 488
 
2.3%
t 449
 
2.1%
A 447
 
2.1%
Other values (69) 9401
44.5%
Hangul
ValueCountFrequency (%)
2533
 
5.0%
2007
 
3.9%
1463
 
2.9%
1162
 
2.3%
1120
 
2.2%
945
 
1.9%
921
 
1.8%
829
 
1.6%
782
 
1.5%
714
 
1.4%
Other values (978) 38414
75.5%
None
ValueCountFrequency (%)
21
91.3%
× 1
 
4.3%
1
 
4.3%
CJK
ValueCountFrequency (%)
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Other values (5) 5
33.3%
Math Operators
ValueCountFrequency (%)
1
100.0%
Distinct446
Distinct (%)4.5%
Missing7
Missing (%)0.1%
Memory size156.2 KiB
2023-12-16T15:22:52.738685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length87
Median length84
Mean length9.0476333
Min length1

Characters and Unicode

Total characters90413
Distinct characters51
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique249 ?
Unique (%)2.5%

Sample

1st row종합몰
2nd row종합몰 교육/도서/완구/오락 의류/패션/잡화/뷰티
3rd row기타
4th row종합몰
5th row기타
ValueCountFrequency (%)
종합몰 3831
27.3%
의류/패션/잡화/뷰티 3575
25.5%
기타 2022
14.4%
건강/식품 1260
 
9.0%
교육/도서/완구/오락 690
 
4.9%
레져/여행/공연 643
 
4.6%
가구/수납용품 544
 
3.9%
컴퓨터/사무용품 469
 
3.3%
가전 418
 
3.0%
자동차/자동차용품 347
 
2.5%
Other values (3) 242
 
1.7%
2023-12-16T15:22:53.865261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 16778
18.6%
4048
 
4.5%
3831
 
4.2%
3831
 
4.2%
3831
 
4.2%
3575
 
4.0%
3575
 
4.0%
3575
 
4.0%
3575
 
4.0%
3575
 
4.0%
Other values (41) 40219
44.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69545
76.9%
Other Punctuation 16778
 
18.6%
Space Separator 4048
 
4.5%
Dash Punctuation 42
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3831
 
5.5%
3831
 
5.5%
3831
 
5.5%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
Other values (38) 33027
47.5%
Other Punctuation
ValueCountFrequency (%)
/ 16778
100.0%
Space Separator
ValueCountFrequency (%)
4048
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69545
76.9%
Common 20868
 
23.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3831
 
5.5%
3831
 
5.5%
3831
 
5.5%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
Other values (38) 33027
47.5%
Common
ValueCountFrequency (%)
/ 16778
80.4%
4048
 
19.4%
- 42
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69545
76.9%
ASCII 20868
 
23.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 16778
80.4%
4048
 
19.4%
- 42
 
0.2%
Hangul
ValueCountFrequency (%)
3831
 
5.5%
3831
 
5.5%
3831
 
5.5%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
3575
 
5.1%
Other values (38) 33027
47.5%

Missing values

2023-12-16T15:22:47.520746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-16T15:22:47.962357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-16T15:22:48.413004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

법인또는상호취급품목
40재수니즈종합몰
8825해림통상종합몰 교육/도서/완구/오락 의류/패션/잡화/뷰티
6983데이터케이스기타
8405당당한우종합몰
7313제이비컴퍼니기타
4075삼정(다된다AD)종합몰
7833다온인터내셔널종합몰 의류/패션/잡화/뷰티
9535말창이네종합몰
8008로즈라벨종합몰
8917한울아이씨티기타
법인또는상호취급품목
8188와이드테크종합몰
3640불란서제비종합몰
4656민요한 헤어기타
4491모두삼삼종합몰
1528송대리 스토어종합몰
9927(주)비에스펀투어레져/여행/공연
8430이지웰플러스식품건강/식품
5927온온온의류/패션/잡화/뷰티
4317닐리리빠빠건강/식품
8009클로즈홀릭의류/패션/잡화/뷰티

Duplicate rows

Most frequently occurring

법인또는상호취급품목# duplicates
11두언니의류/패션/잡화/뷰티3
16러버밍의류/패션/잡화/뷰티3
29베르의류/패션/잡화/뷰티3
52제이에이치종합몰3
0골든베이기타2
1광안리요트투어레져/여행/공연2
2금정푸드건강/식품2
3꼭플라워기타2
4꿀통스토어종합몰2
5다온종합몰2