Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows292
Duplicate rows (%)2.9%
Total size in memory332.0 KiB
Average record size in memory34.0 B

Variable types

Categorical2
Text1

Dataset

Description아임셀러 상품에 대한 키워드 정보에 대한 데이터를 제공합니다. 기준연도, 기준월, 상품키워드 등에 대한 데이터를 제공합니다.
Author(주)중소기업유통센터
URLhttps://www.data.go.kr/data/15067193/fileData.do

Alerts

기준연도 has constant value ""Constant
기준월 has constant value ""Constant
Dataset has 292 (2.9%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 07:54:56.553358
Analysis finished2023-12-12 07:54:57.135972
Duration0.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2020
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 10000
100.0%

Length

2023-12-12T16:54:57.204900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:54:57.283831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 10000
100.0%

기준월
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
9
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row9
3rd row9
4th row9
5th row9

Common Values

ValueCountFrequency (%)
9 10000
100.0%

Length

2023-12-12T16:54:57.362997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:54:57.435036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9 10000
100.0%
Distinct9693
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T16:54:57.743142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length9
Mean length4.9298
Min length1

Characters and Unicode

Total characters49298
Distinct characters1052
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9401 ?
Unique (%)94.0%

Sample

1st row노트3케이스
2nd row습도막측정기
3rd row천연화장품스킨로션
4th row한글
5th row열냉각시트
ValueCountFrequency (%)
스마트도어락 3
 
< 0.1%
부모님선물 3
 
< 0.1%
tresette 3
 
< 0.1%
면역력강화 3
 
< 0.1%
휴대폰거치대 3
 
< 0.1%
핸드폰거치대 3
 
< 0.1%
알톤자전거 3
 
< 0.1%
라쎌lacell 3
 
< 0.1%
디지털보청기 3
 
< 0.1%
차량용거치대 3
 
< 0.1%
Other values (9673) 9970
99.7%
2023-12-12T16:54:58.252120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1236
 
2.5%
1215
 
2.5%
1065
 
2.2%
800
 
1.6%
663
 
1.3%
567
 
1.2%
515
 
1.0%
490
 
1.0%
473
 
1.0%
459
 
0.9%
Other values (1042) 41815
84.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 47215
95.8%
Uppercase Letter 874
 
1.8%
Decimal Number 548
 
1.1%
Lowercase Letter 524
 
1.1%
Other Punctuation 121
 
0.2%
Math Symbol 10
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%
Modifier Symbol 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1236
 
2.6%
1215
 
2.6%
1065
 
2.3%
800
 
1.7%
663
 
1.4%
567
 
1.2%
515
 
1.1%
490
 
1.0%
473
 
1.0%
459
 
1.0%
Other values (971) 39732
84.2%
Uppercase Letter
ValueCountFrequency (%)
E 94
 
10.8%
D 92
 
10.5%
L 90
 
10.3%
C 66
 
7.6%
T 65
 
7.4%
S 53
 
6.1%
A 51
 
5.8%
P 39
 
4.5%
B 36
 
4.1%
R 34
 
3.9%
Other values (16) 254
29.1%
Lowercase Letter
ValueCountFrequency (%)
e 62
 
11.8%
a 52
 
9.9%
t 42
 
8.0%
o 40
 
7.6%
l 37
 
7.1%
r 26
 
5.0%
s 24
 
4.6%
c 22
 
4.2%
u 21
 
4.0%
i 21
 
4.0%
Other values (13) 177
33.8%
Decimal Number
ValueCountFrequency (%)
0 133
24.3%
1 103
18.8%
2 80
14.6%
3 69
12.6%
4 38
 
6.9%
5 36
 
6.6%
6 35
 
6.4%
8 24
 
4.4%
7 17
 
3.1%
9 13
 
2.4%
Other Punctuation
ValueCountFrequency (%)
/ 85
70.2%
. 15
 
12.4%
% 13
 
10.7%
# 6
 
5.0%
& 2
 
1.7%
Math Symbol
ValueCountFrequency (%)
+ 9
90.0%
> 1
 
10.0%
Close Punctuation
ValueCountFrequency (%)
] 1
50.0%
) 1
50.0%
Open Punctuation
ValueCountFrequency (%)
[ 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 47214
95.8%
Latin 1398
 
2.8%
Common 685
 
1.4%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1236
 
2.6%
1215
 
2.6%
1065
 
2.3%
800
 
1.7%
663
 
1.4%
567
 
1.2%
515
 
1.1%
490
 
1.0%
473
 
1.0%
459
 
1.0%
Other values (970) 39731
84.2%
Latin
ValueCountFrequency (%)
E 94
 
6.7%
D 92
 
6.6%
L 90
 
6.4%
C 66
 
4.7%
T 65
 
4.6%
e 62
 
4.4%
S 53
 
3.8%
a 52
 
3.7%
A 51
 
3.6%
t 42
 
3.0%
Other values (39) 731
52.3%
Common
ValueCountFrequency (%)
0 133
19.4%
1 103
15.0%
/ 85
12.4%
2 80
11.7%
3 69
10.1%
4 38
 
5.5%
5 36
 
5.3%
6 35
 
5.1%
8 24
 
3.5%
7 17
 
2.5%
Other values (12) 65
9.5%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 47213
95.8%
ASCII 2083
 
4.2%
Compat Jamo 1
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1236
 
2.6%
1215
 
2.6%
1065
 
2.3%
800
 
1.7%
663
 
1.4%
567
 
1.2%
515
 
1.1%
490
 
1.0%
473
 
1.0%
459
 
1.0%
Other values (969) 39730
84.2%
ASCII
ValueCountFrequency (%)
0 133
 
6.4%
1 103
 
4.9%
E 94
 
4.5%
D 92
 
4.4%
L 90
 
4.3%
/ 85
 
4.1%
2 80
 
3.8%
3 69
 
3.3%
C 66
 
3.2%
T 65
 
3.1%
Other values (61) 1206
57.9%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
100.0%

Missing values

2023-12-12T16:54:57.028455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:54:57.095282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연도기준월상품키워드
520720209노트3케이스
1030220209습도막측정기
1152220209천연화장품스킨로션
400220209한글
871120209열냉각시트
75320209호출벨
314720209껍질벗기기
1234220209대한민국지도
516820209수소수기초화장품
1232220209전립선비대증수술
기준연도기준월상품키워드
1318420209바닐라프라페
1172720209체형교정
1207620209경추베개
579720209예초기날
165620209애견간식
613520209emdr안구운동
941720209간호사스타킹
975920209더블하트
599020209사각옹기
3020209신상품

Duplicate rows

Most frequently occurring

기준연도기준월상품키워드# duplicates
1720209건버섯3
5620209디지털보청기3
6020209라쎌LACELL3
6820209면역력강화3
7120209모듈25W3
10020209부모님선물3
12220209스마트도어락3
14620209아이폰6케이스3
15020209알톤자전거3
19720209자동차용품3