Overview

Dataset statistics

Number of variables4
Number of observations6916
Missing cells2
Missing cells (%)< 0.1%
Duplicate rows6
Duplicate rows (%)0.1%
Total size in memory229.8 KiB
Average record size in memory34.0 B

Variable types

Categorical2
Text1
Boolean1

Dataset

Description아임셀러 상품에 대한 상세품목고시 정보에 대한 데이터 제공. 기준연도, 기준월, 상세품목값 및 상세에 대한 별도 출력여부에 대한 데이터 제공
Author(주)중소기업유통센터
URLhttps://www.data.go.kr/data/15067200/fileData.do

Alerts

기준연도 has constant value ""Constant
기준월 has constant value ""Constant
Dataset has 6 (0.1%) duplicate rowsDuplicates
상세별도출력여부 is highly imbalanced (90.1%)Imbalance

Reproduction

Analysis started2023-12-12 15:25:20.530587
Analysis finished2023-12-12 15:25:21.870749
Duration1.34 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.2 KiB
2020
6916 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 6916
100.0%

Length

2023-12-13T00:25:21.997850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:25:22.148365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 6916
100.0%

기준월
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.2 KiB
9
6916 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row9
3rd row9
4th row9
5th row9

Common Values

ValueCountFrequency (%)
9 6916
100.0%

Length

2023-12-13T00:25:22.290244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:25:22.438704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9 6916
100.0%
Distinct6849
Distinct (%)99.1%
Missing2
Missing (%)< 0.1%
Memory size54.2 KiB
2023-12-13T00:25:22.807735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length405
Median length350
Mean length20.734018
Min length1

Characters and Unicode

Total characters143355
Distinct characters1126
Distinct categories17 ?
Distinct scripts5 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6785 ?
Unique (%)98.1%

Sample

1st row상품소개별도표시
2nd row질경이골드20정과질경이실버10정
3rd row제조일로부터3년
4th row목욕또는세정시온수적당량사용하고종이컵4분의1에2개를녹여사용
5th row주식회사하우동천
ValueCountFrequency (%)
지방fat9g19%/포화지방saturatedfat9g60%/트랜스지방transfat0g/콜레스테롤cholesterol0mg0%/나트륨sodium17mg1 4
 
0.1%
05-jul 3
 
< 0.1%
2~3일 3
 
< 0.1%
해당사항없음 3
 
< 0.1%
한국 3
 
< 0.1%
1일1회/1회1포를직접또는물과함께섭취하십시오 2
 
< 0.1%
지방fat11g22%/포화지방saturatedfat11g71%/트랜스지방transfat0g/콜레스테롤cholesterol0mg0%/나트륨sodium20mg1 2
 
< 0.1%
yj시스템 2
 
< 0.1%
1회제공량당함량thecontentofperone열량caloris136kcal/탄수화물carbohydrate12g4%/단백질protein1g1 2
 
< 0.1%
본제품은공정거래위원회고시소비자분쟁해결기준에의거교환또는보상받을수있습니다 2
 
< 0.1%
Other values (6800) 6894
99.6%
2023-12-13T00:25:23.444858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 7089
 
4.9%
0 5341
 
3.7%
1 4080
 
2.8%
2 2763
 
1.9%
2515
 
1.8%
5 2201
 
1.5%
. 2155
 
1.5%
3 2044
 
1.4%
m 1714
 
1.2%
g 1681
 
1.2%
Other values (1116) 111772
78.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 94034
65.6%
Decimal Number 21779
 
15.2%
Other Punctuation 11443
 
8.0%
Lowercase Letter 8757
 
6.1%
Uppercase Letter 4804
 
3.4%
Dash Punctuation 1413
 
1.0%
Math Symbol 497
 
0.3%
Control 211
 
0.1%
Open Punctuation 109
 
0.1%
Other Symbol 108
 
0.1%
Other values (7) 200
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2515
 
2.7%
1449
 
1.5%
1443
 
1.5%
1419
 
1.5%
1281
 
1.4%
1221
 
1.3%
1091
 
1.2%
1082
 
1.2%
1060
 
1.1%
1053
 
1.1%
Other values (991) 80420
85.5%
Uppercase Letter
ValueCountFrequency (%)
S 393
 
8.2%
C 387
 
8.1%
A 381
 
7.9%
B 342
 
7.1%
E 339
 
7.1%
L 294
 
6.1%
P 290
 
6.0%
D 280
 
5.8%
M 206
 
4.3%
T 195
 
4.1%
Other values (21) 1697
35.3%
Lowercase Letter
ValueCountFrequency (%)
m 1714
19.6%
g 1681
19.2%
c 785
9.0%
l 537
 
6.1%
x 497
 
5.7%
a 454
 
5.2%
e 443
 
5.1%
o 363
 
4.1%
t 286
 
3.3%
r 269
 
3.1%
Other values (20) 1728
19.7%
Other Punctuation
ValueCountFrequency (%)
/ 7089
62.0%
. 2155
 
18.8%
% 1337
 
11.7%
* 443
 
3.9%
: 258
 
2.3%
; 90
 
0.8%
& 20
 
0.2%
20
 
0.2%
· 17
 
0.1%
' 8
 
0.1%
Other values (2) 6
 
0.1%
Other Symbol
ValueCountFrequency (%)
47
43.5%
18
 
16.7%
12
 
11.1%
8
 
7.4%
5
 
4.6%
4
 
3.7%
4
 
3.7%
4
 
3.7%
3
 
2.8%
2
 
1.9%
Decimal Number
ValueCountFrequency (%)
0 5341
24.5%
1 4080
18.7%
2 2763
12.7%
5 2201
10.1%
3 2044
 
9.4%
4 1462
 
6.7%
6 1223
 
5.6%
7 1007
 
4.6%
8 970
 
4.5%
9 688
 
3.2%
Math Symbol
ValueCountFrequency (%)
~ 316
63.6%
+ 111
 
22.3%
× 40
 
8.0%
± 16
 
3.2%
= 10
 
2.0%
> 3
 
0.6%
< 1
 
0.2%
Letter Number
ValueCountFrequency (%)
19
55.9%
6
 
17.6%
6
 
17.6%
2
 
5.9%
1
 
2.9%
Other Number
ValueCountFrequency (%)
½ 8
33.3%
6
25.0%
6
25.0%
3
 
12.5%
1
 
4.2%
Open Punctuation
ValueCountFrequency (%)
[ 102
93.6%
{ 4
 
3.7%
3
 
2.8%
Close Punctuation
ValueCountFrequency (%)
] 88
92.6%
} 4
 
4.2%
3
 
3.2%
Final Punctuation
ValueCountFrequency (%)
37
97.4%
1
 
2.6%
Initial Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1413
100.0%
Control
ValueCountFrequency (%)
211
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 94003
65.6%
Common 35726
 
24.9%
Latin 13574
 
9.5%
Han 31
 
< 0.1%
Greek 21
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2515
 
2.7%
1449
 
1.5%
1443
 
1.5%
1419
 
1.5%
1281
 
1.4%
1221
 
1.3%
1091
 
1.2%
1082
 
1.2%
1060
 
1.1%
1053
 
1.1%
Other values (980) 80389
85.5%
Latin
ValueCountFrequency (%)
m 1714
 
12.6%
g 1681
 
12.4%
c 785
 
5.8%
l 537
 
4.0%
x 497
 
3.7%
a 454
 
3.3%
e 443
 
3.3%
S 393
 
2.9%
C 387
 
2.9%
A 381
 
2.8%
Other values (50) 6302
46.4%
Common
ValueCountFrequency (%)
/ 7089
19.8%
0 5341
14.9%
1 4080
11.4%
2 2763
 
7.7%
5 2201
 
6.2%
. 2155
 
6.0%
3 2044
 
5.7%
4 1462
 
4.1%
- 1413
 
4.0%
% 1337
 
3.7%
Other values (49) 5841
16.3%
Han
ValueCountFrequency (%)
10
32.3%
4
 
12.9%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
Greek
ValueCountFrequency (%)
Ω 8
38.1%
α 6
28.6%
μ 3
 
14.3%
π 2
 
9.5%
1
 
4.8%
Φ 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 93993
65.6%
ASCII 48969
34.2%
None 132
 
0.1%
Punctuation 61
 
< 0.1%
CJK Compat 61
 
< 0.1%
Letterlike Symbols 48
 
< 0.1%
Number Forms 34
 
< 0.1%
CJK 31
 
< 0.1%
Enclosed Alphanum 16
 
< 0.1%
Compat Jamo 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 7089
 
14.5%
0 5341
 
10.9%
1 4080
 
8.3%
2 2763
 
5.6%
5 2201
 
4.5%
. 2155
 
4.4%
3 2044
 
4.2%
m 1714
 
3.5%
g 1681
 
3.4%
4 1462
 
3.0%
Other values (75) 18439
37.7%
Hangul
ValueCountFrequency (%)
2515
 
2.7%
1449
 
1.5%
1443
 
1.5%
1419
 
1.5%
1281
 
1.4%
1221
 
1.3%
1091
 
1.2%
1082
 
1.2%
1060
 
1.1%
1053
 
1.1%
Other values (972) 80379
85.5%
Letterlike Symbols
ValueCountFrequency (%)
47
97.9%
1
 
2.1%
None
ValueCountFrequency (%)
× 40
30.3%
· 17
12.9%
± 16
 
12.1%
Ø 10
 
7.6%
9
 
6.8%
½ 8
 
6.1%
Ω 8
 
6.1%
α 6
 
4.5%
6
 
4.5%
μ 3
 
2.3%
Other values (4) 9
 
6.8%
Punctuation
ValueCountFrequency (%)
37
60.7%
20
32.8%
2
 
3.3%
1
 
1.6%
1
 
1.6%
Number Forms
ValueCountFrequency (%)
19
55.9%
6
 
17.6%
6
 
17.6%
2
 
5.9%
1
 
2.9%
CJK Compat
ValueCountFrequency (%)
18
29.5%
12
19.7%
8
13.1%
5
 
8.2%
4
 
6.6%
4
 
6.6%
4
 
6.6%
3
 
4.9%
2
 
3.3%
1
 
1.6%
CJK
ValueCountFrequency (%)
10
32.3%
4
 
12.9%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
Enclosed Alphanum
ValueCountFrequency (%)
6
37.5%
6
37.5%
3
18.8%
1
 
6.2%
Compat Jamo
ValueCountFrequency (%)
2
20.0%
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

상세별도출력여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 KiB
False
6827 
True
 
89
ValueCountFrequency (%)
False 6827
98.7%
True 89
 
1.3%
2023-12-13T00:25:23.599666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-13T00:25:21.704430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:25:21.817781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연도기준월상세품목값상세별도출력여부
020209상품소개별도표시N
120209질경이골드20정과질경이실버10정N
220209제조일로부터3년N
320209목욕또는세정시온수적당량사용하고종이컵4분의1에2개를녹여사용N
420209주식회사하우동천N
520209한국N
620209해당사항없음N
7202091정600mgN
820209태화고무장갑N
920209고무N
기준연도기준월상세품목값상세별도출력여부
6906202091670-6277N
690720209B361A297-5001공산품자율안전확인N
690820209품질보증기간:구입일로부터6개월N
690920209교환제품을사용치않은상태에서구입일로부터30일이내해당업체와취급품목이있는경우가능N
691020209180gN
691120209어성초비누/홍삼비누N
691220209어성초비누-올리브유/코코넛유/팜유/카놀라유/감초분말/강릉어성초분말/비타민E/정제수/레몬그라스E.O/홍삼비누-올리브유/코코넛유/팜유/카놀라유/숯분말/홍삼씨앗분말/비타민E/정제수/인삼F.ON
691320209TRESETTE트레세테핸드메이드카드케이스목걸이N
691420209남성백팩N
691520209아라요요거트파우더20gX10개스틱형N

Duplicate rows

Most frequently occurring

기준연도기준월상세품목값상세별도출력여부# duplicates
02020905-JulN3
1202091.물에젖지않게하세요.젖었을시통풍이잘되는그늘에서말려주세요.커피나주스등이묻었을경우에는즉시마른수건으로수분을제거한뒤그늘에서말려주세요.2.모피는습기가적고통풍이잘되는곳에보관합니다.3.모니터해상도/밝기/컴퓨터사양/이미지에따라약간의색상차이가있을수있습니다.[NOTICE]모피제품은모피전문세탁소에맡기는것이바람직합니다.1.외출시향수는모피제품착용전에뿌려주어모피제품에묻지않도록합니다.2.모피제품을착용하고헤어스프레이사용은불가합니다.3.모피제품은습기가적고통풍이잘되는곳에보관합니다.4.외출에서돌아오면우선먼지를털어줍니다.5.눈/비에젖었을경우마른수건으로살짝닦아준다음그늘에서자연건조해줍니다.6.고객임의로수선하신제품은반품불가능합니다.7.상품상세페이지의내용과다른페이지의내용이다를경우상세페이지의내용이우선적용됩니다.N2
220209무관N2
320209실크N2
420209씨엘N2
520209<NA>N2