Overview

Dataset statistics

Number of variables6
Number of observations63
Missing cells1
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.2 KiB
Average record size in memory52.1 B

Variable types

Categorical1
Text2
Numeric2
Boolean1

Dataset

Description인천광역시시 화장품 제조기업 생산품(카테고리,제품명,용량정상가격(원),할인판매가(원),판매여부 등) 정보를 제공합니다
Author인천광역시
URLhttps://www.data.go.kr/data/3044187/fileData.do

Alerts

정상가격(원) is highly overall correlated with 할인판매가(원) and 1 other fieldsHigh correlation
할인판매가(원) is highly overall correlated with 정상가격(원)High correlation
판매여부 is highly overall correlated with 정상가격(원)High correlation
판매여부 is highly imbalanced (72.4%)Imbalance
용량 has 1 (1.6%) missing valuesMissing
제품명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 20:16:48.552797
Analysis finished2023-12-12 20:16:49.257513
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

카테고리
Categorical

Distinct13
Distinct (%)20.6%
Missing0
Missing (%)0.0%
Memory size636.0 B
팩&마스크
크림
기능성
클렌징
헤어
Other values (8)
26 

Length

Max length6
Median length5
Mean length3.4761905
Min length2

Unique

Unique1 ?
Unique (%)1.6%

Sample

1st row기초SET
2nd row기초SET
3rd row기초SET
4th row기초SET
5th row팩&마스크

Common Values

ValueCountFrequency (%)
팩&마스크 9
14.3%
크림 8
12.7%
기능성 8
12.7%
클렌징 6
9.5%
헤어 6
9.5%
핸드,바디 5
7.9%
기초SET 4
6.3%
세럼,수딩젤 4
6.3%
선케어 4
6.3%
베이비 3
 
4.8%
Other values (3) 6
9.5%

Length

2023-12-13T05:16:49.329869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
팩&마스크 9
14.3%
크림 8
12.7%
기능성 8
12.7%
클렌징 6
9.5%
헤어 6
9.5%
핸드,바디 5
7.9%
기초set 4
6.3%
세럼,수딩젤 4
6.3%
선케어 4
6.3%
베이비 3
 
4.8%
Other values (3) 6
9.5%

제품명
Text

UNIQUE 

Distinct63
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size636.0 B
2023-12-13T05:16:49.576552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length16
Mean length10.952381
Min length4

Characters and Unicode

Total characters690
Distinct characters189
Distinct categories7 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)100.0%

Sample

1st row카놀라 로얄허니 기초세트
2nd row진주5종(토너,에멀젼,에센스,크림,아이크림)
3rd row홍삼 기초(토너, 에멀젼)
4th row설안 5종 기초 세트
5th row꿀광 프리미엄 마스크팩
ValueCountFrequency (%)
크림 4
 
2.3%
홍삼 4
 
2.3%
린(潾 3
 
1.7%
피부일기 3
 
1.7%
마스크팩 3
 
1.7%
프리미엄 3
 
1.7%
마스크 3
 
1.7%
오일 3
 
1.7%
세럼 3
 
1.7%
포리퓨어 3
 
1.7%
Other values (127) 144
81.8%
2023-12-13T05:16:49.941518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
113
 
16.4%
28
 
4.1%
23
 
3.3%
16
 
2.3%
16
 
2.3%
15
 
2.2%
13
 
1.9%
12
 
1.7%
, 11
 
1.6%
11
 
1.6%
Other values (179) 432
62.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 534
77.4%
Space Separator 113
 
16.4%
Other Punctuation 12
 
1.7%
Close Punctuation 8
 
1.2%
Open Punctuation 8
 
1.2%
Decimal Number 8
 
1.2%
Uppercase Letter 7
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
 
5.2%
23
 
4.3%
16
 
3.0%
16
 
3.0%
15
 
2.8%
13
 
2.4%
12
 
2.2%
11
 
2.1%
11
 
2.1%
10
 
1.9%
Other values (167) 379
71.0%
Uppercase Letter
ValueCountFrequency (%)
C 4
57.1%
T 1
 
14.3%
E 1
 
14.3%
S 1
 
14.3%
Decimal Number
ValueCountFrequency (%)
3 4
50.0%
5 3
37.5%
4 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
, 11
91.7%
& 1
 
8.3%
Space Separator
ValueCountFrequency (%)
113
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 531
77.0%
Common 149
 
21.6%
Latin 7
 
1.0%
Han 3
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
 
5.3%
23
 
4.3%
16
 
3.0%
16
 
3.0%
15
 
2.8%
13
 
2.4%
12
 
2.3%
11
 
2.1%
11
 
2.1%
10
 
1.9%
Other values (166) 376
70.8%
Common
ValueCountFrequency (%)
113
75.8%
, 11
 
7.4%
) 8
 
5.4%
( 8
 
5.4%
3 4
 
2.7%
5 3
 
2.0%
4 1
 
0.7%
& 1
 
0.7%
Latin
ValueCountFrequency (%)
C 4
57.1%
T 1
 
14.3%
E 1
 
14.3%
S 1
 
14.3%
Han
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 531
77.0%
ASCII 156
 
22.6%
CJK 3
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
113
72.4%
, 11
 
7.1%
) 8
 
5.1%
( 8
 
5.1%
3 4
 
2.6%
C 4
 
2.6%
5 3
 
1.9%
T 1
 
0.6%
E 1
 
0.6%
S 1
 
0.6%
Other values (2) 2
 
1.3%
Hangul
ValueCountFrequency (%)
28
 
5.3%
23
 
4.3%
16
 
3.0%
16
 
3.0%
15
 
2.8%
13
 
2.4%
12
 
2.3%
11
 
2.1%
11
 
2.1%
10
 
1.9%
Other values (166) 376
70.8%
CJK
ValueCountFrequency (%)
3
100.0%

용량
Text

MISSING 

Distinct36
Distinct (%)58.1%
Missing1
Missing (%)1.6%
Memory size636.0 B
2023-12-13T05:16:50.114259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length16
Mean length6.1612903
Min length3

Characters and Unicode

Total characters382
Distinct characters21
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)40.3%

Sample

1st row120ml,120ml,50ml
2nd row120ml, 50ml, 30ml
3rd row120ml, 120ml
4th row165ml,50ml,30ml
5th row27ml*10
ValueCountFrequency (%)
50ml 8
 
12.1%
300ml 5
 
7.6%
100ml 5
 
7.6%
30ml 5
 
7.6%
150ml 4
 
6.1%
30mlx10개 3
 
4.5%
120ml 3
 
4.5%
100g 2
 
3.0%
350ml 2
 
3.0%
200ml 2
 
3.0%
Other values (25) 27
40.9%
2023-12-13T05:16:50.423434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 85
22.3%
m 69
18.1%
l 69
18.1%
1 32
 
8.4%
5 31
 
8.1%
3 19
 
5.0%
2 18
 
4.7%
, 12
 
3.1%
6
 
1.6%
6 6
 
1.6%
Other values (11) 35
9.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 203
53.1%
Lowercase Letter 151
39.5%
Other Punctuation 18
 
4.7%
Other Letter 6
 
1.6%
Space Separator 4
 
1.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 85
41.9%
1 32
 
15.8%
5 31
 
15.3%
3 19
 
9.4%
2 18
 
8.9%
6 6
 
3.0%
7 6
 
3.0%
4 6
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
m 69
45.7%
l 69
45.7%
g 5
 
3.3%
x 4
 
2.6%
p 1
 
0.7%
a 1
 
0.7%
d 1
 
0.7%
s 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
, 12
66.7%
* 5
27.8%
/ 1
 
5.6%
Other Letter
ValueCountFrequency (%)
6
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 225
58.9%
Latin 151
39.5%
Hangul 6
 
1.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 85
37.8%
1 32
 
14.2%
5 31
 
13.8%
3 19
 
8.4%
2 18
 
8.0%
, 12
 
5.3%
6 6
 
2.7%
7 6
 
2.7%
4 6
 
2.7%
* 5
 
2.2%
Other values (2) 5
 
2.2%
Latin
ValueCountFrequency (%)
m 69
45.7%
l 69
45.7%
g 5
 
3.3%
x 4
 
2.6%
p 1
 
0.7%
a 1
 
0.7%
d 1
 
0.7%
s 1
 
0.7%
Hangul
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 376
98.4%
Hangul 6
 
1.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 85
22.6%
m 69
18.4%
l 69
18.4%
1 32
 
8.5%
5 31
 
8.2%
3 19
 
5.1%
2 18
 
4.8%
, 12
 
3.2%
6 6
 
1.6%
7 6
 
1.6%
Other values (10) 29
 
7.7%
Hangul
ValueCountFrequency (%)
6
100.0%

정상가격(원)
Real number (ℝ)

HIGH CORRELATION 

Distinct35
Distinct (%)55.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41795.238
Minimum5000
Maximum344000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size699.0 B
2023-12-13T05:16:50.535833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5000
5-th percentile8700
Q118000
median27600
Q340000
95-th percentile97700
Maximum344000
Range339000
Interquartile range (IQR)22000

Descriptive statistics

Standard deviation63268.433
Coefficient of variation (CV)1.5137713
Kurtosis15.533118
Mean41795.238
Median Absolute Deviation (MAD)12400
Skewness3.9445534
Sum2633100
Variance4.0028947 × 109
MonotonicityNot monotonic
2023-12-13T05:16:50.658684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
29000 6
 
9.5%
14000 5
 
7.9%
40000 4
 
6.3%
18000 4
 
6.3%
12000 3
 
4.8%
45000 3
 
4.8%
25000 3
 
4.8%
30000 3
 
4.8%
28000 3
 
4.8%
23000 2
 
3.2%
Other values (25) 27
42.9%
ValueCountFrequency (%)
5000 1
 
1.6%
6000 1
 
1.6%
7000 1
 
1.6%
8500 1
 
1.6%
10500 1
 
1.6%
11500 2
 
3.2%
12000 3
4.8%
14000 5
7.9%
18000 4
6.3%
18500 1
 
1.6%
ValueCountFrequency (%)
344000 1
1.6%
320000 1
1.6%
265000 1
1.6%
98000 1
1.6%
95000 1
1.6%
56000 1
1.6%
52000 1
1.6%
50000 1
1.6%
49000 1
1.6%
48000 1
1.6%

할인판매가(원)
Real number (ℝ)

HIGH CORRELATION 

Distinct33
Distinct (%)52.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19635.556
Minimum2500
Maximum210000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size699.0 B
2023-12-13T05:16:50.814167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2500
5-th percentile5000
Q110000
median12860
Q318000
95-th percentile47400
Maximum210000
Range207500
Interquartile range (IQR)8000

Descriptive statistics

Standard deviation28975.406
Coefficient of variation (CV)1.4756601
Kurtosis31.300012
Mean19635.556
Median Absolute Deviation (MAD)3860
Skewness5.2128212
Sum1237040
Variance8.3957417 × 108
MonotonicityNot monotonic
2023-12-13T05:16:50.955155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
12000 6
 
9.5%
18000 6
 
9.5%
10000 5
 
7.9%
14000 4
 
6.3%
9000 4
 
6.3%
11160 3
 
4.8%
20000 3
 
4.8%
25000 2
 
3.2%
5000 2
 
3.2%
22000 2
 
3.2%
Other values (23) 26
41.3%
ValueCountFrequency (%)
2500 1
 
1.6%
3000 1
 
1.6%
4000 1
 
1.6%
5000 2
3.2%
6000 2
3.2%
6500 1
 
1.6%
8000 1
 
1.6%
8500 1
 
1.6%
9000 4
6.3%
9500 1
 
1.6%
ValueCountFrequency (%)
210000 1
 
1.6%
92000 1
 
1.6%
90000 1
 
1.6%
48000 1
 
1.6%
42000 1
 
1.6%
28800 1
 
1.6%
25000 2
 
3.2%
22000 2
 
3.2%
20000 3
4.8%
18000 6
9.5%

판매여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size195.0 B
False
60 
True
 
3
ValueCountFrequency (%)
False 60
95.2%
True 3
 
4.8%
2023-12-13T05:16:51.054404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-13T05:16:48.974115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:16:48.826614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:16:49.047078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:16:48.908015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:16:51.118419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
카테고리제품명용량정상가격(원)할인판매가(원)판매여부
카테고리1.0001.0000.9440.5010.1080.000
제품명1.0001.0001.0001.0001.0001.000
용량0.9441.0001.0000.9670.9881.000
정상가격(원)0.5011.0000.9671.0000.9510.443
할인판매가(원)0.1081.0000.9880.9511.0000.248
판매여부0.0001.0001.0000.4430.2481.000
2023-12-13T05:16:51.227899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
카테고리판매여부
카테고리1.0000.000
판매여부0.0001.000
2023-12-13T05:16:51.307636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정상가격(원)할인판매가(원)카테고리판매여부
정상가격(원)1.0000.9110.2800.526
할인판매가(원)0.9111.0000.0570.296
카테고리0.2800.0571.0000.000
판매여부0.5260.2960.0001.000

Missing values

2023-12-13T05:16:49.129240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:16:49.220287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

카테고리제품명용량정상가격(원)할인판매가(원)판매여부
0기초SET카놀라 로얄허니 기초세트120ml,120ml,50ml9800042000N
1기초SET진주5종(토너,에멀젼,에센스,크림,아이크림)120ml, 50ml, 30ml344000210000N
2기초SET홍삼 기초(토너, 에멀젼)120ml, 120ml5000022000N
3기초SET설안 5종 기초 세트165ml,50ml,30ml32000090000N
4팩&마스크꿀광 프리미엄 마스크팩27ml*102760012000N
5팩&마스크마스크팩 5종20ml70003000N
6팩&마스크앰플마스크팩 3종25ml1800011500N
7팩&마스크링클리페어 아이마스크100g2700013000N
8팩&마스크굿나잇 필링 슬리핑팩135g2600014000N
9팩&마스크피부일기 비피다바이옴 마스크30mlx10개1200011160N
카테고리제품명용량정상가격(원)할인판매가(원)판매여부
53핸드,바디대나무 바디 3종세트150ml,500ml,250ml3000020000N
54핸드,바디리얼 솔트 바디스크럽270ml140009500N
55핸드,바디올인원 바디워시300ml1800012860N
56헤어프리미엄 샴푸SET550ml, 150ml5200020000N
57헤어스캘프 스파 샴푸300ml185008000N
58헤어헤어트리트먼트 오일100ml210009000N
59헤어아르간 헤어팩300ml140009000N
60헤어아르간 오일 샴푸460ml1800010000N
61헤어헤어왁싱 컬러 3종(레드, 오렌지, 자연갈색)200g115004000N
62미용기기스킨 쿨러<NA>2800011000N