Overview

Dataset statistics

Number of variables8
Number of observations150
Missing cells28
Missing cells (%)2.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.9 KiB
Average record size in memory67.9 B

Variable types

Text1
Numeric2
Categorical1
DateTime4

Dataset

Description포스몰(https://pos-mall.co.kr/index.do)에서 발행한 쿠폰 정보(할인적용값, 쿠폰명, 발급일 등)입니다.
Author한국농수산식품유통공사
URLhttps://www.data.go.kr/data/15072473/fileData.do

Alerts

할인적용값 is highly overall correlated with 사용가능주문최소금액High correlation
사용가능주문최소금액 is highly overall correlated with 할인적용값High correlation
다운허용횟수 is highly imbalanced (87.4%)Imbalance
사용가능종료일자 has 28 (18.7%) missing valuesMissing
사용가능주문최소금액 has 11 (7.3%) zerosZeros

Reproduction

Analysis started2023-12-12 08:48:13.542771
Analysis finished2023-12-12 08:48:15.261724
Duration1.72 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct148
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-12T17:48:15.574508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length21.5
Mean length15.22
Min length4

Characters and Unicode

Total characters2283
Distinct characters235
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique146 ?
Unique (%)97.3%

Sample

1st row특정상품 할인
2nd row에스베라 포도
3rd row포도할인
4th row복숭아할인
5th row거봉할인
ValueCountFrequency (%)
쿠폰 46
 
8.9%
스마트로 27
 
5.2%
할인쿠폰 20
 
3.9%
가입 11
 
2.1%
30만 9
 
1.7%
포스몰 9
 
1.7%
5,000원 9
 
1.7%
번째 9
 
1.7%
할인 8
 
1.5%
이벤트 7
 
1.3%
Other values (222) 364
70.1%
2023-12-12T17:48:16.121937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
370
 
16.2%
0 137
 
6.0%
100
 
4.4%
99
 
4.3%
87
 
3.8%
71
 
3.1%
60
 
2.6%
5 45
 
2.0%
42
 
1.8%
3 41
 
1.8%
Other values (225) 1231
53.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1409
61.7%
Space Separator 370
 
16.2%
Decimal Number 305
 
13.4%
Open Punctuation 49
 
2.1%
Close Punctuation 49
 
2.1%
Other Punctuation 34
 
1.5%
Lowercase Letter 29
 
1.3%
Uppercase Letter 26
 
1.1%
Math Symbol 7
 
0.3%
Dash Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
100
 
7.1%
99
 
7.0%
87
 
6.2%
71
 
5.0%
60
 
4.3%
42
 
3.0%
38
 
2.7%
35
 
2.5%
34
 
2.4%
34
 
2.4%
Other values (178) 809
57.4%
Lowercase Letter
ValueCountFrequency (%)
g 6
20.7%
k 5
17.2%
o 3
10.3%
b 2
 
6.9%
x 2
 
6.9%
e 2
 
6.9%
l 2
 
6.9%
n 2
 
6.9%
p 1
 
3.4%
r 1
 
3.4%
Other values (3) 3
10.3%
Decimal Number
ValueCountFrequency (%)
0 137
44.9%
5 45
 
14.8%
3 41
 
13.4%
1 35
 
11.5%
2 23
 
7.5%
7 7
 
2.3%
4 6
 
2.0%
8 5
 
1.6%
9 5
 
1.6%
6 1
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
S 9
34.6%
C 7
26.9%
B 2
 
7.7%
A 2
 
7.7%
T 1
 
3.8%
K 1
 
3.8%
M 1
 
3.8%
D 1
 
3.8%
H 1
 
3.8%
E 1
 
3.8%
Other Punctuation
ValueCountFrequency (%)
, 31
91.2%
/ 1
 
2.9%
. 1
 
2.9%
! 1
 
2.9%
Math Symbol
ValueCountFrequency (%)
+ 3
42.9%
< 2
28.6%
> 2
28.6%
Open Punctuation
ValueCountFrequency (%)
[ 27
55.1%
( 22
44.9%
Close Punctuation
ValueCountFrequency (%)
] 27
55.1%
) 22
44.9%
Space Separator
ValueCountFrequency (%)
370
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1409
61.7%
Common 819
35.9%
Latin 55
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
100
 
7.1%
99
 
7.0%
87
 
6.2%
71
 
5.0%
60
 
4.3%
42
 
3.0%
38
 
2.7%
35
 
2.5%
34
 
2.4%
34
 
2.4%
Other values (178) 809
57.4%
Common
ValueCountFrequency (%)
370
45.2%
0 137
 
16.7%
5 45
 
5.5%
3 41
 
5.0%
1 35
 
4.3%
, 31
 
3.8%
[ 27
 
3.3%
] 27
 
3.3%
2 23
 
2.8%
( 22
 
2.7%
Other values (14) 61
 
7.4%
Latin
ValueCountFrequency (%)
S 9
16.4%
C 7
12.7%
g 6
 
10.9%
k 5
 
9.1%
o 3
 
5.5%
b 2
 
3.6%
x 2
 
3.6%
B 2
 
3.6%
A 2
 
3.6%
e 2
 
3.6%
Other values (13) 15
27.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1409
61.7%
ASCII 874
38.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
370
42.3%
0 137
 
15.7%
5 45
 
5.1%
3 41
 
4.7%
1 35
 
4.0%
, 31
 
3.5%
[ 27
 
3.1%
] 27
 
3.1%
2 23
 
2.6%
( 22
 
2.5%
Other values (37) 116
 
13.3%
Hangul
ValueCountFrequency (%)
100
 
7.1%
99
 
7.0%
87
 
6.2%
71
 
5.0%
60
 
4.3%
42
 
3.0%
38
 
2.7%
35
 
2.5%
34
 
2.4%
34
 
2.4%
Other values (178) 809
57.4%

할인적용값
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11560.667
Minimum100
Maximum324000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-12T17:48:16.247248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile1000
Q13000
median5000
Q310000
95-th percentile30000
Maximum324000
Range323900
Interquartile range (IQR)7000

Descriptive statistics

Standard deviation32046.141
Coefficient of variation (CV)2.7719977
Kurtosis68.570395
Mean11560.667
Median Absolute Deviation (MAD)3000
Skewness7.8259679
Sum1734100
Variance1.0269552 × 109
MonotonicityNot monotonic
2023-12-12T17:48:16.366733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
5000 42
28.0%
10000 39
26.0%
3000 25
16.7%
1000 12
 
8.0%
2000 6
 
4.0%
20000 4
 
2.7%
30000 4
 
2.7%
100 3
 
2.0%
50000 3
 
2.0%
2500 2
 
1.3%
Other values (10) 10
 
6.7%
ValueCountFrequency (%)
100 3
 
2.0%
1000 12
 
8.0%
2000 6
 
4.0%
2500 2
 
1.3%
3000 25
16.7%
3300 1
 
0.7%
4000 1
 
0.7%
5000 42
28.0%
5500 1
 
0.7%
7000 1
 
0.7%
ValueCountFrequency (%)
324000 1
 
0.7%
198000 1
 
0.7%
102000 1
 
0.7%
50000 3
 
2.0%
30000 4
 
2.7%
20000 4
 
2.7%
15000 1
 
0.7%
12000 1
 
0.7%
10000 39
26.0%
9000 1
 
0.7%

다운허용횟수
Categorical

IMBALANCE 

Distinct3
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
1
146 
100
 
3
10
 
1

Length

Max length3
Median length1
Mean length1.0466667
Min length1

Unique

Unique1 ?
Unique (%)0.7%

Sample

1st row100
2nd row100
3rd row100
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 146
97.3%
100 3
 
2.0%
10 1
 
0.7%

Length

2023-12-12T17:48:16.519248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:48:16.651359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 146
97.3%
100 3
 
2.0%
10 1
 
0.7%

사용가능주문최소금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct17
Distinct (%)11.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10675.333
Minimum0
Maximum50000
Zeros11
Zeros (%)7.3%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-12T17:48:16.754931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15000
median10000
Q310000
95-th percentile32750
Maximum50000
Range50000
Interquartile range (IQR)5000

Descriptive statistics

Standard deviation10767.724
Coefficient of variation (CV)1.0086546
Kurtosis5.3301524
Mean10675.333
Median Absolute Deviation (MAD)5000
Skewness2.2393169
Sum1601300
Variance1.1594388 × 108
MonotonicityNot monotonic
2023-12-12T17:48:16.865826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
10000 55
36.7%
5000 29
19.3%
3000 14
 
9.3%
0 11
 
7.3%
20000 11
 
7.3%
50000 6
 
4.0%
30000 5
 
3.3%
2000 4
 
2.7%
5100 3
 
2.0%
15000 3
 
2.0%
Other values (7) 9
 
6.0%
ValueCountFrequency (%)
0 11
 
7.3%
100 1
 
0.7%
1000 1
 
0.7%
2000 4
 
2.7%
3000 14
 
9.3%
4000 2
 
1.3%
5000 29
19.3%
5100 3
 
2.0%
9900 1
 
0.7%
10000 55
36.7%
ValueCountFrequency (%)
50000 6
 
4.0%
35000 2
 
1.3%
30000 5
 
3.3%
25000 1
 
0.7%
20000 11
 
7.3%
15000 3
 
2.0%
12000 1
 
0.7%
10000 55
36.7%
9900 1
 
0.7%
5100 3
 
2.0%
Distinct93
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Minimum2014-09-01 00:00:00
Maximum2021-07-26 00:00:00
2023-12-12T17:48:16.977917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:17.122218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct60
Distinct (%)49.2%
Missing28
Missing (%)18.7%
Memory size1.3 KiB
Minimum2014-09-30 00:00:00
Maximum2021-07-31 00:00:00
2023-12-12T17:48:17.560594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:17.705280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct93
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Minimum2014-07-01 00:00:00
Maximum2021-07-26 00:00:00
2023-12-12T17:48:17.821161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:17.961122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct89
Distinct (%)59.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Minimum2014-09-30 00:00:00
Maximum2021-07-31 00:00:00
2023-12-12T17:48:18.080280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:18.194315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T17:48:14.771562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:14.516277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:14.894461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:14.637013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:48:18.288758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
할인적용값다운허용횟수사용가능주문최소금액사용가능시작일자사용가능종료일자발급시작일자발급종료일자
할인적용값1.0000.0000.4480.9660.5180.9510.991
다운허용횟수0.0001.0000.0001.0001.0001.0000.959
사용가능주문최소금액0.4480.0001.0000.8660.9190.8100.846
사용가능시작일자0.9661.0000.8661.0000.9981.0000.999
사용가능종료일자0.5181.0000.9190.9981.0000.9970.999
발급시작일자0.9511.0000.8101.0000.9971.0000.999
발급종료일자0.9910.9590.8460.9990.9990.9991.000
2023-12-12T17:48:18.381158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
할인적용값사용가능주문최소금액다운허용횟수
할인적용값1.0000.5470.000
사용가능주문최소금액0.5471.0000.000
다운허용횟수0.0000.0001.000

Missing values

2023-12-12T17:48:15.061161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:48:15.203803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

쿠폰명할인적용값다운허용횟수사용가능주문최소금액사용가능시작일자사용가능종료일자발급시작일자발급종료일자
0특정상품 할인100010002014-09-012014-09-302014-09-012014-09-30
1에스베라 포도100010002014-09-012014-09-302014-09-012014-09-30
2포도할인100010002014-09-012014-09-302014-09-012014-09-30
3복숭아할인100102014-09-04<NA>2014-09-042014-09-30
4거봉할인1000102014-09-05<NA>2014-09-052014-12-31
5구매하면 쿠폰이 덤 4개 카테고리100001500002015-11-262015-12-112015-11-262015-12-06
6친구 추천인50001200002015-11-262015-12-112015-11-262015-12-06
7구매하면 쿠폰이 덤 2개 카테고리30001100002015-11-262015-12-112015-11-262015-12-06
8구매하면 쿠폰이 덤 3개 카테고리50001300002015-11-262015-12-112015-11-262015-12-06
9구매후기100001500002015-11-262015-12-112015-11-262015-12-06
쿠폰명할인적용값다운허용횟수사용가능주문최소금액사용가능시작일자사용가능종료일자발급시작일자발급종료일자
140C/S 고객 10,000원 쿠폰100001100002019-09-102019-10-312019-09-102019-10-31
141절임배추 구매 전용 3,000원 쿠폰30001250002019-10-252019-12-152019-10-142019-12-15
1422019년[절임배추] SK임직원 전용 3,000원 쿠폰30001300002019-11-282019-12-102019-11-282019-12-10
143진바이오텍 전용 쿠폰(설날 나주배 구매용)198000102020-01-17<NA>2020-01-172020-01-24
144CS 10000원 쿠폰100001100002020-08-11<NA>2020-08-112020-09-11
145CS 20000원 쿠폰200001200002020-09-29<NA>2020-09-282020-10-31
146[농축산물 소비 활성화] 5,000원 할인 쿠폰 A50001200002021-07-192021-07-252021-07-192021-07-25
147[농축산물 소비 활성화] 5,000원 할인 쿠폰 B50001200002021-07-192021-07-252021-07-192021-07-25
148[농축산물 소비 활성화 2차] 5,000원 할인쿠폰 B50001200002021-07-262021-07-312021-07-262021-07-31
149[농축산물 소비 활성화 2차] 5,000원 할인쿠폰 A50001200002021-07-262021-07-312021-07-262021-07-31