Overview

Dataset statistics

Number of variables7
Number of observations199
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.4 KiB
Average record size in memory58.7 B

Variable types

DateTime1
Text1
Categorical3
Numeric2

Alerts

2020-01-01 has constant value ""Constant
1 is highly overall correlated with 5.797101High correlation
5.797101 is highly overall correlated with 1 and 3 other fieldsHigh correlation
4 is highly overall correlated with 5.797101 and 2 other fieldsHigh correlation
[40-49] is highly overall correlated with 5.797101 and 2 other fieldsHigh correlation
[18-20] is highly overall correlated with 5.797101 and 2 other fieldsHigh correlation
4 is highly imbalanced (64.8%)Imbalance
[40-49] is highly imbalanced (72.5%)Imbalance

Reproduction

Analysis started2023-12-10 06:25:18.973308
Analysis finished2023-12-10 06:25:20.741424
Duration1.77 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

2020-01-01
Date

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
Minimum2020-01-01 00:00:00
Maximum2020-01-01 00:00:00
2023-12-10T15:25:20.812526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:25:20.957877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct160
Distinct (%)80.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-10T15:25:21.295373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length20
Mean length6.1055276
Min length2

Characters and Unicode

Total characters1215
Distinct characters280
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique135 ?
Unique (%)67.8%

Sample

1st row 닭고기(통닭
2nd row 닭튀김(양념소스)
3rd row
4th row 잡곡밥
5th row 다래
ValueCountFrequency (%)
아메리카노 6
 
2.5%
고구마(찐것 4
 
1.7%
배추김치 4
 
1.7%
커피믹스 3
 
1.3%
멸치볶음 3
 
1.3%
떡만둣국 3
 
1.3%
3
 
1.3%
인스턴트 3
 
1.3%
채소샐러드 3
 
1.3%
쌀밥 3
 
1.3%
Other values (183) 203
85.3%
2023-12-10T15:25:21.986358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
238
 
19.6%
28
 
2.3%
( 23
 
1.9%
) 22
 
1.8%
20
 
1.6%
20
 
1.6%
17
 
1.4%
17
 
1.4%
15
 
1.2%
15
 
1.2%
Other values (270) 800
65.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 905
74.5%
Space Separator 238
 
19.6%
Open Punctuation 23
 
1.9%
Close Punctuation 22
 
1.8%
Decimal Number 10
 
0.8%
Uppercase Letter 9
 
0.7%
Lowercase Letter 6
 
0.5%
Other Punctuation 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
 
3.1%
20
 
2.2%
20
 
2.2%
17
 
1.9%
17
 
1.9%
15
 
1.7%
15
 
1.7%
15
 
1.7%
14
 
1.5%
13
 
1.4%
Other values (247) 731
80.8%
Uppercase Letter
ValueCountFrequency (%)
S 2
22.2%
T 1
11.1%
A 1
11.1%
P 1
11.1%
I 1
11.1%
G 1
11.1%
J 1
11.1%
C 1
11.1%
Decimal Number
ValueCountFrequency (%)
0 3
30.0%
5 3
30.0%
1 1
 
10.0%
2 1
 
10.0%
7 1
 
10.0%
3 1
 
10.0%
Lowercase Letter
ValueCountFrequency (%)
o 2
33.3%
l 1
16.7%
m 1
16.7%
e 1
16.7%
s 1
16.7%
Space Separator
ValueCountFrequency (%)
238
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Other Punctuation
ValueCountFrequency (%)
% 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 905
74.5%
Common 295
 
24.3%
Latin 15
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
 
3.1%
20
 
2.2%
20
 
2.2%
17
 
1.9%
17
 
1.9%
15
 
1.7%
15
 
1.7%
15
 
1.7%
14
 
1.5%
13
 
1.4%
Other values (247) 731
80.8%
Latin
ValueCountFrequency (%)
o 2
13.3%
S 2
13.3%
l 1
 
6.7%
m 1
 
6.7%
T 1
 
6.7%
A 1
 
6.7%
P 1
 
6.7%
I 1
 
6.7%
e 1
 
6.7%
s 1
 
6.7%
Other values (3) 3
20.0%
Common
ValueCountFrequency (%)
238
80.7%
( 23
 
7.8%
) 22
 
7.5%
0 3
 
1.0%
5 3
 
1.0%
% 2
 
0.7%
1 1
 
0.3%
2 1
 
0.3%
7 1
 
0.3%
3 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 905
74.5%
ASCII 310
 
25.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
238
76.8%
( 23
 
7.4%
) 22
 
7.1%
0 3
 
1.0%
5 3
 
1.0%
o 2
 
0.6%
S 2
 
0.6%
% 2
 
0.6%
l 1
 
0.3%
1 1
 
0.3%
Other values (13) 13
 
4.2%
Hangul
ValueCountFrequency (%)
28
 
3.1%
20
 
2.2%
20
 
2.2%
17
 
1.9%
17
 
1.9%
15
 
1.7%
15
 
1.7%
15
 
1.7%
14
 
1.5%
13
 
1.4%
Other values (247) 731
80.8%

4
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct12
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
1
153 
2
28 
3
 
7
5
 
2
4
 
2
Other values (7)
 
7

Length

Max length5
Median length2
Mean length2.0452261
Min length2

Unique

Unique7 ?
Unique (%)3.5%

Sample

1st row 튀김)
2nd row 3
3rd row 3
4th row 2
5th row 2

Common Values

ValueCountFrequency (%)
1 153
76.9%
2 28
 
14.1%
3 7
 
3.5%
5 2
 
1.0%
4 2
 
1.0%
튀김) 1
 
0.5%
냉동 1
 
0.5%
6 1
 
0.5%
9 1
 
0.5%
바닐라맛 1
 
0.5%
Other values (2) 2
 
1.0%

Length

2023-12-10T15:25:22.202342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 153
76.9%
2 28
 
14.1%
3 7
 
3.5%
5 2
 
1.0%
4 2
 
1.0%
튀김 1
 
0.5%
냉동 1
 
0.5%
6 1
 
0.5%
9 1
 
0.5%
바닐라맛 1
 
0.5%
Other values (2) 2
 
1.0%

1
Real number (ℝ)

HIGH CORRELATION 

Distinct67
Distinct (%)33.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.422111
Minimum1
Maximum67
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:25:22.432394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q110.5
median28
Q344.5
95-th percentile59
Maximum67
Range66
Interquartile range (IQR)34

Descriptive statistics

Standard deviation18.934835
Coefficient of variation (CV)0.66620089
Kurtosis-1.2083884
Mean28.422111
Median Absolute Deviation (MAD)17
Skewness0.17603012
Sum5656
Variance358.52799
MonotonicityNot monotonic
2023-12-10T15:25:22.722188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 8
 
4.0%
3 6
 
3.0%
5 5
 
2.5%
6 5
 
2.5%
4 5
 
2.5%
7 5
 
2.5%
8 4
 
2.0%
9 4
 
2.0%
10 4
 
2.0%
11 4
 
2.0%
Other values (57) 149
74.9%
ValueCountFrequency (%)
1 8
4.0%
2 4
2.0%
3 6
3.0%
4 5
2.5%
5 5
2.5%
6 5
2.5%
7 5
2.5%
8 4
2.0%
9 4
2.0%
10 4
2.0%
ValueCountFrequency (%)
67 1
0.5%
66 1
0.5%
65 1
0.5%
64 1
0.5%
63 1
0.5%
62 1
0.5%
61 1
0.5%
60 2
1.0%
59 2
1.0%
58 2
1.0%

5.797101
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)11.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5752467
Minimum1.075269
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:25:23.047945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.075269
5-th percentile1.075269
Q11.075269
median1.449275
Q32.150538
95-th percentile9.090909
Maximum33
Range31.924731
Interquartile range (IQR)1.075269

Descriptive statistics

Standard deviation3.6663051
Coefficient of variation (CV)1.4236714
Kurtosis30.604725
Mean2.5752467
Median Absolute Deviation (MAD)0.374006
Skewness4.8526068
Sum512.4741
Variance13.441793
MonotonicityNot monotonic
2023-12-10T15:25:23.270337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1.075269 51
25.6%
1.204819 46
23.1%
1.449275 45
22.6%
9.090909 11
 
5.5%
2.409639 10
 
5.0%
2.150538 9
 
4.5%
2.898551 5
 
2.5%
3.921569 4
 
2.0%
5.882353 2
 
1.0%
3.614458 2
 
1.0%
Other values (13) 14
 
7.0%
ValueCountFrequency (%)
1.075269 51
25.6%
1.204819 46
23.1%
1.449275 45
22.6%
2.0 1
 
0.5%
2.150538 9
 
4.5%
2.409639 10
 
5.0%
2.898551 5
 
2.5%
3.225806 1
 
0.5%
3.614458 2
 
1.0%
3.921569 4
 
2.0%
ValueCountFrequency (%)
33.0 1
 
0.5%
24.0 1
 
0.5%
17.0 1
 
0.5%
16.0 1
 
0.5%
9.677419 1
 
0.5%
9.090909 11
5.5%
7.843137 1
 
0.5%
7.228916 1
 
0.5%
6.024096 1
 
0.5%
5.882353 2
 
1.0%

[40-49]
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
[40-49]
176 
[50-59]
18 
1.075269
 
3
4.347826
 
1
1.449275
 
1

Length

Max length9
Median length8
Mean length8.0251256
Min length8

Unique

Unique2 ?
Unique (%)1.0%

Sample

1st row 4.347826
2nd row [40-49]
3rd row [40-49]
4th row [40-49]
5th row [40-49]

Common Values

ValueCountFrequency (%)
[40-49] 176
88.4%
[50-59] 18
 
9.0%
1.075269 3
 
1.5%
4.347826 1
 
0.5%
1.449275 1
 
0.5%

Length

2023-12-10T15:25:23.483296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:25:23.679453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40-49 176
88.4%
50-59 18
 
9.0%
1.075269 3
 
1.5%
4.347826 1
 
0.5%
1.449275 1
 
0.5%

[18-20]
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
[22-24]
67 
[20-22]
64 
[18-20]
63 
[40-49]
 
5

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row [40-49]
2nd row [18-20]
3rd row [18-20]
4th row [18-20]
5th row [18-20]

Common Values

ValueCountFrequency (%)
[22-24] 67
33.7%
[20-22] 64
32.2%
[18-20] 63
31.7%
[40-49] 5
 
2.5%

Length

2023-12-10T15:25:23.888922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:25:24.048126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
22-24 67
33.7%
20-22 64
32.2%
18-20 63
31.7%
40-49 5
 
2.5%

Interactions

2023-12-10T15:25:19.950548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:25:19.612298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:25:20.100408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:25:19.788352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:25:24.246178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
415.797101[40-49][18-20]
41.0000.4580.9920.9390.861
10.4581.0000.3930.4970.279
5.7971010.9920.3931.0000.8770.696
[40-49]0.9390.4970.8771.0000.653
[18-20]0.8610.2790.6960.6531.000
2023-12-10T15:25:24.425758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
[18-20][40-49]4
[18-20]1.0000.5820.552
[40-49]0.5821.0000.849
40.5520.8491.000
2023-12-10T15:25:24.613262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
15.7971014[40-49][18-20]
11.000-0.7220.2100.2250.167
5.797101-0.7221.0000.8540.8010.525
40.2100.8541.0000.8490.552
[40-49]0.2250.8010.8491.0000.582
[18-20]0.1670.5250.5520.5821.000

Missing values

2023-12-10T15:25:20.414301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:25:20.643488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

2020-01-01배추김치415.797101[40-49][18-20]
02020-01-01닭고기(통닭튀김)32.04.347826[40-49]
12020-01-01닭튀김(양념소스)334.347826[40-49][18-20]
22020-01-01344.347826[40-49][18-20]
32020-01-01잡곡밥252.898551[40-49][18-20]
42020-01-01다래262.898551[40-49][18-20]
52020-01-01떡국272.898551[40-49][18-20]
62020-01-01계란과자282.898551[40-49][18-20]
72020-01-01치킨무292.898551[40-49][18-20]
82020-01-01맛동산1101.449275[40-49][18-20]
92020-01-01두부1111.449275[40-49][18-20]
2020-01-01배추김치415.797101[40-49][18-20]
1892020-01-01라떼199.090909[50-59][18-20]
1902020-01-01동지팥죽1109.090909[50-59][18-20]
1912020-01-01배추김치1119.090909[50-59][18-20]
1922020-01-01배추김치417.843137[50-59][22-24]
1932020-01-01아메리카노325.882353[50-59][22-24]
1942020-01-01고구마(찐것)335.882353[50-59][22-24]
1952020-01-01채소샐러드243.921569[50-59][22-24]
1962020-01-01계란찜253.921569[50-59][22-24]
1972020-01-01골드키위주스263.921569[50-59][22-24]
1982020-01-01멸치볶음273.921569[50-59][22-24]