Overview

Dataset statistics

Number of variables7
Number of observations199
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.4 KiB
Average record size in memory58.7 B

Variable types

Categorical4
Text1
Numeric2

Alerts

2020-01-01 has constant value ""Constant
1 is highly overall correlated with 22.222222High correlation
22.222222 is highly overall correlated with 1 and 1 other fieldsHigh correlation
2 is highly overall correlated with [40-49]High correlation
[40-49] is highly overall correlated with 2 and 1 other fieldsHigh correlation
09 is highly overall correlated with 22.222222 and 1 other fieldsHigh correlation
2 is highly imbalanced (73.7%)Imbalance

Reproduction

Analysis started2023-12-10 06:31:55.307850
Analysis finished2023-12-10 06:31:57.136548
Duration1.83 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

2020-01-01
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2020-01-01
199 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-01-01
2nd row2020-01-01
3rd row2020-01-01
4th row2020-01-01
5th row2020-01-01

Common Values

ValueCountFrequency (%)
2020-01-01 199
100.0%

Length

2023-12-10T15:31:57.247308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:31:57.419585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-01-01 199
100.0%
Distinct139
Distinct (%)69.8%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-10T15:31:57.843593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length12
Mean length5.3668342
Min length2

Characters and Unicode

Total characters1068
Distinct characters238
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)54.3%

Sample

1st row 배추김치
2nd row 아메리카노
3rd row 마늘쫑무침
4th row 깍두기
5th row 홈런볼 초코
ValueCountFrequency (%)
배추김치 10
 
4.6%
쌀밥 9
 
4.1%
아메리카노 5
 
2.3%
서울우유 4
 
1.8%
떡만둣국 4
 
1.8%
떡국 4
 
1.8%
3
 
1.4%
뚝배기불고기 3
 
1.4%
바나나 3
 
1.4%
우유 3
 
1.4%
Other values (144) 171
78.1%
2023-12-10T15:31:58.498070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
219
 
20.5%
30
 
2.8%
21
 
2.0%
20
 
1.9%
20
 
1.9%
16
 
1.5%
15
 
1.4%
15
 
1.4%
15
 
1.4%
14
 
1.3%
Other values (228) 683
64.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 817
76.5%
Space Separator 219
 
20.5%
Open Punctuation 13
 
1.2%
Close Punctuation 10
 
0.9%
Decimal Number 6
 
0.6%
Lowercase Letter 2
 
0.2%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
3.7%
21
 
2.6%
20
 
2.4%
20
 
2.4%
16
 
2.0%
15
 
1.8%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (217) 637
78.0%
Decimal Number
ValueCountFrequency (%)
0 2
33.3%
1 1
16.7%
2 1
16.7%
7 1
16.7%
5 1
16.7%
Lowercase Letter
ValueCountFrequency (%)
m 1
50.0%
l 1
50.0%
Space Separator
ValueCountFrequency (%)
219
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Other Punctuation
ValueCountFrequency (%)
% 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 817
76.5%
Common 249
 
23.3%
Latin 2
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
3.7%
21
 
2.6%
20
 
2.4%
20
 
2.4%
16
 
2.0%
15
 
1.8%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (217) 637
78.0%
Common
ValueCountFrequency (%)
219
88.0%
( 13
 
5.2%
) 10
 
4.0%
0 2
 
0.8%
% 1
 
0.4%
1 1
 
0.4%
2 1
 
0.4%
7 1
 
0.4%
5 1
 
0.4%
Latin
ValueCountFrequency (%)
m 1
50.0%
l 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 817
76.5%
ASCII 251
 
23.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
219
87.3%
( 13
 
5.2%
) 10
 
4.0%
0 2
 
0.8%
% 1
 
0.4%
1 1
 
0.4%
2 1
 
0.4%
7 1
 
0.4%
5 1
 
0.4%
m 1
 
0.4%
Hangul
ValueCountFrequency (%)
30
 
3.7%
21
 
2.6%
20
 
2.4%
20
 
2.4%
16
 
2.0%
15
 
1.8%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (217) 637
78.0%

2
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
1
178 
2
 
11
3
 
5
4
 
2
삶은것)
 
2

Length

Max length5
Median length2
Mean length2.040201
Min length2

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row 2
2nd row 1
3rd row 1
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
1 178
89.4%
2 11
 
5.5%
3 5
 
2.5%
4 2
 
1.0%
삶은것) 2
 
1.0%
튀김) 1
 
0.5%

Length

2023-12-10T15:31:58.761245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:31:59.021134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 178
89.4%
2 11
 
5.5%
3 5
 
2.5%
4 2
 
1.0%
삶은것 2
 
1.0%
튀김 1
 
0.5%

1
Real number (ℝ)

HIGH CORRELATION 

Distinct24
Distinct (%)12.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4371859
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:31:59.258166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q37.5
95-th percentile14.1
Maximum24
Range23
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation4.7220549
Coefficient of variation (CV)0.86847405
Kurtosis2.7414119
Mean5.4371859
Median Absolute Deviation (MAD)3
Skewness1.5849211
Sum1082
Variance22.297802
MonotonicityNot monotonic
2023-12-10T15:31:59.644797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
1 40
20.1%
2 28
14.1%
3 21
10.6%
4 18
9.0%
5 15
 
7.5%
6 15
 
7.5%
7 12
 
6.0%
8 10
 
5.0%
9 8
 
4.0%
10 6
 
3.0%
Other values (14) 26
13.1%
ValueCountFrequency (%)
1 40
20.1%
2 28
14.1%
3 21
10.6%
4 18
9.0%
5 15
 
7.5%
6 15
 
7.5%
7 12
 
6.0%
8 10
 
5.0%
9 8
 
4.0%
10 6
 
3.0%
ValueCountFrequency (%)
24 1
0.5%
23 1
0.5%
22 1
0.5%
21 1
0.5%
20 1
0.5%
19 1
0.5%
18 1
0.5%
17 1
0.5%
16 1
0.5%
15 1
0.5%

22.222222
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)13.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.495254
Minimum3
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:31:59.897475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile4
Q16.25
median11.111111
Q320
95-th percentile70
Maximum100
Range97
Interquartile range (IQR)13.75

Descriptive statistics

Standard deviation21.927809
Coefficient of variation (CV)1.1855911
Kurtosis7.2555256
Mean18.495254
Median Absolute Deviation (MAD)5.555555
Skewness2.7192199
Sum3680.5556
Variance480.8288
MonotonicityNot monotonic
2023-12-10T15:32:00.122007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
4.0 24
12.1%
11.111111 21
10.6%
25.0 15
 
7.5%
6.25 15
 
7.5%
16.666667 14
 
7.0%
7.692308 13
 
6.5%
14.285714 12
 
6.0%
12.5 12
 
6.0%
5.555556 11
 
5.5%
5.882353 11
 
5.5%
Other values (16) 51
25.6%
ValueCountFrequency (%)
3.0 1
 
0.5%
4.0 24
12.1%
5.555556 11
5.5%
5.882353 11
5.5%
6.25 15
7.5%
6.666667 7
 
3.5%
7.692308 13
6.5%
8.0 1
 
0.5%
11.111111 21
10.6%
11.764706 1
 
0.5%
ValueCountFrequency (%)
100.0 10
5.0%
66.666667 1
 
0.5%
50.0 8
4.0%
33.333333 9
4.5%
28.571429 1
 
0.5%
26.666667 1
 
0.5%
25.0 15
7.5%
23.529412 1
 
0.5%
22.222222 2
 
1.0%
20.0 3
 
1.5%

[40-49]
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
[40-49]
106 
[0-19]
74 
[50-59]
13 
[60-99]
 
3
25.000000
 
1
Other values (2)
 
2

Length

Max length10
Median length8
Mean length7.6482412
Min length7

Unique

Unique3 ?
Unique (%)1.5%

Sample

1st row [40-49]
2nd row [40-49]
3rd row [40-49]
4th row [40-49]
5th row [40-49]

Common Values

ValueCountFrequency (%)
[40-49] 106
53.3%
[0-19] 74
37.2%
[50-59] 13
 
6.5%
[60-99] 3
 
1.5%
25.000000 1
 
0.5%
6.666667 1
 
0.5%
5.555556 1
 
0.5%

Length

2023-12-10T15:32:00.330853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:32:00.543773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40-49 106
53.3%
0-19 74
37.2%
50-59 13
 
6.5%
60-99 3
 
1.5%
25.000000 1
 
0.5%
6.666667 1
 
0.5%
5.555556 1
 
0.5%

09
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
12
36 
10
26 
19
23 
20
22 
15
15 
Other values (15)
77 

Length

Max length8
Median length3
Mean length3.0653266
Min length3

Unique

Unique3 ?
Unique (%)1.5%

Sample

1st row 09
2nd row 09
3rd row 09
4th row 09
5th row 09

Common Values

ValueCountFrequency (%)
12 36
18.1%
10 26
13.1%
19 23
11.6%
20 22
11.1%
15 15
7.5%
18 14
 
7.0%
16 10
 
5.0%
09 9
 
4.5%
13 9
 
4.5%
11 9
 
4.5%
Other values (10) 26
13.1%

Length

2023-12-10T15:32:00.881403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
12 36
18.1%
10 26
13.1%
19 23
11.6%
20 22
11.1%
15 15
7.5%
18 14
 
7.0%
16 10
 
5.0%
09 9
 
4.5%
13 9
 
4.5%
11 9
 
4.5%
Other values (10) 26
13.1%

Interactions

2023-12-10T15:31:56.140306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:31:55.871096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:31:56.291327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:31:56.003915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:32:01.148011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2122.222222[40-49]09
21.0000.0000.3130.7930.752
10.0001.0000.3420.0000.000
22.2222220.3130.3421.0000.5550.847
[40-49]0.7930.0000.5551.0000.858
090.7520.0000.8470.8581.000
2023-12-10T15:32:01.461737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
[40-49]092
[40-49]1.0000.5740.623
090.5741.0000.454
20.6230.4541.000
2023-12-10T15:32:01.747768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
122.2222222[40-49]09
11.000-0.7340.0000.0000.000
22.222222-0.7341.0000.1910.2210.555
20.0000.1911.0000.6230.454
[40-49]0.0000.2210.6231.0000.574
090.0000.5550.4540.5741.000

Missing values

2023-12-10T15:31:56.478859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:31:57.067976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

2020-01-01떡국2122.222222[40-49]09
02020-01-01배추김치2222.222222[40-49]09
12020-01-01아메리카노1311.111111[40-49]09
22020-01-01마늘쫑무침1411.111111[40-49]09
32020-01-01깍두기1511.111111[40-49]09
42020-01-01홈런볼 초코1611.111111[40-49]09
52020-01-01락토핏 생유산균 골드1711.111111[40-49]09
62020-01-01쌀밥1114.285714[40-49]15
72020-01-01기름장1214.285714[40-49]15
82020-01-01양배추샐러드1314.285714[40-49]15
92020-01-01돼지삼겹살(구운것)1414.285714[40-49]15
2020-01-01떡국2122.222222[40-49]09
1892020-01-01오이소박이1125.555556[0-19]10
1902020-01-01닭고기(가슴삶은것)113.05.555556[0-19]
1912020-01-01상추1145.555556[0-19]10
1922020-01-01김치찌개1133.333333[0-19]17
1932020-01-01계란말이1233.333333[0-19]17
1942020-01-0111100.0[0-19]08
1952020-01-01사과11100.0[0-19]16
1962020-01-01춘장1120.0[60-99]12
1972020-01-01치킨버거1220.0[60-99]12
1982020-01-01감자튀김(후렌치후라이)1320.0[60-99]12