Overview

Dataset statistics

Number of variables7
Number of observations199
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.4 KiB
Average record size in memory58.7 B

Variable types

Categorical4
Text1
Numeric2

Alerts

2020-01-01 has constant value ""Constant
1 is highly overall correlated with 9.090909High correlation
9.090909 is highly overall correlated with 1 and 2 other fieldsHigh correlation
3 is highly overall correlated with 9.090909 and 2 other fieldsHigh correlation
[20-29] is highly overall correlated with 9.090909 and 2 other fieldsHigh correlation
[22-24] is highly overall correlated with 3 and 1 other fieldsHigh correlation
3 is highly imbalanced (67.9%)Imbalance

Reproduction

Analysis started2023-12-10 06:32:07.372680
Analysis finished2023-12-10 06:32:08.952698
Duration1.58 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

2020-01-01
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2020-01-01
199 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-01-01
2nd row2020-01-01
3rd row2020-01-01
4th row2020-01-01
5th row2020-01-01

Common Values

ValueCountFrequency (%)
2020-01-01 199
100.0%

Length

2023-12-10T15:32:09.056903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:32:09.231901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-01-01 199
100.0%
Distinct148
Distinct (%)74.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-10T15:32:09.702884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length5.321608
Min length2

Characters and Unicode

Total characters1059
Distinct characters226
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique121 ?
Unique (%)60.8%

Sample

1st row 배추김치
2nd row 된장찌개
3rd row 간장
4th row 옥수수샐러드
5th row 후르츠칵테일(통조림)
ValueCountFrequency (%)
배추김치 8
 
3.8%
쌀밥 6
 
2.8%
떡국 5
 
2.3%
흑미밥 4
 
1.9%
소등심(구운것 4
 
1.9%
라면 4
 
1.9%
떡만둣국 4
 
1.9%
4
 
1.9%
계란후라이 3
 
1.4%
갈비탕 2
 
0.9%
Other values (149) 169
79.3%
2023-12-10T15:32:10.525133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
213
 
20.1%
29
 
2.7%
23
 
2.2%
20
 
1.9%
( 20
 
1.9%
19
 
1.8%
) 18
 
1.7%
15
 
1.4%
15
 
1.4%
14
 
1.3%
Other values (216) 673
63.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 801
75.6%
Space Separator 213
 
20.1%
Open Punctuation 20
 
1.9%
Close Punctuation 18
 
1.7%
Decimal Number 5
 
0.5%
Lowercase Letter 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
 
3.6%
23
 
2.9%
20
 
2.5%
19
 
2.4%
15
 
1.9%
15
 
1.9%
14
 
1.7%
13
 
1.6%
12
 
1.5%
12
 
1.5%
Other values (208) 629
78.5%
Decimal Number
ValueCountFrequency (%)
0 3
60.0%
5 1
 
20.0%
1 1
 
20.0%
Lowercase Letter
ValueCountFrequency (%)
m 1
50.0%
l 1
50.0%
Space Separator
ValueCountFrequency (%)
213
100.0%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 801
75.6%
Common 256
 
24.2%
Latin 2
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
 
3.6%
23
 
2.9%
20
 
2.5%
19
 
2.4%
15
 
1.9%
15
 
1.9%
14
 
1.7%
13
 
1.6%
12
 
1.5%
12
 
1.5%
Other values (208) 629
78.5%
Common
ValueCountFrequency (%)
213
83.2%
( 20
 
7.8%
) 18
 
7.0%
0 3
 
1.2%
5 1
 
0.4%
1 1
 
0.4%
Latin
ValueCountFrequency (%)
m 1
50.0%
l 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 801
75.6%
ASCII 258
 
24.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
213
82.6%
( 20
 
7.8%
) 18
 
7.0%
0 3
 
1.2%
5 1
 
0.4%
m 1
 
0.4%
l 1
 
0.4%
1 1
 
0.4%
Hangul
ValueCountFrequency (%)
29
 
3.6%
23
 
2.9%
20
 
2.5%
19
 
2.4%
15
 
1.9%
15
 
1.9%
14
 
1.7%
13
 
1.6%
12
 
1.5%
12
 
1.5%
Other values (208) 629
78.5%

3
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
1
163 
2
23 
3
 
5
튀김)
 
2
4
 
2
Other values (4)
 
4

Length

Max length5
Median length2
Mean length2.0603015
Min length2

Unique

Unique4 ?
Unique (%)2.0%

Sample

1st row 3
2nd row 2
3rd row 2
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
1 163
81.9%
2 23
 
11.6%
3 5
 
2.5%
튀김) 2
 
1.0%
4 2
 
1.0%
양념장 1
 
0.5%
삶은것 1
 
0.5%
도토리묵 1
 
0.5%
액상 1
 
0.5%

Length

2023-12-10T15:32:10.800075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:32:11.012996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 163
81.9%
2 23
 
11.6%
3 5
 
2.5%
튀김 2
 
1.0%
4 2
 
1.0%
양념장 1
 
0.5%
삶은것 1
 
0.5%
도토리묵 1
 
0.5%
액상 1
 
0.5%

1
Real number (ℝ)

HIGH CORRELATION 

Distinct44
Distinct (%)22.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.889447
Minimum1
Maximum44
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:32:11.264863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median11
Q318
95-th percentile34.1
Maximum44
Range43
Interquartile range (IQR)13

Descriptive statistics

Standard deviation10.018561
Coefficient of variation (CV)0.77726844
Kurtosis0.71498838
Mean12.889447
Median Absolute Deviation (MAD)6
Skewness1.0531295
Sum2565
Variance100.37155
MonotonicityNot monotonic
2023-12-10T15:32:11.502446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
1 14
 
7.0%
2 12
 
6.0%
3 11
 
5.5%
6 10
 
5.0%
4 9
 
4.5%
5 9
 
4.5%
7 8
 
4.0%
8 8
 
4.0%
9 8
 
4.0%
10 8
 
4.0%
Other values (34) 102
51.3%
ValueCountFrequency (%)
1 14
7.0%
2 12
6.0%
3 11
5.5%
4 9
4.5%
5 9
4.5%
6 10
5.0%
7 8
4.0%
8 8
4.0%
9 8
4.0%
10 8
4.0%
ValueCountFrequency (%)
44 1
0.5%
43 1
0.5%
42 1
0.5%
41 1
0.5%
40 1
0.5%
39 1
0.5%
38 1
0.5%
37 1
0.5%
36 1
0.5%
35 1
0.5%

9.090909
Real number (ℝ)

HIGH CORRELATION 

Distinct27
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4480969
Minimum1
Maximum33.333333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:32:12.111134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.724138
Q13.030303
median4.166667
Q34.545455
95-th percentile16.666667
Maximum33.333333
Range32.333333
Interquartile range (IQR)1.515152

Descriptive statistics

Standard deviation5.4311226
Coefficient of variation (CV)0.99688437
Kurtosis10.707151
Mean5.4480969
Median Absolute Deviation (MAD)1.005747
Skewness3.0492192
Sum1084.1713
Variance29.497093
MonotonicityNot monotonic
2023-12-10T15:32:12.349946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
1.724138 35
17.6%
4.545455 32
16.1%
3.225806 24
12.1%
3.030303 22
11.1%
4.166667 15
7.5%
4.347826 14
 
7.0%
16.666667 12
 
6.0%
9.090909 6
 
3.0%
3.333333 6
 
3.0%
3.448276 6
 
3.0%
Other values (17) 27
13.6%
ValueCountFrequency (%)
1.0 1
 
0.5%
1.724138 35
17.6%
3.030303 22
11.1%
3.225806 24
12.1%
3.333333 6
 
3.0%
3.448276 6
 
3.0%
4.0 1
 
0.5%
4.166667 15
7.5%
4.347826 14
 
7.0%
4.545455 32
16.1%
ValueCountFrequency (%)
33.333333 3
 
1.5%
27.0 1
 
0.5%
18.0 1
 
0.5%
17.391304 1
 
0.5%
17.0 1
 
0.5%
16.666667 12
6.0%
13.043478 1
 
0.5%
12.5 1
 
0.5%
9.090909 6
3.0%
8.695652 1
 
0.5%

[20-29]
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
[0-19]
81 
[20-29]
52 
[40-49]
35 
[30-39]
13 
[50-59]
 
6
Other values (6)
12 

Length

Max length9
Median length8
Mean length7.6231156
Min length7

Unique

Unique4 ?
Unique (%)2.0%

Sample

1st row [20-29]
2nd row [20-29]
3rd row [20-29]
4th row [20-29]
5th row [20-29]

Common Values

ValueCountFrequency (%)
[0-19] 81
40.7%
[20-29] 52
26.1%
[40-49] 35
17.6%
[30-39] 13
 
6.5%
[50-59] 6
 
3.0%
[60-99] 6
 
3.0%
4.545455 2
 
1.0%
3.030303 1
 
0.5%
3.225806 1
 
0.5%
4.166667 1
 
0.5%

Length

2023-12-10T15:32:12.617436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0-19 81
40.7%
20-29 52
26.1%
40-49 35
17.6%
30-39 13
 
6.5%
50-59 6
 
3.0%
60-99 6
 
3.0%
4.545455 2
 
1.0%
3.030303 1
 
0.5%
3.225806 1
 
0.5%
4.166667 1
 
0.5%

[22-24]
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
[22-24]
109 
[20-22]
46 
[18-20]
38 
[20-29]
 
2
[0-19]
 
2
Other values (2)
 
2

Length

Max length8
Median length8
Mean length7.9899497
Min length7

Unique

Unique2 ?
Unique (%)1.0%

Sample

1st row [22-24]
2nd row [22-24]
3rd row [22-24]
4th row [22-24]
5th row [22-24]

Common Values

ValueCountFrequency (%)
[22-24] 109
54.8%
[20-22] 46
23.1%
[18-20] 38
 
19.1%
[20-29] 2
 
1.0%
[0-19] 2
 
1.0%
[40-49] 1
 
0.5%
[30-39] 1
 
0.5%

Length

2023-12-10T15:32:12.922098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:32:13.148733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
22-24 109
54.8%
20-22 46
23.1%
18-20 38
 
19.1%
20-29 2
 
1.0%
0-19 2
 
1.0%
40-49 1
 
0.5%
30-39 1
 
0.5%

Interactions

2023-12-10T15:32:08.327960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:32:07.964846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:32:08.459320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:32:08.184189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:32:13.339312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
319.090909[20-29][22-24]
31.0000.2170.8170.8930.893
10.2171.0000.3800.0000.000
9.0909090.8170.3801.0000.8430.602
[20-29]0.8930.0000.8431.0000.909
[22-24]0.8930.0000.6020.9091.000
2023-12-10T15:32:13.510900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
[22-24]3[20-29]
[22-24]1.0000.7510.751
30.7511.0000.689
[20-29]0.7510.6891.000
2023-12-10T15:32:13.666424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
19.0909093[20-29][22-24]
11.000-0.7450.0990.0000.000
9.090909-0.7451.0000.5810.6090.376
30.0990.5811.0000.6890.751
[20-29]0.0000.6090.6891.0000.751
[22-24]0.0000.3760.7510.7511.000

Missing values

2023-12-10T15:32:08.685487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:32:08.878744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

2020-01-01떡국319.090909[20-29][22-24]
02020-01-01배추김치329.090909[20-29][22-24]
12020-01-01된장찌개236.060606[20-29][22-24]
22020-01-01간장246.060606[20-29][22-24]
32020-01-01옥수수샐러드153.030303[20-29][22-24]
42020-01-01후르츠칵테일(통조림)163.030303[20-29][22-24]
52020-01-01잡곡밥173.030303[20-29][22-24]
62020-01-01크림빵183.030303[20-29][22-24]
72020-01-01검정콩밥193.030303[20-29][22-24]
82020-01-01라면1103.030303[20-29][22-24]
92020-01-01계란국1113.030303[20-29][22-24]
2020-01-01떡국319.090909[20-29][22-24]
1892020-01-01떡만둣국226.666667[30-39][22-24]
1902020-01-01배추김치236.666667[30-39][22-24]
1912020-01-01쌀밥246.666667[30-39][22-24]
1922020-01-01깍두기256.666667[30-39][22-24]
1932020-01-01쌈장163.333333[30-39][22-24]
1942020-01-01채소샐러드173.333333[30-39][22-24]
1952020-01-01계란찜183.333333[30-39][22-24]
1962020-01-01고구마(구운것)193.333333[30-39][22-24]
1972020-01-01고구마형 과자1103.333333[30-39][22-24]
1982020-01-01짜장밥1113.333333[30-39][22-24]