Overview

Dataset statistics

Number of variables4
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.5 KiB
Average record size in memory36.3 B

Variable types

Numeric3
Text1

Alerts

positive_score is highly overall correlated with nagative_scoreHigh correlation
nagative_score is highly overall correlated with positive_scoreHigh correlation
id has unique valuesUnique
term has unique valuesUnique
nagative_score has 16 (16.0%) zerosZeros

Reproduction

Analysis started2023-12-10 10:12:44.803286
Analysis finished2023-12-10 10:12:46.958760
Duration2.16 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

id
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:12:47.230181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T19:12:47.594867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

term
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:12:48.121683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.7
Min length2

Characters and Unicode

Total characters270
Distinct characters100
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowㅋㅋ
2nd rowㅋㅋㅋ
3rd rowㅎㅎ
4th rowㅎㅎㅎ
5th row가가
ValueCountFrequency (%)
ㅋㅋ 1
 
1.0%
가단 1
 
1.0%
가드너 1
 
1.0%
가드 1
 
1.0%
가두다 1
 
1.0%
가두 1
 
1.0%
가동 1
 
1.0%
가독 1
 
1.0%
가도쿠라 1
 
1.0%
가도 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T19:12:48.932268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
100
37.0%
20
 
7.4%
9
 
3.3%
7
 
2.6%
5
 
1.9%
5
 
1.9%
5
 
1.9%
3
 
1.1%
3
 
1.1%
3
 
1.1%
Other values (90) 110
40.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 270
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
100
37.0%
20
 
7.4%
9
 
3.3%
7
 
2.6%
5
 
1.9%
5
 
1.9%
5
 
1.9%
3
 
1.1%
3
 
1.1%
3
 
1.1%
Other values (90) 110
40.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 270
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
100
37.0%
20
 
7.4%
9
 
3.3%
7
 
2.6%
5
 
1.9%
5
 
1.9%
5
 
1.9%
3
 
1.1%
3
 
1.1%
3
 
1.1%
Other values (90) 110
40.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 260
96.3%
Compat Jamo 10
 
3.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
100
38.5%
20
 
7.7%
9
 
3.5%
7
 
2.7%
5
 
1.9%
3
 
1.2%
3
 
1.2%
3
 
1.2%
3
 
1.2%
3
 
1.2%
Other values (88) 104
40.0%
Compat Jamo
ValueCountFrequency (%)
5
50.0%
5
50.0%

positive_score
Real number (ℝ)

HIGH CORRELATION 

Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0064647
Minimum1 × 10-5
Maximum0.16593
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:12:49.226109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 10-5
5-th percentile2 × 10-5
Q16 × 10-5
median0.000215
Q30.0012375
95-th percentile0.0355215
Maximum0.16593
Range0.16592
Interquartile range (IQR)0.0011775

Descriptive statistics

Standard deviation0.020772054
Coefficient of variation (CV)3.2131505
Kurtosis36.978664
Mean0.0064647
Median Absolute Deviation (MAD)0.000195
Skewness5.5348947
Sum0.64647
Variance0.00043147822
MonotonicityNot monotonic
2023-12-10T19:12:49.565268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2e-05 12
 
12.0%
9e-05 5
 
5.0%
7e-05 5
 
5.0%
5e-05 4
 
4.0%
6e-05 4
 
4.0%
3e-05 4
 
4.0%
4e-05 3
 
3.0%
0.00013 3
 
3.0%
0.00019 3
 
3.0%
0.00023 2
 
2.0%
Other values (52) 55
55.0%
ValueCountFrequency (%)
1e-05 1
 
1.0%
2e-05 12
12.0%
3e-05 4
 
4.0%
4e-05 3
 
3.0%
5e-05 4
 
4.0%
6e-05 4
 
4.0%
7e-05 5
5.0%
9e-05 5
5.0%
0.0001 2
 
2.0%
0.00011 1
 
1.0%
ValueCountFrequency (%)
0.16593 1
1.0%
0.08231 1
1.0%
0.05838 1
1.0%
0.05575 1
1.0%
0.03574 1
1.0%
0.03551 1
1.0%
0.03335 1
1.0%
0.02129 1
1.0%
0.02086 1
1.0%
0.01821 1
1.0%

nagative_score
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct46
Distinct (%)46.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0061876
Minimum0
Maximum0.16246
Zeros16
Zeros (%)16.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:12:49.880829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17 × 10-5
median0.00029
Q30.00155
95-th percentile0.034846
Maximum0.16246
Range0.16246
Interquartile range (IQR)0.00148

Descriptive statistics

Standard deviation0.020083905
Coefficient of variation (CV)3.2458312
Kurtosis39.542189
Mean0.0061876
Median Absolute Deviation (MAD)0.00029
Skewness5.7547386
Sum0.61876
Variance0.00040336324
MonotonicityNot monotonic
2023-12-10T19:12:50.142672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
0.0 16
 
16.0%
7e-05 10
 
10.0%
4e-05 8
 
8.0%
0.00018 5
 
5.0%
0.00072 4
 
4.0%
0.00011 4
 
4.0%
0.00036 3
 
3.0%
0.00043 3
 
3.0%
0.00029 3
 
3.0%
0.00014 3
 
3.0%
Other values (36) 41
41.0%
ValueCountFrequency (%)
0.0 16
16.0%
4e-05 8
8.0%
7e-05 10
10.0%
0.00011 4
 
4.0%
0.00014 3
 
3.0%
0.00018 5
 
5.0%
0.00021 2
 
2.0%
0.00025 1
 
1.0%
0.00029 3
 
3.0%
0.00036 3
 
3.0%
ValueCountFrequency (%)
0.16246 1
1.0%
0.08779 1
1.0%
0.04327 1
1.0%
0.03873 1
1.0%
0.03515 1
1.0%
0.03483 1
1.0%
0.03379 1
1.0%
0.03168 1
1.0%
0.025 1
1.0%
0.01595 1
1.0%

Interactions

2023-12-10T19:12:45.865480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:44.990170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:45.405021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:46.207236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:45.108650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:45.525527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:46.424450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:45.247076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:45.665714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:12:50.317008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
idtermpositive_scorenagative_score
id1.0001.0000.2370.252
term1.0001.0001.0001.000
positive_score0.2371.0001.0000.947
nagative_score0.2521.0000.9471.000
2023-12-10T19:12:50.471914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
idpositive_scorenagative_score
id1.000-0.067-0.052
positive_score-0.0671.0000.916
nagative_score-0.0520.9161.000

Missing values

2023-12-10T19:12:46.713749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:12:46.889708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

idtermpositive_scorenagative_score
01ㅋㅋ0.035740.03483
12ㅋㅋㅋ0.021290.025
23ㅎㅎ0.055750.03379
34ㅎㅎㅎ0.018180.00908
45가가0.000230.00018
56가가린0.000090.00007
67가가멜0.000030.0
78가가미0.000070.0
89가각0.000040.00004
910가감0.001290.00147
idtermpositive_scorenagative_score
9091가라타니0.000020.00007
9192가라테0.000050.0
9293가락0.016070.0128
9394가락지0.000060.0
9495가란0.000030.00046
9596가람0.000440.0005
9697가랑0.000910.00036
9798가랑이0.000090.00004
9899가래0.000360.00021
99100가량0.001610.00179