Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 3.5 KiB |
Average record size in memory | 36.3 B |
Variable types
Numeric | 3 |
---|---|
Text | 1 |
Dataset
Description | Sample |
---|---|
Author | 국립중앙도서관 |
URL | https://www.bigdata-culture.kr/bigdata/user/data_market/detail.do?id=32bc97f3-79ca-4cc0-bab3-fa6603bd8222 |
positive_score is highly overall correlated with nagative_score | High correlation |
nagative_score is highly overall correlated with positive_score | High correlation |
id has unique values | Unique |
term has unique values | Unique |
nagative_score has 16 (16.0%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-10 10:12:44.803286 |
---|---|
Analysis finished | 2023-12-10 10:12:46.958760 |
Duration | 2.16 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
id
Real number (ℝ)
UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 50.5 |
Minimum | 1 |
---|---|
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 5.95 |
Q1 | 25.75 |
median | 50.5 |
Q3 | 75.25 |
95-th percentile | 95.05 |
Maximum | 100 |
Range | 99 |
Interquartile range (IQR) | 49.5 |
Descriptive statistics
Standard deviation | 29.011492 |
---|---|
Coefficient of variation (CV) | 0.57448499 |
Kurtosis | -1.2 |
Mean | 50.5 |
Median Absolute Deviation (MAD) | 25 |
Skewness | 0 |
Sum | 5050 |
Variance | 841.66667 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 1.0% |
65 | 1 | 1.0% |
75 | 1 | 1.0% |
74 | 1 | 1.0% |
73 | 1 | 1.0% |
72 | 1 | 1.0% |
71 | 1 | 1.0% |
70 | 1 | 1.0% |
69 | 1 | 1.0% |
68 | 1 | 1.0% |
Other values (90) | 90 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
100 | 1 | |
99 | 1 | |
98 | 1 | |
97 | 1 | |
96 | 1 | |
95 | 1 | |
94 | 1 | |
93 | 1 | |
92 | 1 | |
91 | 1 |
term
Text
UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Value | Count | Frequency (%) |
ㅋㅋ | 1 | 1.0% |
가단 | 1 | 1.0% |
가드너 | 1 | 1.0% |
가드 | 1 | 1.0% |
가두다 | 1 | 1.0% |
가두 | 1 | 1.0% |
가동 | 1 | 1.0% |
가독 | 1 | 1.0% |
가도쿠라 | 1 | 1.0% |
가도 | 1 | 1.0% |
Other values (90) | 90 |
Most occurring characters
Value | Count | Frequency (%) |
가 | 100 | |
다 | 20 | 7.4% |
라 | 9 | 3.3% |
나 | 7 | 2.6% |
ㅋ | 5 | 1.9% |
네 | 5 | 1.9% |
ㅎ | 5 | 1.9% |
까 | 3 | 1.1% |
득 | 3 | 1.1% |
디 | 3 | 1.1% |
Other values (90) | 110 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 270 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
가 | 100 | |
다 | 20 | 7.4% |
라 | 9 | 3.3% |
나 | 7 | 2.6% |
ㅋ | 5 | 1.9% |
네 | 5 | 1.9% |
ㅎ | 5 | 1.9% |
까 | 3 | 1.1% |
득 | 3 | 1.1% |
디 | 3 | 1.1% |
Other values (90) | 110 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 270 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
가 | 100 | |
다 | 20 | 7.4% |
라 | 9 | 3.3% |
나 | 7 | 2.6% |
ㅋ | 5 | 1.9% |
네 | 5 | 1.9% |
ㅎ | 5 | 1.9% |
까 | 3 | 1.1% |
득 | 3 | 1.1% |
디 | 3 | 1.1% |
Other values (90) | 110 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 260 | |
Compat Jamo | 10 | 3.7% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
가 | 100 | |
다 | 20 | 7.7% |
라 | 9 | 3.5% |
나 | 7 | 2.7% |
네 | 5 | 1.9% |
까 | 3 | 1.2% |
득 | 3 | 1.2% |
디 | 3 | 1.2% |
랑 | 3 | 1.2% |
이 | 3 | 1.2% |
Other values (88) | 104 |
Compat Jamo
Value | Count | Frequency (%) |
ㅋ | 5 | |
ㅎ | 5 |
positive_score
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 62 |
---|---|
Distinct (%) | 62.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.0064647 |
Minimum | 1 × 10-5 |
---|---|
Maximum | 0.16593 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 1 × 10-5 |
---|---|
5-th percentile | 2 × 10-5 |
Q1 | 6 × 10-5 |
median | 0.000215 |
Q3 | 0.0012375 |
95-th percentile | 0.0355215 |
Maximum | 0.16593 |
Range | 0.16592 |
Interquartile range (IQR) | 0.0011775 |
Descriptive statistics
Standard deviation | 0.020772054 |
---|---|
Coefficient of variation (CV) | 3.2131505 |
Kurtosis | 36.978664 |
Mean | 0.0064647 |
Median Absolute Deviation (MAD) | 0.000195 |
Skewness | 5.5348947 |
Sum | 0.64647 |
Variance | 0.00043147822 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2e-05 | 12 | 12.0% |
9e-05 | 5 | 5.0% |
7e-05 | 5 | 5.0% |
5e-05 | 4 | 4.0% |
6e-05 | 4 | 4.0% |
3e-05 | 4 | 4.0% |
4e-05 | 3 | 3.0% |
0.00013 | 3 | 3.0% |
0.00019 | 3 | 3.0% |
0.00023 | 2 | 2.0% |
Other values (52) | 55 |
Value | Count | Frequency (%) |
1e-05 | 1 | 1.0% |
2e-05 | 12 | |
3e-05 | 4 | 4.0% |
4e-05 | 3 | 3.0% |
5e-05 | 4 | 4.0% |
6e-05 | 4 | 4.0% |
7e-05 | 5 | |
9e-05 | 5 | |
0.0001 | 2 | 2.0% |
0.00011 | 1 | 1.0% |
Value | Count | Frequency (%) |
0.16593 | 1 | |
0.08231 | 1 | |
0.05838 | 1 | |
0.05575 | 1 | |
0.03574 | 1 | |
0.03551 | 1 | |
0.03335 | 1 | |
0.02129 | 1 | |
0.02086 | 1 | |
0.01821 | 1 |
nagative_score
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 46 |
---|---|
Distinct (%) | 46.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.0061876 |
Minimum | 0 |
---|---|
Maximum | 0.16246 |
Zeros | 16 |
Zeros (%) | 16.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 7 × 10-5 |
median | 0.00029 |
Q3 | 0.00155 |
95-th percentile | 0.034846 |
Maximum | 0.16246 |
Range | 0.16246 |
Interquartile range (IQR) | 0.00148 |
Descriptive statistics
Standard deviation | 0.020083905 |
---|---|
Coefficient of variation (CV) | 3.2458312 |
Kurtosis | 39.542189 |
Mean | 0.0061876 |
Median Absolute Deviation (MAD) | 0.00029 |
Skewness | 5.7547386 |
Sum | 0.61876 |
Variance | 0.00040336324 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.0 | 16 | 16.0% |
7e-05 | 10 | 10.0% |
4e-05 | 8 | 8.0% |
0.00018 | 5 | 5.0% |
0.00072 | 4 | 4.0% |
0.00011 | 4 | 4.0% |
0.00036 | 3 | 3.0% |
0.00043 | 3 | 3.0% |
0.00029 | 3 | 3.0% |
0.00014 | 3 | 3.0% |
Other values (36) | 41 |
Value | Count | Frequency (%) |
0.0 | 16 | |
4e-05 | 8 | |
7e-05 | 10 | |
0.00011 | 4 | 4.0% |
0.00014 | 3 | 3.0% |
0.00018 | 5 | 5.0% |
0.00021 | 2 | 2.0% |
0.00025 | 1 | 1.0% |
0.00029 | 3 | 3.0% |
0.00036 | 3 | 3.0% |
Value | Count | Frequency (%) |
0.16246 | 1 | |
0.08779 | 1 | |
0.04327 | 1 | |
0.03873 | 1 | |
0.03515 | 1 | |
0.03483 | 1 | |
0.03379 | 1 | |
0.03168 | 1 | |
0.025 | 1 | |
0.01595 | 1 |
id | term | positive_score | nagative_score | |
---|---|---|---|---|
id | 1.000 | 1.000 | 0.237 | 0.252 |
term | 1.000 | 1.000 | 1.000 | 1.000 |
positive_score | 0.237 | 1.000 | 1.000 | 0.947 |
nagative_score | 0.252 | 1.000 | 0.947 | 1.000 |
id | positive_score | nagative_score | |
---|---|---|---|
id | 1.000 | -0.067 | -0.052 |
positive_score | -0.067 | 1.000 | 0.916 |
nagative_score | -0.052 | 0.916 | 1.000 |
id | term | positive_score | nagative_score | |
---|---|---|---|---|
0 | 1 | ㅋㅋ | 0.03574 | 0.03483 |
1 | 2 | ㅋㅋㅋ | 0.02129 | 0.025 |
2 | 3 | ㅎㅎ | 0.05575 | 0.03379 |
3 | 4 | ㅎㅎㅎ | 0.01818 | 0.00908 |
4 | 5 | 가가 | 0.00023 | 0.00018 |
5 | 6 | 가가린 | 0.00009 | 0.00007 |
6 | 7 | 가가멜 | 0.00003 | 0.0 |
7 | 8 | 가가미 | 0.00007 | 0.0 |
8 | 9 | 가각 | 0.00004 | 0.00004 |
9 | 10 | 가감 | 0.00129 | 0.00147 |
id | term | positive_score | nagative_score | |
---|---|---|---|---|
90 | 91 | 가라타니 | 0.00002 | 0.00007 |
91 | 92 | 가라테 | 0.00005 | 0.0 |
92 | 93 | 가락 | 0.01607 | 0.0128 |
93 | 94 | 가락지 | 0.00006 | 0.0 |
94 | 95 | 가란 | 0.00003 | 0.00046 |
95 | 96 | 가람 | 0.00044 | 0.0005 |
96 | 97 | 가랑 | 0.00091 | 0.00036 |
97 | 98 | 가랑이 | 0.00009 | 0.00004 |
98 | 99 | 가래 | 0.00036 | 0.00021 |
99 | 100 | 가량 | 0.00161 | 0.00179 |