Dataset statistics
Number of variables | 8 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 7.0 KiB |
Average record size in memory | 71.3 B |
Variable types
Categorical | 4 |
---|---|
Text | 1 |
Numeric | 3 |
Dataset
Description | Sample |
---|---|
Author | 국립중앙도서관 |
URL | https://www.bigdata-culture.kr/bigdata/user/data_market/detail.do?id=53be0ca0-1525-11ec-bbc0-d7035fffebeb |
anals_trget_year has constant value "" | Constant |
anals_trget_mt has constant value "" | Constant |
age_ise_rank_co is highly overall correlated with age_flag_nm | High correlation |
all_rank_co is highly overall correlated with vlm_nm | High correlation |
age_flag_nm is highly overall correlated with age_ise_rank_co | High correlation |
vlm_nm is highly overall correlated with all_rank_co | High correlation |
age_flag_nm is highly imbalanced (80.6%) | Imbalance |
vlm_nm is highly imbalanced (79.9%) | Imbalance |
Reproduction
Analysis started | 2023-12-10 10:13:44.757186 |
---|---|
Analysis finished | 2023-12-10 10:13:47.742151 |
Duration | 2.98 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
anals_trget_year
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
2021 |
---|
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2021 |
---|---|
2nd row | 2021 |
3rd row | 2021 |
4th row | 2021 |
5th row | 2021 |
Common Values
Value | Count | Frequency (%) |
2021 | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2021 | 100 |
anals_trget_mt
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
11 |
---|
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 11 |
---|---|
2nd row | 11 |
3rd row | 11 |
4th row | 11 |
5th row | 11 |
Common Values
Value | Count | Frequency (%) |
11 | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
11 | 100 |
age_flag_nm
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
20대 | |
---|---|
초등(8~13) | 3 |
Length
Max length | 8 |
---|---|
Median length | 3 |
Mean length | 3.15 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20대 |
---|---|
2nd row | 초등(8~13) |
3rd row | 20대 |
4th row | 20대 |
5th row | 20대 |
Common Values
Value | Count | Frequency (%) |
20대 | 97 | |
초등(8~13) | 3 | 3.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20대 | 97 | |
초등(8~13 | 3 | 3.0% |
title_nm
Text
Distinct | 95 |
---|---|
Distinct (%) | 95.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Length
Max length | 57 |
---|---|
Median length | 30 |
Mean length | 21.63 |
Min length | 3 |
Characters and Unicode
Total characters | 2163 |
---|---|
Distinct characters | 409 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 91 ? |
---|---|
Unique (%) | 91.0% |
Sample
1st row | (간단함, 병맛, 솔직함으로 기업의 흥망성쇠를 좌우하는) 90년생이 온다 |
---|---|
2nd row | 혼령 장수 |
3rd row | 12가지 인생의 법칙 :혼돈의 해독제 |
4th row | 1984 |
5th row | IT 좀 아는 사람 :비전공자도 IT 전문가처럼 생각하는 법 |
Value | Count | Frequency (%) |
장편소설 | 27 | 4.8% |
구병모 | 6 | 1.1% |
소설 | 6 | 1.1% |
나는 | 5 | 0.9% |
위한 | 5 | 0.9% |
혼령 | 3 | 0.5% |
연작소설 | 3 | 0.5% |
내가 | 3 | 0.5% |
정세랑 | 3 | 0.5% |
법 | 3 | 0.5% |
Other values (447) | 497 |
Most occurring characters
Value | Count | Frequency (%) |
461 | 21.3% | |
: | 68 | 3.1% |
는 | 51 | 2.4% |
소 | 42 | 1.9% |
의 | 41 | 1.9% |
설 | 40 | 1.8% |
장 | 32 | 1.5% |
이 | 31 | 1.4% |
가 | 30 | 1.4% |
편 | 28 | 1.3% |
Other values (399) | 1339 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 1520 | |
Space Separator | 461 | 21.3% |
Other Punctuation | 87 | 4.0% |
Lowercase Letter | 48 | 2.2% |
Decimal Number | 29 | 1.3% |
Uppercase Letter | 11 | 0.5% |
Math Symbol | 4 | 0.2% |
Open Punctuation | 1 | < 0.1% |
Dash Punctuation | 1 | < 0.1% |
Close Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
는 | 51 | 3.4% |
소 | 42 | 2.8% |
의 | 41 | 2.7% |
설 | 40 | 2.6% |
장 | 32 | 2.1% |
이 | 31 | 2.0% |
가 | 30 | 2.0% |
편 | 28 | 1.8% |
한 | 25 | 1.6% |
지 | 25 | 1.6% |
Other values (355) | 1175 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 10 | |
r | 7 | |
s | 5 | |
t | 5 | |
l | 3 | 6.2% |
i | 3 | 6.2% |
o | 3 | 6.2% |
y | 2 | 4.2% |
k | 2 | 4.2% |
h | 2 | 4.2% |
Other values (6) | 6 |
Decimal Number
Value | Count | Frequency (%) |
1 | 6 | |
2 | 5 | |
0 | 4 | |
9 | 3 | |
7 | 2 | 6.9% |
3 | 2 | 6.9% |
5 | 2 | 6.9% |
4 | 2 | 6.9% |
8 | 2 | 6.9% |
6 | 1 | 3.4% |
Uppercase Letter
Value | Count | Frequency (%) |
I | 3 | |
T | 2 | |
A | 1 | 9.1% |
X | 1 | 9.1% |
Q | 1 | 9.1% |
F | 1 | 9.1% |
W | 1 | 9.1% |
E | 1 | 9.1% |
Other Punctuation
Value | Count | Frequency (%) |
: | 68 | |
, | 12 | 13.8% |
/ | 3 | 3.4% |
? | 2 | 2.3% |
' | 2 | 2.3% |
Space Separator
Value | Count | Frequency (%) |
461 |
Math Symbol
Value | Count | Frequency (%) |
= | 4 |
Open Punctuation
Value | Count | Frequency (%) |
( | 1 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 1520 | |
Common | 584 | 27.0% |
Latin | 59 | 2.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
는 | 51 | 3.4% |
소 | 42 | 2.8% |
의 | 41 | 2.7% |
설 | 40 | 2.6% |
장 | 32 | 2.1% |
이 | 31 | 2.0% |
가 | 30 | 2.0% |
편 | 28 | 1.8% |
한 | 25 | 1.6% |
지 | 25 | 1.6% |
Other values (355) | 1175 |
Latin
Value | Count | Frequency (%) |
e | 10 | |
r | 7 | |
s | 5 | 8.5% |
t | 5 | 8.5% |
I | 3 | 5.1% |
l | 3 | 5.1% |
i | 3 | 5.1% |
o | 3 | 5.1% |
y | 2 | 3.4% |
k | 2 | 3.4% |
Other values (14) | 16 |
Common
Value | Count | Frequency (%) |
461 | ||
: | 68 | 11.6% |
, | 12 | 2.1% |
1 | 6 | 1.0% |
2 | 5 | 0.9% |
= | 4 | 0.7% |
0 | 4 | 0.7% |
/ | 3 | 0.5% |
9 | 3 | 0.5% |
7 | 2 | 0.3% |
Other values (10) | 16 | 2.7% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 1520 | |
ASCII | 643 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
461 | ||
: | 68 | 10.6% |
, | 12 | 1.9% |
e | 10 | 1.6% |
r | 7 | 1.1% |
1 | 6 | 0.9% |
2 | 5 | 0.8% |
s | 5 | 0.8% |
t | 5 | 0.8% |
= | 4 | 0.6% |
Other values (34) | 60 | 9.3% |
Hangul
Value | Count | Frequency (%) |
는 | 51 | 3.4% |
소 | 42 | 2.8% |
의 | 41 | 2.7% |
설 | 40 | 2.6% |
장 | 32 | 2.1% |
이 | 31 | 2.0% |
가 | 30 | 2.0% |
편 | 28 | 1.8% |
한 | 25 | 1.6% |
지 | 25 | 1.6% |
Other values (355) | 1175 |
isbn_no
Real number (ℝ)
Distinct | 99 |
---|---|
Distinct (%) | 99.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 9.7898419 × 1012 |
Minimum | 9.7889012 × 1012 |
---|---|
Maximum | 9.7911968 × 1012 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 9.7889012 × 1012 |
---|---|
5-th percentile | 9.7889317 × 1012 |
Q1 | 9.788951 × 1012 |
median | 9.7889722 × 1012 |
Q3 | 9.7911644 × 1012 |
95-th percentile | 9.7911912 × 1012 |
Maximum | 9.7911968 × 1012 |
Range | 2.2955778 × 109 |
Interquartile range (IQR) | 2.2134368 × 109 |
Descriptive statistics
Standard deviation | 1.0945017 × 109 |
---|---|
Coefficient of variation (CV) | 0.00011179973 |
Kurtosis | -1.8645409 |
Mean | 9.7898419 × 1012 |
Median Absolute Deviation (MAD) | 35797897 |
Skewness | 0.41413215 |
Sum | 9.7898419 × 1014 |
Variance | 1.197934 × 1018 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
9788994120966 | 2 | 2.0% |
9791188248674 | 1 | 1.0% |
9791160560640 | 1 | 1.0% |
9791190090377 | 1 | 1.0% |
9788990982704 | 1 | 1.0% |
9788986836240 | 1 | 1.0% |
9791130605210 | 1 | 1.0% |
9788936434441 | 1 | 1.0% |
9788982814471 | 1 | 1.0% |
9788954673105 | 1 | 1.0% |
Other values (89) | 89 |
Value | Count | Frequency (%) |
9788901219943 | 1 | |
9788901230658 | 1 | |
9788901244600 | 1 | |
9788925556253 | 1 | |
9788925588667 | 1 | |
9788932023151 | 1 | |
9788932472959 | 1 | |
9788936434106 | 1 | |
9788936434441 | 1 | |
9788936437541 | 1 |
Value | Count | Frequency (%) |
9791196797706 | 1 | |
9791196394578 | 1 | |
9791196067694 | 1 | |
9791191393170 | 1 | |
9791191311020 | 1 | |
9791191211009 | 1 | |
9791191193138 | 1 | |
9791190885621 | 1 | |
9791190582308 | 1 | |
9791190538329 | 1 |
vlm_nm
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 4.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
<NA> | |
---|---|
2 | 4 |
3 | 1 |
1 | 1 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.82 |
Min length | 1 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 2.0% |
Sample
1st row | <NA> |
---|---|
2nd row | 3 |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 94 | |
2 | 4 | 4.0% |
3 | 1 | 1.0% |
1 | 1 | 1.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 94 | |
2 | 4 | 4.0% |
3 | 1 | 1.0% |
1 | 1 | 1.0% |
age_ise_rank_co
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 47 |
---|---|
Distinct (%) | 47.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 160.65 |
Minimum | 27 |
---|---|
Maximum | 591 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 27 |
---|---|
5-th percentile | 64.8 |
Q1 | 110.25 |
median | 157.5 |
Q3 | 196 |
95-th percentile | 228.3 |
Maximum | 591 |
Range | 564 |
Interquartile range (IQR) | 85.75 |
Descriptive statistics
Standard deviation | 88.556479 |
---|---|
Coefficient of variation (CV) | 0.55123859 |
Kurtosis | 11.590874 |
Mean | 160.65 |
Median Absolute Deviation (MAD) | 40 |
Skewness | 2.7335851 |
Sum | 16065 |
Variance | 7842.25 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
179 | 8 | 8.0% |
221 | 5 | 5.0% |
210 | 5 | 5.0% |
196 | 5 | 5.0% |
130 | 4 | 4.0% |
141 | 4 | 4.0% |
228 | 4 | 4.0% |
159 | 4 | 4.0% |
190 | 3 | 3.0% |
87 | 3 | 3.0% |
Other values (37) | 55 |
Value | Count | Frequency (%) |
27 | 1 | |
30 | 1 | |
45 | 1 | |
47 | 1 | |
61 | 1 | |
65 | 1 | |
67 | 2 | |
71 | 1 | |
73 | 1 | |
76 | 1 |
Value | Count | Frequency (%) |
591 | 2 | 2.0% |
501 | 1 | 1.0% |
234 | 2 | 2.0% |
228 | 4 | |
221 | 5 | |
210 | 5 | |
206 | 3 | 3.0% |
196 | 5 | |
190 | 3 | 3.0% |
179 | 8 |
all_rank_co
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 10 |
---|---|
Distinct (%) | 10.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 994.82 |
Minimum | 855 |
---|---|
Maximum | 1000 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 855 |
---|---|
5-th percentile | 961.7 |
Q1 | 1000 |
median | 1000 |
Q3 | 1000 |
95-th percentile | 1000 |
Maximum | 1000 |
Range | 145 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 20.320354 |
---|---|
Coefficient of variation (CV) | 0.020426161 |
Kurtosis | 26.911449 |
Mean | 994.82 |
Median Absolute Deviation (MAD) | 0 |
Skewness | -4.9123694 |
Sum | 99482 |
Variance | 412.91677 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1000 | 90 | |
972 | 2 | 2.0% |
916 | 1 | 1.0% |
956 | 1 | 1.0% |
948 | 1 | 1.0% |
855 | 1 | 1.0% |
998 | 1 | 1.0% |
962 | 1 | 1.0% |
912 | 1 | 1.0% |
991 | 1 | 1.0% |
Value | Count | Frequency (%) |
855 | 1 | 1.0% |
912 | 1 | 1.0% |
916 | 1 | 1.0% |
948 | 1 | 1.0% |
956 | 1 | 1.0% |
962 | 1 | 1.0% |
972 | 2 | 2.0% |
991 | 1 | 1.0% |
998 | 1 | 1.0% |
1000 | 90 |
Value | Count | Frequency (%) |
1000 | 90 | |
998 | 1 | 1.0% |
991 | 1 | 1.0% |
972 | 2 | 2.0% |
962 | 1 | 1.0% |
956 | 1 | 1.0% |
948 | 1 | 1.0% |
916 | 1 | 1.0% |
912 | 1 | 1.0% |
855 | 1 | 1.0% |
age_flag_nm | title_nm | isbn_no | vlm_nm | age_ise_rank_co | all_rank_co | |
---|---|---|---|---|---|---|
age_flag_nm | 1.000 | 1.000 | 0.185 | 0.310 | 1.000 | 0.000 |
title_nm | 1.000 | 1.000 | 0.682 | 0.000 | 0.000 | 1.000 |
isbn_no | 0.185 | 0.682 | 1.000 | 0.000 | 0.338 | 0.213 |
vlm_nm | 0.310 | 0.000 | 0.000 | 1.000 | 0.598 | NaN |
age_ise_rank_co | 1.000 | 0.000 | 0.338 | 0.598 | 1.000 | 0.000 |
all_rank_co | 0.000 | 1.000 | 0.213 | NaN | 0.000 | 1.000 |
age_flag_nm | vlm_nm | |
---|---|---|
age_flag_nm | 1.000 | 0.354 |
vlm_nm | 0.354 | 1.000 |
isbn_no | age_ise_rank_co | all_rank_co | age_flag_nm | vlm_nm | |
---|---|---|---|---|---|
isbn_no | 1.000 | -0.146 | 0.013 | 0.119 | 0.000 |
age_ise_rank_co | -0.146 | 1.000 | 0.248 | 0.979 | 0.382 |
all_rank_co | 0.013 | 0.248 | 1.000 | 0.000 | 1.000 |
age_flag_nm | 0.119 | 0.979 | 0.000 | 1.000 | 0.354 |
vlm_nm | 0.000 | 0.382 | 1.000 | 0.354 | 1.000 |
anals_trget_year | anals_trget_mt | age_flag_nm | title_nm | isbn_no | vlm_nm | age_ise_rank_co | all_rank_co | |
---|---|---|---|---|---|---|---|---|
0 | 2021 | 11 | 20대 | (간단함, 병맛, 솔직함으로 기업의 흥망성쇠를 좌우하는) 90년생이 온다 | 9791188248674 | <NA> | 130 | 1000 |
1 | 2021 | 11 | 초등(8~13) | 혼령 장수 | 9791189239459 | 3 | 501 | 1000 |
2 | 2021 | 11 | 20대 | 12가지 인생의 법칙 :혼돈의 해독제 | 9791196067694 | <NA> | 77 | 916 |
3 | 2021 | 11 | 20대 | 1984 | 9788937460777 | <NA> | 210 | 1000 |
4 | 2021 | 11 | 20대 | IT 좀 아는 사람 :비전공자도 IT 전문가처럼 생각하는 법 | 9791155813355 | <NA> | 196 | 1000 |
5 | 2021 | 11 | 20대 | 개인주의자 선언 :판사 문유석의 일상유감 | 9788954637756 | <NA> | 179 | 1000 |
6 | 2021 | 11 | 20대 | 곰탕 :김영탁 장편소설 | 9788950973766 | 2 | 196 | 1000 |
7 | 2021 | 11 | 초등(8~13) | 혼령 장수 | 9791189239350 | 2 | 591 | 1000 |
8 | 2021 | 11 | 20대 | 그릿 :IQ, 재능, 환경을 뛰어넘는 열정적 끈기의 힘 | 9791186805398 | <NA> | 210 | 1000 |
9 | 2021 | 11 | 20대 | 꽃을 보듯 너를 본다 :나태주 인터넷 시집 | 9791157280292 | <NA> | 234 | 1000 |
anals_trget_year | anals_trget_mt | age_flag_nm | title_nm | isbn_no | vlm_nm | age_ise_rank_co | all_rank_co | |
---|---|---|---|---|---|---|---|---|
90 | 2021 | 11 | 20대 | 침묵의 봄 | 9788962630619 | <NA> | 179 | 1000 |
91 | 2021 | 11 | 20대 | 칵테일, 러브, 좀비 | 9791190174756 | <NA> | 27 | 1000 |
92 | 2021 | 11 | 20대 | 키르케 :매들린 밀러 장편소설 | 9791190582308 | <NA> | 141 | 1000 |
93 | 2021 | 11 | 20대 | 트렌드 코리아 2021 :팬데믹 위기에 대응하는 전략은 무엇인가? | 9788959896837 | <NA> | 196 | 1000 |
94 | 2021 | 11 | 20대 | 파과 :구병모 장편소설 | 9791162203620 | <NA> | 73 | 1000 |
95 | 2021 | 11 | 20대 | 파과 :구병모 장편소설 | 9788957077740 | <NA> | 159 | 1000 |
96 | 2021 | 11 | 20대 | 프리워커스 =일하는 방식에 질문을 던지는 사람들 /Free workers | 9788925588667 | <NA> | 221 | 1000 |
97 | 2021 | 11 | 20대 | 한 스푼의 시간 :구병모 장편소설 | 9788959130580 | <NA> | 166 | 1000 |
98 | 2021 | 11 | 20대 | 해가 지는 곳으로 :최진영 장편소설 | 9788937473166 | <NA> | 96 | 1000 |
99 | 2021 | 11 | 20대 | 화이트 호스 =강화길 소설 /White horse | 9788954672221 | <NA> | 210 | 1000 |