Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.0 KiB
Average record size in memory71.3 B

Variable types

Categorical4
Text1
Numeric3

Alerts

anals_trget_year has constant value ""Constant
anals_trget_mt has constant value ""Constant
age_ise_rank_co is highly overall correlated with age_flag_nmHigh correlation
all_rank_co is highly overall correlated with vlm_nmHigh correlation
age_flag_nm is highly overall correlated with age_ise_rank_coHigh correlation
vlm_nm is highly overall correlated with all_rank_coHigh correlation
age_flag_nm is highly imbalanced (80.6%)Imbalance
vlm_nm is highly imbalanced (79.9%)Imbalance

Reproduction

Analysis started2023-12-10 10:13:44.757186
Analysis finished2023-12-10 10:13:47.742151
Duration2.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

anals_trget_year
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2021
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 100
100.0%

Length

2023-12-10T19:13:47.857595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:13:48.019825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 100
100.0%

anals_trget_mt
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
11
100 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11
2nd row11
3rd row11
4th row11
5th row11

Common Values

ValueCountFrequency (%)
11 100
100.0%

Length

2023-12-10T19:13:48.177928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:13:48.344129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11 100
100.0%

age_flag_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
20대
97 
초등(8~13)
 
3

Length

Max length8
Median length3
Mean length3.15
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20대
2nd row초등(8~13)
3rd row20대
4th row20대
5th row20대

Common Values

ValueCountFrequency (%)
20대 97
97.0%
초등(8~13) 3
 
3.0%

Length

2023-12-10T19:13:48.503316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:13:48.672946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20대 97
97.0%
초등(8~13 3
 
3.0%
Distinct95
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:13:49.092458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length30
Mean length21.63
Min length3

Characters and Unicode

Total characters2163
Distinct characters409
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)91.0%

Sample

1st row(간단함, 병맛, 솔직함으로 기업의 흥망성쇠를 좌우하는) 90년생이 온다
2nd row혼령 장수
3rd row12가지 인생의 법칙 :혼돈의 해독제
4th row1984
5th rowIT 좀 아는 사람 :비전공자도 IT 전문가처럼 생각하는 법
ValueCountFrequency (%)
장편소설 27
 
4.8%
구병모 6
 
1.1%
소설 6
 
1.1%
나는 5
 
0.9%
위한 5
 
0.9%
혼령 3
 
0.5%
연작소설 3
 
0.5%
내가 3
 
0.5%
정세랑 3
 
0.5%
3
 
0.5%
Other values (447) 497
88.6%
2023-12-10T19:13:49.732948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
461
 
21.3%
: 68
 
3.1%
51
 
2.4%
42
 
1.9%
41
 
1.9%
40
 
1.8%
32
 
1.5%
31
 
1.4%
30
 
1.4%
28
 
1.3%
Other values (399) 1339
61.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1520
70.3%
Space Separator 461
 
21.3%
Other Punctuation 87
 
4.0%
Lowercase Letter 48
 
2.2%
Decimal Number 29
 
1.3%
Uppercase Letter 11
 
0.5%
Math Symbol 4
 
0.2%
Open Punctuation 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
51
 
3.4%
42
 
2.8%
41
 
2.7%
40
 
2.6%
32
 
2.1%
31
 
2.0%
30
 
2.0%
28
 
1.8%
25
 
1.6%
25
 
1.6%
Other values (355) 1175
77.3%
Lowercase Letter
ValueCountFrequency (%)
e 10
20.8%
r 7
14.6%
s 5
10.4%
t 5
10.4%
l 3
 
6.2%
i 3
 
6.2%
o 3
 
6.2%
y 2
 
4.2%
k 2
 
4.2%
h 2
 
4.2%
Other values (6) 6
12.5%
Decimal Number
ValueCountFrequency (%)
1 6
20.7%
2 5
17.2%
0 4
13.8%
9 3
10.3%
7 2
 
6.9%
3 2
 
6.9%
5 2
 
6.9%
4 2
 
6.9%
8 2
 
6.9%
6 1
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
I 3
27.3%
T 2
18.2%
A 1
 
9.1%
X 1
 
9.1%
Q 1
 
9.1%
F 1
 
9.1%
W 1
 
9.1%
E 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
: 68
78.2%
, 12
 
13.8%
/ 3
 
3.4%
? 2
 
2.3%
' 2
 
2.3%
Space Separator
ValueCountFrequency (%)
461
100.0%
Math Symbol
ValueCountFrequency (%)
= 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1520
70.3%
Common 584
 
27.0%
Latin 59
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
 
3.4%
42
 
2.8%
41
 
2.7%
40
 
2.6%
32
 
2.1%
31
 
2.0%
30
 
2.0%
28
 
1.8%
25
 
1.6%
25
 
1.6%
Other values (355) 1175
77.3%
Latin
ValueCountFrequency (%)
e 10
16.9%
r 7
11.9%
s 5
 
8.5%
t 5
 
8.5%
I 3
 
5.1%
l 3
 
5.1%
i 3
 
5.1%
o 3
 
5.1%
y 2
 
3.4%
k 2
 
3.4%
Other values (14) 16
27.1%
Common
ValueCountFrequency (%)
461
78.9%
: 68
 
11.6%
, 12
 
2.1%
1 6
 
1.0%
2 5
 
0.9%
= 4
 
0.7%
0 4
 
0.7%
/ 3
 
0.5%
9 3
 
0.5%
7 2
 
0.3%
Other values (10) 16
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1520
70.3%
ASCII 643
29.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
461
71.7%
: 68
 
10.6%
, 12
 
1.9%
e 10
 
1.6%
r 7
 
1.1%
1 6
 
0.9%
2 5
 
0.8%
s 5
 
0.8%
t 5
 
0.8%
= 4
 
0.6%
Other values (34) 60
 
9.3%
Hangul
ValueCountFrequency (%)
51
 
3.4%
42
 
2.8%
41
 
2.7%
40
 
2.6%
32
 
2.1%
31
 
2.0%
30
 
2.0%
28
 
1.8%
25
 
1.6%
25
 
1.6%
Other values (355) 1175
77.3%

isbn_no
Real number (ℝ)

Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.7898419 × 1012
Minimum9.7889012 × 1012
Maximum9.7911968 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:13:49.962162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.7889012 × 1012
5-th percentile9.7889317 × 1012
Q19.788951 × 1012
median9.7889722 × 1012
Q39.7911644 × 1012
95-th percentile9.7911912 × 1012
Maximum9.7911968 × 1012
Range2.2955778 × 109
Interquartile range (IQR)2.2134368 × 109

Descriptive statistics

Standard deviation1.0945017 × 109
Coefficient of variation (CV)0.00011179973
Kurtosis-1.8645409
Mean9.7898419 × 1012
Median Absolute Deviation (MAD)35797897
Skewness0.41413215
Sum9.7898419 × 1014
Variance1.197934 × 1018
MonotonicityNot monotonic
2023-12-10T19:13:50.172326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9788994120966 2
 
2.0%
9791188248674 1
 
1.0%
9791160560640 1
 
1.0%
9791190090377 1
 
1.0%
9788990982704 1
 
1.0%
9788986836240 1
 
1.0%
9791130605210 1
 
1.0%
9788936434441 1
 
1.0%
9788982814471 1
 
1.0%
9788954673105 1
 
1.0%
Other values (89) 89
89.0%
ValueCountFrequency (%)
9788901219943 1
1.0%
9788901230658 1
1.0%
9788901244600 1
1.0%
9788925556253 1
1.0%
9788925588667 1
1.0%
9788932023151 1
1.0%
9788932472959 1
1.0%
9788936434106 1
1.0%
9788936434441 1
1.0%
9788936437541 1
1.0%
ValueCountFrequency (%)
9791196797706 1
1.0%
9791196394578 1
1.0%
9791196067694 1
1.0%
9791191393170 1
1.0%
9791191311020 1
1.0%
9791191211009 1
1.0%
9791191193138 1
1.0%
9791190885621 1
1.0%
9791190582308 1
1.0%
9791190538329 1
1.0%

vlm_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
94 
2
 
4
3
 
1
1
 
1

Length

Max length4
Median length4
Mean length3.82
Min length1

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row<NA>
2nd row3
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 94
94.0%
2 4
 
4.0%
3 1
 
1.0%
1 1
 
1.0%

Length

2023-12-10T19:13:50.378100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:13:50.566940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 94
94.0%
2 4
 
4.0%
3 1
 
1.0%
1 1
 
1.0%

age_ise_rank_co
Real number (ℝ)

HIGH CORRELATION 

Distinct47
Distinct (%)47.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.65
Minimum27
Maximum591
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:13:50.821556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27
5-th percentile64.8
Q1110.25
median157.5
Q3196
95-th percentile228.3
Maximum591
Range564
Interquartile range (IQR)85.75

Descriptive statistics

Standard deviation88.556479
Coefficient of variation (CV)0.55123859
Kurtosis11.590874
Mean160.65
Median Absolute Deviation (MAD)40
Skewness2.7335851
Sum16065
Variance7842.25
MonotonicityNot monotonic
2023-12-10T19:13:51.034607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
179 8
 
8.0%
221 5
 
5.0%
210 5
 
5.0%
196 5
 
5.0%
130 4
 
4.0%
141 4
 
4.0%
228 4
 
4.0%
159 4
 
4.0%
190 3
 
3.0%
87 3
 
3.0%
Other values (37) 55
55.0%
ValueCountFrequency (%)
27 1
1.0%
30 1
1.0%
45 1
1.0%
47 1
1.0%
61 1
1.0%
65 1
1.0%
67 2
2.0%
71 1
1.0%
73 1
1.0%
76 1
1.0%
ValueCountFrequency (%)
591 2
 
2.0%
501 1
 
1.0%
234 2
 
2.0%
228 4
4.0%
221 5
5.0%
210 5
5.0%
206 3
 
3.0%
196 5
5.0%
190 3
 
3.0%
179 8
8.0%

all_rank_co
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean994.82
Minimum855
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:13:51.221309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum855
5-th percentile961.7
Q11000
median1000
Q31000
95-th percentile1000
Maximum1000
Range145
Interquartile range (IQR)0

Descriptive statistics

Standard deviation20.320354
Coefficient of variation (CV)0.020426161
Kurtosis26.911449
Mean994.82
Median Absolute Deviation (MAD)0
Skewness-4.9123694
Sum99482
Variance412.91677
MonotonicityNot monotonic
2023-12-10T19:13:51.410089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1000 90
90.0%
972 2
 
2.0%
916 1
 
1.0%
956 1
 
1.0%
948 1
 
1.0%
855 1
 
1.0%
998 1
 
1.0%
962 1
 
1.0%
912 1
 
1.0%
991 1
 
1.0%
ValueCountFrequency (%)
855 1
 
1.0%
912 1
 
1.0%
916 1
 
1.0%
948 1
 
1.0%
956 1
 
1.0%
962 1
 
1.0%
972 2
 
2.0%
991 1
 
1.0%
998 1
 
1.0%
1000 90
90.0%
ValueCountFrequency (%)
1000 90
90.0%
998 1
 
1.0%
991 1
 
1.0%
972 2
 
2.0%
962 1
 
1.0%
956 1
 
1.0%
948 1
 
1.0%
916 1
 
1.0%
912 1
 
1.0%
855 1
 
1.0%

Interactions

2023-12-10T19:13:46.759013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:45.408741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:45.945160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:46.895985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:45.600007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:46.444834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:47.125209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:45.775032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:46.616129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:13:51.550296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
age_flag_nmtitle_nmisbn_novlm_nmage_ise_rank_coall_rank_co
age_flag_nm1.0001.0000.1850.3101.0000.000
title_nm1.0001.0000.6820.0000.0001.000
isbn_no0.1850.6821.0000.0000.3380.213
vlm_nm0.3100.0000.0001.0000.598NaN
age_ise_rank_co1.0000.0000.3380.5981.0000.000
all_rank_co0.0001.0000.213NaN0.0001.000
2023-12-10T19:13:51.735816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
age_flag_nmvlm_nm
age_flag_nm1.0000.354
vlm_nm0.3541.000
2023-12-10T19:13:51.878478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
isbn_noage_ise_rank_coall_rank_coage_flag_nmvlm_nm
isbn_no1.000-0.1460.0130.1190.000
age_ise_rank_co-0.1461.0000.2480.9790.382
all_rank_co0.0130.2481.0000.0001.000
age_flag_nm0.1190.9790.0001.0000.354
vlm_nm0.0000.3821.0000.3541.000

Missing values

2023-12-10T19:13:47.386314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:13:47.650926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

anals_trget_yearanals_trget_mtage_flag_nmtitle_nmisbn_novlm_nmage_ise_rank_coall_rank_co
020211120대(간단함, 병맛, 솔직함으로 기업의 흥망성쇠를 좌우하는) 90년생이 온다9791188248674<NA>1301000
1202111초등(8~13)혼령 장수979118923945935011000
220211120대12가지 인생의 법칙 :혼돈의 해독제9791196067694<NA>77916
320211120대19849788937460777<NA>2101000
420211120대IT 좀 아는 사람 :비전공자도 IT 전문가처럼 생각하는 법9791155813355<NA>1961000
520211120대개인주의자 선언 :판사 문유석의 일상유감9788954637756<NA>1791000
620211120대곰탕 :김영탁 장편소설978895097376621961000
7202111초등(8~13)혼령 장수979118923935025911000
820211120대그릿 :IQ, 재능, 환경을 뛰어넘는 열정적 끈기의 힘9791186805398<NA>2101000
920211120대꽃을 보듯 너를 본다 :나태주 인터넷 시집9791157280292<NA>2341000
anals_trget_yearanals_trget_mtage_flag_nmtitle_nmisbn_novlm_nmage_ise_rank_coall_rank_co
9020211120대침묵의 봄9788962630619<NA>1791000
9120211120대칵테일, 러브, 좀비9791190174756<NA>271000
9220211120대키르케 :매들린 밀러 장편소설9791190582308<NA>1411000
9320211120대트렌드 코리아 2021 :팬데믹 위기에 대응하는 전략은 무엇인가?9788959896837<NA>1961000
9420211120대파과 :구병모 장편소설9791162203620<NA>731000
9520211120대파과 :구병모 장편소설9788957077740<NA>1591000
9620211120대프리워커스 =일하는 방식에 질문을 던지는 사람들 /Free workers9788925588667<NA>2211000
9720211120대한 스푼의 시간 :구병모 장편소설9788959130580<NA>1661000
9820211120대해가 지는 곳으로 :최진영 장편소설9788937473166<NA>961000
9920211120대화이트 호스 =강화길 소설 /White horse9788954672221<NA>2101000