Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory78.3 B

Variable types

Categorical5
Text1
Numeric3

Alerts

anals_trget_year has constant value ""Constant
anals_trget_mt has constant value ""Constant
pblcate_year is highly overall correlated with authr_nm and 1 other fieldsHigh correlation
isbn_no is highly overall correlated with authr_nm and 1 other fieldsHigh correlation
authr_nm is highly overall correlated with pblcate_year and 2 other fieldsHigh correlation
publisher_nm is highly overall correlated with pblcate_year and 2 other fieldsHigh correlation
isbn_no has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:54:21.640127
Analysis finished2023-12-10 09:54:24.786666
Duration3.15 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

anals_trget_year
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2021
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 100
100.0%

Length

2023-12-10T18:54:24.964464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:54:25.172978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 100
100.0%

anals_trget_mt
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
11
100 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11
2nd row11
3rd row11
4th row11
5th row11

Common Values

ValueCountFrequency (%)
11 100
100.0%

Length

2023-12-10T18:54:25.376809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:54:25.613914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11 100
100.0%
Distinct51
Distinct (%)51.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:54:26.230865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length24.5
Mean length13.15
Min length1

Characters and Unicode

Total characters1315
Distinct characters219
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)33.0%

Sample

1st row(허영만의) 커피 한잔 할까요?
2nd row남김 - 제8회 대한민국창작만화공모전 수상작품집
3rd row미생 :아직 살아 있지 못한 자
4th row유미의 세포들 =Yumi's cells
5th row미생 :아직 살아 있지 못한 자
ValueCountFrequency (%)
22
 
5.9%
신과 14
 
3.7%
함께 14
 
3.7%
미생 13
 
3.5%
살아 13
 
3.5%
있지 13
 
3.5%
못한 13
 
3.5%
13
 
3.5%
아직 13
 
3.5%
강풀액션만화 9
 
2.4%
Other values (112) 237
63.4%
2023-12-10T18:54:27.160690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
274
 
20.8%
: 50
 
3.8%
37
 
2.8%
30
 
2.3%
26
 
2.0%
24
 
1.8%
24
 
1.8%
22
 
1.7%
16
 
1.2%
16
 
1.2%
Other values (209) 796
60.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 899
68.4%
Space Separator 274
 
20.8%
Other Punctuation 63
 
4.8%
Decimal Number 23
 
1.7%
Lowercase Letter 20
 
1.5%
Dash Punctuation 14
 
1.1%
Uppercase Letter 14
 
1.1%
Close Punctuation 3
 
0.2%
Open Punctuation 3
 
0.2%
Math Symbol 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
37
 
4.1%
30
 
3.3%
26
 
2.9%
24
 
2.7%
24
 
2.7%
22
 
2.4%
16
 
1.8%
16
 
1.8%
16
 
1.8%
16
 
1.8%
Other values (175) 672
74.7%
Lowercase Letter
ValueCountFrequency (%)
l 4
20.0%
e 3
15.0%
s 2
10.0%
h 2
10.0%
u 2
10.0%
c 1
 
5.0%
i 1
 
5.0%
m 1
 
5.0%
d 1
 
5.0%
n 1
 
5.0%
Other values (2) 2
10.0%
Decimal Number
ValueCountFrequency (%)
2 8
34.8%
3 5
21.7%
1 5
21.7%
4 2
 
8.7%
6 1
 
4.3%
5 1
 
4.3%
8 1
 
4.3%
Other Punctuation
ValueCountFrequency (%)
: 50
79.4%
! 5
 
7.9%
? 5
 
7.9%
' 1
 
1.6%
/ 1
 
1.6%
, 1
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
I 10
71.4%
A 2
 
14.3%
Y 1
 
7.1%
T 1
 
7.1%
Space Separator
ValueCountFrequency (%)
274
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Math Symbol
ValueCountFrequency (%)
= 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 899
68.4%
Common 382
29.0%
Latin 34
 
2.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
37
 
4.1%
30
 
3.3%
26
 
2.9%
24
 
2.7%
24
 
2.7%
22
 
2.4%
16
 
1.8%
16
 
1.8%
16
 
1.8%
16
 
1.8%
Other values (175) 672
74.7%
Common
ValueCountFrequency (%)
274
71.7%
: 50
 
13.1%
- 14
 
3.7%
2 8
 
2.1%
3 5
 
1.3%
! 5
 
1.3%
1 5
 
1.3%
? 5
 
1.3%
) 3
 
0.8%
( 3
 
0.8%
Other values (8) 10
 
2.6%
Latin
ValueCountFrequency (%)
I 10
29.4%
l 4
 
11.8%
e 3
 
8.8%
s 2
 
5.9%
A 2
 
5.9%
h 2
 
5.9%
u 2
 
5.9%
c 1
 
2.9%
i 1
 
2.9%
m 1
 
2.9%
Other values (6) 6
17.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 899
68.4%
ASCII 416
31.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
274
65.9%
: 50
 
12.0%
- 14
 
3.4%
I 10
 
2.4%
2 8
 
1.9%
3 5
 
1.2%
! 5
 
1.2%
1 5
 
1.2%
? 5
 
1.2%
l 4
 
1.0%
Other values (24) 36
 
8.7%
Hangul
ValueCountFrequency (%)
37
 
4.1%
30
 
3.3%
26
 
2.9%
24
 
2.7%
24
 
2.7%
22
 
2.4%
16
 
1.8%
16
 
1.8%
16
 
1.8%
16
 
1.8%
Other values (175) 672
74.7%

authr_nm
Categorical

HIGH CORRELATION 

Distinct42
Distinct (%)42.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
지은이: 윤태호
12 
지은이: 강풀
지은이: 주호민
주호민 (지은이)
글·그림: 광진
Other values (37)
59 

Length

Max length18
Median length16
Mean length8.74
Min length3

Unique

Unique25 ?
Unique (%)25.0%

Sample

1st row허영만 글·그림 ;이호준 글
2nd row윤현석
3rd row지은이: 윤태호
4th row글·그림: 이동건
5th row지은이: 윤태호

Common Values

ValueCountFrequency (%)
지은이: 윤태호 12
 
12.0%
지은이: 강풀 9
 
9.0%
지은이: 주호민 8
 
8.0%
주호민 (지은이) 6
 
6.0%
글·그림: 광진 6
 
6.0%
최규석 만화 5
 
5.0%
글·그림: 채유리 4
 
4.0%
조경규 글·그림 4
 
4.0%
글·그림: 강풀 3
 
3.0%
글·그림: 오묘 3
 
3.0%
Other values (32) 40
40.0%

Length

2023-12-10T18:54:27.494108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
지은이 47
21.3%
글·그림 29
13.1%
주호민 14
 
6.3%
강풀 13
 
5.9%
윤태호 13
 
5.9%
만화 8
 
3.6%
8
 
3.6%
그림 7
 
3.2%
광진 6
 
2.7%
최규석 6
 
2.7%
Other values (39) 70
31.7%

publisher_nm
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
애니북스
17 
위즈덤하우스
15 
위즈덤하우스 미디어그룹
14 
웅진씽크빅
Young Com(영컴)
Other values (17)
41 

Length

Max length18
Median length13
Mean length6.41
Min length2

Unique

Unique8 ?
Unique (%)8.0%

Sample

1st row위즈덤하우스
2nd row만화규장각
3rd row위즈덤하우스 미디어그룹
4th row위즈덤하우스 미디어그룹
5th row위즈덤하우스 미디어그룹

Common Values

ValueCountFrequency (%)
애니북스 17
17.0%
위즈덤하우스 15
15.0%
위즈덤하우스 미디어그룹 14
14.0%
웅진씽크빅 7
7.0%
Young Com(영컴) 6
 
6.0%
창비 6
 
6.0%
미래엔 6
 
6.0%
문학동네 5
 
5.0%
송송책방 4
 
4.0%
북폴리오 3
 
3.0%
Other values (12) 17
17.0%

Length

2023-12-10T18:54:27.776242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
위즈덤하우스 29
23.6%
애니북스 17
13.8%
미디어그룹 14
11.4%
웅진씽크빅 7
 
5.7%
young 6
 
4.9%
com(영컴 6
 
4.9%
창비 6
 
4.9%
미래엔 6
 
4.9%
문학동네 5
 
4.1%
송송책방 4
 
3.3%
Other values (15) 23
18.7%

pblcate_year
Real number (ℝ)

HIGH CORRELATION 

Distinct13
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.13
Minimum2008
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:54:28.069031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2008
5-th percentile2011
Q12014
median2017
Q32018
95-th percentile2021
Maximum2021
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.0539096
Coefficient of variation (CV)0.0015147384
Kurtosis-0.31703315
Mean2016.13
Median Absolute Deviation (MAD)1
Skewness-0.5816611
Sum201613
Variance9.3263636
MonotonicityNot monotonic
2023-12-10T18:54:28.306269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
2017 28
28.0%
2018 19
19.0%
2011 12
12.0%
2015 8
 
8.0%
2021 7
 
7.0%
2020 6
 
6.0%
2014 5
 
5.0%
2012 4
 
4.0%
2016 4
 
4.0%
2013 3
 
3.0%
Other values (3) 4
 
4.0%
ValueCountFrequency (%)
2008 1
 
1.0%
2009 1
 
1.0%
2011 12
12.0%
2012 4
 
4.0%
2013 3
 
3.0%
2014 5
 
5.0%
2015 8
 
8.0%
2016 4
 
4.0%
2017 28
28.0%
2018 19
19.0%
ValueCountFrequency (%)
2021 7
 
7.0%
2020 6
 
6.0%
2019 2
 
2.0%
2018 19
19.0%
2017 28
28.0%
2016 4
 
4.0%
2015 8
 
8.0%
2014 5
 
5.0%
2013 3
 
3.0%
2012 4
 
4.0%

isbn_no
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.7895495 × 1012
Minimum9.7889011 × 1012
Maximum9.7911965 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:54:28.651002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.7889011 × 1012
5-th percentile9.7889012 × 1012
Q19.788951 × 1012
median9.7889592 × 1012
Q39.7911622 × 1012
95-th percentile9.7911912 × 1012
Maximum9.7911965 × 1012
Range2.2953372 × 109
Interquartile range (IQR)2.2112432 × 109

Descriptive statistics

Standard deviation9.9389757 × 108
Coefficient of variation (CV)0.00010152639
Kurtosis-0.9118623
Mean9.7895495 × 1012
Median Absolute Deviation (MAD)21374501
Skewness1.051128
Sum9.7895495 × 1014
Variance9.8783238 × 1017
MonotonicityNot monotonic
2023-12-10T18:54:29.017133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9788959139163 1
 
1.0%
9788937814228 1
 
1.0%
9788901215136 1
 
1.0%
9788954676397 1
 
1.0%
9788959194995 1
 
1.0%
9788901215129 1
 
1.0%
9788959195008 1
 
1.0%
9788950973216 1
 
1.0%
9791162334386 1
 
1.0%
9791162332344 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
9788901121499 1
1.0%
9788901160948 1
1.0%
9788901160955 1
1.0%
9788901160979 1
1.0%
9788901215129 1
1.0%
9788901215136 1
1.0%
9788901224619 1
1.0%
9788925589091 1
1.0%
9788925589503 1
1.0%
9788925590059 1
1.0%
ValueCountFrequency (%)
9791196458676 1
1.0%
9791196287801 1
1.0%
9791196202378 1
1.0%
9791196155711 1
1.0%
9791191583137 1
1.0%
9791191194074 1
1.0%
9791190569217 1
1.0%
9791190569040 1
1.0%
9791186712474 1
1.0%
9791186712351 1
1.0%

vlm_nm
Categorical

Distinct20
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
22 
1
19 
2
5
3
Other values (15)
36 

Length

Max length4
Median length1
Mean length1.77
Min length1

Unique

Unique5 ?
Unique (%)5.0%

Sample

1st row1
2nd row<NA>
3rd row2
4th row1
5th row3

Common Values

ValueCountFrequency (%)
<NA> 22
22.0%
1 19
19.0%
2 9
9.0%
5 8
 
8.0%
3 6
 
6.0%
4 5
 
5.0%
10 4
 
4.0%
6 4
 
4.0%
8 3
 
3.0%
7 3
 
3.0%
Other values (10) 17
17.0%

Length

2023-12-10T18:54:29.361439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 22
22.0%
1 19
19.0%
2 9
9.0%
5 8
 
8.0%
3 6
 
6.0%
4 5
 
5.0%
10 4
 
4.0%
6 4
 
4.0%
3
 
3.0%
3
 
3.0%
Other values (10) 17
17.0%

lon_co
Real number (ℝ)

Distinct66
Distinct (%)66.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.36
Minimum1
Maximum321
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:54:29.623438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile82
Q196.75
median113
Q3155.5
95-th percentile218.2
Maximum321
Range320
Interquartile range (IQR)58.75

Descriptive statistics

Standard deviation51.085692
Coefficient of variation (CV)0.40111253
Kurtosis1.9815771
Mean127.36
Median Absolute Deviation (MAD)24
Skewness0.75202701
Sum12736
Variance2609.7479
MonotonicityNot monotonic
2023-12-10T18:54:29.886052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82 4
 
4.0%
97 4
 
4.0%
114 3
 
3.0%
96 3
 
3.0%
1 3
 
3.0%
85 3
 
3.0%
86 3
 
3.0%
98 3
 
3.0%
109 2
 
2.0%
110 2
 
2.0%
Other values (56) 70
70.0%
ValueCountFrequency (%)
1 3
3.0%
81 1
 
1.0%
82 4
4.0%
83 2
2.0%
85 3
3.0%
86 3
3.0%
87 1
 
1.0%
88 1
 
1.0%
92 2
2.0%
93 1
 
1.0%
ValueCountFrequency (%)
321 1
1.0%
256 1
1.0%
245 1
1.0%
228 1
1.0%
222 1
1.0%
218 1
1.0%
207 1
1.0%
203 1
1.0%
202 1
1.0%
200 1
1.0%

Interactions

2023-12-10T18:54:23.650109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:54:22.522325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:54:23.125551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:54:23.839593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:54:22.729479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:54:23.308033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:54:24.029819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:54:22.937359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:54:23.476260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:54:30.074030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
title_nmauthr_nmpublisher_nmpblcate_yearisbn_novlm_nmlon_co
title_nm1.0001.0000.9980.9920.9840.0000.832
authr_nm1.0001.0000.9980.9800.9030.0000.847
publisher_nm0.9980.9981.0000.9350.8700.3920.687
pblcate_year0.9920.9800.9351.0000.3040.6870.438
isbn_no0.9840.9030.8700.3041.0000.0000.272
vlm_nm0.0000.0000.3920.6870.0001.0000.000
lon_co0.8320.8470.6870.4380.2720.0001.000
2023-12-10T18:54:30.291029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
vlm_nmauthr_nmpublisher_nm
vlm_nm1.0000.0000.116
authr_nm0.0001.0000.827
publisher_nm0.1160.8271.000
2023-12-10T18:54:30.495734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pblcate_yearisbn_nolon_coauthr_nmpublisher_nmvlm_nm
pblcate_year1.0000.3350.2240.6890.6250.275
isbn_no0.3351.0000.0080.6270.6540.000
lon_co0.2240.0081.0000.4060.3380.000
authr_nm0.6890.6270.4061.0000.8270.000
publisher_nm0.6250.6540.3380.8271.0000.116
vlm_nm0.2750.0000.0000.0000.1161.000

Missing values

2023-12-10T18:54:24.389527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:54:24.656251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

anals_trget_yearanals_trget_mttitle_nmauthr_nmpublisher_nmpblcate_yearisbn_novlm_nmlon_co
0202111(허영만의) 커피 한잔 할까요?허영만 글·그림 ;이호준 글위즈덤하우스201597889591391631321
1202111남김 - 제8회 대한민국창작만화공모전 수상작품집윤현석만화규장각20119788992596749<NA>1
2202111미생 :아직 살아 있지 못한 자지은이: 윤태호위즈덤하우스 미디어그룹201897889608655942256
3202111유미의 세포들 =Yumi's cells글·그림: 이동건위즈덤하우스 미디어그룹201797889591358821245
4202111미생 :아직 살아 있지 못한 자지은이: 윤태호위즈덤하우스 미디어그룹201897889608657233228
5202111오무라이스 잼잼 :경이로운 일상음식 이야기조경규 글·그림송송책방2017979119056921711222
6202111송곳최규석 만화창비201597889364726653218
7202111잉잉잉 :잉여가 잉여잉여해황준호 글 ;수연 그림애니북스2012978895919480321
8202111송곳최규석 만화창비201597889364726411207
9202111고래별 3 - 경성의 인어공주나윤희 (지은이)알에이치코리아(RHK)20219788925589091<NA>203
anals_trget_yearanals_trget_mttitle_nmauthr_nmpublisher_nmpblcate_yearisbn_novlm_nmlon_co
90202111오무라이스 잼잼 :경이로운 일상음식 이야기조경규 글·그림송송책방202097911905690401085
91202111마녀글·그림: 강풀웅진씽크빅201397889011609480185
92202111미생 :아직 살아 있지 못한 자지은이: 윤태호위즈덤하우스 미디어그룹201897911622030951385
93202111극락왕생 3고사리박사 (지은이)문학동네20219788954679305<NA>83
94202111뽀짜툰글·그림: 채유리미래엔20149791164131877783
95202111당신의 모든 순간강풀 글·그림웅진씽크빅20119788901121499182
96202111우리집에서 밥 먹고 갈래요?글·그림: 오묘웅진씽크빅20179788901224619382
97202111놓지마 정신줄 완전판 22 - 시즌2, 완결신태훈, 나승훈 (지은이)웹툰북스20219791191194074<NA>82
98202111여중생 A허5파6 지음비아북20179791186712474582
99202111마녀글·그림: 강풀웅진씽크빅201397889011609550281