Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells100
Missing cells (%)20.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.3 KiB
Average record size in memory44.3 B

Variable types

Categorical2
Text1
Unsupported1
Numeric1

Alerts

lon_cd is highly overall correlated with lon_cd_nmHigh correlation
lon_cd_nm is highly overall correlated with lon_cdHigh correlation
book_isbn_cn has 100 (100.0%) missing valuesMissing
book_isbn_cn is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 10:02:18.618812
Analysis finished2023-12-10 10:02:19.557138
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

lon_cd
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1001
80 
1
20 

Length

Max length4
Median length4
Mean length3.4
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1001
2nd row1
3rd row1001
4th row1001
5th row1001

Common Values

ValueCountFrequency (%)
1001 80
80.0%
1 20
 
20.0%

Length

2023-12-10T19:02:19.736182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:02:19.921561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1001 80
80.0%
1 20
 
20.0%

lon_cd_nm
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
관내대출
80 
일반대출
20 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row관내대출
2nd row일반대출
3rd row관내대출
4th row관내대출
5th row관내대출

Common Values

ValueCountFrequency (%)
관내대출 80
80.0%
일반대출 20
 
20.0%

Length

2023-12-10T19:02:20.226224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:02:20.400383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
관내대출 80
80.0%
일반대출 20
 
20.0%

sj
Text

Distinct94
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:02:20.855177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length71
Median length32.5
Mean length20.98
Min length3

Characters and Unicode

Total characters2098
Distinct characters297
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique88 ?
Unique (%)88.0%

Sample

1st row갈매기
2nd row(제25회)앙가쥬망전 . 25
3rd row문예진흥원 2000년도 기획공연 ; 세 자매
4th row갈매기 [DVD]
5th row벚나무 동산
ValueCountFrequency (%)
75
 
16.6%
dvd 39
 
8.6%
2004 13
 
2.9%
연극열전 13
 
2.9%
verdi 5
 
1.1%
서울국제공연예술제 5
 
1.1%
the 4
 
0.9%
갈매기 4
 
0.9%
기획공연 4
 
0.9%
극단 4
 
0.9%
Other values (219) 285
63.2%
2023-12-10T19:02:21.738680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
353
 
16.8%
D 86
 
4.1%
0 67
 
3.2%
; 50
 
2.4%
50
 
2.4%
e 50
 
2.4%
i 48
 
2.3%
V 45
 
2.1%
a 44
 
2.1%
( 41
 
2.0%
Other values (287) 1264
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 739
35.2%
Lowercase Letter 389
18.5%
Space Separator 353
16.8%
Uppercase Letter 206
 
9.8%
Decimal Number 159
 
7.6%
Other Punctuation 87
 
4.1%
Open Punctuation 82
 
3.9%
Close Punctuation 82
 
3.9%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
6.8%
36
 
4.9%
28
 
3.8%
21
 
2.8%
18
 
2.4%
18
 
2.4%
16
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.9%
Other values (216) 508
68.7%
Lowercase Letter
ValueCountFrequency (%)
e 50
12.9%
i 48
12.3%
a 44
11.3%
o 31
 
8.0%
r 29
 
7.5%
n 28
 
7.2%
t 23
 
5.9%
d 17
 
4.4%
l 17
 
4.4%
s 14
 
3.6%
Other values (13) 88
22.6%
Uppercase Letter
ValueCountFrequency (%)
D 86
41.7%
V 45
21.8%
T 13
 
6.3%
R 8
 
3.9%
M 7
 
3.4%
L 5
 
2.4%
C 5
 
2.4%
N 4
 
1.9%
A 4
 
1.9%
S 4
 
1.9%
Other values (12) 25
 
12.1%
Decimal Number
ValueCountFrequency (%)
0 67
42.1%
2 40
25.2%
4 15
 
9.4%
1 14
 
8.8%
9 8
 
5.0%
5 4
 
2.5%
6 3
 
1.9%
3 3
 
1.9%
8 3
 
1.9%
7 2
 
1.3%
Other Punctuation
ValueCountFrequency (%)
; 50
57.5%
: 22
25.3%
" 4
 
4.6%
, 3
 
3.4%
& 3
 
3.4%
. 1
 
1.1%
' 1
 
1.1%
? 1
 
1.1%
! 1
 
1.1%
/ 1
 
1.1%
Open Punctuation
ValueCountFrequency (%)
( 41
50.0%
[ 41
50.0%
Close Punctuation
ValueCountFrequency (%)
] 41
50.0%
) 41
50.0%
Space Separator
ValueCountFrequency (%)
353
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 764
36.4%
Hangul 738
35.2%
Latin 595
28.4%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
6.8%
36
 
4.9%
28
 
3.8%
21
 
2.8%
18
 
2.4%
18
 
2.4%
16
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.9%
Other values (215) 507
68.7%
Latin
ValueCountFrequency (%)
D 86
14.5%
e 50
 
8.4%
i 48
 
8.1%
V 45
 
7.6%
a 44
 
7.4%
o 31
 
5.2%
r 29
 
4.9%
n 28
 
4.7%
t 23
 
3.9%
d 17
 
2.9%
Other values (35) 194
32.6%
Common
ValueCountFrequency (%)
353
46.2%
0 67
 
8.8%
; 50
 
6.5%
( 41
 
5.4%
[ 41
 
5.4%
] 41
 
5.4%
) 41
 
5.4%
2 40
 
5.2%
: 22
 
2.9%
4 15
 
2.0%
Other values (16) 53
 
6.9%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1358
64.7%
Hangul 738
35.2%
CJK 1
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
353
26.0%
D 86
 
6.3%
0 67
 
4.9%
; 50
 
3.7%
e 50
 
3.7%
i 48
 
3.5%
V 45
 
3.3%
a 44
 
3.2%
( 41
 
3.0%
[ 41
 
3.0%
Other values (60) 533
39.2%
Hangul
ValueCountFrequency (%)
50
 
6.8%
36
 
4.9%
28
 
3.8%
21
 
2.8%
18
 
2.4%
18
 
2.4%
16
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.9%
Other values (215) 507
68.7%
CJK
ValueCountFrequency (%)
1
100.0%
None
ValueCountFrequency (%)
ä 1
100.0%

book_isbn_cn
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing100
Missing (%)100.0%
Memory size1.0 KiB

co
Real number (ℝ)

Distinct92
Distinct (%)92.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean368.9
Minimum1
Maximum1866
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:02:22.008139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile210.95
Q1256.25
median314
Q3415.5
95-th percentile730.3
Maximum1866
Range1865
Interquartile range (IQR)159.25

Descriptive statistics

Standard deviation231.15824
Coefficient of variation (CV)0.62661491
Kurtosis18.724933
Mean368.9
Median Absolute Deviation (MAD)70.5
Skewness3.5202874
Sum36890
Variance53434.131
MonotonicityNot monotonic
2023-12-10T19:02:22.254326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3
 
3.0%
258 2
 
2.0%
340 2
 
2.0%
289 2
 
2.0%
277 2
 
2.0%
320 2
 
2.0%
210 2
 
2.0%
273 1
 
1.0%
276 1
 
1.0%
212 1
 
1.0%
Other values (82) 82
82.0%
ValueCountFrequency (%)
1 3
3.0%
210 2
2.0%
211 1
 
1.0%
212 1
 
1.0%
221 1
 
1.0%
222 1
 
1.0%
225 1
 
1.0%
226 1
 
1.0%
229 1
 
1.0%
230 1
 
1.0%
ValueCountFrequency (%)
1866 1
1.0%
1215 1
1.0%
871 1
1.0%
870 1
1.0%
793 1
1.0%
727 1
1.0%
685 1
1.0%
677 1
1.0%
624 1
1.0%
573 1
1.0%

Interactions

2023-12-10T19:02:19.087385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:02:22.423242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lon_cdlon_cd_nmsjco
lon_cd1.0000.9990.0000.153
lon_cd_nm0.9991.0000.0000.153
sj0.0000.0001.0000.997
co0.1530.1530.9971.000
2023-12-10T19:02:22.575903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lon_cdlon_cd_nm
lon_cd1.0000.968
lon_cd_nm0.9681.000
2023-12-10T19:02:22.740798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
colon_cdlon_cd_nm
co1.0000.1580.158
lon_cd0.1581.0000.968
lon_cd_nm0.1580.9681.000

Missing values

2023-12-10T19:02:19.306723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:02:19.487165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

lon_cdlon_cd_nmsjbook_isbn_cnco
01001관내대출갈매기<NA>1866
11일반대출(제25회)앙가쥬망전 . 25<NA>1
21001관내대출문예진흥원 2000년도 기획공연 ; 세 자매<NA>1215
31001관내대출갈매기 [DVD]<NA>871
41001관내대출벚나무 동산<NA>870
51001관내대출리어왕<NA>793
61001관내대출시련 [VHS]<NA>727
71001관내대출무엇이 될꼬하니<NA>1
81001관내대출(Unplugged Musical) 밑바닥에서<NA>685
91001관내대출김종욱 찾기 ; 로맨틱 코메디 뮤지컬<NA>677
lon_cdlon_cd_nmsjbook_isbn_cnco
901001관내대출The Heat Is On : The making of Miss Saigon [DVD] = 미스사이공<NA>230
911일반대출Rigoletto [DVD]<NA>229
921001관내대출Les Miserables [DVD]<NA>226
931일반대출Die Zauberflote [DVD]<NA>225
941일반대출Tchaikovsky : Eugene Onegin [DVD]<NA>222
951001관내대출보이첵 ; 몸짓 콘서트<NA>221
961001관내대출굿모닝? 체홉 ; 혜화동1번지 '98 동인작업시리즈<NA>212
971001관내대출해무 : (30주년) 극단 연우무대 기념공연 [DVD]<NA>211
981001관내대출십이야 ; 서울남산국악당 기획공연<NA>210
991001관내대출(2004) 연극열전 ; 남자충동 ; (2004) 연극열전<NA>210