Overview

Dataset statistics

Number of variables6
Number of observations515
Missing cells1362
Missing cells (%)44.1%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory25.3 KiB
Average record size in memory50.3 B

Variable types

Numeric1
Text2
Categorical3

Dataset

Description부산광역시기장군_정관어린이도서관신착자료현황_20200731
Author부산광역시 기장군
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15060476

Alerts

Dataset has 1 (0.2%) duplicate rowsDuplicates
자료실명 is highly overall correlated with 순번 and 2 other fieldsHigh correlation
발행년 is highly overall correlated with 발행자 and 1 other fieldsHigh correlation
발행자 is highly overall correlated with 발행년 and 1 other fieldsHigh correlation
순번 is highly overall correlated with 자료실명High correlation
발행자 is highly imbalanced (78.2%)Imbalance
발행년 is highly imbalanced (74.9%)Imbalance
순번 has 454 (88.2%) missing valuesMissing
서명 has 454 (88.2%) missing valuesMissing
저작자 has 454 (88.2%) missing valuesMissing

Reproduction

Analysis started2023-12-10 16:24:56.586411
Analysis finished2023-12-10 16:24:57.851643
Duration1.27 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct61
Distinct (%)100.0%
Missing454
Missing (%)88.2%
Infinite0
Infinite (%)0.0%
Mean31
Minimum1
Maximum61
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-11T01:24:57.949269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q116
median31
Q346
95-th percentile58
Maximum61
Range60
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.752934
Coefficient of variation (CV)0.57267529
Kurtosis-1.2
Mean31
Median Absolute Deviation (MAD)15
Skewness0
Sum1891
Variance315.16667
MonotonicityStrictly increasing
2023-12-11T01:24:58.118432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
47 1
 
0.2%
34 1
 
0.2%
35 1
 
0.2%
36 1
 
0.2%
37 1
 
0.2%
38 1
 
0.2%
39 1
 
0.2%
40 1
 
0.2%
41 1
 
0.2%
42 1
 
0.2%
Other values (51) 51
 
9.9%
(Missing) 454
88.2%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
61 1
0.2%
60 1
0.2%
59 1
0.2%
58 1
0.2%
57 1
0.2%
56 1
0.2%
55 1
0.2%
54 1
0.2%
53 1
0.2%
52 1
0.2%

서명
Text

MISSING 

Distinct61
Distinct (%)100.0%
Missing454
Missing (%)88.2%
Memory size4.2 KiB
2023-12-11T01:24:58.606306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length83
Median length30
Mean length25.868852
Min length13

Characters and Unicode

Total characters1578
Distinct characters244
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique61 ?
Unique (%)100.0%

Sample

1st row아기는 어떻게 생기나요? [DVD 녹화자료]
2nd row아기 탄생의 비밀 [DVD 녹화자료]
3rd row신종 바이러스 예방 대작전 [DVD 녹화자료]
4th row코로나19 소독 대작전! [DVD 녹화자료]
5th row코로나19 바이러스의 비밀 침투 [DVD 녹화자료]
ValueCountFrequency (%)
dvd 61
 
18.1%
녹화자료 55
 
16.3%
13
 
3.9%
tayo 8
 
2.4%
english 8
 
2.4%
2 5
 
1.5%
비밀 5
 
1.5%
season 4
 
1.2%
1 4
 
1.2%
2-2-disk 2
 
0.6%
Other values (158) 172
51.0%
2023-12-11T01:24:59.430287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
276
 
17.5%
D 122
 
7.7%
V 63
 
4.0%
61
 
3.9%
61
 
3.9%
] 61
 
3.9%
61
 
3.9%
[ 61
 
3.9%
61
 
3.9%
i 26
 
1.6%
Other values (234) 725
45.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 673
42.6%
Space Separator 276
17.5%
Uppercase Letter 214
 
13.6%
Lowercase Letter 206
 
13.1%
Close Punctuation 64
 
4.1%
Open Punctuation 64
 
4.1%
Decimal Number 34
 
2.2%
Other Punctuation 29
 
1.8%
Dash Punctuation 12
 
0.8%
Math Symbol 6
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
61
 
9.1%
61
 
9.1%
61
 
9.1%
61
 
9.1%
25
 
3.7%
15
 
2.2%
14
 
2.1%
10
 
1.5%
10
 
1.5%
8
 
1.2%
Other values (185) 347
51.6%
Lowercase Letter
ValueCountFrequency (%)
i 26
12.6%
s 23
11.2%
o 21
10.2%
a 20
9.7%
n 20
9.7%
l 13
 
6.3%
e 11
 
5.3%
d 11
 
5.3%
h 10
 
4.9%
t 10
 
4.9%
Other values (12) 41
19.9%
Uppercase Letter
ValueCountFrequency (%)
D 122
57.0%
V 63
29.4%
E 9
 
4.2%
T 8
 
3.7%
S 5
 
2.3%
R 2
 
0.9%
K 1
 
0.5%
J 1
 
0.5%
B 1
 
0.5%
A 1
 
0.5%
Decimal Number
ValueCountFrequency (%)
2 15
44.1%
1 12
35.3%
3 3
 
8.8%
9 3
 
8.8%
0 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 14
48.3%
. 13
44.8%
! 1
 
3.4%
? 1
 
3.4%
Close Punctuation
ValueCountFrequency (%)
] 61
95.3%
) 3
 
4.7%
Open Punctuation
ValueCountFrequency (%)
[ 61
95.3%
( 3
 
4.7%
Space Separator
ValueCountFrequency (%)
276
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Math Symbol
ValueCountFrequency (%)
= 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 673
42.6%
Common 485
30.7%
Latin 420
26.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
61
 
9.1%
61
 
9.1%
61
 
9.1%
61
 
9.1%
25
 
3.7%
15
 
2.2%
14
 
2.1%
10
 
1.5%
10
 
1.5%
8
 
1.2%
Other values (185) 347
51.6%
Latin
ValueCountFrequency (%)
D 122
29.0%
V 63
15.0%
i 26
 
6.2%
s 23
 
5.5%
o 21
 
5.0%
a 20
 
4.8%
n 20
 
4.8%
l 13
 
3.1%
e 11
 
2.6%
d 11
 
2.6%
Other values (23) 90
21.4%
Common
ValueCountFrequency (%)
276
56.9%
] 61
 
12.6%
[ 61
 
12.6%
2 15
 
3.1%
: 14
 
2.9%
. 13
 
2.7%
- 12
 
2.5%
1 12
 
2.5%
= 6
 
1.2%
) 3
 
0.6%
Other values (6) 12
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 905
57.4%
Hangul 673
42.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
276
30.5%
D 122
13.5%
V 63
 
7.0%
] 61
 
6.7%
[ 61
 
6.7%
i 26
 
2.9%
s 23
 
2.5%
o 21
 
2.3%
a 20
 
2.2%
n 20
 
2.2%
Other values (39) 212
23.4%
Hangul
ValueCountFrequency (%)
61
 
9.1%
61
 
9.1%
61
 
9.1%
61
 
9.1%
25
 
3.7%
15
 
2.2%
14
 
2.1%
10
 
1.5%
10
 
1.5%
8
 
1.2%
Other values (185) 347
51.6%

저작자
Text

MISSING 

Distinct53
Distinct (%)86.9%
Missing454
Missing (%)88.2%
Memory size4.2 KiB
2023-12-11T01:24:59.757247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length23
Mean length13.655738
Min length6

Characters and Unicode

Total characters833
Distinct characters180
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)80.3%

Sample

1st row하성현,우지연 [공]연출 ; EBS [기획·제작]
2nd row하성현 연출 ; EBS [기획·제작]
3rd row하성현 연출 ; EBS 기획·제작
4th row하성현,우지연 [공]연출 ; EBS 기획·제작
5th row하성현,우지연 [공]연출 ; EBS 기획·제작
ValueCountFrequency (%)
감독 41
19.1%
16
 
7.4%
기획·제작 16
 
7.4%
ebs 15
 
7.0%
공]감독 10
 
4.7%
교육방송 8
 
3.7%
연출 5
 
2.3%
신창환 4
 
1.9%
김민성 4
 
1.9%
하성현,우지연 3
 
1.4%
Other values (86) 93
43.3%
2023-12-11T01:25:00.148330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
154
 
18.5%
53
 
6.4%
52
 
6.2%
20
 
2.4%
17
 
2.0%
17
 
2.0%
16
 
1.9%
] 16
 
1.9%
· 16
 
1.9%
[ 16
 
1.9%
Other values (170) 456
54.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 547
65.7%
Space Separator 154
 
18.5%
Uppercase Letter 51
 
6.1%
Other Punctuation 49
 
5.9%
Close Punctuation 16
 
1.9%
Open Punctuation 16
 
1.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
53
 
9.7%
52
 
9.5%
20
 
3.7%
17
 
3.1%
17
 
3.1%
16
 
2.9%
16
 
2.9%
14
 
2.6%
13
 
2.4%
12
 
2.2%
Other values (157) 317
58.0%
Uppercase Letter
ValueCountFrequency (%)
B 16
31.4%
S 15
29.4%
E 15
29.4%
M 2
 
3.9%
J 2
 
3.9%
C 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
· 16
32.7%
; 16
32.7%
, 14
28.6%
. 3
 
6.1%
Space Separator
ValueCountFrequency (%)
154
100.0%
Close Punctuation
ValueCountFrequency (%)
] 16
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 547
65.7%
Common 235
28.2%
Latin 51
 
6.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
53
 
9.7%
52
 
9.5%
20
 
3.7%
17
 
3.1%
17
 
3.1%
16
 
2.9%
16
 
2.9%
14
 
2.6%
13
 
2.4%
12
 
2.2%
Other values (157) 317
58.0%
Common
ValueCountFrequency (%)
154
65.5%
] 16
 
6.8%
· 16
 
6.8%
[ 16
 
6.8%
; 16
 
6.8%
, 14
 
6.0%
. 3
 
1.3%
Latin
ValueCountFrequency (%)
B 16
31.4%
S 15
29.4%
E 15
29.4%
M 2
 
3.9%
J 2
 
3.9%
C 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 547
65.7%
ASCII 270
32.4%
None 16
 
1.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
154
57.0%
] 16
 
5.9%
[ 16
 
5.9%
B 16
 
5.9%
; 16
 
5.9%
S 15
 
5.6%
E 15
 
5.6%
, 14
 
5.2%
. 3
 
1.1%
M 2
 
0.7%
Other values (2) 3
 
1.1%
Hangul
ValueCountFrequency (%)
53
 
9.7%
52
 
9.5%
20
 
3.7%
17
 
3.1%
17
 
3.1%
16
 
2.9%
16
 
2.9%
14
 
2.6%
13
 
2.4%
12
 
2.2%
Other values (157) 317
58.0%
None
ValueCountFrequency (%)
· 16
100.0%

발행자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct26
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
<NA>
454 
아이코닉스 [제조]
 
8
인조인간 [제작·판매]
 
7
EBS 미디어
 
7
해리슨 앤 컴퍼니 [제조·판매]
 
5
Other values (21)
 
34

Length

Max length21
Median length4
Mean length5.0737864
Min length4

Unique

Unique12 ?
Unique (%)2.3%

Sample

1st rowEBS 미디어
2nd rowEBS 미디어
3rd rowEBS 미디어
4th rowEBS 미디어
5th rowEBS 미디어

Common Values

ValueCountFrequency (%)
<NA> 454
88.2%
아이코닉스 [제조] 8
 
1.6%
인조인간 [제작·판매] 7
 
1.4%
EBS 미디어 7
 
1.4%
해리슨 앤 컴퍼니 [제조·판매] 5
 
1.0%
알스컴퍼니 [제작·판매] 3
 
0.6%
미디어포유 [제작·판매] 3
 
0.6%
이십세기 폭스 홈 엔터테인먼트 [공급] 3
 
0.6%
아이브엔터테인먼트 [제작] 3
 
0.6%
다온미디어 [제작·판매] 2
 
0.4%
Other values (16) 20
 
3.9%

Length

2023-12-11T01:25:00.311139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 454
75.0%
제작·판매 25
 
4.1%
제조 8
 
1.3%
미디어 8
 
1.3%
아이코닉스 8
 
1.3%
인조인간 7
 
1.2%
ebs 7
 
1.2%
제작 7
 
1.2%
공급 6
 
1.0%
해리슨 5
 
0.8%
Other values (31) 70
 
11.6%

발행년
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
<NA>
454 
2020
48 
2017
 
9
2018
 
2
2016
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique2 ?
Unique (%)0.4%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
<NA> 454
88.2%
2020 48
 
9.3%
2017 9
 
1.7%
2018 2
 
0.4%
2016 1
 
0.2%
2011 1
 
0.2%

Length

2023-12-11T01:25:00.453100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:25:00.572160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 454
88.2%
2020 48
 
9.3%
2017 9
 
1.7%
2018 2
 
0.4%
2016 1
 
0.2%
2011 1
 
0.2%

자료실명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
<NA>
454 
[정관]정관어린이도서관 아동자료실
61 

Length

Max length18
Median length4
Mean length5.6582524
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[정관]정관어린이도서관 아동자료실
2nd row[정관]정관어린이도서관 아동자료실
3rd row[정관]정관어린이도서관 아동자료실
4th row[정관]정관어린이도서관 아동자료실
5th row[정관]정관어린이도서관 아동자료실

Common Values

ValueCountFrequency (%)
<NA> 454
88.2%
[정관]정관어린이도서관 아동자료실 61
 
11.8%

Length

2023-12-11T01:25:00.750649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:25:00.877163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 454
78.8%
정관]정관어린이도서관 61
 
10.6%
아동자료실 61
 
10.6%

Interactions

2023-12-11T01:24:57.167664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:25:00.953025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번서명저작자발행자발행년
순번1.0001.0000.9770.8300.716
서명1.0001.0001.0001.0001.000
저작자0.9771.0001.0001.0001.000
발행자0.8301.0001.0001.0000.951
발행년0.7161.0001.0000.9511.000
2023-12-11T01:25:01.058894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자료실명발행년발행자
자료실명1.0001.0001.000
발행년1.0001.0000.586
발행자1.0000.5861.000
2023-12-11T01:25:01.144816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번발행자발행년자료실명
순번1.0000.4100.3401.000
발행자0.4101.0000.5861.000
발행년0.3400.5861.0001.000
자료실명1.0001.0001.0001.000

Missing values

2023-12-11T01:24:57.396181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:24:57.601540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T01:24:57.755105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순번서명저작자발행자발행년자료실명
01아기는 어떻게 생기나요? [DVD 녹화자료]하성현,우지연 [공]연출 ; EBS [기획·제작]EBS 미디어2020[정관]정관어린이도서관 아동자료실
12아기 탄생의 비밀 [DVD 녹화자료]하성현 연출 ; EBS [기획·제작]EBS 미디어2020[정관]정관어린이도서관 아동자료실
23신종 바이러스 예방 대작전 [DVD 녹화자료]하성현 연출 ; EBS 기획·제작EBS 미디어2020[정관]정관어린이도서관 아동자료실
34코로나19 소독 대작전! [DVD 녹화자료]하성현,우지연 [공]연출 ; EBS 기획·제작EBS 미디어2020[정관]정관어린이도서관 아동자료실
45코로나19 바이러스의 비밀 침투 [DVD 녹화자료]하성현,우지연 [공]연출 ; EBS 기획·제작EBS 미디어2020[정관]정관어린이도서관 아동자료실
56가면 바이러스의 공격 [DVD 녹화자료]박유림 연출 ; EBS 기획·제작EBS 미디어2020[정관]정관어린이도서관 아동자료실
67공포의 빨간 도장 바이러스 [DVD 녹화자료]박유림 연출 ; EBS 기획·제작EBS 미디어2020[정관]정관어린이도서관 아동자료실
78너를 만났다 [DVD 녹화자료] : VR 휴먼다큐멘터리김종우 연출 ; MBC 기획·제작미디어포유 [판매]2020[정관]정관어린이도서관 아동자료실
89고스트 버스터즈 [DVD 녹화자료]폴 페이그 감독소니 픽쳐스 홈엔터테인먼트 [공급]2016[정관]정관어린이도서관 아동자료실
910Tayo English [DVD 녹화자료]. 1-disc 1김민성 감독 ; EBS 교육방송 기획·제작아이코닉스 [제조]2017[정관]정관어린이도서관 아동자료실
순번서명저작자발행자발행년자료실명
505<NA><NA><NA><NA><NA><NA>
506<NA><NA><NA><NA><NA><NA>
507<NA><NA><NA><NA><NA><NA>
508<NA><NA><NA><NA><NA><NA>
509<NA><NA><NA><NA><NA><NA>
510<NA><NA><NA><NA><NA><NA>
511<NA><NA><NA><NA><NA><NA>
512<NA><NA><NA><NA><NA><NA>
513<NA><NA><NA><NA><NA><NA>
514<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

순번서명저작자발행자발행년자료실명# duplicates
0<NA><NA><NA><NA><NA><NA>454