Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells7562
Missing cells (%)18.9%
Duplicate rows605
Duplicate rows (%)6.0%
Total size in memory390.6 KiB
Average record size in memory40.0 B

Variable types

Text2
Categorical1
DateTime1

Dataset

Description화성시 전자책의 조회기록정보입니다. 기록이 되어있는 책의 북코드와 접속기기, 접속위치, 열람일시로 정리되어있습니다. 북코드정보는 ebook 보유도서정보를 통해 확인할 수 있습니다.
Author경기도 화성시
URLhttps://www.data.go.kr/data/15093334/fileData.do

Alerts

Dataset has 605 (6.0%) duplicate rowsDuplicates
접속 위치 has 7562 (75.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:12:17.024888
Analysis finished2023-12-12 21:12:17.404819
Duration0.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct515
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T06:12:17.597806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters120000
Distinct characters62
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique158 ?
Unique (%)1.6%

Sample

1st row5O5NR0QL0979
2nd rowU8CRQ3LGTM8O
3rd rowT6TDHYZPVI2G
4th rowK4W98W272L4M
5th rowAUFSHEK8W4RZ
ValueCountFrequency (%)
zgg7ltlw150c 896
 
9.0%
3rmbv99x4t7u 763
 
7.6%
52yw5v9makur 491
 
4.9%
wyb9qtyble6w 445
 
4.5%
t6tdhyzpvi2g 426
 
4.3%
l87on2tew8ta 407
 
4.1%
1hg6559yudfp 379
 
3.8%
2dgg8jh4m1g7 305
 
3.0%
lkqj4uvid146 250
 
2.5%
hdyl3eej48td 225
 
2.2%
Other values (487) 5413
54.1%
2023-12-13T06:12:17.987970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 5755
 
4.8%
9 5715
 
4.8%
5 5501
 
4.6%
L 5045
 
4.2%
G 4893
 
4.1%
Y 4865
 
4.1%
W 4651
 
3.9%
7 4352
 
3.6%
1 3963
 
3.3%
V 3930
 
3.3%
Other values (52) 71330
59.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 81943
68.3%
Decimal Number 37870
31.6%
Lowercase Letter 187
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 5755
 
7.0%
L 5045
 
6.2%
G 4893
 
6.0%
Y 4865
 
5.9%
W 4651
 
5.7%
V 3930
 
4.8%
B 3666
 
4.5%
D 3641
 
4.4%
M 3427
 
4.2%
U 3285
 
4.0%
Other values (16) 38785
47.3%
Lowercase Letter
ValueCountFrequency (%)
m 15
 
8.0%
k 13
 
7.0%
d 12
 
6.4%
e 12
 
6.4%
g 12
 
6.4%
l 11
 
5.9%
s 11
 
5.9%
r 10
 
5.3%
u 9
 
4.8%
w 8
 
4.3%
Other values (16) 74
39.6%
Decimal Number
ValueCountFrequency (%)
9 5715
15.1%
5 5501
14.5%
7 4352
11.5%
1 3963
10.5%
2 3874
10.2%
4 3568
9.4%
8 3071
8.1%
6 2921
7.7%
0 2808
7.4%
3 2097
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 82130
68.4%
Common 37870
31.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 5755
 
7.0%
L 5045
 
6.1%
G 4893
 
6.0%
Y 4865
 
5.9%
W 4651
 
5.7%
V 3930
 
4.8%
B 3666
 
4.5%
D 3641
 
4.4%
M 3427
 
4.2%
U 3285
 
4.0%
Other values (42) 38972
47.5%
Common
ValueCountFrequency (%)
9 5715
15.1%
5 5501
14.5%
7 4352
11.5%
1 3963
10.5%
2 3874
10.2%
4 3568
9.4%
8 3071
8.1%
6 2921
7.7%
0 2808
7.4%
3 2097
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 120000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 5755
 
4.8%
9 5715
 
4.8%
5 5501
 
4.6%
L 5045
 
4.2%
G 4893
 
4.1%
Y 4865
 
4.1%
W 4651
 
3.9%
7 4352
 
3.6%
1 3963
 
3.3%
V 3930
 
3.3%
Other values (52) 71330
59.4%

접속기기
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
pc
8801 
mobile
1199 

Length

Max length6
Median length2
Mean length2.4796
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpc
2nd rowpc
3rd rowpc
4th rowpc
5th rowpc

Common Values

ValueCountFrequency (%)
pc 8801
88.0%
mobile 1199
 
12.0%

Length

2023-12-13T06:12:18.138139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:12:18.263336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
pc 8801
88.0%
mobile 1199
 
12.0%

접속 위치
Text

MISSING 

Distinct460
Distinct (%)18.9%
Missing7562
Missing (%)75.6%
Memory size156.2 KiB
2023-12-13T06:12:18.479713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length200
Median length197
Mean length57.529532
Min length16

Characters and Unicode

Total characters140257
Distinct characters72
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique319 ?
Unique (%)13.1%

Sample

1st rowhttp://ebook.hscity.go.kr/html/book/index/gallery/latest/10/1
2nd rowhttp://ebook.hscity.go.kr/html/book/index/gallery/latest/11/1
3rd rowhttp://ebook.hscity.go.kr/html/
4th rowhttp://reurl.kr/DAF9DB4AI
5th rowhttp://ebook.hscity.go.kr/html/book/index/gallery/latest/24/1
ValueCountFrequency (%)
http://ebook.hscity.go.kr/html 200
 
8.2%
http://ebook.hscity.go.kr/html/book/index/gallery/latest/24/1 155
 
6.4%
http://ebook.hscity.go.kr/html/book/index/gallery/latest/20/1 143
 
5.9%
http://reurl.kr 134
 
5.5%
http://ebook.hscity.go.kr/html/m 116
 
4.8%
http://ebook.hscity.go.kr/html/book/index/gallery/latest/25/1 110
 
4.5%
http://ebook.hscity.go.kr/html/book/index/gallery/latest/7/1 68
 
2.8%
http://ebook.hscity.go.kr/html/book/index/gallery/latest/20/2 68
 
2.8%
http://ebook.hscity.go.kr/html/book/index/gallery/latest/23/1 67
 
2.7%
http://ebook.hscity.go.kr/html/book/index/gallery/latest/16/1 51
 
2.1%
Other values (441) 1326
54.4%
2023-12-13T06:12:18.910399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 16094
 
11.5%
t 11857
 
8.5%
o 8834
 
6.3%
e 7674
 
5.5%
h 7072
 
5.0%
. 6829
 
4.9%
l 5912
 
4.2%
r 5829
 
4.2%
k 5720
 
4.1%
s 4202
 
3.0%
Other values (62) 60234
42.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 90146
64.3%
Other Punctuation 28088
 
20.0%
Uppercase Letter 12093
 
8.6%
Decimal Number 9010
 
6.4%
Math Symbol 693
 
0.5%
Connector Punctuation 210
 
0.1%
Dash Punctuation 17
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 11857
13.2%
o 8834
 
9.8%
e 7674
 
8.5%
h 7072
 
7.8%
l 5912
 
6.6%
r 5829
 
6.5%
k 5720
 
6.3%
s 4202
 
4.7%
i 3710
 
4.1%
y 3474
 
3.9%
Other values (16) 25862
28.7%
Uppercase Letter
ValueCountFrequency (%)
J 2413
20.0%
U 1572
13.0%
E 1226
10.1%
D 995
8.2%
T 910
 
7.5%
V 900
 
7.4%
C 808
 
6.7%
B 772
 
6.4%
F 596
 
4.9%
I 451
 
3.7%
Other values (16) 1450
12.0%
Decimal Number
ValueCountFrequency (%)
2 1875
20.8%
1 1818
20.2%
0 1216
13.5%
4 889
9.9%
9 699
 
7.8%
3 639
 
7.1%
5 587
 
6.5%
8 580
 
6.4%
7 400
 
4.4%
6 307
 
3.4%
Other Punctuation
ValueCountFrequency (%)
/ 16094
57.3%
. 6829
24.3%
: 2441
 
8.7%
% 2122
 
7.6%
& 469
 
1.7%
? 133
 
0.5%
Math Symbol
ValueCountFrequency (%)
= 623
89.9%
+ 70
 
10.1%
Connector Punctuation
ValueCountFrequency (%)
_ 210
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 102239
72.9%
Common 38018
 
27.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 11857
 
11.6%
o 8834
 
8.6%
e 7674
 
7.5%
h 7072
 
6.9%
l 5912
 
5.8%
r 5829
 
5.7%
k 5720
 
5.6%
s 4202
 
4.1%
i 3710
 
3.6%
y 3474
 
3.4%
Other values (42) 37955
37.1%
Common
ValueCountFrequency (%)
/ 16094
42.3%
. 6829
18.0%
: 2441
 
6.4%
% 2122
 
5.6%
2 1875
 
4.9%
1 1818
 
4.8%
0 1216
 
3.2%
4 889
 
2.3%
9 699
 
1.8%
3 639
 
1.7%
Other values (10) 3396
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 140257
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 16094
 
11.5%
t 11857
 
8.5%
o 8834
 
6.3%
e 7674
 
5.5%
h 7072
 
5.0%
. 6829
 
4.9%
l 5912
 
4.2%
r 5829
 
4.2%
k 5720
 
4.1%
s 4202
 
3.0%
Other values (62) 60234
42.9%
Distinct6373
Distinct (%)63.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-09-11 11:42:00
Maximum2021-09-29 11:43:00
2023-12-13T06:12:19.057310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:12:19.188483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Missing values

2023-12-13T06:12:17.245919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:12:17.357622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

북코드접속기기접속 위치열람일시
872805O5NR0QL0979pc<NA>2021-09-17 14:15
38028U8CRQ3LGTM8Opc<NA>2021-02-19 9:05
47573T6TDHYZPVI2Gpc<NA>2021-03-19 14:07
82211K4W98W272L4Mpc<NA>2021-08-21 8:56
1327AUFSHEK8W4RZpchttp://ebook.hscity.go.kr/html/book/index/gallery/latest/10/12020-09-28 14:47
27509ZGG7LTLW150Cpc<NA>2020-12-24 15:43
20816ZGG7LTLW150Cpc<NA>2020-12-11 18:01
85163M7VG5FPZJLMZpc<NA>2021-09-08 18:08
54975OGU0EP9VGTDMpchttp://ebook.hscity.go.kr/html/book/index/gallery/latest/11/12021-04-25 16:42
24904GT805I32VBZ9pc<NA>2020-12-23 14:41
북코드접속기기접속 위치열람일시
60183RMBV99X4T7Upc<NA>2020-11-03 14:25
128691HG6559YUDFPpc<NA>2020-11-18 15:44
3434552YW5V9MAKURpc<NA>2021-01-29 13:57
85320D9K2YNBY81CBpc<NA>2021-09-09 18:10
683141Y9Y4HIFZ59Bpc<NA>2021-06-18 18:03
52063RMBV99X4T7Upc<NA>2020-11-03 14:17
101151HG6559YUDFPpc<NA>2020-11-18 15:29
861463W6YMIZPLIGJpc<NA>2021-09-15 21:09
508502DGG8JH4M1G7pc<NA>2021-04-03 15:46
8709C4DMLJVABZXZpc<NA>2020-11-18 12:58

Duplicate rows

Most frequently occurring

북코드접속기기접속 위치열람일시# duplicates
442T6TDHYZPVI2Gpc<NA>2021-03-19 14:0620
121HG6559YUDFPpc<NA>2020-11-18 15:3118
191HG6559YUDFPpc<NA>2020-11-18 15:3818
19652YW5V9MAKURpc<NA>2021-01-29 13:5318
1103RMBV99X4T7Upc<NA>2020-11-03 14:1517
18352YW5V9MAKURpc<NA>2021-01-29 13:4017
19952YW5V9MAKURpc<NA>2021-01-29 13:5617
439T6TDHYZPVI2Gpc<NA>2021-03-19 14:0317
586ZGG7LTLW150Cpc<NA>2020-12-24 15:2717
1203RMBV99X4T7Upc<NA>2020-11-03 14:2516