gimi9 Pandas Profiling

Dataset statistics

Number of variables	3
Number of observations	10000
Missing cells	0
Missing cells (%)	0.0%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	312.5 KiB
Average record size in memory	32.0 B

Variable types

Text	1
DateTime	1
Categorical	1

Dataset

Description	파일 다운로드
Author	서울특별시
URL	https://data.seoul.go.kr/dataList/OA-15644/F/1/datasetView.do

Reproduction

Analysis started	2024-03-13 19:21:08.692964
Analysis finished	2024-03-13 19:21:08.926277
Duration	0.23 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

자전거번호
Text

Distinct	8257
Distinct (%)	82.6%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Length

Max length	9
Median length	9
Mean length	9
Min length	9

Characters and Unicode

Total characters	90000
Distinct characters	14
Distinct categories	3 ?
Distinct scripts	2 ?
Distinct blocks	1 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	6799 ?
Unique (%)	68.0%

Sample

1st row	SPB-43981
2nd row	SPB-66056
3rd row	SPB-30135
4th row	SPB-66140
5th row	SPB-52504

Value	Count	Frequency (%)
spb-32433	6	0.1%
spb-33499	5	< 0.1%
spb-64432	5	< 0.1%
spb-66726	5	< 0.1%
spb-32139	5	< 0.1%
spb-47011	5	< 0.1%
spb-32952	5	< 0.1%
spb-31386	4	< 0.1%
spb-54658	4	< 0.1%
spb-31346	4	< 0.1%
Other values (8247)	9952	99.5%

Most occurring characters

Value	Count	Frequency (%)
S	10000	11.1%
P	10000	11.1%
B	10000	11.1%
-	10000	11.1%
5	6904	7.7%
4	6562	7.3%
3	6523	7.2%
6	6282	7.0%
8	4124	4.6%
0	4098	4.6%
Other values (4)	15507	17.2%

Most occurring categories

Value	Count	Frequency (%)
Decimal Number	50000	55.6%
Uppercase Letter	30000	33.3%
Dash Punctuation	10000	11.1%

Most frequent character per category

Decimal Number

Value	Count	Frequency (%)
5	6904	13.8%
4	6562	13.1%
3	6523	13.0%
6	6282	12.6%
8	4124	8.2%
0	4098	8.2%
2	4035	8.1%
1	4018	8.0%
7	3844	7.7%
9	3610	7.2%

Uppercase Letter

Value	Count	Frequency (%)
S	10000	33.3%
P	10000	33.3%
B	10000	33.3%

Dash Punctuation

Value	Count	Frequency (%)
-	10000	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	60000	66.7%
Latin	30000	33.3%

Most frequent character per script

Common

Value	Count	Frequency (%)
-	10000	16.7%
5	6904	11.5%
4	6562	10.9%
3	6523	10.9%
6	6282	10.5%
8	4124	6.9%
0	4098	6.8%
2	4035	6.7%
1	4018	6.7%
7	3844	6.4%

Latin

Value	Count	Frequency (%)
S	10000	33.3%
P	10000	33.3%
B	10000	33.3%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	90000	100.0%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
S	10000	11.1%
P	10000	11.1%
B	10000	11.1%
-	10000	11.1%
5	6904	7.7%
4	6562	7.3%
3	6523	7.2%
6	6282	7.0%
8	4124	4.6%
0	4098	4.6%
Other values (4)	15507	17.2%

등록일시
Date

Distinct	9423
Distinct (%)	94.2%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Minimum	2023-01-01 01:49:00
Maximum	2023-06-30 22:31:00

Histogram

Histogram with fixed size bins (bins=50)

구분
Categorical

Distinct	6
Distinct (%)	0.1%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

기타	3106
체인	1912
안장	1897
타이어	1695
페달	897

Length

Max length	4
Median length	3
Mean length	2.6989
Min length	2

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	기타
2nd row	안장
3rd row	기타
4th row	안장
5th row	기타

Common Values

Value	Count	Frequency (%)
기타	3106	31.1%
체인	1912	19.1%
안장	1897	19.0%
타이어	1695	17.0%
페달	897	9.0%
단말기	493	4.9%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
기타	3106	31.1%
체인	1912	19.1%
안장	1897	19.0%
타이어	1695	17.0%
페달	897	9.0%
단말기	493	4.9%

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	자전거번호	등록일시	구분
39042	SPB-43981	2023-04-17 16:34	기타
45230	SPB-66056	2023-04-28 8:21	안장
6566	SPB-30135	2023-01-21 15:33	기타
47550	SPB-66140	2023-05-02 20:11	안장
4717	SPB-52504	2023-01-17 8:24	기타
72969	SPB-41210	2023-06-12 19:02	기타
70287	SPB-80387	2023-06-08 8:51	기타
55564	SPB-63893	2023-05-16 8:58	안장
60884	SPB-45496	2023-05-23 20:55	체인
46092	SPB-33511	2023-04-30 15:21	페달

	자전거번호	등록일시	구분
27900	SPB-56315	2023-03-27 0:06	타이어
28269	SPB-42219	2023-03-27 18:17	안장
35893	SPB-55661	2023-04-11 6:50	타이어
82615	SPB-62832	2023-06-28 9:17	체인
37705	SPB-62718	2023-04-14 14:52	기타
76847	SPB-66287	2023-06-18 1:18	단말기
41211	SPB-49427	2023-04-21 8:32	타이어
70991	SPB-66402	2023-06-09 12:42	기타
15456	SPB-44706	2023-02-24 5:13	체인
21445	SPB-46625	2023-03-13 7:55	기타

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Decimal Number

Uppercase Letter

Dash Punctuation

Most occurring scripts

Most frequent character per script

Common

Latin

Most occurring blocks

Most frequent character per block

ASCII

Common Values

Length

Common Values (Plot)

Missing values

Sample