Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows10
Duplicate rows (%)0.1%
Total size in memory312.5 KiB
Average record size in memory32.0 B

Variable types

Text1
DateTime1
Categorical1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15644/F/1/datasetView.do

Alerts

Dataset has 10 (0.1%) duplicate rowsDuplicates

Reproduction

Analysis started2024-03-13 19:20:55.642144
Analysis finished2024-03-13 19:20:55.838346
Duration0.2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7956
Distinct (%)79.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-14T04:20:56.012987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6293 ?
Unique (%)62.9%

Sample

1st rowSPB-50859
2nd rowSPB-50211
3rd rowSPB-32017
4th rowSPB-46110
5th rowSPB-35521
ValueCountFrequency (%)
spb-53008 8
 
0.1%
spb-34154 7
 
0.1%
spb-46499 5
 
< 0.1%
spb-48861 5
 
< 0.1%
spb-34434 5
 
< 0.1%
spb-32419 5
 
< 0.1%
spb-43158 5
 
< 0.1%
spb-42771 5
 
< 0.1%
spb-48648 5
 
< 0.1%
spb-33543 4
 
< 0.1%
Other values (7946) 9946
99.5%
2024-03-14T04:20:56.358979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 10000
11.1%
P 10000
11.1%
B 10000
11.1%
- 10000
11.1%
3 8157
9.1%
4 7413
8.2%
5 6703
7.4%
1 4416
 
4.9%
0 4354
 
4.8%
2 4227
 
4.7%
Other values (4) 14730
16.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
55.6%
Uppercase Letter 30000
33.3%
Dash Punctuation 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 8157
16.3%
4 7413
14.8%
5 6703
13.4%
1 4416
8.8%
0 4354
8.7%
2 4227
8.5%
8 3902
7.8%
7 3806
7.6%
6 3659
7.3%
9 3363
6.7%
Uppercase Letter
ValueCountFrequency (%)
S 10000
33.3%
P 10000
33.3%
B 10000
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60000
66.7%
Latin 30000
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
- 10000
16.7%
3 8157
13.6%
4 7413
12.4%
5 6703
11.2%
1 4416
7.4%
0 4354
7.3%
2 4227
7.0%
8 3902
 
6.5%
7 3806
 
6.3%
6 3659
 
6.1%
Latin
ValueCountFrequency (%)
S 10000
33.3%
P 10000
33.3%
B 10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 10000
11.1%
P 10000
11.1%
B 10000
11.1%
- 10000
11.1%
3 8157
9.1%
4 7413
8.2%
5 6703
7.4%
1 4416
 
4.9%
0 4354
 
4.8%
2 4227
 
4.7%
Other values (4) 14730
16.4%
Distinct183
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-07-01 00:00:00
Maximum2021-12-30 00:00:00
2024-03-14T04:20:56.474842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T04:20:56.580270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

고장구분
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타
2818 
타이어
2057 
체인
1951 
안장
1847 
페달
879 

Length

Max length4
Median length3
Mean length2.738
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row체인
2nd row타이어
3rd row기타
4th row타이어
5th row체인

Common Values

ValueCountFrequency (%)
기타 2818
28.2%
타이어 2057
20.6%
체인 1951
19.5%
안장 1847
18.5%
페달 879
 
8.8%
단말기 448
 
4.5%

Length

2024-03-14T04:20:56.694129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T04:20:56.794660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 2818
28.2%
타이어 2057
20.6%
체인 1951
19.5%
안장 1847
18.5%
페달 879
 
8.8%
단말기 448
 
4.5%

Missing values

2024-03-14T04:20:55.754605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T04:20:55.810334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자전거번호등록일시고장구분
85289SPB-508592021-11-23체인
18983SPB-502112021-08-02타이어
30583SPB-320172021-08-19기타
19583SPB-461102021-08-03타이어
76393SPB-355212021-11-01체인
25062SPB-431492021-08-11타이어
36163SPB-471352021-08-29기타
80375SPB-481792021-11-11안장
73957SPB-406832021-10-28체인
91455SPB-487312021-12-13안장
자전거번호등록일시고장구분
4934SPB-313252021-07-09체인
74759SPB-402362021-10-29기타
60910SPB-432312021-10-05단말기
36778SPB-458762021-08-30타이어
55804SPB-422692021-09-26기타
19541SPB-380132021-08-03안장
30164SPB-310922021-08-19체인
67834SPB-471112021-10-17안장
78012SPB-432732021-11-04체인
61066SPB-474882021-10-05단말기

Duplicate rows

Most frequently occurring

자전거번호등록일시고장구분# duplicates
0SPB-311872021-09-10페달2
1SPB-312792021-09-02체인2
2SPB-325932021-08-28타이어2
3SPB-334012021-07-04안장2
4SPB-335172021-08-10체인2
5SPB-372222021-09-16체인2
6SPB-432772021-07-06타이어2
7SPB-476822021-09-21타이어2
8SPB-515822021-08-05타이어2
9SPB-538992021-08-11안장2