Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows18
Duplicate rows (%)0.2%
Total size in memory312.5 KiB
Average record size in memory32.0 B

Variable types

Text1
DateTime1
Categorical1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15644/A/1/datasetView.do

Alerts

Dataset has 18 (0.2%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 00:11:44.560938
Analysis finished2024-05-11 00:11:45.170125
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7967
Distinct (%)79.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T00:11:45.561671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6343 ?
Unique (%)63.4%

Sample

1st rowSPB-51759
2nd rowSPB-39634
3rd rowSPB-30136
4th rowSPB-31922
5th rowSPB-51618
ValueCountFrequency (%)
spb-53491 7
 
0.1%
spb-33819 7
 
0.1%
spb-51821 6
 
0.1%
spb-35080 5
 
< 0.1%
spb-46356 5
 
< 0.1%
spb-32122 5
 
< 0.1%
spb-37659 5
 
< 0.1%
spb-47973 5
 
< 0.1%
spb-52258 5
 
< 0.1%
spb-41321 5
 
< 0.1%
Other values (7957) 9945
99.5%
2024-05-11T00:11:46.571935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 10000
11.1%
P 10000
11.1%
B 10000
11.1%
- 10000
11.1%
3 8062
9.0%
4 7508
8.3%
5 6643
7.4%
1 4385
 
4.9%
2 4309
 
4.8%
0 4196
 
4.7%
Other values (4) 14897
16.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
55.6%
Uppercase Letter 30000
33.3%
Dash Punctuation 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 8062
16.1%
4 7508
15.0%
5 6643
13.3%
1 4385
8.8%
2 4309
8.6%
0 4196
8.4%
8 3865
7.7%
7 3826
7.7%
6 3754
7.5%
9 3452
6.9%
Uppercase Letter
ValueCountFrequency (%)
S 10000
33.3%
P 10000
33.3%
B 10000
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60000
66.7%
Latin 30000
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
- 10000
16.7%
3 8062
13.4%
4 7508
12.5%
5 6643
11.1%
1 4385
7.3%
2 4309
7.2%
0 4196
7.0%
8 3865
 
6.4%
7 3826
 
6.4%
6 3754
 
6.3%
Latin
ValueCountFrequency (%)
S 10000
33.3%
P 10000
33.3%
B 10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 10000
11.1%
P 10000
11.1%
B 10000
11.1%
- 10000
11.1%
3 8062
9.0%
4 7508
8.3%
5 6643
7.4%
1 4385
 
4.9%
2 4309
 
4.8%
0 4196
 
4.7%
Other values (4) 14897
16.6%
Distinct183
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-07-01 00:00:00
Maximum2021-12-30 00:00:00
2024-05-11T00:11:46.847589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:11:47.171037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

고장구분
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타
2791 
타이어
2116 
체인
1975 
안장
1817 
페달
902 

Length

Max length4
Median length3
Mean length2.7422
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row단말기
2nd row안장
3rd row체인
4th row기타
5th row타이어

Common Values

ValueCountFrequency (%)
기타 2791
27.9%
타이어 2116
21.2%
체인 1975
19.8%
안장 1817
18.2%
페달 902
 
9.0%
단말기 399
 
4.0%

Length

2024-05-11T00:11:47.631596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T00:11:47.887513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 2791
27.9%
타이어 2116
21.2%
체인 1975
19.8%
안장 1817
18.2%
페달 902
 
9.0%
단말기 399
 
4.0%

Missing values

2024-05-11T00:11:44.884406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T00:11:45.101746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자전거번호등록일시고장구분
83293SPB-517592021-11-17단말기
79739SPB-396342021-11-07안장
29544SPB-301362021-08-18체인
91373SPB-319222021-12-13기타
42370SPB-516182021-09-08타이어
80046SPB-361712021-11-10기타
72942SPB-475382021-10-26타이어
81226SPB-419962021-11-12타이어
38690SPB-434342021-09-02기타
71782SPB-382192021-10-24안장
자전거번호등록일시고장구분
60972SPB-360722021-10-05기타
3165SPB-490322021-07-06기타
48513SPB-334802021-09-16체인
32115SPB-479812021-08-22체인
83532SPB-578482021-11-18기타
67187SPB-560752021-10-15타이어
69253SPB-326702021-10-20기타
91172SPB-548612021-12-12안장
12236SPB-540212021-07-21안장
89418SPB-585082021-12-07안장

Duplicate rows

Most frequently occurring

자전거번호등록일시고장구분# duplicates
0SPB-301282021-07-12타이어2
1SPB-306472021-08-28페달2
2SPB-311222021-09-02안장2
3SPB-323122021-07-29체인2
4SPB-324042021-10-09페달2
5SPB-331792021-08-10체인2
6SPB-335702021-10-20페달2
7SPB-341632021-07-05페달2
8SPB-341672021-09-02안장2
9SPB-356432021-08-20타이어2