Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows20
Duplicate rows (%)0.2%
Total size in memory312.5 KiB
Average record size in memory32.0 B

Variable types

Text1
DateTime1
Categorical1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15644/A/1/datasetView.do

Alerts

Dataset has 20 (0.2%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 00:11:49.477560
Analysis finished2024-05-11 00:11:50.137805
Duration0.66 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7501
Distinct (%)75.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T00:11:50.555281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5654 ?
Unique (%)56.5%

Sample

1st rowSPB-33048
2nd rowSPB-31122
3rd rowSPB-41290
4th rowSPB-32481
5th rowSPB-34211
ValueCountFrequency (%)
spb-46909 8
 
0.1%
spb-46529 8
 
0.1%
spb-30524 7
 
0.1%
spb-41634 7
 
0.1%
spb-32055 7
 
0.1%
spb-32397 6
 
0.1%
spb-41083 6
 
0.1%
spb-32401 6
 
0.1%
spb-32617 6
 
0.1%
spb-46553 6
 
0.1%
Other values (7491) 9933
99.3%
2024-05-11T00:11:51.369111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 10000
11.1%
P 10000
11.1%
B 10000
11.1%
- 10000
11.1%
3 9169
10.2%
4 7298
8.1%
5 6027
6.7%
0 4487
 
5.0%
1 4463
 
5.0%
2 4462
 
5.0%
Other values (4) 14094
15.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
55.6%
Uppercase Letter 30000
33.3%
Dash Punctuation 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 9169
18.3%
4 7298
14.6%
5 6027
12.1%
0 4487
9.0%
1 4463
8.9%
2 4462
8.9%
6 3698
7.4%
7 3582
 
7.2%
8 3462
 
6.9%
9 3352
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
S 10000
33.3%
P 10000
33.3%
B 10000
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60000
66.7%
Latin 30000
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
- 10000
16.7%
3 9169
15.3%
4 7298
12.2%
5 6027
10.0%
0 4487
7.5%
1 4463
7.4%
2 4462
7.4%
6 3698
 
6.2%
7 3582
 
6.0%
8 3462
 
5.8%
Latin
ValueCountFrequency (%)
S 10000
33.3%
P 10000
33.3%
B 10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 10000
11.1%
P 10000
11.1%
B 10000
11.1%
- 10000
11.1%
3 9169
10.2%
4 7298
8.1%
5 6027
6.7%
0 4487
 
5.0%
1 4463
 
5.0%
2 4462
 
5.0%
Other values (4) 14094
15.7%
Distinct149
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-02-01 00:00:00
Maximum2021-06-29 00:00:00
2024-05-11T00:11:51.684264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T00:11:52.199275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

고장구분
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타
2823 
체인
2331 
안장
1984 
타이어
1462 
페달
860 

Length

Max length4
Median length2
Mean length2.6287
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row체인
2nd row체인
3rd row안장
4th row타이어
5th row체인

Common Values

ValueCountFrequency (%)
기타 2823
28.2%
체인 2331
23.3%
안장 1984
19.8%
타이어 1462
14.6%
페달 860
 
8.6%
단말기 540
 
5.4%

Length

2024-05-11T00:11:52.653554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T00:11:53.022490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 2823
28.2%
체인 2331
23.3%
안장 1984
19.8%
타이어 1462
14.6%
페달 860
 
8.6%
단말기 540
 
5.4%

Missing values

2024-05-11T00:11:49.765524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T00:11:50.022247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자전거번호등록일시고장구분
33396SPB-330482021-5-25체인
50541SPB-311222021-6-24체인
5062SPB-412902021-3-25안장
9705SPB-324812021-4-9타이어
44355SPB-342112021-6-14체인
45252SPB-515742021-6-15기타
50833SPB-519122021-6-24타이어
16466SPB-370512021-4-23체인
27473SPB-412152021-5-13기타
51142SPB-433052021-6-25타이어
자전거번호등록일시고장구분
5361SPB-416202021-3-26체인
49254SPB-420282021-6-22단말기
39618SPB-546432021-6-6기타
9852SPB-317662021-4-9체인
5591SPB-425752021-3-26타이어
6139SPB-382832021-3-30기타
35834SPB-319002021-5-30체인
40968SPB-363132021-6-8타이어
5114SPB-414812021-3-25체인
50133SPB-367642021-6-23체인

Duplicate rows

Most frequently occurring

자전거번호등록일시고장구분# duplicates
0SPB-334862021-4-19기타2
1SPB-338342021-5-13체인2
2SPB-340292021-4-29체인2
3SPB-365942021-3-31안장2
4SPB-366932021-4-7타이어2
5SPB-382742021-5-17타이어2
6SPB-383472021-6-4체인2
7SPB-394902021-5-22타이어2
8SPB-395862021-4-6체인2
9SPB-403652021-5-19타이어2