Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows18
Duplicate rows (%)0.2%
Total size in memory312.5 KiB
Average record size in memory32.0 B

Variable types

Text1
DateTime1
Categorical1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15644/F/1/datasetView.do

Alerts

Dataset has 18 (0.2%) duplicate rowsDuplicates

Reproduction

Analysis started2024-03-13 19:20:57.366397
Analysis finished2024-03-13 19:20:57.589063
Duration0.22 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7499
Distinct (%)75.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-14T04:20:57.762779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5646 ?
Unique (%)56.5%

Sample

1st rowSPB-41758
2nd rowSPB-31142
3rd rowSPB-45328
4th rowSPB-32630
5th rowSPB-35372
ValueCountFrequency (%)
spb-46515 11
 
0.1%
spb-46909 8
 
0.1%
spb-32409 6
 
0.1%
spb-37064 6
 
0.1%
spb-41620 6
 
0.1%
spb-37011 6
 
0.1%
spb-33034 6
 
0.1%
spb-52324 6
 
0.1%
spb-45373 6
 
0.1%
spb-34165 6
 
0.1%
Other values (7489) 9933
99.3%
2024-03-14T04:20:58.082981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 10000
11.1%
P 10000
11.1%
B 10000
11.1%
- 10000
11.1%
3 9076
10.1%
4 7433
8.3%
5 5878
6.5%
0 4489
 
5.0%
1 4467
 
5.0%
2 4383
 
4.9%
Other values (4) 14274
15.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
55.6%
Uppercase Letter 30000
33.3%
Dash Punctuation 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 9076
18.2%
4 7433
14.9%
5 5878
11.8%
0 4489
9.0%
1 4467
8.9%
2 4383
8.8%
6 3772
7.5%
7 3599
 
7.2%
8 3496
 
7.0%
9 3407
 
6.8%
Uppercase Letter
ValueCountFrequency (%)
S 10000
33.3%
P 10000
33.3%
B 10000
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60000
66.7%
Latin 30000
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
- 10000
16.7%
3 9076
15.1%
4 7433
12.4%
5 5878
9.8%
0 4489
7.5%
1 4467
7.4%
2 4383
7.3%
6 3772
 
6.3%
7 3599
 
6.0%
8 3496
 
5.8%
Latin
ValueCountFrequency (%)
S 10000
33.3%
P 10000
33.3%
B 10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 10000
11.1%
P 10000
11.1%
B 10000
11.1%
- 10000
11.1%
3 9076
10.1%
4 7433
8.3%
5 5878
6.5%
0 4489
 
5.0%
1 4467
 
5.0%
2 4383
 
4.9%
Other values (4) 14274
15.9%
Distinct149
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-02-01 00:00:00
Maximum2021-06-29 00:00:00
2024-03-14T04:20:58.196092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T04:20:58.303411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

고장구분
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타
2807 
체인
2356 
안장
2004 
타이어
1452 
페달
878 

Length

Max length4
Median length2
Mean length2.6214
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row안장
2nd row체인
3rd row기타
4th row안장
5th row기타

Common Values

ValueCountFrequency (%)
기타 2807
28.1%
체인 2356
23.6%
안장 2004
20.0%
타이어 1452
14.5%
페달 878
 
8.8%
단말기 503
 
5.0%

Length

2024-03-14T04:20:58.428900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T04:20:58.544529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 2807
28.1%
체인 2356
23.6%
안장 2004
20.0%
타이어 1452
14.5%
페달 878
 
8.8%
단말기 503
 
5.0%

Missing values

2024-03-14T04:20:57.501794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T04:20:57.560908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자전거번호등록일시고장구분
35206SPB-417582021-5-28안장
11030SPB-311422021-4-11체인
42442SPB-453282021-6-11기타
48856SPB-326302021-6-21안장
44751SPB-353722021-6-14기타
28552SPB-516812021-5-14페달
24443SPB-329492021-5-8기타
48387SPB-335752021-6-21기타
49658SPB-308502021-6-23페달
52594SPB-448772021-6-27페달
자전거번호등록일시고장구분
39497SPB-504172021-6-6타이어
44111SPB-490682021-6-14체인
9037SPB-325832021-4-8페달
8601SPB-323602021-4-7단말기
30507SPB-434082021-5-19안장
41491SPB-386492021-6-9기타
39812SPB-381682021-6-6기타
3382SPB-460262021-3-14페달
31659SPB-333482021-5-22기타
47189SPB-370092021-6-19타이어

Duplicate rows

Most frequently occurring

자전거번호등록일시고장구분# duplicates
0SPB-301432021-4-26체인2
1SPB-304452021-2-26안장2
2SPB-314502021-4-17기타2
3SPB-321762021-4-9페달2
4SPB-324842021-5-13체인2
5SPB-330342021-4-23체인2
6SPB-334202021-4-10체인2
7SPB-342332021-6-2체인2
8SPB-355952021-2-2단말기2
9SPB-377792021-5-6기타2