Overview

Dataset statistics

Number of variables8
Number of observations2511
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory159.5 KiB
Average record size in memory65.1 B

Variable types

Categorical5
Text2
Numeric1

Dataset

Description부산광역시 시내버스 업체별 연도별 버스 등록대수 현황으로 회사명,버스노선번호, 등록년도, 연료, 유형 등의 항목을 제공하고 있습니다.
Author부산광역시
URLhttps://www.data.go.kr/data/15043689/fileData.do

Alerts

운수사 is highly overall correlated with 연료High correlation
연료 is highly overall correlated with 운수사High correlation
상용구분 is highly imbalanced (70.6%)Imbalance
차량구분 is highly imbalanced (95.6%)Imbalance
연료 is highly imbalanced (61.9%)Imbalance
차량번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 18:03:21.630143
Analysis finished2023-12-12 18:03:22.567444
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

운수사
Categorical

HIGH CORRELATION 

Distinct33
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size19.7 KiB
삼신교통㈜
 
132
삼성여객㈜
 
130
신한여객㈜
 
126
삼진여객㈜
 
107
삼화여객㈜
 
107
Other values (28)
1909 

Length

Max length7
Median length5
Mean length5.0318598
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국제여객㈜
2nd row국제여객㈜
3rd row국제여객㈜
4th row국제여객㈜
5th row국제여객㈜

Common Values

ValueCountFrequency (%)
삼신교통㈜ 132
 
5.3%
삼성여객㈜ 130
 
5.2%
신한여객㈜ 126
 
5.0%
삼진여객㈜ 107
 
4.3%
삼화여객㈜ 107
 
4.3%
동원여객㈜ 98
 
3.9%
세진여객㈜ 94
 
3.7%
동남여객㈜ 89
 
3.5%
대진버스㈜ 86
 
3.4%
태진여객㈜ 83
 
3.3%
Other values (23) 1459
58.1%

Length

2023-12-13T03:03:22.679776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
삼신교통㈜ 132
 
5.3%
삼성여객㈜ 130
 
5.2%
신한여객㈜ 126
 
5.0%
삼진여객㈜ 107
 
4.3%
삼화여객㈜ 107
 
4.3%
동원여객㈜ 98
 
3.9%
세진여객㈜ 94
 
3.7%
동남여객㈜ 89
 
3.5%
대진버스㈜ 86
 
3.4%
태진여객㈜ 83
 
3.3%
Other values (23) 1459
58.1%
Distinct143
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Memory size19.7 KiB
2023-12-13T03:03:23.134552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length2.7546794
Min length1

Characters and Unicode

Total characters6917
Distinct characters14
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10
2nd row210
3rd row10
4th row10
5th row10
ValueCountFrequency (%)
15 38
 
1.5%
31 37
 
1.5%
51 36
 
1.4%
155 35
 
1.4%
68 35
 
1.4%
126 33
 
1.3%
67 32
 
1.3%
115-1 32
 
1.3%
87 32
 
1.3%
1001 31
 
1.2%
Other values (133) 2170
86.4%
2023-12-13T03:03:23.816775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2042
29.5%
0 850
12.3%
3 639
 
9.2%
8 621
 
9.0%
2 513
 
7.4%
6 490
 
7.1%
5 412
 
6.0%
4 354
 
5.1%
9 351
 
5.1%
7 308
 
4.5%
Other values (4) 337
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6580
95.1%
Dash Punctuation 299
 
4.3%
Uppercase Letter 20
 
0.3%
Open Punctuation 9
 
0.1%
Close Punctuation 9
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2042
31.0%
0 850
12.9%
3 639
 
9.7%
8 621
 
9.4%
2 513
 
7.8%
6 490
 
7.4%
5 412
 
6.3%
4 354
 
5.4%
9 351
 
5.3%
7 308
 
4.7%
Dash Punctuation
ValueCountFrequency (%)
- 299
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 20
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6897
99.7%
Latin 20
 
0.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2042
29.6%
0 850
12.3%
3 639
 
9.3%
8 621
 
9.0%
2 513
 
7.4%
6 490
 
7.1%
5 412
 
6.0%
4 354
 
5.1%
9 351
 
5.1%
7 308
 
4.5%
Other values (3) 317
 
4.6%
Latin
ValueCountFrequency (%)
A 20
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6917
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2042
29.5%
0 850
12.3%
3 639
 
9.2%
8 621
 
9.0%
2 513
 
7.4%
6 490
 
7.1%
5 412
 
6.0%
4 354
 
5.1%
9 351
 
5.1%
7 308
 
4.5%
Other values (4) 337
 
4.9%

차량번호
Text

UNIQUE 

Distinct2511
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size19.7 KiB
2023-12-13T03:03:24.097851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters22599
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2511 ?
Unique (%)100.0%

Sample

1st row부산70자1001
2nd row부산70자1002
3rd row부산70자1003
4th row부산70자1004
5th row부산70자1005
ValueCountFrequency (%)
부산70자1001 1
 
< 0.1%
부산70자3908 1
 
< 0.1%
부산70자3934 1
 
< 0.1%
부산70자3900 1
 
< 0.1%
부산70자3901 1
 
< 0.1%
부산70자3903 1
 
< 0.1%
부산70자3904 1
 
< 0.1%
부산70자3906 1
 
< 0.1%
부산70자3907 1
 
< 0.1%
부산70자3917 1
 
< 0.1%
Other values (2501) 2501
99.6%
2023-12-13T03:03:24.671147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3209
14.2%
7 3200
14.2%
2511
11.1%
2511
11.1%
2511
11.1%
1 1591
7.0%
3 1486
6.6%
2 1415
6.3%
5 1181
 
5.2%
4 1104
 
4.9%
Other values (3) 1880
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15066
66.7%
Other Letter 7533
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3209
21.3%
7 3200
21.2%
1 1591
10.6%
3 1486
9.9%
2 1415
9.4%
5 1181
 
7.8%
4 1104
 
7.3%
8 671
 
4.5%
6 661
 
4.4%
9 548
 
3.6%
Other Letter
ValueCountFrequency (%)
2511
33.3%
2511
33.3%
2511
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 15066
66.7%
Hangul 7533
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3209
21.3%
7 3200
21.2%
1 1591
10.6%
3 1486
9.9%
2 1415
9.4%
5 1181
 
7.8%
4 1104
 
7.3%
8 671
 
4.5%
6 661
 
4.4%
9 548
 
3.6%
Hangul
ValueCountFrequency (%)
2511
33.3%
2511
33.3%
2511
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15066
66.7%
Hangul 7533
33.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3209
21.3%
7 3200
21.2%
1 1591
10.6%
3 1486
9.9%
2 1415
9.4%
5 1181
 
7.8%
4 1104
 
7.3%
8 671
 
4.5%
6 661
 
4.4%
9 548
 
3.6%
Hangul
ValueCountFrequency (%)
2511
33.3%
2511
33.3%
2511
33.3%

운행구분
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size19.7 KiB
일반
1536 
저상
792 
고급
172 
좌석
 
11

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row저상

Common Values

ValueCountFrequency (%)
일반 1536
61.2%
저상 792
31.5%
고급 172
 
6.8%
좌석 11
 
0.4%

Length

2023-12-13T03:03:24.858984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:03:24.989226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 1536
61.2%
저상 792
31.5%
고급 172
 
6.8%
좌석 11
 
0.4%

상용구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size19.7 KiB
상용
2381 
예비
 
130

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상용
2nd row상용
3rd row상용
4th row상용
5th row상용

Common Values

ValueCountFrequency (%)
상용 2381
94.8%
예비 130
 
5.2%

Length

2023-12-13T03:03:25.145775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:03:25.285474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상용 2381
94.8%
예비 130
 
5.2%

차량구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size19.7 KiB
대형
2499 
중형
 
12

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대형
2nd row대형
3rd row대형
4th row대형
5th row대형

Common Values

ValueCountFrequency (%)
대형 2499
99.5%
중형 12
 
0.5%

Length

2023-12-13T03:03:25.431290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:03:25.566616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대형 2499
99.5%
중형 12
 
0.5%

연료
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size19.7 KiB
CNG
2099 
전기
354 
수소
 
36
경유
 
22

Length

Max length3
Median length3
Mean length2.8359219
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCNG
2nd rowCNG
3rd rowCNG
4th rowCNG
5th rowCNG

Common Values

ValueCountFrequency (%)
CNG 2099
83.6%
전기 354
 
14.1%
수소 36
 
1.4%
경유 22
 
0.9%

Length

2023-12-13T03:03:25.693212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:03:25.811130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
cng 2099
83.6%
전기 354
 
14.1%
수소 36
 
1.4%
경유 22
 
0.9%

연식
Real number (ℝ)

Distinct12
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2017.6977
Minimum2011
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.2 KiB
2023-12-13T03:03:25.918843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2013
Q12015
median2018
Q32020
95-th percentile2022
Maximum2022
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.8508458
Coefficient of variation (CV)0.0014129201
Kurtosis-0.8726555
Mean2017.6977
Median Absolute Deviation (MAD)2
Skewness-0.29281004
Sum5066439
Variance8.1273215
MonotonicityNot monotonic
2023-12-13T03:03:26.065936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2019 329
13.1%
2018 307
12.2%
2017 297
11.8%
2020 287
11.4%
2015 246
9.8%
2021 243
9.7%
2022 238
9.5%
2014 175
7.0%
2016 150
6.0%
2013 140
5.6%
Other values (2) 99
 
3.9%
ValueCountFrequency (%)
2011 14
 
0.6%
2012 85
 
3.4%
2013 140
5.6%
2014 175
7.0%
2015 246
9.8%
2016 150
6.0%
2017 297
11.8%
2018 307
12.2%
2019 329
13.1%
2020 287
11.4%
ValueCountFrequency (%)
2022 238
9.5%
2021 243
9.7%
2020 287
11.4%
2019 329
13.1%
2018 307
12.2%
2017 297
11.8%
2016 150
6.0%
2015 246
9.8%
2014 175
7.0%
2013 140
5.6%

Interactions

2023-12-13T03:03:22.097130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:03:26.158626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운수사운행구분상용구분차량구분연료연식
운수사1.0000.6530.0000.2340.7980.503
운행구분0.6531.0000.2110.0650.7870.300
상용구분0.0000.2111.0000.0180.1340.419
차량구분0.2340.0650.0181.0000.3590.175
연료0.7980.7870.1340.3591.0000.402
연식0.5030.3000.4190.1750.4021.000
2023-12-13T03:03:26.285621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운수사운행구분상용구분연료차량구분
운수사1.0000.3960.0000.5480.197
운행구분0.3961.0000.1400.4250.043
상용구분0.0000.1401.0000.0890.011
연료0.5480.4250.0891.0000.240
차량구분0.1970.0430.0110.2401.000
2023-12-13T03:03:26.395311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연식운수사운행구분상용구분차량구분연료
연식1.0000.2050.1810.3260.1340.252
운수사0.2051.0000.3960.0000.1970.548
운행구분0.1810.3961.0000.1400.0430.425
상용구분0.3260.0000.1401.0000.0110.089
차량구분0.1340.1970.0430.0111.0000.240
연료0.2520.5480.4250.0890.2401.000

Missing values

2023-12-13T03:03:22.283083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:03:22.498378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

운수사인가노선차량번호운행구분상용구분차량구분연료연식
0국제여객㈜10부산70자1001일반상용대형CNG2022
1국제여객㈜210부산70자1002일반상용대형CNG2015
2국제여객㈜10부산70자1003일반상용대형CNG2022
3국제여객㈜10부산70자1004일반상용대형CNG2014
4국제여객㈜10부산70자1005저상상용대형CNG2014
5국제여객㈜111-1부산70자1007일반상용대형CNG2022
6국제여객㈜111-1부산70자1008저상상용대형CNG2019
7국제여객㈜10부산70자1009일반예비대형CNG2013
8국제여객㈜10부산70자1010일반상용대형CNG2015
9국제여객㈜10부산70자1011일반상용대형CNG2018
운수사인가노선차량번호운행구분상용구분차량구분연료연식
2501화신여객㈜131부산70자5730저상상용대형CNG2019
2502화신여객㈜131부산70자5737일반상용대형CNG2018
2503화신여객㈜51부산70자5751일반상용대형CNG2018
2504화신여객㈜51부산70자5758저상상용대형CNG2014
2505화신여객㈜51부산70자5763저상상용대형CNG2014
2506화신여객㈜131부산70자5779저상상용대형CNG2022
2507화신여객㈜131부산70자5786저상상용대형CNG2019
2508화신여객㈜131부산70자5788일반상용대형CNG2018
2509화신여객㈜131부산70자5795일반상용대형CNG2016
2510화신여객㈜131부산70자5797일반상용대형CNG2016