Overview

Dataset statistics

Number of variables5
Number of observations297
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.3 KiB
Average record size in memory42.4 B

Variable types

Categorical2
Text1
Numeric1
DateTime1

Dataset

Description경상남도 진주시에서 운영하는 버스정보시스템의 버스정보 테이블(제공항목 : 버스회사, 정원, 차량번호, 버스아이디) 정보를 제공합니다. 참고사항 : 2022년 공공데이터 중장기 개방계획 수립 완료한 개방대상테이블 2023년 개방 이행 자료 입니다.
URLhttps://www.data.go.kr/data/15117457/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
버스ID is highly overall correlated with 버스회사 and 1 other fieldsHigh correlation
버스회사 is highly overall correlated with 버스ID and 1 other fieldsHigh correlation
정원 is highly overall correlated with 버스ID and 1 other fieldsHigh correlation
정원 is highly imbalanced (74.4%)Imbalance
차량번호 has unique valuesUnique
버스ID has unique valuesUnique

Reproduction

Analysis started2023-12-12 08:44:16.072258
Analysis finished2023-12-12 08:44:16.501783
Duration0.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

버스회사
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
삼성교통
100 
시민버스
97 
부일교통
49 
부산교통
39 
경전여객
 
8

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row삼성교통
2nd row시민버스
3rd row삼성교통
4th row부산교통
5th row삼성교통

Common Values

ValueCountFrequency (%)
삼성교통 100
33.7%
시민버스 97
32.7%
부일교통 49
16.5%
부산교통 39
 
13.1%
경전여객 8
 
2.7%
경원여객 4
 
1.3%

Length

2023-12-12T17:44:16.578852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:44:16.749308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
삼성교통 100
33.7%
시민버스 97
32.7%
부일교통 49
16.5%
부산교통 39
 
13.1%
경전여객 8
 
2.7%
경원여객 4
 
1.3%

정원
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
48
271 
41
 
11
53
 
6
45
 
5
46
 
4

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row48
2nd row48
3rd row48
4th row48
5th row48

Common Values

ValueCountFrequency (%)
48 271
91.2%
41 11
 
3.7%
53 6
 
2.0%
45 5
 
1.7%
46 4
 
1.3%

Length

2023-12-12T17:44:16.898103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:44:17.040448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
48 271
91.2%
41 11
 
3.7%
53 6
 
2.0%
45 5
 
1.7%
46 4
 
1.3%

차량번호
Text

UNIQUE 

Distinct297
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T17:44:17.335372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters2673
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique297 ?
Unique (%)100.0%

Sample

1st row경남71자5793
2nd row경남99자9991
3rd row경남71자5795
4th row경남71자5667
5th row경남71자5797
ValueCountFrequency (%)
경남71자5793 1
 
0.3%
경남71자5407 1
 
0.3%
경남71자5405 1
 
0.3%
경남71자5404 1
 
0.3%
경남71자5403 1
 
0.3%
경남71자5402 1
 
0.3%
경남71자5401 1
 
0.3%
경남71자5592 1
 
0.3%
경남71자5591 1
 
0.3%
경남71자5590 1
 
0.3%
Other values (287) 287
96.6%
2023-12-12T17:44:17.801511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 386
14.4%
7 351
13.1%
1 332
12.4%
297
11.1%
297
11.1%
278
10.4%
8 150
 
5.6%
4 149
 
5.6%
6 124
 
4.6%
9 100
 
3.7%
Other values (4) 209
7.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1782
66.7%
Other Letter 891
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 386
21.7%
7 351
19.7%
1 332
18.6%
8 150
 
8.4%
4 149
 
8.4%
6 124
 
7.0%
9 100
 
5.6%
0 76
 
4.3%
3 58
 
3.3%
2 56
 
3.1%
Other Letter
ValueCountFrequency (%)
297
33.3%
297
33.3%
278
31.2%
19
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1782
66.7%
Hangul 891
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 386
21.7%
7 351
19.7%
1 332
18.6%
8 150
 
8.4%
4 149
 
8.4%
6 124
 
7.0%
9 100
 
5.6%
0 76
 
4.3%
3 58
 
3.3%
2 56
 
3.1%
Hangul
ValueCountFrequency (%)
297
33.3%
297
33.3%
278
31.2%
19
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1782
66.7%
Hangul 891
33.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 386
21.7%
7 351
19.7%
1 332
18.6%
8 150
 
8.4%
4 149
 
8.4%
6 124
 
7.0%
9 100
 
5.6%
0 76
 
4.3%
3 58
 
3.3%
2 56
 
3.1%
Hangul
ValueCountFrequency (%)
297
33.3%
297
33.3%
278
31.2%
19
 
2.1%

버스ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct297
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63505790
Minimum38105401
Maximum3.810067 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T17:44:17.954013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum38105401
5-th percentile38105416
Q138105479
median38105635
Q338105847
95-th percentile3.8100526 × 108
Maximum3.810067 × 108
Range3.4290129 × 108
Interquartile range (IQR)368

Descriptive statistics

Standard deviation89954165
Coefficient of variation (CV)1.4164719
Kurtosis8.7466925
Mean63505790
Median Absolute Deviation (MAD)185
Skewness3.2692257
Sum1.886122 × 1010
Variance8.0917519 × 1015
MonotonicityNot monotonic
2023-12-12T17:44:18.128568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
38105793 1
 
0.3%
38105408 1
 
0.3%
38105406 1
 
0.3%
38105405 1
 
0.3%
38105404 1
 
0.3%
38105403 1
 
0.3%
38105402 1
 
0.3%
38105401 1
 
0.3%
38105592 1
 
0.3%
38105591 1
 
0.3%
Other values (287) 287
96.6%
ValueCountFrequency (%)
38105401 1
0.3%
38105402 1
0.3%
38105403 1
0.3%
38105404 1
0.3%
38105405 1
0.3%
38105406 1
0.3%
38105407 1
0.3%
38105408 1
0.3%
38105409 1
0.3%
38105410 1
0.3%
ValueCountFrequency (%)
381006695 1
0.3%
381006694 1
0.3%
381006693 1
0.3%
381006692 1
0.3%
381006590 1
0.3%
381006554 1
0.3%
381006526 1
0.3%
381006500 1
0.3%
381006181 1
0.3%
381006145 1
0.3%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
Minimum2023-07-28 00:00:00
Maximum2023-07-28 00:00:00
2023-12-12T17:44:18.242943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:44:18.330580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T17:44:16.223648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:44:18.417309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
버스회사정원버스ID
버스회사1.0000.7950.938
정원0.7951.0000.825
버스ID0.9380.8251.000
2023-12-12T17:44:18.525912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
버스회사정원
버스회사1.0000.680
정원0.6801.000
2023-12-12T17:44:18.615360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
버스ID버스회사정원
버스ID1.0000.7730.944
버스회사0.7731.0000.680
정원0.9440.6801.000

Missing values

2023-12-12T17:44:16.360574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:44:16.462518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

버스회사정원차량번호버스ID데이터기준일자
0삼성교통48경남71자5793381057932023-07-28
1시민버스48경남99자9991381099912023-07-28
2삼성교통48경남71자5795381057952023-07-28
3부산교통48경남71자5667381056672023-07-28
4삼성교통48경남71자5797381057972023-07-28
5삼성교통48경남71자5798381057982023-07-28
6삼성교통48경남71자5799381057992023-07-28
7삼성교통48경남71자5800381058002023-07-28
8삼성교통48경남71자5801381058012023-07-28
9삼성교통48경남71자5802381058022023-07-28
버스회사정원차량번호버스ID데이터기준일자
287경전여객41경남70아66933810066932023-07-28
288삼성교통48경남71자5496381054962023-07-28
289삼성교통48경남71자5497381054972023-07-28
290시민버스48경남71자5491381054912023-07-28
291경전여객41경남70아66943810066942023-07-28
292부산교통41경남70아51883810051882023-07-28
293경전여객41경남70아66953810066952023-07-28
294부일교통48경남71자55503810055502023-07-28
295경전여객41경남70아65003810065002023-07-28
296부산교통48경남71자56563810056562023-07-28