Overview

Dataset statistics

Number of variables5
Number of observations297
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.3 KiB
Average record size in memory42.4 B

Variable types

Categorical3
Text1
Numeric1

Dataset

Description경상남도 진주시에서 운영하는 버스정보시스템의 버스정보 테이블(제공항목 : 버스회사, 정원, 차량번호, 버스아이디) 정보를 제공합니다.
Author경상남도 진주시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15117457

Alerts

데이터기준일자 has constant value ""Constant
버스ID is highly overall correlated with 버스회사 and 1 other fieldsHigh correlation
버스회사 is highly overall correlated with 버스ID and 1 other fieldsHigh correlation
정원 is highly overall correlated with 버스ID and 1 other fieldsHigh correlation
정원 is highly imbalanced (74.4%)Imbalance
차량번호 has unique valuesUnique
버스ID has unique valuesUnique

Reproduction

Analysis started2023-12-10 22:42:52.230190
Analysis finished2023-12-10 22:42:52.564316
Duration0.33 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

버스회사
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
삼성교통
100 
시민버스
97 
부일교통
49 
부산교통
39 
경전여객
 
8

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row삼성교통
2nd row시민버스
3rd row삼성교통
4th row부산교통
5th row삼성교통

Common Values

ValueCountFrequency (%)
삼성교통 100
33.7%
시민버스 97
32.7%
부일교통 49
16.5%
부산교통 39
 
13.1%
경전여객 8
 
2.7%
경원여객 4
 
1.3%

Length

2023-12-11T07:42:52.612011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:42:52.694890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
삼성교통 100
33.7%
시민버스 97
32.7%
부일교통 49
16.5%
부산교통 39
 
13.1%
경전여객 8
 
2.7%
경원여객 4
 
1.3%

정원
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
48
271 
41
 
11
53
 
6
45
 
5
46
 
4

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row48
2nd row48
3rd row48
4th row48
5th row48

Common Values

ValueCountFrequency (%)
48 271
91.2%
41 11
 
3.7%
53 6
 
2.0%
45 5
 
1.7%
46 4
 
1.3%

Length

2023-12-11T07:42:52.806624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:42:52.886418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
48 271
91.2%
41 11
 
3.7%
53 6
 
2.0%
45 5
 
1.7%
46 4
 
1.3%

차량번호
Text

UNIQUE 

Distinct297
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-11T07:42:53.078815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters2673
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique297 ?
Unique (%)100.0%

Sample

1st row경남71자5793
2nd row경남99자9991
3rd row경남71자5795
4th row경남71자5667
5th row경남71자5797
ValueCountFrequency (%)
경남71자5793 1
 
0.3%
경남71자5407 1
 
0.3%
경남71자5405 1
 
0.3%
경남71자5404 1
 
0.3%
경남71자5403 1
 
0.3%
경남71자5402 1
 
0.3%
경남71자5401 1
 
0.3%
경남71자5592 1
 
0.3%
경남71자5591 1
 
0.3%
경남71자5590 1
 
0.3%
Other values (287) 287
96.6%
2023-12-11T07:42:53.378714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 386
14.4%
7 351
13.1%
1 332
12.4%
297
11.1%
297
11.1%
278
10.4%
8 150
 
5.6%
4 149
 
5.6%
6 124
 
4.6%
9 100
 
3.7%
Other values (4) 209
7.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1782
66.7%
Other Letter 891
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 386
21.7%
7 351
19.7%
1 332
18.6%
8 150
 
8.4%
4 149
 
8.4%
6 124
 
7.0%
9 100
 
5.6%
0 76
 
4.3%
3 58
 
3.3%
2 56
 
3.1%
Other Letter
ValueCountFrequency (%)
297
33.3%
297
33.3%
278
31.2%
19
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1782
66.7%
Hangul 891
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 386
21.7%
7 351
19.7%
1 332
18.6%
8 150
 
8.4%
4 149
 
8.4%
6 124
 
7.0%
9 100
 
5.6%
0 76
 
4.3%
3 58
 
3.3%
2 56
 
3.1%
Hangul
ValueCountFrequency (%)
297
33.3%
297
33.3%
278
31.2%
19
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1782
66.7%
Hangul 891
33.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 386
21.7%
7 351
19.7%
1 332
18.6%
8 150
 
8.4%
4 149
 
8.4%
6 124
 
7.0%
9 100
 
5.6%
0 76
 
4.3%
3 58
 
3.3%
2 56
 
3.1%
Hangul
ValueCountFrequency (%)
297
33.3%
297
33.3%
278
31.2%
19
 
2.1%

버스ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct297
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63505790
Minimum38105401
Maximum3.810067 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T07:42:53.493269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum38105401
5-th percentile38105416
Q138105479
median38105635
Q338105847
95-th percentile3.8100526 × 108
Maximum3.810067 × 108
Range3.4290129 × 108
Interquartile range (IQR)368

Descriptive statistics

Standard deviation89954165
Coefficient of variation (CV)1.4164719
Kurtosis8.7466925
Mean63505790
Median Absolute Deviation (MAD)185
Skewness3.2692257
Sum1.886122 × 1010
Variance8.0917519 × 1015
MonotonicityNot monotonic
2023-12-11T07:42:53.601722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
38105793 1
 
0.3%
38105408 1
 
0.3%
38105406 1
 
0.3%
38105405 1
 
0.3%
38105404 1
 
0.3%
38105403 1
 
0.3%
38105402 1
 
0.3%
38105401 1
 
0.3%
38105592 1
 
0.3%
38105591 1
 
0.3%
Other values (287) 287
96.6%
ValueCountFrequency (%)
38105401 1
0.3%
38105402 1
0.3%
38105403 1
0.3%
38105404 1
0.3%
38105405 1
0.3%
38105406 1
0.3%
38105407 1
0.3%
38105408 1
0.3%
38105409 1
0.3%
38105410 1
0.3%
ValueCountFrequency (%)
381006695 1
0.3%
381006694 1
0.3%
381006693 1
0.3%
381006692 1
0.3%
381006590 1
0.3%
381006554 1
0.3%
381006526 1
0.3%
381006500 1
0.3%
381006181 1
0.3%
381006145 1
0.3%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-07-28
297 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-07-28
2nd row2023-07-28
3rd row2023-07-28
4th row2023-07-28
5th row2023-07-28

Common Values

ValueCountFrequency (%)
2023-07-28 297
100.0%

Length

2023-12-11T07:42:53.704288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:42:53.775749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-07-28 297
100.0%

Interactions

2023-12-11T07:42:52.348578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:42:53.818014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
버스회사정원버스ID
버스회사1.0000.7950.938
정원0.7951.0000.825
버스ID0.9380.8251.000
2023-12-11T07:42:53.888571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정원버스회사
정원1.0000.680
버스회사0.6801.000
2023-12-11T07:42:53.951413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
버스ID버스회사정원
버스ID1.0000.7730.944
버스회사0.7731.0000.680
정원0.9440.6801.000

Missing values

2023-12-11T07:42:52.457969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:42:52.535802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

버스회사정원차량번호버스ID데이터기준일자
0삼성교통48경남71자5793381057932023-07-28
1시민버스48경남99자9991381099912023-07-28
2삼성교통48경남71자5795381057952023-07-28
3부산교통48경남71자5667381056672023-07-28
4삼성교통48경남71자5797381057972023-07-28
5삼성교통48경남71자5798381057982023-07-28
6삼성교통48경남71자5799381057992023-07-28
7삼성교통48경남71자5800381058002023-07-28
8삼성교통48경남71자5801381058012023-07-28
9삼성교통48경남71자5802381058022023-07-28
버스회사정원차량번호버스ID데이터기준일자
287경전여객41경남70아66933810066932023-07-28
288삼성교통48경남71자5496381054962023-07-28
289삼성교통48경남71자5497381054972023-07-28
290시민버스48경남71자5491381054912023-07-28
291경전여객41경남70아66943810066942023-07-28
292부산교통41경남70아51883810051882023-07-28
293경전여객41경남70아66953810066952023-07-28
294부일교통48경남71자55503810055502023-07-28
295경전여객41경남70아65003810065002023-07-28
296부산교통48경남71자56563810056562023-07-28