Overview

Dataset statistics

Number of variables6
Number of observations628
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.8 KiB
Average record size in memory50.2 B

Variable types

Categorical4
Text1
Numeric1

Dataset

Description광역공공버스 노선별 인원수송 운행실적
Author경기교통공사
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=T74WHEPX2Z6TMUEMFFWD32988993&infSeq=1

Alerts

관할시군 is highly overall correlated with 운송사업자명High correlation
운송사업자명 is highly overall correlated with 관할시군High correlation
수송인원(명) has 13 (2.1%) zerosZeros

Reproduction

Analysis started2023-12-10 21:30:16.290883
Analysis finished2023-12-10 21:30:16.899877
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년도
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
2022
314 
2021
314 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 314
50.0%
2021 314
50.0%

Length

2023-12-11T06:30:16.959228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:30:17.064890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 314
50.0%
2021 314
50.0%

노선구분
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
경기도 위탁노선
444 
대도시권광역교통위원회 위탁노선
184 

Length

Max length16
Median length8
Mean length10.343949
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경기도 위탁노선
2nd row경기도 위탁노선
3rd row경기도 위탁노선
4th row경기도 위탁노선
5th row경기도 위탁노선

Common Values

ValueCountFrequency (%)
경기도 위탁노선 444
70.7%
대도시권광역교통위원회 위탁노선 184
29.3%

Length

2023-12-11T06:30:17.172265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:30:17.271738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
위탁노선 628
50.0%
경기도 444
35.4%
대도시권광역교통위원회 184
 
14.6%

관할시군
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
용인시
90 
화성시
84 
남양주시
66 
성남시
58 
김포시
44 
Other values (21)
286 

Length

Max length4
Median length3
Mean length3.1273885
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row성남시
2nd row광주시
3rd row용인시
4th row성남시
5th row성남시

Common Values

ValueCountFrequency (%)
용인시 90
14.3%
화성시 84
13.4%
남양주시 66
10.5%
성남시 58
9.2%
김포시 44
 
7.0%
수원시 42
 
6.7%
파주시 40
 
6.4%
광주시 28
 
4.5%
하남시 20
 
3.2%
포천시 20
 
3.2%
Other values (16) 136
21.7%

Length

2023-12-11T06:30:17.370684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
용인시 90
14.3%
화성시 84
13.4%
남양주시 66
10.5%
성남시 58
9.2%
김포시 44
 
7.0%
수원시 42
 
6.7%
파주시 40
 
6.4%
광주시 28
 
4.5%
하남시 20
 
3.2%
포천시 20
 
3.2%
Other values (16) 136
21.7%
Distinct228
Distinct (%)36.3%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
2023-12-11T06:30:17.736403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length4
Mean length4.4235669
Min length2

Characters and Unicode

Total characters2778
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3500
2nd row500-2
3rd row8100
4th row8106
5th row9000
ValueCountFrequency (%)
3100 8
 
1.3%
g6000 8
 
1.3%
3002 6
 
1.0%
3500 6
 
1.0%
3400 6
 
1.0%
3000 6
 
1.0%
8000 6
 
1.0%
3201 6
 
1.0%
1100 6
 
1.0%
1101 6
 
1.0%
Other values (218) 564
89.8%
2023-12-11T06:30:18.233246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 932
33.5%
1 480
17.3%
3 242
 
8.7%
5 170
 
6.1%
6 148
 
5.3%
2 148
 
5.3%
7 142
 
5.1%
9 114
 
4.1%
8 96
 
3.5%
G 92
 
3.3%
Other values (5) 214
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2560
92.2%
Uppercase Letter 148
 
5.3%
Dash Punctuation 70
 
2.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 932
36.4%
1 480
18.8%
3 242
 
9.5%
5 170
 
6.6%
6 148
 
5.8%
2 148
 
5.8%
7 142
 
5.5%
9 114
 
4.5%
8 96
 
3.8%
4 88
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
G 92
62.2%
M 30
 
20.3%
B 14
 
9.5%
A 12
 
8.1%
Dash Punctuation
ValueCountFrequency (%)
- 70
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2630
94.7%
Latin 148
 
5.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 932
35.4%
1 480
18.3%
3 242
 
9.2%
5 170
 
6.5%
6 148
 
5.6%
2 148
 
5.6%
7 142
 
5.4%
9 114
 
4.3%
8 96
 
3.7%
4 88
 
3.3%
Latin
ValueCountFrequency (%)
G 92
62.2%
M 30
 
20.3%
B 14
 
9.5%
A 12
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2778
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 932
33.5%
1 480
17.3%
3 242
 
8.7%
5 170
 
6.1%
6 148
 
5.3%
2 148
 
5.3%
7 142
 
5.1%
9 114
 
4.1%
8 96
 
3.5%
G 92
 
3.3%
Other values (5) 214
 
7.7%

운송사업자명
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
대원고속
82 
경기고속
56 
대원버스
46 
대원운수
46 
김포운수
 
36
Other values (38)
362 

Length

Max length8
Median length4
Mean length4.2101911
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경기고속
2nd row경기고속
3rd row경기고속
4th row경기고속
5th row경기고속

Common Values

ValueCountFrequency (%)
대원고속 82
 
13.1%
경기고속 56
 
8.9%
대원버스 46
 
7.3%
대원운수 46
 
7.3%
김포운수 36
 
5.7%
화성여객 34
 
5.4%
경남여객 30
 
4.8%
경진여객운수 26
 
4.1%
신성교통 20
 
3.2%
용남고속 18
 
2.9%
Other values (33) 234
37.3%

Length

2023-12-11T06:30:18.373715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대원고속 82
 
13.1%
경기고속 56
 
8.9%
대원버스 46
 
7.3%
대원운수 46
 
7.3%
김포운수 36
 
5.7%
화성여객 34
 
5.4%
경남여객 30
 
4.8%
경진여객운수 26
 
4.1%
신성교통 20
 
3.2%
용남고속 18
 
2.9%
Other values (33) 234
37.3%

수송인원(명)
Real number (ℝ)

ZEROS 

Distinct488
Distinct (%)77.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean516939.11
Minimum0
Maximum3148490
Zeros13
Zeros (%)2.1%
Negative0
Negative (%)0.0%
Memory size5.6 KiB
2023-12-11T06:30:18.505596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile19527.4
Q1194626.75
median394796
Q3705076.5
95-th percentile1478430
Maximum3148490
Range3148490
Interquartile range (IQR)510449.75

Descriptive statistics

Standard deviation468611.78
Coefficient of variation (CV)0.90651253
Kurtosis3.991691
Mean516939.11
Median Absolute Deviation (MAD)242918
Skewness1.7313914
Sum3.2463776 × 108
Variance2.19597 × 1011
MonotonicityNot monotonic
2023-12-11T06:30:18.939217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 13
 
2.1%
101513 2
 
0.3%
94553 2
 
0.3%
378720 2
 
0.3%
938911 2
 
0.3%
268070 2
 
0.3%
798409 2
 
0.3%
460927 2
 
0.3%
346296 2
 
0.3%
437349 2
 
0.3%
Other values (478) 597
95.1%
ValueCountFrequency (%)
0 13
2.1%
1722 1
 
0.2%
2159 1
 
0.2%
4438 1
 
0.2%
6841 1
 
0.2%
6875 1
 
0.2%
7027 1
 
0.2%
8166 1
 
0.2%
8295 1
 
0.2%
9320 1
 
0.2%
ValueCountFrequency (%)
3148490 1
0.2%
2770260 1
0.2%
2351440 1
0.2%
2317820 2
0.3%
2246884 2
0.3%
2135094 1
0.2%
1989947 1
0.2%
1949913 1
0.2%
1897536 2
0.3%
1892439 1
0.2%

Interactions

2023-12-11T06:30:16.591020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:30:19.023069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준년도노선구분관할시군운송사업자명수송인원(명)
기준년도1.0000.0000.0000.0000.146
노선구분0.0001.0000.2430.3690.091
관할시군0.0000.2431.0000.9930.275
운송사업자명0.0000.3690.9931.0000.257
수송인원(명)0.1460.0910.2750.2571.000
2023-12-11T06:30:19.111547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선구분운송사업자명기준년도관할시군
노선구분1.0000.2980.0000.189
운송사업자명0.2981.0000.0000.837
기준년도0.0000.0001.0000.000
관할시군0.1890.8370.0001.000
2023-12-11T06:30:19.211646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수송인원(명)기준년도노선구분관할시군운송사업자명
수송인원(명)1.0000.1110.0690.1010.088
기준년도0.1111.0000.0000.0000.000
노선구분0.0690.0001.0000.1890.298
관할시군0.1010.0000.1891.0000.837
운송사업자명0.0880.0000.2980.8371.000

Missing values

2023-12-11T06:30:16.736690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:30:16.852993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년도노선구분관할시군노선번호운송사업자명수송인원(명)
02022경기도 위탁노선성남시3500경기고속1180814
12022경기도 위탁노선광주시500-2경기고속725534
22022경기도 위탁노선용인시8100경기고속2317820
32022경기도 위탁노선성남시8106경기고속1363215
42022경기도 위탁노선성남시9000경기고속777205
52022경기도 위탁노선성남시9000-1경기고속37178
62022경기도 위탁노선수원시1007-1대원고속251458
72022경기도 위탁노선용인시1101대원고속348368
82022경기도 위탁노선수원시1112대원고속1050548
92022경기도 위탁노선용인시1117대원고속829063
기준년도노선구분관할시군노선번호운송사업자명수송인원(명)
6182021경기도 위탁노선용인시5003A경남여객480561
6192021경기도 위탁노선용인시5003B경남여객846340
6202021경기도 위탁노선용인시1113대원고속299486
6212021경기도 위탁노선광주시1113-1대원고속1534940
6222021경기도 위탁노선광주시1113-10대원고속19985
6232021경기도 위탁노선광주시1113-11대원고속13300
6242021경기도 위탁노선광주시1113-2대원고속476778
6252021경기도 위탁노선용인시1550대원고속446077
6262021경기도 위탁노선용인시1560대원고속687132
6272021경기도 위탁노선용인시1570대원고속417729