Overview

Dataset statistics

Number of variables7
Number of observations314
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.2 KiB
Average record size in memory59.4 B

Variable types

Categorical3
Text1
Numeric3

Dataset

Description경기도 버스 준공영제 시행에 따라, 광역버스의 노선별 연간 수송하였던 인원의 수(탑승자)를 나타낸 데이터입니다. ※ 데이터기준일 : 2022.12.
URLhttps://www.data.go.kr/data/15111293/fileData.do

Alerts

운행개시일 is highly overall correlated with 노선구분High correlation
2021년 수송인원(명) is highly overall correlated with 2022년 수송인원(명)High correlation
2022년 수송인원(명) is highly overall correlated with 2021년 수송인원(명)High correlation
노선구분 is highly overall correlated with 운행개시일High correlation
관할시군 is highly overall correlated with 운송사업자명High correlation
운송사업자명 is highly overall correlated with 관할시군High correlation
2021년 수송인원(명) has 10 (3.2%) zerosZeros

Reproduction

Analysis started2023-12-12 09:50:13.696132
Analysis finished2023-12-12 09:50:15.307877
Duration1.61 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
경기도 위탁노선
222 
대도시권광역교통위원회 위탁노선
92 

Length

Max length16
Median length8
Mean length10.343949
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대도시권광역교통위원회 위탁노선
2nd row대도시권광역교통위원회 위탁노선
3rd row대도시권광역교통위원회 위탁노선
4th row대도시권광역교통위원회 위탁노선
5th row대도시권광역교통위원회 위탁노선

Common Values

ValueCountFrequency (%)
경기도 위탁노선 222
70.7%
대도시권광역교통위원회 위탁노선 92
29.3%

Length

2023-12-12T18:50:15.414835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:50:15.522034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
위탁노선 314
50.0%
경기도 222
35.4%
대도시권광역교통위원회 92
 
14.6%

관할시군
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
용인시
45 
화성시
42 
남양주시
33 
성남시
29 
김포시
22 
Other values (21)
143 

Length

Max length4
Median length3
Mean length3.1273885
Min length3

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row남양주시
2nd row안양시
3rd row김포시
4th row김포시
5th row양주시

Common Values

ValueCountFrequency (%)
용인시 45
14.3%
화성시 42
13.4%
남양주시 33
10.5%
성남시 29
9.2%
김포시 22
 
7.0%
수원시 21
 
6.7%
파주시 20
 
6.4%
광주시 14
 
4.5%
포천시 10
 
3.2%
하남시 10
 
3.2%
Other values (16) 68
21.7%

Length

2023-12-12T18:50:15.641661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
용인시 45
14.3%
화성시 42
13.4%
남양주시 33
10.5%
성남시 29
9.2%
김포시 22
 
7.0%
수원시 21
 
6.7%
파주시 20
 
6.4%
광주시 14
 
4.5%
포천시 10
 
3.2%
하남시 10
 
3.2%
Other values (16) 68
21.7%
Distinct225
Distinct (%)71.7%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
2023-12-12T18:50:16.051903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length4
Mean length4.4235669
Min length2

Characters and Unicode

Total characters1389
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique151 ?
Unique (%)48.1%

Sample

1st rowM2341
2nd rowM5333
3rd rowM6427
4th row3000
5th row1101
ValueCountFrequency (%)
3100 4
 
1.3%
g6000 4
 
1.3%
jan-07 3
 
1.0%
1100 3
 
1.0%
8000 3
 
1.0%
3201 3
 
1.0%
3500 3
 
1.0%
3002 3
 
1.0%
1101 3
 
1.0%
3000 3
 
1.0%
Other values (215) 282
89.8%
2023-12-12T18:50:16.625297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 455
32.8%
1 227
16.3%
3 120
 
8.6%
5 81
 
5.8%
6 73
 
5.3%
2 72
 
5.2%
7 68
 
4.9%
9 51
 
3.7%
8 47
 
3.4%
G 46
 
3.3%
Other values (11) 149
 
10.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1238
89.1%
Uppercase Letter 88
 
6.3%
Dash Punctuation 35
 
2.5%
Lowercase Letter 28
 
2.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 455
36.8%
1 227
18.3%
3 120
 
9.7%
5 81
 
6.5%
6 73
 
5.9%
2 72
 
5.8%
7 68
 
5.5%
9 51
 
4.1%
8 47
 
3.8%
4 44
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
G 46
52.3%
M 15
 
17.0%
J 13
 
14.8%
B 7
 
8.0%
A 6
 
6.8%
F 1
 
1.1%
Lowercase Letter
ValueCountFrequency (%)
n 13
46.4%
a 13
46.4%
e 1
 
3.6%
b 1
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
- 35
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1273
91.6%
Latin 116
 
8.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 455
35.7%
1 227
17.8%
3 120
 
9.4%
5 81
 
6.4%
6 73
 
5.7%
2 72
 
5.7%
7 68
 
5.3%
9 51
 
4.0%
8 47
 
3.7%
4 44
 
3.5%
Latin
ValueCountFrequency (%)
G 46
39.7%
M 15
 
12.9%
n 13
 
11.2%
a 13
 
11.2%
J 13
 
11.2%
B 7
 
6.0%
A 6
 
5.2%
F 1
 
0.9%
e 1
 
0.9%
b 1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1389
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 455
32.8%
1 227
16.3%
3 120
 
8.6%
5 81
 
5.8%
6 73
 
5.3%
2 72
 
5.2%
7 68
 
4.9%
9 51
 
3.7%
8 47
 
3.4%
G 46
 
3.3%
Other values (11) 149
 
10.7%

운송사업자명
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)13.7%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
대원고속
41 
경기고속
28 
대원운수
23 
대원버스
23 
김포운수
 
18
Other values (38)
181 

Length

Max length8
Median length4
Mean length4.2101911
Min length2

Unique

Unique7 ?
Unique (%)2.2%

Sample

1st row대원운수
2nd row삼영운수
3rd row김포운수
4th row선진상운
5th row진명여객

Common Values

ValueCountFrequency (%)
대원고속 41
 
13.1%
경기고속 28
 
8.9%
대원운수 23
 
7.3%
대원버스 23
 
7.3%
김포운수 18
 
5.7%
화성여객 17
 
5.4%
경남여객 15
 
4.8%
경진여객운수 13
 
4.1%
신성교통 10
 
3.2%
용남고속 9
 
2.9%
Other values (33) 117
37.3%

Length

2023-12-12T18:50:16.801912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대원고속 41
 
13.1%
경기고속 28
 
8.9%
대원운수 23
 
7.3%
대원버스 23
 
7.3%
김포운수 18
 
5.7%
화성여객 17
 
5.4%
경남여객 15
 
4.8%
경진여객운수 13
 
4.1%
신성교통 10
 
3.2%
용남고속 9
 
2.9%
Other values (33) 117
37.3%

운행개시일
Real number (ℝ)

HIGH CORRELATION 

Distinct59
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20208542
Minimum20200301
Maximum20230401
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 KiB
2023-12-12T18:50:16.964243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20200301
5-th percentile20200330
Q120201102
median20210315
Q320210801
95-th percentile20230101
Maximum20230401
Range30100
Interquartile range (IQR)9699

Descriptive statistics

Standard deviation8751.1286
Coefficient of variation (CV)0.00043304108
Kurtosis0.21094643
Mean20208542
Median Absolute Deviation (MAD)9213
Skewness1.019183
Sum6.3454821 × 109
Variance76582252
MonotonicityNot monotonic
2023-12-12T18:50:17.148325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20210801 76
24.2%
20201102 53
16.9%
20201101 27
 
8.6%
20230101 20
 
6.4%
20201109 12
 
3.8%
20221201 12
 
3.8%
20210901 12
 
3.8%
20201112 8
 
2.5%
20220725 6
 
1.9%
20210315 6
 
1.9%
Other values (49) 82
26.1%
ValueCountFrequency (%)
20200301 5
1.6%
20200309 1
 
0.3%
20200313 4
1.3%
20200317 3
1.0%
20200319 1
 
0.3%
20200324 1
 
0.3%
20200327 1
 
0.3%
20200331 2
 
0.6%
20200401 2
 
0.6%
20200413 1
 
0.3%
ValueCountFrequency (%)
20230401 1
 
0.3%
20230101 20
6.4%
20221201 12
3.8%
20221115 1
 
0.3%
20221102 1
 
0.3%
20221101 3
 
1.0%
20221022 1
 
0.3%
20221001 1
 
0.3%
20220725 6
 
1.9%
20220620 1
 
0.3%

2021년 수송인원(명)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct241
Distinct (%)76.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean461659.72
Minimum0
Maximum2770260
Zeros10
Zeros (%)3.2%
Negative0
Negative (%)0.0%
Memory size2.9 KiB
2023-12-12T18:50:17.300450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8961.25
Q1154677
median360598
Q3643818.25
95-th percentile1365153.9
Maximum2770260
Range2770260
Interquartile range (IQR)489141.25

Descriptive statistics

Standard deviation428366.75
Coefficient of variation (CV)0.92788417
Kurtosis3.9590739
Mean461659.72
Median Absolute Deviation (MAD)239238
Skewness1.7134273
Sum1.4496115 × 108
Variance1.8349807 × 1011
MonotonicityNot monotonic
2023-12-12T18:50:17.459101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 10
 
3.2%
299783 2
 
0.6%
889654 2
 
0.6%
191712 2
 
0.6%
178531 2
 
0.6%
121054 2
 
0.6%
186651 2
 
0.6%
281891 2
 
0.6%
154677 2
 
0.6%
541586 2
 
0.6%
Other values (231) 286
91.1%
ValueCountFrequency (%)
0 10
3.2%
1722 1
 
0.3%
4438 1
 
0.3%
6841 1
 
0.3%
7027 1
 
0.3%
8166 1
 
0.3%
8295 1
 
0.3%
9320 1
 
0.3%
13300 1
 
0.3%
16793 2
 
0.6%
ValueCountFrequency (%)
2770260 1
0.3%
1949913 1
0.3%
1897536 2
0.6%
1843699 1
0.3%
1784809 2
0.6%
1775983 1
0.3%
1698012 1
0.3%
1598689 1
0.3%
1540574 2
0.6%
1534940 1
0.3%

2022년 수송인원(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct248
Distinct (%)79.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean572218.5
Minimum0
Maximum3148490
Zeros3
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size2.9 KiB
2023-12-12T18:50:17.668723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile36824.15
Q1228746.25
median440012
Q3787585
95-th percentile1685508.7
Maximum3148490
Range3148490
Interquartile range (IQR)558838.75

Descriptive statistics

Standard deviation500266.3
Coefficient of variation (CV)0.87425747
Kurtosis3.7170453
Mean572218.5
Median Absolute Deviation (MAD)245401
Skewness1.6951033
Sum1.7967661 × 108
Variance2.5026637 × 1011
MonotonicityNot monotonic
2023-12-12T18:50:17.835472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3
 
1.0%
576063 2
 
0.6%
276059 2
 
0.6%
188391 2
 
0.6%
224491 2
 
0.6%
407715 2
 
0.6%
194632 2
 
0.6%
705282 2
 
0.6%
493810 2
 
0.6%
437349 2
 
0.6%
Other values (238) 293
93.3%
ValueCountFrequency (%)
0 3
1.0%
2159 1
 
0.3%
6875 1
 
0.3%
9841 1
 
0.3%
10257 1
 
0.3%
10720 1
 
0.3%
14594 1
 
0.3%
19281 1
 
0.3%
21112 2
0.6%
23261 1
 
0.3%
ValueCountFrequency (%)
3148490 1
0.3%
2351440 1
0.3%
2317820 2
0.6%
2246884 2
0.6%
2135094 1
0.3%
1989947 1
0.3%
1892439 1
0.3%
1877103 1
0.3%
1826767 2
0.6%
1784428 1
0.3%

Interactions

2023-12-12T18:50:14.734149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:50:14.080144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:50:14.425719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:50:14.838976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:50:14.191883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:50:14.546121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:50:14.981412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:50:14.305167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:50:14.644845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:50:17.944456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선구분관할시군운송사업자명운행개시일2021년 수송인원(명)2022년 수송인원(명)
노선구분1.0000.0000.1910.8960.0000.000
관할시군0.0001.0000.9920.6260.0000.221
운송사업자명0.1910.9921.0000.7060.0000.000
운행개시일0.8960.6260.7061.0000.0000.138
2021년 수송인원(명)0.0000.0000.0000.0001.0000.986
2022년 수송인원(명)0.0000.2210.0000.1380.9861.000
2023-12-12T18:50:18.055813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운송사업자명관할시군노선구분
운송사업자명1.0000.8110.147
관할시군0.8111.0000.000
노선구분0.1470.0001.000
2023-12-12T18:50:18.156432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운행개시일2021년 수송인원(명)2022년 수송인원(명)노선구분관할시군운송사업자명
운행개시일1.000-0.0010.0120.6970.3610.403
2021년 수송인원(명)-0.0011.0000.9630.0000.0000.000
2022년 수송인원(명)0.0120.9631.0000.0000.0810.000
노선구분0.6970.0000.0001.0000.0000.147
관할시군0.3610.0000.0810.0001.0000.811
운송사업자명0.4030.0000.0000.1470.8111.000

Missing values

2023-12-12T18:50:15.134469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:50:15.254161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선구분관할시군노선번호운송사업자명운행개시일2021년 수송인원(명)2022년 수송인원(명)
0대도시권광역교통위원회 위탁노선남양주시M2341대원운수20201124326668364825
1대도시권광역교통위원회 위탁노선안양시M5333삼영운수20201126390167506500
2대도시권광역교통위원회 위탁노선김포시M6427김포운수20201201424743407541
3대도시권광역교통위원회 위탁노선김포시3000선진상운2022102213195861784129
4대도시권광역교통위원회 위탁노선양주시1101진명여객2021111820355277612
5대도시권광역교통위원회 위탁노선광명시3002화영운수20211129453613578835
6대도시권광역교통위원회 위탁노선시흥시6501경원여객202112156841284148
7대도시권광역교통위원회 위탁노선용인시4101경남여객202112244438421427
8대도시권광역교통위원회 위탁노선이천시3401동부고속202203310242196
9대도시권광역교통위원회 위탁노선광주시3302이천시내버스202205160137559
노선구분관할시군노선번호운송사업자명운행개시일2021년 수송인원(명)2022년 수송인원(명)
304경기도 위탁노선용인시5005경남여객2021090111618091427603
305경기도 위탁노선용인시5600경남여객202109019865991299141
306경기도 위탁노선용인시5700A경남여객20210901119802150500
307경기도 위탁노선용인시5700B경남여객20210901199884288464
308경기도 위탁노선화성시1006경진여객운수2021090183216123139
309경기도 위탁노선화성시7200경진여객운수20210901248743323081
310경기도 위탁노선평택시6600대원고속20210901284854390947
311경기도 위탁노선파주시9709A신일여객202109013141320
312경기도 위탁노선광명시3001화영운수20210901136352501443
313경기도 위탁노선성남시8109경기고속20211101225797292890