Overview

Dataset statistics

Number of variables5
Number of observations5104
Missing cells0
Missing cells (%)0.0%
Duplicate rows37
Duplicate rows (%)0.7%
Total size in memory214.5 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text2

Dataset

Description경기도_BMS 노선/정류소 실측 현황
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=6KBYRR0MU2RPORDO33W834395723&infSeq=1

Alerts

Dataset has 37 (0.7%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-19 05:20:22.941682
Analysis finished2024-04-19 05:20:24.478090
Duration1.54 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

이력ID
Real number (ℝ)

Distinct57
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0000019 × 109
Minimum1 × 109
Maximum1.0000033 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size45.0 KiB
2024-04-19T14:20:24.550448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 109
5-th percentile1 × 109
Q11.0000015 × 109
median1.000002 × 109
Q31.0000025 × 109
95-th percentile1.0000031 × 109
Maximum1.0000033 × 109
Range3321
Interquartile range (IQR)1063

Descriptive statistics

Standard deviation858.80181
Coefficient of variation (CV)8.5880015 × 10-7
Kurtosis0.11530357
Mean1.0000019 × 109
Median Absolute Deviation (MAD)554
Skewness-0.69418759
Sum5.1040098 × 1012
Variance737540.54
MonotonicityNot monotonic
2024-04-19T14:20:24.691844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000000000 407
 
8.0%
1000001463 170
 
3.3%
1000002017 167
 
3.3%
1000002941 156
 
3.1%
1000002942 154
 
3.0%
1000001453 149
 
2.9%
1000002018 142
 
2.8%
1000002931 135
 
2.6%
1000001471 133
 
2.6%
1000002301 129
 
2.5%
Other values (47) 3362
65.9%
ValueCountFrequency (%)
1000000000 407
8.0%
1000000048 96
 
1.9%
1000001063 13
 
0.3%
1000001124 60
 
1.2%
1000001161 101
 
2.0%
1000001289 59
 
1.2%
1000001336 97
 
1.9%
1000001384 64
 
1.3%
1000001401 61
 
1.2%
1000001416 77
 
1.5%
ValueCountFrequency (%)
1000003321 115
2.3%
1000003155 110
2.2%
1000003081 73
1.4%
1000002942 154
3.0%
1000002941 156
3.1%
1000002938 57
 
1.1%
1000002931 135
2.6%
1000002930 81
1.6%
1000002900 126
2.5%
1000002854 122
2.4%

노선ID
Real number (ℝ)

Distinct60
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2144187 × 108
Minimum2.0000014 × 108
Maximum2.3400034 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size45.0 KiB
2024-04-19T14:20:24.839642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0000014 × 108
5-th percentile2.0700003 × 108
Q12.1500001 × 108
median2.2200002 × 108
Q32.2900003 × 108
95-th percentile2.3400003 × 108
Maximum2.3400034 × 108
Range34000201
Interquartile range (IQR)14000022

Descriptive statistics

Standard deviation9366625.8
Coefficient of variation (CV)0.042298351
Kurtosis-0.90348203
Mean2.2144187 × 108
Median Absolute Deviation (MAD)7000003
Skewness-0.39604431
Sum1.1302393 × 1012
Variance8.7733679 × 1013
MonotonicityNot monotonic
2024-04-19T14:20:25.283842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
234000033 197
 
3.9%
234000026 170
 
3.3%
217000006 167
 
3.3%
207000027 156
 
3.1%
207000028 154
 
3.0%
234000315 149
 
2.9%
222000004 147
 
2.9%
217000001 142
 
2.8%
207000026 135
 
2.6%
222000014 133
 
2.6%
Other values (50) 3554
69.6%
ValueCountFrequency (%)
200000143 72
1.4%
204000026 61
 
1.2%
204000030 25
 
0.5%
207000026 135
2.6%
207000027 156
3.1%
207000028 154
3.0%
207000033 81
1.6%
207000048 57
 
1.1%
207000054 97
1.9%
207000055 101
2.0%
ValueCountFrequency (%)
234000344 59
 
1.2%
234000315 149
2.9%
234000033 197
3.9%
234000026 170
3.3%
234000015 81
1.6%
234000007 13
 
0.3%
232000004 110
2.2%
232000001 84
1.6%
231000101 71
 
1.4%
231000099 60
 
1.2%

정류소ID
Real number (ℝ)

Distinct3413
Distinct (%)66.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1085642 × 108
Minimum1 × 108
Maximum2.3800028 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size45.0 KiB
2024-04-19T14:20:25.445714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.11 × 108
Q12.1000028 × 108
median2.1900041 × 108
Q32.2800085 × 108
95-th percentile2.3500031 × 108
Maximum2.3800028 × 108
Range1.3800027 × 108
Interquartile range (IQR)18000562

Descriptive statistics

Standard deviation32827267
Coefficient of variation (CV)0.15568541
Kurtosis4.9298369
Mean2.1085642 × 108
Median Absolute Deviation (MAD)9000338.5
Skewness-2.478621
Sum1.0762111 × 1012
Variance1.0776295 × 1015
MonotonicityNot monotonic
2024-04-19T14:20:25.631917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
207000027 5
 
0.1%
228001084 5
 
0.1%
215000024 5
 
0.1%
219000193 5
 
0.1%
228001552 5
 
0.1%
228001143 5
 
0.1%
207000152 5
 
0.1%
219000190 5
 
0.1%
219000192 5
 
0.1%
235000242 5
 
0.1%
Other values (3403) 5054
99.0%
ValueCountFrequency (%)
100000005 1
< 0.1%
100000034 1
< 0.1%
100000088 2
< 0.1%
100000097 2
< 0.1%
100000169 2
< 0.1%
100000174 1
< 0.1%
101000001 2
< 0.1%
101000002 2
< 0.1%
101000005 2
< 0.1%
101000007 1
< 0.1%
ValueCountFrequency (%)
238000277 2
< 0.1%
238000271 2
< 0.1%
238000270 2
< 0.1%
238000269 2
< 0.1%
238000268 2
< 0.1%
238000267 2
< 0.1%
238000266 2
< 0.1%
238000265 2
< 0.1%
238000264 2
< 0.1%
238000263 2
< 0.1%
Distinct59
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size40.0 KiB
2024-04-19T14:20:25.873054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.2893809
Min length1

Characters and Unicode

Total characters16789
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30-1(지게동)
2nd row30-1(지게동)
3rd row30-1(지게동)
4th row30-1(지게동)
5th row30-1(지게동)
ValueCountFrequency (%)
11-1 197
 
3.9%
720-2 170
 
3.3%
61 167
 
3.3%
36 156
 
3.1%
37 154
 
3.0%
9001 149
 
2.9%
202 147
 
2.9%
1 142
 
2.8%
73 137
 
2.7%
133 135
 
2.6%
Other values (49) 3550
69.6%
2024-04-19T14:20:26.231534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2886
17.2%
0 2622
15.6%
3 1935
11.5%
- 1836
10.9%
2 1606
9.6%
7 1450
8.6%
8 960
 
5.7%
5 847
 
5.0%
6 835
 
5.0%
9 545
 
3.2%
Other values (10) 1267
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13724
81.7%
Dash Punctuation 1836
 
10.9%
Other Letter 665
 
4.0%
Close Punctuation 282
 
1.7%
Open Punctuation 282
 
1.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2886
21.0%
0 2622
19.1%
3 1935
14.1%
2 1606
11.7%
7 1450
10.6%
8 960
 
7.0%
5 847
 
6.2%
6 835
 
6.1%
9 545
 
4.0%
4 38
 
0.3%
Other Letter
ValueCountFrequency (%)
101
15.2%
101
15.2%
101
15.2%
97
14.6%
97
14.6%
84
12.6%
84
12.6%
Dash Punctuation
ValueCountFrequency (%)
- 1836
100.0%
Close Punctuation
ValueCountFrequency (%)
) 282
100.0%
Open Punctuation
ValueCountFrequency (%)
( 282
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16124
96.0%
Hangul 665
 
4.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2886
17.9%
0 2622
16.3%
3 1935
12.0%
- 1836
11.4%
2 1606
10.0%
7 1450
9.0%
8 960
 
6.0%
5 847
 
5.3%
6 835
 
5.2%
9 545
 
3.4%
Other values (3) 602
 
3.7%
Hangul
ValueCountFrequency (%)
101
15.2%
101
15.2%
101
15.2%
97
14.6%
97
14.6%
84
12.6%
84
12.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16124
96.0%
Hangul 665
 
4.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2886
17.9%
0 2622
16.3%
3 1935
12.0%
- 1836
11.4%
2 1606
10.0%
7 1450
9.0%
8 960
 
6.0%
5 847
 
5.3%
6 835
 
5.2%
9 545
 
3.4%
Other values (3) 602
 
3.7%
Hangul
ValueCountFrequency (%)
101
15.2%
101
15.2%
101
15.2%
97
14.6%
97
14.6%
84
12.6%
84
12.6%
Distinct2678
Distinct (%)52.5%
Missing0
Missing (%)0.0%
Memory size40.0 KiB
2024-04-19T14:20:26.473891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length6.1081505
Min length2

Characters and Unicode

Total characters31176
Distinct characters516
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1404 ?
Unique (%)27.5%

Sample

1st row주공아파트
2nd row덕정고등학교
3rd row조은마을(주공6단지)
4th row서재말
5th row회암2동.회암편의점
ValueCountFrequency (%)
부대앞 20
 
0.4%
주공5단지 15
 
0.3%
녹양역 11
 
0.2%
현대아파트 11
 
0.2%
원각사 11
 
0.2%
양주시청 10
 
0.2%
일산동구청 10
 
0.2%
연신내역 9
 
0.2%
대화역 9
 
0.2%
주택앞 9
 
0.2%
Other values (2665) 4990
97.7%
2024-04-19T14:20:26.861476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 945
 
3.0%
835
 
2.7%
762
 
2.4%
726
 
2.3%
720
 
2.3%
717
 
2.3%
671
 
2.2%
636
 
2.0%
578
 
1.9%
556
 
1.8%
Other values (506) 24030
77.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29028
93.1%
Other Punctuation 945
 
3.0%
Decimal Number 662
 
2.1%
Uppercase Letter 247
 
0.8%
Close Punctuation 144
 
0.5%
Open Punctuation 136
 
0.4%
Dash Punctuation 9
 
< 0.1%
Lowercase Letter 4
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
835
 
2.9%
762
 
2.6%
726
 
2.5%
720
 
2.5%
717
 
2.5%
671
 
2.3%
636
 
2.2%
578
 
2.0%
556
 
1.9%
551
 
1.9%
Other values (474) 22276
76.7%
Uppercase Letter
ValueCountFrequency (%)
A 106
42.9%
S 39
 
15.8%
G 36
 
14.6%
L 17
 
6.9%
K 14
 
5.7%
C 11
 
4.5%
B 8
 
3.2%
T 6
 
2.4%
E 4
 
1.6%
D 2
 
0.8%
Other values (3) 4
 
1.6%
Decimal Number
ValueCountFrequency (%)
2 167
25.2%
1 162
24.5%
3 97
14.7%
5 56
 
8.5%
4 51
 
7.7%
7 42
 
6.3%
6 30
 
4.5%
8 24
 
3.6%
0 23
 
3.5%
9 10
 
1.5%
Lowercase Letter
ValueCountFrequency (%)
s 1
25.0%
k 1
25.0%
l 1
25.0%
g 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 945
100.0%
Close Punctuation
ValueCountFrequency (%)
) 144
100.0%
Open Punctuation
ValueCountFrequency (%)
( 136
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29028
93.1%
Common 1897
 
6.1%
Latin 251
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
835
 
2.9%
762
 
2.6%
726
 
2.5%
720
 
2.5%
717
 
2.5%
671
 
2.3%
636
 
2.2%
578
 
2.0%
556
 
1.9%
551
 
1.9%
Other values (474) 22276
76.7%
Latin
ValueCountFrequency (%)
A 106
42.2%
S 39
 
15.5%
G 36
 
14.3%
L 17
 
6.8%
K 14
 
5.6%
C 11
 
4.4%
B 8
 
3.2%
T 6
 
2.4%
E 4
 
1.6%
D 2
 
0.8%
Other values (7) 8
 
3.2%
Common
ValueCountFrequency (%)
. 945
49.8%
2 167
 
8.8%
1 162
 
8.5%
) 144
 
7.6%
( 136
 
7.2%
3 97
 
5.1%
5 56
 
3.0%
4 51
 
2.7%
7 42
 
2.2%
6 30
 
1.6%
Other values (5) 67
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29028
93.1%
ASCII 2148
 
6.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 945
44.0%
2 167
 
7.8%
1 162
 
7.5%
) 144
 
6.7%
( 136
 
6.3%
A 106
 
4.9%
3 97
 
4.5%
5 56
 
2.6%
4 51
 
2.4%
7 42
 
2.0%
Other values (22) 242
 
11.3%
Hangul
ValueCountFrequency (%)
835
 
2.9%
762
 
2.6%
726
 
2.5%
720
 
2.5%
717
 
2.5%
671
 
2.3%
636
 
2.2%
578
 
2.0%
556
 
1.9%
551
 
1.9%
Other values (474) 22276
76.7%

Interactions

2024-04-19T14:20:23.984048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:20:23.343478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:20:23.652083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:20:24.087209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:20:23.448541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:20:23.764339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:20:24.211083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:20:23.551443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:20:23.872693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-19T14:20:26.957407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
이력ID노선ID정류소ID노선명
이력ID1.0000.8750.4120.999
노선ID0.8751.0000.5491.000
정류소ID0.4120.5491.0000.857
노선명0.9991.0000.8571.000
2024-04-19T14:20:27.045510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
이력ID노선ID정류소ID
이력ID1.000-0.313-0.091
노선ID-0.3131.0000.145
정류소ID-0.0910.1451.000

Missing values

2024-04-19T14:20:24.349938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-19T14:20:24.439494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

이력ID노선ID정류소ID노선명정류소명
0100000116120700005523500009930-1(지게동)주공아파트
1100000116120700005523500008930-1(지게동)덕정고등학교
2100000116120700005523500009830-1(지게동)조은마을(주공6단지)
3100000116120700005523500009730-1(지게동)서재말
4100000116120700005523500060130-1(지게동)회암2동.회암편의점
5100000116120700005523500052130-1(지게동)율정삼거리
6100000116120700005523500008730-1(지게동)천보초등학교앞
7100000116120700005523500008630-1(지게동)모정동
8100000116120700005523500008530-1(지게동)귀율동
9100000116120700005523500008430-1(지게동)기우리다리앞
이력ID노선ID정류소ID노선명정류소명
5094100000294120700002721500004936지방산업단지앞
5095100000294120700002721500004836성보주택
5096100000294120700002721500004736소요산역앞
5097100000294120700002721500021736소요산차고지
5098100000294220700002820700006137다락원앞
5099100000149523400003312400002111-1둔촌2동사무소.보훈병원
5100100000149523400003312400002311-1강동성심병원.길동사거리
5101100000149523400003312400002511-1강동전철역.강동성심병원
5102100000149523400003312400002711-1천호동.현대백화점.이마트
5103100000149523400003310400008011-1워커힐아파트.워커힐호텔.한강웨딩홀

Duplicate rows

Most frequently occurring

이력ID노선ID정류소ID노선명정류소명# duplicates
01000000000231000005231000521370죽산시외버스터미널2
110000000002310001012140012957-6은산1리2
210000000002310001012140012967-6산하리2
310000000002310001012310000767-6양성터미널2
410000000002310001012310011347-6산하리(평동)2
510000000002310001012310012247-6산하리삼거리2
61000001124229000034229000849700트리플메디컬타운2
710000014532340003151000001699001조계사.인사동2
810000014532340003152060001749001오리초등학교2
910000014532340003152060001759001미금초등학교2