Dataset statistics
Number of variables | 3 |
---|---|
Number of observations | 4492 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 114.2 KiB |
Average record size in memory | 26.0 B |
Variable types
Numeric | 2 |
---|---|
Text | 1 |
Dataset
Description | 경기도_BMS 노선 정보 검증 |
---|---|
Author | 경기도 |
URL | https://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=1MQHOF2F4XO6DQMRHXOA34337309&infSeq=1 |
Reproduction
Analysis started | 2023-12-10 21:22:27.481224 |
---|---|
Analysis finished | 2023-12-10 21:22:28.237366 |
Duration | 0.76 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
업체ID
Real number (ℝ)
Distinct | 122 |
---|---|
Distinct (%) | 2.7% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4073219.2 |
Minimum | 1000002 |
---|---|
Maximum | 4154500 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 39.6 KiB |
Quantile statistics
Minimum | 1000002 |
---|---|
5-th percentile | 4100200 |
Q1 | 4100300 |
median | 4101100 |
Q3 | 4103800 |
95-th percentile | 4109100 |
Maximum | 4154500 |
Range | 3154498 |
Interquartile range (IQR) | 3500 |
Descriptive statistics
Standard deviation | 302206.95 |
---|---|
Coefficient of variation (CV) | 0.074193639 |
Kurtosis | 99.527667 |
Mean | 4073219.2 |
Median Absolute Deviation (MAD) | 900 |
Skewness | -10.072349 |
Sum | 1.82969 × 1010 |
Variance | 9.1329043 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4100300 | 766 | |
4100200 | 706 | 15.7% |
4100900 | 198 | 4.4% |
4101400 | 190 | 4.2% |
4100600 | 167 | 3.7% |
4103800 | 139 | 3.1% |
4100500 | 116 | 2.6% |
4101800 | 112 | 2.5% |
4104200 | 109 | 2.4% |
4103100 | 91 | 2.0% |
Other values (112) | 1898 |
Value | Count | Frequency (%) |
1000002 | 1 | |
1000003 | 1 | |
1000005 | 1 | |
1000008 | 1 | |
1000009 | 2 | |
1000010 | 1 | |
1000011 | 1 | |
1000012 | 1 | |
1000013 | 1 | |
1000014 | 1 |
Value | Count | Frequency (%) |
4154500 | 13 | |
4154400 | 1 | < 0.1% |
4150200 | 1 | < 0.1% |
4150100 | 1 | < 0.1% |
4149400 | 8 | |
4143600 | 13 | |
4113100 | 1 | < 0.1% |
4112800 | 1 | < 0.1% |
4112400 | 2 | < 0.1% |
4112200 | 2 | < 0.1% |
노선ID
Real number (ℝ)
Distinct | 3839 |
---|---|
Distinct (%) | 85.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2.3560724 × 108 |
Minimum | 0 |
---|---|
Maximum | 9.9900011 × 108 |
Zeros | 2 |
Zeros (%) | < 0.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 39.6 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 2.000003 × 108 |
Q1 | 2.2100002 × 108 |
median | 2.3300008 × 108 |
Q3 | 2.3400141 × 108 |
95-th percentile | 2.4000003 × 108 |
Maximum | 9.9900011 × 108 |
Range | 9.9900011 × 108 |
Interquartile range (IQR) | 13001391 |
Descriptive statistics
Standard deviation | 82700609 |
---|---|
Coefficient of variation (CV) | 0.35101047 |
Kurtosis | 79.589823 |
Mean | 2.3560724 × 108 |
Median Absolute Deviation (MAD) | 3999974 |
Skewness | 8.9162434 |
Sum | 1.0583477 × 1012 |
Variance | 6.8393907 × 1015 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
999000107 | 31 | 0.7% |
999000108 | 14 | 0.3% |
219000017 | 4 | 0.1% |
218000001 | 4 | 0.1% |
218000002 | 4 | 0.1% |
218000004 | 4 | 0.1% |
219000008 | 4 | 0.1% |
229000066 | 4 | 0.1% |
229000060 | 4 | 0.1% |
229000063 | 4 | 0.1% |
Other values (3829) | 4415 |
Value | Count | Frequency (%) |
0 | 2 | |
200000006 | 1 | |
200000008 | 1 | |
200000009 | 1 | |
200000010 | 1 | |
200000012 | 1 | |
200000013 | 1 | |
200000014 | 1 | |
200000015 | 1 | |
200000016 | 1 |
Value | Count | Frequency (%) |
999000108 | 14 | |
999000107 | 31 | |
999000106 | 1 | < 0.1% |
999000105 | 1 | < 0.1% |
999000102 | 1 | < 0.1% |
999000101 | 1 | < 0.1% |
999000100 | 1 | < 0.1% |
999000099 | 1 | < 0.1% |
249000002 | 1 | < 0.1% |
241000350 | 1 | < 0.1% |
노선명
Text
Distinct | 2199 |
---|---|
Distinct (%) | 49.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 35.2 KiB |
Value | Count | Frequency (%) |
2222 | 31 | 0.7% |
3 | 24 | 0.5% |
2 | 19 | 0.4% |
33 | 19 | 0.4% |
7 | 18 | 0.4% |
1 | 18 | 0.4% |
8 | 17 | 0.4% |
100 | 17 | 0.4% |
11 | 17 | 0.4% |
5 | 15 | 0.3% |
Other values (2189) | 4297 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 2509 | |
- | 2497 | |
2 | 2122 | |
0 | 1850 | |
3 | 1753 | |
5 | 1283 | |
9 | 1012 | |
8 | 892 | 5.2% |
7 | 822 | 4.8% |
4 | 722 | 4.2% |
Other values (97) | 1617 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 13661 | |
Dash Punctuation | 2497 | 14.6% |
Other Letter | 413 | 2.4% |
Uppercase Letter | 356 | 2.1% |
Close Punctuation | 75 | 0.4% |
Open Punctuation | 75 | 0.4% |
Lowercase Letter | 2 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
근 | 39 | 9.4% |
출 | 22 | 5.3% |
예 | 22 | 5.3% |
약 | 21 | 5.1% |
퇴 | 20 | 4.8% |
전 | 18 | 4.4% |
스 | 18 | 4.4% |
버 | 17 | 4.1% |
똑 | 15 | 3.6% |
학 | 14 | 3.4% |
Other values (70) | 207 |
Uppercase Letter
Value | Count | Frequency (%) |
A | 82 | |
B | 79 | |
M | 58 | |
P | 38 | |
G | 35 | |
H | 25 | 7.0% |
N | 15 | 4.2% |
C | 11 | 3.1% |
D | 4 | 1.1% |
Y | 4 | 1.1% |
Other values (3) | 5 | 1.4% |
Decimal Number
Value | Count | Frequency (%) |
1 | 2509 | |
2 | 2122 | |
0 | 1850 | |
3 | 1753 | |
5 | 1283 | |
9 | 1012 | |
8 | 892 | 6.5% |
7 | 822 | 6.0% |
4 | 722 | 5.3% |
6 | 696 | 5.1% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2497 |
Close Punctuation
Value | Count | Frequency (%) |
) | 75 |
Open Punctuation
Value | Count | Frequency (%) |
( | 75 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 16308 | |
Hangul | 413 | 2.4% |
Latin | 358 | 2.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
근 | 39 | 9.4% |
출 | 22 | 5.3% |
예 | 22 | 5.3% |
약 | 21 | 5.1% |
퇴 | 20 | 4.8% |
전 | 18 | 4.4% |
스 | 18 | 4.4% |
버 | 17 | 4.1% |
똑 | 15 | 3.6% |
학 | 14 | 3.4% |
Other values (70) | 207 |
Latin
Value | Count | Frequency (%) |
A | 82 | |
B | 79 | |
M | 58 | |
P | 38 | |
G | 35 | |
H | 25 | 7.0% |
N | 15 | 4.2% |
C | 11 | 3.1% |
D | 4 | 1.1% |
Y | 4 | 1.1% |
Other values (4) | 7 | 2.0% |
Common
Value | Count | Frequency (%) |
1 | 2509 | |
- | 2497 | |
2 | 2122 | |
0 | 1850 | |
3 | 1753 | |
5 | 1283 | |
9 | 1012 | |
8 | 892 | 5.5% |
7 | 822 | 5.0% |
4 | 722 | 4.4% |
Other values (3) | 846 | 5.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 16666 | |
Hangul | 413 | 2.4% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 2509 | |
- | 2497 | |
2 | 2122 | |
0 | 1850 | |
3 | 1753 | |
5 | 1283 | |
9 | 1012 | |
8 | 892 | 5.4% |
7 | 822 | 4.9% |
4 | 722 | 4.3% |
Other values (17) | 1204 |
Hangul
Value | Count | Frequency (%) |
근 | 39 | 9.4% |
출 | 22 | 5.3% |
예 | 22 | 5.3% |
약 | 21 | 5.1% |
퇴 | 20 | 4.8% |
전 | 18 | 4.4% |
스 | 18 | 4.4% |
버 | 17 | 4.1% |
똑 | 15 | 3.6% |
학 | 14 | 3.4% |
Other values (70) | 207 |
업체ID | 노선ID | |
---|---|---|
업체ID | 1.000 | 0.654 |
노선ID | 0.654 | 1.000 |
업체ID | 노선ID | |
---|---|---|
업체ID | 1.000 | -0.363 |
노선ID | -0.363 | 1.000 |
업체ID | 노선ID | 노선명 | |
---|---|---|---|
0 | 4102100 | 208000020 | 9-3 |
1 | 4102100 | 208000021 | 1-2 |
2 | 4102100 | 208000024 | 1-1 |
3 | 4102100 | 208000028 | 8 |
4 | 4102100 | 208000030 | 52 |
5 | 4102100 | 208000034 | 3-1 |
6 | 4102100 | 208000035 | 52-1 |
7 | 4102100 | 208000036 | 1-5 |
8 | 4102100 | 208000037 | 5-1 |
9 | 4102100 | 208000038 | 83 |
업체ID | 노선ID | 노선명 | |
---|---|---|---|
4482 | 4154500 | 218000012 | M7106 |
4483 | 4154500 | 218000016 | 37 |
4484 | 4154500 | 219000008 | 55 |
4485 | 4154500 | 219000017 | 76 |
4486 | 4154500 | 229000032 | 34 |
4487 | 4154500 | 229000036 | 360 |
4488 | 4154500 | 229000040 | 73 |
4489 | 4154500 | 229000060 | 790 |
4490 | 4154500 | 229000063 | 850 |
4491 | 4154500 | 229000066 | 550 |