Dataset statistics
Number of variables | 2 |
---|---|
Number of observations | 2720 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 682 |
Duplicate rows (%) | 25.1% |
Total size in memory | 45.3 KiB |
Average record size in memory | 17.0 B |
Variable types
Text | 1 |
---|---|
Numeric | 1 |
Dataset
Description | 노선명,노선 |
---|---|
Author | 서울특별시 |
URL | https://data.seoul.go.kr/dataList/OA-15262/S/1/datasetView.do |
Dataset has 682 (25.1%) duplicate rows | Duplicates |
Reproduction
Analysis started | 2024-05-11 06:16:06.410877 |
---|---|
Analysis finished | 2024-05-11 06:16:06.957274 |
Duration | 0.55 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
노선명
Text
Distinct | 690 |
---|---|
Distinct (%) | 25.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 21.4 KiB |
Value | Count | Frequency (%) |
0017 | 4 | 0.1% |
강서05-1 | 4 | 0.1% |
관악08 | 4 | 0.1% |
강북11 | 4 | 0.1% |
강북12 | 4 | 0.1% |
강서01 | 4 | 0.1% |
강서02 | 4 | 0.1% |
강서03 | 4 | 0.1% |
강서04 | 4 | 0.1% |
강서05 | 4 | 0.1% |
Other values (680) | 2680 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 1796 | |
0 | 1515 | |
2 | 986 | |
6 | 868 | 8.1% |
3 | 735 | 6.8% |
7 | 674 | 6.3% |
5 | 649 | 6.0% |
4 | 612 | 5.7% |
8 | 241 | 2.2% |
서 | 208 | 1.9% |
Other values (60) | 2451 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 8234 | |
Other Letter | 2240 | 20.9% |
Uppercase Letter | 207 | 1.9% |
Dash Punctuation | 54 | 0.5% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
서 | 208 | 9.3% |
동 | 160 | 7.1% |
강 | 128 | 5.7% |
성 | 120 | 5.4% |
포 | 120 | 5.4% |
북 | 116 | 5.2% |
로 | 100 | 4.5% |
대 | 100 | 4.5% |
문 | 84 | 3.8% |
초 | 84 | 3.8% |
Other values (42) | 1020 |
Decimal Number
Value | Count | Frequency (%) |
1 | 1796 | |
0 | 1515 | |
2 | 986 | |
6 | 868 | |
3 | 735 | |
7 | 674 | 8.2% |
5 | 649 | 7.9% |
4 | 612 | 7.4% |
8 | 241 | 2.9% |
9 | 158 | 1.9% |
Uppercase Letter
Value | Count | Frequency (%) |
N | 76 | |
A | 36 | |
B | 31 | |
U | 16 | 7.7% |
O | 16 | 7.7% |
R | 16 | 7.7% |
T | 16 | 7.7% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 54 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 8288 | |
Hangul | 2240 | 20.9% |
Latin | 207 | 1.9% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
서 | 208 | 9.3% |
동 | 160 | 7.1% |
강 | 128 | 5.7% |
성 | 120 | 5.4% |
포 | 120 | 5.4% |
북 | 116 | 5.2% |
로 | 100 | 4.5% |
대 | 100 | 4.5% |
문 | 84 | 3.8% |
초 | 84 | 3.8% |
Other values (42) | 1020 |
Common
Value | Count | Frequency (%) |
1 | 1796 | |
0 | 1515 | |
2 | 986 | |
6 | 868 | |
3 | 735 | |
7 | 674 | 8.1% |
5 | 649 | 7.8% |
4 | 612 | 7.4% |
8 | 241 | 2.9% |
9 | 158 | 1.9% |
Latin
Value | Count | Frequency (%) |
N | 76 | |
A | 36 | |
B | 31 | |
U | 16 | 7.7% |
O | 16 | 7.7% |
R | 16 | 7.7% |
T | 16 | 7.7% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 8495 | |
Hangul | 2240 | 20.9% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 1796 | |
0 | 1515 | |
2 | 986 | |
6 | 868 | |
3 | 735 | |
7 | 674 | 7.9% |
5 | 649 | 7.6% |
4 | 612 | 7.2% |
8 | 241 | 2.8% |
9 | 158 | 1.9% |
Other values (8) | 261 | 3.1% |
Hangul
Value | Count | Frequency (%) |
서 | 208 | 9.3% |
동 | 160 | 7.1% |
강 | 128 | 5.7% |
성 | 120 | 5.4% |
포 | 120 | 5.4% |
북 | 116 | 5.2% |
로 | 100 | 4.5% |
대 | 100 | 4.5% |
문 | 84 | 3.8% |
초 | 84 | 3.8% |
Other values (42) | 1020 |
노선
Real number (ℝ)
Distinct | 690 |
---|---|
Distinct (%) | 25.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.0656566 × 108 |
Minimum | 1.0000002 × 108 |
---|---|
Maximum | 1.249 × 108 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 24.0 KiB |
Quantile statistics
Minimum | 1.0000002 × 108 |
---|---|
5-th percentile | 1.0010004 × 108 |
Q1 | 1.0010025 × 108 |
median | 1.0010059 × 108 |
Q3 | 1.13 × 108 |
95-th percentile | 1.22 × 108 |
Maximum | 1.249 × 108 |
Range | 24899986 |
Interquartile range (IQR) | 12899753 |
Descriptive statistics
Standard deviation | 8154293.5 |
---|---|
Coefficient of variation (CV) | 0.07651896 |
Kurtosis | -0.85545463 |
Mean | 1.0656566 × 108 |
Median Absolute Deviation (MAD) | 573 |
Skewness | 0.81842412 |
Sum | 2.898586 × 1011 |
Variance | 6.6492503 × 1013 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
100100124 | 4 | 0.1% |
115900002 | 4 | 0.1% |
108900001 | 4 | 0.1% |
108900012 | 4 | 0.1% |
115900006 | 4 | 0.1% |
115900003 | 4 | 0.1% |
115900004 | 4 | 0.1% |
115900001 | 4 | 0.1% |
115900005 | 4 | 0.1% |
115900008 | 4 | 0.1% |
Other values (680) | 2680 |
Value | Count | Frequency (%) |
100000017 | 4 | |
100000018 | 4 | |
100100001 | 4 | |
100100006 | 4 | |
100100007 | 4 | |
100100008 | 4 | |
100100009 | 4 | |
100100010 | 4 | |
100100011 | 4 | |
100100012 | 4 |
Value | Count | Frequency (%) |
124900003 | 4 | |
124900002 | 4 | |
124900001 | 4 | |
124000040 | 1 | < 0.1% |
124000039 | 4 | |
124000038 | 4 | |
124000036 | 4 | |
124000016 | 4 | |
124000015 | 4 | |
124000014 | 4 |
노선명 | 노선 | |
---|---|---|
0 | 0017 | 100100124 |
1 | 01A | 100100001 |
2 | 01B | 106000004 |
3 | 0411 | 104000012 |
4 | 100 | 100100549 |
5 | 101 | 100100006 |
6 | 1014 | 100100129 |
7 | 1017 | 100100130 |
8 | 102 | 100100007 |
9 | 1020 | 100100131 |
노선명 | 노선 | |
---|---|---|
2710 | 종로03 | 100900010 |
2711 | 종로05 | 100900011 |
2712 | 종로07 | 100900004 |
2713 | 종로08 | 100900005 |
2714 | 종로09 | 100900003 |
2715 | 종로11 | 100900007 |
2716 | 종로12 | 100900009 |
2717 | 종로13 | 100900002 |
2718 | 중랑01 | 106900001 |
2719 | 중랑02 | 106900002 |
Most frequently occurring
노선명 | 노선 | # duplicates | |
---|---|---|---|
0 | 0017 | 100100124 | 4 |
1 | 01A | 100100001 | 4 |
2 | 01B | 106000004 | 4 |
3 | 0411 | 104000012 | 4 |
4 | 100 | 100100549 | 4 |
5 | 101 | 100100006 | 4 |
6 | 1014 | 100100129 | 4 |
7 | 1017 | 100100130 | 4 |
8 | 102 | 100100007 | 4 |
9 | 1020 | 100100131 | 4 |