Overview

Dataset statistics

Number of variables5
Number of observations782
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.0 KiB
Average record size in memory43.2 B

Variable types

Numeric3
Text1
Categorical1

Dataset

Description역사_ID,역사명,호선,위도,경도
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21232/S/1/datasetView.do

Alerts

역사_ID is highly overall correlated with 호선High correlation
위도 is highly overall correlated with 호선High correlation
경도 is highly overall correlated with 호선High correlation
호선 is highly overall correlated with 역사_ID and 2 other fieldsHigh correlation
역사_ID has unique valuesUnique

Reproduction

Analysis started2024-05-04 02:42:12.349845
Analysis finished2024-05-04 02:42:16.766158
Duration4.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

역사_ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct782
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2381.2545
Minimum150
Maximum9996
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 KiB
2024-05-04T02:42:17.051070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile230.05
Q11280.5
median1969
Q33137.75
95-th percentile4707.95
Maximum9996
Range9846
Interquartile range (IQR)1857.25

Descriptive statistics

Standard deviation1595.4795
Coefficient of variation (CV)0.6700164
Kurtosis3.2514071
Mean2381.2545
Median Absolute Deviation (MAD)846.5
Skewness1.2022187
Sum1862141
Variance2545555
MonotonicityStrictly decreasing
2024-05-04T02:42:17.619284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9996 1
 
0.1%
1504 1
 
0.1%
1714 1
 
0.1%
1713 1
 
0.1%
1712 1
 
0.1%
1711 1
 
0.1%
1710 1
 
0.1%
1709 1
 
0.1%
1708 1
 
0.1%
1707 1
 
0.1%
Other values (772) 772
98.7%
ValueCountFrequency (%)
150 1
0.1%
151 1
0.1%
152 1
0.1%
153 1
0.1%
154 1
0.1%
155 1
0.1%
156 1
0.1%
157 1
0.1%
158 1
0.1%
159 1
0.1%
ValueCountFrequency (%)
9996 1
0.1%
9995 1
0.1%
9010 1
0.1%
9009 1
0.1%
9008 1
0.1%
9007 1
0.1%
9006 1
0.1%
9005 1
0.1%
9004 1
0.1%
9002 1
0.1%
Distinct648
Distinct (%)82.9%
Missing0
Missing (%)0.0%
Memory size6.2 KiB
2024-05-04T02:42:18.301760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length2
Mean length3.5703325
Min length2

Characters and Unicode

Total characters2792
Distinct characters320
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique537 ?
Unique (%)68.7%

Sample

1st row미사
2nd row강일
3rd row동탄
4th row구성
5th row성남
ValueCountFrequency (%)
서울역 6
 
0.8%
공덕 5
 
0.6%
김포공항 5
 
0.6%
홍대입구 4
 
0.5%
디지털미디어시티 4
 
0.5%
수서 3
 
0.4%
종로3가 3
 
0.4%
신설동 3
 
0.4%
고속터미널 3
 
0.4%
부평구청 3
 
0.4%
Other values (638) 743
95.0%
2024-05-04T02:42:19.596786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
91
 
3.3%
) 83
 
3.0%
( 83
 
3.0%
82
 
2.9%
64
 
2.3%
59
 
2.1%
57
 
2.0%
55
 
2.0%
52
 
1.9%
51
 
1.8%
Other values (310) 2115
75.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2605
93.3%
Close Punctuation 83
 
3.0%
Open Punctuation 83
 
3.0%
Decimal Number 13
 
0.5%
Other Punctuation 7
 
0.3%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
91
 
3.5%
82
 
3.1%
64
 
2.5%
59
 
2.3%
57
 
2.2%
55
 
2.1%
52
 
2.0%
51
 
2.0%
46
 
1.8%
40
 
1.5%
Other values (300) 2008
77.1%
Decimal Number
ValueCountFrequency (%)
3 5
38.5%
4 3
23.1%
1 2
 
15.4%
5 1
 
7.7%
9 1
 
7.7%
2 1
 
7.7%
Close Punctuation
ValueCountFrequency (%)
) 83
100.0%
Open Punctuation
ValueCountFrequency (%)
( 83
100.0%
Other Punctuation
ValueCountFrequency (%)
. 7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2605
93.3%
Common 187
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
91
 
3.5%
82
 
3.1%
64
 
2.5%
59
 
2.3%
57
 
2.2%
55
 
2.1%
52
 
2.0%
51
 
2.0%
46
 
1.8%
40
 
1.5%
Other values (300) 2008
77.1%
Common
ValueCountFrequency (%)
) 83
44.4%
( 83
44.4%
. 7
 
3.7%
3 5
 
2.7%
4 3
 
1.6%
1 2
 
1.1%
5 1
 
0.5%
9 1
 
0.5%
2 1
 
0.5%
- 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2605
93.3%
ASCII 187
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
91
 
3.5%
82
 
3.1%
64
 
2.5%
59
 
2.3%
57
 
2.2%
55
 
2.1%
52
 
2.0%
51
 
2.0%
46
 
1.8%
40
 
1.5%
Other values (300) 2008
77.1%
ASCII
ValueCountFrequency (%)
) 83
44.4%
( 83
44.4%
. 7
 
3.7%
3 5
 
2.7%
4 3
 
1.6%
1 2
 
1.1%
5 1
 
0.5%
9 1
 
0.5%
2 1
 
0.5%
- 1
 
0.5%

호선
Categorical

HIGH CORRELATION 

Distinct38
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size6.2 KiB
5호선
58 
7호선
 
53
2호선
 
50
경부선
 
39
6호선
 
39
Other values (33)
543 

Length

Max length10
Median length3
Mean length3.7033248
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5호선
2nd row5호선
3rd row수도권 광역급행철도
4th row수도권 광역급행철도
5th row수도권 광역급행철도

Common Values

ValueCountFrequency (%)
5호선 58
 
7.4%
7호선 53
 
6.8%
2호선 50
 
6.4%
경부선 39
 
5.0%
6호선 39
 
5.0%
분당선 35
 
4.5%
3호선 34
 
4.3%
경원선 33
 
4.2%
경의중앙선 31
 
4.0%
인천1호선 30
 
3.8%
Other values (28) 380
48.6%

Length

2024-05-04T02:42:20.124580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5호선 58
 
7.3%
7호선 53
 
6.7%
2호선 50
 
6.3%
경부선 39
 
4.9%
6호선 39
 
4.9%
분당선 35
 
4.4%
3호선 34
 
4.3%
경원선 33
 
4.2%
경의중앙선 31
 
3.9%
인천1호선 30
 
3.8%
Other values (29) 390
49.2%

위도
Real number (ℝ)

HIGH CORRELATION 

Distinct771
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.5115
Minimum36.769502
Maximum38.10073
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 KiB
2024-05-04T02:42:20.581467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.769502
5-th percentile37.261815
Q137.470522
median37.521754
Q337.576385
95-th percentile37.737116
Maximum38.10073
Range1.331228
Interquartile range (IQR)0.105863

Descriptive statistics

Standard deviation0.15451493
Coefficient of variation (CV)0.0041191348
Kurtosis5.7309394
Mean37.5115
Median Absolute Deviation (MAD)0.053184
Skewness-1.228242
Sum29333.993
Variance0.023874862
MonotonicityNot monotonic
2024-05-04T02:42:21.134308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.560927 2
 
0.3%
37.394761 2
 
0.3%
37.55749 2
 
0.3%
37.577475 2
 
0.3%
37.557641 2
 
0.3%
37.542596 2
 
0.3%
37.557231 2
 
0.3%
37.516334 2
 
0.3%
37.511093 2
 
0.3%
37.504598 2
 
0.3%
Other values (761) 762
97.4%
ValueCountFrequency (%)
36.769502 1
0.1%
36.777629 1
0.1%
36.780483 1
0.1%
36.78866 1
0.1%
36.792053 1
0.1%
36.793759 1
0.1%
36.801215 1
0.1%
36.810005 1
0.1%
36.833705 1
0.1%
36.870593 1
0.1%
ValueCountFrequency (%)
38.10073 1
0.1%
38.02458 1
0.1%
37.98172 1
0.1%
37.9481 1
0.1%
37.927878 1
0.1%
37.913702 1
0.1%
37.901885 1
0.1%
37.892334 1
0.1%
37.888421 1
0.1%
37.885054 1
0.1%

경도
Real number (ℝ)

HIGH CORRELATION 

Distinct773
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.97047
Minimum126.44144
Maximum127.72379
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 KiB
2024-05-04T02:42:21.919348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.44144
5-th percentile126.67621
Q1126.85304
median126.99419
Q3127.07147
95-th percentile127.20878
Maximum127.72379
Range1.28235
Interquartile range (IQR)0.21842575

Descriptive statistics

Standard deviation0.18263675
Coefficient of variation (CV)0.0014384191
Kurtosis1.7570719
Mean126.97047
Median Absolute Deviation (MAD)0.096349
Skewness0.4447035
Sum99290.907
Variance0.033356182
MonotonicityNot monotonic
2024-05-04T02:42:22.725329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.193877 2
 
0.3%
127.020114 2
 
0.3%
126.926683 2
 
0.3%
126.952099 2
 
0.3%
126.97103 2
 
0.3%
126.900453 2
 
0.3%
127.021415 2
 
0.3%
127.02506 2
 
0.3%
127.17593 2
 
0.3%
126.935433 1
 
0.1%
Other values (763) 763
97.6%
ValueCountFrequency (%)
126.441442 1
0.1%
126.452508 1
0.1%
126.477516 1
0.1%
126.49379 1
0.1%
126.524254 1
0.1%
126.614309 1
0.1%
126.616801 1
0.1%
126.617326 1
0.1%
126.623853 1
0.1%
126.624648 1
0.1%
ValueCountFrequency (%)
127.723792 1
0.1%
127.717023 1
0.1%
127.71434 1
0.1%
127.634146 1
0.1%
127.629874 1
0.1%
127.628816 1
0.1%
127.594647 1
0.1%
127.58933 1
0.1%
127.570938 1
0.1%
127.557695 1
0.1%

Interactions

2024-05-04T02:42:15.087292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:42:13.110763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:42:14.129574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:42:15.364952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:42:13.438827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:42:14.446393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:42:15.737698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:42:13.794694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:42:14.762152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T02:42:23.057139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역사_ID호선위도경도
역사_ID1.0000.9800.4170.544
호선0.9801.0000.8800.857
위도0.4170.8801.0000.450
경도0.5440.8570.4501.000
2024-05-04T02:42:23.370161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역사_ID위도경도호선
역사_ID1.000-0.085-0.1820.864
위도-0.0851.0000.0370.540
경도-0.1820.0371.0000.501
호선0.8640.5400.5011.000

Missing values

2024-05-04T02:42:16.212725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T02:42:16.621375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

역사_ID역사명호선위도경도
09996미사5호선37.560927127.193877
19995강일5호선37.55749127.17593
29010동탄수도권 광역급행철도37.20034127.09569
39009구성수도권 광역급행철도37.29913127.10389
49008성남수도권 광역급행철도37.39467127.12058
59007수서수도권 광역급행철도37.48637127.10161
69006삼성수도권 광역급행철도37.50887127.06324
79005서울수도권 광역급행철도37.55569126.97296
89004연신내수도권 광역급행철도37.61878126.9213
99002대곡수도권 광역급행철도37.63191126.81113
역사_ID역사명호선위도경도
772159동묘앞1호선37.573197127.01648
773158청량리(서울시립대입구)1호선37.579956127.044585
774157제기동1호선37.578103127.034893
775156신설동1호선37.576048127.024634
776155동대문1호선37.571687127.01093
777154종로5가1호선37.570926127.001849
778153종로3가1호선37.570406126.991847
779152종각1호선37.570161126.982923
780151시청1호선37.565715126.977088
781150서울역1호선37.556228126.972135