Overview

Dataset statistics

Number of variables6
Number of observations262
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.7 KiB
Average record size in memory53.5 B

Variable types

Numeric5
Text1

Dataset

Description서울교통공사 1-8호선 역별 일평균 승하차인원 정보 입니다. 해당 데이터는 연번, 호선, 역번호, 역명, 연평균승차정보, 연평균 하차정보로 구성되어 있습니다. (2006년 이전데이터는 정보 부존재로 2006년 데이터 부터 게시합니다.)
Author서울교통공사
URLhttps://www.data.go.kr/data/15099899/fileData.do

Alerts

연번 is highly overall correlated with 호선 and 1 other fieldsHigh correlation
호선 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
역번호 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
2007년 승차인원(일평균) is highly overall correlated with 2007년 하차인원(일평균)High correlation
2007년 하차인원(일평균) is highly overall correlated with 2007년 승차인원(일평균)High correlation
연번 has unique valuesUnique
역번호 has unique valuesUnique
역명 has unique valuesUnique
2007년 하차인원(일평균) has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:30:55.590445
Analysis finished2023-12-12 14:30:58.228099
Duration2.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct262
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean131.5
Minimum1
Maximum262
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 KiB
2023-12-12T23:30:58.344053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile14.05
Q166.25
median131.5
Q3196.75
95-th percentile248.95
Maximum262
Range261
Interquartile range (IQR)130.5

Descriptive statistics

Standard deviation75.777085
Coefficient of variation (CV)0.5762516
Kurtosis-1.2
Mean131.5
Median Absolute Deviation (MAD)65.5
Skewness0
Sum34453
Variance5742.1667
MonotonicityStrictly increasing
2023-12-12T23:30:58.503672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
166 1
 
0.4%
168 1
 
0.4%
169 1
 
0.4%
170 1
 
0.4%
171 1
 
0.4%
172 1
 
0.4%
173 1
 
0.4%
174 1
 
0.4%
175 1
 
0.4%
Other values (252) 252
96.2%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
262 1
0.4%
261 1
0.4%
260 1
0.4%
259 1
0.4%
258 1
0.4%
257 1
0.4%
256 1
0.4%
255 1
0.4%
254 1
0.4%
253 1
0.4%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6030534
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 KiB
2023-12-12T23:30:58.624041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q36
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.0273206
Coefficient of variation (CV)0.44042952
Kurtosis-1.1839681
Mean4.6030534
Median Absolute Deviation (MAD)2
Skewness-0.060319695
Sum1206
Variance4.110029
MonotonicityIncreasing
2023-12-12T23:30:58.745183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2 50
19.1%
5 50
19.1%
7 42
16.0%
6 37
14.1%
3 30
11.5%
4 26
9.9%
8 17
 
6.5%
1 10
 
3.8%
ValueCountFrequency (%)
1 10
 
3.8%
2 50
19.1%
3 30
11.5%
4 26
9.9%
5 50
19.1%
6 37
14.1%
7 42
16.0%
8 17
 
6.5%
ValueCountFrequency (%)
8 17
 
6.5%
7 42
16.0%
6 37
14.1%
5 50
19.1%
4 26
9.9%
3 30
11.5%
2 50
19.1%
1 10
 
3.8%

역번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct262
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1604.1031
Minimum150
Maximum2827
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 KiB
2023-12-12T23:30:58.912373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile204.05
Q1314.25
median2526.5
Q32641.75
95-th percentile2813.95
Maximum2827
Range2677
Interquartile range (IQR)2327.5

Descriptive statistics

Standard deviation1178.4456
Coefficient of variation (CV)0.73464459
Kurtosis-1.9369219
Mean1604.1031
Median Absolute Deviation (MAD)286
Skewness-0.2270754
Sum420275
Variance1388734.1
MonotonicityStrictly increasing
2023-12-12T23:30:59.051004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
150 1
 
0.4%
2561 1
 
0.4%
2612 1
 
0.4%
2613 1
 
0.4%
2614 1
 
0.4%
2616 1
 
0.4%
2617 1
 
0.4%
2618 1
 
0.4%
2619 1
 
0.4%
2620 1
 
0.4%
Other values (252) 252
96.2%
ValueCountFrequency (%)
150 1
0.4%
151 1
0.4%
152 1
0.4%
153 1
0.4%
154 1
0.4%
155 1
0.4%
156 1
0.4%
157 1
0.4%
158 1
0.4%
159 1
0.4%
ValueCountFrequency (%)
2827 1
0.4%
2826 1
0.4%
2825 1
0.4%
2824 1
0.4%
2823 1
0.4%
2822 1
0.4%
2821 1
0.4%
2820 1
0.4%
2819 1
0.4%
2818 1
0.4%

역명
Text

UNIQUE 

Distinct262
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
2023-12-12T23:30:59.432913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length3.8206107
Min length2

Characters and Unicode

Total characters1001
Distinct characters210
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique262 ?
Unique (%)100.0%

Sample

1st row서울역(1)
2nd row시청(1)
3rd row종각
4th row종로3가(1)
5th row종로5가
ValueCountFrequency (%)
서울역(1 1
 
0.4%
개롱 1
 
0.4%
마천 1
 
0.4%
응암 1
 
0.4%
역촌 1
 
0.4%
불광(6 1
 
0.4%
독바위 1
 
0.4%
구산 1
 
0.4%
새절 1
 
0.4%
증산 1
 
0.4%
Other values (252) 252
96.2%
2023-12-12T23:31:00.029568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 79
 
7.9%
) 79
 
7.9%
32
 
3.2%
28
 
2.8%
22
 
2.2%
22
 
2.2%
5 17
 
1.7%
17
 
1.7%
2 15
 
1.5%
15
 
1.5%
Other values (200) 675
67.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 757
75.6%
Decimal Number 86
 
8.6%
Open Punctuation 79
 
7.9%
Close Punctuation 79
 
7.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
4.2%
28
 
3.7%
22
 
2.9%
22
 
2.9%
17
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.8%
14
 
1.8%
13
 
1.7%
Other values (190) 565
74.6%
Decimal Number
ValueCountFrequency (%)
5 17
19.8%
2 15
17.4%
3 12
14.0%
7 11
12.8%
6 11
12.8%
4 9
10.5%
1 6
 
7.0%
8 5
 
5.8%
Open Punctuation
ValueCountFrequency (%)
( 79
100.0%
Close Punctuation
ValueCountFrequency (%)
) 79
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 757
75.6%
Common 244
 
24.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
4.2%
28
 
3.7%
22
 
2.9%
22
 
2.9%
17
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.8%
14
 
1.8%
13
 
1.7%
Other values (190) 565
74.6%
Common
ValueCountFrequency (%)
( 79
32.4%
) 79
32.4%
5 17
 
7.0%
2 15
 
6.1%
3 12
 
4.9%
7 11
 
4.5%
6 11
 
4.5%
4 9
 
3.7%
1 6
 
2.5%
8 5
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 757
75.6%
ASCII 244
 
24.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 79
32.4%
) 79
32.4%
5 17
 
7.0%
2 15
 
6.1%
3 12
 
4.9%
7 11
 
4.5%
6 11
 
4.5%
4 9
 
3.7%
1 6
 
2.5%
8 5
 
2.0%
Hangul
ValueCountFrequency (%)
32
 
4.2%
28
 
3.7%
22
 
2.9%
22
 
2.9%
17
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.8%
14
 
1.8%
13
 
1.7%
Other values (190) 565
74.6%

2007년 승차인원(일평균)
Real number (ℝ)

HIGH CORRELATION 

Distinct260
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17298.454
Minimum626
Maximum93652
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 KiB
2023-12-12T23:31:00.218373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum626
5-th percentile3094.45
Q17663.25
median13032
Q321367.5
95-th percentile47853.5
Maximum93652
Range93026
Interquartile range (IQR)13704.25

Descriptive statistics

Standard deviation14612.251
Coefficient of variation (CV)0.84471429
Kurtosis4.8795806
Mean17298.454
Median Absolute Deviation (MAD)6138.5
Skewness1.9864686
Sum4532195
Variance2.1351789 × 108
MonotonicityNot monotonic
2023-12-12T23:31:00.405257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12523 2
 
0.8%
9563 2
 
0.8%
57360 1
 
0.4%
8067 1
 
0.4%
3469 1
 
0.4%
3984 1
 
0.4%
2143 1
 
0.4%
6958 1
 
0.4%
12339 1
 
0.4%
9791 1
 
0.4%
Other values (250) 250
95.4%
ValueCountFrequency (%)
626 1
0.4%
1144 1
0.4%
1541 1
0.4%
1607 1
0.4%
1949 1
0.4%
2032 1
0.4%
2143 1
0.4%
2148 1
0.4%
2476 1
0.4%
2772 1
0.4%
ValueCountFrequency (%)
93652 1
0.4%
75134 1
0.4%
72765 1
0.4%
72269 1
0.4%
60186 1
0.4%
60095 1
0.4%
58992 1
0.4%
57360 1
0.4%
54292 1
0.4%
52871 1
0.4%

2007년 하차인원(일평균)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct262
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17164.027
Minimum711
Maximum100635
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 KiB
2023-12-12T23:31:00.565680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum711
5-th percentile2873.4
Q17353.25
median12651
Q322213.75
95-th percentile48477.25
Maximum100635
Range99924
Interquartile range (IQR)14860.5

Descriptive statistics

Standard deviation14892.496
Coefficient of variation (CV)0.86765749
Kurtosis5.2741764
Mean17164.027
Median Absolute Deviation (MAD)6064.5
Skewness2.0222128
Sum4496975
Variance2.2178645 × 108
MonotonicityNot monotonic
2023-12-12T23:31:01.086158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48510 1
 
0.4%
5636 1
 
0.4%
3598 1
 
0.4%
3607 1
 
0.4%
2518 1
 
0.4%
5418 1
 
0.4%
11567 1
 
0.4%
8673 1
 
0.4%
12637 1
 
0.4%
10038 1
 
0.4%
Other values (252) 252
96.2%
ValueCountFrequency (%)
711 1
0.4%
766 1
0.4%
1105 1
0.4%
2054 1
0.4%
2221 1
0.4%
2339 1
0.4%
2473 1
0.4%
2477 1
0.4%
2518 1
0.4%
2667 1
0.4%
ValueCountFrequency (%)
100635 1
0.4%
75374 1
0.4%
68923 1
0.4%
68648 1
0.4%
62012 1
0.4%
58665 1
0.4%
57692 1
0.4%
56878 1
0.4%
53808 1
0.4%
53310 1
0.4%

Interactions

2023-12-12T23:30:57.558771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:55.838613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.280693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.754821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.145219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.637762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:55.923447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.382572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.823736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.213939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.722962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.016630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.490028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.928625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.313456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.796843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.103724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.587353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.996658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.388167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.882442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.185368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:56.668789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.073403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:30:57.456136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:31:01.184477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선역번호2007년 승차인원(일평균)2007년 하차인원(일평균)
연번1.0000.9140.9170.6020.419
호선0.9141.0000.9960.4280.390
역번호0.9170.9961.0000.4150.384
2007년 승차인원(일평균)0.6020.4280.4151.0000.941
2007년 하차인원(일평균)0.4190.3900.3840.9411.000
2023-12-12T23:31:01.290849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선역번호2007년 승차인원(일평균)2007년 하차인원(일평균)
연번1.0000.9881.000-0.432-0.446
호선0.9881.0000.988-0.409-0.424
역번호1.0000.9881.000-0.432-0.446
2007년 승차인원(일평균)-0.432-0.409-0.4321.0000.987
2007년 하차인원(일평균)-0.446-0.424-0.4460.9871.000

Missing values

2023-12-12T23:30:58.041473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:30:58.176374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번호선역번호역명2007년 승차인원(일평균)2007년 하차인원(일평균)
011150서울역(1)5736048510
121151시청(1)2370624594
231152종각5233351151
341153종로3가(1)4335742162
451154종로5가2703126981
561155동대문(1)1782720576
671156신설동(1)1654715987
781157제기동1984520481
891158청량리4092940385
9101159동묘앞(1)65657211
연번호선역번호역명2007년 승차인원(일평균)2007년 하차인원(일평균)
25225382818가락시장(8)1103612003
25325482819문정62376172
25425582820장지32552800
25525682821복정(8)56344166
25625782822산성71796964
25725882823남한산성입구1417012738
25825982824단대오거리1387612078
25926082825신흥60166369
26026182826수진56575139
26126282827모란(8)45173223