Overview

Dataset statistics

Number of variables7
Number of observations816
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory49.5 KiB
Average record size in memory62.2 B

Variable types

Categorical2
Text1
Numeric4

Dataset

Description인천교통공사에서 운영 중인 인천 1호선, 인천 2호선, 7호선(인천구간+부천구간)의 2022년 7월 부터 2023년 6월 까지 역별 월별 수송인원 현황자료입니다. (호선, 역명, 연도, 월, 수송인원, 승차인원, 유입인원)
URLhttps://www.data.go.kr/data/15060441/fileData.do

Alerts

is highly overall correlated with 연도High correlation
수송인원 is highly overall correlated with 승차인원 and 1 other fieldsHigh correlation
승차인원 is highly overall correlated with 수송인원 and 1 other fieldsHigh correlation
유입인원 is highly overall correlated with 수송인원 and 1 other fieldsHigh correlation
연도 is highly overall correlated with High correlation

Reproduction

Analysis started2023-12-12 15:28:44.754471
Analysis finished2023-12-12 15:28:47.136709
Duration2.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
1
360 
2
324 
7
132 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 360
44.1%
2 324
39.7%
7 132
 
16.2%

Length

2023-12-13T00:28:47.214537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:28:47.336064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 360
44.1%
2 324
39.7%
7 132
 
16.2%

역명
Text

Distinct65
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
2023-12-13T00:28:47.574224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length3.6470588
Min length2

Characters and Unicode

Total characters2976
Distinct characters115
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row계양
2nd row귤현
3rd row박촌
4th row임학
5th row계산
ValueCountFrequency (%)
부평구청 24
 
2.9%
석남 24
 
2.9%
인천시청 24
 
2.9%
삼산체육관 12
 
1.5%
상동 12
 
1.5%
시민공원 12
 
1.5%
주안 12
 
1.5%
주안국가산단 12
 
1.5%
가재울 12
 
1.5%
인천가좌 12
 
1.5%
Other values (55) 660
80.9%
2023-12-13T00:28:47.959224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
120
 
4.0%
108
 
3.6%
96
 
3.2%
96
 
3.2%
84
 
2.8%
84
 
2.8%
84
 
2.8%
84
 
2.8%
60
 
2.0%
60
 
2.0%
Other values (105) 2100
70.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2976
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
120
 
4.0%
108
 
3.6%
96
 
3.2%
96
 
3.2%
84
 
2.8%
84
 
2.8%
84
 
2.8%
84
 
2.8%
60
 
2.0%
60
 
2.0%
Other values (105) 2100
70.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2976
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
120
 
4.0%
108
 
3.6%
96
 
3.2%
96
 
3.2%
84
 
2.8%
84
 
2.8%
84
 
2.8%
84
 
2.8%
60
 
2.0%
60
 
2.0%
Other values (105) 2100
70.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2976
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
120
 
4.0%
108
 
3.6%
96
 
3.2%
96
 
3.2%
84
 
2.8%
84
 
2.8%
84
 
2.8%
84
 
2.8%
60
 
2.0%
60
 
2.0%
Other values (105) 2100
70.6%

연도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
2022
408 
2023
408 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 408
50.0%
2023 408
50.0%

Length

2023-12-13T00:28:48.097121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:28:48.189988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 408
50.0%
2023 408
50.0%


Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.3 KiB
2023-12-13T00:28:48.281446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.4541697
Coefficient of variation (CV)0.53141072
Kurtosis-1.2168829
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum5304
Variance11.931288
MonotonicityNot monotonic
2023-12-13T00:28:48.389170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
7 68
8.3%
8 68
8.3%
9 68
8.3%
10 68
8.3%
11 68
8.3%
12 68
8.3%
1 68
8.3%
2 68
8.3%
3 68
8.3%
4 68
8.3%
Other values (2) 136
16.7%
ValueCountFrequency (%)
1 68
8.3%
2 68
8.3%
3 68
8.3%
4 68
8.3%
5 68
8.3%
6 68
8.3%
7 68
8.3%
8 68
8.3%
9 68
8.3%
10 68
8.3%
ValueCountFrequency (%)
12 68
8.3%
11 68
8.3%
10 68
8.3%
9 68
8.3%
8 68
8.3%
7 68
8.3%
6 68
8.3%
5 68
8.3%
4 68
8.3%
3 68
8.3%

수송인원
Real number (ℝ)

HIGH CORRELATION 

Distinct803
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean261385.89
Minimum42667
Maximum767627
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.3 KiB
2023-12-13T00:28:48.532160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum42667
5-th percentile76012.5
Q1143762
median213768.5
Q3347553.75
95-th percentile603784
Maximum767627
Range724960
Interquartile range (IQR)203791.75

Descriptive statistics

Standard deviation159579.27
Coefficient of variation (CV)0.61051218
Kurtosis0.45794727
Mean261385.89
Median Absolute Deviation (MAD)93736
Skewness1.0081476
Sum2.1329089 × 108
Variance2.5465544 × 1010
MonotonicityNot monotonic
2023-12-13T00:28:48.675930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
426279 2
 
0.2%
623107 2
 
0.2%
273743 2
 
0.2%
319231 2
 
0.2%
289144 2
 
0.2%
428036 2
 
0.2%
629208 2
 
0.2%
251089 2
 
0.2%
660683 2
 
0.2%
199894 2
 
0.2%
Other values (793) 796
97.5%
ValueCountFrequency (%)
42667 1
0.1%
43740 1
0.1%
44909 1
0.1%
46318 1
0.1%
46812 1
0.1%
46855 1
0.1%
46911 1
0.1%
47681 1
0.1%
47876 1
0.1%
47899 1
0.1%
ValueCountFrequency (%)
767627 1
0.1%
760502 1
0.1%
749327 1
0.1%
736092 1
0.1%
729485 1
0.1%
728097 1
0.1%
727041 1
0.1%
710549 1
0.1%
709697 1
0.1%
708112 1
0.1%

승차인원
Real number (ℝ)

HIGH CORRELATION 

Distinct802
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean176829.16
Minimum29071
Maximum448602
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.3 KiB
2023-12-13T00:28:48.848302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum29071
5-th percentile56609.75
Q1101116.25
median152456
Q3231798.5
95-th percentile387591
Maximum448602
Range419531
Interquartile range (IQR)130682.25

Descriptive statistics

Standard deviation98782.63
Coefficient of variation (CV)0.55863315
Kurtosis-0.089398781
Mean176829.16
Median Absolute Deviation (MAD)62397
Skewness0.81913328
Sum1.442926 × 108
Variance9.7580081 × 109
MonotonicityNot monotonic
2023-12-13T00:28:48.967718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
258035 2
 
0.2%
102199 2
 
0.2%
194801 2
 
0.2%
125468 2
 
0.2%
387591 2
 
0.2%
372340 2
 
0.2%
386251 2
 
0.2%
157373 2
 
0.2%
257129 2
 
0.2%
190354 2
 
0.2%
Other values (792) 796
97.5%
ValueCountFrequency (%)
29071 1
0.1%
29496 1
0.1%
32264 1
0.1%
32377 1
0.1%
32607 1
0.1%
32867 1
0.1%
33452 1
0.1%
33540 1
0.1%
33560 1
0.1%
33564 1
0.1%
ValueCountFrequency (%)
448602 1
0.1%
442947 1
0.1%
434614 1
0.1%
432564 1
0.1%
432424 1
0.1%
428828 1
0.1%
428705 1
0.1%
426295 1
0.1%
426204 1
0.1%
423442 1
0.1%

유입인원
Real number (ℝ)

HIGH CORRELATION 

Distinct799
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean84556.73
Minimum11320
Maximum325888
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.3 KiB
2023-12-13T00:28:49.071971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11320
5-th percentile17792.25
Q139355.75
median61777
Q3112208.75
95-th percentile209840.75
Maximum325888
Range314568
Interquartile range (IQR)72853

Descriptive statistics

Standard deviation63463.471
Coefficient of variation (CV)0.7505431
Kurtosis1.9029406
Mean84556.73
Median Absolute Deviation (MAD)31429.5
Skewness1.436243
Sum68998292
Variance4.0276121 × 109
MonotonicityNot monotonic
2023-12-13T00:28:49.196904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
133722 3
 
0.4%
273092 2
 
0.2%
93716 2
 
0.2%
90121 2
 
0.2%
60123 2
 
0.2%
114412 2
 
0.2%
98790 2
 
0.2%
102384 2
 
0.2%
31631 2
 
0.2%
170907 2
 
0.2%
Other values (789) 795
97.4%
ValueCountFrequency (%)
11320 1
0.1%
11517 1
0.1%
11869 1
0.1%
12614 1
0.1%
12645 1
0.1%
12727 1
0.1%
12830 1
0.1%
13001 1
0.1%
13252 1
0.1%
13340 1
0.1%
ValueCountFrequency (%)
325888 1
0.1%
320499 1
0.1%
319025 1
0.1%
307387 1
0.1%
300746 1
0.1%
297988 1
0.1%
297061 1
0.1%
295913 1
0.1%
291638 1
0.1%
290952 1
0.1%

Interactions

2023-12-13T00:28:46.496720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:45.385287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:45.716455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:46.117486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:46.593500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:45.457853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:45.837018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:46.199149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:46.699421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:45.552040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:45.939030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:46.296818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:46.811494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:45.627424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:46.022200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:28:46.381518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:28:49.288075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선역명연도수송인원승차인원유입인원
호선1.0000.9930.0000.0000.6510.5690.651
역명0.9931.0000.0000.0000.9650.9610.956
연도0.0000.0001.0001.0000.0000.0000.000
0.0000.0001.0001.0000.0000.0000.000
수송인원0.6510.9650.0000.0001.0000.9670.951
승차인원0.5690.9610.0000.0000.9671.0000.912
유입인원0.6510.9560.0000.0000.9510.9121.000
2023-12-13T00:28:49.372648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선연도
호선1.0000.000
연도0.0001.000
2023-12-13T00:28:49.438226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수송인원승차인원유입인원호선연도
1.0000.0130.0170.0090.0000.995
수송인원0.0131.0000.9950.9840.4960.000
승차인원0.0170.9951.0000.9610.4110.000
유입인원0.0090.9840.9611.0000.4980.000
호선0.0000.4960.4110.4981.0000.000
연도0.9950.0000.0000.0000.0001.000

Missing values

2023-12-13T00:28:46.965991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:28:47.086925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

호선역명연도수송인원승차인원유입인원
01계양202271396039214047463
11귤현20227479433354014403
21박촌2022714792810465143277
31임학20227341804216928124876
41계산20227424811278485146326
51경인교대입구2022722819615098577211
61작전20227575024377453197571
71갈산20227348927233826115101
81부평구청2022727091618596084956
91부평시장20227523187350689172498
호선역명연도수송인원승차인원유입인원
8067부천종합운동장2023622762814176385865
8077춘의20236482399284843197556
8087신중동20236749327428828320499
8097부천시청20236695045404093290952
8107상동20236709697421624288073
8117삼산체육관20236287991177160110831
8127굴포천20236473651274646199005
8137부평구청20236325076205664119412
8147산곡20236388541236808151733
8157석남20236342815210248132567