Overview

Dataset statistics

Number of variables5
Number of observations99
Missing cells6
Missing cells (%)1.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory43.3 B

Variable types

Categorical2
Text1
Numeric2

Dataset

Description수도권1호선에서 관리하는 도시광역철도역들의 철도운영기관명, 선명, 역명, 경도, 위도의 데이터가 포함되어 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041300/fileData.do

Alerts

선명 has constant value ""Constant
철도운영기관명 is highly imbalanced (52.8%)Imbalance
경도 has 3 (3.0%) missing valuesMissing
위도 has 3 (3.0%) missing valuesMissing
역명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 15:36:07.343889
Analysis finished2023-12-12 15:36:08.393492
Duration1.05 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
코레일
89 
서울교통공사
10 

Length

Max length6
Median length3
Mean length3.3030303
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row코레일
2nd row코레일
3rd row코레일
4th row코레일
5th row코레일

Common Values

ValueCountFrequency (%)
코레일 89
89.9%
서울교통공사 10
 
10.1%

Length

2023-12-13T00:36:08.500841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:36:08.668008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 89
89.9%
서울교통공사 10
 
10.1%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1호선
99 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 99
100.0%

Length

2023-12-13T00:36:08.850123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:36:08.990684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 99
100.0%

역명
Text

UNIQUE 

Distinct99
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
2023-12-13T00:36:09.386045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length2
Mean length2.6464646
Min length2

Characters and Unicode

Total characters262
Distinct characters118
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)100.0%

Sample

1st row지행
2nd row도봉
3rd row창동
4th row월계
5th row외대앞
ValueCountFrequency (%)
지행 1
 
1.0%
간석 1
 
1.0%
오산 1
 
1.0%
오산대 1
 
1.0%
세마 1
 
1.0%
서동탄 1
 
1.0%
병점 1
 
1.0%
수원 1
 
1.0%
화서 1
 
1.0%
의왕 1
 
1.0%
Other values (89) 89
89.9%
2023-12-13T00:36:10.051793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
4.6%
10
 
3.8%
10
 
3.8%
9
 
3.4%
7
 
2.7%
5
 
1.9%
5
 
1.9%
5
 
1.9%
4
 
1.5%
4
 
1.5%
Other values (108) 191
72.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 254
96.9%
Open Punctuation 3
 
1.1%
Close Punctuation 3
 
1.1%
Decimal Number 2
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
Decimal Number
ValueCountFrequency (%)
3 1
50.0%
5 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 254
96.9%
Common 8
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
Common
ValueCountFrequency (%)
( 3
37.5%
) 3
37.5%
3 1
 
12.5%
5 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 254
96.9%
ASCII 8
 
3.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
ASCII
ValueCountFrequency (%)
( 3
37.5%
) 3
37.5%
3 1
 
12.5%
5 1
 
12.5%

경도
Real number (ℝ)

MISSING 

Distinct96
Distinct (%)100.0%
Missing3
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean126.95733
Minimum126.61694
Maximum127.1489
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-13T00:36:10.284272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.61694
5-th percentile126.67764
Q1126.89101
median127.00086
Q3127.05323
95-th percentile127.12277
Maximum127.1489
Range0.531969
Interquartile range (IQR)0.1622205

Descriptive statistics

Standard deviation0.13330389
Coefficient of variation (CV)0.0010499897
Kurtosis0.086741559
Mean126.95733
Median Absolute Deviation (MAD)0.061851
Skewness-0.971432
Sum12187.904
Variance0.017769928
MonotonicityNot monotonic
2023-12-13T00:36:10.533879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.764474 1
 
1.0%
127.054397 1
 
1.0%
127.062342 1
 
1.0%
127.066708 1
 
1.0%
127.063081 1
 
1.0%
127.043279 1
 
1.0%
127.051893 1
 
1.0%
127.033268 1
 
1.0%
126.999821 1
 
1.0%
126.989581 1
 
1.0%
Other values (86) 86
86.9%
(Missing) 3
 
3.0%
ValueCountFrequency (%)
126.616936 1
1.0%
126.632598 1
1.0%
126.642824 1
1.0%
126.657165 1
1.0%
126.66846 1
1.0%
126.680702 1
1.0%
126.693408 1
1.0%
126.703016 1
1.0%
126.707179 1
1.0%
126.724584 1
1.0%
ValueCountFrequency (%)
127.148905 1
1.0%
127.146417 1
1.0%
127.143774 1
1.0%
127.136319 1
1.0%
127.127007 1
1.0%
127.121363 1
1.0%
127.105049 1
1.0%
127.085496 1
1.0%
127.080446 1
1.0%
127.070269 1
1.0%

위도
Real number (ℝ)

MISSING 

Distinct96
Distinct (%)100.0%
Missing3
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean37.412426
Minimum36.769213
Maximum37.927569
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-13T00:36:10.758864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.769213
5-th percentile36.793357
Q137.296256
median37.484603
Q337.571545
95-th percentile37.785401
Maximum37.927569
Range1.158356
Interquartile range (IQR)0.275289

Descriptive statistics

Standard deviation0.28419997
Coefficient of variation (CV)0.0075964057
Kurtosis0.21486624
Mean37.412426
Median Absolute Deviation (MAD)0.1083765
Skewness-0.8643159
Sum3591.5929
Variance0.080769621
MonotonicityNot monotonic
2023-12-13T00:36:11.034271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.486664 1
 
1.0%
37.075376 1
 
1.0%
37.109771 1
 
1.0%
37.145472 1
 
1.0%
37.169401 1
 
1.0%
37.187288 1
 
1.0%
37.195675 1
 
1.0%
37.206821 1
 
1.0%
37.266162 1
 
1.0%
37.28399 1
 
1.0%
Other values (86) 86
86.9%
(Missing) 3
 
3.0%
ValueCountFrequency (%)
36.769213 1
1.0%
36.777541 1
1.0%
36.780419 1
1.0%
36.788272 1
1.0%
36.792195 1
1.0%
36.793745 1
1.0%
36.801578 1
1.0%
36.810238 1
1.0%
36.833791 1
1.0%
36.870921 1
1.0%
ValueCountFrequency (%)
37.927569 1
1.0%
37.901818 1
1.0%
37.892361 1
1.0%
37.843216 1
1.0%
37.818761 1
1.0%
37.774281 1
1.0%
37.759416 1
1.0%
37.748362 1
1.0%
37.73873 1
1.0%
37.724037 1
1.0%

Interactions

2023-12-13T00:36:07.842850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:36:07.550865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:36:07.957789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:36:07.733997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:36:11.191876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명경도위도
철도운영기관명1.0001.0000.5870.226
역명1.0001.0001.0001.000
경도0.5871.0001.0000.808
위도0.2261.0000.8081.000
2023-12-13T00:36:11.329991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
경도위도철도운영기관명
경도1.000-0.0300.433
위도-0.0301.0000.163
철도운영기관명0.4330.1631.000

Missing values

2023-12-13T00:36:08.127097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:36:08.240896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T00:36:08.336474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

철도운영기관명선명역명경도위도
0코레일1호선지행127.05573437.892361
1코레일1호선도봉<NA><NA>
2코레일1호선창동127.04766337.653181
3코레일1호선월계127.05883537.633114
4코레일1호선외대앞127.06357237.596157
5코레일1호선대방126.92640737.513392
6코레일1호선신길126.91774337.517018
7코레일1호선덕정127.06151137.843216
8코레일1호선구일126.86959937.496273
9코레일1호선온수126.82339937.492267
철도운영기관명선명역명경도위도
89서울교통공사1호선서울역126.97253337.55315
90서울교통공사1호선시청126.97540737.56359
91서울교통공사1호선종각126.98311637.570203
92서울교통공사1호선종로3가126.99209537.570429
93서울교통공사1호선종로5가127.001937.570971
94서울교통공사1호선동대문127.0078937.565145
95서울교통공사1호선신설동127.0247137.576117
96서울교통공사1호선제기동127.03490237.578116
97서울교통공사1호선청량리(서울시립대입구)127.04506337.580148
98서울교통공사1호선동묘앞127.01645937.573265