Overview

Dataset statistics

Number of variables5
Number of observations202
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.6 KiB
Average record size in memory43.6 B

Variable types

Categorical1
Text1
Numeric3

Dataset

Description한국철도에서 운영하는 간선역별 위치(위도, 경도) 와 출구 수 데이터 입니다. 본부별 역 위치정보와 출구 개수를 알 수 있습니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15127532/fileData.do

Alerts

위도 is highly overall correlated with 지역본부High correlation
지역본부 is highly overall correlated with 위도High correlation
역명 has unique valuesUnique

Reproduction

Analysis started2024-04-13 12:35:34.963993
Analysis finished2024-04-13 12:35:39.758887
Duration4.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지역본부
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
대전충청
54 
대구경북
43 
광주전남
31 
부산경남
28 
서울본부
18 
Other values (3)
28 

Length

Max length5
Median length4
Mean length3.9009901
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울본부
2nd row서울본부
3rd row서울본부
4th row서울본부
5th row서울본부

Common Values

ValueCountFrequency (%)
대전충청 54
26.7%
대구경북 43
21.3%
광주전남 31
15.3%
부산경남 28
13.9%
서울본부 18
 
8.9%
강원본부 12
 
5.9%
전북 12
 
5.9%
수도권광역 4
 
2.0%

Length

2024-04-13T21:35:39.890252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-13T21:35:40.121192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대전충청 54
26.7%
대구경북 43
21.3%
광주전남 31
15.3%
부산경남 28
13.9%
서울본부 18
 
8.9%
강원본부 12
 
5.9%
전북 12
 
5.9%
수도권광역 4
 
2.0%

역명
Text

UNIQUE 

Distinct202
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2024-04-13T21:35:41.401656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length2
Mean length2.3316832
Min length2

Characters and Unicode

Total characters471
Distinct characters168
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique202 ?
Unique (%)100.0%

Sample

1st row서울
2nd row용산
3rd row수색
4th row행신
5th row도라산
ValueCountFrequency (%)
서울 1
 
0.5%
동대구 1
 
0.5%
여수엑스포 1
 
0.5%
월포 1
 
0.5%
극락강 1
 
0.5%
광주 1
 
0.5%
광양 1
 
0.5%
상주 1
 
0.5%
예천 1
 
0.5%
탑리 1
 
0.5%
Other values (192) 192
95.0%
2024-04-13T21:35:42.876452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23
 
4.9%
19
 
4.0%
17
 
3.6%
11
 
2.3%
10
 
2.1%
10
 
2.1%
10
 
2.1%
9
 
1.9%
9
 
1.9%
9
 
1.9%
Other values (158) 344
73.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 467
99.2%
Open Punctuation 2
 
0.4%
Close Punctuation 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
4.9%
19
 
4.1%
17
 
3.6%
11
 
2.4%
10
 
2.1%
10
 
2.1%
10
 
2.1%
9
 
1.9%
9
 
1.9%
9
 
1.9%
Other values (156) 340
72.8%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 467
99.2%
Common 4
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
4.9%
19
 
4.1%
17
 
3.6%
11
 
2.4%
10
 
2.1%
10
 
2.1%
10
 
2.1%
9
 
1.9%
9
 
1.9%
9
 
1.9%
Other values (156) 340
72.8%
Common
ValueCountFrequency (%)
( 2
50.0%
) 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 467
99.2%
ASCII 4
 
0.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
23
 
4.9%
19
 
4.1%
17
 
3.6%
11
 
2.4%
10
 
2.1%
10
 
2.1%
10
 
2.1%
9
 
1.9%
9
 
1.9%
9
 
1.9%
Other values (156) 340
72.8%
ASCII
ValueCountFrequency (%)
( 2
50.0%
) 2
50.0%

위도
Real number (ℝ)

HIGH CORRELATION 

Distinct188
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.228403
Minimum34.57
Maximum38.25763
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2024-04-13T21:35:43.127314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34.57
5-th percentile34.8215
Q135.331902
median36.218406
Q336.975389
95-th percentile37.561659
Maximum38.25763
Range3.68763
Interquartile range (IQR)1.6434873

Descriptive statistics

Standard deviation0.91964394
Coefficient of variation (CV)0.025384612
Kurtosis-1.0528093
Mean36.228403
Median Absolute Deviation (MAD)0.81034055
Skewness0.088370026
Sum7318.1373
Variance0.84574498
MonotonicityNot monotonic
2024-04-13T21:35:43.393205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
35.14 5
 
2.5%
35.02 3
 
1.5%
35.15 3
 
1.5%
34.82 3
 
1.5%
35.1 2
 
1.0%
35.05 2
 
1.0%
35.13 2
 
1.0%
34.76 2
 
1.0%
35.881304 1
 
0.5%
35.18 1
 
0.5%
Other values (178) 178
88.1%
ValueCountFrequency (%)
34.57 1
 
0.5%
34.75 1
 
0.5%
34.76 2
1.0%
34.77 1
 
0.5%
34.78 1
 
0.5%
34.79 1
 
0.5%
34.8 1
 
0.5%
34.82 3
1.5%
34.85 1
 
0.5%
34.89 1
 
0.5%
ValueCountFrequency (%)
38.25763 1
0.5%
38.212687 1
0.5%
38.184937 1
0.5%
38.132142 1
0.5%
37.89877 1
0.5%
37.7645622 1
0.5%
37.691459 1
0.5%
37.642891 1
0.5%
37.5806 1
0.5%
37.580543 1
0.5%

경도
Real number (ℝ)

Distinct200
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.86873
Minimum126.18
Maximum129.37716
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2024-04-13T21:35:43.653309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.18
5-th percentile126.60747
Q1127.08063
median127.78665
Q3128.69418
95-th percentile129.13874
Maximum129.37716
Range3.197157
Interquartile range (IQR)1.6135488

Descriptive statistics

Standard deviation0.86709922
Coefficient of variation (CV)0.0067811672
Kurtosis-1.2904358
Mean127.86873
Median Absolute Deviation (MAD)0.7920345
Skewness0.054262825
Sum25829.483
Variance0.75186105
MonotonicityNot monotonic
2024-04-13T21:35:43.898359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.96 2
 
1.0%
128.11 2
 
1.0%
128.3307228 1
 
0.5%
126.91 1
 
0.5%
127.59 1
 
0.5%
128.1637385 1
 
0.5%
128.4443759 1
 
0.5%
128.6 1
 
0.5%
128.79114 1
 
0.5%
128.180991 1
 
0.5%
Other values (190) 190
94.1%
ValueCountFrequency (%)
126.18 1
0.5%
126.31 1
0.5%
126.39 1
0.5%
126.43 1
0.5%
126.48 1
0.5%
126.4952 1
0.5%
126.5 1
0.5%
126.54 1
0.5%
126.586741 1
0.5%
126.590837 1
0.5%
ValueCountFrequency (%)
129.377157 1
0.5%
129.372288 1
0.5%
129.3719574 1
0.5%
129.369508 1
0.5%
129.366728 1
0.5%
129.3532654 1
0.5%
129.341983 1
0.5%
129.2330567 1
0.5%
129.1803244 1
0.5%
129.1662641 1
0.5%

출입구 개수
Real number (ℝ)

Distinct9
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4752475
Minimum0
Maximum9
Zeros2
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2024-04-13T21:35:44.269409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile3
Maximum9
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1768417
Coefficient of variation (CV)0.79772492
Kurtosis18.341888
Mean1.4752475
Median Absolute Deviation (MAD)0
Skewness3.9074487
Sum298
Variance1.3849564
MonotonicityNot monotonic
2024-04-13T21:35:44.450959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 145
71.8%
2 39
 
19.3%
3 7
 
3.5%
4 3
 
1.5%
8 2
 
1.0%
6 2
 
1.0%
0 2
 
1.0%
5 1
 
0.5%
9 1
 
0.5%
ValueCountFrequency (%)
0 2
 
1.0%
1 145
71.8%
2 39
 
19.3%
3 7
 
3.5%
4 3
 
1.5%
5 1
 
0.5%
6 2
 
1.0%
8 2
 
1.0%
9 1
 
0.5%
ValueCountFrequency (%)
9 1
 
0.5%
8 2
 
1.0%
6 2
 
1.0%
5 1
 
0.5%
4 3
 
1.5%
3 7
 
3.5%
2 39
 
19.3%
1 145
71.8%
0 2
 
1.0%

Interactions

2024-04-13T21:35:38.890759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T21:35:37.344923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T21:35:38.120642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T21:35:39.145252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T21:35:37.616930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T21:35:38.393529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T21:35:39.344341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T21:35:37.877760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T21:35:38.649475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-13T21:35:44.586236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역본부위도경도출입구 개수
지역본부1.0000.8170.6240.518
위도0.8171.0000.5690.309
경도0.6240.5691.0000.000
출입구 개수0.5180.3090.0001.000
2024-04-13T21:35:44.743071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도출입구 개수지역본부
위도1.0000.1410.1640.571
경도0.1411.000-0.0480.357
출입구 개수0.164-0.0481.0000.284
지역본부0.5710.3570.2841.000

Missing values

2024-04-13T21:35:39.518795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-13T21:35:39.686788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지역본부역명위도경도출입구 개수
0서울본부서울37.55473126.97083
1서울본부용산37.52991126.96483
2서울본부수색37.5806126.89591
3서울본부행신37.3645126.49522
4서울본부도라산37.89877126.70981
5서울본부신망리38.132142127.0781312
6서울본부대광리38.184937127.1084981
7서울본부신탄리38.212687127.1389651
8서울본부백마고지38.25763127.1661692
9서울본부청량리37.580543127.0472593
지역본부역명위도경도출입구 개수
192부산경남군북35.25128.353
193부산경남반성35.1128.151
194부산경남진주35.15128.111
195부산경남완사35.13127.971
196부산경남북천35.11127.881
197부산경남횡천35.05127.481
198부산경남하동35.06127.761
199부산경남부전35.16475129.0600722
200부산경남태화강35.538514129.3532651
201부산경남북울산35.614817129.3719571