Overview

Dataset statistics

Number of variables6
Number of observations286
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.7 KiB
Average record size in memory52.5 B

Variable types

Categorical1
Numeric4
Text1

Dataset

Description샘플 데이터
Author지하철 : 서울시버스정류장 : 서울시(스마트카드사)
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=15

Alerts

지하철역코드(SUB_STA_SN) has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:53:05.298003
Analysis finished2023-12-10 14:53:07.904035
Duration2.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구명(GU_NM)
Categorical

Distinct25
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
강남구
23 
송파구
 
19
영등포구
 
18
중구
 
18
마포구
 
15
Other values (20)
193 

Length

Max length4
Median length3
Mean length3.0734266
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row성북구
2nd row영등포구
3rd row성북구
4th row종로구
5th row종로구

Common Values

ValueCountFrequency (%)
강남구 23
 
8.0%
송파구 19
 
6.6%
영등포구 18
 
6.3%
중구 18
 
6.3%
마포구 15
 
5.2%
서초구 14
 
4.9%
용산구 14
 
4.9%
성동구 14
 
4.9%
노원구 12
 
4.2%
동작구 12
 
4.2%
Other values (15) 127
44.4%

Length

2023-12-10T23:53:08.004049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강남구 23
 
8.0%
송파구 19
 
6.6%
영등포구 18
 
6.3%
중구 18
 
6.3%
마포구 15
 
5.2%
서초구 14
 
4.9%
용산구 14
 
4.9%
성동구 14
 
4.9%
노원구 12
 
4.2%
동작구 12
 
4.2%
Other values (15) 127
44.4%

구코드(GU_CD)
Real number (ℝ)

Distinct25
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11431.084
Minimum11110
Maximum11740
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-10T23:53:08.159584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110
5-th percentile11140
Q111230
median11440
Q311620
95-th percentile11710
Maximum11740
Range630
Interquartile range (IQR)390

Descriptive statistics

Standard deviation200.28063
Coefficient of variation (CV)0.017520703
Kurtosis-1.3627262
Mean11431.084
Median Absolute Deviation (MAD)180
Skewness-0.056964358
Sum3269290
Variance40112.33
MonotonicityNot monotonic
2023-12-10T23:53:08.324066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
11680 23
 
8.0%
11710 19
 
6.6%
11560 18
 
6.3%
11140 18
 
6.3%
11440 15
 
5.2%
11170 14
 
4.9%
11650 14
 
4.9%
11200 14
 
4.9%
11350 12
 
4.2%
11380 12
 
4.2%
Other values (15) 127
44.4%
ValueCountFrequency (%)
11110 11
3.8%
11140 18
6.3%
11170 14
4.9%
11200 14
4.9%
11215 8
2.8%
11230 11
3.8%
11260 8
2.8%
11290 10
3.5%
11305 3
 
1.0%
11320 6
 
2.1%
ValueCountFrequency (%)
11740 10
3.5%
11710 19
6.6%
11680 23
8.0%
11650 14
4.9%
11620 7
 
2.4%
11590 12
4.2%
11560 18
6.3%
11545 3
 
1.0%
11530 11
3.8%
11500 9
 
3.1%

지하철역코드(SUB_STA_SN)
Real number (ℝ)

UNIQUE 

Distinct286
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean183.08042
Minimum40
Maximum326
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-10T23:53:08.508297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile54.25
Q1111.25
median183.5
Q3254.75
95-th percentile311.75
Maximum326
Range286
Interquartile range (IQR)143.5

Descriptive statistics

Standard deviation83.128249
Coefficient of variation (CV)0.45405319
Kurtosis-1.2054139
Mean183.08042
Median Absolute Deviation (MAD)72
Skewness-0.0028478455
Sum52361
Variance6910.3058
MonotonicityNot monotonic
2023-12-10T23:53:08.726993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
64 1
 
0.3%
256 1
 
0.3%
158 1
 
0.3%
87 1
 
0.3%
219 1
 
0.3%
128 1
 
0.3%
62 1
 
0.3%
258 1
 
0.3%
159 1
 
0.3%
55 1
 
0.3%
Other values (276) 276
96.5%
ValueCountFrequency (%)
40 1
0.3%
41 1
0.3%
42 1
0.3%
43 1
0.3%
44 1
0.3%
45 1
0.3%
46 1
0.3%
47 1
0.3%
48 1
0.3%
49 1
0.3%
ValueCountFrequency (%)
326 1
0.3%
325 1
0.3%
324 1
0.3%
323 1
0.3%
322 1
0.3%
321 1
0.3%
320 1
0.3%
319 1
0.3%
318 1
0.3%
317 1
0.3%
Distinct261
Distinct (%)91.3%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-10T23:53:09.105590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length3
Mean length4.0034965
Min length3

Characters and Unicode

Total characters1145
Distinct characters215
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique238 ?
Unique (%)83.2%

Sample

1st row사당역
2nd row성신여대입구역
3rd row학동역
4th row중곡역
5th row을지로3가역
ValueCountFrequency (%)
신촌역 3
 
1.0%
동대문운동장역 3
 
1.0%
양재역 2
 
0.7%
사당역 2
 
0.7%
버티고개역 2
 
0.7%
독립문역 2
 
0.7%
남태령역 2
 
0.7%
시청역 2
 
0.7%
이대역 2
 
0.7%
대방역 2
 
0.7%
Other values (252) 265
92.3%
2023-12-10T23:53:09.635875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
287
25.1%
35
 
3.1%
29
 
2.5%
28
 
2.4%
26
 
2.3%
18
 
1.6%
15
 
1.3%
14
 
1.2%
13
 
1.1%
13
 
1.1%
Other values (205) 667
58.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1105
96.5%
Decimal Number 15
 
1.3%
Open Punctuation 12
 
1.0%
Close Punctuation 12
 
1.0%
Space Separator 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
287
26.0%
35
 
3.2%
29
 
2.6%
28
 
2.5%
26
 
2.4%
18
 
1.6%
15
 
1.4%
14
 
1.3%
13
 
1.2%
13
 
1.2%
Other values (197) 627
56.7%
Decimal Number
ValueCountFrequency (%)
7 6
40.0%
3 3
20.0%
5 2
 
13.3%
4 2
 
13.3%
2 2
 
13.3%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1105
96.5%
Common 40
 
3.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
287
26.0%
35
 
3.2%
29
 
2.6%
28
 
2.5%
26
 
2.4%
18
 
1.6%
15
 
1.4%
14
 
1.3%
13
 
1.2%
13
 
1.2%
Other values (197) 627
56.7%
Common
ValueCountFrequency (%)
( 12
30.0%
) 12
30.0%
7 6
15.0%
3 3
 
7.5%
5 2
 
5.0%
4 2
 
5.0%
2 2
 
5.0%
1
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1105
96.5%
ASCII 40
 
3.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
287
26.0%
35
 
3.2%
29
 
2.6%
28
 
2.5%
26
 
2.4%
18
 
1.6%
15
 
1.4%
14
 
1.3%
13
 
1.2%
13
 
1.2%
Other values (197) 627
56.7%
ASCII
ValueCountFrequency (%)
( 12
30.0%
) 12
30.0%
7 6
15.0%
3 3
 
7.5%
5 2
 
5.0%
4 2
 
5.0%
2 2
 
5.0%
1
 
2.5%

X좌표(POINT_X)
Real number (ℝ)

Distinct273
Distinct (%)95.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean199670.74
Minimum182443
Maximum214745
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-10T23:53:09.802022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum182443
5-th percentile187672.75
Q1194044.75
median200509
Q3204939.75
95-th percentile211184.25
Maximum214745
Range32302
Interquartile range (IQR)10895

Descriptive statistics

Standard deviation7103.518
Coefficient of variation (CV)0.035576159
Kurtosis-0.62728954
Mean199670.74
Median Absolute Deviation (MAD)5302
Skewness-0.19735579
Sum57105832
Variance50459968
MonotonicityNot monotonic
2023-12-10T23:53:10.327396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
198356 3
 
1.0%
198423 2
 
0.7%
196270 2
 
0.7%
196116 2
 
0.7%
195207 2
 
0.7%
194358 2
 
0.7%
190700 2
 
0.7%
191242 2
 
0.7%
210933 2
 
0.7%
198841 2
 
0.7%
Other values (263) 265
92.7%
ValueCountFrequency (%)
182443 1
0.3%
182905 1
0.3%
183450 1
0.3%
183488 1
0.3%
184273 1
0.3%
184483 1
0.3%
185542 1
0.3%
185613 1
0.3%
185638 1
0.3%
185906 1
0.3%
ValueCountFrequency (%)
214745 1
0.3%
213678 1
0.3%
213526 1
0.3%
212731 1
0.3%
212613 1
0.3%
212611 1
0.3%
212382 1
0.3%
212036 1
0.3%
211955 1
0.3%
211729 1
0.3%

Y좌표(POINT_Y)
Real number (ℝ)

Distinct274
Distinct (%)95.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean449635.68
Minimum439550
Maximum465554
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-10T23:53:10.577363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum439550
5-th percentile442606.5
Q1445464.25
median449761.5
Q3452541.5
95-th percentile459570
Maximum465554
Range26004
Interquartile range (IQR)7077.25

Descriptive statistics

Standard deviation5213.7306
Coefficient of variation (CV)0.011595456
Kurtosis0.022565237
Mean449635.68
Median Absolute Deviation (MAD)3613
Skewness0.55512736
Sum1.285958 × 108
Variance27182987
MonotonicityNot monotonic
2023-12-10T23:53:10.741018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
441875 3
 
1.0%
451859 2
 
0.7%
452972 2
 
0.7%
450665 2
 
0.7%
448746 2
 
0.7%
450810 2
 
0.7%
440897 2
 
0.7%
443123 2
 
0.7%
452760 2
 
0.7%
450871 2
 
0.7%
Other values (264) 265
92.7%
ValueCountFrequency (%)
439550 1
 
0.3%
440761 1
 
0.3%
440897 2
0.7%
441271 1
 
0.3%
441875 3
1.0%
441899 1
 
0.3%
441956 1
 
0.3%
442044 1
 
0.3%
442389 1
 
0.3%
442410 1
 
0.3%
ValueCountFrequency (%)
465554 1
0.3%
464409 1
0.3%
464220 1
0.3%
463433 1
0.3%
463083 1
0.3%
462826 1
0.3%
462370 1
0.3%
461734 1
0.3%
461487 1
0.3%
460974 1
0.3%

Interactions

2023-12-10T23:53:07.118397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:05.671596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:06.152647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:06.630933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:07.248589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:05.783656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:06.284773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:06.770102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:07.367357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:05.891966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:06.403800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:06.885812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:07.505809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:06.022449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:06.519616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:07.009779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:53:10.866306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구명(GU_NM)구코드(GU_CD)지하철역코드(SUB_STA_SN)X좌표(POINT_X)Y좌표(POINT_Y)
구명(GU_NM)1.0000.2230.0000.1440.000
구코드(GU_CD)0.2231.0000.0000.0000.049
지하철역코드(SUB_STA_SN)0.0000.0001.0000.0890.000
X좌표(POINT_X)0.1440.0000.0891.0000.000
Y좌표(POINT_Y)0.0000.0490.0000.0001.000
2023-12-10T23:53:11.019991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구코드(GU_CD)지하철역코드(SUB_STA_SN)X좌표(POINT_X)Y좌표(POINT_Y)구명(GU_NM)
구코드(GU_CD)1.000-0.049-0.026-0.0110.093
지하철역코드(SUB_STA_SN)-0.0491.0000.0150.0580.000
X좌표(POINT_X)-0.0260.0151.0000.0140.035
Y좌표(POINT_Y)-0.0110.0580.0141.0000.000
구명(GU_NM)0.0930.0000.0350.0001.000

Missing values

2023-12-10T23:53:07.693385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:53:07.835533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구명(GU_NM)구코드(GU_CD)지하철역코드(SUB_STA_SN)지하철역명(KOR_SUB_NM)X좌표(POINT_X)Y좌표(POINT_Y)
0성북구1144064사당역198423443117
1영등포구11200214성신여대입구역196835444084
2성북구1111063학동역205670453021
3종로구1153047중곡역195074448023
4종로구11590192을지로3가역205575441899
5강남구11440233양천구청역202563459649
6은평구1117073강변역203357450491
7서초구11620154이태원역201102443850
8용산구11380150종각역197029452157
9성동구11290218수락산역201632451222
구명(GU_NM)구코드(GU_CD)지하철역코드(SUB_STA_SN)지하철역명(KOR_SUB_NM)X좌표(POINT_X)Y좌표(POINT_Y)
276강남구11650285신설동역204537443098
277양천구11500200당고개역202798452972
278양천구1138049상월곡역203851455885
279중구11560157독바위역191598455546
280영등포구1126048교대역188942450639
281강남구11470284무악재역200159460550
282서초구11710134아현역204096450665
283광진구11740141신림역200525454628
284중구11200223신용산역207738459652
285노원구11110272삼각지역205556451880