Overview

Dataset statistics

Number of variables6
Number of observations21
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory56.3 B

Variable types

Text3
Categorical2
Numeric1

Dataset

Description이 데이터는 서울특별시 동작구 관내에 있는 마을버스에 관한 것입니다. 이 데이터에는 노선번호, 업체명, 운행노선, 운행대수 등이 포함되어 있습니다.
URLhttps://www.data.go.kr/data/15037278/fileData.do

Alerts

운행대수 is highly overall correlated with 예비대수High correlation
예비대수 is highly overall correlated with 운행대수High correlation
노선번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 09:50:05.435877
Analysis finished2023-12-12 09:50:06.089223
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선번호
Text

UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
2023-12-12T18:50:06.225741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length4.1428571
Min length4

Characters and Unicode

Total characters87
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row동작01
2nd row동작10
3rd row동작21
4th row동작02
5th row동작11
ValueCountFrequency (%)
동작01 1
 
4.8%
동작06 1
 
4.8%
동작08 1
 
4.8%
동작18 1
 
4.8%
동작17 1
 
4.8%
동작16 1
 
4.8%
동작07 1
 
4.8%
동작20 1
 
4.8%
동작15 1
 
4.8%
동작14 1
 
4.8%
Other values (11) 11
52.4%
2023-12-12T18:50:06.582094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21
24.1%
21
24.1%
1 14
16.1%
0 11
12.6%
2 4
 
4.6%
5 3
 
3.4%
9 2
 
2.3%
3 2
 
2.3%
6 2
 
2.3%
7 2
 
2.3%
Other values (4) 5
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 43
49.4%
Other Letter 42
48.3%
Space Separator 1
 
1.1%
Dash Punctuation 1
 
1.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 14
32.6%
0 11
25.6%
2 4
 
9.3%
5 3
 
7.0%
9 2
 
4.7%
3 2
 
4.7%
6 2
 
4.7%
7 2
 
4.7%
8 2
 
4.7%
4 1
 
2.3%
Other Letter
ValueCountFrequency (%)
21
50.0%
21
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 45
51.7%
Hangul 42
48.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 14
31.1%
0 11
24.4%
2 4
 
8.9%
5 3
 
6.7%
9 2
 
4.4%
3 2
 
4.4%
6 2
 
4.4%
7 2
 
4.4%
8 2
 
4.4%
1
 
2.2%
Other values (2) 2
 
4.4%
Hangul
ValueCountFrequency (%)
21
50.0%
21
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45
51.7%
Hangul 42
48.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
21
50.0%
21
50.0%
ASCII
ValueCountFrequency (%)
1 14
31.1%
0 11
24.4%
2 4
 
8.9%
5 3
 
6.7%
9 2
 
4.4%
3 2
 
4.4%
6 2
 
4.4%
7 2
 
4.4%
8 2
 
4.4%
1
 
2.2%
Other values (2) 2
 
4.4%

업체명
Categorical

Distinct10
Distinct (%)47.6%
Missing0
Missing (%)0.0%
Memory size300.0 B
㈜합성마을버스
사당3동마을버스주식회사
흑석운수(주)
상도3동마을버스
노들운수(주)
Other values (5)

Length

Max length12
Median length7
Mean length8.1428571
Min length7

Unique

Unique5 ?
Unique (%)23.8%

Sample

1st row흑석운수(주)
2nd row흑석운수(주)
3rd row흑석운수(주)
4th row상도3동마을버스
5th row상도3동마을버스

Common Values

ValueCountFrequency (%)
㈜합성마을버스 4
19.0%
사당3동마을버스주식회사 4
19.0%
흑석운수(주) 3
14.3%
상도3동마을버스 3
14.3%
노들운수(주) 2
9.5%
연성운수(주) 1
 
4.8%
보라매운수(주) 1
 
4.8%
보라매마을버스 1
 
4.8%
명승운수(주) 1
 
4.8%
화인운수(주) 1
 
4.8%

Length

2023-12-12T18:50:06.755917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:50:06.928534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
㈜합성마을버스 4
19.0%
사당3동마을버스주식회사 4
19.0%
흑석운수(주 3
14.3%
상도3동마을버스 3
14.3%
노들운수(주 2
9.5%
연성운수(주 1
 
4.8%
보라매운수(주 1
 
4.8%
보라매마을버스 1
 
4.8%
명승운수(주 1
 
4.8%
화인운수(주 1
 
4.8%
Distinct15
Distinct (%)71.4%
Missing0
Missing (%)0.0%
Memory size300.0 B
2023-12-12T18:50:07.159026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length5.8095238
Min length3

Characters and Unicode

Total characters122
Distinct characters57
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)47.6%

Sample

1st row달마사
2nd row노량진교회
3rd row달마사
4th row사자암
5th row사자암
ValueCountFrequency (%)
대방역 3
13.0%
달마사 3
13.0%
사자암 2
 
8.7%
사당자이(아 2
 
8.7%
극동(아 2
 
8.7%
노량진교회 1
 
4.3%
국사봉터널 1
 
4.3%
신대방삼거리 1
 
4.3%
상도4동 1
 
4.3%
성문교회 1
 
4.3%
Other values (6) 6
26.1%
2023-12-12T18:50:07.480034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
 
9.0%
11
 
9.0%
( 7
 
5.7%
) 7
 
5.7%
6
 
4.9%
5
 
4.1%
4
 
3.3%
4
 
3.3%
4
 
3.3%
3
 
2.5%
Other values (47) 60
49.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 95
77.9%
Space Separator 11
 
9.0%
Open Punctuation 7
 
5.7%
Close Punctuation 7
 
5.7%
Lowercase Letter 1
 
0.8%
Decimal Number 1
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11
 
11.6%
6
 
6.3%
5
 
5.3%
4
 
4.2%
4
 
4.2%
4
 
4.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
Other values (42) 49
51.6%
Space Separator
ValueCountFrequency (%)
11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 95
77.9%
Common 26
 
21.3%
Latin 1
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11
 
11.6%
6
 
6.3%
5
 
5.3%
4
 
4.2%
4
 
4.2%
4
 
4.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
Other values (42) 49
51.6%
Common
ValueCountFrequency (%)
11
42.3%
( 7
26.9%
) 7
26.9%
4 1
 
3.8%
Latin
ValueCountFrequency (%)
e 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 95
77.9%
ASCII 27
 
22.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
11
 
11.6%
6
 
6.3%
5
 
5.3%
4
 
4.2%
4
 
4.2%
4
 
4.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
Other values (42) 49
51.6%
ASCII
ValueCountFrequency (%)
11
40.7%
( 7
25.9%
) 7
25.9%
e 1
 
3.7%
4 1
 
3.7%
Distinct15
Distinct (%)71.4%
Missing0
Missing (%)0.0%
Memory size300.0 B
2023-12-12T18:50:07.664783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length3
Mean length5.4761905
Min length3

Characters and Unicode

Total characters115
Distinct characters53
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)52.4%

Sample

1st row대방역
2nd row상도역
3rd row상도역
4th row노량진역
5th row노량진역(상도3동 주민센터경유)
ValueCountFrequency (%)
대방역 3
 
11.5%
사당역 3
 
11.5%
상도역 2
 
7.7%
사랑의 2
 
7.7%
병원 2
 
7.7%
방면 2
 
7.7%
신대방역(신대방삼거리역 1
 
3.8%
갯마을 1
 
3.8%
낙성대역 1
 
3.8%
이수역 1
 
3.8%
Other values (8) 8
30.8%
2023-12-12T18:50:07.988978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17
 
14.8%
8
 
7.0%
7
 
6.1%
5
 
4.3%
5
 
4.3%
3
 
2.6%
) 3
 
2.6%
( 3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (43) 58
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 103
89.6%
Space Separator 5
 
4.3%
Close Punctuation 3
 
2.6%
Open Punctuation 3
 
2.6%
Decimal Number 1
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17
 
16.5%
8
 
7.8%
7
 
6.8%
5
 
4.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (39) 48
46.6%
Space Separator
ValueCountFrequency (%)
5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 103
89.6%
Common 12
 
10.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
16.5%
8
 
7.8%
7
 
6.8%
5
 
4.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (39) 48
46.6%
Common
ValueCountFrequency (%)
5
41.7%
) 3
25.0%
( 3
25.0%
3 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 103
89.6%
ASCII 12
 
10.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
17
 
16.5%
8
 
7.8%
7
 
6.8%
5
 
4.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (39) 48
46.6%
ASCII
ValueCountFrequency (%)
5
41.7%
) 3
25.0%
( 3
25.0%
3 1
 
8.3%

운행대수
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)47.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4761905
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-12T18:50:08.114411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q37
95-th percentile11
Maximum16
Range15
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.7097041
Coefficient of variation (CV)0.67742423
Kurtosis1.9605747
Mean5.4761905
Median Absolute Deviation (MAD)2
Skewness1.3233637
Sum115
Variance13.761905
MonotonicityNot monotonic
2023-12-12T18:50:08.211169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
3 5
23.8%
7 3
14.3%
4 3
14.3%
6 2
 
9.5%
11 2
 
9.5%
1 2
 
9.5%
16 1
 
4.8%
2 1
 
4.8%
5 1
 
4.8%
8 1
 
4.8%
ValueCountFrequency (%)
1 2
 
9.5%
2 1
 
4.8%
3 5
23.8%
4 3
14.3%
5 1
 
4.8%
6 2
 
9.5%
7 3
14.3%
8 1
 
4.8%
11 2
 
9.5%
16 1
 
4.8%
ValueCountFrequency (%)
16 1
 
4.8%
11 2
 
9.5%
8 1
 
4.8%
7 3
14.3%
6 2
 
9.5%
5 1
 
4.8%
4 3
14.3%
3 5
23.8%
2 1
 
4.8%
1 2
 
9.5%

예비대수
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size300.0 B
0
16 
1
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row2
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 16
76.2%
1 4
 
19.0%
2 1
 
4.8%

Length

2023-12-12T18:50:08.356487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:50:08.465539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 16
76.2%
1 4
 
19.0%
2 1
 
4.8%

Interactions

2023-12-12T18:50:05.733741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:50:08.537760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선번호업체명운행노선(시점)운행노선(종점)운행대수예비대수
노선번호1.0001.0001.0001.0001.0001.000
업체명1.0001.0000.8820.8740.3400.000
운행노선(시점)1.0000.8821.0000.0000.0000.000
운행노선(종점)1.0000.8740.0001.0000.0000.000
운행대수1.0000.3400.0000.0001.0000.773
예비대수1.0000.0000.0000.0000.7731.000
2023-12-12T18:50:08.640514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
예비대수업체명
예비대수1.0000.000
업체명0.0001.000
2023-12-12T18:50:08.745902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운행대수업체명예비대수
운행대수1.0000.1930.540
업체명0.1931.0000.000
예비대수0.5400.0001.000

Missing values

2023-12-12T18:50:05.869791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:50:06.020865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선번호업체명운행노선(시점)운행노선(종점)운행대수예비대수
0동작01흑석운수(주)달마사대방역162
1동작10흑석운수(주)노량진교회상도역20
2동작21흑석운수(주)달마사상도역30
3동작02상도3동마을버스사자암노량진역50
4동작11상도3동마을버스사자암노량진역(상도3동 주민센터경유)60
5동작19상도3동마을버스국사봉터널동작우체국30
6동작03노들운수(주)신대방삼거리노들역80
7동작12노들운수(주)상도4동 성문교회대방역60
8동작13연성운수(주)대방역봉천고개110
9동작05보라매운수(주)대방역신대방역(신대방삼거리역 방면)71
노선번호업체명운행노선(시점)운행노선(종점)운행대수예비대수
11동작06㈜합성마을버스사당시장입구사랑의 병원40
12동작14㈜합성마을버스달마사사랑의 병원30
13동작15㈜합성마을버스사당자이(아)이수역41
14동작20㈜합성마을버스사당자이(아)낙성대역10
15동작07사당3동마을버스주식회사삼성래미안(아)갯마을41
16동작16사당3동마을버스주식회사사당종합체육관사당역30
17동작17사당3동마을버스주식회사극동(아)이수힐스테이트30
18동작18사당3동마을버스주식회사극동(아)사당역10
19동작08명승운수(주)지덕사(행복유치원)대방역111
20동작09화인운수(주)대림 e편한(아)사당역70