Overview

Dataset statistics

Number of variables8
Number of observations1204
Missing cells15
Missing cells (%)0.2%
Duplicate rows210
Duplicate rows (%)17.4%
Total size in memory76.6 KiB
Average record size in memory65.1 B

Variable types

Categorical5
Text2
Numeric1

Dataset

Description이 데이터는 충청남도 아산시의 정류장에 대한 정보를 담고 있습니다. 택시, 일반버스, 시내버스, 시외버스의 정류장, 정류장명, 정류장유형, 노선표 및 시간표 유무 등의 정보를 포함합니다.
Author충청남도 아산시
URLhttps://www.data.go.kr/data/15090294/fileData.do

Alerts

Dataset has 210 (17.4%) duplicate rowsDuplicates
관리기관 is highly imbalanced (88.2%)Imbalance
정류장종류 is highly imbalanced (85.3%)Imbalance
노선표_시간표 유무 is highly imbalanced (61.0%)Imbalance
정류장명 has 15 (1.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 01:19:00.092607
Analysis finished2023-12-12 01:19:00.935352
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct18
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
배방읍
172 
신창면
114 
음봉면
95 
도고면
87 
온양4동
77 
Other values (13)
659 

Length

Max length10
Median length3
Mean length3.2400332
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row도고면
2nd row선장면
3rd row신창면
4th row온양2동
5th row온양3동

Common Values

ValueCountFrequency (%)
배방읍 172
14.3%
신창면 114
 
9.5%
음봉면 95
 
7.9%
도고면 87
 
7.2%
온양4동 77
 
6.4%
탕정면 75
 
6.2%
송악면 74
 
6.1%
영인면 67
 
5.6%
인주면 65
 
5.4%
염치읍 61
 
5.1%
Other values (8) 317
26.3%

Length

2023-12-12T10:19:01.024176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
배방읍 172
14.3%
신창면 114
 
9.5%
음봉면 95
 
7.9%
도고면 87
 
7.2%
온양4동 77
 
6.4%
탕정면 75
 
6.2%
송악면 74
 
6.1%
영인면 67
 
5.6%
인주면 65
 
5.4%
염치읍 61
 
5.1%
Other values (8) 317
26.3%
Distinct153
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
2023-12-12T10:19:01.447346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length5.9950166
Min length2

Characters and Unicode

Total characters7218
Distinct characters132
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.9%

Sample

1st row도고면 와산리
2nd row선장면 신성리
3rd row신창면 읍내리
4th row온천동
5th row모종동
ValueCountFrequency (%)
배방읍 177
 
8.4%
신창면 114
 
5.4%
음봉면 95
 
4.5%
도고면 87
 
4.1%
탕정면 75
 
3.6%
송악면 74
 
3.5%
인주면 65
 
3.1%
영인면 63
 
3.0%
염치읍 61
 
2.9%
온천동 60
 
2.8%
Other values (150) 1236
58.7%
2023-12-12T10:19:02.037487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
903
 
12.5%
903
 
12.5%
688
 
9.5%
329
 
4.6%
259
 
3.6%
239
 
3.3%
213
 
3.0%
183
 
2.5%
140
 
1.9%
133
 
1.8%
Other values (122) 3228
44.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6315
87.5%
Space Separator 903
 
12.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
903
 
14.3%
688
 
10.9%
329
 
5.2%
259
 
4.1%
239
 
3.8%
213
 
3.4%
183
 
2.9%
140
 
2.2%
133
 
2.1%
115
 
1.8%
Other values (121) 3113
49.3%
Space Separator
ValueCountFrequency (%)
903
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6315
87.5%
Common 903
 
12.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
903
 
14.3%
688
 
10.9%
329
 
5.2%
259
 
4.1%
239
 
3.8%
213
 
3.4%
183
 
2.9%
140
 
2.2%
133
 
2.1%
115
 
1.8%
Other values (121) 3113
49.3%
Common
ValueCountFrequency (%)
903
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6315
87.5%
ASCII 903
 
12.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
903
 
14.3%
688
 
10.9%
329
 
5.2%
259
 
4.1%
239
 
3.8%
213
 
3.4%
183
 
2.9%
140
 
2.2%
133
 
2.1%
115
 
1.8%
Other values (121) 3113
49.3%
ASCII
ValueCountFrequency (%)
903
100.0%

관리기관
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
교통행정과
1173 
도로과
 
27
도시개발과
 
4

Length

Max length5
Median length5
Mean length4.9551495
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교통행정과
2nd row교통행정과
3rd row교통행정과
4th row교통행정과
5th row교통행정과

Common Values

ValueCountFrequency (%)
교통행정과 1173
97.4%
도로과 27
 
2.2%
도시개발과 4
 
0.3%

Length

2023-12-12T10:19:02.228327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:19:02.361551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교통행정과 1173
97.4%
도로과 27
 
2.2%
도시개발과 4
 
0.3%

설치일자
Real number (ℝ)

Distinct16
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19151837
Minimum19000101
Maximum20191120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.7 KiB
2023-12-12T10:19:02.494277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19000101
5-th percentile19000101
Q119000101
median19000101
Q319000101
95-th percentile20151231
Maximum20191120
Range1191019
Interquartile range (IQR)0

Descriptive statistics

Standard deviation387904.15
Coefficient of variation (CV)0.020254148
Kurtosis2.7156507
Mean19151837
Median Absolute Deviation (MAD)0
Skewness2.1689946
Sum2.3058812 × 1010
Variance1.5046963 × 1011
MonotonicityNot monotonic
2023-12-12T10:19:02.680334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
19000101 1044
86.7%
20110101 32
 
2.7%
20140101 18
 
1.5%
20151231 16
 
1.3%
20090101 13
 
1.1%
20190101 12
 
1.0%
20160128 11
 
0.9%
20141201 11
 
0.9%
20160101 10
 
0.8%
20171001 10
 
0.8%
Other values (6) 27
 
2.2%
ValueCountFrequency (%)
19000101 1044
86.7%
20090101 13
 
1.1%
20110101 32
 
2.7%
20120101 7
 
0.6%
20140101 18
 
1.5%
20141101 3
 
0.2%
20141201 11
 
0.9%
20141231 6
 
0.5%
20151231 16
 
1.3%
20160101 10
 
0.8%
ValueCountFrequency (%)
20191120 8
0.7%
20190101 12
1.0%
20180601 1
 
0.1%
20171130 2
 
0.2%
20171001 10
0.8%
20160128 11
0.9%
20160101 10
0.8%
20151231 16
1.3%
20141231 6
 
0.5%
20141201 11
0.9%

정류장종류
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
일반버스
1145 
택시
 
43
시외버스
 
9
시내, 시외겸용
 
4
마을버스
 
3

Length

Max length8
Median length4
Mean length3.9418605
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반버스
2nd row일반버스
3rd row일반버스
4th row택시
5th row택시

Common Values

ValueCountFrequency (%)
일반버스 1145
95.1%
택시 43
 
3.6%
시외버스 9
 
0.7%
시내, 시외겸용 4
 
0.3%
마을버스 3
 
0.2%

Length

2023-12-12T10:19:02.855892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:19:02.995153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반버스 1145
94.8%
택시 43
 
3.6%
시외버스 9
 
0.7%
시내 4
 
0.3%
시외겸용 4
 
0.3%
마을버스 3
 
0.2%

정류장명
Text

MISSING 

Distinct668
Distinct (%)56.2%
Missing15
Missing (%)1.2%
Memory size9.5 KiB
2023-12-12T10:19:03.337209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length12
Mean length5.0798991
Min length2

Characters and Unicode

Total characters6040
Distinct characters332
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique354 ?
Unique (%)29.8%

Sample

1st row진테크
2nd row신성1리
3rd row신창초등학교
4th row온양온천역
5th row고속버스터미널
ValueCountFrequency (%)
쌍용보건지소(양면 11
 
0.9%
금산리 10
 
0.8%
휴대리입구 9
 
0.8%
천안아산역 9
 
0.8%
성내2리안골입구 7
 
0.6%
인주산업단지 7
 
0.6%
도고온천역 6
 
0.5%
e마트 6
 
0.5%
배방롯데캐슬아파트 6
 
0.5%
온양온천 6
 
0.5%
Other values (662) 1119
93.6%
2023-12-12T10:19:03.927721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
602
 
10.0%
2 195
 
3.2%
1 174
 
2.9%
161
 
2.7%
142
 
2.4%
128
 
2.1%
111
 
1.8%
110
 
1.8%
99
 
1.6%
81
 
1.3%
Other values (322) 4237
70.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5378
89.0%
Decimal Number 441
 
7.3%
Uppercase Letter 111
 
1.8%
Close Punctuation 45
 
0.7%
Open Punctuation 45
 
0.7%
Space Separator 7
 
0.1%
Math Symbol 7
 
0.1%
Dash Punctuation 2
 
< 0.1%
Modifier Symbol 2
 
< 0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
602
 
11.2%
161
 
3.0%
142
 
2.6%
128
 
2.4%
111
 
2.1%
110
 
2.0%
99
 
1.8%
81
 
1.5%
81
 
1.5%
80
 
1.5%
Other values (292) 3783
70.3%
Uppercase Letter
ValueCountFrequency (%)
P 25
22.5%
A 25
22.5%
T 23
20.7%
L 8
 
7.2%
E 6
 
5.4%
C 6
 
5.4%
D 6
 
5.4%
I 2
 
1.8%
R 2
 
1.8%
K 2
 
1.8%
Other values (3) 6
 
5.4%
Decimal Number
ValueCountFrequency (%)
2 195
44.2%
1 174
39.5%
3 51
 
11.6%
4 14
 
3.2%
5 3
 
0.7%
6 2
 
0.5%
7 1
 
0.2%
8 1
 
0.2%
Math Symbol
ValueCountFrequency (%)
+ 6
85.7%
< 1
 
14.3%
Close Punctuation
ValueCountFrequency (%)
) 45
100.0%
Open Punctuation
ValueCountFrequency (%)
( 45
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5378
89.0%
Common 550
 
9.1%
Latin 112
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
602
 
11.2%
161
 
3.0%
142
 
2.6%
128
 
2.4%
111
 
2.1%
110
 
2.0%
99
 
1.8%
81
 
1.5%
81
 
1.5%
80
 
1.5%
Other values (292) 3783
70.3%
Common
ValueCountFrequency (%)
2 195
35.5%
1 174
31.6%
3 51
 
9.3%
) 45
 
8.2%
( 45
 
8.2%
4 14
 
2.5%
7
 
1.3%
+ 6
 
1.1%
5 3
 
0.5%
- 2
 
0.4%
Other values (6) 8
 
1.5%
Latin
ValueCountFrequency (%)
P 25
22.3%
A 25
22.3%
T 23
20.5%
L 8
 
7.1%
E 6
 
5.4%
C 6
 
5.4%
D 6
 
5.4%
I 2
 
1.8%
R 2
 
1.8%
K 2
 
1.8%
Other values (4) 7
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5378
89.0%
ASCII 662
 
11.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
602
 
11.2%
161
 
3.0%
142
 
2.6%
128
 
2.4%
111
 
2.1%
110
 
2.0%
99
 
1.8%
81
 
1.5%
81
 
1.5%
80
 
1.5%
Other values (292) 3783
70.3%
ASCII
ValueCountFrequency (%)
2 195
29.5%
1 174
26.3%
3 51
 
7.7%
) 45
 
6.8%
( 45
 
6.8%
P 25
 
3.8%
A 25
 
3.8%
T 23
 
3.5%
4 14
 
2.1%
L 8
 
1.2%
Other values (20) 57
 
8.6%

정류장유형
Categorical

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
지붕설치
694 
표지설치
498 
기타
 
11
미분류
 
1

Length

Max length4
Median length4
Mean length3.980897
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row표지설치
2nd row지붕설치
3rd row표지설치
4th row표지설치
5th row지붕설치

Common Values

ValueCountFrequency (%)
지붕설치 694
57.6%
표지설치 498
41.4%
기타 11
 
0.9%
미분류 1
 
0.1%

Length

2023-12-12T10:19:04.160511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:19:04.325275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지붕설치 694
57.6%
표지설치 498
41.4%
기타 11
 
0.9%
미분류 1
 
0.1%

노선표_시간표 유무
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
없음
1028 
있음
173 
미분류
 
3

Length

Max length3
Median length2
Mean length2.0024917
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row없음
2nd row없음
3rd row없음
4th row없음
5th row없음

Common Values

ValueCountFrequency (%)
없음 1028
85.4%
있음 173
 
14.4%
미분류 3
 
0.2%

Length

2023-12-12T10:19:04.490933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:19:04.666490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
없음 1028
85.4%
있음 173
 
14.4%
미분류 3
 
0.2%

Interactions

2023-12-12T10:19:00.624988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:19:04.761508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정읍_면_동관리기관설치일자정류장종류정류장유형노선표_시간표 유무
행정읍_면_동1.0000.718NaN0.2860.2570.376
관리기관0.7181.000NaN0.0440.3610.000
설치일자NaNNaN1.000NaNNaNNaN
정류장종류0.2860.044NaN1.0000.0000.483
정류장유형0.2570.361NaN0.0001.0000.088
노선표_시간표 유무0.3760.000NaN0.4830.0881.000
2023-12-12T10:19:04.910658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정류장종류정류장유형행정읍_면_동관리기관노선표_시간표 유무
정류장종류1.0000.0000.1490.0330.416
정류장유형0.0001.0000.1420.3500.082
행정읍_면_동0.1490.1421.0000.4450.186
관리기관0.0330.3500.4451.0000.000
노선표_시간표 유무0.4160.0820.1860.0001.000
2023-12-12T10:19:05.038937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치일자행정읍_면_동관리기관정류장종류정류장유형노선표_시간표 유무
설치일자1.0000.4380.4130.0630.1610.075
행정읍_면_동0.4381.0000.4450.1490.1420.186
관리기관0.4130.4451.0000.0330.3500.000
정류장종류0.0630.1490.0331.0000.0000.416
정류장유형0.1610.1420.3500.0001.0000.082
노선표_시간표 유무0.0750.1860.0000.4160.0821.000

Missing values

2023-12-12T10:19:00.752452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:19:00.883132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

행정읍_면_동법정읍_면_동관리기관설치일자정류장종류정류장명정류장유형노선표_시간표 유무
0도고면도고면 와산리교통행정과19000101일반버스진테크표지설치없음
1선장면선장면 신성리교통행정과19000101일반버스신성1리지붕설치없음
2신창면신창면 읍내리교통행정과19000101일반버스신창초등학교표지설치없음
3온양2동온천동교통행정과19000101택시온양온천역표지설치없음
4온양3동모종동교통행정과19000101택시고속버스터미널지붕설치없음
5신창면신창면 읍내리교통행정과19000101일반버스순천향대학교표지설치없음
6신창면신창면 읍내리교통행정과19000101일반버스순천향대학교지붕설치없음
7신창면신창면 읍내리교통행정과19000101일반버스순천향대학교표지설치없음
8온양5동용화동교통행정과19000101일반버스송악나드리지붕설치없음
9온양3동모종동교통행정과19000101일반버스온양여고표지설치없음
행정읍_면_동법정읍_면_동관리기관설치일자정류장종류정류장명정류장유형노선표_시간표 유무
1194탕정면탕정면 명암리도로과20140101택시<NA>지붕설치없음
1195탕정면탕정면 명암리도로과20140101일반버스명암2리상목이지붕설치있음
1196탕정면탕정면 명암리도로과20140101일반버스명암2리상목이지붕설치있음
1197탕정면탕정면 명암리도로과20140101일반버스명암2리장목이표지설치있음
1198탕정면탕정면 명암리도로과20140101일반버스삼성LCD표지설치없음
1199탕정면탕정면 명암리도로과20140101일반버스삼성LCD표지설치없음
1200탕정면탕정면 명암리도로과20140101일반버스삼성LCD입구표지설치없음
1201탕정면탕정면 명암리도로과20140101일반버스삼성코닝1단지표지설치없음
1202탕정면탕정면 명암리도로과20140101일반버스명암2리장목이표지설치있음
1203탕정면탕정면 명암리도로과20140101일반버스삼성전자기숙사후문표지설치없음

Duplicate rows

Most frequently occurring

행정읍_면_동법정읍_면_동관리기관설치일자정류장종류정류장명정류장유형노선표_시간표 유무# duplicates
0도고면도고면 금산리교통행정과19000101일반버스금산리지붕설치없음7
43배방읍배방읍 북수리교통행정과20190101일반버스<NA>지붕설치없음7
207탕정면탕정면 용두리교통행정과20160128일반버스휴대리입구지붕설치없음7
19둔포면둔포면 석곡리교통행정과20120101일반버스쌍용보건지소(양면)지붕설치없음5
181인주면인주면 걸매리교통행정과19000101일반버스인주산업단지표지설치없음5
34배방읍배방읍 구령리교통행정과20151231일반버스성내2리안골입구지붕설치없음4
5도고면도고면 시전리교통행정과19000101일반버스도고온천역표지설치없음3
21둔포면둔포면 운용리교통행정과19000101일반버스운용3리지붕설치없음3
37배방읍배방읍 북수리교통행정과19000101일반버스북수1리표지설치없음3
46배방읍배방읍 세출리교통행정과19000101일반버스호서대학교표지설치없음3