Overview

Dataset statistics

Number of variables6
Number of observations706
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory35.3 KiB
Average record size in memory51.2 B

Variable types

Numeric3
Text1
Categorical2

Dataset

Description경기도 시흥시 버스도착정보안내설치현황 입니다(경기도 시흥시 버스도착정보안내설치현황에는 정류소명, 정류소번호, 설치연도, 장비유형, 장비형태가 있습니다)
URLhttps://www.data.go.kr/data/15105829/fileData.do

Alerts

장비유형 is highly overall correlated with 장비형태High correlation
장비형태 is highly overall correlated with 장비유형High correlation
연번 is highly overall correlated with 정류소번호 and 1 other fieldsHigh correlation
정류소번호 is highly overall correlated with 연번High correlation
설치연도 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:08:22.103991
Analysis finished2023-12-12 01:08:24.240973
Duration2.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct706
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean353.5
Minimum1
Maximum706
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.3 KiB
2023-12-12T10:08:24.363519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile36.25
Q1177.25
median353.5
Q3529.75
95-th percentile670.75
Maximum706
Range705
Interquartile range (IQR)352.5

Descriptive statistics

Standard deviation203.94893
Coefficient of variation (CV)0.57694181
Kurtosis-1.2
Mean353.5
Median Absolute Deviation (MAD)176.5
Skewness0
Sum249571
Variance41595.167
MonotonicityStrictly increasing
2023-12-12T10:08:24.575881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
531 1
 
0.1%
467 1
 
0.1%
468 1
 
0.1%
469 1
 
0.1%
470 1
 
0.1%
471 1
 
0.1%
472 1
 
0.1%
473 1
 
0.1%
474 1
 
0.1%
Other values (696) 696
98.6%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
706 1
0.1%
705 1
0.1%
704 1
0.1%
703 1
0.1%
702 1
0.1%
701 1
0.1%
700 1
0.1%
699 1
0.1%
698 1
0.1%
697 1
0.1%
Distinct442
Distinct (%)62.6%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-12T10:08:24.949805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length14
Mean length6.7549575
Min length2

Characters and Unicode

Total characters4769
Distinct characters352
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique206 ?
Unique (%)29.2%

Sample

1st row동보아파트
2nd row동남아파트
3rd row계룡2단지
4th row대림아파트
5th row신동아1단지.한일아파트
ValueCountFrequency (%)
이마트 6
 
0.8%
금강아파트 6
 
0.8%
삼미시장 4
 
0.6%
경신아파트 4
 
0.6%
에이스아파트 4
 
0.6%
검바위 4
 
0.6%
동원아파트 4
 
0.6%
군서미래국제학교 4
 
0.6%
월미동 3
 
0.4%
나분들 3
 
0.4%
Other values (433) 666
94.1%
2023-12-12T10:08:25.515833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
178
 
3.7%
167
 
3.5%
. 147
 
3.1%
143
 
3.0%
112
 
2.3%
95
 
2.0%
82
 
1.7%
78
 
1.6%
68
 
1.4%
67
 
1.4%
Other values (342) 3632
76.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4451
93.3%
Other Punctuation 147
 
3.1%
Decimal Number 96
 
2.0%
Uppercase Letter 62
 
1.3%
Open Punctuation 4
 
0.1%
Close Punctuation 4
 
0.1%
Space Separator 2
 
< 0.1%
Lowercase Letter 2
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
178
 
4.0%
167
 
3.8%
143
 
3.2%
112
 
2.5%
95
 
2.1%
82
 
1.8%
78
 
1.8%
68
 
1.5%
67
 
1.5%
65
 
1.5%
Other values (312) 3396
76.3%
Uppercase Letter
ValueCountFrequency (%)
M 11
17.7%
V 10
16.1%
T 10
16.1%
L 5
8.1%
H 5
8.1%
A 4
 
6.5%
B 4
 
6.5%
C 4
 
6.5%
S 3
 
4.8%
G 2
 
3.2%
Other values (4) 4
 
6.5%
Decimal Number
ValueCountFrequency (%)
2 27
28.1%
1 26
27.1%
3 13
13.5%
4 9
 
9.4%
6 8
 
8.3%
5 5
 
5.2%
7 5
 
5.2%
8 2
 
2.1%
9 1
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
s 1
50.0%
k 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 147
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4452
93.4%
Common 253
 
5.3%
Latin 64
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
178
 
4.0%
167
 
3.8%
143
 
3.2%
112
 
2.5%
95
 
2.1%
82
 
1.8%
78
 
1.8%
68
 
1.5%
67
 
1.5%
65
 
1.5%
Other values (313) 3397
76.3%
Latin
ValueCountFrequency (%)
M 11
17.2%
V 10
15.6%
T 10
15.6%
L 5
7.8%
H 5
7.8%
A 4
 
6.2%
B 4
 
6.2%
C 4
 
6.2%
S 3
 
4.7%
G 2
 
3.1%
Other values (6) 6
9.4%
Common
ValueCountFrequency (%)
. 147
58.1%
2 27
 
10.7%
1 26
 
10.3%
3 13
 
5.1%
4 9
 
3.6%
6 8
 
3.2%
5 5
 
2.0%
7 5
 
2.0%
( 4
 
1.6%
) 4
 
1.6%
Other values (3) 5
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4451
93.3%
ASCII 317
 
6.6%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
178
 
4.0%
167
 
3.8%
143
 
3.2%
112
 
2.5%
95
 
2.1%
82
 
1.8%
78
 
1.8%
68
 
1.5%
67
 
1.5%
65
 
1.5%
Other values (312) 3396
76.3%
ASCII
ValueCountFrequency (%)
. 147
46.4%
2 27
 
8.5%
1 26
 
8.2%
3 13
 
4.1%
M 11
 
3.5%
V 10
 
3.2%
T 10
 
3.2%
4 9
 
2.8%
6 8
 
2.5%
5 5
 
1.6%
Other values (19) 51
 
16.1%
None
ValueCountFrequency (%)
1
100.0%

정류소번호
Real number (ℝ)

HIGH CORRELATION 

Distinct700
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28414.966
Minimum25002
Maximum59129
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.3 KiB
2023-12-12T10:08:25.797384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25002
5-th percentile25064.25
Q125245.5
median25473.5
Q325769.75
95-th percentile59037.75
Maximum59129
Range34127
Interquartile range (IQR)524.25

Descriptive statistics

Standard deviation9515.9586
Coefficient of variation (CV)0.33489249
Kurtosis6.5237576
Mean28414.966
Median Absolute Deviation (MAD)257
Skewness2.9147219
Sum20060966
Variance90553468
MonotonicityNot monotonic
2023-12-12T10:08:26.011801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25337 2
 
0.3%
25338 2
 
0.3%
25171 2
 
0.3%
25943 2
 
0.3%
25424 2
 
0.3%
25835 2
 
0.3%
25549 1
 
0.1%
25555 1
 
0.1%
25861 1
 
0.1%
25933 1
 
0.1%
Other values (690) 690
97.7%
ValueCountFrequency (%)
25002 1
0.1%
25004 1
0.1%
25005 1
0.1%
25006 1
0.1%
25007 1
0.1%
25008 1
0.1%
25009 1
0.1%
25010 1
0.1%
25011 1
0.1%
25012 1
0.1%
ValueCountFrequency (%)
59129 1
0.1%
59116 1
0.1%
59115 1
0.1%
59097 1
0.1%
59096 1
0.1%
59095 1
0.1%
59094 1
0.1%
59093 1
0.1%
59092 1
0.1%
59091 1
0.1%

설치연도
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2015.9051
Minimum2011
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.3 KiB
2023-12-12T10:08:26.188711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2011
Q12012
median2016
Q32019
95-th percentile2021
Maximum2022
Range11
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.5323513
Coefficient of variation (CV)0.0017522409
Kurtosis-1.4609794
Mean2015.9051
Median Absolute Deviation (MAD)3
Skewness0.095725303
Sum1423229
Variance12.477506
MonotonicityNot monotonic
2023-12-12T10:08:26.353704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2012 111
15.7%
2019 92
13.0%
2017 89
12.6%
2013 87
12.3%
2011 69
9.8%
2021 68
9.6%
2020 59
8.4%
2014 47
6.7%
2018 25
 
3.5%
2015 22
 
3.1%
Other values (2) 37
 
5.2%
ValueCountFrequency (%)
2011 69
9.8%
2012 111
15.7%
2013 87
12.3%
2014 47
6.7%
2015 22
 
3.1%
2016 19
 
2.7%
2017 89
12.6%
2018 25
 
3.5%
2019 92
13.0%
2020 59
8.4%
ValueCountFrequency (%)
2022 18
 
2.5%
2021 68
9.6%
2020 59
8.4%
2019 92
13.0%
2018 25
 
3.5%
2017 89
12.6%
2016 19
 
2.7%
2015 22
 
3.1%
2014 47
6.7%
2013 87
12.3%

장비유형
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
LCD
595 
LED
111 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLCD
2nd rowLCD
3rd rowLCD
4th rowLCD
5th rowLCD

Common Values

ValueCountFrequency (%)
LCD 595
84.3%
LED 111
 
15.7%

Length

2023-12-12T10:08:26.494371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:08:26.607351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
lcd 595
84.3%
led 111
 
15.7%

장비형태
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
독립형
227 
일체형
220 
거치형
96 
부착형
50 
3단12열 단면 거치형
37 
Other values (10)
76 

Length

Max length12
Median length3
Mean length4.398017
Min length3

Unique

Unique2 ?
Unique (%)0.3%

Sample

1st row거치형
2nd row거치형
3rd row거치형
4th row거치형
5th row거치형

Common Values

ValueCountFrequency (%)
독립형 227
32.2%
일체형 220
31.2%
거치형 96
13.6%
부착형 50
 
7.1%
3단12열 단면 거치형 37
 
5.2%
4단12열 단면 거치형 21
 
3.0%
3단6열 단면 알뜰형 21
 
3.0%
4단12열 양면 거치형 13
 
1.8%
3단10열 단면 독립형 11
 
1.6%
4단12열 양면 독립형 2
 
0.3%
Other values (5) 8
 
1.1%

Length

2023-12-12T10:08:26.721028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
독립형 240
25.9%
일체형 220
23.8%
거치형 170
18.4%
단면 94
 
10.2%
부착형 53
 
5.7%
3단12열 37
 
4.0%
4단12열 36
 
3.9%
알뜰형 23
 
2.5%
3단6열 21
 
2.3%
양면 17
 
1.8%
Other values (3) 15
 
1.6%

Interactions

2023-12-12T10:08:23.715364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:22.582455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:23.025048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:23.835118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:22.735729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:23.478529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:23.936678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:22.884553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:23.603740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:08:26.833562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번정류소번호설치연도장비유형장비형태
연번1.0000.7850.9530.4500.781
정류소번호0.7851.0000.4850.2440.544
설치연도0.9530.4851.0000.5270.764
장비유형0.4500.2440.5271.0001.000
장비형태0.7810.5440.7641.0001.000
2023-12-12T10:08:26.967059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
장비유형장비형태
장비유형1.0000.991
장비형태0.9911.000
2023-12-12T10:08:27.069993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번정류소번호설치연도장비유형장비형태
연번1.0000.5050.5860.3440.425
정류소번호0.5051.0000.3810.1570.494
설치연도0.5860.3811.0000.4260.408
장비유형0.3440.1570.4261.0000.991
장비형태0.4250.4940.4080.9911.000

Missing values

2023-12-12T10:08:24.072387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:08:24.175263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번정류소명정류소번호설치연도장비유형장비형태
01동보아파트250642015LCD거치형
12동남아파트250672015LCD거치형
23계룡2단지250682019LCD거치형
34대림아파트250692019LCD거치형
45신동아1단지.한일아파트250802019LCD거치형
56신동아1단지250812019LCD거치형
67고합아파트.서해고교250902019LCD거치형
78한일아파트250912019LCD거치형
89영남1단지.고합아파트250982019LCD거치형
910고합아파트250992019LCD거치형
연번정류소명정류소번호설치연도장비유형장비형태
696697피정의집256512022LCD일체형
697698물왕교255502022LCD부착형
698699나분들257282022LCD일체형
699700경기도노인전문시흥병원258442022LED4단12열 단면 거치형
700701새터말254452022LED4단12열 단면 거치형
701702새터말254492022LED4단12열 단면 거치형
702703장곡문화체육센터257622022LCD일체형
703704시흥시청역동원로얄듀크254882022LCD일체형
704705시흥시청역동원로얄듀크254922022LCD일체형
705706LH2단지후문591292022LCD일체형