Overview

Dataset statistics

Number of variables5
Number of observations1103
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory46.4 KiB
Average record size in memory43.1 B

Variable types

Numeric2
Categorical2
Text1

Dataset

Description버스정류장에 설치된 버스안내 단말기에 대한 데이터로 번호, 동명, 정류소ID, 정류소 명칭, 설치대수 항목을 제공합니다.
URLhttps://www.data.go.kr/data/3075343/fileData.do

Alerts

번호 is highly overall correlated with 정류소ID and 1 other fieldsHigh correlation
정류소ID is highly overall correlated with 번호 and 1 other fieldsHigh correlation
동명 is highly overall correlated with 번호 and 1 other fieldsHigh correlation
설치대수(대) is highly imbalanced (57.0%)Imbalance
번호 has unique valuesUnique
정류소ID has unique valuesUnique

Reproduction

Analysis started2023-12-12 23:07:56.788726
Analysis finished2023-12-12 23:07:57.684183
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1103
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean552
Minimum1
Maximum1103
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.8 KiB
2023-12-13T08:07:57.759760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile56.1
Q1276.5
median552
Q3827.5
95-th percentile1047.9
Maximum1103
Range1102
Interquartile range (IQR)551

Descriptive statistics

Standard deviation318.55298
Coefficient of variation (CV)0.57708873
Kurtosis-1.2
Mean552
Median Absolute Deviation (MAD)276
Skewness0
Sum608856
Variance101476
MonotonicityStrictly increasing
2023-12-13T08:07:57.910366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
735 1
 
0.1%
741 1
 
0.1%
740 1
 
0.1%
739 1
 
0.1%
738 1
 
0.1%
737 1
 
0.1%
736 1
 
0.1%
734 1
 
0.1%
726 1
 
0.1%
Other values (1093) 1093
99.1%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1103 1
0.1%
1102 1
0.1%
1101 1
0.1%
1100 1
0.1%
1099 1
0.1%
1098 1
0.1%
1097 1
0.1%
1096 1
0.1%
1095 1
0.1%
1094 1
0.1%

동명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size8.7 KiB
오정동
179 
부천동
154 
성곡동
150 
신중동
138 
상동
130 
Other values (6)
352 

Length

Max length4
Median length3
Mean length2.9365367
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부천동
2nd row상동
3rd row상동
4th row부천동
5th row상동

Common Values

ValueCountFrequency (%)
오정동 179
16.2%
부천동 154
14.0%
성곡동 150
13.6%
신중동 138
12.5%
상동 130
11.8%
범안동 104
9.4%
소사본동 80
7.3%
대산동 80
7.3%
심곡동 60
 
5.4%
중동 24
 
2.2%

Length

2023-12-13T08:07:58.042214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
오정동 179
16.2%
부천동 154
14.0%
성곡동 150
13.6%
신중동 138
12.5%
상동 130
11.8%
범안동 104
9.4%
소사본동 80
7.3%
대산동 80
7.3%
심곡동 60
 
5.4%
중동 24
 
2.2%

정류소ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1103
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12068.942
Minimum11001
Maximum13379
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.8 KiB
2023-12-13T08:07:58.196596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11001
5-th percentile11057.1
Q111285.5
median12047
Q313058.5
95-th percentile13316.9
Maximum13379
Range2378
Interquartile range (IQR)1773

Descriptive statistics

Standard deviation833.87374
Coefficient of variation (CV)0.069092531
Kurtosis-1.4869511
Mean12068.942
Median Absolute Deviation (MAD)817
Skewness0.29777684
Sum13312043
Variance695345.42
MonotonicityStrictly increasing
2023-12-13T08:07:58.342733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11001 1
 
0.1%
12467 1
 
0.1%
12477 1
 
0.1%
12475 1
 
0.1%
12474 1
 
0.1%
12473 1
 
0.1%
12472 1
 
0.1%
12468 1
 
0.1%
12466 1
 
0.1%
12458 1
 
0.1%
Other values (1093) 1093
99.1%
ValueCountFrequency (%)
11001 1
0.1%
11002 1
0.1%
11003 1
0.1%
11004 1
0.1%
11005 1
0.1%
11006 1
0.1%
11007 1
0.1%
11008 1
0.1%
11009 1
0.1%
11010 1
0.1%
ValueCountFrequency (%)
13379 1
0.1%
13378 1
0.1%
13377 1
0.1%
13376 1
0.1%
13375 1
0.1%
13374 1
0.1%
13373 1
0.1%
13372 1
0.1%
13371 1
0.1%
13370 1
0.1%
Distinct678
Distinct (%)61.5%
Missing1
Missing (%)0.1%
Memory size8.7 KiB
2023-12-13T08:07:58.583268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length18
Mean length8.5036298
Min length3

Characters and Unicode

Total characters9371
Distinct characters352
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique287 ?
Unique (%)26.0%

Sample

1st row장미공원앞
2nd row삼산체육관역.영상문화단지정문
3rd row상동가구
4th rowe편한세상2차.예원아파트
5th row벚꽃마을
ValueCountFrequency (%)
행정복지센터 10
 
0.9%
부천터미널소풍 4
 
0.4%
행정복지센터.원미어울마당 4
 
0.4%
신흥시장 4
 
0.4%
부천범박힐스테이트5.6단지 4
 
0.4%
로얄아파트 4
 
0.4%
계남고교 4
 
0.4%
역곡역북부 4
 
0.4%
부천동 4
 
0.4%
원미고등학교 4
 
0.4%
Other values (672) 1074
95.9%
2023-12-13T08:07:58.974311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 419
 
4.5%
292
 
3.1%
241
 
2.6%
235
 
2.5%
224
 
2.4%
213
 
2.3%
210
 
2.2%
210
 
2.2%
206
 
2.2%
194
 
2.1%
Other values (342) 6927
73.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8595
91.7%
Other Punctuation 419
 
4.5%
Decimal Number 202
 
2.2%
Uppercase Letter 86
 
0.9%
Open Punctuation 22
 
0.2%
Close Punctuation 22
 
0.2%
Space Separator 18
 
0.2%
Lowercase Letter 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
292
 
3.4%
241
 
2.8%
235
 
2.7%
224
 
2.6%
213
 
2.5%
210
 
2.4%
210
 
2.4%
206
 
2.4%
194
 
2.3%
174
 
2.0%
Other values (311) 6396
74.4%
Uppercase Letter
ValueCountFrequency (%)
C 15
17.4%
K 14
16.3%
S 9
10.5%
T 8
9.3%
I 8
9.3%
B 8
9.3%
H 4
 
4.7%
L 4
 
4.7%
A 3
 
3.5%
W 2
 
2.3%
Other values (6) 11
12.8%
Decimal Number
ValueCountFrequency (%)
1 48
23.8%
3 46
22.8%
2 35
17.3%
5 19
 
9.4%
4 18
 
8.9%
6 11
 
5.4%
0 10
 
5.0%
7 5
 
2.5%
8 5
 
2.5%
9 5
 
2.5%
Other Punctuation
ValueCountFrequency (%)
. 419
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Space Separator
ValueCountFrequency (%)
18
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8595
91.7%
Common 683
 
7.3%
Latin 93
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
292
 
3.4%
241
 
2.8%
235
 
2.7%
224
 
2.6%
213
 
2.5%
210
 
2.4%
210
 
2.4%
206
 
2.4%
194
 
2.3%
174
 
2.0%
Other values (311) 6396
74.4%
Latin
ValueCountFrequency (%)
C 15
16.1%
K 14
15.1%
S 9
9.7%
T 8
8.6%
I 8
8.6%
B 8
8.6%
e 7
7.5%
H 4
 
4.3%
L 4
 
4.3%
A 3
 
3.2%
Other values (7) 13
14.0%
Common
ValueCountFrequency (%)
. 419
61.3%
1 48
 
7.0%
3 46
 
6.7%
2 35
 
5.1%
( 22
 
3.2%
) 22
 
3.2%
5 19
 
2.8%
18
 
2.6%
4 18
 
2.6%
6 11
 
1.6%
Other values (4) 25
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8595
91.7%
ASCII 776
 
8.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 419
54.0%
1 48
 
6.2%
3 46
 
5.9%
2 35
 
4.5%
( 22
 
2.8%
) 22
 
2.8%
5 19
 
2.4%
18
 
2.3%
4 18
 
2.3%
C 15
 
1.9%
Other values (21) 114
 
14.7%
Hangul
ValueCountFrequency (%)
292
 
3.4%
241
 
2.8%
235
 
2.7%
224
 
2.6%
213
 
2.5%
210
 
2.4%
210
 
2.4%
206
 
2.4%
194
 
2.3%
174
 
2.0%
Other values (311) 6396
74.4%

설치대수(대)
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size8.7 KiB
1
862 
0
222 
2
 
13
<NA>
 
6

Length

Max length4
Median length1
Mean length1.0163191
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 862
78.2%
0 222
 
20.1%
2 13
 
1.2%
<NA> 6
 
0.5%

Length

2023-12-13T08:07:59.205333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:07:59.316440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 862
78.2%
0 222
 
20.1%
2 13
 
1.2%
na 6
 
0.5%

Interactions

2023-12-13T08:07:57.337062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:07:57.139714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:07:57.432894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:07:57.242871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:07:59.390244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호동명정류소ID설치대수(대)
번호1.0000.9390.9400.418
동명0.9391.0000.8400.290
정류소ID0.9400.8401.0000.341
설치대수(대)0.4180.2900.3411.000
2023-12-13T08:07:59.481961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
동명설치대수(대)
동명1.0000.180
설치대수(대)0.1801.000
2023-12-13T08:07:59.574192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호정류소ID동명설치대수(대)
번호1.0001.0000.5880.276
정류소ID1.0001.0000.6100.230
동명0.5880.6101.0000.180
설치대수(대)0.2760.2300.1801.000

Missing values

2023-12-13T08:07:57.551708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:07:57.644733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호동명정류소ID정류소 명칭설치대수(대)
01부천동11001장미공원앞1
12상동11002삼산체육관역.영상문화단지정문1
23상동11003상동가구1
34부천동11004e편한세상2차.예원아파트0
45상동11005벚꽃마을1
56상동11006벚꽃마을1
67상동11007삼산체육관역.상동호수공원0
78상동11008상동고1
89상동11009병원앞1
910상동11010상동고1
번호동명정류소ID정류소 명칭설치대수(대)
10931094오정동13370동양철강0
10941095오정동13371동양철강0
10951096오정동13372부천로지스틱스파크1
10961097오정동13373부천로지스틱스파크0
10971098오정동13374덕산중학교.오정대공원0
10981099오정동13375덕산중학교.오정대공원0
10991100성곡동13376월산장미아파트0
11001101오정동13377월산장미아파트0
11011102오정동13378원종e편한세상아파트0
11021103성곡동13379원종e편한세상아파트0