Overview

Dataset statistics

Number of variables4
Number of observations298
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.0 KiB
Average record size in memory34.4 B

Variable types

Categorical1
Text1
Numeric2

Dataset

Description부산교통공사_소음측정정보_20201231
Author부산교통공사
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083212

Alerts

평균(Leq) is highly overall correlated with 최대(Lmax)High correlation
최대(Lmax) is highly overall correlated with 평균(Leq)High correlation

Reproduction

Analysis started2023-12-10 17:30:37.219433
Analysis finished2023-12-10 17:30:39.209427
Duration1.99 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Categorical

Distinct5
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2호선
84 
1호선(신차)
78 
1호선(구차)
78 
3호선
32 
4호선
26 

Length

Max length7
Median length7
Mean length5.0939597
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선(신차)
2nd row1호선(신차)
3rd row1호선(신차)
4th row1호선(신차)
5th row1호선(신차)

Common Values

ValueCountFrequency (%)
2호선 84
28.2%
1호선(신차) 78
26.2%
1호선(구차) 78
26.2%
3호선 32
 
10.7%
4호선 26
 
8.7%

Length

2023-12-11T02:30:39.672399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:30:40.090027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 84
28.2%
1호선(신차 78
26.2%
1호선(구차 78
26.2%
3호선 32
 
10.7%
4호선 26
 
8.7%

구간
Text

Distinct222
Distinct (%)74.5%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-11T02:30:40.757802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length6.1308725
Min length5

Characters and Unicode

Total characters1827
Distinct characters135
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique146 ?
Unique (%)49.0%

Sample

1st row노포→범어사
2nd row범어사→남산
3rd row남산→두실
4th row두실→구서
5th row구서→장전
ValueCountFrequency (%)
노포→범어사 2
 
0.7%
동대신→토성 2
 
0.7%
부산역→초량 2
 
0.7%
서대신→동대신 2
 
0.7%
괴정→대티 2
 
0.7%
사하→괴정 2
 
0.7%
당리→사하 2
 
0.7%
하단→당리 2
 
0.7%
신평→하단 2
 
0.7%
동매→신평 2
 
0.7%
Other values (212) 278
93.3%
2023-12-11T02:30:41.748941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
298
 
16.3%
92
 
5.0%
90
 
4.9%
58
 
3.2%
52
 
2.8%
48
 
2.6%
41
 
2.2%
40
 
2.2%
38
 
2.1%
36
 
2.0%
Other values (125) 1034
56.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1488
81.4%
Math Symbol 298
 
16.3%
Space Separator 41
 
2.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
92
 
6.2%
90
 
6.0%
58
 
3.9%
52
 
3.5%
48
 
3.2%
40
 
2.7%
38
 
2.6%
36
 
2.4%
32
 
2.2%
32
 
2.2%
Other values (123) 970
65.2%
Math Symbol
ValueCountFrequency (%)
298
100.0%
Space Separator
ValueCountFrequency (%)
41
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1488
81.4%
Common 339
 
18.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
92
 
6.2%
90
 
6.0%
58
 
3.9%
52
 
3.5%
48
 
3.2%
40
 
2.7%
38
 
2.6%
36
 
2.4%
32
 
2.2%
32
 
2.2%
Other values (123) 970
65.2%
Common
ValueCountFrequency (%)
298
87.9%
41
 
12.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1488
81.4%
Arrows 298
 
16.3%
ASCII 41
 
2.2%

Most frequent character per block

Arrows
ValueCountFrequency (%)
298
100.0%
Hangul
ValueCountFrequency (%)
92
 
6.2%
90
 
6.0%
58
 
3.9%
52
 
3.5%
48
 
3.2%
40
 
2.7%
38
 
2.6%
36
 
2.4%
32
 
2.2%
32
 
2.2%
Other values (123) 970
65.2%
ASCII
ValueCountFrequency (%)
41
100.0%

평균(Leq)
Real number (ℝ)

HIGH CORRELATION 

Distinct135
Distinct (%)45.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.493624
Minimum60.1
Maximum78.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T02:30:42.190554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum60.1
5-th percentile62.67
Q167.025
median69.4
Q372.175
95-th percentile75.93
Maximum78.9
Range18.8
Interquartile range (IQR)5.15

Descriptive statistics

Standard deviation3.8446333
Coefficient of variation (CV)0.05532354
Kurtosis-0.26633098
Mean69.493624
Median Absolute Deviation (MAD)2.6
Skewness-0.08313492
Sum20709.1
Variance14.781205
MonotonicityNot monotonic
2023-12-11T02:30:42.570218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
67.5 7
 
2.3%
69.8 6
 
2.0%
68.6 5
 
1.7%
66.4 5
 
1.7%
69.1 5
 
1.7%
71.8 5
 
1.7%
67.6 5
 
1.7%
67.8 5
 
1.7%
70.9 5
 
1.7%
69.3 5
 
1.7%
Other values (125) 245
82.2%
ValueCountFrequency (%)
60.1 1
0.3%
60.5 2
0.7%
60.6 1
0.3%
60.9 1
0.3%
61.0 1
0.3%
61.1 1
0.3%
61.3 1
0.3%
61.4 1
0.3%
61.9 2
0.7%
62.2 1
0.3%
ValueCountFrequency (%)
78.9 1
 
0.3%
78.8 1
 
0.3%
78.3 2
0.7%
77.5 1
 
0.3%
77.0 1
 
0.3%
76.9 1
 
0.3%
76.7 1
 
0.3%
76.4 1
 
0.3%
76.3 3
1.0%
76.2 1
 
0.3%

최대(Lmax)
Real number (ℝ)

HIGH CORRELATION 

Distinct145
Distinct (%)48.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean76.186913
Minimum64.5
Maximum89.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T02:30:43.041490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum64.5
5-th percentile69
Q172.825
median75.8
Q379.075
95-th percentile84.9
Maximum89.3
Range24.8
Interquartile range (IQR)6.25

Descriptive statistics

Standard deviation4.8022999
Coefficient of variation (CV)0.063033134
Kurtosis-0.14882328
Mean76.186913
Median Absolute Deviation (MAD)3.05
Skewness0.30632769
Sum22703.7
Variance23.062084
MonotonicityNot monotonic
2023-12-11T02:30:43.586053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
73.0 7
 
2.3%
76.1 6
 
2.0%
73.1 6
 
2.0%
71.7 5
 
1.7%
72.2 5
 
1.7%
77.8 5
 
1.7%
74.3 5
 
1.7%
76.6 5
 
1.7%
77.7 4
 
1.3%
74.5 4
 
1.3%
Other values (135) 246
82.6%
ValueCountFrequency (%)
64.5 1
0.3%
64.6 1
0.3%
65.3 1
0.3%
65.6 1
0.3%
65.7 1
0.3%
66.4 1
0.3%
67.2 2
0.7%
67.3 1
0.3%
68.4 2
0.7%
68.6 2
0.7%
ValueCountFrequency (%)
89.3 1
 
0.3%
88.2 1
 
0.3%
88.1 2
0.7%
88.0 1
 
0.3%
86.6 1
 
0.3%
86.5 1
 
0.3%
86.1 1
 
0.3%
85.8 1
 
0.3%
85.6 3
1.0%
85.2 1
 
0.3%

Interactions

2023-12-11T02:30:38.331284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:30:37.778401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:30:38.595383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:30:38.020843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:30:43.862949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선평균(Leq)최대(Lmax)
호선1.0000.7650.718
평균(Leq)0.7651.0000.915
최대(Lmax)0.7180.9151.000
2023-12-11T02:30:44.145592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
평균(Leq)최대(Lmax)호선
평균(Leq)1.0000.9080.417
최대(Lmax)0.9081.0000.375
호선0.4170.3751.000

Missing values

2023-12-11T02:30:38.885158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:30:39.100376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

호선구간평균(Leq)최대(Lmax)
01호선(신차)노포→범어사63.671.4
11호선(신차)범어사→남산65.673.3
21호선(신차)남산→두실65.774.4
31호선(신차)두실→구서60.665.3
41호선(신차)구서→장전64.474.8
51호선(신차)장전→부산대64.573.1
61호선(신차)부산대→온천장60.564.5
71호선(신차)온천장→명륜61.068.6
81호선(신차)명륜동→동래61.371.9
91호선(신차)동래→교대62.569.6
호선구간평균(Leq)최대(Lmax)
2884호선낙민→충렬사66.469.8
2894호선충렬사→명장65.470.3
2904호선명장→서동67.471.7
2914호선서동→금사67.472.6
2924호선금사→반여농산물시장69.573.7
2934호선반여농산물시장→석대67.772.8
2944호선석대→영산대67.671.5
2954호선영산대→동부산대학67.570.6
2964호선동부산대학→고촌65.571.2
2974호선고촌→안평66.371.0