Overview

Dataset statistics

Number of variables4
Number of observations298
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.0 KiB
Average record size in memory34.4 B

Variable types

Categorical1
Text1
Numeric2

Dataset

Description부산교통공사_소음측정정보_20220630
Author부산교통공사
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083212

Alerts

평균(Leq) is highly overall correlated with 최대(Lmax)High correlation
최대(Lmax) is highly overall correlated with 평균(Leq)High correlation

Reproduction

Analysis started2023-12-10 17:30:29.127880
Analysis finished2023-12-10 17:30:31.074001
Duration1.95 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Categorical

Distinct5
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2호선
84 
1호선(신차)
78 
1호선(구차)
78 
3호선
32 
4호선
26 

Length

Max length7
Median length7
Mean length5.0939597
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선(신차)
2nd row1호선(신차)
3rd row1호선(신차)
4th row1호선(신차)
5th row1호선(신차)

Common Values

ValueCountFrequency (%)
2호선 84
28.2%
1호선(신차) 78
26.2%
1호선(구차) 78
26.2%
3호선 32
 
10.7%
4호선 26
 
8.7%

Length

2023-12-11T02:30:31.290855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:30:31.656096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 84
28.2%
1호선(신차 78
26.2%
1호선(구차 78
26.2%
3호선 32
 
10.7%
4호선 26
 
8.7%

구간
Text

Distinct222
Distinct (%)74.5%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-11T02:30:32.292390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12.5
Mean length5.9932886
Min length5

Characters and Unicode

Total characters1786
Distinct characters134
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique146 ?
Unique (%)49.0%

Sample

1st row노포→범어사
2nd row범어사→남산
3rd row남산→두실
4th row두실→구서
5th row구서→장전
ValueCountFrequency (%)
노포→범어사 2
 
0.7%
동대신→토성 2
 
0.7%
부산역→초량 2
 
0.7%
서대신→동대신 2
 
0.7%
괴정→대티 2
 
0.7%
사하→괴정 2
 
0.7%
당리→사하 2
 
0.7%
하단→당리 2
 
0.7%
신평→하단 2
 
0.7%
동매→신평 2
 
0.7%
Other values (212) 278
93.3%
2023-12-11T02:30:33.331656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
298
 
16.7%
92
 
5.2%
90
 
5.0%
58
 
3.2%
52
 
2.9%
48
 
2.7%
40
 
2.2%
38
 
2.1%
36
 
2.0%
32
 
1.8%
Other values (124) 1002
56.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1488
83.3%
Math Symbol 298
 
16.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
92
 
6.2%
90
 
6.0%
58
 
3.9%
52
 
3.5%
48
 
3.2%
40
 
2.7%
38
 
2.6%
36
 
2.4%
32
 
2.2%
32
 
2.2%
Other values (123) 970
65.2%
Math Symbol
ValueCountFrequency (%)
298
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1488
83.3%
Common 298
 
16.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
92
 
6.2%
90
 
6.0%
58
 
3.9%
52
 
3.5%
48
 
3.2%
40
 
2.7%
38
 
2.6%
36
 
2.4%
32
 
2.2%
32
 
2.2%
Other values (123) 970
65.2%
Common
ValueCountFrequency (%)
298
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1488
83.3%
Arrows 298
 
16.7%

Most frequent character per block

Arrows
ValueCountFrequency (%)
298
100.0%
Hangul
ValueCountFrequency (%)
92
 
6.2%
90
 
6.0%
58
 
3.9%
52
 
3.5%
48
 
3.2%
40
 
2.7%
38
 
2.6%
36
 
2.4%
32
 
2.2%
32
 
2.2%
Other values (123) 970
65.2%

평균(Leq)
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.479866
Minimum55
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T02:30:34.308572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum55
5-th percentile61
Q166
median70
Q372
95-th percentile76
Maximum81
Range26
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.4549292
Coefficient of variation (CV)0.064118277
Kurtosis-0.099028566
Mean69.479866
Median Absolute Deviation (MAD)3
Skewness-0.39127158
Sum20705
Variance19.846395
MonotonicityNot monotonic
2023-12-11T02:30:34.681691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
71 38
12.8%
70 38
12.8%
72 22
 
7.4%
69 21
 
7.0%
75 20
 
6.7%
66 20
 
6.7%
76 17
 
5.7%
68 15
 
5.0%
73 14
 
4.7%
64 14
 
4.7%
Other values (13) 79
26.5%
ValueCountFrequency (%)
55 1
 
0.3%
56 1
 
0.3%
59 3
 
1.0%
60 2
 
0.7%
61 9
3.0%
62 6
 
2.0%
63 9
3.0%
64 14
4.7%
65 13
4.4%
66 20
6.7%
ValueCountFrequency (%)
81 1
 
0.3%
78 3
 
1.0%
77 4
 
1.3%
76 17
5.7%
75 20
6.7%
74 13
 
4.4%
73 14
 
4.7%
72 22
7.4%
71 38
12.8%
70 38
12.8%

최대(Lmax)
Real number (ℝ)

HIGH CORRELATION 

Distinct27
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean76.315436
Minimum59
Maximum89
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T02:30:35.044532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum59
5-th percentile68
Q173
median76
Q379
95-th percentile85
Maximum89
Range30
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.1077115
Coefficient of variation (CV)0.066928944
Kurtosis-0.066983406
Mean76.315436
Median Absolute Deviation (MAD)3
Skewness0.15111188
Sum22742
Variance26.088717
MonotonicityNot monotonic
2023-12-11T02:30:35.390403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
76 33
 
11.1%
75 30
 
10.1%
73 29
 
9.7%
74 24
 
8.1%
77 24
 
8.1%
78 19
 
6.4%
71 17
 
5.7%
84 12
 
4.0%
83 12
 
4.0%
85 11
 
3.7%
Other values (17) 87
29.2%
ValueCountFrequency (%)
59 1
 
0.3%
63 1
 
0.3%
65 1
 
0.3%
66 3
 
1.0%
67 2
 
0.7%
68 9
3.0%
69 7
2.3%
70 8
2.7%
71 17
5.7%
72 8
2.7%
ValueCountFrequency (%)
89 1
 
0.3%
88 2
 
0.7%
87 2
 
0.7%
86 8
2.7%
85 11
3.7%
84 12
4.0%
83 12
4.0%
82 10
3.4%
81 7
2.3%
80 8
2.7%

Interactions

2023-12-11T02:30:30.071489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:30:29.511311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:30:30.360248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:30:29.777897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:30:35.589825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선평균(Leq)최대(Lmax)
호선1.0000.7860.694
평균(Leq)0.7861.0000.942
최대(Lmax)0.6940.9421.000
2023-12-11T02:30:35.790125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
평균(Leq)최대(Lmax)호선
평균(Leq)1.0000.9230.438
최대(Lmax)0.9231.0000.356
호선0.4380.3561.000

Missing values

2023-12-11T02:30:30.688380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:30:30.920900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

호선구간평균(Leq)최대(Lmax)
01호선(신차)노포→범어사6374
11호선(신차)범어사→남산6370
21호선(신차)남산→두실6268
31호선(신차)두실→구서6167
41호선(신차)구서→장전6269
51호선(신차)장전→부산대6476
61호선(신차)부산대→온천장6171
71호선(신차)온천장→명륜6068
81호선(신차)명륜동→동래6066
91호선(신차)동래→교대6169
호선구간평균(Leq)최대(Lmax)
2884호선낙민→충렬사6976
2894호선충렬사→명장7075
2904호선명장→서동7076
2914호선서동→금사7076
2924호선금사→반여농산물시장7378
2934호선반여농산물시장→석대7074
2944호선석대→영산대7076
2954호선영산대→동부산대학6975
2964호선동부산대학→고촌6774
2974호선고촌→안평6773