Overview

Dataset statistics

Number of variables6
Number of observations42
Missing cells21
Missing cells (%)8.3%
Duplicate rows1
Duplicate rows (%)2.4%
Total size in memory2.3 KiB
Average record size in memory55.1 B

Variable types

Text1
Categorical1
Numeric4

Dataset

Description부산광역시해운대구_이륜차등록현황_20230222
Author부산광역시 해운대구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3075710

Alerts

Dataset has 1 (2.4%) duplicate rowsDuplicates
경형 is highly overall correlated with 소형 and 3 other fieldsHigh correlation
소형 is highly overall correlated with 경형 and 3 other fieldsHigh correlation
중형 is highly overall correlated with 경형 and 3 other fieldsHigh correlation
대형 is highly overall correlated with 경형 and 3 other fieldsHigh correlation
용도 및 규모 is highly overall correlated with 경형 and 3 other fieldsHigh correlation
행정동명 has 21 (50.0%) missing valuesMissing
경형 has 19 (45.2%) zerosZeros
소형 has 19 (45.2%) zerosZeros
중형 has 17 (40.5%) zerosZeros
대형 has 22 (52.4%) zerosZeros

Reproduction

Analysis started2023-12-10 17:02:21.299539
Analysis finished2023-12-10 17:02:24.740137
Duration3.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

행정동명
Text

MISSING 

Distinct21
Distinct (%)100.0%
Missing21
Missing (%)50.0%
Memory size468.0 B
2023-12-11T02:02:24.936940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.4761905
Min length2

Characters and Unicode

Total characters73
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row반송1동
2nd row반송2동
3rd row반송3동
4th row반여1동
5th row반여2동
ValueCountFrequency (%)
반송1동 1
 
4.8%
재송1동 1
 
4.8%
중2동 1
 
4.8%
중1동 1
 
4.8%
좌동 1
 
4.8%
좌4동 1
 
4.8%
좌3동 1
 
4.8%
좌2동 1
 
4.8%
좌1동 1
 
4.8%
재송2동 1
 
4.8%
Other values (11) 11
52.4%
2023-12-11T02:02:25.610681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
27.4%
7
 
9.6%
6
 
8.2%
1 6
 
8.2%
2 6
 
8.2%
5
 
6.8%
3 4
 
5.5%
4
 
5.5%
3
 
4.1%
2
 
2.7%
Other values (8) 10
13.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 55
75.3%
Decimal Number 18
 
24.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
36.4%
7
 
12.7%
6
 
10.9%
5
 
9.1%
4
 
7.3%
3
 
5.5%
2
 
3.6%
2
 
3.6%
1
 
1.8%
1
 
1.8%
Other values (4) 4
 
7.3%
Decimal Number
ValueCountFrequency (%)
1 6
33.3%
2 6
33.3%
3 4
22.2%
4 2
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 55
75.3%
Common 18
 
24.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
20
36.4%
7
 
12.7%
6
 
10.9%
5
 
9.1%
4
 
7.3%
3
 
5.5%
2
 
3.6%
2
 
3.6%
1
 
1.8%
1
 
1.8%
Other values (4) 4
 
7.3%
Common
ValueCountFrequency (%)
1 6
33.3%
2 6
33.3%
3 4
22.2%
4 2
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 55
75.3%
ASCII 18
 
24.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
20
36.4%
7
 
12.7%
6
 
10.9%
5
 
9.1%
4
 
7.3%
3
 
5.5%
2
 
3.6%
2
 
3.6%
1
 
1.8%
1
 
1.8%
Other values (4) 4
 
7.3%
ASCII
ValueCountFrequency (%)
1 6
33.3%
2 6
33.3%
3 4
22.2%
4 2
 
11.1%

용도 및 규모
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size468.0 B
관용
21 
자가용
21 

Length

Max length3
Median length2.5
Mean length2.5
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row관용
2nd row자가용
3rd row관용
4th row자가용
5th row관용

Common Values

ValueCountFrequency (%)
관용 21
50.0%
자가용 21
50.0%

Length

2023-12-11T02:02:25.925431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:02:26.157636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
관용 21
50.0%
자가용 21
50.0%

경형
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct20
Distinct (%)47.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.952381
Minimum0
Maximum64
Zeros19
Zeros (%)45.2%
Negative0
Negative (%)0.0%
Memory size510.0 B
2023-12-11T02:02:26.330504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.5
Q323
95-th percentile43.9
Maximum64
Range64
Interquartile range (IQR)23

Descriptive statistics

Standard deviation16.920117
Coefficient of variation (CV)1.3063326
Kurtosis0.5726887
Mean12.952381
Median Absolute Deviation (MAD)1.5
Skewness1.1535379
Sum544
Variance286.29036
MonotonicityNot monotonic
2023-12-11T02:02:26.525950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0 19
45.2%
2 2
 
4.8%
18 2
 
4.8%
1 2
 
4.8%
23 2
 
4.8%
38 1
 
2.4%
44 1
 
2.4%
42 1
 
2.4%
3 1
 
2.4%
14 1
 
2.4%
Other values (10) 10
23.8%
ValueCountFrequency (%)
0 19
45.2%
1 2
 
4.8%
2 2
 
4.8%
3 1
 
2.4%
14 1
 
2.4%
16 1
 
2.4%
18 2
 
4.8%
20 1
 
2.4%
22 1
 
2.4%
23 2
 
4.8%
ValueCountFrequency (%)
64 1
2.4%
45 1
2.4%
44 1
2.4%
42 1
2.4%
38 1
2.4%
36 1
2.4%
31 1
2.4%
29 1
2.4%
28 1
2.4%
24 1
2.4%

소형
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct21
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.666667
Minimum0
Maximum323
Zeros19
Zeros (%)45.2%
Negative0
Negative (%)0.0%
Memory size510.0 B
2023-12-11T02:02:26.725329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q3130.75
95-th percentile268.7
Maximum323
Range323
Interquartile range (IQR)130.75

Descriptive statistics

Standard deviation94.208556
Coefficient of variation (CV)1.3522759
Kurtosis0.46292291
Mean69.666667
Median Absolute Deviation (MAD)2
Skewness1.2100526
Sum2926
Variance8875.252
MonotonicityNot monotonic
2023-12-11T02:02:26.949396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0 19
45.2%
121 2
 
4.8%
3 2
 
4.8%
1 2
 
4.8%
283 1
 
2.4%
155 1
 
2.4%
134 1
 
2.4%
263 1
 
2.4%
269 1
 
2.4%
87 1
 
2.4%
Other values (11) 11
26.2%
ValueCountFrequency (%)
0 19
45.2%
1 2
 
4.8%
3 2
 
4.8%
37 1
 
2.4%
61 1
 
2.4%
64 1
 
2.4%
72 1
 
2.4%
87 1
 
2.4%
106 1
 
2.4%
121 2
 
4.8%
ValueCountFrequency (%)
323 1
2.4%
283 1
2.4%
269 1
2.4%
263 1
2.4%
190 1
2.4%
188 1
2.4%
158 1
2.4%
155 1
2.4%
146 1
2.4%
140 1
2.4%

중형
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct23
Distinct (%)54.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean190.19048
Minimum0
Maximum771
Zeros17
Zeros (%)40.5%
Negative0
Negative (%)0.0%
Memory size510.0 B
2023-12-11T02:02:27.178697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.5
Q3326.5
95-th percentile659.35
Maximum771
Range771
Interquartile range (IQR)326.5

Descriptive statistics

Standard deviation245.55922
Coefficient of variation (CV)1.2911226
Kurtosis-0.43856823
Mean190.19048
Median Absolute Deviation (MAD)1.5
Skewness0.96837774
Sum7988
Variance60299.329
MonotonicityNot monotonic
2023-12-11T02:02:27.414704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
0 17
40.5%
1 4
 
9.5%
82 1
 
2.4%
308 1
 
2.4%
415 1
 
2.4%
573 1
 
2.4%
200 1
 
2.4%
205 1
 
2.4%
407 1
 
2.4%
258 1
 
2.4%
Other values (13) 13
31.0%
ValueCountFrequency (%)
0 17
40.5%
1 4
 
9.5%
2 1
 
2.4%
75 1
 
2.4%
82 1
 
2.4%
194 1
 
2.4%
200 1
 
2.4%
205 1
 
2.4%
258 1
 
2.4%
283 1
 
2.4%
ValueCountFrequency (%)
771 1
2.4%
700 1
2.4%
662 1
2.4%
609 1
2.4%
573 1
2.4%
554 1
2.4%
542 1
2.4%
500 1
2.4%
415 1
2.4%
407 1
2.4%

대형
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct19
Distinct (%)45.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.738095
Minimum0
Maximum153
Zeros22
Zeros (%)52.4%
Negative0
Negative (%)0.0%
Memory size510.0 B
2023-12-11T02:02:27.611135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q348.75
95-th percentile116.6
Maximum153
Range153
Interquartile range (IQR)48.75

Descriptive statistics

Standard deviation43.986727
Coefficient of variation (CV)1.3859284
Kurtosis0.50308436
Mean31.738095
Median Absolute Deviation (MAD)0
Skewness1.2359829
Sum1333
Variance1934.8322
MonotonicityNot monotonic
2023-12-11T02:02:27.826356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
0 22
52.4%
48 2
 
4.8%
43 2
 
4.8%
109 1
 
2.4%
84 1
 
2.4%
136 1
 
2.4%
34 1
 
2.4%
83 1
 
2.4%
49 1
 
2.4%
78 1
 
2.4%
Other values (9) 9
21.4%
ValueCountFrequency (%)
0 22
52.4%
1 1
 
2.4%
3 1
 
2.4%
26 1
 
2.4%
34 1
 
2.4%
41 1
 
2.4%
43 2
 
4.8%
48 2
 
4.8%
49 1
 
2.4%
52 1
 
2.4%
ValueCountFrequency (%)
153 1
2.4%
136 1
2.4%
117 1
2.4%
109 1
2.4%
99 1
2.4%
86 1
2.4%
84 1
2.4%
83 1
2.4%
78 1
2.4%
52 1
2.4%

Interactions

2023-12-11T02:02:23.868347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:21.638646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:22.643573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:23.269268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:24.008879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:21.797990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:22.786407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:23.416555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:24.150288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:21.993328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:22.935595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:23.571458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:24.313861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:22.150170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:23.094519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:02:23.722236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:02:28.016292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동명용도 및 규모경형소형중형대형
행정동명1.000NaNNaNNaN1.000NaN
용도 및 규모NaN1.0000.9650.9840.9590.807
경형NaN0.9651.0000.8980.8660.899
소형NaN0.9840.8981.0000.8300.861
중형1.0000.9590.8660.8301.0000.822
대형NaN0.8070.8990.8610.8221.000
2023-12-11T02:02:28.190443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
경형소형중형대형용도 및 규모
경형1.0000.8950.8730.8990.771
소형0.8951.0000.9400.9120.820
중형0.8730.9401.0000.8690.738
대형0.8990.9120.8691.0000.754
용도 및 규모0.7710.8200.7380.7541.000

Missing values

2023-12-11T02:02:24.493237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:02:24.669367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

행정동명용도 및 규모경형소형중형대형
0반송1동관용00820
1<NA>자가용2914654243
2반송2동관용0000
3<NA>자가용4526970048
4반송3동관용0000
5<NA>자가용261753
6반여1동관용0020
7<NA>자가용64283771153
8반여2동관용0010
9<NA>자가용1615550043
행정동명용도 및 규모경형소형중형대형
32좌4동관용0000
33<NA>자가용238720048
34좌동관용0000
35<NA>자가용0310
36중1동관용3000
37<NA>자가용42263573136
38중2동관용0000
39<NA>자가용4413441584
40해운대구관용0000
41<NA>자가용0310

Duplicate rows

Most frequently occurring

행정동명용도 및 규모경형소형중형대형# duplicates
0<NA>자가용03102