Overview

Dataset statistics

Number of variables6
Number of observations481
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory23.6 KiB
Average record size in memory50.3 B

Variable types

Categorical1
Text3
Numeric2

Dataset

Description부산광역시남구_도로현황_20190729
Author부산광역시 남구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15022536

Alerts

폭(m) is highly overall correlated with 길이(m) and 1 other fieldsHigh correlation
길이(m) is highly overall correlated with 폭(m) and 1 other fieldsHigh correlation
구분 is highly overall correlated with 폭(m) and 1 other fieldsHigh correlation
구분 is highly imbalanced (69.6%)Imbalance
노선명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 17:10:03.971510
Analysis finished2023-12-10 17:10:05.129980
Duration1.16 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
436 
 
42
대로
 
3

Length

Max length2
Median length1
Mean length1.006237
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대로
2nd row대로
3rd row대로
4th row
5th row

Common Values

ValueCountFrequency (%)
436
90.6%
42
 
8.7%
대로 3
 
0.6%

Length

2023-12-11T02:10:05.231244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:10:05.392918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
436
90.6%
42
 
8.7%
대로 3
 
0.6%

노선명
Text

UNIQUE 

Distinct481
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2023-12-11T02:10:05.733476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length7.6673597
Min length3

Characters and Unicode

Total characters3688
Distinct characters90
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique481 ?
Unique (%)100.0%

Sample

1st row전포대로
2nd row충장대로
3rd row황령대로
4th row8부두로
5th row고동골로
ValueCountFrequency (%)
전포대로 1
 
0.2%
용주로30번길 1
 
0.2%
유엔로25번가길 1
 
0.2%
유엔로215번길 1
 
0.2%
유엔로201번길 1
 
0.2%
유엔로183번길 1
 
0.2%
유엔로169번길 1
 
0.2%
유엔로157번나길 1
 
0.2%
유엔로157번길 1
 
0.2%
유엔로157번가길 1
 
0.2%
Other values (471) 471
97.9%
2023-12-11T02:10:06.319213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
475
 
12.9%
470
 
12.7%
436
 
11.8%
1 197
 
5.3%
2 135
 
3.7%
6 110
 
3.0%
3 107
 
2.9%
7 89
 
2.4%
9 86
 
2.3%
4 80
 
2.2%
Other values (80) 1503
40.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2668
72.3%
Decimal Number 1020
 
27.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
475
17.8%
470
17.6%
436
16.3%
78
 
2.9%
63
 
2.4%
53
 
2.0%
53
 
2.0%
51
 
1.9%
50
 
1.9%
49
 
1.8%
Other values (70) 890
33.4%
Decimal Number
ValueCountFrequency (%)
1 197
19.3%
2 135
13.2%
6 110
10.8%
3 107
10.5%
7 89
8.7%
9 86
8.4%
4 80
7.8%
0 78
 
7.6%
5 74
 
7.3%
8 64
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2668
72.3%
Common 1020
 
27.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
475
17.8%
470
17.6%
436
16.3%
78
 
2.9%
63
 
2.4%
53
 
2.0%
53
 
2.0%
51
 
1.9%
50
 
1.9%
49
 
1.8%
Other values (70) 890
33.4%
Common
ValueCountFrequency (%)
1 197
19.3%
2 135
13.2%
6 110
10.8%
3 107
10.5%
7 89
8.7%
9 86
8.4%
4 80
7.8%
0 78
 
7.6%
5 74
 
7.3%
8 64
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2668
72.3%
ASCII 1020
 
27.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
475
17.8%
470
17.6%
436
16.3%
78
 
2.9%
63
 
2.4%
53
 
2.0%
53
 
2.0%
51
 
1.9%
50
 
1.9%
49
 
1.8%
Other values (70) 890
33.4%
ASCII
ValueCountFrequency (%)
1 197
19.3%
2 135
13.2%
6 110
10.8%
3 107
10.5%
7 89
8.7%
9 86
8.4%
4 80
7.8%
0 78
 
7.6%
5 74
 
7.3%
8 64
 
6.3%

시점
Text

Distinct469
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2023-12-11T02:10:06.819139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length23
Mean length21.569647
Min length14

Characters and Unicode

Total characters10375
Distinct characters45
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique457 ?
Unique (%)95.0%

Sample

1st row부산광역시 남구 문현동 807-5 일원
2nd row부산광역시 중구 중앙동4가 89-14 일원
3rd row부산광역시 진구 범천동 856-7 일원
4th row부산광역시 남구 감만동 85-17 일원
5th row부산광역시 남구 문현동361-32 일원
ValueCountFrequency (%)
부산광역시 481
20.2%
남구 473
19.8%
일원 472
19.8%
대연동 161
 
6.7%
문현동 104
 
4.4%
용호동 69
 
2.9%
우암동 58
 
2.4%
감만동 57
 
2.4%
용당동 14
 
0.6%
진구 4
 
0.2%
Other values (476) 493
20.7%
2023-12-11T02:10:07.464304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1905
18.4%
1 503
 
4.8%
488
 
4.7%
483
 
4.7%
481
 
4.6%
481
 
4.6%
481
 
4.6%
481
 
4.6%
481
 
4.6%
474
 
4.6%
Other values (35) 4117
39.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5764
55.6%
Decimal Number 2240
 
21.6%
Space Separator 1905
 
18.4%
Dash Punctuation 466
 
4.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
488
8.5%
483
8.4%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
474
8.2%
473
8.2%
472
8.2%
Other values (23) 969
16.8%
Decimal Number
ValueCountFrequency (%)
1 503
22.5%
2 266
11.9%
3 219
9.8%
4 214
9.6%
5 199
 
8.9%
8 185
 
8.3%
9 182
 
8.1%
7 170
 
7.6%
6 169
 
7.5%
0 133
 
5.9%
Space Separator
ValueCountFrequency (%)
1905
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 466
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5764
55.6%
Common 4611
44.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
488
8.5%
483
8.4%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
474
8.2%
473
8.2%
472
8.2%
Other values (23) 969
16.8%
Common
ValueCountFrequency (%)
1905
41.3%
1 503
 
10.9%
- 466
 
10.1%
2 266
 
5.8%
3 219
 
4.7%
4 214
 
4.6%
5 199
 
4.3%
8 185
 
4.0%
9 182
 
3.9%
7 170
 
3.7%
Other values (2) 302
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5764
55.6%
ASCII 4611
44.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1905
41.3%
1 503
 
10.9%
- 466
 
10.1%
2 266
 
5.8%
3 219
 
4.7%
4 214
 
4.6%
5 199
 
4.3%
8 185
 
4.0%
9 182
 
3.9%
7 170
 
3.7%
Other values (2) 302
 
6.5%
Hangul
ValueCountFrequency (%)
488
8.5%
483
8.4%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
474
8.2%
473
8.2%
472
8.2%
Other values (23) 969
16.8%

종점
Text

Distinct472
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2023-12-11T02:10:08.101659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length23
Mean length21.60499
Min length14

Characters and Unicode

Total characters10392
Distinct characters45
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique463 ?
Unique (%)96.3%

Sample

1st row부산광역시 진구 전포동 874-1 일원
2nd row부산광역시 남구 문현동 산149일원
3rd row부산광역시 남구 대연동 1808 일원
4th row부산광역시 남구 감만동 75-82 일원
5th row부산광역시 남구 문현동 54-7 일원
ValueCountFrequency (%)
부산광역시 481
20.2%
남구 473
19.8%
일원 471
19.7%
대연동 163
 
6.8%
문현동 103
 
4.3%
용호동 73
 
3.1%
우암동 62
 
2.6%
감만동 54
 
2.3%
용당동 10
 
0.4%
전포동 4
 
0.2%
Other values (475) 491
20.6%
2023-12-11T02:10:09.170997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1904
18.3%
509
 
4.9%
1 504
 
4.8%
481
 
4.6%
481
 
4.6%
481
 
4.6%
481
 
4.6%
481
 
4.6%
480
 
4.6%
473
 
4.6%
Other values (35) 4117
39.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5786
55.7%
Decimal Number 2252
 
21.7%
Space Separator 1904
 
18.3%
Dash Punctuation 450
 
4.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
509
8.8%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
480
8.3%
473
8.2%
472
8.2%
472
8.2%
Other values (23) 975
16.9%
Decimal Number
ValueCountFrequency (%)
1 504
22.4%
2 287
12.7%
3 235
10.4%
5 203
9.0%
8 196
 
8.7%
7 186
 
8.3%
9 179
 
7.9%
4 178
 
7.9%
6 172
 
7.6%
0 112
 
5.0%
Space Separator
ValueCountFrequency (%)
1904
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 450
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5786
55.7%
Common 4606
44.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
509
8.8%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
480
8.3%
473
8.2%
472
8.2%
472
8.2%
Other values (23) 975
16.9%
Common
ValueCountFrequency (%)
1904
41.3%
1 504
 
10.9%
- 450
 
9.8%
2 287
 
6.2%
3 235
 
5.1%
5 203
 
4.4%
8 196
 
4.3%
7 186
 
4.0%
9 179
 
3.9%
4 178
 
3.9%
Other values (2) 284
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5786
55.7%
ASCII 4606
44.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1904
41.3%
1 504
 
10.9%
- 450
 
9.8%
2 287
 
6.2%
3 235
 
5.1%
5 203
 
4.4%
8 196
 
4.3%
7 186
 
4.0%
9 179
 
3.9%
4 178
 
3.9%
Other values (2) 284
 
6.2%
Hangul
ValueCountFrequency (%)
509
8.8%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
481
8.3%
480
8.3%
473
8.2%
472
8.2%
472
8.2%
Other values (23) 975
16.9%

폭(m)
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.7505198
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.4 KiB
2023-12-11T02:10:09.376681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median6
Q38
95-th percentile15
Maximum50
Range49
Interquartile range (IQR)5

Descriptive statistics

Standard deviation6.0380844
Coefficient of variation (CV)0.89446215
Kurtosis18.376913
Mean6.7505198
Median Absolute Deviation (MAD)2
Skewness3.7214917
Sum3247
Variance36.458463
MonotonicityNot monotonic
2023-12-11T02:10:09.634763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
6 92
19.1%
8 70
14.6%
2 65
13.5%
4 58
12.1%
3 53
11.0%
5 28
 
5.8%
7 26
 
5.4%
9 21
 
4.4%
10 14
 
2.9%
12 9
 
1.9%
Other values (21) 45
9.4%
ValueCountFrequency (%)
1 8
 
1.7%
2 65
13.5%
3 53
11.0%
4 58
12.1%
5 28
 
5.8%
6 92
19.1%
7 26
 
5.4%
8 70
14.6%
9 21
 
4.4%
10 14
 
2.9%
ValueCountFrequency (%)
50 2
0.4%
43 1
0.2%
40 1
0.2%
36 1
0.2%
34 1
0.2%
31 2
0.4%
30 1
0.2%
28 1
0.2%
27 1
0.2%
26 2
0.4%

길이(m)
Real number (ℝ)

HIGH CORRELATION 

Distinct338
Distinct (%)70.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean457.30146
Minimum70
Maximum8100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.4 KiB
2023-12-11T02:10:09.860076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum70
5-th percentile121
Q1194
median285
Q3450
95-th percentile1271
Maximum8100
Range8030
Interquartile range (IQR)256

Descriptive statistics

Standard deviation719.87406
Coefficient of variation (CV)1.5741784
Kurtosis48.045054
Mean457.30146
Median Absolute Deviation (MAD)113
Skewness6.2130279
Sum219962
Variance518218.67
MonotonicityNot monotonic
2023-12-11T02:10:10.076856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
222 6
 
1.2%
230 6
 
1.2%
427 5
 
1.0%
149 4
 
0.8%
132 4
 
0.8%
223 4
 
0.8%
128 4
 
0.8%
158 4
 
0.8%
314 4
 
0.8%
99 3
 
0.6%
Other values (328) 437
90.9%
ValueCountFrequency (%)
70 1
 
0.2%
71 1
 
0.2%
73 1
 
0.2%
76 1
 
0.2%
80 1
 
0.2%
81 1
 
0.2%
88 1
 
0.2%
98 1
 
0.2%
99 3
0.6%
102 1
 
0.2%
ValueCountFrequency (%)
8100 1
0.2%
6090 1
0.2%
6061 1
0.2%
5000 1
0.2%
4587 1
0.2%
3933 1
0.2%
3400 1
0.2%
3090 1
0.2%
2797 1
0.2%
2552 1
0.2%

Interactions

2023-12-11T02:10:04.573927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:04.283292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:04.708675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:04.418449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:10:10.208726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분폭(m)길이(m)
구분1.0000.8780.944
폭(m)0.8781.0000.765
길이(m)0.9440.7651.000
2023-12-11T02:10:10.318083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
폭(m)길이(m)구분
폭(m)1.0000.5730.808
길이(m)0.5731.0000.717
구분0.8080.7171.000

Missing values

2023-12-11T02:10:04.896156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:10:05.069425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분노선명시점종점폭(m)길이(m)
0대로전포대로부산광역시 남구 문현동 807-5 일원부산광역시 진구 전포동 874-1 일원503090
1대로충장대로부산광역시 중구 중앙동4가 89-14 일원부산광역시 남구 문현동 산149일원503400
2대로황령대로부산광역시 진구 범천동 856-7 일원부산광역시 남구 대연동 1808 일원405000
38부두로부산광역시 남구 감만동 85-17 일원부산광역시 남구 감만동 75-82 일원221121
4고동골로부산광역시 남구 문현동361-32 일원부산광역시 남구 문현동 54-7 일원151380
5남동천로부산광역시 남구 문현동 810-17 일원부산광역시 남구 문현동 688-1 일원151360
6당산로부산광역시 남구 우암동 67 일원부산광역시 남구 우암동 118-111 일원8579
7동명로부산광역시 남구 용당동 483-13 일원부산광역시 남구 용호동 519-15 일원122095
8동제당로부산광역시 남구 문현동 271-4 일원부산광역시 남구 우암동 150-2 일원71901
9동항로부산광역시 남구 우암동 184-65 일원부산광역시 남구 우암동 127-87 일원5661
구분노선명시점종점폭(m)길이(m)
471황령대로352번가길부산광역시 남구 대연동 219-12 일원부산광역시 남구 대연동 219-29 일원4158
472황령대로352번길부산광역시 남구 대연동 211-16 일원부산광역시 남구 대연동 219-26 일원3161
473황령대로353번길부산광역시 남구 대연동 210-4 일원부산광역시 남구 대연동 231-51 일원5361
474황령대로492번길부산광역시 남구 대연동 16-2 일원부산광역시 남구 대연동 455-29 일원6362
475황령대로74번길부산광역시 진구 전포동 372-2 일원부산광역시 남구 문현동 511-6 일원9520
476황령대로90번가길부산광역시 남구 문현동 605 일원부산광역시 남구 문현동 617-113 일원2167
477황령대로90번길부산광역시 진구 전포동 367-99 일원부산광역시 남구 문현동 483-11 일원8950
478황령대로90번나길부산광역시 남구 문현동 545-360 일원부산광역시 남구 문현동 558-11 일원6287
479황령대로90번다길부산광역시 남구 문현동 509-21 일원부산광역시 남구 문현동 545-224 일원2158
480황령대로98번길부산광역시 진구 전포동 364-71부산광역시 진구 전포동 379-186360