Overview

Dataset statistics

Number of variables6
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory54.4 B

Variable types

Text2
Categorical2
Numeric2

Dataset

Description안산시 도로 종류 및 구간, 길이 등의 정보 현황입니다.
Author안산도시공사
URLhttps://www.data.go.kr/data/15018682/fileData.do

Alerts

자전거 도로 너비 is highly overall correlated with 노선번호High correlation
노선번호 is highly overall correlated with 자전거 도로 너비High correlation
노선명 has unique valuesUnique
도로구간 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:34:27.399452
Analysis finished2023-12-12 01:34:28.376751
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선명
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-12T10:34:28.553376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length5.2
Min length4

Characters and Unicode

Total characters156
Distinct characters27
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row중앙대로12L
2nd row중앙대로13L
3rd row중앙대로14L
4th row중앙대로15L
5th row중앙대로16L
ValueCountFrequency (%)
중앙대로12l 1
 
3.3%
중앙대로13l 1
 
3.3%
신원로1r 1
 
3.3%
진흥로1 1
 
3.3%
시화로1l 1
 
3.3%
시화로2r 1
 
3.3%
시화로1 1
 
3.3%
정왕천동로1 1
 
3.3%
삼일로9l 1
 
3.3%
삼일로8l 1
 
3.3%
Other values (20) 20
66.7%
2023-12-12T10:34:28.944543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
19.2%
18
11.5%
18
11.5%
L 15
9.6%
1 13
 
8.3%
R 9
 
5.8%
5
 
3.2%
5
 
3.2%
5
 
3.2%
5 4
 
2.6%
Other values (17) 34
21.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 97
62.2%
Decimal Number 35
 
22.4%
Uppercase Letter 24
 
15.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
30.9%
18
18.6%
18
18.6%
5
 
5.2%
5
 
5.2%
5
 
5.2%
3
 
3.1%
3
 
3.1%
2
 
2.1%
2
 
2.1%
Other values (6) 6
 
6.2%
Decimal Number
ValueCountFrequency (%)
1 13
37.1%
5 4
 
11.4%
6 4
 
11.4%
2 4
 
11.4%
4 3
 
8.6%
3 3
 
8.6%
7 2
 
5.7%
9 1
 
2.9%
8 1
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
L 15
62.5%
R 9
37.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 97
62.2%
Common 35
 
22.4%
Latin 24
 
15.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
30.9%
18
18.6%
18
18.6%
5
 
5.2%
5
 
5.2%
5
 
5.2%
3
 
3.1%
3
 
3.1%
2
 
2.1%
2
 
2.1%
Other values (6) 6
 
6.2%
Common
ValueCountFrequency (%)
1 13
37.1%
5 4
 
11.4%
6 4
 
11.4%
2 4
 
11.4%
4 3
 
8.6%
3 3
 
8.6%
7 2
 
5.7%
9 1
 
2.9%
8 1
 
2.9%
Latin
ValueCountFrequency (%)
L 15
62.5%
R 9
37.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 97
62.2%
ASCII 59
37.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
30
30.9%
18
18.6%
18
18.6%
5
 
5.2%
5
 
5.2%
5
 
5.2%
3
 
3.1%
3
 
3.1%
2
 
2.1%
2
 
2.1%
Other values (6) 6
 
6.2%
ASCII
ValueCountFrequency (%)
L 15
25.4%
1 13
22.0%
R 9
15.3%
5 4
 
6.8%
6 4
 
6.8%
2 4
 
6.8%
4 3
 
5.1%
3 3
 
5.1%
7 2
 
3.4%
9 1
 
1.7%

노선번호
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
대로1-1
12 
일반철도13-2
대로2-4
광로1-1
중로3-2
Other values (7)

Length

Max length8
Median length5
Mean length5.6
Min length5

Unique

Unique6 ?
Unique (%)20.0%

Sample

1st row광로1-1
2nd row광로1-1
3rd row광로3-10
4th row중로3-2
5th row중로3-2

Common Values

ValueCountFrequency (%)
대로1-1 12
40.0%
일반철도13-2 3
 
10.0%
대로2-4 3
 
10.0%
광로1-1 2
 
6.7%
중로3-2 2
 
6.7%
중로1-61 2
 
6.7%
광로3-10 1
 
3.3%
대로2-24  1
 
3.3%
대로1-1  1
 
3.3%
대로2-24 1
 
3.3%
Other values (2) 2
 
6.7%

Length

2023-12-12T10:34:29.111304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대로1-1 13
43.3%
일반철도13-2 3
 
10.0%
대로2-4 3
 
10.0%
광로1-1 2
 
6.7%
중로3-2 2
 
6.7%
중로1-61 2
 
6.7%
대로2-24 2
 
6.7%
광로3-10 1
 
3.3%
중로2-78 1
 
3.3%
중로2-79 1
 
3.3%

종류
Categorical

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
자전거 보행겸용도로
20 
전용도로
10 

Length

Max length10
Median length10
Mean length8
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전용도로
2nd row자전거 보행겸용도로
3rd row자전거 보행겸용도로
4th row자전거 보행겸용도로
5th row자전거 보행겸용도로

Common Values

ValueCountFrequency (%)
자전거 보행겸용도로 20
66.7%
전용도로 10
33.3%

Length

2023-12-12T10:34:29.290038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:34:29.406219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자전거 20
40.0%
보행겸용도로 20
40.0%
전용도로 10
20.0%

도로구간
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-12T10:34:29.644663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length8.6
Min length7

Characters and Unicode

Total characters258
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row초지동 592-4
2nd row원곡동 809-5
3rd row원곡동 787-15
4th row신길동 1375-11
5th row신길동 225-3
ValueCountFrequency (%)
원곡동 5
 
8.3%
성곡동 5
 
8.3%
고잔동 5
 
8.3%
선부동 4
 
6.7%
신길동 3
 
5.0%
부곡동 3
 
5.0%
초지동 2
 
3.3%
성포동 2
 
3.3%
1074 1
 
1.7%
1070-10 1
 
1.7%
Other values (29) 29
48.3%
2023-12-12T10:34:30.072883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
 
11.6%
30
 
11.6%
1 23
 
8.9%
- 17
 
6.6%
6 17
 
6.6%
5 16
 
6.2%
13
 
5.0%
7 11
 
4.3%
3 11
 
4.3%
2 10
 
3.9%
Other values (18) 80
31.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 120
46.5%
Other Letter 91
35.3%
Space Separator 30
 
11.6%
Dash Punctuation 17
 
6.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
33.0%
13
14.3%
7
 
7.7%
7
 
7.7%
5
 
5.5%
5
 
5.5%
5
 
5.5%
4
 
4.4%
3
 
3.3%
3
 
3.3%
Other values (6) 9
 
9.9%
Decimal Number
ValueCountFrequency (%)
1 23
19.2%
6 17
14.2%
5 16
13.3%
7 11
9.2%
3 11
9.2%
2 10
8.3%
4 9
 
7.5%
0 9
 
7.5%
8 7
 
5.8%
9 7
 
5.8%
Space Separator
ValueCountFrequency (%)
30
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 167
64.7%
Hangul 91
35.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
33.0%
13
14.3%
7
 
7.7%
7
 
7.7%
5
 
5.5%
5
 
5.5%
5
 
5.5%
4
 
4.4%
3
 
3.3%
3
 
3.3%
Other values (6) 9
 
9.9%
Common
ValueCountFrequency (%)
30
18.0%
1 23
13.8%
- 17
10.2%
6 17
10.2%
5 16
9.6%
7 11
 
6.6%
3 11
 
6.6%
2 10
 
6.0%
4 9
 
5.4%
0 9
 
5.4%
Other values (2) 14
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 167
64.7%
Hangul 91
35.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
30
33.0%
13
14.3%
7
 
7.7%
7
 
7.7%
5
 
5.5%
5
 
5.5%
5
 
5.5%
4
 
4.4%
3
 
3.3%
3
 
3.3%
Other values (6) 9
 
9.9%
ASCII
ValueCountFrequency (%)
30
18.0%
1 23
13.8%
- 17
10.2%
6 17
10.2%
5 16
9.6%
7 11
 
6.6%
3 11
 
6.6%
2 10
 
6.0%
4 9
 
5.4%
0 9
 
5.4%
Other values (2) 14
8.4%

총길이
Real number (ℝ)

Distinct27
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.726
Minimum0.11
Maximum2.07
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-12T10:34:30.251449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.11
5-th percentile0.1245
Q10.25
median0.565
Q31.1
95-th percentile1.6425
Maximum2.07
Range1.96
Interquartile range (IQR)0.85

Descriptive statistics

Standard deviation0.57058711
Coefficient of variation (CV)0.78593266
Kurtosis-0.59776884
Mean0.726
Median Absolute Deviation (MAD)0.375
Skewness0.77668412
Sum21.78
Variance0.32556966
MonotonicityNot monotonic
2023-12-12T10:34:30.386353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0.31 2
 
6.7%
0.32 2
 
6.7%
0.19 2
 
6.7%
1.35 1
 
3.3%
1.71 1
 
3.3%
0.39 1
 
3.3%
1.5 1
 
3.3%
0.57 1
 
3.3%
0.58 1
 
3.3%
1.51 1
 
3.3%
Other values (17) 17
56.7%
ValueCountFrequency (%)
0.11 1
3.3%
0.12 1
3.3%
0.13 1
3.3%
0.17 1
3.3%
0.19 2
6.7%
0.2 1
3.3%
0.23 1
3.3%
0.31 2
6.7%
0.32 2
6.7%
0.39 1
3.3%
ValueCountFrequency (%)
2.07 1
3.3%
1.71 1
3.3%
1.56 1
3.3%
1.51 1
3.3%
1.5 1
3.3%
1.46 1
3.3%
1.35 1
3.3%
1.11 1
3.3%
1.07 1
3.3%
1.03 1
3.3%

자전거 도로 너비
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)23.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.95
Minimum1.3
Maximum4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-12T10:34:30.537678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.3
5-th percentile1.4
Q11.4
median1.75
Q32
95-th percentile3.5
Maximum4
Range2.7
Interquartile range (IQR)0.6

Descriptive statistics

Standard deviation0.70209047
Coefficient of variation (CV)0.3600464
Kurtosis2.0270929
Mean1.95
Median Absolute Deviation (MAD)0.35
Skewness1.5561596
Sum58.5
Variance0.49293103
MonotonicityNot monotonic
2023-12-12T10:34:30.668346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2.0 8
26.7%
1.4 8
26.7%
1.5 6
20.0%
2.5 4
13.3%
3.5 2
 
6.7%
4.0 1
 
3.3%
1.3 1
 
3.3%
ValueCountFrequency (%)
1.3 1
 
3.3%
1.4 8
26.7%
1.5 6
20.0%
2.0 8
26.7%
2.5 4
13.3%
3.5 2
 
6.7%
4.0 1
 
3.3%
ValueCountFrequency (%)
4.0 1
 
3.3%
3.5 2
 
6.7%
2.5 4
13.3%
2.0 8
26.7%
1.5 6
20.0%
1.4 8
26.7%
1.3 1
 
3.3%

Interactions

2023-12-12T10:34:27.941465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:34:27.695127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:34:28.058554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:34:27.821651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:34:30.776513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선명노선번호종류도로구간총길이자전거 도로 너비
노선명1.0001.0001.0001.0001.0001.000
노선번호1.0001.0000.0001.0000.8110.821
종류1.0000.0001.0001.0000.0000.253
도로구간1.0001.0001.0001.0001.0001.000
총길이1.0000.8110.0001.0001.0000.000
자전거 도로 너비1.0000.8210.2531.0000.0001.000
2023-12-12T10:34:30.891062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종류노선번호
종류1.0000.000
노선번호0.0001.000
2023-12-12T10:34:30.980081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
총길이자전거 도로 너비노선번호종류
총길이1.0000.1920.4740.000
자전거 도로 너비0.1921.0000.5330.285
노선번호0.4740.5331.0000.000
종류0.0000.2850.0001.000

Missing values

2023-12-12T10:34:28.183847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:34:28.333539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선명노선번호종류도로구간총길이자전거 도로 너비
0중앙대로12L광로1-1전용도로초지동 592-41.352.0
1중앙대로13L광로1-1자전거 보행겸용도로원곡동 809-50.192.0
2중앙대로14L광로3-10자전거 보행겸용도로원곡동 787-151.562.0
3중앙대로15L중로3-2자전거 보행겸용도로신길동 1375-111.032.0
4중앙대로16L중로3-2자전거 보행겸용도로신길동 225-30.172.0
5삼일로1R대로2-24자전거 보행겸용도로신길동 17431.071.5
6삼일로2R일반철도13-2자전거 보행겸용도로원곡동 9630.561.5
7삼일로3R일반철도13-2자전거 보행겸용도로선부동 10710.122.5
8삼일로4R대로1-1자전거 보행겸용도로고잔동 6121.463.5
9삼일로5R대로1-1자전거 보행겸용도로성포동 593-390.314.0
노선명노선번호종류도로구간총길이자전거 도로 너비
20삼일로7L대로1-1자전거 보행겸용도로선부동 산153-220.321.3
21삼일로8L대로1-1전용도로원곡동 951-80.441.5
22삼일로9L대로2-24자전거 보행겸용도로원곡동 15601.111.5
23정왕천동로1중로2-78전용도로정왕동 6952.072.5
24시화로1중로2-79전용도로성곡동 765-111.512.5
25시화로2R대로2-4자전거 보행겸용도로성곡동 664-80.582.0
26시화로1L대로2-4자전거 보행겸용도로성곡동 6570.572.0
27진흥로1대로2-4전용도로성곡동 6671.52.5
28신원로1R중로1-61자전거 보행겸용도로성곡동 6270.322.0
29신원로1중로1-61자전거 보행겸용도로초지동 450-60.391.5