Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Numeric2
Text1
Categorical2

Dataset

Description부산광역시_도시공간정보시스템도로(그리드면)_20230522
Author부산광역시
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15081939

Alerts

보도차도구분 is highly imbalanced (50.1%)Imbalance
공간정보일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:04:54.731580
Analysis finished2023-12-10 16:04:55.976449
Duration1.24 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공간정보일련번호
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52948.104
Minimum8
Maximum140178
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:04:56.081664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile3884.75
Q124873.75
median49047
Q381281.75
95-th percentile117498.05
Maximum140178
Range140170
Interquartile range (IQR)56408

Descriptive statistics

Standard deviation34454.611
Coefficient of variation (CV)0.65072418
Kurtosis-0.60868822
Mean52948.104
Median Absolute Deviation (MAD)28653
Skewness0.43761778
Sum5.2948104 × 108
Variance1.1871202 × 109
MonotonicityNot monotonic
2023-12-11T01:04:56.280869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2424 1
 
< 0.1%
23039 1
 
< 0.1%
88983 1
 
< 0.1%
81261 1
 
< 0.1%
121979 1
 
< 0.1%
89449 1
 
< 0.1%
6302 1
 
< 0.1%
8970 1
 
< 0.1%
30650 1
 
< 0.1%
84564 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
8 1
< 0.1%
9 1
< 0.1%
26 1
< 0.1%
34 1
< 0.1%
65 1
< 0.1%
70 1
< 0.1%
75 1
< 0.1%
81 1
< 0.1%
86 1
< 0.1%
91 1
< 0.1%
ValueCountFrequency (%)
140178 1
< 0.1%
139876 1
< 0.1%
139874 1
< 0.1%
139870 1
< 0.1%
139863 1
< 0.1%
139858 1
< 0.1%
139542 1
< 0.1%
139540 1
< 0.1%
139219 1
< 0.1%
138954 1
< 0.1%
Distinct9919
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T01:04:56.756012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.9994
Min length11

Characters and Unicode

Total characters129994
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9899 ?
Unique (%)99.0%

Sample

1st rowAE00112002321
2nd rowAE00116000908
3rd rowAD00105010363
4th rowAE00112031356
5th rowAE00108001337
ValueCountFrequency (%)
ae00116000000 60
 
0.6%
ad00113004601 5
 
< 0.1%
ad00110000014 2
 
< 0.1%
ae00108002101 2
 
< 0.1%
ae00114004005 2
 
< 0.1%
ad00105007098 2
 
< 0.1%
ae00110001569 2
 
< 0.1%
ae00112014010 2
 
< 0.1%
ad00106100112 2
 
< 0.1%
ae00112001683 2
 
< 0.1%
Other values (9909) 9919
99.2%
2023-12-11T01:04:57.370888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 48401
37.2%
1 23661
18.2%
A 10000
 
7.7%
2 6314
 
4.9%
D 5445
 
4.2%
5 5256
 
4.0%
6 5180
 
4.0%
3 4999
 
3.8%
4 4658
 
3.6%
E 4555
 
3.5%
Other values (3) 11525
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 109994
84.6%
Uppercase Letter 20000
 
15.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 48401
44.0%
1 23661
21.5%
2 6314
 
5.7%
5 5256
 
4.8%
6 5180
 
4.7%
3 4999
 
4.5%
4 4658
 
4.2%
9 4018
 
3.7%
7 3962
 
3.6%
8 3545
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
A 10000
50.0%
D 5445
27.2%
E 4555
22.8%

Most occurring scripts

ValueCountFrequency (%)
Common 109994
84.6%
Latin 20000
 
15.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 48401
44.0%
1 23661
21.5%
2 6314
 
5.7%
5 5256
 
4.8%
6 5180
 
4.7%
3 4999
 
4.5%
4 4658
 
4.2%
9 4018
 
3.7%
7 3962
 
3.6%
8 3545
 
3.2%
Latin
ValueCountFrequency (%)
A 10000
50.0%
D 5445
27.2%
E 4555
22.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 129994
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 48401
37.2%
1 23661
18.2%
A 10000
 
7.7%
2 6314
 
4.9%
D 5445
 
4.2%
5 5256
 
4.0%
6 5180
 
4.0%
3 4999
 
3.8%
4 4658
 
3.6%
E 4555
 
3.5%
Other values (3) 11525
 
8.9%

보도차도구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
차도
5444 
보도
4554 
AE002
 
1
AD910
 
1

Length

Max length5
Median length2
Mean length2.0006
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row보도
2nd row보도
3rd row차도
4th row보도
5th row보도

Common Values

ValueCountFrequency (%)
차도 5444
54.4%
보도 4554
45.5%
AE002 1
 
< 0.1%
AD910 1
 
< 0.1%

Length

2023-12-11T01:04:57.622406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:04:57.792091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
차도 5444
54.4%
보도 4554
45.5%
ae002 1
 
< 0.1%
ad910 1
 
< 0.1%

등록구
Categorical

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
강서구
1460 
기장군
1099 
부산진구
1027 
해운대구
851 
사상구
784 
Other values (11)
4779 

Length

Max length4
Median length3
Mean length3.0089
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강서구
2nd row기장군
3rd row부산진구
4th row강서구
5th row북구

Common Values

ValueCountFrequency (%)
강서구 1460
14.6%
기장군 1099
11.0%
부산진구 1027
10.3%
해운대구 851
8.5%
사상구 784
 
7.8%
사하구 739
 
7.4%
남구 567
 
5.7%
금정구 548
 
5.5%
동래구 500
 
5.0%
북구 431
 
4.3%
Other values (6) 1994
19.9%

Length

2023-12-11T01:04:57.969453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강서구 1460
14.6%
기장군 1099
11.0%
부산진구 1027
10.3%
해운대구 851
8.5%
사상구 784
 
7.8%
사하구 739
 
7.4%
남구 567
 
5.7%
금정구 548
 
5.5%
동래구 500
 
5.0%
북구 431
 
4.3%
Other values (6) 1994
19.9%
Distinct6090
Distinct (%)60.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean746.6734
Minimum0
Maximum83240.9
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:04:58.125278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile16.6
Q184.3
median227.75
Q3606.925
95-th percentile2706.605
Maximum83240.9
Range83240.9
Interquartile range (IQR)522.625

Descriptive statistics

Standard deviation2449.0125
Coefficient of variation (CV)3.2798979
Kurtosis338.94261
Mean746.6734
Median Absolute Deviation (MAD)177.85
Skewness14.661944
Sum7466734
Variance5997662.3
MonotonicityNot monotonic
2023-12-11T01:04:58.316946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16.0 9
 
0.1%
20.0 9
 
0.1%
28.0 9
 
0.1%
8.8 9
 
0.1%
15.4 8
 
0.1%
12.0 8
 
0.1%
29.7 8
 
0.1%
57.6 8
 
0.1%
64.8 8
 
0.1%
49.4 7
 
0.1%
Other values (6080) 9917
99.2%
ValueCountFrequency (%)
0.0 3
< 0.1%
0.1 1
 
< 0.1%
0.4 1
 
< 0.1%
0.5 1
 
< 0.1%
0.6 1
 
< 0.1%
0.7 1
 
< 0.1%
0.9 1
 
< 0.1%
1.0 1
 
< 0.1%
1.1 1
 
< 0.1%
1.2 1
 
< 0.1%
ValueCountFrequency (%)
83240.9 1
< 0.1%
73709.3 1
< 0.1%
61049.4 1
< 0.1%
61004.2 1
< 0.1%
51330.6 1
< 0.1%
42311.5 1
< 0.1%
37135.4 1
< 0.1%
33877.8 1
< 0.1%
32294.0 1
< 0.1%
29708.9 1
< 0.1%

Interactions

2023-12-11T01:04:55.515593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:04:55.274705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:04:55.638808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:04:55.392947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:04:58.447406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공간정보일련번호보도차도구분등록구도형면적(제곱미터)
공간정보일련번호1.0000.1770.6780.047
보도차도구분0.1771.0000.3010.083
등록구0.6780.3011.0000.000
도형면적(제곱미터)0.0470.0830.0001.000
2023-12-11T01:04:58.589798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록구보도차도구분
등록구1.0000.145
보도차도구분0.1451.000
2023-12-11T01:04:58.707981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공간정보일련번호도형면적(제곱미터)보도차도구분등록구
공간정보일련번호1.0000.0490.1070.344
도형면적(제곱미터)0.0491.0000.0500.000
보도차도구분0.1070.0501.0000.145
등록구0.3440.0000.1451.000

Missing values

2023-12-11T01:04:55.791892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:04:55.918115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

공간정보일련번호도형아이디보도차도구분등록구도형면적(제곱미터)
22972424AE00112002321보도강서구119.6
5395563674AE00116000908보도기장군25.2
3534540045AD00105010363차도부산진구52.0
4343449629AE00112031356보도강서구31.6
4921458626AE00108001337보도북구16.0
4404186629AE00105782046보도부산진구33.5
5950464202AE00111006051보도금정구89.1
4744355511AE00112000408보도강서구876.9
6238289540AD00106000913차도동래구176.7
3116238384AD00105005426차도부산진구282.8
공간정보일련번호도형아이디보도차도구분등록구도형면적(제곱미터)
47724468AD00112001613차도강서구149.6
1845720409AD00108001303차도북구360.2
4268186365AE00106002031보도동래구347.5
5691AE00112000038보도강서구64.9
4391248689AD00112031302차도강서구1153.1
2387326706AD00107001742차도남구749.0
1093510302AD00115000220차도사상구202.4
4124586032AD00106000695차도동래구207.6
6333576060AE00110002025보도사하구78.0
5536462485AD00110003501차도사하구84.6