Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory410.2 KiB
Average record size in memory42.0 B

Variable types

Numeric2
Categorical2

Dataset

Description안양시 공간정보시스템 상의 가로수에 대한 현황 데이터 정보등 관리번호와 가로수직경 및 관할지역 과 식재알자 등 데이터 자료입니다
URLhttps://www.data.go.kr/data/15042420/fileData.do

Alerts

Dataset has 2 (< 0.1%) duplicate rowsDuplicates
관리번호 is highly overall correlated with 관할지역 and 1 other fieldsHigh correlation
가로수직경 is highly overall correlated with 식재일자High correlation
관할지역 is highly overall correlated with 관리번호 and 1 other fieldsHigh correlation
식재일자 is highly overall correlated with 관리번호 and 2 other fieldsHigh correlation
식재일자 is highly imbalanced (68.3%)Imbalance
관리번호 has 1257 (12.6%) zerosZeros
가로수직경 has 9865 (98.7%) zerosZeros

Reproduction

Analysis started2023-12-12 10:22:52.472191
Analysis finished2023-12-12 10:22:53.401631
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

관리번호
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct8708
Distinct (%)87.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6458729 × 109
Minimum0
Maximum2.1474836 × 109
Zeros1257
Zeros (%)12.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:22:53.502660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12.0221215 × 109
median2.0221273 × 109
Q32.0221331 × 109
95-th percentile2.0221377 × 109
Maximum2.1474836 × 109
Range2.1474836 × 109
Interquartile range (IQR)11608

Descriptive statistics

Standard deviation7.8760245 × 108
Coefficient of variation (CV)0.47853176
Kurtosis0.59702654
Mean1.6458729 × 109
Median Absolute Deviation (MAD)5803.5
Skewness-1.611277
Sum1.6458729 × 1013
Variance6.2031762 × 1017
MonotonicityNot monotonic
2023-12-12T19:22:53.683038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1257
 
12.6%
2147483647 37
 
0.4%
2022126580 1
 
< 0.1%
2022128849 1
 
< 0.1%
2022120265 1
 
< 0.1%
2022132144 1
 
< 0.1%
2022138059 1
 
< 0.1%
2022136839 1
 
< 0.1%
2022124357 1
 
< 0.1%
2022120946 1
 
< 0.1%
Other values (8698) 8698
87.0%
ValueCountFrequency (%)
0 1257
12.6%
1029 1
 
< 0.1%
1031 1
 
< 0.1%
1032 1
 
< 0.1%
1034 1
 
< 0.1%
1039 1
 
< 0.1%
1040 1
 
< 0.1%
1044 1
 
< 0.1%
1046 1
 
< 0.1%
1047 1
 
< 0.1%
ValueCountFrequency (%)
2147483647 37
0.4%
2023040019 1
 
< 0.1%
2023040018 1
 
< 0.1%
2023040014 1
 
< 0.1%
2023040013 1
 
< 0.1%
2023040012 1
 
< 0.1%
2023040009 1
 
< 0.1%
2023040007 1
 
< 0.1%
2023040006 1
 
< 0.1%
2023040005 1
 
< 0.1%

가로수직경
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.002456
Minimum0
Maximum0.5
Zeros9865
Zeros (%)98.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:22:53.905747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum0.5
Range0.5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.024773967
Coefficient of variation (CV)10.08712
Kurtosis179.01574
Mean0.002456
Median Absolute Deviation (MAD)0
Skewness12.651597
Sum24.56
Variance0.00061374944
MonotonicityNot monotonic
2023-12-12T19:22:54.064736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.0 9865
98.7%
0.1 68
 
0.7%
0.4 18
 
0.2%
0.12 13
 
0.1%
0.2 12
 
0.1%
0.3 11
 
0.1%
0.15 5
 
0.1%
0.25 4
 
< 0.1%
0.35 3
 
< 0.1%
0.5 1
 
< 0.1%
ValueCountFrequency (%)
0.0 9865
98.7%
0.1 68
 
0.7%
0.12 13
 
0.1%
0.15 5
 
0.1%
0.2 12
 
0.1%
0.25 4
 
< 0.1%
0.3 11
 
0.1%
0.35 3
 
< 0.1%
0.4 18
 
0.2%
0.5 1
 
< 0.1%
ValueCountFrequency (%)
0.5 1
 
< 0.1%
0.4 18
 
0.2%
0.35 3
 
< 0.1%
0.3 11
 
0.1%
0.25 4
 
< 0.1%
0.2 12
 
0.1%
0.15 5
 
0.1%
0.12 13
 
0.1%
0.1 68
 
0.7%
0.0 9865
98.7%

관할지역
Categorical

HIGH CORRELATION 

Distinct32
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
안양시
1783 
석수2동
937 
부림동
572 
관양2동
 
528
안양2동
 
463
Other values (27)
5717 

Length

Max length7
Median length4
Mean length3.6188
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row평안동
2nd row안양7동
3rd row신촌동
4th row범계동
5th row석수1동

Common Values

ValueCountFrequency (%)
안양시 1783
17.8%
석수2동 937
 
9.4%
부림동 572
 
5.7%
관양2동 528
 
5.3%
안양2동 463
 
4.6%
관양1동 459
 
4.6%
호계1동 425
 
4.2%
안양7동 391
 
3.9%
석수1동 372
 
3.7%
석수3동 350
 
3.5%
Other values (22) 3720
37.2%

Length

2023-12-12T19:22:54.232235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
안양시 1783
17.8%
석수2동 937
 
9.3%
부림동 572
 
5.7%
관양2동 528
 
5.3%
안양2동 463
 
4.6%
관양1동 459
 
4.6%
호계1동 425
 
4.2%
안양7동 391
 
3.9%
석수1동 372
 
3.7%
석수3동 350
 
3.5%
Other values (23) 3744
37.4%

식재일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2022-12-07
7950 
2012-09-01
1932 
2022-01-01
 
47
2016-12-06
 
40
2019-01-01
 
18

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-12-07
2nd row2022-12-07
3rd row2022-12-07
4th row2022-12-07
5th row2022-12-07

Common Values

ValueCountFrequency (%)
2022-12-07 7950
79.5%
2012-09-01 1932
 
19.3%
2022-01-01 47
 
0.5%
2016-12-06 40
 
0.4%
2019-01-01 18
 
0.2%
2017-12-30 13
 
0.1%

Length

2023-12-12T19:22:54.432724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:22:54.564590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-12-07 7950
79.5%
2012-09-01 1932
 
19.3%
2022-01-01 47
 
0.5%
2016-12-06 40
 
0.4%
2019-01-01 18
 
0.2%
2017-12-30 13
 
0.1%

Interactions

2023-12-12T19:22:52.992401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:22:52.762197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:22:53.122572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:22:52.890910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:22:54.660151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호가로수직경관할지역식재일자
관리번호1.0000.1451.0000.998
가로수직경0.1451.0000.5600.849
관할지역1.0000.5601.0000.865
식재일자0.9980.8490.8651.000
2023-12-12T19:22:54.792813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관할지역식재일자
관할지역1.0000.606
식재일자0.6061.000
2023-12-12T19:22:54.919117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호가로수직경관할지역식재일자
관리번호1.000-0.0140.9810.956
가로수직경-0.0141.0000.2250.685
관할지역0.9810.2251.0000.606
식재일자0.9560.6850.6061.000

Missing values

2023-12-12T19:22:53.265892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:22:53.355031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관리번호가로수직경관할지역식재일자
100920221209460.0평안동2022-12-07
1170220221322340.0안양7동2022-12-07
1283320221333650.0신촌동2022-12-07
1343720221340310.0범계동2022-12-07
644420221264080.0석수1동2022-12-07
1700320221365890.0호계1동2022-12-07
513320221250710.0안양4동2022-12-07
2251443720.0안양시2012-09-01
1242820221330220.0신촌동2022-12-07
2186800.0안양시2012-09-01
관리번호가로수직경관할지역식재일자
1525520221352190.0부림동2022-12-07
1007120221296570.0석수3동2022-12-07
816720221277520.0석수2동2022-12-07
501820221249560.0안양4동2022-12-07
1909147010.0안양시2012-09-01
2167900.0안양시2012-09-01
2180700.0안양시2012-09-01
322320221231610.0관양2동2022-12-07
410220221240400.0관양1동2022-12-07
462120221255330.0안양2동2022-12-07

Duplicate rows

Most frequently occurring

관리번호가로수직경관할지역식재일자# duplicates
000.0안양시2012-09-011257
121474836470.1안양2동2022-01-0137