Overview

Dataset statistics

Number of variables7
Number of observations796
Missing cells232
Missing cells (%)4.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory46.0 KiB
Average record size in memory59.2 B

Variable types

Numeric3
Categorical3
Text1

Dataset

Description경기도 안양시 공간정보시스템 포장도로 현황정보(포장도로순번, 포장도로사용형태, 포장도로연장, 포장도로경과지,포장도로결정면적, 포장도로 데이터기준)데이터 입니다
URLhttps://www.data.go.kr/data/15042413/fileData.do

Alerts

데이터기준 has constant value ""Constant
연 장 is highly overall correlated with 결정면적 High correlation
결정면적 is highly overall correlated with 연 장High correlation
사용형태 is highly overall correlated with 기 능High correlation
기 능 is highly overall correlated with 사용형태High correlation
사용형태 is highly imbalanced (82.0%)Imbalance
기 능 is highly imbalanced (73.7%)Imbalance
주요경과지 has 232 (29.1%) missing valuesMissing
순번 has unique valuesUnique
연 장 has 46 (5.8%) zerosZeros

Reproduction

Analysis started2023-12-12 21:06:36.848735
Analysis finished2023-12-12 21:06:38.434839
Duration1.59 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct796
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean398.5
Minimum1
Maximum796
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2023-12-13T06:06:38.507574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile40.75
Q1199.75
median398.5
Q3597.25
95-th percentile756.25
Maximum796
Range795
Interquartile range (IQR)397.5

Descriptive statistics

Standard deviation229.9297
Coefficient of variation (CV)0.57698795
Kurtosis-1.2
Mean398.5
Median Absolute Deviation (MAD)199
Skewness0
Sum317206
Variance52867.667
MonotonicityStrictly increasing
2023-12-13T06:06:38.655546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
525 1
 
0.1%
527 1
 
0.1%
528 1
 
0.1%
529 1
 
0.1%
530 1
 
0.1%
531 1
 
0.1%
532 1
 
0.1%
533 1
 
0.1%
534 1
 
0.1%
Other values (786) 786
98.7%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
796 1
0.1%
795 1
0.1%
794 1
0.1%
793 1
0.1%
792 1
0.1%
791 1
0.1%
790 1
0.1%
789 1
0.1%
788 1
0.1%
787 1
0.1%

사용형태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
시도중 일반도로
758 
미분류
 
37
지방도
 
1

Length

Max length8
Median length8
Mean length7.7613065
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row시도중 일반도로
2nd row시도중 일반도로
3rd row시도중 일반도로
4th row시도중 일반도로
5th row시도중 일반도로

Common Values

ValueCountFrequency (%)
시도중 일반도로 758
95.2%
미분류 37
 
4.6%
지방도 1
 
0.1%

Length

2023-12-13T06:06:38.792158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:06:38.899809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
시도중 758
48.8%
일반도로 758
48.8%
미분류 37
 
2.4%
지방도 1
 
0.1%

기 능
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
국지도로
722 
미분류
 
38
보조간선도로
 
14
집산도로
 
12
주간선도로
 
10

Length

Max length6
Median length4
Mean length4
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국지도로
2nd row국지도로
3rd row국지도로
4th row국지도로
5th row국지도로

Common Values

ValueCountFrequency (%)
국지도로 722
90.7%
미분류 38
 
4.8%
보조간선도로 14
 
1.8%
집산도로 12
 
1.5%
주간선도로 10
 
1.3%

Length

2023-12-13T06:06:39.039241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:06:39.159287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국지도로 722
90.7%
미분류 38
 
4.8%
보조간선도로 14
 
1.8%
집산도로 12
 
1.5%
주간선도로 10
 
1.3%

연 장
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct662
Distinct (%)83.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean336.85641
Minimum0
Maximum10530
Zeros46
Zeros (%)5.8%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2023-12-13T06:06:39.271200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1131.975
median226.05
Q3366
95-th percentile846.125
Maximum10530
Range10530
Interquartile range (IQR)234.025

Descriptive statistics

Standard deviation575.24818
Coefficient of variation (CV)1.7076955
Kurtosis163.54086
Mean336.85641
Median Absolute Deviation (MAD)107.1
Skewness10.919266
Sum268137.7
Variance330910.47
MonotonicityNot monotonic
2023-12-13T06:06:39.401316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 46
 
5.8%
320.0 5
 
0.6%
210.0 4
 
0.5%
178.699996948242 4
 
0.5%
300.0 3
 
0.4%
244.0 3
 
0.4%
192.0 3
 
0.4%
120.0 3
 
0.4%
201.0 3
 
0.4%
131.0 3
 
0.4%
Other values (652) 719
90.3%
ValueCountFrequency (%)
0.0 46
5.8%
1.0 1
 
0.1%
9.5 1
 
0.1%
16.7000007629395 1
 
0.1%
40.0 1
 
0.1%
42.0 1
 
0.1%
46.4000015258789 1
 
0.1%
46.7999992370605 1
 
0.1%
48.7999992370605 1
 
0.1%
49.0 1
 
0.1%
ValueCountFrequency (%)
10530.0 1
0.1%
7786.2998046875 1
0.1%
3895.39990234375 1
0.1%
3720.80004882813 1
0.1%
3704.0 1
0.1%
2697.60009765625 1
0.1%
2353.30004882813 1
0.1%
2142.5 1
0.1%
2109.0 1
0.1%
1939.40002441406 1
0.1%

주요경과지
Text

MISSING 

Distinct520
Distinct (%)92.2%
Missing232
Missing (%)29.1%
Memory size6.3 KiB
2023-12-13T06:06:39.624996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length15
Mean length8.2251773
Min length3

Characters and Unicode

Total characters4639
Distinct characters327
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique479 ?
Unique (%)84.9%

Sample

1st row연현초교,LG아파트,석수전철아파트
2nd row국민연금관리공단,관양아파트
3rd row안양공고
4th row관일청도체육관
5th row성원아파트
ValueCountFrequency (%)
안양청과시장 3
 
0.5%
성원아파트 3
 
0.5%
삼성아파트 3
 
0.5%
호계공원 2
 
0.4%
샛별한양아파트 2
 
0.4%
양지초교 2
 
0.4%
벽산아파트,신한아파트 2
 
0.4%
삼익아파트 2
 
0.4%
대림주택 2
 
0.4%
귀인중학교 2
 
0.4%
Other values (510) 541
95.9%
2023-12-13T06:06:40.001128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
263
 
5.7%
258
 
5.6%
, 250
 
5.4%
249
 
5.4%
112
 
2.4%
108
 
2.3%
98
 
2.1%
96
 
2.1%
87
 
1.9%
86
 
1.9%
Other values (317) 3032
65.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4302
92.7%
Other Punctuation 251
 
5.4%
Decimal Number 58
 
1.3%
Other Symbol 13
 
0.3%
Uppercase Letter 11
 
0.2%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
263
 
6.1%
258
 
6.0%
249
 
5.8%
112
 
2.6%
108
 
2.5%
98
 
2.3%
96
 
2.2%
87
 
2.0%
86
 
2.0%
80
 
1.9%
Other values (298) 2865
66.6%
Decimal Number
ValueCountFrequency (%)
2 18
31.0%
1 14
24.1%
3 8
13.8%
5 5
 
8.6%
7 4
 
6.9%
8 3
 
5.2%
9 3
 
5.2%
4 2
 
3.4%
6 1
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
L 4
36.4%
G 4
36.4%
K 1
 
9.1%
S 1
 
9.1%
P 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
, 250
99.6%
. 1
 
0.4%
Other Symbol
ValueCountFrequency (%)
13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4315
93.0%
Common 313
 
6.7%
Latin 11
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
263
 
6.1%
258
 
6.0%
249
 
5.8%
112
 
2.6%
108
 
2.5%
98
 
2.3%
96
 
2.2%
87
 
2.0%
86
 
2.0%
80
 
1.9%
Other values (299) 2878
66.7%
Common
ValueCountFrequency (%)
, 250
79.9%
2 18
 
5.8%
1 14
 
4.5%
3 8
 
2.6%
5 5
 
1.6%
7 4
 
1.3%
8 3
 
1.0%
9 3
 
1.0%
( 2
 
0.6%
4 2
 
0.6%
Other values (3) 4
 
1.3%
Latin
ValueCountFrequency (%)
L 4
36.4%
G 4
36.4%
K 1
 
9.1%
S 1
 
9.1%
P 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4302
92.7%
ASCII 324
 
7.0%
None 13
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
263
 
6.1%
258
 
6.0%
249
 
5.8%
112
 
2.6%
108
 
2.5%
98
 
2.3%
96
 
2.2%
87
 
2.0%
86
 
2.0%
80
 
1.9%
Other values (298) 2865
66.6%
ASCII
ValueCountFrequency (%)
, 250
77.2%
2 18
 
5.6%
1 14
 
4.3%
3 8
 
2.5%
5 5
 
1.5%
L 4
 
1.2%
G 4
 
1.2%
7 4
 
1.2%
8 3
 
0.9%
9 3
 
0.9%
Other values (8) 11
 
3.4%
None
ValueCountFrequency (%)
13
100.0%

결정면적
Real number (ℝ)

HIGH CORRELATION 

Distinct707
Distinct (%)88.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9939.0386
Minimum0
Maximum514419.4
Zeros2
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2023-12-13T06:06:40.447496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile386.775
Q11020.3
median2212.6
Q36095.8
95-th percentile32717.3
Maximum514419.4
Range514419.4
Interquartile range (IQR)5075.5

Descriptive statistics

Standard deviation37649.925
Coefficient of variation (CV)3.7880852
Kurtosis100.73189
Mean9939.0386
Median Absolute Deviation (MAD)1521.75
Skewness9.2113965
Sum7911474.7
Variance1.4175169 × 109
MonotonicityNot monotonic
2023-12-13T06:06:40.567955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9018.5 7
 
0.9%
12076.4 7
 
0.9%
15525.0 5
 
0.6%
38255.4 4
 
0.5%
7429.7 4
 
0.5%
9774.5 3
 
0.4%
65831.8 3
 
0.4%
5211.9 3
 
0.4%
20303.0 3
 
0.4%
6121.3 3
 
0.4%
Other values (697) 754
94.7%
ValueCountFrequency (%)
0.0 2
0.3%
87.5 1
0.1%
112.4 1
0.1%
130.0 1
0.1%
144.3 1
0.1%
174.6 2
0.3%
184.3 1
0.1%
208.1 1
0.1%
209.8 1
0.1%
211.6 1
0.1%
ValueCountFrequency (%)
514419.4 1
0.1%
482293.6 1
0.1%
438024.4 1
0.1%
272466.2 1
0.1%
235242.0 1
0.1%
229945.8 2
0.3%
224489.0 1
0.1%
186635.2 1
0.1%
148215.6 1
0.1%
135555.5 1
0.1%

데이터기준
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.3 KiB
2023-07-18
796 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-07-18
2nd row2023-07-18
3rd row2023-07-18
4th row2023-07-18
5th row2023-07-18

Common Values

ValueCountFrequency (%)
2023-07-18 796
100.0%

Length

2023-12-13T06:06:40.687121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:06:40.779646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-07-18 796
100.0%

Interactions

2023-12-13T06:06:37.899899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:06:37.247234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:06:37.594133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:06:37.998542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:06:37.374639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:06:37.718091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:06:38.108486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:06:37.488391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:06:37.807668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:06:40.836077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번사용형태기 능연 장결정면적
순번1.0000.0440.4920.1570.156
사용형태0.0441.0000.6950.0000.000
기 능0.4920.6951.0000.4540.520
연 장0.1570.0000.4541.0000.820
결정면적0.1560.0000.5200.8201.000
2023-12-13T06:06:40.920028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기 능사용형태
기 능1.0000.675
사용형태0.6751.000
2023-12-13T06:06:41.004214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번연 장결정면적사용형태기 능
순번1.000-0.051-0.0630.0250.225
연 장-0.0511.0000.5510.0000.328
결정면적-0.0630.5511.0000.0000.352
사용형태0.0250.0000.0001.0000.675
기 능0.2250.3280.3520.6751.000

Missing values

2023-12-13T06:06:38.245547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:06:38.369170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번사용형태기 능연 장주요경과지결정면적데이터기준
01시도중 일반도로국지도로1742.400024연현초교,LG아파트,석수전철아파트16245.92023-07-18
12시도중 일반도로국지도로481.799988<NA>2281.62023-07-18
23시도중 일반도로국지도로138.800003<NA>1099.62023-07-18
34시도중 일반도로국지도로188.300003국민연금관리공단,관양아파트1505.52023-07-18
45시도중 일반도로국지도로258.0안양공고1403.92023-07-18
56미분류미분류0.0<NA>903.12023-07-18
67시도중 일반도로국지도로186.399994<NA>688.82023-07-18
78시도중 일반도로국지도로109.900002관일청도체육관7429.72023-07-18
89시도중 일반도로국지도로112.0<NA>1293.82023-07-18
910시도중 일반도로국지도로213.600006성원아파트6121.32023-07-18
순번사용형태기 능연 장주요경과지결정면적데이터기준
786787시도중 일반도로국지도로609.099976<NA>5590.32023-07-18
787788시도중 일반도로국지도로532.5태영아파트,신기중학교6704.12023-07-18
788789시도중 일반도로국지도로209.0무궁화코롱아파트,신기초교2009.72023-07-18
789790시도중 일반도로국지도로154.199997평촌반석교회1175.02023-07-18
790791시도중 일반도로국지도로179.699997협성골드프라자2116.22023-07-18
791792시도중 일반도로국지도로236.0<NA>1366.72023-07-18
792793시도중 일반도로국지도로316.200012평촌롯데아파트,평촌초교2043.82023-07-18
793794미분류미분류0.0양지초교7130.62023-07-18
794795시도중 일반도로국지도로321.399994효성아파트2026.52023-07-18
795796시도중 일반도로국지도로147.300003남서울안양의원667.42023-07-18