Overview

Dataset statistics

Number of variables4
Number of observations863
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.8 KiB
Average record size in memory34.2 B

Variable types

Text1
Categorical1
Numeric2

Dataset

Description경기도 평택시 도시계획정보시스템(UPIS) 도시지역 현황으로 현황도형 관리번호, 라벨명, 면적(도형), 면적(길이) 등의 항목을 제공합니다. ※문의 : 평택시 도시계획과(031-8024-3923)
URLhttps://www.data.go.kr/data/15116818/fileData.do

Alerts

면적_도형 is highly overall correlated with 길이_도형High correlation
길이_도형 is highly overall correlated with 면적_도형High correlation
현황도형 관리번호 has unique valuesUnique
면적_도형 has unique valuesUnique
길이_도형 has unique valuesUnique

Reproduction

Analysis started2023-12-12 08:41:34.107100
Analysis finished2023-12-12 08:41:34.913047
Duration0.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct863
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size6.9 KiB
2023-12-12T17:41:35.077916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters20712
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique863 ?
Unique (%)100.0%

Sample

1st row41220UQ111PS202012240001
2nd row41220UQ111PS202012240002
3rd row41220UQ111PS202012240003
4th row41220UQ111PS202012240004
5th row41220UQ111PS202012240005
ValueCountFrequency (%)
41220uq111ps202012240001 1
 
0.1%
41220uq111ps202012240595 1
 
0.1%
41220uq111ps202012240570 1
 
0.1%
41220uq111ps202012240571 1
 
0.1%
41220uq111ps202012240572 1
 
0.1%
41220uq111ps202012240573 1
 
0.1%
41220uq111ps202012240574 1
 
0.1%
41220uq111ps202012240575 1
 
0.1%
41220uq111ps202012240576 1
 
0.1%
41220uq111ps202012240577 1
 
0.1%
Other values (853) 853
98.8%
2023-12-12T17:41:35.485363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 5448
26.3%
1 4585
22.1%
0 3737
18.0%
4 2003
 
9.7%
U 863
 
4.2%
Q 863
 
4.2%
P 863
 
4.2%
S 863
 
4.2%
3 283
 
1.4%
5 281
 
1.4%
Other values (4) 923
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17260
83.3%
Uppercase Letter 3452
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5448
31.6%
1 4585
26.6%
0 3737
21.7%
4 2003
 
11.6%
3 283
 
1.6%
5 281
 
1.6%
6 267
 
1.5%
7 267
 
1.5%
8 224
 
1.3%
9 165
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
U 863
25.0%
Q 863
25.0%
P 863
25.0%
S 863
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17260
83.3%
Latin 3452
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
2 5448
31.6%
1 4585
26.6%
0 3737
21.7%
4 2003
 
11.6%
3 283
 
1.6%
5 281
 
1.6%
6 267
 
1.5%
7 267
 
1.5%
8 224
 
1.3%
9 165
 
1.0%
Latin
ValueCountFrequency (%)
U 863
25.0%
Q 863
25.0%
P 863
25.0%
S 863
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20712
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 5448
26.3%
1 4585
22.1%
0 3737
18.0%
4 2003
 
9.7%
U 863
 
4.2%
Q 863
 
4.2%
P 863
 
4.2%
S 863
 
4.2%
3 283
 
1.4%
5 281
 
1.4%
Other values (4) 923
 
4.5%

라벨명
Categorical

Distinct16
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size6.9 KiB
자연녹지지역
209 
제1종일반주거지역
171 
제2종일반주거지역
145 
준주거지역
103 
생산녹지지역
63 
Other values (11)
172 

Length

Max length9
Median length6
Mean length7.1170336
Min length5

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st row제2종일반주거지역
2nd row제2종일반주거지역
3rd row제2종일반주거지역
4th row제2종일반주거지역
5th row제2종일반주거지역

Common Values

ValueCountFrequency (%)
자연녹지지역 209
24.2%
제1종일반주거지역 171
19.8%
제2종일반주거지역 145
16.8%
준주거지역 103
11.9%
생산녹지지역 63
 
7.3%
제3종일반주거지역 42
 
4.9%
일반공업지역 36
 
4.2%
일반상업지역 32
 
3.7%
준공업지역 28
 
3.2%
보전녹지지역 13
 
1.5%
Other values (6) 21
 
2.4%

Length

2023-12-12T17:41:35.692295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
자연녹지지역 209
24.2%
제1종일반주거지역 171
19.8%
제2종일반주거지역 145
16.8%
준주거지역 103
11.9%
생산녹지지역 63
 
7.3%
제3종일반주거지역 42
 
4.9%
일반공업지역 36
 
4.2%
일반상업지역 32
 
3.7%
준공업지역 28
 
3.2%
보전녹지지역 13
 
1.5%
Other values (6) 21
 
2.4%

면적_도형
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct863
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean175251.59
Minimum0.00062379
Maximum17135672
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.7 KiB
2023-12-12T17:41:36.289975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.00062379
5-th percentile80.787907
Q16790.1948
median28578.329
Q394173.422
95-th percentile646887.43
Maximum17135672
Range17135672
Interquartile range (IQR)87383.227

Descriptive statistics

Standard deviation797945.96
Coefficient of variation (CV)4.5531454
Kurtosis260.05541
Mean175251.59
Median Absolute Deviation (MAD)26532.215
Skewness14.168848
Sum1.5124212 × 108
Variance6.3671775 × 1011
MonotonicityNot monotonic
2023-12-12T17:41:36.493273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22796.93316 1
 
0.1%
118.7645049 1
 
0.1%
1610.547832 1
 
0.1%
1036.14041 1
 
0.1%
488.0377852 1
 
0.1%
344.7213347 1
 
0.1%
198.0320707 1
 
0.1%
188.1549319 1
 
0.1%
181.5707586 1
 
0.1%
149.3372378 1
 
0.1%
Other values (853) 853
98.8%
ValueCountFrequency (%)
0.00062379 1
0.1%
0.00409985 1
0.1%
0.08954026 1
0.1%
0.44233379 1
0.1%
0.47469364 1
0.1%
0.49905944 1
0.1%
1.48226147 1
0.1%
2.35869559 1
0.1%
2.40780724 1
0.1%
2.42514635 1
0.1%
ValueCountFrequency (%)
17135672.3 1
0.1%
8064824.209 1
0.1%
7495123.339 1
0.1%
5637704.269 1
0.1%
4831192.365 1
0.1%
3551334.691 1
0.1%
3434773.066 1
0.1%
3248637.376 1
0.1%
2771449.962 1
0.1%
1929693.593 1
0.1%

길이_도형
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct863
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2072.1566
Minimum1.140294
Maximum100150.84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.7 KiB
2023-12-12T17:41:36.693279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.140294
5-th percentile66.387734
Q1446.44705
median859.08856
Q31720.9076
95-th percentile6239.1779
Maximum100150.84
Range100149.7
Interquartile range (IQR)1274.4605

Descriptive statistics

Standard deviation5935.5058
Coefficient of variation (CV)2.8644099
Kurtosis156.08061
Mean2072.1566
Median Absolute Deviation (MAD)541.2972
Skewness11.064574
Sum1788271.1
Variance35230229
MonotonicityNot monotonic
2023-12-12T17:41:36.910315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
754.1616902 1
 
0.1%
79.75554419 1
 
0.1%
228.4629438 1
 
0.1%
163.1247802 1
 
0.1%
148.8922778 1
 
0.1%
100.9288425 1
 
0.1%
170.9656323 1
 
0.1%
68.60455531 1
 
0.1%
115.3721287 1
 
0.1%
90.31358124 1
 
0.1%
Other values (853) 853
98.8%
ValueCountFrequency (%)
1.14029403 1
0.1%
1.4138465 1
0.1%
3.41128048 1
0.1%
4.80161014 1
0.1%
9.09383607 1
0.1%
12.17417362 1
0.1%
12.59539671 1
0.1%
12.92418819 1
0.1%
12.92419974 1
0.1%
16.17218686 1
0.1%
ValueCountFrequency (%)
100150.8359 1
0.1%
91126.11245 1
0.1%
54613.87252 1
0.1%
46217.38655 1
0.1%
27974.18614 1
0.1%
27581.25479 1
0.1%
26380.24888 1
0.1%
21896.35398 1
0.1%
20780.00336 1
0.1%
20025.31693 1
0.1%

Interactions

2023-12-12T17:41:34.483207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:41:34.236184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:41:34.623194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:41:34.344327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:41:37.036952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
라벨명면적_도형길이_도형
라벨명1.0000.1190.000
면적_도형0.1191.0000.933
길이_도형0.0000.9331.000
2023-12-12T17:41:37.145587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적_도형길이_도형라벨명
면적_도형1.0000.9540.057
길이_도형0.9541.0000.000
라벨명0.0570.0001.000

Missing values

2023-12-12T17:41:34.763286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:41:34.877582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

현황도형 관리번호라벨명면적_도형길이_도형
041220UQ111PS202012240001제2종일반주거지역22796.93316754.16169
141220UQ111PS202012240002제2종일반주거지역20217.12518566.813604
241220UQ111PS202012240003제2종일반주거지역18890.02184560.491656
341220UQ111PS202012240004제2종일반주거지역18074.32075569.136365
441220UQ111PS202012240005제2종일반주거지역15650.10897600.490083
541220UQ111PS202012240006제2종일반주거지역14323.69915626.011689
641220UQ111PS202012240007제2종일반주거지역14041.42856490.360542
741220UQ111PS202012240008제2종일반주거지역13986.57319547.873526
841220UQ111PS202012240009제2종일반주거지역9882.51958423.444494
941220UQ111PS202012240010제2종일반주거지역9097.660712385.065524
현황도형 관리번호라벨명면적_도형길이_도형
85341220UQ111PS202012240854자연녹지지역42729.62713830.396631
85441220UQ111PS202012240855자연녹지지역47812.58326971.750965
85541220UQ111PS202012240856제2종일반주거지역525267.37724289.052493
85641220UQ111PS202012240857제3종일반주거지역1105725.0059949.437894
85741220UQ111PS202304250003일반공업지역1929693.59321896.35398
85841220UQ111PS202304250004자연녹지지역17135672.3100150.8359
85941220UQ111PS202304250005준공업지역389526.37253806.917079
86041220UQ111PS202304250006자연녹지지역448605.23139906.8153
86141220UQ111PS202304250007제2종일반주거지역298813.90842351.573939
86241220UQ111PS202304250008준주거지역73940.24031656.035747