Overview

Dataset statistics

Number of variables4
Number of observations155
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory34.9 B

Variable types

Text1
Categorical1
Numeric2

Dataset

Description충청남도 천안시 도시계획정보시스템(UPIS)교통시설 현황으로 현황도형 관리번호, 라벨명 등의 항목을 제공합니다.
Author충청남도 천안시
URLhttps://www.data.go.kr/data/15123192/fileData.do

Alerts

면적_도형 is highly overall correlated with 길이_도형 and 1 other fieldsHigh correlation
길이_도형 is highly overall correlated with 면적_도형 and 1 other fieldsHigh correlation
라벨명 is highly overall correlated with 면적_도형 and 1 other fieldsHigh correlation
라벨명 is highly imbalanced (60.4%)Imbalance
현황도형 관리번호 has unique valuesUnique
면적_도형 has unique valuesUnique
길이_도형 has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:19:01.055217
Analysis finished2023-12-12 14:19:01.596714
Duration0.54 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct155
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-12T23:19:01.719173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters3720
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique155 ?
Unique (%)100.0%

Sample

1st row44130UQ152PS199711200007
2nd row44130UQ152PS200002120002
3rd row44130UQ152PS200002120003
4th row44130UQ152PS200002120004
5th row44130UQ152PS200002120001
ValueCountFrequency (%)
44130uq152ps199711200007 1
 
0.6%
44130uq152ps201512010419 1
 
0.6%
44130uq152ps200511300007 1
 
0.6%
44130uq152ps202111260030 1
 
0.6%
44130uq152ps202111260031 1
 
0.6%
44130uq152ps202202030009 1
 
0.6%
44130uq152ps201512010424 1
 
0.6%
44130uq152ps201512010421 1
 
0.6%
44130uq152ps201907010093 1
 
0.6%
44130uq152ps200812010963 1
 
0.6%
Other values (145) 145
93.5%
2023-12-12T23:19:02.031557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 935
25.1%
1 606
16.3%
2 473
12.7%
4 365
 
9.8%
3 235
 
6.3%
5 212
 
5.7%
U 155
 
4.2%
Q 155
 
4.2%
P 155
 
4.2%
S 155
 
4.2%
Other values (4) 274
 
7.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3100
83.3%
Uppercase Letter 620
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 935
30.2%
1 606
19.5%
2 473
15.3%
4 365
 
11.8%
3 235
 
7.6%
5 212
 
6.8%
9 98
 
3.2%
6 71
 
2.3%
8 53
 
1.7%
7 52
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
U 155
25.0%
Q 155
25.0%
P 155
25.0%
S 155
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3100
83.3%
Latin 620
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 935
30.2%
1 606
19.5%
2 473
15.3%
4 365
 
11.8%
3 235
 
7.6%
5 212
 
6.8%
9 98
 
3.2%
6 71
 
2.3%
8 53
 
1.7%
7 52
 
1.7%
Latin
ValueCountFrequency (%)
U 155
25.0%
Q 155
25.0%
P 155
25.0%
S 155
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3720
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 935
25.1%
1 606
16.3%
2 473
12.7%
4 365
 
9.8%
3 235
 
6.3%
5 212
 
5.7%
U 155
 
4.2%
Q 155
 
4.2%
P 155
 
4.2%
S 155
 
4.2%
Other values (4) 274
 
7.4%

라벨명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
노외주차장
121 
기타주차장시설
21 
일반철도
 
8
여객자동차터미널
 
2
고속철도
 
1
Other values (2)
 
2

Length

Max length8
Median length5
Mean length5.2516129
Min length4

Unique

Unique3 ?
Unique (%)1.9%

Sample

1st row노외주차장
2nd row노외주차장
3rd row노외주차장
4th row노외주차장
5th row노외주차장

Common Values

ValueCountFrequency (%)
노외주차장 121
78.1%
기타주차장시설 21
 
13.5%
일반철도 8
 
5.2%
여객자동차터미널 2
 
1.3%
고속철도 1
 
0.6%
화물터미널 1
 
0.6%
공영차고지 1
 
0.6%

Length

2023-12-12T23:19:02.180914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:19:02.325611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
노외주차장 121
78.1%
기타주차장시설 21
 
13.5%
일반철도 8
 
5.2%
여객자동차터미널 2
 
1.3%
고속철도 1
 
0.6%
화물터미널 1
 
0.6%
공영차고지 1
 
0.6%

면적_도형
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct155
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11592.613
Minimum205.06788
Maximum918031.86
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-12T23:19:02.457557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum205.06788
5-th percentile531.30886
Q1924.28729
median1551.5542
Q33093.4962
95-th percentile39962.438
Maximum918031.86
Range917826.8
Interquartile range (IQR)2169.2089

Descriptive statistics

Standard deviation75028.752
Coefficient of variation (CV)6.4721175
Kurtosis140.82331
Mean11592.613
Median Absolute Deviation (MAD)817.90015
Skewness11.639163
Sum1796855
Variance5.6293136 × 109
MonotonicityNot monotonic
2023-12-12T23:19:02.594040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1500.04218 1
 
0.6%
546.006217 1
 
0.6%
599.2636465 1
 
0.6%
6623.191904 1
 
0.6%
729.032416 1
 
0.6%
2100.016927 1
 
0.6%
1116.989031 1
 
0.6%
1257.021221 1
 
0.6%
1394.006347 1
 
0.6%
54443.29616 1
 
0.6%
Other values (145) 145
93.5%
ValueCountFrequency (%)
205.0678761 1
0.6%
259.321001 1
0.6%
278.170616 1
0.6%
301.8323432 1
0.6%
400.6189921 1
0.6%
473.1253556 1
0.6%
480.0330795 1
0.6%
517.002567 1
0.6%
537.4401224 1
0.6%
546.006217 1
0.6%
ValueCountFrequency (%)
918031.8648 1
0.6%
127126.0531 1
0.6%
96350.71522 1
0.6%
64886.24689 1
0.6%
60995.77242 1
0.6%
54443.29616 1
0.6%
53979.13605 1
0.6%
52648.58773 1
0.6%
34525.51701 1
0.6%
19454.21426 1
0.6%

길이_도형
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct155
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean717.63239
Minimum68.143778
Maximum54168.668
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-12T23:19:02.735723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum68.143778
5-th percentile96.809458
Q1128.08603
median165.67032
Q3241.06912
95-th percentile976.12032
Maximum54168.668
Range54100.524
Interquartile range (IQR)112.98309

Descriptive statistics

Standard deviation4510.3566
Coefficient of variation (CV)6.2850516
Kurtosis130.77704
Mean717.63239
Median Absolute Deviation (MAD)52.32198
Skewness11.163389
Sum111233.02
Variance20343317
MonotonicityNot monotonic
2023-12-12T23:19:02.953564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
155.9555156 1
 
0.6%
94.00119774 1
 
0.6%
103.0237305 1
 
0.6%
358.2945907 1
 
0.6%
113.3483395 1
 
0.6%
186.0702976 1
 
0.6%
149.3626121 1
 
0.6%
139.6382394 1
 
0.6%
147.8438546 1
 
0.6%
1311.315632 1
 
0.6%
Other values (145) 145
93.5%
ValueCountFrequency (%)
68.14377838 1
0.6%
70.08310991 1
0.6%
71.03378736 1
0.6%
78.85053257 1
0.6%
87.05374469 1
0.6%
88.5559152 1
0.6%
94.00119774 1
0.6%
96.38988239 1
0.6%
96.98927656 1
0.6%
97.2776722 1
0.6%
ValueCountFrequency (%)
54168.6677 1
0.6%
14889.87647 1
0.6%
6282.547843 1
0.6%
1687.782823 1
0.6%
1398.45518 1
0.6%
1311.315632 1
0.6%
1154.555497 1
0.6%
1036.676002 1
0.6%
950.167892 1
0.6%
891.5958062 1
0.6%

Interactions

2023-12-12T23:19:01.306695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:19:01.158938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:19:01.380420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:19:01.229177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:19:03.059180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
라벨명면적_도형길이_도형
라벨명1.0000.6550.746
면적_도형0.6551.0000.800
길이_도형0.7460.8001.000
2023-12-12T23:19:03.154298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적_도형길이_도형라벨명
면적_도형1.0000.9680.551
길이_도형0.9681.0000.618
라벨명0.5510.6181.000

Missing values

2023-12-12T23:19:01.476000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:19:01.566291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

현황도형 관리번호라벨명면적_도형길이_도형
044130UQ152PS199711200007노외주차장1500.04218155.955516
144130UQ152PS200002120002노외주차장1490.893627169.01455
244130UQ152PS200002120003노외주차장1754.043494185.787167
344130UQ152PS200002120004노외주차장824.525696112.160168
444130UQ152PS200002120001노외주차장1467.094056164.608955
544130UQ152PS200704200003노외주차장473.12535688.555915
644130UQ152PS200104020005노외주차장2432.983195249.680747
744130UQ152PS201008020006노외주차장4784.318099423.417648
844130UQ152PS200704200004노외주차장688.969226138.884961
944130UQ152PS200508220004노외주차장753.136203112.589507
현황도형 관리번호라벨명면적_도형길이_도형
14544130UQ152PS199303060036노외주차장301.83234370.08311
14644130UQ152PS199412070023노외주차장1999.959258176.892847
14744130UQ152PS199412070024노외주차장1999.980141174.837115
14844130UQ152PS199412070025노외주차장1999.977344177.628351
14944130UQ152PS199412070007노외주차장2999.967712221.147496
15044130UQ152PS199412070008노외주차장1500.036313156.05128
15144130UQ152PS199412070009노외주차장1499.514976156.190611
15244130UQ152PS199606150029노외주차장2000.032739181.171982
15344130UQ152PS199506190012노외주차장2086.674673181.05577
15444130UQ152PS199711200006노외주차장1499.962866156.902524