Overview

Dataset statistics

Number of variables4
Number of observations3360
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory111.7 KiB
Average record size in memory34.0 B

Variable types

Text1
Categorical1
Numeric2

Dataset

Description충청남도 천안시 도시계획정보시스템(UPIS) 도로 현황으로 현황도형 관리번호, 라벨명 등의 항목을 제공합니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=14&beforeMenuCd=DOM_000000201001001000&publicdatapk=15123194

Alerts

면적_도형 is highly overall correlated with 길이_도형High correlation
길이_도형 is highly overall correlated with 면적_도형High correlation
면적_도형 is highly skewed (γ1 = 30.67468636)Skewed
현황도형 관리번호 has unique valuesUnique
면적_도형 has unique valuesUnique

Reproduction

Analysis started2024-01-09 22:49:00.050252
Analysis finished2024-01-09 22:49:00.653035
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct3360
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size26.4 KiB
2024-01-10T07:49:00.777081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters80640
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3360 ?
Unique (%)100.0%

Sample

1st row44130UQ151PS198706120134
2nd row44130UQ151PS199005090004
3rd row44130UQ151PS199303060878
4th row44130UQ151PS200101180426
5th row44130UQ151PS200812011162
ValueCountFrequency (%)
44130uq151ps198706120134 1
 
< 0.1%
44130uq151ps200002120059 1
 
< 0.1%
44130uq151ps200101181509 1
 
< 0.1%
44130uq151ps202106210007 1
 
< 0.1%
44130uq151ps200101181510 1
 
< 0.1%
44130uq151ps200101181513 1
 
< 0.1%
44130uq151ps200101181521 1
 
< 0.1%
44130uq151ps200101181669 1
 
< 0.1%
44130uq151ps200101181469 1
 
< 0.1%
44130uq151ps202103220019 1
 
< 0.1%
Other values (3350) 3350
99.7%
2024-01-10T07:49:01.049887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 17561
21.8%
0 16655
20.7%
4 8057
10.0%
2 6344
 
7.9%
3 5212
 
6.5%
5 5066
 
6.3%
U 3360
 
4.2%
Q 3360
 
4.2%
P 3360
 
4.2%
S 3360
 
4.2%
Other values (4) 8305
10.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 67200
83.3%
Uppercase Letter 13440
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 17561
26.1%
0 16655
24.8%
4 8057
12.0%
2 6344
 
9.4%
3 5212
 
7.8%
5 5066
 
7.5%
9 2630
 
3.9%
6 2184
 
3.2%
8 1901
 
2.8%
7 1590
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
U 3360
25.0%
Q 3360
25.0%
P 3360
25.0%
S 3360
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 67200
83.3%
Latin 13440
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 17561
26.1%
0 16655
24.8%
4 8057
12.0%
2 6344
 
9.4%
3 5212
 
7.8%
5 5066
 
7.5%
9 2630
 
3.9%
6 2184
 
3.2%
8 1901
 
2.8%
7 1590
 
2.4%
Latin
ValueCountFrequency (%)
U 3360
25.0%
Q 3360
25.0%
P 3360
25.0%
S 3360
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80640
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 17561
21.8%
0 16655
20.7%
4 8057
10.0%
2 6344
 
7.9%
3 5212
 
6.5%
5 5066
 
6.3%
U 3360
 
4.2%
Q 3360
 
4.2%
P 3360
 
4.2%
S 3360
 
4.2%
Other values (4) 8305
10.3%

라벨명
Categorical

Distinct11
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size26.4 KiB
소로2류
1157 
소로3류
833 
소로1류
497 
중로2류
311 
중로3류
271 
Other values (6)
291 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중로2류
2nd row중로2류
3rd row중로2류
4th row중로2류
5th row중로2류

Common Values

ValueCountFrequency (%)
소로2류 1157
34.4%
소로3류 833
24.8%
소로1류 497
14.8%
중로2류 311
 
9.3%
중로3류 271
 
8.1%
중로1류 195
 
5.8%
대로3류 58
 
1.7%
대로2류 16
 
0.5%
대로1류 15
 
0.4%
광로3류 5
 
0.1%

Length

2024-01-10T07:49:01.174104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
소로2류 1157
34.4%
소로3류 833
24.8%
소로1류 497
14.8%
중로2류 311
 
9.3%
중로3류 271
 
8.1%
중로1류 195
 
5.8%
대로3류 58
 
1.7%
대로2류 16
 
0.5%
대로1류 15
 
0.4%
광로3류 5
 
0.1%

면적_도형
Real number (ℝ)

HIGH CORRELATION  SKEWED  UNIQUE 

Distinct3360
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6767.2744
Minimum11.826438
Maximum2197188.4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.7 KiB
2024-01-10T07:49:01.293577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11.826438
5-th percentile121.22035
Q1594.17645
median1342.8056
Q33129.1642
95-th percentile16908.713
Maximum2197188.4
Range2197176.6
Interquartile range (IQR)2534.9878

Descriptive statistics

Standard deviation54852.978
Coefficient of variation (CV)8.1056235
Kurtosis1107.4558
Mean6767.2744
Median Absolute Deviation (MAD)943.64624
Skewness30.674686
Sum22738042
Variance3.0088492 × 109
MonotonicityNot monotonic
2024-01-10T07:49:01.431040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16010.82317 1
 
< 0.1%
2659.866559 1
 
< 0.1%
325.2366427 1
 
< 0.1%
1690.331536 1
 
< 0.1%
797.3407952 1
 
< 0.1%
836.9956373 1
 
< 0.1%
5531.615189 1
 
< 0.1%
11299.67751 1
 
< 0.1%
707.568493 1
 
< 0.1%
2344.963904 1
 
< 0.1%
Other values (3350) 3350
99.7%
ValueCountFrequency (%)
11.82643762 1
< 0.1%
18.5592802 1
< 0.1%
22.08448627 1
< 0.1%
22.556613 1
< 0.1%
22.625419 1
< 0.1%
22.83667676 1
< 0.1%
23.6266 1
< 0.1%
23.85256591 1
< 0.1%
24.64045 1
< 0.1%
26.1082 1
< 0.1%
ValueCountFrequency (%)
2197188.378 1
< 0.1%
1799989.233 1
< 0.1%
691728.6085 1
< 0.1%
416283.0701 1
< 0.1%
408823.4676 1
< 0.1%
386679.394 1
< 0.1%
383201.814 1
< 0.1%
282960.1799 1
< 0.1%
267516.2489 1
< 0.1%
265044.7613 1
< 0.1%

길이_도형
Real number (ℝ)

HIGH CORRELATION 

Distinct3359
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean754.96522
Minimum14.15374
Maximum72453.891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.7 KiB
2024-01-10T07:49:01.548060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum14.15374
5-th percentile55.001406
Q1186.85233
median347.78079
Q3648.59673
95-th percentile1999.9725
Maximum72453.891
Range72439.738
Interquartile range (IQR)461.7444

Descriptive statistics

Standard deviation2443.5785
Coefficient of variation (CV)3.2366769
Kurtosis363.71499
Mean754.96522
Median Absolute Deviation (MAD)199.92165
Skewness16.265603
Sum2536683.1
Variance5971075.7
MonotonicityNot monotonic
2024-01-10T07:49:01.669652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
735.2276109 2
 
0.1%
2150.773347 1
 
< 0.1%
1162.457601 1
 
< 0.1%
464.287111 1
 
< 0.1%
211.18784 1
 
< 0.1%
226.839468 1
 
< 0.1%
776.0122752 1
 
< 0.1%
1557.164875 1
 
< 0.1%
293.24343 1
 
< 0.1%
471.3407775 1
 
< 0.1%
Other values (3349) 3349
99.7%
ValueCountFrequency (%)
14.15373957 1
< 0.1%
19.5192517 1
< 0.1%
20.16086374 1
< 0.1%
20.39668435 1
< 0.1%
20.82376004 1
< 0.1%
22.04026653 1
< 0.1%
22.31427492 1
< 0.1%
22.43196751 1
< 0.1%
23.23496412 1
< 0.1%
23.24818861 1
< 0.1%
ValueCountFrequency (%)
72453.89126 1
< 0.1%
56771.78041 1
< 0.1%
39492.80818 1
< 0.1%
39362.48758 1
< 0.1%
32400.29929 1
< 0.1%
30409.52653 1
< 0.1%
25369.41526 1
< 0.1%
23261.97564 1
< 0.1%
21944.56653 1
< 0.1%
17760.09937 1
< 0.1%

Interactions

2024-01-10T07:49:00.379847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:49:00.227389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:49:00.453887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:49:00.299999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T07:49:01.770491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
라벨명면적_도형길이_도형
라벨명1.0000.6470.614
면적_도형0.6471.0000.917
길이_도형0.6140.9171.000
2024-01-10T07:49:01.860016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적_도형길이_도형라벨명
면적_도형1.0000.9630.426
길이_도형0.9631.0000.347
라벨명0.4260.3471.000

Missing values

2024-01-10T07:49:00.555474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T07:49:00.622430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

현황도형 관리번호라벨명면적_도형길이_도형
044130UQ151PS198706120134중로2류16010.823172150.773347
144130UQ151PS199005090004중로2류1817.963367322.501439
244130UQ151PS199303060878중로2류2504.330546381.979392
344130UQ151PS200101180426중로2류6633.100789915.902343
444130UQ151PS200812011162중로2류1824.568938257.065713
544130UQ151PS199608310006중로2류2886.917398378.15518
644130UQ151PS199901090050중로2류401.376273102.001068
744130UQ151PS199901090051중로2류408.35190799.932764
844130UQ151PS199303060106중로2류8011.0268481116.817162
944130UQ151PS199303060105중로2류8057.3102071133.003875
현황도형 관리번호라벨명면적_도형길이_도형
335044130UQ151PS202107210001대로3류132671.08296603.22385
335144130UQ151PS202107210002중로3류3432.280532408.529206
335244130UQ151PS202107210003중로3류2217.998643269.935636
335344130UQ151PS200101020147대로2류152671.6937796.623574
335444130UQ151PS200812012554소로2류1099.652371291.061601
335544130UQ151PS200101182209소로3류548.222025201.554199
335644130UQ151PS200101182211소로3류381.551668175.381937
335744130UQ151PS200101020133대로1류691728.608539362.48758
335844130UQ151PS200501060045대로3류5485.9916518.912543
335944130UQ151PS200508220134소로3류150.3279562.101057