Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells50
Missing cells (%)0.2%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory332.0 KiB
Average record size in memory34.0 B

Variable types

Numeric2
Text1

Dataset

Description해당 데이터는 충청남도 아산시의 도로면의 정보가 담겨 있습니다. 도로면관리번호, 도로구간번호, 법정읍/면/동의 정보를 포함합니다.
Author충청남도 아산시
URLhttps://www.data.go.kr/data/15089990/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
도로면관리번호 is highly overall correlated with 도로구간번호High correlation
도로구간번호 is highly overall correlated with 도로면관리번호High correlation

Reproduction

Analysis started2023-12-13 00:47:33.791597
Analysis finished2023-12-13 00:47:34.424357
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

도로면관리번호
Real number (ℝ)

HIGH CORRELATION 

Distinct9965
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51717.369
Minimum10
Maximum999999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T09:47:34.484097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile1137.95
Q15385.75
median10311
Q315333.75
95-th percentile370387.05
Maximum999999
Range999989
Interquartile range (IQR)9948

Descriptive statistics

Standard deviation151634.24
Coefficient of variation (CV)2.9319791
Kurtosis22.139456
Mean51717.369
Median Absolute Deviation (MAD)4967
Skewness4.4955128
Sum5.1717369 × 108
Variance2.2992944 × 1010
MonotonicityNot monotonic
2023-12-13T09:47:34.598888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18152 2
 
< 0.1%
18112 2
 
< 0.1%
17056 2
 
< 0.1%
17089 2
 
< 0.1%
17170 2
 
< 0.1%
18139 2
 
< 0.1%
18167 2
 
< 0.1%
17148 2
 
< 0.1%
18136 2
 
< 0.1%
17160 2
 
< 0.1%
Other values (9955) 9980
99.8%
ValueCountFrequency (%)
10 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
22 1
< 0.1%
ValueCountFrequency (%)
999999 1
< 0.1%
995310 1
< 0.1%
995309 1
< 0.1%
995306 1
< 0.1%
995304 1
< 0.1%
995303 1
< 0.1%
995300 1
< 0.1%
995299 1
< 0.1%
995298 1
< 0.1%
995297 1
< 0.1%

도로구간번호
Real number (ℝ)

HIGH CORRELATION 

Distinct7250
Distinct (%)72.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51449.929
Minimum17
Maximum996167
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T09:47:34.739105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile633.95
Q13711.25
median10134.5
Q315594.25
95-th percentile370303.15
Maximum996167
Range996150
Interquartile range (IQR)11883

Descriptive statistics

Standard deviation152057.58
Coefficient of variation (CV)2.9554479
Kurtosis22.02943
Mean51449.929
Median Absolute Deviation (MAD)5731.5
Skewness4.4862203
Sum5.1449929 × 108
Variance2.3121508 × 1010
MonotonicityNot monotonic
2023-12-13T09:47:34.852923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19519 9
 
0.1%
19521 8
 
0.1%
14172 8
 
0.1%
1516 7
 
0.1%
13793 7
 
0.1%
7703 6
 
0.1%
1915 6
 
0.1%
30967 6
 
0.1%
905 6
 
0.1%
30955 6
 
0.1%
Other values (7240) 9931
99.3%
ValueCountFrequency (%)
17 1
< 0.1%
18 1
< 0.1%
20 1
< 0.1%
22 1
< 0.1%
23 1
< 0.1%
25 1
< 0.1%
26 1
< 0.1%
29 1
< 0.1%
34 1
< 0.1%
36 2
< 0.1%
ValueCountFrequency (%)
996167 1
 
< 0.1%
995170 1
 
< 0.1%
995163 1
 
< 0.1%
995161 5
0.1%
995160 3
< 0.1%
995158 2
 
< 0.1%
995155 2
 
< 0.1%
995154 3
< 0.1%
995153 2
 
< 0.1%
995151 1
 
< 0.1%
Distinct161
Distinct (%)1.6%
Missing50
Missing (%)0.5%
Memory size156.2 KiB
2023-12-13T09:47:35.128977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length7
Mean length5.6769849
Min length2

Characters and Unicode

Total characters56486
Distinct characters140
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row배방읍 장재리
2nd row배방읍 구령리
3rd row권곡동
4th row배방읍
5th row둔포면 석곡리
ValueCountFrequency (%)
배방읍 1829
 
11.0%
둔포면 1117
 
6.7%
음봉면 925
 
5.6%
온천동 865
 
5.2%
용화동 630
 
3.8%
북수리 616
 
3.7%
탕정면 586
 
3.5%
장재리 526
 
3.2%
석곡리 489
 
2.9%
영인면 453
 
2.7%
Other values (156) 8570
51.6%
2023-12-13T09:47:35.519994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6656
 
11.8%
6656
 
11.8%
4761
 
8.4%
3347
 
5.9%
2324
 
4.1%
2059
 
3.6%
1862
 
3.3%
1362
 
2.4%
1304
 
2.3%
1221
 
2.2%
Other values (130) 24934
44.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49800
88.2%
Space Separator 6656
 
11.8%
Decimal Number 30
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6656
 
13.4%
4761
 
9.6%
3347
 
6.7%
2324
 
4.7%
2059
 
4.1%
1862
 
3.7%
1362
 
2.7%
1304
 
2.6%
1221
 
2.5%
1153
 
2.3%
Other values (124) 23751
47.7%
Decimal Number
ValueCountFrequency (%)
1 12
40.0%
4 6
20.0%
0 6
20.0%
3 3
 
10.0%
6 3
 
10.0%
Space Separator
ValueCountFrequency (%)
6656
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49800
88.2%
Common 6686
 
11.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6656
 
13.4%
4761
 
9.6%
3347
 
6.7%
2324
 
4.7%
2059
 
4.1%
1862
 
3.7%
1362
 
2.7%
1304
 
2.6%
1221
 
2.5%
1153
 
2.3%
Other values (124) 23751
47.7%
Common
ValueCountFrequency (%)
6656
99.6%
1 12
 
0.2%
4 6
 
0.1%
0 6
 
0.1%
3 3
 
< 0.1%
6 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49800
88.2%
ASCII 6686
 
11.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6656
99.6%
1 12
 
0.2%
4 6
 
0.1%
0 6
 
0.1%
3 3
 
< 0.1%
6 3
 
< 0.1%
Hangul
ValueCountFrequency (%)
6656
 
13.4%
4761
 
9.6%
3347
 
6.7%
2324
 
4.7%
2059
 
4.1%
1862
 
3.7%
1362
 
2.7%
1304
 
2.6%
1221
 
2.5%
1153
 
2.3%
Other values (124) 23751
47.7%

Interactions

2023-12-13T09:47:34.172995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:47:34.019414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:47:34.252075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:47:34.100817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:47:35.595373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도로면관리번호도로구간번호
도로면관리번호1.0000.986
도로구간번호0.9861.000
2023-12-13T09:47:35.654674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도로면관리번호도로구간번호
도로면관리번호1.0000.994
도로구간번호0.9941.000

Missing values

2023-12-13T09:47:34.340024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:47:34.397592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

도로면관리번호도로구간번호법정읍/면/동
167671574615731배방읍 장재리
9662370015370015배방읍 구령리
13717198146권곡동
102123030130183배방읍
111921596916055둔포면 석곡리
37021031710161염치읍 곡교리
36161239012761탕정면 용두리
585128211574초사동
179651619730217음봉면 동암리
5111464814900둔포면 운용리
도로면관리번호도로구간번호법정읍/면/동
527656584187실옥동
1523443572477온천동
448269116163권곡동
1534059604681온천동
540593868823득산동
1442099799646도고면 봉농리
20127274인주면 신성리
32351083710862영인면 신현리
9779380118380305탕정면 호산리
885984117676송악면 강당리

Duplicate rows

Most frequently occurring

도로면관리번호도로구간번호법정읍/면/동# duplicates
02000220002염치읍 송곡리2