Overview

Dataset statistics

Number of variables7
Number of observations216
Missing cells21
Missing cells (%)1.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.8 KiB
Average record size in memory60.6 B

Variable types

Numeric4
Categorical2
Text1

Dataset

Description전라북도 정읍시 도로기반시설물 정보( 관리번호, 행정읍면동, 도엽번호, 도로구간번호, 설치일자, 정류장명)등 자료제공 합니다.
Author전라북도 정읍시
URLhttps://www.data.go.kr/data/15085009/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
관리번호 is highly overall correlated with 행정읍면동 and 2 other fieldsHigh correlation
행정읍면동 is highly overall correlated with 관리번호 and 1 other fieldsHigh correlation
도엽번호 is highly overall correlated with 관리번호 and 2 other fieldsHigh correlation
설치일자 is highly overall correlated with 관리번호 and 1 other fieldsHigh correlation
설치일자 is highly imbalanced (61.2%)Imbalance
정류장명 has 21 (9.7%) missing valuesMissing
관리번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 11:55:27.407734
Analysis finished2023-12-12 11:55:30.562286
Duration3.15 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

관리번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct216
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean133741.59
Minimum1
Maximum989301
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-12T20:55:30.659414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11.75
Q155.75
median115.5
Q3171.25
95-th percentile900001.25
Maximum989301
Range989300
Interquartile range (IQR)115.5

Descriptive statistics

Standard deviation284618.54
Coefficient of variation (CV)2.1281229
Kurtosis2.1797465
Mean133741.59
Median Absolute Deviation (MAD)58
Skewness1.9308629
Sum28888184
Variance8.1007715 × 1010
MonotonicityNot monotonic
2023-12-12T20:55:30.882113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
159 1
 
0.5%
102 1
 
0.5%
129 1
 
0.5%
9 1
 
0.5%
127 1
 
0.5%
122 1
 
0.5%
8 1
 
0.5%
143 1
 
0.5%
39 1
 
0.5%
141 1
 
0.5%
Other values (206) 206
95.4%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
989301 1
0.5%
984301 1
0.5%
936001 1
0.5%
926001 1
0.5%
900008 1
0.5%
900007 1
0.5%
900006 1
0.5%
900005 1
0.5%
900004 1
0.5%
900003 1
0.5%

행정읍면동
Real number (ℝ)

HIGH CORRELATION 

Distinct14
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5180457 × 109
Minimum4.518025 × 109
Maximum4.5180595 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-12T20:55:31.060791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4.518025 × 109
5-th percentile4.518025 × 109
Q14.518031 × 109
median4.5180535 × 109
Q34.518056 × 109
95-th percentile4.5180595 × 109
Maximum4.5180595 × 109
Range34500
Interquartile range (IQR)25000

Descriptive statistics

Standard deviation13214.091
Coefficient of variation (CV)2.9247361 × 10-6
Kurtosis-1.3010968
Mean4.5180457 × 109
Median Absolute Deviation (MAD)4500
Skewness-0.65576747
Sum9.7589787 × 1011
Variance1.7461221 × 108
MonotonicityNot monotonic
2023-12-12T20:55:31.245200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
4518025000 47
21.8%
4518053500 36
16.7%
4518059500 19
8.8%
4518058000 19
8.8%
4518051000 17
 
7.9%
4518057000 14
 
6.5%
4518056000 14
 
6.5%
4518032000 10
 
4.6%
4518042000 10
 
4.6%
4518054500 9
 
4.2%
Other values (4) 21
9.7%
ValueCountFrequency (%)
4518025000 47
21.8%
4518031000 8
 
3.7%
4518032000 10
 
4.6%
4518039000 4
 
1.9%
4518040000 1
 
0.5%
4518042000 10
 
4.6%
4518051000 17
 
7.9%
4518052000 8
 
3.7%
4518053500 36
16.7%
4518054500 9
 
4.2%
ValueCountFrequency (%)
4518059500 19
8.8%
4518058000 19
8.8%
4518057000 14
 
6.5%
4518056000 14
 
6.5%
4518054500 9
 
4.2%
4518053500 36
16.7%
4518052000 8
 
3.7%
4518051000 17
7.9%
4518042000 10
 
4.6%
4518040000 1
 
0.5%

도엽번호
Real number (ℝ)

HIGH CORRELATION 

Distinct148
Distinct (%)68.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5608176 × 109
Minimum3.5608025 × 109
Maximum3.5612021 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-12T20:55:31.448080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3.5608025 × 109
5-th percentile3.5608028 × 109
Q13.5608139 × 109
median3.5608179 × 109
Q33.5608188 × 109
95-th percentile3.5608233 × 109
Maximum3.5612021 × 109
Range399609
Interquartile range (IQR)4951.25

Descriptive statistics

Standard deviation26969.176
Coefficient of variation (CV)7.5738718 × 10-6
Kurtosis194.499
Mean3.5608176 × 109
Median Absolute Deviation (MAD)1907.5
Skewness13.579214
Sum7.6913661 × 1011
Variance7.2733647 × 108
MonotonicityNot monotonic
2023-12-12T20:55:31.633860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3560802684 5
 
2.3%
3560818312 5
 
2.3%
3560818633 4
 
1.9%
3560814103 3
 
1.4%
3560823501 3
 
1.4%
3560808283 3
 
1.4%
3560823391 3
 
1.4%
3560818822 3
 
1.4%
3560818834 2
 
0.9%
3560818843 2
 
0.9%
Other values (138) 183
84.7%
ValueCountFrequency (%)
3560802504 2
 
0.9%
3560802582 2
 
0.9%
3560802683 1
 
0.5%
3560802684 5
2.3%
3560802693 1
 
0.5%
3560802801 1
 
0.5%
3560803433 1
 
0.5%
3560803621 1
 
0.5%
3560803713 1
 
0.5%
3560803812 1
 
0.5%
ValueCountFrequency (%)
3561202113 1
 
0.5%
3560823604 2
0.9%
3560823501 3
1.4%
3560823391 3
1.4%
3560823312 2
0.9%
3560823284 1
 
0.5%
3560823174 1
 
0.5%
3560823172 1
 
0.5%
3560823074 2
0.9%
3560823062 1
 
0.5%

도로구간번호
Real number (ℝ)

Distinct176
Distinct (%)81.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean367872.22
Minimum15
Maximum989306
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-12T20:55:31.846207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile426.5
Q13184.75
median310421.5
Q3709789.5
95-th percentile900071
Maximum989306
Range989291
Interquartile range (IQR)706604.75

Descriptive statistics

Standard deviation320228.41
Coefficient of variation (CV)0.87048814
Kurtosis-1.2539801
Mean367872.22
Median Absolute Deviation (MAD)308398.5
Skewness0.37662795
Sum79460400
Variance1.0254623 × 1011
MonotonicityNot monotonic
2023-12-12T20:55:32.065825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
328003 6
 
2.8%
310728 3
 
1.4%
307001 3
 
1.4%
310706 3
 
1.4%
360002 2
 
0.9%
710021 2
 
0.9%
720187 2
 
0.9%
434 2
 
0.9%
850 2
 
0.9%
404 2
 
0.9%
Other values (166) 189
87.5%
ValueCountFrequency (%)
15 2
0.9%
83 1
0.5%
96 1
0.5%
193 1
0.5%
235 1
0.5%
237 1
0.5%
289 1
0.5%
327 1
0.5%
404 2
0.9%
434 2
0.9%
ValueCountFrequency (%)
989306 1
0.5%
984302 1
0.5%
936001 1
0.5%
926007 1
0.5%
901023 1
0.5%
901020 1
0.5%
900342 2
0.9%
900341 1
0.5%
900310 1
0.5%
900071 2
0.9%

설치일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct14
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
1900-01-01
165 
2012-01-01
17 
2013-01-01
 
11
1998-05-01
 
5
2010-01-01
 
5
Other values (9)
 
13

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique7 ?
Unique (%)3.2%

Sample

1st row1900-01-01
2nd row1900-01-01
3rd row1900-01-01
4th row1900-01-01
5th row1900-01-01

Common Values

ValueCountFrequency (%)
1900-01-01 165
76.4%
2012-01-01 17
 
7.9%
2013-01-01 11
 
5.1%
1998-05-01 5
 
2.3%
2010-01-01 5
 
2.3%
2009-01-01 4
 
1.9%
2003-08-01 2
 
0.9%
2020-10-01 1
 
0.5%
2003-06-01 1
 
0.5%
2015-08-31 1
 
0.5%
Other values (4) 4
 
1.9%

Length

2023-12-12T20:55:32.249050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1900-01-01 165
76.4%
2012-01-01 17
 
7.9%
2013-01-01 11
 
5.1%
1998-05-01 5
 
2.3%
2010-01-01 5
 
2.3%
2009-01-01 4
 
1.9%
2003-08-01 2
 
0.9%
2020-10-01 1
 
0.5%
2003-06-01 1
 
0.5%
2015-08-31 1
 
0.5%
Other values (4) 4
 
1.9%

정류장명
Text

MISSING 

Distinct157
Distinct (%)80.5%
Missing21
Missing (%)9.7%
Memory size1.8 KiB
2023-12-12T20:55:32.634023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.025641
Min length2

Characters and Unicode

Total characters785
Distinct characters199
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)63.6%

Sample

1st row차단
2nd row차단
3rd row엄동
4th row엄동
5th row옹암
ValueCountFrequency (%)
수성주공아파트 4
 
2.1%
두지 4
 
2.1%
부전마을 3
 
1.5%
대림apt 2
 
1.0%
효축마을 2
 
1.0%
차단 2
 
1.0%
신월 2
 
1.0%
시기주공아파트 2
 
1.0%
정읍여고 2
 
1.0%
엄동 2
 
1.0%
Other values (147) 170
87.2%
2023-12-12T20:55:33.247857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25
 
3.2%
22
 
2.8%
22
 
2.8%
19
 
2.4%
19
 
2.4%
18
 
2.3%
17
 
2.2%
17
 
2.2%
16
 
2.0%
16
 
2.0%
Other values (189) 594
75.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 756
96.3%
Uppercase Letter 20
 
2.5%
Decimal Number 6
 
0.8%
Other Punctuation 2
 
0.3%
Other Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
3.3%
22
 
2.9%
22
 
2.9%
19
 
2.5%
19
 
2.5%
18
 
2.4%
17
 
2.2%
17
 
2.2%
16
 
2.1%
16
 
2.1%
Other values (178) 565
74.7%
Uppercase Letter
ValueCountFrequency (%)
T 6
30.0%
A 6
30.0%
P 6
30.0%
C 1
 
5.0%
I 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 3
50.0%
2 2
33.3%
3 1
 
16.7%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
. 1
50.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 757
96.4%
Latin 20
 
2.5%
Common 8
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
3.3%
22
 
2.9%
22
 
2.9%
19
 
2.5%
19
 
2.5%
18
 
2.4%
17
 
2.2%
17
 
2.2%
16
 
2.1%
16
 
2.1%
Other values (179) 566
74.8%
Latin
ValueCountFrequency (%)
T 6
30.0%
A 6
30.0%
P 6
30.0%
C 1
 
5.0%
I 1
 
5.0%
Common
ValueCountFrequency (%)
1 3
37.5%
2 2
25.0%
, 1
 
12.5%
. 1
 
12.5%
3 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 756
96.3%
ASCII 28
 
3.6%
None 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
25
 
3.3%
22
 
2.9%
22
 
2.9%
19
 
2.5%
19
 
2.5%
18
 
2.4%
17
 
2.2%
17
 
2.2%
16
 
2.1%
16
 
2.1%
Other values (178) 565
74.7%
ASCII
ValueCountFrequency (%)
T 6
21.4%
A 6
21.4%
P 6
21.4%
1 3
10.7%
2 2
 
7.1%
, 1
 
3.6%
. 1
 
3.6%
C 1
 
3.6%
I 1
 
3.6%
3 1
 
3.6%
None
ValueCountFrequency (%)
1
100.0%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2022-09-23
216 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-09-23
2nd row2022-09-23
3rd row2022-09-23
4th row2022-09-23
5th row2022-09-23

Common Values

ValueCountFrequency (%)
2022-09-23 216
100.0%

Length

2023-12-12T20:55:33.411794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:55:33.533836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-09-23 216
100.0%

Interactions

2023-12-12T20:55:29.915122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:27.800077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:28.386609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:29.316038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:30.015855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:27.939974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:28.887879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:29.459737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:30.122824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:28.080332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:29.041014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:29.625484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:30.243951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:28.242202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:29.192898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:55:29.776848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:55:33.605166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호행정읍면동도엽번호도로구간번호설치일자
관리번호1.0000.6050.2140.7790.948
행정읍면동0.6051.0000.3600.8700.447
도엽번호0.2140.3601.0000.0791.000
도로구간번호0.7790.8700.0791.0000.555
설치일자0.9480.4471.0000.5551.000
2023-12-12T20:55:33.726684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호행정읍면동도엽번호도로구간번호설치일자
관리번호1.0000.5070.5230.2600.661
행정읍면동0.5071.0000.581-0.4180.247
도엽번호0.5230.5811.0000.0030.972
도로구간번호0.260-0.4180.0031.0000.258
설치일자0.6610.2470.9720.2581.000

Missing values

2023-12-12T20:55:30.397091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:55:30.515382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관리번호행정읍면동도엽번호도로구간번호설치일자정류장명데이터기준일자
0159451803200035608218718001931900-01-01차단2022-09-23
1157451803200035608218718001911900-01-01차단2022-09-23
2158451803200035608218838001991900-01-01엄동2022-09-23
3156451803200035608218838001641900-01-01엄동2022-09-23
4153451803200035608219938001411900-01-01옹암2022-09-23
5160451803200035608219938001411900-01-01옹암2022-09-23
6154451803200035608218948001321900-01-01천원2022-09-23
7155451803200035608218038001221900-01-01원천2022-09-23
8161451803200035608218038001221900-01-01원천2022-09-23
9989301451803200035612021139893062020-10-01<NA>2022-09-23
관리번호행정읍면동도엽번호도로구간번호설치일자정류장명데이터기준일자
206163451804200035608158839000301900-01-01원촌정류장2022-09-23
207168451804200035608158839000301900-01-01원촌승강장2022-09-23
208167451804200035608159819000331900-01-01원촌승강장2022-09-23
209169451804200035608158849000711900-01-01칠보초교2022-09-23
210170451804200035608158849000711900-01-01칠보초교2022-09-23
211162451804200035608159849000421900-01-01<NA>2022-09-23
212166451804200035608200919000421900-01-01송산.남전2022-09-23
213171451804200035608159919000681900-01-01<NA>2022-09-23
214165451804200035608159949000531900-01-01시기정류장2022-09-23
215164451804200035608159949000501900-01-01시기정류장2022-09-23