Overview

Dataset statistics

Number of variables7
Number of observations498
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.8 KiB
Average record size in memory59.3 B

Variable types

Categorical3
Numeric3
Text1

Dataset

Description경기도 화성시_사설표지판에 대한 데이터로 지형지물부호, 관리번호, 비고, 납품업체, 로딩일자, X좌표, Y좌표 등의 항목을 제공합니다.
Author경기도 화성시
URLhttps://www.data.go.kr/data/15093523/fileData.do

Alerts

지형지물부호 has constant value ""Constant
납품업체 is highly overall correlated with 관리번호 and 1 other fieldsHigh correlation
로딩일자 is highly overall correlated with 관리번호 and 2 other fieldsHigh correlation
관리번호 is highly overall correlated with 납품업체 and 1 other fieldsHigh correlation
Y좌표 is highly overall correlated with 로딩일자High correlation
납품업체 is highly imbalanced (65.7%)Imbalance
로딩일자 is highly imbalanced (66.4%)Imbalance

Reproduction

Analysis started2023-12-12 08:21:25.065354
Analysis finished2023-12-12 08:21:26.824650
Duration1.76 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지형지물부호
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
사설표지판
498 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사설표지판
2nd row사설표지판
3rd row사설표지판
4th row사설표지판
5th row사설표지판

Common Values

ValueCountFrequency (%)
사설표지판 498
100.0%

Length

2023-12-12T17:21:26.892951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:21:26.977117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
사설표지판 498
100.0%

관리번호
Real number (ℝ)

HIGH CORRELATION 

Distinct383
Distinct (%)76.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.3922472 × 1012
Minimum100001
Maximum8.99919 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T17:21:27.087810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100001
5-th percentile100031.85
Q1200017.25
median500030
Q32.15308 × 1011
95-th percentile8.29917 × 1013
Maximum8.99919 × 1013
Range8.99919 × 1013
Interquartile range (IQR)2.153078 × 1011

Descriptive statistics

Standard deviation2.2477107 × 1013
Coefficient of variation (CV)2.678318
Kurtosis7.8308181
Mean8.3922472 × 1012
Median Absolute Deviation (MAD)399926.5
Skewness3.0217663
Sum4.1793391 × 1015
Variance5.0522034 × 1026
MonotonicityNot monotonic
2023-12-12T17:21:27.240056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
89991600000000 18
 
3.6%
12161800000000 15
 
3.0%
211606000000 12
 
2.4%
12161500000000 11
 
2.2%
900912000000 9
 
1.8%
12161900000000 9
 
1.8%
22161800000000 8
 
1.6%
215308000000 6
 
1.2%
22161900000000 6
 
1.2%
12161700000000 6
 
1.2%
Other values (373) 398
79.9%
ValueCountFrequency (%)
100001 1
0.2%
100002 1
0.2%
100003 1
0.2%
100004 1
0.2%
100005 1
0.2%
100006 1
0.2%
100009 1
0.2%
100013 1
0.2%
100014 1
0.2%
100015 1
0.2%
ValueCountFrequency (%)
89991900000000 5
 
1.0%
89991600000000 18
3.6%
82992000000000 1
 
0.2%
82991700000000 3
 
0.6%
82161800000000 1
 
0.2%
82112000000000 2
 
0.4%
82111800000000 4
 
0.8%
42161800000000 4
 
0.8%
42161600000000 1
 
0.2%
32161800000000 1
 
0.2%

비고
Text

Distinct136
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T17:21:27.607621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length1
Mean length7.8915663
Min length1

Characters and Unicode

Total characters3930
Distinct characters252
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)19.9%

Sample

1st row서신중학교
2nd row화성시노인전문요양원
3rd row서신성당
4th row
5th row
ValueCountFrequency (%)
gis 79
 
12.0%
db 76
 
11.5%
구축용역 74
 
11.2%
보도설치공사 31
 
4.7%
서신 11
 
1.7%
도시계획도로(소로1-7호선)개설공사 11
 
1.7%
10
 
1.5%
개설공사 9
 
1.4%
교통혼잡지구 8
 
1.2%
개선사업 7
 
1.1%
Other values (190) 344
52.1%
2023-12-12T17:21:28.061213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
730
 
18.6%
127
 
3.2%
109
 
2.8%
103
 
2.6%
I 93
 
2.4%
91
 
2.3%
89
 
2.3%
B 86
 
2.2%
86
 
2.2%
85
 
2.2%
Other values (242) 2331
59.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2413
61.4%
Space Separator 730
 
18.6%
Uppercase Letter 458
 
11.7%
Decimal Number 173
 
4.4%
Other Punctuation 50
 
1.3%
Close Punctuation 34
 
0.9%
Open Punctuation 34
 
0.9%
Dash Punctuation 30
 
0.8%
Lowercase Letter 5
 
0.1%
Math Symbol 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
127
 
5.3%
109
 
4.5%
103
 
4.3%
91
 
3.8%
89
 
3.7%
86
 
3.6%
85
 
3.5%
79
 
3.3%
72
 
3.0%
56
 
2.3%
Other values (207) 1516
62.8%
Uppercase Letter
ValueCountFrequency (%)
I 93
20.3%
B 86
18.8%
D 85
18.6%
S 85
18.6%
G 85
18.6%
C 11
 
2.4%
K 7
 
1.5%
M 2
 
0.4%
P 1
 
0.2%
A 1
 
0.2%
Other values (2) 2
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 51
29.5%
2 38
22.0%
7 25
14.5%
3 22
12.7%
0 10
 
5.8%
8 9
 
5.2%
5 7
 
4.0%
4 5
 
2.9%
6 4
 
2.3%
9 2
 
1.2%
Other Punctuation
ValueCountFrequency (%)
/ 33
66.0%
. 9
 
18.0%
, 7
 
14.0%
' 1
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
m 3
60.0%
k 1
 
20.0%
t 1
 
20.0%
Space Separator
ValueCountFrequency (%)
730
100.0%
Close Punctuation
ValueCountFrequency (%)
) 34
100.0%
Open Punctuation
ValueCountFrequency (%)
( 34
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2414
61.4%
Common 1053
26.8%
Latin 463
 
11.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
127
 
5.3%
109
 
4.5%
103
 
4.3%
91
 
3.8%
89
 
3.7%
86
 
3.6%
85
 
3.5%
79
 
3.3%
72
 
3.0%
56
 
2.3%
Other values (208) 1517
62.8%
Common
ValueCountFrequency (%)
730
69.3%
1 51
 
4.8%
2 38
 
3.6%
) 34
 
3.2%
( 34
 
3.2%
/ 33
 
3.1%
- 30
 
2.8%
7 25
 
2.4%
3 22
 
2.1%
0 10
 
0.9%
Other values (9) 46
 
4.4%
Latin
ValueCountFrequency (%)
I 93
20.1%
B 86
18.6%
D 85
18.4%
S 85
18.4%
G 85
18.4%
C 11
 
2.4%
K 7
 
1.5%
m 3
 
0.6%
M 2
 
0.4%
P 1
 
0.2%
Other values (5) 5
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2413
61.4%
ASCII 1516
38.6%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
730
48.2%
I 93
 
6.1%
B 86
 
5.7%
D 85
 
5.6%
S 85
 
5.6%
G 85
 
5.6%
1 51
 
3.4%
2 38
 
2.5%
) 34
 
2.2%
( 34
 
2.2%
Other values (24) 195
 
12.9%
Hangul
ValueCountFrequency (%)
127
 
5.3%
109
 
4.5%
103
 
4.3%
91
 
3.8%
89
 
3.7%
86
 
3.6%
85
 
3.5%
79
 
3.3%
72
 
3.0%
56
 
2.3%
Other values (207) 1516
62.8%
None
ValueCountFrequency (%)
1
100.0%

납품업체
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct12
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
400 
㈜진성이엔씨
 
37
새한항업
 
18
(주)성원공간정보
 
15
(주)진성이엔씨
 
11
Other values (7)
 
17

Length

Max length10
Median length1
Mean length2.1164659
Min length1

Unique

Unique3 ?
Unique (%)0.6%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
400
80.3%
㈜진성이엔씨 37
 
7.4%
새한항업 18
 
3.6%
(주)성원공간정보 15
 
3.0%
(주)진성이엔씨 11
 
2.2%
(주)지아이에스21 6
 
1.2%
(주)대광지오텍 3
 
0.6%
㈜선진이엔씨 3
 
0.6%
(주)선진이엔씨 2
 
0.4%
(주)우주공간정보 1
 
0.2%
Other values (2) 2
 
0.4%

Length

2023-12-12T17:21:28.198863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
㈜진성이엔씨 37
37.8%
새한항업 18
18.4%
주)성원공간정보 15
15.3%
주)진성이엔씨 11
 
11.2%
주)지아이에스21 6
 
6.1%
주)대광지오텍 3
 
3.1%
㈜선진이엔씨 3
 
3.1%
주)선진이엔씨 2
 
2.0%
주)우주공간정보 1
 
1.0%
지오스 1
 
1.0%

로딩일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct30
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
392 
2019-01-05
 
18
2018-04-12
 
11
2018-12-05
 
9
2018-10-12
 
9
Other values (25)
59 

Length

Max length10
Median length1
Mean length2.9156627
Min length1

Unique

Unique12 ?
Unique (%)2.4%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
392
78.7%
2019-01-05 18
 
3.6%
2018-04-12 11
 
2.2%
2018-12-05 9
 
1.8%
2018-10-12 9
 
1.8%
2018-08-01 6
 
1.2%
2021-03-23 6
 
1.2%
2018-12-10 6
 
1.2%
2020-03-10 5
 
1.0%
2019-12-04 5
 
1.0%
Other values (20) 31
 
6.2%

Length

2023-12-12T17:21:28.308530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2019-01-05 18
17.0%
2018-04-12 11
10.4%
2018-12-05 9
 
8.5%
2018-10-12 9
 
8.5%
2018-08-01 6
 
5.7%
2021-03-23 6
 
5.7%
2018-12-10 6
 
5.7%
2020-03-10 5
 
4.7%
2019-12-04 5
 
4.7%
2020-03-13 4
 
3.8%
Other values (19) 27
25.5%

X좌표
Real number (ℝ)

Distinct497
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean192214.43
Minimum166206.13
Maximum210703.75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T17:21:28.514845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum166206.13
5-th percentile173648.19
Q1183625.87
median194460.95
Q3200477.51
95-th percentile208646.02
Maximum210703.75
Range44497.622
Interquartile range (IQR)16851.647

Descriptive statistics

Standard deviation10859.965
Coefficient of variation (CV)0.05649922
Kurtosis-0.6555119
Mean192214.43
Median Absolute Deviation (MAD)8308.2445
Skewness-0.38033533
Sum95722784
Variance1.1793884 × 108
MonotonicityNot monotonic
2023-12-12T17:21:28.666800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
198321.149 2
 
0.4%
173382.645 1
 
0.2%
182650.619 1
 
0.2%
206593.783 1
 
0.2%
206773.796 1
 
0.2%
206559.314 1
 
0.2%
183663.841 1
 
0.2%
183644.797 1
 
0.2%
182812.261 1
 
0.2%
182829.894 1
 
0.2%
Other values (487) 487
97.8%
ValueCountFrequency (%)
166206.129 1
0.2%
166208.627 1
0.2%
166267.622 1
0.2%
166382.571 1
0.2%
166452.349 1
0.2%
166563.774 1
0.2%
166683.491 1
0.2%
166729.253 1
0.2%
167232.793 1
0.2%
167262.326 1
0.2%
ValueCountFrequency (%)
210703.751 1
0.2%
210669.837 1
0.2%
210647.266 1
0.2%
210540.911 1
0.2%
210403.81 1
0.2%
210401.411 1
0.2%
210375.689 1
0.2%
210325.461 1
0.2%
210259.703 1
0.2%
209975.542 1
0.2%

Y좌표
Real number (ℝ)

HIGH CORRELATION 

Distinct497
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean509531.33
Minimum493262.53
Maximum521121.78
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T17:21:28.788116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum493262.53
5-th percentile498097.64
Q1507115.9
median511168.93
Q3512843.61
95-th percentile516047.32
Maximum521121.78
Range27859.25
Interquartile range (IQR)5727.7182

Descriptive statistics

Standard deviation5361.3045
Coefficient of variation (CV)0.010522031
Kurtosis0.30235274
Mean509531.33
Median Absolute Deviation (MAD)3226.147
Skewness-0.87115456
Sum2.537466 × 108
Variance28743586
MonotonicityNot monotonic
2023-12-12T17:21:28.971205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
502903.722 2
 
0.4%
507697.245 1
 
0.2%
511853.758 1
 
0.2%
511205.566 1
 
0.2%
510536.756 1
 
0.2%
511171.382 1
 
0.2%
512040.969 1
 
0.2%
512109.85 1
 
0.2%
511858.81 1
 
0.2%
511863.228 1
 
0.2%
Other values (487) 487
97.8%
ValueCountFrequency (%)
493262.531 1
0.2%
493321.525 1
0.2%
493372.768 1
0.2%
494789.097 1
0.2%
496391.668 1
0.2%
496420.81 1
0.2%
496609.152 1
0.2%
497016.184 1
0.2%
497024.979 1
0.2%
497244.13 1
0.2%
ValueCountFrequency (%)
521121.781 1
0.2%
520991.053 1
0.2%
520959.221 1
0.2%
520949.163 1
0.2%
518579.251 1
0.2%
518525.949 1
0.2%
518191.664 1
0.2%
518147.718 1
0.2%
518004.839 1
0.2%
517937.516 1
0.2%

Interactions

2023-12-12T17:21:26.132441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:21:25.445128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:21:25.714351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:21:26.239235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:21:25.522679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:21:25.846436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:21:26.373837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:21:25.612440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:21:26.005221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:21:29.093221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호납품업체로딩일자X좌표Y좌표
관리번호1.0001.0000.9990.5650.428
납품업체1.0001.0000.9990.6080.697
로딩일자0.9990.9991.0000.8770.884
X좌표0.5650.6080.8771.0000.747
Y좌표0.4280.6970.8840.7471.000
2023-12-12T17:21:29.182570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
납품업체로딩일자
납품업체1.0000.947
로딩일자0.9471.000
2023-12-12T17:21:29.258811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호X좌표Y좌표납품업체로딩일자
관리번호1.000-0.102-0.1130.9610.924
X좌표-0.1021.0000.0920.3110.488
Y좌표-0.1130.0921.0000.3940.507
납품업체0.9610.3110.3941.0000.947
로딩일자0.9240.4880.5070.9471.000

Missing values

2023-12-12T17:21:26.575401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:21:26.761383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지형지물부호관리번호비고납품업체로딩일자X좌표Y좌표
0사설표지판300024서신중학교173382.645507697.245
1사설표지판300032화성시노인전문요양원173780.003507670.602
2사설표지판300033서신성당173779.712507670.629
3사설표지판100052193887.29507106.642
4사설표지판100049193255.566507353.408
5사설표지판211606000000청원슈퍼178798.564509964.646
6사설표지판300002쥬라기마을176012.583512966.79
7사설표지판500059203547.301511277.364
8사설표지판215304000000동탄2신도시209975.542512241.444
9사설표지판100098198231.101512840.341
지형지물부호관리번호비고납품업체로딩일자X좌표Y좌표
488사설표지판12161500000000서신 도시계획도로(소로1-7호선)개설공사 GIS DB 구축용역㈜진성이엔씨2018-04-12166563.774507328.081
489사설표지판12161500000000서신 도시계획도로(소로1-7호선)개설공사 GIS DB 구축용역㈜진성이엔씨2018-04-12166382.571507187.254
490사설표지판12161500000000서신 도시계획도로(소로1-7호선)개설공사 GIS DB 구축용역㈜진성이엔씨2018-04-12166452.349507225.24
491사설표지판12161500000000서신 도시계획도로(소로1-7호선)개설공사 GIS DB 구축용역㈜진성이엔씨2018-04-12166267.622507051.622
492사설표지판12161500000000서신 도시계획도로(소로1-7호선)개설공사 GIS DB 구축용역㈜진성이엔씨2018-04-12166729.253507490.421
493사설표지판12161500000000서신 도시계획도로(소로1-7호선)개설공사 GIS DB 구축용역㈜진성이엔씨2018-04-12167232.793508428.58
494사설표지판12161500000000서신 도시계획도로(소로1-7호선)개설공사 GIS DB 구축용역㈜진성이엔씨2018-04-12167262.326508435.179
495사설표지판12161500000000서신 도시계획도로(소로1-7호선)개설공사 GIS DB 구축용역㈜진성이엔씨2018-04-12167264.79508437.025
496사설표지판42161600000000비봉 도시계획도로(중로2-3호선)개설공사 GIS DB구축 사업(주)선진이엔씨2018-04-18188744.353515185.815
497사설표지판12111700000000반월,기산지구 시가화예정구역 도로 및 하수관거 정비사업 GIS DB 구축용역㈜진성이엔씨2018-05-25204709.308513270.134