Overview

Dataset statistics

Number of variables7
Number of observations75
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory61.8 B

Variable types

Categorical2
Text1
Numeric4

Dataset

Description생태하천복원사업준공 현황
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=CFZUOI3PS341T4BW76BX15324192&infSeq=1

Alerts

준공연도 is highly overall correlated with 총사업비(백만원) and 3 other fieldsHigh correlation
총사업비(백만원) is highly overall correlated with 준공연도 and 3 other fieldsHigh correlation
국비(백만원) is highly overall correlated with 준공연도 and 3 other fieldsHigh correlation
지방비(백만원) is highly overall correlated with 준공연도 and 3 other fieldsHigh correlation
데이터기준일자 is highly overall correlated with 준공연도 and 3 other fieldsHigh correlation
총사업비(백만원) has unique valuesUnique
지방비(백만원) has unique valuesUnique

Reproduction

Analysis started2023-12-10 22:28:13.817212
Analysis finished2023-12-10 22:28:15.349202
Duration1.53 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

Distinct29
Distinct (%)38.7%
Missing0
Missing (%)0.0%
Memory size732.0 B
용인시
남양주시
 
5
양주시
 
5
부천시
 
4
이천시
 
4
Other values (24)
51 

Length

Max length4
Median length3
Mean length3.1466667
Min length3

Unique

Unique6 ?
Unique (%)8.0%

Sample

1st row가평군
2nd row가평군
3rd row경기도
4th row고양시
5th row고양시

Common Values

ValueCountFrequency (%)
용인시 6
 
8.0%
남양주시 5
 
6.7%
양주시 5
 
6.7%
부천시 4
 
5.3%
이천시 4
 
5.3%
구리시 4
 
5.3%
동두천시 3
 
4.0%
의정부시 3
 
4.0%
오산시 3
 
4.0%
성남시 3
 
4.0%
Other values (19) 35
46.7%

Length

2023-12-11T07:28:15.418596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
용인시 6
 
8.0%
양주시 5
 
6.7%
남양주시 5
 
6.7%
부천시 4
 
5.3%
이천시 4
 
5.3%
구리시 4
 
5.3%
시흥시 3
 
4.0%
안양시 3
 
4.0%
안성시 3
 
4.0%
성남시 3
 
4.0%
Other values (19) 35
46.7%
Distinct74
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size732.0 B
2023-12-11T07:28:15.631409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.9866667
Min length7

Characters and Unicode

Total characters599
Distinct characters95
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique73 ?
Unique (%)97.3%

Sample

1st row조종천('05)
2nd row달전천('21)
3rd row경안천('10)
4th row대장천('19)
5th row벽제천('20)
ValueCountFrequency (%)
신천('15 2
 
2.7%
탄천('18 1
 
1.3%
신갈천('20 1
 
1.3%
청미천('15 1
 
1.3%
경안천('13 1
 
1.3%
궐동천('19 1
 
1.3%
오산천('17 1
 
1.3%
오산천('98 1
 
1.3%
덕계천('16 1
 
1.3%
상하천('18 1
 
1.3%
Other values (64) 64
85.3%
2023-12-11T07:28:15.954351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
78
13.0%
( 75
12.5%
' 75
12.5%
) 75
12.5%
1 47
 
7.8%
9 24
 
4.0%
0 21
 
3.5%
8 11
 
1.8%
3 11
 
1.8%
5 10
 
1.7%
Other values (85) 172
28.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 219
36.6%
Decimal Number 150
25.0%
Open Punctuation 75
 
12.5%
Other Punctuation 75
 
12.5%
Close Punctuation 75
 
12.5%
Space Separator 3
 
0.5%
Other Number 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
78
35.6%
9
 
4.1%
9
 
4.1%
8
 
3.7%
4
 
1.8%
3
 
1.4%
3
 
1.4%
3
 
1.4%
3
 
1.4%
3
 
1.4%
Other values (70) 96
43.8%
Decimal Number
ValueCountFrequency (%)
1 47
31.3%
9 24
16.0%
0 21
14.0%
8 11
 
7.3%
3 11
 
7.3%
5 10
 
6.7%
2 8
 
5.3%
4 7
 
4.7%
6 6
 
4.0%
7 5
 
3.3%
Open Punctuation
ValueCountFrequency (%)
( 75
100.0%
Other Punctuation
ValueCountFrequency (%)
' 75
100.0%
Close Punctuation
ValueCountFrequency (%)
) 75
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Number
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 380
63.4%
Hangul 219
36.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
78
35.6%
9
 
4.1%
9
 
4.1%
8
 
3.7%
4
 
1.8%
3
 
1.4%
3
 
1.4%
3
 
1.4%
3
 
1.4%
3
 
1.4%
Other values (70) 96
43.8%
Common
ValueCountFrequency (%)
( 75
19.7%
' 75
19.7%
) 75
19.7%
1 47
12.4%
9 24
 
6.3%
0 21
 
5.5%
8 11
 
2.9%
3 11
 
2.9%
5 10
 
2.6%
2 8
 
2.1%
Other values (5) 23
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 378
63.1%
Hangul 219
36.6%
Enclosed Alphanum 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
78
35.6%
9
 
4.1%
9
 
4.1%
8
 
3.7%
4
 
1.8%
3
 
1.4%
3
 
1.4%
3
 
1.4%
3
 
1.4%
3
 
1.4%
Other values (70) 96
43.8%
ASCII
ValueCountFrequency (%)
( 75
19.8%
' 75
19.8%
) 75
19.8%
1 47
12.4%
9 24
 
6.3%
0 21
 
5.6%
8 11
 
2.9%
3 11
 
2.9%
5 10
 
2.6%
2 8
 
2.1%
Other values (4) 21
 
5.6%
Enclosed Alphanum
ValueCountFrequency (%)
2
100.0%

준공연도
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)34.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.5467
Minimum1990
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size807.0 B
2023-12-11T07:28:16.060584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1990
5-th percentile1993
Q12003
median2013
Q32017
95-th percentile2019.3
Maximum2021
Range31
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.0647719
Coefficient of variation (CV)0.0045108542
Kurtosis-0.70126169
Mean2009.5467
Median Absolute Deviation (MAD)5
Skewness-0.75873169
Sum150716
Variance82.17009
MonotonicityNot monotonic
2023-12-11T07:28:16.154750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
2018 8
 
10.7%
2015 8
 
10.7%
2016 5
 
6.7%
2013 4
 
5.3%
2019 4
 
5.3%
2017 4
 
5.3%
1993 4
 
5.3%
1998 3
 
4.0%
2010 3
 
4.0%
2009 3
 
4.0%
Other values (16) 29
38.7%
ValueCountFrequency (%)
1990 2
2.7%
1991 1
 
1.3%
1993 4
5.3%
1994 2
2.7%
1996 1
 
1.3%
1998 3
4.0%
1999 2
2.7%
2002 3
4.0%
2003 3
4.0%
2004 1
 
1.3%
ValueCountFrequency (%)
2021 2
 
2.7%
2020 2
 
2.7%
2019 4
5.3%
2018 8
10.7%
2017 4
5.3%
2016 5
6.7%
2015 8
10.7%
2014 3
 
4.0%
2013 4
5.3%
2012 1
 
1.3%

총사업비(백만원)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct75
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12253.627
Minimum224
Maximum48000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size807.0 B
2023-12-11T07:28:16.265171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum224
5-th percentile688
Q12870.5
median8302
Q318639.5
95-th percentile34672.4
Maximum48000
Range47776
Interquartile range (IQR)15769

Descriptive statistics

Standard deviation11444.012
Coefficient of variation (CV)0.93392857
Kurtosis0.64107319
Mean12253.627
Median Absolute Deviation (MAD)6613
Skewness1.1116068
Sum919022
Variance1.3096541 × 108
MonotonicityNot monotonic
2023-12-11T07:28:16.379419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
700 1
 
1.3%
15730 1
 
1.3%
30090 1
 
1.3%
23600 1
 
1.3%
1000 1
 
1.3%
42886 1
 
1.3%
30900 1
 
1.3%
20146 1
 
1.3%
3297 1
 
1.3%
224 1
 
1.3%
Other values (65) 65
86.7%
ValueCountFrequency (%)
224 1
1.3%
232 1
1.3%
500 1
1.3%
660 1
1.3%
700 1
1.3%
900 1
1.3%
998 1
1.3%
1000 1
1.3%
1500 1
1.3%
1689 1
1.3%
ValueCountFrequency (%)
48000 1
1.3%
42886 1
1.3%
39140 1
1.3%
35000 1
1.3%
34532 1
1.3%
31000 1
1.3%
30900 1
1.3%
30090 1
1.3%
26577 1
1.3%
26000 1
1.3%

국비(백만원)
Real number (ℝ)

HIGH CORRELATION 

Distinct74
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8154.7867
Minimum157
Maximum33600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size807.0 B
2023-12-11T07:28:16.485574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum157
5-th percentile481.6
Q11750.5
median5484
Q312693
95-th percentile21651
Maximum33600
Range33443
Interquartile range (IQR)10942.5

Descriptive statistics

Standard deviation7782.5124
Coefficient of variation (CV)0.95434899
Kurtosis0.96914191
Mean8154.7867
Median Absolute Deviation (MAD)4609
Skewness1.1650746
Sum611609
Variance60567500
MonotonicityNot monotonic
2023-12-11T07:28:16.813844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6000 2
 
2.7%
11011 1
 
1.3%
21063 1
 
1.3%
16520 1
 
1.3%
600 1
 
1.3%
30020 1
 
1.3%
21630 1
 
1.3%
14102 1
 
1.3%
1961 1
 
1.3%
490 1
 
1.3%
Other values (64) 64
85.3%
ValueCountFrequency (%)
157 1
1.3%
164 1
1.3%
275 1
1.3%
462 1
1.3%
490 1
1.3%
495 1
1.3%
600 1
1.3%
701 1
1.3%
875 1
1.3%
928 1
1.3%
ValueCountFrequency (%)
33600 1
1.3%
30020 1
1.3%
27398 1
1.3%
21700 1
1.3%
21630 1
1.3%
21063 1
1.3%
21000 1
1.3%
18810 1
1.3%
17661 1
1.3%
17266 1
1.3%

지방비(백만원)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct75
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4098.84
Minimum67
Maximum17266
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size807.0 B
2023-12-11T07:28:16.923921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum67
5-th percentile220.5
Q11077
median2818
Q35832
95-th percentile12079.2
Maximum17266
Range17199
Interquartile range (IQR)4755

Descriptive statistics

Standard deviation3868.3687
Coefficient of variation (CV)0.94377158
Kurtosis1.4550657
Mean4098.84
Median Absolute Deviation (MAD)2057
Skewness1.3247953
Sum307413
Variance14964276
MonotonicityNot monotonic
2023-12-11T07:28:17.042531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
210 1
 
1.3%
4719 1
 
1.3%
9027 1
 
1.3%
7080 1
 
1.3%
400 1
 
1.3%
12866 1
 
1.3%
9270 1
 
1.3%
6044 1
 
1.3%
1336 1
 
1.3%
67 1
 
1.3%
Other values (65) 65
86.7%
ValueCountFrequency (%)
67 1
1.3%
68 1
1.3%
198 1
1.3%
210 1
1.3%
225 1
1.3%
297 1
1.3%
400 1
1.3%
405 1
1.3%
450 1
1.3%
506 1
1.3%
ValueCountFrequency (%)
17266 1
1.3%
14400 1
1.3%
14000 1
1.3%
12866 1
1.3%
11742 1
1.3%
10400 1
1.3%
9607 1
1.3%
9485 1
1.3%
9300 1
1.3%
9270 1
1.3%

데이터기준일자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size732.0 B
2015-12-31
50 
2022-12-31
25 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015-12-31
2nd row2022-12-31
3rd row2015-12-31
4th row2022-12-31
5th row2022-12-31

Common Values

ValueCountFrequency (%)
2015-12-31 50
66.7%
2022-12-31 25
33.3%

Length

2023-12-11T07:28:17.156830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:28:17.243640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015-12-31 50
66.7%
2022-12-31 25
33.3%

Interactions

2023-12-11T07:28:14.922483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.095410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.373437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.653682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.990634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.166316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.441321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.722832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:15.056756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.231717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.519691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.786870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:15.125189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.302527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.586880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:14.854459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:28:17.301028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명하천명준공연도총사업비(백만원)국비(백만원)지방비(백만원)데이터기준일자
시군명1.0000.9290.0000.0000.0000.0000.000
하천명0.9291.0001.0000.7480.4850.8411.000
준공연도0.0001.0001.0000.2820.1410.0000.966
총사업비(백만원)0.0000.7480.2821.0000.9700.9680.699
국비(백만원)0.0000.4850.1410.9701.0000.8770.527
지방비(백만원)0.0000.8410.0000.9680.8771.0000.733
데이터기준일자0.0001.0000.9660.6990.5270.7331.000
2023-12-11T07:28:17.392300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
데이터기준일자시군명
데이터기준일자1.0000.000
시군명0.0001.000
2023-12-11T07:28:17.470091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
준공연도총사업비(백만원)국비(백만원)지방비(백만원)시군명데이터기준일자
준공연도1.0000.5890.5550.6240.0000.795
총사업비(백만원)0.5891.0000.9920.9860.0000.519
국비(백만원)0.5550.9921.0000.9620.0000.507
지방비(백만원)0.6240.9860.9621.0000.0000.541
시군명0.0000.0000.0000.0001.0000.000
데이터기준일자0.7950.5190.5070.5410.0001.000

Missing values

2023-12-11T07:28:15.216481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:28:15.307004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군명하천명준공연도총사업비(백만원)국비(백만원)지방비(백만원)데이터기준일자
0가평군조종천('05)20057004902102015-12-31
1가평군달전천('21)20213453217266172662022-12-31
2경기도경안천('10)2010158061056552412015-12-31
3고양시대장천('19)2019210471473363142022-12-31
4고양시벽제천('20)2020154811083746442022-12-31
5과천시양재천('06)20063070179712732015-12-31
6광명시목감천('11)20118906623426722015-12-31
7광주시경안천('90)1990243417047302015-12-31
8광주시목현천('17)2017184001288055202022-12-31
9구리시왕숙천('94)1994195311927612015-12-31
시군명하천명준공연도총사업비(백만원)국비(백만원)지방비(백만원)데이터기준일자
65이천시학암천('15)20155800348023202015-12-31
66이천시중리천('19)201912000600060002022-12-31
67파주시헤이리천('13)2013300021009002015-12-31
68파주시금촌천('18)20182600015600104002022-12-31
69포천시포천천('11)2011163591211242472015-12-31
70포천시포천천②('18)2018215001505064502022-12-31
71하남시산곡천('15)2015177741244253322015-12-31
72하남시덕풍천('09)20097953556723862015-12-31
73화성시발안천('16)2016188791321556642022-12-31
74화성시남양천('10)20107839536624732015-12-31