Overview

Dataset statistics

Number of variables8
Number of observations2684
Missing cells182
Missing cells (%)0.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory175.7 KiB
Average record size in memory67.0 B

Variable types

Categorical2
Numeric3
Text2
DateTime1

Dataset

Description한국농어촌공사에서 관리하는 농업생산기반시설에 대해 내외부 전문가가 정밀안전진단을 실시한 결과 데이터
Author한국농어촌공사
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20191014000000001282

Alerts

시설구분 is highly imbalanced (71.7%)Imbalance
중부원점 x좌표 has 91 (3.4%) missing valuesMissing
중부원점 y좌표 has 91 (3.4%) missing valuesMissing
시설코드 has unique valuesUnique

Reproduction

Analysis started2023-12-11 03:37:55.516312
Analysis finished2023-12-11 03:37:58.100460
Duration2.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.1 KiB
저수지
2437 
배수장
 
144
양수장
 
74
양배수장
 
29

Length

Max length4
Median length3
Mean length3.0108048
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row저수지
2nd row저수지
3rd row배수장
4th row저수지
5th row저수지

Common Values

ValueCountFrequency (%)
저수지 2437
90.8%
배수장 144
 
5.4%
양수장 74
 
2.8%
양배수장 29
 
1.1%

Length

2023-12-11T12:37:58.194917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:37:58.319580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
저수지 2437
90.8%
배수장 144
 
5.4%
양수장 74
 
2.8%
양배수장 29
 
1.1%

시설코드
Real number (ℝ)

UNIQUE 

Distinct2684
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5512266 × 109
Minimum2.64403 × 109
Maximum4.97101 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.7 KiB
2023-12-11T12:37:58.437049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.64403 × 109
5-th percentile4.13731 × 109
Q14.48001 × 109
median4.6780101 × 109
Q34.723011 × 109
95-th percentile4.88371 × 109
Maximum4.97101 × 109
Range2.32698 × 109
Interquartile range (IQR)2.4300097 × 108

Descriptive statistics

Standard deviation3.8380543 × 108
Coefficient of variation (CV)0.084330107
Kurtosis10.476955
Mean4.5512266 × 109
Median Absolute Deviation (MAD)1.07 × 108
Skewness-3.0720672
Sum1.2215492 × 1013
Variance1.473066 × 1017
MonotonicityNot monotonic
2023-12-11T12:37:58.608330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4683010009 1
 
< 0.1%
4513010033 1
 
< 0.1%
4684010017 1
 
< 0.1%
4272010025 1
 
< 0.1%
4577010139 1
 
< 0.1%
4572010059 1
 
< 0.1%
4376010039 1
 
< 0.1%
4729010059 1
 
< 0.1%
4687010083 1
 
< 0.1%
4719010019 1
 
< 0.1%
Other values (2674) 2674
99.6%
ValueCountFrequency (%)
2644030002 1
< 0.1%
2644030003 1
< 0.1%
2671010056 1
< 0.1%
2671010067 1
< 0.1%
2671010085 1
< 0.1%
2671010097 1
< 0.1%
2714010005 1
< 0.1%
2714010023 1
< 0.1%
2723010006 1
< 0.1%
2723010024 1
< 0.1%
ValueCountFrequency (%)
4971010003 1
< 0.1%
4971010002 1
< 0.1%
4889040012 1
< 0.1%
4889040006 1
< 0.1%
4889020045 1
< 0.1%
4889010336 1
< 0.1%
4889010332 1
< 0.1%
4889010331 1
< 0.1%
4889010285 1
< 0.1%
4889010240 1
< 0.1%
Distinct2032
Distinct (%)75.7%
Missing0
Missing (%)0.0%
Memory size21.1 KiB
2023-12-11T12:37:59.039831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length2
Mean length2.1590909
Min length1

Characters and Unicode

Total characters5795
Distinct characters336
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1671 ?
Unique (%)62.3%

Sample

1st row장암
2nd row서포
3rd row가덕1
4th row모고
5th row구룡
ValueCountFrequency (%)
대곡 12
 
0.4%
연화 11
 
0.4%
학동 10
 
0.4%
백운 9
 
0.3%
가곡 8
 
0.3%
성산 7
 
0.3%
신리 7
 
0.3%
화산 7
 
0.3%
용암 7
 
0.3%
용산 7
 
0.3%
Other values (2022) 2599
96.8%
2023-12-11T12:37:59.596156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
213
 
3.7%
196
 
3.4%
191
 
3.3%
128
 
2.2%
127
 
2.2%
115
 
2.0%
107
 
1.8%
98
 
1.7%
89
 
1.5%
83
 
1.4%
Other values (326) 4448
76.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5583
96.3%
Decimal Number 132
 
2.3%
Open Punctuation 40
 
0.7%
Close Punctuation 40
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
213
 
3.8%
196
 
3.5%
191
 
3.4%
128
 
2.3%
127
 
2.3%
115
 
2.1%
107
 
1.9%
98
 
1.8%
89
 
1.6%
83
 
1.5%
Other values (318) 4236
75.9%
Decimal Number
ValueCountFrequency (%)
2 60
45.5%
1 56
42.4%
3 13
 
9.8%
5 1
 
0.8%
6 1
 
0.8%
4 1
 
0.8%
Open Punctuation
ValueCountFrequency (%)
( 40
100.0%
Close Punctuation
ValueCountFrequency (%)
) 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5583
96.3%
Common 212
 
3.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
213
 
3.8%
196
 
3.5%
191
 
3.4%
128
 
2.3%
127
 
2.3%
115
 
2.1%
107
 
1.9%
98
 
1.8%
89
 
1.6%
83
 
1.5%
Other values (318) 4236
75.9%
Common
ValueCountFrequency (%)
2 60
28.3%
1 56
26.4%
( 40
18.9%
) 40
18.9%
3 13
 
6.1%
5 1
 
0.5%
6 1
 
0.5%
4 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5583
96.3%
ASCII 212
 
3.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
213
 
3.8%
196
 
3.5%
191
 
3.4%
128
 
2.3%
127
 
2.3%
115
 
2.1%
107
 
1.9%
98
 
1.8%
89
 
1.6%
83
 
1.5%
Other values (318) 4236
75.9%
ASCII
ValueCountFrequency (%)
2 60
28.3%
1 56
26.4%
( 40
18.9%
) 40
18.9%
3 13
 
6.1%
5 1
 
0.5%
6 1
 
0.5%
4 1
 
0.5%
Distinct115
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size21.1 KiB
Minimum1995-12-31 00:00:00
Maximum2018-10-12 00:00:00
2023-12-11T12:37:59.763912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:59.912058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

종합평가
Categorical

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size21.1 KiB
C
1597 
D
601 
B
233 
-
217 
A
 
34

Length

Max length4
Median length1
Mean length1.0022355
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD
2nd rowC
3rd rowC
4th rowC
5th rowC

Common Values

ValueCountFrequency (%)
C 1597
59.5%
D 601
 
22.4%
B 233
 
8.7%
- 217
 
8.1%
A 34
 
1.3%
<NA> 2
 
0.1%

Length

2023-12-11T12:38:00.078705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:38:00.215692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 1597
59.5%
d 601
 
22.4%
b 233
 
8.7%
217
 
8.1%
a 34
 
1.3%
na 2
 
0.1%

주소
Text

Distinct2271
Distinct (%)84.6%
Missing0
Missing (%)0.0%
Memory size21.1 KiB
2023-12-11T12:38:00.541809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length16
Mean length15.872578
Min length11

Characters and Unicode

Total characters42602
Distinct characters321
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1937 ?
Unique (%)72.2%

Sample

1st row전라남도 영암군 영암읍 장암리
2nd row경상남도 하동군 진교면 고이리
3rd row경상남도 하동군 금성면 가덕리
4th row경상남도 산청군 산청읍 모고리
5th row경상남도 사천시 사남면 우천리
ValueCountFrequency (%)
전라남도 781
 
7.3%
경상북도 433
 
4.0%
경상남도 420
 
3.9%
전라북도 313
 
2.9%
충청남도 266
 
2.5%
충청북도 156
 
1.5%
경기도 121
 
1.1%
영암군 104
 
1.0%
나주시 93
 
0.9%
강원도 73
 
0.7%
Other values (2753) 7938
74.2%
2023-12-11T12:38:01.147981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8014
18.8%
2738
 
6.4%
2546
 
6.0%
2148
 
5.0%
1795
 
4.2%
1713
 
4.0%
1187
 
2.8%
1150
 
2.7%
1111
 
2.6%
1075
 
2.5%
Other values (311) 19125
44.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 34585
81.2%
Space Separator 8014
 
18.8%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2738
 
7.9%
2546
 
7.4%
2148
 
6.2%
1795
 
5.2%
1713
 
5.0%
1187
 
3.4%
1150
 
3.3%
1111
 
3.2%
1075
 
3.1%
1073
 
3.1%
Other values (309) 18049
52.2%
Space Separator
ValueCountFrequency (%)
8014
100.0%
Decimal Number
ValueCountFrequency (%)
1 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 34585
81.2%
Common 8017
 
18.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2738
 
7.9%
2546
 
7.4%
2148
 
6.2%
1795
 
5.2%
1713
 
5.0%
1187
 
3.4%
1150
 
3.3%
1111
 
3.2%
1075
 
3.1%
1073
 
3.1%
Other values (309) 18049
52.2%
Common
ValueCountFrequency (%)
8014
> 99.9%
1 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 34585
81.2%
ASCII 8017
 
18.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8014
> 99.9%
1 3
 
< 0.1%
Hangul
ValueCountFrequency (%)
2738
 
7.9%
2546
 
7.4%
2148
 
6.2%
1795
 
5.2%
1713
 
5.0%
1187
 
3.4%
1150
 
3.3%
1111
 
3.2%
1075
 
3.1%
1073
 
3.1%
Other values (309) 18049
52.2%

중부원점 x좌표
Real number (ℝ)

MISSING 

Distinct2584
Distinct (%)99.7%
Missing91
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean247516.87
Minimum0
Maximum431943
Zeros2
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size23.7 KiB
2023-12-11T12:38:01.370629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile145934
Q1178894
median231048
Q3315558
95-th percentile394413.2
Maximum431943
Range431943
Interquartile range (IQR)136664

Descriptive statistics

Standard deviation79487.674
Coefficient of variation (CV)0.32114043
Kurtosis-0.90781935
Mean247516.87
Median Absolute Deviation (MAD)61594
Skewness0.42081881
Sum6.4181124 × 108
Variance6.3182903 × 109
MonotonicityNot monotonic
2023-12-11T12:38:01.557730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
172749 2
 
0.1%
379933 2
 
0.1%
211551 2
 
0.1%
229054 2
 
0.1%
265570 2
 
0.1%
0 2
 
0.1%
199742 2
 
0.1%
155350 2
 
0.1%
234270 2
 
0.1%
310474 1
 
< 0.1%
Other values (2574) 2574
95.9%
(Missing) 91
 
3.4%
ValueCountFrequency (%)
0 2
0.1%
106814 1
< 0.1%
110882 1
< 0.1%
111865 1
< 0.1%
112106 1
< 0.1%
112224 1
< 0.1%
112836 1
< 0.1%
112850 1
< 0.1%
113581 1
< 0.1%
114399 1
< 0.1%
ValueCountFrequency (%)
431943 1
< 0.1%
430473 1
< 0.1%
430131 1
< 0.1%
420269 1
< 0.1%
419904 1
< 0.1%
418755 1
< 0.1%
418498 1
< 0.1%
418497 1
< 0.1%
417310 1
< 0.1%
416548 1
< 0.1%

중부원점 y좌표
Real number (ℝ)

MISSING 

Distinct2579
Distinct (%)99.5%
Missing91
Missing (%)3.4%
Infinite0
Infinite (%)0.0%
Mean249313.82
Minimum-19575
Maximum547493
Zeros2
Zeros (%)0.1%
Negative2
Negative (%)0.1%
Memory size23.7 KiB
2023-12-11T12:38:01.741312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-19575
5-th percentile120040.8
Q1173213
median231793
Q3311429
95-th percentile423877.8
Maximum547493
Range567068
Interquartile range (IQR)138216

Descriptive statistics

Standard deviation94409.242
Coefficient of variation (CV)0.37867633
Kurtosis-0.16445786
Mean249313.82
Median Absolute Deviation (MAD)67372
Skewness0.60872197
Sum6.4647073 × 108
Variance8.9131051 × 109
MonotonicityNot monotonic
2023-12-11T12:38:01.899072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
143556 2
 
0.1%
274581 2
 
0.1%
210706 2
 
0.1%
210725 2
 
0.1%
252175 2
 
0.1%
220911 2
 
0.1%
290458 2
 
0.1%
129069 2
 
0.1%
0 2
 
0.1%
162450 2
 
0.1%
Other values (2569) 2573
95.9%
(Missing) 91
 
3.4%
ValueCountFrequency (%)
-19575 1
< 0.1%
-2504 1
< 0.1%
0 2
0.1%
89630 1
< 0.1%
89836 1
< 0.1%
96200 1
< 0.1%
97743 1
< 0.1%
97931 1
< 0.1%
98341 1
< 0.1%
98630 1
< 0.1%
ValueCountFrequency (%)
547493 1
< 0.1%
533250 1
< 0.1%
532281 1
< 0.1%
530847 1
< 0.1%
530710 1
< 0.1%
530562 1
< 0.1%
525494 1
< 0.1%
524732 1
< 0.1%
523301 1
< 0.1%
518927 1
< 0.1%

Interactions

2023-12-11T12:37:56.936932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:56.236182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:56.621423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:57.069848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:56.355694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:56.725122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:57.197096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:56.486283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:37:56.827294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:38:02.027460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설구분시설코드종합평가중부원점 x좌표중부원점 y좌표
시설구분1.0000.2090.0940.2220.185
시설코드0.2091.0000.1160.6200.704
종합평가0.0940.1161.0000.1080.231
중부원점 x좌표0.2220.6200.1081.0000.637
중부원점 y좌표0.1850.7040.2310.6371.000
2023-12-11T12:38:02.158710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종합평가시설구분
종합평가1.0000.077
시설구분0.0771.000
2023-12-11T12:38:02.267872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설코드중부원점 x좌표중부원점 y좌표시설구분종합평가
시설코드1.0000.345-0.4180.1510.074
중부원점 x좌표0.3451.0000.3330.1430.062
중부원점 y좌표-0.4180.3331.0000.1110.097
시설구분0.1510.1430.1111.0000.077
종합평가0.0740.0620.0970.0771.000

Missing values

2023-12-11T12:37:57.369650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:37:57.557499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:37:58.020582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시설구분시설코드시설명점검일자종합평가주소중부원점 x좌표중부원점 y좌표
0저수지4683010009장암1998-12-31D전라남도 영암군 영암읍 장암리176130145927
1저수지4885010103서포2015-12-15C경상남도 하동군 진교면 고이리281962173818
2배수장4885040006가덕12014-11-14C경상남도 하동군 금성면 가덕리274708161516
3저수지4886010005모고2000-12-31C경상남도 산청군 산청읍 모고리281127216159
4저수지4824010051구룡2017-11-10C경상남도 사천시 사남면 우천리300630172404
5양수장4683020055산호2013-10-02C전라남도 영암군 삼호읍 산호리152291141182
6저수지4715010110광덕2014-11-14C경상북도 김천시 감문면 광덕리308227303703
7저수지4415010067요룡2017-12-31C충청남도 공주시 의당면 요룡리212259336199
8저수지4579010327신림2017-12-31C전라북도 고창군 신림면 자포리173855219118
9저수지4682010313동외12011-11-17D전라남도 해남군 문내면 동외리138070122967
시설구분시설코드시설명점검일자종합평가주소중부원점 x좌표중부원점 y좌표
2674저수지3171010122반계12010-11-30C울산광역시 울주군 웅촌면 고연리395957218788
2675저수지4372010036둔덕2014-11-14D충청북도 보은군 삼승면 둔덕리264775328152
2676저수지4690010107고야2010-11-30D전라남도 진도군 지산면 고야리123721105904
2677저수지4617010232만봉2017-11-10B전라남도 나주시 봉황면 만봉리179177157204
2678저수지4376010076용정2002-12-31C충청북도 괴산군 사리면 방축리259037368157
2679저수지4579010337수동2013-11-30D전라북도 고창군 부안면 수동리169692225911
2680저수지4773010112전대2017-11-10C경상북도 의성군 옥산면 구성리362337320856
2681저수지4617010139대도2016-07-15C전라남도 나주시 문평면 대도리165553174264
2682저수지4623010038진월2013-11-20C전라남도 광양시 진월면 신아리270111166067
2683저수지4677010255운대2009-10-07D전라남도 고흥군 두원면 운대리228449125578