Overview

Dataset statistics

Number of variables6
Number of observations26
Missing cells30
Missing cells (%)19.2%
Duplicate rows1
Duplicate rows (%)3.8%
Total size in memory1.4 KiB
Average record size in memory56.1 B

Variable types

Categorical1
Text2
Numeric3

Dataset

Description경남 각시군별 위치한 인공어초에 관한 시설위치, 지역명, 년도, 종류, 면적, 사업비등과 같은 정보를 제공합니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3034247

Alerts

Dataset has 1 (3.8%) duplicate rowsDuplicates
면적 is highly overall correlated with 시군High correlation
수량(개) is highly overall correlated with 시군High correlation
사업비(백만원) is highly overall correlated with 시군High correlation
시군 is highly overall correlated with 면적 and 2 other fieldsHigh correlation
시설위치 has 6 (23.1%) missing valuesMissing
면적 has 6 (23.1%) missing valuesMissing
종류 has 6 (23.1%) missing valuesMissing
수량(개) has 6 (23.1%) missing valuesMissing
사업비(백만원) has 6 (23.1%) missing valuesMissing

Reproduction

Analysis started2023-12-10 23:53:23.744316
Analysis finished2023-12-10 23:53:25.001115
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)26.9%
Missing0
Missing (%)0.0%
Memory size340.0 B
남해
<NA>
통영
거제
사천
Other values (2)

Length

Max length4
Median length2
Mean length2.4615385
Min length1

Unique

Unique2 ?
Unique (%)7.7%

Sample

1st row
2nd row통영
3rd row통영
4th row통영
5th row통영

Common Values

ValueCountFrequency (%)
남해 7
26.9%
<NA> 6
23.1%
통영 5
19.2%
거제 4
15.4%
사천 2
 
7.7%
1
 
3.8%
고성 1
 
3.8%

Length

2023-12-11T08:53:25.065806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:53:25.169592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남해 7
26.9%
na 6
23.1%
통영 5
19.2%
거제 4
15.4%
사천 2
 
7.7%
1
 
3.8%
고성 1
 
3.8%

시설위치
Text

MISSING 

Distinct20
Distinct (%)100.0%
Missing6
Missing (%)23.1%
Memory size340.0 B
2023-12-11T08:53:25.362995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length10.4
Min length6

Characters and Unicode

Total characters208
Distinct characters75
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st row19개 해역
2nd row산양면 오곡도해역
3rd row한산면 용초도해역
4th row욕지면 두미 남구해역
5th row산양읍 연화리 중화해역
ValueCountFrequency (%)
남면 3
 
5.6%
한산면 2
 
3.7%
해역 2
 
3.7%
마도동 2
 
3.7%
산양면 1
 
1.9%
홍현지선 1
 
1.9%
춘암리 1
 
1.9%
지선 1
 
1.9%
유구해역 1
 
1.9%
서면 1
 
1.9%
Other values (39) 39
72.2%
2023-12-11T08:53:25.686881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34
 
16.3%
15
 
7.2%
14
 
6.7%
12
 
5.8%
10
 
4.8%
10
 
4.8%
10
 
4.8%
6
 
2.9%
5
 
2.4%
5
 
2.4%
Other values (65) 87
41.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 172
82.7%
Space Separator 34
 
16.3%
Decimal Number 2
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
 
8.7%
14
 
8.1%
12
 
7.0%
10
 
5.8%
10
 
5.8%
10
 
5.8%
6
 
3.5%
5
 
2.9%
5
 
2.9%
4
 
2.3%
Other values (62) 81
47.1%
Decimal Number
ValueCountFrequency (%)
9 1
50.0%
1 1
50.0%
Space Separator
ValueCountFrequency (%)
34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 172
82.7%
Common 36
 
17.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
 
8.7%
14
 
8.1%
12
 
7.0%
10
 
5.8%
10
 
5.8%
10
 
5.8%
6
 
3.5%
5
 
2.9%
5
 
2.9%
4
 
2.3%
Other values (62) 81
47.1%
Common
ValueCountFrequency (%)
34
94.4%
9 1
 
2.8%
1 1
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 172
82.7%
ASCII 36
 
17.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34
94.4%
9 1
 
2.8%
1 1
 
2.8%
Hangul
ValueCountFrequency (%)
15
 
8.7%
14
 
8.1%
12
 
7.0%
10
 
5.8%
10
 
5.8%
10
 
5.8%
6
 
3.5%
5
 
2.9%
5
 
2.9%
4
 
2.3%
Other values (62) 81
47.1%

면적
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct7
Distinct (%)35.0%
Missing6
Missing (%)23.1%
Infinite0
Infinite (%)0.0%
Mean21
Minimum4
Maximum210
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-11T08:53:25.796319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile4
Q17
median11
Q316
95-th percentile31.4
Maximum210
Range206
Interquartile range (IQR)9

Descriptive statistics

Standard deviation44.822457
Coefficient of variation (CV)2.1344027
Kurtosis19.307501
Mean21
Median Absolute Deviation (MAD)5
Skewness4.3613182
Sum420
Variance2009.0526
MonotonicityNot monotonic
2023-12-11T08:53:25.894543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
16 7
26.9%
4 5
19.2%
8 3
11.5%
10 2
 
7.7%
210 1
 
3.8%
12 1
 
3.8%
22 1
 
3.8%
(Missing) 6
23.1%
ValueCountFrequency (%)
4 5
19.2%
8 3
11.5%
10 2
 
7.7%
12 1
 
3.8%
16 7
26.9%
22 1
 
3.8%
210 1
 
3.8%
ValueCountFrequency (%)
210 1
 
3.8%
22 1
 
3.8%
16 7
26.9%
12 1
 
3.8%
10 2
 
7.7%
8 3
11.5%
4 5
19.2%

종류
Text

MISSING 

Distinct13
Distinct (%)65.0%
Missing6
Missing (%)23.1%
Memory size340.0 B
2023-12-11T08:53:26.065216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length7.05
Min length3

Characters and Unicode

Total characters141
Distinct characters43
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)35.0%

Sample

1st row12종
2nd row팔각상자형강제어초
3rd row대형강제어초
4th row팔각상자형강제어초
5th row사각복합형인공어초
ValueCountFrequency (%)
강제증식어초 3
15.0%
팔각상자형강제어초 2
10.0%
대형강제어초 2
10.0%
신요철형어초 2
10.0%
중형연약지반용강제어초 2
10.0%
반톱니형어초 2
10.0%
12종 1
 
5.0%
사각복합형인공어초 1
 
5.0%
아치형어초 1
 
5.0%
팔각별강제어초 1
 
5.0%
Other values (3) 3
15.0%
2023-12-11T08:53:26.377479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19
 
13.5%
19
 
13.5%
13
 
9.2%
11
 
7.8%
11
 
7.8%
5
 
3.5%
4
 
2.8%
3
 
2.1%
3
 
2.1%
3
 
2.1%
Other values (33) 50
35.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 138
97.9%
Decimal Number 2
 
1.4%
Uppercase Letter 1
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
13.8%
19
13.8%
13
 
9.4%
11
 
8.0%
11
 
8.0%
5
 
3.6%
4
 
2.9%
3
 
2.2%
3
 
2.2%
3
 
2.2%
Other values (30) 47
34.1%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
1 1
50.0%
Uppercase Letter
ValueCountFrequency (%)
H 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 138
97.9%
Common 2
 
1.4%
Latin 1
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
13.8%
19
13.8%
13
 
9.4%
11
 
8.0%
11
 
8.0%
5
 
3.6%
4
 
2.9%
3
 
2.2%
3
 
2.2%
3
 
2.2%
Other values (30) 47
34.1%
Common
ValueCountFrequency (%)
2 1
50.0%
1 1
50.0%
Latin
ValueCountFrequency (%)
H 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 138
97.9%
ASCII 3
 
2.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
19
13.8%
19
13.8%
13
 
9.4%
11
 
8.0%
11
 
8.0%
5
 
3.6%
4
 
2.9%
3
 
2.2%
3
 
2.2%
3
 
2.2%
Other values (30) 47
34.1%
ASCII
ValueCountFrequency (%)
H 1
33.3%
2 1
33.3%
1 1
33.3%

수량(개)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct15
Distinct (%)75.0%
Missing6
Missing (%)23.1%
Infinite0
Infinite (%)0.0%
Mean117.2
Minimum2
Maximum1172
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-11T08:53:26.503682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q15.25
median28.5
Q3124.25
95-th percentile261.9
Maximum1172
Range1170
Interquartile range (IQR)119

Descriptive statistics

Standard deviation259.38807
Coefficient of variation (CV)2.2132087
Kurtosis16.24295
Mean117.2
Median Absolute Deviation (MAD)25.5
Skewness3.8906612
Sum2344
Variance67282.168
MonotonicityNot monotonic
2023-12-11T08:53:26.607089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
3 3
11.5%
2 2
 
7.7%
185 2
 
7.7%
6 2
 
7.7%
1172 1
 
3.8%
50 1
 
3.8%
214 1
 
3.8%
7 1
 
3.8%
104 1
 
3.8%
205 1
 
3.8%
Other values (5) 5
19.2%
(Missing) 6
23.1%
ValueCountFrequency (%)
2 2
7.7%
3 3
11.5%
6 2
7.7%
7 1
 
3.8%
27 1
 
3.8%
28 1
 
3.8%
29 1
 
3.8%
34 1
 
3.8%
50 1
 
3.8%
79 1
 
3.8%
ValueCountFrequency (%)
1172 1
3.8%
214 1
3.8%
205 1
3.8%
185 2
7.7%
104 1
3.8%
79 1
3.8%
50 1
3.8%
34 1
3.8%
29 1
3.8%
28 1
3.8%

사업비(백만원)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct15
Distinct (%)75.0%
Missing6
Missing (%)23.1%
Infinite0
Infinite (%)0.0%
Mean602.5
Minimum276
Maximum6025
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-11T08:53:26.721544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum276
5-th percentile278.85
Q1303.5
median308
Q3335.5
95-th percentile675.55
Maximum6025
Range5749
Interquartile range (IQR)32

Descriptive statistics

Standard deviation1276.6524
Coefficient of variation (CV)2.1189252
Kurtosis19.975989
Mean602.5
Median Absolute Deviation (MAD)9
Skewness4.4683171
Sum12050
Variance1629841.4
MonotonicityNot monotonic
2023-12-11T08:53:26.862181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
305 3
11.5%
304 2
 
7.7%
299 2
 
7.7%
311 2
 
7.7%
6025 1
 
3.8%
356 1
 
3.8%
279 1
 
3.8%
355 1
 
3.8%
276 1
 
3.8%
329 1
 
3.8%
Other values (5) 5
19.2%
(Missing) 6
23.1%
ValueCountFrequency (%)
276 1
 
3.8%
279 1
 
3.8%
299 2
7.7%
302 1
 
3.8%
304 2
7.7%
305 3
11.5%
311 2
7.7%
312 1
 
3.8%
317 1
 
3.8%
329 1
 
3.8%
ValueCountFrequency (%)
6025 1
 
3.8%
394 1
 
3.8%
362 1
 
3.8%
356 1
 
3.8%
355 1
 
3.8%
329 1
 
3.8%
317 1
 
3.8%
312 1
 
3.8%
311 2
7.7%
305 3
11.5%

Interactions

2023-12-11T08:53:24.484833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:53:23.983759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:53:24.242789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:53:24.555551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:53:24.047290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:53:24.327363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:53:24.632522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:53:24.159078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:53:24.409057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:53:26.984652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군시설위치면적종류수량(개)사업비(백만원)
시군1.0001.0001.0000.7410.9561.000
시설위치1.0001.0001.0001.0001.0001.000
면적1.0001.0001.0001.0001.0000.618
종류0.7411.0001.0001.0001.0001.000
수량(개)0.9561.0001.0001.0001.0001.000
사업비(백만원)1.0001.0000.6181.0001.0001.000
2023-12-11T08:53:27.090521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적수량(개)사업비(백만원)시군
면적1.000-0.4800.2680.882
수량(개)-0.4801.0000.2010.671
사업비(백만원)0.2680.2011.0000.882
시군0.8820.6710.8821.000

Missing values

2023-12-11T08:53:24.736108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:53:24.829968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T08:53:24.930002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군시설위치면적종류수량(개)사업비(백만원)
019개 해역21012종11726025
1통영산양면 오곡도해역16팔각상자형강제어초3356
2통영한산면 용초도해역16대형강제어초2279
3통영욕지면 두미 남구해역16팔각상자형강제어초3355
4통영산양읍 연화리 중화해역8사각복합형인공어초50305
5통영한산면 소매물도 해역16대형강제어초2276
6사천마도동 둥근섬 북측해역4신요철형어초214329
7사천마도동 저도 서북쪽해역16중형연약지반용강제어초7394
8거제일운면 망치리 양화지선10아치형어초104312
9거제남부면 갈곶리 해금강지선16팔각별강제어초3317
시군시설위치면적종류수량(개)사업비(백만원)
16남해남면 홍현리 홍현지선8육각패널H빔어초34305
17남해미조면 미조리 팔랑지선4터널형어초79302
18남해창선면 진동리 적량지선8반톱니형어초185304
19남해남면 선구리 항촌지선22중형연약지반용강제어초6299
20<NA><NA><NA><NA><NA><NA>
21<NA><NA><NA><NA><NA><NA>
22<NA><NA><NA><NA><NA><NA>
23<NA><NA><NA><NA><NA><NA>
24<NA><NA><NA><NA><NA><NA>
25<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

시군시설위치면적종류수량(개)사업비(백만원)# duplicates
0<NA><NA><NA><NA><NA><NA>6