Overview

Dataset statistics

Number of variables8
Number of observations2143
Missing cells0
Missing cells (%)0.0%
Duplicate rows540
Duplicate rows (%)25.2%
Total size in memory138.3 KiB
Average record size in memory66.1 B

Variable types

Categorical4
Numeric2
Text2

Alerts

Dataset has 540 (25.2%) duplicate rowsDuplicates
어초종류 is highly overall correlated with 시설연도 and 3 other fieldsHigh correlation
용도구분 is highly overall correlated with 시설면적 and 2 other fieldsHigh correlation
시군구분 is highly overall correlated with 시설해역High correlation
시설해역 is highly overall correlated with 시설연도 and 3 other fieldsHigh correlation
시설연도 is highly overall correlated with 시설해역 and 1 other fieldsHigh correlation
시설면적 is highly overall correlated with 용도구분 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-10 22:03:15.935716
Analysis finished2023-12-10 22:03:16.862561
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

용도구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
어류용
1371 
패조류용
772 

Length

Max length4
Median length3
Mean length3.3602427
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row어류용
2nd row어류용
3rd row어류용
4th row어류용
5th row어류용

Common Values

ValueCountFrequency (%)
어류용 1371
64.0%
패조류용 772
36.0%

Length

2023-12-11T07:03:16.927213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:03:17.017532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
어류용 1371
64.0%
패조류용 772
36.0%

시설해역
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
풍도해역
272 
국화도해역
198 
대부해역
152 
입파도(서)해역
144 
중육도해역
136 
Other values (38)
1241 

Length

Max length9
Median length8
Mean length5.6420905
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국화도(동)해역
2nd row국화도(동)해역
3rd row국화도(동)해역
4th row국화도(동)해역
5th row국화도(동)해역

Common Values

ValueCountFrequency (%)
풍도해역 272
 
12.7%
국화도해역 198
 
9.2%
대부해역 152
 
7.1%
입파도(서)해역 144
 
6.7%
중육도해역 136
 
6.3%
말육도해역 120
 
5.6%
대부도해역 120
 
5.6%
육도해역 106
 
4.9%
입파도(남)해역 100
 
4.7%
학산서해역 72
 
3.4%
Other values (33) 723
33.7%

Length

2023-12-11T07:03:17.118383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
풍도해역 272
 
12.7%
국화도해역 198
 
9.2%
대부해역 152
 
7.1%
입파도(서)해역 144
 
6.7%
중육도해역 136
 
6.3%
말육도해역 120
 
5.6%
대부도해역 120
 
5.6%
육도해역 106
 
4.9%
입파도(남)해역 100
 
4.7%
학산서해역 72
 
3.4%
Other values (33) 723
33.7%

시군구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
안산시
1094 
화성시
1033 
<NA>
 
16

Length

Max length4
Median length3
Mean length3.0074662
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row화성시
2nd row화성시
3rd row화성시
4th row화성시
5th row화성시

Common Values

ValueCountFrequency (%)
안산시 1094
51.0%
화성시 1033
48.2%
<NA> 16
 
0.7%

Length

2023-12-11T07:03:17.220789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:03:17.306936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
안산시 1094
51.0%
화성시 1033
48.2%
na 16
 
0.7%

시설연도
Real number (ℝ)

HIGH CORRELATION 

Distinct28
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2005.4204
Minimum1988
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.0 KiB
2023-12-11T07:03:17.415977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1988
5-th percentile1994
Q12000
median2006
Q32010
95-th percentile2016
Maximum2021
Range33
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.7697281
Coefficient of variation (CV)0.0033757151
Kurtosis-0.63981415
Mean2005.4204
Median Absolute Deviation (MAD)5
Skewness-0.24228889
Sum4297616
Variance45.829218
MonotonicityNot monotonic
2023-12-11T07:03:17.526657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
2008 180
 
8.4%
2009 160
 
7.5%
2005 140
 
6.5%
2004 124
 
5.8%
2007 120
 
5.6%
1995 120
 
5.6%
2002 108
 
5.0%
2006 108
 
5.0%
2011 102
 
4.8%
2015 92
 
4.3%
Other values (18) 889
41.5%
ValueCountFrequency (%)
1988 24
 
1.1%
1994 88
4.1%
1995 120
5.6%
1996 80
3.7%
1997 80
3.7%
1998 68
3.2%
1999 44
 
2.1%
2000 48
 
2.2%
2001 28
 
1.3%
2002 108
5.0%
ValueCountFrequency (%)
2021 8
 
0.4%
2020 10
 
0.5%
2018 15
 
0.7%
2017 40
 
1.9%
2016 36
 
1.7%
2015 92
4.3%
2014 64
3.0%
2013 56
2.6%
2012 84
3.9%
2011 102
4.8%

어초종류
Categorical

HIGH CORRELATION 

Distinct40
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
사각형
656 
신요철형
206 
아치형
132 
대형전주
116 
사각전주
112 
Other values (35)
921 

Length

Max length11
Median length9
Mean length4.1507233
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사각전주
2nd row사각전주
3rd row사각전주
4th row사각전주
5th row사각전주

Common Values

ValueCountFrequency (%)
사각형 656
30.6%
신요철형 206
 
9.6%
아치형 132
 
6.2%
대형전주 116
 
5.4%
사각전주 112
 
5.2%
정삼각뿔형 94
 
4.4%
강제고기굴 70
 
3.3%
피라미드형강제 60
 
2.8%
반원가지 56
 
2.6%
이중돔형 56
 
2.6%
Other values (30) 585
27.3%

Length

2023-12-11T07:03:17.657988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사각형 656
30.6%
신요철형 206
 
9.6%
아치형 132
 
6.2%
대형전주 116
 
5.4%
사각전주 112
 
5.2%
정삼각뿔형 94
 
4.4%
강제고기굴 70
 
3.3%
피라미드형강제 60
 
2.8%
이중돔형 56
 
2.6%
반원가지 56
 
2.6%
Other values (30) 585
27.3%

시설면적
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.602427
Minimum2
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.0 KiB
2023-12-11T07:03:17.776329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4
Q14
median16
Q316
95-th percentile16
Maximum20
Range18
Interquartile range (IQR)12

Descriptive statistics

Standard deviation5.7880906
Coefficient of variation (CV)0.49886897
Kurtosis-1.6178428
Mean11.602427
Median Absolute Deviation (MAD)0
Skewness-0.57612942
Sum24864
Variance33.501992
MonotonicityNot monotonic
2023-12-11T07:03:18.134104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
16 1340
62.5%
4 678
31.6%
2 62
 
2.9%
8 39
 
1.8%
11 8
 
0.4%
20 4
 
0.2%
6 4
 
0.2%
12 4
 
0.2%
9 4
 
0.2%
ValueCountFrequency (%)
2 62
 
2.9%
4 678
31.6%
6 4
 
0.2%
8 39
 
1.8%
9 4
 
0.2%
11 8
 
0.4%
12 4
 
0.2%
16 1340
62.5%
20 4
 
0.2%
ValueCountFrequency (%)
20 4
 
0.2%
16 1340
62.5%
12 4
 
0.2%
11 8
 
0.4%
9 4
 
0.2%
8 39
 
1.8%
6 4
 
0.2%
4 678
31.6%
2 62
 
2.9%

위도
Text

Distinct369
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
2023-12-11T07:03:18.444057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length11.949137
Min length8

Characters and Unicode

Total characters25607
Distinct characters18
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowN 37 04. 988
2nd rowN 37 04. 598
3rd rowN 37 04. 598
4th rowN 37 04. 598
5th rowN 37 04. 598
ValueCountFrequency (%)
n 2141
25.1%
37 1185
13.9%
37˚ 942
 
11.0%
07 415
 
4.9%
05 358
 
4.2%
04 267
 
3.1%
09 180
 
2.1%
08 140
 
1.6%
5 124
 
1.5%
10 108
 
1.3%
Other values (303) 2686
31.4%
2023-12-11T07:03:18.913245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6403
25.0%
7 3173
12.4%
3 2696
10.5%
0 2424
 
9.5%
N 2143
 
8.4%
5 1245
 
4.9%
' 1114
 
4.4%
˚ 950
 
3.7%
4 929
 
3.6%
9 849
 
3.3%
Other values (8) 3681
14.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13845
54.1%
Space Separator 6403
25.0%
Other Punctuation 2258
 
8.8%
Uppercase Letter 2143
 
8.4%
Modifier Symbol 950
 
3.7%
Other Symbol 8
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3173
22.9%
3 2696
19.5%
0 2424
17.5%
5 1245
 
9.0%
4 929
 
6.7%
9 849
 
6.1%
8 704
 
5.1%
6 697
 
5.0%
1 596
 
4.3%
2 532
 
3.8%
Other Punctuation
ValueCountFrequency (%)
' 1114
49.3%
" 688
30.5%
. 448
19.8%
8
 
0.4%
Space Separator
ValueCountFrequency (%)
6403
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 2143
100.0%
Modifier Symbol
ValueCountFrequency (%)
˚ 950
100.0%
Other Symbol
ValueCountFrequency (%)
° 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23464
91.6%
Latin 2143
 
8.4%

Most frequent character per script

Common
ValueCountFrequency (%)
6403
27.3%
7 3173
13.5%
3 2696
11.5%
0 2424
 
10.3%
5 1245
 
5.3%
' 1114
 
4.7%
˚ 950
 
4.0%
4 929
 
4.0%
9 849
 
3.6%
8 704
 
3.0%
Other values (7) 2977
12.7%
Latin
ValueCountFrequency (%)
N 2143
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24641
96.2%
Modifier Letters 950
 
3.7%
None 8
 
< 0.1%
Punctuation 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6403
26.0%
7 3173
12.9%
3 2696
10.9%
0 2424
 
9.8%
N 2143
 
8.7%
5 1245
 
5.1%
' 1114
 
4.5%
4 929
 
3.8%
9 849
 
3.4%
8 704
 
2.9%
Other values (5) 2961
12.0%
Modifier Letters
ValueCountFrequency (%)
˚ 950
100.0%
None
ValueCountFrequency (%)
° 8
100.0%
Punctuation
ValueCountFrequency (%)
8
100.0%

경도
Text

Distinct402
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
2023-12-11T07:03:19.223090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length13.028465
Min length9

Characters and Unicode

Total characters27920
Distinct characters18
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowE 126 34 850
2nd rowE 126 36 350
3rd rowE 126 36 050
4th rowE 126 35 750
5th rowE 126 34 850
ValueCountFrequency (%)
e 2143
25.1%
126 1185
13.9%
126˚ 940
 
11.0%
32 258
 
3.0%
33 236
 
2.8%
26 190
 
2.2%
31 188
 
2.2%
30 184
 
2.2%
27 162
 
1.9%
34 160
 
1.9%
Other values (298) 2890
33.9%
2023-12-11T07:03:19.650302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6393
22.9%
2 3898
14.0%
6 2835
10.2%
1 2785
10.0%
E 2143
 
7.7%
3 2011
 
7.2%
0 1368
 
4.9%
5 1084
 
3.9%
' 1054
 
3.8%
˚ 948
 
3.4%
Other values (8) 3401
12.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16532
59.2%
Space Separator 6393
 
22.9%
Uppercase Letter 2143
 
7.7%
Other Punctuation 1894
 
6.8%
Modifier Symbol 948
 
3.4%
Other Symbol 10
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3898
23.6%
6 2835
17.1%
1 2785
16.8%
3 2011
12.2%
0 1368
 
8.3%
5 1084
 
6.6%
7 745
 
4.5%
4 701
 
4.2%
8 587
 
3.6%
9 518
 
3.1%
Other Punctuation
ValueCountFrequency (%)
' 1054
55.6%
" 688
36.3%
. 144
 
7.6%
8
 
0.4%
Space Separator
ValueCountFrequency (%)
6393
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 2143
100.0%
Modifier Symbol
ValueCountFrequency (%)
˚ 948
100.0%
Other Symbol
ValueCountFrequency (%)
° 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25777
92.3%
Latin 2143
 
7.7%

Most frequent character per script

Common
ValueCountFrequency (%)
6393
24.8%
2 3898
15.1%
6 2835
11.0%
1 2785
10.8%
3 2011
 
7.8%
0 1368
 
5.3%
5 1084
 
4.2%
' 1054
 
4.1%
˚ 948
 
3.7%
7 745
 
2.9%
Other values (7) 2656
10.3%
Latin
ValueCountFrequency (%)
E 2143
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26954
96.5%
Modifier Letters 948
 
3.4%
None 10
 
< 0.1%
Punctuation 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6393
23.7%
2 3898
14.5%
6 2835
10.5%
1 2785
10.3%
E 2143
 
8.0%
3 2011
 
7.5%
0 1368
 
5.1%
5 1084
 
4.0%
' 1054
 
3.9%
7 745
 
2.8%
Other values (5) 2638
9.8%
Modifier Letters
ValueCountFrequency (%)
˚ 948
100.0%
None
ValueCountFrequency (%)
° 10
100.0%
Punctuation
ValueCountFrequency (%)
8
100.0%

Interactions

2023-12-11T07:03:16.509920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:03:16.338842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:03:16.600383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:03:16.421127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:03:19.737286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용도구분시설해역시군구분시설연도어초종류시설면적
용도구분1.0000.8950.1880.4411.0000.902
시설해역0.8951.0001.0000.9410.9720.818
시군구분0.1881.0001.0000.2510.5850.117
시설연도0.4410.9410.2511.0000.9160.667
어초종류1.0000.9720.5850.9161.0000.860
시설면적0.9020.8180.1170.6670.8601.000
2023-12-11T07:03:19.828000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
어초종류용도구분시군구분시설해역
어초종류1.0000.9790.4660.582
용도구분0.9791.0000.1200.797
시군구분0.4660.1201.0000.985
시설해역0.5820.7970.9851.000
2023-12-11T07:03:19.962737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설연도시설면적용도구분시설해역시군구분어초종류
시설연도1.000-0.4530.4460.6800.2690.604
시설면적-0.4531.0000.9660.4610.1280.525
용도구분0.4460.9661.0000.7970.1200.979
시설해역0.6800.4610.7971.0000.9850.582
시군구분0.2690.1280.1200.9851.0000.466
어초종류0.6040.5250.9790.5820.4661.000

Missing values

2023-12-11T07:03:16.711270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:03:16.817821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

용도구분시설해역시군구분시설연도어초종류시설면적위도경도
0어류용국화도(동)해역화성시2005사각전주16N 37 04. 988E 126 34 850
1어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 36 350
2어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 36 050
3어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 35 750
4어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 34 850
5어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 34 550
6어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 34 250
7어류용국화도(동)해역화성시2005사각전주16N 37 04. 413E 126 34 240
8어류용국화도(동)해역화성시2005사각전주16N 37 04. 413E 126 34 550
9어류용국화도(동)해역화성시2005사각전주16N 37 04. 413E 126 34 850
용도구분시설해역시군구분시설연도어초종류시설면적위도경도
2133패조류용학산서해역화성시2011정삼각뿔형4N 37 7 268E 126 31 651
2134패조류용학산서해역화성시2011정삼각뿔형4N 37 7 311E 126 31 784
2135패조류용학산서해역화성시2011정삼각뿔형4N 37 7 414E 126 31 741
2136패조류용학산서해역화성시2011정삼각뿔형4N 37 7 363E 126 31 908
2137패조류용학산서해역화성시2011정삼각뿔형4N 37 7 18E 126 23 85
2138패조류용학산서해역화성시2011정삼각뿔형4N 37 7 203E 126 31 781
2139패조류용학산서해역화성시2011정삼각뿔형4N 37 7 268E 126 31 651
2140패조류용학산서해역화성시2011정삼각뿔형4N 37 7 311E 126 31 784
2141패조류용학산서해역화성시2011정삼각뿔형4N 37 7 414E 126 31 741
2142패조류용학산서해역화성시2011정삼각뿔형4N 37 7 203E 126 31 781

Duplicate rows

Most frequently occurring

용도구분시설해역시군구분시설연도어초종류시설면적위도경도# duplicates
354패조류용국화도해역화성시2004신요철형4N 37˚ 04' 100E 126˚ 33' 8008
377패조류용국화도해역화성시2007아치형4N 37˚ 03. 790E 126˚ 32. 8038
0어류용국화도(동)해역화성시2005사각전주16N 37 04. 413E 126 34 2404
1어류용국화도(동)해역화성시2005사각전주16N 37 04. 413E 126 34 5504
2어류용국화도(동)해역화성시2005사각전주16N 37 04. 413E 126 34 8504
3어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 34 2504
4어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 34 5504
5어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 34 8504
6어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 35 7504
7어류용국화도(동)해역화성시2005사각전주16N 37 04. 598E 126 36 0504