Overview

Dataset statistics

Number of variables4
Number of observations87
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)1.1%
Total size in memory3.1 KiB
Average record size in memory36.5 B

Variable types

Text1
Numeric3

Dataset

Description해안선은 바다와 육지를 나누는 경계로 우리나라 국토형상을 정의하는 기본 공간정보입니다. 국립해양조사원은 해안선조사를 통해 최신 해안선 공간정보 및 길이정보를 제공하고 있습니다.
URLhttps://www.data.go.kr/data/15120666/fileData.do

Alerts

Dataset has 1 (1.1%) duplicate rowsDuplicates
총계 is highly overall correlated with 자연해안선 and 1 other fieldsHigh correlation
자연해안선 is highly overall correlated with 총계 and 1 other fieldsHigh correlation
인공해안선 is highly overall correlated with 총계 and 1 other fieldsHigh correlation
자연해안선 has 7 (8.0%) zerosZeros

Reproduction

Analysis started2023-12-12 08:54:13.638193
Analysis finished2023-12-12 08:54:14.697277
Duration1.06 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct80
Distinct (%)92.0%
Missing0
Missing (%)0.0%
Memory size828.0 B
2023-12-12T17:54:14.874791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.3218391
Min length2

Characters and Unicode

Total characters289
Distinct characters76
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)85.1%

Sample

1st row전국
2nd row부산광역시
3rd row중 구
4th row서 구
5th row동 구
ValueCountFrequency (%)
10
 
10.1%
3
 
3.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
고성군 2
 
2.0%
2
 
2.0%
2
 
2.0%
광양시 1
 
1.0%
장흥군 1
 
1.0%
Other values (72) 72
72.7%
2023-12-12T17:54:15.311090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31
 
10.7%
30
 
10.4%
28
 
9.7%
17
 
5.9%
11
 
3.8%
8
 
2.8%
6
 
2.1%
6
 
2.1%
6
 
2.1%
6
 
2.1%
Other values (66) 140
48.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 261
90.3%
Space Separator 28
 
9.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
11.9%
30
 
11.5%
17
 
6.5%
11
 
4.2%
8
 
3.1%
6
 
2.3%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
1.9%
Other values (65) 135
51.7%
Space Separator
ValueCountFrequency (%)
28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 261
90.3%
Common 28
 
9.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
11.9%
30
 
11.5%
17
 
6.5%
11
 
4.2%
8
 
3.1%
6
 
2.3%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
1.9%
Other values (65) 135
51.7%
Common
ValueCountFrequency (%)
28
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 261
90.3%
ASCII 28
 
9.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
31
 
11.9%
30
 
11.5%
17
 
6.5%
11
 
4.2%
8
 
3.1%
6
 
2.3%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
1.9%
Other values (65) 135
51.7%
ASCII
ValueCountFrequency (%)
28
100.0%

총계
Real number (ℝ)

HIGH CORRELATION 

Distinct86
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean527.08379
Minimum5.71
Maximum15285.43
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size915.0 B
2023-12-12T17:54:15.489138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5.71
5-th percentile8.302
Q143.515
median110.28
Q3354.99
95-th percentile1169.323
Maximum15285.43
Range15279.72
Interquartile range (IQR)311.475

Descriptive statistics

Standard deviation1797.21
Coefficient of variation (CV)3.4097236
Kurtosis55.563167
Mean527.08379
Median Absolute Deviation (MAD)93.06
Skewness7.1278384
Sum45856.29
Variance3229963.9
MonotonicityNot monotonic
2023-12-12T17:54:15.657108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
664.8299999999999 2
 
2.3%
15285.430000000002 1
 
1.1%
74.53999999999999 1
 
1.1%
24.479999999999997 1
 
1.1%
232.93 1
 
1.1%
23.78 1
 
1.1%
325.28999999999996 1
 
1.1%
77.86 1
 
1.1%
112.92 1
 
1.1%
110.24000000000001 1
 
1.1%
Other values (76) 76
87.4%
ValueCountFrequency (%)
5.71 1
1.1%
5.84 1
1.1%
7.22 1
1.1%
7.52 1
1.1%
7.63 1
1.1%
9.87 1
1.1%
10.88 1
1.1%
16.560000000000002 1
1.1%
17.22 1
1.1%
18.1 1
1.1%
ValueCountFrequency (%)
15285.430000000002 1
1.1%
6880.75 1
1.1%
2478.77 1
1.1%
1958.3799999999999 1
1.1%
1209.22 1
1.1%
1076.23 1
1.1%
1069.81 1
1.1%
1022.45 1
1.1%
776.0 1
1.1%
750.96 1
1.1%

자연해안선
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct79
Distinct (%)90.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean335.52103
Minimum0
Maximum9730.11
Zeros7
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size915.0 B
2023-12-12T17:54:15.848877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q112.72
median55.39
Q3197.2
95-th percentile735.82
Maximum9730.11
Range9730.11
Interquartile range (IQR)184.48

Descriptive statistics

Standard deviation1159.4634
Coefficient of variation (CV)3.4557102
Kurtosis53.016835
Mean335.52103
Median Absolute Deviation (MAD)52.05
Skewness6.9494472
Sum29190.33
Variance1344355.5
MonotonicityNot monotonic
2023-12-12T17:54:15.999971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 7
 
8.0%
615.39 2
 
2.3%
12.72 2
 
2.3%
9730.109999999999 1
 
1.1%
36.89 1
 
1.1%
706.77 1
 
1.1%
152.69 1
 
1.1%
6.4 1
 
1.1%
124.19 1
 
1.1%
3.34 1
 
1.1%
Other values (69) 69
79.3%
ValueCountFrequency (%)
0.0 7
8.0%
0.4 1
 
1.1%
0.87 1
 
1.1%
1.33 1
 
1.1%
3.34 1
 
1.1%
5.34 1
 
1.1%
5.47 1
 
1.1%
6.07 1
 
1.1%
6.4 1
 
1.1%
6.61 1
 
1.1%
ValueCountFrequency (%)
9730.109999999999 1
1.1%
4635.76 1
1.1%
1493.83 1
1.1%
1430.45 1
1.1%
748.27 1
1.1%
706.77 1
1.1%
700.6 1
1.1%
647.9 1
1.1%
615.39 2
2.3%
578.38 1
1.1%

인공해안선
Real number (ℝ)

HIGH CORRELATION 

Distinct85
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean191.56276
Minimum5.31
Maximum5555.32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size915.0 B
2023-12-12T17:54:16.177219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5.31
5-th percentile7.553
Q123.74
median55.12
Q3142.435
95-th percentile451.164
Maximum5555.32
Range5550.01
Interquartile range (IQR)118.695

Descriptive statistics

Standard deviation641.76587
Coefficient of variation (CV)3.3501599
Kurtosis59.080734
Mean191.56276
Median Absolute Deviation (MAD)39.69
Skewness7.3589818
Sum16665.96
Variance411863.44
MonotonicityNot monotonic
2023-12-12T17:54:16.346615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
49.44 2
 
2.3%
46.0 2
 
2.3%
5555.319999999999 1
 
1.1%
240.75 1
 
1.1%
18.08 1
 
1.1%
108.74 1
 
1.1%
20.44 1
 
1.1%
130.5 1
 
1.1%
59.42 1
 
1.1%
76.03 1
 
1.1%
Other values (75) 75
86.2%
ValueCountFrequency (%)
5.31 1
1.1%
5.84 1
1.1%
5.89 1
1.1%
6.37 1
1.1%
7.52 1
1.1%
7.63 1
1.1%
9.02 1
1.1%
9.87 1
1.1%
10.49 1
1.1%
10.88 1
1.1%
ValueCountFrequency (%)
5555.319999999999 1
1.1%
2244.99 1
1.1%
1048.32 1
1.1%
464.55 1
1.1%
460.95 1
1.1%
428.33 1
1.1%
363.04 1
1.1%
321.85 1
1.1%
246.16 1
1.1%
240.75 1
1.1%

Interactions

2023-12-12T17:54:14.216111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:54:13.740776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:54:13.977715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:54:14.312772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:54:13.827701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:54:14.056882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:54:14.394163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:54:13.903595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:54:14.133256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:54:16.458836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구총계자연해안선인공해안선
시군구1.0001.0001.0001.000
총계1.0001.0001.0000.996
자연해안선1.0001.0001.0000.996
인공해안선1.0000.9960.9961.000
2023-12-12T17:54:16.592002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
총계자연해안선인공해안선
총계1.0000.9750.931
자연해안선0.9751.0000.853
인공해안선0.9310.8531.000

Missing values

2023-12-12T17:54:14.546924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:54:14.659947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군구총계자연해안선인공해안선
0전국15285.439730.115555.32
1부산광역시403.94198.24205.7
2중 구5.840.05.84
3서 구16.566.0710.49
4동 구7.520.07.52
5영도구41.8416.8125.03
6남 구34.4912.7221.77
7해운대구17.228.29.02
8사하구81.6947.4534.24
9강서구125.9570.0855.87
시군구총계자연해안선인공해안선
77사천시204.51107.8896.63
78거제시514.22313.03201.19
79고성군216.3997.89118.5
80남해군375.42215.0160.42
81하동군79.1737.5241.65
82제주도571.92348.65223.27
83제주시334.56191.99142.57
84서귀포시237.36156.6680.7
85기 타664.83615.3949.44
86기 타664.83615.3949.44

Duplicate rows

Most frequently occurring

시군구총계자연해안선인공해안선# duplicates
0기 타664.83615.3949.442