Overview

Dataset statistics

Number of variables5
Number of observations39
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 KiB
Average record size in memory45.4 B

Variable types

Categorical1
Text2
Numeric2

Dataset

Description제주특별자치도 서귀포시 잠재오염원 데이터로 잠재오염원 구분, 오염원명, 소재지, 위도, 경도 데이터를 제공합니다.
URLhttps://www.data.go.kr/data/15114118/fileData.do

Alerts

구분 has constant value ""Constant
위도 is highly overall correlated with 경도High correlation
경도 is highly overall correlated with 위도High correlation
오염원명 has unique valuesUnique
소재지 has unique valuesUnique
위도 has unique valuesUnique
경도 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:55:21.681619
Analysis finished2023-12-12 13:55:22.644974
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

CONSTANT 

Distinct1
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size444.0 B
저류지
39 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row저류지
2nd row저류지
3rd row저류지
4th row저류지
5th row저류지

Common Values

ValueCountFrequency (%)
저류지 39
100.0%

Length

2023-12-12T22:55:22.705702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:55:22.797510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
저류지 39
100.0%

오염원명
Text

UNIQUE 

Distinct39
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size444.0 B
2023-12-12T22:55:22.986884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length4
Mean length4.3333333
Min length4

Characters and Unicode

Total characters169
Distinct characters44
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)100.0%

Sample

1st row영어도시
2nd row지방도1132호선(일주도로)-2
3rd row하천1호
4th row난산1호
5th row난산2호
ValueCountFrequency (%)
영어도시 1
 
2.6%
상모4호 1
 
2.6%
영락2호 1
 
2.6%
보성1호 1
 
2.6%
동일1호 1
 
2.6%
동일2호 1
 
2.6%
동일3호 1
 
2.6%
일과1호 1
 
2.6%
신평1호 1
 
2.6%
영락1호 1
 
2.6%
Other values (29) 29
74.4%
2023-12-12T22:55:23.319795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
38
22.5%
1 20
 
11.8%
2 11
 
6.5%
3 7
 
4.1%
6
 
3.6%
6
 
3.6%
5
 
3.0%
5
 
3.0%
5
 
3.0%
5
 
3.0%
Other values (34) 61
36.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 124
73.4%
Decimal Number 42
 
24.9%
Open Punctuation 1
 
0.6%
Close Punctuation 1
 
0.6%
Dash Punctuation 1
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
30.6%
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
Other values (26) 41
33.1%
Decimal Number
ValueCountFrequency (%)
1 20
47.6%
2 11
26.2%
3 7
 
16.7%
4 3
 
7.1%
5 1
 
2.4%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 124
73.4%
Common 45
 
26.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
38
30.6%
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
Other values (26) 41
33.1%
Common
ValueCountFrequency (%)
1 20
44.4%
2 11
24.4%
3 7
 
15.6%
4 3
 
6.7%
( 1
 
2.2%
) 1
 
2.2%
5 1
 
2.2%
- 1
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 124
73.4%
ASCII 45
 
26.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
38
30.6%
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
Other values (26) 41
33.1%
ASCII
ValueCountFrequency (%)
1 20
44.4%
2 11
24.4%
3 7
 
15.6%
4 3
 
6.7%
( 1
 
2.2%
) 1
 
2.2%
5 1
 
2.2%
- 1
 
2.2%

소재지
Text

UNIQUE 

Distinct39
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size444.0 B
2023-12-12T22:55:23.542766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length26
Mean length25.282051
Min length23

Characters and Unicode

Total characters986
Distinct characters60
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)100.0%

Sample

1st row제주특별자치도 서귀포시 안덕면 동광리 산92-4
2nd row제주특별자치도 서귀포시 성산읍 온평리 265-1
3rd row제주특별자치도 서귀포시 표선면 하천리 1126
4th row제주특별자치도 서귀포시 성산읍 난산리 1930-1
5th row제주특별자치도 서귀포시 성산읍 신산리 723-2
ValueCountFrequency (%)
제주특별자치도 39
20.1%
서귀포시 39
20.1%
대정읍 16
 
8.2%
성산읍 13
 
6.7%
삼달리 5
 
2.6%
상모리 4
 
2.1%
표선면 4
 
2.1%
신평리 3
 
1.5%
난산리 3
 
1.5%
남원읍 3
 
1.5%
Other values (57) 65
33.5%
2023-12-12T22:55:23.941451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
156
 
15.8%
39
 
4.0%
39
 
4.0%
39
 
4.0%
39
 
4.0%
39
 
4.0%
39
 
4.0%
39
 
4.0%
39
 
4.0%
39
 
4.0%
Other values (50) 479
48.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 662
67.1%
Space Separator 156
 
15.8%
Decimal Number 150
 
15.2%
Dash Punctuation 18
 
1.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
Other values (38) 272
41.1%
Decimal Number
ValueCountFrequency (%)
1 33
22.0%
2 26
17.3%
8 16
10.7%
6 15
10.0%
5 14
9.3%
0 11
 
7.3%
7 9
 
6.0%
4 9
 
6.0%
3 9
 
6.0%
9 8
 
5.3%
Space Separator
ValueCountFrequency (%)
156
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 662
67.1%
Common 324
32.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
Other values (38) 272
41.1%
Common
ValueCountFrequency (%)
156
48.1%
1 33
 
10.2%
2 26
 
8.0%
- 18
 
5.6%
8 16
 
4.9%
6 15
 
4.6%
5 14
 
4.3%
0 11
 
3.4%
7 9
 
2.8%
4 9
 
2.8%
Other values (2) 17
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 662
67.1%
ASCII 324
32.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
156
48.1%
1 33
 
10.2%
2 26
 
8.0%
- 18
 
5.6%
8 16
 
4.9%
6 15
 
4.6%
5 14
 
4.3%
0 11
 
3.4%
7 9
 
2.8%
4 9
 
2.8%
Other values (2) 17
 
5.2%
Hangul
ValueCountFrequency (%)
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
39
 
5.9%
Other values (38) 272
41.1%

위도
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct39
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.312899
Minimum33.216568
Maximum33.4535
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-12T22:55:24.067103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum33.216568
5-th percentile33.225917
Q133.249169
median33.29468
Q333.3804
95-th percentile33.41244
Maximum33.4535
Range0.23693189
Interquartile range (IQR)0.13123116

Descriptive statistics

Standard deviation0.070853741
Coefficient of variation (CV)0.0021269161
Kurtosis-1.4349608
Mean33.312899
Median Absolute Deviation (MAD)0.06064146
Skewness0.28546524
Sum1299.203
Variance0.0050202526
MonotonicityNot monotonic
2023-12-12T22:55:24.183993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
33.30918668 1
 
2.6%
33.42434899 1
 
2.6%
33.25824122 1
 
2.6%
33.24910543 1
 
2.6%
33.24923254 1
 
2.6%
33.24743082 1
 
2.6%
33.23934638 1
 
2.6%
33.24114687 1
 
2.6%
33.2624068 1
 
2.6%
33.267018 1
 
2.6%
Other values (29) 29
74.4%
ValueCountFrequency (%)
33.21656768 1
2.6%
33.22215372 1
2.6%
33.22633476 1
2.6%
33.23015351 1
2.6%
33.23403814 1
2.6%
33.23934638 1
2.6%
33.23982182 1
2.6%
33.24114687 1
2.6%
33.24743082 1
2.6%
33.24910543 1
2.6%
ValueCountFrequency (%)
33.45349957 1
2.6%
33.42434899 1
2.6%
33.41111657 1
2.6%
33.41010242 1
2.6%
33.40022589 1
2.6%
33.39648909 1
2.6%
33.38939786 1
2.6%
33.38439403 1
2.6%
33.38404733 1
2.6%
33.38371014 1
2.6%

경도
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct39
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.56328
Minimum126.21896
Maximum126.9056
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-12T22:55:24.304525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.21896
5-th percentile126.23869
Q1126.2704
median126.69826
Q3126.84762
95-th percentile126.87487
Maximum126.9056
Range0.6866362
Interquartile range (IQR)0.5772221

Descriptive statistics

Standard deviation0.28400007
Coefficient of variation (CV)0.0022439374
Kurtosis-1.9908896
Mean126.56328
Median Absolute Deviation (MAD)0.1806425
Skewness-0.079133318
Sum4935.9678
Variance0.080656039
MonotonicityNot monotonic
2023-12-12T22:55:24.430887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
126.3252922 1
 
2.6%
126.9056012 1
 
2.6%
126.218965 1
 
2.6%
126.265342 1
 
2.6%
126.2473885 1
 
2.6%
126.2391841 1
 
2.6%
126.2342714 1
 
2.6%
126.2398142 1
 
2.6%
126.2526952 1
 
2.6%
126.2574973 1
 
2.6%
Other values (29) 29
74.4%
ValueCountFrequency (%)
126.218965 1
2.6%
126.2342714 1
2.6%
126.2391841 1
2.6%
126.2398142 1
2.6%
126.2473885 1
2.6%
126.2526952 1
2.6%
126.2557613 1
2.6%
126.2574973 1
2.6%
126.265342 1
2.6%
126.2661913 1
2.6%
ValueCountFrequency (%)
126.9056012 1
2.6%
126.8788994 1
2.6%
126.8744193 1
2.6%
126.8702698 1
2.6%
126.8687623 1
2.6%
126.8598902 1
2.6%
126.858167 1
2.6%
126.8521905 1
2.6%
126.8508417 1
2.6%
126.8507627 1
2.6%

Interactions

2023-12-12T22:55:21.978712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:21.816575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:22.056525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:21.889704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:55:24.520991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
오염원명소재지위도경도
오염원명1.0001.0001.0001.000
소재지1.0001.0001.0001.000
위도1.0001.0001.0000.723
경도1.0001.0000.7231.000
2023-12-12T22:55:24.628829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도
위도1.0000.841
경도0.8411.000

Missing values

2023-12-12T22:55:22.514286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:55:22.602908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분오염원명소재지위도경도
0저류지영어도시제주특별자치도 서귀포시 안덕면 동광리 산92-433.309187126.325292
1저류지지방도1132호선(일주도로)-2제주특별자치도 서귀포시 성산읍 온평리 265-133.424349126.905601
2저류지하천1호제주특별자치도 서귀포시 표선면 하천리 112633.344718126.82868
3저류지난산1호제주특별자치도 서귀포시 성산읍 난산리 1930-133.396489126.85989
4저류지난산2호제주특별자치도 서귀포시 성산읍 신산리 723-233.389398126.878899
5저류지난산3호제주특별자치도 서귀포시 성산읍 난산리 828-1233.411117126.868762
6저류지난산4호제주특별자치도 서귀포시 성산읍 난산리 831-233.410102126.87027
7저류지삼달1호제주특별자치도 서귀포시 성산읍 삼달리 22633.370392126.858167
8저류지삼달2호제주특별자치도 서귀포시 성산읍 삼달리 99033.384394126.844475
9저류지삼달3호제주특별자치도 서귀포시 성산읍 삼달리 145833.377011126.836856
구분오염원명소재지위도경도
29저류지신평2호제주특별자치도 서귀포시 대정읍 신평리 47033.267018126.257497
30저류지신평3호제주특별자치도 서귀포시 대정읍 신평리 45633.266885126.255761
31저류지성읍1호제주특별자치도 서귀포시 표선면 성읍리 2285-233.38371126.781375
32저류지표선1호제주특별자치도 서귀포시 표선면 표선리 2758-233.333819126.808847
33저류지상효1호제주특별자치도 서귀포시 상효동 608-233.272742126.602539
34저류지남원1호제주특별자치도 서귀포시 남원읍 남원리 1621-233.286398126.702148
35저류지남원2호제주특별자치도 서귀포시 남원읍 남원리 182933.29468126.698257
36저류지보성2호제주특별자치도 서귀포시 대정읍 보성리 88833.260441126.266191
37저류지성읍2호제주특별자치도 서귀포시 표선면 성읍리 산20-133.400226126.803428
38저류지보성3호제주특별자치도 서귀포시 대정읍 보성리 115333.254934126.276687