Overview

Dataset statistics

Number of variables4
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.2 KiB
Average record size in memory33.3 B

Variable types

Text1
Numeric1
Categorical2

Alerts

GAP_RISK_CN is highly overall correlated with GAP_XTN and 1 other fieldsHigh correlation
GAP_RISK_GRD_CD is highly overall correlated with GAP_XTN and 1 other fieldsHigh correlation
GAP_XTN is highly overall correlated with GAP_RISK_GRD_CD and 1 other fieldsHigh correlation
GEOM has unique valuesUnique
GAP_XTN has 17 (3.4%) zerosZeros

Reproduction

Analysis started2024-03-13 12:47:31.235024
Analysis finished2024-03-13 12:47:31.834569
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

GEOM
Text

UNIQUE 

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2024-03-13T21:47:32.057518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length256
Median length255
Mean length190.618
Min length154

Characters and Unicode

Total characters95309
Distinct characters25
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique500 ?
Unique (%)100.0%

Sample

1st rowMULTIPOLYGON (((129.449851799286 36.5112670023122,129.449851805655 36.5112681028816,129.449862968751 36.5112680607846,129.449862916601 36.5112590494972,129.449858875098 36.5112590647382,129.449851799286 36.5112670023122)))
2nd rowMULTIPOLYGON (((129.449863016607 36.5112763301548,129.449862968751 36.5112680607846,129.449851805655 36.5112681028816,129.449851839214 36.5112739019109,129.449863016607 36.5112763301548)))
3rd rowMULTIPOLYGON (((129.44986289062 36.5112545601653,129.449862916601 36.5112590494972,129.449874079695 36.5112590073992,129.449874027544 36.5112499961119,129.449866935322 36.5112500228581,129.44986289062 36.5112545601653)))
4th rowMULTIPOLYGON (((129.449862916601 36.5112590494972,129.449862968751 36.5112680607846,129.449874131846 36.5112680186865,129.449874079695 36.5112590073992,129.449862916601 36.5112590494972)))
5th rowMULTIPOLYGON (((129.449866373511 36.5112770594287,129.449874183997 36.5112770299738,129.449874131846 36.5112680186865,129.449862968751 36.5112680607846,129.449863016607 36.5112763301548,129.449866373511 36.5112770594287)))
ValueCountFrequency (%)
multipolygon 500
 
14.1%
36.5111139846251,129.450107669817 1
 
< 0.1%
36.5110869507633 1
 
< 0.1%
129.450096402391 1
 
< 0.1%
36.5110959620506,129.450096454568 1
 
< 0.1%
36.5111049733379,129.450107617639 1
 
< 0.1%
36.5111049312182,129.450107565461 1
 
< 0.1%
36.5110959199309,129.450096402391 1
 
< 0.1%
36.5110959620506 1
 
< 0.1%
129.450096454568 1
 
< 0.1%
Other values (3045) 3045
85.7%
2024-03-13T21:47:32.628280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 13549
14.2%
5 8821
9.3%
9 8602
9.0%
4 7881
8.3%
2 7454
7.8%
0 7235
 
7.6%
6 7022
 
7.4%
3 6464
 
6.8%
. 5108
 
5.4%
8 4581
 
4.8%
Other values (15) 18592
19.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 76093
79.8%
Other Punctuation 7162
 
7.5%
Uppercase Letter 6000
 
6.3%
Space Separator 3054
 
3.2%
Close Punctuation 1500
 
1.6%
Open Punctuation 1500
 
1.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 13549
17.8%
5 8821
11.6%
9 8602
11.3%
4 7881
10.4%
2 7454
9.8%
0 7235
9.5%
6 7022
9.2%
3 6464
8.5%
8 4581
 
6.0%
7 4484
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
O 1000
16.7%
L 1000
16.7%
U 500
8.3%
N 500
8.3%
G 500
8.3%
Y 500
8.3%
P 500
8.3%
I 500
8.3%
T 500
8.3%
M 500
8.3%
Other Punctuation
ValueCountFrequency (%)
. 5108
71.3%
, 2054
28.7%
Space Separator
ValueCountFrequency (%)
3054
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1500
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1500
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 89309
93.7%
Latin 6000
 
6.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 13549
15.2%
5 8821
9.9%
9 8602
9.6%
4 7881
8.8%
2 7454
8.3%
0 7235
8.1%
6 7022
7.9%
3 6464
7.2%
. 5108
 
5.7%
8 4581
 
5.1%
Other values (5) 12592
14.1%
Latin
ValueCountFrequency (%)
O 1000
16.7%
L 1000
16.7%
U 500
8.3%
N 500
8.3%
G 500
8.3%
Y 500
8.3%
P 500
8.3%
I 500
8.3%
T 500
8.3%
M 500
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 95309
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 13549
14.2%
5 8821
9.3%
9 8602
9.0%
4 7881
8.3%
2 7454
7.8%
0 7235
 
7.6%
6 7022
 
7.4%
3 6464
 
6.8%
. 5108
 
5.4%
8 4581
 
4.8%
Other values (15) 18592
19.5%

GAP_XTN
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct277
Distinct (%)55.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.197072
Minimum0
Maximum0.851
Zeros17
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2024-03-13T21:47:32.867497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.00495
Q10.086
median0.208
Q30.294
95-th percentile0.40625
Maximum0.851
Range0.851
Interquartile range (IQR)0.208

Descriptive statistics

Standard deviation0.12875496
Coefficient of variation (CV)0.65333967
Kurtosis0.73233321
Mean0.197072
Median Absolute Deviation (MAD)0.095
Skewness0.47309684
Sum98.536
Variance0.016577838
MonotonicityNot monotonic
2024-03-13T21:47:33.131428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 17
 
3.4%
0.3 16
 
3.2%
0.301 7
 
1.4%
0.283 7
 
1.4%
0.162 5
 
1.0%
0.308 5
 
1.0%
0.032 5
 
1.0%
0.312 4
 
0.8%
0.221 4
 
0.8%
0.086 4
 
0.8%
Other values (267) 426
85.2%
ValueCountFrequency (%)
0.0 17
3.4%
0.001 2
 
0.4%
0.003 3
 
0.6%
0.004 3
 
0.6%
0.005 3
 
0.6%
0.006 2
 
0.4%
0.007 2
 
0.4%
0.008 1
 
0.2%
0.009 1
 
0.2%
0.012 1
 
0.2%
ValueCountFrequency (%)
0.851 1
0.2%
0.649 1
0.2%
0.646 1
0.2%
0.531 1
0.2%
0.527 1
0.2%
0.499 1
0.2%
0.476 1
0.2%
0.471 1
0.2%
0.469 1
0.2%
0.466 1
0.2%

GAP_RISK_GRD_CD
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
C
258 
B
128 
D
92 
A
 
17
E
 
5

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD
2nd rowC
3rd rowB
4th rowC
5th rowC

Common Values

ValueCountFrequency (%)
C 258
51.6%
B 128
25.6%
D 92
 
18.4%
A 17
 
3.4%
E 5
 
1.0%

Length

2024-03-13T21:47:33.381193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T21:47:34.018363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 258
51.6%
b 128
25.6%
d 92
 
18.4%
a 17
 
3.4%
e 5
 
1.0%

GAP_RISK_CN
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
주의
258 
보통
128 
위험
92 
안전
 
17
매우위험
 
5

Length

Max length4
Median length2
Mean length2.02
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row위험
2nd row주의
3rd row보통
4th row주의
5th row주의

Common Values

ValueCountFrequency (%)
주의 258
51.6%
보통 128
25.6%
위험 92
 
18.4%
안전 17
 
3.4%
매우위험 5
 
1.0%

Length

2024-03-13T21:47:34.239659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T21:47:34.412191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주의 258
51.6%
보통 128
25.6%
위험 92
 
18.4%
안전 17
 
3.4%
매우위험 5
 
1.0%

Interactions

2024-03-13T21:47:31.526353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T21:47:34.507008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GAP_XTNGAP_RISK_GRD_CDGAP_RISK_CN
GAP_XTN1.0000.8860.886
GAP_RISK_GRD_CD0.8861.0001.000
GAP_RISK_CN0.8861.0001.000
2024-03-13T21:47:34.648994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GAP_RISK_CNGAP_RISK_GRD_CD
GAP_RISK_CN1.0001.000
GAP_RISK_GRD_CD1.0001.000
2024-03-13T21:47:34.772306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GAP_XTNGAP_RISK_GRD_CDGAP_RISK_CN
GAP_XTN1.0000.7630.763
GAP_RISK_GRD_CD0.7631.0001.000
GAP_RISK_CN0.7631.0001.000

Missing values

2024-03-13T21:47:31.677548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T21:47:31.778693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GEOMGAP_XTNGAP_RISK_GRD_CDGAP_RISK_CN
0MULTIPOLYGON (((129.449851799286 36.5112670023122,129.449851805655 36.5112681028816,129.449862968751 36.5112680607846,129.449862916601 36.5112590494972,129.449858875098 36.5112590647382,129.449851799286 36.5112670023122)))0.336D위험
1MULTIPOLYGON (((129.449863016607 36.5112763301548,129.449862968751 36.5112680607846,129.449851805655 36.5112681028816,129.449851839214 36.5112739019109,129.449863016607 36.5112763301548)))0.271C주의
2MULTIPOLYGON (((129.44986289062 36.5112545601653,129.449862916601 36.5112590494972,129.449874079695 36.5112590073992,129.449874027544 36.5112499961119,129.449866935322 36.5112500228581,129.44986289062 36.5112545601653)))0.015B보통
3MULTIPOLYGON (((129.449862916601 36.5112590494972,129.449862968751 36.5112680607846,129.449874131846 36.5112680186865,129.449874079695 36.5112590073992,129.449862916601 36.5112590494972)))0.15C주의
4MULTIPOLYGON (((129.449866373511 36.5112770594287,129.449874183997 36.5112770299738,129.449874131846 36.5112680186865,129.449862968751 36.5112680607846,129.449863016607 36.5112763301548,129.449866373511 36.5112770594287)))0.175C주의
5MULTIPOLYGON (((129.449874995544 36.5112409809774,129.449885138484 36.5112409427255,129.449885086332 36.5112319314382,129.449883055765 36.5112319390961,129.449874995544 36.5112409809774)))0.285C주의
6MULTIPOLYGON (((129.449873981951 36.5112421180173,129.449874027544 36.5112499961119,129.449885190637 36.5112499540129,129.449885138484 36.5112409427255,129.449874995544 36.5112409809774,129.449873981951 36.5112421180173)))0.294C주의
7MULTIPOLYGON (((129.449874027544 36.5112499961119,129.449874079695 36.5112590073992,129.449885242789 36.5112589653002,129.449885190637 36.5112499540129,129.449874027544 36.5112499961119)))0.245C주의
8MULTIPOLYGON (((129.449874079695 36.5112590073992,129.449874131846 36.5112680186865,129.449885294942 36.5112679765875,129.449885242789 36.5112589653002,129.449874079695 36.5112590073992)))0.296C주의
9MULTIPOLYGON (((129.449874131846 36.5112680186865,129.449874183997 36.5112770299738,129.449885347094 36.5112769878747,129.449885294942 36.5112679765875,129.449874131846 36.5112680186865)))0.434D위험
GEOMGAP_XTNGAP_RISK_GRD_CDGAP_RISK_CN
490MULTIPOLYGON (((129.450185028524 36.5109784783293,129.450185080711 36.5109874896167,129.450196243766 36.5109874474888,129.450196191578 36.5109784362015,129.450185028524 36.5109784783293)))0.281C주의
491MULTIPOLYGON (((129.450185080711 36.5109874896167,129.450185132898 36.510996500904,129.450196295954 36.5109964587761,129.450196243766 36.5109874474888,129.450185080711 36.5109874896167)))0.298C주의
492MULTIPOLYGON (((129.450185132898 36.510996500904,129.450185185085 36.5110055121913,129.450196348142 36.5110054700634,129.450196295954 36.5109964587761,129.450185132898 36.510996500904)))0.229C주의
493MULTIPOLYGON (((129.450185185085 36.5110055121913,129.450185237271 36.5110145234786,129.45019640033 36.5110144813507,129.450196348142 36.5110054700634,129.450185185085 36.5110055121913)))0.0A안전
494MULTIPOLYGON (((129.450185237271 36.5110145234786,129.450185289458 36.5110235347659,129.450196452518 36.5110234926379,129.45019640033 36.5110144813507,129.450185237271 36.5110145234786)))0.068B보통
495MULTIPOLYGON (((129.450185289458 36.5110235347659,129.450185341645 36.5110325460532,129.450196504706 36.5110325039252,129.450196452518 36.5110234926379,129.450185289458 36.5110235347659)))0.129C주의
496MULTIPOLYGON (((129.450185341645 36.5110325460532,129.450185393832 36.5110415573404,129.450196556895 36.5110415152125,129.450196504706 36.5110325039252,129.450185341645 36.5110325460532)))0.032B보통
497MULTIPOLYGON (((129.450185393832 36.5110415573404,129.450185446019 36.5110505686277,129.450196609083 36.5110505264997,129.450196556895 36.5110415152125,129.450185393832 36.5110415573404)))0.0A안전
498MULTIPOLYGON (((129.450185446019 36.5110505686277,129.450185498206 36.5110595799149,129.450196661271 36.5110595377869,129.450196609083 36.5110505264997,129.450185446019 36.5110505686277)))0.074B보통
499MULTIPOLYGON (((129.450185498206 36.5110595799149,129.450185550393 36.5110685912021,129.450196713459 36.5110685490741,129.450196661271 36.5110595377869,129.450185498206 36.5110595799149)))0.033B보통