Overview

Dataset statistics

Number of variables5
Number of observations55
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.5 KiB
Average record size in memory46.4 B

Variable types

Text1
Categorical1
Numeric3

Dataset

DescriptionSample
Author제타럭스시스템
URLhttps://bigdata-geo.kr/user/dataset/view.do?data_sn=499

Alerts

LIFE_INFRA is highly overall correlated with TOTL_GRADHigh correlation
CMPTT_GRAD is highly overall correlated with TOTL_GRAD and 1 other fieldsHigh correlation
TOTL_GRAD is highly overall correlated with LIFE_INFRA and 2 other fieldsHigh correlation
PUL_GRAD is highly overall correlated with CMPTT_GRAD and 1 other fieldsHigh correlation
PUL_GRAD is highly imbalanced (86.9%)Imbalance
GRID_NO has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:21:32.855896
Analysis finished2023-12-10 13:21:35.235605
Duration2.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

GRID_NO
Text

UNIQUE 

Distinct55
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size572.0 B
2023-12-10T22:21:35.644260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters550
Distinct characters9
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)100.0%

Sample

1st row다바96bb47bb
2nd row다바96bb48aa
3rd row다바97aa47bb
4th row다바97aa48aa
5th row다바97aa48ab
ValueCountFrequency (%)
다바96bb47bb 1
 
1.8%
다바97bb49ab 1
 
1.8%
다바98aa47bb 1
 
1.8%
다바98aa48aa 1
 
1.8%
다바98aa48ab 1
 
1.8%
다바98aa48ba 1
 
1.8%
다바98aa48bb 1
 
1.8%
다바98aa49aa 1
 
1.8%
다바98aa49ab 1
 
1.8%
다바98aa49ba 1
 
1.8%
Other values (45) 45
81.8%
2023-12-10T22:21:36.372888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 115
20.9%
b 105
19.1%
9 75
13.6%
55
10.0%
55
10.0%
4 55
10.0%
8 53
9.6%
7 35
 
6.4%
6 2
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 220
40.0%
Decimal Number 220
40.0%
Other Letter 110
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 75
34.1%
4 55
25.0%
8 53
24.1%
7 35
15.9%
6 2
 
0.9%
Lowercase Letter
ValueCountFrequency (%)
a 115
52.3%
b 105
47.7%
Other Letter
ValueCountFrequency (%)
55
50.0%
55
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 220
40.0%
Common 220
40.0%
Hangul 110
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 75
34.1%
4 55
25.0%
8 53
24.1%
7 35
15.9%
6 2
 
0.9%
Latin
ValueCountFrequency (%)
a 115
52.3%
b 105
47.7%
Hangul
ValueCountFrequency (%)
55
50.0%
55
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 440
80.0%
Hangul 110
 
20.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 115
26.1%
b 105
23.9%
9 75
17.0%
4 55
12.5%
8 53
12.0%
7 35
 
8.0%
6 2
 
0.5%
Hangul
ValueCountFrequency (%)
55
50.0%
55
50.0%

PUL_GRAD
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size572.0 B
1
54 
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 54
98.2%
2 1
 
1.8%

Length

2023-12-10T22:21:36.593777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:21:36.752861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 54
98.2%
2 1
 
1.8%

LIFE_INFRA
Real number (ℝ)

HIGH CORRELATION 

Distinct14
Distinct (%)25.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.909091
Minimum7
Maximum37
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size627.0 B
2023-12-10T22:21:36.967267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile9.7
Q117
median17
Q317
95-th percentile20.6
Maximum37
Range30
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.5593726
Coefficient of variation (CV)0.26964032
Kurtosis10.909339
Mean16.909091
Median Absolute Deviation (MAD)0
Skewness2.3153311
Sum930
Variance20.787879
MonotonicityNot monotonic
2023-12-10T22:21:37.146044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
17 41
74.5%
13 2
 
3.6%
35 1
 
1.8%
20 1
 
1.8%
37 1
 
1.8%
12 1
 
1.8%
22 1
 
1.8%
8 1
 
1.8%
9 1
 
1.8%
7 1
 
1.8%
Other values (4) 4
 
7.3%
ValueCountFrequency (%)
7 1
 
1.8%
8 1
 
1.8%
9 1
 
1.8%
10 1
 
1.8%
12 1
 
1.8%
13 2
 
3.6%
14 1
 
1.8%
15 1
 
1.8%
17 41
74.5%
18 1
 
1.8%
ValueCountFrequency (%)
37 1
 
1.8%
35 1
 
1.8%
22 1
 
1.8%
20 1
 
1.8%
18 1
 
1.8%
17 41
74.5%
15 1
 
1.8%
14 1
 
1.8%
13 2
 
3.6%
12 1
 
1.8%

CMPTT_GRAD
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.363636
Minimum66
Maximum72
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size627.0 B
2023-12-10T22:21:37.343008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum66
5-th percentile67.7
Q171
median72
Q372
95-th percentile72
Maximum72
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.3792595
Coefficient of variation (CV)0.019327204
Kurtosis6.6734925
Mean71.363636
Median Absolute Deviation (MAD)0
Skewness-2.6711829
Sum3925
Variance1.9023569
MonotonicityNot monotonic
2023-12-10T22:21:37.538656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
72 39
70.9%
71 10
 
18.2%
67 2
 
3.6%
69 1
 
1.8%
70 1
 
1.8%
68 1
 
1.8%
66 1
 
1.8%
ValueCountFrequency (%)
66 1
 
1.8%
67 2
 
3.6%
68 1
 
1.8%
69 1
 
1.8%
70 1
 
1.8%
71 10
 
18.2%
72 39
70.9%
ValueCountFrequency (%)
72 39
70.9%
71 10
 
18.2%
70 1
 
1.8%
69 1
 
1.8%
68 1
 
1.8%
67 2
 
3.6%
66 1
 
1.8%

TOTL_GRAD
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.436364
Minimum27
Maximum44
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size627.0 B
2023-12-10T22:21:37.702011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27
5-th percentile28.7
Q134
median34
Q334
95-th percentile34.6
Maximum44
Range17
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.8203207
Coefficient of variation (CV)0.084348906
Kurtosis5.5081023
Mean33.436364
Median Absolute Deviation (MAD)0
Skewness0.88160079
Sum1839
Variance7.9542088
MonotonicityNot monotonic
2023-12-10T22:21:37.857597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
34 40
72.7%
30 4
 
7.3%
27 2
 
3.6%
29 2
 
3.6%
44 1
 
1.8%
43 1
 
1.8%
36 1
 
1.8%
33 1
 
1.8%
28 1
 
1.8%
31 1
 
1.8%
ValueCountFrequency (%)
27 2
 
3.6%
28 1
 
1.8%
29 2
 
3.6%
30 4
 
7.3%
31 1
 
1.8%
32 1
 
1.8%
33 1
 
1.8%
34 40
72.7%
36 1
 
1.8%
43 1
 
1.8%
ValueCountFrequency (%)
44 1
 
1.8%
43 1
 
1.8%
36 1
 
1.8%
34 40
72.7%
33 1
 
1.8%
32 1
 
1.8%
31 1
 
1.8%
30 4
 
7.3%
29 2
 
3.6%
28 1
 
1.8%

Interactions

2023-12-10T22:21:34.423574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:21:33.134476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:21:33.881797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:21:34.589090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:21:33.446863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:21:34.030540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:21:34.766790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:21:33.734661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:21:34.245418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:21:37.983183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GRID_NOPUL_GRADLIFE_INFRACMPTT_GRADTOTL_GRAD
GRID_NO1.0001.0001.0001.0001.000
PUL_GRAD1.0001.0000.0000.6080.608
LIFE_INFRA1.0000.0001.0000.9240.978
CMPTT_GRAD1.0000.6080.9241.0000.863
TOTL_GRAD1.0000.6080.9780.8631.000
2023-12-10T22:21:38.548193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
LIFE_INFRACMPTT_GRADTOTL_GRADPUL_GRAD
LIFE_INFRA1.0000.1820.8040.000
CMPTT_GRAD0.1821.0000.5460.622
TOTL_GRAD0.8040.5461.0000.622
PUL_GRAD0.0000.6220.6221.000

Missing values

2023-12-10T22:21:34.963575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:21:35.111921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GRID_NOPUL_GRADLIFE_INFRACMPTT_GRADTOTL_GRAD
0다바96bb47bb1357144
1다바96bb48aa1206934
2다바97aa47bb1376743
3다바97aa48aa1137130
4다바97aa48ab1127130
5다바97aa48ba1227136
6다바97aa48bb1177234
7다바97aa49aa1177234
8다바97aa49ab1177234
9다바97ab47bb1177234
GRID_NOPUL_GRADLIFE_INFRACMPTT_GRADTOTL_GRAD
45다바98ab49ab1177234
46다바98ab49ba1177234
47다바98ab49bb1177234
48다바98ba47bb1177234
49다바98ba48ab1186630
50다바98ba48ba1177234
51다바98ba48bb1177234
52다바98ba49aa1177234
53다바98ba49ab1177234
54다바98ba49ba1177234