Overview

Dataset statistics

Number of variables5
Number of observations139
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.1 KiB
Average record size in memory44.9 B

Variable types

Text1
Categorical3
Numeric1

Dataset

DescriptionSample
Author제타럭스시스템
URLhttps://bigdata-geo.kr/user/dataset/view.do?data_sn=496

Alerts

PUL_GRAD has constant value ""Constant
LIFE_INFRA is highly overall correlated with CMPTT_GRAD and 1 other fieldsHigh correlation
CMPTT_GRAD is highly overall correlated with LIFE_INFRA and 1 other fieldsHigh correlation
TOTL_GRAD is highly overall correlated with LIFE_INFRA and 1 other fieldsHigh correlation
CMPTT_GRAD is highly imbalanced (82.0%)Imbalance
TOTL_GRAD is highly imbalanced (82.0%)Imbalance
GIRD_NO has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:21:10.600496
Analysis finished2023-12-10 13:21:11.433612
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

GIRD_NO
Text

UNIQUE 

Distinct139
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2023-12-10T22:21:11.851281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1390
Distinct characters11
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique139 ?
Unique (%)100.0%

Sample

1st row마마72ba30ab
2nd row마마73aa30ba
3rd row마마73aa30bb
4th row마마73ab30bb
5th row마마73aa31aa
ValueCountFrequency (%)
마마72ba30ab 1
 
0.7%
마마71ba31ba 1
 
0.7%
마마71ba31aa 1
 
0.7%
마마71ba30bb 1
 
0.7%
마마71ba30ba 1
 
0.7%
마마71ba30ab 1
 
0.7%
마마71ba30aa 1
 
0.7%
마마71ba29bb 1
 
0.7%
마마71ba29ba 1
 
0.7%
마마71aa29ab 1
 
0.7%
Other values (129) 129
92.8%
2023-12-10T22:21:12.604851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
b 283
20.4%
278
20.0%
a 273
19.6%
7 129
9.3%
0 103
 
7.4%
3 89
 
6.4%
1 82
 
5.9%
2 77
 
5.5%
9 45
 
3.2%
8 20
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 556
40.0%
Decimal Number 556
40.0%
Other Letter 278
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 129
23.2%
0 103
18.5%
3 89
16.0%
1 82
14.7%
2 77
13.8%
9 45
 
8.1%
8 20
 
3.6%
6 11
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
b 283
50.9%
a 273
49.1%
Other Letter
ValueCountFrequency (%)
278
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 556
40.0%
Common 556
40.0%
Hangul 278
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 129
23.2%
0 103
18.5%
3 89
16.0%
1 82
14.7%
2 77
13.8%
9 45
 
8.1%
8 20
 
3.6%
6 11
 
2.0%
Latin
ValueCountFrequency (%)
b 283
50.9%
a 273
49.1%
Hangul
ValueCountFrequency (%)
278
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1112
80.0%
Hangul 278
 
20.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b 283
25.4%
a 273
24.6%
7 129
11.6%
0 103
 
9.3%
3 89
 
8.0%
1 82
 
7.4%
2 77
 
6.9%
9 45
 
4.0%
8 20
 
1.8%
6 11
 
1.0%
Hangul
ValueCountFrequency (%)
278
100.0%

PUL_GRAD
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
1
139 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 139
100.0%

Length

2023-12-10T22:21:12.837631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:21:13.027428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 139
100.0%

LIFE_INFRA
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.784173
Minimum10
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-10T22:21:13.157763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile16.7
Q117
median17
Q317
95-th percentile17
Maximum24
Range14
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.344676
Coefficient of variation (CV)0.080115714
Kurtosis17.713999
Mean16.784173
Median Absolute Deviation (MAD)0
Skewness-2.1198479
Sum2333
Variance1.8081535
MonotonicityNot monotonic
2023-12-10T22:21:13.355837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
17 131
94.2%
11 3
 
2.2%
14 1
 
0.7%
10 1
 
0.7%
12 1
 
0.7%
24 1
 
0.7%
13 1
 
0.7%
ValueCountFrequency (%)
10 1
 
0.7%
11 3
 
2.2%
12 1
 
0.7%
13 1
 
0.7%
14 1
 
0.7%
17 131
94.2%
24 1
 
0.7%
ValueCountFrequency (%)
24 1
 
0.7%
17 131
94.2%
14 1
 
0.7%
13 1
 
0.7%
12 1
 
0.7%
11 3
 
2.2%
10 1
 
0.7%

CMPTT_GRAD
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
72
131 
71
 
4
67
 
2
68
 
1
69
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique2 ?
Unique (%)1.4%

Sample

1st row72
2nd row72
3rd row72
4th row72
5th row72

Common Values

ValueCountFrequency (%)
72 131
94.2%
71 4
 
2.9%
67 2
 
1.4%
68 1
 
0.7%
69 1
 
0.7%

Length

2023-12-10T22:21:13.631749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:21:13.806031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
72 131
94.2%
71 4
 
2.9%
67 2
 
1.4%
68 1
 
0.7%
69 1
 
0.7%

TOTL_GRAD
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
34
131 
29
 
4
27
 
2
31
 
1
35
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique2 ?
Unique (%)1.4%

Sample

1st row34
2nd row34
3rd row34
4th row34
5th row34

Common Values

ValueCountFrequency (%)
34 131
94.2%
29 4
 
2.9%
27 2
 
1.4%
31 1
 
0.7%
35 1
 
0.7%

Length

2023-12-10T22:21:14.002324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:21:14.179412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
34 131
94.2%
29 4
 
2.9%
27 2
 
1.4%
31 1
 
0.7%
35 1
 
0.7%

Interactions

2023-12-10T22:21:10.961104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:21:14.297544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
LIFE_INFRACMPTT_GRADTOTL_GRAD
LIFE_INFRA1.0000.9750.987
CMPTT_GRAD0.9751.0000.970
TOTL_GRAD0.9870.9701.000
2023-12-10T22:21:14.449031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
TOTL_GRADCMPTT_GRAD
TOTL_GRAD1.0000.752
CMPTT_GRAD0.7521.000
2023-12-10T22:21:14.580950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
LIFE_INFRACMPTT_GRADTOTL_GRAD
LIFE_INFRA1.0000.7730.833
CMPTT_GRAD0.7731.0000.752
TOTL_GRAD0.8330.7521.000

Missing values

2023-12-10T22:21:11.223214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:21:11.373222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GIRD_NOPUL_GRADLIFE_INFRACMPTT_GRADTOTL_GRAD
0마마72ba30ab1177234
1마마73aa30ba1177234
2마마73aa30bb1177234
3마마73ab30bb1177234
4마마73aa31aa1177234
5마마73ab31aa1177234
6마마69ba29bb1177234
7마마69ba30aa1177234
8마마69ba30ab1177234
9마마69ba30ba1177234
GIRD_NOPUL_GRADLIFE_INFRACMPTT_GRADTOTL_GRAD
129마마72ab31aa1177234
130마마72ab31ab1177234
131마마72ab31ba1177234
132마마72ab31bb1177234
133마마72ba30ba1177234
134마마72ba30bb1177234
135마마72ba31aa1177234
136마마72bb30ba1177234
137마마72bb30bb1177234
138마마72bb31aa1177234