Overview

Dataset statistics

Number of variables4
Number of observations118
Missing cells0
Missing cells (%)0.0%
Duplicate rows9
Duplicate rows (%)7.6%
Total size in memory3.9 KiB
Average record size in memory34.1 B

Variable types

Text2
Numeric1
Categorical1

Dataset

Description태양광 발전사업 허가내역으로 발전시설의 위치, 발전시설의 허가자, 발전시설의 설비용량 및 발전시설 사업 허가 후 실제 사업 개시여부 등의 정보를 포함함
Author경상남도 함안군
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15113391

Alerts

Dataset has 9 (7.6%) duplicate rowsDuplicates
설비용량(kw) is highly overall correlated with 비고High correlation
비고 is highly overall correlated with 설비용량(kw)High correlation

Reproduction

Analysis started2024-04-20 18:39:32.985058
Analysis finished2024-04-20 18:39:34.559601
Duration1.57 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

위치
Text

Distinct56
Distinct (%)47.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-04-21T03:39:34.689192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length17.084746
Min length16

Characters and Unicode

Total characters2016
Distinct characters77
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)32.2%

Sample

1st row경상남도 함안군 칠서면 대치리
2nd row경상남도 함안군 법수면 강주리
3rd row경상남도 함안군 법수면 우거리
4th row경상남도 함안군 법수면 백산리
5th row경상남도 함안군 법수면 백산리
ValueCountFrequency (%)
경상남도 118
25.0%
함안군 118
25.0%
법수면 48
10.2%
백산리 25
 
5.3%
가야읍 15
 
3.2%
군북면 13
 
2.8%
북실길 12
 
2.5%
칠북면 10
 
2.1%
대산면 9
 
1.9%
칠원읍 8
 
1.7%
Other values (52) 96
20.3%
2024-04-21T03:39:34.981470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
462
22.9%
131
 
6.5%
122
 
6.1%
121
 
6.0%
118
 
5.9%
118
 
5.9%
118
 
5.9%
118
 
5.9%
95
 
4.7%
78
 
3.9%
Other values (67) 535
26.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1549
76.8%
Space Separator 462
 
22.9%
Decimal Number 5
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
131
 
8.5%
122
 
7.9%
121
 
7.8%
118
 
7.6%
118
 
7.6%
118
 
7.6%
118
 
7.6%
95
 
6.1%
78
 
5.0%
50
 
3.2%
Other values (64) 480
31.0%
Decimal Number
ValueCountFrequency (%)
1 4
80.0%
2 1
 
20.0%
Space Separator
ValueCountFrequency (%)
462
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1549
76.8%
Common 467
 
23.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
131
 
8.5%
122
 
7.9%
121
 
7.8%
118
 
7.6%
118
 
7.6%
118
 
7.6%
118
 
7.6%
95
 
6.1%
78
 
5.0%
50
 
3.2%
Other values (64) 480
31.0%
Common
ValueCountFrequency (%)
462
98.9%
1 4
 
0.9%
2 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1549
76.8%
ASCII 467
 
23.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
462
98.9%
1 4
 
0.9%
2 1
 
0.2%
Hangul
ValueCountFrequency (%)
131
 
8.5%
122
 
7.9%
121
 
7.8%
118
 
7.6%
118
 
7.6%
118
 
7.6%
118
 
7.6%
95
 
6.1%
78
 
5.0%
50
 
3.2%
Other values (64) 480
31.0%

성명
Text

Distinct83
Distinct (%)70.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-04-21T03:39:35.195372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9915254
Min length2

Characters and Unicode

Total characters353
Distinct characters79
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)52.5%

Sample

1st row김*국
2nd row정*란
3rd row김*숙
4th row박*규
5th row박*규
ValueCountFrequency (%)
정*찬 7
 
5.9%
정*란 5
 
4.2%
최*훈 4
 
3.4%
박*규 3
 
2.5%
조*래 3
 
2.5%
윤*국 3
 
2.5%
이*환 3
 
2.5%
허*회 2
 
1.7%
김*헌 2
 
1.7%
조*애 2
 
1.7%
Other values (73) 84
71.2%
2024-04-21T03:39:35.507582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 118
33.4%
20
 
5.7%
17
 
4.8%
14
 
4.0%
10
 
2.8%
8
 
2.3%
7
 
2.0%
7
 
2.0%
7
 
2.0%
7
 
2.0%
Other values (69) 138
39.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 235
66.6%
Other Punctuation 118
33.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
 
8.5%
17
 
7.2%
14
 
6.0%
10
 
4.3%
8
 
3.4%
7
 
3.0%
7
 
3.0%
7
 
3.0%
7
 
3.0%
7
 
3.0%
Other values (68) 131
55.7%
Other Punctuation
ValueCountFrequency (%)
* 118
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 235
66.6%
Common 118
33.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
20
 
8.5%
17
 
7.2%
14
 
6.0%
10
 
4.3%
8
 
3.4%
7
 
3.0%
7
 
3.0%
7
 
3.0%
7
 
3.0%
7
 
3.0%
Other values (68) 131
55.7%
Common
ValueCountFrequency (%)
* 118
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 235
66.6%
ASCII 118
33.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 118
100.0%
Hangul
ValueCountFrequency (%)
20
 
8.5%
17
 
7.2%
14
 
6.0%
10
 
4.3%
8
 
3.4%
7
 
3.0%
7
 
3.0%
7
 
3.0%
7
 
3.0%
7
 
3.0%
Other values (68) 131
55.7%

설비용량(kw)
Real number (ℝ)

HIGH CORRELATION 

Distinct67
Distinct (%)56.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.75292
Minimum9.6
Maximum999.92
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-04-21T03:39:35.624523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.6
5-th percentile29.1105
Q190
median99.71
Q399.91
95-th percentile615.636
Maximum999.92
Range990.32
Interquartile range (IQR)9.91

Descriptive statistics

Standard deviation183.33536
Coefficient of variation (CV)1.1548472
Kurtosis8.5330253
Mean158.75292
Median Absolute Deviation (MAD)9.71
Skewness2.901036
Sum18732.845
Variance33611.855
MonotonicityNot monotonic
2024-04-21T03:39:35.746771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.84 19
 
16.1%
99.76 8
 
6.8%
98.88 6
 
5.1%
90.0 6
 
5.1%
85.5 6
 
5.1%
99.71 5
 
4.2%
97.0 4
 
3.4%
99.91 3
 
2.5%
99.2 2
 
1.7%
27.285 2
 
1.7%
Other values (57) 57
48.3%
ValueCountFrequency (%)
9.6 1
0.8%
19.14 1
0.8%
19.2 1
0.8%
27.285 2
1.7%
28.32 1
0.8%
29.25 1
0.8%
29.96 1
0.8%
37.12 1
0.8%
40.32 1
0.8%
49.98 1
0.8%
ValueCountFrequency (%)
999.92 1
0.8%
960.29 1
0.8%
806.3 1
0.8%
769.08 1
0.8%
710.99 1
0.8%
656.64 1
0.8%
608.4 1
0.8%
499.2 1
0.8%
425.1 1
0.8%
381.24 1
0.8%

비고
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
86 
사업개시
32 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row사업개시
5th row사업개시

Common Values

ValueCountFrequency (%)
<NA> 86
72.9%
사업개시 32
 
27.1%

Length

2024-04-21T03:39:35.873096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T03:39:35.952379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 86
72.9%
사업개시 32
 
27.1%

Interactions

2024-04-21T03:39:34.327401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T03:39:36.003476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위치성명설비용량(kw)
위치1.0000.9910.988
성명0.9911.0000.630
설비용량(kw)0.9880.6301.000
2024-04-21T03:39:36.077218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설비용량(kw)비고
설비용량(kw)1.0001.000
비고1.0001.000

Missing values

2024-04-21T03:39:34.460909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T03:39:34.526688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

위치성명설비용량(kw)비고
0경상남도 함안군 칠서면 대치리김*국97.01<NA>
1경상남도 함안군 법수면 강주리정*란192.64<NA>
2경상남도 함안군 법수면 우거리김*숙499.2<NA>
3경상남도 함안군 법수면 백산리박*규99.76사업개시
4경상남도 함안군 법수면 백산리박*규99.76사업개시
5경상남도 함안군 법수면 백산리박*규99.76사업개시
6경상남도 함안군 칠북면 북원로정*란234.78사업개시
7경상남도 함안군 칠원읍 오곡로동*74.24사업개시
8경상남도 함안군 법수면 장백로정*란189.66<NA>
9경상남도 함안군 군북면 석교천길이*경299.92사업개시
위치성명설비용량(kw)비고
108경상남도 함안군 법수면 강주리박*근347.2<NA>
109경상남도 함안군 칠북면 가연리주*훈91.04<NA>
110경상남도 함안군 칠북면 가연리주*훈49.98<NA>
111경상남도 함안군 법수면 황사리이*형70.8<NA>
112경상남도 함안군 법수면 황사리정*찬322.92<NA>
113경상남도 함안군 법수면 백산리최*은99.76<NA>
114경상남도 함안군 법수면 백산리조*래99.76<NA>
115경상남도 함안군 법수면 백산리조*래99.76<NA>
116경상남도 함안군 산인면 신산리나*상63.8<NA>
117경상남도 함안군 칠원읍 무기리이*진144.97<NA>

Duplicate rows

Most frequently occurring

위치성명설비용량(kw)비고# duplicates
4경상남도 함안군 군북면 하림리윤*국99.71<NA>3
6경상남도 함안군 법수면 백산리박*규99.76사업개시3
0경상남도 함안군 가야읍 북실길강*식85.5<NA>2
1경상남도 함안군 가야읍 북실길오*재90.0<NA>2
2경상남도 함안군 가야읍 북실길이*호90.0<NA>2
3경상남도 함안군 가야읍 북실길조*애85.5<NA>2
5경상남도 함안군 대산면 부목리윤*수99.91사업개시2
7경상남도 함안군 법수면 백산리조*래99.76<NA>2
8경상남도 함안군 법수면 윤외리이*환99.71<NA>2