Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells29988
Missing cells (%)42.8%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

Text2
Categorical4
DateTime1

Dataset

Description경상남도 의령군의 제설함 정보를 제공하는 데이터 입니다. 제설함이 설치된 설치장소(설치구역), 소재지 주소, 제설함 수, 염화칼슘 비치량, 관리기관명과 전화번호 정보를 제공합니다.
Author경상남도 의령군
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15099440

Alerts

데이터기준일 has constant value ""Constant
Dataset has 1 (< 0.1%) duplicate rowsDuplicates
제설함 수 is highly overall correlated with 염화칼슘 비치량 and 2 other fieldsHigh correlation
관리기관 전화번호 is highly overall correlated with 제설함 수 and 2 other fieldsHigh correlation
관리기관명 is highly overall correlated with 제설함 수 and 2 other fieldsHigh correlation
염화칼슘 비치량 is highly overall correlated with 제설함 수 and 2 other fieldsHigh correlation
제설함 수 is highly imbalanced (99.5%)Imbalance
염화칼슘 비치량 is highly imbalanced (99.5%)Imbalance
관리기관명 is highly imbalanced (99.5%)Imbalance
관리기관 전화번호 is highly imbalanced (99.5%)Imbalance
설치장소명(설치구역) has 9996 (> 99.9%) missing valuesMissing
소재지 주소 has 9996 (> 99.9%) missing valuesMissing
데이터기준일 has 9996 (> 99.9%) missing valuesMissing

Reproduction

Analysis started2023-12-10 23:01:41.307463
Analysis finished2023-12-10 23:01:42.135156
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct4
Distinct (%)100.0%
Missing9996
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T08:01:42.247764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length21.5
Mean length21
Min length19

Characters and Unicode

Total characters84
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row경상남도 의령군 봉수면 삼가리산 76-1
2nd row경상남도 의령군 유곡면 마두리137
3rd row경상남도 의령군 의령읍 동동리산 48-4
4th row경상남도 의령군 화정면 덕교리산38-9
ValueCountFrequency (%)
경상남도 4
22.2%
의령군 4
22.2%
봉수면 1
 
5.6%
삼가리산 1
 
5.6%
76-1 1
 
5.6%
유곡면 1
 
5.6%
마두리137 1
 
5.6%
의령읍 1
 
5.6%
동동리산 1
 
5.6%
48-4 1
 
5.6%
Other values (2) 2
11.1%
2023-12-11T08:01:42.545257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14
16.7%
5
 
6.0%
5
 
6.0%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
3
 
3.6%
Other values (23) 33
39.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 55
65.5%
Space Separator 14
 
16.7%
Decimal Number 12
 
14.3%
Dash Punctuation 3
 
3.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
9.1%
5
 
9.1%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
3
 
5.5%
3
 
5.5%
Other values (14) 15
27.3%
Decimal Number
ValueCountFrequency (%)
3 2
16.7%
8 2
16.7%
1 2
16.7%
7 2
16.7%
4 2
16.7%
6 1
8.3%
9 1
8.3%
Space Separator
ValueCountFrequency (%)
14
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 55
65.5%
Common 29
34.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
9.1%
5
 
9.1%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
3
 
5.5%
3
 
5.5%
Other values (14) 15
27.3%
Common
ValueCountFrequency (%)
14
48.3%
- 3
 
10.3%
3 2
 
6.9%
8 2
 
6.9%
1 2
 
6.9%
7 2
 
6.9%
4 2
 
6.9%
6 1
 
3.4%
9 1
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 55
65.5%
ASCII 29
34.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14
48.3%
- 3
 
10.3%
3 2
 
6.9%
8 2
 
6.9%
1 2
 
6.9%
7 2
 
6.9%
4 2
 
6.9%
6 1
 
3.4%
9 1
 
3.4%
Hangul
ValueCountFrequency (%)
5
 
9.1%
5
 
9.1%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
4
 
7.3%
3
 
5.5%
3
 
5.5%
Other values (14) 15
27.3%

소재지 주소
Text

MISSING 

Distinct4
Distinct (%)100.0%
Missing9996
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T08:01:42.734008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12.5
Mean length12
Min length10

Characters and Unicode

Total characters48
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row봉수면 삼가리산 76-1
2nd row유곡면 마두리137
3rd row의령읍 동동리산 48-4
4th row화정면 덕교리산38-9
ValueCountFrequency (%)
봉수면 1
10.0%
삼가리산 1
10.0%
76-1 1
10.0%
유곡면 1
10.0%
마두리137 1
10.0%
의령읍 1
10.0%
동동리산 1
10.0%
48-4 1
10.0%
화정면 1
10.0%
덕교리산38-9 1
10.0%
2023-12-11T08:01:43.073840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6
 
12.5%
4
 
8.3%
3
 
6.2%
3
 
6.2%
- 3
 
6.2%
8 2
 
4.2%
3 2
 
4.2%
2
 
4.2%
7 2
 
4.2%
4 2
 
4.2%
Other values (18) 19
39.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 27
56.2%
Decimal Number 12
25.0%
Space Separator 6
 
12.5%
Dash Punctuation 3
 
6.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
14.8%
3
 
11.1%
3
 
11.1%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (9) 9
33.3%
Decimal Number
ValueCountFrequency (%)
8 2
16.7%
3 2
16.7%
7 2
16.7%
4 2
16.7%
1 2
16.7%
6 1
8.3%
9 1
8.3%
Space Separator
ValueCountFrequency (%)
6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 27
56.2%
Common 21
43.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
14.8%
3
 
11.1%
3
 
11.1%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (9) 9
33.3%
Common
ValueCountFrequency (%)
6
28.6%
- 3
14.3%
8 2
 
9.5%
3 2
 
9.5%
7 2
 
9.5%
4 2
 
9.5%
1 2
 
9.5%
6 1
 
4.8%
9 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 27
56.2%
ASCII 21
43.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6
28.6%
- 3
14.3%
8 2
 
9.5%
3 2
 
9.5%
7 2
 
9.5%
4 2
 
9.5%
1 2
 
9.5%
6 1
 
4.8%
9 1
 
4.8%
Hangul
ValueCountFrequency (%)
4
14.8%
3
 
11.1%
3
 
11.1%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (9) 9
33.3%

제설함 수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9996 
1
 
4

Length

Max length4
Median length4
Mean length3.9988
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9996
> 99.9%
1 4
 
< 0.1%

Length

2023-12-11T08:01:43.210741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:01:43.303016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9996
> 99.9%
1 4
 
< 0.1%

염화칼슘 비치량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9996 
25kg 5포
 
4

Length

Max length7
Median length4
Mean length4.0012
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9996
> 99.9%
25kg 5포 4
 
< 0.1%

Length

2023-12-11T08:01:43.407811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:01:43.509929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9996
99.9%
25kg 4
 
< 0.1%
5포 4
 
< 0.1%

관리기관명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9996 
의령군 건설과
 
4

Length

Max length7
Median length4
Mean length4.0012
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9996
> 99.9%
의령군 건설과 4
 
< 0.1%

Length

2023-12-11T08:01:43.645800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:01:43.744077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9996
99.9%
의령군 4
 
< 0.1%
건설과 4
 
< 0.1%

관리기관 전화번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9996 
055-570-3614
 
4

Length

Max length12
Median length4
Mean length4.0032
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9996
> 99.9%
055-570-3614 4
 
< 0.1%

Length

2023-12-11T08:01:43.852799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:01:43.956425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9996
> 99.9%
055-570-3614 4
 
< 0.1%

데이터기준일
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)25.0%
Missing9996
Missing (%)> 99.9%
Memory size156.2 KiB
Minimum2023-03-12 00:00:00
Maximum2023-03-12 00:00:00
2023-12-11T08:01:44.038225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:01:44.124074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-11T08:01:44.192292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치장소명(설치구역)소재지 주소
설치장소명(설치구역)1.0001.000
소재지 주소1.0001.000
2023-12-11T08:01:44.290642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제설함 수관리기관 전화번호관리기관명염화칼슘 비치량
제설함 수1.0001.0001.0001.000
관리기관 전화번호1.0001.0001.0001.000
관리기관명1.0001.0001.0001.000
염화칼슘 비치량1.0001.0001.0001.000
2023-12-11T08:01:44.416187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제설함 수염화칼슘 비치량관리기관명관리기관 전화번호
제설함 수1.0001.0001.0001.000
염화칼슘 비치량1.0001.0001.0001.000
관리기관명1.0001.0001.0001.000
관리기관 전화번호1.0001.0001.0001.000

Missing values

2023-12-11T08:01:41.711363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:01:41.894921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T08:01:42.042465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

설치장소명(설치구역)소재지 주소제설함 수염화칼슘 비치량관리기관명관리기관 전화번호데이터기준일
19591<NA><NA><NA><NA><NA><NA><NA>
41026<NA><NA><NA><NA><NA><NA><NA>
87015<NA><NA><NA><NA><NA><NA><NA>
57013<NA><NA><NA><NA><NA><NA><NA>
68681<NA><NA><NA><NA><NA><NA><NA>
20135<NA><NA><NA><NA><NA><NA><NA>
64568<NA><NA><NA><NA><NA><NA><NA>
41030<NA><NA><NA><NA><NA><NA><NA>
8616<NA><NA><NA><NA><NA><NA><NA>
57148<NA><NA><NA><NA><NA><NA><NA>
설치장소명(설치구역)소재지 주소제설함 수염화칼슘 비치량관리기관명관리기관 전화번호데이터기준일
88684<NA><NA><NA><NA><NA><NA><NA>
24570<NA><NA><NA><NA><NA><NA><NA>
15611<NA><NA><NA><NA><NA><NA><NA>
12558<NA><NA><NA><NA><NA><NA><NA>
1827<NA><NA><NA><NA><NA><NA><NA>
13365<NA><NA><NA><NA><NA><NA><NA>
35484<NA><NA><NA><NA><NA><NA><NA>
12221<NA><NA><NA><NA><NA><NA><NA>
6668<NA><NA><NA><NA><NA><NA><NA>
34217<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

설치장소명(설치구역)소재지 주소제설함 수염화칼슘 비치량관리기관명관리기관 전화번호데이터기준일# duplicates
0<NA><NA><NA><NA><NA><NA><NA>9996