Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells29979
Missing cells (%)42.8%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

Text2
Categorical4
DateTime1

Dataset

Description경상남도 의령군의 제설함 정보를 제공하는 데이터 입니다. 제설함이 설치된 설치장소(설치구역), 소재지 주소, 제설함 수, 염화칼슘 비치량, 관리기관명과 전화번호 정보를 제공합니다.
Author경상남도 의령군
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15099440

Alerts

데이터기준일 has constant value ""Constant
Dataset has 1 (< 0.1%) duplicate rowsDuplicates
제설함 수 is highly overall correlated with 염화칼슘 비치량 and 2 other fieldsHigh correlation
관리기관 전화번호 is highly overall correlated with 제설함 수 and 2 other fieldsHigh correlation
관리기관명 is highly overall correlated with 제설함 수 and 2 other fieldsHigh correlation
염화칼슘 비치량 is highly overall correlated with 제설함 수 and 2 other fieldsHigh correlation
제설함 수 is highly imbalanced (99.2%)Imbalance
염화칼슘 비치량 is highly imbalanced (99.2%)Imbalance
관리기관명 is highly imbalanced (99.2%)Imbalance
관리기관 전화번호 is highly imbalanced (99.2%)Imbalance
설치장소명(설치구역) has 9993 (99.9%) missing valuesMissing
소재지 주소 has 9993 (99.9%) missing valuesMissing
데이터기준일 has 9993 (99.9%) missing valuesMissing

Reproduction

Analysis started2023-12-10 23:01:46.636118
Analysis finished2023-12-10 23:01:47.484263
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7
Distinct (%)100.0%
Missing9993
Missing (%)99.9%
Memory size156.2 KiB
2023-12-11T08:01:47.601961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length12.714286
Min length12

Characters and Unicode

Total characters89
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row부림면 경산리산64-1
2nd row화정면 상정리1360-1
3rd row용덕면 용소리산 76-1
4th row칠곡면 도산리산 179-2
5th row부림면 권혜리295-6
ValueCountFrequency (%)
부림면 2
 
11.1%
179-2 1
 
5.6%
정곡면 1
 
5.6%
47-9 1
 
5.6%
1
 
5.6%
중리 1
 
5.6%
의령읍 1
 
5.6%
권혜리295-6 1
 
5.6%
도산리산 1
 
5.6%
경산리산64-1 1
 
5.6%
Other values (7) 7
38.9%
2023-12-11T08:01:47.991011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
 
12.4%
7
 
7.9%
7
 
7.9%
- 7
 
7.9%
6
 
6.7%
1 6
 
6.7%
9 4
 
4.5%
6 4
 
4.5%
3
 
3.4%
7 3
 
3.4%
Other values (24) 31
34.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46
51.7%
Decimal Number 25
28.1%
Space Separator 11
 
12.4%
Dash Punctuation 7
 
7.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
15.2%
7
15.2%
6
13.0%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
Other values (13) 13
28.3%
Decimal Number
ValueCountFrequency (%)
1 6
24.0%
9 4
16.0%
6 4
16.0%
7 3
12.0%
0 2
 
8.0%
4 2
 
8.0%
2 2
 
8.0%
5 1
 
4.0%
3 1
 
4.0%
Space Separator
ValueCountFrequency (%)
11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 46
51.7%
Common 43
48.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
15.2%
7
15.2%
6
13.0%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
Other values (13) 13
28.3%
Common
ValueCountFrequency (%)
11
25.6%
- 7
16.3%
1 6
14.0%
9 4
 
9.3%
6 4
 
9.3%
7 3
 
7.0%
0 2
 
4.7%
4 2
 
4.7%
2 2
 
4.7%
5 1
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 46
51.7%
ASCII 43
48.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11
25.6%
- 7
16.3%
1 6
14.0%
9 4
 
9.3%
6 4
 
9.3%
7 3
 
7.0%
0 2
 
4.7%
4 2
 
4.7%
2 2
 
4.7%
5 1
 
2.3%
Hangul
ValueCountFrequency (%)
7
15.2%
7
15.2%
6
13.0%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
Other values (13) 13
28.3%

소재지 주소
Text

MISSING 

Distinct7
Distinct (%)100.0%
Missing9993
Missing (%)99.9%
Memory size156.2 KiB
2023-12-11T08:01:48.192961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length12.714286
Min length12

Characters and Unicode

Total characters89
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row부림면 경산리산64-1
2nd row화정면 상정리1360-1
3rd row용덕면 용소리산 76-1
4th row칠곡면 도산리산 179-2
5th row부림면 권혜리295-6
ValueCountFrequency (%)
부림면 2
 
11.1%
179-2 1
 
5.6%
정곡면 1
 
5.6%
47-9 1
 
5.6%
1
 
5.6%
중리 1
 
5.6%
의령읍 1
 
5.6%
권혜리295-6 1
 
5.6%
도산리산 1
 
5.6%
경산리산64-1 1
 
5.6%
Other values (7) 7
38.9%
2023-12-11T08:01:48.589901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
 
12.4%
7
 
7.9%
7
 
7.9%
- 7
 
7.9%
6
 
6.7%
1 6
 
6.7%
9 4
 
4.5%
6 4
 
4.5%
3
 
3.4%
7 3
 
3.4%
Other values (24) 31
34.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46
51.7%
Decimal Number 25
28.1%
Space Separator 11
 
12.4%
Dash Punctuation 7
 
7.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
15.2%
7
15.2%
6
13.0%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
Other values (13) 13
28.3%
Decimal Number
ValueCountFrequency (%)
1 6
24.0%
9 4
16.0%
6 4
16.0%
7 3
12.0%
0 2
 
8.0%
4 2
 
8.0%
2 2
 
8.0%
5 1
 
4.0%
3 1
 
4.0%
Space Separator
ValueCountFrequency (%)
11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 46
51.7%
Common 43
48.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
15.2%
7
15.2%
6
13.0%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
Other values (13) 13
28.3%
Common
ValueCountFrequency (%)
11
25.6%
- 7
16.3%
1 6
14.0%
9 4
 
9.3%
6 4
 
9.3%
7 3
 
7.0%
0 2
 
4.7%
4 2
 
4.7%
2 2
 
4.7%
5 1
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 46
51.7%
ASCII 43
48.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11
25.6%
- 7
16.3%
1 6
14.0%
9 4
 
9.3%
6 4
 
9.3%
7 3
 
7.0%
0 2
 
4.7%
4 2
 
4.7%
2 2
 
4.7%
5 1
 
2.3%
Hangul
ValueCountFrequency (%)
7
15.2%
7
15.2%
6
13.0%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
Other values (13) 13
28.3%

제설함 수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9993 
1
 
7

Length

Max length4
Median length4
Mean length3.9979
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9993
99.9%
1 7
 
0.1%

Length

2023-12-11T08:01:48.729912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:01:48.842457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9993
99.9%
1 7
 
0.1%

염화칼슘 비치량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9993 
25kg 5포
 
7

Length

Max length7
Median length4
Mean length4.0021
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9993
99.9%
25kg 5포 7
 
0.1%

Length

2023-12-11T08:01:48.999308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:01:49.128106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9993
99.9%
25kg 7
 
0.1%
5포 7
 
0.1%

관리기관명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9993 
의령군 건설과
 
7

Length

Max length7
Median length4
Mean length4.0021
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9993
99.9%
의령군 건설과 7
 
0.1%

Length

2023-12-11T08:01:49.290210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:01:49.400750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9993
99.9%
의령군 7
 
0.1%
건설과 7
 
0.1%

관리기관 전화번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9993 
055-570-3614
 
7

Length

Max length12
Median length4
Mean length4.0056
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9993
99.9%
055-570-3614 7
 
0.1%

Length

2023-12-11T08:01:49.523462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:01:49.645338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9993
99.9%
055-570-3614 7
 
0.1%

데이터기준일
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)14.3%
Missing9993
Missing (%)99.9%
Memory size156.2 KiB
Minimum2022-03-18 00:00:00
Maximum2022-03-18 00:00:00
2023-12-11T08:01:49.743854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:01:49.854639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-11T08:01:49.922371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치장소명(설치구역)소재지 주소
설치장소명(설치구역)1.0001.000
소재지 주소1.0001.000
2023-12-11T08:01:50.007210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제설함 수관리기관 전화번호관리기관명염화칼슘 비치량
제설함 수1.0001.0001.0001.000
관리기관 전화번호1.0001.0001.0001.000
관리기관명1.0001.0001.0001.000
염화칼슘 비치량1.0001.0001.0001.000
2023-12-11T08:01:50.136319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제설함 수염화칼슘 비치량관리기관명관리기관 전화번호
제설함 수1.0001.0001.0001.000
염화칼슘 비치량1.0001.0001.0001.000
관리기관명1.0001.0001.0001.000
관리기관 전화번호1.0001.0001.0001.000

Missing values

2023-12-11T08:01:47.064401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:01:47.211031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T08:01:47.360979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

설치장소명(설치구역)소재지 주소제설함 수염화칼슘 비치량관리기관명관리기관 전화번호데이터기준일
26249<NA><NA><NA><NA><NA><NA><NA>
72935<NA><NA><NA><NA><NA><NA><NA>
43642<NA><NA><NA><NA><NA><NA><NA>
82751<NA><NA><NA><NA><NA><NA><NA>
34018<NA><NA><NA><NA><NA><NA><NA>
89631<NA><NA><NA><NA><NA><NA><NA>
14572<NA><NA><NA><NA><NA><NA><NA>
48665<NA><NA><NA><NA><NA><NA><NA>
78091<NA><NA><NA><NA><NA><NA><NA>
89128<NA><NA><NA><NA><NA><NA><NA>
설치장소명(설치구역)소재지 주소제설함 수염화칼슘 비치량관리기관명관리기관 전화번호데이터기준일
60940<NA><NA><NA><NA><NA><NA><NA>
4737<NA><NA><NA><NA><NA><NA><NA>
3941<NA><NA><NA><NA><NA><NA><NA>
50837<NA><NA><NA><NA><NA><NA><NA>
98368<NA><NA><NA><NA><NA><NA><NA>
58593<NA><NA><NA><NA><NA><NA><NA>
57485<NA><NA><NA><NA><NA><NA><NA>
54499<NA><NA><NA><NA><NA><NA><NA>
93090<NA><NA><NA><NA><NA><NA><NA>
13209<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

설치장소명(설치구역)소재지 주소제설함 수염화칼슘 비치량관리기관명관리기관 전화번호데이터기준일# duplicates
0<NA><NA><NA><NA><NA><NA><NA>9993