Overview

Dataset statistics

Number of variables6
Number of observations40
Missing cells40
Missing cells (%)16.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory53.3 B

Variable types

Text2
Categorical3
Unsupported1

Dataset

Description대전광역시 유성구에 있는 음용수대현황에 대한 데이터로 공원명, 소재지지번주소, 수량, 용수별 항목을 제공합니다.
Author대전광역시 유성구
URLhttps://www.data.go.kr/data/15089341/fileData.do

Alerts

종류 is highly overall correlated with 용수별High correlation
용수별 is highly overall correlated with 종류High correlation
수량 is highly imbalanced (60.0%)Imbalance
용수별 is highly imbalanced (71.4%)Imbalance
Unnamed: 5 has 40 (100.0%) missing valuesMissing
공원명 has unique valuesUnique
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 18:06:52.892171
Analysis finished2023-12-12 18:06:53.369480
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공원명
Text

UNIQUE 

Distinct40
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size452.0 B
2023-12-13T03:06:53.553702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.275
Min length2

Characters and Unicode

Total characters131
Distinct characters82
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)100.0%

Sample

1st row성두산
2nd row엑스포
3rd row송강
4th row진잠
5th row덕명1
ValueCountFrequency (%)
성두산 1
 
2.5%
엑스포 1
 
2.5%
문지 1
 
2.5%
배울골 1
 
2.5%
아랫관들 1
 
2.5%
윗관들 1
 
2.5%
방아다리 1
 
2.5%
용산골 1
 
2.5%
오랭이 1
 
2.5%
강변 1
 
2.5%
Other values (30) 30
75.0%
2023-12-13T03:06:53.966171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
3.1%
) 4
 
3.1%
( 4
 
3.1%
4
 
3.1%
4
 
3.1%
4
 
3.1%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
Other values (72) 95
72.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 118
90.1%
Decimal Number 5
 
3.8%
Close Punctuation 4
 
3.1%
Open Punctuation 4
 
3.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (66) 84
71.2%
Decimal Number
ValueCountFrequency (%)
1 2
40.0%
5 1
20.0%
3 1
20.0%
6 1
20.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 118
90.1%
Common 13
 
9.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (66) 84
71.2%
Common
ValueCountFrequency (%)
) 4
30.8%
( 4
30.8%
1 2
15.4%
5 1
 
7.7%
3 1
 
7.7%
6 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 118
90.1%
ASCII 13
 
9.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (66) 84
71.2%
ASCII
ValueCountFrequency (%)
) 4
30.8%
( 4
30.8%
1 2
15.4%
5 1
 
7.7%
3 1
 
7.7%
6 1
 
7.7%

종류
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)17.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
어린이공원
16 
근린공원
15 
여가녹지
문화공원
수변공원
 
1
Other values (2)

Length

Max length5
Median length4
Mean length4.4
Min length4

Unique

Unique3 ?
Unique (%)7.5%

Sample

1st row근린공원
2nd row근린공원
3rd row근린공원
4th row근린공원
5th row근린공원

Common Values

ValueCountFrequency (%)
어린이공원 16
40.0%
근린공원 15
37.5%
여가녹지 4
 
10.0%
문화공원 2
 
5.0%
수변공원 1
 
2.5%
산림욕장 1
 
2.5%
희망쉼터 1
 
2.5%

Length

2023-12-13T03:06:54.138137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:06:54.274956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
어린이공원 16
40.0%
근린공원 15
37.5%
여가녹지 4
 
10.0%
문화공원 2
 
5.0%
수변공원 1
 
2.5%
산림욕장 1
 
2.5%
희망쉼터 1
 
2.5%
Distinct39
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
2023-12-13T03:06:54.589452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length18.425
Min length15

Characters and Unicode

Total characters737
Distinct characters69
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)95.0%

Sample

1st row대전광역시 유성구 구성동 20
2nd row대전광역시 유성구 전민동 467-3
3rd row대전광역시 유성구 송강동 9
4th row대전광역시 유성구 원내동 358
5th row대전광역시 유성구 덕명동 509
ValueCountFrequency (%)
대전광역시 40
24.1%
유성구 40
24.1%
관평동 6
 
3.6%
덕명동 5
 
3.0%
봉명동 3
 
1.8%
용산동 3
 
1.8%
문지동 3
 
1.8%
인근 2
 
1.2%
667 2
 
1.2%
1필 2
 
1.2%
Other values (56) 60
36.1%
2023-12-13T03:06:55.015823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
126
17.1%
44
 
6.0%
43
 
5.8%
42
 
5.7%
41
 
5.6%
41
 
5.6%
40
 
5.4%
40
 
5.4%
40
 
5.4%
40
 
5.4%
Other values (59) 240
32.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 480
65.1%
Space Separator 126
 
17.1%
Decimal Number 120
 
16.3%
Dash Punctuation 8
 
1.1%
Other Punctuation 3
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
44
9.2%
43
9.0%
42
8.8%
41
8.5%
41
8.5%
40
 
8.3%
40
 
8.3%
40
 
8.3%
40
 
8.3%
8
 
1.7%
Other values (46) 101
21.0%
Decimal Number
ValueCountFrequency (%)
1 21
17.5%
7 17
14.2%
4 15
12.5%
5 14
11.7%
0 12
10.0%
8 10
8.3%
2 10
8.3%
6 9
7.5%
9 6
 
5.0%
3 6
 
5.0%
Space Separator
ValueCountFrequency (%)
126
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Other Punctuation
ValueCountFrequency (%)
@ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 480
65.1%
Common 257
34.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
44
9.2%
43
9.0%
42
8.8%
41
8.5%
41
8.5%
40
 
8.3%
40
 
8.3%
40
 
8.3%
40
 
8.3%
8
 
1.7%
Other values (46) 101
21.0%
Common
ValueCountFrequency (%)
126
49.0%
1 21
 
8.2%
7 17
 
6.6%
4 15
 
5.8%
5 14
 
5.4%
0 12
 
4.7%
8 10
 
3.9%
2 10
 
3.9%
6 9
 
3.5%
- 8
 
3.1%
Other values (3) 15
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 480
65.1%
ASCII 257
34.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
126
49.0%
1 21
 
8.2%
7 17
 
6.6%
4 15
 
5.8%
5 14
 
5.4%
0 12
 
4.7%
8 10
 
3.9%
2 10
 
3.9%
6 9
 
3.5%
- 8
 
3.1%
Other values (3) 15
 
5.8%
Hangul
ValueCountFrequency (%)
44
9.2%
43
9.0%
42
8.8%
41
8.5%
41
8.5%
40
 
8.3%
40
 
8.3%
40
 
8.3%
40
 
8.3%
8
 
1.7%
Other values (46) 101
21.0%

수량
Categorical

IMBALANCE 

Distinct3
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
1
35 
2
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)2.5%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 35
87.5%
2 4
 
10.0%
3 1
 
2.5%

Length

2023-12-13T03:06:55.175923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:06:55.289274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 35
87.5%
2 4
 
10.0%
3 1
 
2.5%

용수별
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size452.0 B
상수도
38 
지하수
 
2

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상수도
2nd row상수도
3rd row상수도
4th row상수도
5th row상수도

Common Values

ValueCountFrequency (%)
상수도 38
95.0%
지하수 2
 
5.0%

Length

2023-12-13T03:06:55.417785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:06:55.532438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상수도 38
95.0%
지하수 2
 
5.0%

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing40
Missing (%)100.0%
Memory size492.0 B

Correlations

2023-12-13T03:06:55.612005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공원명종류소재지지번주소수량용수별
공원명1.0001.0001.0001.0001.000
종류1.0001.0001.0000.0690.602
소재지지번주소1.0001.0001.0001.0001.000
수량1.0000.0691.0001.0000.000
용수별1.0000.6021.0000.0001.000
2023-12-13T03:06:55.731480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용수별수량종류
용수별1.0000.0000.602
수량0.0001.0000.000
종류0.6020.0001.000
2023-12-13T03:06:55.827006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종류수량용수별
종류1.0000.0000.602
수량0.0001.0000.000
용수별0.6020.0001.000

Missing values

2023-12-13T03:06:53.200774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:06:53.316005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

공원명종류소재지지번주소수량용수별Unnamed: 5
0성두산근린공원대전광역시 유성구 구성동 201상수도<NA>
1엑스포근린공원대전광역시 유성구 전민동 467-31상수도<NA>
2송강근린공원대전광역시 유성구 송강동 91상수도<NA>
3진잠근린공원대전광역시 유성구 원내동 3581상수도<NA>
4덕명1근린공원대전광역시 유성구 덕명동 5091상수도<NA>
5장배기근린공원대전광역시 유성구 관평동 886 외 1필2상수도<NA>
6동화울수변근린공원대전광역시 유성구 관평동 7712상수도<NA>
7청벽산근린공원대전광역시 유성구 탑립동 7121상수도<NA>
8덜레기근린공원대전광역시 유성구 원신흥동 523외 1필1상수도<NA>
9용반들근린공원대전광역시 유성구 봉명동 1025-11상수도<NA>
공원명종류소재지지번주소수량용수별Unnamed: 5
30주막어린이공원대전광역시 유성구 신성동 141-41상수도<NA>
31유성온천문화공원대전광역시 유성구 봉명동 574 일원1상수도<NA>
32도안문화공원대전광역시 유성구 상대동 4881상수도<NA>
33작은내수변공원대전광역시 유성구 원신흥동 492-12상수도<NA>
34성북동산림욕장대전광역시 유성구 성북동 산84번지11지하수<NA>
35세미래여가녹지대전광역시 유성구 반석동 111외 101상수도<NA>
36반석동여가녹지대전광역시 유성구 반석동 17-21상수도<NA>
37외삼동여가녹지대전광역시 유성구 외삼동 344-21상수도<NA>
38갑천변여가녹지대전광역시 유성구 문지동 엑스포@3단지 인근1상수도<NA>
39자운대희망쉼터대전광역시 유성구 자운동 41-1일원1상수도<NA>