Overview

Dataset statistics

Number of variables7
Number of observations1204
Missing cells5547
Missing cells (%)65.8%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory68.3 KiB
Average record size in memory58.1 B

Variable types

Text2
Numeric2
Categorical2
DateTime1

Dataset

Description인천광역시 연수구 공중 및 개방화장실 현황의 데이터로 화장실명, 위치 등의 목록- 화장실명, 위치, 구분, 개방 지정일, 유형으로 구분
Author인천광역시 연수구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15087382&srcSe=7661IVAWM27C61E190

Alerts

개방 지정일 has constant value ""Constant
Dataset has 1 (0.1%) duplicate rowsDuplicates
구분 is highly overall correlated with 유형High correlation
유형 is highly overall correlated with 구분High correlation
구분 is highly imbalanced (71.8%)Imbalance
유형 is highly imbalanced (80.2%)Imbalance
화장실명 has 1086 (90.2%) missing valuesMissing
위치 has 1086 (90.2%) missing valuesMissing
위도 has 1086 (90.2%) missing valuesMissing
경도 has 1086 (90.2%) missing valuesMissing
개방 지정일 has 1203 (99.9%) missing valuesMissing

Reproduction

Analysis started2024-01-28 06:15:48.441312
Analysis finished2024-01-28 06:15:49.271747
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

화장실명
Text

MISSING 

Distinct118
Distinct (%)100.0%
Missing1086
Missing (%)90.2%
Memory size9.5 KiB
2024-01-28T15:15:49.391825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length14
Mean length5.5338983
Min length1

Characters and Unicode

Total characters653
Distinct characters183
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique118 ?
Unique (%)100.0%

Sample

1st row양지
2nd row청학
3rd row용담
4th row문화
5th row솔밭
ValueCountFrequency (%)
주유소 16
 
10.7%
송도점 3
 
2.0%
연수점 3
 
2.0%
미추홀 2
 
1.3%
롯데마트 2
 
1.3%
홈플러스 2
 
1.3%
㈜태화석유 1
 
0.7%
지식정보단지역 1
 
0.7%
국제업무지구역 1
 
0.7%
센트럴파크역 1
 
0.7%
Other values (118) 118
78.7%
2024-01-28T15:15:49.653624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
36
 
5.5%
32
 
4.9%
26
 
4.0%
26
 
4.0%
20
 
3.1%
18
 
2.8%
18
 
2.8%
17
 
2.6%
16
 
2.5%
16
 
2.5%
Other values (173) 428
65.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 580
88.8%
Space Separator 32
 
4.9%
Decimal Number 13
 
2.0%
Other Symbol 8
 
1.2%
Lowercase Letter 7
 
1.1%
Uppercase Letter 6
 
0.9%
Close Punctuation 3
 
0.5%
Open Punctuation 3
 
0.5%
Letter Number 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
36
 
6.2%
26
 
4.5%
26
 
4.5%
20
 
3.4%
18
 
3.1%
18
 
3.1%
17
 
2.9%
16
 
2.8%
16
 
2.8%
14
 
2.4%
Other values (155) 373
64.3%
Uppercase Letter
ValueCountFrequency (%)
C 1
16.7%
N 1
16.7%
F 1
16.7%
L 1
16.7%
K 1
16.7%
S 1
16.7%
Lowercase Letter
ValueCountFrequency (%)
s 3
42.9%
g 2
28.6%
l 1
 
14.3%
o 1
 
14.3%
Decimal Number
ValueCountFrequency (%)
3 5
38.5%
2 4
30.8%
1 4
30.8%
Space Separator
ValueCountFrequency (%)
32
100.0%
Other Symbol
ValueCountFrequency (%)
8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 588
90.0%
Common 51
 
7.8%
Latin 14
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
36
 
6.1%
26
 
4.4%
26
 
4.4%
20
 
3.4%
18
 
3.1%
18
 
3.1%
17
 
2.9%
16
 
2.7%
16
 
2.7%
14
 
2.4%
Other values (156) 381
64.8%
Latin
ValueCountFrequency (%)
s 3
21.4%
g 2
14.3%
C 1
 
7.1%
N 1
 
7.1%
F 1
 
7.1%
L 1
 
7.1%
l 1
 
7.1%
o 1
 
7.1%
K 1
 
7.1%
S 1
 
7.1%
Common
ValueCountFrequency (%)
32
62.7%
3 5
 
9.8%
2 4
 
7.8%
1 4
 
7.8%
) 3
 
5.9%
( 3
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 580
88.8%
ASCII 64
 
9.8%
None 8
 
1.2%
Number Forms 1
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
36
 
6.2%
26
 
4.5%
26
 
4.5%
20
 
3.4%
18
 
3.1%
18
 
3.1%
17
 
2.9%
16
 
2.8%
16
 
2.8%
14
 
2.4%
Other values (155) 373
64.3%
ASCII
ValueCountFrequency (%)
32
50.0%
3 5
 
7.8%
2 4
 
6.2%
1 4
 
6.2%
s 3
 
4.7%
) 3
 
4.7%
( 3
 
4.7%
g 2
 
3.1%
C 1
 
1.6%
N 1
 
1.6%
Other values (6) 6
 
9.4%
None
ValueCountFrequency (%)
8
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

위치
Text

MISSING 

Distinct114
Distinct (%)96.6%
Missing1086
Missing (%)90.2%
Memory size9.5 KiB
2024-01-28T15:15:49.864724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length9.7372881
Min length5

Characters and Unicode

Total characters1149
Distinct characters90
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)93.2%

Sample

1st row연수동 594-1
2nd row청학동 501
3rd row용담로 61
4th row연수동 578
5th row연수동 582-1
ValueCountFrequency (%)
경원대로 10
 
4.3%
연수동 9
 
3.8%
동춘동 9
 
3.8%
원인재로 6
 
2.6%
비류대로 6
 
2.6%
송도국제대로 5
 
2.1%
청능대로 5
 
2.1%
앵고개로 5
 
2.1%
청학동 4
 
1.7%
인천타워대로 4
 
1.7%
Other values (151) 172
73.2%
2024-01-28T15:15:50.193901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
117
 
10.2%
88
 
7.7%
1 81
 
7.0%
69
 
6.0%
2 54
 
4.7%
5 50
 
4.4%
44
 
3.8%
3 43
 
3.7%
4 34
 
3.0%
9 29
 
2.5%
Other values (80) 540
47.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 577
50.2%
Decimal Number 380
33.1%
Space Separator 117
 
10.2%
Dash Punctuation 25
 
2.2%
Close Punctuation 25
 
2.2%
Open Punctuation 25
 
2.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
88
 
15.3%
69
 
12.0%
44
 
7.6%
22
 
3.8%
19
 
3.3%
19
 
3.3%
16
 
2.8%
16
 
2.8%
15
 
2.6%
14
 
2.4%
Other values (66) 255
44.2%
Decimal Number
ValueCountFrequency (%)
1 81
21.3%
2 54
14.2%
5 50
13.2%
3 43
11.3%
4 34
8.9%
9 29
 
7.6%
8 26
 
6.8%
0 24
 
6.3%
6 20
 
5.3%
7 19
 
5.0%
Space Separator
ValueCountFrequency (%)
117
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 577
50.2%
Common 572
49.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
88
 
15.3%
69
 
12.0%
44
 
7.6%
22
 
3.8%
19
 
3.3%
19
 
3.3%
16
 
2.8%
16
 
2.8%
15
 
2.6%
14
 
2.4%
Other values (66) 255
44.2%
Common
ValueCountFrequency (%)
117
20.5%
1 81
14.2%
2 54
9.4%
5 50
8.7%
3 43
 
7.5%
4 34
 
5.9%
9 29
 
5.1%
8 26
 
4.5%
- 25
 
4.4%
) 25
 
4.4%
Other values (4) 88
15.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 577
50.2%
ASCII 572
49.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
117
20.5%
1 81
14.2%
2 54
9.4%
5 50
8.7%
3 43
 
7.5%
4 34
 
5.9%
9 29
 
5.1%
8 26
 
4.5%
- 25
 
4.4%
) 25
 
4.4%
Other values (4) 88
15.4%
Hangul
ValueCountFrequency (%)
88
 
15.3%
69
 
12.0%
44
 
7.6%
22
 
3.8%
19
 
3.3%
19
 
3.3%
16
 
2.8%
16
 
2.8%
15
 
2.6%
14
 
2.4%
Other values (66) 255
44.2%

위도
Real number (ℝ)

MISSING 

Distinct111
Distinct (%)94.1%
Missing1086
Missing (%)90.2%
Infinite0
Infinite (%)0.0%
Mean37.399181
Minimum35.907757
Maximum37.489775
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.7 KiB
2024-01-28T15:15:50.301122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35.907757
5-th percentile37.382413
Q137.40108
median37.414318
Q337.422641
95-th percentile37.432071
Maximum37.489775
Range1.582018
Interquartile range (IQR)0.021561

Descriptive statistics

Standard deviation0.13948127
Coefficient of variation (CV)0.0037295273
Kurtosis114.52962
Mean37.399181
Median Absolute Deviation (MAD)0.009557
Skewness-10.623755
Sum4413.1034
Variance0.019455025
MonotonicityNot monotonic
2024-01-28T15:15:50.405968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.383912 2
 
0.2%
37.398096 2
 
0.2%
37.406106 2
 
0.2%
37.384878 2
 
0.2%
37.409789 2
 
0.2%
37.430122 2
 
0.2%
37.382413 2
 
0.2%
37.404158 1
 
0.1%
37.405395 1
 
0.1%
37.379755 1
 
0.1%
Other values (101) 101
 
8.4%
(Missing) 1086
90.2%
ValueCountFrequency (%)
35.907757 1
0.1%
37.366701 1
0.1%
37.378451 1
0.1%
37.379755 1
0.1%
37.381413 1
0.1%
37.382413 2
0.2%
37.3825 1
0.1%
37.383912 2
0.2%
37.384878 2
0.2%
37.387343 1
0.1%
ValueCountFrequency (%)
37.489775 1
0.1%
37.438592 1
0.1%
37.43494 1
0.1%
37.433917 1
0.1%
37.432719 1
0.1%
37.432187 1
0.1%
37.43205 1
0.1%
37.431419 1
0.1%
37.430405 1
0.1%
37.430122 2
0.2%

경도
Real number (ℝ)

MISSING 

Distinct111
Distinct (%)94.1%
Missing1086
Missing (%)90.2%
Infinite0
Infinite (%)0.0%
Mean126.67793
Minimum126.63169
Maximum127.76692
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.7 KiB
2024-01-28T15:15:50.527468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.63169
5-th percentile126.6391
Q1126.65468
median126.67013
Q3126.68173
95-th percentile126.69858
Maximum127.76692
Range1.135234
Interquartile range (IQR)0.02704475

Descriptive statistics

Standard deviation0.10291308
Coefficient of variation (CV)0.00081239947
Kurtosis109.72631
Mean126.67793
Median Absolute Deviation (MAD)0.0137715
Skewness10.292266
Sum14947.995
Variance0.010591102
MonotonicityNot monotonic
2024-01-28T15:15:50.861387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.643855 2
 
0.2%
126.631688 2
 
0.2%
126.683724 2
 
0.2%
126.658734 2
 
0.2%
126.67832 2
 
0.2%
126.698485 2
 
0.2%
126.656355 2
 
0.2%
126.681106 1
 
0.1%
126.672745 1
 
0.1%
126.644072 1
 
0.1%
Other values (101) 101
 
8.4%
(Missing) 1086
90.2%
ValueCountFrequency (%)
126.631688 2
0.2%
126.633304 1
0.1%
126.636717 1
0.1%
126.638248 1
0.1%
126.638565 1
0.1%
126.639197 1
0.1%
126.640168 1
0.1%
126.640779 1
0.1%
126.642309 1
0.1%
126.643219 1
0.1%
ValueCountFrequency (%)
127.766922 1
0.1%
126.723308 1
0.1%
126.701351 1
0.1%
126.701314 1
0.1%
126.699185 1
0.1%
126.699115 1
0.1%
126.698485 2
0.2%
126.698372 1
0.1%
126.698351 1
0.1%
126.698279 1
0.1%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
<NA>
1086 
공중
 
82
개방(의무)
 
32
개방(지정)
 
4

Length

Max length6
Median length4
Mean length3.923588
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공중
2nd row공중
3rd row공중
4th row공중
5th row공중

Common Values

ValueCountFrequency (%)
<NA> 1086
90.2%
공중 82
 
6.8%
개방(의무) 32
 
2.7%
개방(지정) 4
 
0.3%

Length

2024-01-28T15:15:50.961598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T15:15:51.045359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1086
90.2%
공중 82
 
6.8%
개방(의무 32
 
2.7%
개방(지정 4
 
0.3%

개방 지정일
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing1203
Missing (%)99.9%
Memory size9.5 KiB
Minimum2011-05-23 00:00:00
Maximum2011-05-23 00:00:00
2024-01-28T15:15:51.108093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T15:15:51.173635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

유형
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct12
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
<NA>
1086 
공원
 
46
주유소
 
22
기타(관공서)
 
17
지하철
 
12
Other values (7)
 
21

Length

Max length8
Median length4
Mean length3.923588
Min length2

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st row공원
2nd row공원
3rd row공원
4th row공원
5th row공원

Common Values

ValueCountFrequency (%)
<NA> 1086
90.2%
공원 46
 
3.8%
주유소 22
 
1.8%
기타(관공서) 17
 
1.4%
지하철 12
 
1.0%
시장 12
 
1.0%
주유소(경기장) 2
 
0.2%
기타(빌딩) 2
 
0.2%
전철 2
 
0.2%
유원지 1
 
0.1%
Other values (2) 2
 
0.2%

Length

2024-01-28T15:15:51.258509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1086
90.2%
공원 46
 
3.8%
주유소 22
 
1.8%
기타(관공서 17
 
1.4%
지하철 12
 
1.0%
시장 12
 
1.0%
주유소(경기장 2
 
0.2%
기타(빌딩 2
 
0.2%
전철 2
 
0.2%
유원지 1
 
0.1%
Other values (2) 2
 
0.2%

Interactions

2024-01-28T15:15:48.889354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T15:15:48.734746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T15:15:48.961487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T15:15:48.822088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-28T15:15:51.321446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도구분유형
위도1.0000.6950.0470.000
경도0.6951.0000.0470.000
구분0.0470.0471.0000.943
유형0.0000.0000.9431.000
2024-01-28T15:15:51.389065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형구분
유형1.0000.887
구분0.8871.000
2024-01-28T15:15:51.449718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도구분유형
위도1.0000.3940.0770.000
경도0.3941.0000.0770.000
구분0.0770.0771.0000.887
유형0.0000.0000.8871.000

Missing values

2024-01-28T15:15:49.047691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-28T15:15:49.128801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-28T15:15:49.213150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

화장실명위치위도경도구분개방 지정일유형
0양지연수동 594-137.415099126.678723공중<NA>공원
1청학청학동 50137.424563126.665133공중<NA>공원
2용담용담로 6137.418569126.674201공중<NA>공원
3문화연수동 57837.417376126.68144공중<NA>공원
4솔밭연수동 582-137.416298126.6903공중<NA>공원
5대학함박뫼로 19437.421423126.686352공중<NA>공원
6송도공중화장실능허대로 19237.416787126.647824공중<NA>유원지
7아암도공중화장실아암대로 55537.413854126.642309공중<NA>기타
8두리비류대로541번길 14-737.423794126.695201공중<NA>공원
9승기선학동 34837.419438126.698372공중<NA>공원
화장실명위치위도경도구분개방 지정일유형
1194<NA><NA><NA><NA><NA><NA><NA>
1195<NA><NA><NA><NA><NA><NA><NA>
1196<NA><NA><NA><NA><NA><NA><NA>
1197<NA><NA><NA><NA><NA><NA><NA>
1198<NA><NA><NA><NA><NA><NA><NA>
1199<NA><NA><NA><NA><NA><NA><NA>
1200<NA><NA><NA><NA><NA><NA><NA>
1201<NA><NA><NA><NA><NA><NA><NA>
1202<NA><NA><NA><NA><NA><NA><NA>
1203<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

화장실명위치위도경도구분개방 지정일유형# duplicates
0<NA><NA><NA><NA><NA><NA><NA>1086