Overview

Dataset statistics

Number of variables4
Number of observations101
Missing cells206
Missing cells (%)51.0%
Duplicate rows4
Duplicate rows (%)4.0%
Total size in memory3.5 KiB
Average record size in memory35.3 B

Variable types

Unsupported1
Numeric1
Categorical1
Text1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2743/F/1/datasetView.do

Alerts

Dataset has 4 (4.0%) duplicate rowsDuplicates
2017년 도로명주소에 대한 시민인식 조사 is highly overall correlated with Unnamed: 2High correlation
Unnamed: 2 is highly overall correlated with 2017년 도로명주소에 대한 시민인식 조사High correlation
Unnamed: 0 has 101 (100.0%) missing valuesMissing
2017년 도로명주소에 대한 시민인식 조사 has 87 (86.1%) missing valuesMissing
Unnamed: 3 has 18 (17.8%) missing valuesMissing
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 05:25:40.873970
Analysis finished2023-12-11 05:25:41.610306
Duration0.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing101
Missing (%)100.0%
Memory size1.0 KiB

2017년 도로명주소에 대한 시민인식 조사
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)100.0%
Missing87
Missing (%)86.1%
Infinite0
Infinite (%)0.0%
Mean7.5
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-11T14:25:41.663788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.65
Q14.25
median7.5
Q310.75
95-th percentile13.35
Maximum14
Range13
Interquartile range (IQR)6.5

Descriptive statistics

Standard deviation4.1833001
Coefficient of variation (CV)0.55777335
Kurtosis-1.2
Mean7.5
Median Absolute Deviation (MAD)3.5
Skewness0
Sum105
Variance17.5
MonotonicityStrictly increasing
2023-12-11T14:25:41.787774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
1 1
 
1.0%
2 1
 
1.0%
3 1
 
1.0%
4 1
 
1.0%
5 1
 
1.0%
6 1
 
1.0%
7 1
 
1.0%
8 1
 
1.0%
9 1
 
1.0%
10 1
 
1.0%
Other values (4) 4
 
4.0%
(Missing) 87
86.1%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
14 1
1.0%
13 1
1.0%
12 1
1.0%
11 1
1.0%
10 1
1.0%
9 1
1.0%
8 1
1.0%
7 1
1.0%
6 1
1.0%
5 1
1.0%

Unnamed: 2
Categorical

HIGH CORRELATION 

Distinct42
Distinct (%)41.6%
Missing0
Missing (%)0.0%
Memory size940.0 B
1
14 
2
14 
3
11 
4
5
Other values (37)
48 

Length

Max length117
Median length1
Mean length10.009901
Min length1

Unique

Unique33 ?
Unique (%)32.7%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row도로명과 건물번호로 이루어진 “00로 00” 또는 “00로00길 00” 형태의 도로명주소로 된 선생님 댁의 집 주소를 알고 계십니까?

Common Values

ValueCountFrequency (%)
1 14
13.9%
2 14
13.9%
3 11
 
10.9%
4 9
 
8.9%
5 5
 
5.0%
99 5
 
5.0%
<NA> 4
 
4.0%
6 4
 
4.0%
7 2
 
2.0%
상세주소는 구청, 주민센터를 방문하거나 인터넷으로 신청할 수 있는데요. 선생님께서는 상세주소를 신청할 의향이 있으십니까? 1
 
1.0%
Other values (32) 32
31.7%

Length

2023-12-11T14:25:41.937824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 14
 
5.0%
2 14
 
5.0%
3 11
 
3.9%
4 9
 
3.2%
선생님께서는 8
 
2.9%
5 5
 
1.8%
99 5
 
1.8%
na 4
 
1.4%
6 4
 
1.4%
도로명주소를 4
 
1.4%
Other values (174) 202
72.1%

Unnamed: 3
Text

MISSING 

Distinct77
Distinct (%)92.8%
Missing18
Missing (%)17.8%
Memory size940.0 B
2023-12-11T14:25:42.252572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length29
Mean length9.2168675
Min length2

Characters and Unicode

Total characters765
Distinct characters187
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)89.2%

Sample

1st row정확히 알고 있다.
2nd row어렴풋이 알고 있다.
3rd row모른다.
4th row있다
5th row없다
ValueCountFrequency (%)
사용한다 6
 
2.9%
있다 6
 
2.9%
6
 
2.9%
기타 5
 
2.4%
지번주소를 5
 
2.4%
4
 
2.0%
어렵다 4
 
2.0%
알고 4
 
2.0%
모른다 3
 
1.5%
3
 
1.5%
Other values (135) 159
77.6%
2023-12-11T14:25:42.734932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
122
 
15.9%
32
 
4.2%
29
 
3.8%
24
 
3.1%
. 21
 
2.7%
18
 
2.4%
13
 
1.7%
13
 
1.7%
13
 
1.7%
13
 
1.7%
Other values (177) 467
61.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 597
78.0%
Space Separator 122
 
15.9%
Other Punctuation 33
 
4.3%
Open Punctuation 4
 
0.5%
Close Punctuation 4
 
0.5%
Uppercase Letter 3
 
0.4%
Decimal Number 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
5.4%
29
 
4.9%
24
 
4.0%
18
 
3.0%
13
 
2.2%
13
 
2.2%
13
 
2.2%
13
 
2.2%
12
 
2.0%
12
 
2.0%
Other values (167) 418
70.0%
Other Punctuation
ValueCountFrequency (%)
. 21
63.6%
, 10
30.3%
/ 1
 
3.0%
: 1
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
S 2
66.7%
N 1
33.3%
Space Separator
ValueCountFrequency (%)
122
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 597
78.0%
Common 165
 
21.6%
Latin 3
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
5.4%
29
 
4.9%
24
 
4.0%
18
 
3.0%
13
 
2.2%
13
 
2.2%
13
 
2.2%
13
 
2.2%
12
 
2.0%
12
 
2.0%
Other values (167) 418
70.0%
Common
ValueCountFrequency (%)
122
73.9%
. 21
 
12.7%
, 10
 
6.1%
( 4
 
2.4%
) 4
 
2.4%
1 2
 
1.2%
/ 1
 
0.6%
: 1
 
0.6%
Latin
ValueCountFrequency (%)
S 2
66.7%
N 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 597
78.0%
ASCII 168
 
22.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
122
72.6%
. 21
 
12.5%
, 10
 
6.0%
( 4
 
2.4%
) 4
 
2.4%
S 2
 
1.2%
1 2
 
1.2%
/ 1
 
0.6%
N 1
 
0.6%
: 1
 
0.6%
Hangul
ValueCountFrequency (%)
32
 
5.4%
29
 
4.9%
24
 
4.0%
18
 
3.0%
13
 
2.2%
13
 
2.2%
13
 
2.2%
13
 
2.2%
12
 
2.0%
12
 
2.0%
Other values (167) 418
70.0%

Interactions

2023-12-11T14:25:41.116890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T14:25:42.826610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2017년 도로명주소에 대한 시민인식 조사Unnamed: 2Unnamed: 3
2017년 도로명주소에 대한 시민인식 조사1.0001.000NaN
Unnamed: 21.0001.0001.000
Unnamed: 3NaN1.0001.000
2023-12-11T14:25:42.917152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2017년 도로명주소에 대한 시민인식 조사Unnamed: 2
2017년 도로명주소에 대한 시민인식 조사1.0001.000
Unnamed: 21.0001.000

Missing values

2023-12-11T14:25:41.272871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T14:25:41.423572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T14:25:41.551445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 02017년 도로명주소에 대한 시민인식 조사Unnamed: 2Unnamed: 3
0<NA><NA><NA><NA>
1<NA><NA><NA><NA>
2<NA><NA><NA><NA>
3<NA><NA><NA><NA>
4<NA>1도로명과 건물번호로 이루어진 “00로 00” 또는 “00로00길 00” 형태의 도로명주소로 된 선생님 댁의 집 주소를 알고 계십니까?<NA>
5<NA><NA>1정확히 알고 있다.
6<NA><NA>2어렴풋이 알고 있다.
7<NA><NA>3모른다.
8<NA>2선생님께서는 길을 찾거나 알려줄 때, 또는 우편물이나 택배를 보내고, 주민등록 등 민원서류를 발급받을 때 등 실생활에서 도로명주소를 사용해 본 경험이 있으십니까?<NA>
9<NA><NA>1있다
Unnamed: 02017년 도로명주소에 대한 시민인식 조사Unnamed: 2Unnamed: 3
91<NA><NA>17성북구
92<NA><NA>18송파구
93<NA><NA>19양천구
94<NA><NA>20영등포구
95<NA><NA>21용산구
96<NA><NA>22은평구
97<NA><NA>23종로구
98<NA><NA>24중구
99<NA><NA>25중랑구
100<NA><NA>26기타(경기, 인천)

Duplicate rows

Most frequently occurring

2017년 도로명주소에 대한 시민인식 조사Unnamed: 2Unnamed: 3# duplicates
2<NA>99기타5
3<NA><NA><NA>4
0<NA>1알고 있다2
1<NA>2모른다2