Overview

Dataset statistics

Number of variables7
Number of observations343
Missing cells241
Missing cells (%)10.0%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory19.6 KiB
Average record size in memory58.4 B

Variable types

Numeric2
Categorical2
Text3

Dataset

Description해당 데이터는 인천광역시 연수구의 교회 현황에 관련된 자료로서, 교회 현황의 종교시설 명칭, 종류, 주소, 전화번호의 정보를 확인할 수 있습니다
Author인천광역시 연수구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15112515&srcSe=7661IVAWM27C61E190

Alerts

Dataset has 1 (0.3%) duplicate rowsDuplicates
행정동 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
구분 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
연번 is highly overall correlated with 행정동 and 1 other fieldsHigh correlation
우편번호 is highly overall correlated with 행정동 and 1 other fieldsHigh correlation
구분 is highly imbalanced (71.5%)Imbalance
연번 has 17 (5.0%) missing valuesMissing
종교시설명 has 17 (5.0%) missing valuesMissing
전화번호 has 173 (50.4%) missing valuesMissing
주소 has 17 (5.0%) missing valuesMissing
우편번호 has 17 (5.0%) missing valuesMissing

Reproduction

Analysis started2024-01-28 12:58:34.119464
Analysis finished2024-01-28 12:58:35.402231
Duration1.28 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct326
Distinct (%)100.0%
Missing17
Missing (%)5.0%
Infinite0
Infinite (%)0.0%
Mean163.5
Minimum1
Maximum326
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 KiB
2024-01-28T21:58:35.461543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile17.25
Q182.25
median163.5
Q3244.75
95-th percentile309.75
Maximum326
Range325
Interquartile range (IQR)162.5

Descriptive statistics

Standard deviation94.252321
Coefficient of variation (CV)0.57646679
Kurtosis-1.2
Mean163.5
Median Absolute Deviation (MAD)81.5
Skewness0
Sum53301
Variance8883.5
MonotonicityStrictly increasing
2024-01-28T21:58:35.577045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
206 1
 
0.3%
224 1
 
0.3%
223 1
 
0.3%
222 1
 
0.3%
221 1
 
0.3%
220 1
 
0.3%
219 1
 
0.3%
218 1
 
0.3%
217 1
 
0.3%
216 1
 
0.3%
Other values (316) 316
92.1%
(Missing) 17
 
5.0%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
326 1
0.3%
325 1
0.3%
324 1
0.3%
323 1
0.3%
322 1
0.3%
321 1
0.3%
320 1
0.3%
319 1
0.3%
318 1
0.3%
317 1
0.3%

행정동
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
청학동
63 
연수1동
53 
옥련1동
38 
옥련2동
30 
선학동
29 
Other values (11)
130 

Length

Max length4
Median length4
Mean length3.7230321
Min length1

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row옥련1동
2nd row옥련1동
3rd row옥련1동
4th row옥련1동
5th row옥련1동

Common Values

ValueCountFrequency (%)
청학동 63
18.4%
연수1동 53
15.5%
옥련1동 38
11.1%
옥련2동 30
8.7%
선학동 29
8.5%
연수2동 28
8.2%
송도1동 26
7.6%
동춘1동 21
 
6.1%
<NA> 16
 
4.7%
동춘2동 9
 
2.6%
Other values (6) 30
8.7%

Length

2024-01-28T21:58:35.696576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
청학동 63
18.4%
연수1동 53
15.5%
옥련1동 38
11.1%
옥련2동 30
8.7%
선학동 29
8.5%
연수2동 28
8.2%
송도1동 26
7.6%
동춘1동 21
 
6.1%
na 16
 
4.7%
동춘2동 9
 
2.6%
Other values (6) 30
8.7%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
기독교
326 
<NA>
 
17

Length

Max length4
Median length3
Mean length3.0495627
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기독교
2nd row기독교
3rd row기독교
4th row기독교
5th row기독교

Common Values

ValueCountFrequency (%)
기독교 326
95.0%
<NA> 17
 
5.0%

Length

2024-01-28T21:58:35.822046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T21:58:35.902762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기독교 326
95.0%
na 17
 
5.0%

종교시설명
Text

MISSING 

Distinct298
Distinct (%)91.4%
Missing17
Missing (%)5.0%
Memory size2.8 KiB
2024-01-28T21:58:36.081238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length19
Mean length6.4570552
Min length3

Characters and Unicode

Total characters2105
Distinct characters249
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique273 ?
Unique (%)83.7%

Sample

1st row영광의교회
2nd row생수감리교회
3rd row옥련중앙장로교회
4th row형통한교회기도원
5th row예일교회
ValueCountFrequency (%)
교회 10
 
2.8%
순복음 5
 
1.4%
소망장로교회 3
 
0.8%
하늘교회 3
 
0.8%
영광교회 3
 
0.8%
은혜교회 3
 
0.8%
반석교회 3
 
0.8%
대한예수교장로회 2
 
0.6%
새언약교회 2
 
0.6%
예수사랑감리교회 2
 
0.6%
Other values (302) 321
89.9%
2024-01-28T21:58:36.386869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
342
 
16.2%
340
 
16.2%
79
 
3.8%
69
 
3.3%
44
 
2.1%
44
 
2.1%
38
 
1.8%
36
 
1.7%
32
 
1.5%
32
 
1.5%
Other values (239) 1049
49.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2031
96.5%
Space Separator 32
 
1.5%
Lowercase Letter 20
 
1.0%
Open Punctuation 8
 
0.4%
Close Punctuation 8
 
0.4%
Uppercase Letter 6
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
342
 
16.8%
340
 
16.7%
79
 
3.9%
69
 
3.4%
44
 
2.2%
44
 
2.2%
38
 
1.9%
36
 
1.8%
32
 
1.6%
28
 
1.4%
Other values (220) 979
48.2%
Lowercase Letter
ValueCountFrequency (%)
e 4
20.0%
m 3
15.0%
u 2
10.0%
r 2
10.0%
h 2
10.0%
o 1
 
5.0%
c 1
 
5.0%
y 1
 
5.0%
t 1
 
5.0%
i 1
 
5.0%
Other values (2) 2
10.0%
Uppercase Letter
ValueCountFrequency (%)
C 3
50.0%
R 1
 
16.7%
S 1
 
16.7%
I 1
 
16.7%
Space Separator
ValueCountFrequency (%)
32
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2031
96.5%
Common 48
 
2.3%
Latin 26
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
342
 
16.8%
340
 
16.7%
79
 
3.9%
69
 
3.4%
44
 
2.2%
44
 
2.2%
38
 
1.9%
36
 
1.8%
32
 
1.6%
28
 
1.4%
Other values (220) 979
48.2%
Latin
ValueCountFrequency (%)
e 4
15.4%
C 3
11.5%
m 3
11.5%
u 2
 
7.7%
r 2
 
7.7%
h 2
 
7.7%
o 1
 
3.8%
c 1
 
3.8%
y 1
 
3.8%
t 1
 
3.8%
Other values (6) 6
23.1%
Common
ValueCountFrequency (%)
32
66.7%
( 8
 
16.7%
) 8
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2031
96.5%
ASCII 74
 
3.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
342
 
16.8%
340
 
16.7%
79
 
3.9%
69
 
3.4%
44
 
2.2%
44
 
2.2%
38
 
1.9%
36
 
1.8%
32
 
1.6%
28
 
1.4%
Other values (220) 979
48.2%
ASCII
ValueCountFrequency (%)
32
43.2%
( 8
 
10.8%
) 8
 
10.8%
e 4
 
5.4%
C 3
 
4.1%
m 3
 
4.1%
u 2
 
2.7%
r 2
 
2.7%
h 2
 
2.7%
o 1
 
1.4%
Other values (9) 9
 
12.2%

전화번호
Text

MISSING 

Distinct162
Distinct (%)95.3%
Missing173
Missing (%)50.4%
Memory size2.8 KiB
2024-01-28T21:58:36.610374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters2040
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique154 ?
Unique (%)90.6%

Sample

1st row032-831-9101
2nd row032-833-6263
3rd row032-421-8291
4th row032-833-1451
5th row032-831-8290
ValueCountFrequency (%)
032-858-3600 2
 
1.2%
032-818-0691 2
 
1.2%
032-811-6000 2
 
1.2%
032-831-2245 2
 
1.2%
032-215-7312 2
 
1.2%
032-833-0391 2
 
1.2%
032-858-0091 2
 
1.2%
032-859-5000 2
 
1.2%
032-831-8545 1
 
0.6%
032-812-7089 1
 
0.6%
Other values (152) 152
89.4%
2024-01-28T21:58:36.959735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 340
16.7%
0 297
14.6%
3 296
14.5%
2 281
13.8%
8 228
11.2%
1 215
10.5%
9 96
 
4.7%
5 81
 
4.0%
4 71
 
3.5%
6 69
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1700
83.3%
Dash Punctuation 340
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 297
17.5%
3 296
17.4%
2 281
16.5%
8 228
13.4%
1 215
12.6%
9 96
 
5.6%
5 81
 
4.8%
4 71
 
4.2%
6 69
 
4.1%
7 66
 
3.9%
Dash Punctuation
ValueCountFrequency (%)
- 340
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2040
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 340
16.7%
0 297
14.6%
3 296
14.5%
2 281
13.8%
8 228
11.2%
1 215
10.5%
9 96
 
4.7%
5 81
 
4.0%
4 71
 
3.5%
6 69
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 340
16.7%
0 297
14.6%
3 296
14.5%
2 281
13.8%
8 228
11.2%
1 215
10.5%
9 96
 
4.7%
5 81
 
4.0%
4 71
 
3.5%
6 69
 
3.4%

주소
Text

MISSING 

Distinct309
Distinct (%)94.8%
Missing17
Missing (%)5.0%
Memory size2.8 KiB
2024-01-28T21:58:37.246808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length40
Mean length22.484663
Min length8

Characters and Unicode

Total characters7330
Distinct characters180
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique295 ?
Unique (%)90.5%

Sample

1st row인천광역시 연수구 독배로 25
2nd row인천광역시 청량로155번길 54
3rd row인천광역시 인권로9번길 67
4th row인천광역시 연수구 한나루로86번길 26
5th row인천광역시 연수구 옥련로 142
ValueCountFrequency (%)
인천광역시 325
22.2%
연수구 321
22.0%
상가 17
 
1.2%
원인재로 11
 
0.8%
먼우금로 10
 
0.7%
20 8
 
0.5%
9 8
 
0.5%
새말로 7
 
0.5%
청능대로 7
 
0.5%
21 7
 
0.5%
Other values (428) 740
50.7%
2024-01-28T21:58:37.627625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1139
 
15.5%
347
 
4.7%
335
 
4.6%
334
 
4.6%
329
 
4.5%
329
 
4.5%
327
 
4.5%
327
 
4.5%
325
 
4.4%
322
 
4.4%
Other values (170) 3216
43.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4747
64.8%
Decimal Number 1311
 
17.9%
Space Separator 1139
 
15.5%
Dash Punctuation 43
 
0.6%
Other Punctuation 40
 
0.5%
Open Punctuation 18
 
0.2%
Close Punctuation 18
 
0.2%
Uppercase Letter 11
 
0.2%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
347
 
7.3%
335
 
7.1%
334
 
7.0%
329
 
6.9%
329
 
6.9%
327
 
6.9%
327
 
6.9%
325
 
6.8%
322
 
6.8%
185
 
3.9%
Other values (146) 1587
33.4%
Decimal Number
ValueCountFrequency (%)
1 255
19.5%
2 211
16.1%
4 147
11.2%
3 132
10.1%
5 110
8.4%
0 105
8.0%
6 104
7.9%
7 85
 
6.5%
8 84
 
6.4%
9 78
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
D 3
27.3%
A 2
18.2%
T 1
 
9.1%
I 1
 
9.1%
M 1
 
9.1%
B 1
 
9.1%
E 1
 
9.1%
C 1
 
9.1%
Space Separator
ValueCountFrequency (%)
1139
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 43
100.0%
Other Punctuation
ValueCountFrequency (%)
, 40
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4747
64.8%
Common 2572
35.1%
Latin 11
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
347
 
7.3%
335
 
7.1%
334
 
7.0%
329
 
6.9%
329
 
6.9%
327
 
6.9%
327
 
6.9%
325
 
6.8%
322
 
6.8%
185
 
3.9%
Other values (146) 1587
33.4%
Common
ValueCountFrequency (%)
1139
44.3%
1 255
 
9.9%
2 211
 
8.2%
4 147
 
5.7%
3 132
 
5.1%
5 110
 
4.3%
0 105
 
4.1%
6 104
 
4.0%
7 85
 
3.3%
8 84
 
3.3%
Other values (6) 200
 
7.8%
Latin
ValueCountFrequency (%)
D 3
27.3%
A 2
18.2%
T 1
 
9.1%
I 1
 
9.1%
M 1
 
9.1%
B 1
 
9.1%
E 1
 
9.1%
C 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4747
64.8%
ASCII 2583
35.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1139
44.1%
1 255
 
9.9%
2 211
 
8.2%
4 147
 
5.7%
3 132
 
5.1%
5 110
 
4.3%
0 105
 
4.1%
6 104
 
4.0%
7 85
 
3.3%
8 84
 
3.3%
Other values (14) 211
 
8.2%
Hangul
ValueCountFrequency (%)
347
 
7.3%
335
 
7.1%
334
 
7.0%
329
 
6.9%
329
 
6.9%
327
 
6.9%
327
 
6.9%
325
 
6.8%
322
 
6.8%
185
 
3.9%
Other values (146) 1587
33.4%

우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct87
Distinct (%)26.7%
Missing17
Missing (%)5.0%
Infinite0
Infinite (%)0.0%
Mean21942.049
Minimum21900
Maximum22011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 KiB
2024-01-28T21:58:37.750295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum21900
5-th percentile21904.25
Q121917.25
median21939
Q321958
95-th percentile22000
Maximum22011
Range111
Interquartile range (IQR)40.75

Descriptive statistics

Standard deviation29.34491
Coefficient of variation (CV)0.0013373824
Kurtosis-0.5250567
Mean21942.049
Median Absolute Deviation (MAD)21.5
Skewness0.67408652
Sum7153108
Variance861.12374
MonotonicityNot monotonic
2024-01-28T21:58:37.856328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21929 16
 
4.7%
21917 15
 
4.4%
21921 11
 
3.2%
21909 10
 
2.9%
21948 10
 
2.9%
21919 10
 
2.9%
21939 9
 
2.6%
21940 9
 
2.6%
21925 8
 
2.3%
21911 8
 
2.3%
Other values (77) 220
64.1%
(Missing) 17
 
5.0%
ValueCountFrequency (%)
21900 2
 
0.6%
21901 4
 
1.2%
21902 1
 
0.3%
21903 4
 
1.2%
21904 6
1.7%
21905 1
 
0.3%
21906 5
1.5%
21907 1
 
0.3%
21908 1
 
0.3%
21909 10
2.9%
ValueCountFrequency (%)
22011 2
 
0.6%
22009 1
 
0.3%
22008 1
 
0.3%
22007 2
 
0.6%
22003 3
0.9%
22002 1
 
0.3%
22001 4
1.2%
22000 6
1.7%
21998 2
 
0.6%
21996 4
1.2%

Interactions

2024-01-28T21:58:34.639323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T21:58:34.491866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T21:58:35.046127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T21:58:34.569093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-28T21:58:37.924131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번행정동우편번호
연번1.0000.9410.910
행정동0.9411.0000.926
우편번호0.9100.9261.000
2024-01-28T21:58:37.989351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동구분
행정동1.0001.000
구분1.0001.000
2024-01-28T21:58:38.053240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번우편번호행정동구분
연번1.0000.3490.7681.000
우편번호0.3491.0000.7161.000
행정동0.7680.7161.0001.000
구분1.0001.0001.0001.000

Missing values

2024-01-28T21:58:35.140644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-28T21:58:35.228873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-28T21:58:35.325674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번행정동구분종교시설명전화번호주소우편번호
01옥련1동기독교영광의교회032-831-9101인천광역시 연수구 독배로 2521958
12옥련1동기독교생수감리교회<NA>인천광역시 청량로155번길 5421946
23옥련1동기독교옥련중앙장로교회032-833-6263인천광역시 인권로9번길 6721946
34옥련1동기독교형통한교회기도원032-421-8291인천광역시 연수구 한나루로86번길 2621946
45옥련1동기독교예일교회032-833-1451인천광역시 연수구 옥련로 14221953
56옥련1동기독교영광축복교회032-831-8290인천광역시 연수구 한나루로 8021946
67옥련1동기독교그루터기 교회<NA>인천광역시 연수구 청량로184번길 48, 4층21953
78옥련1동기독교하모니교회032-833-3440인천광역시 연수구 한나루로 18321954
89옥련1동기독교송도교회032-832-5763인천광역시 연수구 청룡로 2021941
910옥련1동기독교선목교회<NA>인천광역시 연수구 한나루로163번길 621953
연번행정동구분종교시설명전화번호주소우편번호
333<NA><NA><NA><NA><NA><NA><NA>
334<NA><NA><NA><NA><NA><NA><NA>
335<NA><NA><NA><NA><NA><NA><NA>
336<NA><NA><NA><NA><NA><NA><NA>
337<NA><NA><NA><NA><NA><NA><NA>
338<NA><NA><NA><NA><NA><NA><NA>
339<NA><NA><NA><NA><NA><NA><NA>
340<NA><NA><NA><NA><NA><NA><NA>
341<NA><NA><NA><NA><NA><NA><NA>
342<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번행정동구분종교시설명전화번호주소우편번호# duplicates
0<NA><NA><NA><NA><NA><NA><NA>16