Overview

Dataset statistics

Number of variables4
Number of observations169
Missing cells468
Missing cells (%)69.2%
Duplicate rows1
Duplicate rows (%)0.6%
Total size in memory5.4 KiB
Average record size in memory32.8 B

Variable types

Text3
Categorical1

Dataset

Description인천광역시 서구 산부의과의원의 현황에 대한 데이터입니다. 이 데이터는 의원명, 소재지, 전화번호 등에 대한 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15086608/fileData.do

Alerts

Dataset has 1 (0.6%) duplicate rowsDuplicates
데이터기준일자 is highly imbalanced (60.9%)Imbalance
의원명 has 156 (92.3%) missing valuesMissing
소재지 has 156 (92.3%) missing valuesMissing
전화번호 has 156 (92.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:20:57.870800
Analysis finished2023-12-12 14:20:58.332028
Duration0.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

의원명
Text

MISSING 

Distinct13
Distinct (%)100.0%
Missing156
Missing (%)92.3%
Memory size1.4 KiB
2023-12-12T23:20:58.480724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length9
Mean length9.3846154
Min length7

Characters and Unicode

Total characters122
Distinct characters38
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)100.0%

Sample

1st row검단아라산부인과의원
2nd row더나아산부인과의원
3rd row이해림산부인과의원
4th row강남산부인과의원
5th row금산부인과의원
ValueCountFrequency (%)
검단아라산부인과의원 1
 
7.7%
더나아산부인과의원 1
 
7.7%
이해림산부인과의원 1
 
7.7%
강남산부인과의원 1
 
7.7%
금산부인과의원 1
 
7.7%
그린산부인과의원 1
 
7.7%
한사랑산부인과의원 1
 
7.7%
엄마와아이들소아청소년과산부인과의원 1
 
7.7%
연세산부인과의원 1
 
7.7%
김승연산부인과의원 1
 
7.7%
Other values (3) 3
23.1%
2023-12-12T23:20:58.950072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14
11.5%
14
11.5%
13
10.7%
13
10.7%
13
10.7%
13
10.7%
5
 
4.1%
2
 
1.6%
2
 
1.6%
2
 
1.6%
Other values (28) 31
25.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 122
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
11.5%
14
11.5%
13
10.7%
13
10.7%
13
10.7%
13
10.7%
5
 
4.1%
2
 
1.6%
2
 
1.6%
2
 
1.6%
Other values (28) 31
25.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 122
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
11.5%
14
11.5%
13
10.7%
13
10.7%
13
10.7%
13
10.7%
5
 
4.1%
2
 
1.6%
2
 
1.6%
2
 
1.6%
Other values (28) 31
25.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 122
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
14
11.5%
14
11.5%
13
10.7%
13
10.7%
13
10.7%
13
10.7%
5
 
4.1%
2
 
1.6%
2
 
1.6%
2
 
1.6%
Other values (28) 31
25.4%

소재지
Text

MISSING 

Distinct13
Distinct (%)100.0%
Missing156
Missing (%)92.3%
Memory size1.4 KiB
2023-12-12T23:20:59.266967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length41
Mean length31.153846
Min length22

Characters and Unicode

Total characters405
Distinct characters80
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)100.0%

Sample

1st row인천광역시 서구 이음대로 392, 메트로시티 5층 512~514호 (원당동)
2nd row인천광역시 서구 검단로 480, 검단리치웰프라자 5층 (왕길동)
3rd row인천광역시 서구 서곶로 355-1, 4층 (연희동, 성우빌딩)
4th row인천광역시 서구 검단로 474, 405호, 406호 (왕길동, 뷰비스타운)
5th row인천광역시 서구 승학로 528 (검암동)
ValueCountFrequency (%)
인천광역시 13
 
15.7%
서구 13
 
15.7%
왕길동 3
 
3.6%
가좌동 3
 
3.6%
연희동 2
 
2.4%
검단로 2
 
2.4%
3층 2
 
2.4%
서곶로 2
 
2.4%
가정로 2
 
2.4%
5층 2
 
2.4%
Other values (38) 39
47.0%
2023-12-12T23:20:59.678185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
70
 
17.3%
, 15
 
3.7%
15
 
3.7%
15
 
3.7%
14
 
3.5%
) 14
 
3.5%
( 14
 
3.5%
13
 
3.2%
13
 
3.2%
13
 
3.2%
Other values (70) 209
51.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 227
56.0%
Space Separator 70
 
17.3%
Decimal Number 63
 
15.6%
Other Punctuation 15
 
3.7%
Close Punctuation 14
 
3.5%
Open Punctuation 14
 
3.5%
Math Symbol 1
 
0.2%
Dash Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
 
6.6%
15
 
6.6%
14
 
6.2%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
7
 
3.1%
Other values (54) 98
43.2%
Decimal Number
ValueCountFrequency (%)
5 11
17.5%
0 9
14.3%
4 8
12.7%
3 8
12.7%
1 7
11.1%
8 7
11.1%
6 5
7.9%
7 4
 
6.3%
2 3
 
4.8%
9 1
 
1.6%
Space Separator
ValueCountFrequency (%)
70
100.0%
Other Punctuation
ValueCountFrequency (%)
, 15
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 227
56.0%
Common 178
44.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
 
6.6%
15
 
6.6%
14
 
6.2%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
7
 
3.1%
Other values (54) 98
43.2%
Common
ValueCountFrequency (%)
70
39.3%
, 15
 
8.4%
) 14
 
7.9%
( 14
 
7.9%
5 11
 
6.2%
0 9
 
5.1%
4 8
 
4.5%
3 8
 
4.5%
1 7
 
3.9%
8 7
 
3.9%
Other values (6) 15
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 227
56.0%
ASCII 178
44.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
70
39.3%
, 15
 
8.4%
) 14
 
7.9%
( 14
 
7.9%
5 11
 
6.2%
0 9
 
5.1%
4 8
 
4.5%
3 8
 
4.5%
1 7
 
3.9%
8 7
 
3.9%
Other values (6) 15
 
8.4%
Hangul
ValueCountFrequency (%)
15
 
6.6%
15
 
6.6%
14
 
6.2%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
13
 
5.7%
7
 
3.1%
Other values (54) 98
43.2%

전화번호
Text

MISSING 

Distinct13
Distinct (%)100.0%
Missing156
Missing (%)92.3%
Memory size1.4 KiB
2023-12-12T23:20:59.940601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters156
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)100.0%

Sample

1st row032-569-7575
2nd row032-567-5732
3rd row032-564-1800
4th row032-569-3582
5th row032-569-0880
ValueCountFrequency (%)
032-569-7575 1
 
7.7%
032-567-5732 1
 
7.7%
032-564-1800 1
 
7.7%
032-569-3582 1
 
7.7%
032-569-0880 1
 
7.7%
032-574-3535 1
 
7.7%
032-561-4313 1
 
7.7%
032-572-1771 1
 
7.7%
032-566-0001 1
 
7.7%
032-563-1251 1
 
7.7%
Other values (3) 3
23.1%
2023-12-12T23:21:00.317217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 26
16.7%
0 22
14.1%
5 21
13.5%
3 20
12.8%
2 20
12.8%
6 12
7.7%
7 12
7.7%
1 9
 
5.8%
8 8
 
5.1%
9 3
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 130
83.3%
Dash Punctuation 26
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 22
16.9%
5 21
16.2%
3 20
15.4%
2 20
15.4%
6 12
9.2%
7 12
9.2%
1 9
6.9%
8 8
 
6.2%
9 3
 
2.3%
4 3
 
2.3%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 156
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 26
16.7%
0 22
14.1%
5 21
13.5%
3 20
12.8%
2 20
12.8%
6 12
7.7%
7 12
7.7%
1 9
 
5.8%
8 8
 
5.1%
9 3
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 156
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 26
16.7%
0 22
14.1%
5 21
13.5%
3 20
12.8%
2 20
12.8%
6 12
7.7%
7 12
7.7%
1 9
 
5.8%
8 8
 
5.1%
9 3
 
1.9%

데이터기준일자
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
<NA>
156 
2023-08-01
 
13

Length

Max length10
Median length4
Mean length4.4615385
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-08-01
2nd row2023-08-01
3rd row2023-08-01
4th row2023-08-01
5th row2023-08-01

Common Values

ValueCountFrequency (%)
<NA> 156
92.3%
2023-08-01 13
 
7.7%

Length

2023-12-12T23:21:00.486626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:21:00.603390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 156
92.3%
2023-08-01 13
 
7.7%

Correlations

2023-12-12T23:21:00.684708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
의원명소재지전화번호
의원명1.0001.0001.000
소재지1.0001.0001.000
전화번호1.0001.0001.000

Missing values

2023-12-12T23:20:58.064310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:20:58.160128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:20:58.273135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

의원명소재지전화번호데이터기준일자
0검단아라산부인과의원인천광역시 서구 이음대로 392, 메트로시티 5층 512~514호 (원당동)032-569-75752023-08-01
1더나아산부인과의원인천광역시 서구 검단로 480, 검단리치웰프라자 5층 (왕길동)032-567-57322023-08-01
2이해림산부인과의원인천광역시 서구 서곶로 355-1, 4층 (연희동, 성우빌딩)032-564-18002023-08-01
3강남산부인과의원인천광역시 서구 검단로 474, 405호, 406호 (왕길동, 뷰비스타운)032-569-35822023-08-01
4금산부인과의원인천광역시 서구 승학로 528 (검암동)032-569-08802023-08-01
5그린산부인과의원인천광역시 서구 건지로 371 (가좌동)032-574-35352023-08-01
6한사랑산부인과의원인천광역시 서구 원당대로 660, 영프라자 6,7(일부)층 (당하동)032-561-43132023-08-01
7엄마와아이들소아청소년과산부인과의원인천광역시 서구 가정로 388 (가정동)032-572-17712023-08-01
8연세산부인과의원인천광역시 서구 원적로 100, 604호 (가좌동, 로얄프라자)032-566-00012023-08-01
9김승연산부인과의원인천광역시 서구 서곶로 355, 3층 (연희동, 한터빌딩)032-563-12512023-08-01
의원명소재지전화번호데이터기준일자
159<NA><NA><NA><NA>
160<NA><NA><NA><NA>
161<NA><NA><NA><NA>
162<NA><NA><NA><NA>
163<NA><NA><NA><NA>
164<NA><NA><NA><NA>
165<NA><NA><NA><NA>
166<NA><NA><NA><NA>
167<NA><NA><NA><NA>
168<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

의원명소재지전화번호데이터기준일자# duplicates
0<NA><NA><NA><NA>156