Overview

Dataset statistics

Number of variables11
Number of observations77
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.8 KiB
Average record size in memory90.7 B

Variable types

Numeric1
Categorical8
Text2

Dataset

Description인천광역시 부평구 국가암 검진기관 현황 데이터는 검진기관 명, 전화번호, 주소, 검사 가능한 암에 대한 데이터를 제공합니다.
URLhttps://www.data.go.kr/data/15103261/fileData.do

Alerts

주소지 is highly overall correlated with 연번 and 7 other fieldsHigh correlation
일반검진 is highly overall correlated with 연번 and 7 other fieldsHigh correlation
자궁경부암 is highly overall correlated with 연번 and 7 other fieldsHigh correlation
유방암 is highly overall correlated with 연번 and 7 other fieldsHigh correlation
대장암 is highly overall correlated with 연번 and 7 other fieldsHigh correlation
폐암 is highly overall correlated with 연번 and 7 other fieldsHigh correlation
위암 is highly overall correlated with 연번 and 7 other fieldsHigh correlation
간암 is highly overall correlated with 연번 and 7 other fieldsHigh correlation
연번 is highly overall correlated with 주소지 and 7 other fieldsHigh correlation
폐암 is highly imbalanced (76.2%)Imbalance
연번 has unique valuesUnique
검진기관명 has unique valuesUnique
전화전호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 06:02:39.147043
Analysis finished2023-12-12 06:02:40.130164
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct77
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39
Minimum1
Maximum77
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size825.0 B
2023-12-12T15:02:40.204656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.8
Q120
median39
Q358
95-th percentile73.2
Maximum77
Range76
Interquartile range (IQR)38

Descriptive statistics

Standard deviation22.371857
Coefficient of variation (CV)0.57363737
Kurtosis-1.2
Mean39
Median Absolute Deviation (MAD)19
Skewness0
Sum3003
Variance500.5
MonotonicityStrictly increasing
2023-12-12T15:02:40.370513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.3%
50 1
 
1.3%
57 1
 
1.3%
56 1
 
1.3%
55 1
 
1.3%
54 1
 
1.3%
53 1
 
1.3%
52 1
 
1.3%
51 1
 
1.3%
49 1
 
1.3%
Other values (67) 67
87.0%
ValueCountFrequency (%)
1 1
1.3%
2 1
1.3%
3 1
1.3%
4 1
1.3%
5 1
1.3%
6 1
1.3%
7 1
1.3%
8 1
1.3%
9 1
1.3%
10 1
1.3%
ValueCountFrequency (%)
77 1
1.3%
76 1
1.3%
75 1
1.3%
74 1
1.3%
73 1
1.3%
72 1
1.3%
71 1
1.3%
70 1
1.3%
69 1
1.3%
68 1
1.3%

주소지
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Memory size748.0 B
부평1동
14 
부평5동
10 
삼산2동
십정2동
부평6동
Other values (16)
38 

Length

Max length4
Median length4
Mean length3.987013
Min length3

Unique

Unique5 ?
Unique (%)6.5%

Sample

1st row갈산1동
2nd row갈산1동
3rd row갈산1동
4th row갈산2동
5th row구산동

Common Values

ValueCountFrequency (%)
부평1동 14
18.2%
부평5동 10
13.0%
삼산2동 6
 
7.8%
십정2동 5
 
6.5%
부평6동 4
 
5.2%
부평4동 4
 
5.2%
청천2동 4
 
5.2%
부개3동 4
 
5.2%
부개1동 4
 
5.2%
갈산1동 3
 
3.9%
Other values (11) 19
24.7%

Length

2023-12-12T15:02:40.525640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
부평1동 14
18.2%
부평5동 10
13.0%
삼산2동 6
 
7.8%
십정2동 5
 
6.5%
부평6동 4
 
5.2%
부평4동 4
 
5.2%
청천2동 4
 
5.2%
부개3동 4
 
5.2%
부개1동 4
 
5.2%
산곡1동 3
 
3.9%
Other values (11) 19
24.7%

검진기관명
Text

UNIQUE 

Distinct77
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size748.0 B
2023-12-12T15:02:40.833629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.3376623
Min length4

Characters and Unicode

Total characters565
Distinct characters135
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique77 ?
Unique (%)100.0%

Sample

1st row전성희내과의원
2nd row파티마산부인과의원
3rd row행복플러스의원
4th row갈산중앙의원
5th row근로복지공단인천병원
ValueCountFrequency (%)
전성희내과의원 1
 
1.3%
안정경내과의원 1
 
1.3%
다인이비인후과병원 1
 
1.3%
정속편안내과의원 1
 
1.3%
새봄여성병원 1
 
1.3%
부평사랑내과의원 1
 
1.3%
이수금내과의원 1
 
1.3%
부평정진의원 1
 
1.3%
부평바른내과의원 1
 
1.3%
부평봄내과의원 1
 
1.3%
Other values (67) 67
87.0%
2023-12-12T15:02:41.316057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
77
 
13.6%
72
 
12.7%
56
 
9.9%
36
 
6.4%
23
 
4.1%
16
 
2.8%
14
 
2.5%
13
 
2.3%
11
 
1.9%
9
 
1.6%
Other values (125) 238
42.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 565
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
77
 
13.6%
72
 
12.7%
56
 
9.9%
36
 
6.4%
23
 
4.1%
16
 
2.8%
14
 
2.5%
13
 
2.3%
11
 
1.9%
9
 
1.6%
Other values (125) 238
42.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 565
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
77
 
13.6%
72
 
12.7%
56
 
9.9%
36
 
6.4%
23
 
4.1%
16
 
2.8%
14
 
2.5%
13
 
2.3%
11
 
1.9%
9
 
1.6%
Other values (125) 238
42.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 565
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
77
 
13.6%
72
 
12.7%
56
 
9.9%
36
 
6.4%
23
 
4.1%
16
 
2.8%
14
 
2.5%
13
 
2.3%
11
 
1.9%
9
 
1.6%
Other values (125) 238
42.1%

전화전호
Text

UNIQUE 

Distinct77
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size748.0 B
2023-12-12T15:02:41.637962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.961039
Min length9

Characters and Unicode

Total characters921
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique77 ?
Unique (%)100.0%

Sample

1st row032-515-7533
2nd row032-511-7600
3rd row032-511-5475
4th row032-513-9303
5th row032-500-0114
ValueCountFrequency (%)
032-515-7533 1
 
1.3%
032-514-9086 1
 
1.3%
032-515-2325 1
 
1.3%
032-519-7887 1
 
1.3%
032-521-7200 1
 
1.3%
032-526-0075 1
 
1.3%
032-268-7575 1
 
1.3%
032-506-5010 1
 
1.3%
032-299-7575 1
 
1.3%
032-514-8275 1
 
1.3%
Other values (67) 67
87.0%
2023-12-12T15:02:42.132886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 153
16.6%
2 140
15.2%
0 138
15.0%
3 128
13.9%
5 106
11.5%
1 74
8.0%
7 58
 
6.3%
8 42
 
4.6%
4 29
 
3.1%
6 28
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 768
83.4%
Dash Punctuation 153
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 140
18.2%
0 138
18.0%
3 128
16.7%
5 106
13.8%
1 74
9.6%
7 58
7.6%
8 42
 
5.5%
4 29
 
3.8%
6 28
 
3.6%
9 25
 
3.3%
Dash Punctuation
ValueCountFrequency (%)
- 153
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 921
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 153
16.6%
2 140
15.2%
0 138
15.0%
3 128
13.9%
5 106
11.5%
1 74
8.0%
7 58
 
6.3%
8 42
 
4.6%
4 29
 
3.1%
6 28
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 921
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 153
16.6%
2 140
15.2%
0 138
15.0%
3 128
13.9%
5 106
11.5%
1 74
8.0%
7 58
 
6.3%
8 42
 
4.6%
4 29
 
3.1%
6 28
 
3.0%

일반검진
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size748.0 B
63 
<NA>
14 

Length

Max length4
Median length1
Mean length1.5454545
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row<NA>
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
63
81.8%
<NA> 14
 
18.2%

Length

2023-12-12T15:02:42.296394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:02:42.416669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
63
81.8%
na 14
 
18.2%

위암
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size748.0 B
55 
<NA>
22 

Length

Max length4
Median length1
Mean length1.8571429
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row<NA>
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
55
71.4%
<NA> 22
 
28.6%

Length

2023-12-12T15:02:42.561710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:02:42.709369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
55
71.4%
na 22
 
28.6%

간암
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size748.0 B
52 
<NA>
25 

Length

Max length4
Median length1
Mean length1.974026
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row<NA>
3rd row
4th row<NA>
5th row

Common Values

ValueCountFrequency (%)
52
67.5%
<NA> 25
32.5%

Length

2023-12-12T15:02:42.847120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:02:43.000423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
52
67.5%
na 25
32.5%

대장암
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size748.0 B
42 
<NA>
35 

Length

Max length4
Median length1
Mean length2.3636364
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row

Common Values

ValueCountFrequency (%)
42
54.5%
<NA> 35
45.5%

Length

2023-12-12T15:02:43.111842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:02:43.209238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
42
54.5%
na 35
45.5%

유방암
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size748.0 B
<NA>
43 
34 

Length

Max length4
Median length4
Mean length2.6753247
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row

Common Values

ValueCountFrequency (%)
<NA> 43
55.8%
34
44.2%

Length

2023-12-12T15:02:43.357961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:02:43.517346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 43
55.8%
34
44.2%

자궁경부암
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size748.0 B
44 
<NA>
33 

Length

Max length4
Median length1
Mean length2.2857143
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row
3rd row<NA>
4th row<NA>
5th row

Common Values

ValueCountFrequency (%)
44
57.1%
<NA> 33
42.9%

Length

2023-12-12T15:02:43.641978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:02:43.758181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
44
57.1%
na 33
42.9%

폐암
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size748.0 B
<NA>
74 
 
3

Length

Max length4
Median length4
Mean length3.8831169
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row

Common Values

ValueCountFrequency (%)
<NA> 74
96.1%
3
 
3.9%

Length

2023-12-12T15:02:43.881508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:02:44.009053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 74
96.1%
3
 
3.9%

Interactions

2023-12-12T15:02:39.706123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:02:44.089530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번주소지검진기관명전화전호
연번1.0000.9671.0001.000
주소지0.9671.0001.0001.000
검진기관명1.0001.0001.0001.000
전화전호1.0001.0001.0001.000
2023-12-12T15:02:44.220403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주소지일반검진자궁경부암유방암대장암폐암위암간암
주소지1.0001.0001.0001.0001.0001.0001.0001.000
일반검진1.0001.0001.0001.0001.0001.0001.0001.000
자궁경부암1.0001.0001.0001.0001.0001.0001.0001.000
유방암1.0001.0001.0001.0001.0001.0001.0001.000
대장암1.0001.0001.0001.0001.0001.0001.0001.000
폐암1.0001.0001.0001.0001.0001.0001.0001.000
위암1.0001.0001.0001.0001.0001.0001.0001.000
간암1.0001.0001.0001.0001.0001.0001.0001.000
2023-12-12T15:02:44.374625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번주소지일반검진위암간암대장암유방암자궁경부암폐암
연번1.0000.7501.0001.0001.0001.0001.0001.0001.000
주소지0.7501.0001.0001.0001.0001.0001.0001.0001.000
일반검진1.0001.0001.0001.0001.0001.0001.0001.0001.000
위암1.0001.0001.0001.0001.0001.0001.0001.0001.000
간암1.0001.0001.0001.0001.0001.0001.0001.0001.000
대장암1.0001.0001.0001.0001.0001.0001.0001.0001.000
유방암1.0001.0001.0001.0001.0001.0001.0001.0001.000
자궁경부암1.0001.0001.0001.0001.0001.0001.0001.0001.000
폐암1.0001.0001.0001.0001.0001.0001.0001.0001.000

Missing values

2023-12-12T15:02:39.883685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:02:40.072778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번주소지검진기관명전화전호일반검진위암간암대장암유방암자궁경부암폐암
01갈산1동전성희내과의원032-515-7533<NA><NA><NA><NA>
12갈산1동파티마산부인과의원032-511-7600<NA><NA><NA><NA><NA><NA>
23갈산1동행복플러스의원032-511-5475<NA><NA><NA><NA>
34갈산2동갈산중앙의원032-513-9303<NA><NA><NA><NA><NA>
45구산동근로복지공단인천병원032-500-0114
56부개1동박가정의학과의원032-514-4867<NA>
67부개1동에덴여성의원032-519-8582<NA><NA><NA><NA><NA><NA>
78부개1동평화의원032-524-6911<NA>
89부개1동박현수정형외과의원032-507-6222<NA><NA><NA><NA><NA><NA>
910부개2동푸른가정의학과의원032-361-7600<NA><NA><NA><NA><NA><NA>
연번주소지검진기관명전화전호일반검진위암간암대장암유방암자궁경부암폐암
6768십정1동이기섭내과의원032-435-7070<NA><NA><NA><NA>
6869십정2동동암우리내과의원032-421-5775<NA><NA><NA><NA><NA>
6970십정2동마루내과의원032-435-5600<NA>
7071십정2동삼성장편한내과의원032-433-7588<NA>
7172십정2동연세미여성의원032-431-7582<NA><NA><NA><NA><NA><NA>
7273십정2동이화산부인과의원032-428-3338<NA><NA><NA><NA><NA><NA>
7374청천2동부평세림병원032-524-0591
7475청천2동우리내과의원032-505-1608<NA><NA><NA>
7576청천2동장내과의원032-519-4373<NA><NA><NA>
7677청천2동하이큐영상의원032-363-3292<NA>