Overview

Dataset statistics

Number of variables8
Number of observations57
Missing cells10
Missing cells (%)2.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.7 KiB
Average record size in memory67.3 B

Variable types

Numeric1
Text3
Categorical2
DateTime2

Dataset

Description인천광역시 서구관내에 위치한 점포현황(상호, 종류, 개점일자, 전화번호, 소재지, 규모)등에 관하여 입력된 데이터파일입니다.
Author인천광역시 서구
URLhttps://www.data.go.kr/data/15042500/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
종류 is highly overall correlated with 규모High correlation
규모 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 규모High correlation
종류 is highly imbalanced (54.6%)Imbalance
전화번호 has 10 (17.5%) missing valuesMissing
연번 has unique valuesUnique
소재지 has unique valuesUnique

Reproduction

Analysis started2023-12-12 09:14:05.394465
Analysis finished2023-12-12 09:14:06.373665
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct57
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29
Minimum1
Maximum57
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.0 B
2023-12-12T18:14:06.471466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.8
Q115
median29
Q343
95-th percentile54.2
Maximum57
Range56
Interquartile range (IQR)28

Descriptive statistics

Standard deviation16.598193
Coefficient of variation (CV)0.57235147
Kurtosis-1.2
Mean29
Median Absolute Deviation (MAD)14
Skewness0
Sum1653
Variance275.5
MonotonicityStrictly increasing
2023-12-12T18:14:06.633565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.8%
44 1
 
1.8%
32 1
 
1.8%
33 1
 
1.8%
34 1
 
1.8%
35 1
 
1.8%
36 1
 
1.8%
37 1
 
1.8%
38 1
 
1.8%
39 1
 
1.8%
Other values (47) 47
82.5%
ValueCountFrequency (%)
1 1
1.8%
2 1
1.8%
3 1
1.8%
4 1
1.8%
5 1
1.8%
6 1
1.8%
7 1
1.8%
8 1
1.8%
9 1
1.8%
10 1
1.8%
ValueCountFrequency (%)
57 1
1.8%
56 1
1.8%
55 1
1.8%
54 1
1.8%
53 1
1.8%
52 1
1.8%
51 1
1.8%
50 1
1.8%
49 1
1.8%
48 1
1.8%

상호
Text

Distinct56
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size588.0 B
2023-12-12T18:14:06.966389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length14
Mean length11.052632
Min length5

Characters and Unicode

Total characters630
Distinct characters114
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)96.5%

Sample

1st row서경백화점
2nd row홈플러스 가좌점
3rd row탑스빌웰빙타운
4th row이마트 검단점
5th row롯데마트 검단점
ValueCountFrequency (%)
노브랜드 9
 
7.6%
홈플러스익스프레스 8
 
6.7%
gs슈퍼마켓 7
 
5.9%
이마트 7
 
5.9%
에브리데이 6
 
5.0%
청라점 5
 
4.2%
gs 5
 
4.2%
더프레시 4
 
3.4%
인천마전점 3
 
2.5%
경서점 3
 
2.5%
Other values (52) 62
52.1%
2023-12-12T18:14:07.533487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
62
 
9.8%
53
 
8.4%
31
 
4.9%
26
 
4.1%
16
 
2.5%
16
 
2.5%
15
 
2.4%
14
 
2.2%
S 14
 
2.2%
14
 
2.2%
Other values (104) 369
58.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 515
81.7%
Space Separator 62
 
9.8%
Uppercase Letter 37
 
5.9%
Decimal Number 9
 
1.4%
Lowercase Letter 5
 
0.8%
Open Punctuation 1
 
0.2%
Close Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
53
 
10.3%
31
 
6.0%
26
 
5.0%
16
 
3.1%
16
 
3.1%
15
 
2.9%
14
 
2.7%
14
 
2.7%
14
 
2.7%
14
 
2.7%
Other values (85) 302
58.6%
Uppercase Letter
ValueCountFrequency (%)
S 14
37.8%
G 13
35.1%
H 3
 
8.1%
C 2
 
5.4%
E 2
 
5.4%
T 1
 
2.7%
F 1
 
2.7%
R 1
 
2.7%
Lowercase Letter
ValueCountFrequency (%)
s 1
20.0%
k 1
20.0%
m 1
20.0%
o 1
20.0%
e 1
20.0%
Decimal Number
ValueCountFrequency (%)
9 6
66.7%
2 2
 
22.2%
1 1
 
11.1%
Space Separator
ValueCountFrequency (%)
62
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 515
81.7%
Common 73
 
11.6%
Latin 42
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
53
 
10.3%
31
 
6.0%
26
 
5.0%
16
 
3.1%
16
 
3.1%
15
 
2.9%
14
 
2.7%
14
 
2.7%
14
 
2.7%
14
 
2.7%
Other values (85) 302
58.6%
Latin
ValueCountFrequency (%)
S 14
33.3%
G 13
31.0%
H 3
 
7.1%
C 2
 
4.8%
E 2
 
4.8%
s 1
 
2.4%
k 1
 
2.4%
T 1
 
2.4%
F 1
 
2.4%
R 1
 
2.4%
Other values (3) 3
 
7.1%
Common
ValueCountFrequency (%)
62
84.9%
9 6
 
8.2%
2 2
 
2.7%
( 1
 
1.4%
) 1
 
1.4%
1 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 515
81.7%
ASCII 115
 
18.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
62
53.9%
S 14
 
12.2%
G 13
 
11.3%
9 6
 
5.2%
H 3
 
2.6%
2 2
 
1.7%
C 2
 
1.7%
E 2
 
1.7%
( 1
 
0.9%
) 1
 
0.9%
Other values (9) 9
 
7.8%
Hangul
ValueCountFrequency (%)
53
 
10.3%
31
 
6.0%
26
 
5.0%
16
 
3.1%
16
 
3.1%
15
 
2.9%
14
 
2.7%
14
 
2.7%
14
 
2.7%
14
 
2.7%
Other values (85) 302
58.6%

종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size588.0 B
<NA>
46 
대형마트
전문점
 
3
쇼핑센터
 
2
백화점
 
1

Length

Max length4
Median length4
Mean length3.9298246
Min length3

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row백화점
2nd row대형마트
3rd row쇼핑센터
4th row대형마트
5th row대형마트

Common Values

ValueCountFrequency (%)
<NA> 46
80.7%
대형마트 5
 
8.8%
전문점 3
 
5.3%
쇼핑센터 2
 
3.5%
백화점 1
 
1.8%

Length

2023-12-12T18:14:07.712848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:14:07.861399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 46
80.7%
대형마트 5
 
8.8%
전문점 3
 
5.3%
쇼핑센터 2
 
3.5%
백화점 1
 
1.8%
Distinct52
Distinct (%)91.2%
Missing0
Missing (%)0.0%
Memory size588.0 B
Minimum1994-03-16 00:00:00
Maximum2023-08-21 00:00:00
2023-12-12T18:14:08.031109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:08.229740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

전화번호
Text

MISSING 

Distinct46
Distinct (%)97.9%
Missing10
Missing (%)17.5%
Memory size588.0 B
2023-12-12T18:14:08.520074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters564
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)95.7%

Sample

1st row032-581-3001
2nd row032-453-8121
3rd row032-567-0797
4th row032-440-1050
5th row032-560-2520
ValueCountFrequency (%)
032-380-9461 2
 
4.3%
032-561-7301 1
 
2.1%
032-569-5380 1
 
2.1%
032-566-2888 1
 
2.1%
032-565-5603 1
 
2.1%
032-566-5603 1
 
2.1%
032-575-7400 1
 
2.1%
032-573-8288 1
 
2.1%
032-568-3541 1
 
2.1%
032-569-2131 1
 
2.1%
Other values (36) 36
76.6%
2023-12-12T18:14:08.956990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 94
16.7%
0 87
15.4%
5 74
13.1%
3 71
12.6%
2 70
12.4%
6 50
8.9%
8 33
 
5.9%
1 23
 
4.1%
7 22
 
3.9%
4 21
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 470
83.3%
Dash Punctuation 94
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 87
18.5%
5 74
15.7%
3 71
15.1%
2 70
14.9%
6 50
10.6%
8 33
 
7.0%
1 23
 
4.9%
7 22
 
4.7%
4 21
 
4.5%
9 19
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 94
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 564
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 94
16.7%
0 87
15.4%
5 74
13.1%
3 71
12.6%
2 70
12.4%
6 50
8.9%
8 33
 
5.9%
1 23
 
4.1%
7 22
 
3.9%
4 21
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 564
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 94
16.7%
0 87
15.4%
5 74
13.1%
3 71
12.6%
2 70
12.4%
6 50
8.9%
8 33
 
5.9%
1 23
 
4.1%
7 22
 
3.9%
4 21
 
3.7%

소재지
Text

UNIQUE 

Distinct57
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size588.0 B
2023-12-12T18:14:09.274952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length63
Median length46
Mean length33.947368
Min length17

Characters and Unicode

Total characters1935
Distinct characters161
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)100.0%

Sample

1st row인천광역시 서구 가정로 369(신현동 272-1번지)
2nd row인천광역시 서구 가정로 151(가좌동 118번지)
3rd row인천광역시 서구 완정로 70(마전동 999-4번지)
4th row인천광역시 서구 서곶로 754당하동 1065-1번지)
5th row인천광역시 서구 원당대로 581(마전동 1031-1번지)
ValueCountFrequency (%)
인천광역시 57
 
16.4%
서구 57
 
16.4%
청라동 10
 
2.9%
서곶로 4
 
1.2%
검단로 3
 
0.9%
마전동 3
 
0.9%
심곡동 3
 
0.9%
원당동 3
 
0.9%
근린생활시설 3
 
0.9%
1층 2
 
0.6%
Other values (184) 202
58.2%
2023-12-12T18:14:09.785138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
290
 
15.0%
1 119
 
6.1%
66
 
3.4%
65
 
3.4%
60
 
3.1%
58
 
3.0%
57
 
2.9%
57
 
2.9%
57
 
2.9%
57
 
2.9%
Other values (151) 1049
54.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1015
52.5%
Decimal Number 429
22.2%
Space Separator 290
 
15.0%
Close Punctuation 54
 
2.8%
Open Punctuation 53
 
2.7%
Dash Punctuation 39
 
2.0%
Uppercase Letter 22
 
1.1%
Other Punctuation 21
 
1.1%
Math Symbol 10
 
0.5%
Lowercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
 
6.5%
65
 
6.4%
60
 
5.9%
58
 
5.7%
57
 
5.6%
57
 
5.6%
57
 
5.6%
57
 
5.6%
53
 
5.2%
33
 
3.3%
Other values (127) 452
44.5%
Decimal Number
ValueCountFrequency (%)
1 119
27.7%
0 47
 
11.0%
2 41
 
9.6%
4 40
 
9.3%
3 35
 
8.2%
5 33
 
7.7%
6 33
 
7.7%
7 30
 
7.0%
9 27
 
6.3%
8 24
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
B 13
59.1%
K 3
 
13.6%
S 3
 
13.6%
L 2
 
9.1%
T 1
 
4.5%
Other Punctuation
ValueCountFrequency (%)
, 20
95.2%
1
 
4.8%
Lowercase Letter
ValueCountFrequency (%)
s 1
50.0%
k 1
50.0%
Space Separator
ValueCountFrequency (%)
290
100.0%
Close Punctuation
ValueCountFrequency (%)
) 54
100.0%
Open Punctuation
ValueCountFrequency (%)
( 53
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 39
100.0%
Math Symbol
ValueCountFrequency (%)
~ 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1015
52.5%
Common 896
46.3%
Latin 24
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
 
6.5%
65
 
6.4%
60
 
5.9%
58
 
5.7%
57
 
5.6%
57
 
5.6%
57
 
5.6%
57
 
5.6%
53
 
5.2%
33
 
3.3%
Other values (127) 452
44.5%
Common
ValueCountFrequency (%)
290
32.4%
1 119
13.3%
) 54
 
6.0%
( 53
 
5.9%
0 47
 
5.2%
2 41
 
4.6%
4 40
 
4.5%
- 39
 
4.4%
3 35
 
3.9%
5 33
 
3.7%
Other values (7) 145
16.2%
Latin
ValueCountFrequency (%)
B 13
54.2%
K 3
 
12.5%
S 3
 
12.5%
L 2
 
8.3%
T 1
 
4.2%
s 1
 
4.2%
k 1
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1015
52.5%
ASCII 919
47.5%
None 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
290
31.6%
1 119
12.9%
) 54
 
5.9%
( 53
 
5.8%
0 47
 
5.1%
2 41
 
4.5%
4 40
 
4.4%
- 39
 
4.2%
3 35
 
3.8%
5 33
 
3.6%
Other values (13) 168
18.3%
Hangul
ValueCountFrequency (%)
66
 
6.5%
65
 
6.4%
60
 
5.9%
58
 
5.7%
57
 
5.6%
57
 
5.6%
57
 
5.6%
57
 
5.6%
53
 
5.2%
33
 
3.3%
Other values (127) 452
44.5%
None
ValueCountFrequency (%)
1
100.0%

규모
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size588.0 B
준대규모
40 
대규모
11 
<NA>

Length

Max length4
Median length4
Mean length3.8070175
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대규모
2nd row대규모
3rd row대규모
4th row대규모
5th row대규모

Common Values

ValueCountFrequency (%)
준대규모 40
70.2%
대규모 11
 
19.3%
<NA> 6
 
10.5%

Length

2023-12-12T18:14:10.300230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:14:10.421675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
준대규모 40
70.2%
대규모 11
 
19.3%
na 6
 
10.5%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size588.0 B
Minimum2023-10-19 00:00:00
Maximum2023-10-19 00:00:00
2023-12-12T18:14:10.555093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:10.695913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T18:14:06.032590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:14:10.811577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번상호종류개점일자전화번호소재지규모
연번1.0000.9410.0000.9701.0001.0000.995
상호0.9411.0001.0000.9950.9981.0001.000
종류0.0001.0001.0001.0001.0001.000NaN
개점일자0.9700.9951.0001.0000.9961.0001.000
전화번호1.0000.9981.0000.9961.0001.0001.000
소재지1.0001.0001.0001.0001.0001.0001.000
규모0.9951.000NaN1.0001.0001.0001.000
2023-12-12T18:14:10.987628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종류규모
종류1.0001.000
규모1.0001.000
2023-12-12T18:14:11.111855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번종류규모
연번1.0000.0000.859
종류0.0001.0001.000
규모0.8591.0001.000

Missing values

2023-12-12T18:14:06.180900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:14:06.321652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번상호종류개점일자전화번호소재지규모데이터기준일자
01서경백화점백화점1994-03-16032-581-3001인천광역시 서구 가정로 369(신현동 272-1번지)대규모2023-10-19
12홈플러스 가좌점대형마트2002-08-01032-453-8121인천광역시 서구 가정로 151(가좌동 118번지)대규모2023-10-19
23탑스빌웰빙타운쇼핑센터2005-08-05032-567-0797인천광역시 서구 완정로 70(마전동 999-4번지)대규모2023-10-19
34이마트 검단점대형마트2006-09-07032-440-1050인천광역시 서구 서곶로 754당하동 1065-1번지)대규모2023-10-19
45롯데마트 검단점대형마트2009-12-24032-560-2520인천광역시 서구 원당대로 581(마전동 1031-1번지)대규모2023-10-19
56HomeCC인천점전문점2010-05-25032-570-7000인천광역시 서구 중봉대로 393(원창동 379-1번지)대규모2023-10-19
67롯데마트 청라점대형마트2012-12-13032-590-2520인천광역시 서구 청라커낼로 252청라롯데캐슬 (청라동 159-1번지)대규모2023-10-19
78홈플러스 인천청라라점대형마트2013-08-27032-723-8920인천광역시 서구 중봉대로 587(청라동 157-22번지)대규모2023-10-19
89검단공구상가전문점2013-01-10032-561-3588인천광역시 서구 보듬5로 13(오류동 1640-2번지)대규모2023-10-19
910모다아울렛전문점2016-02-19032-288-5000인천광역시 서구 북항로32번안길 50(원창동 381-69번지 외 1필지)대규모2023-10-19
연번상호종류개점일자전화번호소재지규모데이터기준일자
4748노브랜드 인천불로점<NA>2022-04-18032-380-9461인천광역시 서구 검단로 747 (불로동)준대규모2023-10-19
4849노브랜드 인천마전점<NA>2021-10-07032-380-9461인천광역시 서구 검단로 492 (마전동, 세훈빌딩)준대규모2023-10-19
4950이마트에브리데이 인천sk리더스뷰점<NA>2022-05-31<NA>인천광역시 서구 가정로437, B242~B250호(가정동, 루원시티 SK리더스뷰)준대규모2023-10-19
5051GS 더프레시 검단푸르지오점<NA>2021-10-07<NA>인천광역시 서구 원당동 산52-1<NA>2023-10-19
5152GS THE FRESH 검단대방점<NA>2022-04-18<NA>인천광역시 서구 이음2로 30(당하동, 대방디에트르 더 펠리체) B101~B105호준대규모2023-10-19
5253롯데프레시 인천마전점<NA>2023-02-09<NA>인천광역시 서구 마전동 904-3, 102호<NA>2023-10-19
5354롯데프레시 검단예미지점<NA>2023-07-20<NA>인천광역시 서구 원당동 1014, 검단신도시예미지트리플에듀 상가동 113~117호<NA>2023-10-19
5455GS 더프레시 루원리더스뷰점<NA>2023-07-26<NA>인천광역시 서구 봉오대로 270, 루원시티 SK리더스뷰2차 근린생활시설 302동 B239호~B245호<NA>2023-10-19
5556GS 더프레시 검암푸르지오점<NA>2023-08-21<NA>인천광역시 서구 한들로 33, 검암역로열파크시티푸르지오 1단지 근린생활시설 5동 지하1층 B101호 (1개 호수)<NA>2023-10-19
5657GS 더프레시 검단법원점<NA>2023-08-21<NA>인천광역시 서구 서로3로 104, 신화프라자 근린생활시설 1층 107호~110호(4개호수)<NA>2023-10-19