Overview

Dataset statistics

Number of variables6
Number of observations70
Missing cells98
Missing cells (%)23.3%
Duplicate rows1
Duplicate rows (%)1.4%
Total size in memory3.5 KiB
Average record size in memory50.9 B

Variable types

Numeric1
Categorical2
Text3

Dataset

Description인천광역시 서구 관내에 위치한 유흥주점업(업종명, 업소명, 소재지(도로명), 전화번호)현행 데이터입니다.
Author인천광역시 서구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15089618&srcSe=7661IVAWM27C61E190

Alerts

Dataset has 1 (1.4%) duplicate rowsDuplicates
데이터기준일자 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
업종명 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 업종명 and 1 other fieldsHigh correlation
연번 has 21 (30.0%) missing valuesMissing
업소명 has 21 (30.0%) missing valuesMissing
소재지(도로명) has 21 (30.0%) missing valuesMissing
전화번호 has 35 (50.0%) missing valuesMissing

Reproduction

Analysis started2024-01-28 06:51:24.716256
Analysis finished2024-01-28 06:51:25.312832
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct49
Distinct (%)100.0%
Missing21
Missing (%)30.0%
Infinite0
Infinite (%)0.0%
Mean25
Minimum1
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size762.0 B
2024-01-28T15:51:25.367721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.4
Q113
median25
Q337
95-th percentile46.6
Maximum49
Range48
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.28869
Coefficient of variation (CV)0.57154761
Kurtosis-1.2
Mean25
Median Absolute Deviation (MAD)12
Skewness0
Sum1225
Variance204.16667
MonotonicityStrictly increasing
2024-01-28T15:51:25.480042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
38 1
 
1.4%
28 1
 
1.4%
29 1
 
1.4%
30 1
 
1.4%
31 1
 
1.4%
32 1
 
1.4%
33 1
 
1.4%
34 1
 
1.4%
35 1
 
1.4%
36 1
 
1.4%
Other values (39) 39
55.7%
(Missing) 21
30.0%
ValueCountFrequency (%)
1 1
1.4%
2 1
1.4%
3 1
1.4%
4 1
1.4%
5 1
1.4%
6 1
1.4%
7 1
1.4%
8 1
1.4%
9 1
1.4%
10 1
1.4%
ValueCountFrequency (%)
49 1
1.4%
48 1
1.4%
47 1
1.4%
46 1
1.4%
45 1
1.4%
44 1
1.4%
43 1
1.4%
42 1
1.4%
41 1
1.4%
40 1
1.4%

업종명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size692.0 B
유흥주점영업
49 
<NA>
21 

Length

Max length6
Median length6
Mean length5.4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유흥주점영업
2nd row유흥주점영업
3rd row유흥주점영업
4th row유흥주점영업
5th row유흥주점영업

Common Values

ValueCountFrequency (%)
유흥주점영업 49
70.0%
<NA> 21
30.0%

Length

2024-01-28T15:51:25.604090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T15:51:25.722196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유흥주점영업 49
70.0%
na 21
30.0%

업소명
Text

MISSING 

Distinct49
Distinct (%)100.0%
Missing21
Missing (%)30.0%
Memory size692.0 B
2024-01-28T15:51:25.906750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length9
Mean length6.1836735
Min length1

Characters and Unicode

Total characters303
Distinct characters138
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)100.0%

Sample

1st row왕코노래클럽
2nd row놀러와
3rd row레이디
4th row러블리짱
5th row북항프라자2 203호
ValueCountFrequency (%)
라이브 2
 
3.5%
준노래클럽 1
 
1.8%
오림노래바 1
 
1.8%
에스지(sg)라이브 1
 
1.8%
장녹수미인파티 1
 
1.8%
엘르(elle 1
 
1.8%
극장식 1
 
1.8%
마마앤 1
 
1.8%
파파 1
 
1.8%
홈런볼 1
 
1.8%
Other values (46) 46
80.7%
2024-01-28T15:51:26.208611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23
 
7.6%
21
 
6.9%
15
 
5.0%
14
 
4.6%
8
 
2.6%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
( 5
 
1.7%
Other values (128) 193
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 255
84.2%
Uppercase Letter 13
 
4.3%
Space Separator 8
 
2.6%
Decimal Number 8
 
2.6%
Lowercase Letter 8
 
2.6%
Open Punctuation 5
 
1.7%
Close Punctuation 5
 
1.7%
Other Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
9.0%
21
 
8.2%
15
 
5.9%
14
 
5.5%
8
 
3.1%
6
 
2.4%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (102) 150
58.8%
Uppercase Letter
ValueCountFrequency (%)
L 2
15.4%
E 2
15.4%
S 2
15.4%
G 2
15.4%
K 1
7.7%
A 1
7.7%
H 1
7.7%
N 1
7.7%
I 1
7.7%
Lowercase Letter
ValueCountFrequency (%)
n 1
12.5%
u 1
12.5%
r 1
12.5%
g 1
12.5%
t 1
12.5%
h 1
12.5%
e 1
12.5%
p 1
12.5%
Decimal Number
ValueCountFrequency (%)
2 3
37.5%
1 2
25.0%
9 1
 
12.5%
3 1
 
12.5%
0 1
 
12.5%
Space Separator
ValueCountFrequency (%)
8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 255
84.2%
Common 27
 
8.9%
Latin 21
 
6.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
9.0%
21
 
8.2%
15
 
5.9%
14
 
5.5%
8
 
3.1%
6
 
2.4%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (102) 150
58.8%
Latin
ValueCountFrequency (%)
L 2
 
9.5%
E 2
 
9.5%
S 2
 
9.5%
G 2
 
9.5%
n 1
 
4.8%
u 1
 
4.8%
r 1
 
4.8%
K 1
 
4.8%
A 1
 
4.8%
H 1
 
4.8%
Other values (7) 7
33.3%
Common
ValueCountFrequency (%)
8
29.6%
( 5
18.5%
) 5
18.5%
2 3
 
11.1%
1 2
 
7.4%
9 1
 
3.7%
& 1
 
3.7%
3 1
 
3.7%
0 1
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 255
84.2%
ASCII 48
 
15.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
23
 
9.0%
21
 
8.2%
15
 
5.9%
14
 
5.5%
8
 
3.1%
6
 
2.4%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (102) 150
58.8%
ASCII
ValueCountFrequency (%)
8
16.7%
( 5
 
10.4%
) 5
 
10.4%
2 3
 
6.2%
L 2
 
4.2%
E 2
 
4.2%
S 2
 
4.2%
1 2
 
4.2%
G 2
 
4.2%
9 1
 
2.1%
Other values (16) 16
33.3%

소재지(도로명)
Text

MISSING 

Distinct48
Distinct (%)98.0%
Missing21
Missing (%)30.0%
Memory size692.0 B
2024-01-28T15:51:26.423220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length48
Median length37
Mean length32.55102
Min length21

Characters and Unicode

Total characters1595
Distinct characters77
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47 ?
Unique (%)95.9%

Sample

1st row인천광역시 서구 서곶로315번길 17 (심곡동)
2nd row인천광역시 서구 탁옥로51번길 11 (심곡동, 심곡네오프라자 403호,404호)
3rd row인천광역시 서구 완정로 160 (마전동, 802호)
4th row인천광역시 서구 길주로 101, 4층 일부호 (석남동)
5th row인천광역시 서구 북항로32번안길 9-20, 북항프라자2 203호 (원창동)
ValueCountFrequency (%)
인천광역시 49
 
15.9%
서구 49
 
15.9%
심곡동 25
 
8.1%
지하1층 13
 
4.2%
탁옥로51번길 13
 
4.2%
석남동 12
 
3.9%
길주로 7
 
2.3%
마전동 6
 
1.9%
완정로 6
 
1.9%
11 5
 
1.6%
Other values (86) 124
40.1%
2024-01-28T15:51:26.788433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
260
 
16.3%
1 89
 
5.6%
56
 
3.5%
( 52
 
3.3%
) 52
 
3.3%
50
 
3.1%
50
 
3.1%
49
 
3.1%
49
 
3.1%
49
 
3.1%
Other values (67) 839
52.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 860
53.9%
Decimal Number 301
 
18.9%
Space Separator 260
 
16.3%
Open Punctuation 52
 
3.3%
Close Punctuation 52
 
3.3%
Other Punctuation 48
 
3.0%
Dash Punctuation 19
 
1.2%
Uppercase Letter 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
56
 
6.5%
50
 
5.8%
50
 
5.8%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
38
 
4.4%
Other values (49) 372
43.3%
Decimal Number
ValueCountFrequency (%)
1 89
29.6%
5 43
14.3%
2 37
12.3%
3 34
 
11.3%
0 29
 
9.6%
4 28
 
9.3%
6 18
 
6.0%
8 9
 
3.0%
7 9
 
3.0%
9 5
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
S 1
33.3%
B 1
33.3%
Space Separator
ValueCountFrequency (%)
260
100.0%
Open Punctuation
ValueCountFrequency (%)
( 52
100.0%
Close Punctuation
ValueCountFrequency (%)
) 52
100.0%
Other Punctuation
ValueCountFrequency (%)
, 48
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 860
53.9%
Common 732
45.9%
Latin 3
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
56
 
6.5%
50
 
5.8%
50
 
5.8%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
38
 
4.4%
Other values (49) 372
43.3%
Common
ValueCountFrequency (%)
260
35.5%
1 89
 
12.2%
( 52
 
7.1%
) 52
 
7.1%
, 48
 
6.6%
5 43
 
5.9%
2 37
 
5.1%
3 34
 
4.6%
0 29
 
4.0%
4 28
 
3.8%
Other values (5) 60
 
8.2%
Latin
ValueCountFrequency (%)
G 1
33.3%
S 1
33.3%
B 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 860
53.9%
ASCII 735
46.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
260
35.4%
1 89
 
12.1%
( 52
 
7.1%
) 52
 
7.1%
, 48
 
6.5%
5 43
 
5.9%
2 37
 
5.0%
3 34
 
4.6%
0 29
 
3.9%
4 28
 
3.8%
Other values (8) 63
 
8.6%
Hangul
ValueCountFrequency (%)
56
 
6.5%
50
 
5.8%
50
 
5.8%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
49
 
5.7%
38
 
4.4%
Other values (49) 372
43.3%

전화번호
Text

MISSING 

Distinct34
Distinct (%)97.1%
Missing35
Missing (%)50.0%
Memory size692.0 B
2024-01-28T15:51:26.955139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.028571
Min length12

Characters and Unicode

Total characters421
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)94.3%

Sample

1st row032-561-1331
2nd row032-562-2236
3rd row032-562-5823
4th row032-562-8753
5th row032-564-8604
ValueCountFrequency (%)
032-582-0042 2
 
5.7%
032-564-9949 1
 
2.9%
032-582-3353 1
 
2.9%
032-583-4114 1
 
2.9%
032-221-6769 1
 
2.9%
032-287-1000 1
 
2.9%
032-325-8688 1
 
2.9%
032-561-0530 1
 
2.9%
032-581-5678 1
 
2.9%
032-577-0972 1
 
2.9%
Other values (24) 24
68.6%
2024-01-28T15:51:27.217717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 70
16.6%
2 55
13.1%
0 54
12.8%
3 54
12.8%
5 42
10.0%
6 37
8.8%
8 29
6.9%
7 29
6.9%
4 23
 
5.5%
1 17
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 351
83.4%
Dash Punctuation 70
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 55
15.7%
0 54
15.4%
3 54
15.4%
5 42
12.0%
6 37
10.5%
8 29
8.3%
7 29
8.3%
4 23
6.6%
1 17
 
4.8%
9 11
 
3.1%
Dash Punctuation
ValueCountFrequency (%)
- 70
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 421
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 70
16.6%
2 55
13.1%
0 54
12.8%
3 54
12.8%
5 42
10.0%
6 37
8.8%
8 29
6.9%
7 29
6.9%
4 23
 
5.5%
1 17
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 421
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 70
16.6%
2 55
13.1%
0 54
12.8%
3 54
12.8%
5 42
10.0%
6 37
8.8%
8 29
6.9%
7 29
6.9%
4 23
 
5.5%
1 17
 
4.0%

데이터기준일자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size692.0 B
2022-09-01
49 
<NA>
21 

Length

Max length10
Median length10
Mean length8.2
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-09-01
2nd row2022-09-01
3rd row2022-09-01
4th row2022-09-01
5th row2022-09-01

Common Values

ValueCountFrequency (%)
2022-09-01 49
70.0%
<NA> 21
30.0%

Length

2024-01-28T15:51:27.330752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T15:51:27.406254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-09-01 49
70.0%
na 21
30.0%

Interactions

2024-01-28T15:51:25.000836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-28T15:51:27.456161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업소명소재지(도로명)전화번호
연번1.0001.0000.9360.873
업소명1.0001.0001.0001.000
소재지(도로명)0.9361.0001.0000.993
전화번호0.8731.0000.9931.000
2024-01-28T15:51:27.526044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
데이터기준일자업종명
데이터기준일자1.0001.000
업종명1.0001.000
2024-01-28T15:51:27.588923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종명데이터기준일자
연번1.0001.0001.000
업종명1.0001.0001.000
데이터기준일자1.0001.0001.000

Missing values

2024-01-28T15:51:25.087158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-28T15:51:25.173393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-28T15:51:25.258348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업종명업소명소재지(도로명)전화번호데이터기준일자
01유흥주점영업왕코노래클럽인천광역시 서구 서곶로315번길 17 (심곡동)<NA>2022-09-01
12유흥주점영업놀러와인천광역시 서구 탁옥로51번길 11 (심곡동, 심곡네오프라자 403호,404호)<NA>2022-09-01
23유흥주점영업레이디인천광역시 서구 완정로 160 (마전동, 802호)<NA>2022-09-01
34유흥주점영업러블리짱인천광역시 서구 길주로 101, 4층 일부호 (석남동)<NA>2022-09-01
45유흥주점영업북항프라자2 203호인천광역시 서구 북항로32번안길 9-20, 북항프라자2 203호 (원창동)<NA>2022-09-01
56유흥주점영업SINGHA(싱하)인천광역시 서구 완정로 160, 검단프라자 701호 (마전동)<NA>2022-09-01
67유흥주점영업크룽텝(Krungthep)인천광역시 서구 길주로 79, B301호 (석남동)<NA>2022-09-01
78유흥주점영업줄래줄래 노래클럽인천광역시 서구 탁옥로51번길 15, 지층 (심곡동)032-561-13312022-09-01
89유흥주점영업황궁 비지니스인천광역시 서구 완정로 160 (마전동, 지하)032-562-22362022-09-01
910유흥주점영업도도노래클럽인천광역시 서구 탁옥로51번길 13-6, 지하1층 (심곡동)032-562-58232022-09-01
연번업종명업소명소재지(도로명)전화번호데이터기준일자
60<NA><NA><NA><NA><NA><NA>
61<NA><NA><NA><NA><NA><NA>
62<NA><NA><NA><NA><NA><NA>
63<NA><NA><NA><NA><NA><NA>
64<NA><NA><NA><NA><NA><NA>
65<NA><NA><NA><NA><NA><NA>
66<NA><NA><NA><NA><NA><NA>
67<NA><NA><NA><NA><NA><NA>
68<NA><NA><NA><NA><NA><NA>
69<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번업종명업소명소재지(도로명)전화번호데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA>21