Overview

Dataset statistics

Number of variables5
Number of observations173
Missing cells101
Missing cells (%)11.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.1 KiB
Average record size in memory41.8 B

Variable types

Numeric1
Categorical1
Text3

Dataset

Description관광진흥법에 따른 등록 및 지정을 받은 부천시 관내 관광사업체 정보로 업종유형, 상호명, 소재지(도로명), 전화번호 등의 데이터를 제공합니다.
Author경기도 부천시
URLhttps://www.data.go.kr/data/3039453/fileData.do

Alerts

연번 is highly overall correlated with 업종 유형High correlation
업종 유형 is highly overall correlated with 연번High correlation
소재지(도로명) has 3 (1.7%) missing valuesMissing
전화번호 has 98 (56.6%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:14:41.036773
Analysis finished2023-12-12 01:14:41.908554
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct173
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean87
Minimum1
Maximum173
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-12T10:14:42.002291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9.6
Q144
median87
Q3130
95-th percentile164.4
Maximum173
Range172
Interquartile range (IQR)86

Descriptive statistics

Standard deviation50.084928
Coefficient of variation (CV)0.57568883
Kurtosis-1.2
Mean87
Median Absolute Deviation (MAD)43
Skewness0
Sum15051
Variance2508.5
MonotonicityStrictly increasing
2023-12-12T10:14:42.215165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.6%
120 1
 
0.6%
112 1
 
0.6%
113 1
 
0.6%
114 1
 
0.6%
115 1
 
0.6%
116 1
 
0.6%
117 1
 
0.6%
118 1
 
0.6%
119 1
 
0.6%
Other values (163) 163
94.2%
ValueCountFrequency (%)
1 1
0.6%
2 1
0.6%
3 1
0.6%
4 1
0.6%
5 1
0.6%
6 1
0.6%
7 1
0.6%
8 1
0.6%
9 1
0.6%
10 1
0.6%
ValueCountFrequency (%)
173 1
0.6%
172 1
0.6%
171 1
0.6%
170 1
0.6%
169 1
0.6%
168 1
0.6%
167 1
0.6%
166 1
0.6%
165 1
0.6%
164 1
0.6%

업종 유형
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
국내외여행업
78 
종합여행업
63 
국내여행업
32 

Length

Max length6
Median length5
Mean length5.4508671
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국내여행업
2nd row국내여행업
3rd row국내여행업
4th row국내여행업
5th row국내여행업

Common Values

ValueCountFrequency (%)
국내외여행업 78
45.1%
종합여행업 63
36.4%
국내여행업 32
18.5%

Length

2023-12-12T10:14:42.389381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:14:42.543492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국내외여행업 78
45.1%
종합여행업 63
36.4%
국내여행업 32
18.5%
Distinct160
Distinct (%)92.5%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-12T10:14:42.823314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length15
Mean length7.8554913
Min length3

Characters and Unicode

Total characters1359
Distinct characters255
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147 ?
Unique (%)85.0%

Sample

1st row주)내쇼날항공여행사
2nd row프린스관광(주)
3rd row신세대관광(주)
4th row(주)현대항공여행사
5th row블루스카이(주)
ValueCountFrequency (%)
주식회사 25
 
11.5%
여행사 5
 
2.3%
주)내쇼날항공여행사 2
 
0.9%
초이스투어 2
 
0.9%
투어 2
 
0.9%
주)비젼항공투어 2
 
0.9%
평안여행사 2
 
0.9%
에스비에스여행사 2
 
0.9%
신신투어 2
 
0.9%
주)세진투어 2
 
0.9%
Other values (166) 172
78.9%
2023-12-12T10:14:43.330780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
101
 
7.4%
89
 
6.5%
79
 
5.8%
76
 
5.6%
) 65
 
4.8%
( 57
 
4.2%
46
 
3.4%
44
 
3.2%
41
 
3.0%
32
 
2.4%
Other values (245) 729
53.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1136
83.6%
Close Punctuation 65
 
4.8%
Open Punctuation 57
 
4.2%
Space Separator 46
 
3.4%
Uppercase Letter 36
 
2.6%
Lowercase Letter 16
 
1.2%
Decimal Number 2
 
0.1%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
101
 
8.9%
89
 
7.8%
79
 
7.0%
76
 
6.7%
44
 
3.9%
41
 
3.6%
32
 
2.8%
26
 
2.3%
26
 
2.3%
20
 
1.8%
Other values (209) 602
53.0%
Uppercase Letter
ValueCountFrequency (%)
O 4
11.1%
C 4
11.1%
S 3
 
8.3%
K 3
 
8.3%
A 3
 
8.3%
T 3
 
8.3%
R 2
 
5.6%
P 2
 
5.6%
M 2
 
5.6%
E 1
 
2.8%
Other values (9) 9
25.0%
Lowercase Letter
ValueCountFrequency (%)
e 3
18.8%
l 2
12.5%
i 2
12.5%
k 1
 
6.2%
a 1
 
6.2%
r 1
 
6.2%
o 1
 
6.2%
y 1
 
6.2%
f 1
 
6.2%
n 1
 
6.2%
Other values (2) 2
12.5%
Close Punctuation
ValueCountFrequency (%)
) 65
100.0%
Open Punctuation
ValueCountFrequency (%)
( 57
100.0%
Space Separator
ValueCountFrequency (%)
46
100.0%
Decimal Number
ValueCountFrequency (%)
8 2
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1131
83.2%
Common 171
 
12.6%
Latin 52
 
3.8%
Han 5
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
101
 
8.9%
89
 
7.9%
79
 
7.0%
76
 
6.7%
44
 
3.9%
41
 
3.6%
32
 
2.8%
26
 
2.3%
26
 
2.3%
20
 
1.8%
Other values (204) 597
52.8%
Latin
ValueCountFrequency (%)
O 4
 
7.7%
C 4
 
7.7%
S 3
 
5.8%
K 3
 
5.8%
A 3
 
5.8%
T 3
 
5.8%
e 3
 
5.8%
l 2
 
3.8%
R 2
 
3.8%
P 2
 
3.8%
Other values (21) 23
44.2%
Common
ValueCountFrequency (%)
) 65
38.0%
( 57
33.3%
46
26.9%
8 2
 
1.2%
& 1
 
0.6%
Han
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1131
83.2%
ASCII 223
 
16.4%
CJK 5
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
101
 
8.9%
89
 
7.9%
79
 
7.0%
76
 
6.7%
44
 
3.9%
41
 
3.6%
32
 
2.8%
26
 
2.3%
26
 
2.3%
20
 
1.8%
Other values (204) 597
52.8%
ASCII
ValueCountFrequency (%)
) 65
29.1%
( 57
25.6%
46
20.6%
O 4
 
1.8%
C 4
 
1.8%
S 3
 
1.3%
K 3
 
1.3%
A 3
 
1.3%
T 3
 
1.3%
e 3
 
1.3%
Other values (26) 32
14.3%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

소재지(도로명)
Text

MISSING 

Distinct157
Distinct (%)92.4%
Missing3
Missing (%)1.7%
Memory size1.5 KiB
2023-12-12T10:14:43.775764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length42
Mean length33.958824
Min length19

Characters and Unicode

Total characters5773
Distinct characters202
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)85.3%

Sample

1st row경기도 부천시 부흥로 339 301동 3-207호 (중동 스타팰리움)
2nd row경기도 부천시 부일로 326 1층 29호 (중동)
3rd row경기도 부천시 중동로 367 (약대동)
4th row경기도 부천시 부일로 390 (심곡동)
5th row경기도 부천시 경인로 20 송내 자이 104호 (송내동)
ValueCountFrequency (%)
부천시 174
 
14.5%
경기도 170
 
14.2%
상동 55
 
4.6%
중동 43
 
3.6%
1층 20
 
1.7%
길주로 19
 
1.6%
심곡본동 16
 
1.3%
부일로 16
 
1.3%
심곡동 14
 
1.2%
중동로254번길 10
 
0.8%
Other values (410) 663
55.2%
2023-12-12T10:14:44.345107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1211
21.0%
229
 
4.0%
223
 
3.9%
1 211
 
3.7%
199
 
3.4%
187
 
3.2%
181
 
3.1%
( 180
 
3.1%
2 180
 
3.1%
) 180
 
3.1%
Other values (192) 2792
48.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3024
52.4%
Space Separator 1211
21.0%
Decimal Number 1118
 
19.4%
Open Punctuation 180
 
3.1%
Close Punctuation 180
 
3.1%
Dash Punctuation 34
 
0.6%
Uppercase Letter 22
 
0.4%
Other Punctuation 2
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
229
 
7.6%
223
 
7.4%
199
 
6.6%
187
 
6.2%
181
 
6.0%
175
 
5.8%
171
 
5.7%
170
 
5.6%
138
 
4.6%
89
 
2.9%
Other values (169) 1262
41.7%
Decimal Number
ValueCountFrequency (%)
1 211
18.9%
2 180
16.1%
3 143
12.8%
0 141
12.6%
4 108
9.7%
7 77
 
6.9%
5 76
 
6.8%
8 63
 
5.6%
6 62
 
5.5%
9 57
 
5.1%
Uppercase Letter
ValueCountFrequency (%)
B 10
45.5%
A 8
36.4%
C 1
 
4.5%
F 1
 
4.5%
N 1
 
4.5%
G 1
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
y 1
50.0%
b 1
50.0%
Space Separator
ValueCountFrequency (%)
1211
100.0%
Open Punctuation
ValueCountFrequency (%)
( 180
100.0%
Close Punctuation
ValueCountFrequency (%)
) 180
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3024
52.4%
Common 2725
47.2%
Latin 24
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
229
 
7.6%
223
 
7.4%
199
 
6.6%
187
 
6.2%
181
 
6.0%
175
 
5.8%
171
 
5.7%
170
 
5.6%
138
 
4.6%
89
 
2.9%
Other values (169) 1262
41.7%
Common
ValueCountFrequency (%)
1211
44.4%
1 211
 
7.7%
( 180
 
6.6%
2 180
 
6.6%
) 180
 
6.6%
3 143
 
5.2%
0 141
 
5.2%
4 108
 
4.0%
7 77
 
2.8%
5 76
 
2.8%
Other values (5) 218
 
8.0%
Latin
ValueCountFrequency (%)
B 10
41.7%
A 8
33.3%
C 1
 
4.2%
F 1
 
4.2%
y 1
 
4.2%
N 1
 
4.2%
G 1
 
4.2%
b 1
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3024
52.4%
ASCII 2749
47.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1211
44.1%
1 211
 
7.7%
( 180
 
6.5%
2 180
 
6.5%
) 180
 
6.5%
3 143
 
5.2%
0 141
 
5.1%
4 108
 
3.9%
7 77
 
2.8%
5 76
 
2.8%
Other values (13) 242
 
8.8%
Hangul
ValueCountFrequency (%)
229
 
7.6%
223
 
7.4%
199
 
6.6%
187
 
6.2%
181
 
6.0%
175
 
5.8%
171
 
5.7%
170
 
5.6%
138
 
4.6%
89
 
2.9%
Other values (169) 1262
41.7%

전화번호
Text

MISSING 

Distinct71
Distinct (%)94.7%
Missing98
Missing (%)56.6%
Memory size1.5 KiB
2023-12-12T10:14:44.660564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length11.893333
Min length9

Characters and Unicode

Total characters892
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)89.3%

Sample

1st row032-666-2300
2nd row032-349-2300
3rd row032-681-1100
4th row032-661-6626
5th row032-667-3434
ValueCountFrequency (%)
032-661-6626 2
 
2.7%
032-666-8420 2
 
2.7%
032-666-2300 2
 
2.7%
032-624-7570 2
 
2.7%
032-215-1655 1
 
1.3%
02-542-5979 1
 
1.3%
032-507-8686 1
 
1.3%
032-325-2525 1
 
1.3%
02-736-2281 1
 
1.3%
032-321-5286 1
 
1.3%
Other values (61) 61
81.3%
2023-12-12T10:14:45.134965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 147
16.5%
2 133
14.9%
0 128
14.3%
3 122
13.7%
6 95
10.7%
5 64
7.2%
7 52
 
5.8%
1 45
 
5.0%
8 44
 
4.9%
4 39
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 745
83.5%
Dash Punctuation 147
 
16.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 133
17.9%
0 128
17.2%
3 122
16.4%
6 95
12.8%
5 64
8.6%
7 52
 
7.0%
1 45
 
6.0%
8 44
 
5.9%
4 39
 
5.2%
9 23
 
3.1%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 892
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 147
16.5%
2 133
14.9%
0 128
14.3%
3 122
13.7%
6 95
10.7%
5 64
7.2%
7 52
 
5.8%
1 45
 
5.0%
8 44
 
4.9%
4 39
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 147
16.5%
2 133
14.9%
0 128
14.3%
3 122
13.7%
6 95
10.7%
5 64
7.2%
7 52
 
5.8%
1 45
 
5.0%
8 44
 
4.9%
4 39
 
4.4%

Interactions

2023-12-12T10:14:41.449519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:14:45.240271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종 유형전화번호
연번1.0000.9490.873
업종 유형0.9491.0000.000
전화번호0.8730.0001.000
2023-12-12T10:14:45.329525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종 유형
연번1.0000.921
업종 유형0.9211.000

Missing values

2023-12-12T10:14:41.590212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:14:41.719311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T10:14:41.851634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업종 유형상호명소재지(도로명)전화번호
01국내여행업주)내쇼날항공여행사경기도 부천시 부흥로 339 301동 3-207호 (중동 스타팰리움)032-666-2300
12국내여행업프린스관광(주)경기도 부천시 부일로 326 1층 29호 (중동)<NA>
23국내여행업신세대관광(주)<NA><NA>
34국내여행업(주)현대항공여행사<NA>032-349-2300
45국내여행업블루스카이(주)경기도 부천시 중동로 367 (약대동)032-681-1100
56국내여행업주)쿨허니문투어경기도 부천시 부일로 390 (심곡동)032-661-6626
67국내여행업수도고속관광여행사(주)경기도 부천시 경인로 20 송내 자이 104호 (송내동)032-667-3434
78국내여행업(주)신세계길벗고속관광경기도 부천시 길주로 311 뉴월드타운 2동 201호 (중동)032-328-8671
89국내여행업(주)하나월드여행사경기도 부천시 송내대로73번길 46 대양훼미리코아 7층 A14호 (상동)<NA>
910국내여행업ABC투어<NA><NA>
연번업종 유형상호명소재지(도로명)전화번호
163164종합여행업은하행정 여행사경기도 부천시 상일로 82 중동다모아쇼핑타운 503호 (상동)<NA>
164165종합여행업동북행정사여행사사무소경기도 부천시 경인로 232 이테크 에비뉴스타 1층 113호 (심곡본동)<NA>
165166종합여행업태화국제여행사경기도 부천시 경인옛로 25 103동 B109 B110호 (소사본동 부천 한신더휴 메트로)<NA>
166167종합여행업유니버스투어경기도 부천시 중동로 19 B층 102호 (송내동 래미안부천어반비스타)<NA>
167168종합여행업남산번역여행사경기도 부천시 소삼로 17 2층 (소사본동)032-716-5891
168169종합여행업시대여행사경기도 부천시 경인옛로 25 103동 101호 (소사본동 부천 한신더휴 메트로)<NA>
169170종합여행업마이러브투어경기도 부천시 소향로 225 디아뜨갤러리3 B동 비동 306호 (중동)<NA>
170171종합여행업리빌국제여행사경기도 부천시 부일로 482 1층 (심곡동)<NA>
171172종합여행업주식회사 한림여행사경기도 부천시 부일로233번길 45 402호 (상동)<NA>
172173종합여행업주식회사 한여국제경기도 부천시 중동로254번길 78 필타운 6층 (중동)<NA>