Overview

Dataset statistics

Number of variables5
Number of observations254
Missing cells196
Missing cells (%)15.4%
Duplicate rows8
Duplicate rows (%)3.1%
Total size in memory10.1 KiB
Average record size in memory40.5 B

Variable types

Text3
Categorical2

Dataset

Description경기도 김포시 여행사업체 현황 정보에 대한 데이터로 업체명, 소재지주소, 전화번호, 업종구분, 데이터기준일자 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15036605/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 8 (3.1%) duplicate rowsDuplicates
전화번호 has 196 (77.2%) missing valuesMissing

Reproduction

Analysis started2023-12-11 22:47:20.214212
Analysis finished2023-12-11 22:47:20.663417
Duration0.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct152
Distinct (%)59.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-12T07:47:20.844231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length17
Mean length8.3031496
Min length2

Characters and Unicode

Total characters2109
Distinct characters278
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique61 ?
Unique (%)24.0%

Sample

1st row(주)세주여행사
2nd row수트래블(주)
3rd row경진관광(주)
4th row(주)국민상조
5th row(주)어울림관광여행사
ValueCountFrequency (%)
주식회사 30
 
9.4%
travel 5
 
1.6%
여행사 5
 
1.6%
주)투어퀸스 3
 
0.9%
주)하모니여행사 3
 
0.9%
폴트래블(foll 3
 
0.9%
제로트레킹 3
 
0.9%
주)리앤뉴 3
 
0.9%
한강모두트래블 3
 
0.9%
주)인투온 3
 
0.9%
Other values (163) 257
80.8%
2023-12-12T07:47:21.255341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
159
 
7.5%
( 136
 
6.4%
) 136
 
6.4%
89
 
4.2%
67
 
3.2%
64
 
3.0%
63
 
3.0%
58
 
2.8%
56
 
2.7%
36
 
1.7%
Other values (268) 1245
59.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1628
77.2%
Open Punctuation 136
 
6.4%
Close Punctuation 136
 
6.4%
Uppercase Letter 75
 
3.6%
Lowercase Letter 69
 
3.3%
Space Separator 64
 
3.0%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
159
 
9.8%
89
 
5.5%
67
 
4.1%
63
 
3.9%
58
 
3.6%
56
 
3.4%
36
 
2.2%
33
 
2.0%
33
 
2.0%
32
 
2.0%
Other values (229) 1002
61.5%
Uppercase Letter
ValueCountFrequency (%)
L 10
13.3%
T 8
 
10.7%
A 7
 
9.3%
K 5
 
6.7%
E 4
 
5.3%
C 4
 
5.3%
R 4
 
5.3%
O 4
 
5.3%
S 4
 
5.3%
W 3
 
4.0%
Other values (11) 22
29.3%
Lowercase Letter
ValueCountFrequency (%)
e 13
18.8%
r 11
15.9%
o 9
13.0%
a 6
8.7%
n 5
 
7.2%
h 4
 
5.8%
v 4
 
5.8%
l 4
 
5.8%
u 4
 
5.8%
p 3
 
4.3%
Other values (4) 6
8.7%
Open Punctuation
ValueCountFrequency (%)
( 136
100.0%
Close Punctuation
ValueCountFrequency (%)
) 136
100.0%
Space Separator
ValueCountFrequency (%)
64
100.0%
Decimal Number
ValueCountFrequency (%)
9 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1628
77.2%
Common 337
 
16.0%
Latin 144
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
159
 
9.8%
89
 
5.5%
67
 
4.1%
63
 
3.9%
58
 
3.6%
56
 
3.4%
36
 
2.2%
33
 
2.0%
33
 
2.0%
32
 
2.0%
Other values (229) 1002
61.5%
Latin
ValueCountFrequency (%)
e 13
 
9.0%
r 11
 
7.6%
L 10
 
6.9%
o 9
 
6.2%
T 8
 
5.6%
A 7
 
4.9%
a 6
 
4.2%
n 5
 
3.5%
K 5
 
3.5%
E 4
 
2.8%
Other values (25) 66
45.8%
Common
ValueCountFrequency (%)
( 136
40.4%
) 136
40.4%
64
19.0%
9 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1628
77.2%
ASCII 481
 
22.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
159
 
9.8%
89
 
5.5%
67
 
4.1%
63
 
3.9%
58
 
3.6%
56
 
3.4%
36
 
2.2%
33
 
2.0%
33
 
2.0%
32
 
2.0%
Other values (229) 1002
61.5%
ASCII
ValueCountFrequency (%)
( 136
28.3%
) 136
28.3%
64
13.3%
e 13
 
2.7%
r 11
 
2.3%
L 10
 
2.1%
o 9
 
1.9%
T 8
 
1.7%
A 7
 
1.5%
a 6
 
1.2%
Other values (29) 81
16.8%
Distinct147
Distinct (%)57.9%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-12T07:47:21.502288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length48
Median length40
Mean length31.161417
Min length18

Characters and Unicode

Total characters7915
Distinct characters187
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique61 ?
Unique (%)24.0%

Sample

1st row경기도 김포시 사우중로 27 (사우동)
2nd row경기도 김포시 사우중로74번길 29, 701동 7층 33호 (사우동, 시그마프라자)
3rd row경기도 김포시 통진읍 김포대로2053번길 42-2
4th row경기도 김포시 고촌읍 전호로26번길 29
5th row경기도 김포시 장릉로 3 (풍무동)
ValueCountFrequency (%)
경기도 254
 
15.7%
김포시 254
 
15.7%
고촌읍 53
 
3.3%
장기동 35
 
2.2%
사우동 28
 
1.7%
구래동 28
 
1.7%
양촌읍 22
 
1.4%
운양동 19
 
1.2%
태장로 18
 
1.1%
통진읍 17
 
1.0%
Other values (328) 894
55.1%
2023-12-12T07:47:21.865409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1368
 
17.3%
339
 
4.3%
332
 
4.2%
1 329
 
4.2%
300
 
3.8%
261
 
3.3%
256
 
3.2%
254
 
3.2%
254
 
3.2%
2 238
 
3.0%
Other values (177) 3984
50.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4294
54.3%
Decimal Number 1648
 
20.8%
Space Separator 1368
 
17.3%
Other Punctuation 205
 
2.6%
Close Punctuation 150
 
1.9%
Open Punctuation 150
 
1.9%
Dash Punctuation 90
 
1.1%
Uppercase Letter 7
 
0.1%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
339
 
7.9%
332
 
7.7%
300
 
7.0%
261
 
6.1%
256
 
6.0%
254
 
5.9%
254
 
5.9%
183
 
4.3%
166
 
3.9%
101
 
2.4%
Other values (158) 1848
43.0%
Decimal Number
ValueCountFrequency (%)
1 329
20.0%
2 238
14.4%
5 200
12.1%
3 183
11.1%
0 173
10.5%
7 151
9.2%
4 130
 
7.9%
9 89
 
5.4%
6 80
 
4.9%
8 75
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
C 4
57.1%
B 3
42.9%
Lowercase Letter
ValueCountFrequency (%)
v 2
66.7%
e 1
33.3%
Space Separator
ValueCountFrequency (%)
1368
100.0%
Other Punctuation
ValueCountFrequency (%)
, 205
100.0%
Close Punctuation
ValueCountFrequency (%)
) 150
100.0%
Open Punctuation
ValueCountFrequency (%)
( 150
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 90
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4294
54.3%
Common 3611
45.6%
Latin 10
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
339
 
7.9%
332
 
7.7%
300
 
7.0%
261
 
6.1%
256
 
6.0%
254
 
5.9%
254
 
5.9%
183
 
4.3%
166
 
3.9%
101
 
2.4%
Other values (158) 1848
43.0%
Common
ValueCountFrequency (%)
1368
37.9%
1 329
 
9.1%
2 238
 
6.6%
, 205
 
5.7%
5 200
 
5.5%
3 183
 
5.1%
0 173
 
4.8%
7 151
 
4.2%
) 150
 
4.2%
( 150
 
4.2%
Other values (5) 464
 
12.8%
Latin
ValueCountFrequency (%)
C 4
40.0%
B 3
30.0%
v 2
20.0%
e 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4294
54.3%
ASCII 3621
45.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1368
37.8%
1 329
 
9.1%
2 238
 
6.6%
, 205
 
5.7%
5 200
 
5.5%
3 183
 
5.1%
0 173
 
4.8%
7 151
 
4.2%
) 150
 
4.1%
( 150
 
4.1%
Other values (9) 474
 
13.1%
Hangul
ValueCountFrequency (%)
339
 
7.9%
332
 
7.7%
300
 
7.0%
261
 
6.1%
256
 
6.0%
254
 
5.9%
254
 
5.9%
183
 
4.3%
166
 
3.9%
101
 
2.4%
Other values (158) 1848
43.0%

전화번호
Text

MISSING 

Distinct53
Distinct (%)91.4%
Missing196
Missing (%)77.2%
Memory size2.1 KiB
2023-12-12T07:47:22.081425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.982759
Min length9

Characters and Unicode

Total characters695
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)82.8%

Sample

1st row031-984-9900
2nd row031-997-1450
3rd row070-4331-9076
4th row070-7768-6145
5th row031-999-7832
ValueCountFrequency (%)
031-411-8004 2
 
3.4%
02-715-3622 2
 
3.4%
031-982-2039 2
 
3.4%
031-989-7715 2
 
3.4%
070-7768-6145 2
 
3.4%
02-2285-2506 1
 
1.7%
031-495-4905 1
 
1.7%
02-755-5545 1
 
1.7%
031-998-0272 1
 
1.7%
02-588-1186 1
 
1.7%
Other values (43) 43
74.1%
2023-12-12T07:47:22.482528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 115
16.5%
0 110
15.8%
1 83
11.9%
9 75
10.8%
3 67
9.6%
8 61
8.8%
2 44
 
6.3%
7 41
 
5.9%
4 36
 
5.2%
5 34
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 580
83.5%
Dash Punctuation 115
 
16.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 110
19.0%
1 83
14.3%
9 75
12.9%
3 67
11.6%
8 61
10.5%
2 44
 
7.6%
7 41
 
7.1%
4 36
 
6.2%
5 34
 
5.9%
6 29
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 115
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 695
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 115
16.5%
0 110
15.8%
1 83
11.9%
9 75
10.8%
3 67
9.6%
8 61
8.8%
2 44
 
6.3%
7 41
 
5.9%
4 36
 
5.2%
5 34
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 695
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 115
16.5%
0 110
15.8%
1 83
11.9%
9 75
10.8%
3 67
9.6%
8 61
8.8%
2 44
 
6.3%
7 41
 
5.9%
4 36
 
5.2%
5 34
 
4.9%

업종구분
Categorical

Distinct3
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
국내외
163 
종합
66 
국내
25 

Length

Max length3
Median length3
Mean length2.6417323
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종합
2nd row종합
3rd row종합
4th row종합
5th row종합

Common Values

ValueCountFrequency (%)
국내외 163
64.2%
종합 66
26.0%
국내 25
 
9.8%

Length

2023-12-12T07:47:22.640489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:47:22.727371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국내외 163
64.2%
종합 66
26.0%
국내 25
 
9.8%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-07-17
254 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-07-17
2nd row2023-07-17
3rd row2023-07-17
4th row2023-07-17
5th row2023-07-17

Common Values

ValueCountFrequency (%)
2023-07-17 254
100.0%

Length

2023-12-12T07:47:22.826667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:47:22.917157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-07-17 254
100.0%

Correlations

2023-12-12T07:47:22.972379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전화번호업종구분
전화번호1.0000.112
업종구분0.1121.000

Missing values

2023-12-12T07:47:20.528503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:47:20.628522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명소재지주소전화번호업종구분데이터기준일자
0(주)세주여행사경기도 김포시 사우중로 27 (사우동)031-984-9900종합2023-07-17
1수트래블(주)경기도 김포시 사우중로74번길 29, 701동 7층 33호 (사우동, 시그마프라자)031-997-1450종합2023-07-17
2경진관광(주)경기도 김포시 통진읍 김포대로2053번길 42-2<NA>종합2023-07-17
3(주)국민상조경기도 김포시 고촌읍 전호로26번길 29070-4331-9076종합2023-07-17
4(주)어울림관광여행사경기도 김포시 장릉로 3 (풍무동)<NA>종합2023-07-17
5(주)안녕하세요여행사경기도 김포시 사우중로74번길 29, 701호 (사우동, 시그마프라자)070-7768-6145종합2023-07-17
6수자원환경산업진흥(주)경기도 김포시 고촌읍 아라육로270번길 74031-999-7832종합2023-07-17
7(주)길벗여행사경기도 김포시 고촌읍 고송로 12, 2층031-991-3294종합2023-07-17
8(주)인투온경기도 김포시 월곶면 애기봉로 392-11<NA>종합2023-07-17
9(주)힐링투어라인경기도 김포시 고촌읍 김포대로 344, 6층 606호 (정다운 가)031-988-1660종합2023-07-17
업체명소재지주소전화번호업종구분데이터기준일자
244한강모두트래블경기도 김포시 김포한강9로75번길 142, 태림더끌리움 1711호 (구래동)<NA>국내2023-07-17
245여함여행사경기도 김포시 중봉1로 53, 가동 (감정동)<NA>국내2023-07-17
246(주)동백미디어경기도 김포시 김포한강11로 322, 더파크뷰테라스오피스텔 229호 (운양동)<NA>국내2023-07-17
247(주)리앤뉴경기도 김포시 태장로795번길 23, 536호 (장기동)070-4949-4026국내2023-07-17
248주식회사 스카이투어경기도 김포시 양촌읍 양곡4로 139, 나동 104호<NA>국내2023-07-17
249주식회사 하나트레블경기도 김포시 양촌읍 양곡4로 139, 나동 104호<NA>국내2023-07-17
250(주)투어퀸스경기도 김포시 김포한강11로 322, 138-5호 (운양동)<NA>국내2023-07-17
251산타고버스경기도 김포시 풍무2로 25, 645호 (풍무동)<NA>국내2023-07-17
252(주)골드웨이여행사경기도 김포시 풍무로 2, 유현마을신동아아파트 상가402동 304호 (풍무동)<NA>국내2023-07-17
253흰돌관광(주)경기도 김포시 고촌읍 고송로 12-1, 201호<NA>국내2023-07-17

Duplicate rows

Most frequently occurring

업체명소재지주소전화번호업종구분데이터기준일자# duplicates
0(주)동백미디어경기도 김포시 김포한강11로 322, 더파크뷰테라스오피스텔 229호 (운양동)031-411-8004국내외2023-07-172
1(주)리앤뉴경기도 김포시 태장로795번길 23, 536호 (장기동)<NA>국내외2023-07-172
2(주)투어퀸스경기도 김포시 김포한강11로 322, 138-5호 (운양동)<NA>국내외2023-07-172
3(주)하나관광경기도 김포시 양촌읍 양곡4로 139, 나동 104호<NA>국내외2023-07-172
4(주)하모니여행사경기도 김포시 중구로 27 (북변동, 건재빌딩 2층)<NA>국내외2023-07-172
5여함여행사경기도 김포시 중봉1로 53, 가동 (감정동)<NA>국내외2023-07-172
6제로트레킹경기도 김포시 북변중로68번길 7 (북변동)<NA>국내외2023-07-172
7한강모두트래블경기도 김포시 김포한강9로75번길 142, 태림더끌리움 1711호 (구래동)<NA>국내외2023-07-172