Overview

Dataset statistics

Number of variables4
Number of observations390
Missing cells47
Missing cells (%)3.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.7 KiB
Average record size in memory33.3 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description부산 동구 관내 소독의무대상시설에 대한 데이터로 시설종류(숙박, 식품, 문화, 교육, 업무, 공동주택 등), 시설명, 소재지 등의 항목을 제공합니다
Author부산광역시 동구
URLhttps://www.data.go.kr/data/15127396/fileData.do

Alerts

연번 is highly overall correlated with 시설종류High correlation
시설종류 is highly overall correlated with 연번High correlation
시설명 has 47 (12.1%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-21 02:31:49.816214
Analysis finished2024-04-21 02:31:51.335464
Duration1.52 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct390
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean195.5
Minimum1
Maximum390
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2024-04-21T11:31:51.404577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile20.45
Q198.25
median195.5
Q3292.75
95-th percentile370.55
Maximum390
Range389
Interquartile range (IQR)194.5

Descriptive statistics

Standard deviation112.72755
Coefficient of variation (CV)0.5766115
Kurtosis-1.2
Mean195.5
Median Absolute Deviation (MAD)97.5
Skewness0
Sum76245
Variance12707.5
MonotonicityStrictly increasing
2024-04-21T11:31:51.537907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
246 1
 
0.3%
268 1
 
0.3%
267 1
 
0.3%
266 1
 
0.3%
265 1
 
0.3%
264 1
 
0.3%
263 1
 
0.3%
262 1
 
0.3%
261 1
 
0.3%
Other values (380) 380
97.4%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
390 1
0.3%
389 1
0.3%
388 1
0.3%
387 1
0.3%
386 1
0.3%
385 1
0.3%
384 1
0.3%
383 1
0.3%
382 1
0.3%
381 1
0.3%

시설종류
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
업무시설
93 
숙박업소
81 
식품접객업소
66 
집단급식소
45 
학교
17 
Other values (19)
88 

Length

Max length7
Median length4
Mean length4.3589744
Min length2

Unique

Unique5 ?
Unique (%)1.3%

Sample

1st row숙박업소
2nd row숙박업소
3rd row숙박업소
4th row숙박업소
5th row숙박업소

Common Values

ValueCountFrequency (%)
업무시설 93
23.8%
숙박업소 81
20.8%
식품접객업소 66
16.9%
집단급식소 45
11.5%
학교 17
 
4.4%
공동주택 12
 
3.1%
어린이집 10
 
2.6%
전통시장 9
 
2.3%
종합병원 7
 
1.8%
관광숙박업소 7
 
1.8%
Other values (14) 43
11.0%

Length

2024-04-21T11:31:51.660094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
업무시설 93
23.7%
숙박업소 81
20.7%
식품접객업소 66
16.8%
집단급식소 45
11.5%
학교 17
 
4.3%
공동주택 12
 
3.1%
어린이집 10
 
2.6%
전통시장 9
 
2.3%
종합병원 7
 
1.8%
관광숙박업소 7
 
1.8%
Other values (15) 45
11.5%
Distinct346
Distinct (%)88.7%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2024-04-21T11:31:51.861631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length44
Mean length24.930769
Min length7

Characters and Unicode

Total characters9723
Distinct characters151
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique307 ?
Unique (%)78.7%

Sample

1st row부산광역시 동구 중앙대로180번길 12-9 (초량동)
2nd row부산광역시 동구 중앙대로260번길 3-9 (초량동)
3rd row부산광역시 동구 중앙대로248번길 3-9 (초량동)
4th row부산광역시 동구 중앙대로236번길 7-9 (초량동)
5th row부산광역시 동구 중앙대로286번길 3-2 (초량동,1162-10)
ValueCountFrequency (%)
동구 392
20.0%
부산광역시 359
18.3%
초량동 166
 
8.5%
범일동 96
 
4.9%
중앙대로 79
 
4.0%
수정동 35
 
1.8%
조방로 21
 
1.1%
좌천동 18
 
0.9%
범일로 16
 
0.8%
중앙대로196번길 16
 
0.8%
Other values (390) 759
38.8%
2024-04-21T11:31:52.238355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1567
 
16.1%
727
 
7.5%
399
 
4.1%
390
 
4.0%
385
 
4.0%
384
 
3.9%
365
 
3.8%
365
 
3.8%
359
 
3.7%
1 342
 
3.5%
Other values (141) 4440
45.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5718
58.8%
Space Separator 1567
 
16.1%
Decimal Number 1565
 
16.1%
Open Punctuation 334
 
3.4%
Close Punctuation 334
 
3.4%
Other Punctuation 105
 
1.1%
Dash Punctuation 76
 
0.8%
Math Symbol 12
 
0.1%
Uppercase Letter 11
 
0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
727
 
12.7%
399
 
7.0%
390
 
6.8%
385
 
6.7%
384
 
6.7%
365
 
6.4%
365
 
6.4%
359
 
6.3%
200
 
3.5%
200
 
3.5%
Other values (118) 1944
34.0%
Decimal Number
ValueCountFrequency (%)
1 342
21.9%
2 270
17.3%
3 191
12.2%
4 135
 
8.6%
0 132
 
8.4%
6 125
 
8.0%
9 114
 
7.3%
7 91
 
5.8%
5 86
 
5.5%
8 79
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
A 4
36.4%
B 3
27.3%
G 3
27.3%
T 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
, 102
97.1%
/ 2
 
1.9%
. 1
 
1.0%
Space Separator
ValueCountFrequency (%)
1567
100.0%
Open Punctuation
ValueCountFrequency (%)
( 334
100.0%
Close Punctuation
ValueCountFrequency (%)
) 334
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 76
100.0%
Math Symbol
ValueCountFrequency (%)
~ 12
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5718
58.8%
Common 3993
41.1%
Latin 12
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
727
 
12.7%
399
 
7.0%
390
 
6.8%
385
 
6.7%
384
 
6.7%
365
 
6.4%
365
 
6.4%
359
 
6.3%
200
 
3.5%
200
 
3.5%
Other values (118) 1944
34.0%
Common
ValueCountFrequency (%)
1567
39.2%
1 342
 
8.6%
( 334
 
8.4%
) 334
 
8.4%
2 270
 
6.8%
3 191
 
4.8%
4 135
 
3.4%
0 132
 
3.3%
6 125
 
3.1%
9 114
 
2.9%
Other values (8) 449
 
11.2%
Latin
ValueCountFrequency (%)
A 4
33.3%
B 3
25.0%
G 3
25.0%
T 1
 
8.3%
1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5718
58.8%
ASCII 4004
41.2%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1567
39.1%
1 342
 
8.5%
( 334
 
8.3%
) 334
 
8.3%
2 270
 
6.7%
3 191
 
4.8%
4 135
 
3.4%
0 132
 
3.3%
6 125
 
3.1%
9 114
 
2.8%
Other values (12) 460
 
11.5%
Hangul
ValueCountFrequency (%)
727
 
12.7%
399
 
7.0%
390
 
6.8%
385
 
6.7%
384
 
6.7%
365
 
6.4%
365
 
6.4%
359
 
6.3%
200
 
3.5%
200
 
3.5%
Other values (118) 1944
34.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

시설명
Text

MISSING 

Distinct309
Distinct (%)90.1%
Missing47
Missing (%)12.1%
Memory size3.2 KiB
2024-04-21T11:31:52.481924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length20
Mean length8.1253644
Min length2

Characters and Unicode

Total characters2787
Distinct characters369
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique277 ?
Unique (%)80.8%

Sample

1st row부산역비즈니스호텔
2nd row장안여관
3rd row세종장여관
4th row부산역센텀호텔
5th row호텔그레이
ValueCountFrequency (%)
1호선 5
 
1.1%
주식회사 5
 
1.1%
부산역 4
 
0.9%
일신기독병원 3
 
0.7%
김원묵기념봉생병원 3
 
0.7%
부산역점 3
 
0.7%
호텔 3
 
0.7%
의료법인정화의료재단 3
 
0.7%
좋은문화병원 3
 
0.7%
3
 
0.7%
Other values (367) 414
92.2%
2024-04-21T11:31:52.824081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
106
 
3.8%
79
 
2.8%
76
 
2.7%
71
 
2.5%
59
 
2.1%
55
 
2.0%
53
 
1.9%
50
 
1.8%
) 48
 
1.7%
47
 
1.7%
Other values (359) 2143
76.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2453
88.0%
Space Separator 106
 
3.8%
Uppercase Letter 59
 
2.1%
Close Punctuation 48
 
1.7%
Open Punctuation 47
 
1.7%
Decimal Number 37
 
1.3%
Lowercase Letter 22
 
0.8%
Other Punctuation 5
 
0.2%
Math Symbol 3
 
0.1%
Other Symbol 2
 
0.1%
Other values (3) 5
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
79
 
3.2%
76
 
3.1%
71
 
2.9%
59
 
2.4%
55
 
2.2%
53
 
2.2%
50
 
2.0%
47
 
1.9%
39
 
1.6%
36
 
1.5%
Other values (310) 1888
77.0%
Uppercase Letter
ValueCountFrequency (%)
T 9
15.3%
C 7
11.9%
E 5
 
8.5%
S 5
 
8.5%
M 4
 
6.8%
O 4
 
6.8%
H 3
 
5.1%
G 3
 
5.1%
K 3
 
5.1%
J 3
 
5.1%
Other values (7) 13
22.0%
Lowercase Letter
ValueCountFrequency (%)
e 5
22.7%
o 4
18.2%
v 2
 
9.1%
i 2
 
9.1%
a 2
 
9.1%
f 2
 
9.1%
l 2
 
9.1%
c 1
 
4.5%
b 1
 
4.5%
w 1
 
4.5%
Decimal Number
ValueCountFrequency (%)
1 13
35.1%
0 6
16.2%
8 5
 
13.5%
7 4
 
10.8%
3 2
 
5.4%
4 2
 
5.4%
6 2
 
5.4%
2 2
 
5.4%
9 1
 
2.7%
Other Punctuation
ValueCountFrequency (%)
, 3
60.0%
. 1
 
20.0%
& 1
 
20.0%
Math Symbol
ValueCountFrequency (%)
~ 2
66.7%
> 1
33.3%
Letter Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
106
100.0%
Close Punctuation
ValueCountFrequency (%)
) 48
100.0%
Open Punctuation
ValueCountFrequency (%)
( 47
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2455
88.1%
Common 249
 
8.9%
Latin 83
 
3.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
79
 
3.2%
76
 
3.1%
71
 
2.9%
59
 
2.4%
55
 
2.2%
53
 
2.2%
50
 
2.0%
47
 
1.9%
39
 
1.6%
36
 
1.5%
Other values (311) 1890
77.0%
Latin
ValueCountFrequency (%)
T 9
 
10.8%
C 7
 
8.4%
E 5
 
6.0%
e 5
 
6.0%
S 5
 
6.0%
o 4
 
4.8%
M 4
 
4.8%
O 4
 
4.8%
H 3
 
3.6%
G 3
 
3.6%
Other values (19) 34
41.0%
Common
ValueCountFrequency (%)
106
42.6%
) 48
19.3%
( 47
18.9%
1 13
 
5.2%
0 6
 
2.4%
8 5
 
2.0%
7 4
 
1.6%
, 3
 
1.2%
3 2
 
0.8%
4 2
 
0.8%
Other values (9) 13
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2453
88.0%
ASCII 330
 
11.8%
None 2
 
0.1%
Number Forms 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
106
32.1%
) 48
14.5%
( 47
14.2%
1 13
 
3.9%
T 9
 
2.7%
C 7
 
2.1%
0 6
 
1.8%
8 5
 
1.5%
E 5
 
1.5%
e 5
 
1.5%
Other values (36) 79
23.9%
Hangul
ValueCountFrequency (%)
79
 
3.2%
76
 
3.1%
71
 
2.9%
59
 
2.4%
55
 
2.2%
53
 
2.2%
50
 
2.0%
47
 
1.9%
39
 
1.6%
36
 
1.5%
Other values (310) 1888
77.0%
None
ValueCountFrequency (%)
2
100.0%
Number Forms
ValueCountFrequency (%)
1
50.0%
1
50.0%

Interactions

2024-04-21T11:31:51.061520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T11:31:52.922267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시설종류
연번1.0000.947
시설종류0.9471.000
2024-04-21T11:31:53.004135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시설종류
연번1.0000.730
시설종류0.7301.000

Missing values

2024-04-21T11:31:51.228809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T11:31:51.301982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시설종류소재지시설명
01숙박업소부산광역시 동구 중앙대로180번길 12-9 (초량동)부산역비즈니스호텔
12숙박업소부산광역시 동구 중앙대로260번길 3-9 (초량동)장안여관
23숙박업소부산광역시 동구 중앙대로248번길 3-9 (초량동)세종장여관
34숙박업소부산광역시 동구 중앙대로236번길 7-9 (초량동)부산역센텀호텔
45숙박업소부산광역시 동구 중앙대로286번길 3-2 (초량동,1162-10)호텔그레이
56숙박업소부산광역시 동구 초량중로 21 (초량동)태림하우스
67숙박업소부산광역시 동구 중앙대로195번길 30 (초량동)동일장모텔
78숙박업소부산광역시 동구 조방로 35 (범일동)미진장여관
89숙박업소부산광역시 동구 범일로102번길 16-6 (범일동)챌린지 호텔
910숙박업소부산광역시 동구 중앙대로195번가길 12 (초량동)테레목 호스텔(TEREMOK)
연번시설종류소재지시설명
380381공동주택동구 중앙대로 514한성기린아파트
381382공동주택동구 성남일로 5두산위브범일뉴타운
382383공동주택동구 자성로116번길 2두산위브포세이돈Ⅱ
383384공동주택동구 자성로133번길 6진흥마제스타워범일
384385공동주택동구 범일로 41오션브릿지
385386공동주택동구 홍곡로 50e편한세상 부산항
386387공동주택동구 중앙대로 357수정협성휴포레 부산진역더뷰아파트
387388공동주택동구 초량중로 114범양레우스
388389공동주택동구 홍곡로 37초량 베스티움 센트럴베이
389390공동주택동구 자성로6부산항 일동미라주 더 오션 아파트