Overview

Dataset statistics

Number of variables4
Number of observations669
Missing cells40
Missing cells (%)1.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.7 KiB
Average record size in memory33.2 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description감염병의 예방 및 관리에 관한 법률에 의거하여 주기적인 소독을 실시하여야 하는 부산광역시 사하구 소독의무대상 시설 현황입니다. 시설명과 도로명 주소를 포함하고 있습니다.
Author부산광역시 사하구
URLhttps://www.data.go.kr/data/15127235/fileData.do

Alerts

연번 is highly overall correlated with 시설의 종류High correlation
시설의 종류 is highly overall correlated with 연번High correlation
시설명(업체명) has 40 (6.0%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-30 08:24:35.667314
Analysis finished2024-03-30 08:24:37.437884
Duration1.77 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct669
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean335
Minimum1
Maximum669
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.0 KiB
2024-03-30T08:24:37.739512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile34.4
Q1168
median335
Q3502
95-th percentile635.6
Maximum669
Range668
Interquartile range (IQR)334

Descriptive statistics

Standard deviation193.26795
Coefficient of variation (CV)0.57691925
Kurtosis-1.2
Mean335
Median Absolute Deviation (MAD)167
Skewness0
Sum224115
Variance37352.5
MonotonicityStrictly increasing
2024-03-30T08:24:38.171089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
441 1
 
0.1%
443 1
 
0.1%
444 1
 
0.1%
445 1
 
0.1%
446 1
 
0.1%
447 1
 
0.1%
448 1
 
0.1%
449 1
 
0.1%
450 1
 
0.1%
Other values (659) 659
98.5%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
669 1
0.1%
668 1
0.1%
667 1
0.1%
666 1
0.1%
665 1
0.1%
664 1
0.1%
663 1
0.1%
662 1
0.1%
661 1
0.1%
660 1
0.1%

시설의 종류
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
집단급식소
141 
복합용도건축물
78 
공동주택
66 
일반음식점
59 
어린이집
54 
Other values (26)
271 

Length

Max length10
Median length7
Mean length4.9282511
Min length2

Unique

Unique7 ?
Unique (%)1.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
집단급식소 141
21.1%
복합용도건축물 78
11.7%
공동주택 66
9.9%
일반음식점 59
8.8%
어린이집 54
 
8.1%
숙박업(일반) 36
 
5.4%
휴게음식점 27
 
4.0%
초등학교 26
 
3.9%
사무실용건축물 24
 
3.6%
유치원 22
 
3.3%
Other values (21) 136
20.3%

Length

2024-03-30T08:24:38.722830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
집단급식소 141
20.7%
복합용도건축물 78
11.5%
공동주택 66
9.7%
일반음식점 59
 
8.7%
어린이집 54
 
7.9%
숙박업(일반 36
 
5.3%
휴게음식점 27
 
4.0%
초등학교 26
 
3.8%
사무실용건축물 24
 
3.5%
유치원 22
 
3.2%
Other values (23) 148
21.7%

시설명(업체명)
Text

MISSING 

Distinct536
Distinct (%)85.2%
Missing40
Missing (%)6.0%
Memory size5.4 KiB
2024-03-30T08:24:39.438915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length22
Mean length7.7583466
Min length2

Characters and Unicode

Total characters4880
Distinct characters438
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique445 ?
Unique (%)70.7%

Sample

1st row로망스장 여관
2nd row넘버25 호텔 하단점
3rd row파라다이스모텔
4th row해수장여관
5th row굿타임별관
ValueCountFrequency (%)
의료법인 8
 
1.0%
스타벅스 6
 
0.8%
투썸플레이스 5
 
0.6%
동아대학교 4
 
0.5%
호텔 4
 
0.5%
모텔 4
 
0.5%
주)풀무원푸드앤컬처 3
 
0.4%
강동병원 3
 
0.4%
사하점 3
 
0.4%
부산 3
 
0.4%
Other values (634) 749
94.6%
2024-03-30T08:24:40.585622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
165
 
3.4%
139
 
2.8%
131
 
2.7%
128
 
2.6%
106
 
2.2%
83
 
1.7%
82
 
1.7%
80
 
1.6%
76
 
1.6%
76
 
1.6%
Other values (428) 3814
78.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4420
90.6%
Space Separator 165
 
3.4%
Uppercase Letter 80
 
1.6%
Open Punctuation 67
 
1.4%
Close Punctuation 67
 
1.4%
Decimal Number 54
 
1.1%
Other Symbol 11
 
0.2%
Lowercase Letter 9
 
0.2%
Other Punctuation 6
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
139
 
3.1%
131
 
3.0%
128
 
2.9%
106
 
2.4%
83
 
1.9%
82
 
1.9%
80
 
1.8%
76
 
1.7%
76
 
1.7%
74
 
1.7%
Other values (382) 3445
77.9%
Uppercase Letter
ValueCountFrequency (%)
T 10
12.5%
D 8
 
10.0%
E 7
 
8.8%
W 6
 
7.5%
O 5
 
6.2%
A 5
 
6.2%
C 5
 
6.2%
I 4
 
5.0%
N 4
 
5.0%
L 4
 
5.0%
Other values (11) 22
27.5%
Decimal Number
ValueCountFrequency (%)
2 14
25.9%
1 10
18.5%
3 7
13.0%
6 6
11.1%
5 6
11.1%
0 4
 
7.4%
4 3
 
5.6%
7 2
 
3.7%
8 1
 
1.9%
9 1
 
1.9%
Lowercase Letter
ValueCountFrequency (%)
e 3
33.3%
m 2
22.2%
n 1
 
11.1%
t 1
 
11.1%
o 1
 
11.1%
h 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 3
50.0%
, 1
 
16.7%
: 1
 
16.7%
& 1
 
16.7%
Space Separator
ValueCountFrequency (%)
165
100.0%
Open Punctuation
ValueCountFrequency (%)
( 67
100.0%
Close Punctuation
ValueCountFrequency (%)
) 67
100.0%
Other Symbol
ValueCountFrequency (%)
11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4431
90.8%
Common 360
 
7.4%
Latin 89
 
1.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
139
 
3.1%
131
 
3.0%
128
 
2.9%
106
 
2.4%
83
 
1.9%
82
 
1.9%
80
 
1.8%
76
 
1.7%
76
 
1.7%
74
 
1.7%
Other values (383) 3456
78.0%
Latin
ValueCountFrequency (%)
T 10
 
11.2%
D 8
 
9.0%
E 7
 
7.9%
W 6
 
6.7%
O 5
 
5.6%
A 5
 
5.6%
C 5
 
5.6%
I 4
 
4.5%
N 4
 
4.5%
L 4
 
4.5%
Other values (17) 31
34.8%
Common
ValueCountFrequency (%)
165
45.8%
( 67
18.6%
) 67
18.6%
2 14
 
3.9%
1 10
 
2.8%
3 7
 
1.9%
6 6
 
1.7%
5 6
 
1.7%
0 4
 
1.1%
4 3
 
0.8%
Other values (8) 11
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4420
90.6%
ASCII 449
 
9.2%
None 11
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
165
36.7%
( 67
14.9%
) 67
14.9%
2 14
 
3.1%
1 10
 
2.2%
T 10
 
2.2%
D 8
 
1.8%
E 7
 
1.6%
3 7
 
1.6%
W 6
 
1.3%
Other values (35) 88
19.6%
Hangul
ValueCountFrequency (%)
139
 
3.1%
131
 
3.0%
128
 
2.9%
106
 
2.4%
83
 
1.9%
82
 
1.9%
80
 
1.8%
76
 
1.7%
76
 
1.7%
74
 
1.7%
Other values (382) 3445
77.9%
None
ValueCountFrequency (%)
11
100.0%
Distinct636
Distinct (%)95.1%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
2024-03-30T08:24:41.363081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length40
Mean length27.266069
Min length15

Characters and Unicode

Total characters18241
Distinct characters250
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique605 ?
Unique (%)90.4%

Sample

1st row부산광역시 사하구 사하로 201-5 (괴정동)
2nd row부산광역시 사하구 하신번영로300번길 113 (하단동)
3rd row부산광역시 사하구 원양로 395 (감천동)
4th row부산광역시 사하구 원양로 384 (감천동)
5th row부산광역시 사하구 하신번영로300번길 100-6, B동 (하단동)
ValueCountFrequency (%)
부산광역시 669
19.7%
사하구 667
19.6%
다대동 104
 
3.1%
하단동 96
 
2.8%
괴정동 69
 
2.0%
장림동 68
 
2.0%
다대로 66
 
1.9%
신평동 60
 
1.8%
감천동 40
 
1.2%
당리동 34
 
1.0%
Other values (774) 1525
44.9%
2024-03-30T08:24:42.655114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2781
 
15.2%
897
 
4.9%
735
 
4.0%
714
 
3.9%
706
 
3.9%
700
 
3.8%
689
 
3.8%
673
 
3.7%
673
 
3.7%
671
 
3.7%
Other values (240) 9002
49.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11386
62.4%
Space Separator 2781
 
15.2%
Decimal Number 2582
 
14.2%
Open Punctuation 549
 
3.0%
Close Punctuation 549
 
3.0%
Other Punctuation 310
 
1.7%
Dash Punctuation 56
 
0.3%
Uppercase Letter 17
 
0.1%
Lowercase Letter 8
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
897
 
7.9%
735
 
6.5%
714
 
6.3%
706
 
6.2%
700
 
6.1%
689
 
6.1%
673
 
5.9%
673
 
5.9%
671
 
5.9%
644
 
5.7%
Other values (207) 4284
37.6%
Decimal Number
ValueCountFrequency (%)
1 513
19.9%
2 382
14.8%
3 300
11.6%
0 240
9.3%
4 238
9.2%
5 238
9.2%
6 201
 
7.8%
7 195
 
7.6%
9 148
 
5.7%
8 127
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
B 3
17.6%
S 3
17.6%
K 2
11.8%
Y 2
11.8%
W 2
11.8%
A 2
11.8%
H 1
 
5.9%
L 1
 
5.9%
C 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
, 303
97.7%
. 5
 
1.6%
* 1
 
0.3%
/ 1
 
0.3%
Lowercase Letter
ValueCountFrequency (%)
e 6
75.0%
l 1
 
12.5%
t 1
 
12.5%
Open Punctuation
ValueCountFrequency (%)
( 548
99.8%
[ 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 548
99.8%
] 1
 
0.2%
Space Separator
ValueCountFrequency (%)
2781
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 56
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11386
62.4%
Common 6830
37.4%
Latin 25
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
897
 
7.9%
735
 
6.5%
714
 
6.3%
706
 
6.2%
700
 
6.1%
689
 
6.1%
673
 
5.9%
673
 
5.9%
671
 
5.9%
644
 
5.7%
Other values (207) 4284
37.6%
Common
ValueCountFrequency (%)
2781
40.7%
( 548
 
8.0%
) 548
 
8.0%
1 513
 
7.5%
2 382
 
5.6%
, 303
 
4.4%
3 300
 
4.4%
0 240
 
3.5%
4 238
 
3.5%
5 238
 
3.5%
Other values (11) 739
 
10.8%
Latin
ValueCountFrequency (%)
e 6
24.0%
B 3
12.0%
S 3
12.0%
K 2
 
8.0%
Y 2
 
8.0%
W 2
 
8.0%
A 2
 
8.0%
H 1
 
4.0%
L 1
 
4.0%
C 1
 
4.0%
Other values (2) 2
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11386
62.4%
ASCII 6855
37.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2781
40.6%
( 548
 
8.0%
) 548
 
8.0%
1 513
 
7.5%
2 382
 
5.6%
, 303
 
4.4%
3 300
 
4.4%
0 240
 
3.5%
4 238
 
3.5%
5 238
 
3.5%
Other values (23) 764
 
11.1%
Hangul
ValueCountFrequency (%)
897
 
7.9%
735
 
6.5%
714
 
6.3%
706
 
6.2%
700
 
6.1%
689
 
6.1%
673
 
5.9%
673
 
5.9%
671
 
5.9%
644
 
5.7%
Other values (207) 4284
37.6%

Interactions

2024-03-30T08:24:36.544260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-30T08:24:42.918460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시설의 종류
연번1.0000.981
시설의 종류0.9811.000
2024-03-30T08:24:43.151232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시설의 종류
연번1.0000.850
시설의 종류0.8501.000

Missing values

2024-03-30T08:24:37.018520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-30T08:24:37.314380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시설의 종류시설명(업체명)소재지(도로명 주소)
01숙박업(일반)로망스장 여관부산광역시 사하구 사하로 201-5 (괴정동)
12숙박업(일반)넘버25 호텔 하단점부산광역시 사하구 하신번영로300번길 113 (하단동)
23숙박업(일반)파라다이스모텔부산광역시 사하구 원양로 395 (감천동)
34숙박업(일반)해수장여관부산광역시 사하구 원양로 384 (감천동)
45숙박업(일반)굿타임별관부산광역시 사하구 하신번영로300번길 100-6, B동 (하단동)
56숙박업(일반)코자837모텔부산광역시 사하구 다대로 567-5 (다대동)
67숙박업(일반)프린스여관부산광역시 사하구 낙동대로233번길 12 (괴정동)
78숙박업(일반)세방장여관부산광역시 사하구 낙동대로451번안길 2 (하단동)
89숙박업(일반)퀸모텔부산광역시 사하구 하신번영로300번길 100-12 (하단동)
910숙박업(일반)늘봄모텔부산광역시 사하구 낙동대로224번길 7 (괴정동)
연번시설의 종류시설명(업체명)소재지(도로명 주소)
659660공동주택구평 대림e편한 1차부산광역시 사하구 서포로30번길 26(구평동, 대림e편한 1차)
660661공동주택사하뷰웰부산광역시 사하구 신산북로43번길 59(신평동, LH천년나무)
661662공동주택구평 e편한세상 사하2차부산광역시 사하구 서포로30번길 12(구평동, e편한세상사하2차)
662663공동주택장림역 스마트더블유아파트부산광역시 사하구 장림번영로 104(장림동, 장림역스마트더블유)
663664공동주택사하역 비스타동원부산광역시 사하구 낙동대로 280(괴정동, 사하역비스타동원)
664665공동주택부산사하중흥S클래스부산광역시 사하구 서포로30번길 42(구평동, 부산사하중흥S-클래스)
665666공동주택코오롱하늘채부산광역시 사하구 신산북로41(신평동, 코오롱하늘채)
666667공동주택한신더휴포레스티지부산광역시 사하구 사리로 106 (괴정동, 한신더휴포레스티지)
667668공동주택하단 롯데캐슬부산광역시 사하구 낙동대로 413 (하단동, 하단 롯데캐슬)
668669공동주택힐스테이스 사하역부산광역시 사하구 괴정로 166 (괴정동, 힐스테이트 사하역)