Overview

Dataset statistics

Number of variables8
Number of observations110
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.1 KiB
Average record size in memory66.2 B

Variable types

Numeric1
Categorical6
Text1

Dataset

Description서울특별시 동대문구 흡연시설 관련 데이터로 시설 형태, 설치 위치, 규모, 설치년도, 설치주체, 운영관리 관련 항목을 포함하고 있습니다.
URLhttps://www.data.go.kr/data/15070168/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
운영관리 is highly overall correlated with 설치주체High correlation
설치주체 is highly overall correlated with 운영관리High correlation
설치주체 is highly imbalanced (58.0%)Imbalance
운영관리 is highly imbalanced (58.0%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 17:06:12.238511
Analysis finished2023-12-12 17:06:13.407747
Duration1.17 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct110
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.5
Minimum1
Maximum110
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-13T02:06:13.487447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.45
Q128.25
median55.5
Q382.75
95-th percentile104.55
Maximum110
Range109
Interquartile range (IQR)54.5

Descriptive statistics

Standard deviation31.898276
Coefficient of variation (CV)0.57474371
Kurtosis-1.2
Mean55.5
Median Absolute Deviation (MAD)27.5
Skewness0
Sum6105
Variance1017.5
MonotonicityStrictly increasing
2023-12-13T02:06:13.670470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.9%
71 1
 
0.9%
82 1
 
0.9%
81 1
 
0.9%
80 1
 
0.9%
79 1
 
0.9%
78 1
 
0.9%
77 1
 
0.9%
76 1
 
0.9%
75 1
 
0.9%
Other values (100) 100
90.9%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
10 1
0.9%
ValueCountFrequency (%)
110 1
0.9%
109 1
0.9%
108 1
0.9%
107 1
0.9%
106 1
0.9%
105 1
0.9%
104 1
0.9%
103 1
0.9%
102 1
0.9%
101 1
0.9%

시설형태
Categorical

Distinct4
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size1012.0 B
개방형
69 
완전개방형
18 
완전폐쇄형
16 
폐쇄형

Length

Max length5
Median length3
Mean length3.6181818
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row완전폐쇄형
2nd row완전폐쇄형
3rd row개방형
4th row완전개방형
5th row개방형

Common Values

ValueCountFrequency (%)
개방형 69
62.7%
완전개방형 18
 
16.4%
완전폐쇄형 16
 
14.5%
폐쇄형 7
 
6.4%

Length

2023-12-13T02:06:13.867740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:06:14.034560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개방형 69
62.7%
완전개방형 18
 
16.4%
완전폐쇄형 16
 
14.5%
폐쇄형 7
 
6.4%
Distinct106
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Memory size1012.0 B
2023-12-13T02:06:14.345876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length22
Mean length15.772727
Min length7

Characters and Unicode

Total characters1735
Distinct characters135
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)92.7%

Sample

1st row레프트커피(경희대로 20) 옥상
2nd row빅히트스크린야구(이문로 82) 테라스
3rd row천호대로 307 (답십리동) 10층
4th row고미술로 100(답십리동) 옥상
5th row천호대로 329(답십리동) 1층
ValueCountFrequency (%)
1층 48
 
14.4%
옥상 38
 
11.4%
왕산로 18
 
5.4%
천호대로 13
 
3.9%
장한로 6
 
1.8%
테라스 6
 
1.8%
장한로2길 6
 
1.8%
회기로 5
 
1.5%
답십리로 5
 
1.5%
33 5
 
1.5%
Other values (137) 184
55.1%
2023-12-13T02:06:14.853565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
225
 
13.0%
110
 
6.3%
( 100
 
5.8%
) 100
 
5.8%
1 92
 
5.3%
87
 
5.0%
68
 
3.9%
2 57
 
3.3%
3 47
 
2.7%
43
 
2.5%
Other values (125) 806
46.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 934
53.8%
Decimal Number 359
 
20.7%
Space Separator 225
 
13.0%
Open Punctuation 100
 
5.8%
Close Punctuation 100
 
5.8%
Uppercase Letter 11
 
0.6%
Other Punctuation 6
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
110
 
11.8%
87
 
9.3%
68
 
7.3%
43
 
4.6%
40
 
4.3%
40
 
4.3%
32
 
3.4%
30
 
3.2%
28
 
3.0%
26
 
2.8%
Other values (101) 430
46.0%
Decimal Number
ValueCountFrequency (%)
1 92
25.6%
2 57
15.9%
3 47
13.1%
4 36
 
10.0%
7 33
 
9.2%
6 22
 
6.1%
9 20
 
5.6%
5 20
 
5.6%
0 17
 
4.7%
8 15
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
N 2
18.2%
S 2
18.2%
C 1
9.1%
M 1
9.1%
I 1
9.1%
E 1
9.1%
O 1
9.1%
U 1
9.1%
L 1
9.1%
Other Punctuation
ValueCountFrequency (%)
, 5
83.3%
. 1
 
16.7%
Space Separator
ValueCountFrequency (%)
225
100.0%
Open Punctuation
ValueCountFrequency (%)
( 100
100.0%
Close Punctuation
ValueCountFrequency (%)
) 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 934
53.8%
Common 790
45.5%
Latin 11
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
110
 
11.8%
87
 
9.3%
68
 
7.3%
43
 
4.6%
40
 
4.3%
40
 
4.3%
32
 
3.4%
30
 
3.2%
28
 
3.0%
26
 
2.8%
Other values (101) 430
46.0%
Common
ValueCountFrequency (%)
225
28.5%
( 100
12.7%
) 100
12.7%
1 92
11.6%
2 57
 
7.2%
3 47
 
5.9%
4 36
 
4.6%
7 33
 
4.2%
6 22
 
2.8%
9 20
 
2.5%
Other values (5) 58
 
7.3%
Latin
ValueCountFrequency (%)
N 2
18.2%
S 2
18.2%
C 1
9.1%
M 1
9.1%
I 1
9.1%
E 1
9.1%
O 1
9.1%
U 1
9.1%
L 1
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 934
53.8%
ASCII 801
46.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
225
28.1%
( 100
12.5%
) 100
12.5%
1 92
11.5%
2 57
 
7.1%
3 47
 
5.9%
4 36
 
4.5%
7 33
 
4.1%
6 22
 
2.7%
9 20
 
2.5%
Other values (14) 69
 
8.6%
Hangul
ValueCountFrequency (%)
110
 
11.8%
87
 
9.3%
68
 
7.3%
43
 
4.6%
40
 
4.3%
40
 
4.3%
32
 
3.4%
30
 
3.2%
28
 
3.0%
26
 
2.8%
Other values (101) 430
46.0%

규모
Categorical

Distinct19
Distinct (%)17.3%
Missing0
Missing (%)0.0%
Memory size1012.0 B
5㎡
14 
10㎡
13 
2㎡
13 
3㎡
12 
6㎡
11 
Other values (14)
47 

Length

Max length5
Median length2
Mean length2.4363636
Min length2

Unique

Unique2 ?
Unique (%)1.8%

Sample

1st row5㎡
2nd row2㎡
3rd row15㎡
4th row13㎡
5th row13㎡

Common Values

ValueCountFrequency (%)
5㎡ 14
12.7%
10㎡ 13
11.8%
2㎡ 13
11.8%
3㎡ 12
10.9%
6㎡ 11
10.0%
7㎡ 7
 
6.4%
17㎡ 5
 
4.5%
13㎡ 5
 
4.5%
20㎡ 4
 
3.6%
15㎡ 4
 
3.6%
Other values (9) 22
20.0%

Length

2023-12-13T02:06:15.031821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5㎡ 14
12.7%
2㎡ 13
11.8%
10㎡ 13
11.8%
3㎡ 12
10.9%
6㎡ 11
10.0%
7㎡ 7
 
6.4%
17㎡ 5
 
4.5%
13㎡ 5
 
4.5%
4㎡ 4
 
3.6%
8㎡ 4
 
3.6%
Other values (9) 22
20.0%

설치년도
Categorical

Distinct12
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size1012.0 B
2017
29 
2018
22 
2016
16 
2015
10 
2012
Other values (7)
25 

Length

Max length4
Median length4
Mean length3.9727273
Min length1

Unique

Unique2 ?
Unique (%)1.8%

Sample

1st row2016
2nd row
3rd row2014
4th row2018
5th row2008

Common Values

ValueCountFrequency (%)
2017 29
26.4%
2018 22
20.0%
2016 16
14.5%
2015 10
 
9.1%
2012 8
 
7.3%
2014 6
 
5.5%
2013 6
 
5.5%
<NA> 5
 
4.5%
2010 4
 
3.6%
2008 2
 
1.8%
Other values (2) 2
 
1.8%

Length

2023-12-13T02:06:15.175116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2017 29
26.6%
2018 22
20.2%
2016 16
14.7%
2015 10
 
9.2%
2012 8
 
7.3%
2014 6
 
5.5%
2013 6
 
5.5%
na 5
 
4.6%
2010 4
 
3.7%
2008 2
 
1.8%

설치주체
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct22
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size1012.0 B
건물관리자
82 
장안스피존
 
5
롯데백화점
 
3
동대문구청
 
2
빅히트 스크린야구
 
1
Other values (17)
17 

Length

Max length13
Median length5
Mean length5.3
Min length4

Unique

Unique18 ?
Unique (%)16.4%

Sample

1st row레프트커피
2nd row빅히트 스크린야구
3rd row건물관리자
4th row건물관리자
5th row건물관리자

Common Values

ValueCountFrequency (%)
건물관리자 82
74.5%
장안스피존 5
 
4.5%
롯데백화점 3
 
2.7%
동대문구청 2
 
1.8%
빅히트 스크린야구 1
 
0.9%
서울명병원 1
 
0.9%
멘토스병원 1
 
0.9%
대한민국정형외과 1
 
0.9%
경희의료원 1
 
0.9%
브레인요양병원 1
 
0.9%
Other values (12) 12
 
10.9%

Length

2023-12-13T02:06:15.300624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
건물관리자 82
72.6%
장안스피존 5
 
4.4%
롯데백화점 3
 
2.7%
동대문구청 2
 
1.8%
inn.seoul 1
 
0.9%
레프트커피 1
 
0.9%
호텔케이피 1
 
0.9%
주식회사 1
 
0.9%
지혜병원 1
 
0.9%
래미안엘파인아파트 1
 
0.9%
Other values (15) 15
 
13.3%

운영관리
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct22
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size1012.0 B
건물관리자
82 
장안스피존
 
5
 
3
동대문구청
 
2
빅히트 스크린야구
 
1
Other values (17)
17 

Length

Max length13
Median length5
Mean length5.1909091
Min length1

Unique

Unique18 ?
Unique (%)16.4%

Sample

1st row레프트커피
2nd row빅히트 스크린야구
3rd row건물관리자
4th row건물관리자
5th row건물관리자

Common Values

ValueCountFrequency (%)
건물관리자 82
74.5%
장안스피존 5
 
4.5%
3
 
2.7%
동대문구청 2
 
1.8%
빅히트 스크린야구 1
 
0.9%
서울명병원 1
 
0.9%
멘토스병원 1
 
0.9%
대한민국정형외과 1
 
0.9%
경희의료원 1
 
0.9%
브레인요양병원 1
 
0.9%
Other values (12) 12
 
10.9%

Length

2023-12-13T02:06:15.458136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
건물관리자 82
72.6%
장안스피존 5
 
4.4%
3
 
2.7%
동대문구청 2
 
1.8%
inn.seoul 1
 
0.9%
레프트커피 1
 
0.9%
호텔케이피 1
 
0.9%
주식회사 1
 
0.9%
지혜병원 1
 
0.9%
래미안엘파인아파트 1
 
0.9%
Other values (15) 15
 
13.3%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1012.0 B
2021-10-27
110 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-10-27
2nd row2021-10-27
3rd row2021-10-27
4th row2021-10-27
5th row2021-10-27

Common Values

ValueCountFrequency (%)
2021-10-27 110
100.0%

Length

2023-12-13T02:06:15.600773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:06:15.712856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-10-27 110
100.0%

Interactions

2023-12-13T02:06:12.777069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:06:15.795236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시설형태규모설치년도설치주체운영관리
연번1.0000.4860.7120.4570.6740.674
시설형태0.4861.0000.5730.1250.6230.623
규모0.7120.5731.0000.6550.7170.717
설치년도0.4570.1250.6551.0000.8430.843
설치주체0.6740.6230.7170.8431.0001.000
운영관리0.6740.6230.7170.8431.0001.000
2023-12-13T02:06:15.967483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설형태규모운영관리설치년도설치주체
시설형태1.0000.3240.3470.0670.347
규모0.3241.0000.2860.2980.286
운영관리0.3470.2861.0000.4671.000
설치년도0.0670.2980.4671.0000.467
설치주체0.3470.2861.0000.4671.000
2023-12-13T02:06:16.077826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시설형태규모설치년도설치주체운영관리
연번1.0000.3020.3500.2100.3010.301
시설형태0.3021.0000.3240.0670.3470.347
규모0.3500.3241.0000.2980.2860.286
설치년도0.2100.0670.2981.0000.4670.467
설치주체0.3010.3470.2860.4671.0001.000
운영관리0.3010.3470.2860.4671.0001.000

Missing values

2023-12-13T02:06:13.194630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:06:13.343498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시설형태설치위치규모설치년도설치주체운영관리데이터기준일자
01완전폐쇄형레프트커피(경희대로 20) 옥상5㎡2016레프트커피레프트커피2021-10-27
12완전폐쇄형빅히트스크린야구(이문로 82) 테라스2㎡빅히트 스크린야구빅히트 스크린야구2021-10-27
23개방형천호대로 307 (답십리동) 10층15㎡2014건물관리자건물관리자2021-10-27
34완전개방형고미술로 100(답십리동) 옥상13㎡2018건물관리자건물관리자2021-10-27
45개방형천호대로 329(답십리동) 1층13㎡2008건물관리자건물관리자2021-10-27
56완전개방형한천로11길 6(답십리동) 옥상10㎡2017건물관리자건물관리자2021-10-27
67완전개방형홍릉로 58(청량리동) 1층10㎡2017건물관리자건물관리자2021-10-27
78완전폐쇄형회기로 47(청량리동) 1층5㎡2010건물관리자건물관리자2021-10-27
89완전폐쇄형회기로 57(청량리동) 1층13㎡2016건물관리자건물관리자2021-10-27
910완전개방형왕산로 61(용두동) 옥상10㎡2013건물관리자건물관리자2021-10-27
연번시설형태설치위치규모설치년도설치주체운영관리데이터기준일자
100101완전폐쇄형장한로2길 33, 2층17㎡<NA>장안스피존장안스피존2021-10-27
101102완전폐쇄형장한로2길 33, 3층17㎡<NA>장안스피존장안스피존2021-10-27
102103완전폐쇄형장한로2길 33, 4층17㎡<NA>장안스피존장안스피존2021-10-27
103104완전폐쇄형장한로2길 33, 5층17㎡<NA>장안스피존장안스피존2021-10-27
104105완전폐쇄형장한로2길 33, 5층17㎡<NA>장안스피존장안스피존2021-10-27
105106개방형왕산로 214 1층<NA>2010롯데백화점2021-10-27
106107개방형왕산로 214 3층<NA>2010롯데백화점2021-10-27
107108개방형왕산로 214 7층 옥상<NA>2010롯데백화점2021-10-27
108109개방형천호대로 421 옥상10㎡2018지혜병원지혜병원2021-10-27
109110완전개방형장한로6 옥상20㎡2018장안빌딩장안빌딩2021-10-27