Overview

Dataset statistics

Number of variables4
Number of observations189
Missing cells3
Missing cells (%)0.4%
Duplicate rows1
Duplicate rows (%)0.5%
Total size in memory6.2 KiB
Average record size in memory33.7 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author관악구
URLhttps://data.seoul.go.kr/dataList/OA-11530/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 1 (0.5%) duplicate rowsDuplicates
업태명 has 3 (1.6%) missing valuesMissing

Reproduction

Analysis started2024-04-29 15:48:51.394918
Analysis finished2024-04-29 15:48:52.967466
Duration1.57 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
관악구
189 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row관악구
2nd row관악구
3rd row관악구
4th row관악구
5th row관악구

Common Values

ValueCountFrequency (%)
관악구 189
100.0%

Length

2024-04-30T00:48:53.037635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T00:48:53.136329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
관악구 189
100.0%

법정동명
Categorical

Distinct3
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
봉천동
70 
신림동
70 
남현동
49 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row봉천동
2nd row봉천동
3rd row봉천동
4th row봉천동
5th row봉천동

Common Values

ValueCountFrequency (%)
봉천동 70
37.0%
신림동 70
37.0%
남현동 49
25.9%

Length

2024-04-30T00:48:53.224241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T00:48:53.305545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
봉천동 70
37.0%
신림동 70
37.0%
남현동 49
25.9%

업태명
Text

MISSING 

Distinct74
Distinct (%)39.8%
Missing3
Missing (%)1.6%
Memory size1.6 KiB
2024-04-30T00:48:53.465901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.4301075
Min length2

Characters and Unicode

Total characters1010
Distinct characters153
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)6.5%

Sample

1st row한식
2nd row중국식
3rd row경양식
4th row일식
5th row분식
ValueCountFrequency (%)
기타 12
 
6.0%
패스트푸드 6
 
3.0%
식품제조가공업 4
 
2.0%
집단급식소 4
 
2.0%
식품소분업 3
 
1.5%
유통전문판매업 3
 
1.5%
아이스크림 3
 
1.5%
중국식 3
 
1.5%
학교 3
 
1.5%
사회복지시설 3
 
1.5%
Other values (64) 155
77.9%
2024-04-30T00:48:53.775856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
62
 
6.1%
54
 
5.3%
41
 
4.1%
38
 
3.8%
30
 
3.0%
28
 
2.8%
( 23
 
2.3%
) 23
 
2.3%
21
 
2.1%
20
 
2.0%
Other values (143) 670
66.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 939
93.0%
Open Punctuation 23
 
2.3%
Close Punctuation 23
 
2.3%
Space Separator 13
 
1.3%
Other Punctuation 12
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
62
 
6.6%
54
 
5.8%
41
 
4.4%
38
 
4.0%
30
 
3.2%
28
 
3.0%
21
 
2.2%
20
 
2.1%
17
 
1.8%
15
 
1.6%
Other values (137) 613
65.3%
Other Punctuation
ValueCountFrequency (%)
/ 9
75.0%
, 2
 
16.7%
. 1
 
8.3%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Space Separator
ValueCountFrequency (%)
13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 939
93.0%
Common 71
 
7.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
62
 
6.6%
54
 
5.8%
41
 
4.4%
38
 
4.0%
30
 
3.2%
28
 
3.0%
21
 
2.2%
20
 
2.1%
17
 
1.8%
15
 
1.6%
Other values (137) 613
65.3%
Common
ValueCountFrequency (%)
( 23
32.4%
) 23
32.4%
13
18.3%
/ 9
 
12.7%
, 2
 
2.8%
. 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 939
93.0%
ASCII 71
 
7.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
62
 
6.6%
54
 
5.8%
41
 
4.4%
38
 
4.0%
30
 
3.2%
28
 
3.0%
21
 
2.2%
20
 
2.1%
17
 
1.8%
15
 
1.6%
Other values (137) 613
65.3%
ASCII
ValueCountFrequency (%)
( 23
32.4%
) 23
32.4%
13
18.3%
/ 9
 
12.7%
, 2
 
2.8%
. 1
 
1.4%

업소수
Real number (ℝ)

Distinct69
Distinct (%)36.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.941799
Minimum1
Maximum1056
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2024-04-30T00:48:53.914661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median8
Q334
95-th percentile246.2
Maximum1056
Range1055
Interquartile range (IQR)32

Descriptive statistics

Standard deviation120.47725
Coefficient of variation (CV)2.5129898
Kurtosis42.882007
Mean47.941799
Median Absolute Deviation (MAD)7
Skewness5.850821
Sum9061
Variance14514.768
MonotonicityNot monotonic
2024-04-30T00:48:54.049954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 39
20.6%
2 13
 
6.9%
4 12
 
6.3%
5 9
 
4.8%
3 9
 
4.8%
7 8
 
4.2%
15 6
 
3.2%
8 6
 
3.2%
10 6
 
3.2%
9 5
 
2.6%
Other values (59) 76
40.2%
ValueCountFrequency (%)
1 39
20.6%
2 13
 
6.9%
3 9
 
4.8%
4 12
 
6.3%
5 9
 
4.8%
6 4
 
2.1%
7 8
 
4.2%
8 6
 
3.2%
9 5
 
2.6%
10 6
 
3.2%
ValueCountFrequency (%)
1056 1
0.5%
964 1
0.5%
301 1
0.5%
292 1
0.5%
271 1
0.5%
266 1
0.5%
263 1
0.5%
257 1
0.5%
250 1
0.5%
247 1
0.5%

Interactions

2024-04-30T00:48:52.718297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T00:48:54.122538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.119
업태명0.0001.0000.822
업소수0.1190.8221.000
2024-04-30T00:48:54.195261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.112
법정동명0.1121.000

Missing values

2024-04-30T00:48:52.865025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T00:48:52.932954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0관악구봉천동한식964
1관악구봉천동중국식82
2관악구봉천동경양식179
3관악구봉천동일식133
4관악구봉천동분식182
5관악구봉천동뷔페식5
6관악구봉천동정종/대포집/소주방55
7관악구봉천동패스트푸드9
8관악구봉천동호프/통닭234
9관악구봉천동통닭(치킨)41
자치구명법정동명업태명업소수
179관악구남현동제과점영업7
180관악구남현동집단급식소 식품판매업1
181관악구남현동건강기능식품수입업1
182관악구남현동영업장판매25
183관악구남현동방문판매2
184관악구남현동전자상거래(통신판매업)27
185관악구남현동<NA>1
186관악구남현동다단계판매1
187관악구남현동도매업(유통)1
188관악구남현동건강기능식품유통전문판매업3

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0관악구봉천동키즈카페12