Overview

Dataset statistics

Number of variables4
Number of observations929
Missing cells11
Missing cells (%)0.3%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory30.1 KiB
Average record size in memory33.1 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author마포구
URLhttps://data.seoul.go.kr/dataList/OA-11376/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 1 (0.1%) duplicate rowsDuplicates
업태명 has 11 (1.2%) missing valuesMissing

Reproduction

Analysis started2024-05-11 06:40:27.427659
Analysis finished2024-05-11 06:40:28.840161
Duration1.41 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.4 KiB
마포구
929 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row마포구
2nd row마포구
3rd row마포구
4th row마포구
5th row마포구

Common Values

ValueCountFrequency (%)
마포구 929
100.0%

Length

2024-05-11T06:40:29.083978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T06:40:29.370559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
마포구 929
100.0%

법정동명
Categorical

Distinct27
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size7.4 KiB
서교동
 
60
상암동
 
55
공덕동
 
54
도화동
 
54
성산동
 
54
Other values (22)
652 

Length

Max length4
Median length3
Mean length3.0570506
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row효창동
2nd row아현동
3rd row아현동
4th row아현동
5th row아현동

Common Values

ValueCountFrequency (%)
서교동 60
 
6.5%
상암동 55
 
5.9%
공덕동 54
 
5.8%
도화동 54
 
5.8%
성산동 54
 
5.8%
합정동 51
 
5.5%
망원동 49
 
5.3%
동교동 49
 
5.3%
신수동 42
 
4.5%
대흥동 41
 
4.4%
Other values (17) 420
45.2%

Length

2024-05-11T06:40:29.799491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서교동 60
 
6.5%
상암동 55
 
5.9%
공덕동 54
 
5.8%
도화동 54
 
5.8%
성산동 54
 
5.8%
합정동 51
 
5.5%
망원동 49
 
5.3%
동교동 49
 
5.3%
신수동 42
 
4.5%
대흥동 41
 
4.4%
Other values (17) 420
45.2%

업태명
Text

MISSING 

Distinct75
Distinct (%)8.2%
Missing11
Missing (%)1.2%
Memory size7.4 KiB
2024-05-11T06:40:30.366192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.795207
Min length2

Characters and Unicode

Total characters5320
Distinct characters150
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.8%

Sample

1st row커피숍
2nd row한식
3rd row중국식
4th row경양식
5th row일식
ValueCountFrequency (%)
기타 84
 
8.2%
패스트푸드 33
 
3.2%
식품제조가공업 27
 
2.6%
식품자동판매기영업 25
 
2.4%
한식 25
 
2.4%
커피숍 23
 
2.3%
편의점 23
 
2.3%
유통전문판매업 23
 
2.3%
즉석판매제조가공업 23
 
2.3%
전자상거래(통신판매업 22
 
2.2%
Other values (66) 714
69.9%
2024-05-11T06:40:31.680231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
376
 
7.1%
324
 
6.1%
249
 
4.7%
236
 
4.4%
195
 
3.7%
171
 
3.2%
124
 
2.3%
118
 
2.2%
113
 
2.1%
( 107
 
2.0%
Other values (140) 3307
62.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4942
92.9%
Open Punctuation 107
 
2.0%
Close Punctuation 107
 
2.0%
Space Separator 104
 
2.0%
Other Punctuation 60
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
376
 
7.6%
324
 
6.6%
249
 
5.0%
236
 
4.8%
195
 
3.9%
171
 
3.5%
124
 
2.5%
118
 
2.4%
113
 
2.3%
90
 
1.8%
Other values (135) 2946
59.6%
Other Punctuation
ValueCountFrequency (%)
/ 42
70.0%
, 18
30.0%
Open Punctuation
ValueCountFrequency (%)
( 107
100.0%
Close Punctuation
ValueCountFrequency (%)
) 107
100.0%
Space Separator
ValueCountFrequency (%)
104
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4942
92.9%
Common 378
 
7.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
376
 
7.6%
324
 
6.6%
249
 
5.0%
236
 
4.8%
195
 
3.9%
171
 
3.5%
124
 
2.5%
118
 
2.4%
113
 
2.3%
90
 
1.8%
Other values (135) 2946
59.6%
Common
ValueCountFrequency (%)
( 107
28.3%
) 107
28.3%
104
27.5%
/ 42
 
11.1%
, 18
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4942
92.9%
ASCII 378
 
7.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
376
 
7.6%
324
 
6.6%
249
 
5.0%
236
 
4.8%
195
 
3.9%
171
 
3.5%
124
 
2.5%
118
 
2.4%
113
 
2.3%
90
 
1.8%
Other values (135) 2946
59.6%
ASCII
ValueCountFrequency (%)
( 107
28.3%
) 107
28.3%
104
27.5%
/ 42
 
11.1%
, 18
 
4.8%

업소수
Real number (ℝ)

Distinct100
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.133477
Minimum1
Maximum581
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.3 KiB
2024-05-11T06:40:32.080411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q313
95-th percentile72
Maximum581
Range580
Interquartile range (IQR)12

Descriptive statistics

Standard deviation41.111254
Coefficient of variation (CV)2.5481955
Kurtosis70.971393
Mean16.133477
Median Absolute Deviation (MAD)2
Skewness7.0811037
Sum14988
Variance1690.1352
MonotonicityNot monotonic
2024-05-11T06:40:32.646879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 255
27.4%
2 134
14.4%
3 82
 
8.8%
4 44
 
4.7%
5 38
 
4.1%
6 33
 
3.6%
7 25
 
2.7%
8 21
 
2.3%
9 17
 
1.8%
11 16
 
1.7%
Other values (90) 264
28.4%
ValueCountFrequency (%)
1 255
27.4%
2 134
14.4%
3 82
 
8.8%
4 44
 
4.7%
5 38
 
4.1%
6 33
 
3.6%
7 25
 
2.7%
8 21
 
2.3%
9 17
 
1.8%
10 15
 
1.6%
ValueCountFrequency (%)
581 1
0.1%
500 1
0.1%
372 1
0.1%
298 1
0.1%
255 1
0.1%
221 1
0.1%
205 1
0.1%
199 1
0.1%
197 1
0.1%
196 1
0.1%

Interactions

2024-05-11T06:40:27.848344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T06:40:32.945654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.000
업태명0.0001.0000.000
업소수0.0000.0001.000
2024-05-11T06:40:33.194642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.000
법정동명0.0001.000

Missing values

2024-05-11T06:40:28.290796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T06:40:28.721660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0마포구효창동커피숍1
1마포구아현동한식62
2마포구아현동중국식5
3마포구아현동경양식7
4마포구아현동일식9
5마포구아현동분식13
6마포구아현동호프/통닭14
7마포구아현동통닭(치킨)2
8마포구아현동회집1
9마포구아현동까페4
자치구명법정동명업태명업소수
919마포구상암동영업장판매39
920마포구상암동방문판매3
921마포구상암동전자상거래(통신판매업)55
922마포구상암동<NA>2
923마포구상암동다단계판매1
924마포구상암동도매업(유통)1
925마포구상암동자동판매기판매1
926마포구상암동기타(복합 등)8
927마포구상암동기타 건강기능식품일반판매업2
928마포구상암동건강기능식품유통전문판매업10

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0마포구구수동패스트푸드12