Overview

Dataset statistics

Number of variables4
Number of observations1243
Missing cells18
Missing cells (%)0.4%
Duplicate rows5
Duplicate rows (%)0.4%
Total size in memory40.2 KiB
Average record size in memory33.1 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author영등포구
URLhttps://data.seoul.go.kr/dataList/OA-10452/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 5 (0.4%) duplicate rowsDuplicates
업태명 has 18 (1.4%) missing valuesMissing

Reproduction

Analysis started2024-05-11 06:31:09.669112
Analysis finished2024-05-11 06:31:10.821688
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
영등포구
1243 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영등포구
2nd row영등포구
3rd row영등포구
4th row영등포구
5th row영등포구

Common Values

ValueCountFrequency (%)
영등포구 1243
100.0%

Length

2024-05-11T15:31:10.938928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:31:11.103406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영등포구 1243
100.0%

법정동명
Categorical

Distinct33
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
여의도동
 
72
신길동
 
59
대림동
 
54
문래동3가
 
51
양평동4가
 
48
Other values (28)
959 

Length

Max length6
Median length5
Mean length4.8334674
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영등포동
2nd row영등포동
3rd row영등포동
4th row영등포동
5th row영등포동

Common Values

ValueCountFrequency (%)
여의도동 72
 
5.8%
신길동 59
 
4.7%
대림동 54
 
4.3%
문래동3가 51
 
4.1%
양평동4가 48
 
3.9%
당산동3가 47
 
3.8%
영등포동4가 45
 
3.6%
양평동3가 43
 
3.5%
당산동1가 42
 
3.4%
영등포동 42
 
3.4%
Other values (23) 740
59.5%

Length

2024-05-11T15:31:11.310550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
여의도동 72
 
5.8%
신길동 59
 
4.7%
대림동 54
 
4.3%
문래동3가 51
 
4.1%
양평동4가 48
 
3.9%
당산동3가 47
 
3.8%
영등포동4가 45
 
3.6%
양평동3가 43
 
3.5%
당산동1가 42
 
3.4%
영등포동 42
 
3.4%
Other values (23) 740
59.5%

업태명
Text

MISSING 

Distinct78
Distinct (%)6.4%
Missing18
Missing (%)1.4%
Memory size9.8 KiB
2024-05-11T15:31:11.683741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.7110204
Min length2

Characters and Unicode

Total characters6996
Distinct characters160
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)0.3%

Sample

1st row한식
2nd row중국식
3rd row경양식
4th row일식
5th row분식
ValueCountFrequency (%)
기타 113
 
8.3%
패스트푸드 35
 
2.6%
한식 33
 
2.4%
휴게음식점 33
 
2.4%
커피숍 32
 
2.4%
전자상거래(통신판매업 32
 
2.4%
편의점 32
 
2.4%
식품제조가공업 32
 
2.4%
즉석판매제조가공업 32
 
2.4%
수입판매업 32
 
2.4%
Other values (69) 952
70.1%
2024-05-11T15:31:12.359751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
505
 
7.2%
438
 
6.3%
342
 
4.9%
332
 
4.7%
249
 
3.6%
232
 
3.3%
164
 
2.3%
158
 
2.3%
134
 
1.9%
133
 
1.9%
Other values (150) 4309
61.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6532
93.4%
Space Separator 133
 
1.9%
Close Punctuation 119
 
1.7%
Open Punctuation 119
 
1.7%
Other Punctuation 93
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
505
 
7.7%
438
 
6.7%
342
 
5.2%
332
 
5.1%
249
 
3.8%
232
 
3.6%
164
 
2.5%
158
 
2.4%
134
 
2.1%
114
 
1.7%
Other values (144) 3864
59.2%
Other Punctuation
ValueCountFrequency (%)
/ 71
76.3%
, 18
 
19.4%
. 4
 
4.3%
Space Separator
ValueCountFrequency (%)
133
100.0%
Close Punctuation
ValueCountFrequency (%)
) 119
100.0%
Open Punctuation
ValueCountFrequency (%)
( 119
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6532
93.4%
Common 464
 
6.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
505
 
7.7%
438
 
6.7%
342
 
5.2%
332
 
5.1%
249
 
3.8%
232
 
3.6%
164
 
2.5%
158
 
2.4%
134
 
2.1%
114
 
1.7%
Other values (144) 3864
59.2%
Common
ValueCountFrequency (%)
133
28.7%
) 119
25.6%
( 119
25.6%
/ 71
15.3%
, 18
 
3.9%
. 4
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6532
93.4%
ASCII 464
 
6.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
505
 
7.7%
438
 
6.7%
342
 
5.2%
332
 
5.1%
249
 
3.8%
232
 
3.6%
164
 
2.5%
158
 
2.4%
134
 
2.1%
114
 
1.7%
Other values (144) 3864
59.2%
ASCII
ValueCountFrequency (%)
133
28.7%
) 119
25.6%
( 119
25.6%
/ 71
15.3%
, 18
 
3.9%
. 4
 
0.9%

업소수
Real number (ℝ)

Distinct90
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.452936
Minimum1
Maximum760
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2024-05-11T15:31:12.648980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q39
95-th percentile44.8
Maximum760
Range759
Interquartile range (IQR)8

Descriptive statistics

Standard deviation34.644479
Coefficient of variation (CV)3.0249429
Kurtosis199.28608
Mean11.452936
Median Absolute Deviation (MAD)2
Skewness11.5236
Sum14236
Variance1200.2399
MonotonicityNot monotonic
2024-05-11T15:31:12.924629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 362
29.1%
2 193
15.5%
3 115
 
9.3%
4 82
 
6.6%
5 65
 
5.2%
7 49
 
3.9%
6 37
 
3.0%
9 28
 
2.3%
8 25
 
2.0%
10 23
 
1.9%
Other values (80) 264
21.2%
ValueCountFrequency (%)
1 362
29.1%
2 193
15.5%
3 115
 
9.3%
4 82
 
6.6%
5 65
 
5.2%
6 37
 
3.0%
7 49
 
3.9%
8 25
 
2.0%
9 28
 
2.3%
10 23
 
1.9%
ValueCountFrequency (%)
760 1
0.1%
393 1
0.1%
281 1
0.1%
273 1
0.1%
261 1
0.1%
250 1
0.1%
231 1
0.1%
219 1
0.1%
165 1
0.1%
155 1
0.1%

Interactions

2024-05-11T15:31:09.978356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:31:13.118971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.143
업태명0.0001.0000.000
업소수0.1430.0001.000
2024-05-11T15:31:13.288874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.061
법정동명0.0611.000

Missing values

2024-05-11T15:31:10.228930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:31:10.414417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0영등포구영등포동한식81
1영등포구영등포동중국식12
2영등포구영등포동경양식7
3영등포구영등포동일식3
4영등포구영등포동분식15
5영등포구영등포동정종/대포집/소주방4
6영등포구영등포동호프/통닭7
7영등포구영등포동통닭(치킨)4
8영등포구영등포동까페1
9영등포구영등포동외국음식전문점(인도,태국등)1
자치구명법정동명업태명업소수
1233영등포구대림동집단급식소 식품판매업3
1234영등포구대림동건강기능식품수입업20
1235영등포구대림동영업장판매56
1236영등포구대림동방문판매28
1237영등포구대림동전자상거래(통신판매업)98
1238영등포구대림동<NA>3
1239영등포구대림동다단계판매5
1240영등포구대림동도매업(유통)4
1241영등포구대림동기타 건강기능식품일반판매업6
1242영등포구대림동건강기능식품유통전문판매업8

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0영등포구당산동4가패스트푸드12
1영등포구문래동3가키즈카페12
2영등포구문래동6가패스트푸드12
3영등포구여의도동전통찻집12
4영등포구영등포동7가패스트푸드12