Overview

Dataset statistics

Number of variables4
Number of observations1020
Missing cells18
Missing cells (%)0.4%
Duplicate rows6
Duplicate rows (%)0.6%
Total size in memory33.0 KiB
Average record size in memory33.1 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author용산구
URLhttps://data.seoul.go.kr/dataList/OA-11222/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 6 (0.6%) duplicate rowsDuplicates
업태명 has 18 (1.8%) missing valuesMissing

Reproduction

Analysis started2024-05-18 06:57:14.671139
Analysis finished2024-05-18 06:57:16.498570
Duration1.83 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
용산구
1020 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row용산구
2nd row용산구
3rd row용산구
4th row용산구
5th row용산구

Common Values

ValueCountFrequency (%)
용산구 1020
100.0%

Length

2024-05-18T15:57:16.880099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T15:57:17.283621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
용산구 1020
100.0%

법정동명
Categorical

Distinct36
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
한남동
 
57
한강로2가
 
51
이태원동
 
50
한강로3가
 
47
이촌동
 
43
Other values (31)
772 

Length

Max length5
Median length4
Mean length3.9480392
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row후암동
2nd row후암동
3rd row후암동
4th row후암동
5th row후암동

Common Values

ValueCountFrequency (%)
한남동 57
 
5.6%
한강로2가 51
 
5.0%
이태원동 50
 
4.9%
한강로3가 47
 
4.6%
이촌동 43
 
4.2%
동자동 42
 
4.1%
후암동 41
 
4.0%
서계동 39
 
3.8%
갈월동 38
 
3.7%
보광동 38
 
3.7%
Other values (26) 574
56.3%

Length

2024-05-18T15:57:17.605696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한남동 57
 
5.6%
한강로2가 51
 
5.0%
이태원동 50
 
4.9%
한강로3가 47
 
4.6%
이촌동 43
 
4.2%
동자동 42
 
4.1%
후암동 41
 
4.0%
서계동 39
 
3.8%
원효로2가 38
 
3.7%
보광동 38
 
3.7%
Other values (26) 574
56.3%

업태명
Text

MISSING 

Distinct76
Distinct (%)7.6%
Missing18
Missing (%)1.8%
Memory size8.1 KiB
2024-05-18T15:57:18.193098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.8802395
Min length2

Characters and Unicode

Total characters5892
Distinct characters157
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)1.0%

Sample

1st row한식
2nd row중국식
3rd row경양식
4th row일식
5th row분식
ValueCountFrequency (%)
기타 96
 
8.7%
식품등 33
 
3.0%
수입판매업 33
 
3.0%
커피숍 33
 
3.0%
한식 33
 
3.0%
전자상거래(통신판매업 32
 
2.9%
편의점 32
 
2.9%
식품자동판매기영업 31
 
2.8%
경양식 31
 
2.8%
휴게음식점 31
 
2.8%
Other values (67) 722
65.2%
2024-05-18T15:57:19.347781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
425
 
7.2%
355
 
6.0%
282
 
4.8%
275
 
4.7%
201
 
3.4%
171
 
2.9%
144
 
2.4%
137
 
2.3%
) 130
 
2.2%
( 130
 
2.2%
Other values (147) 3642
61.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5442
92.4%
Close Punctuation 130
 
2.2%
Open Punctuation 130
 
2.2%
Space Separator 105
 
1.8%
Other Punctuation 85
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
425
 
7.8%
355
 
6.5%
282
 
5.2%
275
 
5.1%
201
 
3.7%
171
 
3.1%
144
 
2.6%
137
 
2.5%
114
 
2.1%
111
 
2.0%
Other values (141) 3227
59.3%
Other Punctuation
ValueCountFrequency (%)
/ 57
67.1%
, 27
31.8%
. 1
 
1.2%
Close Punctuation
ValueCountFrequency (%)
) 130
100.0%
Open Punctuation
ValueCountFrequency (%)
( 130
100.0%
Space Separator
ValueCountFrequency (%)
105
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5442
92.4%
Common 450
 
7.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
425
 
7.8%
355
 
6.5%
282
 
5.2%
275
 
5.1%
201
 
3.7%
171
 
3.1%
144
 
2.6%
137
 
2.5%
114
 
2.1%
111
 
2.0%
Other values (141) 3227
59.3%
Common
ValueCountFrequency (%)
) 130
28.9%
( 130
28.9%
105
23.3%
/ 57
12.7%
, 27
 
6.0%
. 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5442
92.4%
ASCII 450
 
7.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
425
 
7.8%
355
 
6.5%
282
 
5.2%
275
 
5.1%
201
 
3.7%
171
 
3.1%
144
 
2.6%
137
 
2.5%
114
 
2.1%
111
 
2.0%
Other values (141) 3227
59.3%
ASCII
ValueCountFrequency (%)
) 130
28.9%
( 130
28.9%
105
23.3%
/ 57
12.7%
, 27
 
6.0%
. 1
 
0.2%

업소수
Real number (ℝ)

Distinct70
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.7901961
Minimum1
Maximum306
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.1 KiB
2024-05-18T15:57:19.804167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q37
95-th percentile36.05
Maximum306
Range305
Interquartile range (IQR)6

Descriptive statistics

Standard deviation21.908389
Coefficient of variation (CV)2.4923664
Kurtosis73.089499
Mean8.7901961
Median Absolute Deviation (MAD)2
Skewness7.4410145
Sum8966
Variance479.97753
MonotonicityNot monotonic
2024-05-18T15:57:20.265493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 350
34.3%
2 149
14.6%
3 103
 
10.1%
4 58
 
5.7%
5 53
 
5.2%
6 34
 
3.3%
7 32
 
3.1%
8 23
 
2.3%
11 18
 
1.8%
9 16
 
1.6%
Other values (60) 184
18.0%
ValueCountFrequency (%)
1 350
34.3%
2 149
14.6%
3 103
 
10.1%
4 58
 
5.7%
5 53
 
5.2%
6 34
 
3.3%
7 32
 
3.1%
8 23
 
2.3%
9 16
 
1.6%
10 15
 
1.5%
ValueCountFrequency (%)
306 1
0.1%
264 1
0.1%
198 1
0.1%
191 1
0.1%
190 1
0.1%
177 1
0.1%
165 1
0.1%
155 1
0.1%
105 1
0.1%
94 1
0.1%

Interactions

2024-05-18T15:57:15.252354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T15:57:20.635321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.000
업태명0.0001.0000.000
업소수0.0000.0001.000
2024-05-18T15:57:20.908342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.000
법정동명0.0001.000

Missing values

2024-05-18T15:57:15.739004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T15:57:16.173730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0용산구후암동한식63
1용산구후암동중국식5
2용산구후암동경양식26
3용산구후암동일식16
4용산구후암동분식18
5용산구후암동정종/대포집/소주방3
6용산구후암동호프/통닭7
7용산구후암동통닭(치킨)2
8용산구후암동김밥(도시락)1
9용산구후암동회집1
자치구명법정동명업태명업소수
1010용산구보광동유통전문판매업3
1011용산구보광동기타식품판매업1
1012용산구보광동위탁급식영업1
1013용산구보광동제과점영업2
1014용산구보광동건강기능식품수입업1
1015용산구보광동영업장판매3
1016용산구보광동방문판매3
1017용산구보광동전자상거래(통신판매업)12
1018용산구보광동다단계판매1
1019용산구보광동건강기능식품유통전문판매업1

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0용산구보광동패스트푸드12
1용산구용산동2가패스트푸드22
2용산구원효로2가패스트푸드12
3용산구청파동1가<NA>12
4용산구한강로2가패스트푸드22
5용산구한남동<NA>12