Overview

Dataset statistics

Number of variables4
Number of observations769
Missing cells14
Missing cells (%)0.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory24.9 KiB
Average record size in memory33.2 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author강남구
URLhttps://data.seoul.go.kr/dataList/OA-11299/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
업태명 has 14 (1.8%) missing valuesMissing

Reproduction

Analysis started2024-05-03 20:17:32.225017
Analysis finished2024-05-03 20:17:33.263375
Duration1.04 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
강남구
769 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강남구
2nd row강남구
3rd row강남구
4th row강남구
5th row강남구

Common Values

ValueCountFrequency (%)
강남구 769
100.0%

Length

2024-05-03T20:17:33.372289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T20:17:33.544159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강남구 769
100.0%

법정동명
Categorical

Distinct19
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
역삼동
74 
논현동
71 
신사동
66 
대치동
66 
삼성동
65 
Other values (14)
427 

Length

Max length4
Median length3
Mean length3.0572172
Min length3

Unique

Unique3 ?
Unique (%)0.4%

Sample

1st row무악동
2nd row가산동
3rd row<NA>
4th row<NA>
5th row역삼동

Common Values

ValueCountFrequency (%)
역삼동 74
9.6%
논현동 71
9.2%
신사동 66
8.6%
대치동 66
8.6%
삼성동 65
8.5%
청담동 63
8.2%
개포동 59
 
7.7%
도곡동 56
 
7.3%
일원동 50
 
6.5%
수서동 49
 
6.4%
Other values (9) 150
19.5%

Length

2024-05-03T20:17:33.760318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
역삼동 74
9.6%
논현동 71
9.2%
신사동 66
8.6%
대치동 66
8.6%
삼성동 65
8.5%
청담동 63
8.2%
개포동 59
 
7.7%
도곡동 56
 
7.3%
일원동 50
 
6.5%
수서동 49
 
6.4%
Other values (9) 150
19.5%

업태명
Text

MISSING 

Distinct82
Distinct (%)10.9%
Missing14
Missing (%)1.8%
Memory size6.1 KiB
2024-05-03T20:17:34.239239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.7364238
Min length2

Characters and Unicode

Total characters4331
Distinct characters162
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)1.3%

Sample

1st row식품등 수입판매업
2nd row식품등 수입판매업
3rd row건강기능식품수입업
4th row한식
5th row중국식
ValueCountFrequency (%)
기타 59
 
7.0%
패스트푸드 25
 
3.0%
식품제조가공업 22
 
2.6%
집단급식소 20
 
2.4%
식품등 16
 
1.9%
수입판매업 16
 
1.9%
영업장판매 15
 
1.8%
건강기능식품수입업 14
 
1.7%
유통전문판매업 14
 
1.7%
방문판매 14
 
1.7%
Other values (74) 622
74.3%
2024-05-03T20:17:35.221951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
285
 
6.6%
250
 
5.8%
195
 
4.5%
184
 
4.2%
137
 
3.2%
136
 
3.1%
) 94
 
2.2%
( 94
 
2.2%
87
 
2.0%
87
 
2.0%
Other values (152) 2782
64.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4014
92.7%
Close Punctuation 94
 
2.2%
Open Punctuation 94
 
2.2%
Space Separator 82
 
1.9%
Other Punctuation 45
 
1.0%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
285
 
7.1%
250
 
6.2%
195
 
4.9%
184
 
4.6%
137
 
3.4%
136
 
3.4%
87
 
2.2%
87
 
2.2%
82
 
2.0%
67
 
1.7%
Other values (144) 2504
62.4%
Other Punctuation
ValueCountFrequency (%)
/ 32
71.1%
, 12
 
26.7%
. 1
 
2.2%
Uppercase Letter
ValueCountFrequency (%)
P 1
50.0%
B 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 94
100.0%
Open Punctuation
ValueCountFrequency (%)
( 94
100.0%
Space Separator
ValueCountFrequency (%)
82
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4014
92.7%
Common 315
 
7.3%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
285
 
7.1%
250
 
6.2%
195
 
4.9%
184
 
4.6%
137
 
3.4%
136
 
3.4%
87
 
2.2%
87
 
2.2%
82
 
2.0%
67
 
1.7%
Other values (144) 2504
62.4%
Common
ValueCountFrequency (%)
) 94
29.8%
( 94
29.8%
82
26.0%
/ 32
 
10.2%
, 12
 
3.8%
. 1
 
0.3%
Latin
ValueCountFrequency (%)
P 1
50.0%
B 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4014
92.7%
ASCII 317
 
7.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
285
 
7.1%
250
 
6.2%
195
 
4.9%
184
 
4.6%
137
 
3.4%
136
 
3.4%
87
 
2.2%
87
 
2.2%
82
 
2.0%
67
 
1.7%
Other values (144) 2504
62.4%
ASCII
ValueCountFrequency (%)
) 94
29.7%
( 94
29.7%
82
25.9%
/ 32
 
10.1%
, 12
 
3.8%
P 1
 
0.3%
B 1
 
0.3%
. 1
 
0.3%

업소수
Real number (ℝ)

Distinct145
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.49935
Minimum1
Maximum1442
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 KiB
2024-05-03T20:17:35.476414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median7
Q323
95-th percentile181.8
Maximum1442
Range1441
Interquartile range (IQR)21

Descriptive statistics

Standard deviation99.593067
Coefficient of variation (CV)2.6558612
Kurtosis66.333451
Mean37.49935
Median Absolute Deviation (MAD)6
Skewness6.7008879
Sum28837
Variance9918.779
MonotonicityNot monotonic
2024-05-03T20:17:35.743283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 136
17.7%
2 67
 
8.7%
3 60
 
7.8%
5 46
 
6.0%
4 40
 
5.2%
6 28
 
3.6%
7 27
 
3.5%
9 24
 
3.1%
10 15
 
2.0%
12 14
 
1.8%
Other values (135) 312
40.6%
ValueCountFrequency (%)
1 136
17.7%
2 67
8.7%
3 60
7.8%
4 40
 
5.2%
5 46
 
6.0%
6 28
 
3.6%
7 27
 
3.5%
8 14
 
1.8%
9 24
 
3.1%
10 15
 
2.0%
ValueCountFrequency (%)
1442 1
0.1%
836 1
0.1%
766 1
0.1%
612 1
0.1%
597 1
0.1%
579 1
0.1%
546 1
0.1%
493 1
0.1%
468 1
0.1%
456 1
0.1%

Interactions

2024-05-03T20:17:32.552281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T20:17:35.916383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.000
업태명0.0001.0000.000
업소수0.0000.0001.000
2024-05-03T20:17:36.080806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.000
법정동명0.0001.000

Missing values

2024-05-03T20:17:32.845879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T20:17:33.204484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0강남구무악동식품등 수입판매업1
1강남구가산동<NA>1
2강남구<NA>식품등 수입판매업10
3강남구<NA>건강기능식품수입업1
4강남구역삼동한식1442
5강남구역삼동중국식125
6강남구역삼동경양식468
7강남구역삼동일식214
8강남구역삼동분식209
9강남구역삼동뷔페식14
자치구명법정동명업태명업소수
759강남구도곡동방문판매9
760강남구도곡동전화권유판매1
761강남구도곡동전자상거래(통신판매업)94
762강남구도곡동<NA>5
763강남구도곡동다단계판매2
764강남구도곡동도매업(유통)9
765강남구도곡동기타(복합 등)1
766강남구도곡동기타 건강기능식품일반판매업2
767강남구도곡동건강기능식품유통전문판매업25
768강남구풍덕천동영업장판매2