Overview

Dataset statistics

Number of variables5
Number of observations884
Missing cells84
Missing cells (%)1.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory34.7 KiB
Average record size in memory40.1 B

Variable types

Categorical2
Unsupported1
Text2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-1100/F/1/datasetView.do

Alerts

서울풍물시장 점포현황(취급품목) is highly overall correlated with Unnamed: 4High correlation
Unnamed: 4 is highly overall correlated with 서울풍물시장 점포현황(취급품목)High correlation
Unnamed: 3 has 79 (8.9%) missing valuesMissing
Unnamed: 1 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 08:20:34.213677
Analysis finished2023-12-11 08:20:34.980180
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

서울풍물시장 점포현황(취급품목)
Categorical

HIGH CORRELATION 

Distinct50
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
구제의류
150 
골동품
119 
<NA>
87 
의류
75 
잡화
73 
Other values (45)
380 

Length

Max length5
Median length4
Mean length3.188914
Min length2

Unique

Unique12 ?
Unique (%)1.4%

Sample

1st row<NA>
2nd row<NA>
3rd row코드구분
4th row식당
5th row식당

Common Values

ValueCountFrequency (%)
구제의류 150
17.0%
골동품 119
13.5%
<NA> 87
9.8%
의류 75
8.5%
잡화 73
 
8.3%
식당 54
 
6.1%
패션잡화 40
 
4.5%
레저용품 39
 
4.4%
음반 32
 
3.6%
구제잡화 30
 
3.4%
Other values (40) 185
20.9%

Length

2023-12-11T17:20:35.088731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
구제의류 150
17.0%
골동품 119
13.5%
na 87
9.8%
의류 75
8.5%
잡화 73
 
8.3%
식당 54
 
6.1%
패션잡화 40
 
4.5%
레저용품 39
 
4.4%
음반 32
 
3.6%
구제잡화 30
 
3.4%
Other values (40) 185
20.9%

Unnamed: 1
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)0.2%
Memory size7.0 KiB
Distinct881
Distinct (%)100.0%
Missing3
Missing (%)0.3%
Memory size7.0 KiB
2023-12-11T17:20:35.378529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.9909194
Min length2

Characters and Unicode

Total characters8802
Distinct characters29
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique881 ?
Unique (%)100.0%

Sample

1st row위치
2nd row1층 빨강동001호
3rd row1층 빨강동002호
4th row1층 빨강동003호
5th row1층 빨강동004호
ValueCountFrequency (%)
2층 466
26.5%
1층 414
23.5%
남색동050호 1
 
0.1%
남색동049호 1
 
0.1%
남색동026호 1
 
0.1%
남색동028호 1
 
0.1%
남색동052호 1
 
0.1%
남색동029호 1
 
0.1%
남색동030호 1
 
0.1%
남색동031호 1
 
0.1%
Other values (873) 873
49.6%
2023-12-11T17:20:35.832333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
880
 
10.0%
880
 
10.0%
880
 
10.0%
880
 
10.0%
0 858
 
9.7%
1 848
 
9.6%
2 659
 
7.5%
266
 
3.0%
194
 
2.2%
194
 
2.2%
Other values (19) 2263
25.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4402
50.0%
Decimal Number 3520
40.0%
Space Separator 880
 
10.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
880
20.0%
880
20.0%
880
20.0%
266
 
6.0%
194
 
4.4%
194
 
4.4%
151
 
3.4%
135
 
3.1%
135
 
3.1%
121
 
2.7%
Other values (8) 566
12.9%
Decimal Number
ValueCountFrequency (%)
0 858
24.4%
1 848
24.1%
2 659
18.7%
3 186
 
5.3%
4 179
 
5.1%
5 171
 
4.9%
6 157
 
4.5%
7 156
 
4.4%
8 155
 
4.4%
9 151
 
4.3%
Space Separator
ValueCountFrequency (%)
880
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4402
50.0%
Common 4400
50.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
880
20.0%
880
20.0%
880
20.0%
266
 
6.0%
194
 
4.4%
194
 
4.4%
151
 
3.4%
135
 
3.1%
135
 
3.1%
121
 
2.7%
Other values (8) 566
12.9%
Common
ValueCountFrequency (%)
880
20.0%
0 858
19.5%
1 848
19.3%
2 659
15.0%
3 186
 
4.2%
4 179
 
4.1%
5 171
 
3.9%
6 157
 
3.6%
7 156
 
3.5%
8 155
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4402
50.0%
ASCII 4400
50.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
880
20.0%
0 858
19.5%
1 848
19.3%
2 659
15.0%
3 186
 
4.2%
4 179
 
4.1%
5 171
 
3.9%
6 157
 
3.6%
7 156
 
3.5%
8 155
 
3.5%
Hangul
ValueCountFrequency (%)
880
20.0%
880
20.0%
880
20.0%
266
 
6.0%
194
 
4.4%
194
 
4.4%
151
 
3.4%
135
 
3.1%
135
 
3.1%
121
 
2.7%
Other values (8) 566
12.9%

Unnamed: 3
Text

MISSING 

Distinct387
Distinct (%)48.1%
Missing79
Missing (%)8.9%
Memory size7.0 KiB
2023-12-11T17:20:36.104113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length7.0447205
Min length1

Characters and Unicode

Total characters5671
Distinct characters313
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique217 ?
Unique (%)27.0%

Sample

1st row취급품목
2nd row소머리국밥,순대국,돼지머리수육
3rd row소머리국밥,순대국,돼지머리수육
4th row냉면
5th row냉면
ValueCountFrequency (%)
구제의류 51
 
5.8%
잡화 45
 
5.1%
의류 41
 
4.7%
성인용품 14
 
1.6%
신발 14
 
1.6%
공구,잡화 9
 
1.0%
시계 9
 
1.0%
음반 8
 
0.9%
골동품 8
 
0.9%
가방 8
 
0.9%
Other values (388) 670
76.4%
2023-12-11T17:20:36.527263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 771
 
13.6%
251
 
4.4%
241
 
4.2%
227
 
4.0%
202
 
3.6%
201
 
3.5%
179
 
3.2%
172
 
3.0%
126
 
2.2%
106
 
1.9%
Other values (303) 3195
56.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4617
81.4%
Other Punctuation 803
 
14.2%
Uppercase Letter 142
 
2.5%
Space Separator 72
 
1.3%
Close Punctuation 18
 
0.3%
Open Punctuation 18
 
0.3%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
251
 
5.4%
241
 
5.2%
227
 
4.9%
202
 
4.4%
201
 
4.4%
179
 
3.9%
172
 
3.7%
126
 
2.7%
106
 
2.3%
104
 
2.3%
Other values (283) 2808
60.8%
Uppercase Letter
ValueCountFrequency (%)
D 49
34.5%
C 18
 
12.7%
P 17
 
12.0%
V 16
 
11.3%
T 11
 
7.7%
A 9
 
6.3%
E 9
 
6.3%
L 6
 
4.2%
M 2
 
1.4%
Y 2
 
1.4%
Other values (2) 3
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 771
96.0%
. 32
 
4.0%
Close Punctuation
ValueCountFrequency (%)
) 15
83.3%
] 3
 
16.7%
Open Punctuation
ValueCountFrequency (%)
( 15
83.3%
[ 3
 
16.7%
Space Separator
ValueCountFrequency (%)
72
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4617
81.4%
Common 912
 
16.1%
Latin 142
 
2.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
251
 
5.4%
241
 
5.2%
227
 
4.9%
202
 
4.4%
201
 
4.4%
179
 
3.9%
172
 
3.7%
126
 
2.7%
106
 
2.3%
104
 
2.3%
Other values (283) 2808
60.8%
Latin
ValueCountFrequency (%)
D 49
34.5%
C 18
 
12.7%
P 17
 
12.0%
V 16
 
11.3%
T 11
 
7.7%
A 9
 
6.3%
E 9
 
6.3%
L 6
 
4.2%
M 2
 
1.4%
Y 2
 
1.4%
Other values (2) 3
 
2.1%
Common
ValueCountFrequency (%)
, 771
84.5%
72
 
7.9%
. 32
 
3.5%
) 15
 
1.6%
( 15
 
1.6%
[ 3
 
0.3%
] 3
 
0.3%
3 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4617
81.4%
ASCII 1054
 
18.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 771
73.1%
72
 
6.8%
D 49
 
4.6%
. 32
 
3.0%
C 18
 
1.7%
P 17
 
1.6%
V 16
 
1.5%
) 15
 
1.4%
( 15
 
1.4%
T 11
 
1.0%
Other values (10) 38
 
3.6%
Hangul
ValueCountFrequency (%)
251
 
5.4%
241
 
5.2%
227
 
4.9%
202
 
4.4%
201
 
4.4%
179
 
3.9%
172
 
3.7%
126
 
2.7%
106
 
2.3%
104
 
2.3%
Other values (283) 2808
60.8%

Unnamed: 4
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
남색동
194 
노랑동
152 
보라동
135 
초록동
121 
파랑동
115 
Other values (5)
167 

Length

Max length8
Median length3
Mean length3.0067873
Min length3

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st row<NA>
2nd row2022.03.
3rd row동구분
4th row빨강동
5th row빨강동

Common Values

ValueCountFrequency (%)
남색동 194
21.9%
노랑동 152
17.2%
보라동 135
15.3%
초록동 121
13.7%
파랑동 115
13.0%
주황동 104
11.8%
빨강동 60
 
6.8%
<NA> 1
 
0.1%
2022.03. 1
 
0.1%
동구분 1
 
0.1%

Length

2023-12-11T17:20:36.683365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T17:20:36.845861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남색동 194
21.9%
노랑동 152
17.2%
보라동 135
15.3%
초록동 121
13.7%
파랑동 115
13.0%
주황동 104
11.8%
빨강동 60
 
6.8%
na 1
 
0.1%
2022.03 1
 
0.1%
동구분 1
 
0.1%

Correlations

2023-12-11T17:20:36.967273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
서울풍물시장 점포현황(취급품목)Unnamed: 4
서울풍물시장 점포현황(취급품목)1.0000.942
Unnamed: 40.9421.000
2023-12-11T17:20:37.380551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 4서울풍물시장 점포현황(취급품목)
Unnamed: 41.0000.700
서울풍물시장 점포현황(취급품목)0.7001.000
2023-12-11T17:20:37.487076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
서울풍물시장 점포현황(취급품목)Unnamed: 4
서울풍물시장 점포현황(취급품목)1.0000.700
Unnamed: 40.7001.000

Missing values

2023-12-11T17:20:34.626585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T17:20:34.763110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T17:20:34.892054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

서울풍물시장 점포현황(취급품목)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
0<NA>NaN<NA><NA><NA>
1<NA>NaN<NA><NA>2022.03.
2코드구분상호명위치취급품목동구분
3식당마장동한우소머리국밥1층 빨강동001호소머리국밥,순대국,돼지머리수육빨강동
4식당마장동한우소머리국밥1층 빨강동002호소머리국밥,순대국,돼지머리수육빨강동
5식당옛맛한우소머리국밥1층 빨강동003호냉면빨강동
6식당옛맛한우소머리국밥1층 빨강동004호냉면빨강동
7식당옛맛한우소머리국밥1층 빨강동005호냉면빨강동
8식당한우리1층 빨강동006호소머리국밥,순대국밥,국수빨강동
9식당한우리1층 빨강동007호소머리국밥,순대국밥,국수빨강동
서울풍물시장 점포현황(취급품목)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
874레저용품보물창고2층 보라동126호오토바이 용품,택스일오토바이복,가죽 오토바이복보라동
875구제의류보물창고2층 보라동127호밍크,가죽의류,오토바이 가죽옷보라동
876의류명품구제2층 보라동128호의류보라동
877의류명품구제2층 보라동129호의류보라동
878의류신설유통2층 보라동130호의류,잡화보라동
879<NA>철거2층 보라동131호<NA>보라동
880<NA>철거2층 보라동132호<NA>보라동
881<NA>철거2층 보라동133호<NA>보라동
882<NA>철거2층 보라동134호<NA>보라동
883<NA>철거2층 보라동135호<NA>보라동