Overview

Dataset statistics

Number of variables4
Number of observations2806
Missing cells267
Missing cells (%)2.4%
Duplicate rows8
Duplicate rows (%)0.3%
Total size in memory87.8 KiB
Average record size in memory32.0 B

Variable types

Text2
Categorical1
DateTime1

Dataset

Description이 데이터는 서울특별시 동작구에 소재하고 있는 꿈나무카드 가맹점에 관한 것이며, 이 데이터에는 가맹점명, 업종명, 가맹점주소,데이터기준일자 등의 내용이 포함되어 있습니다.
Author서울특별시 동작구
URLhttps://www.data.go.kr/data/15088598/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 8 (0.3%) duplicate rowsDuplicates
가맹점명 has 89 (3.2%) missing valuesMissing
가맹점주소 has 89 (3.2%) missing valuesMissing
데이터기준일자 has 89 (3.2%) missing valuesMissing

Reproduction

Analysis started2024-04-29 22:53:08.230529
Analysis finished2024-04-29 22:53:09.344236
Duration1.11 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

가맹점명
Text

MISSING 

Distinct2613
Distinct (%)96.2%
Missing89
Missing (%)3.2%
Memory size22.1 KiB
2024-04-30T07:53:09.561154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length21
Mean length7.396025
Min length1

Characters and Unicode

Total characters20095
Distinct characters768
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2528 ?
Unique (%)93.0%

Sample

1st row(주) 씨엘그룹
2nd row(주) 아워홈 보라매병원동작점
3rd row(주) 안동장
4th row(주) 장블랑제리 이수역 직영점
5th row(주)가든즈
ValueCountFrequency (%)
gs25 59
 
1.5%
노량진점 53
 
1.4%
상도점 45
 
1.2%
사당점 45
 
1.2%
세븐일레븐 44
 
1.1%
중앙대점 37
 
1.0%
씨유(cu 31
 
0.8%
이마트24 29
 
0.7%
이수역점 28
 
0.7%
주식회사 26
 
0.7%
Other values (2796) 3487
89.8%
2024-04-30T07:53:09.981918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1167
 
5.8%
958
 
4.8%
474
 
2.4%
412
 
2.1%
302
 
1.5%
271
 
1.3%
267
 
1.3%
( 253
 
1.3%
) 252
 
1.3%
236
 
1.2%
Other values (758) 15503
77.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16992
84.6%
Space Separator 1167
 
5.8%
Uppercase Letter 617
 
3.1%
Decimal Number 532
 
2.6%
Open Punctuation 254
 
1.3%
Close Punctuation 253
 
1.3%
Lowercase Letter 226
 
1.1%
Other Punctuation 47
 
0.2%
Modifier Symbol 4
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
958
 
5.6%
474
 
2.8%
412
 
2.4%
302
 
1.8%
271
 
1.6%
267
 
1.6%
236
 
1.4%
235
 
1.4%
233
 
1.4%
188
 
1.1%
Other values (685) 13416
79.0%
Uppercase Letter
ValueCountFrequency (%)
S 134
21.7%
G 119
19.3%
C 74
12.0%
U 71
11.5%
B 30
 
4.9%
A 24
 
3.9%
R 21
 
3.4%
T 17
 
2.8%
E 16
 
2.6%
O 15
 
2.4%
Other values (15) 96
15.6%
Lowercase Letter
ValueCountFrequency (%)
e 34
15.0%
a 19
 
8.4%
i 18
 
8.0%
o 15
 
6.6%
r 14
 
6.2%
l 14
 
6.2%
n 12
 
5.3%
t 12
 
5.3%
h 12
 
5.3%
c 12
 
5.3%
Other values (14) 64
28.3%
Decimal Number
ValueCountFrequency (%)
2 186
35.0%
5 137
25.8%
4 53
 
10.0%
1 46
 
8.6%
0 41
 
7.7%
9 22
 
4.1%
3 17
 
3.2%
8 13
 
2.4%
6 9
 
1.7%
7 8
 
1.5%
Other Punctuation
ValueCountFrequency (%)
& 29
61.7%
. 9
 
19.1%
, 4
 
8.5%
! 3
 
6.4%
? 1
 
2.1%
/ 1
 
2.1%
Open Punctuation
ValueCountFrequency (%)
( 253
99.6%
1
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 252
99.6%
1
 
0.4%
Space Separator
ValueCountFrequency (%)
1167
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16988
84.5%
Common 2259
 
11.2%
Latin 844
 
4.2%
Han 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
958
 
5.6%
474
 
2.8%
412
 
2.4%
302
 
1.8%
271
 
1.6%
267
 
1.6%
236
 
1.4%
235
 
1.4%
233
 
1.4%
188
 
1.1%
Other values (681) 13412
78.9%
Latin
ValueCountFrequency (%)
S 134
15.9%
G 119
 
14.1%
C 74
 
8.8%
U 71
 
8.4%
e 34
 
4.0%
B 30
 
3.6%
A 24
 
2.8%
R 21
 
2.5%
a 19
 
2.3%
i 18
 
2.1%
Other values (40) 300
35.5%
Common
ValueCountFrequency (%)
1167
51.7%
( 253
 
11.2%
) 252
 
11.2%
2 186
 
8.2%
5 137
 
6.1%
4 53
 
2.3%
1 46
 
2.0%
0 41
 
1.8%
& 29
 
1.3%
9 22
 
1.0%
Other values (13) 73
 
3.2%
Han
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16988
84.5%
ASCII 3096
 
15.4%
None 6
 
< 0.1%
CJK 4
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1167
37.7%
( 253
 
8.2%
) 252
 
8.1%
2 186
 
6.0%
5 137
 
4.4%
S 134
 
4.3%
G 119
 
3.8%
C 74
 
2.4%
U 71
 
2.3%
4 53
 
1.7%
Other values (59) 650
21.0%
Hangul
ValueCountFrequency (%)
958
 
5.6%
474
 
2.8%
412
 
2.4%
302
 
1.8%
271
 
1.6%
267
 
1.6%
236
 
1.4%
235
 
1.4%
233
 
1.4%
188
 
1.1%
Other values (681) 13412
78.9%
None
ValueCountFrequency (%)
´ 4
66.7%
1
 
16.7%
1
 
16.7%
CJK
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

업종명
Categorical

Distinct11
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
한식
1271 
일반대중음식
607 
편의점
293 
일식
135 
중식
 
124
Other values (6)
376 

Length

Max length8
Median length2
Mean length3.2031361
Min length2

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row패스트푸드
2nd row일반대중음식
3rd row중식
4th row제과점
5th row일반대중음식

Common Values

ValueCountFrequency (%)
한식 1271
45.3%
일반대중음식 607
21.6%
편의점 293
 
10.4%
일식 135
 
4.8%
중식 124
 
4.4%
패스트푸드 118
 
4.2%
제과점 116
 
4.1%
<NA> 89
 
3.2%
양식 51
 
1.8%
할인점/슈퍼마켓 1
 
< 0.1%

Length

2024-04-30T07:53:10.118266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한식 1271
45.3%
일반대중음식 607
21.6%
편의점 293
 
10.4%
일식 135
 
4.8%
중식 124
 
4.4%
패스트푸드 118
 
4.2%
제과점 116
 
4.1%
na 89
 
3.2%
양식 51
 
1.8%
할인점/슈퍼마켓 1
 
< 0.1%

가맹점주소
Text

MISSING 

Distinct2638
Distinct (%)97.1%
Missing89
Missing (%)3.2%
Memory size22.1 KiB
2024-04-30T07:53:10.405962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length53
Mean length26.345234
Min length14

Characters and Unicode

Total characters71580
Distinct characters329
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2570 ?
Unique (%)94.6%

Sample

1st row서울 동작구 사당로 300, 1층 (사당동, 이수자이아파트)
2nd row서울특별시 동작구 보라매로5길 20 지하1층
3rd row서울 동작구 흑석로 105-1 (흑석동)
4th row서울특별시 동작구 동작대로23길 8 사당동
5th row서울특별시 동작구 사당로28길 5 1층 101호
ValueCountFrequency (%)
동작구 2717
 
17.8%
서울특별시 1700
 
11.1%
1층 1223
 
8.0%
서울 1017
 
6.6%
사당동 313
 
2.0%
상도동 219
 
1.4%
노량진동 196
 
1.3%
2층 185
 
1.2%
상도로 173
 
1.1%
사당로 139
 
0.9%
Other values (1667) 7413
48.5%
2024-04-30T07:53:10.849095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12603
 
17.6%
4829
 
6.7%
1 4634
 
6.5%
3103
 
4.3%
2804
 
3.9%
2723
 
3.8%
2723
 
3.8%
2478
 
3.5%
2 2160
 
3.0%
1974
 
2.8%
Other values (319) 31549
44.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41043
57.3%
Decimal Number 13426
 
18.8%
Space Separator 12603
 
17.6%
Open Punctuation 1454
 
2.0%
Close Punctuation 1454
 
2.0%
Other Punctuation 1078
 
1.5%
Dash Punctuation 415
 
0.6%
Uppercase Letter 95
 
0.1%
Math Symbol 6
 
< 0.1%
Lowercase Letter 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4829
 
11.8%
3103
 
7.6%
2804
 
6.8%
2723
 
6.6%
2723
 
6.6%
2478
 
6.0%
1974
 
4.8%
1741
 
4.2%
1706
 
4.2%
1706
 
4.2%
Other values (284) 15256
37.2%
Decimal Number
ValueCountFrequency (%)
1 4634
34.5%
2 2160
16.1%
0 1144
 
8.5%
3 1131
 
8.4%
4 936
 
7.0%
6 817
 
6.1%
5 771
 
5.7%
7 746
 
5.6%
8 573
 
4.3%
9 514
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
B 52
54.7%
A 16
 
16.8%
C 6
 
6.3%
D 6
 
6.3%
R 5
 
5.3%
T 4
 
4.2%
P 3
 
3.2%
Y 1
 
1.1%
X 1
 
1.1%
U 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
, 1010
93.7%
. 62
 
5.8%
& 5
 
0.5%
@ 1
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
b 3
50.0%
a 1
 
16.7%
e 1
 
16.7%
s 1
 
16.7%
Open Punctuation
ValueCountFrequency (%)
( 1451
99.8%
3
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 1451
99.8%
3
 
0.2%
Space Separator
ValueCountFrequency (%)
12603
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 415
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41042
57.3%
Common 30436
42.5%
Latin 101
 
0.1%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4829
 
11.8%
3103
 
7.6%
2804
 
6.8%
2723
 
6.6%
2723
 
6.6%
2478
 
6.0%
1974
 
4.8%
1741
 
4.2%
1706
 
4.2%
1706
 
4.2%
Other values (283) 15255
37.2%
Common
ValueCountFrequency (%)
12603
41.4%
1 4634
 
15.2%
2 2160
 
7.1%
( 1451
 
4.8%
) 1451
 
4.8%
0 1144
 
3.8%
3 1131
 
3.7%
, 1010
 
3.3%
4 936
 
3.1%
6 817
 
2.7%
Other values (11) 3099
 
10.2%
Latin
ValueCountFrequency (%)
B 52
51.5%
A 16
 
15.8%
C 6
 
5.9%
D 6
 
5.9%
R 5
 
5.0%
T 4
 
4.0%
b 3
 
3.0%
P 3
 
3.0%
Y 1
 
1.0%
X 1
 
1.0%
Other values (4) 4
 
4.0%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41042
57.3%
ASCII 30531
42.7%
None 6
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12603
41.3%
1 4634
 
15.2%
2 2160
 
7.1%
( 1451
 
4.8%
) 1451
 
4.8%
0 1144
 
3.7%
3 1131
 
3.7%
, 1010
 
3.3%
4 936
 
3.1%
6 817
 
2.7%
Other values (23) 3194
 
10.5%
Hangul
ValueCountFrequency (%)
4829
 
11.8%
3103
 
7.6%
2804
 
6.8%
2723
 
6.6%
2723
 
6.6%
2478
 
6.0%
1974
 
4.8%
1741
 
4.2%
1706
 
4.2%
1706
 
4.2%
Other values (283) 15255
37.2%
None
ValueCountFrequency (%)
3
50.0%
3
50.0%
CJK
ValueCountFrequency (%)
1
100.0%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing89
Missing (%)3.2%
Memory size22.1 KiB
Minimum2024-04-22 00:00:00
Maximum2024-04-22 00:00:00
2024-04-30T07:53:10.959958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T07:53:11.049693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Missing values

2024-04-30T07:53:09.111808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T07:53:09.194510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-30T07:53:09.286933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

가맹점명업종명가맹점주소데이터기준일자
0(주) 씨엘그룹패스트푸드서울 동작구 사당로 300, 1층 (사당동, 이수자이아파트)2024-04-22
1(주) 아워홈 보라매병원동작점일반대중음식서울특별시 동작구 보라매로5길 20 지하1층2024-04-22
2(주) 안동장중식서울 동작구 흑석로 105-1 (흑석동)2024-04-22
3(주) 장블랑제리 이수역 직영점제과점서울특별시 동작구 동작대로23길 8 사당동2024-04-22
4(주)가든즈일반대중음식서울특별시 동작구 사당로28길 5 1층 101호2024-04-22
5(주)계림원푸드(노량진점)한식서울특별시 동작구 노량진로8길 8 (노량진동)1층2024-04-22
6(주)낙지세상 신대방본점한식서울특별시 동작구 대림로 57 1층2024-04-22
7(주)닥터로빈 흑석점일반대중음식서울 동작구 현충로 124, 뉴지엄 지하1층 2층 (흑석동)2024-04-22
8(주)동서유통패스트푸드서울 동작구 동작대로 129, 1층 (사당동)2024-04-22
9(주)롯데리아 보네스빼브레드 이수역패스트푸드서울 동작구 동작대로 119 (사당동)2024-04-22
가맹점명업종명가맹점주소데이터기준일자
2796<NA><NA><NA><NA>
2797<NA><NA><NA><NA>
2798<NA><NA><NA><NA>
2799<NA><NA><NA><NA>
2800<NA><NA><NA><NA>
2801<NA><NA><NA><NA>
2802<NA><NA><NA><NA>
2803<NA><NA><NA><NA>
2804<NA><NA><NA><NA>
2805<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

가맹점명업종명가맹점주소데이터기준일자# duplicates
7<NA><NA><NA><NA>89
0롯데리아 보라매타운점패스트푸드서울특별시 동작구 보라매로5길 35 103호2024-04-222
1세븐일레븐 숭실대중앙점편의점서울특별시 동작구 사당로 18 1층2024-04-222
2시골집한식서울 동작구 양녕로26길 56, 2층 (상도동)2024-04-222
3이마트24 R보라매서울대병원점편의점서울특별시 동작구 보라매로5길 20 1층2024-04-222
4이마트24 동작삼익점편의점서울특별시 동작구 만양로 84 101동 105호2024-04-222
5이마트24 사당역점편의점서울특별시 동작구 남부순환로 2045 1층2024-04-222
6이마트24 신대방역점편의점서울특별시 동작구 신대방길 5 1층2024-04-222