Overview

Dataset statistics

Number of variables5
Number of observations2000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory80.2 KiB
Average record size in memory41.1 B

Variable types

Categorical2
Numeric1
Text2

Alerts

CTY_NM has constant value ""Constant
RSTRNT_ID is highly overall correlated with RSTRNT_TRDAR_NMHigh correlation
RSTRNT_TRDAR_NM is highly overall correlated with RSTRNT_IDHigh correlation
RSTRNT_ID has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:41:47.233711
Analysis finished2023-12-10 09:41:48.850502
Duration1.62 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

CTY_NM
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
seoul
2000 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowseoul
2nd rowseoul
3rd rowseoul
4th rowseoul
5th rowseoul

Common Values

ValueCountFrequency (%)
seoul 2000
100.0%

Length

2023-12-10T18:41:48.961251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:41:49.132350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
seoul 2000
100.0%

RSTRNT_ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1138.1495
Minimum1
Maximum2265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-10T18:41:49.318055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile118.95
Q1574.75
median1149.5
Q31706.25
95-th percentile2155.05
Maximum2265
Range2264
Interquartile range (IQR)1131.5

Descriptive statistics

Standard deviation653.5449
Coefficient of variation (CV)0.57421709
Kurtosis-1.1970096
Mean1138.1495
Median Absolute Deviation (MAD)568
Skewness-0.029319763
Sum2276299
Variance427120.94
MonotonicityStrictly increasing
2023-12-10T18:41:49.690034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1518 1
 
0.1%
1531 1
 
0.1%
1530 1
 
0.1%
1529 1
 
0.1%
1528 1
 
0.1%
1527 1
 
0.1%
1526 1
 
0.1%
1525 1
 
0.1%
1524 1
 
0.1%
Other values (1990) 1990
99.5%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
9 1
0.1%
10 1
0.1%
11 1
0.1%
ValueCountFrequency (%)
2265 1
0.1%
2264 1
0.1%
2263 1
0.1%
2262 1
0.1%
2261 1
0.1%
2259 1
0.1%
2258 1
0.1%
2257 1
0.1%
2256 1
0.1%
2253 1
0.1%
Distinct1994
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
2023-12-10T18:41:50.147410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length23
Mean length7.375
Min length1

Characters and Unicode

Total characters14750
Distinct characters832
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1988 ?
Unique (%)99.4%

Sample

1st row아리마(서초점)
2nd row강남교자(본점)
3rd row무화잠(강남점)
4th row아이 해브 어 드림
5th row아카사카(강남점)
ValueCountFrequency (%)
홍대점 17
 
0.7%
본점 10
 
0.4%
신촌점 7
 
0.3%
명동점 7
 
0.3%
암사점 4
 
0.2%
카페 4
 
0.2%
레스토랑 3
 
0.1%
종각점 3
 
0.1%
table 3
 
0.1%
3
 
0.1%
Other values (2172) 2210
97.3%
2023-12-10T18:41:50.894133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1820
 
12.3%
426
 
2.9%
) 410
 
2.8%
( 410
 
2.8%
298
 
2.0%
280
 
1.9%
193
 
1.3%
173
 
1.2%
137
 
0.9%
114
 
0.8%
Other values (822) 10489
71.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10994
74.5%
Space Separator 1820
 
12.3%
Lowercase Letter 481
 
3.3%
Uppercase Letter 453
 
3.1%
Close Punctuation 416
 
2.8%
Open Punctuation 416
 
2.8%
Decimal Number 131
 
0.9%
Other Punctuation 29
 
0.2%
Dash Punctuation 9
 
0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
426
 
3.9%
298
 
2.7%
280
 
2.5%
193
 
1.8%
173
 
1.6%
137
 
1.2%
114
 
1.0%
113
 
1.0%
112
 
1.0%
107
 
1.0%
Other values (749) 9041
82.2%
Uppercase Letter
ValueCountFrequency (%)
A 42
 
9.3%
O 39
 
8.6%
C 37
 
8.2%
E 32
 
7.1%
S 26
 
5.7%
I 26
 
5.7%
B 24
 
5.3%
T 24
 
5.3%
N 22
 
4.9%
R 19
 
4.2%
Other values (16) 162
35.8%
Lowercase Letter
ValueCountFrequency (%)
a 71
14.8%
e 61
12.7%
n 36
 
7.5%
r 33
 
6.9%
o 32
 
6.7%
i 32
 
6.7%
l 30
 
6.2%
t 28
 
5.8%
c 24
 
5.0%
s 21
 
4.4%
Other values (13) 113
23.5%
Decimal Number
ValueCountFrequency (%)
1 32
24.4%
2 24
18.3%
9 16
12.2%
0 12
 
9.2%
7 10
 
7.6%
6 9
 
6.9%
8 8
 
6.1%
3 7
 
5.3%
5 7
 
5.3%
4 6
 
4.6%
Other Punctuation
ValueCountFrequency (%)
& 13
44.8%
. 8
27.6%
' 6
20.7%
% 1
 
3.4%
! 1
 
3.4%
Close Punctuation
ValueCountFrequency (%)
) 410
98.6%
5
 
1.2%
] 1
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 410
98.6%
5
 
1.2%
[ 1
 
0.2%
Space Separator
ValueCountFrequency (%)
1820
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10983
74.5%
Common 2822
 
19.1%
Latin 934
 
6.3%
Han 11
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
426
 
3.9%
298
 
2.7%
280
 
2.5%
193
 
1.8%
173
 
1.6%
137
 
1.2%
114
 
1.0%
113
 
1.0%
112
 
1.0%
107
 
1.0%
Other values (738) 9030
82.2%
Latin
ValueCountFrequency (%)
a 71
 
7.6%
e 61
 
6.5%
A 42
 
4.5%
O 39
 
4.2%
C 37
 
4.0%
n 36
 
3.9%
r 33
 
3.5%
o 32
 
3.4%
E 32
 
3.4%
i 32
 
3.4%
Other values (39) 519
55.6%
Common
ValueCountFrequency (%)
1820
64.5%
) 410
 
14.5%
( 410
 
14.5%
1 32
 
1.1%
2 24
 
0.9%
9 16
 
0.6%
& 13
 
0.5%
0 12
 
0.4%
7 10
 
0.4%
6 9
 
0.3%
Other values (14) 66
 
2.3%
Han
ValueCountFrequency (%)
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10983
74.5%
ASCII 3746
 
25.4%
CJK 11
 
0.1%
None 10
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1820
48.6%
) 410
 
10.9%
( 410
 
10.9%
a 71
 
1.9%
e 61
 
1.6%
A 42
 
1.1%
O 39
 
1.0%
C 37
 
1.0%
n 36
 
1.0%
r 33
 
0.9%
Other values (61) 787
21.0%
Hangul
ValueCountFrequency (%)
426
 
3.9%
298
 
2.7%
280
 
2.5%
193
 
1.8%
173
 
1.6%
137
 
1.2%
114
 
1.0%
113
 
1.0%
112
 
1.0%
107
 
1.0%
Other values (738) 9030
82.2%
None
ValueCountFrequency (%)
5
50.0%
5
50.0%
CJK
ValueCountFrequency (%)
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Distinct1944
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
2023-12-10T18:41:51.573893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length40
Mean length20.6245
Min length14

Characters and Unicode

Total characters41249
Distinct characters357
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1901 ?
Unique (%)95.0%

Sample

1st row서울특별시 서초구 서초동 1675-13
2nd row서울특별시 서초구 서초동 1308-1
3rd row서울특별시 강남구 강남대로 338
4th row서울특별시 강남구 역삼동 821-1
5th row서울특별시 서초구 서초동 1327-2
ValueCountFrequency (%)
서울특별시 2002
 
23.1%
마포구 316
 
3.6%
종로구 270
 
3.1%
중구 237
 
2.7%
용산구 173
 
2.0%
광진구 150
 
1.7%
서대문구 143
 
1.6%
1층 132
 
1.5%
서교동 113
 
1.3%
강남구 111
 
1.3%
Other values (2424) 5025
57.9%
2023-12-10T18:41:53.043682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6773
16.4%
2361
 
5.7%
1 2047
 
5.0%
2039
 
4.9%
2011
 
4.9%
2009
 
4.9%
2006
 
4.9%
2006
 
4.9%
1757
 
4.3%
- 1403
 
3.4%
Other values (347) 16837
40.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24207
58.7%
Decimal Number 8735
 
21.2%
Space Separator 6773
 
16.4%
Dash Punctuation 1403
 
3.4%
Uppercase Letter 50
 
0.1%
Other Punctuation 39
 
0.1%
Lowercase Letter 29
 
0.1%
Close Punctuation 5
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2361
 
9.8%
2039
 
8.4%
2011
 
8.3%
2009
 
8.3%
2006
 
8.3%
2006
 
8.3%
1757
 
7.3%
807
 
3.3%
351
 
1.4%
342
 
1.4%
Other values (306) 8518
35.2%
Lowercase Letter
ValueCountFrequency (%)
e 6
20.7%
t 6
20.7%
r 4
13.8%
o 3
10.3%
a 2
 
6.9%
w 2
 
6.9%
d 1
 
3.4%
k 1
 
3.4%
h 1
 
3.4%
f 1
 
3.4%
Other values (2) 2
 
6.9%
Uppercase Letter
ValueCountFrequency (%)
B 18
36.0%
F 11
22.0%
G 5
 
10.0%
S 4
 
8.0%
T 3
 
6.0%
D 2
 
4.0%
K 2
 
4.0%
I 2
 
4.0%
Y 1
 
2.0%
P 1
 
2.0%
Decimal Number
ValueCountFrequency (%)
1 2047
23.4%
2 1355
15.5%
3 1098
12.6%
4 816
 
9.3%
5 735
 
8.4%
6 624
 
7.1%
0 570
 
6.5%
7 523
 
6.0%
8 523
 
6.0%
9 444
 
5.1%
Other Punctuation
ValueCountFrequency (%)
, 34
87.2%
/ 3
 
7.7%
. 2
 
5.1%
Space Separator
ValueCountFrequency (%)
6773
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1403
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24205
58.7%
Common 16963
41.1%
Latin 79
 
0.2%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2361
 
9.8%
2039
 
8.4%
2011
 
8.3%
2009
 
8.3%
2006
 
8.3%
2006
 
8.3%
1757
 
7.3%
807
 
3.3%
351
 
1.5%
342
 
1.4%
Other values (304) 8516
35.2%
Latin
ValueCountFrequency (%)
B 18
22.8%
F 11
13.9%
e 6
 
7.6%
t 6
 
7.6%
G 5
 
6.3%
S 4
 
5.1%
r 4
 
5.1%
o 3
 
3.8%
T 3
 
3.8%
D 2
 
2.5%
Other values (13) 17
21.5%
Common
ValueCountFrequency (%)
6773
39.9%
1 2047
 
12.1%
- 1403
 
8.3%
2 1355
 
8.0%
3 1098
 
6.5%
4 816
 
4.8%
5 735
 
4.3%
6 624
 
3.7%
0 570
 
3.4%
7 523
 
3.1%
Other values (8) 1019
 
6.0%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24205
58.7%
ASCII 17042
41.3%
CJK 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6773
39.7%
1 2047
 
12.0%
- 1403
 
8.2%
2 1355
 
8.0%
3 1098
 
6.4%
4 816
 
4.8%
5 735
 
4.3%
6 624
 
3.7%
0 570
 
3.3%
7 523
 
3.1%
Other values (31) 1098
 
6.4%
Hangul
ValueCountFrequency (%)
2361
 
9.8%
2039
 
8.4%
2011
 
8.3%
2009
 
8.3%
2006
 
8.3%
2006
 
8.3%
1757
 
7.3%
807
 
3.3%
351
 
1.5%
342
 
1.4%
Other values (304) 8516
35.2%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%

RSTRNT_TRDAR_NM
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
마포구
313 
종로구
265 
중구
236 
용산구
172 
광진구
148 
Other values (23)
866 

Length

Max length6
Median length3
Mean length3.007
Min length1

Unique

Unique6 ?
Unique (%)0.3%

Sample

1st row서초구
2nd row서초구
3rd row강남구
4th row강남구
5th row서초구

Common Values

ValueCountFrequency (%)
마포구 313
15.7%
종로구 265
13.2%
중구 236
11.8%
용산구 172
8.6%
광진구 148
7.4%
서대문구 143
 
7.1%
강남구 110
 
5.5%
동대문구 89
 
4.5%
성북구 84
 
4.2%
노원구 72
 
3.6%
Other values (18) 368
18.4%

Length

2023-12-10T18:41:53.423554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
마포구 313
15.7%
종로구 265
13.2%
중구 236
11.8%
용산구 172
8.6%
광진구 148
7.4%
서대문구 143
 
7.1%
강남구 110
 
5.5%
동대문구 89
 
4.5%
성북구 84
 
4.2%
노원구 72
 
3.6%
Other values (18) 368
18.4%

Interactions

2023-12-10T18:41:48.352784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:41:53.578657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RSTRNT_IDRSTRNT_TRDAR_NM
RSTRNT_ID1.0000.912
RSTRNT_TRDAR_NM0.9121.000
2023-12-10T18:41:53.740374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RSTRNT_IDRSTRNT_TRDAR_NM
RSTRNT_ID1.0000.637
RSTRNT_TRDAR_NM0.6371.000

Missing values

2023-12-10T18:41:48.594813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:41:48.782920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

CTY_NMRSTRNT_IDRSTRNT_NMRSTRNT_ADDRRSTRNT_TRDAR_NM
0seoul1아리마(서초점)서울특별시 서초구 서초동 1675-13서초구
1seoul2강남교자(본점)서울특별시 서초구 서초동 1308-1서초구
2seoul3무화잠(강남점)서울특별시 강남구 강남대로 338강남구
3seoul4아이 해브 어 드림서울특별시 강남구 역삼동 821-1강남구
4seoul5아카사카(강남점)서울특별시 서초구 서초동 1327-2서초구
5seoul6마포참숯불갈비서울특별시 강남구 역삼동 817-4강남구
6seoul7삼성각서울특별시 서초구 서초동 1330-18서초구
7seoul9카페로데서울특별시 서초구 서초대로74길 29서초구
8seoul10레스토랑나무와서울특별시 강남구 역삼1동 617-6강남구
9seoul11아이해브어드림 VIP키친서울특별시 강남구 강남대로94길 55-3 윤희빌딩 1층강남구
CTY_NMRSTRNT_IDRSTRNT_NMRSTRNT_ADDRRSTRNT_TRDAR_NM
1990seoul2253단두리서울특별시 관악구 봉천11동 1644-1관악구
1991seoul2256숲속의쉼터서울특별시 관악구 신림8동 1655-2관악구
1992seoul2257빵굼터(봉천역점)서울특별시 관악구 봉천동 930-39관악구
1993seoul2258정가네서울특별시 영등포구 신길6동 4320 1층영등포구
1994seoul2259알미골칼국수서울특별시 관악구 봉천2동 7-312관악구
1995seoul2261돈짜루서울특별시 관악구 봉천11동 1639-2 ,3 1층관악구
1996seoul2262들꽃서울특별시 관악구 봉천본동 958-12관악구
1997seoul2263임실치즈피자(신림점)서울특별시 관악구 신림11동 1567-6관악구
1998seoul2264바다회집서울특별시 관악구 봉천4동 866-1관악구
1999seoul2265상도연탄갈비와고기뷔페서울특별시 동작구 상도1동 369동작구