Overview

Dataset statistics

Number of variables4
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.7 KiB
Average record size in memory34.3 B

Variable types

Numeric2
Text2

Alerts

ldgs_cd has unique valuesUnique
ldgs_fggg_nm has unique valuesUnique
ldgs_fggg_addr has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:46:35.341467
Analysis finished2023-12-10 09:46:36.920699
Duration1.58 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ldgs_cd
Real number (ℝ)

UNIQUE 

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25000.224
Minimum23184
Maximum26395
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T18:46:37.063538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum23184
5-th percentile23527.85
Q124377
median25039
Q325725.25
95-th percentile26263.2
Maximum26395
Range3211
Interquartile range (IQR)1348.25

Descriptive statistics

Standard deviation859.41493
Coefficient of variation (CV)0.034376289
Kurtosis-0.93982509
Mean25000.224
Median Absolute Deviation (MAD)676
Skewness-0.24170262
Sum12500112
Variance738594.03
MonotonicityStrictly increasing
2023-12-10T18:46:37.318341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23184 1
 
0.2%
25481 1
 
0.2%
25582 1
 
0.2%
25580 1
 
0.2%
25579 1
 
0.2%
25570 1
 
0.2%
25566 1
 
0.2%
25540 1
 
0.2%
25536 1
 
0.2%
25532 1
 
0.2%
Other values (490) 490
98.0%
ValueCountFrequency (%)
23184 1
0.2%
23192 1
0.2%
23216 1
0.2%
23223 1
0.2%
23224 1
0.2%
23227 1
0.2%
23231 1
0.2%
23234 1
0.2%
23237 1
0.2%
23238 1
0.2%
ValueCountFrequency (%)
26395 1
0.2%
26394 1
0.2%
26384 1
0.2%
26381 1
0.2%
26380 1
0.2%
26374 1
0.2%
26366 1
0.2%
26359 1
0.2%
26357 1
0.2%
26356 1
0.2%

ldgs_fggg_nm
Text

UNIQUE 

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-10T18:46:37.873487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length32
Mean length18.46
Min length6

Characters and Unicode

Total characters9230
Distinct characters71
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique500 ?
Unique (%)100.0%

Sample

1st rowYeongdeungpo Lifestyle F HOTEL
2nd rowJeongju Ochang M+
3rd rowSeoul National University Station Fox
4th rowYangjae Shuimpyo
5th rowSinchon Reem
ValueCountFrequency (%)
hotel 62
 
4.3%
busan 49
 
3.4%
incheon 25
 
1.8%
daejeon 24
 
1.7%
motel 23
 
1.6%
daegu 21
 
1.5%
check-in 17
 
1.2%
self 17
 
1.2%
station 16
 
1.1%
suwon 13
 
0.9%
Other values (675) 1159
81.3%
2023-12-10T18:46:38.808702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 986
 
10.7%
926
 
10.0%
o 806
 
8.7%
e 799
 
8.7%
a 641
 
6.9%
g 415
 
4.5%
u 381
 
4.1%
i 315
 
3.4%
l 264
 
2.9%
h 254
 
2.8%
Other values (61) 3443
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6542
70.9%
Uppercase Letter 1537
 
16.7%
Space Separator 926
 
10.0%
Dash Punctuation 72
 
0.8%
Open Punctuation 46
 
0.5%
Close Punctuation 46
 
0.5%
Decimal Number 41
 
0.4%
Other Punctuation 16
 
0.2%
Math Symbol 3
 
< 0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 986
15.1%
o 806
12.3%
e 799
12.2%
a 641
9.8%
g 415
 
6.3%
u 381
 
5.8%
i 315
 
4.8%
l 264
 
4.0%
h 254
 
3.9%
s 236
 
3.6%
Other values (16) 1445
22.1%
Uppercase Letter
ValueCountFrequency (%)
S 214
13.9%
H 124
 
8.1%
B 112
 
7.3%
D 103
 
6.7%
G 99
 
6.4%
C 96
 
6.2%
M 93
 
6.1%
J 73
 
4.7%
A 71
 
4.6%
I 64
 
4.2%
Other values (16) 488
31.8%
Decimal Number
ValueCountFrequency (%)
1 12
29.3%
2 12
29.3%
9 4
 
9.8%
5 3
 
7.3%
3 3
 
7.3%
6 3
 
7.3%
0 2
 
4.9%
4 1
 
2.4%
7 1
 
2.4%
Other Punctuation
ValueCountFrequency (%)
, 7
43.8%
& 7
43.8%
/ 1
 
6.2%
# 1
 
6.2%
Space Separator
ValueCountFrequency (%)
926
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 72
100.0%
Open Punctuation
ValueCountFrequency (%)
( 46
100.0%
Close Punctuation
ValueCountFrequency (%)
) 46
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%
Other Letter
ValueCountFrequency (%)
μ±° 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8079
87.5%
Common 1150
 
12.5%
Hangul 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 986
 
12.2%
o 806
 
10.0%
e 799
 
9.9%
a 641
 
7.9%
g 415
 
5.1%
u 381
 
4.7%
i 315
 
3.9%
l 264
 
3.3%
h 254
 
3.1%
s 236
 
2.9%
Other values (42) 2982
36.9%
Common
ValueCountFrequency (%)
926
80.5%
- 72
 
6.3%
( 46
 
4.0%
) 46
 
4.0%
1 12
 
1.0%
2 12
 
1.0%
, 7
 
0.6%
& 7
 
0.6%
9 4
 
0.3%
5 3
 
0.3%
Other values (8) 15
 
1.3%
Hangul
ValueCountFrequency (%)
μ±° 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9229
> 99.9%
Hangul 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 986
 
10.7%
926
 
10.0%
o 806
 
8.7%
e 799
 
8.7%
a 641
 
6.9%
g 415
 
4.5%
u 381
 
4.1%
i 315
 
3.4%
l 264
 
2.9%
h 254
 
2.8%
Other values (60) 3442
37.3%
Hangul
ValueCountFrequency (%)
μ±° 1
100.0%

ldgs_zip_no
Real number (ℝ)

Distinct378
Distinct (%)75.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29657.572
Minimum1062
Maximum63129
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T18:46:39.089396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1062
5-th percentile3139
Q114483
median31123.5
Q346544
95-th percentile57796.4
Maximum63129
Range62067
Interquartile range (IQR)32061

Descriptive statistics

Standard deviation17695.081
Coefficient of variation (CV)0.59664631
Kurtosis-1.223155
Mean29657.572
Median Absolute Deviation (MAD)16157.5
Skewness0.045576311
Sum14828786
Variance3.1311589 Γ— 108
MonotonicityNot monotonic
2023-12-10T18:46:39.384117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3779 6
 
1.2%
48094 4
 
0.8%
1062 4
 
0.8%
3139 4
 
0.8%
8754 4
 
0.8%
42714 4
 
0.8%
34186 4
 
0.8%
53014 4
 
0.8%
25770 4
 
0.8%
61963 4
 
0.8%
Other values (368) 458
91.6%
ValueCountFrequency (%)
1062 4
0.8%
1073 2
0.4%
1081 1
 
0.2%
1164 1
 
0.2%
1221 1
 
0.2%
1914 1
 
0.2%
2149 1
 
0.2%
2163 1
 
0.2%
2444 1
 
0.2%
2468 1
 
0.2%
ValueCountFrequency (%)
63129 1
 
0.2%
62363 1
 
0.2%
62359 2
0.4%
62278 1
 
0.2%
61963 4
0.8%
61938 1
 
0.2%
61250 1
 
0.2%
61229 1
 
0.2%
59769 2
0.4%
59733 1
 
0.2%

ldgs_fggg_addr
Text

UNIQUE 

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-10T18:46:39.856907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length85
Median length68
Mean length52.222
Min length3

Characters and Unicode

Total characters26111
Distinct characters61
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique500 ?
Unique (%)100.0%

Sample

1st row8, Gyeongin-ro 108-gil, Yeongdeungpo-gu, Seoul (Yeongdeungpo-dong 1-ga)
2nd row9-12, Yangcheongsongdae-gil Ochang-eup, Cheongwon-gu, Cheongju-si, Chungcheongbuk-do
3rd row208-4, Gwanak-ro, Gwanak-gu, Seoul (Bongcheon-dong)
4th row6, Seoun-ro 1-gil, Seocho-gu, Seoul (Seocho-dong)
5th row16, Yeonsei-ro 4-gil, Seodaemun-gu, Seoul,
ValueCountFrequency (%)
korea 96
 
3.5%
gyeonggi-do 89
 
3.3%
seoul 85
 
3.1%
busan 52
 
1.9%
gyeongsangbuk-do 27
 
1.0%
incheon 26
 
1.0%
daejeon 25
 
0.9%
chungcheongnam-do 23
 
0.8%
daegu 22
 
0.8%
gangwon-do 22
 
0.8%
Other values (1139) 2263
82.9%
2023-12-10T18:46:41.131280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2340
 
9.0%
n 2296
 
8.8%
2237
 
8.6%
g 2033
 
7.8%
- 1868
 
7.2%
, 1702
 
6.5%
e 1573
 
6.0%
a 1344
 
5.1%
u 1188
 
4.5%
i 899
 
3.4%
Other values (51) 8631
33.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16338
62.6%
Space Separator 2237
 
8.6%
Decimal Number 1910
 
7.3%
Uppercase Letter 1876
 
7.2%
Dash Punctuation 1868
 
7.2%
Other Punctuation 1704
 
6.5%
Open Punctuation 89
 
0.3%
Close Punctuation 89
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2340
14.3%
n 2296
14.1%
g 2033
12.4%
e 1573
9.6%
a 1344
8.2%
u 1188
7.3%
i 899
 
5.5%
l 649
 
4.0%
r 567
 
3.5%
s 547
 
3.3%
Other values (14) 2902
17.8%
Uppercase Letter
ValueCountFrequency (%)
G 377
20.1%
S 285
15.2%
D 187
10.0%
B 159
8.5%
J 141
 
7.5%
C 139
 
7.4%
K 97
 
5.2%
Y 83
 
4.4%
H 61
 
3.3%
N 60
 
3.2%
Other values (11) 287
15.3%
Decimal Number
ValueCountFrequency (%)
1 422
22.1%
2 300
15.7%
3 205
10.7%
4 170
8.9%
6 155
 
8.1%
5 154
 
8.1%
7 142
 
7.4%
8 127
 
6.6%
0 119
 
6.2%
9 116
 
6.1%
Other Punctuation
ValueCountFrequency (%)
, 1702
99.9%
' 2
 
0.1%
Space Separator
ValueCountFrequency (%)
2237
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1868
100.0%
Open Punctuation
ValueCountFrequency (%)
( 89
100.0%
Close Punctuation
ValueCountFrequency (%)
) 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18214
69.8%
Common 7897
30.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2340
12.8%
n 2296
12.6%
g 2033
11.2%
e 1573
 
8.6%
a 1344
 
7.4%
u 1188
 
6.5%
i 899
 
4.9%
l 649
 
3.6%
r 567
 
3.1%
s 547
 
3.0%
Other values (35) 4778
26.2%
Common
ValueCountFrequency (%)
2237
28.3%
- 1868
23.7%
, 1702
21.6%
1 422
 
5.3%
2 300
 
3.8%
3 205
 
2.6%
4 170
 
2.2%
6 155
 
2.0%
5 154
 
2.0%
7 142
 
1.8%
Other values (6) 542
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26111
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2340
 
9.0%
n 2296
 
8.8%
2237
 
8.6%
g 2033
 
7.8%
- 1868
 
7.2%
, 1702
 
6.5%
e 1573
 
6.0%
a 1344
 
5.1%
u 1188
 
4.5%
i 899
 
3.4%
Other values (51) 8631
33.1%

Interactions

2023-12-10T18:46:36.108127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:35.711291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:36.345772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:35.885479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:46:41.314054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ldgs_cdldgs_zip_no
ldgs_cd1.0000.247
ldgs_zip_no0.2471.000
2023-12-10T18:46:41.464794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ldgs_cdldgs_zip_no
ldgs_cd1.0000.115
ldgs_zip_no0.1151.000

Missing values

2023-12-10T18:46:36.617902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:46:36.850719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ldgs_cdldgs_fggg_nmldgs_zip_noldgs_fggg_addr
023184Yeongdeungpo Lifestyle F HOTEL73068, Gyeongin-ro 108-gil, Yeongdeungpo-gu, Seoul (Yeongdeungpo-dong 1-ga)
123192Jeongju Ochang M+281189-12, Yangcheongsongdae-gil Ochang-eup, Cheongwon-gu, Cheongju-si, Chungcheongbuk-do
223216Seoul National University Station Fox8737208-4, Gwanak-ro, Gwanak-gu, Seoul (Bongcheon-dong)
323223Yangjae Shuimpyo67336, Seoun-ro 1-gil, Seocho-gu, Seoul (Seocho-dong)
423224Sinchon Reem377916, Yeonsei-ro 4-gil, Seodaemun-gu, Seoul,
523227A + Bucheon1462343, Buil-ro 233beon-gil, Bucheon-si, Gyeonggi-do, Korea
623231Pohang Evian Car Self check-in Motel3776019-8, Saecheon-si-ro 450beon-gil, Nam-gu, Pohang, Gyeongsangbuk-do
723234Anmyeondo (Taean) Sun Beach3216616 Bangpo 1-gil, Anmyeon-eup, Taean-gun, Chungcheongnam-do
823237Ulsan Mugeo-dong Couple4460226, Daehak-ro 147beon-gil, Nam-gu, Ulsan
923238Sinchon Baron de Paris37798, Yeonsei-ro 2-na-gil, Seodaemun-gu, Seoul,
ldgs_cdldgs_fggg_nmldgs_zip_noldgs_fggg_addr
49026356Yangsan Jungbu-dong Winners5062922, Yangsan Station 1-gil, Yangsan-si, Gyeongnam
49126357Incheon (Juan) Fox2213291, Juan-ro, Nam-gu, Incheon
49226359Suwon (Ingye-dong) Byeolheneunbam1648927-23, Ingye-ro 124beon-gil, Paldal-gu, Suwon-si, Gyeonggi-do, Korea
49326366Jinju K526834, Bibong-ro 58beon-gil, Jinju-si, Gyeongnam
49426374Busan Spa CG477135, Chabatgolgol-ro 21beon-gil, Dongnae-gu, Busan
49526380Daejeon Euneungjeongee Thema3492712, Daeheung-ro 169beon-gil, Jung-gu, Daejeon, Korea
49626381Bulgwang LAVILLA3378138, Jinheung-ro, Eunpyeong-gu, Seoul
49726384Yeongdeungpo Commodore7304Commodore Dragon, 10 Yeongjung-ro 6-gil, Yeongdeungpo-gu, Seoul
49826394B-Olleh Ilsan103811575, Jungang-ro, Ilsanseo-gu, Goyang-si, Gyeonggi-do
49926395Jeonju Sanjeong-dong Kan5501524-3, Sanjeong 2-gil, Deokjin-gu, Jeonju-si, Jeollabuk-do