Overview

Dataset statistics

Number of variables4
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.4 KiB
Average record size in memory35.3 B

Variable types

Numeric2
Text2

Alerts

property_cd has unique valuesUnique
property_nm has unique valuesUnique
property_load_addr has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:53:46.315009
Analysis finished2023-12-10 09:53:47.934385
Duration1.62 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

property_cd
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30026267
Minimum23184
Maximum1.0001173 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:53:48.064956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum23184
5-th percentile23206.95
Q123278.5
median23493
Q323619.75
95-th percentile23713.2
Maximum1.0001173 × 109
Range1.0000941 × 109
Interquartile range (IQR)341.25

Descriptive statistics

Standard deviation1.7146269 × 108
Coefficient of variation (CV)5.7104233
Kurtosis29.897775
Mean30026267
Median Absolute Deviation (MAD)172
Skewness5.5946495
Sum3.0026267 × 109
Variance2.9399456 × 1016
MonotonicityNot monotonic
2023-12-10T18:53:48.363508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23184 1
 
1.0%
23559 1
 
1.0%
23606 1
 
1.0%
23602 1
 
1.0%
23600 1
 
1.0%
23594 1
 
1.0%
23589 1
 
1.0%
23587 1
 
1.0%
23585 1
 
1.0%
23583 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
23184 1
1.0%
23196 1
1.0%
23197 1
1.0%
23204 1
1.0%
23206 1
1.0%
23207 1
1.0%
23216 1
1.0%
23223 1
1.0%
23224 1
1.0%
23227 1
1.0%
ValueCountFrequency (%)
1000117284 1
1.0%
1000117281 1
1.0%
1000117277 1
1.0%
23720 1
1.0%
23717 1
1.0%
23713 1
1.0%
23708 1
1.0%
23707 1
1.0%
23704 1
1.0%
23701 1
1.0%

property_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:53:49.006496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length9.01
Min length4

Characters and Unicode

Total characters901
Distinct characters242
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row영등포 라이프스타일 F HOTEL
2nd row양평 돌체파르니엔펜션
3rd row대전 신탄진 거기
4th row강남 렉시
5th row대전 유성 시나브로
ValueCountFrequency (%)
호텔 18
 
6.6%
hotel 7
 
2.6%
대전 6
 
2.2%
안산 5
 
1.8%
신촌 4
 
1.5%
신천 4
 
1.5%
포항 4
 
1.5%
무인텔 3
 
1.1%
대잠동 3
 
1.1%
3
 
1.1%
Other values (192) 214
79.0%
2023-12-10T18:53:49.986250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
171
 
19.0%
28
 
3.1%
24
 
2.7%
22
 
2.4%
16
 
1.8%
16
 
1.8%
14
 
1.6%
14
 
1.6%
) 14
 
1.6%
( 14
 
1.6%
Other values (232) 568
63.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 613
68.0%
Space Separator 171
 
19.0%
Uppercase Letter 71
 
7.9%
Close Punctuation 14
 
1.6%
Open Punctuation 14
 
1.6%
Decimal Number 10
 
1.1%
Lowercase Letter 4
 
0.4%
Other Punctuation 2
 
0.2%
Dash Punctuation 1
 
0.1%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
 
4.6%
24
 
3.9%
22
 
3.6%
16
 
2.6%
16
 
2.6%
14
 
2.3%
14
 
2.3%
12
 
2.0%
12
 
2.0%
12
 
2.0%
Other values (195) 443
72.3%
Uppercase Letter
ValueCountFrequency (%)
H 9
12.7%
T 8
11.3%
O 8
11.3%
M 7
9.9%
E 7
9.9%
L 7
9.9%
S 5
7.0%
A 5
7.0%
I 2
 
2.8%
U 2
 
2.8%
Other values (9) 11
15.5%
Decimal Number
ValueCountFrequency (%)
2 3
30.0%
7 2
20.0%
4 1
 
10.0%
5 1
 
10.0%
6 1
 
10.0%
3 1
 
10.0%
9 1
 
10.0%
Lowercase Letter
ValueCountFrequency (%)
l 1
25.0%
e 1
25.0%
t 1
25.0%
o 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 1
50.0%
& 1
50.0%
Space Separator
ValueCountFrequency (%)
171
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 613
68.0%
Common 213
 
23.6%
Latin 75
 
8.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
 
4.6%
24
 
3.9%
22
 
3.6%
16
 
2.6%
16
 
2.6%
14
 
2.3%
14
 
2.3%
12
 
2.0%
12
 
2.0%
12
 
2.0%
Other values (195) 443
72.3%
Latin
ValueCountFrequency (%)
H 9
12.0%
T 8
10.7%
O 8
10.7%
M 7
9.3%
E 7
9.3%
L 7
9.3%
S 5
 
6.7%
A 5
 
6.7%
I 2
 
2.7%
U 2
 
2.7%
Other values (13) 15
20.0%
Common
ValueCountFrequency (%)
171
80.3%
) 14
 
6.6%
( 14
 
6.6%
2 3
 
1.4%
7 2
 
0.9%
4 1
 
0.5%
- 1
 
0.5%
5 1
 
0.5%
6 1
 
0.5%
. 1
 
0.5%
Other values (4) 4
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 613
68.0%
ASCII 288
32.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
171
59.4%
) 14
 
4.9%
( 14
 
4.9%
H 9
 
3.1%
T 8
 
2.8%
O 8
 
2.8%
M 7
 
2.4%
E 7
 
2.4%
L 7
 
2.4%
S 5
 
1.7%
Other values (27) 38
 
13.2%
Hangul
ValueCountFrequency (%)
28
 
4.6%
24
 
3.9%
22
 
3.6%
16
 
2.6%
16
 
2.6%
14
 
2.3%
14
 
2.3%
12
 
2.0%
12
 
2.0%
12
 
2.0%
Other values (195) 443
72.3%

property_zip_no
Real number (ℝ)

Distinct92
Distinct (%)92.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22649.46
Minimum1114
Maximum61252
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:53:50.298756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1114
5-th percentile3779
Q17582
median18113
Q334464.5
95-th percentile52719.4
Maximum61252
Range60138
Interquartile range (IQR)26882.5

Descriptive statistics

Standard deviation16087.318
Coefficient of variation (CV)0.7102738
Kurtosis-0.66010979
Mean22649.46
Median Absolute Deviation (MAD)12556.5
Skewness0.60345967
Sum2264946
Variance2.588018 × 108
MonotonicityNot monotonic
2023-12-10T18:53:50.647390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3779 4
 
4.0%
5557 3
 
3.0%
22135 2
 
2.0%
21446 2
 
2.0%
37760 2
 
2.0%
38157 1
 
1.0%
24271 1
 
1.0%
39185 1
 
1.0%
6221 1
 
1.0%
1114 1
 
1.0%
Other values (82) 82
82.0%
ValueCountFrequency (%)
1114 1
 
1.0%
2149 1
 
1.0%
2163 1
 
1.0%
3191 1
 
1.0%
3779 4
4.0%
4027 1
 
1.0%
5329 1
 
1.0%
5354 1
 
1.0%
5404 1
 
1.0%
5543 1
 
1.0%
ValueCountFrequency (%)
61252 1
1.0%
59623 1
1.0%
58148 1
1.0%
55023 1
1.0%
53278 1
1.0%
52690 1
1.0%
51751 1
1.0%
51004 1
1.0%
50844 1
1.0%
48496 1
1.0%

property_load_addr
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:53:51.273611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length28
Mean length22.76
Min length14

Characters and Unicode

Total characters2276
Distinct characters185
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row서울 영등포구 경인로108길 8 (영등포동1가)
2nd row경기도 양평군 옥천면 용천로 64
3rd row대전광역시 대덕구 신탄진동로7번안길 38(신탄진동)
4th row서울특별시 강남구 테헤란로16길 11
5th row대전광역시 유성구 온천북로59번길 9
ValueCountFrequency (%)
경기도 18
 
3.7%
서울 15
 
3.1%
서울특별시 14
 
2.9%
경상북도 9
 
1.9%
남구 6
 
1.2%
8 6
 
1.2%
송파구 6
 
1.2%
인천광역시 6
 
1.2%
12 5
 
1.0%
안산시 5
 
1.0%
Other values (311) 396
81.5%
2023-12-10T18:53:52.150236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
386
 
17.0%
1 91
 
4.0%
89
 
3.9%
76
 
3.3%
74
 
3.3%
72
 
3.2%
69
 
3.0%
2 62
 
2.7%
) 60
 
2.6%
( 60
 
2.6%
Other values (175) 1237
54.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1361
59.8%
Space Separator 386
 
17.0%
Decimal Number 375
 
16.5%
Close Punctuation 60
 
2.6%
Open Punctuation 60
 
2.6%
Dash Punctuation 34
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
89
 
6.5%
76
 
5.6%
74
 
5.4%
72
 
5.3%
69
 
5.1%
47
 
3.5%
47
 
3.5%
42
 
3.1%
34
 
2.5%
32
 
2.4%
Other values (161) 779
57.2%
Decimal Number
ValueCountFrequency (%)
1 91
24.3%
2 62
16.5%
4 42
11.2%
8 39
10.4%
3 37
9.9%
5 24
 
6.4%
9 22
 
5.9%
6 21
 
5.6%
7 19
 
5.1%
0 18
 
4.8%
Space Separator
ValueCountFrequency (%)
386
100.0%
Close Punctuation
ValueCountFrequency (%)
) 60
100.0%
Open Punctuation
ValueCountFrequency (%)
( 60
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1361
59.8%
Common 915
40.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
89
 
6.5%
76
 
5.6%
74
 
5.4%
72
 
5.3%
69
 
5.1%
47
 
3.5%
47
 
3.5%
42
 
3.1%
34
 
2.5%
32
 
2.4%
Other values (161) 779
57.2%
Common
ValueCountFrequency (%)
386
42.2%
1 91
 
9.9%
2 62
 
6.8%
) 60
 
6.6%
( 60
 
6.6%
4 42
 
4.6%
8 39
 
4.3%
3 37
 
4.0%
- 34
 
3.7%
5 24
 
2.6%
Other values (4) 80
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1361
59.8%
ASCII 915
40.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
386
42.2%
1 91
 
9.9%
2 62
 
6.8%
) 60
 
6.6%
( 60
 
6.6%
4 42
 
4.6%
8 39
 
4.3%
3 37
 
4.0%
- 34
 
3.7%
5 24
 
2.6%
Other values (4) 80
 
8.7%
Hangul
ValueCountFrequency (%)
89
 
6.5%
76
 
5.6%
74
 
5.4%
72
 
5.3%
69
 
5.1%
47
 
3.5%
47
 
3.5%
42
 
3.1%
34
 
2.5%
32
 
2.4%
Other values (161) 779
57.2%

Interactions

2023-12-10T18:53:47.258172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:53:46.956361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:53:47.475916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:53:47.100754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:53:52.337531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
property_cdproperty_nmproperty_zip_noproperty_load_addr
property_cd1.0001.0000.0711.000
property_nm1.0001.0001.0001.000
property_zip_no0.0711.0001.0001.000
property_load_addr1.0001.0001.0001.000
2023-12-10T18:53:52.582618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
property_cdproperty_zip_no
property_cd1.0000.196
property_zip_no0.1961.000

Missing values

2023-12-10T18:53:47.693781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:53:47.868167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

property_cdproperty_nmproperty_zip_noproperty_load_addr
023184영등포 라이프스타일 F HOTEL7306서울 영등포구 경인로108길 8 (영등포동1가)
11000117277양평 돌체파르니엔펜션12506경기도 양평군 옥천면 용천로 64
223196대전 신탄진 거기34309대전광역시 대덕구 신탄진동로7번안길 38(신탄진동)
323197강남 렉시6235서울특별시 강남구 테헤란로16길 11
423204대전 유성 시나브로34185대전광역시 유성구 온천북로59번길 9
523206부천(상동) MY HOTEL14542경기도 부천시 길주로121번길 18-11(상동)
623207역삼 벤6220서울 강남구 언주로87길 41 (역삼동)
71000117281강화도 야생화카페펜션23059인천광역시 강화군 화도면 해안남로 1133-42
823216서울대입구 폭스8737서울 관악구 관악로 208-4 (봉천동)
923223양재 호텔 신트라6733서울 서초구 서운로1길 6 (서초동)
property_cdproperty_nmproperty_zip_noproperty_load_addr
9023683안산 IMT15326경기도 안산시 상록구 성호로1길 12-4 (일동)
9123686광주 곤지암 윌 무인텔12807경기도 광주시 초월읍 산이길 8
9223693광주 신안동 여기서자자61252광주광역시 북구 경양로119번길 29 (신안동)
9323701부천 노블레스14643경기도 부천시 부천로10번길 49
9423704인천(주안) 더자자22135인천광역시 미추홀구 석바위로101번길 2
9523707경산 러브웨이38636경상북도 경산시 향교길 66-1
9623708포항 영일대 포스37708경상북도 포항시 북구 해안로 55
9723713화순 퀸 무인텔58148전라남도 화순군 도곡면 온천1길 28-5
9823717동탄 첼시18453경기도 화성시 동탄지성로 12
9923720김해 어방동 6950844경상남도 김해시 분성로511번길 8 (어방동)