Overview

Dataset statistics

Number of variables18
Number of observations5319
Missing cells20337
Missing cells (%)21.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory758.5 KiB
Average record size in memory146.0 B

Variable types

Numeric1
Text9
Categorical4
DateTime1
Boolean2
Unsupported1

Alerts

cp_email is highly overall correlated with cp_class and 2 other fieldsHigh correlation
cp_emailflag is highly overall correlated with cp_emailHigh correlation
skey is highly overall correlated with last_load_dttmHigh correlation
cp_class is highly overall correlated with cp_emailHigh correlation
cp_webflag is highly overall correlated with cp_emailHigh correlation
last_load_dttm is highly overall correlated with skeyHigh correlation
cp_email is highly imbalanced (92.3%)Imbalance
cp_emailflag is highly imbalanced (63.5%)Imbalance
cp_home has 5146 (96.7%) missing valuesMissing
cp_sanum has 2844 (53.5%) missing valuesMissing
cp_info has 1693 (31.8%) missing valuesMissing
cp_state has 5319 (100.0%) missing valuesMissing
cp_img has 5307 (99.8%) missing valuesMissing
skey has unique valuesUnique
cp_state is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-16 13:24:57.430258
Analysis finished2024-04-16 13:25:00.083241
Duration2.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

skey
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct5319
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2942966
Minimum2940307
Maximum2945625
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.9 KiB
2024-04-16T22:25:00.145400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2940307
5-th percentile2940572.9
Q12941636.5
median2942966
Q32944295.5
95-th percentile2945359.1
Maximum2945625
Range5318
Interquartile range (IQR)2659

Descriptive statistics

Standard deviation1535.6074
Coefficient of variation (CV)0.00052178903
Kurtosis-1.2
Mean2942966
Median Absolute Deviation (MAD)1330
Skewness0
Sum1.5653636 × 1010
Variance2358090
MonotonicityNot monotonic
2024-04-16T22:25:00.267304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2945556 1
 
< 0.1%
2942061 1
 
< 0.1%
2942059 1
 
< 0.1%
2942058 1
 
< 0.1%
2942057 1
 
< 0.1%
2942056 1
 
< 0.1%
2942055 1
 
< 0.1%
2942054 1
 
< 0.1%
2942053 1
 
< 0.1%
2942052 1
 
< 0.1%
Other values (5309) 5309
99.8%
ValueCountFrequency (%)
2940307 1
< 0.1%
2940308 1
< 0.1%
2940309 1
< 0.1%
2940310 1
< 0.1%
2940311 1
< 0.1%
2940312 1
< 0.1%
2940313 1
< 0.1%
2940314 1
< 0.1%
2940315 1
< 0.1%
2940316 1
< 0.1%
ValueCountFrequency (%)
2945625 1
< 0.1%
2945624 1
< 0.1%
2945623 1
< 0.1%
2945622 1
< 0.1%
2945621 1
< 0.1%
2945620 1
< 0.1%
2945619 1
< 0.1%
2945618 1
< 0.1%
2945617 1
< 0.1%
2945616 1
< 0.1%
Distinct5052
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
2024-04-16T22:25:00.537395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length19
Mean length6.6576424
Min length1

Characters and Unicode

Total characters35412
Distinct characters807
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4848 ?
Unique (%)91.1%

Sample

1st row송정광일체육관
2nd row사자후체육관
3rd row송천체육관
4th row용오름태권도
5th row승리마루다대도장
ValueCountFrequency (%)
부산은행 284
 
4.3%
어린이집 184
 
2.8%
미용실 53
 
0.8%
부산 30
 
0.4%
주)파크랜드 24
 
0.4%
웰메이드 23
 
0.3%
아가방 17
 
0.3%
음악학원 15
 
0.2%
abc-mart 14
 
0.2%
학원 14
 
0.2%
Other values (5367) 6021
90.1%
2024-04-16T22:25:01.013320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1373
 
3.9%
1350
 
3.8%
1255
 
3.5%
1121
 
3.2%
1094
 
3.1%
946
 
2.7%
619
 
1.7%
579
 
1.6%
556
 
1.6%
533
 
1.5%
Other values (797) 25986
73.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33171
93.7%
Space Separator 1373
 
3.9%
Uppercase Letter 401
 
1.1%
Decimal Number 144
 
0.4%
Close Punctuation 101
 
0.3%
Open Punctuation 100
 
0.3%
Lowercase Letter 64
 
0.2%
Other Punctuation 28
 
0.1%
Dash Punctuation 26
 
0.1%
Math Symbol 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1350
 
4.1%
1255
 
3.8%
1121
 
3.4%
1094
 
3.3%
946
 
2.9%
619
 
1.9%
579
 
1.7%
556
 
1.7%
533
 
1.6%
466
 
1.4%
Other values (732) 24652
74.3%
Uppercase Letter
ValueCountFrequency (%)
A 52
13.0%
B 41
10.2%
T 38
9.5%
M 36
9.0%
C 36
9.0%
S 29
 
7.2%
K 27
 
6.7%
G 22
 
5.5%
L 20
 
5.0%
R 19
 
4.7%
Other values (13) 81
20.2%
Lowercase Letter
ValueCountFrequency (%)
o 10
15.6%
e 8
12.5%
i 6
9.4%
n 6
9.4%
t 5
 
7.8%
s 4
 
6.2%
c 3
 
4.7%
m 3
 
4.7%
y 3
 
4.7%
d 3
 
4.7%
Other values (8) 13
20.3%
Decimal Number
ValueCountFrequency (%)
2 44
30.6%
1 40
27.8%
0 17
 
11.8%
3 14
 
9.7%
5 8
 
5.6%
7 8
 
5.6%
8 6
 
4.2%
4 5
 
3.5%
6 1
 
0.7%
9 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
& 10
35.7%
. 8
28.6%
, 6
21.4%
· 2
 
7.1%
! 1
 
3.6%
' 1
 
3.6%
Math Symbol
ValueCountFrequency (%)
+ 1
50.0%
~ 1
50.0%
Space Separator
ValueCountFrequency (%)
1373
100.0%
Close Punctuation
ValueCountFrequency (%)
) 101
100.0%
Open Punctuation
ValueCountFrequency (%)
( 100
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33166
93.7%
Common 1775
 
5.0%
Latin 465
 
1.3%
Han 6
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1350
 
4.1%
1255
 
3.8%
1121
 
3.4%
1094
 
3.3%
946
 
2.9%
619
 
1.9%
579
 
1.7%
556
 
1.7%
533
 
1.6%
466
 
1.4%
Other values (727) 24647
74.3%
Latin
ValueCountFrequency (%)
A 52
 
11.2%
B 41
 
8.8%
T 38
 
8.2%
M 36
 
7.7%
C 36
 
7.7%
S 29
 
6.2%
K 27
 
5.8%
G 22
 
4.7%
L 20
 
4.3%
R 19
 
4.1%
Other values (31) 145
31.2%
Common
ValueCountFrequency (%)
1373
77.4%
) 101
 
5.7%
( 100
 
5.6%
2 44
 
2.5%
1 40
 
2.3%
- 26
 
1.5%
0 17
 
1.0%
3 14
 
0.8%
& 10
 
0.6%
5 8
 
0.5%
Other values (13) 42
 
2.4%
Han
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33165
93.7%
ASCII 2238
 
6.3%
CJK 6
 
< 0.1%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1373
61.3%
) 101
 
4.5%
( 100
 
4.5%
A 52
 
2.3%
2 44
 
2.0%
B 41
 
1.8%
1 40
 
1.8%
T 38
 
1.7%
M 36
 
1.6%
C 36
 
1.6%
Other values (53) 377
 
16.8%
Hangul
ValueCountFrequency (%)
1350
 
4.1%
1255
 
3.8%
1121
 
3.4%
1094
 
3.3%
946
 
2.9%
619
 
1.9%
579
 
1.7%
556
 
1.7%
533
 
1.6%
466
 
1.4%
Other values (726) 24646
74.3%
None
ValueCountFrequency (%)
· 2
66.7%
1
33.3%
CJK
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

cp_home
Text

MISSING 

Distinct92
Distinct (%)53.2%
Missing5146
Missing (%)96.7%
Memory size41.7 KiB
2024-04-16T22:25:01.251954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length33
Mean length17.462428
Min length1

Characters and Unicode

Total characters3021
Distinct characters53
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)47.4%

Sample

1st rowwww.kjc21.com
2nd rowwww.gjsports.or.kr
3rd rowwww.namgusports.or.kr
4th rowshopping.namyangi.com
5th row-
ValueCountFrequency (%)
www.parkland.co.kr 44
25.4%
www.vilac.co.kr 20
 
11.6%
14
 
8.1%
www.inoti.co.kr 4
 
2.3%
www.wilshirekorea.co.kr 3
 
1.7%
http://www.pulipchae.com 2
 
1.2%
www.balancebrain.co.kr 2
 
1.2%
www.kjc21.com 2
 
1.2%
http://cafe.daum.net/geojedreamkid 1
 
0.6%
korea.mgchina.co.kr 1
 
0.6%
Other values (80) 80
46.2%
2024-04-16T22:25:01.628531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 420
13.9%
w 382
12.6%
a 214
 
7.1%
r 209
 
6.9%
o 199
 
6.6%
c 183
 
6.1%
k 179
 
5.9%
n 134
 
4.4%
l 101
 
3.3%
i 84
 
2.8%
Other values (43) 916
30.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2352
77.9%
Other Punctuation 478
 
15.8%
Decimal Number 126
 
4.2%
Dash Punctuation 34
 
1.1%
Space Separator 17
 
0.6%
Other Letter 10
 
0.3%
Connector Punctuation 3
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 382
16.2%
a 214
 
9.1%
r 209
 
8.9%
o 199
 
8.5%
c 183
 
7.8%
k 179
 
7.6%
n 134
 
5.7%
l 101
 
4.3%
i 84
 
3.6%
p 81
 
3.4%
Other values (15) 586
24.9%
Decimal Number
ValueCountFrequency (%)
0 26
20.6%
2 21
16.7%
1 19
15.1%
3 11
8.7%
5 10
 
7.9%
9 10
 
7.9%
8 9
 
7.1%
6 8
 
6.3%
7 8
 
6.3%
4 4
 
3.2%
Other Letter
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Other Punctuation
ValueCountFrequency (%)
. 420
87.9%
/ 51
 
10.7%
: 6
 
1.3%
? 1
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%
Space Separator
ValueCountFrequency (%)
17
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2352
77.9%
Common 659
 
21.8%
Hangul 10
 
0.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 382
16.2%
a 214
 
9.1%
r 209
 
8.9%
o 199
 
8.5%
c 183
 
7.8%
k 179
 
7.6%
n 134
 
5.7%
l 101
 
4.3%
i 84
 
3.6%
p 81
 
3.4%
Other values (15) 586
24.9%
Common
ValueCountFrequency (%)
. 420
63.7%
/ 51
 
7.7%
- 34
 
5.2%
0 26
 
3.9%
2 21
 
3.2%
1 19
 
2.9%
17
 
2.6%
3 11
 
1.7%
5 10
 
1.5%
9 10
 
1.5%
Other values (8) 40
 
6.1%
Hangul
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3011
99.7%
Hangul 10
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 420
13.9%
w 382
12.7%
a 214
 
7.1%
r 209
 
6.9%
o 199
 
6.6%
c 183
 
6.1%
k 179
 
5.9%
n 134
 
4.5%
l 101
 
3.4%
i 84
 
2.8%
Other values (33) 906
30.1%
Hangul
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

cp_class
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
요식업등
1167 
어린이집
915 
학원
656 
이미용업
543 
병의원
401 
Other values (12)
1637 

Length

Max length7
Median length4
Mean length3.4805415
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row체육시설
2nd row체육시설
3rd row체육시설
4th row체육시설
5th row체육시설

Common Values

ValueCountFrequency (%)
요식업등 1167
21.9%
어린이집 915
17.2%
학원 656
12.3%
이미용업 543
10.2%
병의원 401
 
7.5%
체육시설 330
 
6.2%
금융기관 288
 
5.4%
기타 279
 
5.2%
한의원 219
 
4.1%
안경 116
 
2.2%
Other values (7) 405
 
7.6%

Length

2024-04-16T22:25:01.752453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
요식업등 1167
21.7%
어린이집 915
17.0%
학원 656
12.2%
이미용업 543
10.1%
병의원 401
 
7.5%
체육시설 330
 
6.1%
금융기관 288
 
5.4%
기타 279
 
5.2%
한의원 219
 
4.1%
안경 116
 
2.2%
Other values (8) 461
 
8.6%

cp_hgu
Categorical

Distinct18
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
연제구
624 
동래구
527 
남구
513 
부산진구
511 
사하구
428 
Other values (13)
2716 

Length

Max length4
Median length3
Mean length2.9435984
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row해운대구
2nd row해운대구
3rd row영도구
4th row사하구
5th row연제구

Common Values

ValueCountFrequency (%)
연제구 624
11.7%
동래구 527
9.9%
남구 513
9.6%
부산진구 511
9.6%
사하구 428
8.0%
수영구 421
7.9%
북구 405
7.6%
해운대구 378
7.1%
사상구 324
 
6.1%
금정구 285
 
5.4%
Other values (8) 903
17.0%

Length

2024-04-16T22:25:01.863830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
연제구 624
11.7%
동래구 527
9.9%
남구 513
9.6%
부산진구 511
9.6%
사하구 428
8.0%
수영구 421
7.9%
북구 405
7.6%
해운대구 378
7.1%
사상구 324
 
6.1%
금정구 285
 
5.4%
Other values (8) 903
17.0%
Distinct4276
Distinct (%)80.4%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
2024-04-16T22:25:02.131290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length3
Mean length3.0094003
Min length1

Characters and Unicode

Total characters16007
Distinct characters378
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3795 ?
Unique (%)71.3%

Sample

1st row박광현
2nd row하태환
3rd row하태환
4th row김진영
5th row김철재
ValueCountFrequency (%)
이장호 236
 
4.4%
곽국민 44
 
0.8%
빈대인 31
 
0.6%
박순호 26
 
0.5%
박경수 22
 
0.4%
아가방 15
 
0.3%
김지완 10
 
0.2%
홈플러스 9
 
0.2%
김영숙 8
 
0.1%
김미숙 8
 
0.1%
Other values (4279) 4928
92.3%
2024-04-16T22:25:02.511817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1041
 
6.5%
938
 
5.9%
672
 
4.2%
468
 
2.9%
466
 
2.9%
423
 
2.6%
347
 
2.2%
347
 
2.2%
334
 
2.1%
331
 
2.1%
Other values (368) 10640
66.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15972
99.8%
Space Separator 18
 
0.1%
Decimal Number 10
 
0.1%
Close Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Other Punctuation 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1041
 
6.5%
938
 
5.9%
672
 
4.2%
468
 
2.9%
466
 
2.9%
423
 
2.6%
347
 
2.2%
347
 
2.2%
334
 
2.1%
331
 
2.1%
Other values (359) 10605
66.4%
Decimal Number
ValueCountFrequency (%)
1 5
50.0%
2 2
 
20.0%
3 2
 
20.0%
0 1
 
10.0%
Space Separator
ValueCountFrequency (%)
18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Other Punctuation
ValueCountFrequency (%)
· 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15972
99.8%
Common 35
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1041
 
6.5%
938
 
5.9%
672
 
4.2%
468
 
2.9%
466
 
2.9%
423
 
2.6%
347
 
2.2%
347
 
2.2%
334
 
2.1%
331
 
2.1%
Other values (359) 10605
66.4%
Common
ValueCountFrequency (%)
18
51.4%
1 5
 
14.3%
) 2
 
5.7%
2 2
 
5.7%
3 2
 
5.7%
( 2
 
5.7%
· 2
 
5.7%
- 1
 
2.9%
0 1
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15972
99.8%
ASCII 33
 
0.2%
None 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1041
 
6.5%
938
 
5.9%
672
 
4.2%
468
 
2.9%
466
 
2.9%
423
 
2.6%
347
 
2.2%
347
 
2.2%
334
 
2.1%
331
 
2.1%
Other values (359) 10605
66.4%
ASCII
ValueCountFrequency (%)
18
54.5%
1 5
 
15.2%
) 2
 
6.1%
2 2
 
6.1%
3 2
 
6.1%
( 2
 
6.1%
- 1
 
3.0%
0 1
 
3.0%
None
ValueCountFrequency (%)
· 2
100.0%

cp_sanum
Text

MISSING 

Distinct52
Distinct (%)2.1%
Missing2844
Missing (%)53.5%
Memory size41.7 KiB
2024-04-16T22:25:02.687427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length1
Mean length1.2076768
Min length1

Characters and Unicode

Total characters2989
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)2.0%

Sample

1st row6079541472
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0 2423
97.9%
6068269370 2
 
0.1%
6189611465 1
 
< 0.1%
6079088870 1
 
< 0.1%
543-70-00209 1
 
< 0.1%
6079193026 1
 
< 0.1%
123-81-64820 1
 
< 0.1%
폐업 1
 
< 0.1%
481-81-00168 1
 
< 0.1%
211-90-16777 1
 
< 0.1%
Other values (42) 42
 
1.7%
2024-04-16T22:25:03.013905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2509
83.9%
6 69
 
2.3%
1 57
 
1.9%
9 53
 
1.8%
2 52
 
1.7%
- 52
 
1.7%
8 44
 
1.5%
5 40
 
1.3%
7 38
 
1.3%
3 37
 
1.2%
Other values (3) 38
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2935
98.2%
Dash Punctuation 52
 
1.7%
Other Letter 2
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2509
85.5%
6 69
 
2.4%
1 57
 
1.9%
9 53
 
1.8%
2 52
 
1.8%
8 44
 
1.5%
5 40
 
1.4%
7 38
 
1.3%
3 37
 
1.3%
4 36
 
1.2%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 52
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2987
99.9%
Hangul 2
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2509
84.0%
6 69
 
2.3%
1 57
 
1.9%
9 53
 
1.8%
2 52
 
1.7%
- 52
 
1.7%
8 44
 
1.5%
5 40
 
1.3%
7 38
 
1.3%
3 37
 
1.2%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2987
99.9%
Hangul 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2509
84.0%
6 69
 
2.3%
1 57
 
1.9%
9 53
 
1.8%
2 52
 
1.7%
- 52
 
1.7%
8 44
 
1.5%
5 40
 
1.3%
7 38
 
1.3%
3 37
 
1.2%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct310
Distinct (%)5.8%
Missing6
Missing (%)0.1%
Memory size41.7 KiB
Minimum2000-01-01 00:00:00
Maximum2020-11-25 00:00:00
2024-04-16T22:25:03.134342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-16T22:25:03.250171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct5158
Distinct (%)97.4%
Missing22
Missing (%)0.4%
Memory size41.7 KiB
2024-04-16T22:25:03.555991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length75
Median length53
Mean length24.589201
Min length3

Characters and Unicode

Total characters130249
Distinct characters512
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5061 ?
Unique (%)95.5%

Sample

1st row부산광역시 해운대구 송정동 송정중앙로 21번길 66
2nd row부산광역시 해운대구 좌동 세실로 87 영진파스타
3rd row부산광역시 영도구 동삼동로 5 (동삼동)
4th row부산광역시 사하구 다대동 다대낙조2길 215 도시몰운대아파트
5th row부산광역시 연제구 연산동 쌍미천로 11
ValueCountFrequency (%)
부산광역시 5306
 
20.0%
연제구 642
 
2.4%
동래구 553
 
2.1%
남구 523
 
2.0%
부산진구 522
 
2.0%
사하구 437
 
1.6%
수영구 432
 
1.6%
북구 416
 
1.6%
해운대구 391
 
1.5%
사상구 332
 
1.2%
Other values (5866) 17036
64.1%
2024-04-16T22:25:04.020424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21763
 
16.7%
6537
 
5.0%
6160
 
4.7%
6149
 
4.7%
1 6065
 
4.7%
5620
 
4.3%
5527
 
4.2%
5485
 
4.2%
5329
 
4.1%
2 3684
 
2.8%
Other values (502) 57930
44.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 79238
60.8%
Decimal Number 25207
 
19.4%
Space Separator 21763
 
16.7%
Dash Punctuation 2611
 
2.0%
Open Punctuation 467
 
0.4%
Close Punctuation 466
 
0.4%
Other Punctuation 263
 
0.2%
Uppercase Letter 211
 
0.2%
Lowercase Letter 15
 
< 0.1%
Math Symbol 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6537
 
8.2%
6160
 
7.8%
6149
 
7.8%
5620
 
7.1%
5527
 
7.0%
5485
 
6.9%
5329
 
6.7%
3333
 
4.2%
1639
 
2.1%
1506
 
1.9%
Other values (456) 31953
40.3%
Uppercase Letter
ValueCountFrequency (%)
A 35
16.6%
T 26
12.3%
G 21
10.0%
P 20
9.5%
L 18
8.5%
B 16
7.6%
S 16
7.6%
K 14
 
6.6%
F 10
 
4.7%
C 8
 
3.8%
Other values (7) 27
12.8%
Decimal Number
ValueCountFrequency (%)
1 6065
24.1%
2 3684
14.6%
3 2912
11.6%
4 2226
 
8.8%
0 2143
 
8.5%
5 1935
 
7.7%
6 1655
 
6.6%
8 1608
 
6.4%
7 1595
 
6.3%
9 1384
 
5.5%
Lowercase Letter
ValueCountFrequency (%)
e 3
20.0%
a 3
20.0%
l 2
13.3%
g 2
13.3%
s 2
13.3%
k 2
13.3%
i 1
 
6.7%
Other Punctuation
ValueCountFrequency (%)
, 126
47.9%
@ 89
33.8%
. 27
 
10.3%
/ 18
 
6.8%
? 2
 
0.8%
& 1
 
0.4%
Math Symbol
ValueCountFrequency (%)
~ 7
87.5%
+ 1
 
12.5%
Space Separator
ValueCountFrequency (%)
21763
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2611
100.0%
Open Punctuation
ValueCountFrequency (%)
( 467
100.0%
Close Punctuation
ValueCountFrequency (%)
) 466
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 79236
60.8%
Common 50785
39.0%
Latin 226
 
0.2%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6537
 
8.3%
6160
 
7.8%
6149
 
7.8%
5620
 
7.1%
5527
 
7.0%
5485
 
6.9%
5329
 
6.7%
3333
 
4.2%
1639
 
2.1%
1506
 
1.9%
Other values (454) 31951
40.3%
Latin
ValueCountFrequency (%)
A 35
15.5%
T 26
11.5%
G 21
9.3%
P 20
8.8%
L 18
8.0%
B 16
 
7.1%
S 16
 
7.1%
K 14
 
6.2%
F 10
 
4.4%
C 8
 
3.5%
Other values (14) 42
18.6%
Common
ValueCountFrequency (%)
21763
42.9%
1 6065
 
11.9%
2 3684
 
7.3%
3 2912
 
5.7%
- 2611
 
5.1%
4 2226
 
4.4%
0 2143
 
4.2%
5 1935
 
3.8%
6 1655
 
3.3%
8 1608
 
3.2%
Other values (12) 4183
 
8.2%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 79232
60.8%
ASCII 51011
39.2%
Compat Jamo 4
 
< 0.1%
CJK 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21763
42.7%
1 6065
 
11.9%
2 3684
 
7.2%
3 2912
 
5.7%
- 2611
 
5.1%
4 2226
 
4.4%
0 2143
 
4.2%
5 1935
 
3.8%
6 1655
 
3.2%
8 1608
 
3.2%
Other values (36) 4409
 
8.6%
Hangul
ValueCountFrequency (%)
6537
 
8.3%
6160
 
7.8%
6149
 
7.8%
5620
 
7.1%
5527
 
7.0%
5485
 
6.9%
5329
 
6.7%
3333
 
4.2%
1639
 
2.1%
1506
 
1.9%
Other values (452) 31947
40.3%
Compat Jamo
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%

cp_tel
Text

Distinct5114
Distinct (%)96.1%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
2024-04-16T22:25:04.224556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.031397
Min length11

Characters and Unicode

Total characters63995
Distinct characters18
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4934 ?
Unique (%)92.8%

Sample

1st row051-704-6888
2nd row051-704-4707
3rd row051-404-5562
4th row051-264-3754
5th row051-266-5452
ValueCountFrequency (%)
051-000-0000 15
 
0.3%
051-810-3941 6
 
0.1%
051-506-2771 4
 
0.1%
051 4
 
0.1%
051-0000-0000 3
 
0.1%
051-851-8845 3
 
0.1%
051-000-000 3
 
0.1%
051-754-9797 3
 
0.1%
051-523-7730 3
 
0.1%
051-781-1123 3
 
0.1%
Other values (5105) 5275
99.1%
2024-04-16T22:25:04.528086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 10642
16.6%
5 9936
15.5%
0 9413
14.7%
1 8748
13.7%
2 4518
7.1%
3 4106
 
6.4%
6 3793
 
5.9%
7 3689
 
5.8%
8 3547
 
5.5%
4 3150
 
4.9%
Other values (8) 2453
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 53323
83.3%
Dash Punctuation 10642
 
16.6%
Other Punctuation 22
 
< 0.1%
Other Letter 4
 
< 0.1%
Space Separator 3
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 9936
18.6%
0 9413
17.7%
1 8748
16.4%
2 4518
8.5%
3 4106
7.7%
6 3793
 
7.1%
7 3689
 
6.9%
8 3547
 
6.7%
4 3150
 
5.9%
9 2423
 
4.5%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 10642
100.0%
Other Punctuation
ValueCountFrequency (%)
* 22
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 63991
> 99.9%
Hangul 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 10642
16.6%
5 9936
15.5%
0 9413
14.7%
1 8748
13.7%
2 4518
7.1%
3 4106
 
6.4%
6 3793
 
5.9%
7 3689
 
5.8%
8 3547
 
5.5%
4 3150
 
4.9%
Other values (4) 2449
 
3.8%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 63991
> 99.9%
Hangul 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 10642
16.6%
5 9936
15.5%
0 9413
14.7%
1 8748
13.7%
2 4518
7.1%
3 4106
 
6.4%
6 3793
 
5.9%
7 3689
 
5.8%
8 3547
 
5.5%
4 3150
 
4.9%
Other values (4) 2449
 
3.8%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

cp_email
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct24
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
<NA>
5021 
 
274
hush0892@gmail.com
 
2
bgjahwal@hanmail.net
 
2
korea@mgchina.co.kr
 
1
Other values (19)
 
19

Length

Max length24
Median length4
Mean length3.9095695
Min length1

Unique

Unique20 ?
Unique (%)0.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 5021
94.4%
274
 
5.2%
hush0892@gmail.com 2
 
< 0.1%
bgjahwal@hanmail.net 2
 
< 0.1%
korea@mgchina.co.kr 1
 
< 0.1%
megong@daum.net 1
 
< 0.1%
baromain1@naver.com 1
 
< 0.1%
wjdrlwh77@oanmail.net 1
 
< 0.1%
kmc623@gmail.com 1
 
< 0.1%
babycoolcool@hanmail.net 1
 
< 0.1%
Other values (14) 14
 
0.3%

Length

2024-04-16T22:25:04.649126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 5021
99.5%
bgjahwal@hanmail.net 2
 
< 0.1%
hush0892@gmail.com 2
 
< 0.1%
kjhy96@hanmail.net 1
 
< 0.1%
10000soo@gmail.com 1
 
< 0.1%
sudenn1@hanmail.net 1
 
< 0.1%
allright2875@naver.com 1
 
< 0.1%
waterbag@naver.com 1
 
< 0.1%
trunk0@daum.net 1
 
< 0.1%
lovekang99@nate.com 1
 
< 0.1%
Other values (13) 13
 
0.3%

cp_emailflag
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
True
4948 
False
 
371
ValueCountFrequency (%)
True 4948
93.0%
False 371
 
7.0%
2024-04-16T22:25:04.727418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

cp_info
Text

MISSING 

Distinct153
Distinct (%)4.2%
Missing1693
Missing (%)31.8%
Memory size41.7 KiB
2024-04-16T22:25:04.859238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length766
Median length1
Mean length17.806122
Min length1

Characters and Unicode

Total characters64565
Distinct characters592
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique127 ?
Unique (%)3.5%

Sample

1st row<P STYLE="MARGIN-LEFT: 40PX">부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여P STYLE="MARGIN-LEFT: 40PX">
2nd row <P STYLE="MARGIN-LEFT: 40PX">부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여
3rd rowP STYLE="MARGIN-LEFT: 40PX" GT;부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여P STYLE="MARGIN-LEFT: 40PX" GT;
4th row<P STYLE="MARGIN-LEFT: 40PX;">부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여P STYLE="MARGIN-LEFT: 40PX" GT;
5th row부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여
ValueCountFrequency (%)
체육관 1090
11.9%
style="margin-left 1003
 
11.0%
40px 658
 
7.2%
p 548
 
6.0%
다자녀가정 546
 
6.0%
546
 
6.0%
희망 545
 
6.0%
우대 545
 
6.0%
소속 545
 
6.0%
참여p 479
 
5.2%
Other values (1354) 2629
28.8%
2024-04-16T22:25:05.226011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12833
 
19.9%
T 2565
 
4.0%
P 2095
 
3.2%
" 2094
 
3.2%
E 2077
 
3.2%
L 2063
 
3.2%
G 1493
 
2.3%
0 1130
 
1.8%
1116
 
1.7%
1111
 
1.7%
Other values (582) 35988
55.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22127
34.3%
Uppercase Letter 19930
30.9%
Space Separator 12833
19.9%
Other Punctuation 4114
 
6.4%
Decimal Number 2409
 
3.7%
Math Symbol 1972
 
3.1%
Dash Punctuation 1061
 
1.6%
Open Punctuation 48
 
0.1%
Close Punctuation 47
 
0.1%
Modifier Symbol 18
 
< 0.1%
Other values (3) 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1116
 
5.0%
1111
 
5.0%
1108
 
5.0%
674
 
3.0%
621
 
2.8%
603
 
2.7%
598
 
2.7%
595
 
2.7%
593
 
2.7%
590
 
2.7%
Other values (523) 14518
65.6%
Uppercase Letter
ValueCountFrequency (%)
T 2565
12.9%
P 2095
10.5%
E 2077
10.4%
L 2063
10.4%
G 1493
 
7.5%
S 1089
 
5.5%
N 1068
 
5.4%
A 1066
 
5.3%
I 1051
 
5.3%
R 1047
 
5.3%
Other values (14) 4316
21.7%
Other Punctuation
ValueCountFrequency (%)
" 2094
50.9%
: 1045
25.4%
; 600
 
14.6%
. 170
 
4.1%
, 119
 
2.9%
/ 42
 
1.0%
% 22
 
0.5%
! 11
 
0.3%
# 6
 
0.1%
* 3
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 1130
46.9%
4 1028
42.7%
1 66
 
2.7%
2 48
 
2.0%
5 37
 
1.5%
3 35
 
1.5%
9 21
 
0.9%
8 20
 
0.8%
6 14
 
0.6%
7 10
 
0.4%
Math Symbol
ValueCountFrequency (%)
= 1047
53.1%
> 655
33.2%
< 261
 
13.2%
~ 7
 
0.4%
1
 
0.1%
× 1
 
0.1%
Space Separator
ValueCountFrequency (%)
12833
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1061
100.0%
Open Punctuation
ValueCountFrequency (%)
( 48
100.0%
Close Punctuation
ValueCountFrequency (%)
) 47
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 18
100.0%
Format
ValueCountFrequency (%)
 4
100.0%
Other Number
ValueCountFrequency (%)
² 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22508
34.9%
Hangul 22127
34.3%
Latin 19930
30.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1116
 
5.0%
1111
 
5.0%
1108
 
5.0%
674
 
3.0%
621
 
2.8%
603
 
2.7%
598
 
2.7%
595
 
2.7%
593
 
2.7%
590
 
2.7%
Other values (523) 14518
65.6%
Common
ValueCountFrequency (%)
12833
57.0%
" 2094
 
9.3%
0 1130
 
5.0%
- 1061
 
4.7%
= 1047
 
4.7%
: 1045
 
4.6%
4 1028
 
4.6%
> 655
 
2.9%
; 600
 
2.7%
< 261
 
1.2%
Other values (25) 754
 
3.3%
Latin
ValueCountFrequency (%)
T 2565
12.9%
P 2095
10.5%
E 2077
10.4%
L 2063
10.4%
G 1493
 
7.5%
S 1089
 
5.5%
N 1068
 
5.4%
A 1066
 
5.3%
I 1051
 
5.3%
R 1047
 
5.3%
Other values (14) 4316
21.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42428
65.7%
Hangul 22113
34.2%
Compat Jamo 14
 
< 0.1%
None 6
 
< 0.1%
Punctuation 2
 
< 0.1%
Arrows 1
 
< 0.1%
CJK Compat 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12833
30.2%
T 2565
 
6.0%
P 2095
 
4.9%
" 2094
 
4.9%
E 2077
 
4.9%
L 2063
 
4.9%
G 1493
 
3.5%
0 1130
 
2.7%
S 1089
 
2.6%
N 1068
 
2.5%
Other values (43) 13921
32.8%
Hangul
ValueCountFrequency (%)
1116
 
5.0%
1111
 
5.0%
1108
 
5.0%
674
 
3.0%
621
 
2.8%
603
 
2.7%
598
 
2.7%
595
 
2.7%
593
 
2.7%
590
 
2.7%
Other values (522) 14504
65.6%
Compat Jamo
ValueCountFrequency (%)
14
100.0%
None
ValueCountFrequency (%)
 4
66.7%
² 1
 
16.7%
× 1
 
16.7%
Punctuation
ValueCountFrequency (%)
2
100.0%
Arrows
ValueCountFrequency (%)
1
100.0%
CJK Compat
ValueCountFrequency (%)
1
100.0%

cp_woo
Text

Distinct2367
Distinct (%)44.5%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
2024-04-16T22:25:05.487970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1024
Median length779
Mean length31.123331
Min length1

Characters and Unicode

Total characters165545
Distinct characters595
Distinct categories14 ?
Distinct scripts4 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2019 ?
Unique (%)38.0%

Sample

1st row<P STYLE="MARGIN-LEFT: 40PX">다자녀가정 1인당(부,모,자녀) 수련비 10%내외할인(※중복할인불가)
2nd row <P STYLE="MARGIN-LEFT: 40PX">다자녀가정 1인당(부,모,자녀) 수련비 10%내외할인(※중복할인불가) <P STYLE="MARGIN-LEFT: 40PX"> NBSP; <P STYLE="MARGIN-LEFT: 40PX">* 탈퇴함
3rd rowP STYLE="MARGIN-LEFT: 40PX" GT;다자녀가정 1인당(부,모,자녀) 수련비 10%내외할인(※중복할인불가)
4th rowP STYLE="MARGIN-LEFT: 40PX" GT;다자녀가정 1인당(부,모,자녀) 수련비 10%내외할인(※중복할인불가)
5th row탈퇴
ValueCountFrequency (%)
할인 1946
 
6.8%
10 1391
 
4.9%
입학금 855
 
3.0%
nbsp 828
 
2.9%
652
 
2.3%
p 642
 
2.3%
면제 612
 
2.1%
20 569
 
2.0%
5 540
 
1.9%
수련비 531
 
1.9%
Other values (3087) 19963
70.0%
2024-04-16T22:25:05.893413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24107
 
14.6%
0 6713
 
4.1%
5689
 
3.4%
4691
 
2.8%
% 4657
 
2.8%
1 4146
 
2.5%
P 3822
 
2.3%
> 3666
 
2.2%
B 3471
 
2.1%
< 3403
 
2.1%
Other values (585) 101180
61.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66911
40.4%
Uppercase Letter 29395
17.8%
Space Separator 24107
 
14.6%
Decimal Number 15614
 
9.4%
Other Punctuation 15032
 
9.1%
Math Symbol 8564
 
5.2%
Close Punctuation 2202
 
1.3%
Open Punctuation 2198
 
1.3%
Dash Punctuation 1499
 
0.9%
Format 13
 
< 0.1%
Other values (4) 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5689
 
8.5%
4691
 
7.0%
2414
 
3.6%
2142
 
3.2%
2039
 
3.0%
1950
 
2.9%
1758
 
2.6%
1618
 
2.4%
1539
 
2.3%
1503
 
2.2%
Other values (515) 41568
62.1%
Uppercase Letter
ValueCountFrequency (%)
P 3822
13.0%
B 3471
11.8%
R 3158
10.7%
T 2459
 
8.4%
S 2317
 
7.9%
N 2303
 
7.8%
E 1752
 
6.0%
L 1560
 
5.3%
G 1125
 
3.8%
O 1046
 
3.6%
Other values (15) 6382
21.7%
Other Punctuation
ValueCountFrequency (%)
% 4657
31.0%
, 2390
15.9%
" 2102
14.0%
; 1890
12.6%
. 1322
 
8.8%
/ 856
 
5.7%
: 852
 
5.7%
542
 
3.6%
· 176
 
1.2%
# 127
 
0.8%
Other values (4) 118
 
0.8%
Decimal Number
ValueCountFrequency (%)
0 6713
43.0%
1 4146
26.6%
2 1669
 
10.7%
5 1372
 
8.8%
4 711
 
4.6%
3 541
 
3.5%
8 263
 
1.7%
6 88
 
0.6%
7 61
 
0.4%
9 50
 
0.3%
Math Symbol
ValueCountFrequency (%)
> 3666
42.8%
< 3403
39.7%
= 1053
 
12.3%
~ 429
 
5.0%
+ 7
 
0.1%
3
 
< 0.1%
3
 
< 0.1%
Other Number
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
Close Punctuation
ValueCountFrequency (%)
) 2201
> 99.9%
] 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 2197
> 99.9%
[ 1
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
^ 2
66.7%
` 1
33.3%
Space Separator
ValueCountFrequency (%)
24107
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1499
100.0%
Format
ValueCountFrequency (%)
 13
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 69239
41.8%
Hangul 66907
40.4%
Latin 29395
17.8%
Han 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5689
 
8.5%
4691
 
7.0%
2414
 
3.6%
2142
 
3.2%
2039
 
3.0%
1950
 
2.9%
1758
 
2.6%
1618
 
2.4%
1539
 
2.3%
1503
 
2.2%
Other values (511) 41564
62.1%
Common
ValueCountFrequency (%)
24107
34.8%
0 6713
 
9.7%
% 4657
 
6.7%
1 4146
 
6.0%
> 3666
 
5.3%
< 3403
 
4.9%
, 2390
 
3.5%
) 2201
 
3.2%
( 2197
 
3.2%
" 2102
 
3.0%
Other values (35) 13657
19.7%
Latin
ValueCountFrequency (%)
P 3822
13.0%
B 3471
11.8%
R 3158
10.7%
T 2459
 
8.4%
S 2317
 
7.9%
N 2303
 
7.8%
E 1752
 
6.0%
L 1560
 
5.3%
G 1125
 
3.8%
O 1046
 
3.6%
Other values (15) 6382
21.7%
Han
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 97891
59.1%
Hangul 66896
40.4%
Punctuation 542
 
0.3%
None 189
 
0.1%
Compat Jamo 11
 
< 0.1%
Enclosed Alphanum 5
 
< 0.1%
CJK 4
 
< 0.1%
Geometric Shapes 3
 
< 0.1%
Math Operators 3
 
< 0.1%
Misc Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
24107
24.6%
0 6713
 
6.9%
% 4657
 
4.8%
1 4146
 
4.2%
P 3822
 
3.9%
> 3666
 
3.7%
B 3471
 
3.5%
< 3403
 
3.5%
R 3158
 
3.2%
T 2459
 
2.5%
Other values (51) 38289
39.1%
Hangul
ValueCountFrequency (%)
5689
 
8.5%
4691
 
7.0%
2414
 
3.6%
2142
 
3.2%
2039
 
3.0%
1950
 
2.9%
1758
 
2.6%
1618
 
2.4%
1539
 
2.3%
1503
 
2.2%
Other values (510) 41553
62.1%
Punctuation
ValueCountFrequency (%)
542
100.0%
None
ValueCountFrequency (%)
· 176
93.1%
 13
 
6.9%
Compat Jamo
ValueCountFrequency (%)
11
100.0%
Geometric Shapes
ValueCountFrequency (%)
3
100.0%
Math Operators
ValueCountFrequency (%)
3
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
CJK
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Misc Symbols
ValueCountFrequency (%)
1
100.0%

cp_state
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5319
Missing (%)100.0%
Memory size46.9 KiB

cp_img
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing5307
Missing (%)99.8%
Memory size41.7 KiB
2024-04-16T22:25:06.055751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length23
Mean length21.916667
Min length12

Characters and Unicode

Total characters263
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st row4150420130418155640.jpg
2nd row4545720180927181755.jpg
3rd row532820180927182153.jpg
4th row4970820180927175336.jpg
5th row274620130424181121.JPG
ValueCountFrequency (%)
4150420130418155640.jpg 1
8.3%
4545720180927181755.jpg 1
8.3%
532820180927182153.jpg 1
8.3%
4970820180927175336.jpg 1
8.3%
274620130424181121.jpg 1
8.3%
5171220130424170527.gif 1
8.3%
7563820130424174316.jpg 1
8.3%
009_6769.jpg 1
8.3%
2643120130522163717.gif 1
8.3%
4733520130424173054.gif 1
8.3%
Other values (2) 2
16.7%
2024-04-16T22:25:06.324255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 36
13.7%
2 32
12.2%
0 29
11.0%
4 23
8.7%
7 23
8.7%
3 21
8.0%
5 18
 
6.8%
6 12
 
4.6%
. 12
 
4.6%
8 11
 
4.2%
Other values (12) 46
17.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 214
81.4%
Lowercase Letter 24
 
9.1%
Other Punctuation 12
 
4.6%
Uppercase Letter 12
 
4.6%
Connector Punctuation 1
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 36
16.8%
2 32
15.0%
0 29
13.6%
4 23
10.7%
7 23
10.7%
3 21
9.8%
5 18
8.4%
6 12
 
5.6%
8 11
 
5.1%
9 9
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
g 8
33.3%
p 5
20.8%
j 5
20.8%
i 3
 
12.5%
f 3
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
G 4
33.3%
J 3
25.0%
P 3
25.0%
I 1
 
8.3%
F 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
. 12
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 227
86.3%
Latin 36
 
13.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 36
15.9%
2 32
14.1%
0 29
12.8%
4 23
10.1%
7 23
10.1%
3 21
9.3%
5 18
7.9%
6 12
 
5.3%
. 12
 
5.3%
8 11
 
4.8%
Other values (2) 10
 
4.4%
Latin
ValueCountFrequency (%)
g 8
22.2%
p 5
13.9%
j 5
13.9%
G 4
11.1%
J 3
 
8.3%
P 3
 
8.3%
i 3
 
8.3%
f 3
 
8.3%
I 1
 
2.8%
F 1
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 263
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 36
13.7%
2 32
12.2%
0 29
11.0%
4 23
8.7%
7 23
8.7%
3 21
8.0%
5 18
 
6.8%
6 12
 
4.6%
. 12
 
4.6%
8 11
 
4.2%
Other values (12) 46
17.5%

cp_webflag
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
True
3464 
False
1855 
ValueCountFrequency (%)
True 3464
65.1%
False 1855
34.9%
2024-04-16T22:25:06.418850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

last_load_dttm
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
2020-12-21 20:49:53
4091 
2020-12-21 20:49:52
1228 

Length

Max length19
Median length19
Mean length19
Min length19

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-12-21 20:49:52
2nd row2020-12-21 20:49:52
3rd row2020-12-21 20:49:52
4th row2020-12-21 20:49:52
5th row2020-12-21 20:49:52

Common Values

ValueCountFrequency (%)
2020-12-21 20:49:53 4091
76.9%
2020-12-21 20:49:52 1228
 
23.1%

Length

2024-04-16T22:25:06.498672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-16T22:25:06.577309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-12-21 5319
50.0%
20:49:53 4091
38.5%
20:49:52 1228
 
11.5%

Interactions

2024-04-16T22:24:59.490171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-16T22:25:06.649363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
skeycp_homecp_classcp_hgucp_sanumcp_emailcp_emailflagcp_imgcp_webflaglast_load_dttm
skey1.0000.2460.4450.3710.4070.2230.0681.0000.2240.991
cp_home0.2461.0000.9680.8220.0000.9340.6271.0000.3210.000
cp_class0.4450.9681.0000.5200.3450.9250.0741.0000.1260.212
cp_hgu0.3710.8220.5201.0000.0000.5890.2211.0000.2580.166
cp_sanum0.4070.0000.3450.0001.0001.0000.329NaN0.0000.227
cp_email0.2230.9340.9250.5891.0001.0001.000NaN0.8820.156
cp_emailflag0.0680.6270.0740.2210.3291.0001.000NaN0.3680.000
cp_img1.0001.0001.0001.000NaNNaNNaN1.000NaN1.000
cp_webflag0.2240.3210.1260.2580.0000.8820.368NaN1.0000.180
last_load_dttm0.9910.0000.2120.1660.2270.1560.0001.0000.1801.000
2024-04-16T22:25:07.067337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
cp_emailcp_classcp_webflagcp_hgulast_load_dttmcp_emailflag
cp_email1.0000.6330.7900.2190.1300.964
cp_class0.6331.0000.1130.1460.1900.067
cp_webflag0.7900.1131.0000.2310.1150.240
cp_hgu0.2190.1460.2311.0000.1490.198
last_load_dttm0.1300.1900.1150.1491.0000.000
cp_emailflag0.9640.0670.2400.1980.0001.000
2024-04-16T22:25:07.158909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
skeycp_classcp_hgucp_emailcp_emailflagcp_webflaglast_load_dttm
skey1.0000.1900.1530.0810.0520.1710.914
cp_class0.1901.0000.1460.6330.0670.1130.190
cp_hgu0.1530.1461.0000.2190.1980.2310.149
cp_email0.0810.6330.2191.0000.9640.7900.130
cp_emailflag0.0520.0670.1980.9641.0000.2400.000
cp_webflag0.1710.1130.2310.7900.2401.0000.115
last_load_dttm0.9140.1900.1490.1300.0000.1151.000

Missing values

2024-04-16T22:24:59.617126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-16T22:24:59.842896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-16T22:24:59.992523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

skeycp_compnamecp_homecp_classcp_hgucp_ceonamecp_sanumcp_sidatecp_addrcp_telcp_emailcp_emailflagcp_infocp_woocp_statecp_imgcp_webflaglast_load_dttm
02945556송정광일체육관<NA>체육시설해운대구박광현<NA>2015-11-03부산광역시 해운대구 송정동 송정중앙로 21번길 66051-704-6888<NA>Y<P STYLE="MARGIN-LEFT: 40PX">부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여P STYLE="MARGIN-LEFT: 40PX"><P STYLE="MARGIN-LEFT: 40PX">다자녀가정 1인당(부,모,자녀) 수련비 10%내외할인(※중복할인불가)<NA><NA>Y2020-12-21 20:49:52
12945557사자후체육관<NA>체육시설해운대구하태환<NA>2015-11-03부산광역시 해운대구 좌동 세실로 87 영진파스타051-704-4707<NA>N<P STYLE="MARGIN-LEFT: 40PX">부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여<P STYLE="MARGIN-LEFT: 40PX">다자녀가정 1인당(부,모,자녀) 수련비 10%내외할인(※중복할인불가) <P STYLE="MARGIN-LEFT: 40PX"> NBSP; <P STYLE="MARGIN-LEFT: 40PX">* 탈퇴함<NA><NA>N2020-12-21 20:49:52
22945558송천체육관<NA>체육시설영도구하태환<NA>2015-11-03부산광역시 영도구 동삼동로 5 (동삼동)051-404-5562<NA>YP STYLE="MARGIN-LEFT: 40PX" GT;부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여P STYLE="MARGIN-LEFT: 40PX" GT;P STYLE="MARGIN-LEFT: 40PX" GT;다자녀가정 1인당(부,모,자녀) 수련비 10%내외할인(※중복할인불가)<NA><NA>Y2020-12-21 20:49:52
32945559용오름태권도<NA>체육시설사하구김진영<NA>2015-11-03부산광역시 사하구 다대동 다대낙조2길 215 도시몰운대아파트051-264-3754<NA>Y<P STYLE="MARGIN-LEFT: 40PX;">부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여P STYLE="MARGIN-LEFT: 40PX" GT;P STYLE="MARGIN-LEFT: 40PX" GT;다자녀가정 1인당(부,모,자녀) 수련비 10%내외할인(※중복할인불가)<NA><NA>Y2020-12-21 20:49:52
42945560승리마루다대도장<NA>체육시설연제구김철재<NA>2015-11-03부산광역시 연제구 연산동 쌍미천로 11051-266-5452<NA>N부산광역시태권도협회 소속 체육관 중 희망 체육관 다자녀가정 우대 참여탈퇴<NA><NA>N2020-12-21 20:49:52
52945561머리사랑헤어<NA>이미용업연제구이분임<NA>2017-11-17부산광역시 연제구 연산동 과정로 238번길 15051-000-000<NA>Y미용료 10% 할인<BR><NA><NA>Y2020-12-21 20:49:52
62945562수림빛 어린이집<NA>어린이집연제구김현정<NA>2017-11-17부산광역시 연제구 연산동 과정로 연산9동051-753-9961<NA>Y첫째, 둘째자녀 입학금 30% 면제, 셋째자녀 50% 입학금 면제(가족사랑카드 제시시)<BR><NA><NA>Y2020-12-21 20:49:52
72945563금돼지<NA>요식업등연제구박말례<NA>2017-11-17부산광역시 연제구 연산동 배산로 연산6동051-868-9551<NA>N탈퇴<BR><NA><NA>N2020-12-21 20:49:52
82945564아이비츠 동구캠퍼스<NA>학원동구이숙자<NA>2013-02-27부산광역시 동구 수정동 수정동로 545-11051-467-3880<NA>Y수강료 30% 할인<NA><NA>N2020-12-21 20:49:52
92945565금정컴퓨터학원<NA>학원동래구황성윤60795414722013-02-27부산광역시 동래구 온천동 금강로 397-32 4층 온천초등학교삼거리051-556-6755kjhy96@hanmail.netY자격증전문 금정컴퓨터학원1인 5% 할인, 2인 10% 할인 폐업<BR><NA><NA>N2020-12-21 20:49:52
skeycp_compnamecp_homecp_classcp_hgucp_ceonamecp_sanumcp_sidatecp_addrcp_telcp_emailcp_emailflagcp_infocp_woocp_statecp_imgcp_webflaglast_load_dttm
53092940350해금강<NA>요식업등부산시외최경희02000-01-01830-9051-645-0040<NA>Y· 음식값 10% 할인<BR><BR><NA><NA>N2020-12-21 20:49:53
53102940351롯데초록어린이집<NA>어린이집사하구윤정숙<NA>2014-04-25부산광역시 사하구 다대동 다대낙조2길 100 306동 104호051-263-3187<NA>N폐업<BR><BR>입학금 20%, 현장학습비 20% 할인<NA><NA>N2020-12-21 20:49:53
53112940352혜원칼라<NA>사진관부산진구채상민02000-01-01부산광역시 부산진구 가야1동 459-16051-892-9026<NA>Y가족, 베이비, 인물, 증명사진 20% 할인<NA><NA>N2020-12-21 20:49:53
53122940353디즈니영아전담 어린이집<NA>어린이집연제구정경옥<NA>2014-04-25부산광역시 연제구 거제동 해맞이로 23051-504-2234<NA>Y2인이상 등록시 입학준비금 5%할인<NA><NA>Y2020-12-21 20:49:53
53132940354연제루<NA>요식업등연제구전후봉<NA>2014-04-25부산광역시 연제구 연산동 안연로8번길 37051-754-4070<NA>Y<NA>음식값의 10%할인<NA><NA>N2020-12-21 20:49:53
53142940355명랑어린이집<NA>어린이집동래구김정희02000-01-01부산광역시 동래구 온천3동 1202-7051-505-6601<NA>Y입학금 면제<BR><BR><NA><NA>N2020-12-21 20:49:53
53152940356다음미용실<NA>이미용업동래구이순자02000-01-01부산광역시 동래구 온천3동 1448-1051-506-3951<NA>Y· 미용료 10% 할인(컷트 제외)스티커 부착<BR><BR><NA><NA>Y2020-12-21 20:49:53
53162940357온천이용원<NA>이미용업동래구양재승02000-01-01부산광역시 동래구 온천3동 1464-1051-503-2167Y<NA>· 이용료 10% 할인<BR><BR><NA><NA>Y2020-12-21 20:49:53
53172940358한국한의원<NA>한의원동래구윤경석02000-01-01부산광역시 동래구 온천3동 1441-17051-501-0025Y<NA>· 약값 10% 할인<BR><BR><NA><NA>Y2020-12-21 20:49:53
53182940359보명한의원<NA>한의원동래구정성엽02000-01-01부산광역시 동래구 온천3동 1411-9051-506-1255<NA>Y<NA>· 건강보험 비급여 항목 10% 할인<NA><NA>Y2020-12-21 20:49:53