Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

Categorical2
Text2
DateTime1

Alerts

ltrtr_se_cd is highly overall correlated with cyber_ltrtr_cd_nmHigh correlation
cyber_ltrtr_cd_nm is highly overall correlated with ltrtr_se_cdHigh correlation
ltrtr_se_cd is highly imbalanced (80.6%)Imbalance
cyber_ltrtr_cd_nm is highly imbalanced (80.6%)Imbalance
authr_sj has unique valuesUnique
rgs_de has unique valuesUnique
orginl_link_url has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:49:10.601306
Analysis finished2023-12-10 09:49:12.048621
Duration1.45 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ltrtr_se_cd
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
PM
97 
LT
 
3

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPM
2nd rowLT
3rd rowPM
4th rowPM
5th rowPM

Common Values

ValueCountFrequency (%)
PM 97
97.0%
LT 3
 
3.0%

Length

2023-12-10T18:49:12.224090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:49:12.407019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
pm 97
97.0%
lt 3
 
3.0%

cyber_ltrtr_cd_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
시배달
97 
문장배달
 
3

Length

Max length4
Median length3
Mean length3.03
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row시배달
2nd row문장배달
3rd row시배달
4th row시배달
5th row시배달

Common Values

ValueCountFrequency (%)
시배달 97
97.0%
문장배달 3
 
3.0%

Length

2023-12-10T18:49:12.610656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:49:12.838631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
시배달 97
97.0%
문장배달 3
 
3.0%

authr_sj
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:49:13.509744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length21
Mean length13.75
Min length8

Characters and Unicode

Total characters1375
Distinct characters320
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row신대철, 「반딧불 하나 내려보낼까요?」
2nd row김승옥의「무진기행」
3rd row김영승, 「반성 673」
4th row김행숙, 「입맞춤-사춘기2」
5th row박용래, 「상치꽃 아욱꽃」
ValueCountFrequency (%)
7
 
2.2%
2
 
0.6%
김행숙 2
 
0.6%
함민복 2
 
0.6%
이장욱 2
 
0.6%
진은영 2
 
0.6%
안도현 2
 
0.6%
김용택 2
 
0.6%
김혜순 2
 
0.6%
이원 2
 
0.6%
Other values (282) 287
92.0%
2023-12-10T18:49:14.423643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
212
 
15.4%
, 99
 
7.2%
98
 
7.1%
98
 
7.1%
26
 
1.9%
21
 
1.5%
18
 
1.3%
16
 
1.2%
11
 
0.8%
11
 
0.8%
Other values (310) 765
55.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 843
61.3%
Space Separator 212
 
15.4%
Close Punctuation 101
 
7.3%
Open Punctuation 101
 
7.3%
Other Punctuation 100
 
7.3%
Decimal Number 14
 
1.0%
Other Symbol 1
 
0.1%
Initial Punctuation 1
 
0.1%
Final Punctuation 1
 
0.1%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
 
3.1%
21
 
2.5%
18
 
2.1%
16
 
1.9%
11
 
1.3%
11
 
1.3%
11
 
1.3%
10
 
1.2%
10
 
1.2%
10
 
1.2%
Other values (289) 699
82.9%
Decimal Number
ValueCountFrequency (%)
2 4
28.6%
4 2
14.3%
1 2
14.3%
9 2
14.3%
0 1
 
7.1%
6 1
 
7.1%
7 1
 
7.1%
3 1
 
7.1%
Close Punctuation
ValueCountFrequency (%)
98
97.0%
) 2
 
2.0%
1
 
1.0%
Open Punctuation
ValueCountFrequency (%)
98
97.0%
( 2
 
2.0%
1
 
1.0%
Other Punctuation
ValueCountFrequency (%)
, 99
99.0%
? 1
 
1.0%
Space Separator
ValueCountFrequency (%)
212
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 837
60.9%
Common 532
38.7%
Han 6
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
 
3.1%
21
 
2.5%
18
 
2.2%
16
 
1.9%
11
 
1.3%
11
 
1.3%
11
 
1.3%
10
 
1.2%
10
 
1.2%
10
 
1.2%
Other values (283) 693
82.8%
Common
ValueCountFrequency (%)
212
39.8%
, 99
18.6%
98
18.4%
98
18.4%
2 4
 
0.8%
( 2
 
0.4%
4 2
 
0.4%
1 2
 
0.4%
) 2
 
0.4%
9 2
 
0.4%
Other values (11) 11
 
2.1%
Han
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 837
60.9%
ASCII 331
 
24.1%
None 198
 
14.4%
CJK 5
 
0.4%
Punctuation 2
 
0.1%
Geometric Shapes 1
 
0.1%
CJK Compat Ideographs 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
212
64.0%
, 99
29.9%
2 4
 
1.2%
( 2
 
0.6%
4 2
 
0.6%
1 2
 
0.6%
) 2
 
0.6%
9 2
 
0.6%
0 1
 
0.3%
? 1
 
0.3%
Other values (4) 4
 
1.2%
None
ValueCountFrequency (%)
98
49.5%
98
49.5%
1
 
0.5%
1
 
0.5%
Hangul
ValueCountFrequency (%)
26
 
3.1%
21
 
2.5%
18
 
2.2%
16
 
1.9%
11
 
1.3%
11
 
1.3%
11
 
1.3%
10
 
1.2%
10
 
1.2%
10
 
1.2%
Other values (283) 693
82.8%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

rgs_de
Date

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2007-05-31 00:00:00
Maximum2021-08-26 00:00:00
2023-12-10T18:49:14.748056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:49:15.002899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

orginl_link_url
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:49:15.396619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length31
Mean length30.77
Min length30

Characters and Unicode

Total characters3077
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowhttps://munjang.or.kr/?p=283305
2nd rowhttp://munjang.or.kr/archives/141504
3rd rowhttps://munjang.or.kr/?p=283205
4th rowhttps://munjang.or.kr/?p=283131
5th rowhttps://munjang.or.kr/?p=283077
ValueCountFrequency (%)
https://munjang.or.kr/?p=283305 1
 
1.0%
http://munjang.or.kr/?p=277757 1
 
1.0%
http://munjang.or.kr/?p=276980 1
 
1.0%
http://munjang.or.kr/?p=277086 1
 
1.0%
http://munjang.or.kr/?p=277171 1
 
1.0%
http://munjang.or.kr/?p=277209 1
 
1.0%
http://munjang.or.kr/?p=277243 1
 
1.0%
http://munjang.or.kr/?p=277291 1
 
1.0%
http://munjang.or.kr/?p=277370 1
 
1.0%
http://munjang.or.kr/?p=277464 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T18:49:16.007328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 303
 
9.8%
r 203
 
6.6%
n 200
 
6.5%
. 200
 
6.5%
t 200
 
6.5%
p 197
 
6.4%
2 131
 
4.3%
7 108
 
3.5%
a 103
 
3.3%
h 103
 
3.3%
Other values (22) 1329
43.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1680
54.6%
Other Punctuation 700
22.7%
Decimal Number 600
 
19.5%
Math Symbol 97
 
3.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 203
12.1%
n 200
11.9%
t 200
11.9%
p 197
11.7%
a 103
 
6.1%
h 103
 
6.1%
m 100
 
6.0%
u 100
 
6.0%
j 100
 
6.0%
g 100
 
6.0%
Other values (7) 274
16.3%
Decimal Number
ValueCountFrequency (%)
2 131
21.8%
7 108
18.0%
8 75
12.5%
0 46
 
7.7%
4 46
 
7.7%
9 44
 
7.3%
1 43
 
7.2%
3 39
 
6.5%
5 36
 
6.0%
6 32
 
5.3%
Other Punctuation
ValueCountFrequency (%)
/ 303
43.3%
. 200
28.6%
: 100
 
14.3%
? 97
 
13.9%
Math Symbol
ValueCountFrequency (%)
= 97
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1680
54.6%
Common 1397
45.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 203
12.1%
n 200
11.9%
t 200
11.9%
p 197
11.7%
a 103
 
6.1%
h 103
 
6.1%
m 100
 
6.0%
u 100
 
6.0%
j 100
 
6.0%
g 100
 
6.0%
Other values (7) 274
16.3%
Common
ValueCountFrequency (%)
/ 303
21.7%
. 200
14.3%
2 131
9.4%
7 108
 
7.7%
: 100
 
7.2%
? 97
 
6.9%
= 97
 
6.9%
8 75
 
5.4%
0 46
 
3.3%
4 46
 
3.3%
Other values (5) 194
13.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3077
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 303
 
9.8%
r 203
 
6.6%
n 200
 
6.5%
. 200
 
6.5%
t 200
 
6.5%
p 197
 
6.4%
2 131
 
4.3%
7 108
 
3.5%
a 103
 
3.3%
h 103
 
3.3%
Other values (22) 1329
43.2%

Correlations

2023-12-10T18:49:16.199994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ltrtr_se_cdcyber_ltrtr_cd_nmauthr_sjrgs_deorginl_link_url
ltrtr_se_cd1.0000.9631.0001.0001.000
cyber_ltrtr_cd_nm0.9631.0001.0001.0001.000
authr_sj1.0001.0001.0001.0001.000
rgs_de1.0001.0001.0001.0001.000
orginl_link_url1.0001.0001.0001.0001.000
2023-12-10T18:49:16.391834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
cyber_ltrtr_cd_nmltrtr_se_cd
cyber_ltrtr_cd_nm1.0000.826
ltrtr_se_cd0.8261.000
2023-12-10T18:49:16.546416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ltrtr_se_cdcyber_ltrtr_cd_nm
ltrtr_se_cd1.0000.826
cyber_ltrtr_cd_nm0.8261.000

Missing values

2023-12-10T18:49:11.674318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:49:11.947597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ltrtr_se_cdcyber_ltrtr_cd_nmauthr_sjrgs_deorginl_link_url
0PM시배달신대철, 「반딧불 하나 내려보낼까요?」2021-08-26https://munjang.or.kr/?p=283305
1LT문장배달김승옥의「무진기행」2007-06-14http://munjang.or.kr/archives/141504
2PM시배달김영승, 「반성 673」2021-07-29https://munjang.or.kr/?p=283205
3PM시배달김행숙, 「입맞춤-사춘기2」2021-07-15https://munjang.or.kr/?p=283131
4PM시배달박용래, 「상치꽃 아욱꽃」2021-07-01https://munjang.or.kr/?p=283077
5PM시배달황인찬 , 「법원」2020-12-17https://munjang.or.kr/?p=282388
6PM시배달신해욱, 「보고 싶은 친구에게」2020-12-03https://munjang.or.kr/?p=282259
7LT문장배달김소진의「눈사람 속의 검은 항아리」2007-06-07http://munjang.or.kr/archives/141544
8PM시배달김언희, 「트렁크」2020-11-05https://munjang.or.kr/?p=282112
9PM시배달이장욱, 「두번째 강물」2020-10-22https://munjang.or.kr/?p=281917
ltrtr_se_cdcyber_ltrtr_cd_nmauthr_sjrgs_deorginl_link_url
90PM시배달함민복, 「숨 쉬기도 미안한 4월」2017-04-13http://munjang.or.kr/?p=274275
91PM시배달경종호, 「새싹 하나가 나기까지는」2017-03-30http://munjang.or.kr/?p=274229
92PM시배달신철규, 「눈물의 중력」2017-03-16http://munjang.or.kr/?p=274108
93PM시배달임순덕, 「부아가 나서」2017-03-02http://munjang.or.kr/?p=274049
94PM시배달이병률, 「반반」2017-02-16http://munjang.or.kr/?p=273919
95PM시배달정동철, 「포릉포릉」2017-02-02http://munjang.or.kr/?p=273838
96PM시배달박소란, 「지익」2017-01-19http://munjang.or.kr/?p=273723
97PM시배달장석남, 「여행의 메모」2017-01-05http://munjang.or.kr/?p=273427
98PM시배달성미정, 무상한 나라의 앨리스2016-12-22http://munjang.or.kr/?p=273336
99PM시배달송경동, 「참, 좆같은 풍경」2016-12-08http://munjang.or.kr/?p=273280