Overview

Dataset statistics

Number of variables9
Number of observations155
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.6 KiB
Average record size in memory76.9 B

Variable types

Categorical7
Numeric1
Text1

Dataset

DescriptionSample
Author한국인터넷진흥원
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=KIS00000000000000021

Alerts

수신년도 has constant value ""Constant
수신월 has constant value ""Constant
이메일제목명 has constant value ""Constant
이메일내용 has constant value ""Constant
발송국가명 is highly overall correlated with 수신일 and 1 other fieldsHigh correlation
인터넷회사명 is highly overall correlated with 수신일 and 1 other fieldsHigh correlation
수신일 is highly overall correlated with 발송국가명 and 1 other fieldsHigh correlation
수신일 is highly imbalanced (61.8%)Imbalance
발송IP주소 has unique valuesUnique

Reproduction

Analysis started2023-12-10 06:30:07.805480
Analysis finished2023-12-10 06:30:08.651211
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

수신년도
Categorical

CONSTANT 

Distinct1
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2017
155 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2017 155
100.0%

Length

2023-12-10T15:30:08.767520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:30:08.902721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 155
100.0%

수신월
Categorical

CONSTANT 

Distinct1
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
1
155 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 155
100.0%

Length

2023-12-10T15:30:09.082508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:30:09.264480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 155
100.0%

수신일
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
1
138 
10
 
10
5
 
7

Length

Max length2
Median length1
Mean length1.0645161
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 138
89.0%
10 10
 
6.5%
5 7
 
4.5%

Length

2023-12-10T15:30:09.488011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:30:09.662185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 138
89.0%
10 10
 
6.5%
5 7
 
4.5%

수신시분초
Real number (ℝ)

Distinct151
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean145250.89
Minimum41147
Maximum233645
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-10T15:30:09.846740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum41147
5-th percentile61169.3
Q197872
median150148
Q3192502.5
95-th percentile224413
Maximum233645
Range192498
Interquartile range (IQR)94630.5

Descriptive statistics

Standard deviation53458.191
Coefficient of variation (CV)0.36804037
Kurtosis-1.1476444
Mean145250.89
Median Absolute Deviation (MAD)43673
Skewness-0.057833688
Sum22513888
Variance2.8577782 × 109
MonotonicityNot monotonic
2023-12-10T15:30:10.098230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
74522 2
 
1.3%
74843 2
 
1.3%
94119 2
 
1.3%
74520 2
 
1.3%
205933 1
 
0.6%
214553 1
 
0.6%
213814 1
 
0.6%
202819 1
 
0.6%
123347 1
 
0.6%
123415 1
 
0.6%
Other values (141) 141
91.0%
ValueCountFrequency (%)
41147 1
0.6%
50327 1
0.6%
53245 1
0.6%
53317 1
0.6%
53713 1
0.6%
60520 1
0.6%
60830 1
0.6%
60834 1
0.6%
61313 1
0.6%
62454 1
0.6%
ValueCountFrequency (%)
233645 1
0.6%
232431 1
0.6%
231330 1
0.6%
231321 1
0.6%
231131 1
0.6%
230724 1
0.6%
230659 1
0.6%
225540 1
0.6%
223930 1
0.6%
223901 1
0.6%

발송국가명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
CN
89 
-
62 
US
 
4

Length

Max length2
Median length2
Mean length1.6
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
CN 89
57.4%
- 62
40.0%
US 4
 
2.6%

Length

2023-12-10T15:30:10.360398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:30:10.557023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
cn 89
57.4%
62
40.0%
us 4
 
2.6%

발송IP주소
Text

UNIQUE 

Distinct155
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-10T15:30:11.030541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.529032
Min length10

Characters and Unicode

Total characters1787
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique155 ?
Unique (%)100.0%

Sample

1st row103.*.21.180
2nd row103.*.21.182
3rd row103.*.21.183
4th row103.*.21.184
5th row103.*.21.189
ValueCountFrequency (%)
103.*.21.180 1
 
0.6%
119.*.78.62 1
 
0.6%
119.*.79.223 1
 
0.6%
119.*.78.233 1
 
0.6%
119.*.78.24 1
 
0.6%
119.*.78.245 1
 
0.6%
119.*.78.31 1
 
0.6%
119.*.78.54 1
 
0.6%
119.*.78.6 1
 
0.6%
119.*.78.88 1
 
0.6%
Other values (145) 145
93.5%
2023-12-10T15:30:11.730206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 465
26.0%
1 337
18.9%
* 155
 
8.7%
2 139
 
7.8%
3 135
 
7.6%
7 131
 
7.3%
9 124
 
6.9%
0 99
 
5.5%
6 76
 
4.3%
8 57
 
3.2%
Other values (2) 69
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1167
65.3%
Other Punctuation 620
34.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 337
28.9%
2 139
11.9%
3 135
11.6%
7 131
 
11.2%
9 124
 
10.6%
0 99
 
8.5%
6 76
 
6.5%
8 57
 
4.9%
4 37
 
3.2%
5 32
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 465
75.0%
* 155
 
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1787
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 465
26.0%
1 337
18.9%
* 155
 
8.7%
2 139
 
7.8%
3 135
 
7.6%
7 131
 
7.3%
9 124
 
6.9%
0 99
 
5.5%
6 76
 
4.3%
8 57
 
3.2%
Other values (2) 69
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1787
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 465
26.0%
1 337
18.9%
* 155
 
8.7%
2 139
 
7.8%
3 135
 
7.6%
7 131
 
7.3%
9 124
 
6.9%
0 99
 
5.5%
6 76
 
4.3%
8 57
 
3.2%
Other values (2) 69
 
3.9%

이메일제목명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
컨텐츠 비공개
155 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row컨텐츠 비공개
2nd row컨텐츠 비공개
3rd row컨텐츠 비공개
4th row컨텐츠 비공개
5th row컨텐츠 비공개

Common Values

ValueCountFrequency (%)
컨텐츠 비공개 155
100.0%

Length

2023-12-10T15:30:11.957602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:30:12.120252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
컨텐츠 155
50.0%
비공개 155
50.0%

이메일내용
Categorical

CONSTANT 

Distinct1
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
-
155 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 155
100.0%

Length

2023-12-10T15:30:12.290596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:30:12.460982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
155
100.0%

인터넷회사명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
CHINANET HUBEI PROVINCE NETWORK
79 
-
62 
JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
10 
E.I. DU PONT DE NEMOURS AND CO. INC.
 
4

Length

Max length42
Median length31
Mean length19.83871
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
CHINANET HUBEI PROVINCE NETWORK 79
51.0%
- 62
40.0%
JIANGSU GROUP CO. NANJING JIANGSU PROVINCE 10
 
6.5%
E.I. DU PONT DE NEMOURS AND CO. INC. 4
 
2.6%

Length

2023-12-10T15:30:12.648430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:30:12.826794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
province 89
18.9%
chinanet 79
16.8%
hubei 79
16.8%
network 79
16.8%
62
13.2%
jiangsu 20
 
4.3%
co 14
 
3.0%
group 10
 
2.1%
nanjing 10
 
2.1%
e.i 4
 
0.9%
Other values (6) 24
 
5.1%

Interactions

2023-12-10T15:30:08.132051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:30:12.966489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수신일수신시분초발송국가명인터넷회사명
수신일1.0000.1670.8600.817
수신시분초0.1671.0000.4870.461
발송국가명0.8600.4871.0001.000
인터넷회사명0.8170.4611.0001.000
2023-12-10T15:30:13.345595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발송국가명인터넷회사명수신일
발송국가명1.0000.9970.548
인터넷회사명0.9971.0000.880
수신일0.5480.8801.000
2023-12-10T15:30:13.788108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수신시분초수신일발송국가명인터넷회사명
수신시분초1.0000.1270.3460.301
수신일0.1271.0000.5480.880
발송국가명0.3460.5481.0000.997
인터넷회사명0.3010.8800.9971.000

Missing values

2023-12-10T15:30:08.322406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:30:08.555431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

수신년도수신월수신일수신시분초발송국가명발송IP주소이메일제목명이메일내용인터넷회사명
0201711205933-103.*.21.180컨텐츠 비공개--
120171160834-103.*.21.182컨텐츠 비공개--
2201711153935-103.*.21.183컨텐츠 비공개--
3201711153740-103.*.21.184컨텐츠 비공개--
4201711102401-103.*.21.189컨텐츠 비공개--
5201711153916-103.*.21.190컨텐츠 비공개--
620171160830-103.*.21.191컨텐츠 비공개--
7201711205923-103.*.21.192컨텐츠 비공개--
8201711102308-103.*.21.66컨텐츠 비공개--
9201711195253-103.*.21.69컨텐츠 비공개--
수신년도수신월수신일수신시분초발송국가명발송IP주소이메일제목명이메일내용인터넷회사명
1452017110135713CN112.*.75.171컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1462017110222217CN112.*.75.187컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1472017110165315CN112.*.75.26컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1482017110221834CN112.*.75.46컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1492017110184752CN112.*.76.162컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1502017110193821CN112.*.76.30컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1512017110215324CN112.*.76.4컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1522017110185347CN112.*.76.62컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1532017110223053CN112.*.76.97컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE
1542017110214150CN112.*.77.130컨텐츠 비공개-JIANGSU GROUP CO. NANJING JIANGSU PROVINCE