Overview

Dataset statistics

Number of variables6
Number of observations400
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory20.1 KiB
Average record size in memory51.3 B

Variable types

Numeric3
Categorical2
Text1

Dataset

DescriptionSample
Author코난테크놀로지
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=TPOGROUP

Alerts

"채널값" has constant value ""Constant
"기본키값" is highly overall correlated with "해당일자"High correlation
"차례값" is highly overall correlated with "건수값"High correlation
"건수값" is highly overall correlated with "차례값"High correlation
"해당일자" is highly overall correlated with "기본키값"High correlation
"기본키값" has unique valuesUnique

Reproduction

Analysis started2023-12-10 06:40:05.416490
Analysis finished2023-12-10 06:40:07.867004
Duration2.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

"기본키값"
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20364.97
Minimum12156
Maximum44075
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:40:08.054758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12156
5-th percentile12175.95
Q112255.75
median12355.5
Q328157.25
95-th percentile28237.05
Maximum44075
Range31919
Interquartile range (IQR)15901.5

Descriptive statistics

Standard deviation8347.5253
Coefficient of variation (CV)0.40989627
Kurtosis-1.3648771
Mean20364.97
Median Absolute Deviation (MAD)199
Skewness0.2216268
Sum8145988
Variance69681179
MonotonicityStrictly increasing
2023-12-10T15:40:08.376973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12156 1
 
0.2%
28122 1
 
0.2%
28132 1
 
0.2%
28131 1
 
0.2%
28130 1
 
0.2%
28129 1
 
0.2%
28128 1
 
0.2%
28127 1
 
0.2%
28126 1
 
0.2%
28125 1
 
0.2%
Other values (390) 390
97.5%
ValueCountFrequency (%)
12156 1
0.2%
12157 1
0.2%
12158 1
0.2%
12159 1
0.2%
12160 1
0.2%
12161 1
0.2%
12162 1
0.2%
12163 1
0.2%
12164 1
0.2%
12165 1
0.2%
ValueCountFrequency (%)
44075 1
0.2%
44074 1
0.2%
44073 1
0.2%
44072 1
0.2%
44071 1
0.2%
28252 1
0.2%
28251 1
0.2%
28250 1
0.2%
28249 1
0.2%
28248 1
0.2%

"채널값"
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
"블로그"
400 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row"블로그"
2nd row"블로그"
3rd row"블로그"
4th row"블로그"
5th row"블로그"

Common Values

ValueCountFrequency (%)
"블로그" 400
100.0%

Length

2023-12-10T15:40:08.634610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:40:08.869980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
블로그 400
100.0%

"해당일자"
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2020-05-01
201 
2020-05-02
194 
2020-05-03
 
5

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-05-01
2nd row2020-05-01
3rd row2020-05-01
4th row2020-05-01
5th row2020-05-01

Common Values

ValueCountFrequency (%)
2020-05-01 201
50.2%
2020-05-02 194
48.5%
2020-05-03 5
 
1.2%

Length

2023-12-10T15:40:09.053718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:40:09.219769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-05-01 201
50.2%
2020-05-02 194
48.5%
2020-05-03 5
 
1.2%

"차례값"
Real number (ℝ)

HIGH CORRELATION 

Distinct201
Distinct (%)50.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.0775
Minimum1
Maximum201
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:40:09.424658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q148
median98
Q3148
95-th percentile188
Maximum201
Range200
Interquartile range (IQR)100

Descriptive statistics

Standard deviation57.781079
Coefficient of variation (CV)0.58913695
Kurtosis-1.2011587
Mean98.0775
Median Absolute Deviation (MAD)50
Skewness0.0076202642
Sum39231
Variance3338.6531
MonotonicityNot monotonic
2023-12-10T15:40:09.688173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3
 
0.8%
3 3
 
0.8%
4 3
 
0.8%
5 3
 
0.8%
2 3
 
0.8%
136 2
 
0.5%
127 2
 
0.5%
128 2
 
0.5%
129 2
 
0.5%
130 2
 
0.5%
Other values (191) 375
93.8%
ValueCountFrequency (%)
1 3
0.8%
2 3
0.8%
3 3
0.8%
4 3
0.8%
5 3
0.8%
6 2
0.5%
7 2
0.5%
8 2
0.5%
9 2
0.5%
10 2
0.5%
ValueCountFrequency (%)
201 1
0.2%
200 1
0.2%
199 1
0.2%
198 1
0.2%
197 1
0.2%
196 1
0.2%
195 1
0.2%
194 2
0.5%
193 2
0.5%
192 2
0.5%
Distinct209
Distinct (%)52.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:40:10.171687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length5.4075
Min length4

Characters and Unicode

Total characters2163
Distinct characters237
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)5.8%

Sample

1st row"고객"
2nd row"아기"
3rd row"학생"
4th row"회사원"
5th row"근로자"
ValueCountFrequency (%)
고객 3
 
0.8%
어린이 3
 
0.8%
아기 3
 
0.8%
학생 3
 
0.8%
회사원 3
 
0.8%
복학생 2
 
0.5%
예비역 2
 
0.5%
하위층 2
 
0.5%
5인가구 2
 
0.5%
은퇴자 2
 
0.5%
Other values (199) 375
93.8%
2023-12-10T15:40:10.955523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 800
37.0%
50
 
2.3%
44
 
2.0%
39
 
1.8%
35
 
1.6%
33
 
1.5%
31
 
1.4%
26
 
1.2%
25
 
1.2%
25
 
1.2%
Other values (227) 1055
48.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1286
59.5%
Other Punctuation 800
37.0%
Decimal Number 63
 
2.9%
Lowercase Letter 14
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
3.9%
44
 
3.4%
39
 
3.0%
35
 
2.7%
33
 
2.6%
31
 
2.4%
26
 
2.0%
25
 
1.9%
25
 
1.9%
24
 
1.9%
Other values (210) 954
74.2%
Decimal Number
ValueCountFrequency (%)
0 22
34.9%
3 9
14.3%
8 6
 
9.5%
4 6
 
9.5%
5 6
 
9.5%
6 5
 
7.9%
2 4
 
6.3%
7 3
 
4.8%
1 2
 
3.2%
Lowercase Letter
ValueCountFrequency (%)
p 2
14.3%
y 2
14.3%
n 2
14.3%
i 2
14.3%
v 2
14.3%
x 2
14.3%
z 2
14.3%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1286
59.5%
Common 863
39.9%
Latin 14
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
3.9%
44
 
3.4%
39
 
3.0%
35
 
2.7%
33
 
2.6%
31
 
2.4%
26
 
2.0%
25
 
1.9%
25
 
1.9%
24
 
1.9%
Other values (210) 954
74.2%
Common
ValueCountFrequency (%)
" 800
92.7%
0 22
 
2.5%
3 9
 
1.0%
8 6
 
0.7%
4 6
 
0.7%
5 6
 
0.7%
6 5
 
0.6%
2 4
 
0.5%
7 3
 
0.3%
1 2
 
0.2%
Latin
ValueCountFrequency (%)
p 2
14.3%
y 2
14.3%
n 2
14.3%
i 2
14.3%
v 2
14.3%
x 2
14.3%
z 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1286
59.5%
ASCII 877
40.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 800
91.2%
0 22
 
2.5%
3 9
 
1.0%
8 6
 
0.7%
4 6
 
0.7%
5 6
 
0.7%
6 5
 
0.6%
2 4
 
0.5%
7 3
 
0.3%
p 2
 
0.2%
Other values (7) 14
 
1.6%
Hangul
ValueCountFrequency (%)
50
 
3.9%
44
 
3.4%
39
 
3.0%
35
 
2.7%
33
 
2.6%
31
 
2.4%
26
 
2.0%
25
 
1.9%
25
 
1.9%
24
 
1.9%
Other values (210) 954
74.2%

"건수값"
Real number (ℝ)

HIGH CORRELATION 

Distinct234
Distinct (%)58.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean866.3475
Minimum1
Maximum41399
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:40:11.202554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q19
median50
Q3362.5
95-th percentile3271.85
Maximum41399
Range41398
Interquartile range (IQR)353.5

Descriptive statistics

Standard deviation3510.8186
Coefficient of variation (CV)4.0524369
Kurtosis82.50825
Mean866.3475
Median Absolute Deviation (MAD)48
Skewness8.4880796
Sum346539
Variance12325847
MonotonicityNot monotonic
2023-12-10T15:40:11.467799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 22
 
5.5%
2 21
 
5.2%
1 17
 
4.2%
5 11
 
2.8%
4 8
 
2.0%
8 8
 
2.0%
18 7
 
1.8%
6 7
 
1.8%
17 6
 
1.5%
11 5
 
1.2%
Other values (224) 288
72.0%
ValueCountFrequency (%)
1 17
4.2%
2 21
5.2%
3 22
5.5%
4 8
 
2.0%
5 11
2.8%
6 7
 
1.8%
7 3
 
0.8%
8 8
 
2.0%
9 4
 
1.0%
10 3
 
0.8%
ValueCountFrequency (%)
41399 1
0.2%
35165 1
0.2%
31707 1
0.2%
15236 1
0.2%
14965 1
0.2%
13789 1
0.2%
8255 1
0.2%
6665 1
0.2%
6222 1
0.2%
6172 1
0.2%

Interactions

2023-12-10T15:40:06.991322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:40:05.858454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:40:06.424262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:40:07.136763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:40:06.056497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:40:06.652943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:40:07.300712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:40:06.267163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:40:06.844366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:40:11.658819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
"기본키값""해당일자""차례값""건수값"
"기본키값"1.0001.0000.6060.447
"해당일자"1.0001.0000.2780.766
"차례값"0.6060.2781.0000.411
"건수값"0.4470.7660.4111.000
2023-12-10T15:40:11.879685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
"기본키값""차례값""건수값""해당일자"
"기본키값"1.0000.421-0.4560.999
"차례값"0.4211.000-0.9990.164
"건수값"-0.456-0.9991.0000.442
"해당일자"0.9990.1640.4421.000

Missing values

2023-12-10T15:40:07.530064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:40:07.775904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

"기본키값""채널값""해당일자""차례값""이슈어값""건수값"
012156"블로그"2020-05-011"고객"41399
112157"블로그"2020-05-012"아기"14965
212158"블로그"2020-05-013"학생"8255
312159"블로그"2020-05-014"회사원"6172
412160"블로그"2020-05-015"근로자"5823
512161"블로그"2020-05-016"어린이"5704
612162"블로그"2020-05-017"주부"4096
712163"블로그"2020-05-018"연예인"3421
812164"블로그"2020-05-019"교수"3050
912165"블로그"2020-05-0110"생산자"2647
"기본키값""채널값""해당일자""차례값""이슈어값""건수값"
39028248"블로그"2020-05-02190"미혼부"1
39128249"블로그"2020-05-02191"다이아수저"1
39228250"블로그"2020-05-02192"차상위층"1
39328251"블로그"2020-05-02193"다세대가족"1
39428252"블로그"2020-05-02194"워킹푸어"1
39544071"블로그"2020-05-031"고객"31707
39644072"블로그"2020-05-032"아기"13789
39744073"블로그"2020-05-033"학생"6222
39844074"블로그"2020-05-034"회사원"5723
39944075"블로그"2020-05-035"어린이"5241