Overview

Dataset statistics

Number of variables6
Number of observations400
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory20.1 KiB
Average record size in memory51.3 B

Variable types

Numeric3
Categorical1
DateTime1
Text1

Dataset

DescriptionSample
Author코난테크놀로지
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=TPORELATION

Alerts

"채널값" has constant value ""Constant
"차례값" is highly overall correlated with "건수값"High correlation
"건수값" is highly overall correlated with "차례값"High correlation
"기본키값" has unique valuesUnique

Reproduction

Analysis started2023-12-10 06:17:07.137498
Analysis finished2023-12-10 06:17:11.400065
Duration4.26 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

"기본키값"
Real number (ℝ)

UNIQUE 

Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31751.19
Minimum12357
Maximum61564
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:17:11.524154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12357
5-th percentile12376.95
Q112456.75
median28334.5
Q344328.25
95-th percentile61544.05
Maximum61564
Range49207
Interquartile range (IQR)31871.5

Descriptive statistics

Standard deviation15834.912
Coefficient of variation (CV)0.49871869
Kurtosis-0.95766719
Mean31751.19
Median Absolute Deviation (MAD)15936
Skewness0.2782427
Sum12700476
Variance2.5074444 × 108
MonotonicityStrictly increasing
2023-12-10T15:17:11.750284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12357 1
 
0.2%
44293 1
 
0.2%
44303 1
 
0.2%
44302 1
 
0.2%
44301 1
 
0.2%
44300 1
 
0.2%
44299 1
 
0.2%
44298 1
 
0.2%
44297 1
 
0.2%
44296 1
 
0.2%
Other values (390) 390
97.5%
ValueCountFrequency (%)
12357 1
0.2%
12358 1
0.2%
12359 1
0.2%
12360 1
0.2%
12361 1
0.2%
12362 1
0.2%
12363 1
0.2%
12364 1
0.2%
12365 1
0.2%
12366 1
0.2%
ValueCountFrequency (%)
61564 1
0.2%
61563 1
0.2%
61562 1
0.2%
61561 1
0.2%
61560 1
0.2%
61559 1
0.2%
61558 1
0.2%
61557 1
0.2%
61556 1
0.2%
61555 1
0.2%

"채널값"
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
"블로그"
400 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row"블로그"
2nd row"블로그"
3rd row"블로그"
4th row"블로그"
5th row"블로그"

Common Values

ValueCountFrequency (%)
"블로그" 400
100.0%

Length

2023-12-10T15:17:11.997706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:17:12.187673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
블로그 400
100.0%
Distinct4
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Minimum2020-05-01 00:00:00
Maximum2020-05-04 00:00:00
2023-12-10T15:17:12.344901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:12.514230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=4)

"차례값"
Real number (ℝ)

HIGH CORRELATION 

Distinct122
Distinct (%)30.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.51
Minimum1
Maximum122
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:17:12.710461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median54
Q387
95-th percentile114
Maximum122
Range121
Interquartile range (IQR)61.25

Descriptive statistics

Standard deviation35.236827
Coefficient of variation (CV)0.62355029
Kurtosis-1.2263758
Mean56.51
Median Absolute Deviation (MAD)30.5
Skewness0.16115045
Sum22604
Variance1241.634
MonotonicityNot monotonic
2023-12-10T15:17:12.950188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 4
 
1.0%
22 4
 
1.0%
24 4
 
1.0%
25 4
 
1.0%
26 4
 
1.0%
27 4
 
1.0%
28 4
 
1.0%
29 4
 
1.0%
30 4
 
1.0%
31 4
 
1.0%
Other values (112) 360
90.0%
ValueCountFrequency (%)
1 4
1.0%
2 4
1.0%
3 4
1.0%
4 4
1.0%
5 4
1.0%
6 4
1.0%
7 4
1.0%
8 4
1.0%
9 4
1.0%
10 4
1.0%
ValueCountFrequency (%)
122 1
 
0.2%
121 1
 
0.2%
120 2
0.5%
119 2
0.5%
118 3
0.8%
117 3
0.8%
116 3
0.8%
115 3
0.8%
114 3
0.8%
113 3
0.8%
Distinct122
Distinct (%)30.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:17:13.461264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.885
Min length4

Characters and Unicode

Total characters1954
Distinct characters109
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.5%

Sample

1st row"친구"
2nd row"가족"
3rd row"엄마"
4th row"혼자"
5th row"아기"
ValueCountFrequency (%)
친구 4
 
1.0%
삼촌 4
 
1.0%
여자친구 4
 
1.0%
형제 4
 
1.0%
누나 4
 
1.0%
선배 4
 
1.0%
할아버지 4
 
1.0%
딸(女 4
 
1.0%
막내 4
 
1.0%
반려견 4
 
1.0%
Other values (112) 360
90.0%
2023-12-10T15:17:14.194556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 800
40.9%
49
 
2.5%
47
 
2.4%
44
 
2.3%
42
 
2.1%
42
 
2.1%
38
 
1.9%
30
 
1.5%
27
 
1.4%
27
 
1.4%
Other values (99) 808
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1146
58.6%
Other Punctuation 800
40.9%
Close Punctuation 4
 
0.2%
Open Punctuation 4
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
4.3%
47
 
4.1%
44
 
3.8%
42
 
3.7%
42
 
3.7%
38
 
3.3%
30
 
2.6%
27
 
2.4%
27
 
2.4%
26
 
2.3%
Other values (96) 774
67.5%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1142
58.4%
Common 808
41.4%
Han 4
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
4.3%
47
 
4.1%
44
 
3.9%
42
 
3.7%
42
 
3.7%
38
 
3.3%
30
 
2.6%
27
 
2.4%
27
 
2.4%
26
 
2.3%
Other values (95) 770
67.4%
Common
ValueCountFrequency (%)
" 800
99.0%
) 4
 
0.5%
( 4
 
0.5%
Han
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1142
58.4%
ASCII 808
41.4%
CJK Compat Ideographs 4
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 800
99.0%
) 4
 
0.5%
( 4
 
0.5%
Hangul
ValueCountFrequency (%)
49
 
4.3%
47
 
4.1%
44
 
3.9%
42
 
3.7%
42
 
3.7%
38
 
3.3%
30
 
2.6%
27
 
2.4%
27
 
2.4%
26
 
2.3%
Other values (95) 770
67.4%
CJK Compat Ideographs
ValueCountFrequency (%)
4
100.0%

"건수값"
Real number (ℝ)

HIGH CORRELATION 

Distinct282
Distinct (%)70.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2212.52
Minimum1
Maximum28557
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:17:14.440957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.95
Q131
median184
Q31435.75
95-th percentile12925.1
Maximum28557
Range28556
Interquartile range (IQR)1404.75

Descriptive statistics

Standard deviation4767.2909
Coefficient of variation (CV)2.1546883
Kurtosis10.669678
Mean2212.52
Median Absolute Deviation (MAD)176
Skewness3.1502129
Sum885008
Variance22727063
MonotonicityNot monotonic
2023-12-10T15:17:14.663701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11 7
 
1.8%
1 7
 
1.8%
30 6
 
1.5%
24 6
 
1.5%
8 5
 
1.2%
17 5
 
1.2%
4 5
 
1.2%
2 4
 
1.0%
37 4
 
1.0%
56 4
 
1.0%
Other values (272) 347
86.8%
ValueCountFrequency (%)
1 7
1.8%
2 4
1.0%
3 4
1.0%
4 5
1.2%
5 4
1.0%
6 1
 
0.2%
7 3
0.8%
8 5
1.2%
9 4
1.0%
10 1
 
0.2%
ValueCountFrequency (%)
28557 1
0.2%
27997 1
0.2%
26419 1
0.2%
24182 1
0.2%
23957 1
0.2%
23398 1
0.2%
22753 1
0.2%
21880 1
0.2%
19244 1
0.2%
18120 1
0.2%

Interactions

2023-12-10T15:17:10.635327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:09.549822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:10.211465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:10.792239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:09.779722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:10.353286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:10.940797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:10.019539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:17:10.500329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:17:14.802502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
"기본키값""해당일자""차례값""건수값"
"기본키값"1.0001.0000.2840.222
"해당일자"1.0001.0000.2820.185
"차례값"0.2840.2821.0000.813
"건수값"0.2220.1850.8131.000
2023-12-10T15:17:14.961681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
"기본키값""차례값""건수값"
"기본키값"1.0000.096-0.094
"차례값"0.0961.000-1.000
"건수값"-0.094-1.0001.000

Missing values

2023-12-10T15:17:11.136644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:17:11.326038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

"기본키값""채널값""해당일자""차례값""이슈어값""건수값"
012357"블로그"2020-05-011"친구"26419
112358"블로그"2020-05-012"가족"24182
212359"블로그"2020-05-013"엄마"18120
312360"블로그"2020-05-014"혼자"15575
412361"블로그"2020-05-015"아기"14965
512362"블로그"2020-05-016"부모"11697
612363"블로그"2020-05-017"남편"11489
712364"블로그"2020-05-018"아빠"10859
812365"블로그"2020-05-019"지인"7804
912366"블로그"2020-05-0110"자녀"7621
"기본키값""채널값""해당일자""차례값""이슈어값""건수값"
39061555"블로그"2020-05-0431"막내"1182
39161556"블로그"2020-05-0432"친정"1151
39261557"블로그"2020-05-0433"직장동료"1070
39361558"블로그"2020-05-0434"자매"947
39461559"블로그"2020-05-0435"동창"841
39561560"블로그"2020-05-0436"시댁"818
39661561"블로그"2020-05-0437"후배"811
39761562"블로그"2020-05-0438"친정어머니"592
39861563"블로그"2020-05-0439"삼촌"570
39961564"블로그"2020-05-0440"여동생"559