Overview

Dataset statistics

Number of variables9
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory81.4 B

Variable types

Numeric5
Categorical3
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=663752c0-2fb1-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
일간언급량연번 is highly overall correlated with 환경플랫폼 하위 도메인명 and 1 other fieldsHigh correlation
긍정언급량 is highly overall correlated with 부정언급량 and 2 other fieldsHigh correlation
부정언급량 is highly overall correlated with 긍정언급량 and 2 other fieldsHigh correlation
중립언급량 is highly overall correlated with 긍정언급량 and 2 other fieldsHigh correlation
총언급량 is highly overall correlated with 긍정언급량 and 2 other fieldsHigh correlation
환경플랫폼 하위 도메인명 is highly overall correlated with 일간언급량연번High correlation
SNS 채널명 is highly overall correlated with 일간언급량연번High correlation
일간언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:26:14.091300
Analysis finished2023-12-10 13:26:18.553642
Duration4.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.5
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:26:18.655308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.45
Q18.25
median15.5
Q322.75
95-th percentile28.55
Maximum30
Range29
Interquartile range (IQR)14.5

Descriptive statistics

Standard deviation8.8034084
Coefficient of variation (CV)0.56796183
Kurtosis-1.2
Mean15.5
Median Absolute Deviation (MAD)7.5
Skewness0
Sum465
Variance77.5
MonotonicityStrictly increasing
2023-12-10T22:26:18.850585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 1
 
3.3%
17 1
 
3.3%
30 1
 
3.3%
29 1
 
3.3%
28 1
 
3.3%
27 1
 
3.3%
26 1
 
3.3%
25 1
 
3.3%
24 1
 
3.3%
23 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
1 1
3.3%
2 1
3.3%
3 1
3.3%
4 1
3.3%
5 1
3.3%
6 1
3.3%
7 1
3.3%
8 1
3.3%
9 1
3.3%
10 1
3.3%
ValueCountFrequency (%)
30 1
3.3%
29 1
3.3%
28 1
3.3%
27 1
3.3%
26 1
3.3%
25 1
3.3%
24 1
3.3%
23 1
3.3%
22 1
3.3%
21 1
3.3%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2020-10-01
30 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-10-01
2nd row2020-10-01
3rd row2020-10-01
4th row2020-10-01
5th row2020-10-01

Common Values

ValueCountFrequency (%)
2020-10-01 30
100.0%

Length

2023-12-10T22:26:19.046802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:26:19.192531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-10-01 30
100.0%

환경플랫폼 하위 도메인명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
물환경
12 
자연환경
12 
생활환경

Length

Max length4
Median length4
Mean length3.6
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 12
40.0%
자연환경 12
40.0%
생활환경 6
20.0%

Length

2023-12-10T22:26:19.363017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:26:19.530190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 12
40.0%
자연환경 12
40.0%
생활환경 6
20.0%
Distinct15
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T22:26:19.752672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.8
Min length2

Characters and Unicode

Total characters84
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row상수도
3rd row지하수
4th row하수도
5th row하천
ValueCountFrequency (%)
물재난 2
 
6.7%
상수도 2
 
6.7%
지하수 2
 
6.7%
하수도 2
 
6.7%
하천 2
 
6.7%
호소 2
 
6.7%
대기 2
 
6.7%
폐기물 2
 
6.7%
화학물질 2
 
6.7%
기상변화 2
 
6.7%
Other values (5) 10
33.3%
2023-12-10T22:26:20.503602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
9.5%
6
 
7.1%
6
 
7.1%
6
 
7.1%
6
 
7.1%
6
 
7.1%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
Other values (15) 30
35.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 84
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
9.5%
6
 
7.1%
6
 
7.1%
6
 
7.1%
6
 
7.1%
6
 
7.1%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
Other values (15) 30
35.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 84
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
9.5%
6
 
7.1%
6
 
7.1%
6
 
7.1%
6
 
7.1%
6
 
7.1%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
Other values (15) 30
35.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 84
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
9.5%
6
 
7.1%
6
 
7.1%
6
 
7.1%
6
 
7.1%
6
 
7.1%
4
 
4.8%
4
 
4.8%
4
 
4.8%
4
 
4.8%
Other values (15) 30
35.7%

SNS 채널명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
All
15 
blog
15 

Length

Max length4
Median length3.5
Mean length3.5
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 15
50.0%
blog 15
50.0%

Length

2023-12-10T22:26:20.737809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:26:20.908198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 15
50.0%
blog 15
50.0%

긍정언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.8
Minimum9
Maximum700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:26:21.077724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile9.9
Q130
median110
Q3181
95-th percentile550.15
Maximum700
Range691
Interquartile range (IQR)151

Descriptive statistics

Standard deviation183.58694
Coefficient of variation (CV)1.1560891
Kurtosis3.4393402
Mean158.8
Median Absolute Deviation (MAD)77
Skewness1.8760042
Sum4764
Variance33704.166
MonotonicityNot monotonic
2023-12-10T22:26:21.275069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
29 2
 
6.7%
140 2
 
6.7%
11 2
 
6.7%
33 2
 
6.7%
178 2
 
6.7%
161 2
 
6.7%
53 2
 
6.7%
367 2
 
6.7%
21 2
 
6.7%
110 2
 
6.7%
Other values (5) 10
33.3%
ValueCountFrequency (%)
9 2
6.7%
11 2
6.7%
21 2
6.7%
29 2
6.7%
33 2
6.7%
53 2
6.7%
54 2
6.7%
110 2
6.7%
140 2
6.7%
161 2
6.7%
ValueCountFrequency (%)
700 2
6.7%
367 2
6.7%
334 2
6.7%
182 2
6.7%
178 2
6.7%
161 2
6.7%
140 2
6.7%
110 2
6.7%
54 2
6.7%
53 2
6.7%

부정언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.333333
Minimum2
Maximum496
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:26:21.450362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2.45
Q117.5
median40
Q393.25
95-th percentile348.85
Maximum496
Range494
Interquartile range (IQR)75.75

Descriptive statistics

Standard deviation123.47088
Coefficient of variation (CV)1.4816505
Kurtosis7.3462611
Mean83.333333
Median Absolute Deviation (MAD)32
Skewness2.7004569
Sum2500
Variance15245.057
MonotonicityNot monotonic
2023-12-10T22:26:21.632266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
17 2
 
6.7%
63 2
 
6.7%
2 2
 
6.7%
19 2
 
6.7%
99 2
 
6.7%
43 2
 
6.7%
40 2
 
6.7%
159 2
 
6.7%
8 2
 
6.7%
34 2
 
6.7%
Other values (5) 10
33.3%
ValueCountFrequency (%)
2 2
6.7%
3 2
6.7%
8 2
6.7%
17 2
6.7%
19 2
6.7%
22 2
6.7%
34 2
6.7%
40 2
6.7%
43 2
6.7%
63 2
6.7%
ValueCountFrequency (%)
496 2
6.7%
169 2
6.7%
159 2
6.7%
99 2
6.7%
76 2
6.7%
63 2
6.7%
43 2
6.7%
40 2
6.7%
34 2
6.7%
22 2
6.7%

중립언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6601.9333
Minimum240
Maximum27867
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:26:21.795451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum240
5-th percentile371.4
Q11281.5
median4288
Q311461.25
95-th percentile21274.95
Maximum27867
Range27627
Interquartile range (IQR)10179.75

Descriptive statistics

Standard deviation7360.9686
Coefficient of variation (CV)1.1149717
Kurtosis2.9000958
Mean6601.9333
Median Absolute Deviation (MAD)3373
Skewness1.7115741
Sum198058
Variance54183859
MonotonicityNot monotonic
2023-12-10T22:26:21.974200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
1226 2
 
6.7%
13218 2
 
6.7%
240 2
 
6.7%
1448 2
 
6.7%
7661 2
 
6.7%
4288 2
 
6.7%
2079 2
 
6.7%
12728 2
 
6.7%
727 2
 
6.7%
5418 2
 
6.7%
Other values (5) 10
33.3%
ValueCountFrequency (%)
240 2
6.7%
532 2
6.7%
727 2
6.7%
1226 2
6.7%
1448 2
6.7%
2079 2
6.7%
2543 2
6.7%
4288 2
6.7%
5418 2
6.7%
6140 2
6.7%
ValueCountFrequency (%)
27867 2
6.7%
13218 2
6.7%
12914 2
6.7%
12728 2
6.7%
7661 2
6.7%
6140 2
6.7%
5418 2
6.7%
4288 2
6.7%
2543 2
6.7%
2079 2
6.7%

총언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6844.0667
Minimum253
Maximum29063
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:26:22.186928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum253
5-th percentile383.95
Q11329
median4492
Q311925
95-th percentile22024.1
Maximum29063
Range28810
Interquartile range (IQR)10596

Descriptive statistics

Standard deviation7653.1862
Coefficient of variation (CV)1.118222
Kurtosis3.0082319
Mean6844.0667
Median Absolute Deviation (MAD)3446
Skewness1.7344586
Sum205322
Variance58571259
MonotonicityNot monotonic
2023-12-10T22:26:22.386566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
1272 2
 
6.7%
13421 2
 
6.7%
253 2
 
6.7%
1500 2
 
6.7%
7938 2
 
6.7%
4492 2
 
6.7%
2172 2
 
6.7%
13254 2
 
6.7%
756 2
 
6.7%
5562 2
 
6.7%
Other values (5) 10
33.3%
ValueCountFrequency (%)
253 2
6.7%
544 2
6.7%
756 2
6.7%
1272 2
6.7%
1500 2
6.7%
2172 2
6.7%
2619 2
6.7%
4492 2
6.7%
5562 2
6.7%
6398 2
6.7%
ValueCountFrequency (%)
29063 2
6.7%
13421 2
6.7%
13417 2
6.7%
13254 2
6.7%
7938 2
6.7%
6398 2
6.7%
5562 2
6.7%
4492 2
6.7%
2619 2
6.7%
2172 2
6.7%

Interactions

2023-12-10T22:26:17.330934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:14.461709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:15.152004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:15.889869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:16.560521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:17.522233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:14.576976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:15.297732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:16.032829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:16.697993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:17.648986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:14.728482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:15.444483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:16.166825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:16.908997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:17.846531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:14.875206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:15.584141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:16.302571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:17.062320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:17.996749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:15.012634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:15.751038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:16.433929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:26:17.202157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:26:22.550507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간언급량연번환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명긍정언급량부정언급량중립언급량총언급량
일간언급량연번1.0001.0000.0001.0000.0910.0000.0000.000
환경플랫폼 하위 도메인명1.0001.0001.0000.0000.7290.2530.0000.000
도메인 하위 카테고리명0.0001.0001.0000.0001.0001.0001.0001.000
SNS 채널명1.0000.0000.0001.0000.0000.0000.0000.000
긍정언급량0.0910.7291.0000.0001.0000.9720.8860.886
부정언급량0.0000.2531.0000.0000.9721.0000.8890.889
중립언급량0.0000.0001.0000.0000.8860.8891.0001.000
총언급량0.0000.0001.0000.0000.8860.8891.0001.000
2023-12-10T22:26:22.737373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SNS 채널명환경플랫폼 하위 도메인명
SNS 채널명1.0000.000
환경플랫폼 하위 도메인명0.0001.000
2023-12-10T22:26:22.868853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간언급량연번긍정언급량부정언급량중립언급량총언급량환경플랫폼 하위 도메인명SNS 채널명
일간언급량연번1.0000.1090.1050.0730.0730.8610.845
긍정언급량0.1091.0000.9750.9290.9290.3780.000
부정언급량0.1050.9751.0000.9460.9460.2300.000
중립언급량0.0730.9290.9461.0001.0000.0000.000
총언급량0.0730.9290.9461.0001.0000.0000.000
환경플랫폼 하위 도메인명0.8610.3780.2300.0000.0001.0000.000
SNS 채널명0.8450.0000.0000.0000.0000.0001.000

Missing values

2023-12-10T22:26:18.225106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:26:18.456581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명긍정언급량부정언급량중립언급량총언급량
012020-10-01물환경물재난All291712261272
122020-10-01물환경상수도All140631321813421
232020-10-01물환경지하수All112240253
342020-10-01물환경하수도All331914481500
452020-10-01물환경하천All1789976617938
562020-10-01물환경호소All1614342884492
672020-10-01생활환경대기All534020792172
782020-10-01생활환경폐기물All3671591272813254
892020-10-01생활환경화학물질All218727756
9102020-10-01자연환경기상변화All1103454185562
일간언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명긍정언급량부정언급량중립언급량총언급량
20212020-10-01물환경호소blog1614342884492
21222020-10-01생활환경대기blog534020792172
22232020-10-01생활환경폐기물blog3671591272813254
23242020-10-01생활환경화학물질blog218727756
24252020-10-01자연환경기상변화blog1103454185562
25262020-10-01자연환경기후변화blog7004962786729063
26272020-10-01자연환경생태계blog1827661406398
27282020-10-01자연환경지질blog542225432619
28292020-10-01자연환경지형blog3341691291413417
29302020-10-01자연환경토양blog93532544