Overview

Dataset statistics

Number of variables9
Number of observations56
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory77.3 B

Variable types

Numeric3
DateTime1
Categorical4
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=f0997ad0-e841-11ea-a0ba-57da34c93da2

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
도메인 하위 카테고리명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간연관어언급량 is highly overall correlated with 일간연관어단어량High correlation
일간연관어단어량 is highly overall correlated with 일간연관어언급량High correlation
일간연관어연번 has unique valuesUnique
연관어명 has unique valuesUnique

Reproduction

Analysis started2024-04-21 10:08:05.454564
Analysis finished2024-04-21 10:08:08.635119
Duration3.18 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간연관어연번
Real number (ℝ)

UNIQUE 

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.5
Minimum1
Maximum56
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size632.0 B
2024-04-21T19:08:08.823622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.75
Q114.75
median28.5
Q342.25
95-th percentile53.25
Maximum56
Range55
Interquartile range (IQR)27.5

Descriptive statistics

Standard deviation16.309506
Coefficient of variation (CV)0.57226338
Kurtosis-1.2
Mean28.5
Median Absolute Deviation (MAD)14
Skewness0
Sum1596
Variance266
MonotonicityStrictly increasing
2024-04-21T19:08:09.147827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.8%
30 1
 
1.8%
32 1
 
1.8%
33 1
 
1.8%
34 1
 
1.8%
35 1
 
1.8%
36 1
 
1.8%
37 1
 
1.8%
38 1
 
1.8%
39 1
 
1.8%
Other values (46) 46
82.1%
ValueCountFrequency (%)
1 1
1.8%
2 1
1.8%
3 1
1.8%
4 1
1.8%
5 1
1.8%
6 1
1.8%
7 1
1.8%
8 1
1.8%
9 1
1.8%
10 1
1.8%
ValueCountFrequency (%)
56 1
1.8%
55 1
1.8%
54 1
1.8%
53 1
1.8%
52 1
1.8%
51 1
1.8%
50 1
1.8%
49 1
1.8%
48 1
1.8%
47 1
1.8%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size576.0 B
Minimum2020-07-01 00:00:00
Maximum2020-07-01 00:00:00
2024-04-21T19:08:09.336599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:09.494527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct1
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size576.0 B
물환경
56 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 56
100.0%

Length

2024-04-21T19:08:09.676716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T19:08:09.833054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 56
100.0%

도메인 하위 카테고리명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size576.0 B
물재난
56 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
물재난 56
100.0%

Length

2024-04-21T19:08:09.993198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T19:08:10.148257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물재난 56
100.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size576.0 B
news
56 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownews
2nd rownews
3rd rownews
4th rownews
5th rownews

Common Values

ValueCountFrequency (%)
news 56
100.0%

Length

2024-04-21T19:08:10.320327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T19:08:10.475081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
news 56
100.0%

단어속성명
Categorical

Distinct10
Distinct (%)17.9%
Missing0
Missing (%)0.0%
Memory size576.0 B
속성
16 
라이프
15 
장소
기타
사회이슈
Other values (5)

Length

Max length6
Median length2
Mean length2.4464286
Min length2

Unique

Unique2 ?
Unique (%)3.6%

Sample

1st row속성
2nd row상품
3rd row속성
4th row라이프
5th row라이프

Common Values

ValueCountFrequency (%)
속성 16
28.6%
라이프 15
26.8%
장소 8
14.3%
기타 6
 
10.7%
사회이슈 3
 
5.4%
시간 2
 
3.6%
단체 2
 
3.6%
인물 2
 
3.6%
상품 1
 
1.8%
엔터테인먼트 1
 
1.8%

Length

2024-04-21T19:08:10.668022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T19:08:10.895492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
속성 16
28.6%
라이프 15
26.8%
장소 8
14.3%
기타 6
 
10.7%
사회이슈 3
 
5.4%
시간 2
 
3.6%
단체 2
 
3.6%
인물 2
 
3.6%
상품 1
 
1.8%
엔터테인먼트 1
 
1.8%

연관어명
Text

UNIQUE 

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size576.0 B
2024-04-21T19:08:11.737302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.4464286
Min length2

Characters and Unicode

Total characters137
Distinct characters77
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)100.0%

Sample

1st row가격
2nd row가구
3rd row가드레일
4th row가로수
5th row가뭄
ValueCountFrequency (%)
가격 1
 
1.8%
가구 1
 
1.8%
건설 1
 
1.8%
강타 1
 
1.8%
강현 1
 
1.8%
강화군 1
 
1.8%
갖추다 1
 
1.8%
개체 1
 
1.8%
갯벌 1
 
1.8%
거닐다 1
 
1.8%
Other values (46) 46
82.1%
2024-04-21T19:08:12.781627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
9.5%
12
 
8.8%
5
 
3.6%
5
 
3.6%
4
 
2.9%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
Other values (67) 83
60.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 137
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
9.5%
12
 
8.8%
5
 
3.6%
5
 
3.6%
4
 
2.9%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
Other values (67) 83
60.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 137
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
9.5%
12
 
8.8%
5
 
3.6%
5
 
3.6%
4
 
2.9%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
Other values (67) 83
60.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 137
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
13
 
9.5%
12
 
8.8%
5
 
3.6%
5
 
3.6%
4
 
2.9%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
Other values (67) 83
60.6%

일간연관어언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)28.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.2321429
Minimum1
Maximum44
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size632.0 B
2024-04-21T19:08:12.984681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q37
95-th percentile18.5
Maximum44
Range43
Interquartile range (IQR)5

Descriptive statistics

Standard deviation7.9954329
Coefficient of variation (CV)1.2829348
Kurtosis11.258215
Mean6.2321429
Median Absolute Deviation (MAD)2
Skewness3.1047023
Sum349
Variance63.926948
MonotonicityNot monotonic
2024-04-21T19:08:13.173318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
2 12
21.4%
1 11
19.6%
5 6
10.7%
4 5
8.9%
6 4
 
7.1%
7 4
 
7.1%
3 3
 
5.4%
12 2
 
3.6%
11 2
 
3.6%
8 1
 
1.8%
Other values (6) 6
10.7%
ValueCountFrequency (%)
1 11
19.6%
2 12
21.4%
3 3
 
5.4%
4 5
8.9%
5 6
10.7%
6 4
 
7.1%
7 4
 
7.1%
8 1
 
1.8%
11 2
 
3.6%
12 2
 
3.6%
ValueCountFrequency (%)
44 1
 
1.8%
36 1
 
1.8%
20 1
 
1.8%
18 1
 
1.8%
17 1
 
1.8%
14 1
 
1.8%
12 2
3.6%
11 2
3.6%
8 1
 
1.8%
7 4
7.1%

일간연관어단어량
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)39.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.035714
Minimum1
Maximum122
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size632.0 B
2024-04-21T19:08:13.365821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4.5
Q39.25
95-th percentile40.25
Maximum122
Range121
Interquartile range (IQR)7.25

Descriptive statistics

Standard deviation22.331359
Coefficient of variation (CV)1.8554245
Kurtosis15.85356
Mean12.035714
Median Absolute Deviation (MAD)3
Skewness3.8233371
Sum674
Variance498.68961
MonotonicityNot monotonic
2024-04-21T19:08:13.585212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
1 9
16.1%
2 9
16.1%
4 8
14.3%
5 5
8.9%
8 3
 
5.4%
22 2
 
3.6%
3 2
 
3.6%
6 2
 
3.6%
7 2
 
3.6%
9 2
 
3.6%
Other values (12) 12
21.4%
ValueCountFrequency (%)
1 9
16.1%
2 9
16.1%
3 2
 
3.6%
4 8
14.3%
5 5
8.9%
6 2
 
3.6%
7 2
 
3.6%
8 3
 
5.4%
9 2
 
3.6%
10 1
 
1.8%
ValueCountFrequency (%)
122 1
1.8%
108 1
1.8%
44 1
1.8%
39 1
1.8%
35 1
1.8%
30 1
1.8%
24 1
1.8%
22 2
3.6%
19 1
1.8%
15 1
1.8%

Interactions

2024-04-21T19:08:07.187603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:05.785785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:06.490202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:07.426423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:06.019269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:06.722901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:07.661082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:06.249032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T19:08:06.947423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T19:08:13.740876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번단어속성명연관어명일간연관어언급량일간연관어단어량
일간연관어연번1.0000.0001.0000.0000.097
단어속성명0.0001.0001.0000.0000.000
연관어명1.0001.0001.0001.0001.000
일간연관어언급량0.0000.0001.0001.0000.886
일간연관어단어량0.0970.0001.0000.8861.000
2024-04-21T19:08:13.907461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번일간연관어언급량일간연관어단어량단어속성명
일간연관어연번1.000-0.020-0.0220.000
일간연관어언급량-0.0201.0000.9320.000
일간연관어단어량-0.0220.9321.0000.000
단어속성명0.0000.0000.0001.000

Missing values

2024-04-21T19:08:08.001030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T19:08:08.453680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
012020-07-01물환경물재난news속성가격11
122020-07-01물환경물재난news상품가구22
232020-07-01물환경물재난news속성가드레일24
342020-07-01물환경물재난news라이프가로수22
452020-07-01물환경물재난news라이프가뭄824
562020-07-01물환경물재난news속성가옥11
672020-07-01물환경물재난news시간가을78
782020-07-01물환경물재난news속성가일22
892020-07-01물환경물재난news속성가입719
9102020-07-01물환경물재난news속성가장자리33
일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
46472020-07-01물환경물재난news사회이슈검출24
47482020-07-01물환경물재난news속성게릴라44
48492020-07-01물환경물재난news속성게시판44
49502020-07-01물환경물재난news시간겨울66
50512020-07-01물환경물재난news기타결박22
51522020-07-01물환경물재난news라이프결연11
52532020-07-01물환경물재난news라이프결항22
53542020-07-01물환경물재난news장소경기도67
54552020-07-01물환경물재난news장소경남44
55562020-07-01물환경물재난news라이프경련11