Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric3
DateTime1
Categorical4
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=f0997ad0-e841-11ea-a0ba-57da34c93da2

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
도메인 하위 카테고리명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간연관어언급량 is highly overall correlated with 일간연관어단어량High correlation
일간연관어단어량 is highly overall correlated with 일간연관어언급량High correlation
일간연관어연번 has unique valuesUnique
연관어명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:03:26.450536
Analysis finished2023-12-10 13:03:28.123809
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간연관어연번
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:28.196463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T22:03:28.386831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2020-10-01 00:00:00
Maximum2020-10-01 00:00:00
2023-12-10T22:03:28.513630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:28.610854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T22:03:28.754699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:28.848661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물재난
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
물재난 100
100.0%

Length

2023-12-10T22:03:28.943304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:29.048024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물재난 100
100.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
news
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownews
2nd rownews
3rd rownews
4th rownews
5th rownews

Common Values

ValueCountFrequency (%)
news 100
100.0%

Length

2023-12-10T22:03:29.147343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:29.243215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
news 100
100.0%

단어속성명
Categorical

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
라이프
26 
속성
25 
장소
15 
기타
13 
인물
Other values (6)
14 

Length

Max length6
Median length2
Mean length2.37
Min length2

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row속성
2nd row상품
3rd row상품
4th row상품
5th row라이프

Common Values

ValueCountFrequency (%)
라이프 26
26.0%
속성 25
25.0%
장소 15
15.0%
기타 13
13.0%
인물 7
 
7.0%
상품 5
 
5.0%
단체 4
 
4.0%
엔터테인먼트 2
 
2.0%
사회이슈 1
 
1.0%
브랜드 1
 
1.0%

Length

2023-12-10T22:03:29.354907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
라이프 26
26.0%
속성 25
25.0%
장소 15
15.0%
기타 13
13.0%
인물 7
 
7.0%
상품 5
 
5.0%
단체 4
 
4.0%
엔터테인먼트 2
 
2.0%
사회이슈 1
 
1.0%
브랜드 1
 
1.0%

연관어명
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T22:03:29.692491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.57
Min length2

Characters and Unicode

Total characters257
Distinct characters117
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row가경
2nd row가공품
3rd row가구
4th row가전제품
5th row가정
ValueCountFrequency (%)
가경 1
 
1.0%
국비 1
 
1.0%
기울이다 1
 
1.0%
기부 1
 
1.0%
기본 1
 
1.0%
기반 1
 
1.0%
기념일 1
 
1.0%
기관지 1
 
1.0%
금액 1
 
1.0%
군수 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T22:03:30.187819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
9
 
3.5%
9
 
3.5%
9
 
3.5%
9
 
3.5%
8
 
3.1%
7
 
2.7%
7
 
2.7%
5
 
1.9%
Other values (107) 172
66.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 257
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
9
 
3.5%
9
 
3.5%
9
 
3.5%
9
 
3.5%
8
 
3.1%
7
 
2.7%
7
 
2.7%
5
 
1.9%
Other values (107) 172
66.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 257
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
9
 
3.5%
9
 
3.5%
9
 
3.5%
9
 
3.5%
8
 
3.1%
7
 
2.7%
7
 
2.7%
5
 
1.9%
Other values (107) 172
66.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 257
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
9
 
3.5%
9
 
3.5%
9
 
3.5%
9
 
3.5%
8
 
3.1%
7
 
2.7%
7
 
2.7%
5
 
1.9%
Other values (107) 172
66.9%

일간연관어언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.25
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:30.345675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile12.05
Maximum25
Range24
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.8908725
Coefficient of variation (CV)0.91549941
Kurtosis8.504427
Mean4.25
Median Absolute Deviation (MAD)1
Skewness2.4920092
Sum425
Variance15.138889
MonotonicityNot monotonic
2023-12-10T22:03:30.470283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
2 30
30.0%
4 20
20.0%
1 16
16.0%
3 6
 
6.0%
6 6
 
6.0%
5 5
 
5.0%
9 4
 
4.0%
7 3
 
3.0%
10 3
 
3.0%
14 2
 
2.0%
Other values (5) 5
 
5.0%
ValueCountFrequency (%)
1 16
16.0%
2 30
30.0%
3 6
 
6.0%
4 20
20.0%
5 5
 
5.0%
6 6
 
6.0%
7 3
 
3.0%
8 1
 
1.0%
9 4
 
4.0%
10 3
 
3.0%
ValueCountFrequency (%)
25 1
 
1.0%
17 1
 
1.0%
14 2
 
2.0%
13 1
 
1.0%
12 1
 
1.0%
10 3
3.0%
9 4
4.0%
8 1
 
1.0%
7 3
3.0%
6 6
6.0%

일간연관어단어량
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.17
Minimum1
Maximum61
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:30.582126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q38
95-th percentile22.35
Maximum61
Range60
Interquartile range (IQR)6

Descriptive statistics

Standard deviation9.1993138
Coefficient of variation (CV)1.2830284
Kurtosis14.087238
Mean7.17
Median Absolute Deviation (MAD)2
Skewness3.371997
Sum717
Variance84.627374
MonotonicityNot monotonic
2023-12-10T22:03:30.699907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
4 21
21.0%
2 20
20.0%
1 10
10.0%
3 7
 
7.0%
6 7
 
7.0%
5 6
 
6.0%
9 5
 
5.0%
10 4
 
4.0%
8 3
 
3.0%
15 2
 
2.0%
Other values (12) 15
15.0%
ValueCountFrequency (%)
1 10
10.0%
2 20
20.0%
3 7
 
7.0%
4 21
21.0%
5 6
 
6.0%
6 7
 
7.0%
7 2
 
2.0%
8 3
 
3.0%
9 5
 
5.0%
10 4
 
4.0%
ValueCountFrequency (%)
61 1
1.0%
43 1
1.0%
37 1
1.0%
29 2
2.0%
22 1
1.0%
21 1
1.0%
19 1
1.0%
18 1
1.0%
16 2
2.0%
15 2
2.0%

Interactions

2023-12-10T22:03:27.563656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:26.707117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:27.276559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:27.674651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:27.049042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:27.374020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:27.811334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:27.175911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:27.469673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:03:30.793341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번단어속성명연관어명일간연관어언급량일간연관어단어량
일간연관어연번1.0000.5351.0000.1520.000
단어속성명0.5351.0001.0000.0000.000
연관어명1.0001.0001.0001.0001.000
일간연관어언급량0.1520.0001.0001.0000.970
일간연관어단어량0.0000.0001.0000.9701.000
2023-12-10T22:03:30.891780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번일간연관어언급량일간연관어단어량단어속성명
일간연관어연번1.000-0.009-0.0350.258
일간연관어언급량-0.0091.0000.8190.000
일간연관어단어량-0.0350.8191.0000.000
단어속성명0.2580.0000.0001.000

Missing values

2023-12-10T22:03:27.931712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:03:28.071362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
012020-10-01물환경물재난news속성가경39
122020-10-01물환경물재난news상품가공품44
232020-10-01물환경물재난news상품가구13
342020-10-01물환경물재난news상품가전제품11
452020-10-01물환경물재난news라이프가정26
562020-10-01물환경물재난news라이프가족37
672020-10-01물환경물재난news속성가중44
782020-10-01물환경물재난news속성가축1321
892020-10-01물환경물재난news장소가평210
9102020-10-01물환경물재난news장소가평군34
일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
90912020-10-01물환경물재난news장소남도66
91922020-10-01물환경물재난news장소남원13
92932020-10-01물환경물재난news장소남원시13
93942020-10-01물환경물재난news기타내려가다22
94952020-10-01물환경물재난news기타내천66
95962020-10-01물환경물재난news속성노동22
96972020-10-01물환경물재난news단체노동당26
97982020-10-01물환경물재난news속성노력1416
98992020-10-01물환경물재난news속성노면44
991002020-10-01물환경물재난news속성녹색412