Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric3
DateTime1
Categorical4
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=aafe9230-e841-11ea-a0ba-57da34c93da2

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간감성어연번 is highly overall correlated with 도메인 하위 카테고리명High correlation
일간감성어언급량 is highly overall correlated with 일간감성어단어량High correlation
일간감성어단어량 is highly overall correlated with 일간감성어언급량High correlation
도메인 하위 카테고리명 is highly overall correlated with 일간감성어연번High correlation
일간감성어연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 11:57:49.843943
Analysis finished2023-12-10 11:57:51.845577
Duration2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간감성어연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:57:52.012156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T20:57:52.224194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2020-10-01 00:00:00
Maximum2020-10-01 00:00:00
2023-12-10T20:57:52.436928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:52.544883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T20:57:52.674345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:57:52.792797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
하천
44 
물재난
24 
지하수
호소
상수도

Length

Max length3
Median length2
Mean length2.47
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
하천 44
44.0%
물재난 24
24.0%
지하수 9
 
9.0%
호소 9
 
9.0%
상수도 7
 
7.0%
하수도 7
 
7.0%

Length

2023-12-10T20:57:52.919135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:57:53.067585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하천 44
44.0%
물재난 24
24.0%
지하수 9
 
9.0%
호소 9
 
9.0%
상수도 7
 
7.0%
하수도 7
 
7.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
news
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownews
2nd rownews
3rd rownews
4th rownews
5th rownews

Common Values

ValueCountFrequency (%)
news 100
100.0%

Length

2023-12-10T20:57:53.220484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:57:53.336092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
news 100
100.0%
Distinct65
Distinct (%)65.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T20:57:53.601565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.46
Min length2

Characters and Unicode

Total characters246
Distinct characters106
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)45.0%

Sample

1st row감사
2nd row갖추다
3rd row걱정
4th row고충
5th row기대
ValueCountFrequency (%)
대상 6
 
6.0%
이루다 4
 
4.0%
우려 4
 
4.0%
우수 4
 
4.0%
기대 4
 
4.0%
안전 3
 
3.0%
적극 3
 
3.0%
예방 3
 
3.0%
정상 2
 
2.0%
감사 2
 
2.0%
Other values (55) 65
65.0%
2023-12-10T20:57:54.097337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24
 
9.8%
10
 
4.1%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (96) 157
63.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 246
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
9.8%
10
 
4.1%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (96) 157
63.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 246
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
9.8%
10
 
4.1%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (96) 157
63.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 246
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
24
 
9.8%
10
 
4.1%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (96) 157
63.8%

감성타입
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
p
77 
n
23 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowp
2nd rowp
3rd rown
4th rown
5th rowp

Common Values

ValueCountFrequency (%)
p 77
77.0%
n 23
 
23.0%

Length

2023-12-10T20:57:54.285254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:57:54.411663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
p 77
77.0%
n 23
 
23.0%

일간감성어언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.94
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:57:54.503782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.4280349
Coefficient of variation (CV)0.82586222
Kurtosis9.915068
Mean2.94
Median Absolute Deviation (MAD)1
Skewness2.7672794
Sum294
Variance5.8953535
MonotonicityNot monotonic
2023-12-10T20:57:54.633013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2 35
35.0%
1 23
23.0%
3 19
19.0%
4 9
 
9.0%
5 5
 
5.0%
9 4
 
4.0%
6 2
 
2.0%
16 1
 
1.0%
12 1
 
1.0%
7 1
 
1.0%
ValueCountFrequency (%)
1 23
23.0%
2 35
35.0%
3 19
19.0%
4 9
 
9.0%
5 5
 
5.0%
6 2
 
2.0%
7 1
 
1.0%
9 4
 
4.0%
12 1
 
1.0%
16 1
 
1.0%
ValueCountFrequency (%)
16 1
 
1.0%
12 1
 
1.0%
9 4
 
4.0%
7 1
 
1.0%
6 2
 
2.0%
5 5
 
5.0%
4 9
 
9.0%
3 19
19.0%
2 35
35.0%
1 23
23.0%

일간감성어단어량
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.06
Minimum1
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:57:54.760950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q34
95-th percentile12.05
Maximum33
Range32
Interquartile range (IQR)2

Descriptive statistics

Standard deviation4.5986823
Coefficient of variation (CV)1.1326804
Kurtosis16.559535
Mean4.06
Median Absolute Deviation (MAD)1
Skewness3.495033
Sum406
Variance21.147879
MonotonicityNot monotonic
2023-12-10T20:57:54.944476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
2 32
32.0%
1 19
19.0%
3 17
17.0%
4 9
 
9.0%
6 6
 
6.0%
9 4
 
4.0%
5 3
 
3.0%
12 2
 
2.0%
11 1
 
1.0%
8 1
 
1.0%
Other values (6) 6
 
6.0%
ValueCountFrequency (%)
1 19
19.0%
2 32
32.0%
3 17
17.0%
4 9
 
9.0%
5 3
 
3.0%
6 6
 
6.0%
8 1
 
1.0%
9 4
 
4.0%
10 1
 
1.0%
11 1
 
1.0%
ValueCountFrequency (%)
33 1
 
1.0%
20 1
 
1.0%
16 1
 
1.0%
14 1
 
1.0%
13 1
 
1.0%
12 2
2.0%
11 1
 
1.0%
10 1
 
1.0%
9 4
4.0%
8 1
 
1.0%

Interactions

2023-12-10T20:57:51.219689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:50.191997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:50.525093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:51.317423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:50.293016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:50.634449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:51.428307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:50.405500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:57:50.755277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:57:55.142572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간감성어연번도메인 하위 카테고리명감성어명감성타입일간감성어언급량일간감성어단어량
일간감성어연번1.0000.9490.0000.0000.4210.000
도메인 하위 카테고리명0.9491.0000.0000.0000.4800.120
감성어명0.0000.0001.0001.0000.0000.000
감성타입0.0000.0001.0001.0000.0000.000
일간감성어언급량0.4210.4800.0000.0001.0000.858
일간감성어단어량0.0000.1200.0000.0000.8581.000
2023-12-10T20:57:55.307330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
감성타입도메인 하위 카테고리명
감성타입1.0000.000
도메인 하위 카테고리명0.0001.000
2023-12-10T20:57:55.444373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간감성어연번일간감성어언급량일간감성어단어량도메인 하위 카테고리명감성타입
일간감성어연번1.000-0.261-0.2640.8470.000
일간감성어언급량-0.2611.0000.8880.2380.000
일간감성어단어량-0.2640.8881.0000.0670.000
도메인 하위 카테고리명0.8470.2380.0671.0000.000
감성타입0.0000.0000.0000.0001.000

Missing values

2023-12-10T20:57:51.597981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:57:51.774276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간감성어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명감성어명감성타입일간감성어언급량일간감성어단어량
012020-10-01물환경물재난news감사p11
122020-10-01물환경물재난news갖추다p22
232020-10-01물환경물재난news걱정n22
342020-10-01물환경물재난news고충n913
452020-10-01물환경물재난news기대p99
562020-10-01물환경물재난news기쁘다p22
672020-10-01물환경물재난news대상p99
782020-10-01물환경물재난news도움p36
892020-10-01물환경물재난news불량n33
9102020-10-01물환경물재난news쏟아지다p22
일간감성어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명감성어명감성타입일간감성어언급량일간감성어단어량
90912020-10-01물환경하천news훼손n12
91922020-10-01물환경호소news가깝다p33
92932020-10-01물환경호소news고운p22
93942020-10-01물환경호소news긍정p22
94952020-10-01물환경호소news기대p22
95962020-10-01물환경호소news기침n33
96972020-10-01물환경호소news깨닫다p33
97982020-10-01물환경호소news내놓다n22
98992020-10-01물환경호소news대상p22
991002020-10-01물환경호소news멋지다p22