Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric3
Categorical5
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=7a4a32b0-e842-11ea-835f-5b142183dc74

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
도메인 하위 카테고리명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간연관어언급량 is highly overall correlated with 일간연관어단어량High correlation
일간연관어단어량 is highly overall correlated with 일간연관어언급량High correlation
일간연관어연번 has unique valuesUnique
연관어명 has unique valuesUnique

Reproduction

Analysis started2024-04-18 07:00:06.169880
Analysis finished2024-04-18 07:00:07.456485
Duration1.29 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간연관어연번
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-18T16:00:07.520860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2024-04-18T16:00:07.637052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020-02-29
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-02-29
2nd row2020-02-29
3rd row2020-02-29
4th row2020-02-29
5th row2020-02-29

Common Values

ValueCountFrequency (%)
2020-02-29 100
100.0%

Length

2024-04-18T16:00:07.743231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T16:00:07.817210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-02-29 100
100.0%
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
생활환경
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row생활환경
2nd row생활환경
3rd row생활환경
4th row생활환경
5th row생활환경

Common Values

ValueCountFrequency (%)
생활환경 100
100.0%

Length

2024-04-18T16:00:07.894253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T16:00:07.964972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
생활환경 100
100.0%

도메인 하위 카테고리명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
대기
100 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대기
2nd row대기
3rd row대기
4th row대기
5th row대기

Common Values

ValueCountFrequency (%)
대기 100
100.0%

Length

2024-04-18T16:00:08.044328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T16:00:08.117006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대기 100
100.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
paper
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpaper
2nd rowpaper
3rd rowpaper
4th rowpaper
5th rowpaper

Common Values

ValueCountFrequency (%)
paper 100
100.0%

Length

2024-04-18T16:00:08.194172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T16:00:08.268532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
paper 100
100.0%

단어속성명
Categorical

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
속성
53 
라이프
16 
기타
10 
장소
10 
상품
 
4
Other values (4)

Length

Max length4
Median length2
Mean length2.2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row속성
2nd row상품
3rd row속성
4th row속성
5th row시간

Common Values

ValueCountFrequency (%)
속성 53
53.0%
라이프 16
 
16.0%
기타 10
 
10.0%
장소 10
 
10.0%
상품 4
 
4.0%
시간 2
 
2.0%
사회이슈 2
 
2.0%
단체 2
 
2.0%
인물 1
 
1.0%

Length

2024-04-18T16:00:08.358358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T16:00:08.709512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
속성 53
53.0%
라이프 16
 
16.0%
기타 10
 
10.0%
장소 10
 
10.0%
상품 4
 
4.0%
시간 2
 
2.0%
사회이슈 2
 
2.0%
단체 2
 
2.0%
인물 1
 
1.0%

연관어명
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2024-04-18T16:00:08.987064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.36
Min length2

Characters and Unicode

Total characters236
Distinct characters109
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row가설
2nd row가솔린
3rd row가스
4th row가시
5th row가을
ValueCountFrequency (%)
가설 1
 
1.0%
공란 1
 
1.0%
관찰 1
 
1.0%
관심사 1
 
1.0%
관성 1
 
1.0%
과소 1
 
1.0%
과도 1
 
1.0%
과거 1
 
1.0%
공학 1
 
1.0%
공정 1
 
1.0%
Other values (90) 90
90.0%
2024-04-18T16:00:09.345435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
5.1%
10
 
4.2%
9
 
3.8%
9
 
3.8%
8
 
3.4%
8
 
3.4%
7
 
3.0%
6
 
2.5%
5
 
2.1%
5
 
2.1%
Other values (99) 157
66.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 236
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
5.1%
10
 
4.2%
9
 
3.8%
9
 
3.8%
8
 
3.4%
8
 
3.4%
7
 
3.0%
6
 
2.5%
5
 
2.1%
5
 
2.1%
Other values (99) 157
66.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 236
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
5.1%
10
 
4.2%
9
 
3.8%
9
 
3.8%
8
 
3.4%
8
 
3.4%
7
 
3.0%
6
 
2.5%
5
 
2.1%
5
 
2.1%
Other values (99) 157
66.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 236
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
5.1%
10
 
4.2%
9
 
3.8%
9
 
3.8%
8
 
3.4%
8
 
3.4%
7
 
3.0%
6
 
2.5%
5
 
2.1%
5
 
2.1%
Other values (99) 157
66.5%

일간연관어언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.73
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-18T16:00:09.447383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum7
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.262153
Coefficient of variation (CV)0.72956823
Kurtosis4.045228
Mean1.73
Median Absolute Deviation (MAD)0
Skewness2.0336656
Sum173
Variance1.5930303
MonotonicityNot monotonic
2024-04-18T16:00:09.532338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 65
65.0%
2 15
 
15.0%
3 11
 
11.0%
5 4
 
4.0%
4 3
 
3.0%
7 1
 
1.0%
6 1
 
1.0%
ValueCountFrequency (%)
1 65
65.0%
2 15
 
15.0%
3 11
 
11.0%
4 3
 
3.0%
5 4
 
4.0%
6 1
 
1.0%
7 1
 
1.0%
ValueCountFrequency (%)
7 1
 
1.0%
6 1
 
1.0%
5 4
 
4.0%
4 3
 
3.0%
3 11
 
11.0%
2 15
 
15.0%
1 65
65.0%

일간연관어단어량
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.24
Minimum1
Maximum42
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-18T16:00:09.626823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q36
95-th percentile22.1
Maximum42
Range41
Interquartile range (IQR)5

Descriptive statistics

Standard deviation7.7929261
Coefficient of variation (CV)1.4871996
Kurtosis8.2049817
Mean5.24
Median Absolute Deviation (MAD)1
Skewness2.7916223
Sum524
Variance60.729697
MonotonicityNot monotonic
2024-04-18T16:00:09.729442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
1 43
43.0%
3 13
 
13.0%
2 11
 
11.0%
8 4
 
4.0%
13 4
 
4.0%
4 4
 
4.0%
6 3
 
3.0%
7 3
 
3.0%
5 3
 
3.0%
9 2
 
2.0%
Other values (10) 10
 
10.0%
ValueCountFrequency (%)
1 43
43.0%
2 11
 
11.0%
3 13
 
13.0%
4 4
 
4.0%
5 3
 
3.0%
6 3
 
3.0%
7 3
 
3.0%
8 4
 
4.0%
9 2
 
2.0%
10 1
 
1.0%
ValueCountFrequency (%)
42 1
 
1.0%
36 1
 
1.0%
31 1
 
1.0%
30 1
 
1.0%
24 1
 
1.0%
22 1
 
1.0%
21 1
 
1.0%
20 1
 
1.0%
13 4
4.0%
12 1
 
1.0%

Interactions

2024-04-18T16:00:07.070277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T16:00:06.650577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T16:00:06.864361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T16:00:07.133478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T16:00:06.730781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T16:00:06.934098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T16:00:07.202734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T16:00:06.798992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T16:00:07.006502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-18T16:00:09.802086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번단어속성명연관어명일간연관어언급량일간연관어단어량
일간연관어연번1.0000.0001.0000.0000.000
단어속성명0.0001.0001.0000.4200.451
연관어명1.0001.0001.0001.0001.000
일간연관어언급량0.0000.4201.0001.0000.803
일간연관어단어량0.0000.4511.0000.8031.000
2024-04-18T16:00:09.891339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번일간연관어언급량일간연관어단어량단어속성명
일간연관어연번1.000-0.005-0.0010.000
일간연관어언급량-0.0051.0000.7560.233
일간연관어단어량-0.0010.7561.0000.237
단어속성명0.0000.2330.2371.000

Missing values

2024-04-18T16:00:07.293962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T16:00:07.408829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
012020-02-29생활환경대기paper속성가설11
122020-02-29생활환경대기paper상품가솔린33
232020-02-29생활환경대기paper속성가스48
342020-02-29생활환경대기paper속성가시12
452020-02-29생활환경대기paper시간가을730
562020-02-29생활환경대기paper속성가이드라인13
672020-02-29생활환경대기paper속성가중12
782020-02-29생활환경대기paper속성가중치11
892020-02-29생활환경대기paper속성간격512
9102020-02-29생활환경대기paper라이프간담회11
일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
90912020-02-29생활환경대기paper속성구름24
91922020-02-29생활환경대기paper속성구별14
92932020-02-29생활환경대기paper라이프구부33
93942020-02-29생활환경대기paper장소구청11
94952020-02-29생활환경대기paper기타구하다23
95962020-02-29생활환경대기paper속성국립16
96972020-02-29생활환경대기paper라이프권고13
97982020-02-29생활환경대기paper속성권역11
98992020-02-29생활환경대기paper속성균일22
991002020-02-29생활환경대기paper속성그래프23