Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric2
Categorical6
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=7a4a32b0-e842-11ea-835f-5b142183dc74

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
도메인 하위 카테고리명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간연관어언급량 has constant value ""Constant
일간연관어연번 has unique valuesUnique
연관어명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:09:14.522803
Analysis finished2023-12-10 13:09:18.411601
Duration3.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간연관어연번
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:09:18.545537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T22:09:18.746656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2021-04-30
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-04-30
2nd row2021-04-30
3rd row2021-04-30
4th row2021-04-30
5th row2021-04-30

Common Values

ValueCountFrequency (%)
2021-04-30 100
100.0%

Length

2023-12-10T22:09:18.973422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:19.121076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-04-30 100
100.0%
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T22:09:19.292014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:19.447577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물재난
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
물재난 100
100.0%

Length

2023-12-10T22:09:19.581507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:19.715449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물재난 100
100.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
paper
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpaper
2nd rowpaper
3rd rowpaper
4th rowpaper
5th rowpaper

Common Values

ValueCountFrequency (%)
paper 100
100.0%

Length

2023-12-10T22:09:19.824247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:19.946569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
paper 100
100.0%

단어속성명
Categorical

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
속성
47 
기타
26 
라이프
인물
장소
Other values (4)

Length

Max length6
Median length2
Mean length2.27
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row사회이슈
2nd row기타
3rd row속성
4th row속성
5th row속성

Common Values

ValueCountFrequency (%)
속성 47
47.0%
기타 26
26.0%
라이프 8
 
8.0%
인물 6
 
6.0%
장소 5
 
5.0%
엔터테인먼트 4
 
4.0%
단체 2
 
2.0%
사회이슈 1
 
1.0%
브랜드 1
 
1.0%

Length

2023-12-10T22:09:20.209663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:20.344706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
속성 47
47.0%
기타 26
26.0%
라이프 8
 
8.0%
인물 6
 
6.0%
장소 5
 
5.0%
엔터테인먼트 4
 
4.0%
단체 2
 
2.0%
사회이슈 1
 
1.0%
브랜드 1
 
1.0%

연관어명
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T22:09:21.319102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.41
Min length2

Characters and Unicode

Total characters241
Distinct characters133
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row4대강사업
2nd row가역
3rd row가중치
4th row감도
5th row강보
ValueCountFrequency (%)
4대강사업 1
 
1.0%
상판 1
 
1.0%
수체 1
 
1.0%
수자원 1
 
1.0%
수온 1
 
1.0%
수리 1
 
1.0%
수노 1
 
1.0%
수계 1
 
1.0%
소원 1
 
1.0%
서론 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T22:09:22.328373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
5.4%
7
 
2.9%
6
 
2.5%
6
 
2.5%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (123) 184
76.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 240
99.6%
Decimal Number 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
5.4%
7
 
2.9%
6
 
2.5%
6
 
2.5%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (122) 183
76.2%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 240
99.6%
Common 1
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
5.4%
7
 
2.9%
6
 
2.5%
6
 
2.5%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (122) 183
76.2%
Common
ValueCountFrequency (%)
4 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 240
99.6%
ASCII 1
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
13
 
5.4%
7
 
2.9%
6
 
2.5%
6
 
2.5%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (122) 183
76.2%
ASCII
ValueCountFrequency (%)
4 1
100.0%

일간연관어언급량
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 100
100.0%

Length

2023-12-10T22:09:22.555949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:22.750036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 100
100.0%

일간연관어단어량
Real number (ℝ)

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.45
Minimum1
Maximum59
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:09:22.921731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile4.05
Maximum59
Range58
Interquartile range (IQR)1

Descriptive statistics

Standard deviation6.3633658
Coefficient of variation (CV)2.5972922
Kurtosis65.308769
Mean2.45
Median Absolute Deviation (MAD)0
Skewness7.6985633
Sum245
Variance40.492424
MonotonicityNot monotonic
2023-12-10T22:09:23.141016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 70
70.0%
2 12
 
12.0%
3 11
 
11.0%
4 2
 
2.0%
6 1
 
1.0%
23 1
 
1.0%
17 1
 
1.0%
5 1
 
1.0%
59 1
 
1.0%
ValueCountFrequency (%)
1 70
70.0%
2 12
 
12.0%
3 11
 
11.0%
4 2
 
2.0%
5 1
 
1.0%
6 1
 
1.0%
17 1
 
1.0%
23 1
 
1.0%
59 1
 
1.0%
ValueCountFrequency (%)
59 1
 
1.0%
23 1
 
1.0%
17 1
 
1.0%
6 1
 
1.0%
5 1
 
1.0%
4 2
 
2.0%
3 11
 
11.0%
2 12
 
12.0%
1 70
70.0%

Interactions

2023-12-10T22:09:17.649558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:09:17.036265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:09:17.800802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:09:17.366062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:09:23.287282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번단어속성명연관어명일간연관어단어량
일간연관어연번1.0000.2151.0000.047
단어속성명0.2151.0001.0000.000
연관어명1.0001.0001.0001.000
일간연관어단어량0.0470.0001.0001.000
2023-12-10T22:09:23.444599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번일간연관어단어량단어속성명
일간연관어연번1.000-0.0090.093
일간연관어단어량-0.0091.0000.000
단어속성명0.0930.0001.000

Missing values

2023-12-10T22:09:18.053954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:09:18.303411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
012021-04-30물환경물재난paper사회이슈4대강사업11
122021-04-30물환경물재난paper기타가역11
232021-04-30물환경물재난paper속성가중치13
342021-04-30물환경물재난paper속성감도13
452021-04-30물환경물재난paper속성강보11
562021-04-30물환경물재난paper라이프강수량12
672021-04-30물환경물재난paper인물강인11
782021-04-30물환경물재난paper속성개체12
892021-04-30물환경물재난paper속성검정11
9102021-04-30물환경물재난paper속성경보11
일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
90912021-04-30물환경물재난paper라이프유기물13
91922021-04-30물환경물재난paper속성유역11
92932021-04-30물환경물재난paper단체의경11
93942021-04-30물환경물재난paper장소의령11
94952021-04-30물환경물재난paper기타의수11
95962021-04-30물환경물재난paper라이프의학11
96972021-04-30물환경물재난paper인물이상민14
97982021-04-30물환경물재난paper브랜드인산11
98992021-04-30물환경물재난paper기타일사량11
991002021-04-30물환경물재난paper속성자판11