Overview

Dataset statistics

Number of variables9
Number of observations28
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.2 KiB
Average record size in memory79.7 B

Variable types

Numeric1
Categorical7
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=7a4a32b0-e842-11ea-835f-5b142183dc74

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
도메인 하위 카테고리명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간연관어언급량 has constant value ""Constant
일간연관어단어량 is highly imbalanced (62.9%)Imbalance
일간연관어연번 has unique valuesUnique
연관어명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:09:24.788010
Analysis finished2023-12-10 13:09:26.740329
Duration1.95 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간연관어연번
Real number (ℝ)

UNIQUE 

Distinct28
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.5
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.0 B
2023-12-10T22:09:26.858981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.35
Q17.75
median14.5
Q321.25
95-th percentile26.65
Maximum28
Range27
Interquartile range (IQR)13.5

Descriptive statistics

Standard deviation8.2259751
Coefficient of variation (CV)0.56730863
Kurtosis-1.2
Mean14.5
Median Absolute Deviation (MAD)7
Skewness0
Sum406
Variance67.666667
MonotonicityStrictly increasing
2023-12-10T22:09:27.059069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
1 1
 
3.6%
16 1
 
3.6%
28 1
 
3.6%
27 1
 
3.6%
26 1
 
3.6%
25 1
 
3.6%
24 1
 
3.6%
23 1
 
3.6%
22 1
 
3.6%
21 1
 
3.6%
Other values (18) 18
64.3%
ValueCountFrequency (%)
1 1
3.6%
2 1
3.6%
3 1
3.6%
4 1
3.6%
5 1
3.6%
6 1
3.6%
7 1
3.6%
8 1
3.6%
9 1
3.6%
10 1
3.6%
ValueCountFrequency (%)
28 1
3.6%
27 1
3.6%
26 1
3.6%
25 1
3.6%
24 1
3.6%
23 1
3.6%
22 1
3.6%
21 1
3.6%
20 1
3.6%
19 1
3.6%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size356.0 B
2021-01-31
28 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-01-31
2nd row2021-01-31
3rd row2021-01-31
4th row2021-01-31
5th row2021-01-31

Common Values

ValueCountFrequency (%)
2021-01-31 28
100.0%

Length

2023-12-10T22:09:27.294076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:27.682281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-01-31 28
100.0%
Distinct1
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size356.0 B
물환경
28 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 28
100.0%

Length

2023-12-10T22:09:27.892688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:28.066844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 28
100.0%

도메인 하위 카테고리명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size356.0 B
하천
28 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하천
2nd row하천
3rd row하천
4th row하천
5th row하천

Common Values

ValueCountFrequency (%)
하천 28
100.0%

Length

2023-12-10T22:09:28.243706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:28.394404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하천 28
100.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size356.0 B
paper
28 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpaper
2nd rowpaper
3rd rowpaper
4th rowpaper
5th rowpaper

Common Values

ValueCountFrequency (%)
paper 28
100.0%

Length

2023-12-10T22:09:28.561327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:28.704096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
paper 28
100.0%

단어속성명
Categorical

Distinct6
Distinct (%)21.4%
Missing0
Missing (%)0.0%
Memory size356.0 B
인물
11 
기타
10 
속성
엔터테인먼트
장소
 
1

Length

Max length6
Median length2
Mean length2.3214286
Min length2

Unique

Unique2 ?
Unique (%)7.1%

Sample

1st row기타
2nd row장소
3rd row기타
4th row인물
5th row인물

Common Values

ValueCountFrequency (%)
인물 11
39.3%
기타 10
35.7%
속성 3
 
10.7%
엔터테인먼트 2
 
7.1%
장소 1
 
3.6%
라이프 1
 
3.6%

Length

2023-12-10T22:09:28.939727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:29.149264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
인물 11
39.3%
기타 10
35.7%
속성 3
 
10.7%
엔터테인먼트 2
 
7.1%
장소 1
 
3.6%
라이프 1
 
3.6%

연관어명
Text

UNIQUE 

Distinct28
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size356.0 B
2023-12-10T22:09:29.472561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length2.6785714
Min length2

Characters and Unicode

Total characters75
Distinct characters53
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)100.0%

Sample

1st row경공
2nd row경안천
3rd row귀래
4th row김경민
5th row김민지
ValueCountFrequency (%)
경공 1
 
3.6%
경안천 1
 
3.6%
해상풍력발전 1
 
3.6%
통계 1
 
3.6%
탐색 1
 
3.6%
최연 1
 
3.6%
채민 1
 
3.6%
주진철 1
 
3.6%
정석희 1
 
3.6%
이준호 1
 
3.6%
Other values (18) 18
64.3%
2023-12-10T22:09:30.065537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
6.7%
4
 
5.3%
4
 
5.3%
3
 
4.0%
3
 
4.0%
3
 
4.0%
2
 
2.7%
2
 
2.7%
2
 
2.7%
2
 
2.7%
Other values (43) 45
60.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 75
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
6.7%
4
 
5.3%
4
 
5.3%
3
 
4.0%
3
 
4.0%
3
 
4.0%
2
 
2.7%
2
 
2.7%
2
 
2.7%
2
 
2.7%
Other values (43) 45
60.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 75
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
6.7%
4
 
5.3%
4
 
5.3%
3
 
4.0%
3
 
4.0%
3
 
4.0%
2
 
2.7%
2
 
2.7%
2
 
2.7%
2
 
2.7%
Other values (43) 45
60.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 75
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
 
6.7%
4
 
5.3%
4
 
5.3%
3
 
4.0%
3
 
4.0%
3
 
4.0%
2
 
2.7%
2
 
2.7%
2
 
2.7%
2
 
2.7%
Other values (43) 45
60.0%

일간연관어언급량
Categorical

CONSTANT 

Distinct1
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size356.0 B
1
28 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 28
100.0%

Length

2023-12-10T22:09:30.292397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:30.450155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 28
100.0%

일간연관어단어량
Categorical

IMBALANCE 

Distinct2
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size356.0 B
2
26 
1
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 26
92.9%
1 2
 
7.1%

Length

2023-12-10T22:09:30.618270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:30.768368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 26
92.9%
1 2
 
7.1%

Interactions

2023-12-10T22:09:26.126078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:09:30.896322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번단어속성명연관어명일간연관어단어량
일간연관어연번1.0000.0001.0000.000
단어속성명0.0001.0001.0000.000
연관어명1.0001.0001.0001.000
일간연관어단어량0.0000.0001.0001.000
2023-12-10T22:09:31.490999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어단어량단어속성명
일간연관어단어량1.0000.000
단어속성명0.0001.000
2023-12-10T22:09:31.629104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번단어속성명일간연관어단어량
일간연관어연번1.0000.0000.000
단어속성명0.0001.0000.000
일간연관어단어량0.0000.0001.000

Missing values

2023-12-10T22:09:26.354572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:09:26.637122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
012021-01-31물환경하천paper기타경공11
122021-01-31물환경하천paper장소경안천12
232021-01-31물환경하천paper기타귀래12
342021-01-31물환경하천paper인물김경민12
452021-01-31물환경하천paper인물김민지12
562021-01-31물환경하천paper인물김성수12
672021-01-31물환경하천paper인물김진호12
782021-01-31물환경하천paper엔터테인먼트대호12
892021-01-31물환경하천paper속성목차12
9102021-01-31물환경하천paper인물박민규12
일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
18192021-01-31물환경하천paper인물윤석민12
19202021-01-31물환경하천paper인물이준호12
20212021-01-31물환경하천paper기타정석희12
21222021-01-31물환경하천paper인물주진철12
22232021-01-31물환경하천paper기타채민12
23242021-01-31물환경하천paper인물최연12
24252021-01-31물환경하천paper속성탐색12
25262021-01-31물환경하천paper라이프통계12
26272021-01-31물환경하천paper기타해상풍력발전12
27282021-01-31물환경하천paper속성호소12