Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory67.3 B

Variable types

Numeric2
DateTime1
Categorical4
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=0d7f33b0-2fce-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
도메인 하위 카테고리명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간연관어연번 is highly overall correlated with 일간연관어언급량High correlation
일간연관어언급량 is highly overall correlated with 일간연관어연번High correlation
일간연관어연번 has unique valuesUnique
연관어명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 12:43:05.450655
Analysis finished2023-12-10 12:43:06.403445
Duration0.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간연관어연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:43:06.493831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T21:43:06.666809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2017-01-02 00:00:00
Maximum2017-01-02 00:00:00
2023-12-10T21:43:06.777643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:06.874966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T21:43:07.001167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:07.145708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
하천
100 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하천
2nd row하천
3rd row하천
4th row하천
5th row하천

Common Values

ValueCountFrequency (%)
하천 100
100.0%

Length

2023-12-10T21:43:07.284746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:07.374171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하천 100
100.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
All
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 100
100.0%

Length

2023-12-10T21:43:07.484874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:07.573205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 100
100.0%

단어속성명
Categorical

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
라이프
28 
속성
28 
장소
12 
인물
11 
시간
Other values (6)
15 

Length

Max length6
Median length2
Mean length2.43
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row시간
2nd row장소
3rd row사회이슈
4th row단체
5th row라이프

Common Values

ValueCountFrequency (%)
라이프 28
28.0%
속성 28
28.0%
장소 12
12.0%
인물 11
 
11.0%
시간 6
 
6.0%
단체 4
 
4.0%
기타 4
 
4.0%
사회이슈 3
 
3.0%
엔터테인먼트 2
 
2.0%
상품 1
 
1.0%

Length

2023-12-10T21:43:07.683228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
라이프 28
28.0%
속성 28
28.0%
장소 12
12.0%
인물 11
 
11.0%
시간 6
 
6.0%
단체 4
 
4.0%
기타 4
 
4.0%
사회이슈 3
 
3.0%
엔터테인먼트 2
 
2.0%
상품 1
 
1.0%

연관어명
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T21:43:08.026163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.52
Min length1

Characters and Unicode

Total characters252
Distinct characters149
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row새해
2nd row서울
3rd row탈당
4th row새누리
5th row특검
ValueCountFrequency (%)
새해 1
 
1.0%
날씨 1
 
1.0%
트위터 1
 
1.0%
이동관 1
 
1.0%
발표 1
 
1.0%
지시 1
 
1.0%
청와대 1
 
1.0%
맥락 1
 
1.0%
침묵 1
 
1.0%
대비 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T21:43:08.575633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
2.8%
6
 
2.4%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (139) 204
81.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 244
96.8%
Decimal Number 5
 
2.0%
Lowercase Letter 3
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
2.9%
6
 
2.5%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (132) 196
80.3%
Decimal Number
ValueCountFrequency (%)
4 1
20.0%
2 1
20.0%
0 1
20.0%
1 1
20.0%
7 1
20.0%
Lowercase Letter
ValueCountFrequency (%)
s 2
66.7%
n 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 244
96.8%
Common 5
 
2.0%
Latin 3
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
2.9%
6
 
2.5%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (132) 196
80.3%
Common
ValueCountFrequency (%)
4 1
20.0%
2 1
20.0%
0 1
20.0%
1 1
20.0%
7 1
20.0%
Latin
ValueCountFrequency (%)
s 2
66.7%
n 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 244
96.8%
ASCII 8
 
3.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7
 
2.9%
6
 
2.5%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (132) 196
80.3%
ASCII
ValueCountFrequency (%)
s 2
25.0%
4 1
12.5%
2 1
12.5%
0 1
12.5%
n 1
12.5%
1 1
12.5%
7 1
12.5%

일간연관어언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct55
Distinct (%)55.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75.84
Minimum43
Maximum240
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:43:08.747464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum43
5-th percentile44
Q150
median63
Q378.25
95-th percentile164.15
Maximum240
Range197
Interquartile range (IQR)28.25

Descriptive statistics

Standard deviation40.448426
Coefficient of variation (CV)0.53333895
Kurtosis4.4645281
Mean75.84
Median Absolute Deviation (MAD)13
Skewness2.137975
Sum7584
Variance1636.0752
MonotonicityDecreasing
2023-12-10T21:43:08.926116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
44 7
 
7.0%
50 6
 
6.0%
74 5
 
5.0%
65 4
 
4.0%
46 4
 
4.0%
66 3
 
3.0%
59 3
 
3.0%
53 3
 
3.0%
54 3
 
3.0%
67 3
 
3.0%
Other values (45) 59
59.0%
ValueCountFrequency (%)
43 3
3.0%
44 7
7.0%
45 3
3.0%
46 4
4.0%
47 2
 
2.0%
48 1
 
1.0%
49 1
 
1.0%
50 6
6.0%
51 2
 
2.0%
52 1
 
1.0%
ValueCountFrequency (%)
240 1
1.0%
217 1
1.0%
205 1
1.0%
187 1
1.0%
167 1
1.0%
164 1
1.0%
160 1
1.0%
159 1
1.0%
147 1
1.0%
144 1
1.0%

Interactions

2023-12-10T21:43:05.943585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:05.752381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:06.027934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:05.836684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T21:43:09.025743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번단어속성명연관어명일간연관어언급량
일간연관어연번1.0000.0001.0000.895
단어속성명0.0001.0001.0000.351
연관어명1.0001.0001.0001.000
일간연관어언급량0.8950.3511.0001.000
2023-12-10T21:43:09.111784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번일간연관어언급량단어속성명
일간연관어연번1.000-0.9990.000
일간연관어언급량-0.9991.0000.153
단어속성명0.0000.1531.000

Missing values

2023-12-10T21:43:06.176471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T21:43:06.343376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량
012017-01-02물환경하천All시간새해240
122017-01-02물환경하천All장소서울217
232017-01-02물환경하천All사회이슈탈당205
342017-01-02물환경하천All단체새누리187
452017-01-02물환경하천All라이프특검167
562017-01-02물환경하천All시간2017년164
672017-01-02물환경하천All장소한국160
782017-01-02물환경하천All라이프문화159
892017-01-02물환경하천All라이프국민147
9102017-01-02물환경하천All인물이명박144
일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량
90912017-01-02물환경하천All장소지방44
91922017-01-02물환경하천All라이프정책44
92932017-01-02물환경하천All라이프안전44
93942017-01-02물환경하천All속성수도44
94952017-01-02물환경하천All속성설치44
95962017-01-02물환경하천All장소대한민국44
96972017-01-02물환경하천All라이프건설44
97982017-01-02물환경하천All속성중심43
98992017-01-02물환경하천All인물엄마43
991002017-01-02물환경하천All속성녹조43