Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric3
DateTime1
Categorical4
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=369aed70-e842-11ea-a837-83d4a69b8aa7

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간지역언급량연번 is highly overall correlated with 도메인 하위 카테고리명High correlation
도메인 하위 카테고리명 is highly overall correlated with 일간지역언급량연번High correlation
일간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 12:43:43.163613
Analysis finished2023-12-10 12:43:45.094357
Duration1.93 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:43:45.166556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T21:43:45.304413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-04-01 00:00:00
Maximum2021-04-01 00:00:00
2023-12-10T21:43:45.407229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:45.492718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T21:43:45.591211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:45.671069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
하천
43 
하수도
26 
물재난
16 
상수도
10 
지하수

Length

Max length3
Median length3
Mean length2.57
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
하천 43
43.0%
하수도 26
26.0%
물재난 16
 
16.0%
상수도 10
 
10.0%
지하수 5
 
5.0%

Length

2023-12-10T21:43:45.760691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:45.851027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하천 43
43.0%
하수도 26
26.0%
물재난 16
 
16.0%
상수도 10
 
10.0%
지하수 5
 
5.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
news
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownews
2nd rownews
3rd rownews
4th rownews
5th rownews

Common Values

ValueCountFrequency (%)
news 100
100.0%

Length

2023-12-10T21:43:45.955136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:46.036653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
news 100
100.0%

시도명
Categorical

Distinct15
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
32 
경남
19 
경북
강원
대구
Other values (10)
29 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row경기
2nd row경기
3rd row경기
4th row경기
5th row경기

Common Values

ValueCountFrequency (%)
경기 32
32.0%
경남 19
19.0%
경북 8
 
8.0%
강원 6
 
6.0%
대구 6
 
6.0%
전북 5
 
5.0%
부산 5
 
5.0%
충남 5
 
5.0%
서울 3
 
3.0%
광주 3
 
3.0%
Other values (5) 8
 
8.0%

Length

2023-12-10T21:43:46.118793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 32
32.0%
경남 19
19.0%
경북 8
 
8.0%
강원 6
 
6.0%
대구 6
 
6.0%
전북 5
 
5.0%
부산 5
 
5.0%
충남 5
 
5.0%
서울 3
 
3.0%
광주 3
 
3.0%
Other values (5) 8
 
8.0%
Distinct57
Distinct (%)57.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T21:43:46.335235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.88
Min length2

Characters and Unicode

Total characters288
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)31.0%

Sample

1st row광주시
2nd row동두천시
3rd row양주시
4th row연천군
5th row용인시
ValueCountFrequency (%)
서구 9
 
9.0%
남구 5
 
5.0%
고성군 4
 
4.0%
용인시 4
 
4.0%
김해시 3
 
3.0%
광주시 3
 
3.0%
창원시 3
 
3.0%
양주시 2
 
2.0%
청양군 2
 
2.0%
오산시 2
 
2.0%
Other values (47) 63
63.0%
2023-12-10T21:43:46.694602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
47
 
16.3%
28
 
9.7%
27
 
9.4%
13
 
4.5%
12
 
4.2%
10
 
3.5%
9
 
3.1%
9
 
3.1%
8
 
2.8%
7
 
2.4%
Other values (52) 118
41.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 288
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
47
 
16.3%
28
 
9.7%
27
 
9.4%
13
 
4.5%
12
 
4.2%
10
 
3.5%
9
 
3.1%
9
 
3.1%
8
 
2.8%
7
 
2.4%
Other values (52) 118
41.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 288
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
47
 
16.3%
28
 
9.7%
27
 
9.4%
13
 
4.5%
12
 
4.2%
10
 
3.5%
9
 
3.1%
9
 
3.1%
8
 
2.8%
7
 
2.4%
Other values (52) 118
41.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 288
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
47
 
16.3%
28
 
9.7%
27
 
9.4%
13
 
4.5%
12
 
4.2%
10
 
3.5%
9
 
3.1%
9
 
3.1%
8
 
2.8%
7
 
2.4%
Other values (52) 118
41.0%

일간시도언급량
Real number (ℝ)

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.38
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:43:46.806713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum7
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.1171754
Coefficient of variation (CV)0.80954737
Kurtosis15.185853
Mean1.38
Median Absolute Deviation (MAD)0
Skewness3.8118289
Sum138
Variance1.2480808
MonotonicityNot monotonic
2023-12-10T21:43:46.893183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 83
83.0%
2 9
 
9.0%
3 4
 
4.0%
7 2
 
2.0%
6 1
 
1.0%
5 1
 
1.0%
ValueCountFrequency (%)
1 83
83.0%
2 9
 
9.0%
3 4
 
4.0%
5 1
 
1.0%
6 1
 
1.0%
7 2
 
2.0%
ValueCountFrequency (%)
7 2
 
2.0%
6 1
 
1.0%
5 1
 
1.0%
3 4
 
4.0%
2 9
 
9.0%
1 83
83.0%

일간시도단어량
Real number (ℝ)

Distinct13
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.39
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:43:47.022443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile10.05
Maximum14
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.9979623
Coefficient of variation (CV)0.88435465
Kurtosis2.9854071
Mean3.39
Median Absolute Deviation (MAD)1
Skewness1.8062901
Sum339
Variance8.9877778
MonotonicityNot monotonic
2023-12-10T21:43:47.141616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
1 30
30.0%
2 23
23.0%
3 16
16.0%
4 8
 
8.0%
6 6
 
6.0%
5 5
 
5.0%
8 4
 
4.0%
10 2
 
2.0%
13 2
 
2.0%
14 1
 
1.0%
Other values (3) 3
 
3.0%
ValueCountFrequency (%)
1 30
30.0%
2 23
23.0%
3 16
16.0%
4 8
 
8.0%
5 5
 
5.0%
6 6
 
6.0%
7 1
 
1.0%
8 4
 
4.0%
10 2
 
2.0%
11 1
 
1.0%
ValueCountFrequency (%)
14 1
 
1.0%
13 2
 
2.0%
12 1
 
1.0%
11 1
 
1.0%
10 2
 
2.0%
8 4
4.0%
7 1
 
1.0%
6 6
6.0%
5 5
5.0%
4 8
8.0%

Interactions

2023-12-10T21:43:44.618117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:44.089569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:44.385227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:44.713255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:44.205628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:44.457251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:44.791306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:44.294208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:44.535033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T21:43:47.225010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번도메인 하위 카테고리명시도명시군구명일간시도언급량일간시도단어량
일간지역언급량연번1.0000.9740.7380.4140.4600.350
도메인 하위 카테고리명0.9741.0000.4170.0000.0000.000
시도명0.7380.4171.0000.9110.0000.000
시군구명0.4140.0000.9111.0000.0000.891
일간시도언급량0.4600.0000.0000.0001.0000.763
일간시도단어량0.3500.0000.0000.8910.7631.000
2023-12-10T21:43:47.336633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도메인 하위 카테고리명시도명
도메인 하위 카테고리명1.0000.176
시도명0.1761.000
2023-12-10T21:43:47.432423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번일간시도언급량일간시도단어량도메인 하위 카테고리명시도명
일간지역언급량연번1.0000.2270.0430.7540.369
일간시도언급량0.2271.0000.3920.0000.000
일간시도단어량0.0430.3921.0000.0000.000
도메인 하위 카테고리명0.7540.0000.0001.0000.176
시도명0.3690.0000.0000.1761.000

Missing values

2023-12-10T21:43:44.921243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T21:43:45.044753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량일간시도단어량
012021-04-01물환경물재난news경기광주시12
122021-04-01물환경물재난news경기동두천시12
232021-04-01물환경물재난news경기양주시13
342021-04-01물환경물재난news경기연천군12
452021-04-01물환경물재난news경기용인시24
562021-04-01물환경물재난news경기의정부시11
672021-04-01물환경물재난news경기포천시12
782021-04-01물환경물재난news경남산청군11
892021-04-01물환경물재난news경남진주시110
9102021-04-01물환경물재난news경남하동군14
일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량일간시도단어량
90912021-04-01물환경하천news경북예천군11
91922021-04-01물환경하천news광주서구12
92932021-04-01물환경하천news대구달서구11
93942021-04-01물환경하천news대구달성군11
94952021-04-01물환경하천news대구서구12
95962021-04-01물환경하천news대전서구12
96972021-04-01물환경하천news부산기장군311
97982021-04-01물환경하천news부산서구12
98992021-04-01물환경하천news서울노원구12
991002021-04-01물환경하천news서울도봉구12