Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric3
Categorical5
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=369aed70-e842-11ea-a837-83d4a69b8aa7

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간지역언급량연번 is highly overall correlated with 도메인 하위 카테고리명 and 1 other fieldsHigh correlation
일간시도언급량 is highly overall correlated with 도메인 하위 카테고리명 and 1 other fieldsHigh correlation
도메인 하위 카테고리명 is highly overall correlated with 일간지역언급량연번 and 1 other fieldsHigh correlation
시도명 is highly overall correlated with 일간지역언급량연번 and 1 other fieldsHigh correlation
일간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 12:43:57.168412
Analysis finished2023-12-10 12:43:58.403813
Duration1.24 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:43:58.477221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T21:43:58.611610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020-01-01
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-01-01
2nd row2020-01-01
3rd row2020-01-01
4th row2020-01-01
5th row2020-01-01

Common Values

ValueCountFrequency (%)
2020-01-01 100
100.0%

Length

2023-12-10T21:43:58.724672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:58.800156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-01-01 100
100.0%
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T21:43:58.881873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:58.965074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
상수도
59 
물재난
41 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
상수도 59
59.0%
물재난 41
41.0%

Length

2023-12-10T21:43:59.046703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:59.140568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상수도 59
59.0%
물재난 41
41.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
news
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownews
2nd rownews
3rd rownews
4th rownews
5th rownews

Common Values

ValueCountFrequency (%)
news 100
100.0%

Length

2023-12-10T21:43:59.251484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:43:59.331655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
news 100
100.0%

시도명
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
22 
부산
13 
경북
11 
대구
10 
광주
Other values (10)
36 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row경기
2nd row경기
3rd row경기
4th row경기
5th row경기

Common Values

ValueCountFrequency (%)
경기 22
22.0%
부산 13
13.0%
경북 11
11.0%
대구 10
10.0%
광주 8
 
8.0%
경남 6
 
6.0%
서울 6
 
6.0%
대전 5
 
5.0%
강원 5
 
5.0%
울산 3
 
3.0%
Other values (5) 11
11.0%

Length

2023-12-10T21:43:59.440903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 22
22.0%
부산 13
13.0%
경북 11
11.0%
대구 10
10.0%
광주 8
 
8.0%
경남 6
 
6.0%
서울 6
 
6.0%
대전 5
 
5.0%
강원 5
 
5.0%
울산 3
 
3.0%
Other values (5) 11
11.0%
Distinct61
Distinct (%)61.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T21:43:59.673278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.63
Min length2

Characters and Unicode

Total characters263
Distinct characters72
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)52.0%

Sample

1st row과천시
2nd row동두천시
3rd row양평군
4th row연천군
5th row포천시
ValueCountFrequency (%)
동구 10
 
10.0%
북구 9
 
9.0%
서구 9
 
9.0%
남구 9
 
9.0%
중구 3
 
3.0%
양평군 2
 
2.0%
포천시 2
 
2.0%
화성시 2
 
2.0%
고성군 2
 
2.0%
이천시 1
 
1.0%
Other values (51) 51
51.0%
2023-12-10T21:44:00.019771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
56
21.3%
24
 
9.1%
21
 
8.0%
14
 
5.3%
10
 
3.8%
10
 
3.8%
10
 
3.8%
9
 
3.4%
9
 
3.4%
5
 
1.9%
Other values (62) 95
36.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 263
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
56
21.3%
24
 
9.1%
21
 
8.0%
14
 
5.3%
10
 
3.8%
10
 
3.8%
10
 
3.8%
9
 
3.4%
9
 
3.4%
5
 
1.9%
Other values (62) 95
36.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 263
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
56
21.3%
24
 
9.1%
21
 
8.0%
14
 
5.3%
10
 
3.8%
10
 
3.8%
10
 
3.8%
9
 
3.4%
9
 
3.4%
5
 
1.9%
Other values (62) 95
36.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 263
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
56
21.3%
24
 
9.1%
21
 
8.0%
14
 
5.3%
10
 
3.8%
10
 
3.8%
10
 
3.8%
9
 
3.4%
9
 
3.4%
5
 
1.9%
Other values (62) 95
36.1%

일간시도언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.68
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:44:00.127663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile5
Maximum8
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.496663
Coefficient of variation (CV)0.89087081
Kurtosis3.4688069
Mean1.68
Median Absolute Deviation (MAD)0
Skewness2.097283
Sum168
Variance2.24
MonotonicityNot monotonic
2023-12-10T21:44:00.237597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 80
80.0%
5 11
 
11.0%
4 4
 
4.0%
2 3
 
3.0%
3 1
 
1.0%
8 1
 
1.0%
ValueCountFrequency (%)
1 80
80.0%
2 3
 
3.0%
3 1
 
1.0%
4 4
 
4.0%
5 11
 
11.0%
8 1
 
1.0%
ValueCountFrequency (%)
8 1
 
1.0%
5 11
 
11.0%
4 4
 
4.0%
3 1
 
1.0%
2 3
 
3.0%
1 80
80.0%

일간시도단어량
Real number (ℝ)

Distinct17
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:44:00.334210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median3
Q39
95-th percentile17
Maximum69
Range68
Interquartile range (IQR)6

Descriptive statistics

Standard deviation8.5993892
Coefficient of variation (CV)1.322983
Kurtosis28.856724
Mean6.5
Median Absolute Deviation (MAD)2
Skewness4.5924109
Sum650
Variance73.949495
MonotonicityNot monotonic
2023-12-10T21:44:00.451314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
3 35
35.0%
1 18
18.0%
6 9
 
9.0%
9 7
 
7.0%
5 6
 
6.0%
13 6
 
6.0%
17 4
 
4.0%
4 3
 
3.0%
2 2
 
2.0%
10 2
 
2.0%
Other values (7) 8
 
8.0%
ValueCountFrequency (%)
1 18
18.0%
2 2
 
2.0%
3 35
35.0%
4 3
 
3.0%
5 6
 
6.0%
6 9
 
9.0%
8 1
 
1.0%
9 7
 
7.0%
10 2
 
2.0%
12 2
 
2.0%
ValueCountFrequency (%)
69 1
 
1.0%
33 1
 
1.0%
32 1
 
1.0%
18 1
 
1.0%
17 4
4.0%
14 1
 
1.0%
13 6
6.0%
12 2
 
2.0%
10 2
 
2.0%
9 7
7.0%

Interactions

2023-12-10T21:43:57.889843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:57.421753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:57.655468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:57.963178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:57.494067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:57.735404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:58.055584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:57.574327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:43:57.815446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T21:44:00.534472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번도메인 하위 카테고리명시도명시군구명일간시도언급량일간시도단어량
일간지역언급량연번1.0000.9990.8940.4990.4300.000
도메인 하위 카테고리명0.9991.0000.4600.0000.7130.000
시도명0.8940.4601.0000.8350.8470.819
시군구명0.4990.0000.8351.0000.9810.966
일간시도언급량0.4300.7130.8470.9811.0000.680
일간시도단어량0.0000.0000.8190.9660.6801.000
2023-12-10T21:44:00.632586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도메인 하위 카테고리명시도명
도메인 하위 카테고리명1.0000.390
시도명0.3901.000
2023-12-10T21:44:00.709536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번일간시도언급량일간시도단어량도메인 하위 카테고리명시도명
일간지역언급량연번1.000-0.4310.2040.9380.584
일간시도언급량-0.4311.0000.4040.5150.569
일간시도단어량0.2040.4041.0000.0000.469
도메인 하위 카테고리명0.9380.5150.0001.0000.390
시도명0.5840.5690.4690.3901.000

Missing values

2023-12-10T21:43:58.194022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T21:43:58.353039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량일간시도단어량
012020-01-01물환경물재난news경기과천시13
122020-01-01물환경물재난news경기동두천시11
232020-01-01물환경물재난news경기양평군11
342020-01-01물환경물재난news경기연천군11
452020-01-01물환경물재난news경기포천시11
562020-01-01물환경물재난news경기화성시13
672020-01-01물환경물재난news경남사천시12
782020-01-01물환경물재난news경북경주시11
892020-01-01물환경물재난news경북남구55
9102020-01-01물환경물재난news경북북구11
일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량일간시도단어량
90912020-01-01물환경상수도news부산사상구13
91922020-01-01물환경상수도news부산서구117
92932020-01-01물환경상수도news부산수영구13
93942020-01-01물환경상수도news부산중구13
94952020-01-01물환경상수도news부산해운대구13
95962020-01-01물환경상수도news서울구로구13
96972020-01-01물환경상수도news서울금천구13
97982020-01-01물환경상수도news서울동대문구13
98992020-01-01물환경상수도news서울성동구13
991002020-01-01물환경상수도news서울영등포구13