Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory67.3 B

Variable types

Numeric2
DateTime1
Categorical5

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=8a03ac90-2fc9-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
SNS 채널명 has constant value ""Constant
도메인 하위 카테고리명 is highly overall correlated with 일간지역언급량연번 and 1 other fieldsHigh correlation
환경플랫폼 하위 도메인명 is highly overall correlated with 일간지역언급량연번 and 2 other fieldsHigh correlation
일간지역언급량연번 is highly overall correlated with 환경플랫폼 하위 도메인명 and 1 other fieldsHigh correlation
시군구명 is highly overall correlated with 환경플랫폼 하위 도메인명High correlation
일간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:45:44.402033
Analysis finished2023-12-10 13:45:46.201976
Duration1.8 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:45:46.387997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T22:45:46.743025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2020-01-01 00:00:00
Maximum2020-01-01 00:00:00
2023-12-10T22:45:47.288092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:47.491406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

환경플랫폼 하위 도메인명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
55 
생활환경
43 
자연환경
 
2

Length

Max length4
Median length3
Mean length3.45
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 55
55.0%
생활환경 43
43.0%
자연환경 2
 
2.0%

Length

2023-12-10T22:45:47.737031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:47.948768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 55
55.0%
생활환경 43
43.0%
자연환경 2
 
2.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
하천
34 
대기
34 
호소
14 
폐기물
상수도
Other values (3)

Length

Max length4
Median length2
Mean length2.21
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row상수도
2nd row상수도
3rd row상수도
4th row상수도
5th row상수도

Common Values

ValueCountFrequency (%)
하천 34
34.0%
대기 34
34.0%
호소 14
14.0%
폐기물 8
 
8.0%
상수도 6
 
6.0%
기후변화 2
 
2.0%
하수도 1
 
1.0%
화학물질 1
 
1.0%

Length

2023-12-10T22:45:48.270309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:48.495371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하천 34
34.0%
대기 34
34.0%
호소 14
14.0%
폐기물 8
 
8.0%
상수도 6
 
6.0%
기후변화 2
 
2.0%
하수도 1
 
1.0%
화학물질 1
 
1.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
All
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 100
100.0%

Length

2023-12-10T22:45:48.712082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:48.851287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 100
100.0%

시도명
Categorical

Distinct13
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울
17 
부산
15 
대구
10 
인천
경기
Other values (8)
42 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row광주
2nd row대구
3rd row대구
4th row대전
5th row부산

Common Values

ValueCountFrequency (%)
서울 17
17.0%
부산 15
15.0%
대구 10
10.0%
인천 8
8.0%
경기 8
8.0%
광주 7
7.0%
경북 7
7.0%
울산 7
7.0%
대전 6
 
6.0%
경남 6
 
6.0%
Other values (3) 9
9.0%

Length

2023-12-10T22:45:49.014695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 17
17.0%
부산 15
15.0%
대구 10
10.0%
인천 8
8.0%
경기 8
8.0%
광주 7
7.0%
경북 7
7.0%
울산 7
7.0%
대전 6
 
6.0%
경남 6
 
6.0%
Other values (3) 9
9.0%

시군구명
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)43.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서구
15 
중구
12 
북구
10 
동구
강서구
Other values (38)
51 

Length

Max length4
Median length3.5
Mean length2.51
Min length2

Unique

Unique30 ?
Unique (%)30.0%

Sample

1st row서구
2nd row달서구
3rd row서구
4th row서구
5th row서구

Common Values

ValueCountFrequency (%)
서구 15
15.0%
중구 12
 
12.0%
북구 10
 
10.0%
동구 6
 
6.0%
강서구 6
 
6.0%
남구 5
 
5.0%
송파구 4
 
4.0%
고성군 2
 
2.0%
구리시 2
 
2.0%
일산 2
 
2.0%
Other values (33) 36
36.0%

Length

2023-12-10T22:45:49.286776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서구 15
15.0%
중구 12
 
12.0%
북구 10
 
10.0%
동구 6
 
6.0%
강서구 6
 
6.0%
남구 5
 
5.0%
송파구 4
 
4.0%
사하구 2
 
2.0%
금천구 2
 
2.0%
강남구 2
 
2.0%
Other values (33) 36
36.0%

일간시도언급량
Real number (ℝ)

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.69
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:45:49.514847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum31
Range30
Interquartile range (IQR)1

Descriptive statistics

Standard deviation5.7078627
Coefficient of variation (CV)2.1218821
Kurtosis19.593263
Mean2.69
Median Absolute Deviation (MAD)0
Skewness4.5195725
Sum269
Variance32.579697
MonotonicityNot monotonic
2023-12-10T22:45:49.704202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 69
69.0%
2 14
 
14.0%
3 6
 
6.0%
5 6
 
6.0%
31 2
 
2.0%
29 2
 
2.0%
4 1
 
1.0%
ValueCountFrequency (%)
1 69
69.0%
2 14
 
14.0%
3 6
 
6.0%
4 1
 
1.0%
5 6
 
6.0%
29 2
 
2.0%
31 2
 
2.0%
ValueCountFrequency (%)
31 2
 
2.0%
29 2
 
2.0%
5 6
 
6.0%
4 1
 
1.0%
3 6
 
6.0%
2 14
 
14.0%
1 69
69.0%

Interactions

2023-12-10T22:45:45.399851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:45.082606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:45.606215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:45.241273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:45:49.836644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명시군구명일간시도언급량
일간지역언급량연번1.0000.8030.8540.6520.6660.374
환경플랫폼 하위 도메인명0.8031.0001.0000.5130.9090.316
도메인 하위 카테고리명0.8541.0001.0000.2300.8660.195
시도명0.6520.5130.2301.0000.8970.000
시군구명0.6660.9090.8660.8971.0000.775
일간시도언급량0.3740.3160.1950.0000.7751.000
2023-12-10T22:45:50.008255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명도메인 하위 카테고리명시군구명환경플랫폼 하위 도메인명
시도명1.0000.0960.4300.319
도메인 하위 카테고리명0.0961.0000.4360.974
시군구명0.4300.4361.0000.570
환경플랫폼 하위 도메인명0.3190.9740.5701.000
2023-12-10T22:45:50.165593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번일간시도언급량환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명시군구명
일간지역언급량연번1.0000.0560.6690.6270.3320.228
일간시도언급량0.0561.0000.0810.0500.0000.371
환경플랫폼 하위 도메인명0.6690.0811.0000.9740.3190.570
도메인 하위 카테고리명0.6270.0500.9741.0000.0960.436
시도명0.3320.0000.3190.0961.0000.430
시군구명0.2280.3710.5700.4360.4301.000

Missing values

2023-12-10T22:45:45.895277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:45:46.125425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량
012020-01-01물환경상수도All광주서구1
122020-01-01물환경상수도All대구달서구1
232020-01-01물환경상수도All대구서구1
342020-01-01물환경상수도All대전서구1
452020-01-01물환경상수도All부산서구1
562020-01-01물환경상수도All인천서구1
672020-01-01물환경하수도All서울송파구1
782020-01-01물환경하천All강원정선군3
892020-01-01물환경하천All경기구리시2
9102020-01-01물환경하천All경기김포시4
일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량
90912020-01-01생활환경폐기물All경기부천시1
91922020-01-01생활환경폐기물All광주서구2
92932020-01-01생활환경폐기물All대구서구2
93942020-01-01생활환경폐기물All대전서구2
94952020-01-01생활환경폐기물All부산서구2
95962020-01-01생활환경폐기물All인천남동구2
96972020-01-01생활환경폐기물All인천서구2
97982020-01-01생활환경화학물질All경북포항시2
98992020-01-01자연환경기후변화All강원강릉시1
991002020-01-01자연환경기후변화All강원인제군1