Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory67.3 B

Variable types

Numeric2
Categorical6

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=d58abcc0-2fca-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
주간지역언급량연번 is highly overall correlated with 도메인 하위 카테고리명 and 1 other fieldsHigh correlation
주간시도언급량 is highly overall correlated with 시군구명High correlation
도메인 하위 카테고리명 is highly overall correlated with 주간지역언급량연번 and 2 other fieldsHigh correlation
SNS 채널명 is highly overall correlated with 주간지역언급량연번High correlation
시도명 is highly overall correlated with 도메인 하위 카테고리명High correlation
시군구명 is highly overall correlated with 주간시도언급량 and 1 other fieldsHigh correlation
주간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 11:09:52.931126
Analysis finished2023-12-10 11:09:54.249996
Duration1.32 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

주간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:09:54.354762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T20:09:54.540065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020-01-06
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-01-06
2nd row2020-01-06
3rd row2020-01-06
4th row2020-01-06
5th row2020-01-06

Common Values

ValueCountFrequency (%)
2020-01-06 100
100.0%

Length

2023-12-10T20:09:54.745337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:54.878708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-01-06 100
100.0%
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T20:09:55.017642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:55.135035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물재난
77 
상수도
23 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
물재난 77
77.0%
상수도 23
 
23.0%

Length

2023-12-10T20:09:55.245307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:55.370349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물재난 77
77.0%
상수도 23
 
23.0%

SNS 채널명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
All
60 
blog
32 
twitter

Length

Max length7
Median length3
Mean length3.64
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 60
60.0%
blog 32
32.0%
twitter 8
 
8.0%

Length

2023-12-10T20:09:55.528702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:55.668177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 60
60.0%
blog 32
32.0%
twitter 8
 
8.0%

시도명
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
23 
강원
10 
부산
10 
경남
경북
Other values (9)
44 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원
2nd row강원
3rd row강원
4th row경기
5th row경기

Common Values

ValueCountFrequency (%)
경기 23
23.0%
강원 10
10.0%
부산 10
10.0%
경남 7
 
7.0%
경북 6
 
6.0%
광주 6
 
6.0%
대구 6
 
6.0%
대전 6
 
6.0%
서울 6
 
6.0%
인천 6
 
6.0%
Other values (4) 14
14.0%

Length

2023-12-10T20:09:55.783089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 23
23.0%
강원 10
10.0%
부산 10
10.0%
경남 7
 
7.0%
경북 6
 
6.0%
광주 6
 
6.0%
대구 6
 
6.0%
대전 6
 
6.0%
서울 6
 
6.0%
인천 6
 
6.0%
Other values (4) 14
14.0%

시군구명
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)43.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
동구
12 
서구
10 
남구
10 
일산
 
4
강서구
 
4
Other values (38)
60 

Length

Max length4
Median length3
Mean length2.7
Min length2

Unique

Unique19 ?
Unique (%)19.0%

Sample

1st row영월군
2nd row철원군
3rd row화천군
4th row수원시
5th row일산

Common Values

ValueCountFrequency (%)
동구 12
 
12.0%
서구 10
 
10.0%
남구 10
 
10.0%
일산 4
 
4.0%
강서구 4
 
4.0%
화천군 3
 
3.0%
수원시 3
 
3.0%
진주시 3
 
3.0%
서천군 2
 
2.0%
강동구 2
 
2.0%
Other values (33) 47
47.0%

Length

2023-12-10T20:09:55.923130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
동구 12
 
12.0%
남구 10
 
10.0%
서구 10
 
10.0%
일산 4
 
4.0%
강서구 4
 
4.0%
화천군 3
 
3.0%
수원시 3
 
3.0%
진주시 3
 
3.0%
여수시 2
 
2.0%
제주시 2
 
2.0%
Other values (33) 47
47.0%

주간시도언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.72
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:09:56.081371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile4
Maximum16
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.7983157
Coefficient of variation (CV)1.0455324
Kurtosis41.373244
Mean1.72
Median Absolute Deviation (MAD)0
Skewness5.7584811
Sum172
Variance3.2339394
MonotonicityNot monotonic
2023-12-10T20:09:56.219673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 62
62.0%
2 29
29.0%
3 3
 
3.0%
7 2
 
2.0%
4 2
 
2.0%
5 1
 
1.0%
16 1
 
1.0%
ValueCountFrequency (%)
1 62
62.0%
2 29
29.0%
3 3
 
3.0%
4 2
 
2.0%
5 1
 
1.0%
7 2
 
2.0%
16 1
 
1.0%
ValueCountFrequency (%)
16 1
 
1.0%
7 2
 
2.0%
5 1
 
1.0%
4 2
 
2.0%
3 3
 
3.0%
2 29
29.0%
1 62
62.0%

Interactions

2023-12-10T20:09:53.596525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:09:53.354635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:09:53.740234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:09:53.460641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:09:56.348389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주간지역언급량연번도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
주간지역언급량연번1.0000.9940.8920.7780.0000.239
도메인 하위 카테고리명0.9941.0000.2630.8700.9730.313
SNS 채널명0.8920.2631.0000.0000.0000.000
시도명0.7780.8700.0001.0000.9360.000
시군구명0.0000.9730.0000.9361.0000.981
주간시도언급량0.2390.3130.0000.0000.9811.000
2023-12-10T20:09:56.498992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명SNS 채널명시군구명도메인 하위 카테고리명
시도명1.0000.0000.4990.673
SNS 채널명0.0001.0000.0000.425
시군구명0.4990.0001.0000.708
도메인 하위 카테고리명0.6730.4250.7081.000
2023-12-10T20:09:56.647152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주간지역언급량연번주간시도언급량도메인 하위 카테고리명SNS 채널명시도명시군구명
주간지역언급량연번1.0000.1330.8940.8080.4450.000
주간시도언급량0.1331.0000.4140.0000.0000.624
도메인 하위 카테고리명0.8940.4141.0000.4250.6730.708
SNS 채널명0.8080.0000.4251.0000.0000.000
시도명0.4450.0000.6730.0001.0000.499
시군구명0.0000.6240.7080.0000.4991.000

Missing values

2023-12-10T20:09:53.953300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:09:54.177060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

주간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
012020-01-06물환경물재난All강원영월군1
122020-01-06물환경물재난All강원철원군1
232020-01-06물환경물재난All강원화천군2
342020-01-06물환경물재난All경기수원시1
452020-01-06물환경물재난All경기일산2
562020-01-06물환경물재난All경남남해군1
672020-01-06물환경물재난All경남진주시5
782020-01-06물환경물재난All경북남구2
892020-01-06물환경물재난All경북영주시1
9102020-01-06물환경물재난All경북포항시1
주간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
90912020-01-06물환경상수도All경기용인시2
91922020-01-06물환경상수도All경기의정부시2
92932020-01-06물환경상수도All경기일산1
93942020-01-06물환경상수도All경기처인구2
94952020-01-06물환경상수도All경기파주시3
95962020-01-06물환경상수도All경기포천시1
96972020-01-06물환경상수도All경기하남시4
97982020-01-06물환경상수도All경기화성시7
98992020-01-06물환경상수도All경남고성군1
991002020-01-06물환경상수도All경남산청군4