Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory67.3 B

Variable types

Numeric1
Categorical7

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=d58abcc0-2fca-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
주간지역언급량연번 is highly overall correlated with SNS 채널명High correlation
도메인 하위 카테고리명 is highly overall correlated with 시군구명 and 1 other fieldsHigh correlation
SNS 채널명 is highly overall correlated with 주간지역언급량연번High correlation
시도명 is highly overall correlated with 시군구명High correlation
시군구명 is highly overall correlated with 도메인 하위 카테고리명 and 2 other fieldsHigh correlation
주간시도언급량 is highly overall correlated with 도메인 하위 카테고리명 and 1 other fieldsHigh correlation
도메인 하위 카테고리명 is highly imbalanced (85.9%)Imbalance
주간시도언급량 is highly imbalanced (52.2%)Imbalance
주간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 11:09:36.477845
Analysis finished2023-12-10 11:09:37.415719
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

주간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:09:37.548757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T20:09:38.064286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2021-04-05
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-04-05
2nd row2021-04-05
3rd row2021-04-05
4th row2021-04-05
5th row2021-04-05

Common Values

ValueCountFrequency (%)
2021-04-05 100
100.0%

Length

2023-12-10T20:09:38.245985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:38.383866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-04-05 100
100.0%
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T20:09:38.539980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:38.712033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물재난
98 
상수도
 
2

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
물재난 98
98.0%
상수도 2
 
2.0%

Length

2023-12-10T20:09:38.873798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:39.095962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물재난 98
98.0%
상수도 2
 
2.0%

SNS 채널명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
All
51 
blog
49 

Length

Max length4
Median length3
Mean length3.49
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 51
51.0%
blog 49
49.0%

Length

2023-12-10T20:09:39.288765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:39.467302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 51
51.0%
blog 49
49.0%

시도명
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
20 
전남
14 
대전
10 
경남
부산
Other values (10)
40 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원
2nd row경기
3rd row경기
4th row경기
5th row경기

Common Values

ValueCountFrequency (%)
경기 20
20.0%
전남 14
14.0%
대전 10
10.0%
경남 8
 
8.0%
부산 8
 
8.0%
대구 6
 
6.0%
인천 6
 
6.0%
전북 6
 
6.0%
광주 4
 
4.0%
서울 4
 
4.0%
Other values (5) 14
14.0%

Length

2023-12-10T20:09:39.606730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 20
20.0%
전남 14
14.0%
대전 10
10.0%
경남 8
 
8.0%
부산 8
 
8.0%
대구 6
 
6.0%
인천 6
 
6.0%
전북 6
 
6.0%
광주 4
 
4.0%
서울 4
 
4.0%
Other values (5) 14
14.0%

시군구명
Categorical

HIGH CORRELATION 

Distinct37
Distinct (%)37.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
중구
12 
동구
12 
서구
10 
처인구
 
2
유성구
 
2
Other values (32)
62 

Length

Max length4
Median length3
Mean length2.66
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row철원군
2nd row가평군
3rd row수원시
4th row양평군
5th row연천군

Common Values

ValueCountFrequency (%)
중구 12
 
12.0%
동구 12
 
12.0%
서구 10
 
10.0%
처인구 2
 
2.0%
유성구 2
 
2.0%
가평군 2
 
2.0%
수원시 2
 
2.0%
양평군 2
 
2.0%
연천군 2
 
2.0%
용인시 2
 
2.0%
Other values (27) 52
52.0%

Length

2023-12-10T20:09:39.794232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
중구 12
 
12.0%
동구 12
 
12.0%
서구 10
 
10.0%
담양군 2
 
2.0%
영동군 2
 
2.0%
제주시 2
 
2.0%
철원군 2
 
2.0%
보성군 2
 
2.0%
고흥군 2
 
2.0%
나주시 2
 
2.0%
Other values (27) 52
52.0%

주간시도언급량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
81 
2
18 
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 81
81.0%
2 18
 
18.0%
3 1
 
1.0%

Length

2023-12-10T20:09:39.970123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:40.126130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 81
81.0%
2 18
 
18.0%
3 1
 
1.0%

Interactions

2023-12-10T20:09:36.939016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:09:40.211921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주간지역언급량연번도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
주간지역언급량연번1.0000.4180.9960.8460.0000.048
도메인 하위 카테고리명0.4181.0000.0000.0001.0000.449
SNS 채널명0.9960.0001.0000.0000.0000.000
시도명0.8460.0000.0001.0000.9540.000
시군구명0.0001.0000.0000.9541.0001.000
주간시도언급량0.0480.4490.0000.0001.0001.000
2023-12-10T20:09:40.355342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구명주간시도언급량시도명SNS 채널명도메인 하위 카테고리명
시군구명1.0000.8060.5830.0000.802
주간시도언급량0.8061.0000.0000.0000.693
시도명0.5830.0001.0000.0000.000
SNS 채널명0.0000.0000.0001.0000.000
도메인 하위 카테고리명0.8020.6930.0000.0001.000
2023-12-10T20:09:40.487841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주간지역언급량연번도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
주간지역언급량연번1.0000.3060.9040.4990.0000.000
도메인 하위 카테고리명0.3061.0000.0000.0000.8020.693
SNS 채널명0.9040.0001.0000.0000.0000.000
시도명0.4990.0000.0001.0000.5830.000
시군구명0.0000.8020.0000.5831.0000.806
주간시도언급량0.0000.6930.0000.0000.8061.000

Missing values

2023-12-10T20:09:37.132427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:09:37.326689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

주간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
012021-04-05물환경물재난All강원철원군1
122021-04-05물환경물재난All경기가평군1
232021-04-05물환경물재난All경기수원시1
342021-04-05물환경물재난All경기양평군1
452021-04-05물환경물재난All경기연천군1
562021-04-05물환경물재난All경기용인시1
672021-04-05물환경물재난All경기처인구1
782021-04-05물환경물재난All경기포천시1
892021-04-05물환경물재난All경기하남시1
9102021-04-05물환경물재난All경기화성시1
주간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
90912021-04-05물환경물재난blog전남여수시1
91922021-04-05물환경물재난blog전남화순군2
92932021-04-05물환경물재난blog전북김제시1
93942021-04-05물환경물재난blog전북남원시1
94952021-04-05물환경물재난blog전북부안군1
95962021-04-05물환경물재난blog제주서귀포시1
96972021-04-05물환경물재난blog제주제주시1
97982021-04-05물환경물재난blog충북영동군1
98992021-04-05물환경상수도All경기광주시3
991002021-04-05물환경상수도All경기구리시1