Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory67.3 B

Variable types

Numeric1
DateTime1
Categorical5
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=8a03ac90-2fc9-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
SNS 채널명 has constant value ""Constant
도메인 하위 카테고리명 is highly overall correlated with 일간지역언급량연번 and 1 other fieldsHigh correlation
환경플랫폼 하위 도메인명 is highly overall correlated with 일간지역언급량연번 and 1 other fieldsHigh correlation
일간지역언급량연번 is highly overall correlated with 환경플랫폼 하위 도메인명 and 1 other fieldsHigh correlation
일간시도언급량 is highly imbalanced (80.6%)Imbalance
일간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:45:23.647050
Analysis finished2023-12-10 13:45:24.989156
Duration1.34 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:45:25.141685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T22:45:25.474970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-04-01 00:00:00
Maximum2021-04-01 00:00:00
2023-12-10T22:45:25.704663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:26.000740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

환경플랫폼 하위 도메인명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
61 
생활환경
32 
자연환경

Length

Max length4
Median length3
Mean length3.39
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 61
61.0%
생활환경 32
32.0%
자연환경 7
 
7.0%

Length

2023-12-10T22:45:26.192096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:26.343354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 61
61.0%
생활환경 32
32.0%
자연환경 7
 
7.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
호소
20 
폐기물
17 
물재난
12 
하천
11 
화학물질
Other values (6)
32 

Length

Max length4
Median length3
Mean length2.77
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
호소 20
20.0%
폐기물 17
17.0%
물재난 12
12.0%
하천 11
11.0%
화학물질 8
 
8.0%
하수도 7
 
7.0%
대기 7
 
7.0%
지하수 6
 
6.0%
상수도 5
 
5.0%
기후변화 5
 
5.0%

Length

2023-12-10T22:45:26.547961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
호소 20
20.0%
폐기물 17
17.0%
물재난 12
12.0%
하천 11
11.0%
화학물질 8
 
8.0%
하수도 7
 
7.0%
대기 7
 
7.0%
지하수 6
 
6.0%
상수도 5
 
5.0%
기후변화 5
 
5.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
All
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 100
100.0%

Length

2023-12-10T22:45:26.770039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:26.907069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 100
100.0%

시도명
Categorical

Distinct13
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
44 
서울
10 
경북
대구
광주
Other values (8)
28 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row광주
2nd row대구
3rd row대구
4th row대전
5th row부산

Common Values

ValueCountFrequency (%)
경기 44
44.0%
서울 10
 
10.0%
경북 7
 
7.0%
대구 6
 
6.0%
광주 5
 
5.0%
부산 5
 
5.0%
경남 5
 
5.0%
대전 4
 
4.0%
울산 4
 
4.0%
충남 4
 
4.0%
Other values (3) 6
 
6.0%

Length

2023-12-10T22:45:27.040669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 44
44.0%
서울 10
 
10.0%
경북 7
 
7.0%
대구 6
 
6.0%
광주 5
 
5.0%
부산 5
 
5.0%
경남 5
 
5.0%
대전 4
 
4.0%
울산 4
 
4.0%
충남 4
 
4.0%
Other values (3) 6
 
6.0%
Distinct59
Distinct (%)59.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T22:45:27.409806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.74
Min length2

Characters and Unicode

Total characters274
Distinct characters68
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)39.0%

Sample

1st row동구
2nd row동구
3rd row수성구
4th row동구
5th row동구
ValueCountFrequency (%)
북구 12
 
12.0%
동구 6
 
6.0%
중구 6
 
6.0%
수원시 3
 
3.0%
여주시 3
 
3.0%
광주시 3
 
3.0%
오산시 2
 
2.0%
광명시 2
 
2.0%
일산 2
 
2.0%
가평군 2
 
2.0%
Other values (49) 59
59.0%
2023-12-10T22:45:28.084761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
53
19.3%
37
 
13.5%
14
 
5.1%
9
 
3.3%
9
 
3.3%
8
 
2.9%
8
 
2.9%
8
 
2.9%
7
 
2.6%
7
 
2.6%
Other values (58) 114
41.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 274
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
53
19.3%
37
 
13.5%
14
 
5.1%
9
 
3.3%
9
 
3.3%
8
 
2.9%
8
 
2.9%
8
 
2.9%
7
 
2.6%
7
 
2.6%
Other values (58) 114
41.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 274
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
53
19.3%
37
 
13.5%
14
 
5.1%
9
 
3.3%
9
 
3.3%
8
 
2.9%
8
 
2.9%
8
 
2.9%
7
 
2.6%
7
 
2.6%
Other values (58) 114
41.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 274
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
53
19.3%
37
 
13.5%
14
 
5.1%
9
 
3.3%
9
 
3.3%
8
 
2.9%
8
 
2.9%
8
 
2.9%
7
 
2.6%
7
 
2.6%
Other values (58) 114
41.6%

일간시도언급량
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
97 
2
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 97
97.0%
2 3
 
3.0%

Length

2023-12-10T22:45:28.299710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:28.443409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 97
97.0%
2 3
 
3.0%

Interactions

2023-12-10T22:45:24.373588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:45:28.567552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명시군구명일간시도언급량
일간지역언급량연번1.0000.9160.9400.4960.8030.308
환경플랫폼 하위 도메인명0.9161.0001.0000.3040.8700.088
도메인 하위 카테고리명0.9401.0001.0000.5240.8950.276
시도명0.4960.3040.5241.0000.8510.000
시군구명0.8030.8700.8950.8511.0000.258
일간시도언급량0.3080.0880.2760.0000.2581.000
2023-12-10T22:45:28.793301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명도메인 하위 카테고리명일간시도언급량환경플랫폼 하위 도메인명
시도명1.0000.2350.0000.167
도메인 하위 카테고리명0.2351.0000.2500.958
일간시도언급량0.0000.2501.0000.144
환경플랫폼 하위 도메인명0.1670.9580.1441.000
2023-12-10T22:45:28.941866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명일간시도언급량
일간지역언급량연번1.0000.8480.7710.2240.225
환경플랫폼 하위 도메인명0.8481.0000.9580.1670.144
도메인 하위 카테고리명0.7710.9581.0000.2350.250
시도명0.2240.1670.2351.0000.000
일간시도언급량0.2250.1440.2500.0001.000

Missing values

2023-12-10T22:45:24.577835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:45:24.854273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량
012021-04-01물환경물재난All광주동구1
122021-04-01물환경물재난All대구동구1
232021-04-01물환경물재난All대구수성구1
342021-04-01물환경물재난All대전동구1
452021-04-01물환경물재난All부산동구1
562021-04-01물환경물재난All부산해운대구1
672021-04-01물환경물재난All서울강북구1
782021-04-01물환경물재난All서울도봉구1
892021-04-01물환경물재난All울산동구1
9102021-04-01물환경물재난All인천동구1
일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량
90912021-04-01생활환경화학물질All서울중구1
91922021-04-01생활환경화학물질All울산중구1
92932021-04-01생활환경화학물질All인천중구1
93942021-04-01자연환경기상변화All경북경산시1
94952021-04-01자연환경기상변화All대전유성구1
95962021-04-01자연환경기후변화All경기구리시2
96972021-04-01자연환경기후변화All경기파주시1
97982021-04-01자연환경기후변화All경북북구1
98992021-04-01자연환경기후변화All광주북구1
991002021-04-01자연환경기후변화All광주서구1