Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory67.3 B

Variable types

Numeric1
DateTime1
Categorical5
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=8a03ac90-2fc9-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
SNS 채널명 has constant value ""Constant
도메인 하위 카테고리명 is highly overall correlated with 일간지역언급량연번 and 1 other fieldsHigh correlation
환경플랫폼 하위 도메인명 is highly overall correlated with 도메인 하위 카테고리명High correlation
일간지역언급량연번 is highly overall correlated with 도메인 하위 카테고리명High correlation
환경플랫폼 하위 도메인명 is highly imbalanced (85.9%)Imbalance
일간시도언급량 is highly imbalanced (70.5%)Imbalance
일간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2024-04-22 00:56:31.587701
Analysis finished2024-04-22 00:56:33.472653
Duration1.88 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-22T09:56:33.597277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2024-04-22T09:56:33.781933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-01-01 00:00:00
Maximum2021-01-01 00:00:00
2024-04-22T09:56:33.910468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T09:56:34.022895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

환경플랫폼 하위 도메인명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
98 
생활환경
 
2

Length

Max length4
Median length3
Mean length3.02
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 98
98.0%
생활환경 2
 
2.0%

Length

2024-04-22T09:56:34.162424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:56:34.277082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 98
98.0%
생활환경 2
 
2.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
호소
52 
하수도
17 
상수도
12 
물재난
지하수
Other values (2)
 
3

Length

Max length3
Median length2
Mean length2.45
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
호소 52
52.0%
하수도 17
 
17.0%
상수도 12
 
12.0%
물재난 9
 
9.0%
지하수 7
 
7.0%
대기 2
 
2.0%
하천 1
 
1.0%

Length

2024-04-22T09:56:34.397790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:56:34.522364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
호소 52
52.0%
하수도 17
 
17.0%
상수도 12
 
12.0%
물재난 9
 
9.0%
지하수 7
 
7.0%
대기 2
 
2.0%
하천 1
 
1.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
All
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 100
100.0%

Length

2024-04-22T09:56:34.665841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:56:34.908278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 100
100.0%

시도명
Categorical

Distinct14
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
32 
인천
10 
서울
강원
전남
Other values (9)
36 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row경기
2nd row경기
3rd row경북
4th row서울
5th row전남

Common Values

ValueCountFrequency (%)
경기 32
32.0%
인천 10
 
10.0%
서울 9
 
9.0%
강원 7
 
7.0%
전남 6
 
6.0%
대구 6
 
6.0%
대전 6
 
6.0%
부산 6
 
6.0%
충남 5
 
5.0%
광주 4
 
4.0%
Other values (4) 9
 
9.0%

Length

2024-04-22T09:56:35.175008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 32
32.0%
인천 10
 
10.0%
서울 9
 
9.0%
강원 7
 
7.0%
전남 6
 
6.0%
대구 6
 
6.0%
대전 6
 
6.0%
부산 6
 
6.0%
충남 5
 
5.0%
광주 4
 
4.0%
Other values (4) 9
 
9.0%
Distinct58
Distinct (%)58.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2024-04-22T09:56:35.618235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.69
Min length2

Characters and Unicode

Total characters269
Distinct characters63
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)45.0%

Sample

1st row시흥시
2nd row연천군
3rd row경주시
4th row강북구
5th row광양시
ValueCountFrequency (%)
서구 15
 
15.0%
중구 12
 
12.0%
동구 6
 
6.0%
연천군 3
 
3.0%
양평군 3
 
3.0%
제주시 2
 
2.0%
용인시 2
 
2.0%
강릉시 2
 
2.0%
단원구 2
 
2.0%
시흥시 2
 
2.0%
Other values (48) 51
51.0%
2024-04-22T09:56:36.333191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49
18.2%
35
 
13.0%
20
 
7.4%
17
 
6.3%
12
 
4.5%
9
 
3.3%
9
 
3.3%
9
 
3.3%
7
 
2.6%
7
 
2.6%
Other values (53) 95
35.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 269
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
18.2%
35
 
13.0%
20
 
7.4%
17
 
6.3%
12
 
4.5%
9
 
3.3%
9
 
3.3%
9
 
3.3%
7
 
2.6%
7
 
2.6%
Other values (53) 95
35.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 269
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
18.2%
35
 
13.0%
20
 
7.4%
17
 
6.3%
12
 
4.5%
9
 
3.3%
9
 
3.3%
9
 
3.3%
7
 
2.6%
7
 
2.6%
Other values (53) 95
35.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 269
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
49
18.2%
35
 
13.0%
20
 
7.4%
17
 
6.3%
12
 
4.5%
9
 
3.3%
9
 
3.3%
9
 
3.3%
7
 
2.6%
7
 
2.6%
Other values (53) 95
35.3%

일간시도언급량
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
92 
2
 
6
3
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 92
92.0%
2 6
 
6.0%
3 2
 
2.0%

Length

2024-04-22T09:56:36.615610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:56:36.782436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 92
92.0%
2 6
 
6.0%
3 2
 
2.0%

Interactions

2024-04-22T09:56:33.020271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-22T09:56:36.935711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명시군구명일간시도언급량
일간지역언급량연번1.0000.4180.8890.6860.7070.269
환경플랫폼 하위 도메인명0.4181.0001.0000.0000.7730.000
도메인 하위 카테고리명0.8891.0001.0000.0000.7890.000
시도명0.6860.0000.0001.0000.8160.000
시군구명0.7070.7730.7890.8161.0000.245
일간시도언급량0.2690.0000.0000.0000.2451.000
2024-04-22T09:56:37.139518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도메인 하위 카테고리명일간시도언급량환경플랫폼 하위 도메인명시도명
도메인 하위 카테고리명1.0000.0000.9740.000
일간시도언급량0.0001.0000.0000.000
환경플랫폼 하위 도메인명0.9740.0001.0000.000
시도명0.0000.0000.0001.000
2024-04-22T09:56:37.314687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명일간시도언급량
일간지역언급량연번1.0000.3060.7160.3500.158
환경플랫폼 하위 도메인명0.3061.0000.9740.0000.000
도메인 하위 카테고리명0.7160.9741.0000.0000.000
시도명0.3500.0000.0001.0000.000
일간시도언급량0.1580.0000.0000.0001.000

Missing values

2024-04-22T09:56:33.234163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-22T09:56:33.404011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량
012021-01-01물환경물재난All경기시흥시1
122021-01-01물환경물재난All경기연천군1
232021-01-01물환경물재난All경북경주시1
342021-01-01물환경물재난All서울강북구1
452021-01-01물환경물재난All전남광양시1
562021-01-01물환경물재난All전남구례군1
672021-01-01물환경물재난All전북순창군1
782021-01-01물환경물재난All충남당진시1
892021-01-01물환경물재난All충남천안시1
9102021-01-01물환경상수도All경기양평군1
일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량
90912021-01-01물환경호소All전남순천시1
91922021-01-01물환경호소All전남신안군1
92932021-01-01물환경호소All전남영광군1
93942021-01-01물환경호소All전북군산시1
94952021-01-01물환경호소All전북부안군1
95962021-01-01물환경호소All충남서산시1
96972021-01-01물환경호소All충남아산시1
97982021-01-01물환경호소All충남태안군1
98992021-01-01생활환경대기All경기고양시1
991002021-01-01생활환경대기All경기단원구1