Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory67.3 B

Variable types

Numeric2
Categorical5
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=d58abcc0-2fca-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
SNS 채널명 has constant value ""Constant
환경플랫폼 하위 도메인명 is highly overall correlated with 주간지역언급량연번 and 1 other fieldsHigh correlation
도메인 하위 카테고리명 is highly overall correlated with 주간지역언급량연번 and 1 other fieldsHigh correlation
주간지역언급량연번 is highly overall correlated with 환경플랫폼 하위 도메인명 and 1 other fieldsHigh correlation
주간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 11:09:46.632988
Analysis finished2023-12-10 11:09:48.114743
Duration1.48 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

주간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:09:48.236727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T20:09:48.482435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020-07-01
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-07-01
2nd row2020-07-01
3rd row2020-07-01
4th row2020-07-01
5th row2020-07-01

Common Values

ValueCountFrequency (%)
2020-07-01 100
100.0%

Length

2023-12-10T20:09:48.714293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:48.873277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-07-01 100
100.0%

환경플랫폼 하위 도메인명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
86 
생활환경
14 

Length

Max length4
Median length3
Mean length3.14
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 86
86.0%
생활환경 14
 
14.0%

Length

2023-12-10T20:09:49.037875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:49.520507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 86
86.0%
생활환경 14
 
14.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
하천
33 
호소
24 
하수도
12 
상수도
폐기물
Other values (3)
14 

Length

Max length3
Median length2
Mean length2.37
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row상수도

Common Values

ValueCountFrequency (%)
하천 33
33.0%
호소 24
24.0%
하수도 12
 
12.0%
상수도 9
 
9.0%
폐기물 8
 
8.0%
대기 6
 
6.0%
물재난 4
 
4.0%
지하수 4
 
4.0%

Length

2023-12-10T20:09:49.730808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:49.908446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하천 33
33.0%
호소 24
24.0%
하수도 12
 
12.0%
상수도 9
 
9.0%
폐기물 8
 
8.0%
대기 6
 
6.0%
물재난 4
 
4.0%
지하수 4
 
4.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
All
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 100
100.0%

Length

2023-12-10T20:09:50.099409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:09:50.252396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 100
100.0%

시도명
Categorical

Distinct16
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
30 
경북
10 
대구
10 
강원
부산
Other values (11)
37 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row경남
2nd row서울
3rd row전북
4th row제주
5th row경기

Common Values

ValueCountFrequency (%)
경기 30
30.0%
경북 10
 
10.0%
대구 10
 
10.0%
강원 7
 
7.0%
부산 6
 
6.0%
광주 5
 
5.0%
울산 5
 
5.0%
충남 5
 
5.0%
대전 5
 
5.0%
충북 4
 
4.0%
Other values (6) 13
13.0%

Length

2023-12-10T20:09:50.391428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 30
30.0%
경북 10
 
10.0%
대구 10
 
10.0%
강원 7
 
7.0%
부산 6
 
6.0%
광주 5
 
5.0%
울산 5
 
5.0%
충남 5
 
5.0%
대전 5
 
5.0%
충북 4
 
4.0%
Other values (6) 13
13.0%
Distinct58
Distinct (%)58.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T20:09:50.693064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.74
Min length2

Characters and Unicode

Total characters274
Distinct characters64
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)40.0%

Sample

1st row함양군
2nd row서초구
3rd row부안군
4th row서귀포시
5th row양평군
ValueCountFrequency (%)
동구 12
 
12.0%
남구 10
 
10.0%
북구 5
 
5.0%
달성군 4
 
4.0%
평택시 3
 
3.0%
청주시 2
 
2.0%
일산 2
 
2.0%
광명시 2
 
2.0%
고양시 2
 
2.0%
대덕구 2
 
2.0%
Other values (48) 56
56.0%
2023-12-10T20:09:51.192791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
43
15.7%
39
 
14.2%
20
 
7.3%
16
 
5.8%
13
 
4.7%
11
 
4.0%
11
 
4.0%
9
 
3.3%
6
 
2.2%
5
 
1.8%
Other values (54) 101
36.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 274
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
43
15.7%
39
 
14.2%
20
 
7.3%
16
 
5.8%
13
 
4.7%
11
 
4.0%
11
 
4.0%
9
 
3.3%
6
 
2.2%
5
 
1.8%
Other values (54) 101
36.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 274
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
43
15.7%
39
 
14.2%
20
 
7.3%
16
 
5.8%
13
 
4.7%
11
 
4.0%
11
 
4.0%
9
 
3.3%
6
 
2.2%
5
 
1.8%
Other values (54) 101
36.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 274
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
43
15.7%
39
 
14.2%
20
 
7.3%
16
 
5.8%
13
 
4.7%
11
 
4.0%
11
 
4.0%
9
 
3.3%
6
 
2.2%
5
 
1.8%
Other values (54) 101
36.9%

주간시도언급량
Real number (ℝ)

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.66
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:09:51.375742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile5.1
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.0689715
Coefficient of variation (CV)1.1537487
Kurtosis10.330911
Mean2.66
Median Absolute Deviation (MAD)0
Skewness3.045579
Sum266
Variance9.4185859
MonotonicityNot monotonic
2023-12-10T20:09:51.538554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 55
55.0%
2 15
 
15.0%
5 15
 
15.0%
4 7
 
7.0%
3 3
 
3.0%
15 2
 
2.0%
16 2
 
2.0%
7 1
 
1.0%
ValueCountFrequency (%)
1 55
55.0%
2 15
 
15.0%
3 3
 
3.0%
4 7
 
7.0%
5 15
 
15.0%
7 1
 
1.0%
15 2
 
2.0%
16 2
 
2.0%
ValueCountFrequency (%)
16 2
 
2.0%
15 2
 
2.0%
7 1
 
1.0%
5 15
 
15.0%
4 7
 
7.0%
3 3
 
3.0%
2 15
 
15.0%
1 55
55.0%

Interactions

2023-12-10T20:09:47.451686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:09:47.089281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:09:47.571682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:09:47.228902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:09:51.640610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주간지역언급량연번환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명시군구명주간시도언급량
주간지역언급량연번1.0000.9820.8980.5770.7960.546
환경플랫폼 하위 도메인명0.9821.0001.0000.3040.6120.000
도메인 하위 카테고리명0.8981.0001.0000.5830.8520.107
시도명0.5770.3040.5831.0000.9510.334
시군구명0.7960.6120.8520.9511.0000.845
주간시도언급량0.5460.0000.1070.3340.8451.000
2023-12-10T20:09:51.762075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명환경플랫폼 하위 도메인명도메인 하위 카테고리명
시도명1.0000.2180.229
환경플랫폼 하위 도메인명0.2181.0000.969
도메인 하위 카테고리명0.2290.9691.000
2023-12-10T20:09:51.882339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주간지역언급량연번주간시도언급량환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명
주간지역언급량연번1.0000.2860.8470.7130.256
주간시도언급량0.2861.0000.0000.0000.201
환경플랫폼 하위 도메인명0.8470.0001.0000.9690.218
도메인 하위 카테고리명0.7130.0000.9691.0000.229
시도명0.2560.2010.2180.2291.000

Missing values

2023-12-10T20:09:47.795069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:09:48.026744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

주간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
012020-07-01물환경물재난All경남함양군1
122020-07-01물환경물재난All서울서초구1
232020-07-01물환경물재난All전북부안군3
342020-07-01물환경물재난All제주서귀포시1
452020-07-01물환경상수도All경기양평군2
562020-07-01물환경상수도All경북남구1
672020-07-01물환경상수도All광주남구1
782020-07-01물환경상수도All대구남구1
892020-07-01물환경상수도All부산남구1
9102020-07-01물환경상수도All울산남구1
주간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
90912020-07-01생활환경대기All충남서산시1
91922020-07-01생활환경대기All충북청주시2
92932020-07-01생활환경폐기물All경기고양시2
93942020-07-01생활환경폐기물All경기과천시2
94952020-07-01생활환경폐기물All경기광명시2
95962020-07-01생활환경폐기물All경기광주시2
96972020-07-01생활환경폐기물All경기구리시2
97982020-07-01생활환경폐기물All경기군포시2
98992020-07-01생활환경폐기물All경기김포시4
991002020-07-01생활환경폐기물All경기남양주시4