Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric3
Categorical5
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=369aed70-e842-11ea-a837-83d4a69b8aa7

Alerts

연월일 has constant value ""Constant
SNS 채널명 has constant value ""Constant
환경플랫폼 하위 도메인명 is highly overall correlated with 일간지역언급량연번 and 2 other fieldsHigh correlation
도메인 하위 카테고리명 is highly overall correlated with 일간지역언급량연번 and 1 other fieldsHigh correlation
일간지역언급량연번 is highly overall correlated with 일간시도단어량 and 2 other fieldsHigh correlation
일간시도언급량 is highly overall correlated with 일간시도단어량 and 1 other fieldsHigh correlation
일간시도단어량 is highly overall correlated with 일간지역언급량연번 and 1 other fieldsHigh correlation
일간지역언급량연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 12:44:01.644514
Analysis finished2023-12-10 12:44:02.900743
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:44:03.002251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T21:44:03.129426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020-10-01
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-10-01
2nd row2020-10-01
3rd row2020-10-01
4th row2020-10-01
5th row2020-10-01

Common Values

ValueCountFrequency (%)
2020-10-01 100
100.0%

Length

2023-12-10T21:44:03.251529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:44:03.590782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-10-01 100
100.0%

환경플랫폼 하위 도메인명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
48 
자연환경
41 
생활환경
11 

Length

Max length4
Median length4
Mean length3.52
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 48
48.0%
자연환경 41
41.0%
생활환경 11
 
11.0%

Length

2023-12-10T21:44:03.682499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:44:03.770109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 48
48.0%
자연환경 41
41.0%
생활환경 11
 
11.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
기상변화
41 
물재난
23 
하천
15 
폐기물
하수도
Other values (5)

Length

Max length4
Median length3
Mean length3.21
Min length2

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
기상변화 41
41.0%
물재난 23
23.0%
하천 15
 
15.0%
폐기물 7
 
7.0%
하수도 5
 
5.0%
호소 3
 
3.0%
대기 3
 
3.0%
상수도 1
 
1.0%
지하수 1
 
1.0%
화학물질 1
 
1.0%

Length

2023-12-10T21:44:03.894526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:44:04.019684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기상변화 41
41.0%
물재난 23
23.0%
하천 15
 
15.0%
폐기물 7
 
7.0%
하수도 5
 
5.0%
호소 3
 
3.0%
대기 3
 
3.0%
상수도 1
 
1.0%
지하수 1
 
1.0%
화학물질 1
 
1.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
news
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownews
2nd rownews
3rd rownews
4th rownews
5th rownews

Common Values

ValueCountFrequency (%)
news 100
100.0%

Length

2023-12-10T21:44:04.133561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:44:04.208220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
news 100
100.0%

시도명
Categorical

Distinct10
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
41 
강원
20 
전남
11 
충북
서울
Other values (5)
14 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row경기
2nd row경기
3rd row경기
4th row경기
5th row경기

Common Values

ValueCountFrequency (%)
경기 41
41.0%
강원 20
20.0%
전남 11
 
11.0%
충북 8
 
8.0%
서울 6
 
6.0%
경북 4
 
4.0%
전북 4
 
4.0%
경남 3
 
3.0%
부산 2
 
2.0%
충남 1
 
1.0%

Length

2023-12-10T21:44:04.284120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:44:04.392574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기 41
41.0%
강원 20
20.0%
전남 11
 
11.0%
충북 8
 
8.0%
서울 6
 
6.0%
경북 4
 
4.0%
전북 4
 
4.0%
경남 3
 
3.0%
부산 2
 
2.0%
충남 1
 
1.0%
Distinct73
Distinct (%)73.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T21:44:04.628356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.02
Min length3

Characters and Unicode

Total characters302
Distinct characters75
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)49.0%

Sample

1st row가평군
2nd row성남시
3rd row안성시
4th row양평군
5th row연천군
ValueCountFrequency (%)
강서구 4
 
4.0%
성남시 3
 
3.0%
가평군 2
 
2.0%
용인시 2
 
2.0%
김포시 2
 
2.0%
시흥시 2
 
2.0%
서초구 2
 
2.0%
문경시 2
 
2.0%
홍천군 2
 
2.0%
흥덕구 2
 
2.0%
Other values (63) 77
77.0%
2023-12-10T21:44:04.994178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
54
17.9%
35
 
11.6%
17
 
5.6%
14
 
4.6%
12
 
4.0%
12
 
4.0%
10
 
3.3%
8
 
2.6%
8
 
2.6%
7
 
2.3%
Other values (65) 125
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 302
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
54
17.9%
35
 
11.6%
17
 
5.6%
14
 
4.6%
12
 
4.0%
12
 
4.0%
10
 
3.3%
8
 
2.6%
8
 
2.6%
7
 
2.3%
Other values (65) 125
41.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 302
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
54
17.9%
35
 
11.6%
17
 
5.6%
14
 
4.6%
12
 
4.0%
12
 
4.0%
10
 
3.3%
8
 
2.6%
8
 
2.6%
7
 
2.3%
Other values (65) 125
41.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 302
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
54
17.9%
35
 
11.6%
17
 
5.6%
14
 
4.6%
12
 
4.0%
12
 
4.0%
10
 
3.3%
8
 
2.6%
8
 
2.6%
7
 
2.3%
Other values (65) 125
41.4%

일간시도언급량
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.87
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:44:05.105133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile4.05
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.3307171
Coefficient of variation (CV)0.71161344
Kurtosis7.6654476
Mean1.87
Median Absolute Deviation (MAD)0
Skewness2.2901996
Sum187
Variance1.7708081
MonotonicityNot monotonic
2023-12-10T21:44:05.199559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 56
56.0%
2 21
 
21.0%
3 12
 
12.0%
4 6
 
6.0%
5 4
 
4.0%
9 1
 
1.0%
ValueCountFrequency (%)
1 56
56.0%
2 21
 
21.0%
3 12
 
12.0%
4 6
 
6.0%
5 4
 
4.0%
9 1
 
1.0%
ValueCountFrequency (%)
9 1
 
1.0%
5 4
 
4.0%
4 6
 
6.0%
3 12
 
12.0%
2 21
 
21.0%
1 56
56.0%

일간시도단어량
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.31
Minimum1
Maximum40
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:44:05.300214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile9
Maximum40
Range39
Interquartile range (IQR)3

Descriptive statistics

Standard deviation4.7026642
Coefficient of variation (CV)1.4207445
Kurtosis38.619451
Mean3.31
Median Absolute Deviation (MAD)1
Skewness5.4673571
Sum331
Variance22.115051
MonotonicityNot monotonic
2023-12-10T21:44:05.405625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 43
43.0%
2 15
 
15.0%
4 13
 
13.0%
3 12
 
12.0%
9 5
 
5.0%
5 5
 
5.0%
6 2
 
2.0%
10 2
 
2.0%
40 1
 
1.0%
8 1
 
1.0%
ValueCountFrequency (%)
1 43
43.0%
2 15
 
15.0%
3 12
 
12.0%
4 13
 
13.0%
5 5
 
5.0%
6 2
 
2.0%
8 1
 
1.0%
9 5
 
5.0%
10 2
 
2.0%
20 1
 
1.0%
ValueCountFrequency (%)
40 1
 
1.0%
20 1
 
1.0%
10 2
 
2.0%
9 5
 
5.0%
8 1
 
1.0%
6 2
 
2.0%
5 5
 
5.0%
4 13
13.0%
3 12
12.0%
2 15
15.0%

Interactions

2023-12-10T21:44:02.392993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:44:01.940064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:44:02.163426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:44:02.473258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:44:02.018420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:44:02.234697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:44:02.615380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:44:02.090462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T21:44:02.314034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T21:44:05.495099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명시군구명일간시도언급량일간시도단어량
일간지역언급량연번1.0000.9370.9400.8770.7470.6970.404
환경플랫폼 하위 도메인명0.9371.0001.0000.6330.0000.8480.307
도메인 하위 카테고리명0.9401.0001.0000.7950.0000.6870.585
시도명0.8770.6330.7951.0000.9840.6590.766
시군구명0.7470.0000.0000.9841.0000.8420.961
일간시도언급량0.6970.8480.6870.6590.8421.0000.746
일간시도단어량0.4040.3070.5850.7660.9610.7461.000
2023-12-10T21:44:05.596712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명
환경플랫폼 하위 도메인명1.0000.9630.461
도메인 하위 카테고리명0.9631.0000.359
시도명0.4610.3591.000
2023-12-10T21:44:05.699453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간지역언급량연번일간시도언급량일간시도단어량환경플랫폼 하위 도메인명도메인 하위 카테고리명시도명
일간지역언급량연번1.000-0.468-0.6700.8850.5920.460
일간시도언급량-0.4681.0000.7890.5330.4400.413
일간시도단어량-0.6700.7891.0000.2400.2720.410
환경플랫폼 하위 도메인명0.8850.5330.2401.0000.9630.461
도메인 하위 카테고리명0.5920.4400.2720.9631.0000.359
시도명0.4600.4130.4100.4610.3591.000

Missing values

2023-12-10T21:44:02.726005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T21:44:02.847651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량일간시도단어량
012020-10-01물환경물재난news경기가평군34
122020-10-01물환경물재난news경기성남시11
232020-10-01물환경물재난news경기안성시12
342020-10-01물환경물재난news경기양평군12
452020-10-01물환경물재난news경기연천군13
562020-10-01물환경물재난news경기오산시11
672020-10-01물환경물재난news경기용인시14
782020-10-01물환경물재난news경기이천시13
892020-10-01물환경물재난news경기포천시34
9102020-10-01물환경물재난news경남합천군940
일간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명일간시도언급량일간시도단어량
90912020-10-01자연환경기상변화news경기시흥시11
91922020-10-01자연환경기상변화news경기안산시11
92932020-10-01자연환경기상변화news경기안성시11
93942020-10-01자연환경기상변화news경기안양시11
94952020-10-01자연환경기상변화news경기양주시11
95962020-10-01자연환경기상변화news경기양평군11
96972020-10-01자연환경기상변화news경기여주시11
97982020-10-01자연환경기상변화news경기연천군11
98992020-10-01자연환경기상변화news경기오산시11
991002020-10-01자연환경기상변화news경기용인시11