Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells9
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory67.3 B

Variable types

Numeric2
DateTime1
Categorical4
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=d58abcc0-2fca-11ea-94b6-73a02796bba4

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
도메인 하위 카테고리명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
주간지역언급량연번 is highly overall correlated with 시도명High correlation
시도명 is highly overall correlated with 주간지역언급량연번High correlation
시군구명 has 9 (9.0%) missing valuesMissing
주간지역언급량연번 has unique valuesUnique
주간시도언급량 has 29 (29.0%) zerosZeros

Reproduction

Analysis started2023-12-10 11:10:02.760968
Analysis finished2023-12-10 11:10:03.910630
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

주간지역언급량연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:10:04.022308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T20:10:04.219104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2017-01-02 00:00:00
Maximum2017-01-02 00:00:00
2023-12-10T20:10:04.387659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:10:04.517522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2023-12-10T20:10:04.688549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:10:04.820507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
하천
100 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하천
2nd row하천
3rd row하천
4th row하천
5th row하천

Common Values

ValueCountFrequency (%)
하천 100
100.0%

Length

2023-12-10T20:10:04.966715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:10:05.104612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하천 100
100.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
All
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAll
2nd rowAll
3rd rowAll
4th rowAll
5th rowAll

Common Values

ValueCountFrequency (%)
All 100
100.0%

Length

2023-12-10T20:10:05.239742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:10:05.366970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
all 100
100.0%

시도명
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울
26 
경기
18 
부산
17 
인천
11 
대구
Other values (4)
19 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row서울
2nd row서울
3rd row서울
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
서울 26
26.0%
경기 18
18.0%
부산 17
17.0%
인천 11
11.0%
대구 9
 
9.0%
광주 6
 
6.0%
대전 6
 
6.0%
울산 6
 
6.0%
세종 1
 
1.0%

Length

2023-12-10T20:10:05.548599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:10:05.711501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울 26
26.0%
경기 18
18.0%
부산 17
17.0%
인천 11
11.0%
대구 9
 
9.0%
광주 6
 
6.0%
대전 6
 
6.0%
울산 6
 
6.0%
세종 1
 
1.0%

시군구명
Text

MISSING 

Distinct70
Distinct (%)76.9%
Missing9
Missing (%)9.0%
Memory size932.0 B
2023-12-10T20:10:06.046069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.8131868
Min length2

Characters and Unicode

Total characters256
Distinct characters74
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)70.3%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
중구 6
 
6.6%
동구 6
 
6.6%
서구 5
 
5.5%
북구 4
 
4.4%
남구 4
 
4.4%
강서구 2
 
2.2%
울주군 1
 
1.1%
달서구 1
 
1.1%
달성군 1
 
1.1%
미추홀구 1
 
1.1%
Other values (60) 60
65.9%
2023-12-10T20:10:06.607406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
79
30.9%
14
 
5.5%
10
 
3.9%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
2.0%
Other values (64) 107
41.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 256
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
79
30.9%
14
 
5.5%
10
 
3.9%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
2.0%
Other values (64) 107
41.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 256
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
79
30.9%
14
 
5.5%
10
 
3.9%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
2.0%
Other values (64) 107
41.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 256
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
79
30.9%
14
 
5.5%
10
 
3.9%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
2.0%
Other values (64) 107
41.8%

주간시도언급량
Real number (ℝ)

ZEROS 

Distinct25
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.92
Minimum0
Maximum69
Zeros29
Zeros (%)29.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:10:06.824329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median6
Q320.5
95-th percentile48
Maximum69
Range69
Interquartile range (IQR)20.5

Descriptive statistics

Standard deviation15.663559
Coefficient of variation (CV)1.2123497
Kurtosis1.2451085
Mean12.92
Median Absolute Deviation (MAD)6
Skewness1.4126844
Sum1292
Variance245.34707
MonotonicityNot monotonic
2023-12-10T20:10:07.003706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
0 29
29.0%
4 9
 
9.0%
25 6
 
6.0%
48 6
 
6.0%
40 5
 
5.0%
6 5
 
5.0%
8 5
 
5.0%
13 4
 
4.0%
14 4
 
4.0%
3 4
 
4.0%
Other values (15) 23
23.0%
ValueCountFrequency (%)
0 29
29.0%
3 4
 
4.0%
4 9
 
9.0%
5 4
 
4.0%
6 5
 
5.0%
7 3
 
3.0%
8 5
 
5.0%
10 2
 
2.0%
11 2
 
2.0%
12 1
 
1.0%
ValueCountFrequency (%)
69 1
 
1.0%
48 6
6.0%
46 1
 
1.0%
40 5
5.0%
34 1
 
1.0%
30 2
 
2.0%
25 6
6.0%
24 1
 
1.0%
23 1
 
1.0%
22 1
 
1.0%

Interactions

2023-12-10T20:10:03.282586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:10:03.043374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:10:03.405552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:10:03.170292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:10:07.158425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주간지역언급량연번시도명시군구명주간시도언급량
주간지역언급량연번1.0000.9040.0000.000
시도명0.9041.0000.0000.000
시군구명0.0000.0001.0001.000
주간시도언급량0.0000.0001.0001.000
2023-12-10T20:10:07.311306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주간지역언급량연번주간시도언급량시도명
주간지역언급량연번1.000-0.2610.704
주간시도언급량-0.2611.0000.000
시도명0.7040.0001.000

Missing values

2023-12-10T20:10:03.637929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:10:03.829758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

주간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
012017-01-02물환경하천All서울<NA>0
122017-01-02물환경하천All서울종로구11
232017-01-02물환경하천All서울중구25
342017-01-02물환경하천All서울용산구46
452017-01-02물환경하천All서울성동구13
562017-01-02물환경하천All서울광진구17
672017-01-02물환경하천All서울동대문구3
782017-01-02물환경하천All서울중랑구6
892017-01-02물환경하천All서울성북구4
9102017-01-02물환경하천All서울강북구12
주간지역언급량연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명시도명시군구명주간시도언급량
90912017-01-02물환경하천All경기중원구0
91922017-01-02물환경하천All경기분당구0
92932017-01-02물환경하천All경기의정부시3
93942017-01-02물환경하천All경기안양시13
94952017-01-02물환경하천All경기만안구0
95962017-01-02물환경하천All경기동안구0
96972017-01-02물환경하천All경기부천시4
97982017-01-02물환경하천All경기광명시7
98992017-01-02물환경하천All경기평택시4
991002017-01-02물환경하천All경기동두천시4