Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric2
DateTime1
Categorical5
Text1

Dataset

Description샘플 데이터
Author성균관대학교 산학협력단
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=b9de2350-e842-11ea-a837-83d4a69b8aa7

Alerts

연월일 has constant value ""Constant
환경플랫폼 하위 도메인명 has constant value ""Constant
SNS 채널명 has constant value ""Constant
일간연관어언급량 has constant value ""Constant
일간연관어연번 is highly overall correlated with 도메인 하위 카테고리명High correlation
도메인 하위 카테고리명 is highly overall correlated with 일간연관어연번High correlation
일간연관어연번 has unique valuesUnique

Reproduction

Analysis started2024-04-19 21:47:34.290220
Analysis finished2024-04-19 21:47:36.514093
Duration2.22 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일간연관어연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-20T06:47:36.590438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2024-04-20T06:47:36.713679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

연월일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-03-02 00:00:00
Maximum2021-03-02 00:00:00
2024-04-20T06:47:36.808950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T06:47:36.905078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물환경
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물환경
2nd row물환경
3rd row물환경
4th row물환경
5th row물환경

Common Values

ValueCountFrequency (%)
물환경 100
100.0%

Length

2024-04-20T06:47:36.995920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-20T06:47:37.072360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물환경 100
100.0%

도메인 하위 카테고리명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
물재난
62 
하수도
38 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물재난
2nd row물재난
3rd row물재난
4th row물재난
5th row물재난

Common Values

ValueCountFrequency (%)
물재난 62
62.0%
하수도 38
38.0%

Length

2024-04-20T06:47:37.158275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-20T06:47:37.256249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
물재난 62
62.0%
하수도 38
38.0%

SNS 채널명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
patent
100 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpatent
2nd rowpatent
3rd rowpatent
4th rowpatent
5th rowpatent

Common Values

ValueCountFrequency (%)
patent 100
100.0%

Length

2024-04-20T06:47:37.360092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-20T06:47:37.448090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
patent 100
100.0%

단어속성명
Categorical

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
속성
49 
기타
17 
라이프
15 
상품
11 
장소

Length

Max length3
Median length2
Mean length2.15
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row라이프
2nd row기타
3rd row기타
4th row상품
5th row속성

Common Values

ValueCountFrequency (%)
속성 49
49.0%
기타 17
 
17.0%
라이프 15
 
15.0%
상품 11
 
11.0%
장소 8
 
8.0%

Length

2024-04-20T06:47:37.546384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-20T06:47:37.654630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
속성 49
49.0%
기타 17
 
17.0%
라이프 15
 
15.0%
상품 11
 
11.0%
장소 8
 
8.0%
Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2024-04-20T06:47:37.877369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.41
Min length2

Characters and Unicode

Total characters241
Distinct characters101
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)24.0%

Sample

1st row감내
2nd row감천동
3rd row거듭나다
4th row거치대
5th row곡선
ValueCountFrequency (%)
감내 2
 
2.0%
복구 2
 
2.0%
배관 2
 
2.0%
배수구 2
 
2.0%
벌레 2
 
2.0%
보정 2
 
2.0%
볼트 2
 
2.0%
부력 2
 
2.0%
부산광역시 2
 
2.0%
받침 2
 
2.0%
Other values (52) 80
80.0%
2024-04-20T06:47:38.251520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9
 
3.7%
8
 
3.3%
7
 
2.9%
6
 
2.5%
6
 
2.5%
5
 
2.1%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (91) 183
75.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 241
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
3.7%
8
 
3.3%
7
 
2.9%
6
 
2.5%
6
 
2.5%
5
 
2.1%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (91) 183
75.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 241
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
3.7%
8
 
3.3%
7
 
2.9%
6
 
2.5%
6
 
2.5%
5
 
2.1%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (91) 183
75.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 241
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9
 
3.7%
8
 
3.3%
7
 
2.9%
6
 
2.5%
6
 
2.5%
5
 
2.1%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (91) 183
75.9%

일간연관어언급량
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 100
100.0%

Length

2024-04-20T06:47:38.394219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-20T06:47:38.490658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 100
100.0%

일간연관어단어량
Real number (ℝ)

Distinct10
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.89
Minimum1
Maximum27
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-20T06:47:38.567713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile8.1
Maximum27
Range26
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.9693141
Coefficient of variation (CV)1.3734651
Kurtosis18.247478
Mean2.89
Median Absolute Deviation (MAD)1
Skewness3.9515419
Sum289
Variance15.755455
MonotonicityNot monotonic
2024-04-20T06:47:38.678709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 49
49.0%
2 19
 
19.0%
3 14
 
14.0%
6 6
 
6.0%
4 5
 
5.0%
15 2
 
2.0%
8 2
 
2.0%
21 1
 
1.0%
27 1
 
1.0%
10 1
 
1.0%
ValueCountFrequency (%)
1 49
49.0%
2 19
 
19.0%
3 14
 
14.0%
4 5
 
5.0%
6 6
 
6.0%
8 2
 
2.0%
10 1
 
1.0%
15 2
 
2.0%
21 1
 
1.0%
27 1
 
1.0%
ValueCountFrequency (%)
27 1
 
1.0%
21 1
 
1.0%
15 2
 
2.0%
10 1
 
1.0%
8 2
 
2.0%
6 6
 
6.0%
4 5
 
5.0%
3 14
 
14.0%
2 19
 
19.0%
1 49
49.0%

Interactions

2024-04-20T06:47:36.101253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T06:47:35.883498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T06:47:36.190077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T06:47:36.027684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-20T06:47:38.756621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번도메인 하위 카테고리명단어속성명연관어명일간연관어단어량
일간연관어연번1.0000.9980.0000.0000.000
도메인 하위 카테고리명0.9981.0000.0000.0000.000
단어속성명0.0000.0001.0001.0000.000
연관어명0.0000.0001.0001.0001.000
일간연관어단어량0.0000.0000.0001.0001.000
2024-04-20T06:47:38.867935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도메인 하위 카테고리명단어속성명
도메인 하위 카테고리명1.0000.000
단어속성명0.0001.000
2024-04-20T06:47:38.952437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일간연관어연번일간연관어단어량도메인 하위 카테고리명단어속성명
일간연관어연번1.0000.0100.9220.000
일간연관어단어량0.0101.0000.0000.000
도메인 하위 카테고리명0.9220.0001.0000.000
단어속성명0.0000.0000.0001.000

Missing values

2024-04-20T06:47:36.305615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-20T06:47:36.459179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
012021-03-02물환경물재난patent라이프감내12
122021-03-02물환경물재난patent기타감천동12
232021-03-02물환경물재난patent기타거듭나다11
342021-03-02물환경물재난patent상품거치대11
452021-03-02물환경물재난patent속성곡선13
562021-03-02물환경물재난patent상품나사13
672021-03-02물환경물재난patent속성낙후11
782021-03-02물환경물재난patent속성너트11
892021-03-02물환경물재난patent기타대각12
9102021-03-02물환경물재난patent속성대각선11
일간연관어연번연월일환경플랫폼 하위 도메인명도메인 하위 카테고리명SNS 채널명단어속성명연관어명일간연관어언급량일간연관어단어량
90912021-03-02물환경하수도patent장소부산광역시12
91922021-03-02물환경하수도patent장소사하구12
92932021-03-02물환경하수도patent상품세면대12
93942021-03-02물환경하수도patent상품세탁기14
94952021-03-02물환경하수도patent라이프손해11
95962021-03-02물환경하수도patent속성심포11
96972021-03-02물환경하수도patent기타쏠리다11
97982021-03-02물환경하수도patent기타쏠림12
98992021-03-02물환경하수도patent속성씽크13
991002021-03-02물환경하수도patent속성아키11