Overview

Dataset statistics

Number of variables7
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory64.4 B

Variable types

Text1
Categorical5
DateTime1

Dataset

Description샘플 데이터
Author한국신용데이터
URLhttps://bigdata-region.kr/#/dataset/274db76e-3965-4e59-98ef-50e76cb49de3

Alerts

연월 has constant value ""Constant
게시글내키워드평균중요도 is highly overall correlated with 키워드포함게시글작성비율 and 1 other fieldsHigh correlation
키워드포함게시글작성비율 is highly overall correlated with 게시글내키워드평균빈도 and 1 other fieldsHigh correlation
게시글내키워드평균빈도 is highly overall correlated with 키워드포함게시글작성비율 and 1 other fieldsHigh correlation
키워드분류 is highly overall correlated with 키워드길이High correlation
키워드길이 is highly overall correlated with 키워드분류High correlation
게시글내키워드평균빈도 is highly imbalanced (78.9%)Imbalance
키워드 has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:02:18.983953
Analysis finished2023-12-10 14:02:19.700586
Duration0.72 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

키워드
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:02:19.895438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length3.1666667
Min length2

Characters and Unicode

Total characters95
Distinct characters39
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row가가
2nd row가건물
3rd row가게
4th row가게내
5th row가게라며
ValueCountFrequency (%)
가가 1
 
3.3%
가건물 1
 
3.3%
가까이 1
 
3.3%
가까워지고 1
 
3.3%
가기도 1
 
3.3%
가기 1
 
3.3%
가구 1
 
3.3%
가과 1
 
3.3%
가공 1
 
3.3%
가고 1
 
3.3%
Other values (20) 20
66.7%
2023-12-10T23:02:20.342032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31
32.6%
10
 
10.5%
5
 
5.3%
5
 
5.3%
4
 
4.2%
3
 
3.2%
2
 
2.1%
2
 
2.1%
2
 
2.1%
2
 
2.1%
Other values (29) 29
30.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 95
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
32.6%
10
 
10.5%
5
 
5.3%
5
 
5.3%
4
 
4.2%
3
 
3.2%
2
 
2.1%
2
 
2.1%
2
 
2.1%
2
 
2.1%
Other values (29) 29
30.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 95
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
32.6%
10
 
10.5%
5
 
5.3%
5
 
5.3%
4
 
4.2%
3
 
3.2%
2
 
2.1%
2
 
2.1%
2
 
2.1%
2
 
2.1%
Other values (29) 29
30.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 95
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
31
32.6%
10
 
10.5%
5
 
5.3%
5
 
5.3%
4
 
4.2%
3
 
3.2%
2
 
2.1%
2
 
2.1%
2
 
2.1%
2
 
2.1%
Other values (29) 29
30.5%

키워드분류
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
동사
18 
명사
12 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row명사
2nd row명사
3rd row명사
4th row동사
5th row동사

Common Values

ValueCountFrequency (%)
동사 18
60.0%
명사 12
40.0%

Length

2023-12-10T23:02:20.516200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:02:20.649839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
동사 18
60.0%
명사 12
40.0%

키워드길이
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2
11 
3
4
5

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row2
4th row3
5th row4

Common Values

ValueCountFrequency (%)
2 11
36.7%
3 8
26.7%
4 6
20.0%
5 5
16.7%

Length

2023-12-10T23:02:20.817138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:02:20.968826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 11
36.7%
3 8
26.7%
4 6
20.0%
5 5
16.7%

키워드포함게시글작성비율
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
0
24 
1
2
 
2
19
 
1

Length

Max length2
Median length1
Mean length1.0333333
Min length1

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row0
2nd row0
3rd row19
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 24
80.0%
1 3
 
10.0%
2 2
 
6.7%
19 1
 
3.3%

Length

2023-12-10T23:02:21.152518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:02:21.324178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 24
80.0%
1 3
 
10.0%
2 2
 
6.7%
19 1
 
3.3%

게시글내키워드평균빈도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
1
29 
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row1
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 29
96.7%
2 1
 
3.3%

Length

2023-12-10T23:02:21.496645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:02:21.641328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 29
96.7%
2 1
 
3.3%

게시글내키워드평균중요도
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
8
19 
7
6
9
 
1
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)6.7%

Sample

1st row9
2nd row7
3rd row4
4th row8
5th row8

Common Values

ValueCountFrequency (%)
8 19
63.3%
7 5
 
16.7%
6 4
 
13.3%
9 1
 
3.3%
4 1
 
3.3%

Length

2023-12-10T23:02:21.813253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:02:22.009126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
8 19
63.3%
7 5
 
16.7%
6 4
 
13.3%
9 1
 
3.3%
4 1
 
3.3%

연월
Date

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2023-04-01 00:00:00
Maximum2023-04-01 00:00:00
2023-12-10T23:02:22.269773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:02:22.515953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-10T23:02:22.688421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
키워드키워드분류키워드길이키워드포함게시글작성비율게시글내키워드평균빈도게시글내키워드평균중요도
키워드1.0001.0001.0001.0001.0001.000
키워드분류1.0001.0000.9470.5330.0000.255
키워드길이1.0000.9471.0000.0000.0000.000
키워드포함게시글작성비율1.0000.5330.0001.0001.0000.783
게시글내키워드평균빈도1.0000.0000.0001.0001.0001.000
게시글내키워드평균중요도1.0000.2550.0000.7831.0001.000
2023-12-10T23:02:22.891815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
게시글내키워드평균중요도키워드포함게시글작성비율키워드분류키워드길이게시글내키워드평균빈도
게시글내키워드평균중요도1.0000.7210.2870.0000.945
키워드포함게시글작성비율0.7211.0000.3460.0000.964
키워드분류0.2870.3461.0000.7630.000
키워드길이0.0000.0000.7631.0000.000
게시글내키워드평균빈도0.9450.9640.0000.0001.000
2023-12-10T23:02:23.064570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
키워드분류키워드길이키워드포함게시글작성비율게시글내키워드평균빈도게시글내키워드평균중요도
키워드분류1.0000.7630.3460.0000.287
키워드길이0.7631.0000.0000.0000.000
키워드포함게시글작성비율0.3460.0001.0000.9640.721
게시글내키워드평균빈도0.0000.0000.9641.0000.945
게시글내키워드평균중요도0.2870.0000.7210.9451.000

Missing values

2023-12-10T23:02:19.440485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:02:19.621359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

키워드키워드분류키워드길이키워드포함게시글작성비율게시글내키워드평균빈도게시글내키워드평균중요도연월
0가가명사20192023-04
1가건물명사30172023-04
2가게명사219242023-04
3가게내동사30182023-04
4가게라며동사40182023-04
5가게세동사30172023-04
6가게세는동사40172023-04
7가게안에서동사50182023-04
8가게였고동사40182023-04
9가게였는데동사50172023-04
키워드키워드분류키워드길이키워드포함게시글작성비율게시글내키워드평균빈도게시글내키워드평균중요도연월
20가계명사21182023-04
21가고동사21162023-04
22가공명사20182023-04
23가과명사20182023-04
24가구명사20182023-04
25가기명사20172023-04
26가기도동사30182023-04
27가까워지고동사50182023-04
28가까이명사31162023-04
29가끔명사22162023-04