Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.7 KiB
Average record size in memory68.3 B

Variable types

Text4
Categorical1
Numeric3

Alerts

KWRD_RANK_CO is highly overall correlated with PASSNGR_KWRD_RANK_CO and 1 other fieldsHigh correlation
PASSNGR_KWRD_RANK_CO is highly overall correlated with KWRD_RANK_CO and 1 other fieldsHigh correlation
LCLS_KWRD_RANK_CO is highly overall correlated with KWRD_RANK_CO and 1 other fieldsHigh correlation
ID has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:13:14.084239
Analysis finished2023-12-10 10:13:17.632703
Duration3.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:13:18.045636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.6
Min length5

Characters and Unicode

Total characters560
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowTTL_01
2nd rowTTL_02
3rd rowTTL_03
4th rowTTL_04
5th rowTTL_05
ValueCountFrequency (%)
ttl_01 1
 
1.0%
bsn_03 1
 
1.0%
bsn_14 1
 
1.0%
bsn_13 1
 
1.0%
bsn_12 1
 
1.0%
bsn_11 1
 
1.0%
bsn_10 1
 
1.0%
bsn_09 1
 
1.0%
bsn_08 1
 
1.0%
bsn_07 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T19:13:19.131027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 100
17.9%
T 60
10.7%
S 60
10.7%
L 60
10.7%
0 46
8.2%
1 41
7.3%
2 40
 
7.1%
N 30
 
5.4%
B 30
 
5.4%
3 13
 
2.3%
Other values (8) 80
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 260
46.4%
Decimal Number 200
35.7%
Connector Punctuation 100
 
17.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 46
23.0%
1 41
20.5%
2 40
20.0%
3 13
 
6.5%
7 10
 
5.0%
9 10
 
5.0%
8 10
 
5.0%
6 10
 
5.0%
5 10
 
5.0%
4 10
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
T 60
23.1%
S 60
23.1%
L 60
23.1%
N 30
11.5%
B 30
11.5%
D 10
 
3.8%
G 10
 
3.8%
Connector Punctuation
ValueCountFrequency (%)
_ 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 300
53.6%
Latin 260
46.4%

Most frequent character per script

Common
ValueCountFrequency (%)
_ 100
33.3%
0 46
15.3%
1 41
13.7%
2 40
 
13.3%
3 13
 
4.3%
7 10
 
3.3%
9 10
 
3.3%
8 10
 
3.3%
6 10
 
3.3%
5 10
 
3.3%
Latin
ValueCountFrequency (%)
T 60
23.1%
S 60
23.1%
L 60
23.1%
N 30
11.5%
B 30
11.5%
D 10
 
3.8%
G 10
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 100
17.9%
T 60
10.7%
S 60
10.7%
L 60
10.7%
0 46
8.2%
1 41
7.3%
2 40
 
7.1%
N 30
 
5.4%
B 30
 
5.4%
3 13
 
2.3%
Other values (8) 80
14.3%

AREA_NM
Categorical

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
전국
30 
서울특별시
30 
부산광역시
30 
대구광역시
10 

Length

Max length5
Median length5
Mean length4.1
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전국
2nd row전국
3rd row전국
4th row전국
5th row전국

Common Values

ValueCountFrequency (%)
전국 30
30.0%
서울특별시 30
30.0%
부산광역시 30
30.0%
대구광역시 10
 
10.0%

Length

2023-12-10T19:13:19.369702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:13:19.556968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전국 30
30.0%
서울특별시 30
30.0%
부산광역시 30
30.0%
대구광역시 10
 
10.0%

KWRD_RANK_CO
Real number (ℝ)

HIGH CORRELATION 

Distinct30
Distinct (%)30.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.5
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:13:19.738735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q17
median14
Q322
95-th percentile29
Maximum30
Range29
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.8334763
Coefficient of variation (CV)0.60920526
Kurtosis-1.2329291
Mean14.5
Median Absolute Deviation (MAD)8
Skewness0.16149815
Sum1450
Variance78.030303
MonotonicityNot monotonic
2023-12-10T19:13:19.977546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 4
 
4.0%
3 4
 
4.0%
4 4
 
4.0%
5 4
 
4.0%
6 4
 
4.0%
7 4
 
4.0%
8 4
 
4.0%
9 4
 
4.0%
10 4
 
4.0%
2 4
 
4.0%
Other values (20) 60
60.0%
ValueCountFrequency (%)
1 4
4.0%
2 4
4.0%
3 4
4.0%
4 4
4.0%
5 4
4.0%
6 4
4.0%
7 4
4.0%
8 4
4.0%
9 4
4.0%
10 4
4.0%
ValueCountFrequency (%)
30 3
3.0%
29 3
3.0%
28 3
3.0%
27 3
3.0%
26 3
3.0%
25 3
3.0%
24 3
3.0%
23 3
3.0%
22 3
3.0%
21 3
3.0%
Distinct70
Distinct (%)70.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:13:20.389737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.6
Min length1

Characters and Unicode

Total characters260
Distinct characters115
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)50.0%

Sample

1st row먹거리
2nd row바다
3rd row계곡
4th row해변
5th row공원
ValueCountFrequency (%)
먹거리 4
 
4.0%
공원 4
 
4.0%
축제 3
 
3.0%
호텔 3
 
3.0%
맛집 3
 
3.0%
카페 3
 
3.0%
시장 3
 
3.0%
재래시장 3
 
3.0%
산책 2
 
2.0%
온천 2
 
2.0%
Other values (60) 70
70.0%
2023-12-10T19:13:21.075422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
5.0%
11
 
4.2%
10
 
3.8%
10
 
3.8%
8
 
3.1%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
4
 
1.5%
Other values (105) 175
67.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 260
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
5.0%
11
 
4.2%
10
 
3.8%
10
 
3.8%
8
 
3.1%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
4
 
1.5%
Other values (105) 175
67.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 260
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
5.0%
11
 
4.2%
10
 
3.8%
10
 
3.8%
8
 
3.1%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
4
 
1.5%
Other values (105) 175
67.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 260
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
13
 
5.0%
11
 
4.2%
10
 
3.8%
10
 
3.8%
8
 
3.1%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
4
 
1.5%
Other values (105) 175
67.3%

PASSNGR_KWRD_RANK_CO
Real number (ℝ)

HIGH CORRELATION 

Distinct30
Distinct (%)30.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.5
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:13:21.336777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q17
median14
Q322
95-th percentile29
Maximum30
Range29
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.8334763
Coefficient of variation (CV)0.60920526
Kurtosis-1.2329291
Mean14.5
Median Absolute Deviation (MAD)8
Skewness0.16149815
Sum1450
Variance78.030303
MonotonicityNot monotonic
2023-12-10T19:13:21.563005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 4
 
4.0%
3 4
 
4.0%
4 4
 
4.0%
5 4
 
4.0%
6 4
 
4.0%
7 4
 
4.0%
8 4
 
4.0%
9 4
 
4.0%
10 4
 
4.0%
2 4
 
4.0%
Other values (20) 60
60.0%
ValueCountFrequency (%)
1 4
4.0%
2 4
4.0%
3 4
4.0%
4 4
4.0%
5 4
4.0%
6 4
4.0%
7 4
4.0%
8 4
4.0%
9 4
4.0%
10 4
4.0%
ValueCountFrequency (%)
30 3
3.0%
29 3
3.0%
28 3
3.0%
27 3
3.0%
26 3
3.0%
25 3
3.0%
24 3
3.0%
23 3
3.0%
22 3
3.0%
21 3
3.0%
Distinct70
Distinct (%)70.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:13:22.098976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.61
Min length1

Characters and Unicode

Total characters261
Distinct characters115
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)50.0%

Sample

1st row바다
2nd row먹거리
3rd row해변
4th row계곡
5th row펜션
ValueCountFrequency (%)
먹거리 4
 
4.0%
공원 4
 
4.0%
맛집 3
 
3.0%
호텔 3
 
3.0%
시장 3
 
3.0%
카페 3
 
3.0%
볼거리 3
 
3.0%
산책 3
 
3.0%
아이 2
 
2.0%
길거리음식 2
 
2.0%
Other values (60) 70
70.0%
2023-12-10T19:13:23.012806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
5.0%
13
 
5.0%
11
 
4.2%
10
 
3.8%
9
 
3.4%
7
 
2.7%
6
 
2.3%
6
 
2.3%
4
 
1.5%
4
 
1.5%
Other values (105) 178
68.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 261
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
5.0%
13
 
5.0%
11
 
4.2%
10
 
3.8%
9
 
3.4%
7
 
2.7%
6
 
2.3%
6
 
2.3%
4
 
1.5%
4
 
1.5%
Other values (105) 178
68.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 261
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
5.0%
13
 
5.0%
11
 
4.2%
10
 
3.8%
9
 
3.4%
7
 
2.7%
6
 
2.3%
6
 
2.3%
4
 
1.5%
4
 
1.5%
Other values (105) 178
68.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 261
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
13
 
5.0%
13
 
5.0%
11
 
4.2%
10
 
3.8%
9
 
3.4%
7
 
2.7%
6
 
2.3%
6
 
2.3%
4
 
1.5%
4
 
1.5%
Other values (105) 178
68.2%

LCLS_KWRD_RANK_CO
Real number (ℝ)

HIGH CORRELATION 

Distinct30
Distinct (%)30.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.5
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:13:23.319651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q17
median14
Q322
95-th percentile29
Maximum30
Range29
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.8334763
Coefficient of variation (CV)0.60920526
Kurtosis-1.2329291
Mean14.5
Median Absolute Deviation (MAD)8
Skewness0.16149815
Sum1450
Variance78.030303
MonotonicityNot monotonic
2023-12-10T19:13:23.530687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 4
 
4.0%
3 4
 
4.0%
4 4
 
4.0%
5 4
 
4.0%
6 4
 
4.0%
7 4
 
4.0%
8 4
 
4.0%
9 4
 
4.0%
10 4
 
4.0%
2 4
 
4.0%
Other values (20) 60
60.0%
ValueCountFrequency (%)
1 4
4.0%
2 4
4.0%
3 4
4.0%
4 4
4.0%
5 4
4.0%
6 4
4.0%
7 4
4.0%
8 4
4.0%
9 4
4.0%
10 4
4.0%
ValueCountFrequency (%)
30 3
3.0%
29 3
3.0%
28 3
3.0%
27 3
3.0%
26 3
3.0%
25 3
3.0%
24 3
3.0%
23 3
3.0%
22 3
3.0%
21 3
3.0%
Distinct70
Distinct (%)70.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:13:23.900273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.65
Min length1

Characters and Unicode

Total characters265
Distinct characters114
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)50.0%

Sample

1st row먹거리
2nd row축제
3rd row공원
4th row계곡
5th row바다
ValueCountFrequency (%)
먹거리 4
 
4.0%
공원 4
 
4.0%
시장 3
 
3.0%
맛집 3
 
3.0%
카페 3
 
3.0%
재래시장 3
 
3.0%
축제 3
 
3.0%
호텔 3
 
3.0%
쇼핑 2
 
2.0%
산책 2
 
2.0%
Other values (60) 70
70.0%
2023-12-10T19:13:24.575206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17
 
6.4%
10
 
3.8%
10
 
3.8%
10
 
3.8%
10
 
3.8%
9
 
3.4%
8
 
3.0%
7
 
2.6%
5
 
1.9%
5
 
1.9%
Other values (104) 174
65.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 265
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17
 
6.4%
10
 
3.8%
10
 
3.8%
10
 
3.8%
10
 
3.8%
9
 
3.4%
8
 
3.0%
7
 
2.6%
5
 
1.9%
5
 
1.9%
Other values (104) 174
65.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 265
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
6.4%
10
 
3.8%
10
 
3.8%
10
 
3.8%
10
 
3.8%
9
 
3.4%
8
 
3.0%
7
 
2.6%
5
 
1.9%
5
 
1.9%
Other values (104) 174
65.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 265
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
17
 
6.4%
10
 
3.8%
10
 
3.8%
10
 
3.8%
10
 
3.8%
9
 
3.4%
8
 
3.0%
7
 
2.6%
5
 
1.9%
5
 
1.9%
Other values (104) 174
65.7%

Interactions

2023-12-10T19:13:16.526331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:15.275122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:15.940547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:16.697680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:15.505960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:16.135337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:16.878758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:15.742217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:13:16.351711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:13:24.747258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
IDAREA_NMKWRD_RANK_COKWRD_NMPASSNGR_KWRD_RANK_COPASSNGR_KWRD_NMLCLS_KWRD_RANK_COLCLS_KWRD_NM
ID1.0001.0001.0001.0001.0001.0001.0001.000
AREA_NM1.0001.0000.0000.0000.0000.0000.0000.000
KWRD_RANK_CO1.0000.0001.0000.7981.0000.8611.0000.705
KWRD_NM1.0000.0000.7981.0000.7980.9450.7980.990
PASSNGR_KWRD_RANK_CO1.0000.0001.0000.7981.0000.8611.0000.705
PASSNGR_KWRD_NM1.0000.0000.8610.9450.8611.0000.8610.964
LCLS_KWRD_RANK_CO1.0000.0001.0000.7981.0000.8611.0000.705
LCLS_KWRD_NM1.0000.0000.7050.9900.7050.9640.7051.000
2023-12-10T19:13:24.958584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
KWRD_RANK_COPASSNGR_KWRD_RANK_COLCLS_KWRD_RANK_COAREA_NM
KWRD_RANK_CO1.0001.0001.0000.000
PASSNGR_KWRD_RANK_CO1.0001.0001.0000.000
LCLS_KWRD_RANK_CO1.0001.0001.0000.000
AREA_NM0.0000.0000.0001.000

Missing values

2023-12-10T19:13:17.243436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:13:17.527216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

IDAREA_NMKWRD_RANK_COKWRD_NMPASSNGR_KWRD_RANK_COPASSNGR_KWRD_NMLCLS_KWRD_RANK_COLCLS_KWRD_NM
0TTL_01전국1먹거리1바다1먹거리
1TTL_02전국2바다2먹거리2축제
2TTL_03전국3계곡3해변3공원
3TTL_04전국4해변4계곡4계곡
4TTL_05전국5공원5펜션5바다
5TTL_06전국6축제6해산물6맛집
6TTL_07전국7펜션77
7TTL_08전국8해산물8호텔8펜션
8TTL_09전국999해산물
9TTL_10전국1010공원10해변
IDAREA_NMKWRD_RANK_COKWRD_NMPASSNGR_KWRD_RANK_COPASSNGR_KWRD_NMLCLS_KWRD_RANK_COLCLS_KWRD_NM
90DG_01대구광역시1막창1막창1팔공산
91DG_02대구광역시2대구2대구2막창
92DG_03대구광역시3팔공산3먹거리3대구
93DG_04대구광역시4먹거리4팔공산4먹거리
94DG_05대구광역시5수성못5수성못5앞산
95DG_06대구광역시6공원6서문시장6공원
96DG_07대구광역시7앞산7김광석거리7수성못
97DG_08대구광역시8서문시장8공원8두류공원
98DG_09대구광역시9골목9곱창9골목
99DG_10대구광역시10이월드10동성로10이월드