Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows3
Duplicate rows (%)3.0%
Total size in memory4.2 KiB
Average record size in memory43.3 B

Variable types

DateTime2
Numeric1
Categorical2

Alerts

Dataset has 3 (3.0%) duplicate rowsDuplicates
state_nm is highly overall correlated with state_cdHigh correlation
state_cd is highly overall correlated with state_nmHigh correlation
state_cd is highly imbalanced (63.4%)Imbalance
state_nm is highly imbalanced (63.4%)Imbalance

Reproduction

Analysis started2023-12-10 10:12:31.267844
Analysis finished2023-12-10 10:12:32.071487
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct55
Distinct (%)55.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2015-01-12 00:00:00
Maximum2019-11-21 00:00:00
2023-12-10T19:12:32.230454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:32.564204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2023-12-10 09:00:00
Maximum2023-12-10 11:00:00
2023-12-10T19:12:32.832733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:33.115113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=3)

watch_member_no
Real number (ℝ)

Distinct38
Distinct (%)38.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.87
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:12:33.375990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile17.9
Q124
median40
Q372.25
95-th percentile100
Maximum100
Range99
Interquartile range (IQR)48.25

Descriptive statistics

Standard deviation31.123298
Coefficient of variation (CV)0.6240886
Kurtosis-1.1256649
Mean49.87
Median Absolute Deviation (MAD)19.5
Skewness0.57054187
Sum4987
Variance968.6597
MonotonicityNot monotonic
2023-12-10T19:12:33.629762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
100 20
20.0%
20 10
 
10.0%
24 7
 
7.0%
30 5
 
5.0%
50 5
 
5.0%
70 4
 
4.0%
25 4
 
4.0%
27 3
 
3.0%
31 3
 
3.0%
23 3
 
3.0%
Other values (28) 36
36.0%
ValueCountFrequency (%)
1 2
 
2.0%
11 1
 
1.0%
14 1
 
1.0%
16 1
 
1.0%
18 1
 
1.0%
20 10
10.0%
21 1
 
1.0%
22 1
 
1.0%
23 3
 
3.0%
24 7
7.0%
ValueCountFrequency (%)
100 20
20.0%
95 1
 
1.0%
90 1
 
1.0%
80 1
 
1.0%
75 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
70 4
 
4.0%
65 2
 
2.0%
60 2
 
2.0%

state_cd
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2
93 
0
 
7

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 93
93.0%
0 7
 
7.0%

Length

2023-12-10T19:12:33.933259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:12:34.142021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 93
93.0%
0 7
 
7.0%

state_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
확정
93 
예약신청
 
7

Length

Max length4
Median length2
Mean length2.14
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row예약신청
2nd row예약신청
3rd row확정
4th row확정
5th row확정

Common Values

ValueCountFrequency (%)
확정 93
93.0%
예약신청 7
 
7.0%

Length

2023-12-10T19:12:34.406322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:12:34.622460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
확정 93
93.0%
예약신청 7
 
7.0%

Interactions

2023-12-10T19:12:31.568384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:12:34.754088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
watch_datetour_time_nmwatch_member_nostate_cdstate_nm
watch_date1.0000.8270.7920.7770.777
tour_time_nm0.8271.0000.6510.0000.000
watch_member_no0.7920.6511.0000.3070.307
state_cd0.7770.0000.3071.0000.993
state_nm0.7770.0000.3070.9931.000
2023-12-10T19:12:34.969694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
state_nmstate_cd
state_nm1.0000.922
state_cd0.9221.000
2023-12-10T19:12:35.147742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
watch_member_nostate_cdstate_nm
watch_member_no1.0000.2240.224
state_cd0.2241.0000.922
state_nm0.2240.9221.000

Missing values

2023-12-10T19:12:31.802638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:12:31.996539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

watch_datetour_time_nmwatch_member_nostate_cdstate_nm
02015-01-1210:00200예약신청
12019-11-1410:00310예약신청
22015-11-1210:00262확정
32015-11-2611:00142확정
42016-04-0611:00242확정
52016-04-0610:00422확정
62016-04-0610:00582확정
72019-11-1410:00310예약신청
82016-04-0710:00512확정
92016-04-0711:00602확정
watch_datetour_time_nmwatch_member_nostate_cdstate_nm
902017-06-2111:00202확정
912017-06-2111:00180예약신청
922017-06-2210:00202확정
932017-06-2911:00432확정
942017-09-0611:00312확정
952017-09-1311:001002확정
962017-09-1411:00492확정
972017-09-1410:00402확정
982017-09-2010:001002확정
992017-09-2110:00232확정

Duplicate rows

Most frequently occurring

watch_datetour_time_nmwatch_member_nostate_cdstate_nm# duplicates
02016-05-0410:00242확정2
12016-09-2811:00302확정2
22019-11-1410:00310예약신청2