Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells67
Missing cells (%)9.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.8 KiB
Average record size in memory59.3 B

Variable types

Categorical3
Numeric1
Text3

Alerts

stnd_year is highly overall correlated with repr_dt and 1 other fieldsHigh correlation
equip_tpe is highly overall correlated with repr_tpe_nmsHigh correlation
repr_tpe_nms is highly overall correlated with repr_dt and 2 other fieldsHigh correlation
repr_dt is highly overall correlated with stnd_year and 1 other fieldsHigh correlation
stnd_year is highly imbalanced (80.6%)Imbalance
equip_tpe is highly imbalanced (91.9%)Imbalance
repr_tpe_nms is highly imbalanced (86.0%)Imbalance
mjr_parts has 67 (67.0%) missing valuesMissing

Reproduction

Analysis started2023-12-10 09:55:57.912196
Analysis finished2023-12-10 09:55:59.506422
Duration1.59 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

stnd_year
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2003
97 
2021
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2003
2nd row2021
3rd row2003
4th row2003
5th row2003

Common Values

ValueCountFrequency (%)
2003 97
97.0%
2021 3
 
3.0%

Length

2023-12-10T18:55:59.711856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:55:59.916552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2003 97
97.0%
2021 3
 
3.0%

repr_dt
Real number (ℝ)

HIGH CORRELATION 

Distinct34
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20036418
Minimum20030723
Maximum20210727
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:56:00.163629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20030723
5-th percentile20030725
Q120030924
median20031002
Q320031201
95-th percentile20031210
Maximum20210727
Range180004
Interquartile range (IQR)277.25

Descriptive statistics

Standard deviation30808.666
Coefficient of variation (CV)0.0015376334
Kurtosis29.896274
Mean20036418
Median Absolute Deviation (MAD)124
Skewness5.5944458
Sum2.0036418 × 109
Variance9.4917389 × 108
MonotonicityNot monotonic
2023-12-10T18:56:00.447022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
20031001 10
 
10.0%
20031204 9
 
9.0%
20030924 7
 
7.0%
20031127 7
 
7.0%
20031002 6
 
6.0%
20030925 4
 
4.0%
20030917 4
 
4.0%
20030929 4
 
4.0%
20031126 4
 
4.0%
20031210 3
 
3.0%
Other values (24) 42
42.0%
ValueCountFrequency (%)
20030723 1
 
1.0%
20030724 3
3.0%
20030725 2
2.0%
20030728 1
 
1.0%
20030730 2
2.0%
20030731 1
 
1.0%
20030917 4
4.0%
20030919 1
 
1.0%
20030920 2
2.0%
20030922 2
2.0%
ValueCountFrequency (%)
20210727 1
 
1.0%
20210722 1
 
1.0%
20210721 1
 
1.0%
20031210 3
 
3.0%
20031209 3
 
3.0%
20031206 2
 
2.0%
20031204 9
9.0%
20031203 3
 
3.0%
20031202 2
 
2.0%
20031201 2
 
2.0%

equip_tpe
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
보트
99 
모터
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row보트
2nd row보트
3rd row보트
4th row보트
5th row보트

Common Values

ValueCountFrequency (%)
보트 99
99.0%
모터 1
 
1.0%

Length

2023-12-10T18:56:00.773017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:56:00.965714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보트 99
99.0%
모터 1
 
1.0%

repr_tpe_nms
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
기타정비
97 
낙수방지 고무
 
2
기화기
 
1

Length

Max length7
Median length4
Mean length4.05
Min length3

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row기타정비
2nd row낙수방지 고무
3rd row기타정비
4th row기타정비
5th row기타정비

Common Values

ValueCountFrequency (%)
기타정비 97
97.0%
낙수방지 고무 2
 
2.0%
기화기 1
 
1.0%

Length

2023-12-10T18:56:01.156652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:56:01.396211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타정비 97
95.1%
낙수방지 2
 
2.0%
고무 2
 
2.0%
기화기 1
 
1.0%

mjr_parts
Text

MISSING 

Distinct25
Distinct (%)75.8%
Missing67
Missing (%)67.0%
Memory size932.0 B
2023-12-10T18:56:01.704878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length16
Mean length8
Min length2

Characters and Unicode

Total characters264
Distinct characters54
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)60.6%

Sample

1st row낙수방지 고무(좌)
2nd row펜더(좌현,상)
3rd row스로틀레버
4th row낙수방지 고무(좌)
5th row선저외판/펜더
ValueCountFrequency (%)
카울(좌 6
 
14.3%
소프트바우 3
 
7.1%
스티어링와이어 2
 
4.8%
펜더(우현,상 2
 
4.8%
낙수방지 2
 
4.8%
고무(좌 2
 
4.8%
스트링거고무(우 2
 
4.8%
펜더(좌현,상 2
 
4.8%
카울(우 2
 
4.8%
2
 
4.8%
Other values (16) 17
40.5%
2023-12-10T18:56:02.435121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 22
 
8.3%
) 22
 
8.3%
14
 
5.3%
11
 
4.2%
11
 
4.2%
11
 
4.2%
, 10
 
3.8%
9
 
3.4%
/ 9
 
3.4%
9
 
3.4%
Other values (44) 136
51.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 190
72.0%
Open Punctuation 22
 
8.3%
Close Punctuation 22
 
8.3%
Other Punctuation 20
 
7.6%
Space Separator 9
 
3.4%
Decimal Number 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
7.4%
11
 
5.8%
11
 
5.8%
11
 
5.8%
9
 
4.7%
9
 
4.7%
8
 
4.2%
7
 
3.7%
7
 
3.7%
6
 
3.2%
Other values (37) 97
51.1%
Other Punctuation
ValueCountFrequency (%)
, 10
50.0%
/ 9
45.0%
# 1
 
5.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Space Separator
ValueCountFrequency (%)
9
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 190
72.0%
Common 74
 
28.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
7.4%
11
 
5.8%
11
 
5.8%
11
 
5.8%
9
 
4.7%
9
 
4.7%
8
 
4.2%
7
 
3.7%
7
 
3.7%
6
 
3.2%
Other values (37) 97
51.1%
Common
ValueCountFrequency (%)
( 22
29.7%
) 22
29.7%
, 10
13.5%
/ 9
12.2%
9
12.2%
3 1
 
1.4%
# 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 190
72.0%
ASCII 74
 
28.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 22
29.7%
) 22
29.7%
, 10
13.5%
/ 9
12.2%
9
12.2%
3 1
 
1.4%
# 1
 
1.4%
Hangul
ValueCountFrequency (%)
14
 
7.4%
11
 
5.8%
11
 
5.8%
11
 
5.8%
9
 
4.7%
9
 
4.7%
8
 
4.2%
7
 
3.7%
7
 
3.7%
6
 
3.2%
Other values (37) 97
51.1%
Distinct93
Distinct (%)93.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:56:02.919357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length59
Median length36
Mean length18.19
Min length4

Characters and Unicode

Total characters1819
Distinct characters58
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)87.0%

Sample

1st row펜더 좌현,상(50) ,하(110) 보수
2nd row낙수방지 고무(좌)교체
3rd row펜더 우현,상(600) 보수
4th row펜더 좌현,하(1190) 보수
5th row펜더 우현,하 보수(320)
ValueCountFrequency (%)
펜더 64
20.7%
보수 64
20.7%
선저 17
 
5.5%
10
 
3.2%
데크 8
 
2.6%
판재보수 7
 
2.3%
좌현,하(60 3
 
1.0%
우현,하(100 3
 
1.0%
좌현,상 3
 
1.0%
좌현,하 3
 
1.0%
Other values (113) 127
41.1%
2023-12-10T18:56:03.612235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
213
 
11.7%
0 182
 
10.0%
) 117
 
6.4%
( 115
 
6.3%
104
 
5.7%
100
 
5.5%
, 100
 
5.5%
80
 
4.4%
70
 
3.8%
1 70
 
3.8%
Other values (48) 668
36.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 805
44.3%
Decimal Number 418
23.0%
Space Separator 213
 
11.7%
Other Punctuation 119
 
6.5%
Close Punctuation 117
 
6.4%
Open Punctuation 115
 
6.3%
Lowercase Letter 32
 
1.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
104
12.9%
100
12.4%
80
9.9%
70
8.7%
69
8.6%
56
 
7.0%
55
 
6.8%
51
 
6.3%
50
 
6.2%
25
 
3.1%
Other values (32) 145
18.0%
Decimal Number
ValueCountFrequency (%)
0 182
43.5%
1 70
 
16.7%
5 38
 
9.1%
3 25
 
6.0%
2 22
 
5.3%
6 20
 
4.8%
7 16
 
3.8%
8 15
 
3.6%
4 15
 
3.6%
9 15
 
3.6%
Other Punctuation
ValueCountFrequency (%)
, 100
84.0%
/ 19
 
16.0%
Space Separator
ValueCountFrequency (%)
213
100.0%
Close Punctuation
ValueCountFrequency (%)
) 117
100.0%
Open Punctuation
ValueCountFrequency (%)
( 115
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 32
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 982
54.0%
Hangul 805
44.3%
Latin 32
 
1.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
104
12.9%
100
12.4%
80
9.9%
70
8.7%
69
8.6%
56
 
7.0%
55
 
6.8%
51
 
6.3%
50
 
6.2%
25
 
3.1%
Other values (32) 145
18.0%
Common
ValueCountFrequency (%)
213
21.7%
0 182
18.5%
) 117
11.9%
( 115
11.7%
, 100
10.2%
1 70
 
7.1%
5 38
 
3.9%
3 25
 
2.5%
2 22
 
2.2%
6 20
 
2.0%
Other values (5) 80
 
8.1%
Latin
ValueCountFrequency (%)
x 32
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1014
55.7%
Hangul 805
44.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
213
21.0%
0 182
17.9%
) 117
11.5%
( 115
11.3%
, 100
9.9%
1 70
 
6.9%
5 38
 
3.7%
x 32
 
3.2%
3 25
 
2.5%
2 22
 
2.2%
Other values (6) 100
9.9%
Hangul
ValueCountFrequency (%)
104
12.9%
100
12.4%
80
9.9%
70
8.7%
69
8.6%
56
 
7.0%
55
 
6.8%
51
 
6.3%
50
 
6.2%
25
 
3.1%
Other values (32) 145
18.0%
Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:56:04.214272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)34.0%

Sample

1st rowB2003065
2nd rowB2020033
3rd rowB2003037
4th rowB2003023
5th rowB2003013
ValueCountFrequency (%)
b2003062 4
 
4.0%
b2003065 3
 
3.0%
b2003075 3
 
3.0%
b2003037 3
 
3.0%
b2003023 3
 
3.0%
b2003044 3
 
3.0%
b2003061 3
 
3.0%
b2003025 3
 
3.0%
b2003004 3
 
3.0%
b2003066 2
 
2.0%
Other values (52) 70
70.0%
2023-12-10T18:56:05.061273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 320
40.0%
2 127
 
15.9%
3 122
 
15.2%
B 99
 
12.4%
6 26
 
3.2%
5 25
 
3.1%
1 22
 
2.8%
7 21
 
2.6%
4 20
 
2.5%
8 11
 
1.4%
Other values (2) 7
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 320
45.7%
2 127
 
18.1%
3 122
 
17.4%
6 26
 
3.7%
5 25
 
3.6%
1 22
 
3.1%
7 21
 
3.0%
4 20
 
2.9%
8 11
 
1.6%
9 6
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
B 99
99.0%
M 1
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 320
45.7%
2 127
 
18.1%
3 122
 
17.4%
6 26
 
3.7%
5 25
 
3.6%
1 22
 
3.1%
7 21
 
3.0%
4 20
 
2.9%
8 11
 
1.6%
9 6
 
0.9%
Latin
ValueCountFrequency (%)
B 99
99.0%
M 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 320
40.0%
2 127
 
15.9%
3 122
 
15.2%
B 99
 
12.4%
6 26
 
3.2%
5 25
 
3.1%
1 22
 
2.8%
7 21
 
2.6%
4 20
 
2.5%
8 11
 
1.4%
Other values (2) 7
 
0.9%

Interactions

2023-12-10T18:55:58.855659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:56:05.273717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
stnd_yearrepr_dtequip_tperepr_tpe_nmsmjr_partsrepr_descequip_no
stnd_year1.0000.9630.3961.0001.0001.0001.000
repr_dt0.9631.0000.3961.0001.0001.0001.000
equip_tpe0.3960.3961.0001.0001.0001.0001.000
repr_tpe_nms1.0001.0001.0001.0001.0001.0001.000
mjr_parts1.0001.0001.0001.0001.0001.0000.000
repr_desc1.0001.0001.0001.0001.0001.0000.892
equip_no1.0001.0001.0001.0000.0000.8921.000
2023-12-10T18:56:05.557274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
stnd_yearequip_tperepr_tpe_nms
stnd_year1.0000.2590.995
equip_tpe0.2591.0000.995
repr_tpe_nms0.9950.9951.000
2023-12-10T18:56:05.732479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
repr_dtstnd_yearequip_tperepr_tpe_nms
repr_dt1.0000.8260.2590.995
stnd_year0.8261.0000.2590.995
equip_tpe0.2590.2591.0000.995
repr_tpe_nms0.9950.9950.9951.000

Missing values

2023-12-10T18:55:59.137002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:55:59.422215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

stnd_yearrepr_dtequip_tperepr_tpe_nmsmjr_partsrepr_descequip_no
0200320030723보트기타정비<NA>펜더 좌현,상(50) ,하(110) 보수B2003065
1202120210721보트낙수방지 고무낙수방지 고무(좌)낙수방지 고무(좌)교체B2020033
2200320031210보트기타정비<NA>펜더 우현,상(600) 보수B2003037
3200320031210보트기타정비<NA>펜더 좌현,하(1190) 보수B2003023
4200320031210보트기타정비펜더(좌현,상)펜더 우현,하 보수(320)B2003013
5200320031209보트기타정비스로틀레버펜더 좌현,상(850) 보수B2003078
6200320031209보트기타정비<NA>펜더 우현,상(1500) , 하(700) 보수B2003066
7202120210722보트낙수방지 고무낙수방지 고무(좌)낙수방지 고무(좌)교체B2020017
8200320031209보트기타정비선저외판/펜더우현선저선수부(1310x315)수리/우현펜더(하)교체B2003010
9200320031206보트기타정비<NA>펜더 우현,하(1200) 보수B2003072
stnd_yearrepr_dtequip_tperepr_tpe_nmsmjr_partsrepr_descequip_no
90200320030923보트기타정비<NA>펜더 좌현,하 보수(100)B2003013
91200320030922보트기타정비<NA>펜더 우현,하(160) 보수B2003077
92200320030922보트기타정비깃봉베이스선저 우현,상(545x300)수리B2003062
93200320030920보트기타정비<NA>펜더 좌현펜,하(90) 보수B2003076
94200320030920보트기타정비<NA>펜더 좌현,상 보수(120)B2003011
95200320030919보트기타정비소프트바우펜더 우현,하(130) 보수B2003060
96200320030917보트기타정비소프트바우펜더 좌현,상(55) 보수B2003075
97200320030917보트기타정비<NA>선저 좌현,상(75x50) 판재보수B2003030
98200320030917보트기타정비폐들고정고무선저 좌현,중(150x150)(170x150)수리B2003025
99200320030917보트기타정비<NA>펜더 좌현,상(100) 보수B2003061