Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells2
Missing cells (%)0.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

Text4
Categorical1

Alerts

parts_tpe has constant value ""Constant
spec has 2 (2.0%) missing valuesMissing
parts_cd has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:49:00.169611
Analysis finished2023-12-10 09:49:01.284102
Duration1.11 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

parts_cd
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:49:01.687260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length7.69
Min length7

Characters and Unicode

Total characters769
Distinct characters18
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowA010100
2nd rowZ1601-0203
3rd rowA010200
4th rowA010200P
5th rowA010300
ValueCountFrequency (%)
a010100 1
 
1.0%
a012103p 1
 
1.0%
a012600 1
 
1.0%
a012400p 1
 
1.0%
a012400 1
 
1.0%
a012314 1
 
1.0%
a012302p 1
 
1.0%
a012302 1
 
1.0%
a012300p 1
 
1.0%
a012300(11년형 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T18:49:02.473452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 275
35.8%
1 176
22.9%
A 97
 
12.6%
2 64
 
8.3%
P 48
 
6.2%
8 21
 
2.7%
3 20
 
2.6%
6 17
 
2.2%
4 11
 
1.4%
9 10
 
1.3%
Other values (8) 30
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 610
79.3%
Uppercase Letter 148
 
19.2%
Other Letter 4
 
0.5%
Dash Punctuation 3
 
0.4%
Open Punctuation 2
 
0.3%
Close Punctuation 2
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 275
45.1%
1 176
28.9%
2 64
 
10.5%
8 21
 
3.4%
3 20
 
3.3%
6 17
 
2.8%
4 11
 
1.8%
9 10
 
1.6%
5 8
 
1.3%
7 8
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
A 97
65.5%
P 48
32.4%
Z 3
 
2.0%
Other Letter
ValueCountFrequency (%)
2
50.0%
2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 617
80.2%
Latin 148
 
19.2%
Hangul 4
 
0.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 275
44.6%
1 176
28.5%
2 64
 
10.4%
8 21
 
3.4%
3 20
 
3.2%
6 17
 
2.8%
4 11
 
1.8%
9 10
 
1.6%
5 8
 
1.3%
7 8
 
1.3%
Other values (3) 7
 
1.1%
Latin
ValueCountFrequency (%)
A 97
65.5%
P 48
32.4%
Z 3
 
2.0%
Hangul
ValueCountFrequency (%)
2
50.0%
2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 765
99.5%
Hangul 4
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 275
35.9%
1 176
23.0%
A 97
 
12.7%
2 64
 
8.4%
P 48
 
6.3%
8 21
 
2.7%
3 20
 
2.6%
6 17
 
2.2%
4 11
 
1.4%
9 10
 
1.3%
Other values (6) 26
 
3.4%
Hangul
ValueCountFrequency (%)
2
50.0%
2
50.0%

parts_tpe
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
M
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
M 100
100.0%

Length

2023-12-10T18:49:02.759666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:49:02.957448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
m 100
100.0%
Distinct55
Distinct (%)55.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:49:03.282418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters700
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)12.0%

Sample

1st rowA010100
2nd rowM202100
3rd rowA010200
4th rowA010200
5th rowA010300
ValueCountFrequency (%)
a012100 3
 
3.0%
a012300 3
 
3.0%
a012901 2
 
2.0%
a011900 2
 
2.0%
a012000 2
 
2.0%
a011113 2
 
2.0%
a012802 2
 
2.0%
a012801 2
 
2.0%
a011500 2
 
2.0%
a011600 2
 
2.0%
Other values (45) 78
78.0%
2023-12-10T18:49:03.892019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 274
39.1%
1 168
24.0%
A 97
 
13.9%
2 67
 
9.6%
8 21
 
3.0%
3 18
 
2.6%
6 15
 
2.1%
4 11
 
1.6%
9 10
 
1.4%
7 8
 
1.1%
Other values (2) 11
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 600
85.7%
Uppercase Letter 100
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 274
45.7%
1 168
28.0%
2 67
 
11.2%
8 21
 
3.5%
3 18
 
3.0%
6 15
 
2.5%
4 11
 
1.8%
9 10
 
1.7%
7 8
 
1.3%
5 8
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
A 97
97.0%
M 3
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Common 600
85.7%
Latin 100
 
14.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 274
45.7%
1 168
28.0%
2 67
 
11.2%
8 21
 
3.5%
3 18
 
3.0%
6 15
 
2.5%
4 11
 
1.8%
9 10
 
1.7%
7 8
 
1.3%
5 8
 
1.3%
Latin
ValueCountFrequency (%)
A 97
97.0%
M 3
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 700
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 274
39.1%
1 168
24.0%
A 97
 
13.9%
2 67
 
9.6%
8 21
 
3.0%
3 18
 
2.6%
6 15
 
2.1%
4 11
 
1.6%
9 10
 
1.4%
7 8
 
1.1%
Other values (2) 11
 
1.6%
Distinct68
Distinct (%)68.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:49:04.297535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length11
Mean length7.79
Min length2

Characters and Unicode

Total characters779
Distinct characters91
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)36.0%

Sample

1st row크랭크샤프트조합
2nd row스페이스링
3rd row니들베어링(센터)
4th row니들베어링(센터)
5th row센터베어링레이스
ValueCountFrequency (%)
조합 11
 
9.2%
피스톤링 4
 
3.4%
3
 
2.5%
드로틀바 2
 
1.7%
워터호스 2
 
1.7%
오링 2
 
1.7%
흡기매니폴드가스켓 2
 
1.7%
실린더블럭가스켓 2
 
1.7%
실린더헤드가스켓 2
 
1.7%
배기커버가스켓 2
 
1.7%
Other values (61) 87
73.1%
2023-12-10T18:49:04.924135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
71
 
9.1%
40
 
5.1%
30
 
3.9%
) 28
 
3.6%
( 28
 
3.6%
23
 
3.0%
23
 
3.0%
22
 
2.8%
22
 
2.8%
19
 
2.4%
Other values (81) 473
60.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 687
88.2%
Close Punctuation 28
 
3.6%
Open Punctuation 28
 
3.6%
Space Separator 19
 
2.4%
Lowercase Letter 8
 
1.0%
Decimal Number 6
 
0.8%
Dash Punctuation 3
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
71
 
10.3%
40
 
5.8%
30
 
4.4%
23
 
3.3%
23
 
3.3%
22
 
3.2%
22
 
3.2%
18
 
2.6%
18
 
2.6%
18
 
2.6%
Other values (71) 402
58.5%
Lowercase Letter
ValueCountFrequency (%)
d 2
25.0%
s 2
25.0%
t 2
25.0%
n 2
25.0%
Decimal Number
ValueCountFrequency (%)
1 4
66.7%
2 2
33.3%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Space Separator
ValueCountFrequency (%)
19
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 687
88.2%
Common 84
 
10.8%
Latin 8
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
71
 
10.3%
40
 
5.8%
30
 
4.4%
23
 
3.3%
23
 
3.3%
22
 
3.2%
22
 
3.2%
18
 
2.6%
18
 
2.6%
18
 
2.6%
Other values (71) 402
58.5%
Common
ValueCountFrequency (%)
) 28
33.3%
( 28
33.3%
19
22.6%
1 4
 
4.8%
- 3
 
3.6%
2 2
 
2.4%
Latin
ValueCountFrequency (%)
d 2
25.0%
s 2
25.0%
t 2
25.0%
n 2
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 687
88.2%
ASCII 92
 
11.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
71
 
10.3%
40
 
5.8%
30
 
4.4%
23
 
3.3%
23
 
3.3%
22
 
3.2%
22
 
3.2%
18
 
2.6%
18
 
2.6%
18
 
2.6%
Other values (71) 402
58.5%
ASCII
ValueCountFrequency (%)
) 28
30.4%
( 28
30.4%
19
20.7%
1 4
 
4.3%
- 3
 
3.3%
d 2
 
2.2%
s 2
 
2.2%
t 2
 
2.2%
n 2
 
2.2%
2 2
 
2.2%

spec
Text

MISSING 

Distinct50
Distinct (%)51.0%
Missing2
Missing (%)2.0%
Memory size932.0 B
2023-12-10T18:49:05.329075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length9.9285714
Min length2

Characters and Unicode

Total characters973
Distinct characters51
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)23.5%

Sample

1st rowSUB-ASS'Y
2nd row02년도계약분
3rd rowKTEG 354525
4th rowKTEG 354525
5th rowCBR456133
ValueCountFrequency (%)
sub-ass'y 15
 
10.4%
sm45c 14
 
9.7%
sts304 10
 
6.9%
1t 8
 
5.6%
viton 6
 
4.2%
ac4c-t6 5
 
3.5%
φ15 4
 
2.8%
a0 4
 
2.8%
0.3t 4
 
2.8%
fcd 4
 
2.8%
Other values (43) 70
48.6%
2023-12-10T18:49:06.155965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 89
 
9.1%
4 63
 
6.5%
5 61
 
6.3%
, 53
 
5.4%
46
 
4.7%
0 46
 
4.7%
C 44
 
4.5%
2 37
 
3.8%
1 37
 
3.8%
3 36
 
3.7%
Other values (41) 461
47.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 385
39.6%
Decimal Number 329
33.8%
Other Punctuation 100
 
10.3%
Space Separator 46
 
4.7%
Math Symbol 35
 
3.6%
Lowercase Letter 31
 
3.2%
Dash Punctuation 24
 
2.5%
Other Letter 23
 
2.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 89
23.1%
C 44
11.4%
A 33
 
8.6%
M 32
 
8.3%
T 29
 
7.5%
Φ 21
 
5.5%
B 19
 
4.9%
U 17
 
4.4%
Y 15
 
3.9%
L 14
 
3.6%
Other values (11) 72
18.7%
Decimal Number
ValueCountFrequency (%)
4 63
19.1%
5 61
18.5%
0 46
14.0%
2 37
11.2%
1 37
11.2%
3 36
10.9%
6 22
 
6.7%
8 18
 
5.5%
7 6
 
1.8%
9 3
 
0.9%
Other Letter
ValueCountFrequency (%)
3
13.0%
3
13.0%
3
13.0%
3
13.0%
3
13.0%
2
8.7%
2
8.7%
2
8.7%
2
8.7%
Other Punctuation
ValueCountFrequency (%)
, 53
53.0%
. 31
31.0%
' 15
 
15.0%
/ 1
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
t 25
80.6%
d 2
 
6.5%
n 2
 
6.5%
s 2
 
6.5%
Space Separator
ValueCountFrequency (%)
46
100.0%
Math Symbol
ValueCountFrequency (%)
× 35
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 534
54.9%
Latin 395
40.6%
Hangul 23
 
2.4%
Greek 21
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 89
22.5%
C 44
11.1%
A 33
 
8.4%
M 32
 
8.1%
T 29
 
7.3%
t 25
 
6.3%
B 19
 
4.8%
U 17
 
4.3%
Y 15
 
3.8%
L 14
 
3.5%
Other values (14) 78
19.7%
Common
ValueCountFrequency (%)
4 63
11.8%
5 61
11.4%
, 53
9.9%
46
8.6%
0 46
8.6%
2 37
6.9%
1 37
6.9%
3 36
6.7%
× 35
 
6.6%
. 31
 
5.8%
Other values (7) 89
16.7%
Hangul
ValueCountFrequency (%)
3
13.0%
3
13.0%
3
13.0%
3
13.0%
3
13.0%
2
8.7%
2
8.7%
2
8.7%
2
8.7%
Greek
ValueCountFrequency (%)
Φ 21
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 894
91.9%
None 56
 
5.8%
Hangul 23
 
2.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 89
 
10.0%
4 63
 
7.0%
5 61
 
6.8%
, 53
 
5.9%
46
 
5.1%
0 46
 
5.1%
C 44
 
4.9%
2 37
 
4.1%
1 37
 
4.1%
3 36
 
4.0%
Other values (30) 382
42.7%
None
ValueCountFrequency (%)
× 35
62.5%
Φ 21
37.5%
Hangul
ValueCountFrequency (%)
3
13.0%
3
13.0%
3
13.0%
3
13.0%
3
13.0%
2
8.7%
2
8.7%
2
8.7%
2
8.7%

Correlations

2023-12-10T18:49:06.416085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
parts_cdmodule_codeparts_item_cd_nmspec
parts_cd1.0001.0001.0001.000
module_code1.0001.0000.9990.995
parts_item_cd_nm1.0000.9991.0000.993
spec1.0000.9950.9931.000

Missing values

2023-12-10T18:49:01.011157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:49:01.223954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

parts_cdparts_tpemodule_codeparts_item_cd_nmspec
0A010100MA010100크랭크샤프트조합SUB-ASS'Y
1Z1601-0203MM202100스페이스링02년도계약분
2A010200MA010200니들베어링(센터)KTEG 354525
3A010200PMA010200니들베어링(센터)KTEG 354525
4A010300MA010300센터베어링레이스CBR456133
5A010300PMA010300센터베어링레이스CBR456133
6A010400MA010400센터베어링레이스클립Φ62
7Z1601-0300MM200600스페이스링02년도계약분
8A010500MA010500피스톤AC8A
9A010500PMA010500피스톤AC8A-T6,
parts_cdparts_tpemodule_codeparts_item_cd_nmspec
90A012805PMA012805오일실스페이스A5056, Φ36×Φ44×2t
91A012806MA012806오일씰VITON, TC25448
92A012806PMA012806오일실-(베어링케이스 상)VITON, TC254408
93A012807MA012807오일씰스토퍼STS304, 0.8t
94A012807PMA012807오일실스토퍼STS304, 0.8t
95A012901MA012901스타트풀리AC4C-F
96A012901PMA012901스타트풀리AC4C-F,
97A012902MA012902키(스타트풀리)SM45C
98A012902PMA012902키(스타트풀리)SM45C,
99A012903MA012903특수평와셔(스타트풀리)SM45C, Φ8.2×Φ25×5t