Overview

Dataset statistics

Number of variables1
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory932.0 B
Average record size in memory9.3 B

Variable types

Text1

Reproduction

Analysis started2023-12-10 09:51:19.698968
Analysis finished2023-12-10 09:51:19.991737
Duration0.29 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:51:20.318191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length68
Median length65
Mean length62.4
Min length56

Characters and Unicode

Total characters6240
Distinct characters49
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row86d64b59-544d-4a5d-af16-1389ddbccde30직장인225-30세NNNNYY
2nd rowd658b72e-4b61-4b62-b5c3-4b3e451718370직장인225-30세NNNNNN
3rd rowc9fdd6fc-0ad8-4c69-aa4e-6048b323007f0직장인436-40세NNNNNN
4th rowdf2457b0-bc7e-4d53-839c-e2305f5fe2e899225-30세NNNNNN
5th rowf3d1d1e5-48f8-48fb-b27c-86522337c14c9재택인사331-35세NNNNNN
ValueCountFrequency (%)
노동직"225-30세nnnnyn 3
 
2.8%
86d64b59-544d-4a5d-af16-1389ddbccde30직장인225-30세nnnnyy 1
 
0.9%
d658b72e-4b61-4b62-b5c3-4b3e451718370직장인225-30세nnnnnn 1
 
0.9%
6aa0d06e-83b4-4571-85ab-da9a7135715d1개인사업자119~24세nnnyyy 1
 
0.9%
3fbe83c7-9145-4db3-9e87-32eb48f766fb0직장인225-30세nnnnny 1
 
0.9%
246cade9-82e9-479b-9741-c11b4ad3830b99331-35세nnnnnn 1
 
0.9%
cfc8c683-122b-49c4-8cb1-83bd6842d95d99331-35세nnnnny 1
 
0.9%
d1b6eecd-5ef0-4726-a553-0c8b83e23e532학생119~24세nnnyyy 1
 
0.9%
5a029c75-ade1-4a0f-b0ac-a1ee3172d20f9재택인사119~24세nnnnnn 1
 
0.9%
8874efb2-c141-4b90-b3f2-c489c140fc802학생225-30세nynnyn 1
 
0.9%
Other values (95) 95
88.8%
2023-12-10T18:51:20.963568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
 1000
16.0%
N 493
 
7.9%
- 477
 
7.6%
4 330
 
5.3%
9 316
 
5.1%
3 302
 
4.8%
2 292
 
4.7%
0 288
 
4.6%
5 276
 
4.4%
1 242
 
3.9%
Other values (39) 2224
35.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2674
42.9%
Lowercase Letter 1150
18.4%
Control 1000
 
16.0%
Uppercase Letter 600
 
9.6%
Dash Punctuation 477
 
7.6%
Other Letter 303
 
4.9%
Math Symbol 15
 
0.2%
Other Punctuation 14
 
0.2%
Space Separator 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
95
31.4%
47
15.5%
37
 
12.2%
32
 
10.6%
12
 
4.0%
10
 
3.3%
10
 
3.3%
7
 
2.3%
6
 
2.0%
5
 
1.7%
Other values (16) 42
13.9%
Decimal Number
ValueCountFrequency (%)
4 330
12.3%
9 316
11.8%
3 302
11.3%
2 292
10.9%
0 288
10.8%
5 276
10.3%
1 242
9.1%
8 240
9.0%
7 195
7.3%
6 193
7.2%
Lowercase Letter
ValueCountFrequency (%)
b 209
18.2%
e 199
17.3%
c 191
16.6%
d 191
16.6%
f 185
16.1%
a 175
15.2%
Uppercase Letter
ValueCountFrequency (%)
N 493
82.2%
Y 107
 
17.8%
Control
ValueCountFrequency (%)
 1000
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 477
100.0%
Math Symbol
ValueCountFrequency (%)
~ 15
100.0%
Other Punctuation
ValueCountFrequency (%)
" 14
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4187
67.1%
Latin 1750
28.0%
Hangul 303
 
4.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
95
31.4%
47
15.5%
37
 
12.2%
32
 
10.6%
12
 
4.0%
10
 
3.3%
10
 
3.3%
7
 
2.3%
6
 
2.0%
5
 
1.7%
Other values (16) 42
13.9%
Common
ValueCountFrequency (%)
 1000
23.9%
- 477
11.4%
4 330
 
7.9%
9 316
 
7.5%
3 302
 
7.2%
2 292
 
7.0%
0 288
 
6.9%
5 276
 
6.6%
1 242
 
5.8%
8 240
 
5.7%
Other values (5) 424
10.1%
Latin
ValueCountFrequency (%)
N 493
28.2%
b 209
11.9%
e 199
11.4%
c 191
 
10.9%
d 191
 
10.9%
f 185
 
10.6%
a 175
 
10.0%
Y 107
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5937
95.1%
Hangul 303
 
4.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
 1000
16.8%
N 493
 
8.3%
- 477
 
8.0%
4 330
 
5.6%
9 316
 
5.3%
3 302
 
5.1%
2 292
 
4.9%
0 288
 
4.9%
5 276
 
4.6%
1 242
 
4.1%
Other values (13) 1921
32.4%
Hangul
ValueCountFrequency (%)
95
31.4%
47
15.5%
37
 
12.2%
32
 
10.6%
12
 
4.0%
10
 
3.3%
10
 
3.3%
7
 
2.3%
6
 
2.0%
5
 
1.7%
Other values (16) 42
13.9%

Missing values

2023-12-10T18:51:19.811681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:51:19.948174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

USIDOCCP_CLOCCP_NMYEAR_SEYEAR_NMTOUR_INTRST_ATDTFD_INTRST_ATSTAR_INTRST_ATPHOTO_INTRST_ATSOCICNTC_INTRST_ATFASHN_INTRST_AT
086d64b59-544d-4a5d-af16-1389ddbccde30직장인225-30세NNNNYY
1d658b72e-4b61-4b62-b5c3-4b3e451718370직장인225-30세NNNNNN
2c9fdd6fc-0ad8-4c69-aa4e-6048b323007f0직장인436-40세NNNNNN
3df2457b0-bc7e-4d53-839c-e2305f5fe2e899225-30세NNNNNN
4f3d1d1e5-48f8-48fb-b27c-86522337c14c9재택인사331-35세NNNNNN
5c52faf49-d9b1-422b-9c59-76cfc128527999225-30세YNNNYN
6d180de66-0623-4a79-89b7-980a1d5c917899331-35세YNNYYY
79fe45335-d54c-419b-b9fc-14869c49b9355전통노동직646세이상NNNNYY
885b5e3b3-3ab9-430c-a4b9-a3283f1cd9890직장인225-30세NNNNYN
972060b7d-850c-4250-9cd6-d9c27c2ed52b0직장인225-30세NNNNYN
USIDOCCP_CLOCCP_NMYEAR_SEYEAR_NMTOUR_INTRST_ATDTFD_INTRST_ATSTAR_INTRST_ATPHOTO_INTRST_ATSOCICNTC_INTRST_ATFASHN_INTRST_AT
9026aaf034-a8ba-414a-bdf1-d483423a614399225-30세YYNYYY
91add18819-2de2-4748-8042-5a585c1c40c299225-30세NNNNNN
92bd9b6b26-59de-4e7d-b568-5aec4737ef5099119~24세NNNNYN
9305908ffd-3e74-4fa2-8cf9-4e884b2bfb9299331-35세NNNYYN
94be8f81c2-bb17-448b-84e9-cd2c40fdcae199646세이상NNNNYN
95ee84db85-559c-4de4-bc2f-7b7ed7c159b00직장인225-30세NNNNYN
9652f1f710-7faf-4572-8afe-8711894c93f099225-30세NNNNNN
973714d9bf-7d6a-4e92-83ce-20fea637cd1910"새 노동직"331-35세NNNNYN
98a2d794b2-1d7d-4a9f-b3b6-ba8b3828c69699225-30세YNNNYN
99c46835cd-e918-4655-9b75-813e2c2d500f0직장인225-30세NNNNNN