Overview

Dataset statistics

Number of variables1
Number of observations26
Missing cells10
Missing cells (%)38.5%
Duplicate rows2
Duplicate rows (%)7.7%
Total size in memory340.0 B
Average record size in memory13.1 B

Variable types

Text1

Alerts

Dataset has 2 (7.7%) duplicate rowsDuplicates
;[1] has 10 (38.5%) missing valuesMissing

Reproduction

Analysis started2024-03-14 00:09:58.106383
Analysis finished2024-03-14 00:09:58.216612
Duration0.11 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

;[1]
Text

MISSING 

Distinct15
Distinct (%)93.8%
Missing10
Missing (%)38.5%
Memory size340.0 B
2024-03-14T09:09:58.289585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length76
Median length3
Mean length7.3125
Min length1

Characters and Unicode

Total characters117
Distinct characters39
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)87.5%

Sample

1st row
2nd row;
3rd row (‘16년 6월말 현재) (개소, 두)
4th row;[2]
5th row시군
ValueCountFrequency (%)
2
 
10.0%
전주시 1
 
5.0%
고창군 1
 
5.0%
장수군 1
 
5.0%
김제시 1
 
5.0%
남원시 1
 
5.0%
정읍시 1
 
5.0%
익산시 1
 
5.0%
군산시 1
 
5.0%
9 1
 
5.0%
Other values (9) 9
45.0%
2024-03-14T09:09:58.522522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
61
52.1%
7
 
6.0%
5
 
4.3%
) 3
 
2.6%
; 3
 
2.6%
( 3
 
2.6%
6 2
 
1.7%
2
 
1.7%
1
 
0.9%
1
 
0.9%
Other values (29) 29
24.8%

Most occurring categories

ValueCountFrequency (%)
Space Separator 61
52.1%
Other Letter 38
32.5%
Decimal Number 5
 
4.3%
Close Punctuation 4
 
3.4%
Other Punctuation 4
 
3.4%
Open Punctuation 4
 
3.4%
Initial Punctuation 1
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
18.4%
5
 
13.2%
2
 
5.3%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (17) 17
44.7%
Decimal Number
ValueCountFrequency (%)
6 2
40.0%
9 1
20.0%
2 1
20.0%
1 1
20.0%
Close Punctuation
ValueCountFrequency (%)
) 3
75.0%
] 1
 
25.0%
Other Punctuation
ValueCountFrequency (%)
; 3
75.0%
, 1
 
25.0%
Open Punctuation
ValueCountFrequency (%)
( 3
75.0%
[ 1
 
25.0%
Space Separator
ValueCountFrequency (%)
61
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 79
67.5%
Hangul 38
32.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
18.4%
5
 
13.2%
2
 
5.3%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (17) 17
44.7%
Common
ValueCountFrequency (%)
61
77.2%
) 3
 
3.8%
; 3
 
3.8%
( 3
 
3.8%
6 2
 
2.5%
9 1
 
1.3%
1
 
1.3%
] 1
 
1.3%
2 1
 
1.3%
[ 1
 
1.3%
Other values (2) 2
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78
66.7%
Hangul 38
32.5%
Punctuation 1
 
0.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
61
78.2%
) 3
 
3.8%
; 3
 
3.8%
( 3
 
3.8%
6 2
 
2.6%
9 1
 
1.3%
] 1
 
1.3%
2 1
 
1.3%
[ 1
 
1.3%
, 1
 
1.3%
Hangul
ValueCountFrequency (%)
7
18.4%
5
 
13.2%
2
 
5.3%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (17) 17
44.7%
Punctuation
ValueCountFrequency (%)
1
100.0%

Missing values

2024-03-14T09:09:58.159824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:09:58.200104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

;[1]
0
1;
2(‘16년 6월말 현재) (개소, 두)
3;[2]
4시군
5<NA>
6계 (9)
7전주시
8군산시
9익산시
;[1]
16남원시
17<NA>
18김제시
19<NA>
20장수군
21<NA>
22<NA>
23고창군
24부안군
25;

Duplicate rows

Most frequently occurring

;[1]# duplicates
1<NA>10
0;2