Overview

Dataset statistics

Number of variables1
Number of observations128
Missing cells0
Missing cells (%)0.0%
Duplicate rows6
Duplicate rows (%)4.7%
Total size in memory1.1 KiB
Average record size in memory9.0 B

Variable types

Text1

Dataset

Description전라북도탄소산업연관기업현황
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202132

Alerts

Dataset has 6 (4.7%) duplicate rowsDuplicates

Reproduction

Analysis started2024-03-14 02:48:01.979334
Analysis finished2024-03-14 02:48:02.126573
Duration0.15 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct116
Distinct (%)90.6%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-03-14T11:48:02.348474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length2
Mean length2.2265625
Min length1

Characters and Unicode

Total characters285
Distinct characters35
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)85.9%

Sample

1st row (‘16. 3. 16기준)
2nd row;[1]
3rd row연번
4th row총계
5th row1
ValueCountFrequency (%)
5
 
3.7%
연번 5
 
3.7%
3 4
 
3.0%
1 3
 
2.2%
2 3
 
2.2%
4 3
 
2.2%
5 2
 
1.5%
62 1
 
0.7%
63 1
 
0.7%
66 1
 
0.7%
Other values (106) 106
79.1%
2024-03-14T11:48:02.686357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 32
11.2%
3 24
 
8.4%
6 23
 
8.1%
2 23
 
8.1%
4 23
 
8.1%
5 22
 
7.7%
7 20
 
7.0%
9 20
 
7.0%
8 20
 
7.0%
0 17
 
6.0%
Other values (25) 61
21.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 224
78.6%
Other Letter 27
 
9.5%
Other Punctuation 12
 
4.2%
Space Separator 9
 
3.2%
Open Punctuation 6
 
2.1%
Close Punctuation 6
 
2.1%
Initial Punctuation 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
18.5%
5
18.5%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (7) 7
25.9%
Decimal Number
ValueCountFrequency (%)
1 32
14.3%
3 24
10.7%
6 23
10.3%
2 23
10.3%
4 23
10.3%
5 22
9.8%
7 20
8.9%
9 20
8.9%
8 20
8.9%
0 17
7.6%
Other Punctuation
ValueCountFrequency (%)
; 10
83.3%
. 2
 
16.7%
Open Punctuation
ValueCountFrequency (%)
[ 5
83.3%
( 1
 
16.7%
Close Punctuation
ValueCountFrequency (%)
] 5
83.3%
) 1
 
16.7%
Space Separator
ValueCountFrequency (%)
9
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 258
90.5%
Hangul 27
 
9.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 32
12.4%
3 24
9.3%
6 23
8.9%
2 23
8.9%
4 23
8.9%
5 22
8.5%
7 20
7.8%
9 20
7.8%
8 20
7.8%
0 17
6.6%
Other values (8) 34
13.2%
Hangul
ValueCountFrequency (%)
5
18.5%
5
18.5%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (7) 7
25.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 257
90.2%
Hangul 27
 
9.5%
Punctuation 1
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 32
12.5%
3 24
9.3%
6 23
8.9%
2 23
8.9%
4 23
8.9%
5 22
8.6%
7 20
7.8%
9 20
7.8%
8 20
7.8%
0 17
6.6%
Other values (7) 33
12.8%
Hangul
ValueCountFrequency (%)
5
18.5%
5
18.5%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (7) 7
25.9%
Punctuation
ValueCountFrequency (%)
1
100.0%

Missing values

2024-03-14T11:48:02.029763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T11:48:02.097309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

전라북도 탄소산업 연관 기업 현황
0(‘16. 3. 16기준)
1;[1]
2연번
3총계
41
52
63
74
85
96
전라북도 탄소산업 연관 기업 현황
118106
119;
120탄소소재 적용 관심 기업체 현황
121;[5]
122연번
1231
1242
1253
1264
127;

Duplicate rows

Most frequently occurring

전라북도 탄소산업 연관 기업 현황# duplicates
4;5
5연번5
012
122
232
342