Overview

Dataset statistics

Number of variables3
Number of observations419
Missing cells0
Missing cells (%)0.0%
Duplicate rows7
Duplicate rows (%)1.7%
Total size in memory9.9 KiB
Average record size in memory24.3 B

Variable types

Text1
Categorical2

Dataset

Description전북특별자치도 내 건설기계중장비대여업 현황(230501 기준)상호명칭, 등록종류(일반, 개별) 등총 419개소
Author전북특별자치도
URLhttps://www.data.go.kr/data/15113787/fileData.do

Alerts

Dataset has 7 (1.7%) duplicate rowsDuplicates
상태 is highly imbalanced (97.6%)Imbalance

Reproduction

Analysis started2024-03-15 00:14:55.541030
Analysis finished2024-03-15 00:14:56.157049
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct411
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
2024-03-15T09:14:56.868191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length13
Mean length7.2792363
Min length2

Characters and Unicode

Total characters3050
Distinct characters252
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique404 ?
Unique (%)96.4%

Sample

1st row(유)두임건설
2nd row(유)완산특수운송사
3rd row(유)전북기업
4th row(유)다올산업
5th row(유)규림건설중기
ValueCountFrequency (%)
유한회사 17
 
3.7%
개별중기 3
 
0.7%
주식회사 3
 
0.7%
고창지게차 2
 
0.4%
공단지게차 2
 
0.4%
유)해성종합중기 2
 
0.4%
전북대형지게차 2
 
0.4%
새고창지게차 2
 
0.4%
건설중기 2
 
0.4%
유)유일건기 2
 
0.4%
Other values (417) 419
91.9%
2024-03-15T09:14:58.129356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
219
 
7.2%
( 198
 
6.5%
) 198
 
6.5%
169
 
5.5%
166
 
5.4%
131
 
4.3%
99
 
3.2%
75
 
2.5%
62
 
2.0%
62
 
2.0%
Other values (242) 1671
54.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2602
85.3%
Open Punctuation 198
 
6.5%
Close Punctuation 198
 
6.5%
Space Separator 37
 
1.2%
Decimal Number 8
 
0.3%
Uppercase Letter 4
 
0.1%
Other Punctuation 2
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
219
 
8.4%
169
 
6.5%
166
 
6.4%
131
 
5.0%
99
 
3.8%
75
 
2.9%
62
 
2.4%
62
 
2.4%
56
 
2.2%
52
 
2.0%
Other values (228) 1511
58.1%
Decimal Number
ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
2 2
25.0%
7 1
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
D 1
25.0%
J 1
25.0%
O 1
25.0%
K 1
25.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
50.0%
. 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 198
100.0%
Close Punctuation
ValueCountFrequency (%)
) 198
100.0%
Space Separator
ValueCountFrequency (%)
37
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2602
85.3%
Common 444
 
14.6%
Latin 4
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
219
 
8.4%
169
 
6.5%
166
 
6.4%
131
 
5.0%
99
 
3.8%
75
 
2.9%
62
 
2.4%
62
 
2.4%
56
 
2.2%
52
 
2.0%
Other values (228) 1511
58.1%
Common
ValueCountFrequency (%)
( 198
44.6%
) 198
44.6%
37
 
8.3%
9 3
 
0.7%
1 2
 
0.5%
2 2
 
0.5%
/ 1
 
0.2%
- 1
 
0.2%
7 1
 
0.2%
. 1
 
0.2%
Latin
ValueCountFrequency (%)
D 1
25.0%
J 1
25.0%
O 1
25.0%
K 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2602
85.3%
ASCII 448
 
14.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
219
 
8.4%
169
 
6.5%
166
 
6.4%
131
 
5.0%
99
 
3.8%
75
 
2.9%
62
 
2.4%
62
 
2.4%
56
 
2.2%
52
 
2.0%
Other values (228) 1511
58.1%
ASCII
ValueCountFrequency (%)
( 198
44.2%
) 198
44.2%
37
 
8.3%
9 3
 
0.7%
1 2
 
0.4%
2 2
 
0.4%
/ 1
 
0.2%
- 1
 
0.2%
7 1
 
0.2%
. 1
 
0.2%
Other values (4) 4
 
0.9%

등록종별
Categorical

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
일반
239 
개별
180 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 239
57.0%
개별 180
43.0%

Length

2024-03-15T09:14:58.659614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T09:14:58.840672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 239
57.0%
개별 180
43.0%

상태
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
영업
418 
재개업
 
1

Length

Max length3
Median length2
Mean length2.0023866
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row영업
2nd row영업
3rd row영업
4th row영업
5th row영업

Common Values

ValueCountFrequency (%)
영업 418
99.8%
재개업 1
 
0.2%

Length

2024-03-15T09:14:59.035628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T09:14:59.215741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업 418
99.8%
재개업 1
 
0.2%

Correlations

2024-03-15T09:14:59.325444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록종별상태
등록종별1.0000.000
상태0.0001.000
2024-03-15T09:14:59.470890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상태등록종별
상태1.0000.000
등록종별0.0001.000
2024-03-15T09:14:59.719386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록종별상태
등록종별1.0000.000
상태0.0001.000

Missing values

2024-03-15T09:14:55.832354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T09:14:56.069293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호명칭등록종별상태
0(유)두임건설일반영업
1(유)완산특수운송사일반영업
2(유)전북기업일반영업
3(유)다올산업일반영업
4(유)규림건설중기일반영업
5신토중기일반영업
6(유)인프라종합중기일반영업
7연합중기개별영업
8우주종합중기일반영업
9임광토건중기개별영업
상호명칭등록종별상태
409(유)호림건설일반영업
410바다콤개별영업
411서해안건설중기일반영업
412(유)세대건설일반영업
413부안중앙농업협동조합일반영업
414하서농업협동조합일반영업
415남부안농업협동조합일반영업
416(유)세계산업일반영업
417린협동조합개별영업
418부안농업협동조합일반영업

Duplicate rows

Most frequently occurring

상호명칭등록종별상태# duplicates
2개별중기개별영업3
0(유)유일건기일반영업2
1(유)해성종합중기일반영업2
3고창지게차개별영업2
4공단지게차개별영업2
5새고창지게차개별영업2
6전북대형지게차개별영업2