Overview

Dataset statistics

Number of variables6
Number of observations990
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory47.5 KiB
Average record size in memory49.1 B

Variable types

Text1
Categorical4
Boolean1

Dataset

Description경상남도 버스업체경영수지분석시스템 데이터자료로, 등록일자, 갱신일자, 사원아이디, 사원종류구분 등에 대한 정보들을 제공합니다.
Author경상남도
URLhttps://www.data.go.kr/data/15065978/fileData.do

Alerts

USE_YN has constant value ""Constant
UPD_DT is highly overall correlated with REMARKS and 1 other fieldsHigh correlation
REMARKS is highly overall correlated with ENT_DT and 1 other fieldsHigh correlation
ENT_DT is highly overall correlated with REMARKS and 1 other fieldsHigh correlation
REMARKS is highly imbalanced (92.7%)Imbalance
EMP_TAG is highly imbalanced (54.0%)Imbalance
No has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:59:10.747902
Analysis finished2023-12-12 14:59:11.149548
Duration0.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

No
Text

UNIQUE 

Distinct990
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-12T23:59:11.506143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.8919192
Min length1

Characters and Unicode

Total characters2863
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique990 ?
Unique (%)100.0%

Sample

1st row▶1
2nd row2
3rd row3
4th row4
5th row5
ValueCountFrequency (%)
▶1 1
 
0.1%
650 1
 
0.1%
652 1
 
0.1%
653 1
 
0.1%
654 1
 
0.1%
655 1
 
0.1%
656 1
 
0.1%
657 1
 
0.1%
658 1
 
0.1%
659 1
 
0.1%
Other values (980) 980
99.0%
2023-12-12T23:59:12.089035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 299
10.4%
3 299
10.4%
4 299
10.4%
5 299
10.4%
6 299
10.4%
7 299
10.4%
8 299
10.4%
2 299
10.4%
9 281
9.8%
0 189
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2862
> 99.9%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 299
10.4%
3 299
10.4%
4 299
10.4%
5 299
10.4%
6 299
10.4%
7 299
10.4%
8 299
10.4%
2 299
10.4%
9 281
9.8%
0 189
6.6%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2863
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 299
10.4%
3 299
10.4%
4 299
10.4%
5 299
10.4%
6 299
10.4%
7 299
10.4%
8 299
10.4%
2 299
10.4%
9 281
9.8%
0 189
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2862
> 99.9%
Geometric Shapes 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 299
10.4%
3 299
10.4%
4 299
10.4%
5 299
10.4%
6 299
10.4%
7 299
10.4%
8 299
10.4%
2 299
10.4%
9 281
9.8%
0 189
6.6%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%

REMARKS
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
<NA>
967 
촉탁
 
9
영업소
 
9
청소원
 
2
세차원
 
1
Other values (2)
 
2

Length

Max length5
Median length4
Mean length3.9707071
Min length2

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 967
97.7%
촉탁 9
 
0.9%
영업소 9
 
0.9%
청소원 2
 
0.2%
세차원 1
 
0.1%
대표이사 1
 
0.1%
노조지부장 1
 
0.1%

Length

2023-12-12T23:59:12.231907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:59:12.393576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 967
97.7%
촉탁 9
 
0.9%
영업소 9
 
0.9%
청소원 2
 
0.2%
세차원 1
 
0.1%
대표이사 1
 
0.1%
노조지부장 1
 
0.1%

EMP_TAG
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
844 
2
106 
3
 
40

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 844
85.3%
2 106
 
10.7%
3 40
 
4.0%

Length

2023-12-12T23:59:12.558143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:59:12.655569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 844
85.3%
2 106
 
10.7%
3 40
 
4.0%

ENT_DT
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
20,100,906
354 
20,100,901
206 
20,100,908
180 
20,100,905
158 
20,100,907
72 
Other values (4)
 
20

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row20,100,915
2nd row20,100,916
3rd row20,100,916
4th row20,100,916
5th row20,100,916

Common Values

ValueCountFrequency (%)
20,100,906 354
35.8%
20,100,901 206
20.8%
20,100,908 180
18.2%
20,100,905 158
16.0%
20,100,907 72
 
7.3%
20,100,916 14
 
1.4%
20,100,825 3
 
0.3%
20,100,902 2
 
0.2%
20,100,915 1
 
0.1%

Length

2023-12-12T23:59:12.781003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:59:12.916393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20,100,906 354
35.8%
20,100,901 206
20.8%
20,100,908 180
18.2%
20,100,905 158
16.0%
20,100,907 72
 
7.3%
20,100,916 14
 
1.4%
20,100,825 3
 
0.3%
20,100,902 2
 
0.2%
20,100,915 1
 
0.1%

UPD_DT
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
20,100,906
352 
20,100,901
206 
20,100,908
180 
20,100,905
158 
20,100,907
72 
Other values (5)
 
22

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row20,100,915
2nd row20,100,916
3rd row20,100,916
4th row20,100,916
5th row20,100,916

Common Values

ValueCountFrequency (%)
20,100,906 352
35.6%
20,100,901 206
20.8%
20,100,908 180
18.2%
20,100,905 158
16.0%
20,100,907 72
 
7.3%
20,100,916 14
 
1.4%
20,100,825 3
 
0.3%
20,100,902 2
 
0.2%
20,101,101 2
 
0.2%
20,100,915 1
 
0.1%

Length

2023-12-12T23:59:13.074683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:59:13.209343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20,100,906 352
35.6%
20,100,901 206
20.8%
20,100,908 180
18.2%
20,100,905 158
16.0%
20,100,907 72
 
7.3%
20,100,916 14
 
1.4%
20,100,825 3
 
0.3%
20,100,902 2
 
0.2%
20,101,101 2
 
0.2%
20,100,915 1
 
0.1%

USE_YN
Boolean

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
True
990 
ValueCountFrequency (%)
True 990
100.0%
2023-12-12T23:59:13.365319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:59:13.433963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
REMARKSEMP_TAGENT_DTUPD_DT
REMARKS1.0000.649NaNNaN
EMP_TAG0.6491.0000.4210.346
ENT_DTNaN0.4211.0001.000
UPD_DTNaN0.3461.0001.000
2023-12-12T23:59:13.549599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
UPD_DTREMARKSENT_DTEMP_TAG
UPD_DT1.0001.0000.9990.220
REMARKS1.0001.0001.0000.298
ENT_DT0.9991.0001.0000.204
EMP_TAG0.2200.2980.2041.000
2023-12-12T23:59:13.682556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
REMARKSEMP_TAGENT_DTUPD_DT
REMARKS1.0000.2981.0001.000
EMP_TAG0.2981.0000.2040.220
ENT_DT1.0000.2041.0000.999
UPD_DT1.0000.2200.9991.000

Missing values

2023-12-12T23:59:10.983032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:59:11.110319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

NoREMARKSEMP_TAGENT_DTUPD_DTUSE_YN
0▶1<NA>220,100,91520,100,915Y
12<NA>120,100,91620,100,916Y
23<NA>120,100,91620,100,916Y
34<NA>120,100,91620,100,916Y
45<NA>120,100,91620,100,916Y
56<NA>120,100,91620,100,916Y
67<NA>120,100,91620,100,916Y
78<NA>120,100,91620,100,916Y
89<NA>120,100,91620,100,916Y
910<NA>120,100,91620,100,916Y
NoREMARKSEMP_TAGENT_DTUPD_DTUSE_YN
980981<NA>120,100,90620,100,906Y
981982<NA>120,100,90620,100,906Y
982983<NA>120,100,90620,100,906Y
983984<NA>120,100,90620,100,906Y
984985<NA>120,100,90620,100,906Y
985986<NA>320,100,82520,100,825Y
986987<NA>220,100,90820,100,908Y
987988<NA>220,100,90820,100,908Y
988989<NA>220,100,90820,100,908Y
989990<NA>220,100,90820,100,908Y