Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 990 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 47.5 KiB |
Average record size in memory | 49.1 B |
Variable types
Text | 1 |
---|---|
Categorical | 4 |
Boolean | 1 |
Dataset
Description | 경상남도 버스업체경영수지분석시스템 데이터자료로, 등록일자, 갱신일자, 사원아이디, 사원종류구분 등에 대한 정보들을 제공합니다. |
---|---|
Author | 경상남도 |
URL | https://www.data.go.kr/data/15065978/fileData.do |
USE_YN has constant value "" | Constant |
UPD_DT is highly overall correlated with REMARKS and 1 other fields | High correlation |
REMARKS is highly overall correlated with ENT_DT and 1 other fields | High correlation |
ENT_DT is highly overall correlated with REMARKS and 1 other fields | High correlation |
REMARKS is highly imbalanced (92.7%) | Imbalance |
EMP_TAG is highly imbalanced (54.0%) | Imbalance |
No has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 14:59:10.747902 |
---|---|
Analysis finished | 2023-12-12 14:59:11.149548 |
Duration | 0.4 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
No
Text
UNIQUE
 
Distinct | 990 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 7.9 KiB |
Value | Count | Frequency (%) |
▶1 | 1 | 0.1% |
650 | 1 | 0.1% |
652 | 1 | 0.1% |
653 | 1 | 0.1% |
654 | 1 | 0.1% |
655 | 1 | 0.1% |
656 | 1 | 0.1% |
657 | 1 | 0.1% |
658 | 1 | 0.1% |
659 | 1 | 0.1% |
Other values (980) | 980 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 299 | |
3 | 299 | |
4 | 299 | |
5 | 299 | |
6 | 299 | |
7 | 299 | |
8 | 299 | |
2 | 299 | |
9 | 281 | |
0 | 189 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 2862 | |
Other Symbol | 1 | < 0.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 299 | |
3 | 299 | |
4 | 299 | |
5 | 299 | |
6 | 299 | |
7 | 299 | |
8 | 299 | |
2 | 299 | |
9 | 281 | |
0 | 189 |
Other Symbol
Value | Count | Frequency (%) |
▶ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 2863 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 299 | |
3 | 299 | |
4 | 299 | |
5 | 299 | |
6 | 299 | |
7 | 299 | |
8 | 299 | |
2 | 299 | |
9 | 281 | |
0 | 189 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2862 | |
Geometric Shapes | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 299 | |
3 | 299 | |
4 | 299 | |
5 | 299 | |
6 | 299 | |
7 | 299 | |
8 | 299 | |
2 | 299 | |
9 | 281 | |
0 | 189 |
Geometric Shapes
Value | Count | Frequency (%) |
▶ | 1 |
REMARKS
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 7 |
---|---|
Distinct (%) | 0.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 7.9 KiB |
<NA> | |
---|---|
촉탁 | 9 |
영업소 | 9 |
청소원 | 2 |
세차원 | 1 |
Other values (2) | 2 |
Length
Max length | 5 |
---|---|
Median length | 4 |
Mean length | 3.9707071 |
Min length | 2 |
Unique
Unique | 3 ? |
---|---|
Unique (%) | 0.3% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 967 | |
촉탁 | 9 | 0.9% |
영업소 | 9 | 0.9% |
청소원 | 2 | 0.2% |
세차원 | 1 | 0.1% |
대표이사 | 1 | 0.1% |
노조지부장 | 1 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 967 | |
촉탁 | 9 | 0.9% |
영업소 | 9 | 0.9% |
청소원 | 2 | 0.2% |
세차원 | 1 | 0.1% |
대표이사 | 1 | 0.1% |
노조지부장 | 1 | 0.1% |
EMP_TAG
Categorical
IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 7.9 KiB |
1 | |
---|---|
2 | |
3 | 40 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 844 | |
2 | 106 | 10.7% |
3 | 40 | 4.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 844 | |
2 | 106 | 10.7% |
3 | 40 | 4.0% |
ENT_DT
Categorical
HIGH CORRELATION
 
Distinct | 9 |
---|---|
Distinct (%) | 0.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 7.9 KiB |
20,100,906 | |
---|---|
20,100,901 | |
20,100,908 | |
20,100,905 | |
20,100,907 | |
Other values (4) | 20 |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | 20,100,915 |
---|---|
2nd row | 20,100,916 |
3rd row | 20,100,916 |
4th row | 20,100,916 |
5th row | 20,100,916 |
Common Values
Value | Count | Frequency (%) |
20,100,906 | 354 | |
20,100,901 | 206 | |
20,100,908 | 180 | |
20,100,905 | 158 | |
20,100,907 | 72 | 7.3% |
20,100,916 | 14 | 1.4% |
20,100,825 | 3 | 0.3% |
20,100,902 | 2 | 0.2% |
20,100,915 | 1 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20,100,906 | 354 | |
20,100,901 | 206 | |
20,100,908 | 180 | |
20,100,905 | 158 | |
20,100,907 | 72 | 7.3% |
20,100,916 | 14 | 1.4% |
20,100,825 | 3 | 0.3% |
20,100,902 | 2 | 0.2% |
20,100,915 | 1 | 0.1% |
UPD_DT
Categorical
HIGH CORRELATION
 
Distinct | 10 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 7.9 KiB |
20,100,906 | |
---|---|
20,100,901 | |
20,100,908 | |
20,100,905 | |
20,100,907 | |
Other values (5) | 22 |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | 20,100,915 |
---|---|
2nd row | 20,100,916 |
3rd row | 20,100,916 |
4th row | 20,100,916 |
5th row | 20,100,916 |
Common Values
Value | Count | Frequency (%) |
20,100,906 | 352 | |
20,100,901 | 206 | |
20,100,908 | 180 | |
20,100,905 | 158 | |
20,100,907 | 72 | 7.3% |
20,100,916 | 14 | 1.4% |
20,100,825 | 3 | 0.3% |
20,100,902 | 2 | 0.2% |
20,101,101 | 2 | 0.2% |
20,100,915 | 1 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20,100,906 | 352 | |
20,100,901 | 206 | |
20,100,908 | 180 | |
20,100,905 | 158 | |
20,100,907 | 72 | 7.3% |
20,100,916 | 14 | 1.4% |
20,100,825 | 3 | 0.3% |
20,100,902 | 2 | 0.2% |
20,101,101 | 2 | 0.2% |
20,100,915 | 1 | 0.1% |
USE_YN
Boolean
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.1 KiB |
True |
---|
Value | Count | Frequency (%) |
True | 990 |
REMARKS | EMP_TAG | ENT_DT | UPD_DT | |
---|---|---|---|---|
REMARKS | 1.000 | 0.649 | NaN | NaN |
EMP_TAG | 0.649 | 1.000 | 0.421 | 0.346 |
ENT_DT | NaN | 0.421 | 1.000 | 1.000 |
UPD_DT | NaN | 0.346 | 1.000 | 1.000 |
UPD_DT | REMARKS | ENT_DT | EMP_TAG | |
---|---|---|---|---|
UPD_DT | 1.000 | 1.000 | 0.999 | 0.220 |
REMARKS | 1.000 | 1.000 | 1.000 | 0.298 |
ENT_DT | 0.999 | 1.000 | 1.000 | 0.204 |
EMP_TAG | 0.220 | 0.298 | 0.204 | 1.000 |
REMARKS | EMP_TAG | ENT_DT | UPD_DT | |
---|---|---|---|---|
REMARKS | 1.000 | 0.298 | 1.000 | 1.000 |
EMP_TAG | 0.298 | 1.000 | 0.204 | 0.220 |
ENT_DT | 1.000 | 0.204 | 1.000 | 0.999 |
UPD_DT | 1.000 | 0.220 | 0.999 | 1.000 |
No | REMARKS | EMP_TAG | ENT_DT | UPD_DT | USE_YN | |
---|---|---|---|---|---|---|
0 | ▶1 | <NA> | 2 | 20,100,915 | 20,100,915 | Y |
1 | 2 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
2 | 3 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
3 | 4 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
4 | 5 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
5 | 6 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
6 | 7 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
7 | 8 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
8 | 9 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
9 | 10 | <NA> | 1 | 20,100,916 | 20,100,916 | Y |
No | REMARKS | EMP_TAG | ENT_DT | UPD_DT | USE_YN | |
---|---|---|---|---|---|---|
980 | 981 | <NA> | 1 | 20,100,906 | 20,100,906 | Y |
981 | 982 | <NA> | 1 | 20,100,906 | 20,100,906 | Y |
982 | 983 | <NA> | 1 | 20,100,906 | 20,100,906 | Y |
983 | 984 | <NA> | 1 | 20,100,906 | 20,100,906 | Y |
984 | 985 | <NA> | 1 | 20,100,906 | 20,100,906 | Y |
985 | 986 | <NA> | 3 | 20,100,825 | 20,100,825 | Y |
986 | 987 | <NA> | 2 | 20,100,908 | 20,100,908 | Y |
987 | 988 | <NA> | 2 | 20,100,908 | 20,100,908 | Y |
988 | 989 | <NA> | 2 | 20,100,908 | 20,100,908 | Y |
989 | 990 | <NA> | 2 | 20,100,908 | 20,100,908 | Y |