Overview

Dataset statistics

Number of variables10
Number of observations85
Missing cells412
Missing cells (%)48.5%
Duplicate rows1
Duplicate rows (%)1.2%
Total size in memory7.1 KiB
Average record size in memory85.6 B

Variable types

Categorical5
Text1
Unsupported4

Dataset

Description경상남도 김해시 쓰레기바코드시스템 개방 가능 DB 정보로 운영부서, 정보시스템명, DB명, 영문테이블명 등의 데이터로 구성되어 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15063876

Alerts

Dataset has 1 (1.2%) duplicate rowsDuplicates
정보시스템명 is highly overall correlated with 운영부서 and 3 other fieldsHigh correlation
영문 테이블명 is highly overall correlated with 운영부서 and 2 other fieldsHigh correlation
한글 테이블명 is highly overall correlated with 운영부서 and 2 other fieldsHigh correlation
DB명 is highly overall correlated with 운영부서 and 3 other fieldsHigh correlation
운영부서 is highly overall correlated with 정보시스템명 and 3 other fieldsHigh correlation
영문 테이블명 is highly imbalanced (57.3%)Imbalance
한글 테이블명 is highly imbalanced (57.3%)Imbalance
한글 컬럼명 has 72 (84.7%) missing valuesMissing
Unnamed: 6 has 85 (100.0%) missing valuesMissing
Unnamed: 7 has 85 (100.0%) missing valuesMissing
Unnamed: 8 has 85 (100.0%) missing valuesMissing
Unnamed: 9 has 85 (100.0%) missing valuesMissing
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 00:46:53.959808
Analysis finished2023-12-11 00:46:54.484467
Duration0.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

운영부서
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size812.0 B
<NA>
72 
청소과
13 

Length

Max length4
Median length4
Mean length3.8470588
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row청소과
2nd row청소과
3rd row청소과
4th row청소과
5th row청소과

Common Values

ValueCountFrequency (%)
<NA> 72
84.7%
청소과 13
 
15.3%

Length

2023-12-11T09:46:54.566885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:46:54.705695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 72
84.7%
청소과 13
 
15.3%

정보시스템명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size812.0 B
<NA>
72 
쓰레기바코드시스템
13 

Length

Max length9
Median length4
Mean length4.7647059
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row쓰레기바코드시스템
2nd row쓰레기바코드시스템
3rd row쓰레기바코드시스템
4th row쓰레기바코드시스템
5th row쓰레기바코드시스템

Common Values

ValueCountFrequency (%)
<NA> 72
84.7%
쓰레기바코드시스템 13
 
15.3%

Length

2023-12-11T09:46:54.847751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:46:54.968187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 72
84.7%
쓰레기바코드시스템 13
 
15.3%

DB명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size812.0 B
<NA>
72 
SMTSERVER
13 

Length

Max length9
Median length4
Mean length4.7647059
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSMTSERVER
2nd rowSMTSERVER
3rd rowSMTSERVER
4th rowSMTSERVER
5th rowSMTSERVER

Common Values

ValueCountFrequency (%)
<NA> 72
84.7%
SMTSERVER 13
 
15.3%

Length

2023-12-11T09:46:55.096197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:46:55.226002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 72
84.7%
smtserver 13
 
15.3%

영문 테이블명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size812.0 B
<NA>
72 
GBMT110
12 
GBTT011
 
1

Length

Max length7
Median length4
Mean length4.4588235
Min length4

Unique

Unique1 ?
Unique (%)1.2%

Sample

1st rowGBTT011
2nd rowGBMT110
3rd rowGBMT110
4th rowGBMT110
5th rowGBMT110

Common Values

ValueCountFrequency (%)
<NA> 72
84.7%
GBMT110 12
 
14.1%
GBTT011 1
 
1.2%

Length

2023-12-11T09:46:55.366903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:46:55.473913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 72
84.7%
gbmt110 12
 
14.1%
gbtt011 1
 
1.2%

한글 테이블명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size812.0 B
<NA>
72 
포장단위
12 
발주 상세 내역
 
1

Length

Max length8
Median length4
Mean length4.0470588
Min length4

Unique

Unique1 ?
Unique (%)1.2%

Sample

1st row발주 상세 내역
2nd row포장단위
3rd row포장단위
4th row포장단위
5th row포장단위

Common Values

ValueCountFrequency (%)
<NA> 72
84.7%
포장단위 12
 
14.1%
발주 상세 내역 1
 
1.2%

Length

2023-12-11T09:46:55.600847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:46:55.738072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 72
82.8%
포장단위 12
 
13.8%
발주 1
 
1.1%
상세 1
 
1.1%
내역 1
 
1.1%

한글 컬럼명
Text

MISSING 

Distinct13
Distinct (%)100.0%
Missing72
Missing (%)84.7%
Memory size812.0 B
2023-12-11T09:46:55.929565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length3.5384615
Min length1

Characters and Unicode

Total characters46
Distinct characters31
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)100.0%

Sample

1st row구.군코드
2nd row지정코드
3rd row봉투구분
4th row봉투재질
5th row봉투용량
ValueCountFrequency (%)
구.군코드 1
 
7.7%
지정코드 1
 
7.7%
봉투구분 1
 
7.7%
봉투재질 1
 
7.7%
봉투용량 1
 
7.7%
박스 1
 
7.7%
1
 
7.7%
낱장구분 1
 
7.7%
만료기간 1
 
7.7%
수량 1
 
7.7%
Other values (3) 3
23.1%
2023-12-11T09:46:56.286280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
8.7%
4
 
8.7%
3
 
6.5%
3
 
6.5%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
1
 
2.2%
Other values (21) 21
45.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45
97.8%
Other Punctuation 1
 
2.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
8.9%
4
 
8.9%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
2
 
4.4%
1
 
2.2%
1
 
2.2%
Other values (20) 20
44.4%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45
97.8%
Common 1
 
2.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
8.9%
4
 
8.9%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
2
 
4.4%
1
 
2.2%
1
 
2.2%
Other values (20) 20
44.4%
Common
ValueCountFrequency (%)
. 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45
97.8%
ASCII 1
 
2.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
 
8.9%
4
 
8.9%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
2
 
4.4%
1
 
2.2%
1
 
2.2%
Other values (20) 20
44.4%
ASCII
ValueCountFrequency (%)
. 1
100.0%

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing85
Missing (%)100.0%
Memory size897.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing85
Missing (%)100.0%
Memory size897.0 B

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing85
Missing (%)100.0%
Memory size897.0 B

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing85
Missing (%)100.0%
Memory size897.0 B

Correlations

2023-12-11T09:46:56.389670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
영문 테이블명한글 테이블명한글 컬럼명
영문 테이블명1.0000.5621.000
한글 테이블명0.5621.0001.000
한글 컬럼명1.0001.0001.000
2023-12-11T09:46:56.494011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정보시스템명영문 테이블명한글 테이블명DB명운영부서
정보시스템명1.0001.0001.0001.0001.000
영문 테이블명1.0001.0000.3721.0001.000
한글 테이블명1.0000.3721.0001.0001.000
DB명1.0001.0001.0001.0001.000
운영부서1.0001.0001.0001.0001.000
2023-12-11T09:46:56.609980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운영부서정보시스템명DB명영문 테이블명한글 테이블명
운영부서1.0001.0001.0001.0001.000
정보시스템명1.0001.0001.0001.0001.000
DB명1.0001.0001.0001.0001.000
영문 테이블명1.0001.0001.0001.0000.372
한글 테이블명1.0001.0001.0000.3721.000

Missing values

2023-12-11T09:46:54.232567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:46:54.406553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

운영부서정보시스템명DB명영문 테이블명한글 테이블명한글 컬럼명Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9
0청소과쓰레기바코드시스템SMTSERVERGBTT011발주 상세 내역구.군코드<NA><NA><NA><NA>
1청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위지정코드<NA><NA><NA><NA>
2청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위봉투구분<NA><NA><NA><NA>
3청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위봉투재질<NA><NA><NA><NA>
4청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위봉투용량<NA><NA><NA><NA>
5청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위박스<NA><NA><NA><NA>
6청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위<NA><NA><NA><NA>
7청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위낱장구분<NA><NA><NA><NA>
8청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위만료기간<NA><NA><NA><NA>
9청소과쓰레기바코드시스템SMTSERVERGBMT110포장단위수량<NA><NA><NA><NA>
운영부서정보시스템명DB명영문 테이블명한글 테이블명한글 컬럼명Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9
75<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
76<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
77<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
78<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
79<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
80<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
81<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
82<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
83<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
84<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

운영부서정보시스템명DB명영문 테이블명한글 테이블명한글 컬럼명# duplicates
0<NA><NA><NA><NA><NA><NA>72