Overview

Dataset statistics

Number of variables4
Number of observations64
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory34.1 B

Variable types

Text1
DateTime1
Categorical2

Dataset

Description대구광역시_의료R&D지구 입주기업 현황_20210324
Author대구광역시
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15085792&dataSetDetailId=150857921cd41b6d3e273&provdMethod=FILE

Alerts

입주현황 is highly imbalanced (55.1%)Imbalance
업종 is highly imbalanced (64.7%)Imbalance
업체명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 19:14:33.788760
Analysis finished2023-12-10 19:14:35.413093
Duration1.62 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업체명
Text

UNIQUE 

Distinct64
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size644.0 B
2023-12-11T04:14:35.645340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length10
Mean length5.84375
Min length2

Characters and Unicode

Total characters374
Distinct characters143
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)100.0%

Sample

1st row㈜케이엠에프
2nd row㈜토탈소프트뱅크
3rd row㈜파인메딕스
4th row㈜신라시스템
5th row씨에스텍
ValueCountFrequency (%)
㈜케이엠에프 1
 
1.6%
㈜토탈소프트뱅크 1
 
1.6%
㈜튜링겐코리아 1
 
1.6%
벡트론 1
 
1.6%
유타스㈜ 1
 
1.6%
㈜대일산업 1
 
1.6%
㈜올소테크 1
 
1.6%
㈜메가콤 1
 
1.6%
㈜제이에스테크윈 1
 
1.6%
㈜우성티오티 1
 
1.6%
Other values (54) 54
84.4%
2023-12-11T04:14:36.193591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
14.7%
20
 
5.3%
19
 
5.1%
12
 
3.2%
11
 
2.9%
10
 
2.7%
7
 
1.9%
( 6
 
1.6%
) 6
 
1.6%
6
 
1.6%
Other values (133) 222
59.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 291
77.8%
Other Symbol 55
 
14.7%
Uppercase Letter 12
 
3.2%
Open Punctuation 6
 
1.6%
Close Punctuation 6
 
1.6%
Other Punctuation 2
 
0.5%
Decimal Number 1
 
0.3%
Dash Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
 
6.9%
19
 
6.5%
12
 
4.1%
11
 
3.8%
10
 
3.4%
7
 
2.4%
6
 
2.1%
5
 
1.7%
5
 
1.7%
5
 
1.7%
Other values (118) 191
65.6%
Uppercase Letter
ValueCountFrequency (%)
S 3
25.0%
I 2
16.7%
C 1
 
8.3%
L 1
 
8.3%
H 1
 
8.3%
T 1
 
8.3%
O 1
 
8.3%
E 1
 
8.3%
G 1
 
8.3%
Other Symbol
ValueCountFrequency (%)
55
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 346
92.5%
Common 16
 
4.3%
Latin 12
 
3.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
 
15.9%
20
 
5.8%
19
 
5.5%
12
 
3.5%
11
 
3.2%
10
 
2.9%
7
 
2.0%
6
 
1.7%
5
 
1.4%
5
 
1.4%
Other values (119) 196
56.6%
Latin
ValueCountFrequency (%)
S 3
25.0%
I 2
16.7%
C 1
 
8.3%
L 1
 
8.3%
H 1
 
8.3%
T 1
 
8.3%
O 1
 
8.3%
E 1
 
8.3%
G 1
 
8.3%
Common
ValueCountFrequency (%)
( 6
37.5%
) 6
37.5%
. 2
 
12.5%
3 1
 
6.2%
- 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 291
77.8%
None 55
 
14.7%
ASCII 28
 
7.5%

Most frequent character per block

None
ValueCountFrequency (%)
55
100.0%
Hangul
ValueCountFrequency (%)
20
 
6.9%
19
 
6.5%
12
 
4.1%
11
 
3.8%
10
 
3.4%
7
 
2.4%
6
 
2.1%
5
 
1.7%
5
 
1.7%
5
 
1.7%
Other values (118) 191
65.6%
ASCII
ValueCountFrequency (%)
( 6
21.4%
) 6
21.4%
S 3
10.7%
. 2
 
7.1%
I 2
 
7.1%
C 1
 
3.6%
3 1
 
3.6%
L 1
 
3.6%
H 1
 
3.6%
- 1
 
3.6%
Other values (4) 4
14.3%
Distinct49
Distinct (%)76.6%
Missing0
Missing (%)0.0%
Memory size644.0 B
Minimum2012-10-08 00:00:00
Maximum2018-07-25 00:00:00
2023-12-11T04:14:36.380343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T04:14:36.555430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)

입주현황
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size644.0 B
입주완료
58 
준비중

Length

Max length4
Median length4
Mean length3.90625
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row입주완료
2nd row입주완료
3rd row입주완료
4th row입주완료
5th row입주완료

Common Values

ValueCountFrequency (%)
입주완료 58
90.6%
준비중 6
 
9.4%

Length

2023-12-11T04:14:36.733413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T04:14:36.873116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
입주완료 58
90.6%
준비중 6
 
9.4%

업종
Categorical

IMBALANCE 

Distinct4
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size644.0 B
의료기기
56 
제약,바이오
 
5
기타
 
2
화장품
 
1

Length

Max length6
Median length4
Mean length4.078125
Min length2

Unique

Unique1 ?
Unique (%)1.6%

Sample

1st row제약,바이오
2nd row의료기기
3rd row의료기기
4th row의료기기
5th row의료기기

Common Values

ValueCountFrequency (%)
의료기기 56
87.5%
제약,바이오 5
 
7.8%
기타 2
 
3.1%
화장품 1
 
1.6%

Length

2023-12-11T04:14:37.038900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T04:14:37.182288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
의료기기 56
87.5%
제약,바이오 5
 
7.8%
기타 2
 
3.1%
화장품 1
 
1.6%

Correlations

2023-12-11T04:14:37.261516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체명입주계약일입주현황업종
업체명1.0001.0001.0001.000
입주계약일1.0001.0000.8910.000
입주현황1.0000.8911.0000.513
업종1.0000.0000.5131.000
2023-12-11T04:14:37.391300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
입주현황업종
입주현황1.0000.342
업종0.3421.000
2023-12-11T04:14:37.505254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
입주현황업종
입주현황1.0000.342
업종0.3421.000

Missing values

2023-12-11T04:14:35.251403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T04:14:35.373922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명입주계약일입주현황업종
0㈜케이엠에프2012-10-08입주완료제약,바이오
1㈜토탈소프트뱅크2012-10-08입주완료의료기기
2㈜파인메딕스2012-10-09입주완료의료기기
3㈜신라시스템2012-11-14입주완료의료기기
4씨에스텍2012-11-15입주완료의료기기
5㈜덴스타2013-02-21입주완료의료기기
6㈜덴티스2013-10-29입주완료의료기기
7프라나(주)2013-10-30입주완료의료기기
8㈜레이월드2013-10-31입주완료의료기기
9㈜한국센서2013-10-31입주완료의료기기
업체명입주계약일입주현황업종
54에비던스임플란트2013-01-23입주완료의료기기
55㈜젬텍2012-11-14입주완료의료기기
56㈜세양2014-06-27준비중의료기기
57㈜덕산코트랜2015-06-29준비중의료기기
58무한컴퍼니2017-12-20입주완료기타
59㈜SLC2018-07-25준비중화장품
60㈜제이디자인윅스2014-06-09입주완료의료기기
61㈜대원GSI2014-07-18입주완료제약,바이오
62씨아이에스㈜2017-07-24입주완료의료기기
63명성2017-12-20입주완료의료기기