Overview

Dataset statistics

Number of variables9
Number of observations44
Missing cells44
Missing cells (%)11.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.4 KiB
Average record size in memory78.0 B

Variable types

Categorical4
Unsupported1
Text3
DateTime1

Alerts

업체수 has constant value ""Constant
일평균 사용량 has constant value ""Constant
광역구분 is highly overall correlated with 공단구분High correlation
공단구분 is highly overall correlated with 광역구분High correlation
광역구분 is highly imbalanced (84.4%)Imbalance
담당본부 has 44 (100.0%) missing valuesMissing
공단명 has unique valuesUnique
단지코드 has unique valuesUnique
단지주소 has unique valuesUnique
담당본부 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-17 01:23:03.984758
Analysis finished2024-04-17 01:23:05.762395
Duration1.78 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

광역구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size484.0 B
미가동
43 
<NA>
 
1

Length

Max length4
Median length3
Mean length3.0227273
Min length3

Unique

Unique1 ?
Unique (%)2.3%

Sample

1st row<NA>
2nd row미가동
3rd row미가동
4th row미가동
5th row미가동

Common Values

ValueCountFrequency (%)
미가동 43
97.7%
<NA> 1
 
2.3%

Length

2024-04-17T10:23:05.822984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T10:23:05.905740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
미가동 43
97.7%
na 1
 
2.3%

담당본부
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing44
Missing (%)100.0%
Memory size528.0 B

공단구분
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size484.0 B
일반산업단지
33 
도시첨단산업단지
국가산업단지
농공단지
 
1

Length

Max length8
Median length6
Mean length6.2272727
Min length4

Unique

Unique1 ?
Unique (%)2.3%

Sample

1st row국가산업단지
2nd row도시첨단산업단지
3rd row일반산업단지
4th row일반산업단지
5th row일반산업단지

Common Values

ValueCountFrequency (%)
일반산업단지 33
75.0%
도시첨단산업단지 6
 
13.6%
국가산업단지 4
 
9.1%
농공단지 1
 
2.3%

Length

2024-04-17T10:23:06.006663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T10:23:06.113732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반산업단지 33
75.0%
도시첨단산업단지 6
 
13.6%
국가산업단지 4
 
9.1%
농공단지 1
 
2.3%

공단명
Text

UNIQUE 

Distinct44
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size484.0 B
2024-04-17T10:23:06.307150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length14.5
Mean length10.386364
Min length6

Characters and Unicode

Total characters457
Distinct characters123
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)100.0%

Sample

1st row경남항공국가산업단지
2nd row용인기흥ICT밸리도시첨단산업단지
3rd row동방일반산업단지
4th row지사글로벌일반산업단지
5th row성안산업단지
ValueCountFrequency (%)
경남항공국가산업단지 1
 
2.2%
영진바이오일반산업단지 1
 
2.2%
오창테크노폴리스일반산업단지 1
 
2.2%
학운4-1일반산업단지 1
 
2.2%
에코그린일반산업단지 1
 
2.2%
검단일반산업단지 1
 
2.2%
연천bix(은통일반산업단지 1
 
2.2%
무촌일반산업단지 1
 
2.2%
국사일반산업단지 1
 
2.2%
화석산업단지 1
 
2.2%
Other values (35) 35
77.8%
2024-04-17T10:23:06.622717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
50
 
10.9%
47
 
10.3%
43
 
9.4%
40
 
8.8%
30
 
6.6%
29
 
6.3%
7
 
1.5%
6
 
1.3%
6
 
1.3%
6
 
1.3%
Other values (113) 193
42.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 421
92.1%
Uppercase Letter 12
 
2.6%
Decimal Number 7
 
1.5%
Lowercase Letter 6
 
1.3%
Close Punctuation 4
 
0.9%
Open Punctuation 4
 
0.9%
Dash Punctuation 2
 
0.4%
Space Separator 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
11.9%
47
 
11.2%
43
 
10.2%
40
 
9.5%
30
 
7.1%
29
 
6.9%
7
 
1.7%
6
 
1.4%
6
 
1.4%
6
 
1.4%
Other values (90) 157
37.3%
Uppercase Letter
ValueCountFrequency (%)
I 3
25.0%
S 2
16.7%
B 1
 
8.3%
X 1
 
8.3%
F 1
 
8.3%
D 1
 
8.3%
C 1
 
8.3%
T 1
 
8.3%
P 1
 
8.3%
Lowercase Letter
ValueCountFrequency (%)
o 2
33.3%
d 1
16.7%
a 1
16.7%
r 1
16.7%
k 1
16.7%
Decimal Number
ValueCountFrequency (%)
1 2
28.6%
2 2
28.6%
4 1
14.3%
6 1
14.3%
3 1
14.3%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 421
92.1%
Common 18
 
3.9%
Latin 18
 
3.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
11.9%
47
 
11.2%
43
 
10.2%
40
 
9.5%
30
 
7.1%
29
 
6.9%
7
 
1.7%
6
 
1.4%
6
 
1.4%
6
 
1.4%
Other values (90) 157
37.3%
Latin
ValueCountFrequency (%)
I 3
16.7%
o 2
11.1%
S 2
11.1%
B 1
 
5.6%
X 1
 
5.6%
d 1
 
5.6%
F 1
 
5.6%
D 1
 
5.6%
C 1
 
5.6%
T 1
 
5.6%
Other values (4) 4
22.2%
Common
ValueCountFrequency (%)
) 4
22.2%
( 4
22.2%
1 2
11.1%
- 2
11.1%
2 2
11.1%
4 1
 
5.6%
6 1
 
5.6%
3 1
 
5.6%
1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 421
92.1%
ASCII 36
 
7.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
50
 
11.9%
47
 
11.2%
43
 
10.2%
40
 
9.5%
30
 
7.1%
29
 
6.9%
7
 
1.7%
6
 
1.4%
6
 
1.4%
6
 
1.4%
Other values (90) 157
37.3%
ASCII
ValueCountFrequency (%)
) 4
 
11.1%
( 4
 
11.1%
I 3
 
8.3%
o 2
 
5.6%
1 2
 
5.6%
- 2
 
5.6%
2 2
 
5.6%
S 2
 
5.6%
4 1
 
2.8%
6 1
 
2.8%
Other values (13) 13
36.1%

단지코드
Text

UNIQUE 

Distinct44
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size484.0 B
2024-04-17T10:23:06.813844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters264
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)100.0%

Sample

1st row148080
2nd row341060
3rd row241BO0
4th row226300
5th row243800
ValueCountFrequency (%)
148080 1
 
2.3%
341060 1
 
2.3%
243840 1
 
2.3%
241bt0 1
 
2.3%
241bv0 1
 
2.3%
247710 1
 
2.3%
241bw0 1
 
2.3%
241bu0 1
 
2.3%
243820 1
 
2.3%
243830 1
 
2.3%
Other values (34) 34
77.3%
2024-04-17T10:23:07.107039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 56
21.2%
2 47
17.8%
4 40
15.2%
1 26
9.8%
8 26
9.8%
3 17
 
6.4%
B 10
 
3.8%
6 7
 
2.7%
7 7
 
2.7%
A 7
 
2.7%
Other values (12) 21
 
8.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 230
87.1%
Uppercase Letter 34
 
12.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 10
29.4%
A 7
20.6%
T 2
 
5.9%
W 2
 
5.9%
V 2
 
5.9%
U 2
 
5.9%
R 2
 
5.9%
S 2
 
5.9%
X 2
 
5.9%
P 1
 
2.9%
Other values (2) 2
 
5.9%
Decimal Number
ValueCountFrequency (%)
0 56
24.3%
2 47
20.4%
4 40
17.4%
1 26
11.3%
8 26
11.3%
3 17
 
7.4%
6 7
 
3.0%
7 7
 
3.0%
9 2
 
0.9%
5 2
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 230
87.1%
Latin 34
 
12.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 10
29.4%
A 7
20.6%
T 2
 
5.9%
W 2
 
5.9%
V 2
 
5.9%
U 2
 
5.9%
R 2
 
5.9%
S 2
 
5.9%
X 2
 
5.9%
P 1
 
2.9%
Other values (2) 2
 
5.9%
Common
ValueCountFrequency (%)
0 56
24.3%
2 47
20.4%
4 40
17.4%
1 26
11.3%
8 26
11.3%
3 17
 
7.4%
6 7
 
3.0%
7 7
 
3.0%
9 2
 
0.9%
5 2
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 264
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 56
21.2%
2 47
17.8%
4 40
15.2%
1 26
9.8%
8 26
9.8%
3 17
 
6.4%
B 10
 
3.8%
6 7
 
2.7%
7 7
 
2.7%
A 7
 
2.7%
Other values (12) 21
 
8.0%

단지주소
Text

UNIQUE 

Distinct44
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size484.0 B
2024-04-17T10:23:07.352165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length35
Mean length26.318182
Min length18

Characters and Unicode

Total characters1158
Distinct characters138
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)100.0%

Sample

1st row경상남도 진주시 정촌면 예하리. 대축리. 화개리 일원. 사천시 용현면 선진리.신촌리. 통양리 일원
2nd row경기도 용인시 기흥구 구갈동 259-1번지 일원
3rd row경기도 화성시 팔탄면 덕우리 산66-2번지 일원
4th row부산광역시 강서구 지사동 산137번지 일원
5th row가. 충청북도 음성군 금왕읍 봉곡리 산36-77번지 일원
ValueCountFrequency (%)
일원 44
 
16.2%
경상남도 12
 
4.4%
경기도 12
 
4.4%
충청북도 5
 
1.8%
인천광역시 3
 
1.1%
지사동 2
 
0.7%
음성군 2
 
0.7%
금왕읍 2
 
0.7%
봉곡리 2
 
0.7%
안성시 2
 
0.7%
Other values (159) 185
68.3%
2024-04-17T10:23:08.008248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
227
 
19.6%
49
 
4.2%
48
 
4.1%
44
 
3.8%
38
 
3.3%
36
 
3.1%
34
 
2.9%
30
 
2.6%
1 27
 
2.3%
27
 
2.3%
Other values (128) 598
51.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 782
67.5%
Space Separator 227
 
19.6%
Decimal Number 112
 
9.7%
Other Punctuation 22
 
1.9%
Dash Punctuation 15
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
6.3%
48
 
6.1%
44
 
5.6%
38
 
4.9%
36
 
4.6%
34
 
4.3%
30
 
3.8%
27
 
3.5%
25
 
3.2%
25
 
3.2%
Other values (115) 426
54.5%
Decimal Number
ValueCountFrequency (%)
1 27
24.1%
2 16
14.3%
7 15
13.4%
4 11
9.8%
6 9
 
8.0%
5 9
 
8.0%
3 9
 
8.0%
0 7
 
6.2%
8 5
 
4.5%
9 4
 
3.6%
Space Separator
ValueCountFrequency (%)
227
100.0%
Other Punctuation
ValueCountFrequency (%)
. 22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 782
67.5%
Common 376
32.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
6.3%
48
 
6.1%
44
 
5.6%
38
 
4.9%
36
 
4.6%
34
 
4.3%
30
 
3.8%
27
 
3.5%
25
 
3.2%
25
 
3.2%
Other values (115) 426
54.5%
Common
ValueCountFrequency (%)
227
60.4%
1 27
 
7.2%
. 22
 
5.9%
2 16
 
4.3%
- 15
 
4.0%
7 15
 
4.0%
4 11
 
2.9%
6 9
 
2.4%
5 9
 
2.4%
3 9
 
2.4%
Other values (3) 16
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 782
67.5%
ASCII 376
32.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
227
60.4%
1 27
 
7.2%
. 22
 
5.9%
2 16
 
4.3%
- 15
 
4.0%
7 15
 
4.0%
4 11
 
2.9%
6 9
 
2.4%
5 9
 
2.4%
3 9
 
2.4%
Other values (3) 16
 
4.3%
Hangul
ValueCountFrequency (%)
49
 
6.3%
48
 
6.1%
44
 
5.6%
38
 
4.9%
36
 
4.6%
34
 
4.3%
30
 
3.8%
27
 
3.5%
25
 
3.2%
25
 
3.2%
Other values (115) 426
54.5%
Distinct36
Distinct (%)81.8%
Missing0
Missing (%)0.0%
Memory size484.0 B
Minimum2017-01-05 00:00:00
Maximum2017-12-29 00:00:00
2024-04-17T10:23:08.121073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T10:23:08.224937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)

업체수
Categorical

CONSTANT 

Distinct1
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size484.0 B
0
44 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 44
100.0%

Length

2024-04-17T10:23:08.327358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T10:23:08.407042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 44
100.0%

일평균 사용량
Categorical

CONSTANT 

Distinct1
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size484.0 B
0
44 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 44
100.0%

Length

2024-04-17T10:23:08.485621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T10:23:08.562333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 44
100.0%

Correlations

2024-04-17T10:23:08.621600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공단구분공단명단지코드단지주소지정일자
공단구분1.0001.0001.0001.0001.000
공단명1.0001.0001.0001.0001.000
단지코드1.0001.0001.0001.0001.000
단지주소1.0001.0001.0001.0001.000
지정일자1.0001.0001.0001.0001.000
2024-04-17T10:23:08.703744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
광역구분공단구분
광역구분1.0001.000
공단구분1.0001.000
2024-04-17T10:23:08.768892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
광역구분공단구분
광역구분1.0001.000
공단구분1.0001.000

Missing values

2024-04-17T10:23:05.555180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T10:23:05.711418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

광역구분담당본부공단구분공단명단지코드단지주소지정일자업체수일평균 사용량
0<NA><NA>국가산업단지경남항공국가산업단지148080경상남도 진주시 정촌면 예하리. 대축리. 화개리 일원. 사천시 용현면 선진리.신촌리. 통양리 일원2017-05-0200
1미가동<NA>도시첨단산업단지용인기흥ICT밸리도시첨단산업단지341060경기도 용인시 기흥구 구갈동 259-1번지 일원2017-01-1300
2미가동<NA>일반산업단지동방일반산업단지241BO0경기도 화성시 팔탄면 덕우리 산66-2번지 일원2017-02-0100
3미가동<NA>일반산업단지지사글로벌일반산업단지226300부산광역시 강서구 지사동 산137번지 일원2017-02-1500
4미가동<NA>일반산업단지성안산업단지243800가. 충청북도 음성군 금왕읍 봉곡리 산36-77번지 일원2017-03-2400
5미가동<NA>일반산업단지가유일반산업단지241BP0경기도 안성시 고삼면 가유리 777-1번지 일원2017-04-1000
6미가동<NA>일반산업단지동문일반산업단지241BR0경기도 안성시 원곡면 지문리 산110번지 일원2017-04-2100
7미가동<NA>국가산업단지경남항공(사천지구)148082경상남도 사천시 용현면 선진리.신촌리. 통양리 일원2017-05-0200
8미가동<NA>국가산업단지경남항공(진주지구)148081경상남도 진주시 정촌면 예하리. 대축리. 화개리 일원2017-05-0200
9미가동<NA>도시첨단산업단지삼성SDS춘천센터도시첨단산업단지342040강원도 춘천시 칠전동 324번지 일원2017-06-0100
광역구분담당본부공단구분공단명단지코드단지주소지정일자업체수일평균 사용량
34미가동<NA>일반산업단지진목일반산업단지241BX0경기도 포천시 내촌면 진목리 186번지 일원2017-12-0800
35미가동<NA>일반산업단지대곡2일반산업단지247810경상북도 경주시 건천읍 대곡리 산37-1번지 일원2017-12-1100
36미가동<NA>도시첨단산업단지용인일양히포도시첨단산업단지341070경기도 용인시 기흥구 하갈동 182-4번지 일원2017-12-2200
37미가동<NA>일반산업단지세종벤처밸리일반산업단지236140세종특별자치시 전동면 심중리 574번지 일원2017-12-2800
38미가동<NA>일반산업단지원지일반산업단지248AX0경상남도 김해시 주촌면 원지리 산167-1번지 일원2017-12-2800
39미가동<NA>일반산업단지세종스마트그린일반산업단지236150세종특별자치시 소정면 고등리 산65번지. 전의면 읍내리 47번지 일원2017-12-2800
40미가동<NA>도시첨단산업단지남동도시첨단산업단지328020인천광역시 남동구 남촌동 210-6번지 일원2017-12-2900
41미가동<NA>도시첨단산업단지율하도시첨단산업단지327020대구광역시 동구 율하동 814-3 일원2017-12-2900
42미가동<NA>일반산업단지영남일반산업단지248AR0경상남도 창녕군 대합면 대동리 일원2017-01-0500
43미가동<NA>도시첨단산업단지순천도시첨단산업단지346010전라남도 순천시 야흥동 142-1번지 일원2017-12-2900