Overview

Dataset statistics

Number of variables9
Number of observations34
Missing cells159
Missing cells (%)52.0%
Duplicate rows7
Duplicate rows (%)20.6%
Total size in memory2.5 KiB
Average record size in memory75.9 B

Variable types

Text1
Unsupported8

Dataset

Description전국산업단지현황통계(시도별, 전국산업단지, 국가, 일반, 도시첨단, 농공, 신규지정 및 해제현황, 자유무역, 외국인 투자지역 등)
Author한국산업단지공단
URLhttps://www.data.go.kr/data/3041272/fileData.do

Alerts

Dataset has 7 (20.6%) duplicate rowsDuplicates
Unnamed: 0 has 7 (20.6%) missing valuesMissing
2023년 4분기 전국산업단지 현황통계 요약 has 15 (44.1%) missing valuesMissing
Unnamed: 2 has 16 (47.1%) missing valuesMissing
Unnamed: 3 has 16 (47.1%) missing valuesMissing
Unnamed: 4 has 14 (41.2%) missing valuesMissing
Unnamed: 5 has 14 (41.2%) missing valuesMissing
Unnamed: 6 has 22 (64.7%) missing valuesMissing
Unnamed: 7 has 28 (82.4%) missing valuesMissing
Unnamed: 8 has 27 (79.4%) missing valuesMissing
2023년 4분기 전국산업단지 현황통계 요약 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-23 04:26:49.156588
Analysis finished2024-03-23 04:26:51.275976
Duration2.12 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Text

MISSING 

Distinct16
Distinct (%)59.3%
Missing7
Missing (%)20.6%
Memory size404.0 B
2024-03-23T04:26:51.658419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length83
Median length61
Mean length14.518519
Min length2

Characters and Unicode

Total characters392
Distinct characters117
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)37.0%

Sample

1st row(1) 조성 및 분양(천㎡)
2nd row단지유형
3rd row국가
4th row일반
5th row도시첨단
ValueCountFrequency (%)
5
 
5.2%
4
 
4.2%
일반 3
 
3.1%
국가 3
 
3.1%
도시첨단 3
 
3.1%
생산 3
 
3.1%
총합 3
 
3.1%
농공 3
 
3.1%
2
 
2.1%
수출 2
 
2.1%
Other values (61) 65
67.7%
2024-03-23T04:26:52.634418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
81
 
20.7%
10
 
2.6%
, 10
 
2.6%
9
 
2.3%
9
 
2.3%
) 8
 
2.0%
2 8
 
2.0%
8
 
2.0%
( 7
 
1.8%
6
 
1.5%
Other values (107) 236
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 253
64.5%
Space Separator 81
 
20.7%
Decimal Number 22
 
5.6%
Other Punctuation 15
 
3.8%
Close Punctuation 8
 
2.0%
Open Punctuation 7
 
1.8%
Dash Punctuation 3
 
0.8%
Math Symbol 1
 
0.3%
Other Symbol 1
 
0.3%
Uppercase Letter 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
4.0%
9
 
3.6%
9
 
3.6%
8
 
3.2%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (90) 182
71.9%
Decimal Number
ValueCountFrequency (%)
2 8
36.4%
1 5
22.7%
3 5
22.7%
8 2
 
9.1%
7 1
 
4.5%
6 1
 
4.5%
Other Punctuation
ValueCountFrequency (%)
, 10
66.7%
. 3
 
20.0%
: 1
 
6.7%
* 1
 
6.7%
Space Separator
ValueCountFrequency (%)
81
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Uppercase Letter
ValueCountFrequency (%)
X 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 253
64.5%
Common 138
35.2%
Latin 1
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
 
4.0%
9
 
3.6%
9
 
3.6%
8
 
3.2%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (90) 182
71.9%
Common
ValueCountFrequency (%)
81
58.7%
, 10
 
7.2%
) 8
 
5.8%
2 8
 
5.8%
( 7
 
5.1%
1 5
 
3.6%
3 5
 
3.6%
. 3
 
2.2%
- 3
 
2.2%
8 2
 
1.4%
Other values (6) 6
 
4.3%
Latin
ValueCountFrequency (%)
X 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 253
64.5%
ASCII 138
35.2%
CJK Compat 1
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
81
58.7%
, 10
 
7.2%
) 8
 
5.8%
2 8
 
5.8%
( 7
 
5.1%
1 5
 
3.6%
3 5
 
3.6%
. 3
 
2.2%
- 3
 
2.2%
8 2
 
1.4%
Other values (6) 6
 
4.3%
Hangul
ValueCountFrequency (%)
10
 
4.0%
9
 
3.6%
9
 
3.6%
8
 
3.2%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (90) 182
71.9%
CJK Compat
ValueCountFrequency (%)
1
100.0%

2023년 4분기 전국산업단지 현황통계 요약
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing15
Missing (%)44.1%
Memory size404.0 B

Unnamed: 2
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing16
Missing (%)47.1%
Memory size404.0 B

Unnamed: 3
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing16
Missing (%)47.1%
Memory size404.0 B

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing14
Missing (%)41.2%
Memory size404.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing14
Missing (%)41.2%
Memory size404.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing22
Missing (%)64.7%
Memory size404.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing28
Missing (%)82.4%
Memory size404.0 B

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing27
Missing (%)79.4%
Memory size404.0 B

Missing values

2024-03-23T04:26:49.616385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T04:26:50.349654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-23T04:26:50.881906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 02023년 4분기 전국산업단지 현황통계 요약Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
0<NA>NaNNaNNaNNaNNaNNaNNaNNaN
1(1) 조성 및 분양(천㎡)NaNNaNNaNNaNNaNNaNNaN(단위 : 개, 천㎡, %)
2단지유형단지수지정면적관리면적산업시설구역NaNNaNNaNNaN
3<NA>NaNNaNNaN전체면적분양대상분양미분양분양률(%)
4국가50789424483379306915282017275417660097.66
5일반7315714815619733443552708262575931323395.11
6도시첨단44113111128650163472272974378.6
7농공4817805077554588475648754543194496.56
8총합1306145026611341927151336128025902822252096.33
9주1) 전체면적은 미개발면적을 포함한 산업시설구역 총 면적을 의미하며, 분양대상은 산업시설구역 중 조성된 면적NaNNaNNaNNaNNaNNaNNaNNaN
Unnamed: 02023년 4분기 전국산업단지 현황통계 요약Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
24국가6919298496892116510.394392229871399236948729-2.986861NaNNaN
25일반5017066324985538820.632379177526028195530790-9.208147NaNNaN
26도시첨단424504540174055.6663445829715508005.840777NaNNaN
27농공65034774648507310.28379511976570118378201.172091NaNNaN
28총합126291630012566336690.499957419956968444868139-5.599675NaNNaN
29<NA>NaNNaNNaNNaNNaNNaNNaNNaN
30* 유의사항NaNNaNNaNNaNNaNNaNNaNNaN
31- 두개의 시도에 걸친 산업단지(한국수출, 빛그린, 아산, 명지녹산)는 중복 카운트 되어 총 1,287개로 보이지만, 실제 단지개수는 1,283개임NaNNaNNaNNaNNaNNaNNaNNaN
32- 가동업체 개수가 2개 이하인 단지의 생산, 수출, 고용 정보는 개별 업체정보보호를 위하여 X로 표기함NaNNaNNaNNaNNaNNaNNaNNaN
33- 생산 및 수출은 연간 누계 기준임 (23.2분기 누계생산 : 23.1월~23.6월까지 생산액 합계)NaNNaNNaNNaNNaNNaNNaNNaN

Duplicate rows

Most frequently occurring

Unnamed: 0# duplicates
6<NA>7
0국가3
1농공3
3도시첨단3
4일반3
5총합3
2단지유형2