Overview

Dataset statistics

Number of variables9
Number of observations30
Missing cells122
Missing cells (%)45.2%
Duplicate rows1
Duplicate rows (%)3.3%
Total size in memory2.2 KiB
Average record size in memory76.4 B

Variable types

Unsupported3
Text4
Categorical1
Boolean1

Dataset

Description지구정보진행이력 데이터
Author한국토지주택공사
URLhttps://www.vworld.kr/dtmk/dtmk_ntads_s002.do?dsId=30485

Alerts

Unnamed: 5 has constant value ""Constant
Unnamed: 8 has constant value ""Constant
Dataset has 1 (3.3%) duplicate rowsDuplicates
테이블정의서 has 1 (3.3%) missing valuesMissing
Unnamed: 1 has 6 (20.0%) missing valuesMissing
Unnamed: 2 has 5 (16.7%) missing valuesMissing
Unnamed: 4 has 7 (23.3%) missing valuesMissing
Unnamed: 5 has 22 (73.3%) missing valuesMissing
Unnamed: 6 has 25 (83.3%) missing valuesMissing
Unnamed: 7 has 27 (90.0%) missing valuesMissing
Unnamed: 8 has 29 (96.7%) missing valuesMissing
테이블정의서 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-18 07:54:11.689006
Analysis finished2024-04-18 07:54:12.314148
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

테이블정의서
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1
Missing (%)3.3%
Memory size372.0 B

Unnamed: 1
Text

MISSING 

Distinct24
Distinct (%)100.0%
Missing6
Missing (%)20.0%
Memory size372.0 B
2024-04-18T16:54:12.447837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length14.5
Mean length11.75
Min length4

Characters and Unicode

Total characters282
Distinct characters25
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)100.0%

Sample

1st row컬럼ID
2nd rowDSTRC_APPN_NO
3rd rowDSTRC_HIST_NO
4th rowNTFC_BSNS_DSTRC_NO
5th rowNTFC_MNTRNG_NO
ValueCountFrequency (%)
컬럼id 1
 
4.2%
dstrc_appn_no 1
 
4.2%
updt_dt 1
 
4.2%
updt_id 1
 
4.2%
regist_dt 1
 
4.2%
regist_id 1
 
4.2%
regist_sttus 1
 
4.2%
chrg_instt_cttpc 1
 
4.2%
chrg_instt_dept 1
 
4.2%
chrg_instt_code 1
 
4.2%
Other values (14) 14
58.3%
2024-04-18T16:54:12.758902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 39
13.8%
T 34
12.1%
S 25
8.9%
N 24
8.5%
C 24
8.5%
D 23
8.2%
O 17
 
6.0%
R 16
 
5.7%
E 16
 
5.7%
I 11
 
3.9%
Other values (15) 53
18.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 237
84.0%
Connector Punctuation 39
 
13.8%
Decimal Number 4
 
1.4%
Other Letter 2
 
0.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 34
14.3%
S 25
10.5%
N 24
10.1%
C 24
10.1%
D 23
9.7%
O 17
7.2%
R 16
6.8%
E 16
6.8%
I 11
 
4.6%
P 9
 
3.8%
Other values (10) 38
16.0%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
1 2
50.0%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 39
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 237
84.0%
Common 43
 
15.2%
Hangul 2
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 34
14.3%
S 25
10.5%
N 24
10.1%
C 24
10.1%
D 23
9.7%
O 17
7.2%
R 16
6.8%
E 16
6.8%
I 11
 
4.6%
P 9
 
3.8%
Other values (10) 38
16.0%
Common
ValueCountFrequency (%)
_ 39
90.7%
2 2
 
4.7%
1 2
 
4.7%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 280
99.3%
Hangul 2
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 39
13.9%
T 34
12.1%
S 25
8.9%
N 24
8.6%
C 24
8.6%
D 23
8.2%
O 17
 
6.1%
R 16
 
5.7%
E 16
 
5.7%
I 11
 
3.9%
Other values (13) 51
18.2%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 2
Text

MISSING 

Distinct25
Distinct (%)100.0%
Missing5
Missing (%)16.7%
Memory size372.0 B
2024-04-18T16:54:12.946709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length5.24
Min length3

Characters and Unicode

Total characters131
Distinct characters53
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row컬럼명
2nd row지구지정번호
3rd row지구이력번호
4th row고시사업지구번호
5th row고시모니터링번호
ValueCountFrequency (%)
컬럼명 1
 
4.0%
고시일자 1
 
4.0%
고시년도 1
 
4.0%
수정일시 1
 
4.0%
수정아이디 1
 
4.0%
등록일시 1
 
4.0%
등록아이디 1
 
4.0%
등록상태 1
 
4.0%
담당기관연락처 1
 
4.0%
담당기관부서 1
 
4.0%
Other values (15) 15
60.0%
2024-04-18T16:54:13.241413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
6.1%
8
 
6.1%
8
 
6.1%
6
 
4.6%
5
 
3.8%
5
 
3.8%
5
 
3.8%
4
 
3.1%
4
 
3.1%
4
 
3.1%
Other values (43) 74
56.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 127
96.9%
Decimal Number 4
 
3.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
6.3%
8
 
6.3%
8
 
6.3%
6
 
4.7%
5
 
3.9%
5
 
3.9%
5
 
3.9%
4
 
3.1%
4
 
3.1%
4
 
3.1%
Other values (41) 70
55.1%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
1 2
50.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 127
96.9%
Common 4
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
6.3%
8
 
6.3%
8
 
6.3%
6
 
4.7%
5
 
3.9%
5
 
3.9%
5
 
3.9%
4
 
3.1%
4
 
3.1%
4
 
3.1%
Other values (41) 70
55.1%
Common
ValueCountFrequency (%)
2 2
50.0%
1 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 127
96.9%
ASCII 4
 
3.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
6.3%
8
 
6.3%
8
 
6.3%
6
 
4.7%
5
 
3.9%
5
 
3.9%
5
 
3.9%
4
 
3.1%
4
 
3.1%
4
 
3.1%
Other values (41) 70
55.1%
ASCII
ValueCountFrequency (%)
2 2
50.0%
1 2
50.0%

Unnamed: 3
Categorical

Distinct7
Distinct (%)23.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
VARCHAR2
11 
CHAR
<NA>
NUMBER
DATE
Other values (2)

Length

Max length8
Median length7
Mean length5.5666667
Min length2

Unique

Unique2 ?
Unique (%)6.7%

Sample

1st row<NA>
2nd row테이블ID
3rd row<NA>
4th row타입
5th rowCHAR

Common Values

ValueCountFrequency (%)
VARCHAR2 11
36.7%
CHAR 8
26.7%
<NA> 5
16.7%
NUMBER 2
 
6.7%
DATE 2
 
6.7%
테이블ID 1
 
3.3%
타입 1
 
3.3%

Length

2024-04-18T16:54:13.373164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T16:54:13.485610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
varchar2 11
36.7%
char 8
26.7%
na 5
16.7%
number 2
 
6.7%
date 2
 
6.7%
테이블id 1
 
3.3%
타입 1
 
3.3%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing7
Missing (%)23.3%
Memory size372.0 B

Unnamed: 5
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)12.5%
Missing22
Missing (%)73.3%
Memory size192.0 B
False
(Missing)
22 
ValueCountFrequency (%)
False 8
 
26.7%
(Missing) 22
73.3%
2024-04-18T16:54:13.615917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Unnamed: 6
Text

MISSING 

Distinct4
Distinct (%)80.0%
Missing25
Missing (%)83.3%
Memory size372.0 B
2024-04-18T16:54:13.722121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length3.2
Min length2

Characters and Unicode

Total characters16
Distinct characters11
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)60.0%

Sample

1st row작성일
2nd row테이블명
3rd rowPK/FK
4th rowPK
5th rowPK
ValueCountFrequency (%)
pk 2
40.0%
작성일 1
20.0%
테이블명 1
20.0%
pk/fk 1
20.0%
2024-04-18T16:54:13.981635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
K 4
25.0%
P 3
18.8%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
/ 1
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8
50.0%
Other Letter 7
43.8%
Other Punctuation 1
 
6.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Uppercase Letter
ValueCountFrequency (%)
K 4
50.0%
P 3
37.5%
F 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
50.0%
Hangul 7
43.8%
Common 1
 
6.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Latin
ValueCountFrequency (%)
K 4
50.0%
P 3
37.5%
F 1
 
12.5%
Common
ValueCountFrequency (%)
/ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
56.2%
Hangul 7
43.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
K 4
44.4%
P 3
33.3%
/ 1
 
11.1%
F 1
 
11.1%
Hangul
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing27
Missing (%)90.0%
Memory size372.0 B

Unnamed: 8
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing29
Missing (%)96.7%
Memory size372.0 B
2024-04-18T16:54:14.143949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row참조테이블명/비고
ValueCountFrequency (%)
참조테이블명/비고 1
100.0%
2024-04-18T16:54:14.381242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
/ 1
11.1%
1
11.1%
1
11.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8
88.9%
Other Punctuation 1
 
11.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8
88.9%
Common 1
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Common
ValueCountFrequency (%)
/ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8
88.9%
ASCII 1
 
11.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
ASCII
ValueCountFrequency (%)
/ 1
100.0%

Correlations

2024-04-18T16:54:14.459614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 6
Unnamed: 11.0001.0001.0001.000
Unnamed: 21.0001.0001.0001.000
Unnamed: 31.0001.0001.0001.000
Unnamed: 61.0001.0001.0001.000

Missing values

2024-04-18T16:54:12.077085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T16:54:12.205815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
0작성자<NA><NA><NA>NaN<NA>작성일2015-12-23 00:00:00<NA>
1주제영역명<NA><NA>테이블IDZ_LHSDW_BLS5_DSTRC_PROGRS_HIST<NA>테이블명지구정보진행이력<NA>
2테이블설명<NA><NA><NA>NaN<NA><NA>NaN<NA>
3No컬럼ID컬럼명타입길이(Byte)<NA>PK/FKDefault참조테이블명/비고
41DSTRC_APPN_NO지구지정번호CHAR14NPKNaN<NA>
52DSTRC_HIST_NO지구이력번호NUMBER22NPKNaN<NA>
63NTFC_BSNS_DSTRC_NO고시사업지구번호NUMBER22<NA><NA>NaN<NA>
74NTFC_MNTRNG_NO고시모니터링번호VARCHAR2128<NA><NA>NaN<NA>
85NTFC_BSNS_DSTRC_NM고시사업지구명VARCHAR2100<NA><NA>NaN<NA>
96STEP_CODE단계코드VARCHAR210N<NA>NaN<NA>
테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
2017CHRG_INSTT_CTTPC담당기관연락처VARCHAR220<NA><NA>NaN<NA>
2118REGIST_STTUS등록상태VARCHAR23N<NA>NaN<NA>
2219REGIST_ID등록아이디VARCHAR220N<NA>NaN<NA>
2320REGIST_DT등록일시DATENaNN<NA>NaN<NA>
2421UPDT_ID수정아이디VARCHAR220<NA><NA>NaN<NA>
2522UPDT_DT수정일시DATENaN<NA><NA>NaN<NA>
2623NTFC_YEAR고시년도CHAR4<NA><NA>NaN<NA>
27인덱스명<NA>인덱스키<NA>NaN<NA><NA>NaN<NA>
28NaN<NA><NA><NA>NaN<NA><NA>NaN<NA>
29업무규칙<NA><NA><NA>NaN<NA><NA>NaN<NA>

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 5Unnamed: 6Unnamed: 8# duplicates
0<NA><NA><NA><NA><NA><NA>3