Overview

Dataset statistics

Number of variables9
Number of observations30
Missing cells60
Missing cells (%)22.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.2 KiB
Average record size in memory76.4 B

Variable types

Unsupported3
Text2
Categorical3
Boolean1

Dataset

Description집합건물에 대한 대지권지분비율 정보
Author국토교통부
URLhttps://www.vworld.kr/dtmk/dtmk_ntads_s002.do?dsId=30580

Alerts

Unnamed: 5 has constant value ""Constant
Unnamed: 8 is highly overall correlated with Unnamed: 3 and 1 other fieldsHigh correlation
Unnamed: 3 is highly overall correlated with Unnamed: 6 and 1 other fieldsHigh correlation
Unnamed: 6 is highly overall correlated with Unnamed: 3 and 1 other fieldsHigh correlation
Unnamed: 1 has 7 (23.3%) missing valuesMissing
Unnamed: 2 has 1 (3.3%) missing valuesMissing
Unnamed: 4 has 6 (20.0%) missing valuesMissing
Unnamed: 5 has 19 (63.3%) missing valuesMissing
Unnamed: 7 has 27 (90.0%) missing valuesMissing
테이블정의서 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-18 06:58:37.104990
Analysis finished2024-04-18 06:58:38.750653
Duration1.65 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

테이블정의서
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size372.0 B

Unnamed: 1
Text

MISSING 

Distinct23
Distinct (%)100.0%
Missing7
Missing (%)23.3%
Memory size372.0 B
2024-04-18T15:58:38.867954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length16
Mean length11.304348
Min length2

Characters and Unicode

Total characters260
Distinct characters26
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)100.0%

Sample

1st row컬럼ID
2nd rowADM_SECT_CD
3rd rowLAND_LOC_CD
4th rowDONG
5th rowFLR
ValueCountFrequency (%)
컬럼id 1
 
4.3%
rel_land_loc_cd 1
 
4.3%
col_adm_sect_cd 1
 
4.3%
hist_expr_gbn 1
 
4.3%
prsn_own_rgt_hist_odrno 1
 
4.3%
chrg_man_id 1
 
4.3%
land_rgt_end_ymd 1
 
4.3%
land_rgt_bgn_ymd 1
 
4.3%
land_rgt_jibun_rate_nume 1
 
4.3%
land_rgt_jibun_rate_deno 1
 
4.3%
Other values (13) 13
56.5%
2024-04-18T15:58:39.138399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 40
15.4%
N 28
10.8%
D 25
 
9.6%
L 16
 
6.2%
G 16
 
6.2%
R 15
 
5.8%
T 14
 
5.4%
B 14
 
5.4%
A 13
 
5.0%
O 12
 
4.6%
Other values (16) 67
25.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 218
83.8%
Connector Punctuation 40
 
15.4%
Other Letter 2
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 28
12.8%
D 25
11.5%
L 16
 
7.3%
G 16
 
7.3%
R 15
 
6.9%
T 14
 
6.4%
B 14
 
6.4%
A 13
 
6.0%
O 12
 
5.5%
E 12
 
5.5%
Other values (13) 53
24.3%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 218
83.8%
Common 40
 
15.4%
Hangul 2
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 28
12.8%
D 25
11.5%
L 16
 
7.3%
G 16
 
7.3%
R 15
 
6.9%
T 14
 
6.4%
B 14
 
6.4%
A 13
 
6.0%
O 12
 
5.5%
E 12
 
5.5%
Other values (13) 53
24.3%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Common
ValueCountFrequency (%)
_ 40
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 258
99.2%
Hangul 2
 
0.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 40
15.5%
N 28
10.9%
D 25
 
9.7%
L 16
 
6.2%
G 16
 
6.2%
R 15
 
5.8%
T 14
 
5.4%
B 14
 
5.4%
A 13
 
5.0%
O 12
 
4.7%
Other values (14) 65
25.2%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 2
Text

MISSING 

Distinct29
Distinct (%)100.0%
Missing1
Missing (%)3.3%
Memory size372.0 B
2024-04-18T15:58:39.301196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length79
Median length9
Mean length9.6896552
Min length1

Characters and Unicode

Total characters281
Distinct characters89
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)100.0%

Sample

1st row최민선
2nd row행정업무
3rd row집합건물에 대한 대지권지분비율 정보
4th row컬럼명
5th row행정구역코드
ValueCountFrequency (%)
대지권지분비율 2
 
6.2%
최민선 1
 
3.1%
집합건물일련번호 1
 
3.1%
adm_sect_cd,land_loc_cd,ledg_gbn,bobn,bubn,dong,flr,ho,sil,cbldg_seqno,dnst_gbn 1
 
3.1%
인덱스키 1
 
3.1%
건물식별번호 1
 
3.1%
원천시군구코드 1
 
3.1%
연혁표시구분 1
 
3.1%
현재소유권연혁순번 1
 
3.1%
담당자id 1
 
3.1%
Other values (21) 21
65.6%
2024-04-18T15:58:39.576574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 16
 
5.7%
d 14
 
5.0%
14
 
5.0%
_ 12
 
4.3%
l 11
 
3.9%
n 11
 
3.9%
c 10
 
3.6%
9
 
3.2%
o 9
 
3.2%
8
 
2.8%
Other values (79) 167
59.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 146
52.0%
Lowercase Letter 102
36.3%
Other Punctuation 16
 
5.7%
Connector Punctuation 12
 
4.3%
Space Separator 3
 
1.1%
Uppercase Letter 2
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
9.6%
9
 
6.2%
8
 
5.5%
7
 
4.8%
5
 
3.4%
5
 
3.4%
4
 
2.7%
4
 
2.7%
4
 
2.7%
4
 
2.7%
Other values (56) 82
56.2%
Lowercase Letter
ValueCountFrequency (%)
d 14
13.7%
l 11
10.8%
n 11
10.8%
c 10
9.8%
o 9
8.8%
b 8
7.8%
g 7
6.9%
s 7
6.9%
e 5
 
4.9%
a 4
 
3.9%
Other values (8) 16
15.7%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
D 1
50.0%
Other Punctuation
ValueCountFrequency (%)
, 16
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 12
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 146
52.0%
Latin 104
37.0%
Common 31
 
11.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
9.6%
9
 
6.2%
8
 
5.5%
7
 
4.8%
5
 
3.4%
5
 
3.4%
4
 
2.7%
4
 
2.7%
4
 
2.7%
4
 
2.7%
Other values (56) 82
56.2%
Latin
ValueCountFrequency (%)
d 14
13.5%
l 11
10.6%
n 11
10.6%
c 10
9.6%
o 9
8.7%
b 8
7.7%
g 7
 
6.7%
s 7
 
6.7%
e 5
 
4.8%
a 4
 
3.8%
Other values (10) 18
17.3%
Common
ValueCountFrequency (%)
, 16
51.6%
_ 12
38.7%
3
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 146
52.0%
ASCII 135
48.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 16
11.9%
d 14
10.4%
_ 12
 
8.9%
l 11
 
8.1%
n 11
 
8.1%
c 10
 
7.4%
o 9
 
6.7%
b 8
 
5.9%
g 7
 
5.2%
s 7
 
5.2%
Other values (13) 30
22.2%
Hangul
ValueCountFrequency (%)
14
 
9.6%
9
 
6.2%
8
 
5.5%
7
 
4.8%
5
 
3.4%
5
 
3.4%
4
 
2.7%
4
 
2.7%
4
 
2.7%
4
 
2.7%
Other values (56) 82
56.2%

Unnamed: 3
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
VARCHAR2
11 
CHAR
10 
<NA>
테이블ID
 
1
타입
 
1

Length

Max length8
Median length4
Mean length5.5
Min length2

Unique

Unique3 ?
Unique (%)10.0%

Sample

1st row<NA>
2nd row테이블ID
3rd row<NA>
4th row타입
5th rowCHAR

Common Values

ValueCountFrequency (%)
VARCHAR2 11
36.7%
CHAR 10
33.3%
<NA> 6
20.0%
테이블ID 1
 
3.3%
타입 1
 
3.3%
NUMBER 1
 
3.3%

Length

2024-04-18T15:58:39.687984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T15:58:40.125606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
varchar2 11
36.7%
char 10
33.3%
na 6
20.0%
테이블id 1
 
3.3%
타입 1
 
3.3%
number 1
 
3.3%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6
Missing (%)20.0%
Memory size372.0 B

Unnamed: 5
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)9.1%
Missing19
Missing (%)63.3%
Memory size192.0 B
False
11 
(Missing)
19 
ValueCountFrequency (%)
False 11
36.7%
(Missing) 19
63.3%
2024-04-18T15:58:40.204591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Unnamed: 6
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
<NA>
16 
PK/FK
PK
작성일
 
1
테이블명
 
1

Length

Max length5
Median length4
Mean length3.9666667
Min length2

Unique

Unique2 ?
Unique (%)6.7%

Sample

1st row작성일
2nd row테이블명
3rd row<NA>
4th rowPK/FK
5th rowPK/FK

Common Values

ValueCountFrequency (%)
<NA> 16
53.3%
PK/FK 8
26.7%
PK 4
 
13.3%
작성일 1
 
3.3%
테이블명 1
 
3.3%

Length

2024-04-18T15:58:40.295913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T15:58:40.399204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 16
53.3%
pk/fk 8
26.7%
pk 4
 
13.3%
작성일 1
 
3.3%
테이블명 1
 
3.3%

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing27
Missing (%)90.0%
Memory size372.0 B

Unnamed: 8
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
<NA>
22 
집합건물전유부
참조테이블명/비고
 
1

Length

Max length9
Median length4
Mean length4.8666667
Min length4

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row참조테이블명/비고
5th row집합건물전유부

Common Values

ValueCountFrequency (%)
<NA> 22
73.3%
집합건물전유부 7
 
23.3%
참조테이블명/비고 1
 
3.3%

Length

2024-04-18T15:58:40.500078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T15:58:40.586073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 22
73.3%
집합건물전유부 7
 
23.3%
참조테이블명/비고 1
 
3.3%

Correlations

2024-04-18T15:58:40.650208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 6Unnamed: 8
Unnamed: 11.0001.0001.0001.0001.000
Unnamed: 21.0001.0001.0001.0001.000
Unnamed: 31.0001.0001.0000.7211.000
Unnamed: 61.0001.0000.7211.000NaN
Unnamed: 81.0001.0001.000NaN1.000
2024-04-18T15:58:40.737466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 8Unnamed: 3Unnamed: 6
Unnamed: 81.0000.9131.000
Unnamed: 30.9131.0000.717
Unnamed: 61.0000.7171.000
2024-04-18T15:58:40.809365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 3Unnamed: 6Unnamed: 8
Unnamed: 31.0000.7170.913
Unnamed: 60.7171.0001.000
Unnamed: 80.9131.0001.000

Missing values

2024-04-18T15:58:38.403805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T15:58:38.553125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T15:58:38.665895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
0작성자<NA>최민선<NA>NaN<NA>작성일2016-01-19 00:00:00<NA>
1주제영역명<NA>행정업무테이블IDABPD_CBLDG_LAND_RGT_JIBUN_RATE<NA>테이블명집합건물대지권지분비율<NA>
2테이블설명<NA>집합건물에 대한 대지권지분비율 정보<NA>NaN<NA><NA>NaN<NA>
3No컬럼ID컬럼명타입길이(Byte)<NA>PK/FKDefault참조테이블명/비고
41ADM_SECT_CD행정구역코드CHAR5NPK/FKNaN집합건물전유부
52LAND_LOC_CD토지소재지코드CHAR5NPK/FKNaN집합건물전유부
63DONGVARCHAR2150NPK/FKNaN집합건물전유부
74FLRVARCHAR2150NPK/FKNaN집합건물전유부
85HOVARCHAR2150NPK/FKNaN집합건물전유부
96SILVARCHAR2150NPK/FKNaN집합건물전유부
테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
2017LAND_RGT_END_YMD대지권종료일자VARCHAR28<NA><NA>NaN<NA>
2118CHRG_MAN_ID담당자IDVARCHAR220<NA><NA>NaN<NA>
2219PRSN_OWN_RGT_HIST_ODRNO현재소유권연혁순번CHAR4<NA><NA>NaN<NA>
2320HIST_EXPR_GBN연혁표시구분CHAR1<NA><NA>NaN<NA>
2421COL_ADM_SECT_CD원천시군구코드VARCHAR25<NA><NA>NaN<NA>
2522BLDG_GBN_NO건물식별번호NUMBER28<NA><NA>NaN<NA>
26인덱스명<NA>인덱스키<NA>NaN<NA><NA>NaN<NA>
27ABPD_CBLDG_LAND_RGT_JIBUN_RATE_PK<NA>adm_sect_cd,land_loc_cd,ledg_gbn,bobn,bubn,dong,flr,ho,sil,cbldg_seqno,dnst_gbn<NA>NaN<NA><NA>NaN<NA>
28ABPD_CBLDG_LAND_RGT_JIBUN_RATE_FK01<NA>adm_sect_cd,land_loc_cd,dong,flr,ho,sil,cbldg_seqno<NA>NaN<NA><NA>NaN<NA>
29업무규칙<NA><NA><NA>NaN<NA><NA>NaN<NA>