Overview

Dataset statistics

Number of variables9
Number of observations21
Missing cells89
Missing cells (%)47.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory79.3 B

Variable types

Unsupported4
Text4
Categorical1

Dataset

Description개별주택 가격정보
Author국토교통부
URLhttps://www.vworld.kr/dtmk/dtmk_ntads_s002.do?dsId=30520

Alerts

Unnamed: 8 has constant value ""Constant
Unnamed: 1 has 6 (28.6%) missing valuesMissing
Unnamed: 2 has 1 (4.8%) missing valuesMissing
Unnamed: 4 has 5 (23.8%) missing valuesMissing
Unnamed: 5 has 21 (100.0%) missing valuesMissing
Unnamed: 6 has 18 (85.7%) missing valuesMissing
Unnamed: 7 has 18 (85.7%) missing valuesMissing
Unnamed: 8 has 20 (95.2%) missing valuesMissing
테이블정의서 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-16 02:28:54.680732
Analysis finished2024-04-16 02:28:56.502381
Duration1.82 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

테이블정의서
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size300.0 B

Unnamed: 1
Text

MISSING 

Distinct15
Distinct (%)100.0%
Missing6
Missing (%)28.6%
Memory size300.0 B
2024-04-16T11:28:56.614023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length11
Mean length8.6
Min length3

Characters and Unicode

Total characters129
Distinct characters25
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)100.0%

Sample

1st row컬럼ID
2nd rowPNU
3rd rowBILD_REGSTR_UNQNO
4th rowDONG_NO
5th rowPANN_YEAR
ValueCountFrequency (%)
컬럼id 1
 
6.7%
pnu 1
 
6.7%
bild_regstr_unqno 1
 
6.7%
dong_no 1
 
6.7%
pann_year 1
 
6.7%
stdmt 1
 
6.7%
potvale 1
 
6.7%
pjji_yn 1
 
6.7%
pann_gbn 1
 
6.7%
lndbuk_area 1
 
6.7%
Other values (5) 5
33.3%
2024-04-16T11:28:56.864318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 15
11.6%
N 14
 
10.9%
_ 14
 
10.9%
R 9
 
7.0%
E 9
 
7.0%
D 8
 
6.2%
P 7
 
5.4%
C 6
 
4.7%
L 6
 
4.7%
O 5
 
3.9%
Other values (15) 36
27.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 113
87.6%
Connector Punctuation 14
 
10.9%
Other Letter 2
 
1.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 15
13.3%
N 14
12.4%
R 9
 
8.0%
E 9
 
8.0%
D 8
 
7.1%
P 7
 
6.2%
C 6
 
5.3%
L 6
 
5.3%
O 5
 
4.4%
T 5
 
4.4%
Other values (12) 29
25.7%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 113
87.6%
Common 14
 
10.9%
Hangul 2
 
1.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 15
13.3%
N 14
12.4%
R 9
 
8.0%
E 9
 
8.0%
D 8
 
7.1%
P 7
 
6.2%
C 6
 
5.3%
L 6
 
5.3%
O 5
 
4.4%
T 5
 
4.4%
Other values (12) 29
25.7%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Common
ValueCountFrequency (%)
_ 14
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 127
98.4%
Hangul 2
 
1.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 15
11.8%
N 14
11.0%
_ 14
11.0%
R 9
 
7.1%
E 9
 
7.1%
D 8
 
6.3%
P 7
 
5.5%
C 6
 
4.7%
L 6
 
4.7%
O 5
 
3.9%
Other values (13) 34
26.8%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 2
Text

MISSING 

Distinct20
Distinct (%)100.0%
Missing1
Missing (%)4.8%
Memory size300.0 B
2024-04-16T11:28:57.019728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length9
Mean length7.1
Min length3

Characters and Unicode

Total characters142
Distinct characters76
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st row허재민
2nd row가격업무
3rd row개별주택 가격정보
4th row컬럼명
5th row토지코드
ValueCountFrequency (%)
허재민 1
 
4.0%
토지대장면적 1
 
4.0%
pann_year 1
 
4.0%
dong_no 1
 
4.0%
bild_regstr_unqno 1
 
4.0%
pnu 1
 
4.0%
인덱스키 1
 
4.0%
공시일자 1
 
4.0%
원천시군구코드 1
 
4.0%
주거면적 1
 
4.0%
Other values (15) 15
60.0%
2024-04-16T11:28:57.276253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 7
 
4.9%
5
 
3.5%
5
 
3.5%
_ 4
 
2.8%
4
 
2.8%
, 4
 
2.8%
4
 
2.8%
4
 
2.8%
4
 
2.8%
4
 
2.8%
Other values (66) 97
68.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 92
64.8%
Uppercase Letter 37
26.1%
Space Separator 5
 
3.5%
Connector Punctuation 4
 
2.8%
Other Punctuation 4
 
2.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
5.4%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
3
 
3.3%
3
 
3.3%
2
 
2.2%
Other values (46) 55
59.8%
Uppercase Letter
ValueCountFrequency (%)
N 7
18.9%
T 3
 
8.1%
O 3
 
8.1%
R 3
 
8.1%
D 3
 
8.1%
U 2
 
5.4%
P 2
 
5.4%
E 2
 
5.4%
G 2
 
5.4%
S 2
 
5.4%
Other values (7) 8
21.6%
Space Separator
ValueCountFrequency (%)
5
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 92
64.8%
Latin 37
26.1%
Common 13
 
9.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
5.4%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
3
 
3.3%
3
 
3.3%
2
 
2.2%
Other values (46) 55
59.8%
Latin
ValueCountFrequency (%)
N 7
18.9%
T 3
 
8.1%
O 3
 
8.1%
R 3
 
8.1%
D 3
 
8.1%
U 2
 
5.4%
P 2
 
5.4%
E 2
 
5.4%
G 2
 
5.4%
S 2
 
5.4%
Other values (7) 8
21.6%
Common
ValueCountFrequency (%)
5
38.5%
_ 4
30.8%
, 4
30.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 92
64.8%
ASCII 50
35.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 7
14.0%
5
 
10.0%
_ 4
 
8.0%
, 4
 
8.0%
T 3
 
6.0%
O 3
 
6.0%
R 3
 
6.0%
D 3
 
6.0%
U 2
 
4.0%
P 2
 
4.0%
Other values (10) 14
28.0%
Hangul
ValueCountFrequency (%)
5
 
5.4%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
3
 
3.3%
3
 
3.3%
2
 
2.2%
Other values (46) 55
59.8%

Unnamed: 3
Categorical

Distinct6
Distinct (%)28.6%
Missing0
Missing (%)0.0%
Memory size300.0 B
VARCHAR2
<NA>
NUMBER
CHAR
테이블ID

Length

Max length8
Median length6
Mean length5.5714286
Min length2

Unique

Unique2 ?
Unique (%)9.5%

Sample

1st row<NA>
2nd row테이블ID
3rd row<NA>
4th row타입
5th rowVARCHAR2

Common Values

ValueCountFrequency (%)
VARCHAR2 6
28.6%
<NA> 5
23.8%
NUMBER 5
23.8%
CHAR 3
14.3%
테이블ID 1
 
4.8%
타입 1
 
4.8%

Length

2024-04-16T11:28:57.382083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-16T11:28:57.484621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
varchar2 6
28.6%
na 5
23.8%
number 5
23.8%
char 3
14.3%
테이블id 1
 
4.8%
타입 1
 
4.8%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)23.8%
Memory size300.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing21
Missing (%)100.0%
Memory size321.0 B

Unnamed: 6
Text

MISSING 

Distinct3
Distinct (%)100.0%
Missing18
Missing (%)85.7%
Memory size300.0 B
2024-04-16T11:28:57.596818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4
Min length3

Characters and Unicode

Total characters12
Distinct characters11
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row작성일
2nd row테이블명
3rd rowPK/FK
ValueCountFrequency (%)
작성일 1
33.3%
테이블명 1
33.3%
pk/fk 1
33.3%
2024-04-16T11:28:57.821353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
K 2
16.7%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
P 1
8.3%
/ 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7
58.3%
Uppercase Letter 4
33.3%
Other Punctuation 1
 
8.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Uppercase Letter
ValueCountFrequency (%)
K 2
50.0%
P 1
25.0%
F 1
25.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7
58.3%
Latin 4
33.3%
Common 1
 
8.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Latin
ValueCountFrequency (%)
K 2
50.0%
P 1
25.0%
F 1
25.0%
Common
ValueCountFrequency (%)
/ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7
58.3%
ASCII 5
41.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
K 2
40.0%
P 1
20.0%
/ 1
20.0%
F 1
20.0%
Hangul
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing18
Missing (%)85.7%
Memory size300.0 B

Unnamed: 8
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing20
Missing (%)95.2%
Memory size300.0 B
2024-04-16T11:28:57.955700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row참조테이블명/비고
ValueCountFrequency (%)
참조테이블명/비고 1
100.0%
2024-04-16T11:28:58.149602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
/ 1
11.1%
1
11.1%
1
11.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8
88.9%
Other Punctuation 1
 
11.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8
88.9%
Common 1
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Common
ValueCountFrequency (%)
/ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8
88.9%
ASCII 1
 
11.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
ASCII
ValueCountFrequency (%)
/ 1
100.0%

Correlations

2024-04-16T11:28:58.216516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 6
Unnamed: 11.0001.0001.000NaN
Unnamed: 21.0001.0001.0001.000
Unnamed: 31.0001.0001.0000.000
Unnamed: 6NaN1.0000.0001.000

Missing values

2024-04-16T11:28:56.177929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-16T11:28:56.320860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-16T11:28:56.433942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
0작성자<NA>허재민<NA>NaN<NA>작성일2016-01-19 00:00:00<NA>
1주제영역명<NA>가격업무테이블IDAPMM_HP_PRC_MNG<NA>테이블명개별주택 가격정보<NA>
2테이블설명<NA>개별주택 가격정보<NA>NaN<NA><NA>NaN<NA>
3No컬럼ID컬럼명타입길이(Byte)<NA>PK/FKDefault참조테이블명/비고
41PNU토지코드VARCHAR219<NA><NA>NaN<NA>
52BILD_REGSTR_UNQNO건축물대장고유번호VARCHAR219<NA><NA>NaN<NA>
63DONG_NO동번호VARCHAR25<NA><NA>NaN<NA>
74PANN_YEAR공시년도VARCHAR24<NA><NA>NaN<NA>
85STDMT기준월CHAR2<NA><NA>NaN<NA>
96POTVALE공시가격NUMBER13<NA><NA>NaN<NA>
테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
118PANN_GBN공시구분CHAR1<NA><NA>NaN<NA>
129LNDBUK_AREA토지대장면적NUMBER13,2<NA><NA>NaN<NA>
1310CALC_LAREA산정대지면적NUMBER13,2<NA><NA>NaN<NA>
1411HPRC_GAREA주택가격연면적NUMBER13,2<NA><NA>NaN<NA>
1512RES_AREA주거면적NUMBER13,2<NA><NA>NaN<NA>
1613COL_ADM_SECT_CD원천시군구코드VARCHAR25<NA><NA>NaN<NA>
1714PANN_YMD공시일자VARCHAR28<NA><NA>NaN<NA>
18인덱스명<NA>인덱스키<NA>NaN<NA><NA>NaN<NA>
19APMM_HP_PRC_MNG_INX1<NA>PNU, BILD_REGSTR_UNQNO, DONG_NO, PANN_YEAR, STDMT<NA>NaN<NA><NA>NaN<NA>
20업무규칙<NA><NA><NA>NaN<NA><NA>NaN<NA>