Overview

Dataset statistics

Number of variables8
Number of observations45
Missing cells91
Missing cells (%)25.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.0 KiB
Average record size in memory67.9 B

Variable types

Text4
Numeric1
Categorical1
Unsupported1
Boolean1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-1323/S/1/datasetView.do

Alerts

테이블영문명 has constant value ""Constant
데이블한글명 has constant value ""Constant
데이터타입 is highly imbalanced (50.1%)Imbalance
Null is highly imbalanced (84.6%)Imbalance
테이블영문명 has 44 (97.8%) missing valuesMissing
데이블한글명 has 44 (97.8%) missing valuesMissing
길이 has 3 (6.7%) missing valuesMissing
컬럼순서 has unique valuesUnique
컬럼영문명 has unique valuesUnique
컬럼한글명 has unique valuesUnique
길이 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 06:09:11.635891
Analysis finished2023-12-11 06:09:12.425452
Duration0.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

테이블영문명
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing44
Missing (%)97.8%
Memory size492.0 B
2023-12-11T15:09:12.516818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters15
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowTN_NRSTR_ND_OBT
ValueCountFrequency (%)
tn_nrstr_nd_obt 1
100.0%
2023-12-11T15:09:12.800134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 3
20.0%
N 3
20.0%
_ 3
20.0%
R 2
13.3%
S 1
 
6.7%
D 1
 
6.7%
O 1
 
6.7%
B 1
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12
80.0%
Connector Punctuation 3
 
20.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 3
25.0%
N 3
25.0%
R 2
16.7%
S 1
 
8.3%
D 1
 
8.3%
O 1
 
8.3%
B 1
 
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
80.0%
Common 3
 
20.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 3
25.0%
N 3
25.0%
R 2
16.7%
S 1
 
8.3%
D 1
 
8.3%
O 1
 
8.3%
B 1
 
8.3%
Common
ValueCountFrequency (%)
_ 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 3
20.0%
N 3
20.0%
_ 3
20.0%
R 2
13.3%
S 1
 
6.7%
D 1
 
6.7%
O 1
 
6.7%
B 1
 
6.7%

데이블한글명
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing44
Missing (%)97.8%
Memory size492.0 B
2023-12-11T15:09:12.935853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row보호수및노거수
ValueCountFrequency (%)
보호수및노거수 1
100.0%
2023-12-11T15:09:13.248792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2
28.6%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
28.6%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
28.6%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2
28.6%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

컬럼순서
Real number (ℝ)

UNIQUE 

Distinct45
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23
Minimum1
Maximum45
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size537.0 B
2023-12-11T15:09:13.377276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.2
Q112
median23
Q334
95-th percentile42.8
Maximum45
Range44
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.133926
Coefficient of variation (CV)0.57104024
Kurtosis-1.2
Mean23
Median Absolute Deviation (MAD)11
Skewness0
Sum1035
Variance172.5
MonotonicityStrictly increasing
2023-12-11T15:09:13.574055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
1 1
 
2.2%
35 1
 
2.2%
26 1
 
2.2%
27 1
 
2.2%
28 1
 
2.2%
29 1
 
2.2%
30 1
 
2.2%
31 1
 
2.2%
32 1
 
2.2%
33 1
 
2.2%
Other values (35) 35
77.8%
ValueCountFrequency (%)
1 1
2.2%
2 1
2.2%
3 1
2.2%
4 1
2.2%
5 1
2.2%
6 1
2.2%
7 1
2.2%
8 1
2.2%
9 1
2.2%
10 1
2.2%
ValueCountFrequency (%)
45 1
2.2%
44 1
2.2%
43 1
2.2%
42 1
2.2%
41 1
2.2%
40 1
2.2%
39 1
2.2%
38 1
2.2%
37 1
2.2%
36 1
2.2%

컬럼영문명
Text

UNIQUE 

Distinct45
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size492.0 B
2023-12-11T15:09:13.785086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length7.6
Min length3

Characters and Unicode

Total characters342
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)100.0%

Sample

1st rowOBJECTID
2nd rowGU_NM
3rd rowHNR_NAM
4th rowMTC_AT
5th rowMASTERNO
ValueCountFrequency (%)
objectid 1
 
2.2%
pss_man 1
 
2.2%
sde_knd_nm 1
 
2.2%
sde_dme_et 1
 
2.2%
sde_mge_st 1
 
2.2%
sde_mge_me 1
 
2.2%
vde_knd_nm 1
 
2.2%
vde_dme_et 1
 
2.2%
vde_mge_st 1
 
2.2%
vde_mge_me 1
 
2.2%
Other values (35) 35
77.8%
2023-12-11T15:09:14.115649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 51
14.9%
E 41
12.0%
T 37
10.8%
M 29
 
8.5%
N 23
 
6.7%
D 22
 
6.4%
S 16
 
4.7%
A 16
 
4.7%
R 15
 
4.4%
C 15
 
4.4%
Other values (14) 77
22.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 291
85.1%
Connector Punctuation 51
 
14.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 41
14.1%
T 37
12.7%
M 29
10.0%
N 23
 
7.9%
D 22
 
7.6%
S 16
 
5.5%
A 16
 
5.5%
R 15
 
5.2%
C 15
 
5.2%
L 10
 
3.4%
Other values (13) 67
23.0%
Connector Punctuation
ValueCountFrequency (%)
_ 51
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 291
85.1%
Common 51
 
14.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 41
14.1%
T 37
12.7%
M 29
10.0%
N 23
 
7.9%
D 22
 
7.6%
S 16
 
5.5%
A 16
 
5.5%
R 15
 
5.2%
C 15
 
5.2%
L 10
 
3.4%
Other values (13) 67
23.0%
Common
ValueCountFrequency (%)
_ 51
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 342
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 51
14.9%
E 41
12.0%
T 37
10.8%
M 29
 
8.5%
N 23
 
6.7%
D 22
 
6.4%
S 16
 
4.7%
A 16
 
4.7%
R 15
 
4.4%
C 15
 
4.4%
Other values (14) 77
22.5%

컬럼한글명
Text

UNIQUE 

Distinct45
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size492.0 B
2023-12-11T15:09:14.334329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length4.0222222
Min length2

Characters and Unicode

Total characters181
Distinct characters80
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)100.0%

Sample

1st row고유번호
2nd row구명
3rd row법정동명
4th row산지여부
5th row주지번
ValueCountFrequency (%)
고유번호 1
 
2.1%
지정사유 1
 
2.1%
병해피해도 1
 
2.1%
병해관리상태 1
 
2.1%
병해관리방안 1
 
2.1%
충해종류명 1
 
2.1%
충해피해도 1
 
2.1%
충해관리상태 1
 
2.1%
충해관리방안 1
 
2.1%
기타피해상태 1
 
2.1%
Other values (37) 37
78.7%
2023-12-11T15:09:15.022037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
6.6%
8
 
4.4%
8
 
4.4%
8
 
4.4%
7
 
3.9%
6
 
3.3%
5
 
2.8%
5
 
2.8%
4
 
2.2%
4
 
2.2%
Other values (70) 114
63.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 179
98.9%
Space Separator 2
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
6.7%
8
 
4.5%
8
 
4.5%
8
 
4.5%
7
 
3.9%
6
 
3.4%
5
 
2.8%
5
 
2.8%
4
 
2.2%
4
 
2.2%
Other values (69) 112
62.6%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 179
98.9%
Common 2
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
6.7%
8
 
4.5%
8
 
4.5%
8
 
4.5%
7
 
3.9%
6
 
3.4%
5
 
2.8%
5
 
2.8%
4
 
2.2%
4
 
2.2%
Other values (69) 112
62.6%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 179
98.9%
ASCII 2
 
1.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
6.7%
8
 
4.5%
8
 
4.5%
8
 
4.5%
7
 
3.9%
6
 
3.4%
5
 
2.8%
5
 
2.8%
4
 
2.2%
4
 
2.2%
Other values (69) 112
62.6%
ASCII
ValueCountFrequency (%)
2
100.0%

데이터타입
Categorical

IMBALANCE 

Distinct5
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size492.0 B
NVARCHAR2
33 
NUMBER
NCLOB
 
1
VARCHAR2
 
1
DATE
 
1

Length

Max length9
Median length9
Mean length8.1777778
Min length4

Unique

Unique3 ?
Unique (%)6.7%

Sample

1st rowNUMBER
2nd rowNVARCHAR2
3rd rowNVARCHAR2
4th rowNVARCHAR2
5th rowNVARCHAR2

Common Values

ValueCountFrequency (%)
NVARCHAR2 33
73.3%
NUMBER 9
 
20.0%
NCLOB 1
 
2.2%
VARCHAR2 1
 
2.2%
DATE 1
 
2.2%

Length

2023-12-11T15:09:15.182028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:09:15.327620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
nvarchar2 33
73.3%
number 9
 
20.0%
nclob 1
 
2.2%
varchar2 1
 
2.2%
date 1
 
2.2%

길이
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)6.7%
Memory size492.0 B

Null
Boolean

IMBALANCE 

Distinct2
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size177.0 B
True
44 
False
 
1
ValueCountFrequency (%)
True 44
97.8%
False 1
 
2.2%
2023-12-11T15:09:15.461146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-11T15:09:11.930110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T15:09:15.533185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
컬럼순서컬럼영문명컬럼한글명데이터타입Null
컬럼순서1.0001.0001.0000.5660.000
컬럼영문명1.0001.0001.0001.0001.000
컬럼한글명1.0001.0001.0001.0001.000
데이터타입0.5661.0001.0001.0000.038
Null0.0001.0001.0000.0381.000
2023-12-11T15:09:15.651584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Null데이터타입
Null1.0000.000
데이터타입0.0001.000
2023-12-11T15:09:15.762854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
컬럼순서데이터타입Null
컬럼순서1.0000.2690.000
데이터타입0.2691.0000.000
Null0.0000.0001.000

Missing values

2023-12-11T15:09:12.075798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:09:12.225142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T15:09:12.340993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

테이블영문명데이블한글명컬럼순서컬럼영문명컬럼한글명데이터타입길이Null
0TN_NRSTR_ND_OBT보호수및노거수1OBJECTID고유번호NUMBERNaNN
1<NA><NA>2GU_NM구명NVARCHAR2252Y
2<NA><NA>3HNR_NAM법정동명NVARCHAR250Y
3<NA><NA>4MTC_AT산지여부NVARCHAR21Y
4<NA><NA>5MASTERNO주지번NVARCHAR24Y
5<NA><NA>6SLAVENO부지번NVARCHAR24Y
6<NA><NA>7NEADRES_NM새주소명NVARCHAR290Y
7<NA><NA>8LOCPLC_CN소재지NCLOBNaNY
8<NA><NA>9JMK_KOR지목한글NVARCHAR250Y
9<NA><NA>10WDPT_AR수목면적NUMBER38,8Y
테이블영문명데이블한글명컬럼순서컬럼영문명컬럼한글명데이터타입길이Null
35<NA><NA>36ATT_WHY지정사유NVARCHAR2100Y
36<NA><NA>37TRE_CRR나무의특징NVARCHAR250Y
37<NA><NA>38HSY_TDN_CT연혁 및 전설NVARCHAR250Y
38<NA><NA>39ETC기타NVARCHAR2255Y
39<NA><NA>40TRE_IDN수목고유번호NVARCHAR250Y
40<NA><NA>41ITM_LVL품계등급NUMBER38,8Y
41<NA><NA>42CREAT_DE생성일DATENaNY
42<NA><NA>43PO_FE_NM사진파일명NVARCHAR230Y
43<NA><NA>44LNG경도NVARCHAR211Y
44<NA><NA>45LAT위도NVARCHAR211Y