Overview

Dataset statistics

Number of variables9
Number of observations31
Missing cells132
Missing cells (%)47.3%
Duplicate rows1
Duplicate rows (%)3.2%
Total size in memory2.3 KiB
Average record size in memory77.3 B

Variable types

Unsupported4
Text4
Categorical1

Dataset

Description수질측정망 하천수 지점 정보
Author국립환경과학원
URLhttps://www.vworld.kr/dtmk/dtmk_ntads_s002.do?dsId=30124

Alerts

Unnamed: 8 has constant value ""Constant
Dataset has 1 (3.2%) duplicate rowsDuplicates
테이블정의서 has 1 (3.2%) missing valuesMissing
Unnamed: 1 has 6 (19.4%) missing valuesMissing
Unnamed: 2 has 2 (6.5%) missing valuesMissing
Unnamed: 4 has 6 (19.4%) missing valuesMissing
Unnamed: 5 has 31 (100.0%) missing valuesMissing
Unnamed: 6 has 28 (90.3%) missing valuesMissing
Unnamed: 7 has 28 (90.3%) missing valuesMissing
Unnamed: 8 has 30 (96.8%) missing valuesMissing
테이블정의서 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-17 23:21:49.436428
Analysis finished2024-04-17 23:21:50.610445
Duration1.17 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

테이블정의서
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1
Missing (%)3.2%
Memory size380.0 B

Unnamed: 1
Text

MISSING 

Distinct25
Distinct (%)100.0%
Missing6
Missing (%)19.4%
Memory size380.0 B
2024-04-18T08:21:50.752877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length6.44
Min length3

Characters and Unicode

Total characters161
Distinct characters28
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row컬럼ID
2nd rowFID
3rd rowAREA
4th rowAREA_CD
5th rowWATER
ValueCountFrequency (%)
컬럼id 1
 
4.0%
flow_pt 1
 
4.0%
o_env_std 1
 
4.0%
wgs84_y 1
 
4.0%
wgs84_x 1
 
4.0%
bjd_cd 1
 
4.0%
major_pt 1
 
4.0%
water_re 1
 
4.0%
m_area_re 1
 
4.0%
clo_year 1
 
4.0%
Other values (15) 15
60.0%
2024-04-18T08:21:51.144459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 23
14.3%
T 15
 
9.3%
E 14
 
8.7%
A 14
 
8.7%
R 11
 
6.8%
D 11
 
6.8%
S 8
 
5.0%
C 7
 
4.3%
M 7
 
4.3%
N 7
 
4.3%
Other values (18) 44
27.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 132
82.0%
Connector Punctuation 23
 
14.3%
Decimal Number 4
 
2.5%
Other Letter 2
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 15
11.4%
E 14
10.6%
A 14
10.6%
R 11
 
8.3%
D 11
 
8.3%
S 8
 
6.1%
C 7
 
5.3%
M 7
 
5.3%
N 7
 
5.3%
W 6
 
4.5%
Other values (13) 32
24.2%
Decimal Number
ValueCountFrequency (%)
4 2
50.0%
8 2
50.0%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 132
82.0%
Common 27
 
16.8%
Hangul 2
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 15
11.4%
E 14
10.6%
A 14
10.6%
R 11
 
8.3%
D 11
 
8.3%
S 8
 
6.1%
C 7
 
5.3%
M 7
 
5.3%
N 7
 
5.3%
W 6
 
4.5%
Other values (13) 32
24.2%
Common
ValueCountFrequency (%)
_ 23
85.2%
4 2
 
7.4%
8 2
 
7.4%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 159
98.8%
Hangul 2
 
1.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 23
14.5%
T 15
 
9.4%
E 14
 
8.8%
A 14
 
8.8%
R 11
 
6.9%
D 11
 
6.9%
S 8
 
5.0%
C 7
 
4.4%
M 7
 
4.4%
N 7
 
4.4%
Other values (16) 42
26.4%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 2
Text

MISSING 

Distinct29
Distinct (%)100.0%
Missing2
Missing (%)6.5%
Memory size380.0 B
2024-04-18T08:21:51.359802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length6
Mean length4.6551724
Min length2

Characters and Unicode

Total characters135
Distinct characters70
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)100.0%

Sample

1st row한기철
2nd rowWEIS
3rd row물환경정보시스템의 수질측정망하천수지점
4th row컬럼명
5th row객체ID
ValueCountFrequency (%)
weis 1
 
3.3%
물환경정보시스템의 1
 
3.3%
인덱스키 1
 
3.3%
구측정소코드 1
 
3.3%
구환경기준 1
 
3.3%
기준y좌표 1
 
3.3%
기준x좌표 1
 
3.3%
법정동코드 1
 
3.3%
주요지점부 1
 
3.3%
수계대표 1
 
3.3%
Other values (20) 20
66.7%
2024-04-18T08:21:51.700632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6
 
4.4%
6
 
4.4%
5
 
3.7%
5
 
3.7%
5
 
3.7%
5
 
3.7%
5
 
3.7%
4
 
3.0%
4
 
3.0%
4
 
3.0%
Other values (60) 86
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 126
93.3%
Uppercase Letter 8
 
5.9%
Space Separator 1
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
4
 
3.2%
Other values (52) 77
61.1%
Uppercase Letter
ValueCountFrequency (%)
I 2
25.0%
W 1
12.5%
X 1
12.5%
Y 1
12.5%
S 1
12.5%
D 1
12.5%
E 1
12.5%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 126
93.3%
Latin 8
 
5.9%
Common 1
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
4
 
3.2%
Other values (52) 77
61.1%
Latin
ValueCountFrequency (%)
I 2
25.0%
W 1
12.5%
X 1
12.5%
Y 1
12.5%
S 1
12.5%
D 1
12.5%
E 1
12.5%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 126
93.3%
ASCII 9
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
4
 
3.2%
Other values (52) 77
61.1%
ASCII
ValueCountFrequency (%)
I 2
22.2%
W 1
11.1%
X 1
11.1%
Y 1
11.1%
S 1
11.1%
1
11.1%
D 1
11.1%
E 1
11.1%

Unnamed: 3
Categorical

Distinct6
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Memory size380.0 B
VARCHAR
21 
<NA>
NUMERIC
 
2
테이블ID
 
1
타입
 
1

Length

Max length7
Median length7
Mean length6.2903226
Min length2

Unique

Unique3 ?
Unique (%)9.7%

Sample

1st row<NA>
2nd row테이블ID
3rd row<NA>
4th row타입
5th rowINTEGER

Common Values

ValueCountFrequency (%)
VARCHAR 21
67.7%
<NA> 5
 
16.1%
NUMERIC 2
 
6.5%
테이블ID 1
 
3.2%
타입 1
 
3.2%
INTEGER 1
 
3.2%

Length

2024-04-18T08:21:51.831522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T08:21:51.937809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
varchar 21
67.7%
na 5
 
16.1%
numeric 2
 
6.5%
테이블id 1
 
3.2%
타입 1
 
3.2%
integer 1
 
3.2%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6
Missing (%)19.4%
Memory size380.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing31
Missing (%)100.0%
Memory size411.0 B

Unnamed: 6
Text

MISSING 

Distinct3
Distinct (%)100.0%
Missing28
Missing (%)90.3%
Memory size380.0 B
2024-04-18T08:21:52.107630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4
Min length3

Characters and Unicode

Total characters12
Distinct characters11
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row작성일
2nd row테이블명
3rd rowPK/FK
ValueCountFrequency (%)
작성일 1
33.3%
테이블명 1
33.3%
pk/fk 1
33.3%
2024-04-18T08:21:52.434261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
K 2
16.7%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
P 1
8.3%
/ 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7
58.3%
Uppercase Letter 4
33.3%
Other Punctuation 1
 
8.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Uppercase Letter
ValueCountFrequency (%)
K 2
50.0%
P 1
25.0%
F 1
25.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7
58.3%
Latin 4
33.3%
Common 1
 
8.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Latin
ValueCountFrequency (%)
K 2
50.0%
P 1
25.0%
F 1
25.0%
Common
ValueCountFrequency (%)
/ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7
58.3%
ASCII 5
41.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
K 2
40.0%
P 1
20.0%
/ 1
20.0%
F 1
20.0%
Hangul
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing28
Missing (%)90.3%
Memory size380.0 B

Unnamed: 8
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing30
Missing (%)96.8%
Memory size380.0 B
2024-04-18T08:21:52.581654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row참조테이블명/비고
ValueCountFrequency (%)
참조테이블명/비고 1
100.0%
2024-04-18T08:21:52.870032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
/ 1
11.1%
1
11.1%
1
11.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8
88.9%
Other Punctuation 1
 
11.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8
88.9%
Common 1
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Common
ValueCountFrequency (%)
/ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8
88.9%
ASCII 1
 
11.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
ASCII
ValueCountFrequency (%)
/ 1
100.0%

Correlations

2024-04-18T08:21:52.969903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 6
Unnamed: 11.0001.0001.000NaN
Unnamed: 21.0001.0001.0001.000
Unnamed: 31.0001.0001.0000.000
Unnamed: 6NaN1.0000.0001.000

Missing values

2024-04-18T08:21:50.376672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T08:21:50.504843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
0작성자<NA>한기철<NA>NaN<NA>작성일2017-07-05 00:00:00<NA>
1주제영역명<NA>WEIS테이블IDZ_CBDMS_WQS_SITE_MANUAL_A<NA>테이블명수질측정망 하천수기점<NA>
2테이블설명<NA>물환경정보시스템의 수질측정망하천수지점<NA>NaN<NA><NA>NaN<NA>
3No컬럼ID컬럼명타입길이(Byte)<NA>PK/FKDefault참조테이블명/비고
41FID객체IDINTEGERNaN<NA><NA>NaN<NA>
52AREA권역VARCHAR256<NA><NA>NaN<NA>
63AREA_CD권역코드VARCHAR256<NA><NA>NaN<NA>
74WATER수계VARCHAR256<NA><NA>NaN<NA>
85AM_NM중권역명VARCHAR256<NA><NA>NaN<NA>
96AM_CD중권역코드VARCHAR256<NA><NA>NaN<NA>
테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
2118WATER_RE수계대표VARCHAR256<NA><NA>NaN<NA>
2219MAJOR_PT주요지점부VARCHAR256<NA><NA>NaN<NA>
2320BJD_CD법정동코드VARCHAR256<NA><NA>NaN<NA>
2421WGS84_X기준X좌표NUMERIC23<NA><NA>NaN<NA>
2522WGS84_Y기준Y좌표NUMERIC23<NA><NA>NaN<NA>
2623O_ENV_STD구환경기준VARCHAR256<NA><NA>NaN<NA>
2724O_ST_CD구측정소코드VARCHAR256<NA><NA>NaN<NA>
28인덱스명<NA>인덱스키<NA>NaN<NA><NA>NaN<NA>
29NaN<NA><NA><NA>NaN<NA><NA>NaN<NA>
30업무규칙<NA><NA><NA>NaN<NA><NA>NaN<NA>

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 6Unnamed: 8# duplicates
0<NA><NA><NA><NA><NA>2