Overview

Dataset statistics

Number of variables6
Number of observations75
Missing cells122
Missing cells (%)27.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.6 KiB
Average record size in memory49.8 B

Variable types

Text5
DateTime1

Dataset

Description창녕군 토양오염정보 데이터로 토양오염 업소면, 주소, 전화번호, 소방법완공검사일, 2023년 토양오염도검사 조사대상, 2023년 누출검사 조사대상을 제공하고 있음
URLhttps://www.data.go.kr/data/15013292/fileData.do

Alerts

전화번호 has 10 (13.3%) missing valuesMissing
2023년 토양오염도검사 조사대상 has 43 (57.3%) missing valuesMissing
2023년 누출검사 조사대상 has 69 (92.0%) missing valuesMissing
업소명 has unique valuesUnique
주소 has unique valuesUnique

Reproduction

Analysis started2023-12-12 16:33:00.784621
Analysis finished2023-12-12 16:33:01.786755
Duration1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업소명
Text

UNIQUE 

Distinct75
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size732.0 B
2023-12-13T01:33:02.168321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length24
Mean length7.6
Min length3

Characters and Unicode

Total characters570
Distinct characters149
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75 ?
Unique (%)100.0%

Sample

1st row대창주유소
2nd row화명주유소
3rd row남지농협 고곡주유소
4th row길곡주유소
5th row부곡주유소
ValueCountFrequency (%)
육군 2
 
2.3%
착한주유소 2
 
2.3%
부영주유소 1
 
1.1%
대영주유소 1
 
1.1%
㈜대동 1
 
1.1%
공단주유소 1
 
1.1%
경남대로주유소 1
 
1.1%
창녕주유소 1
 
1.1%
㈜해연 1
 
1.1%
㈜제일주유소 1
 
1.1%
Other values (75) 75
86.2%
2023-12-13T01:33:02.753438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
64
 
11.2%
60
 
10.5%
59
 
10.4%
17
 
3.0%
16
 
2.8%
13
 
2.3%
12
 
2.1%
( 12
 
2.1%
) 12
 
2.1%
10
 
1.8%
Other values (139) 295
51.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 488
85.6%
Other Symbol 17
 
3.0%
Decimal Number 16
 
2.8%
Open Punctuation 13
 
2.3%
Close Punctuation 13
 
2.3%
Space Separator 12
 
2.1%
Uppercase Letter 6
 
1.1%
Other Punctuation 5
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
64
 
13.1%
60
 
12.3%
59
 
12.1%
16
 
3.3%
13
 
2.7%
10
 
2.0%
10
 
2.0%
9
 
1.8%
9
 
1.8%
8
 
1.6%
Other values (118) 230
47.1%
Decimal Number
ValueCountFrequency (%)
1 4
25.0%
6 2
12.5%
8 2
12.5%
5 2
12.5%
7 2
12.5%
9 1
 
6.2%
0 1
 
6.2%
2 1
 
6.2%
3 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
C 3
50.0%
I 1
 
16.7%
S 1
 
16.7%
J 1
 
16.7%
Open Punctuation
ValueCountFrequency (%)
( 12
92.3%
[ 1
 
7.7%
Close Punctuation
ValueCountFrequency (%)
) 12
92.3%
] 1
 
7.7%
Other Punctuation
ValueCountFrequency (%)
, 4
80.0%
. 1
 
20.0%
Other Symbol
ValueCountFrequency (%)
17
100.0%
Space Separator
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 505
88.6%
Common 59
 
10.4%
Latin 6
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
64
 
12.7%
60
 
11.9%
59
 
11.7%
17
 
3.4%
16
 
3.2%
13
 
2.6%
10
 
2.0%
10
 
2.0%
9
 
1.8%
9
 
1.8%
Other values (119) 238
47.1%
Common
ValueCountFrequency (%)
12
20.3%
( 12
20.3%
) 12
20.3%
, 4
 
6.8%
1 4
 
6.8%
6 2
 
3.4%
8 2
 
3.4%
5 2
 
3.4%
7 2
 
3.4%
9 1
 
1.7%
Other values (6) 6
10.2%
Latin
ValueCountFrequency (%)
C 3
50.0%
I 1
 
16.7%
S 1
 
16.7%
J 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 488
85.6%
ASCII 65
 
11.4%
None 17
 
3.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
64
 
13.1%
60
 
12.3%
59
 
12.1%
16
 
3.3%
13
 
2.7%
10
 
2.0%
10
 
2.0%
9
 
1.8%
9
 
1.8%
8
 
1.6%
Other values (118) 230
47.1%
None
ValueCountFrequency (%)
17
100.0%
ASCII
ValueCountFrequency (%)
12
18.5%
( 12
18.5%
) 12
18.5%
, 4
 
6.2%
1 4
 
6.2%
C 3
 
4.6%
6 2
 
3.1%
8 2
 
3.1%
5 2
 
3.1%
7 2
 
3.1%
Other values (10) 10
15.4%

주소
Text

UNIQUE 

Distinct75
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size732.0 B
2023-12-13T01:33:03.187416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length34
Mean length28.906667
Min length19

Characters and Unicode

Total characters2168
Distinct characters109
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75 ?
Unique (%)100.0%

Sample

1st row경상남도 창녕군 대합면 우포2로 850(소야리 523-6)
2nd row경상남도 창녕군 부곡면 온천로 1173(수다리 315-5)
3rd row경상남도 창녕군 남지읍 박진로 992(고곡리 362-4)
4th row경상남도 창녕군 길곡면 길곡로 4(증산리 557-7)
5th row경상남도 창녕군 부곡면 부곡로 103
ValueCountFrequency (%)
경상남도 75
 
17.6%
창녕군 75
 
17.6%
창녕읍 14
 
3.3%
부곡면 11
 
2.6%
온천로 11
 
2.6%
계성면 8
 
1.9%
경남대로 8
 
1.9%
남지읍 7
 
1.6%
대합면 7
 
1.6%
도천면 7
 
1.6%
Other values (166) 203
47.7%
2023-12-13T01:33:03.793280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
352
 
16.2%
97
 
4.5%
97
 
4.5%
94
 
4.3%
89
 
4.1%
83
 
3.8%
76
 
3.5%
75
 
3.5%
1 73
 
3.4%
62
 
2.9%
Other values (99) 1070
49.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1201
55.4%
Decimal Number 457
 
21.1%
Space Separator 352
 
16.2%
Open Punctuation 51
 
2.4%
Close Punctuation 51
 
2.4%
Dash Punctuation 51
 
2.4%
Other Punctuation 5
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
97
 
8.1%
97
 
8.1%
94
 
7.8%
89
 
7.4%
83
 
6.9%
76
 
6.3%
75
 
6.2%
62
 
5.2%
55
 
4.6%
54
 
4.5%
Other values (84) 419
34.9%
Decimal Number
ValueCountFrequency (%)
1 73
16.0%
3 60
13.1%
4 58
12.7%
2 55
12.0%
5 54
11.8%
6 43
9.4%
0 36
7.9%
7 28
 
6.1%
9 27
 
5.9%
8 23
 
5.0%
Space Separator
ValueCountFrequency (%)
352
100.0%
Open Punctuation
ValueCountFrequency (%)
( 51
100.0%
Close Punctuation
ValueCountFrequency (%)
) 51
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 51
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1201
55.4%
Common 967
44.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
97
 
8.1%
97
 
8.1%
94
 
7.8%
89
 
7.4%
83
 
6.9%
76
 
6.3%
75
 
6.2%
62
 
5.2%
55
 
4.6%
54
 
4.5%
Other values (84) 419
34.9%
Common
ValueCountFrequency (%)
352
36.4%
1 73
 
7.5%
3 60
 
6.2%
4 58
 
6.0%
2 55
 
5.7%
5 54
 
5.6%
( 51
 
5.3%
) 51
 
5.3%
- 51
 
5.3%
6 43
 
4.4%
Other values (5) 119
 
12.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1201
55.4%
ASCII 967
44.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
352
36.4%
1 73
 
7.5%
3 60
 
6.2%
4 58
 
6.0%
2 55
 
5.7%
5 54
 
5.6%
( 51
 
5.3%
) 51
 
5.3%
- 51
 
5.3%
6 43
 
4.4%
Other values (5) 119
 
12.3%
Hangul
ValueCountFrequency (%)
97
 
8.1%
97
 
8.1%
94
 
7.8%
89
 
7.4%
83
 
6.9%
76
 
6.3%
75
 
6.2%
62
 
5.2%
55
 
4.6%
54
 
4.5%
Other values (84) 419
34.9%

전화번호
Text

MISSING 

Distinct64
Distinct (%)98.5%
Missing10
Missing (%)13.3%
Memory size732.0 B
2023-12-13T01:33:04.108256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters780
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)96.9%

Sample

1st row055-533-1210
2nd row055-536-5667
3rd row055-521-8189
4th row055-536-9942
5th row055-536-7800
ValueCountFrequency (%)
055-536-2100 2
 
3.1%
055-536-8600 1
 
1.5%
055-533-3336 1
 
1.5%
055-533-1210 1
 
1.5%
055-536-4028 1
 
1.5%
055-532-2339 1
 
1.5%
055-530-7127 1
 
1.5%
055-536-9084 1
 
1.5%
055-526-0300 1
 
1.5%
055-532-2222 1
 
1.5%
Other values (54) 54
83.1%
2023-12-13T01:33:04.562971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 225
28.8%
- 130
16.7%
0 117
15.0%
3 71
 
9.1%
2 68
 
8.7%
1 42
 
5.4%
6 37
 
4.7%
7 27
 
3.5%
4 27
 
3.5%
8 24
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 650
83.3%
Dash Punctuation 130
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 225
34.6%
0 117
18.0%
3 71
 
10.9%
2 68
 
10.5%
1 42
 
6.5%
6 37
 
5.7%
7 27
 
4.2%
4 27
 
4.2%
8 24
 
3.7%
9 12
 
1.8%
Dash Punctuation
ValueCountFrequency (%)
- 130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 780
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 225
28.8%
- 130
16.7%
0 117
15.0%
3 71
 
9.1%
2 68
 
8.7%
1 42
 
5.4%
6 37
 
4.7%
7 27
 
3.5%
4 27
 
3.5%
8 24
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 780
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 225
28.8%
- 130
16.7%
0 117
15.0%
3 71
 
9.1%
2 68
 
8.7%
1 42
 
5.4%
6 37
 
4.7%
7 27
 
3.5%
4 27
 
3.5%
8 24
 
3.1%
Distinct72
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size732.0 B
Minimum1984-06-26 00:00:00
Maximum2020-07-03 00:00:00
2023-12-13T01:33:04.760823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:33:04.959575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct29
Distinct (%)90.6%
Missing43
Missing (%)57.3%
Memory size732.0 B
2023-12-13T01:33:05.221632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length21
Mean length21
Min length21

Characters and Unicode

Total characters672
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)81.2%

Sample

1st row2023-02-20~2023-05-19
2nd row2023-02-01~2023-04-30
3rd row2023-02-06~2023-05-05
4th row2023-06-30~2023-09-29
5th row2023-12-28~2024-03-27
ValueCountFrequency (%)
2023-01-17~2023-04-16 2
 
6.2%
2023-02-06~2023-05-05 2
 
6.2%
2022-12-23~2023-03-22 2
 
6.2%
2023-05-31~2023-08-30 1
 
3.1%
2022-12-30~2023-03-29 1
 
3.1%
2023-07-29~2023-10-28 1
 
3.1%
2023-01-23~2023-04-22 1
 
3.1%
2023-10-13~2024-01-12 1
 
3.1%
2023-01-05~2023-04-04 1
 
3.1%
2023-08-20~2023-11-19 1
 
3.1%
Other values (19) 19
59.4%
2023-12-13T01:33:05.633307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 184
27.4%
0 139
20.7%
- 128
19.0%
3 72
 
10.7%
1 56
 
8.3%
~ 32
 
4.8%
4 18
 
2.7%
5 11
 
1.6%
6 9
 
1.3%
9 9
 
1.3%
Other values (2) 14
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 512
76.2%
Dash Punctuation 128
 
19.0%
Math Symbol 32
 
4.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 184
35.9%
0 139
27.1%
3 72
 
14.1%
1 56
 
10.9%
4 18
 
3.5%
5 11
 
2.1%
6 9
 
1.8%
9 9
 
1.8%
8 8
 
1.6%
7 6
 
1.2%
Dash Punctuation
ValueCountFrequency (%)
- 128
100.0%
Math Symbol
ValueCountFrequency (%)
~ 32
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 672
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 184
27.4%
0 139
20.7%
- 128
19.0%
3 72
 
10.7%
1 56
 
8.3%
~ 32
 
4.8%
4 18
 
2.7%
5 11
 
1.6%
6 9
 
1.3%
9 9
 
1.3%
Other values (2) 14
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 672
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 184
27.4%
0 139
20.7%
- 128
19.0%
3 72
 
10.7%
1 56
 
8.3%
~ 32
 
4.8%
4 18
 
2.7%
5 11
 
1.6%
6 9
 
1.3%
9 9
 
1.3%
Other values (2) 14
 
2.1%
Distinct6
Distinct (%)100.0%
Missing69
Missing (%)92.0%
Memory size732.0 B
2023-12-13T01:33:05.822987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length21
Mean length21
Min length21

Characters and Unicode

Total characters126
Distinct characters11
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)100.0%

Sample

1st row2023-06-30~2023-09-29
2nd row2023-12-20~2024-03-19
3rd row2023-02-21~2023-05-20
4th row2023-01-23~2023-04-22
5th row2023-03-20~2023-06-19
ValueCountFrequency (%)
2023-06-30~2023-09-29 1
16.7%
2023-12-20~2024-03-19 1
16.7%
2023-02-21~2023-05-20 1
16.7%
2023-01-23~2023-04-22 1
16.7%
2023-03-20~2023-06-19 1
16.7%
2023-04-12~2023-07-11 1
16.7%
2023-12-13T01:33:06.124641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 35
27.8%
0 27
21.4%
- 24
19.0%
3 15
11.9%
1 8
 
6.3%
~ 6
 
4.8%
9 4
 
3.2%
4 3
 
2.4%
6 2
 
1.6%
5 1
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 96
76.2%
Dash Punctuation 24
 
19.0%
Math Symbol 6
 
4.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 35
36.5%
0 27
28.1%
3 15
15.6%
1 8
 
8.3%
9 4
 
4.2%
4 3
 
3.1%
6 2
 
2.1%
5 1
 
1.0%
7 1
 
1.0%
Dash Punctuation
ValueCountFrequency (%)
- 24
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 126
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 35
27.8%
0 27
21.4%
- 24
19.0%
3 15
11.9%
1 8
 
6.3%
~ 6
 
4.8%
9 4
 
3.2%
4 3
 
2.4%
6 2
 
1.6%
5 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 126
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 35
27.8%
0 27
21.4%
- 24
19.0%
3 15
11.9%
1 8
 
6.3%
~ 6
 
4.8%
9 4
 
3.2%
4 3
 
2.4%
6 2
 
1.6%
5 1
 
0.8%

Correlations

2023-12-13T01:33:06.224946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소명주소전화번호소방법완공검사일2023년 토양오염도검사 조사대상2023년 누출검사 조사대상
업소명1.0001.0001.0001.0001.0001.000
주소1.0001.0001.0001.0001.0001.000
전화번호1.0001.0001.0001.0001.0001.000
소방법완공검사일1.0001.0001.0001.0001.0001.000
2023년 토양오염도검사 조사대상1.0001.0001.0001.0001.0001.000
2023년 누출검사 조사대상1.0001.0001.0001.0001.0001.000

Missing values

2023-12-13T01:33:01.258301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:33:01.389530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T01:33:01.596690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업소명주소전화번호소방법완공검사일2023년 토양오염도검사 조사대상2023년 누출검사 조사대상
0대창주유소경상남도 창녕군 대합면 우포2로 850(소야리 523-6)055-533-12101993-01-06<NA><NA>
1화명주유소경상남도 창녕군 부곡면 온천로 1173(수다리 315-5)055-536-56671987-08-22<NA><NA>
2남지농협 고곡주유소경상남도 창녕군 남지읍 박진로 992(고곡리 362-4)055-521-81891993-02-08<NA><NA>
3길곡주유소경상남도 창녕군 길곡면 길곡로 4(증산리 557-7)055-536-99421992-02-202023-02-20~2023-05-19<NA>
4부곡주유소경상남도 창녕군 부곡면 부곡로 103055-536-78001987-06-17<NA><NA>
5청룡휴게주유소경상남도 창녕군 창녕읍 경남대로 4237(여초리 281-24)055-532-32441996-02-012023-02-01~2023-04-30<NA>
6고암주유소경상남도 창녕군 고암면 창밀로 366(우천리 974-1)055-533-08881992-02-062023-02-06~2023-05-05<NA>
7대성주유소경상남도 창녕군 유어면 우포1대로 642055-532-70492004-06-252023-06-30~2023-09-292023-06-30~2023-09-29
8계성주유소경상남도 창녕군 계성면 영산계성로 463055-521-00011990-12-282023-12-28~2024-03-27<NA>
9대초주유소경상남도 창녕군 대지면 우포1대로 1185(본초리 536)055-532-97511991-08-07<NA><NA>
업소명주소전화번호소방법완공검사일2023년 토양오염도검사 조사대상2023년 누출검사 조사대상
65육군 제9715부대(8611부대)경상남도 창녕군 고암면 경남대로 5078-66(원촌리 306번지 외55필지)055-880-45602012-11-12<NA><NA>
66부곡알뜰주유소경상남도 창녕군 부곡면 온천로 692(부곡리 631-14,62)055-521-46612013-01-232023-01-23~2023-04-222023-01-23~2023-04-22
67창녕농협클린주유소경상남도 창녕군 창녕읍 술정리 302-4055-532-97012013-03-20<NA>2023-03-20~2023-06-19
68㈜세아베스틸경상남도 창녕군 대합면 대합산업단지로 100055-530-85152013-04-12<NA>2023-04-12~2023-07-11
69육군 제5870부대경상남도 창녕군 고암면 경남대로 5078-66<NA>2014-07-03<NA><NA>
70㈜두남환경경상남도 창녕군 대지면 우포1대로 1218055-532-57002008-07-292023-07-29~2023-10-28<NA>
71천일여객(창녕지사)경상남도 창녕군 창녕읍 창녕읍 창녕대로 11051-559-10322000-08-112023-08-11~2023-11-10<NA>
72제이엠6주유소경상남도 창녕군 성산면 경남대로 5742-6<NA>2017-11-30<NA><NA>
73㈜잼텍경상남도 창녕군 영산면 서리상촌길 307-50055-357-16412012-06-11<NA><NA>
74(주)엘엠에이티경상남도 창녕군 대합면 대합산업단지로 87<NA>2020-07-03<NA><NA>