Overview

Dataset statistics

Number of variables5
Number of observations110
Missing cells13
Missing cells (%)2.4%
Duplicate rows1
Duplicate rows (%)0.9%
Total size in memory4.5 KiB
Average record size in memory42.2 B

Variable types

Numeric1
Text3
DateTime1

Dataset

Description대구광역시 서구 내에 있는 토양오염 관리대장 시설에 대한 현황입니다. 상호명, 소재지 지번 및 도로명주소를 포함하고 있습니다.
Author대구광역시 서구
URLhttps://www.data.go.kr/data/15092019/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.9%) duplicate rowsDuplicates
연번 has 2 (1.8%) missing valuesMissing
상호 has 2 (1.8%) missing valuesMissing
소재지도로명주소 has 3 (2.7%) missing valuesMissing
소재지지번주소 has 4 (3.6%) missing valuesMissing
데이터기준일자 has 2 (1.8%) missing valuesMissing

Reproduction

Analysis started2024-03-14 12:20:20.172701
Analysis finished2024-03-14 12:20:21.595663
Duration1.42 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

MISSING 

Distinct108
Distinct (%)100.0%
Missing2
Missing (%)1.8%
Infinite0
Infinite (%)0.0%
Mean54.5
Minimum1
Maximum108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-03-14T21:20:21.837519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.35
Q127.75
median54.5
Q381.25
95-th percentile102.65
Maximum108
Range107
Interquartile range (IQR)53.5

Descriptive statistics

Standard deviation31.32092
Coefficient of variation (CV)0.57469577
Kurtosis-1.2
Mean54.5
Median Absolute Deviation (MAD)27
Skewness0
Sum5886
Variance981
MonotonicityStrictly increasing
2024-03-14T21:20:22.288405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70 1
 
0.9%
81 1
 
0.9%
80 1
 
0.9%
79 1
 
0.9%
78 1
 
0.9%
77 1
 
0.9%
76 1
 
0.9%
75 1
 
0.9%
74 1
 
0.9%
73 1
 
0.9%
Other values (98) 98
89.1%
(Missing) 2
 
1.8%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
10 1
0.9%
ValueCountFrequency (%)
108 1
0.9%
107 1
0.9%
106 1
0.9%
105 1
0.9%
104 1
0.9%
103 1
0.9%
102 1
0.9%
101 1
0.9%
100 1
0.9%
99 1
0.9%

상호
Text

MISSING 

Distinct108
Distinct (%)100.0%
Missing2
Missing (%)1.8%
Memory size1008.0 B
2024-03-14T21:20:23.364211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length17
Mean length6.9814815
Min length3

Characters and Unicode

Total characters754
Distinct characters164
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)100.0%

Sample

1st row(주)대일종합에너지 서부지점 행복주유소
2nd row지에스칼텍스(주)달구벌대로주유소
3rd row이현공단주유소
4th row지민주유소
5th row꽉주유소
ValueCountFrequency (%)
주유소 2
 
1.7%
주)동진상사 1
 
0.8%
미창도금 1
 
0.8%
한라테크 1
 
0.8%
우진표면테크 1
 
0.8%
jp산업 1
 
0.8%
amp테크 1
 
0.8%
진영산업 1
 
0.8%
삼보산업 1
 
0.8%
주)부광에프디 1
 
0.8%
Other values (109) 109
90.8%
2024-03-14T21:20:25.061452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
65
 
8.6%
50
 
6.6%
36
 
4.8%
( 34
 
4.5%
) 34
 
4.5%
28
 
3.7%
17
 
2.3%
14
 
1.9%
13
 
1.7%
13
 
1.7%
Other values (154) 450
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 653
86.6%
Open Punctuation 34
 
4.5%
Close Punctuation 34
 
4.5%
Uppercase Letter 15
 
2.0%
Space Separator 12
 
1.6%
Decimal Number 4
 
0.5%
Lowercase Letter 1
 
0.1%
Other Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
65
 
10.0%
50
 
7.7%
36
 
5.5%
28
 
4.3%
17
 
2.6%
14
 
2.1%
13
 
2.0%
13
 
2.0%
13
 
2.0%
12
 
1.8%
Other values (137) 392
60.0%
Uppercase Letter
ValueCountFrequency (%)
C 3
20.0%
T 2
13.3%
M 2
13.3%
P 2
13.3%
I 2
13.3%
A 1
 
6.7%
J 1
 
6.7%
S 1
 
6.7%
F 1
 
6.7%
Decimal Number
ValueCountFrequency (%)
8 2
50.0%
1 1
25.0%
2 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 34
100.0%
Close Punctuation
ValueCountFrequency (%)
) 34
100.0%
Space Separator
ValueCountFrequency (%)
12
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 654
86.7%
Common 84
 
11.1%
Latin 16
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
65
 
9.9%
50
 
7.6%
36
 
5.5%
28
 
4.3%
17
 
2.6%
14
 
2.1%
13
 
2.0%
13
 
2.0%
13
 
2.0%
12
 
1.8%
Other values (138) 393
60.1%
Latin
ValueCountFrequency (%)
C 3
18.8%
T 2
12.5%
M 2
12.5%
P 2
12.5%
I 2
12.5%
A 1
 
6.2%
J 1
 
6.2%
S 1
 
6.2%
e 1
 
6.2%
F 1
 
6.2%
Common
ValueCountFrequency (%)
( 34
40.5%
) 34
40.5%
12
 
14.3%
8 2
 
2.4%
1 1
 
1.2%
2 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 653
86.6%
ASCII 100
 
13.3%
None 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
65
 
10.0%
50
 
7.7%
36
 
5.5%
28
 
4.3%
17
 
2.6%
14
 
2.1%
13
 
2.0%
13
 
2.0%
13
 
2.0%
12
 
1.8%
Other values (137) 392
60.0%
ASCII
ValueCountFrequency (%)
( 34
34.0%
) 34
34.0%
12
 
12.0%
C 3
 
3.0%
T 2
 
2.0%
M 2
 
2.0%
P 2
 
2.0%
I 2
 
2.0%
8 2
 
2.0%
A 1
 
1.0%
Other values (6) 6
 
6.0%
None
ValueCountFrequency (%)
1
100.0%
Distinct105
Distinct (%)98.1%
Missing3
Missing (%)2.7%
Memory size1008.0 B
2024-03-14T21:20:26.176220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length28
Mean length24.448598
Min length21

Characters and Unicode

Total characters2616
Distinct characters70
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103 ?
Unique (%)96.3%

Sample

1st row대구광역시 서구 국채보상로 122 (중리동)
2nd row대구광역시 서구 달구벌대로 1845 (내당동)
3rd row대구광역시 서구 국채보상로 181 (평리동)
4th row대구광역시 서구 북비산로 177 (평리동)
5th row대구광역시 서구 국채보상로 317 (평리동)
ValueCountFrequency (%)
대구광역시 107
19.9%
서구 106
19.7%
이현동 33
 
6.1%
비산동 25
 
4.6%
평리동 21
 
3.9%
중리동 19
 
3.5%
북비산로 11
 
2.0%
서대구로 7
 
1.3%
국채보상로 7
 
1.3%
와룡로 6
 
1.1%
Other values (152) 196
36.4%
2024-03-14T21:20:27.725337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
434
16.6%
223
 
8.5%
119
 
4.5%
119
 
4.5%
( 108
 
4.1%
) 108
 
4.1%
108
 
4.1%
107
 
4.1%
107
 
4.1%
107
 
4.1%
Other values (60) 1076
41.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1570
60.0%
Space Separator 434
 
16.6%
Decimal Number 370
 
14.1%
Open Punctuation 108
 
4.1%
Close Punctuation 108
 
4.1%
Dash Punctuation 22
 
0.8%
Other Punctuation 4
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
223
14.2%
119
 
7.6%
119
 
7.6%
108
 
6.9%
107
 
6.8%
107
 
6.8%
107
 
6.8%
107
 
6.8%
53
 
3.4%
46
 
2.9%
Other values (45) 474
30.2%
Decimal Number
ValueCountFrequency (%)
1 79
21.4%
2 48
13.0%
7 48
13.0%
3 40
10.8%
4 35
9.5%
8 28
 
7.6%
6 28
 
7.6%
0 26
 
7.0%
5 20
 
5.4%
9 18
 
4.9%
Space Separator
ValueCountFrequency (%)
434
100.0%
Open Punctuation
ValueCountFrequency (%)
( 108
100.0%
Close Punctuation
ValueCountFrequency (%)
) 108
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1570
60.0%
Common 1046
40.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
223
14.2%
119
 
7.6%
119
 
7.6%
108
 
6.9%
107
 
6.8%
107
 
6.8%
107
 
6.8%
107
 
6.8%
53
 
3.4%
46
 
2.9%
Other values (45) 474
30.2%
Common
ValueCountFrequency (%)
434
41.5%
( 108
 
10.3%
) 108
 
10.3%
1 79
 
7.6%
2 48
 
4.6%
7 48
 
4.6%
3 40
 
3.8%
4 35
 
3.3%
8 28
 
2.7%
6 28
 
2.7%
Other values (5) 90
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1570
60.0%
ASCII 1046
40.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
434
41.5%
( 108
 
10.3%
) 108
 
10.3%
1 79
 
7.6%
2 48
 
4.6%
7 48
 
4.6%
3 40
 
3.8%
4 35
 
3.3%
8 28
 
2.7%
6 28
 
2.7%
Other values (5) 90
 
8.6%
Hangul
ValueCountFrequency (%)
223
14.2%
119
 
7.6%
119
 
7.6%
108
 
6.9%
107
 
6.8%
107
 
6.8%
107
 
6.8%
107
 
6.8%
53
 
3.4%
46
 
2.9%
Other values (45) 474
30.2%

소재지지번주소
Text

MISSING 

Distinct104
Distinct (%)98.1%
Missing4
Missing (%)3.6%
Memory size1008.0 B
2024-03-14T21:20:28.903321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length22
Mean length19.660377
Min length17

Characters and Unicode

Total characters2084
Distinct characters36
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)96.2%

Sample

1st row대구광역시 서구 중리동 1082-7
2nd row대구광역시 서구 내당동 62-1
3rd row대구광역시 서구 평리동 1527-2
4th row대구광역시 서구 평리동 544
5th row대구광역시 서구 평리동 1052-9
ValueCountFrequency (%)
대구광역시 106
24.8%
서구 105
24.6%
이현동 33
 
7.7%
비산동 26
 
6.1%
평리동 21
 
4.9%
중리동 18
 
4.2%
내당동 3
 
0.7%
42-226 2
 
0.5%
44-16 2
 
0.5%
상리동 2
 
0.5%
Other values (109) 109
25.5%
2024-03-14T21:20:30.526418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
427
20.5%
212
 
10.2%
108
 
5.2%
107
 
5.1%
106
 
5.1%
106
 
5.1%
106
 
5.1%
105
 
5.0%
- 92
 
4.4%
1 92
 
4.4%
Other values (26) 623
29.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1068
51.2%
Decimal Number 496
23.8%
Space Separator 427
 
20.5%
Dash Punctuation 92
 
4.4%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
212
19.9%
108
10.1%
107
10.0%
106
9.9%
106
9.9%
106
9.9%
105
9.8%
41
 
3.8%
33
 
3.1%
33
 
3.1%
Other values (13) 111
10.4%
Decimal Number
ValueCountFrequency (%)
1 92
18.5%
2 91
18.3%
4 74
14.9%
0 59
11.9%
3 40
8.1%
5 37
7.5%
6 35
 
7.1%
7 25
 
5.0%
8 24
 
4.8%
9 19
 
3.8%
Space Separator
ValueCountFrequency (%)
427
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 92
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1068
51.2%
Common 1016
48.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
212
19.9%
108
10.1%
107
10.0%
106
9.9%
106
9.9%
106
9.9%
105
9.8%
41
 
3.8%
33
 
3.1%
33
 
3.1%
Other values (13) 111
10.4%
Common
ValueCountFrequency (%)
427
42.0%
- 92
 
9.1%
1 92
 
9.1%
2 91
 
9.0%
4 74
 
7.3%
0 59
 
5.8%
3 40
 
3.9%
5 37
 
3.6%
6 35
 
3.4%
7 25
 
2.5%
Other values (3) 44
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1068
51.2%
ASCII 1016
48.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
427
42.0%
- 92
 
9.1%
1 92
 
9.1%
2 91
 
9.0%
4 74
 
7.3%
0 59
 
5.8%
3 40
 
3.9%
5 37
 
3.6%
6 35
 
3.4%
7 25
 
2.5%
Other values (3) 44
 
4.3%
Hangul
ValueCountFrequency (%)
212
19.9%
108
10.1%
107
10.0%
106
9.9%
106
9.9%
106
9.9%
105
9.8%
41
 
3.8%
33
 
3.1%
33
 
3.1%
Other values (13) 111
10.4%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)0.9%
Missing2
Missing (%)1.8%
Memory size1008.0 B
Minimum2024-02-07 00:00:00
Maximum2024-02-07 00:00:00
2024-03-14T21:20:30.921389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T21:20:31.238327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2024-03-14T21:20:20.472112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-03-14T21:20:20.790556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T21:20:21.108690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T21:20:21.413366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번상호소재지도로명주소소재지지번주소데이터기준일자
01(주)대일종합에너지 서부지점 행복주유소대구광역시 서구 국채보상로 122 (중리동)대구광역시 서구 중리동 1082-72024-02-07
12지에스칼텍스(주)달구벌대로주유소대구광역시 서구 달구벌대로 1845 (내당동)대구광역시 서구 내당동 62-12024-02-07
23이현공단주유소대구광역시 서구 국채보상로 181 (평리동)대구광역시 서구 평리동 1527-22024-02-07
34지민주유소대구광역시 서구 북비산로 177 (평리동)대구광역시 서구 평리동 5442024-02-07
45꽉주유소대구광역시 서구 국채보상로 317 (평리동)대구광역시 서구 평리동 1052-92024-02-07
56명조주유소대구광역시 서구 평리로 156 (중리동)대구광역시 서구 중리동 703-12024-02-07
67나혜주유소대구광역시 서구 서대구로 170 (평리동)대구광역시 서구 평리동 1086-42024-02-07
78(주)에스앤에스주유소대구광역시 서구 달구벌대로 1833 (내당동)대구광역시 서구 내당동 63-42024-02-07
89태화주유소대구광역시 서구 평리로 260 (내당동)대구광역시 서구 내당동 300-42024-02-07
910행복제1주유소대구광역시 서구 고성로 104 (원대동1가)대구광역시 서구 원대동1가 2082024-02-07
연번상호소재지도로명주소소재지지번주소데이터기준일자
100101삼진산업(대구경북삼진세탁동우회)대구광역시 서구 와룡로 447-7 (이현동)대구광역시 서구 이현동 42-4812024-02-07
101102(주)명진섬유대구광역시 서구 달서천로 42 (이현동)대구광역시 서구 이현동 526-12024-02-07
102103한신윤활유대구광역시 서구 염색공단천로 62 (비산동)대구광역시 서구 비산동 2028-512024-02-07
103104안성염직공업사대구광역시 서구 염색공단천로19길 36 (비산동)대구광역시 서구 비산동 3186-22024-02-07
104105원대금속대구광역시 서구 와룡로 350 (중리동)대구광역시 서구 중리동 1119-12024-02-07
105106태성금속대구광역시 서구 와룡로 377-8 (중리동)대구광역시 서구 중리동 1030-242024-02-07
106107우신일랙코대구광역시 서구 문화로7길 32 (이현동)대구광역시 서구 이현동 42-4022024-02-07
107108경원인더스트리(주)대구광역시 서구 와룡로66길 7-8 (중리동)<NA>2024-02-07
108<NA><NA><NA><NA><NA>
109<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번상호소재지도로명주소소재지지번주소데이터기준일자# duplicates
0<NA><NA><NA><NA><NA>2