Overview

Dataset statistics

Number of variables9
Number of observations44
Missing cells80
Missing cells (%)20.2%
Duplicate rows1
Duplicate rows (%)2.3%
Total size in memory3.3 KiB
Average record size in memory76.0 B

Variable types

Text4
Categorical4
Unsupported1

Dataset

Description환경전문공사업등록현황
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202067

Alerts

Dataset has 1 (2.3%) duplicate rowsDuplicates
대기 is highly overall correlated with and 2 other fieldsHigh correlation
is highly overall correlated with 대기 and 2 other fieldsHigh correlation
소음진동 is highly overall correlated with and 2 other fieldsHigh correlation
수질 is highly overall correlated with and 2 other fieldsHigh correlation
연번 has 9 (20.5%) missing valuesMissing
업소명 has 9 (20.5%) missing valuesMissing
영업소 소재지 has 9 (20.5%) missing valuesMissing
전화번호 has 9 (20.5%) missing valuesMissing
Unnamed: 8 has 44 (100.0%) missing valuesMissing
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 01:17:24.953859
Analysis finished2024-03-14 01:17:25.569861
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Text

MISSING 

Distinct35
Distinct (%)100.0%
Missing9
Missing (%)20.5%
Memory size484.0 B
2024-03-14T10:17:25.677447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.7142857
Min length1

Characters and Unicode

Total characters60
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)100.0%

Sample

1st row
2nd row1
3rd row2
4th row3
5th row4
ValueCountFrequency (%)
1 1
 
2.9%
27 1
 
2.9%
20 1
 
2.9%
21 1
 
2.9%
22 1
 
2.9%
23 1
 
2.9%
24 1
 
2.9%
25 1
 
2.9%
34 1
 
2.9%
19 1
 
2.9%
Other values (25) 25
71.4%
2024-03-14T10:17:25.935206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 14
23.3%
2 14
23.3%
3 9
15.0%
4 4
 
6.7%
8 3
 
5.0%
5 3
 
5.0%
6 3
 
5.0%
7 3
 
5.0%
9 3
 
5.0%
0 3
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 59
98.3%
Other Letter 1
 
1.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 14
23.7%
2 14
23.7%
3 9
15.3%
4 4
 
6.8%
8 3
 
5.1%
5 3
 
5.1%
6 3
 
5.1%
7 3
 
5.1%
9 3
 
5.1%
0 3
 
5.1%
Other Letter
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 59
98.3%
Hangul 1
 
1.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 14
23.7%
2 14
23.7%
3 9
15.3%
4 4
 
6.8%
8 3
 
5.1%
5 3
 
5.1%
6 3
 
5.1%
7 3
 
5.1%
9 3
 
5.1%
0 3
 
5.1%
Hangul
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59
98.3%
Hangul 1
 
1.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 14
23.7%
2 14
23.7%
3 9
15.3%
4 4
 
6.8%
8 3
 
5.1%
5 3
 
5.1%
6 3
 
5.1%
7 3
 
5.1%
9 3
 
5.1%
0 3
 
5.1%
Hangul
ValueCountFrequency (%)
1
100.0%

업소명
Text

MISSING 

Distinct35
Distinct (%)100.0%
Missing9
Missing (%)20.5%
Memory size484.0 B
2024-03-14T10:17:26.112377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length10
Mean length7.2857143
Min length1

Characters and Unicode

Total characters255
Distinct characters82
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)100.0%

Sample

1st row-
2nd row광진환경㈜
3rd row㈜유성테크
4th row(유)지구엔비텍
5th row(유)호일엔지니어링
ValueCountFrequency (%)
광진환경㈜ 1
 
2.7%
1
 
2.7%
유)대신환경개발 1
 
2.7%
㈜새롬이엔지 1
 
2.7%
유)부광건설 1
 
2.7%
유)푸른이엔텍 1
 
2.7%
유)일토씨엔엠 1
 
2.7%
유)미래건설 1
 
2.7%
주)파인리포먼스 1
 
2.7%
유)대금환경 1
 
2.7%
Other values (27) 27
73.0%
2024-03-14T10:17:26.484756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 23
 
9.0%
) 23
 
9.0%
18
 
7.1%
11
 
4.3%
9
 
3.5%
9
 
3.5%
8
 
3.1%
8
 
3.1%
8
 
3.1%
8
 
3.1%
Other values (72) 130
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 195
76.5%
Open Punctuation 23
 
9.0%
Close Punctuation 23
 
9.0%
Other Symbol 11
 
4.3%
Space Separator 2
 
0.8%
Dash Punctuation 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
 
9.2%
9
 
4.6%
9
 
4.6%
8
 
4.1%
8
 
4.1%
8
 
4.1%
8
 
4.1%
6
 
3.1%
6
 
3.1%
6
 
3.1%
Other values (67) 109
55.9%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Other Symbol
ValueCountFrequency (%)
11
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 206
80.8%
Common 49
 
19.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
 
8.7%
11
 
5.3%
9
 
4.4%
9
 
4.4%
8
 
3.9%
8
 
3.9%
8
 
3.9%
8
 
3.9%
6
 
2.9%
6
 
2.9%
Other values (68) 115
55.8%
Common
ValueCountFrequency (%)
( 23
46.9%
) 23
46.9%
2
 
4.1%
- 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 195
76.5%
ASCII 49
 
19.2%
None 11
 
4.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 23
46.9%
) 23
46.9%
2
 
4.1%
- 1
 
2.0%
Hangul
ValueCountFrequency (%)
18
 
9.2%
9
 
4.6%
9
 
4.6%
8
 
4.1%
8
 
4.1%
8
 
4.1%
8
 
4.1%
6
 
3.1%
6
 
3.1%
6
 
3.1%
Other values (67) 109
55.9%
None
ValueCountFrequency (%)
11
100.0%

영업소 소재지
Text

MISSING 

Distinct35
Distinct (%)100.0%
Missing9
Missing (%)20.5%
Memory size484.0 B
2024-03-14T10:17:26.709137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length25
Mean length21.485714
Min length1

Characters and Unicode

Total characters752
Distinct characters115
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)100.0%

Sample

1st row-
2nd row전주시 덕진구 호반3길 7 (덕진동2가)
3rd row전주시 완산구 송정중앙로 29 (효자동1가)
4th row전주시 덕진구 비석날로 99 (팔복동2가)
5th row익산시 춘포면 궁성로 272
ValueCountFrequency (%)
전주시 22
 
13.6%
덕진구 14
 
8.6%
완산구 8
 
4.9%
익산시 5
 
3.1%
우아동3가 4
 
2.5%
군산시 3
 
1.9%
효자동1가 2
 
1.2%
인후동1가 2
 
1.2%
남원시 2
 
1.2%
11 2
 
1.2%
Other values (96) 98
60.5%
2024-03-14T10:17:27.124824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
127
 
16.9%
1 33
 
4.4%
33
 
4.4%
30
 
4.0%
( 25
 
3.3%
) 25
 
3.3%
2 25
 
3.3%
23
 
3.1%
22
 
2.9%
22
 
2.9%
Other values (105) 387
51.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 415
55.2%
Decimal Number 141
 
18.8%
Space Separator 127
 
16.9%
Open Punctuation 25
 
3.3%
Close Punctuation 25
 
3.3%
Dash Punctuation 11
 
1.5%
Other Punctuation 7
 
0.9%
Uppercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
 
8.0%
30
 
7.2%
23
 
5.5%
22
 
5.3%
22
 
5.3%
20
 
4.8%
19
 
4.6%
18
 
4.3%
17
 
4.1%
16
 
3.9%
Other values (88) 195
47.0%
Decimal Number
ValueCountFrequency (%)
1 33
23.4%
2 25
17.7%
3 20
14.2%
0 13
 
9.2%
7 13
 
9.2%
4 11
 
7.8%
5 10
 
7.1%
8 6
 
4.3%
9 6
 
4.3%
6 4
 
2.8%
Other Punctuation
ValueCountFrequency (%)
, 6
85.7%
; 1
 
14.3%
Space Separator
ValueCountFrequency (%)
127
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 415
55.2%
Common 336
44.7%
Latin 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
33
 
8.0%
30
 
7.2%
23
 
5.5%
22
 
5.3%
22
 
5.3%
20
 
4.8%
19
 
4.6%
18
 
4.3%
17
 
4.1%
16
 
3.9%
Other values (88) 195
47.0%
Common
ValueCountFrequency (%)
127
37.8%
1 33
 
9.8%
( 25
 
7.4%
) 25
 
7.4%
2 25
 
7.4%
3 20
 
6.0%
0 13
 
3.9%
7 13
 
3.9%
- 11
 
3.3%
4 11
 
3.3%
Other values (6) 33
 
9.8%
Latin
ValueCountFrequency (%)
A 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 415
55.2%
ASCII 337
44.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
127
37.7%
1 33
 
9.8%
( 25
 
7.4%
) 25
 
7.4%
2 25
 
7.4%
3 20
 
5.9%
0 13
 
3.9%
7 13
 
3.9%
- 11
 
3.3%
4 11
 
3.3%
Other values (7) 34
 
10.1%
Hangul
ValueCountFrequency (%)
33
 
8.0%
30
 
7.2%
23
 
5.5%
22
 
5.3%
22
 
5.3%
20
 
4.8%
19
 
4.6%
18
 
4.3%
17
 
4.1%
16
 
3.9%
Other values (88) 195
47.0%

전화번호
Text

MISSING 

Distinct35
Distinct (%)100.0%
Missing9
Missing (%)20.5%
Memory size484.0 B
2024-03-14T10:17:27.336572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.685714
Min length1

Characters and Unicode

Total characters409
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)100.0%

Sample

1st row-
2nd row063-254-8150
3rd row063-222-7968
4th row063-211-8001
5th row063-832-5271
ValueCountFrequency (%)
063-254-8150 1
 
2.9%
063-242-7532 1
 
2.9%
063-221-2211 1
 
2.9%
063-564-1940 1
 
2.9%
063-245-0984 1
 
2.9%
063-452-1367 1
 
2.9%
063-631-5050 1
 
2.9%
063-236-3777 1
 
2.9%
063-858-0850 1
 
2.9%
063-245-7234 1
 
2.9%
Other values (25) 25
71.4%
2024-03-14T10:17:27.652415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 69
16.9%
3 63
15.4%
0 55
13.4%
6 48
11.7%
2 48
11.7%
5 32
7.8%
1 30
7.3%
4 26
 
6.4%
8 20
 
4.9%
7 11
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 340
83.1%
Dash Punctuation 69
 
16.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 63
18.5%
0 55
16.2%
6 48
14.1%
2 48
14.1%
5 32
9.4%
1 30
8.8%
4 26
7.6%
8 20
 
5.9%
7 11
 
3.2%
9 7
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 69
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 409
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 69
16.9%
3 63
15.4%
0 55
13.4%
6 48
11.7%
2 48
11.7%
5 32
7.8%
1 30
7.3%
4 26
 
6.4%
8 20
 
4.9%
7 11
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 409
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 69
16.9%
3 63
15.4%
0 55
13.4%
6 48
11.7%
2 48
11.7%
5 32
7.8%
1 30
7.3%
4 26
 
6.4%
8 20
 
4.9%
7 11
 
2.7%


Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)11.4%
Missing0
Missing (%)0.0%
Memory size484.0 B
1
28 
<NA>
2
38
 
1
-
 
1

Length

Max length4
Median length1
Mean length1.6363636
Min length1

Unique

Unique2 ?
Unique (%)4.5%

Sample

1st row38
2nd row1
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 28
63.6%
<NA> 9
 
20.5%
2 5
 
11.4%
38 1
 
2.3%
- 1
 
2.3%

Length

2024-03-14T10:17:27.772457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:17:27.860757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 28
63.6%
na 9
 
20.5%
2 5
 
11.4%
38 1
 
2.3%
1
 
2.3%

대기
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)11.4%
Missing0
Missing (%)0.0%
Memory size484.0 B
-
25 
<NA>
1
8
 
1
 
1

Length

Max length4
Median length1
Mean length1.6136364
Min length1

Unique

Unique2 ?
Unique (%)4.5%

Sample

1st row8
2nd row
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
- 25
56.8%
<NA> 9
 
20.5%
1 8
 
18.2%
8 1
 
2.3%
1
 
2.3%

Length

2024-03-14T10:17:27.955053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:17:28.046145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
25
58.1%
na 9
 
20.9%
1 8
 
18.6%
8 1
 
2.3%

수질
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size484.0 B
1
31 
<NA>
-
 
3
31
 
1

Length

Max length4
Median length1
Mean length1.6363636
Min length1

Unique

Unique1 ?
Unique (%)2.3%

Sample

1st row31
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 31
70.5%
<NA> 9
 
20.5%
- 3
 
6.8%
31 1
 
2.3%

Length

2024-03-14T10:17:28.146889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:17:28.241384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 31
70.5%
na 9
 
20.5%
3
 
6.8%
31 1
 
2.3%

소음진동
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size484.0 B
-
34 
<NA>
0
 
1

Length

Max length4
Median length1
Mean length1.6136364
Min length1

Unique

Unique1 ?
Unique (%)2.3%

Sample

1st row0
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 34
77.3%
<NA> 9
 
20.5%
0 1
 
2.3%

Length

2024-03-14T10:17:28.331619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:17:28.442616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
34
77.3%
na 9
 
20.5%
0 1
 
2.3%

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing44
Missing (%)100.0%
Memory size528.0 B

Correlations

2024-03-14T10:17:28.556993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업소명영업소 소재지전화번호대기수질소음진동
연번1.0001.0001.0001.0001.0001.0001.0001.000
업소명1.0001.0001.0001.0001.0001.0001.0001.000
영업소 소재지1.0001.0001.0001.0001.0001.0001.0001.000
전화번호1.0001.0001.0001.0001.0001.0001.0001.000
1.0001.0001.0001.0001.0000.9460.6561.000
대기1.0001.0001.0001.0000.9461.0000.7431.000
수질1.0001.0001.0001.0000.6560.7431.0001.000
소음진동1.0001.0001.0001.0001.0001.0001.0001.000
2024-03-14T10:17:28.676610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대기소음진동수질
대기1.0000.6890.9690.778
0.6891.0000.9690.670
소음진동0.9690.9691.0000.985
수질0.7780.6700.9851.000
2024-03-14T10:17:28.751046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대기수질소음진동
1.0000.6890.6700.969
대기0.6891.0000.7780.969
수질0.6700.7781.0000.985
소음진동0.9690.9690.9851.000

Missing values

2024-03-14T10:17:25.295880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T10:17:25.403842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T10:17:25.501090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업소명영업소 소재지전화번호대기수질소음진동Unnamed: 8
0---388310<NA>
11광진환경㈜전주시 덕진구 호반3길 7 (덕진동2가)063-254-815011-<NA>
22㈜유성테크전주시 완산구 송정중앙로 29 (효자동1가)063-222-7968211-<NA>
33(유)지구엔비텍전주시 덕진구 비석날로 99 (팔복동2가)063-211-8001211-<NA>
44(유)호일엔지니어링익산시 춘포면 궁성로 272063-832-5271211-<NA>
55(주) 이앤에스테크전주시 완산구 서곡6길 23-4 (효자동3가)063-273-5251211-<NA>
66㈜청송이앤티전주시 덕진구 동부대로 727 (우아동3가)063-242-469011--<NA>
77㈜광산테크익산시 석암로13길 71 (팔봉동)063-835-282311--<NA>
88바다정수산업㈜전주시 덕진구 원만성로 5 (만성동)063-211-43311-1-<NA>
99(유)도영종합건설전주시 완산구 메너머2길 21-10 (중화산동2가)063-229-50251-1-<NA>
연번업소명영업소 소재지전화번호대기수질소음진동Unnamed: 8
3434㈜보원건설산업익산시 황등면 후정4길 58-36063-858-08501-1-<NA>
35<NA><NA><NA><NA><NA><NA><NA><NA><NA>
36<NA><NA><NA><NA><NA><NA><NA><NA><NA>
37<NA><NA><NA><NA><NA><NA><NA><NA><NA>
38<NA><NA><NA><NA><NA><NA><NA><NA><NA>
39<NA><NA><NA><NA><NA><NA><NA><NA><NA>
40<NA><NA><NA><NA><NA><NA><NA><NA><NA>
41<NA><NA><NA><NA><NA><NA><NA><NA><NA>
42<NA><NA><NA><NA><NA><NA><NA><NA><NA>
43<NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번업소명영업소 소재지전화번호대기수질소음진동# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA>9