Overview

Dataset statistics

Number of variables5
Number of observations21
Missing cells4
Missing cells (%)3.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory993.0 B
Average record size in memory47.3 B

Variable types

Numeric1
Text3
Categorical1

Dataset

Description장성군 관내에 등록된 전문건설업 중 기계설비공사업 현황 제공입니다. 제공 항목은 상호명, 업종명, 도로명주소, 대표연락처 4개 항목입니다.
Author전라남도 장성군
URLhttps://www.data.go.kr/data/15125085/fileData.do

Alerts

연번 is highly overall correlated with 업종High correlation
업종 is highly overall correlated with 연번High correlation
업종 is highly imbalanced (72.4%)Imbalance
연번 has 1 (4.8%) missing valuesMissing
상호 has 1 (4.8%) missing valuesMissing
주소 has 1 (4.8%) missing valuesMissing
전화번호 has 1 (4.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 22:17:18.648080
Analysis finished2023-12-12 22:17:19.497902
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct20
Distinct (%)100.0%
Missing1
Missing (%)4.8%
Infinite0
Infinite (%)0.0%
Mean10.5
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T07:17:19.545952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.95
Q15.75
median10.5
Q315.25
95-th percentile19.05
Maximum20
Range19
Interquartile range (IQR)9.5

Descriptive statistics

Standard deviation5.9160798
Coefficient of variation (CV)0.56343617
Kurtosis-1.2
Mean10.5
Median Absolute Deviation (MAD)5
Skewness0
Sum210
Variance35
MonotonicityStrictly increasing
2023-12-13T07:17:19.652765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
1 1
 
4.8%
12 1
 
4.8%
20 1
 
4.8%
19 1
 
4.8%
18 1
 
4.8%
17 1
 
4.8%
16 1
 
4.8%
15 1
 
4.8%
14 1
 
4.8%
13 1
 
4.8%
Other values (10) 10
47.6%
ValueCountFrequency (%)
1 1
4.8%
2 1
4.8%
3 1
4.8%
4 1
4.8%
5 1
4.8%
6 1
4.8%
7 1
4.8%
8 1
4.8%
9 1
4.8%
10 1
4.8%
ValueCountFrequency (%)
20 1
4.8%
19 1
4.8%
18 1
4.8%
17 1
4.8%
16 1
4.8%
15 1
4.8%
14 1
4.8%
13 1
4.8%
12 1
4.8%
11 1
4.8%

상호
Text

MISSING 

Distinct20
Distinct (%)100.0%
Missing1
Missing (%)4.8%
Memory size300.0 B
2023-12-13T07:17:19.828223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length10.5
Mean length8.55
Min length5

Characters and Unicode

Total characters171
Distinct characters74
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st row(주)가온테크
2nd row(주)강호이엔지
3rd row(주)동명산업개발
4th row(주)동천기공
5th row(주)바우만테크
ValueCountFrequency (%)
주)가온테크 1
 
5.0%
주)강호이엔지 1
 
5.0%
태금기계(주 1
 
5.0%
용호금속(주 1
 
5.0%
엘씨안전연구소(주 1
 
5.0%
에이치티씨시스템(주 1
 
5.0%
성호그린테크(주 1
 
5.0%
농업회사법인주식회사동광이엔지 1
 
5.0%
광주데코임포트(주 1
 
5.0%
공명환경기계(주 1
 
5.0%
Other values (10) 10
50.0%
2023-12-13T07:17:20.138636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21
 
12.3%
( 19
 
11.1%
) 19
 
11.1%
9
 
5.3%
4
 
2.3%
4
 
2.3%
4
 
2.3%
3
 
1.8%
3
 
1.8%
3
 
1.8%
Other values (64) 82
48.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 133
77.8%
Open Punctuation 19
 
11.1%
Close Punctuation 19
 
11.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21
 
15.8%
9
 
6.8%
4
 
3.0%
4
 
3.0%
4
 
3.0%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
Other values (62) 76
57.1%
Open Punctuation
ValueCountFrequency (%)
( 19
100.0%
Close Punctuation
ValueCountFrequency (%)
) 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 133
77.8%
Common 38
 
22.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21
 
15.8%
9
 
6.8%
4
 
3.0%
4
 
3.0%
4
 
3.0%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
Other values (62) 76
57.1%
Common
ValueCountFrequency (%)
( 19
50.0%
) 19
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 133
77.8%
ASCII 38
 
22.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
21
 
15.8%
9
 
6.8%
4
 
3.0%
4
 
3.0%
4
 
3.0%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
Other values (62) 76
57.1%
ASCII
ValueCountFrequency (%)
( 19
50.0%
) 19
50.0%

업종
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size300.0 B
기계설비ㆍ가스공사업
20 
<NA>
 
1

Length

Max length10
Median length10
Mean length9.7142857
Min length4

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row기계설비ㆍ가스공사업
2nd row기계설비ㆍ가스공사업
3rd row기계설비ㆍ가스공사업
4th row기계설비ㆍ가스공사업
5th row기계설비ㆍ가스공사업

Common Values

ValueCountFrequency (%)
기계설비ㆍ가스공사업 20
95.2%
<NA> 1
 
4.8%

Length

2023-12-13T07:17:20.266081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:17:20.355659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기계설비ㆍ가스공사업 20
95.2%
na 1
 
4.8%

주소
Text

MISSING 

Distinct20
Distinct (%)100.0%
Missing1
Missing (%)4.8%
Memory size300.0 B
2023-12-13T07:17:20.517181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length32.5
Mean length23.45
Min length19

Characters and Unicode

Total characters469
Distinct characters67
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st row전라남도 장성군 삼계면 사창로 34
2nd row전라남도 장성군 남면 나노산단3로 77
3rd row전라남도 장성군 진원면 노사로 502
4th row전라남도 장성군 장성읍 영천로 107-1
5th row전라남도 장성군 동화면 전자농공단지2길 43
ValueCountFrequency (%)
전라남도 20
18.9%
장성군 20
18.9%
남면 8
 
7.5%
동화면 3
 
2.8%
삼계면 3
 
2.8%
강변로 2
 
1.9%
황룡면 2
 
1.9%
283-5 2
 
1.9%
황토로 2
 
1.9%
나노산단로 2
 
1.9%
Other values (38) 42
39.6%
2023-12-13T07:17:20.804187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
86
18.3%
28
 
6.0%
22
 
4.7%
22
 
4.7%
21
 
4.5%
20
 
4.3%
20
 
4.3%
20
 
4.3%
19
 
4.1%
17
 
3.6%
Other values (57) 194
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 294
62.7%
Space Separator 86
 
18.3%
Decimal Number 77
 
16.4%
Dash Punctuation 7
 
1.5%
Other Punctuation 3
 
0.6%
Close Punctuation 1
 
0.2%
Open Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
 
9.5%
22
 
7.5%
22
 
7.5%
21
 
7.1%
20
 
6.8%
20
 
6.8%
20
 
6.8%
19
 
6.5%
17
 
5.8%
10
 
3.4%
Other values (42) 95
32.3%
Decimal Number
ValueCountFrequency (%)
2 14
18.2%
1 13
16.9%
5 10
13.0%
4 9
11.7%
0 6
7.8%
7 6
7.8%
3 6
7.8%
6 5
 
6.5%
8 5
 
6.5%
9 3
 
3.9%
Space Separator
ValueCountFrequency (%)
86
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 294
62.7%
Common 175
37.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
 
9.5%
22
 
7.5%
22
 
7.5%
21
 
7.1%
20
 
6.8%
20
 
6.8%
20
 
6.8%
19
 
6.5%
17
 
5.8%
10
 
3.4%
Other values (42) 95
32.3%
Common
ValueCountFrequency (%)
86
49.1%
2 14
 
8.0%
1 13
 
7.4%
5 10
 
5.7%
4 9
 
5.1%
- 7
 
4.0%
0 6
 
3.4%
7 6
 
3.4%
3 6
 
3.4%
6 5
 
2.9%
Other values (5) 13
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 294
62.7%
ASCII 175
37.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
86
49.1%
2 14
 
8.0%
1 13
 
7.4%
5 10
 
5.7%
4 9
 
5.1%
- 7
 
4.0%
0 6
 
3.4%
7 6
 
3.4%
3 6
 
3.4%
6 5
 
2.9%
Other values (5) 13
 
7.4%
Hangul
ValueCountFrequency (%)
28
 
9.5%
22
 
7.5%
22
 
7.5%
21
 
7.1%
20
 
6.8%
20
 
6.8%
20
 
6.8%
19
 
6.5%
17
 
5.8%
10
 
3.4%
Other values (42) 95
32.3%

전화번호
Text

MISSING 

Distinct20
Distinct (%)100.0%
Missing1
Missing (%)4.8%
Memory size300.0 B
2023-12-13T07:17:20.969052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.15
Min length12

Characters and Unicode

Total characters243
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st row061-393-0479
2nd row062-971-7742
3rd row061-392-6166
4th row061-452-5201
5th row061-394-1185
ValueCountFrequency (%)
061-393-0479 1
 
5.0%
062-971-7742 1
 
5.0%
061-392-3643 1
 
5.0%
061-336-0604 1
 
5.0%
061-395-9511 1
 
5.0%
062-953-5072 1
 
5.0%
062-674-0939 1
 
5.0%
061-392-2180 1
 
5.0%
061-395-0636 1
 
5.0%
061-878-7125 1
 
5.0%
Other values (10) 10
50.0%
2023-12-13T07:17:21.247755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 40
16.5%
0 37
15.2%
6 33
13.6%
1 28
11.5%
3 20
8.2%
9 17
7.0%
7 17
7.0%
2 16
 
6.6%
4 15
 
6.2%
5 14
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 203
83.5%
Dash Punctuation 40
 
16.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 37
18.2%
6 33
16.3%
1 28
13.8%
3 20
9.9%
9 17
8.4%
7 17
8.4%
2 16
7.9%
4 15
7.4%
5 14
 
6.9%
8 6
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 243
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 40
16.5%
0 37
15.2%
6 33
13.6%
1 28
11.5%
3 20
8.2%
9 17
7.0%
7 17
7.0%
2 16
 
6.6%
4 15
 
6.2%
5 14
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 243
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 40
16.5%
0 37
15.2%
6 33
13.6%
1 28
11.5%
3 20
8.2%
9 17
7.0%
7 17
7.0%
2 16
 
6.6%
4 15
 
6.2%
5 14
 
5.8%

Interactions

2023-12-13T07:17:18.867015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:17:21.338380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번상호주소전화번호
연번1.0001.0001.0001.000
상호1.0001.0001.0001.000
주소1.0001.0001.0001.000
전화번호1.0001.0001.0001.000
2023-12-13T07:17:21.463566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종
연번1.0001.000
업종1.0001.000

Missing values

2023-12-13T07:17:19.268284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:17:19.348337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:17:19.442353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번상호업종주소전화번호
01(주)가온테크기계설비ㆍ가스공사업전라남도 장성군 삼계면 사창로 34061-393-0479
12(주)강호이엔지기계설비ㆍ가스공사업전라남도 장성군 남면 나노산단3로 77062-971-7742
23(주)동명산업개발기계설비ㆍ가스공사업전라남도 장성군 진원면 노사로 502061-392-6166
34(주)동천기공기계설비ㆍ가스공사업전라남도 장성군 장성읍 영천로 107-1061-452-5201
45(주)바우만테크기계설비ㆍ가스공사업전라남도 장성군 동화면 전자농공단지2길 43061-394-1185
56(주)상무냉동이앤지기계설비ㆍ가스공사업전라남도 장성군 남면 나노산단로 112070-4866-0757
67(주)온빛기계설비ㆍ가스공사업전라남도 장성군 삼계면 영장로 1679061-394-1714
78(주)우리아이텍기계설비ㆍ가스공사업전라남도 장성군 남면 황토로 283-5, 주건축물제2동062-943-9250
89(주)쿨테이너기계설비ㆍ가스공사업전라남도 장성군 북일면 신흥로 416070-7176-1460
910(주)티와이이엔지기계설비ㆍ가스공사업전라남도 장성군 황룡면 강변로 452-16061-395-4426
연번상호업종주소전화번호
1112공명환경기계(주)기계설비ㆍ가스공사업전라남도 장성군 남면 나노산단1로 85061-878-7125
1213광주데코임포트(주)기계설비ㆍ가스공사업전라남도 장성군 동화면 전자농공단지2길 78061-395-0636
1314농업회사법인주식회사동광이엔지기계설비ㆍ가스공사업전라남도 장성군 동화면 농공단지길 49061-392-2180
1415성호그린테크(주)기계설비ㆍ가스공사업전라남도 장성군 진원면 나노산단2로 81062-674-0939
1516에이치티씨시스템(주)기계설비ㆍ가스공사업전라남도 장성군 남면 나노산단로 100062-953-5072
1617엘씨안전연구소(주)기계설비ㆍ가스공사업전라남도 장성군 남면 황토로 283-5061-395-9511
1718용호금속(주)기계설비ㆍ가스공사업전라남도 장성군 삼계면 사창로 64-72, 상가동 104호(금광아파트)061-336-0604
1819태금기계(주)기계설비ㆍ가스공사업전라남도 장성군 황룡면 강변로 451-36061-392-3643
1920해표산업(주)기계설비ㆍ가스공사업전라남도 장성군 남면 나노산단5로 45, 5동 2층 202-1호070-4035-2673
20<NA><NA><NA><NA><NA>