Overview

Dataset statistics

Number of variables5
Number of observations95
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.8 KiB
Average record size in memory41.4 B

Variable types

Text3
Categorical2

Dataset

Description경기도 시흥시 종합 건설업 현황 입니다(경기도 시흥시 종합 건설업 현황에는 업체명, 업종, 영업소재지(도로명주소), 전화번호가 있습니다.)
Author경기도 시흥시
URLhttps://www.data.go.kr/data/15048917/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
업종 is highly imbalanced (70.2%)Imbalance
업체명 has unique valuesUnique
도로명주소 has unique valuesUnique

Reproduction

Analysis started2023-12-12 09:18:14.112559
Analysis finished2023-12-12 09:18:14.591986
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업체명
Text

UNIQUE 

Distinct95
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size892.0 B
2023-12-12T18:18:14.767714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length8.6421053
Min length5

Characters and Unicode

Total characters821
Distinct characters128
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique95 ?
Unique (%)100.0%

Sample

1st row(주)가산씨에이치오아이
2nd row(주)고려건설
3rd row(주)공간건설
4th row(주)공명
5th row(주)다온건설
ValueCountFrequency (%)
주)가산씨에이치오아이 1
 
1.1%
에스디종합건설(주 1
 
1.1%
신원종합건설(주 1
 
1.1%
신아종합건설(주 1
 
1.1%
성훈종합건설(주 1
 
1.1%
성지종합건설(주 1
 
1.1%
성원종합건설(주 1
 
1.1%
서울종합건설(주 1
 
1.1%
부영토건(주 1
 
1.1%
두우종합조경(주 1
 
1.1%
Other values (85) 85
89.5%
2023-12-12T18:18:15.123029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
97
 
11.8%
( 92
 
11.2%
) 92
 
11.2%
71
 
8.6%
71
 
8.6%
50
 
6.1%
50
 
6.1%
21
 
2.6%
11
 
1.3%
11
 
1.3%
Other values (118) 255
31.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 637
77.6%
Open Punctuation 92
 
11.2%
Close Punctuation 92
 
11.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
97
15.2%
71
 
11.1%
71
 
11.1%
50
 
7.8%
50
 
7.8%
21
 
3.3%
11
 
1.7%
11
 
1.7%
10
 
1.6%
9
 
1.4%
Other values (116) 236
37.0%
Open Punctuation
ValueCountFrequency (%)
( 92
100.0%
Close Punctuation
ValueCountFrequency (%)
) 92
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 637
77.6%
Common 184
 
22.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
97
15.2%
71
 
11.1%
71
 
11.1%
50
 
7.8%
50
 
7.8%
21
 
3.3%
11
 
1.7%
11
 
1.7%
10
 
1.6%
9
 
1.4%
Other values (116) 236
37.0%
Common
ValueCountFrequency (%)
( 92
50.0%
) 92
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 637
77.6%
ASCII 184
 
22.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
97
15.2%
71
 
11.1%
71
 
11.1%
50
 
7.8%
50
 
7.8%
21
 
3.3%
11
 
1.7%
11
 
1.7%
10
 
1.6%
9
 
1.4%
Other values (116) 236
37.0%
ASCII
ValueCountFrequency (%)
( 92
50.0%
) 92
50.0%

업종
Categorical

IMBALANCE 

Distinct7
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size892.0 B
건축공사업
83 
토목공사업
 
5
조경공사업
 
2
토목건축공사업
 
2
토목공사업, 건축공사업(영업정지)
 
1
Other values (2)
 
2

Length

Max length23
Median length5
Mean length5.4421053
Min length5

Unique

Unique3 ?
Unique (%)3.2%

Sample

1st row건축공사업
2nd row건축공사업
3rd row건축공사업
4th row건축공사업
5th row건축공사업

Common Values

ValueCountFrequency (%)
건축공사업 83
87.4%
토목공사업 5
 
5.3%
조경공사업 2
 
2.1%
토목건축공사업 2
 
2.1%
토목공사업, 건축공사업(영업정지) 1
 
1.1%
조경공사업, 토목건축공사업, 산업설비공사업 1
 
1.1%
토목공사업, 건축공사업 1
 
1.1%

Length

2023-12-12T18:18:15.268933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:18:15.380437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건축공사업 84
84.8%
토목공사업 7
 
7.1%
조경공사업 3
 
3.0%
토목건축공사업 3
 
3.0%
건축공사업(영업정지 1
 
1.0%
산업설비공사업 1
 
1.0%

도로명주소
Text

UNIQUE 

Distinct95
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size892.0 B
2023-12-12T18:18:15.641708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length41
Mean length33.221053
Min length20

Characters and Unicode

Total characters3156
Distinct characters176
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique95 ?
Unique (%)100.0%

Sample

1st row경기도 시흥시 월곶중앙로 58 4층 402-1호(월곶동, 월곶중앙프라자)
2nd row경기도 시흥시 수인로3395번길 11 2층 (신천동)
3rd row경기도 시흥시 신천3길 23 3층 (신천동)
4th row경기도 시흥시 오이도로 21 , 103호(정왕동, 비-랜드랜드프라자)
5th row경기도 시흥시 봉우재로209번길 70-2 (정왕동)
ValueCountFrequency (%)
경기도 95
 
16.2%
시흥시 95
 
16.2%
30
 
5.1%
신천동 7
 
1.2%
시청로68번길 7
 
1.2%
3층 6
 
1.0%
정왕동 6
 
1.0%
2층 5
 
0.9%
은행동 5
 
0.9%
45 4
 
0.7%
Other values (271) 328
55.8%
2023-12-12T18:18:16.113319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
493
 
15.6%
204
 
6.5%
, 121
 
3.8%
1 115
 
3.6%
2 104
 
3.3%
103
 
3.3%
100
 
3.2%
99
 
3.1%
99
 
3.1%
( 97
 
3.1%
Other values (166) 1621
51.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1719
54.5%
Decimal Number 602
 
19.1%
Space Separator 493
 
15.6%
Other Punctuation 123
 
3.9%
Open Punctuation 97
 
3.1%
Close Punctuation 97
 
3.1%
Dash Punctuation 25
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
204
 
11.9%
103
 
6.0%
100
 
5.8%
99
 
5.8%
99
 
5.8%
96
 
5.6%
85
 
4.9%
71
 
4.1%
55
 
3.2%
38
 
2.2%
Other values (150) 769
44.7%
Decimal Number
ValueCountFrequency (%)
1 115
19.1%
2 104
17.3%
3 79
13.1%
0 77
12.8%
6 48
8.0%
4 46
 
7.6%
5 45
 
7.5%
9 31
 
5.1%
7 29
 
4.8%
8 28
 
4.7%
Other Punctuation
ValueCountFrequency (%)
, 121
98.4%
2
 
1.6%
Space Separator
ValueCountFrequency (%)
493
100.0%
Open Punctuation
ValueCountFrequency (%)
( 97
100.0%
Close Punctuation
ValueCountFrequency (%)
) 97
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1719
54.5%
Common 1437
45.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
204
 
11.9%
103
 
6.0%
100
 
5.8%
99
 
5.8%
99
 
5.8%
96
 
5.6%
85
 
4.9%
71
 
4.1%
55
 
3.2%
38
 
2.2%
Other values (150) 769
44.7%
Common
ValueCountFrequency (%)
493
34.3%
, 121
 
8.4%
1 115
 
8.0%
2 104
 
7.2%
( 97
 
6.8%
) 97
 
6.8%
3 79
 
5.5%
0 77
 
5.4%
6 48
 
3.3%
4 46
 
3.2%
Other values (6) 160
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1719
54.5%
ASCII 1435
45.5%
None 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
493
34.4%
, 121
 
8.4%
1 115
 
8.0%
2 104
 
7.2%
( 97
 
6.8%
) 97
 
6.8%
3 79
 
5.5%
0 77
 
5.4%
6 48
 
3.3%
4 46
 
3.2%
Other values (5) 158
 
11.0%
Hangul
ValueCountFrequency (%)
204
 
11.9%
103
 
6.0%
100
 
5.8%
99
 
5.8%
99
 
5.8%
96
 
5.6%
85
 
4.9%
71
 
4.1%
55
 
3.2%
38
 
2.2%
Other values (150) 769
44.7%
None
ValueCountFrequency (%)
2
100.0%
Distinct93
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size892.0 B
2023-12-12T18:18:16.368224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.042105
Min length11

Characters and Unicode

Total characters1144
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)95.8%

Sample

1st row031-317-1028
2nd row031-314-8939
3rd row054-471-8601
4th row070-8150-2342
5th row031-404-1327
ValueCountFrequency (%)
031-451-0048 2
 
2.1%
031-318-1860 2
 
2.1%
031-434-8196 1
 
1.1%
031-317-1028 1
 
1.1%
031-317-1110 1
 
1.1%
031-503-0462 1
 
1.1%
031-431-3001 1
 
1.1%
031-435-9859 1
 
1.1%
031-832-0002 1
 
1.1%
031-498-8886 1
 
1.1%
Other values (83) 83
87.4%
2023-12-12T18:18:16.756045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 190
16.6%
1 189
16.5%
3 184
16.1%
0 181
15.8%
4 86
7.5%
8 66
 
5.8%
2 60
 
5.2%
7 52
 
4.5%
9 50
 
4.4%
5 47
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 954
83.4%
Dash Punctuation 190
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 189
19.8%
3 184
19.3%
0 181
19.0%
4 86
9.0%
8 66
 
6.9%
2 60
 
6.3%
7 52
 
5.5%
9 50
 
5.2%
5 47
 
4.9%
6 39
 
4.1%
Dash Punctuation
ValueCountFrequency (%)
- 190
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1144
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 190
16.6%
1 189
16.5%
3 184
16.1%
0 181
15.8%
4 86
7.5%
8 66
 
5.8%
2 60
 
5.2%
7 52
 
4.5%
9 50
 
4.4%
5 47
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1144
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 190
16.6%
1 189
16.5%
3 184
16.1%
0 181
15.8%
4 86
7.5%
8 66
 
5.8%
2 60
 
5.2%
7 52
 
4.5%
9 50
 
4.4%
5 47
 
4.1%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size892.0 B
2022-09-08
95 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-09-08
2nd row2022-09-08
3rd row2022-09-08
4th row2022-09-08
5th row2022-09-08

Common Values

ValueCountFrequency (%)
2022-09-08 95
100.0%

Length

2023-12-12T18:18:16.894102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:18:16.985741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-09-08 95
100.0%

Correlations

2023-12-12T18:18:17.049860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체명업종도로명주소전화번호
업체명1.0001.0001.0001.000
업종1.0001.0001.0000.000
도로명주소1.0001.0001.0001.000
전화번호1.0000.0001.0001.000

Missing values

2023-12-12T18:18:14.456059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:18:14.555069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명업종도로명주소전화번호데이터기준일자
0(주)가산씨에이치오아이건축공사업경기도 시흥시 월곶중앙로 58 4층 402-1호(월곶동, 월곶중앙프라자)031-317-10282022-09-08
1(주)고려건설건축공사업경기도 시흥시 수인로3395번길 11 2층 (신천동)031-314-89392022-09-08
2(주)공간건설건축공사업경기도 시흥시 신천3길 23 3층 (신천동)054-471-86012022-09-08
3(주)공명건축공사업경기도 시흥시 오이도로 21 , 103호(정왕동, 비-랜드랜드프라자)070-8150-23422022-09-08
4(주)다온건설건축공사업경기도 시흥시 봉우재로209번길 70-2 (정왕동)031-404-13272022-09-08
5(주)다운종합건설건축공사업경기도 시흥시 승지로60번길 17, 708호(능곡동, 현대프라자)031-498-00062022-09-08
6(주)더블유종합건설건축공사업경기도 시흥시 거북섬둘레길 34, 112호(정왕동, 더오션2)031-438-88222022-09-08
7(주)디투건축공사업경기도 시흥시 엠티브이북로 127 , 5층(정왕동)031-492-21452022-09-08
8(주)라스코종합건설건축공사업경기도 시흥시 은계로 260 , 5층 504호(은행동, 우평라비엔빌딩)031-216-09552022-09-08
9(주)미강씨앤씨건축공사업경기도 시흥시 배미골길 10-9 , 1층 (목감동)031-401-11142022-09-08
업체명업종도로명주소전화번호데이터기준일자
85지아이종합건설(주)건축공사업경기도 시흥시 매화산단2길 27 , 2층 201호 (도창동)031-318-65552022-09-08
86진석종합건설(주)건축공사업경기도 시흥시 시청로68번길 25, 301호(장현동, 삼하빌딩)031-404-67772022-09-08
87태진건설산업(주)건축공사업경기도 시흥시 비둘기공원7길 45 , 비동 302호 (대야동, 트윈프라자)031-314-62632022-09-08
88태풍종합건설(주)건축공사업경기도 시흥시 목감남서로 9-29 ,701호,702호,703호(조남동, 센타프라자2)031-398-82422022-09-08
89티와이건설(주)건축공사업경기도 시흥시 공단1대로 13 907호 (정왕동)031-497-04712022-09-08
90팀텍주식회사건축공사업경기도 시흥시 공단1대로196번길 12 (정왕동)031-495-88212022-09-08
91프라임건설(주)건축공사업경기도 시흥시 마유로 433 ,401호(정왕동,아이에스빌딩)031-319-74792022-09-08
92하늬이앤씨(주)건축공사업경기도 시흥시 서울대학로278번길 34, 421호(정왕동, 시흥배곧아브뉴프랑센트럴옐로우)031-432-33512022-09-08
93하우스디종합건설(주)건축공사업경기도 시흥시 시청로68번길 31 ,제5층501호(장현동, 대성빌딩)031-318-85222022-09-08
94효원종합건설(주)건축공사업경기도 시흥시 황고개로 283 (군자동)031-434-80102022-09-08