Overview

Dataset statistics

Number of variables5
Number of observations359
Missing cells6
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.1 KiB
Average record size in memory40.4 B

Variable types

Categorical2
Text3

Dataset

Description경상북도청에서 등록/관리하는 측량업체에 대한 데이터로, 측량업체의 업종, 업체명, 사무소 전화번호, 도로명주소를 제공합니다.
Author경상북도
URLhttps://www.data.go.kr/data/15075669/fileData.do

Alerts

지자체 is highly imbalanced (50.4%)Imbalance
사무소 전화번호 has 6 (1.7%) missing valuesMissing

Reproduction

Analysis started2024-03-14 16:21:26.908436
Analysis finished2024-03-14 16:21:27.866486
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지자체
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.9 KiB
경상북도
320 
경상북도 포항시
39 

Length

Max length8
Median length4
Mean length4.4345404
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상북도
2nd row경상북도
3rd row경상북도
4th row경상북도
5th row경상북도 포항시

Common Values

ValueCountFrequency (%)
경상북도 320
89.1%
경상북도 포항시 39
 
10.9%

Length

2024-03-15T01:21:28.008249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T01:21:28.297608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경상북도 359
90.2%
포항시 39
 
9.8%

업종
Categorical

Distinct3
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.9 KiB
일반측량
224 
공공측량
111 
지적측량
24 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반측량
2nd row일반측량
3rd row일반측량
4th row공공측량
5th row일반측량

Common Values

ValueCountFrequency (%)
일반측량 224
62.4%
공공측량 111
30.9%
지적측량 24
 
6.7%

Length

2024-03-15T01:21:28.487070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T01:21:28.668218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반측량 224
62.4%
공공측량 111
30.9%
지적측량 24
 
6.7%
Distinct348
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Memory size2.9 KiB
2024-03-15T01:21:29.766432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length17
Mean length9.0250696
Min length4

Characters and Unicode

Total characters3240
Distinct characters191
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique337 ?
Unique (%)93.9%

Sample

1st row(주)나라이엔지
2nd row(주)더원이앤씨
3rd row주식회사 주인엔지니어링
4th row에이치디이엔지 주식회사
5th row(주)보보이엔지
ValueCountFrequency (%)
주식회사 23
 
6.0%
주)부린 2
 
0.5%
주)지오스카이 2
 
0.5%
주)한빛기술단 2
 
0.5%
주)가온이앤씨 2
 
0.5%
영원지리정보 2
 
0.5%
티엘엔지니어링(주 2
 
0.5%
주)대국지아이에스 2
 
0.5%
주)성산 2
 
0.5%
주)도원디엔씨 2
 
0.5%
Other values (340) 342
89.3%
2024-03-15T01:21:31.150624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
318
 
9.8%
( 281
 
8.7%
) 281
 
8.7%
163
 
5.0%
161
 
5.0%
127
 
3.9%
103
 
3.2%
103
 
3.2%
103
 
3.2%
88
 
2.7%
Other values (181) 1512
46.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2650
81.8%
Open Punctuation 281
 
8.7%
Close Punctuation 281
 
8.7%
Space Separator 24
 
0.7%
Other Symbol 2
 
0.1%
Uppercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
318
 
12.0%
163
 
6.2%
161
 
6.1%
127
 
4.8%
103
 
3.9%
103
 
3.9%
103
 
3.9%
88
 
3.3%
82
 
3.1%
70
 
2.6%
Other values (175) 1332
50.3%
Uppercase Letter
ValueCountFrequency (%)
G 1
50.0%
C 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 281
100.0%
Close Punctuation
ValueCountFrequency (%)
) 281
100.0%
Space Separator
ValueCountFrequency (%)
24
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2652
81.9%
Common 586
 
18.1%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
318
 
12.0%
163
 
6.1%
161
 
6.1%
127
 
4.8%
103
 
3.9%
103
 
3.9%
103
 
3.9%
88
 
3.3%
82
 
3.1%
70
 
2.6%
Other values (176) 1334
50.3%
Common
ValueCountFrequency (%)
( 281
48.0%
) 281
48.0%
24
 
4.1%
Latin
ValueCountFrequency (%)
G 1
50.0%
C 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2650
81.8%
ASCII 588
 
18.1%
None 2
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
318
 
12.0%
163
 
6.2%
161
 
6.1%
127
 
4.8%
103
 
3.9%
103
 
3.9%
103
 
3.9%
88
 
3.3%
82
 
3.1%
70
 
2.6%
Other values (175) 1332
50.3%
ASCII
ValueCountFrequency (%)
( 281
47.8%
) 281
47.8%
24
 
4.1%
G 1
 
0.2%
C 1
 
0.2%
None
ValueCountFrequency (%)
2
100.0%
Distinct347
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size2.9 KiB
2024-03-15T01:21:32.481398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length41
Mean length28.18663
Min length17

Characters and Unicode

Total characters10119
Distinct characters245
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique335 ?
Unique (%)93.3%

Sample

1st row경상북도 안동시 법상길 61 (법상동)
2nd row경상북도 경주시 동천로93번길 6 2층 (동천동)
3rd row경상북도 봉화군 봉화읍 신시장길 11 라동 306호
4th row경상북도 경산시 경안로67길 13-5 1층 102호(대평동)
5th row경상북도 포항시 남구 대이로45번길 6-4 ,2층(대잠동)
ValueCountFrequency (%)
경상북도 357
 
17.3%
2층 76
 
3.7%
포항시 39
 
1.9%
경산시 33
 
1.6%
경주시 33
 
1.6%
3층 32
 
1.6%
남구 30
 
1.5%
안동시 25
 
1.2%
구미시 25
 
1.2%
문경시 23
 
1.1%
Other values (737) 1386
67.3%
2024-03-15T01:21:34.182290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2256
22.3%
467
 
4.6%
413
 
4.1%
393
 
3.9%
380
 
3.8%
1 370
 
3.7%
2 334
 
3.3%
289
 
2.9%
261
 
2.6%
234
 
2.3%
Other values (235) 4722
46.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5372
53.1%
Space Separator 2256
22.3%
Decimal Number 1720
 
17.0%
Other Punctuation 229
 
2.3%
Close Punctuation 197
 
1.9%
Open Punctuation 197
 
1.9%
Dash Punctuation 148
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
467
 
8.7%
413
 
7.7%
393
 
7.3%
380
 
7.1%
289
 
5.4%
261
 
4.9%
234
 
4.4%
194
 
3.6%
183
 
3.4%
115
 
2.1%
Other values (220) 2443
45.5%
Decimal Number
ValueCountFrequency (%)
1 370
21.5%
2 334
19.4%
3 206
12.0%
0 165
9.6%
4 149
8.7%
5 147
 
8.5%
6 106
 
6.2%
8 96
 
5.6%
7 87
 
5.1%
9 60
 
3.5%
Space Separator
ValueCountFrequency (%)
2256
100.0%
Other Punctuation
ValueCountFrequency (%)
, 229
100.0%
Close Punctuation
ValueCountFrequency (%)
) 197
100.0%
Open Punctuation
ValueCountFrequency (%)
( 197
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 148
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5372
53.1%
Common 4747
46.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
467
 
8.7%
413
 
7.7%
393
 
7.3%
380
 
7.1%
289
 
5.4%
261
 
4.9%
234
 
4.4%
194
 
3.6%
183
 
3.4%
115
 
2.1%
Other values (220) 2443
45.5%
Common
ValueCountFrequency (%)
2256
47.5%
1 370
 
7.8%
2 334
 
7.0%
, 229
 
4.8%
3 206
 
4.3%
) 197
 
4.1%
( 197
 
4.1%
0 165
 
3.5%
4 149
 
3.1%
- 148
 
3.1%
Other values (5) 496
 
10.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5372
53.1%
ASCII 4747
46.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2256
47.5%
1 370
 
7.8%
2 334
 
7.0%
, 229
 
4.8%
3 206
 
4.3%
) 197
 
4.1%
( 197
 
4.1%
0 165
 
3.5%
4 149
 
3.1%
- 148
 
3.1%
Other values (5) 496
 
10.4%
Hangul
ValueCountFrequency (%)
467
 
8.7%
413
 
7.7%
393
 
7.3%
380
 
7.1%
289
 
5.4%
261
 
4.9%
234
 
4.4%
194
 
3.6%
183
 
3.4%
115
 
2.1%
Other values (220) 2443
45.5%
Distinct334
Distinct (%)94.6%
Missing6
Missing (%)1.7%
Memory size2.9 KiB
2024-03-15T01:21:35.121475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.056657
Min length12

Characters and Unicode

Total characters4256
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique317 ?
Unique (%)89.8%

Sample

1st row054-841-1229
2nd row054-741-5672
3rd row053-721-7661
4th row053-814-5550
5th row054-272-6627
ValueCountFrequency (%)
053-745-0871 3
 
0.8%
053-744-2799 3
 
0.8%
070-8196-2125 2
 
0.6%
070-4226-6114 2
 
0.6%
053-759-2500 2
 
0.6%
054-781-8882 2
 
0.6%
054-624-0365 2
 
0.6%
054-774-2478 2
 
0.6%
053-814-2226 2
 
0.6%
054-278-0344 2
 
0.6%
Other values (324) 331
93.8%
2024-03-15T01:21:36.553814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 706
16.6%
0 618
14.5%
5 594
14.0%
4 452
10.6%
3 365
8.6%
7 357
8.4%
8 273
 
6.4%
2 258
 
6.1%
1 254
 
6.0%
6 206
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3550
83.4%
Dash Punctuation 706
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 618
17.4%
5 594
16.7%
4 452
12.7%
3 365
10.3%
7 357
10.1%
8 273
7.7%
2 258
7.3%
1 254
7.2%
6 206
 
5.8%
9 173
 
4.9%
Dash Punctuation
ValueCountFrequency (%)
- 706
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4256
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 706
16.6%
0 618
14.5%
5 594
14.0%
4 452
10.6%
3 365
8.6%
7 357
8.4%
8 273
 
6.4%
2 258
 
6.1%
1 254
 
6.0%
6 206
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4256
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 706
16.6%
0 618
14.5%
5 594
14.0%
4 452
10.6%
3 365
8.6%
7 357
8.4%
8 273
 
6.4%
2 258
 
6.1%
1 254
 
6.0%
6 206
 
4.8%

Correlations

2024-03-15T01:21:36.719667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체업종
지자체1.0000.000
업종0.0001.000
2024-03-15T01:21:36.862528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체업종
지자체1.0000.000
업종0.0001.000
2024-03-15T01:21:37.005771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체업종
지자체1.0000.000
업종0.0001.000

Missing values

2024-03-15T01:21:27.463109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T01:21:27.779722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지자체업종측량업체명소재지사무소 전화번호
0경상북도일반측량(주)나라이엔지경상북도 안동시 법상길 61 (법상동)054-841-1229
1경상북도일반측량(주)더원이앤씨경상북도 경주시 동천로93번길 6 2층 (동천동)054-741-5672
2경상북도일반측량주식회사 주인엔지니어링경상북도 봉화군 봉화읍 신시장길 11 라동 306호053-721-7661
3경상북도공공측량에이치디이엔지 주식회사경상북도 경산시 경안로67길 13-5 1층 102호(대평동)053-814-5550
4경상북도 포항시일반측량(주)보보이엔지경상북도 포항시 남구 대이로45번길 6-4 ,2층(대잠동)054-272-6627
5경상북도일반측량주식회사 창성이앤씨경상북도 문경시 신흥시장길 25 1층(흥덕동)053-561-3484
6경상북도일반측량(주)황용건설엔지니어링경상북도 문경시 농암면 농암길 68 1층053-939-1197
7경상북도공공측량명보이앤씨 주식회사경상북도 안동시 풍천면 천년숲서로 7-19 611호070-5184-8102
8경상북도일반측량주식회사 보검엔지니어링경상북도 예천군 예천읍 시장로 161 1층<NA>
9경상북도일반측량(주)유한종합기술경상북도 상주시 화서면 문장로 94053-721-7510
지자체업종측량업체명소재지사무소 전화번호
349경상북도일반측량금오측지토목설계공사경상북도 구미시 송원서로 23 2층054-452-8480
350경상북도일반측량(주)주원건설엔지니어링경상북도 영주시 시청로 20054-638-9999
351경상북도일반측량정우건설엔지니어링경상북도 김천시 시청6길 8-23 (신음동)054-434-9614
352경상북도 포항시일반측량(주)보보경상북도 포항시 남구 대이로45번길 6-4 (대잠동),054-283-6317
353경상북도공공측량(주)세원경상북도 예천군 예천읍 충효로 408-20054-654-7401
354경상북도일반측량동진측량설계경상북도 영천시 충효로 75-2, 2층054-331-8460
355경상북도일반측량경상측량설계공사경상북도 상주시 상산로 215-16 2층(남성동)054-535-1904
356경상북도 포항시일반측량(주)대성건설엔지니어링경상북도 포항시 북구 아치로 15, 4층 (관음빌딩)054-247-1733
357경상북도 포항시일반측량기암이엔씨경상북도 포항시 남구 상공로6번길 34054-272-2058
358경상북도일반측량동양측량토목설계사무소경상북도 안동시 퇴계로 106054-857-4301