Overview

Dataset statistics

Number of variables6
Number of observations206
Missing cells50
Missing cells (%)4.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.8 KiB
Average record size in memory48.6 B

Variable types

Categorical3
Text3

Dataset

Description강원도 행정사 정보(업체명, 대표자, 종류, 연락처, 소재지도로명주소, 위지 위도/경도 등) 데이터를 제공합니다.
Author강원특별자치도
URLhttps://www.data.go.kr/data/15033702/fileData.do

Alerts

시도명 is highly imbalanced (95.6%)Imbalance
종류 is highly imbalanced (94.4%)Imbalance
연락처 has 50 (24.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 15:18:55.350428
Analysis finished2023-12-12 15:18:56.034162
Duration0.68 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
강원특별자치도
205 
강원특별자치도
 
1

Length

Max length8
Median length7
Mean length7.0048544
Min length7

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row강원특별자치도
2nd row강원특별자치도
3rd row강원특별자치도
4th row강원특별자치도
5th row강원특별자치도

Common Values

ValueCountFrequency (%)
강원특별자치도 205
99.5%
강원특별자치도 1
 
0.5%

Length

2023-12-13T00:18:56.138710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:18:56.264516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강원특별자치도 206
100.0%

시군구명
Categorical

Distinct16
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
원주시
50 
춘천시
48 
강릉시
27 
동해시
10 
속초시
10 
Other values (11)
61 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강릉시
2nd row강릉시
3rd row강릉시
4th row강릉시
5th row강릉시

Common Values

ValueCountFrequency (%)
원주시 50
24.3%
춘천시 48
23.3%
강릉시 27
13.1%
동해시 10
 
4.9%
속초시 10
 
4.9%
홍천군 10
 
4.9%
인제군 9
 
4.4%
횡성군 9
 
4.4%
평창군 8
 
3.9%
삼척시 6
 
2.9%
Other values (6) 19
 
9.2%

Length

2023-12-13T00:18:56.368413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
원주시 50
24.3%
춘천시 48
23.3%
강릉시 27
13.1%
동해시 10
 
4.9%
속초시 10
 
4.9%
홍천군 10
 
4.9%
인제군 9
 
4.4%
횡성군 9
 
4.4%
평창군 8
 
3.9%
삼척시 6
 
2.9%
Other values (6) 19
 
9.2%
Distinct204
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-13T00:18:56.669419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length8.6407767
Min length2

Characters and Unicode

Total characters1780
Distinct characters193
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique202 ?
Unique (%)98.1%

Sample

1st row성진행정사사무소
2nd row행정사 김낙국사무소
3rd row행정사 배형각
4th row행정사 김찬기사무소
5th row행정사 송암사무소
ValueCountFrequency (%)
행정사 32
 
11.8%
사무소 16
 
5.9%
행정서사 6
 
2.2%
행정사사무소 5
 
1.8%
행정사무소 2
 
0.7%
해운행정사사무소 2
 
0.7%
심우석행정사사무소 1
 
0.4%
최호성행정사사무소 1
 
0.4%
김삼기 1
 
0.4%
영문번역행정사사무소 1
 
0.4%
Other values (205) 205
75.4%
2023-12-13T00:18:57.144386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
359
20.2%
209
11.7%
204
 
11.5%
170
 
9.6%
168
 
9.4%
66
 
3.7%
30
 
1.7%
23
 
1.3%
16
 
0.9%
13
 
0.7%
Other values (183) 522
29.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1702
95.6%
Space Separator 66
 
3.7%
Lowercase Letter 4
 
0.2%
Decimal Number 3
 
0.2%
Uppercase Letter 3
 
0.2%
Close Punctuation 1
 
0.1%
Open Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
359
21.1%
209
12.3%
204
12.0%
170
 
10.0%
168
 
9.9%
30
 
1.8%
23
 
1.4%
16
 
0.9%
13
 
0.8%
13
 
0.8%
Other values (172) 497
29.2%
Lowercase Letter
ValueCountFrequency (%)
o 2
50.0%
y 1
25.0%
a 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
33.3%
N 1
33.3%
H 1
33.3%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
4 1
33.3%
Space Separator
ValueCountFrequency (%)
66
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1702
95.6%
Common 71
 
4.0%
Latin 7
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
359
21.1%
209
12.3%
204
12.0%
170
 
10.0%
168
 
9.9%
30
 
1.8%
23
 
1.4%
16
 
0.9%
13
 
0.8%
13
 
0.8%
Other values (172) 497
29.2%
Latin
ValueCountFrequency (%)
o 2
28.6%
S 1
14.3%
N 1
14.3%
y 1
14.3%
a 1
14.3%
H 1
14.3%
Common
ValueCountFrequency (%)
66
93.0%
1 2
 
2.8%
) 1
 
1.4%
( 1
 
1.4%
4 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1702
95.6%
ASCII 78
 
4.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
359
21.1%
209
12.3%
204
12.0%
170
 
10.0%
168
 
9.9%
30
 
1.8%
23
 
1.4%
16
 
0.9%
13
 
0.8%
13
 
0.8%
Other values (172) 497
29.2%
ASCII
ValueCountFrequency (%)
66
84.6%
1 2
 
2.6%
o 2
 
2.6%
S 1
 
1.3%
N 1
 
1.3%
y 1
 
1.3%
a 1
 
1.3%
) 1
 
1.3%
( 1
 
1.3%
H 1
 
1.3%
Distinct204
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-13T00:18:57.504722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3
Min length2

Characters and Unicode

Total characters618
Distinct characters138
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique202 ?
Unique (%)98.1%

Sample

1st row김성진
2nd row김낙국
3rd row배형각
4th row김찬기
5th row최기순
ValueCountFrequency (%)
이원근 2
 
1.0%
김찬욱 2
 
1.0%
김영적 1
 
0.5%
이기연 1
 
0.5%
함영일 1
 
0.5%
최호성 1
 
0.5%
김성진 1
 
0.5%
오웅철 1
 
0.5%
이희춘 1
 
0.5%
차태환 1
 
0.5%
Other values (194) 194
94.2%
2023-12-13T00:18:58.071666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42
 
6.8%
33
 
5.3%
19
 
3.1%
16
 
2.6%
16
 
2.6%
15
 
2.4%
14
 
2.3%
14
 
2.3%
13
 
2.1%
13
 
2.1%
Other values (128) 423
68.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 617
99.8%
Decimal Number 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
42
 
6.8%
33
 
5.3%
19
 
3.1%
16
 
2.6%
16
 
2.6%
15
 
2.4%
14
 
2.3%
14
 
2.3%
13
 
2.1%
13
 
2.1%
Other values (127) 422
68.4%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 617
99.8%
Common 1
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
42
 
6.8%
33
 
5.3%
19
 
3.1%
16
 
2.6%
16
 
2.6%
15
 
2.4%
14
 
2.3%
14
 
2.3%
13
 
2.1%
13
 
2.1%
Other values (127) 422
68.4%
Common
ValueCountFrequency (%)
1 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 617
99.8%
ASCII 1
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
42
 
6.8%
33
 
5.3%
19
 
3.1%
16
 
2.6%
16
 
2.6%
15
 
2.4%
14
 
2.3%
14
 
2.3%
13
 
2.1%
13
 
2.1%
Other values (127) 422
68.4%
ASCII
ValueCountFrequency (%)
1 1
100.0%

종류
Categorical

IMBALANCE 

Distinct3
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
일반
204 
기술
 
1
번역(영어)
 
1

Length

Max length6
Median length2
Mean length2.0194175
Min length2

Unique

Unique2 ?
Unique (%)1.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 204
99.0%
기술 1
 
0.5%
번역(영어) 1
 
0.5%

Length

2023-12-13T00:18:58.240028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:18:58.392680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 204
99.0%
기술 1
 
0.5%
번역(영어 1
 
0.5%

연락처
Text

MISSING 

Distinct155
Distinct (%)99.4%
Missing50
Missing (%)24.3%
Memory size1.7 KiB
2023-12-13T00:18:58.677291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.012821
Min length12

Characters and Unicode

Total characters1874
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique154 ?
Unique (%)98.7%

Sample

1st row033-641-3344
2nd row033-642-0159
3rd row033-642-5030
4th row033-643-1115
5th row033-643-8811
ValueCountFrequency (%)
033-433-6969 2
 
1.2%
033 2
 
1.2%
033-241-0081 1
 
0.6%
033-244-5301 1
 
0.6%
033-252-5666 1
 
0.6%
033-252-4831 1
 
0.6%
033-452-1340 1
 
0.6%
033-452-2563 1
 
0.6%
033-455-9020 1
 
0.6%
033-241-9833 1
 
0.6%
Other values (148) 148
92.5%
2023-12-13T00:18:59.166159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 446
23.8%
- 308
16.4%
0 247
13.2%
2 145
 
7.7%
5 133
 
7.1%
4 132
 
7.0%
6 120
 
6.4%
7 117
 
6.2%
1 95
 
5.1%
8 67
 
3.6%
Other values (2) 64
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1562
83.4%
Dash Punctuation 308
 
16.4%
Space Separator 4
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 446
28.6%
0 247
15.8%
2 145
 
9.3%
5 133
 
8.5%
4 132
 
8.5%
6 120
 
7.7%
7 117
 
7.5%
1 95
 
6.1%
8 67
 
4.3%
9 60
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 308
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1874
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 446
23.8%
- 308
16.4%
0 247
13.2%
2 145
 
7.7%
5 133
 
7.1%
4 132
 
7.0%
6 120
 
6.4%
7 117
 
6.2%
1 95
 
5.1%
8 67
 
3.6%
Other values (2) 64
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1874
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 446
23.8%
- 308
16.4%
0 247
13.2%
2 145
 
7.7%
5 133
 
7.1%
4 132
 
7.0%
6 120
 
6.4%
7 117
 
6.2%
1 95
 
5.1%
8 67
 
3.6%
Other values (2) 64
 
3.4%

Correlations

2023-12-13T00:18:59.293788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명시군구명종류
시도명1.0000.0000.000
시군구명0.0001.0000.000
종류0.0000.0001.000
2023-12-13T00:18:59.398117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종류시도명시군구명
종류1.0000.0000.000
시도명0.0001.0000.000
시군구명0.0000.0001.000
2023-12-13T00:18:59.495575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명시군구명종류
시도명1.0000.0000.000
시군구명0.0001.0000.000
종류0.0000.0001.000

Missing values

2023-12-13T00:18:55.719522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:18:55.965207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군구명업체명대표자명종류연락처
0강원특별자치도강릉시성진행정사사무소김성진일반<NA>
1강원특별자치도강릉시행정사 김낙국사무소김낙국일반<NA>
2강원특별자치도강릉시행정사 배형각배형각일반<NA>
3강원특별자치도강릉시행정사 김찬기사무소김찬기일반<NA>
4강원특별자치도강릉시행정사 송암사무소최기순일반<NA>
5강원특별자치도강릉시원영래 행정사사무소원영래일반033-641-3344
6강원특별자치도강릉시행정서사 윤조익사무소윤조익일반033-642-0159
7강원특별자치도강릉시행정서사 최종설사무소최종설일반033-642-5030
8강원특별자치도강릉시행정사 백규선사무소백규선일반033-643-1115
9강원특별자치도강릉시행정사 박재영사무소박재영일반033-643-8811
시도명시군구명업체명대표자명종류연락처
196강원특별자치도화천군정인행정사무소김인규일반<NA>
197강원특별자치도횡성군김철호행정사김철호일반<NA>
198강원특별자치도횡성군선의행정사사무소정선교일반<NA>
199강원특별자치도횡성군박희구행정사박희구일반<NA>
200강원특별자치도횡성군고치용행정사고치용일반<NA>
201강원특별자치도횡성군김준구행정사김준구일반033-343-3169
202강원특별자치도횡성군행정사김순철사무소김순철일반033-343-3634
203강원특별자치도횡성군고만수행정사고만수일반033-343-3754
204강원특별자치도횡성군오후행정사이재호일반033-343-5789
205강원특별자치도횡성군이관형행정사이원근일반033-345-0505