Overview

Dataset statistics

Number of variables7
Number of observations393
Missing cells35
Missing cells (%)1.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.6 KiB
Average record size in memory56.3 B

Variable types

Categorical3
Text4

Dataset

Description가정용 도시가스렌지 연결, 요금검침, 안전점검 등의 업무를 도시가스사로부터 위탁받아 수행하는 도시가스 지역관리소에 대한 정보(도시가스 회사명, 고객센터명, 공급지역, 전화번호)입니다.
Author한국가스안전공사
URLhttps://www.data.go.kr/data/15001493/fileData.do

Alerts

연락처 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
도시가스 회사명 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
구분 is highly overall correlated with 도시가스 회사명 and 1 other fieldsHigh correlation
고객센터명 has 34 (8.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:21:14.200115
Analysis finished2023-12-12 21:21:14.953923
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
서울특별시
89 
경기도
77 
부산광역시
32 
경상북도
28 
전라남도
21 
Other values (12)
146 

Length

Max length7
Median length5
Mean length4.2340967
Min length3

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
서울특별시 89
22.6%
경기도 77
19.6%
부산광역시 32
 
8.1%
경상북도 28
 
7.1%
전라남도 21
 
5.3%
대구광역시 21
 
5.3%
전라북도 20
 
5.1%
충청남도 19
 
4.8%
경상남도 18
 
4.6%
충청북도 15
 
3.8%
Other values (7) 53
13.5%

Length

2023-12-13T06:21:15.041519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울특별시 89
22.6%
경기도 77
19.6%
부산광역시 32
 
8.1%
경상북도 28
 
7.1%
전라남도 21
 
5.3%
대구광역시 21
 
5.3%
전라북도 20
 
5.1%
충청남도 19
 
4.8%
경상남도 18
 
4.6%
충청북도 15
 
3.8%
Other values (7) 53
13.5%
Distinct188
Distinct (%)47.8%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2023-12-13T06:21:15.390711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.9516539
Min length2

Characters and Unicode

Total characters1160
Distinct characters127
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)27.5%

Sample

1st row강릉시
2nd row고성군
3rd row동해시
4th row삼척시
5th row속초시
ValueCountFrequency (%)
서구 11
 
2.8%
동구 10
 
2.5%
중구 9
 
2.3%
전주시 8
 
2.0%
성북구 7
 
1.8%
성남시 7
 
1.8%
강남구 6
 
1.5%
북구 6
 
1.5%
남구 6
 
1.5%
송파구 5
 
1.3%
Other values (178) 318
80.9%
2023-12-13T06:21:15.902141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
181
 
15.6%
153
 
13.2%
73
 
6.3%
40
 
3.4%
38
 
3.3%
33
 
2.8%
31
 
2.7%
31
 
2.7%
29
 
2.5%
28
 
2.4%
Other values (117) 523
45.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1160
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
181
 
15.6%
153
 
13.2%
73
 
6.3%
40
 
3.4%
38
 
3.3%
33
 
2.8%
31
 
2.7%
31
 
2.7%
29
 
2.5%
28
 
2.4%
Other values (117) 523
45.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1160
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
181
 
15.6%
153
 
13.2%
73
 
6.3%
40
 
3.4%
38
 
3.3%
33
 
2.8%
31
 
2.7%
31
 
2.7%
29
 
2.5%
28
 
2.4%
Other values (117) 523
45.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1160
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
181
 
15.6%
153
 
13.2%
73
 
6.3%
40
 
3.4%
38
 
3.3%
33
 
2.8%
31
 
2.7%
31
 
2.7%
29
 
2.5%
28
 
2.4%
Other values (117) 523
45.1%

도시가스 회사명
Categorical

HIGH CORRELATION 

Distinct33
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
삼천리
42 
예스코
35 
부산도시가스
32 
코원에너지서비스
32 
서울도시가스
31 
Other values (28)
221 

Length

Max length9
Median length8
Mean length5.7379135
Min length3

Unique

Unique3 ?
Unique (%)0.8%

Sample

1st row참빛영동도시가스
2nd row참빛도시가스
3rd row참빛영동도시가스
4th row참빛영동도시가스
5th row참빛도시가스

Common Values

ValueCountFrequency (%)
삼천리 42
 
10.7%
예스코 35
 
8.9%
부산도시가스 32
 
8.1%
코원에너지서비스 32
 
8.1%
서울도시가스 31
 
7.9%
대성에너지 26
 
6.6%
대륜E&S 23
 
5.9%
영남에너지서비스 17
 
4.3%
충청에너지서비스 15
 
3.8%
전북도시가스 14
 
3.6%
Other values (23) 126
32.1%

Length

2023-12-13T06:21:16.070066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
삼천리 42
 
10.7%
예스코 35
 
8.9%
부산도시가스 32
 
8.1%
코원에너지서비스 32
 
8.1%
서울도시가스 31
 
7.9%
대성에너지 26
 
6.6%
대륜e&s 23
 
5.9%
영남에너지서비스 17
 
4.3%
충청에너지서비스 15
 
3.8%
전북도시가스 14
 
3.6%
Other values (23) 126
32.1%

고객센터명
Text

MISSING 

Distinct252
Distinct (%)70.2%
Missing34
Missing (%)8.7%
Memory size3.2 KiB
2023-12-13T06:21:16.485410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length4.7075209
Min length2

Characters and Unicode

Total characters1690
Distinct characters148
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique176 ?
Unique (%)49.0%

Sample

1st row강릉지역관리소
2nd row동해지역관리소
3rd row살림에너지
4th row살림에너지
5th row살림에너지
ValueCountFrequency (%)
14
 
3.6%
도시가스 9
 
2.3%
살림에너지 5
 
1.3%
2센터 5
 
1.3%
서부고객센터 5
 
1.3%
제3서비스센터 4
 
1.0%
동부고객센터 4
 
1.0%
남부고객센터 4
 
1.0%
서비스센터 4
 
1.0%
중앙 3
 
0.8%
Other values (247) 331
85.3%
2023-12-13T06:21:17.000558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
108
 
6.4%
108
 
6.4%
108
 
6.4%
79
 
4.7%
77
 
4.6%
77
 
4.6%
73
 
4.3%
72
 
4.3%
68
 
4.0%
1 47
 
2.8%
Other values (138) 873
51.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1488
88.0%
Decimal Number 152
 
9.0%
Space Separator 29
 
1.7%
Open Punctuation 7
 
0.4%
Close Punctuation 7
 
0.4%
Other Symbol 4
 
0.2%
Other Punctuation 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
108
 
7.3%
108
 
7.3%
108
 
7.3%
79
 
5.3%
77
 
5.2%
77
 
5.2%
73
 
4.9%
72
 
4.8%
68
 
4.6%
42
 
2.8%
Other values (123) 676
45.4%
Decimal Number
ValueCountFrequency (%)
1 47
30.9%
2 28
18.4%
3 28
18.4%
4 16
 
10.5%
5 14
 
9.2%
8 6
 
3.9%
6 5
 
3.3%
7 4
 
2.6%
9 2
 
1.3%
0 2
 
1.3%
Space Separator
ValueCountFrequency (%)
29
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1492
88.3%
Common 198
 
11.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
108
 
7.2%
108
 
7.2%
108
 
7.2%
79
 
5.3%
77
 
5.2%
77
 
5.2%
73
 
4.9%
72
 
4.8%
68
 
4.6%
42
 
2.8%
Other values (124) 680
45.6%
Common
ValueCountFrequency (%)
1 47
23.7%
29
14.6%
2 28
14.1%
3 28
14.1%
4 16
 
8.1%
5 14
 
7.1%
( 7
 
3.5%
) 7
 
3.5%
8 6
 
3.0%
6 5
 
2.5%
Other values (4) 11
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1488
88.0%
ASCII 198
 
11.7%
None 4
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
108
 
7.3%
108
 
7.3%
108
 
7.3%
79
 
5.3%
77
 
5.2%
77
 
5.2%
73
 
4.9%
72
 
4.8%
68
 
4.6%
42
 
2.8%
Other values (123) 676
45.4%
ASCII
ValueCountFrequency (%)
1 47
23.7%
29
14.6%
2 28
14.1%
3 28
14.1%
4 16
 
8.1%
5 14
 
7.1%
( 7
 
3.5%
) 7
 
3.5%
8 6
 
3.0%
6 5
 
2.5%
Other values (4) 11
 
5.6%
None
ValueCountFrequency (%)
4
100.0%
Distinct385
Distinct (%)98.2%
Missing1
Missing (%)0.3%
Memory size3.2 KiB
2023-12-13T06:21:17.373949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length245
Median length114
Mean length27.875
Min length3

Characters and Unicode

Total characters10927
Distinct characters327
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique381 ?
Unique (%)97.2%

Sample

1st row강릉시
2nd row고성군 일부
3rd row동해시
4th row삼척시
5th row속초시
ValueCountFrequency (%)
44
 
2.3%
일부 25
 
1.3%
전역 25
 
1.3%
제외 14
 
0.7%
1~2동 11
 
0.6%
1~3동 8
 
0.4%
1~3가 8
 
0.4%
전체 7
 
0.4%
처인구 6
 
0.3%
수원시 5
 
0.3%
Other values (1558) 1754
92.0%
2023-12-13T06:21:17.923393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 1601
 
14.7%
1529
 
14.0%
1522
 
13.9%
1 330
 
3.0%
2 236
 
2.2%
~ 168
 
1.5%
3 138
 
1.3%
134
 
1.2%
114
 
1.0%
112
 
1.0%
Other values (317) 5043
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6335
58.0%
Other Punctuation 1670
 
15.3%
Space Separator 1529
 
14.0%
Decimal Number 1014
 
9.3%
Math Symbol 168
 
1.5%
Close Punctuation 104
 
1.0%
Open Punctuation 104
 
1.0%
Dash Punctuation 2
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1522
 
24.0%
134
 
2.1%
114
 
1.8%
112
 
1.8%
105
 
1.7%
101
 
1.6%
88
 
1.4%
87
 
1.4%
86
 
1.4%
83
 
1.3%
Other values (296) 3903
61.6%
Decimal Number
ValueCountFrequency (%)
1 330
32.5%
2 236
23.3%
3 138
13.6%
4 80
 
7.9%
5 49
 
4.8%
6 45
 
4.4%
7 37
 
3.6%
9 36
 
3.6%
0 34
 
3.4%
8 29
 
2.9%
Other Punctuation
ValueCountFrequency (%)
, 1601
95.9%
: 41
 
2.5%
. 13
 
0.8%
/ 12
 
0.7%
· 3
 
0.2%
Space Separator
ValueCountFrequency (%)
1529
100.0%
Math Symbol
ValueCountFrequency (%)
~ 168
100.0%
Close Punctuation
ValueCountFrequency (%)
) 104
100.0%
Open Punctuation
ValueCountFrequency (%)
( 104
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6324
57.9%
Common 4591
42.0%
Han 11
 
0.1%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1522
 
24.1%
134
 
2.1%
114
 
1.8%
112
 
1.8%
105
 
1.7%
101
 
1.6%
88
 
1.4%
87
 
1.4%
86
 
1.4%
83
 
1.3%
Other values (295) 3892
61.5%
Common
ValueCountFrequency (%)
, 1601
34.9%
1529
33.3%
1 330
 
7.2%
2 236
 
5.1%
~ 168
 
3.7%
3 138
 
3.0%
) 104
 
2.3%
( 104
 
2.3%
4 80
 
1.7%
5 49
 
1.1%
Other values (10) 252
 
5.5%
Han
ValueCountFrequency (%)
11
100.0%
Latin
ValueCountFrequency (%)
A 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6323
57.9%
ASCII 4589
42.0%
CJK 11
 
0.1%
None 3
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 1601
34.9%
1529
33.3%
1 330
 
7.2%
2 236
 
5.1%
~ 168
 
3.7%
3 138
 
3.0%
) 104
 
2.3%
( 104
 
2.3%
4 80
 
1.7%
5 49
 
1.1%
Other values (10) 250
 
5.4%
Hangul
ValueCountFrequency (%)
1522
 
24.1%
134
 
2.1%
114
 
1.8%
112
 
1.8%
105
 
1.7%
101
 
1.6%
88
 
1.4%
87
 
1.4%
86
 
1.4%
83
 
1.3%
Other values (294) 3891
61.5%
CJK
ValueCountFrequency (%)
11
100.0%
None
ValueCountFrequency (%)
· 3
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

연락처
Categorical

HIGH CORRELATION 

Distinct33
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
1544-3002
42 
1544-3131
35 
1599-3366
32 
1544-0009
32 
1588-5788
31 
Other values (28)
221 

Length

Max length12
Median length9
Mean length9.3358779
Min length9

Unique

Unique6 ?
Unique (%)1.5%

Sample

1st row1899-9100
2nd row1899-9100
3rd row1899-9100
4th row1899-9100
5th row1899-9100

Common Values

ValueCountFrequency (%)
1544-3002 42
 
10.7%
1544-3131 35
 
8.9%
1599-3366 32
 
8.1%
1544-0009 32
 
8.1%
1588-5788 31
 
7.9%
1577-1190 26
 
6.6%
1566-6116 23
 
5.9%
1599-0009 17
 
4.3%
1599-3131 15
 
3.8%
1666-0009 14
 
3.6%
Other values (23) 126
32.1%

Length

2023-12-13T06:21:18.086651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1544-3002 42
 
10.7%
1544-3131 35
 
8.9%
1599-3366 32
 
8.1%
1544-0009 32
 
8.1%
1588-5788 31
 
7.9%
1577-1190 26
 
6.6%
1566-6116 23
 
5.9%
1599-0009 17
 
4.3%
1599-3131 15
 
3.8%
1666-0009 14
 
3.6%
Other values (23) 126
32.1%
Distinct219
Distinct (%)55.7%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2023-12-13T06:21:18.397763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length10.888041
Min length9

Characters and Unicode

Total characters4279
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)36.9%

Sample

1st row1899-9100
2nd row1899-9100
3rd row1899-9100
4th row1899-9100
5th row1899-9100
ValueCountFrequency (%)
1544-3131 35
 
8.9%
055-260-4000 12
 
3.1%
1899-9100 8
 
2.0%
1670-4700 8
 
2.0%
1544-1115 7
 
1.8%
1577-6580 6
 
1.5%
1577-8181 6
 
1.5%
1670-0799 6
 
1.5%
1599-2232 6
 
1.5%
1577-1169 5
 
1.3%
Other values (209) 294
74.8%
2023-12-13T06:21:18.854019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 704
16.5%
- 650
15.2%
1 559
13.1%
3 468
10.9%
4 356
8.3%
5 340
7.9%
6 286
6.7%
2 284
6.6%
7 228
 
5.3%
8 225
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3629
84.8%
Dash Punctuation 650
 
15.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 704
19.4%
1 559
15.4%
3 468
12.9%
4 356
9.8%
5 340
9.4%
6 286
7.9%
2 284
7.8%
7 228
 
6.3%
8 225
 
6.2%
9 179
 
4.9%
Dash Punctuation
ValueCountFrequency (%)
- 650
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4279
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 704
16.5%
- 650
15.2%
1 559
13.1%
3 468
10.9%
4 356
8.3%
5 340
7.9%
6 286
6.7%
2 284
6.6%
7 228
 
5.3%
8 225
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4279
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 704
16.5%
- 650
15.2%
1 559
13.1%
3 468
10.9%
4 356
8.3%
5 340
7.9%
6 286
6.7%
2 284
6.6%
7 228
 
5.3%
8 225
 
5.3%

Correlations

2023-12-13T06:21:18.991266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분도시가스 회사명연락처
구분1.0000.9890.988
도시가스 회사명0.9891.0001.000
연락처0.9881.0001.000
2023-12-13T06:21:19.110934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분연락처도시가스 회사명
구분1.0000.8450.850
연락처0.8451.0000.948
도시가스 회사명0.8500.9481.000
2023-12-13T06:21:19.217821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분도시가스 회사명연락처
구분1.0000.8500.845
도시가스 회사명0.8501.0000.948
연락처0.8450.9481.000

Missing values

2023-12-13T06:21:14.708544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:21:14.820927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:21:14.907361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분지역구도시가스 회사명고객센터명공급지역연락처연락처2
0강원도강릉시참빛영동도시가스강릉지역관리소강릉시1899-91001899-9100
1강원도고성군참빛도시가스<NA>고성군 일부1899-91001899-9100
2강원도동해시참빛영동도시가스동해지역관리소동해시1899-91001899-9100
3강원도삼척시참빛영동도시가스<NA>삼척시1899-91001899-9100
4강원도속초시참빛도시가스<NA>속초시1899-91001899-9100
5강원도영월군강원도시가스살림에너지영월군1599-22321599-2232
6강원도원주시참빛원주도시가스<NA>원주시1899-91001899-9100
7강원도태백시강원도시가스살림에너지태백시1599-22321599-2232
8강원도춘천시강원도시가스살림에너지춘천시(남산면, 동면, 동내면, 동산면, 석사동, 후평1~3동)1599-22321599-2232
9강원도정선군강원도시가스살림에너지정선군1599-22321599-2232
구분지역구도시가스 회사명고객센터명공급지역연락처연락처2
383충청북도보은군충청에너지서비스제3서비스센터보은군1599-3131043-291-9421
384충청북도영동군충청에너지서비스제3서비스센터영동군1599-3131043-291-9421
385충청북도옥천군충청에너지서비스제3서비스센터옥천군1599-3131043-291-9421
386충청북도청주시충청에너지서비스제3서비스센터금천동, 용암동, 방서동, 운동동, 평촌동, 지북동, 대성동, 문화동, 용담동, 탑동,서문동, 남주동, 북문로1가, 용정동, 남문로1가, 남문로2가, 서운동, 석교동1599-3131043-291-9421
387충청북도단양군충청에너지서비스북부서비스센터단양군1599-3131043-645-8182
388충청북도제천시충청에너지서비스북부서비스센터제천시1599-3131043-645-8182
389충청북도괴산군충청에너지서비스서부서비스센터괴산군1599-3131043-883-0334
390충청북도음성군충청에너지서비스서부서비스센터음성군1599-3131043-883-0334
391충청북도진천군충청에너지서비스서부서비스센터진천군1599-3131043-883-0334
392충청북도충주시참빛충북도시가스<NA>충주시1899-91001899-9100