Overview

Dataset statistics

Number of variables7
Number of observations151
Missing cells543
Missing cells (%)51.4%
Duplicate rows2
Duplicate rows (%)1.3%
Total size in memory8.4 KiB
Average record size in memory56.9 B

Variable types

Text6
DateTime1

Dataset

Description한국가스공사의 주요사업인 천연가스의 안전하고 안정적인 공급 운영과 관련하여, 천연가스 주배관 구간별 최초 가스주입 현황 정보를 민간기업 및 수요처에 제공함으로써 국민편익 및 이해도를 증진함.
URLhttps://www.data.go.kr/data/15102902/fileData.do

Alerts

구간6 has constant value ""Constant
Dataset has 2 (1.3%) duplicate rowsDuplicates
구간3 has 105 (69.5%) missing valuesMissing
구간4 has 141 (93.4%) missing valuesMissing
구간5 has 146 (96.7%) missing valuesMissing
구간6 has 150 (99.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:44:43.012423
Analysis finished2023-12-12 21:44:43.723683
Duration0.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct122
Distinct (%)80.8%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-13T06:44:43.979352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length2
Mean length2.1854305
Min length2

Characters and Unicode

Total characters330
Distinct characters128
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)64.2%

Sample

1st row기지
2nd row안중
3rd row목감
4th row율도
5th row가좌
ValueCountFrequency (%)
반월 6
 
4.0%
남양주 2
 
1.3%
이리 2
 
1.3%
승주 2
 
1.3%
신북 2
 
1.3%
안강 2
 
1.3%
홍성 2
 
1.3%
영종 2
 
1.3%
통영 2
 
1.3%
동김천 2
 
1.3%
Other values (112) 127
84.1%
2023-12-13T06:44:44.441684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
3.9%
13
 
3.9%
12
 
3.6%
10
 
3.0%
8
 
2.4%
8
 
2.4%
8
 
2.4%
7
 
2.1%
6
 
1.8%
6
 
1.8%
Other values (118) 239
72.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 320
97.0%
Uppercase Letter 8
 
2.4%
Other Punctuation 2
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
4.1%
13
 
4.1%
12
 
3.8%
10
 
3.1%
8
 
2.5%
8
 
2.5%
8
 
2.5%
7
 
2.2%
6
 
1.9%
6
 
1.9%
Other values (113) 229
71.6%
Uppercase Letter
ValueCountFrequency (%)
L 3
37.5%
T 3
37.5%
B 1
 
12.5%
V 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 320
97.0%
Latin 8
 
2.4%
Common 2
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
4.1%
13
 
4.1%
12
 
3.8%
10
 
3.1%
8
 
2.5%
8
 
2.5%
8
 
2.5%
7
 
2.2%
6
 
1.9%
6
 
1.9%
Other values (113) 229
71.6%
Latin
ValueCountFrequency (%)
L 3
37.5%
T 3
37.5%
B 1
 
12.5%
V 1
 
12.5%
Common
ValueCountFrequency (%)
/ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 320
97.0%
ASCII 10
 
3.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
13
 
4.1%
13
 
4.1%
12
 
3.8%
10
 
3.1%
8
 
2.5%
8
 
2.5%
8
 
2.5%
7
 
2.2%
6
 
1.9%
6
 
1.9%
Other values (113) 229
71.6%
ASCII
ValueCountFrequency (%)
L 3
30.0%
T 3
30.0%
/ 2
20.0%
B 1
 
10.0%
V 1
 
10.0%
Distinct137
Distinct (%)91.3%
Missing1
Missing (%)0.7%
Memory size1.3 KiB
2023-12-13T06:44:44.828689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length2
Mean length2.3066667
Min length2

Characters and Unicode

Total characters346
Distinct characters148
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)82.7%

Sample

1st row안중
2nd row목감
3rd row율도
4th row일도
5th row고척
ValueCountFrequency (%)
진월 2
 
1.3%
청주 2
 
1.3%
상계 2
 
1.3%
군산 2
 
1.3%
부여 2
 
1.3%
용인 2
 
1.3%
가좌 2
 
1.3%
의정부 2
 
1.3%
합정 2
 
1.3%
경서 2
 
1.3%
Other values (127) 130
86.7%
2023-12-13T06:44:45.350796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
 
5.8%
8
 
2.3%
7
 
2.0%
7
 
2.0%
7
 
2.0%
6
 
1.7%
6
 
1.7%
6
 
1.7%
6
 
1.7%
6
 
1.7%
Other values (138) 267
77.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 323
93.4%
Lowercase Letter 8
 
2.3%
Uppercase Letter 8
 
2.3%
Other Punctuation 3
 
0.9%
Dash Punctuation 2
 
0.6%
Open Punctuation 1
 
0.3%
Close Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
 
6.2%
8
 
2.5%
7
 
2.2%
7
 
2.2%
7
 
2.2%
6
 
1.9%
6
 
1.9%
6
 
1.9%
6
 
1.9%
6
 
1.9%
Other values (127) 244
75.5%
Uppercase Letter
ValueCountFrequency (%)
T 3
37.5%
B 2
25.0%
V 2
25.0%
L 1
 
12.5%
Lowercase Letter
ValueCountFrequency (%)
i 4
50.0%
e 2
25.0%
n 2
25.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 323
93.4%
Latin 16
 
4.6%
Common 7
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
20
 
6.2%
8
 
2.5%
7
 
2.2%
7
 
2.2%
7
 
2.2%
6
 
1.9%
6
 
1.9%
6
 
1.9%
6
 
1.9%
6
 
1.9%
Other values (127) 244
75.5%
Latin
ValueCountFrequency (%)
i 4
25.0%
T 3
18.8%
B 2
12.5%
V 2
12.5%
e 2
12.5%
n 2
12.5%
L 1
 
6.2%
Common
ValueCountFrequency (%)
/ 3
42.9%
- 2
28.6%
( 1
 
14.3%
) 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 323
93.4%
ASCII 23
 
6.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
20
 
6.2%
8
 
2.5%
7
 
2.2%
7
 
2.2%
7
 
2.2%
6
 
1.9%
6
 
1.9%
6
 
1.9%
6
 
1.9%
6
 
1.9%
Other values (127) 244
75.5%
ASCII
ValueCountFrequency (%)
i 4
17.4%
T 3
13.0%
/ 3
13.0%
B 2
8.7%
V 2
8.7%
e 2
8.7%
- 2
8.7%
n 2
8.7%
L 1
 
4.3%
( 1
 
4.3%

구간3
Text

MISSING 

Distinct43
Distinct (%)93.5%
Missing105
Missing (%)69.5%
Memory size1.3 KiB
2023-12-13T06:44:45.575376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.0434783
Min length2

Characters and Unicode

Total characters94
Distinct characters61
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)89.1%

Sample

1st row일산
2nd row분당
3rd row진장
4th row창원
5th row하동
ValueCountFrequency (%)
하동 3
 
6.5%
가채 2
 
4.3%
작동 1
 
2.2%
신원 1
 
2.2%
해남 1
 
2.2%
율천 1
 
2.2%
미사리 1
 
2.2%
금남 1
 
2.2%
옥정 1
 
2.2%
동내 1
 
2.2%
Other values (33) 33
71.7%
2023-12-13T06:44:45.923593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9
 
9.6%
5
 
5.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
3
 
3.2%
2
 
2.1%
2
 
2.1%
2
 
2.1%
2
 
2.1%
Other values (51) 57
60.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 94
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
9.6%
5
 
5.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
3
 
3.2%
2
 
2.1%
2
 
2.1%
2
 
2.1%
2
 
2.1%
Other values (51) 57
60.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 94
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
9.6%
5
 
5.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
3
 
3.2%
2
 
2.1%
2
 
2.1%
2
 
2.1%
2
 
2.1%
Other values (51) 57
60.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 94
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9
 
9.6%
5
 
5.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
3
 
3.2%
2
 
2.1%
2
 
2.1%
2
 
2.1%
2
 
2.1%
Other values (51) 57
60.6%

구간4
Text

MISSING 

Distinct10
Distinct (%)100.0%
Missing141
Missing (%)93.4%
Memory size1.3 KiB
2023-12-13T06:44:46.108147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters20
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)100.0%

Sample

1st row광교
2nd row안동
3rd row거창
4th row신북
5th row봉암
ValueCountFrequency (%)
광교 1
10.0%
안동 1
10.0%
거창 1
10.0%
신북 1
10.0%
봉암 1
10.0%
오량 1
10.0%
연풍 1
10.0%
은현 1
10.0%
덕현 1
10.0%
청량 1
10.0%
2023-12-13T06:44:46.449860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2
 
10.0%
2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (8) 8
40.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 20
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
10.0%
2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (8) 8
40.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 20
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
 
10.0%
2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (8) 8
40.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 20
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2
 
10.0%
2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (8) 8
40.0%

구간5
Text

MISSING 

Distinct4
Distinct (%)80.0%
Missing146
Missing (%)96.7%
Memory size1.3 KiB
2023-12-13T06:44:46.581442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.4
Min length2

Characters and Unicode

Total characters12
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)60.0%

Sample

1st row포천
2nd row동두천
3rd row성포
4th row동두천
5th row상북
ValueCountFrequency (%)
동두천 2
40.0%
포천 1
20.0%
성포 1
20.0%
상북 1
20.0%
2023-12-13T06:44:46.843593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
25.0%
2
16.7%
2
16.7%
2
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3
25.0%
2
16.7%
2
16.7%
2
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3
25.0%
2
16.7%
2
16.7%
2
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3
25.0%
2
16.7%
2
16.7%
2
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%

구간6
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing150
Missing (%)99.3%
Memory size1.3 KiB
2023-12-13T06:44:46.941297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row거제
ValueCountFrequency (%)
거제 1
100.0%
2023-12-13T06:44:47.134865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

일자
Date

Distinct141
Distinct (%)93.4%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Minimum1986-12-01 00:00:00
Maximum2023-06-29 00:00:00
2023-12-13T06:44:47.255974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:44:47.403648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Correlations

2023-12-13T06:44:47.483458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구간3구간4구간5
구간31.0001.0001.000
구간41.0001.0001.000
구간51.0001.0001.000

Missing values

2023-12-13T06:44:43.414977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:44:43.539676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:44:43.657520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구간1구간2구간3구간4구간5구간6일자
0기지안중<NA><NA><NA><NA>1986-12-01
1안중목감<NA><NA><NA><NA>1986-12-02
2목감율도<NA><NA><NA><NA>1986-12-03
3율도일도<NA><NA><NA><NA>1991-12-24
4가좌고척<NA><NA><NA><NA>1987-01-30
5고척독산<NA><NA><NA><NA>1987-02-05
6목감대치<NA><NA><NA><NA>1987-02-14
7대치군자<NA><NA><NA><NA>1987-03-27
8독산목동<NA><NA><NA><NA>1987-04-11
9반월수원<NA><NA><NA><NA>1987-11-10
구간1구간2구간3구간4구간5구간6일자
141전동청주<NA><NA><NA><NA>2019-11-29
142홍성청양<NA><NA><NA><NA>2021-07-27
143법원광탄<NA><NA><NA><NA>2021-11-17
144고령합천<NA><NA><NA><NA>2021-11-30
145함양산청<NA><NA><NA><NA>2022-03-02
146여수여천<NA><NA><NA><NA>2022-02-11
147청양은산<NA><NA><NA><NA>2022-06-30
148은산부여<NA><NA><NA><NA>2022-09-06
149홍성홍북<NA><NA><NA><NA>2023-01-19
150동내경산<NA><NA><NA><NA>2023-06-29

Duplicate rows

Most frequently occurring

구간1구간2구간3구간4구간5구간6일자# duplicates
0반월수원<NA><NA><NA><NA>1987-11-102
1승주진월하동<NA><NA><NA>1999-11-272