Overview

Dataset statistics

Number of variables10
Number of observations5121
Missing cells0
Missing cells (%)0.0%
Duplicate rows5
Duplicate rows (%)0.1%
Total size in memory415.2 KiB
Average record size in memory83.0 B

Variable types

Categorical4
Text3
Numeric3

Dataset

Description한국가스안전공사에서 실시하는 가스시설 검사와 관련하여 검사종류, 법, 용량 등에 따른 검사 수수료 정보로 국민들(특히, 가스업계 종사자)에게 도움을 줄 수 있는 데이터입니다.
Author한국가스안전공사
URLhttps://www.data.go.kr/data/15067809/fileData.do

Alerts

Dataset has 5 (0.1%) duplicate rowsDuplicates
검사종류 is highly overall correlated with 수수료 and 2 other fieldsHigh correlation
업무구분 is highly overall correlated with 검사종류 and 1 other fieldsHigh correlation
용량(FROM) is highly overall correlated with 용량(TO)High correlation
용량(TO) is highly overall correlated with 용량(FROM)High correlation
수수료 is highly overall correlated with 검사종류High correlation
법구분 is highly overall correlated with 업무구분 and 1 other fieldsHigh correlation
업무구분 is highly imbalanced (58.4%)Imbalance
계산방식 is highly imbalanced (79.6%)Imbalance
용량(FROM) has 1767 (34.5%) zerosZeros
용량(TO) has 1181 (23.1%) zerosZeros

Reproduction

Analysis started2023-12-12 09:47:40.056357
Analysis finished2023-12-12 09:47:43.010369
Duration2.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업무구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct13
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
제품검사
3634 
시설검사
612 
심사
 
295
교육
 
273
검사용역
 
135
Other values (8)
 
172

Length

Max length13
Median length4
Mean length3.7873462
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row검사용역
2nd row검사용역
3rd row검사용역
4th row검사용역
5th row검사용역

Common Values

ValueCountFrequency (%)
제품검사 3634
71.0%
시설검사 612
 
12.0%
심사 295
 
5.8%
교육 273
 
5.3%
검사용역 135
 
2.6%
시험인증 110
 
2.1%
교재비등 33
 
0.6%
기술용역 11
 
0.2%
민원검사서비스 6
 
0.1%
인증심사 5
 
0.1%
Other values (3) 7
 
0.1%

Length

2023-12-12T18:47:43.099905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제품검사 3634
71.0%
시설검사 612
 
12.0%
심사 295
 
5.8%
교육 273
 
5.3%
검사용역 135
 
2.6%
시험인증 110
 
2.1%
교재비등 33
 
0.6%
기술용역 11
 
0.2%
민원검사서비스 6
 
0.1%
인증심사 5
 
0.1%
Other values (3) 7
 
0.1%

검사종류
Categorical

HIGH CORRELATION 

Distinct49
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
제품확인(상시샘플)검사
694 
생산공정(필증발급)검사
647 
종합공정(필증발급)검사
646 
정기검사
 
253
특정설비(제품확인검사)
 
246
Other values (44)
2635 

Length

Max length17
Median length16
Mean length9.7365749
Min length3

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row자율검사
2nd row자율검사
3rd row자율검사
4th row자율검사
5th row자율검사

Common Values

ValueCountFrequency (%)
제품확인(상시샘플)검사 694
13.6%
생산공정(필증발급)검사 647
 
12.6%
종합공정(필증발급)검사 646
 
12.6%
정기검사 253
 
4.9%
특정설비(제품확인검사) 246
 
4.8%
기술검토 245
 
4.8%
설계단계(정밀)검사 227
 
4.4%
완성검사 219
 
4.3%
특정설비(종합공정_각인등검사) 160
 
3.1%
특정설비(생산공정_각인등검사) 160
 
3.1%
Other values (39) 1624
31.7%

Length

2023-12-12T18:47:43.262943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제품확인(상시샘플)검사 694
13.4%
생산공정(필증발급)검사 647
 
12.5%
종합공정(필증발급)검사 646
 
12.4%
정기검사 253
 
4.9%
특정설비(제품확인검사 246
 
4.7%
기술검토 245
 
4.7%
설계단계(정밀)검사 227
 
4.4%
완성검사 219
 
4.2%
특정설비(종합공정_각인등검사 160
 
3.1%
특정설비(생산공정_각인등검사 160
 
3.1%
Other values (40) 1695
32.6%

법구분
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
액법
2387 
고법
2118 
도법
262 
수소법
 
182
미구분
 
172

Length

Max length3
Median length2
Mean length2.0691271
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도법
2nd row도법
3rd row도법
4th row도법
5th row도법

Common Values

ValueCountFrequency (%)
액법 2387
46.6%
고법 2118
41.4%
도법 262
 
5.1%
수소법 182
 
3.6%
미구분 172
 
3.4%

Length

2023-12-12T18:47:43.434943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:47:43.557313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
액법 2387
46.6%
고법 2118
41.4%
도법 262
 
5.1%
수소법 182
 
3.6%
미구분 172
 
3.4%
Distinct233
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
2023-12-12T18:47:43.806338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length5.721734
Min length1

Characters and Unicode

Total characters29301
Distinct characters228
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)1.3%

Sample

1st row도시가스회사
2nd row도시가스회사
3rd row도시가스회사
4th row도시가스회사
5th row도시가스회사
ValueCountFrequency (%)
가스난방기 446
 
8.6%
배관용밸브 398
 
7.7%
압력조정기 350
 
6.7%
저장탱크 298
 
5.7%
가스온수보일러 285
 
5.5%
용접용기 190
 
3.7%
충전시설 190
 
3.7%
압력용기 126
 
2.4%
저장시설 98
 
1.9%
특정제조 95
 
1.8%
Other values (226) 2718
52.3%
2023-12-12T18:47:44.263230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2128
 
7.3%
1863
 
6.4%
1553
 
5.3%
1511
 
5.2%
745
 
2.5%
735
 
2.5%
710
 
2.4%
650
 
2.2%
612
 
2.1%
564
 
1.9%
Other values (218) 18230
62.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 28801
98.3%
Uppercase Letter 254
 
0.9%
Space Separator 74
 
0.3%
Open Punctuation 69
 
0.2%
Close Punctuation 69
 
0.2%
Other Punctuation 19
 
0.1%
Decimal Number 9
 
< 0.1%
Dash Punctuation 3
 
< 0.1%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2128
 
7.4%
1863
 
6.5%
1553
 
5.4%
1511
 
5.2%
745
 
2.6%
735
 
2.6%
710
 
2.5%
650
 
2.3%
612
 
2.1%
564
 
2.0%
Other values (201) 17730
61.6%
Uppercase Letter
ValueCountFrequency (%)
G 80
31.5%
L 78
30.7%
P 73
28.7%
N 14
 
5.5%
C 6
 
2.4%
E 3
 
1.2%
Decimal Number
ValueCountFrequency (%)
2 3
33.3%
1 3
33.3%
3 3
33.3%
Other Punctuation
ValueCountFrequency (%)
· 13
68.4%
, 6
31.6%
Letter Number
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
74
100.0%
Open Punctuation
ValueCountFrequency (%)
( 69
100.0%
Close Punctuation
ValueCountFrequency (%)
) 69
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 28801
98.3%
Latin 257
 
0.9%
Common 243
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2128
 
7.4%
1863
 
6.5%
1553
 
5.4%
1511
 
5.2%
745
 
2.6%
735
 
2.6%
710
 
2.5%
650
 
2.3%
612
 
2.1%
564
 
2.0%
Other values (201) 17730
61.6%
Common
ValueCountFrequency (%)
74
30.5%
( 69
28.4%
) 69
28.4%
· 13
 
5.3%
, 6
 
2.5%
2 3
 
1.2%
1 3
 
1.2%
- 3
 
1.2%
3 3
 
1.2%
Latin
ValueCountFrequency (%)
G 80
31.1%
L 78
30.4%
P 73
28.4%
N 14
 
5.4%
C 6
 
2.3%
E 3
 
1.2%
2
 
0.8%
1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 28801
98.3%
ASCII 484
 
1.7%
None 13
 
< 0.1%
Number Forms 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2128
 
7.4%
1863
 
6.5%
1553
 
5.4%
1511
 
5.2%
745
 
2.6%
735
 
2.6%
710
 
2.5%
650
 
2.3%
612
 
2.1%
564
 
2.0%
Other values (201) 17730
61.6%
ASCII
ValueCountFrequency (%)
G 80
16.5%
L 78
16.1%
74
15.3%
P 73
15.1%
( 69
14.3%
) 69
14.3%
N 14
 
2.9%
C 6
 
1.2%
, 6
 
1.2%
2 3
 
0.6%
Other values (4) 12
 
2.5%
None
ValueCountFrequency (%)
· 13
100.0%
Number Forms
ValueCountFrequency (%)
2
66.7%
1
33.3%
Distinct523
Distinct (%)10.2%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
2023-12-12T18:47:44.562342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length25
Mean length5.91818
Min length1

Characters and Unicode

Total characters30307
Distinct characters365
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique194 ?
Unique (%)3.8%

Sample

1st row도매사업 배관
2nd row일반사업 배관
3rd row공동주택내 배관
4th row도매사업 배관
5th row일반사업 배관
ValueCountFrequency (%)
기타 236
 
4.2%
충전시설 172
 
3.1%
볼밸브 139
 
2.5%
kgs 131
 
2.4%
글로브밸브 100
 
1.8%
특정제조 95
 
1.7%
일반제조 93
 
1.7%
저장시설 80
 
1.4%
원통형(입형 66
 
1.2%
원통형(횡형 66
 
1.2%
Other values (549) 4396
78.9%
2023-12-12T18:47:45.039113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1477
 
4.9%
1192
 
3.9%
853
 
2.8%
853
 
2.8%
801
 
2.6%
757
 
2.5%
637
 
2.1%
575
 
1.9%
532
 
1.8%
528
 
1.7%
Other values (355) 22102
72.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26062
86.0%
Uppercase Letter 1983
 
6.5%
Decimal Number 855
 
2.8%
Space Separator 532
 
1.8%
Close Punctuation 273
 
0.9%
Open Punctuation 273
 
0.9%
Lowercase Letter 221
 
0.7%
Dash Punctuation 62
 
0.2%
Math Symbol 30
 
0.1%
Other Punctuation 13
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1477
 
5.7%
1192
 
4.6%
853
 
3.3%
853
 
3.3%
801
 
3.1%
757
 
2.9%
637
 
2.4%
575
 
2.2%
528
 
2.0%
523
 
2.0%
Other values (301) 17866
68.6%
Uppercase Letter
ValueCountFrequency (%)
G 306
15.4%
F 200
10.1%
C 181
9.1%
B 157
7.9%
S 154
7.8%
P 149
7.5%
L 144
7.3%
A 141
7.1%
K 136
6.9%
N 128
6.5%
Other values (9) 287
14.5%
Lowercase Letter
ValueCountFrequency (%)
e 30
13.6%
t 28
12.7%
n 24
10.9%
u 23
10.4%
l 22
10.0%
a 19
8.6%
g 14
6.3%
i 13
5.9%
k 12
 
5.4%
p 10
 
4.5%
Other values (7) 26
11.8%
Decimal Number
ValueCountFrequency (%)
1 293
34.3%
2 284
33.2%
3 249
29.1%
4 21
 
2.5%
7 7
 
0.8%
5 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
· 6
46.2%
, 4
30.8%
/ 2
 
15.4%
. 1
 
7.7%
Letter Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
532
100.0%
Close Punctuation
ValueCountFrequency (%)
) 273
100.0%
Open Punctuation
ValueCountFrequency (%)
( 273
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 62
100.0%
Math Symbol
ValueCountFrequency (%)
+ 30
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26062
86.0%
Latin 2206
 
7.3%
Common 2039
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1477
 
5.7%
1192
 
4.6%
853
 
3.3%
853
 
3.3%
801
 
3.1%
757
 
2.9%
637
 
2.4%
575
 
2.2%
528
 
2.0%
523
 
2.0%
Other values (301) 17866
68.6%
Latin
ValueCountFrequency (%)
G 306
13.9%
F 200
 
9.1%
C 181
 
8.2%
B 157
 
7.1%
S 154
 
7.0%
P 149
 
6.8%
L 144
 
6.5%
A 141
 
6.4%
K 136
 
6.2%
N 128
 
5.8%
Other values (28) 510
23.1%
Common
ValueCountFrequency (%)
532
26.1%
1 293
14.4%
2 284
13.9%
) 273
13.4%
( 273
13.4%
3 249
12.2%
- 62
 
3.0%
+ 30
 
1.5%
4 21
 
1.0%
7 7
 
0.3%
Other values (6) 15
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26061
86.0%
ASCII 4237
 
14.0%
None 6
 
< 0.1%
Number Forms 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1477
 
5.7%
1192
 
4.6%
853
 
3.3%
853
 
3.3%
801
 
3.1%
757
 
2.9%
637
 
2.4%
575
 
2.2%
528
 
2.0%
523
 
2.0%
Other values (300) 17865
68.6%
ASCII
ValueCountFrequency (%)
532
 
12.6%
G 306
 
7.2%
1 293
 
6.9%
2 284
 
6.7%
) 273
 
6.4%
( 273
 
6.4%
3 249
 
5.9%
F 200
 
4.7%
C 181
 
4.3%
B 157
 
3.7%
Other values (41) 1489
35.1%
None
ValueCountFrequency (%)
· 6
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct629
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
2023-12-12T18:47:45.329326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length28
Mean length5.7389182
Min length1

Characters and Unicode

Total characters29389
Distinct characters396
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique290 ?
Unique (%)5.7%

Sample

1st row도매사업 배관
2nd row일반사업 배관
3rd row공동주택내 배관
4th row도매사업 배관
5th row일반사업 배관
ValueCountFrequency (%)
기타 177
 
3.2%
신규 175
 
3.1%
압축가스 174
 
3.1%
액화가스 142
 
2.5%
kgs 127
 
2.3%
갱신 107
 
1.9%
상온 102
 
1.8%
초저온 102
 
1.8%
저온 90
 
1.6%
압축 80
 
1.4%
Other values (699) 4319
77.2%
2023-12-12T18:47:45.768063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1401
 
4.8%
1113
 
3.8%
1066
 
3.6%
968
 
3.3%
945
 
3.2%
628
 
2.1%
575
 
2.0%
477
 
1.6%
473
 
1.6%
459
 
1.6%
Other values (386) 21284
72.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24945
84.9%
Uppercase Letter 1904
 
6.5%
Decimal Number 1040
 
3.5%
Space Separator 477
 
1.6%
Lowercase Letter 322
 
1.1%
Open Punctuation 256
 
0.9%
Close Punctuation 256
 
0.9%
Other Punctuation 78
 
0.3%
Dash Punctuation 69
 
0.2%
Math Symbol 33
 
0.1%
Other values (3) 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1401
 
5.6%
1113
 
4.5%
1066
 
4.3%
968
 
3.9%
945
 
3.8%
628
 
2.5%
575
 
2.3%
473
 
1.9%
459
 
1.8%
396
 
1.6%
Other values (326) 16921
67.8%
Uppercase Letter
ValueCountFrequency (%)
G 268
14.1%
F 202
10.6%
P 167
8.8%
C 158
8.3%
S 150
7.9%
B 149
7.8%
A 144
7.6%
L 136
7.1%
K 132
6.9%
N 95
 
5.0%
Other values (12) 303
15.9%
Lowercase Letter
ValueCountFrequency (%)
g 54
16.8%
k 52
16.1%
h 44
13.7%
a 25
7.8%
e 25
7.8%
t 22
6.8%
n 18
 
5.6%
l 18
 
5.6%
u 17
 
5.3%
i 11
 
3.4%
Other values (8) 36
11.2%
Decimal Number
ValueCountFrequency (%)
1 351
33.8%
2 314
30.2%
3 239
23.0%
0 103
 
9.9%
4 22
 
2.1%
5 6
 
0.6%
7 5
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/ 52
66.7%
. 13
 
16.7%
, 7
 
9.0%
· 6
 
7.7%
Letter Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
477
100.0%
Open Punctuation
ValueCountFrequency (%)
( 256
100.0%
Close Punctuation
ValueCountFrequency (%)
) 256
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 69
100.0%
Math Symbol
ValueCountFrequency (%)
+ 33
100.0%
Other Symbol
ValueCountFrequency (%)
6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24945
84.9%
Latin 2228
 
7.6%
Common 2216
 
7.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1401
 
5.6%
1113
 
4.5%
1066
 
4.3%
968
 
3.9%
945
 
3.8%
628
 
2.5%
575
 
2.3%
473
 
1.9%
459
 
1.8%
396
 
1.6%
Other values (326) 16921
67.8%
Latin
ValueCountFrequency (%)
G 268
12.0%
F 202
 
9.1%
P 167
 
7.5%
C 158
 
7.1%
S 150
 
6.7%
B 149
 
6.7%
A 144
 
6.5%
L 136
 
6.1%
K 132
 
5.9%
N 95
 
4.3%
Other values (32) 627
28.1%
Common
ValueCountFrequency (%)
477
21.5%
1 351
15.8%
2 314
14.2%
( 256
11.6%
) 256
11.6%
3 239
10.8%
0 103
 
4.6%
- 69
 
3.1%
/ 52
 
2.3%
+ 33
 
1.5%
Other values (8) 66
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24944
84.9%
ASCII 4430
 
15.1%
None 6
 
< 0.1%
CJK Compat 6
 
< 0.1%
Number Forms 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1401
 
5.6%
1113
 
4.5%
1066
 
4.3%
968
 
3.9%
945
 
3.8%
628
 
2.5%
575
 
2.3%
473
 
1.9%
459
 
1.8%
396
 
1.6%
Other values (325) 16920
67.8%
ASCII
ValueCountFrequency (%)
477
 
10.8%
1 351
 
7.9%
2 314
 
7.1%
G 268
 
6.0%
( 256
 
5.8%
) 256
 
5.8%
3 239
 
5.4%
F 202
 
4.6%
P 167
 
3.8%
C 158
 
3.6%
Other values (46) 1742
39.3%
None
ValueCountFrequency (%)
· 6
100.0%
CJK Compat
ValueCountFrequency (%)
6
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
50.0%
1
50.0%

계산방식
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size40.1 KiB
용량
4792 
용량 + 초과분
 
255
조단위계산
 
39
0
 
35

Length

Max length8
Median length2
Mean length2.3147823
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
용량 4792
93.6%
용량 + 초과분 255
 
5.0%
조단위계산 39
 
0.8%
0 35
 
0.7%

Length

2023-12-12T18:47:45.911427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:47:46.014564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
용량 5047
89.6%
255
 
4.5%
초과분 255
 
4.5%
조단위계산 39
 
0.7%
0 35
 
0.6%

용량(FROM)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct46
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63170.398
Minimum0
Maximum2000000
Zeros1767
Zeros (%)34.5%
Negative0
Negative (%)0.0%
Memory size45.1 KiB
2023-12-12T18:47:46.134428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median80
Q35000
95-th percentile500000
Maximum2000000
Range2000000
Interquartile range (IQR)5000

Descriptive statistics

Standard deviation253851.61
Coefficient of variation (CV)4.0185217
Kurtosis36.205169
Mean63170.398
Median Absolute Deviation (MAD)80
Skewness5.7031734
Sum3.2349561 × 108
Variance6.4440642 × 1010
MonotonicityNot monotonic
2023-12-12T18:47:46.311836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
0 1767
34.5%
1000 222
 
4.3%
10000 211
 
4.1%
5000 205
 
4.0%
100 184
 
3.6%
20000 178
 
3.5%
50 171
 
3.3%
2000 162
 
3.2%
50000 144
 
2.8%
10 131
 
2.6%
Other values (36) 1746
34.1%
ValueCountFrequency (%)
0 1767
34.5%
1 8
 
0.2%
2 5
 
0.1%
3 3
 
0.1%
4 5
 
0.1%
5 131
 
2.6%
8 19
 
0.4%
10 131
 
2.6%
15 61
 
1.2%
20 89
 
1.7%
ValueCountFrequency (%)
2000000 54
 
1.1%
1000000 79
1.5%
700000 53
 
1.0%
500000 79
1.5%
300000 54
 
1.1%
200000 85
1.7%
100000 85
1.7%
70000 1
 
< 0.1%
50000 144
2.8%
40000 6
 
0.1%

용량(TO)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct55
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3015643 × 108
Minimum0
Maximum1.4100654 × 109
Zeros1181
Zeros (%)23.1%
Negative0
Negative (%)0.0%
Memory size45.1 KiB
2023-12-12T18:47:46.462295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median500
Q320000
95-th percentile1.4100654 × 109
Maximum1.4100654 × 109
Range1.4100654 × 109
Interquartile range (IQR)19995

Descriptive statistics

Standard deviation3.8593976 × 108
Coefficient of variation (CV)2.9651994
Kurtosis5.6262233
Mean1.3015643 × 108
Median Absolute Deviation (MAD)500
Skewness2.7174029
Sum6.6653108 × 1011
Variance1.489495 × 1017
MonotonicityNot monotonic
2023-12-12T18:47:46.598625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1181
23.1%
1410065407 318
 
6.2%
1000 221
 
4.3%
999999999 217
 
4.2%
10000 210
 
4.1%
5000 206
 
4.0%
100 189
 
3.7%
20000 178
 
3.5%
50 171
 
3.3%
2000 162
 
3.2%
Other values (45) 2068
40.4%
ValueCountFrequency (%)
0 1181
23.1%
1 5
 
0.1%
2 5
 
0.1%
3 5
 
0.1%
4 5
 
0.1%
5 131
 
2.6%
8 19
 
0.4%
10 135
 
2.6%
15 61
 
1.2%
20 92
 
1.8%
ValueCountFrequency (%)
1410065407 318
6.2%
999999999 217
4.2%
141065407 1
 
< 0.1%
141006540 3
 
0.1%
99999999 1
 
< 0.1%
99991231 1
 
< 0.1%
9999999 4
 
0.1%
2000000 54
 
1.1%
1000000 79
 
1.5%
999999 2
 
< 0.1%

수수료
Real number (ℝ)

HIGH CORRELATION 

Distinct658
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean442568.73
Minimum30
Maximum16756000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size45.1 KiB
2023-12-12T18:47:46.734452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum30
5-th percentile560
Q15810
median55000
Q3333000
95-th percentile2939000
Maximum16756000
Range16755970
Interquartile range (IQR)327190

Descriptive statistics

Standard deviation1128745.7
Coefficient of variation (CV)2.5504417
Kurtosis36.041497
Mean442568.73
Median Absolute Deviation (MAD)54000
Skewness4.9093034
Sum2.2663945 × 109
Variance1.2740669 × 1012
MonotonicityNot monotonic
2023-12-12T18:47:46.880605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
370000 158
 
3.1%
74000 83
 
1.6%
130000 81
 
1.6%
246000 80
 
1.6%
39000 77
 
1.5%
22000 76
 
1.5%
920 76
 
1.5%
4540000 62
 
1.2%
3020000 60
 
1.2%
190 58
 
1.1%
Other values (648) 4310
84.2%
ValueCountFrequency (%)
30 1
 
< 0.1%
50 5
0.1%
60 11
0.2%
80 6
0.1%
90 9
0.2%
120 6
0.1%
130 7
0.1%
140 4
 
0.1%
150 1
 
< 0.1%
160 4
 
0.1%
ValueCountFrequency (%)
16756000 1
 
< 0.1%
16634000 1
 
< 0.1%
13640000 1
 
< 0.1%
11638000 1
 
< 0.1%
11069000 2
 
< 0.1%
10820000 1
 
< 0.1%
10105000 1
 
< 0.1%
9710000 1
 
< 0.1%
6043000 1
 
< 0.1%
5980000 46
0.9%

Interactions

2023-12-12T18:47:42.304098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:47:41.518509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:47:41.937692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:47:42.424353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:47:41.664423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:47:42.084392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:47:42.557032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:47:41.808232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:47:42.199002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:47:46.984629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분검사종류법구분계산방식용량(FROM)용량(TO)수수료
업무구분1.0000.9960.7750.2910.0770.1550.100
검사종류0.9961.0000.9210.5350.1850.3130.869
법구분0.7750.9211.0000.1690.1510.0650.180
계산방식0.2910.5350.1691.0000.1310.7550.029
용량(FROM)0.0770.1850.1510.1311.0000.3330.110
용량(TO)0.1550.3130.0650.7550.3331.0000.040
수수료0.1000.8690.1800.0290.1100.0401.000
2023-12-12T18:47:47.104872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법구분검사종류계산방식업무구분
법구분1.0000.7100.1390.564
검사종류0.7101.0000.2950.942
계산방식0.1390.2951.0000.173
업무구분0.5640.9420.1731.000
2023-12-12T18:47:47.222088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용량(FROM)용량(TO)수수료업무구분검사종류법구분계산방식
용량(FROM)1.0000.8590.0100.0380.0790.1020.085
용량(TO)0.8591.000-0.0390.0910.1610.0530.392
수수료0.010-0.0391.0000.0450.5540.1110.013
업무구분0.0380.0910.0451.0000.9420.5640.173
검사종류0.0790.1610.5540.9421.0000.7100.295
법구분0.1020.0530.1110.5640.7101.0000.139
계산방식0.0850.3920.0130.1730.2950.1391.000

Missing values

2023-12-12T18:47:42.749890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:47:42.937792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업무구분검사종류법구분대분류중분류소분류계산방식용량(FROM)용량(TO)수수료
0검사용역자율검사도법도시가스회사도매사업 배관도매사업 배관00069000
1검사용역자율검사도법도시가스회사일반사업 배관일반사업 배관00069000
2검사용역자율검사도법도시가스회사공동주택내 배관공동주택내 배관00069000
3검사용역자율검사도법도시가스회사도매사업 배관도매사업 배관00069000
4검사용역자율검사도법도시가스회사일반사업 배관일반사업 배관00069000
5검사용역자율검사도법도시가스회사공동주택내 배관공동주택내 배관00069000
6검사용역자율검사액법저장시설저장시설저장시설용량 + 초과분10001410065407948000
7검사용역자율검사액법저장시설저장시설저장시설용량05139000
8검사용역자율검사액법저장시설저장시설저장시설용량510191000
9검사용역자율검사액법저장시설저장시설저장시설용량1020216000
업무구분검사종류법구분대분류중분류소분류계산방식용량(FROM)용량(TO)수수료
5111제품검사설계단계(정밀)검사고법특정설비설계검토특정설비설계검토특정설비설계검토용량00140000
5112제품검사설계단계(정밀)검사고법독성가스배관용밸브글로우브밸브글로우브밸브용량00505000
5113제품검사특정재검고법저장탱크원통형(입형)초저온용량 + 초과분50000999999999256000
5114제품검사특정재검고법압력용기탑류탑류용량1000500046000
5115제품검사특정재검고법압력용기탑류탑류용량50001000082000
5116제품검사특정재검고법압력용기탑류탑류용량1000020000177000
5117제품검사특정재검고법압력용기탑류탑류용량2000050000256000
5118홍보(정보회원/자료판매)정보회원미구분정보회원가스안전정보회원(C회원)용량0050000
5119홍보(정보회원/자료판매)정보회원미구분정보회원가스안전정보회원(B회원)용량00100000
5120홍보(정보회원/자료판매)정보회원미구분정보회원가스안전정보회원(G회원)용량00120000

Duplicate rows

Most frequently occurring

업무구분검사종류법구분대분류중분류소분류계산방식용량(FROM)용량(TO)수수료# duplicates
0검사용역자율검사도법도시가스회사공동주택내 배관공동주택내 배관000690002
1검사용역자율검사도법도시가스회사도매사업 배관도매사업 배관000690002
2검사용역자율검사도법도시가스회사일반사업 배관일반사업 배관000690002
3교육위탁교육고법주문형맞춤식과정SIL(안전 신뢰성 등급)SIL(안전 신뢰성 등급)용량004180002
4교재비등시험응시료미구분재시험응시수수료재시험응시수수료재시험응시수수료용량00100002