Overview

Dataset statistics

Number of variables16
Number of observations518
Missing cells1938
Missing cells (%)23.4%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory67.9 KiB
Average record size in memory134.3 B

Variable types

Categorical4
Text4
DateTime2
Unsupported2
Numeric4

Dataset

Description석유및석유대체연료판매업체 현황
Author행정안전부
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=KLYUWPFDELV41Y17CWE814442660&infSeq=1

Alerts

Dataset has 1 (0.2%) duplicate rowsDuplicates
통합영업상태명 is highly overall correlated with 영업상태명High correlation
영업상태명 is highly overall correlated with 통합영업상태명High correlation
WGS84위도 is highly overall correlated with 시군명High correlation
WGS84경도 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with WGS84위도 and 1 other fieldsHigh correlation
거래처 is highly imbalanced (57.3%)Imbalance
인허가취소일자 has 518 (100.0%) missing valuesMissing
폐업일자 has 364 (70.3%) missing valuesMissing
소재지시설전화번호 has 270 (52.1%) missing valuesMissing
소재지면적정보 has 518 (100.0%) missing valuesMissing
소재지도로명주소 has 23 (4.4%) missing valuesMissing
WGS84위도 has 20 (3.9%) missing valuesMissing
WGS84경도 has 20 (3.9%) missing valuesMissing
자본금 has 205 (39.6%) missing valuesMissing
인허가취소일자 is an unsupported type, check if it needs cleaning or further analysisUnsupported
소재지면적정보 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 21:34:15.579795
Analysis finished2023-12-10 21:34:19.167022
Duration3.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
<NA>
76 
고양시
44 
평택시
41 
양주시
35 
화성시
34 
Other values (26)
288 

Length

Max length4
Median length3
Mean length3.2065637
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row가평군
5th row고양시

Common Values

ValueCountFrequency (%)
<NA> 76
14.7%
고양시 44
 
8.5%
평택시 41
 
7.9%
양주시 35
 
6.8%
화성시 34
 
6.6%
성남시 31
 
6.0%
수원시 26
 
5.0%
안산시 21
 
4.1%
부천시 18
 
3.5%
용인시 16
 
3.1%
Other values (21) 176
34.0%

Length

2023-12-11T06:34:19.246536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 76
14.7%
고양시 44
 
8.5%
평택시 41
 
7.9%
양주시 35
 
6.8%
화성시 34
 
6.6%
성남시 31
 
6.0%
수원시 26
 
5.0%
안산시 21
 
4.1%
부천시 18
 
3.5%
용인시 16
 
3.1%
Other values (21) 176
34.0%
Distinct336
Distinct (%)64.9%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-11T06:34:19.479798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length17
Mean length7.7606178
Min length4

Characters and Unicode

Total characters4020
Distinct characters230
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique183 ?
Unique (%)35.3%

Sample

1st row(주)케이피아이
2nd row(주)피에이
3rd row(주)케이피아이
4th row(주)피에이
5th row(주)대영에너지
ValueCountFrequency (%)
주식회사 21
 
3.8%
주)오티씨 6
 
1.1%
주)엠씨에너지 6
 
1.1%
주)엔맥스 5
 
0.9%
주)동해석유 4
 
0.7%
주)오조 4
 
0.7%
주)경진플러스 4
 
0.7%
주)밸룩스 4
 
0.7%
주)일성트레이드 4
 
0.7%
주)코렉스 4
 
0.7%
Other values (333) 489
88.7%
2023-12-11T06:34:19.884834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
474
 
11.8%
( 450
 
11.2%
) 450
 
11.2%
256
 
6.4%
209
 
5.2%
193
 
4.8%
102
 
2.5%
88
 
2.2%
87
 
2.2%
60
 
1.5%
Other values (220) 1651
41.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3057
76.0%
Open Punctuation 451
 
11.2%
Close Punctuation 451
 
11.2%
Space Separator 35
 
0.9%
Uppercase Letter 23
 
0.6%
Decimal Number 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
474
 
15.5%
256
 
8.4%
209
 
6.8%
193
 
6.3%
102
 
3.3%
88
 
2.9%
87
 
2.8%
60
 
2.0%
56
 
1.8%
53
 
1.7%
Other values (204) 1479
48.4%
Uppercase Letter
ValueCountFrequency (%)
S 8
34.8%
T 5
21.7%
X 3
 
13.0%
K 3
 
13.0%
C 1
 
4.3%
J 1
 
4.3%
D 1
 
4.3%
M 1
 
4.3%
Open Punctuation
ValueCountFrequency (%)
( 450
99.8%
[ 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 450
99.8%
] 1
 
0.2%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
1 1
50.0%
Space Separator
ValueCountFrequency (%)
35
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3057
76.0%
Common 940
 
23.4%
Latin 23
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
474
 
15.5%
256
 
8.4%
209
 
6.8%
193
 
6.3%
102
 
3.3%
88
 
2.9%
87
 
2.8%
60
 
2.0%
56
 
1.8%
53
 
1.7%
Other values (204) 1479
48.4%
Common
ValueCountFrequency (%)
( 450
47.9%
) 450
47.9%
35
 
3.7%
] 1
 
0.1%
[ 1
 
0.1%
& 1
 
0.1%
2 1
 
0.1%
1 1
 
0.1%
Latin
ValueCountFrequency (%)
S 8
34.8%
T 5
21.7%
X 3
 
13.0%
K 3
 
13.0%
C 1
 
4.3%
J 1
 
4.3%
D 1
 
4.3%
M 1
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3057
76.0%
ASCII 963
 
24.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
474
 
15.5%
256
 
8.4%
209
 
6.8%
193
 
6.3%
102
 
3.3%
88
 
2.9%
87
 
2.8%
60
 
2.0%
56
 
1.8%
53
 
1.7%
Other values (204) 1479
48.4%
ASCII
ValueCountFrequency (%)
( 450
46.7%
) 450
46.7%
35
 
3.6%
S 8
 
0.8%
T 5
 
0.5%
X 3
 
0.3%
K 3
 
0.3%
] 1
 
0.1%
[ 1
 
0.1%
& 1
 
0.1%
Other values (6) 6
 
0.6%
Distinct321
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
Minimum1988-07-07 00:00:00
Maximum2019-08-19 00:00:00
2023-12-11T06:34:20.051939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:20.201050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

인허가취소일자
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing518
Missing (%)100.0%
Memory size4.7 KiB

통합영업상태명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
<NA>
155 
폐업
153 
영업/정상
119 
취소/말소/만료/정지/중지
89 
휴업
 
2

Length

Max length14
Median length5
Mean length5.3494208
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row폐업
4th row폐업
5th row취소/말소/만료/정지/중지

Common Values

ValueCountFrequency (%)
<NA> 155
29.9%
폐업 153
29.5%
영업/정상 119
23.0%
취소/말소/만료/정지/중지 89
17.2%
휴업 2
 
0.4%

Length

2023-12-11T06:34:20.340615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:34:20.467497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 155
29.9%
폐업 153
29.5%
영업/정상 119
23.0%
취소/말소/만료/정지/중지 89
17.2%
휴업 2
 
0.4%

영업상태명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
폐지
153 
영업개시
101 
운영중
94 
등록취소
89 
폐업 등
61 
Other values (3)
20 

Length

Max length6
Median length4
Mean length3.1660232
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row폐업 등
2nd row폐업 등
3rd row폐지
4th row폐지
5th row등록취소

Common Values

ValueCountFrequency (%)
폐지 153
29.5%
영업개시 101
19.5%
운영중 94
18.1%
등록취소 89
17.2%
폐업 등 61
 
11.8%
신규 17
 
3.3%
사업휴지 2
 
0.4%
휴지사업재개 1
 
0.2%

Length

2023-12-11T06:34:20.616371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:34:20.737846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
폐지 153
26.4%
영업개시 101
17.4%
운영중 94
16.2%
등록취소 89
15.4%
폐업 61
 
10.5%
61
 
10.5%
신규 17
 
2.9%
사업휴지 2
 
0.3%
휴지사업재개 1
 
0.2%

폐업일자
Date

MISSING 

Distinct87
Distinct (%)56.5%
Missing364
Missing (%)70.3%
Memory size4.2 KiB
Minimum2009-05-25 00:00:00
Maximum2019-08-30 00:00:00
2023-12-11T06:34:20.887166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:21.044082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

소재지시설전화번호
Real number (ℝ)

MISSING 

Distinct223
Distinct (%)89.9%
Missing270
Missing (%)52.1%
Infinite0
Infinite (%)0.0%
Mean4.7551393 × 108
Minimum0
Maximum7.0823861 × 109
Zeros1
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-11T06:34:21.218760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile27827137
Q13.1357519 × 108
median3.1667897 × 108
Q33.1906241 × 108
95-th percentile3.2637186 × 108
Maximum7.0823861 × 109
Range7.0823861 × 109
Interquartile range (IQR)5487218.5

Descriptive statistics

Standard deviation9.5690183 × 108
Coefficient of variation (CV)2.0123529
Kurtosis35.816648
Mean4.7551393 × 108
Median Absolute Deviation (MAD)2545487
Skewness5.8671592
Sum1.1792746 × 1011
Variance9.1566111 × 1017
MonotonicityNot monotonic
2023-12-11T06:34:21.409152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
313575189 4
 
0.8%
316775183 3
 
0.6%
317904395 2
 
0.4%
315129922 2
 
0.4%
3180179933 2
 
0.4%
316957181 2
 
0.4%
312937711 2
 
0.4%
318368951 2
 
0.4%
318610004 2
 
0.4%
315435004 2
 
0.4%
Other values (213) 225
43.4%
(Missing) 270
52.1%
ValueCountFrequency (%)
0 1
0.2%
23169871 1
0.2%
23712356 1
0.2%
24169545 1
0.2%
24654881 2
0.4%
24685189 1
0.2%
24725155 1
0.2%
25308800 1
0.2%
25562017 1
0.2%
25618777 1
0.2%
ValueCountFrequency (%)
7082386073 1
0.2%
7078001799 1
0.2%
7047146884 1
0.2%
7044327217 1
0.2%
3180865253 1
0.2%
3180397881 1
0.2%
3180327813 1
0.2%
3180179933 2
0.4%
3162690173 1
0.2%
328887600 1
0.2%

소재지면적정보
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing518
Missing (%)100.0%
Memory size4.7 KiB
Distinct314
Distinct (%)63.4%
Missing23
Missing (%)4.4%
Memory size4.2 KiB
2023-12-11T06:34:21.762735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length44
Mean length30.456566
Min length14

Characters and Unicode

Total characters15076
Distinct characters348
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique169 ?
Unique (%)34.1%

Sample

1st row경기도 가평군 가평읍 경춘로 1766
2nd row경기도 가평군 설악면 유명로 1642-66
3rd row경기도 가평군 가평읍 경춘로 1766
4th row경기도 가평군 설악면 유명로 1642-66
5th row경기도 고양시 일산동구 무궁화로 11, 비동 412호 (장항동,한라밀라트 )
ValueCountFrequency (%)
경기도 419
 
13.9%
고양시 44
 
1.5%
평택시 39
 
1.3%
서울특별시 36
 
1.2%
양주시 32
 
1.1%
성남시 31
 
1.0%
화성시 30
 
1.0%
수원시 26
 
0.9%
분당구 23
 
0.8%
안산시 21
 
0.7%
Other values (1057) 2310
76.7%
2023-12-11T06:34:22.262631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2516
 
16.7%
1 637
 
4.2%
519
 
3.4%
505
 
3.3%
477
 
3.2%
453
 
3.0%
440
 
2.9%
435
 
2.9%
0 388
 
2.6%
, 381
 
2.5%
Other values (338) 8325
55.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8548
56.7%
Decimal Number 2821
 
18.7%
Space Separator 2516
 
16.7%
Other Punctuation 381
 
2.5%
Close Punctuation 335
 
2.2%
Open Punctuation 335
 
2.2%
Dash Punctuation 102
 
0.7%
Uppercase Letter 22
 
0.1%
Lowercase Letter 16
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
519
 
6.1%
505
 
5.9%
477
 
5.6%
453
 
5.3%
440
 
5.1%
435
 
5.1%
247
 
2.9%
226
 
2.6%
191
 
2.2%
158
 
1.8%
Other values (305) 4897
57.3%
Decimal Number
ValueCountFrequency (%)
1 637
22.6%
0 388
13.8%
2 370
13.1%
3 300
10.6%
4 274
9.7%
5 195
 
6.9%
6 184
 
6.5%
9 163
 
5.8%
8 156
 
5.5%
7 154
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
B 8
36.4%
S 3
 
13.6%
G 3
 
13.6%
A 2
 
9.1%
O 2
 
9.1%
K 1
 
4.5%
V 1
 
4.5%
J 1
 
4.5%
L 1
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
n 3
18.8%
e 2
12.5%
k 2
12.5%
a 2
12.5%
l 2
12.5%
i 2
12.5%
c 1
 
6.2%
t 1
 
6.2%
r 1
 
6.2%
Space Separator
ValueCountFrequency (%)
2516
100.0%
Other Punctuation
ValueCountFrequency (%)
, 381
100.0%
Close Punctuation
ValueCountFrequency (%)
) 335
100.0%
Open Punctuation
ValueCountFrequency (%)
( 335
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 102
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8548
56.7%
Common 6490
43.0%
Latin 38
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
519
 
6.1%
505
 
5.9%
477
 
5.6%
453
 
5.3%
440
 
5.1%
435
 
5.1%
247
 
2.9%
226
 
2.6%
191
 
2.2%
158
 
1.8%
Other values (305) 4897
57.3%
Latin
ValueCountFrequency (%)
B 8
21.1%
n 3
 
7.9%
S 3
 
7.9%
G 3
 
7.9%
e 2
 
5.3%
A 2
 
5.3%
k 2
 
5.3%
a 2
 
5.3%
l 2
 
5.3%
O 2
 
5.3%
Other values (8) 9
23.7%
Common
ValueCountFrequency (%)
2516
38.8%
1 637
 
9.8%
0 388
 
6.0%
, 381
 
5.9%
2 370
 
5.7%
) 335
 
5.2%
( 335
 
5.2%
3 300
 
4.6%
4 274
 
4.2%
5 195
 
3.0%
Other values (5) 759
 
11.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8548
56.7%
ASCII 6528
43.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2516
38.5%
1 637
 
9.8%
0 388
 
5.9%
, 381
 
5.8%
2 370
 
5.7%
) 335
 
5.1%
( 335
 
5.1%
3 300
 
4.6%
4 274
 
4.2%
5 195
 
3.0%
Other values (23) 797
 
12.2%
Hangul
ValueCountFrequency (%)
519
 
6.1%
505
 
5.9%
477
 
5.6%
453
 
5.3%
440
 
5.1%
435
 
5.1%
247
 
2.9%
226
 
2.6%
191
 
2.2%
158
 
1.8%
Other values (305) 4897
57.3%
Distinct425
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-11T06:34:22.629023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length42
Mean length28.801158
Min length16

Characters and Unicode

Total characters14919
Distinct characters324
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique340 ?
Unique (%)65.6%

Sample

1st row경기도 가평군 가평읍 상색리 280번지 1호
2nd row경기도 가평군 설악면 신천리 596번지
3rd row경기도 가평군 가평읍 상색리 280번지 1호
4th row경기도 가평군 설악면 신천리 596번지
5th row경기도 고양시일산동구 장항2동 742번지 1호 한라밀라트 비동 412호
ValueCountFrequency (%)
경기도 433
 
13.6%
1호 73
 
2.3%
2호 50
 
1.6%
평택시 41
 
1.3%
서울특별시 38
 
1.2%
양주시 35
 
1.1%
화성시 34
 
1.1%
3호 31
 
1.0%
5호 30
 
0.9%
성남시 29
 
0.9%
Other values (1019) 2395
75.1%
2023-12-11T06:34:23.231116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3087
20.7%
1 686
 
4.6%
559
 
3.7%
554
 
3.7%
528
 
3.5%
524
 
3.5%
482
 
3.2%
481
 
3.2%
446
 
3.0%
444
 
3.0%
Other values (314) 7128
47.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8649
58.0%
Decimal Number 3088
 
20.7%
Space Separator 3087
 
20.7%
Dash Punctuation 37
 
0.2%
Uppercase Letter 27
 
0.2%
Lowercase Letter 16
 
0.1%
Other Punctuation 7
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Open Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
559
 
6.5%
554
 
6.4%
528
 
6.1%
524
 
6.1%
482
 
5.6%
481
 
5.6%
446
 
5.2%
444
 
5.1%
251
 
2.9%
153
 
1.8%
Other values (276) 4227
48.9%
Uppercase Letter
ValueCountFrequency (%)
B 8
29.6%
G 4
14.8%
S 3
 
11.1%
A 3
 
11.1%
L 2
 
7.4%
O 2
 
7.4%
P 1
 
3.7%
T 1
 
3.7%
K 1
 
3.7%
V 1
 
3.7%
Decimal Number
ValueCountFrequency (%)
1 686
22.2%
0 424
13.7%
2 382
12.4%
4 317
10.3%
3 303
9.8%
5 249
 
8.1%
6 216
 
7.0%
7 200
 
6.5%
9 157
 
5.1%
8 154
 
5.0%
Lowercase Letter
ValueCountFrequency (%)
n 3
18.8%
e 2
12.5%
k 2
12.5%
a 2
12.5%
l 2
12.5%
i 2
12.5%
c 1
 
6.2%
t 1
 
6.2%
r 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 4
57.1%
# 1
 
14.3%
; 1
 
14.3%
& 1
 
14.3%
Space Separator
ValueCountFrequency (%)
3087
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 37
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8649
58.0%
Common 6227
41.7%
Latin 43
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
559
 
6.5%
554
 
6.4%
528
 
6.1%
524
 
6.1%
482
 
5.6%
481
 
5.6%
446
 
5.2%
444
 
5.1%
251
 
2.9%
153
 
1.8%
Other values (276) 4227
48.9%
Latin
ValueCountFrequency (%)
B 8
18.6%
G 4
 
9.3%
S 3
 
7.0%
n 3
 
7.0%
A 3
 
7.0%
L 2
 
4.7%
e 2
 
4.7%
k 2
 
4.7%
a 2
 
4.7%
l 2
 
4.7%
Other values (10) 12
27.9%
Common
ValueCountFrequency (%)
3087
49.6%
1 686
 
11.0%
0 424
 
6.8%
2 382
 
6.1%
4 317
 
5.1%
3 303
 
4.9%
5 249
 
4.0%
6 216
 
3.5%
7 200
 
3.2%
9 157
 
2.5%
Other values (8) 206
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8649
58.0%
ASCII 6270
42.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3087
49.2%
1 686
 
10.9%
0 424
 
6.8%
2 382
 
6.1%
4 317
 
5.1%
3 303
 
4.8%
5 249
 
4.0%
6 216
 
3.4%
7 200
 
3.2%
9 157
 
2.5%
Other values (28) 249
 
4.0%
Hangul
ValueCountFrequency (%)
559
 
6.5%
554
 
6.4%
528
 
6.1%
524
 
6.1%
482
 
5.6%
481
 
5.6%
446
 
5.2%
444
 
5.1%
251
 
2.9%
153
 
1.8%
Other values (276) 4227
48.9%
Distinct407
Distinct (%)78.6%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-11T06:34:23.587790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.3397683
Min length4

Characters and Unicode

Total characters3284
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique325 ?
Unique (%)62.7%

Sample

1st row477806
2nd row477853
3rd row477-806
4th row477-853
5th row410-837
ValueCountFrequency (%)
482-844 4
 
0.8%
411-745 4
 
0.8%
10404 4
 
0.8%
451-821 4
 
0.8%
410-837 4
 
0.8%
13524 4
 
0.8%
443-734 3
 
0.6%
420-861 3
 
0.6%
12245 3
 
0.6%
451821 3
 
0.6%
Other values (397) 482
93.1%
2023-12-11T06:34:24.071385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 578
17.6%
1 428
13.0%
8 362
11.0%
0 325
9.9%
2 295
9.0%
- 288
8.8%
5 261
7.9%
3 243
7.4%
7 206
 
6.3%
6 183
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2996
91.2%
Dash Punctuation 288
 
8.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 578
19.3%
1 428
14.3%
8 362
12.1%
0 325
10.8%
2 295
9.8%
5 261
8.7%
3 243
8.1%
7 206
 
6.9%
6 183
 
6.1%
9 115
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 288
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 578
17.6%
1 428
13.0%
8 362
11.0%
0 325
9.9%
2 295
9.0%
- 288
8.8%
5 261
7.9%
3 243
7.4%
7 206
 
6.3%
6 183
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 578
17.6%
1 428
13.0%
8 362
11.0%
0 325
9.9%
2 295
9.0%
- 288
8.8%
5 261
7.9%
3 243
7.4%
7 206
 
6.3%
6 183
 
5.6%

WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct311
Distinct (%)62.4%
Missing20
Missing (%)3.9%
Infinite0
Infinite (%)0.0%
Mean37.400502
Minimum34.94878
Maximum37.965351
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-11T06:34:24.244419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34.94878
5-th percentile36.943096
Q137.253451
median37.425667
Q337.660759
95-th percentile37.855774
Maximum37.965351
Range3.016571
Interquartile range (IQR)0.40730758

Descriptive statistics

Standard deviation0.37129583
Coefficient of variation (CV)0.0099275627
Kurtosis9.7193235
Mean37.400502
Median Absolute Deviation (MAD)0.19774217
Skewness-2.1818749
Sum18625.45
Variance0.13786059
MonotonicityNot monotonic
2023-12-11T06:34:24.415494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.0479995151 6
 
1.2%
37.0621641372 5
 
1.0%
37.8155134657 5
 
1.0%
37.8256662798 5
 
1.0%
37.190196224 5
 
1.0%
37.0031126755 5
 
1.0%
36.9641156187 5
 
1.0%
37.6636912131 4
 
0.8%
37.3974963089 4
 
0.8%
37.8162615864 4
 
0.8%
Other values (301) 450
86.9%
(Missing) 20
 
3.9%
ValueCountFrequency (%)
34.9487800111 1
0.2%
35.1299874403 1
0.2%
35.2485942255 1
0.2%
35.8446461387 1
0.2%
35.8536288764 1
0.2%
35.879765229 1
0.2%
35.977615205 1
0.2%
36.3325592805 1
0.2%
36.3387873614 1
0.2%
36.3475580191 1
0.2%
ValueCountFrequency (%)
37.9653510104 2
0.4%
37.9539516912 2
0.4%
37.9295364898 2
0.4%
37.9089149454 2
0.4%
37.908352228 1
0.2%
37.9081817102 1
0.2%
37.8955436804 1
0.2%
37.8945520949 1
0.2%
37.8884729705 2
0.4%
37.8790874896 2
0.4%

WGS84경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct311
Distinct (%)62.4%
Missing20
Missing (%)3.9%
Infinite0
Infinite (%)0.0%
Mean127.01022
Minimum126.55571
Maximum129.17394
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-11T06:34:24.585321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.55571
5-th percentile126.71033
Q1126.8281
median126.98537
Q3127.12362
95-th percentile127.48626
Maximum129.17394
Range2.6182303
Interquartile range (IQR)0.29551891

Descriptive statistics

Standard deviation0.26450391
Coefficient of variation (CV)0.0020825403
Kurtosis11.868691
Mean127.01022
Median Absolute Deviation (MAD)0.14363904
Skewness2.187222
Sum63251.091
Variance0.069962316
MonotonicityNot monotonic
2023-12-11T06:34:24.732006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.8972035588 6
 
1.2%
127.2904135496 5
 
1.0%
127.1067236966 5
 
1.0%
126.9881430785 5
 
1.0%
126.7140361533 5
 
1.0%
127.0726440763 5
 
1.0%
126.845011878 5
 
1.0%
126.7707832758 4
 
0.8%
127.1120479774 4
 
0.8%
127.0608192501 4
 
0.8%
Other values (301) 450
86.9%
(Missing) 20
 
3.9%
ValueCountFrequency (%)
126.5557053894 1
 
0.2%
126.5679601305 2
0.4%
126.5792751124 2
0.4%
126.6164666237 4
0.8%
126.6169329742 2
0.4%
126.6247348072 1
 
0.2%
126.6344247938 1
 
0.2%
126.6483570868 1
 
0.2%
126.6485886247 1
 
0.2%
126.6563082518 1
 
0.2%
ValueCountFrequency (%)
129.1739356652 1
0.2%
128.6534201371 1
0.2%
128.1436634717 1
0.2%
127.9440786552 1
0.2%
127.7568344076 1
0.2%
127.7560084398 1
0.2%
127.652400038 1
0.2%
127.6374065838 2
0.4%
127.6270616643 2
0.4%
127.6185579871 1
0.2%

자본금
Real number (ℝ)

MISSING 

Distinct48
Distinct (%)15.3%
Missing205
Missing (%)39.6%
Infinite0
Infinite (%)0.0%
Mean4.6711852 × 108
Minimum50000000
Maximum9.5 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-11T06:34:24.870898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum50000000
5-th percentile50000000
Q11 × 108
median1 × 108
Q32 × 108
95-th percentile1.38 × 109
Maximum9.5 × 109
Range9.45 × 109
Interquartile range (IQR)1 × 108

Descriptive statistics

Standard deviation1.3775243 × 109
Coefficient of variation (CV)2.9489824
Kurtosis25.011606
Mean4.6711852 × 108
Median Absolute Deviation (MAD)0
Skewness4.9826913
Sum1.462081 × 1011
Variance1.8975731 × 1018
MonotonicityNot monotonic
2023-12-11T06:34:25.259147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
100000000 178
34.4%
200000000 25
 
4.8%
50000000 24
 
4.6%
300000000 12
 
2.3%
500000000 7
 
1.4%
150000000 7
 
1.4%
110000000 5
 
1.0%
400000000 3
 
0.6%
700000000 3
 
0.6%
900000000 2
 
0.4%
Other values (38) 47
 
9.1%
(Missing) 205
39.6%
ValueCountFrequency (%)
50000000 24
 
4.6%
92489000 1
 
0.2%
100000000 178
34.4%
105000000 1
 
0.2%
107260324 1
 
0.2%
110000000 5
 
1.0%
116000000 1
 
0.2%
120000000 1
 
0.2%
130000000 1
 
0.2%
146000000 1
 
0.2%
ValueCountFrequency (%)
9500000000 1
0.2%
8694541000 1
0.2%
8652193750 1
0.2%
8582500000 1
0.2%
7937210000 1
0.2%
7500000000 2
0.4%
6266239975 1
0.2%
5580824500 1
0.2%
5500000000 1
0.2%
4683330000 1
0.2%

거래처
Categorical

IMBALANCE 

Distinct35
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
기타
271 
<NA>
155 
현대
 
20
SK
 
12
GS칼텍스
 
9
Other values (30)
51 

Length

Max length12
Median length2
Mean length2.9073359
Min length1

Unique

Unique21 ?
Unique (%)4.1%

Sample

1st row<NA>
2nd row<NA>
3rd row기타
4th row기타
5th rowGS칼텍스

Common Values

ValueCountFrequency (%)
기타 271
52.3%
<NA> 155
29.9%
현대 20
 
3.9%
SK 12
 
2.3%
GS칼텍스 9
 
1.7%
- 8
 
1.5%
S-Oil 6
 
1.2%
인천정유 3
 
0.6%
호남석유화학 3
 
0.6%
롯데대산유화 2
 
0.4%
Other values (25) 29
 
5.6%

Length

2023-12-11T06:34:25.383098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타 271
52.1%
na 155
29.8%
현대 20
 
3.8%
sk 12
 
2.3%
10
 
1.9%
gs칼텍스 9
 
1.7%
s-oil 6
 
1.2%
호남석유화학 4
 
0.8%
인천정유 3
 
0.6%
미정 2
 
0.4%
Other values (23) 28
 
5.4%

Interactions

2023-12-11T06:34:17.893909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:16.548146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:16.988265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:17.449488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:18.028423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:16.671472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:17.112015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:17.551629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:18.143841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:16.790273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:17.232774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:17.683496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:18.248140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:16.897864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:17.351617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:34:17.806511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:34:25.465930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명통합영업상태명영업상태명폐업일자소재지시설전화번호WGS84위도WGS84경도자본금거래처
시군명1.0000.0000.0000.9480.0000.9630.9700.0000.228
통합영업상태명0.0001.0001.000NaN0.0000.0610.0000.0000.000
영업상태명0.0001.0001.0001.0000.0000.1010.0000.0000.000
폐업일자0.948NaN1.0001.0000.0000.9870.9230.0000.000
소재지시설전화번호0.0000.0000.0000.0001.0000.0000.1070.4310.000
WGS84위도0.9630.0610.1010.9870.0001.0000.8290.0000.000
WGS84경도0.9700.0000.0000.9230.1070.8291.0000.0000.000
자본금0.0000.0000.0000.0000.4310.0000.0001.0000.329
거래처0.2280.0000.0000.0000.0000.0000.0000.3291.000
2023-12-11T06:34:25.583513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통합영업상태명영업상태명시군명거래처
통합영업상태명1.0000.9970.0000.000
영업상태명0.9971.0000.0000.000
시군명0.0000.0001.0000.049
거래처0.0000.0000.0491.000
2023-12-11T06:34:25.691002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소재지시설전화번호WGS84위도WGS84경도자본금시군명통합영업상태명영업상태명거래처
소재지시설전화번호1.0000.312-0.124-0.1200.0000.0000.0000.000
WGS84위도0.3121.000-0.140-0.0980.7470.0380.0530.000
WGS84경도-0.124-0.1401.0000.0320.7700.0000.0000.000
자본금-0.120-0.0980.0321.0000.0000.0000.0000.173
시군명0.0000.7470.7700.0001.0000.0000.0000.049
통합영업상태명0.0000.0380.0000.0000.0001.0000.9970.000
영업상태명0.0000.0530.0000.0000.0000.9971.0000.000
거래처0.0000.0000.0000.1730.0490.0000.0001.000

Missing values

2023-12-11T06:34:18.653767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:34:18.890035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T06:34:19.059049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명사업장명인허가일자인허가취소일자통합영업상태명영업상태명폐업일자소재지시설전화번호소재지면적정보소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도자본금거래처
0가평군(주)케이피아이20020722<NA><NA>폐업 등20120403<NA><NA>경기도 가평군 가평읍 경춘로 1766경기도 가평군 가평읍 상색리 280번지 1호47780637.796799127.481372<NA><NA>
1가평군(주)피에이20100427<NA><NA>폐업 등20120801<NA><NA>경기도 가평군 설악면 유명로 1642-66경기도 가평군 설악면 신천리 596번지47785337.674003127.487776<NA><NA>
2가평군(주)케이피아이2002-07-22<NA>폐업폐지2012-04-03315815182<NA>경기도 가평군 가평읍 경춘로 1766경기도 가평군 가평읍 상색리 280번지 1호477-80637.796799127.481372200000000기타
3가평군(주)피에이2010-04-27<NA>폐업폐지2012-08-01315851752<NA>경기도 가평군 설악면 유명로 1642-66경기도 가평군 설악면 신천리 596번지477-85337.674003127.487776100000000기타
4고양시(주)대영에너지2005-12-21<NA>취소/말소/만료/정지/중지등록취소<NA>316535145<NA>경기도 고양시 일산동구 무궁화로 11, 비동 412호 (장항동,한라밀라트 )경기도 고양시일산동구 장항2동 742번지 1호 한라밀라트 비동 412호410-83737.661894126.765926300000000GS칼텍스
5고양시(주)지축에너지2009-03-02<NA>취소/말소/만료/정지/중지등록취소<NA>23712356<NA>경기도 고양시 덕양구 북한산로 424-19 (효자동)경기도 고양시덕양구 효자동 128번지 4호412-14037.662557126.952609100000000현대
6고양시(주)하이에너지2006-12-11<NA>취소/말소/만료/정지/중지등록취소<NA><NA><NA>경기도 고양시 덕양구 중앙로 323경기도 고양시덕양구 강매동 8-1번지 대한송유관공사 본관204호412-29037.607663126.851896100000000기타
7고양시(주)동서이엔지2003-11-07<NA>취소/말소/만료/정지/중지등록취소<NA>319795185<NA>경기도 고양시 덕양구 중앙로 323, 본관1층 (화전동,대한송유관공사경인지사)경기도 고양시 덕양구 화전동 784번지 3호 대한송유관공사 경인지사 본관1층412-16037.608778126.856905100000000기타
8고양시(주)다인에너지2012-10-25<NA>취소/말소/만료/정지/중지등록취소<NA><NA><NA>경기도 고양시 일산서구 산현로 34경기도 고양시일산서구 탄현동 17번지 52호 동문1차아파트 104동 801호411-32037.68913126.765464100000000현대
9고양시(주)다원상사2010-12-17<NA>취소/말소/만료/정지/중지등록취소<NA>0<NA>경기도 고양시 덕양구 무원로6번길 61 (행신동)경기도 고양시 덕양구 행신동 713번지412-82537.614785126.833183100000000기타
시군명사업장명인허가일자인허가취소일자통합영업상태명영업상태명폐업일자소재지시설전화번호소재지면적정보소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도자본금거래처
508<NA>(주)노일에너지2008-05-13<NA>폐업폐지<NA>319388161<NA>서울특별시 강서구 화곡로53가길 7 (화곡동,현대빌라트가동201)서울특별시 강서구 화곡1동 988번지 18호 현대빌라트가동201157-90937.551061126.847869100000000현대
509<NA>(주)신한에너지2010-02-09<NA>폐업폐지2013-09-06<NA><NA>서울특별시 성북구 보문로29다길 31, 105동 304호 (삼선동2가,대우푸르지오아파트)서울특별시 성북구 삼선동2가 425번지 대우푸르지오아파트 105동 304호136-72137.584086127.014334<NA>-
510<NA>중부석유2011-03-16<NA>폐업폐지<NA><NA><NA>충청북도 증평군 증평읍 연정1길 1-11충청북도 증평군 증평읍 덕상리 236번지368-90236.744599127.618558100000000기타
511<NA>에스제이투(주)2011-01-31<NA>폐업폐지2014-07-22323225189<NA>서울특별시 강서구 화곡로13길 107, 147동 1201호 (화곡동,화곡푸르지오)서울특별시 강서구 화곡동 1091번지 화곡푸르지오 147동 1201호157-77337.542535126.831812<NA>호남석유화학
512<NA>(주)서경에너지2010-09-13<NA>폐업폐지<NA><NA><NA>충청북도 청주시 청원구 공항로138번길 65-5충청북도 청주시상당구 율량동 811번지 102호360-81836.669834127.488205100000000S-Oil
513<NA>(주)화이트오일2007-04-13<NA>폐업폐지<NA><NA><NA>서울특별시 송파구 오금로 126 (송파동,한진로즈힐레이트401)서울특별시 송파구 송파1동 58번지 4호 한진로즈힐레이트401138-84937.511365127.108823100000000기타
514<NA>(주)하이트오일2007-04-12<NA>폐업폐지<NA><NA><NA>서울특별시 송파구 오금로 126서울특별시 송파구 송파1동 58-4번지 한진로즈힐이크 401동138-84937.511365127.108823100000000기타
515<NA>(주)오일뱅크2006-07-03<NA>폐업폐지<NA>318265103<NA><NA>서울특별시 양천구 신월2동 612번지 12호 청화연립 가-102158-84337.519126.847267200000000기타
516<NA>(주)부건에스앤에스(서울지점)2015-03-31<NA>폐업폐지2018-11-01323245789<NA>전라남도 담양군 화양길 31전라남도 담양군 창평면 295번지517-88135.248594126.998223350000000GS칼텍스
517<NA>(주)무한에너지2006-04-13<NA>폐업폐지<NA><NA><NA><NA>서울특별시 중랑구 중화2동 319번지 1호131-881<NA><NA>200000000기타

Duplicate rows

Most frequently occurring

시군명사업장명인허가일자통합영업상태명영업상태명폐업일자소재지시설전화번호소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도자본금거래처# duplicates
0평택시(주)엠씨에너지20100325<NA>폐업 등20131205<NA>경기도 평택시 청북면 청북중앙로 406경기도 평택시 청북면 936번지 3호45183237.048126.897204<NA><NA>3