Overview

Dataset statistics

Number of variables15
Number of observations165
Missing cells488
Missing cells (%)19.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory20.8 KiB
Average record size in memory128.8 B

Variable types

Categorical3
Text4
Unsupported2
Numeric6

Dataset

Description급수공사 대행업체 현황
Author행정안전부
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=E0K8XT6UBEU93WOUNY551127039&infSeq=1

Alerts

소재지우편번호 is highly overall correlated with WGS84위도 and 1 other fieldsHigh correlation
WGS84위도 is highly overall correlated with 소재지우편번호 and 1 other fieldsHigh correlation
WGS84경도 is highly overall correlated with 시군명High correlation
종업원수 is highly overall correlated with 통합영업상태명 and 1 other fieldsHigh correlation
허가시작일자 is highly overall correlated with 허가종료일자 and 1 other fieldsHigh correlation
허가종료일자 is highly overall correlated with 허가시작일자 and 1 other fieldsHigh correlation
시군명 is highly overall correlated with 소재지우편번호 and 4 other fieldsHigh correlation
통합영업상태명 is highly overall correlated with 종업원수High correlation
영업상태명 is highly overall correlated with 종업원수High correlation
통합영업상태명 is highly imbalanced (94.7%)Imbalance
영업상태명 is highly imbalanced (94.7%)Imbalance
인허가취소일자 has 165 (100.0%) missing valuesMissing
소재지시설전화번호 has 27 (16.4%) missing valuesMissing
소재지면적정보 has 165 (100.0%) missing valuesMissing
소재지도로명주소 has 14 (8.5%) missing valuesMissing
소재지우편번호 has 14 (8.5%) missing valuesMissing
WGS84위도 has 12 (7.3%) missing valuesMissing
WGS84경도 has 12 (7.3%) missing valuesMissing
종업원수 has 67 (40.6%) missing valuesMissing
허가시작일자 has 5 (3.0%) missing valuesMissing
허가종료일자 has 7 (4.2%) missing valuesMissing
인허가취소일자 is an unsupported type, check if it needs cleaning or further analysisUnsupported
소재지면적정보 is an unsupported type, check if it needs cleaning or further analysisUnsupported
종업원수 has 25 (15.2%) zerosZeros

Reproduction

Analysis started2023-12-10 22:13:50.370793
Analysis finished2023-12-10 22:13:55.839214
Duration5.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
양평군
33 
광주시
33 
화성시
17 
오산시
16 
수원시
14 
Other values (13)
52 

Length

Max length4
Median length3
Mean length3.0181818
Min length3

Unique

Unique1 ?
Unique (%)0.6%

Sample

1st row고양시
2nd row광주시
3rd row광주시
4th row광주시
5th row광주시

Common Values

ValueCountFrequency (%)
양평군 33
20.0%
광주시 33
20.0%
화성시 17
10.3%
오산시 16
9.7%
수원시 14
8.5%
안양시 8
 
4.8%
평택시 8
 
4.8%
연천군 7
 
4.2%
안성시 5
 
3.0%
시흥시 4
 
2.4%
Other values (8) 20
12.1%

Length

2023-12-11T07:13:55.898578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
양평군 33
20.0%
광주시 33
20.0%
화성시 17
10.3%
오산시 16
9.7%
수원시 14
8.5%
안양시 8
 
4.8%
평택시 8
 
4.8%
연천군 7
 
4.2%
안성시 5
 
3.0%
시흥시 4
 
2.4%
Other values (8) 20
12.1%
Distinct126
Distinct (%)76.4%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2023-12-11T07:13:56.167062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length7
Mean length6.8484848
Min length3

Characters and Unicode

Total characters1130
Distinct characters111
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)60.6%

Sample

1st row강서건설(주)
2nd row(주)토우
3rd row하나공영(주)
4th row(주)금남환경건설
5th row시유건설(주)
ValueCountFrequency (%)
주)이영설비 5
 
3.0%
주)양평수도 4
 
2.4%
주식회사 4
 
2.4%
주)민영토건 4
 
2.4%
동원건설(주 4
 
2.4%
주)토우 3
 
1.8%
주)대신기업 3
 
1.8%
주)송원건설 3
 
1.8%
주)우린건설 3
 
1.8%
동남토건(주 2
 
1.2%
Other values (117) 134
79.3%
2023-12-11T07:13:56.563653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
130
 
11.5%
( 119
 
10.5%
) 119
 
10.5%
102
 
9.0%
98
 
8.7%
23
 
2.0%
23
 
2.0%
22
 
1.9%
21
 
1.9%
20
 
1.8%
Other values (101) 453
40.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 884
78.2%
Open Punctuation 119
 
10.5%
Close Punctuation 119
 
10.5%
Space Separator 4
 
0.4%
Other Symbol 2
 
0.2%
Decimal Number 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
130
 
14.7%
102
 
11.5%
98
 
11.1%
23
 
2.6%
23
 
2.6%
22
 
2.5%
21
 
2.4%
20
 
2.3%
19
 
2.1%
17
 
1.9%
Other values (95) 409
46.3%
Decimal Number
ValueCountFrequency (%)
5 1
50.0%
0 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 119
100.0%
Close Punctuation
ValueCountFrequency (%)
) 119
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 886
78.4%
Common 244
 
21.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
130
 
14.7%
102
 
11.5%
98
 
11.1%
23
 
2.6%
23
 
2.6%
22
 
2.5%
21
 
2.4%
20
 
2.3%
19
 
2.1%
17
 
1.9%
Other values (96) 411
46.4%
Common
ValueCountFrequency (%)
( 119
48.8%
) 119
48.8%
4
 
1.6%
5 1
 
0.4%
0 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 884
78.2%
ASCII 244
 
21.6%
None 2
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
130
 
14.7%
102
 
11.5%
98
 
11.1%
23
 
2.6%
23
 
2.6%
22
 
2.5%
21
 
2.4%
20
 
2.3%
19
 
2.1%
17
 
1.9%
Other values (95) 409
46.3%
ASCII
ValueCountFrequency (%)
( 119
48.8%
) 119
48.8%
4
 
1.6%
5 1
 
0.4%
0 1
 
0.4%
None
ValueCountFrequency (%)
2
100.0%

인허가취소일자
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing165
Missing (%)100.0%
Memory size1.6 KiB

통합영업상태명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
영업/정상
164 
제외/삭제/전출
 
1

Length

Max length8
Median length5
Mean length5.0181818
Min length5

Unique

Unique1 ?
Unique (%)0.6%

Sample

1st row영업/정상
2nd row영업/정상
3rd row영업/정상
4th row영업/정상
5th row영업/정상

Common Values

ValueCountFrequency (%)
영업/정상 164
99.4%
제외/삭제/전출 1
 
0.6%

Length

2023-12-11T07:13:56.895918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:13:57.012390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업/정상 164
99.4%
제외/삭제/전출 1
 
0.6%

영업상태명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
정상
164 
제외사항
 
1

Length

Max length4
Median length2
Mean length2.0121212
Min length2

Unique

Unique1 ?
Unique (%)0.6%

Sample

1st row정상
2nd row정상
3rd row정상
4th row정상
5th row정상

Common Values

ValueCountFrequency (%)
정상 164
99.4%
제외사항 1
 
0.6%

Length

2023-12-11T07:13:57.127944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:13:57.225800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정상 164
99.4%
제외사항 1
 
0.6%
Distinct108
Distinct (%)78.3%
Missing27
Missing (%)16.4%
Memory size1.4 KiB
2023-12-11T07:13:57.449589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length10.702899
Min length7

Characters and Unicode

Total characters1477
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80 ?
Unique (%)58.0%

Sample

1st row031-768-5800
2nd row031-798-1222
3rd row767-5508
4th row031-764-8385
5th row031-767-0504
ValueCountFrequency (%)
031-772-5925 3
 
2.2%
031-373-7417 3
 
2.2%
031-353-5154 2
 
1.4%
031-374-8333 2
 
1.4%
031-771-6611 2
 
1.4%
031-771-2010 2
 
1.4%
772-5925 2
 
1.4%
031-768-5800 2
 
1.4%
031-774-0410 2
 
1.4%
771-2010 2
 
1.4%
Other values (98) 116
84.1%
2023-12-11T07:13:57.786234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 222
15.0%
3 203
13.7%
1 190
12.9%
7 180
12.2%
0 169
11.4%
6 106
7.2%
4 95
6.4%
2 93
6.3%
5 79
 
5.3%
9 73
 
4.9%
Other values (2) 67
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1251
84.7%
Dash Punctuation 222
 
15.0%
Close Punctuation 4
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 203
16.2%
1 190
15.2%
7 180
14.4%
0 169
13.5%
6 106
8.5%
4 95
7.6%
2 93
7.4%
5 79
 
6.3%
9 73
 
5.8%
8 63
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 222
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1477
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 222
15.0%
3 203
13.7%
1 190
12.9%
7 180
12.2%
0 169
11.4%
6 106
7.2%
4 95
6.4%
2 93
6.3%
5 79
 
5.3%
9 73
 
4.9%
Other values (2) 67
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1477
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 222
15.0%
3 203
13.7%
1 190
12.9%
7 180
12.2%
0 169
11.4%
6 106
7.2%
4 95
6.4%
2 93
6.3%
5 79
 
5.3%
9 73
 
4.9%
Other values (2) 67
 
4.5%

소재지면적정보
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing165
Missing (%)100.0%
Memory size1.6 KiB
Distinct119
Distinct (%)78.8%
Missing14
Missing (%)8.5%
Memory size1.4 KiB
2023-12-11T07:13:58.031075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length34
Mean length24.834437
Min length17

Characters and Unicode

Total characters3750
Distinct characters187
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)60.3%

Sample

1st row경기도 고양시 일산동구 호수로 ***-** ***동 ***호 (백석동,동문굿모닝타워*차)
2nd row경기도 광주시 초월읍 설월길**번길 **-* (지월리)
3rd row경기도 광주시 광주대로***번길 * (송정동)
4th row경기도 광주시 탄벌길 *, 태경빌딩 (탄벌동)
5th row경기도 광주시 초월읍 현산로***번길 *-**
ValueCountFrequency (%)
152
18.3%
경기도 151
18.2%
양평군 33
 
4.0%
광주시 33
 
4.0%
오산시 16
 
1.9%
15
 
1.8%
양평읍 15
 
1.8%
화성시 15
 
1.8%
13
 
1.6%
용문면 12
 
1.4%
Other values (196) 375
45.2%
2023-12-11T07:13:58.378822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
679
18.1%
* 621
16.6%
158
 
4.2%
154
 
4.1%
153
 
4.1%
120
 
3.2%
107
 
2.9%
94
 
2.5%
92
 
2.5%
) 84
 
2.2%
Other values (177) 1488
39.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2186
58.3%
Space Separator 679
 
18.1%
Other Punctuation 661
 
17.6%
Close Punctuation 84
 
2.2%
Open Punctuation 84
 
2.2%
Dash Punctuation 54
 
1.4%
Uppercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
158
 
7.2%
154
 
7.0%
153
 
7.0%
120
 
5.5%
107
 
4.9%
94
 
4.3%
92
 
4.2%
72
 
3.3%
55
 
2.5%
48
 
2.2%
Other values (169) 1133
51.8%
Other Punctuation
ValueCountFrequency (%)
* 621
93.9%
, 39
 
5.9%
. 1
 
0.2%
Space Separator
ValueCountFrequency (%)
679
100.0%
Close Punctuation
ValueCountFrequency (%)
) 84
100.0%
Open Punctuation
ValueCountFrequency (%)
( 84
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2186
58.3%
Common 1562
41.7%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
158
 
7.2%
154
 
7.0%
153
 
7.0%
120
 
5.5%
107
 
4.9%
94
 
4.3%
92
 
4.2%
72
 
3.3%
55
 
2.5%
48
 
2.2%
Other values (169) 1133
51.8%
Common
ValueCountFrequency (%)
679
43.5%
* 621
39.8%
) 84
 
5.4%
( 84
 
5.4%
- 54
 
3.5%
, 39
 
2.5%
. 1
 
0.1%
Latin
ValueCountFrequency (%)
A 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2186
58.3%
ASCII 1564
41.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
679
43.4%
* 621
39.7%
) 84
 
5.4%
( 84
 
5.4%
- 54
 
3.5%
, 39
 
2.5%
A 2
 
0.1%
. 1
 
0.1%
Hangul
ValueCountFrequency (%)
158
 
7.2%
154
 
7.0%
153
 
7.0%
120
 
5.5%
107
 
4.9%
94
 
4.3%
92
 
4.2%
72
 
3.3%
55
 
2.5%
48
 
2.2%
Other values (169) 1133
51.8%
Distinct134
Distinct (%)81.2%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2023-12-11T07:13:58.654121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length35
Mean length23.642424
Min length15

Characters and Unicode

Total characters3901
Distinct characters153
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique112 ?
Unique (%)67.9%

Sample

1st row경기도 고양시일산동구 백석동 ****번지 동문굿모닝타워 ***동 ***호
2nd row경기도 광주시 초월면 지월리 ***번지 ** 호
3rd row경기도 광주시 송정동 ***-**번지
4th row경기도 광주시 탄벌동 ***번지 태경빌딩
5th row경기도 광주시 초월읍 지월리 ***-**번지
ValueCountFrequency (%)
경기도 165
18.9%
번지 109
 
12.5%
98
 
11.2%
66
 
7.6%
양평군 33
 
3.8%
광주시 33
 
3.8%
화성시 17
 
1.9%
오산시 16
 
1.8%
양평읍 15
 
1.7%
용문면 12
 
1.4%
Other values (145) 308
35.3%
2023-12-11T07:13:59.059339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
951
24.4%
* 701
18.0%
169
 
4.3%
167
 
4.3%
165
 
4.2%
129
 
3.3%
120
 
3.1%
109
 
2.8%
101
 
2.6%
81
 
2.1%
Other values (143) 1208
31.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2174
55.7%
Space Separator 951
24.4%
Other Punctuation 701
 
18.0%
Dash Punctuation 71
 
1.8%
Close Punctuation 2
 
0.1%
Open Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
169
 
7.8%
167
 
7.7%
165
 
7.6%
129
 
5.9%
120
 
5.5%
109
 
5.0%
101
 
4.6%
81
 
3.7%
70
 
3.2%
69
 
3.2%
Other values (138) 994
45.7%
Space Separator
ValueCountFrequency (%)
951
100.0%
Other Punctuation
ValueCountFrequency (%)
* 701
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 71
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2174
55.7%
Common 1727
44.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
169
 
7.8%
167
 
7.7%
165
 
7.6%
129
 
5.9%
120
 
5.5%
109
 
5.0%
101
 
4.6%
81
 
3.7%
70
 
3.2%
69
 
3.2%
Other values (138) 994
45.7%
Common
ValueCountFrequency (%)
951
55.1%
* 701
40.6%
- 71
 
4.1%
) 2
 
0.1%
( 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2174
55.7%
ASCII 1727
44.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
951
55.1%
* 701
40.6%
- 71
 
4.1%
) 2
 
0.1%
( 2
 
0.1%
Hangul
ValueCountFrequency (%)
169
 
7.8%
167
 
7.7%
165
 
7.6%
129
 
5.9%
120
 
5.5%
109
 
5.0%
101
 
4.6%
81
 
3.7%
70
 
3.2%
69
 
3.2%
Other values (138) 994
45.7%

소재지우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct100
Distinct (%)66.2%
Missing14
Missing (%)8.5%
Infinite0
Infinite (%)0.0%
Mean14720.079
Minimum10449
Maximum18629
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:13:59.182248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10449
5-th percentile11025
Q112571
median13959
Q317651.5
95-th percentile18331.5
Maximum18629
Range8180
Interquartile range (IQR)5080.5

Descriptive statistics

Standard deviation2599.1039
Coefficient of variation (CV)0.17656861
Kurtosis-1.5557318
Mean14720.079
Median Absolute Deviation (MAD)2099
Skewness0.21922016
Sum2222732
Variance6755341.2
MonotonicityNot monotonic
2023-12-11T07:13:59.307392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12747 6
 
3.6%
18104 6
 
3.6%
12522 4
 
2.4%
12739 4
 
2.4%
18135 4
 
2.4%
12515 4
 
2.4%
12547 4
 
2.4%
12563 3
 
1.8%
12571 3
 
1.8%
12559 3
 
1.8%
Other values (90) 110
66.7%
(Missing) 14
 
8.5%
ValueCountFrequency (%)
10449 1
0.6%
10844 1
0.6%
10931 1
0.6%
10946 1
0.6%
11000 1
0.6%
11012 1
0.6%
11023 2
1.2%
11027 1
0.6%
11029 1
0.6%
11034 1
0.6%
ValueCountFrequency (%)
18629 1
0.6%
18593 1
0.6%
18567 1
0.6%
18515 2
1.2%
18401 1
0.6%
18336 1
0.6%
18334 1
0.6%
18329 1
0.6%
18325 1
0.6%
18316 1
0.6%

WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct131
Distinct (%)85.6%
Missing12
Missing (%)7.3%
Infinite0
Infinite (%)0.0%
Mean37.363855
Minimum36.96186
Maximum38.183708
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:13:59.434213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.96186
5-th percentile37.01716
Q137.20485
median37.379888
Q337.468768
95-th percentile37.807035
Maximum38.183708
Range1.2218474
Interquartile range (IQR)0.26391829

Descriptive statistics

Standard deviation0.23012963
Coefficient of variation (CV)0.0061591512
Kurtosis2.0246972
Mean37.363855
Median Absolute Deviation (MAD)0.11384104
Skewness1.0902713
Sum5716.6699
Variance0.052959649
MonotonicityNot monotonic
2023-12-11T07:13:59.561454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.4764411 3
 
1.8%
37.4910273 3
 
1.8%
37.4852774867 2
 
1.2%
37.3206684 2
 
1.2%
37.4031649 2
 
1.2%
37.4851681 2
 
1.2%
37.4251126 2
 
1.2%
37.485891 2
 
1.2%
37.2947764193 2
 
1.2%
37.1526614694 2
 
1.2%
Other values (121) 131
79.4%
(Missing) 12
 
7.3%
ValueCountFrequency (%)
36.9618604 1
0.6%
36.9841412 1
0.6%
36.990659 1
0.6%
36.9907725 1
0.6%
36.9912159 1
0.6%
37.0002492 1
0.6%
37.0101892 1
0.6%
37.0152768 1
0.6%
37.0184158 1
0.6%
37.0563236 1
0.6%
ValueCountFrequency (%)
38.1837078 1
0.6%
38.0975584 1
0.6%
38.0294197 1
0.6%
38.024229 1
0.6%
38.0191774 1
0.6%
38.0038684 1
0.6%
37.9936517 1
0.6%
37.8162222832 1
0.6%
37.8009100798 1
0.6%
37.7665123806 1
0.6%

WGS84경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct131
Distinct (%)85.6%
Missing12
Missing (%)7.3%
Infinite0
Infinite (%)0.0%
Mean127.14051
Minimum126.74444
Maximum127.59344
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:13:59.687634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.74444
5-th percentile126.80034
Q1127.00539
median127.07085
Q3127.26744
95-th percentile127.54819
Maximum127.59344
Range0.8490042
Interquartile range (IQR)0.2620438

Descriptive statistics

Standard deviation0.21640404
Coefficient of variation (CV)0.0017020856
Kurtosis-0.50906957
Mean127.14051
Median Absolute Deviation (MAD)0.15653418
Skewness0.44058467
Sum19452.499
Variance0.046830707
MonotonicityNot monotonic
2023-12-11T07:13:59.804868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.4664863 3
 
1.8%
127.4999217 3
 
1.8%
127.5926053639 2
 
1.2%
126.832379 2
 
1.2%
126.9143125 2
 
1.2%
127.592607 2
 
1.2%
127.2827553 2
 
1.2%
127.5903853 2
 
1.2%
127.0236918202 2
 
1.2%
127.0708466815 2
 
1.2%
Other values (121) 131
79.4%
(Missing) 12
 
7.3%
ValueCountFrequency (%)
126.7444402 1
0.6%
126.7455281 1
0.6%
126.7610029769 1
0.6%
126.7779608418 1
0.6%
126.7864895 1
0.6%
126.7889737576 1
0.6%
126.7970704 1
0.6%
126.8002206 1
0.6%
126.800417645 1
0.6%
126.8190583218 1
0.6%
ValueCountFrequency (%)
127.5934444 1
 
0.6%
127.592607 2
1.2%
127.5926053639 2
1.2%
127.5904795115 1
 
0.6%
127.5903853 2
1.2%
127.5200614 1
 
0.6%
127.5138678 1
 
0.6%
127.5100584172 1
 
0.6%
127.5100584 2
1.2%
127.4999217 3
1.8%

종업원수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct13
Distinct (%)13.3%
Missing67
Missing (%)40.6%
Infinite0
Infinite (%)0.0%
Mean3.2959184
Minimum0
Maximum13
Zeros25
Zeros (%)15.2%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:13:59.909348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.25
median3
Q35
95-th percentile9.15
Maximum13
Range13
Interquartile range (IQR)4.75

Descriptive statistics

Standard deviation2.9817631
Coefficient of variation (CV)0.90468353
Kurtosis0.75563133
Mean3.2959184
Median Absolute Deviation (MAD)2
Skewness0.99753762
Sum323
Variance8.890911
MonotonicityNot monotonic
2023-12-11T07:14:00.009319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0 25
 
15.2%
3 20
 
12.1%
2 15
 
9.1%
4 9
 
5.5%
6 8
 
4.8%
5 7
 
4.2%
8 4
 
2.4%
9 2
 
1.2%
10 2
 
1.2%
11 2
 
1.2%
Other values (3) 4
 
2.4%
(Missing) 67
40.6%
ValueCountFrequency (%)
0 25
15.2%
1 2
 
1.2%
2 15
9.1%
3 20
12.1%
4 9
 
5.5%
5 7
 
4.2%
6 8
 
4.8%
7 1
 
0.6%
8 4
 
2.4%
9 2
 
1.2%
ValueCountFrequency (%)
13 1
 
0.6%
11 2
 
1.2%
10 2
 
1.2%
9 2
 
1.2%
8 4
 
2.4%
7 1
 
0.6%
6 8
 
4.8%
5 7
 
4.2%
4 9
5.5%
3 20
12.1%

허가시작일자
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct65
Distinct (%)40.6%
Missing5
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean20139306
Minimum20020902
Maximum20220130
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:14:00.136706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20020902
5-th percentile20040209
Q120087934
median20150229
Q320190301
95-th percentile20220130
Maximum20220130
Range199228
Interquartile range (IQR)102367

Descriptive statistics

Standard deviation63372.141
Coefficient of variation (CV)0.0031466895
Kurtosis-1.2882399
Mean20139306
Median Absolute Deviation (MAD)59162
Skewness-0.34281529
Sum3.2222889 × 109
Variance4.0160282 × 109
MonotonicityNot monotonic
2023-12-11T07:14:00.263868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20210101 16
 
9.7%
20190101 11
 
6.7%
20220130 10
 
6.1%
20041213 10
 
6.1%
20190130 9
 
5.5%
20210301 7
 
4.2%
20190301 7
 
4.2%
20160130 6
 
3.6%
20170101 6
 
3.6%
20130130 5
 
3.0%
Other values (55) 73
44.2%
(Missing) 5
 
3.0%
ValueCountFrequency (%)
20020902 1
 
0.6%
20021007 1
 
0.6%
20030724 1
 
0.6%
20031007 2
1.2%
20031112 1
 
0.6%
20031124 1
 
0.6%
20040209 3
1.8%
20040818 1
 
0.6%
20040901 1
 
0.6%
20041005 2
1.2%
ValueCountFrequency (%)
20220130 10
6.1%
20210302 1
 
0.6%
20210301 7
4.2%
20210223 1
 
0.6%
20210101 16
9.7%
20190415 4
 
2.4%
20190301 7
4.2%
20190130 9
5.5%
20190101 11
6.7%
20171124 1
 
0.6%

허가종료일자
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct64
Distinct (%)40.5%
Missing7
Missing (%)4.2%
Infinite0
Infinite (%)0.0%
Mean20158024
Minimum20040901
Maximum20250129
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:14:00.414591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20040901
5-th percentile20060208
Q120103379
median20160630
Q320211231
95-th percentile20250129
Maximum20250129
Range209228
Interquartile range (IQR)107852

Descriptive statistics

Standard deviation63142.125
Coefficient of variation (CV)0.0031323569
Kurtosis-1.2243958
Mean20158024
Median Absolute Deviation (MAD)50601
Skewness-0.29473684
Sum3.1849678 × 109
Variance3.986928 × 109
MonotonicityNot monotonic
2023-12-11T07:14:00.563917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20211231 16
 
9.7%
20201231 11
 
6.7%
20061212 10
 
6.1%
20250129 10
 
6.1%
20220129 9
 
5.5%
20210228 7
 
4.2%
20240229 7
 
4.2%
20160129 5
 
3.0%
20190129 5
 
3.0%
20191231 4
 
2.4%
Other values (54) 74
44.8%
(Missing) 7
 
4.2%
ValueCountFrequency (%)
20040901 1
0.6%
20041006 1
0.6%
20050723 1
0.6%
20051006 2
1.2%
20051111 1
0.6%
20051123 1
0.6%
20060208 2
1.2%
20060817 1
0.6%
20060831 1
0.6%
20061004 2
1.2%
ValueCountFrequency (%)
20250129 10
6.1%
20240229 7
4.2%
20220414 4
 
2.4%
20220129 9
5.5%
20211231 16
9.7%
20210228 7
4.2%
20201231 11
6.7%
20191231 4
 
2.4%
20190130 1
 
0.6%
20190129 5
 
3.0%

Interactions

2023-12-11T07:13:54.904479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:52.195683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:52.852946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.373583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.840642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.375681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.979340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:52.359513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:52.954980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.459974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.929110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.487269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:55.062284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:52.463472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.034948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.528789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.018182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.573802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:55.137187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:52.547964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.119106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.591058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.114413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.658274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:55.203988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:52.644708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.208170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.670308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.204351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.733774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:55.286950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:52.762351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.288247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:53.749860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.299290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:13:54.824112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:14:00.726856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명통합영업상태명영업상태명소재지우편번호WGS84위도WGS84경도종업원수허가시작일자허가종료일자
시군명1.0000.0000.0000.9970.9950.9700.6820.8600.834
통합영업상태명0.0001.0000.6990.0000.0000.140NaN0.0000.000
영업상태명0.0000.6991.0000.0000.0000.140NaN0.0000.000
소재지우편번호0.9970.0000.0001.0000.9460.8860.5710.7380.720
WGS84위도0.9950.0000.0000.9461.0000.8830.5000.6590.666
WGS84경도0.9700.1400.1400.8860.8831.0000.2130.5950.496
종업원수0.682NaNNaN0.5710.5000.2131.0000.4320.437
허가시작일자0.8600.0000.0000.7380.6590.5950.4321.0000.967
허가종료일자0.8340.0000.0000.7200.6660.4960.4370.9671.000
2023-12-11T07:14:00.848516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통합영업상태명영업상태명시군명
통합영업상태명1.0000.4920.000
영업상태명0.4921.0000.000
시군명0.0000.0001.000
2023-12-11T07:14:00.941686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소재지우편번호WGS84위도WGS84경도종업원수허가시작일자허가종료일자시군명통합영업상태명영업상태명
소재지우편번호1.000-0.950-0.4830.322-0.204-0.2160.9550.0000.000
WGS84위도-0.9501.0000.346-0.4040.2090.2240.8530.0000.000
WGS84경도-0.4830.3461.0000.1480.2450.3090.7050.1350.135
종업원수0.322-0.4040.1481.000-0.261-0.2750.3341.0001.000
허가시작일자-0.2040.2090.245-0.2611.0000.9850.5430.0000.000
허가종료일자-0.2160.2240.309-0.2750.9851.0000.5090.0000.000
시군명0.9550.8530.7050.3340.5430.5091.0000.0000.000
통합영업상태명0.0000.0000.1351.0000.0000.0000.0001.0000.492
영업상태명0.0000.0000.1351.0000.0000.0000.0000.4921.000

Missing values

2023-12-11T07:13:55.391037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:13:55.579777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T07:13:55.738065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명사업장명인허가취소일자통합영업상태명영업상태명소재지시설전화번호소재지면적정보소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도종업원수허가시작일자허가종료일자
0고양시강서건설(주)<NA>영업/정상정상<NA><NA>경기도 고양시 일산동구 호수로 ***-** ***동 ***호 (백석동,동문굿모닝타워*차)경기도 고양시일산동구 백석동 ****번지 동문굿모닝타워 ***동 ***호1044937.640383126.7864902008071120100710
1광주시(주)토우<NA>영업/정상정상<NA><NA>경기도 광주시 초월읍 설월길**번길 **-* (지월리)경기도 광주시 초월면 지월리 ***번지 ** 호1274037.425113127.28275592003111220051111
2광주시하나공영(주)<NA>영업/정상정상<NA><NA>경기도 광주시 광주대로***번길 * (송정동)경기도 광주시 송정동 ***-**번지1273937.426529127.25800713<NA><NA>
3광주시(주)금남환경건설<NA>영업/정상정상031-768-5800<NA>경기도 광주시 탄벌길 *, 태경빌딩 (탄벌동)경기도 광주시 탄벌동 ***번지 태경빌딩1274737.419113127.245782<NA>2019010120201231
4광주시시유건설(주)<NA>영업/정상정상031-798-1222<NA>경기도 광주시 초월읍 현산로***번길 *-**경기도 광주시 초월읍 지월리 ***-**번지1272837.416221127.285862<NA>2019010120201231
5광주시시유건설(주)<NA>영업/정상정상767-5508<NA>경기도 광주시 중앙로***번길 ** (경안동)경기도 광주시 경안동 **-**1275537.412256127.2544802003112420051123
6광주시선우종합건설(주)<NA>영업/정상정상<NA><NA>경기도 광주시 중앙로 *** (송정동)경기도 광주시 송정동 ***-*번지1274337.415503127.257032<NA><NA><NA>
7광주시(주)우린건설<NA>영업/정상정상031-764-8385<NA>경기도 광주시 초월읍 선장동길 **-**, *층경기도 광주시 초월읍 선동리 **1273237.390878127.322241<NA>2021010120211231
8광주시하나공영주식회사<NA>영업/정상정상031-767-0504<NA>경기도 광주시 광주대로***번길 *, *층 (송정동)경기도 광주시 송정동 ***-**1273937.426585127.258072<NA>2021010120211231
9광주시선우종합건설 주식회사<NA>영업/정상정상031-797-0432<NA>경기도 광주시 중앙로 *** (송정동)경기도 광주시 송정동 ***-*1274337.415568127.25709<NA>2021010120211231
시군명사업장명인허가취소일자통합영업상태명영업상태명소재지시설전화번호소재지면적정보소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도종업원수허가시작일자허가종료일자
155화성시한진토건(주)<NA>영업/정상정상031-375-2670<NA>경기도 화성시 정남면 발안로 ****경기도 화성시 정남면 덕절리 **-*1851537.134348127.028854<NA>2017010120191231
156화성시수도건설(주)<NA>영업/정상정상031-237-3578<NA>경기도 화성시 효행로 ***-**경기도 화성시 태안읍 진안리 ***번지 **호1840137.210251127.0370642004121320061212
157화성시지원종합건설(주)<NA>영업/정상정상<NA><NA>경기도 화성시 향남읍 삼천병마로 ***경기도 화성시 향남면 장짐리 ***번지 **호1859337.135525126.91121452004121320061212
158화성시만능건설(주)<NA>영업/정상정상031-357-9595<NA>경기도 화성시 남양읍 화성로 ****경기도 화성시 남양동 ***번지 *호1825437.203781126.80041862004120920061208
159화성시새생활설비<NA>영업/정상정상<NA><NA>경기도 화성시 우정읍 *.*만세로 **-*경기도 화성시 우정읍 조암리 ***번지 *호1856737.08267126.81928612004121320061212
160화성시대한건설(주)<NA>영업/정상정상<NA><NA>경기도 화성시 남양읍 남양성지로 ***경기도 화성시 남양동 ****번지 아이리스프라자 ***1826137.209839126.81905822004121320061212
161화성시한진토건(주)<NA>영업/정상정상<NA><NA>경기도 화성시 정남면 발안로 ****경기도 화성시 정남면 덕절리 **번지 *호1851537.134348127.02885432004121320061212
162화성시대산개발(주)<NA>영업/정상정상031-354-3322<NA><NA>경기도 화성시 양감면 송산리 ***번지 *호1862937.104479126.99086762004121320061212
163화성시대성건설(주)<NA>영업/정상정상031-376-4100<NA><NA>경기도 화성시 동탄면 오산리 ***번지 **호<NA>37.185697127.08727832004121320061212
164화성시효성건설(주)<NA>영업/정상정상<NA><NA>경기도 화성시 봉담읍 동화역말길 **-**경기도 화성시 봉담읍 동화리 ***번지 **호1829837.217907126.96687382004121320061212