Overview

Dataset statistics

Number of variables5
Number of observations747
Missing cells96
Missing cells (%)2.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.0 KiB
Average record size in memory41.2 B

Variable types

Numeric1
Text3
Categorical1

Dataset

Description경기도_고양시_기타수질오염원현황에 대한 데이터로 경기도 고양시 기타수질오염원 업소명, 소재지, 전화번호, 업종 등의 항목을 제공합니다.
Author경기도 고양시
URLhttps://www.data.go.kr/data/3078985/fileData.do

Alerts

전화번호 has 96 (12.9%) missing valuesMissing
번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:42:37.751824
Analysis finished2023-12-12 10:42:38.674094
Duration0.92 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct747
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean374
Minimum1
Maximum747
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-12-12T19:42:38.764247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile38.3
Q1187.5
median374
Q3560.5
95-th percentile709.7
Maximum747
Range746
Interquartile range (IQR)373

Descriptive statistics

Standard deviation215.78461
Coefficient of variation (CV)0.57696421
Kurtosis-1.2
Mean374
Median Absolute Deviation (MAD)187
Skewness0
Sum279378
Variance46563
MonotonicityStrictly increasing
2023-12-12T19:42:38.975918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
503 1
 
0.1%
494 1
 
0.1%
495 1
 
0.1%
496 1
 
0.1%
497 1
 
0.1%
498 1
 
0.1%
499 1
 
0.1%
500 1
 
0.1%
501 1
 
0.1%
Other values (737) 737
98.7%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
747 1
0.1%
746 1
0.1%
745 1
0.1%
744 1
0.1%
743 1
0.1%
742 1
0.1%
741 1
0.1%
740 1
0.1%
739 1
0.1%
738 1
0.1%
Distinct731
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size6.0 KiB
2023-12-12T19:42:39.306748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length19
Mean length7.253012
Min length2

Characters and Unicode

Total characters5418
Distinct characters452
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique715 ?
Unique (%)95.7%

Sample

1st row명덕사진관
2nd row1,2,3칼라
3rd row유한사진실
4th row세계사진관
5th row㈜한양컨트리클럽
ValueCountFrequency (%)
안경 9
 
1.0%
의료법인 6
 
0.7%
안경나라 4
 
0.5%
으뜸플러스안경 4
 
0.5%
안경박사 3
 
0.3%
글라스 3
 
0.3%
다비치안경 3
 
0.3%
안경원 3
 
0.3%
motors 3
 
0.3%
안경마을 3
 
0.3%
Other values (791) 840
95.3%
2023-12-12T19:42:39.809266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
232
 
4.3%
204
 
3.8%
148
 
2.7%
143
 
2.6%
138
 
2.5%
137
 
2.5%
127
 
2.3%
112
 
2.1%
106
 
2.0%
99
 
1.8%
Other values (442) 3972
73.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4942
91.2%
Space Separator 137
 
2.5%
Uppercase Letter 114
 
2.1%
Other Symbol 112
 
2.1%
Decimal Number 46
 
0.8%
Lowercase Letter 25
 
0.5%
Close Punctuation 16
 
0.3%
Open Punctuation 16
 
0.3%
Other Punctuation 10
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
232
 
4.7%
204
 
4.1%
148
 
3.0%
143
 
2.9%
138
 
2.8%
127
 
2.6%
106
 
2.1%
99
 
2.0%
99
 
2.0%
89
 
1.8%
Other values (392) 3557
72.0%
Uppercase Letter
ValueCountFrequency (%)
O 15
13.2%
S 12
10.5%
M 12
10.5%
T 10
 
8.8%
R 8
 
7.0%
B 7
 
6.1%
C 7
 
6.1%
A 5
 
4.4%
K 5
 
4.4%
E 5
 
4.4%
Other values (12) 28
24.6%
Lowercase Letter
ValueCountFrequency (%)
o 6
24.0%
g 4
16.0%
a 2
 
8.0%
r 2
 
8.0%
e 2
 
8.0%
n 2
 
8.0%
t 2
 
8.0%
p 1
 
4.0%
d 1
 
4.0%
y 1
 
4.0%
Other values (2) 2
 
8.0%
Decimal Number
ValueCountFrequency (%)
1 16
34.8%
0 10
21.7%
2 5
 
10.9%
5 5
 
10.9%
9 4
 
8.7%
7 3
 
6.5%
3 3
 
6.5%
Other Punctuation
ValueCountFrequency (%)
& 3
30.0%
. 3
30.0%
, 2
20.0%
1
 
10.0%
/ 1
 
10.0%
Space Separator
ValueCountFrequency (%)
137
100.0%
Other Symbol
ValueCountFrequency (%)
112
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5054
93.3%
Common 225
 
4.2%
Latin 139
 
2.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
232
 
4.6%
204
 
4.0%
148
 
2.9%
143
 
2.8%
138
 
2.7%
127
 
2.5%
112
 
2.2%
106
 
2.1%
99
 
2.0%
99
 
2.0%
Other values (393) 3646
72.1%
Latin
ValueCountFrequency (%)
O 15
 
10.8%
S 12
 
8.6%
M 12
 
8.6%
T 10
 
7.2%
R 8
 
5.8%
B 7
 
5.0%
C 7
 
5.0%
o 6
 
4.3%
A 5
 
3.6%
K 5
 
3.6%
Other values (24) 52
37.4%
Common
ValueCountFrequency (%)
137
60.9%
) 16
 
7.1%
1 16
 
7.1%
( 16
 
7.1%
0 10
 
4.4%
2 5
 
2.2%
5 5
 
2.2%
9 4
 
1.8%
& 3
 
1.3%
7 3
 
1.3%
Other values (5) 10
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4942
91.2%
ASCII 363
 
6.7%
None 113
 
2.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
232
 
4.7%
204
 
4.1%
148
 
3.0%
143
 
2.9%
138
 
2.8%
127
 
2.6%
106
 
2.1%
99
 
2.0%
99
 
2.0%
89
 
1.8%
Other values (392) 3557
72.0%
ASCII
ValueCountFrequency (%)
137
37.7%
) 16
 
4.4%
1 16
 
4.4%
( 16
 
4.4%
O 15
 
4.1%
S 12
 
3.3%
M 12
 
3.3%
T 10
 
2.8%
0 10
 
2.8%
R 8
 
2.2%
Other values (38) 111
30.6%
None
ValueCountFrequency (%)
112
99.1%
1
 
0.9%
Distinct705
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Memory size6.0 KiB
2023-12-12T19:42:40.092132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length35
Mean length22.261044
Min length14

Characters and Unicode

Total characters16629
Distinct characters187
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique668 ?
Unique (%)89.4%

Sample

1st row고양시 일산서구 원일로 66  (일산동) 
2nd row고양시 일산서구 일중로79번길 6-6 (일산동)
3rd row고양시 덕양구 혜음로 23 (고양동)
4th row고양시 덕양구 통일로 763 (관산동)
5th row고양시 덕양구 고양대로1643번길 164 (원당동)
ValueCountFrequency (%)
고양시 747
20.8%
일산동구 287
 
8.0%
덕양구 232
 
6.5%
일산서구 227
 
6.3%
중앙로 80
 
2.2%
일산로 42
 
1.2%
덕이동 35
 
1.0%
장항동 34
 
0.9%
행신동 32
 
0.9%
고양대로 30
 
0.8%
Other values (839) 1846
51.4%
2023-12-12T19:42:40.603863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2885
17.3%
1029
 
6.2%
829
 
5.0%
810
 
4.9%
757
 
4.6%
748
 
4.5%
709
 
4.3%
682
 
4.1%
650
 
3.9%
1 613
 
3.7%
Other values (177) 6917
41.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9555
57.5%
Space Separator 2924
 
17.6%
Decimal Number 2854
 
17.2%
Open Punctuation 505
 
3.0%
Close Punctuation 505
 
3.0%
Dash Punctuation 149
 
0.9%
Other Punctuation 126
 
0.8%
Uppercase Letter 9
 
0.1%
Lowercase Letter 1
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1029
 
10.8%
829
 
8.7%
810
 
8.5%
757
 
7.9%
748
 
7.8%
709
 
7.4%
682
 
7.1%
650
 
6.8%
304
 
3.2%
233
 
2.4%
Other values (155) 2804
29.3%
Decimal Number
ValueCountFrequency (%)
1 613
21.5%
2 381
13.3%
3 308
10.8%
4 284
10.0%
5 275
9.6%
6 221
 
7.7%
7 217
 
7.6%
0 210
 
7.4%
8 184
 
6.4%
9 161
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
B 5
55.6%
A 3
33.3%
E 1
 
11.1%
Space Separator
ValueCountFrequency (%)
2885
98.7%
  39
 
1.3%
Other Punctuation
ValueCountFrequency (%)
, 121
96.0%
. 5
 
4.0%
Open Punctuation
ValueCountFrequency (%)
( 505
100.0%
Close Punctuation
ValueCountFrequency (%)
) 505
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 149
100.0%
Lowercase Letter
ValueCountFrequency (%)
b 1
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9555
57.5%
Common 7064
42.5%
Latin 10
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1029
 
10.8%
829
 
8.7%
810
 
8.5%
757
 
7.9%
748
 
7.8%
709
 
7.4%
682
 
7.1%
650
 
6.8%
304
 
3.2%
233
 
2.4%
Other values (155) 2804
29.3%
Common
ValueCountFrequency (%)
2885
40.8%
1 613
 
8.7%
( 505
 
7.1%
) 505
 
7.1%
2 381
 
5.4%
3 308
 
4.4%
4 284
 
4.0%
5 275
 
3.9%
6 221
 
3.1%
7 217
 
3.1%
Other values (8) 870
 
12.3%
Latin
ValueCountFrequency (%)
B 5
50.0%
A 3
30.0%
E 1
 
10.0%
b 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9555
57.5%
ASCII 7035
42.3%
None 39
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2885
41.0%
1 613
 
8.7%
( 505
 
7.2%
) 505
 
7.2%
2 381
 
5.4%
3 308
 
4.4%
4 284
 
4.0%
5 275
 
3.9%
6 221
 
3.1%
7 217
 
3.1%
Other values (11) 841
 
12.0%
Hangul
ValueCountFrequency (%)
1029
 
10.8%
829
 
8.7%
810
 
8.5%
757
 
7.9%
748
 
7.8%
709
 
7.4%
682
 
7.1%
650
 
6.8%
304
 
3.2%
233
 
2.4%
Other values (155) 2804
29.3%
None
ValueCountFrequency (%)
  39
100.0%

전화번호
Text

MISSING 

Distinct631
Distinct (%)96.9%
Missing96
Missing (%)12.9%
Memory size6.0 KiB
2023-12-12T19:42:40.958538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length12
Mean length12
Min length9

Characters and Unicode

Total characters7812
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique613 ?
Unique (%)94.2%

Sample

1st row031-975-2373
2nd row031-976-7171
3rd row031-962-9151
4th row031-962-0332
5th row031-969-0810
ValueCountFrequency (%)
031-971-5101 3
 
0.5%
031-970-1194 3
 
0.5%
031-910-1172 2
 
0.3%
031-978-3100 2
 
0.3%
031-972-8222 2
 
0.3%
031-975-2677 2
 
0.3%
031-967-5432 2
 
0.3%
031-932-7755 2
 
0.3%
031-905-1693 2
 
0.3%
031-912-8292 2
 
0.3%
Other values (623) 631
96.6%
2023-12-12T19:42:41.513638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1301
16.7%
0 1202
15.4%
1 1151
14.7%
3 950
12.2%
9 850
10.9%
7 578
7.4%
2 411
 
5.3%
5 385
 
4.9%
8 375
 
4.8%
6 359
 
4.6%
Other values (3) 250
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6507
83.3%
Dash Punctuation 1301
 
16.7%
Other Punctuation 2
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1202
18.5%
1 1151
17.7%
3 950
14.6%
9 850
13.1%
7 578
8.9%
2 411
 
6.3%
5 385
 
5.9%
8 375
 
5.8%
6 359
 
5.5%
4 246
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 1301
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7812
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1301
16.7%
0 1202
15.4%
1 1151
14.7%
3 950
12.2%
9 850
10.9%
7 578
7.4%
2 411
 
5.3%
5 385
 
4.9%
8 375
 
4.8%
6 359
 
4.6%
Other values (3) 250
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7812
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1301
16.7%
0 1202
15.4%
1 1151
14.7%
3 950
12.2%
9 850
10.9%
7 578
7.4%
2 411
 
5.3%
5 385
 
4.9%
8 375
 
4.8%
6 359
 
4.6%
Other values (3) 250
 
3.2%

시설종류
Categorical

Distinct8
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size6.0 KiB
운수장비정비
239 
안경원
167 
X-ray시설
121 
X-Ray시설
104 
사진처리시설
99 
Other values (3)
 
17

Length

Max length7
Median length6
Mean length5.5983936
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사진처리시설
2nd row사진처리시설
3rd row사진처리시설
4th row사진처리시설
5th row골프장

Common Values

ValueCountFrequency (%)
운수장비정비 239
32.0%
안경원 167
22.4%
X-ray시설 121
16.2%
X-Ray시설 104
13.9%
사진처리시설 99
13.3%
폐차장 시설 9
 
1.2%
골프장 4
 
0.5%
양어장 4
 
0.5%

Length

2023-12-12T19:42:41.715043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:42:41.897467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
운수장비정비 239
31.6%
x-ray시설 225
29.8%
안경원 167
22.1%
사진처리시설 99
13.1%
폐차장 9
 
1.2%
시설 9
 
1.2%
골프장 4
 
0.5%
양어장 4
 
0.5%

Interactions

2023-12-12T19:42:38.263238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:42:42.029465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호시설종류
번호1.0000.727
시설종류0.7271.000
2023-12-12T19:42:42.136773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호시설종류
번호1.0000.458
시설종류0.4581.000

Missing values

2023-12-12T19:42:38.466180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:42:38.621849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호업소명소재지전화번호시설종류
01명덕사진관고양시 일산서구 원일로 66  (일산동)031-975-2373사진처리시설
121,2,3칼라고양시 일산서구 일중로79번길 6-6 (일산동)031-976-7171사진처리시설
23유한사진실고양시 덕양구 혜음로 23 (고양동)031-962-9151사진처리시설
34세계사진관고양시 덕양구 통일로 763 (관산동)031-962-0332사진처리시설
45㈜한양컨트리클럽고양시 덕양구 고양대로1643번길 164 (원당동)031-969-0810골프장
56아카데미 17분칼라현상소고양시 덕양구 원당로 55(주교동)031-966-3067사진처리시설
67훼미리스튜디오고양시 덕양구 호국로789번길 19 (주교동)031-965-0385사진처리시설
78미화칼라고양시 덕양구 호국로 840031-964-0164사진처리시설
8977칼라고양시 덕양구 토당로104번길 48-22 (토당동)031-970-8787사진처리시설
910허리우드사진관고양시 덕양구 혜음로 65(고양동)031-963-9923사진처리시설
번호업소명소재지전화번호시설종류
737738모토니티자동차공업사고양시 일산서구 덕산로173번길 85-13031-924-0681운수장비정비
738739안경진정성 행신점고양시 덕양구 중앙로 462, 1층031-978-6001안경원
739740㈜리플랜자동차고양시 일산동구 성현로 141, 1동<NA>운수장비정비
740741대승자동차정비고양시 일산동구 장항로 297-30<NA>운수장비정비
741742으뜸플러스안경 일산풍동점고양시 일산동구 숲속마을로 26, 202호031-901-1213안경원
742743㈜선진운수고양시 일산서구 덕산로 174031-923-8921, 02-355-5855운수장비정비
743744으뜸50안경 킨텍스점고양시 일산서구 호수로 817, 레이킨스몰 223호031-922-1550안경원
744745프로(PRO) MOTORS고양시 일산동구 장진천길82번길 54031-911-4972운수장비정비
745746아주오토리움㈜고양시 일산동구 은행마을로 46-43<NA>운수장비정비
746747으뜸 플러스고양시 덕양구 동송로 70 판매시설동 201호02-2266-5503안경원