Overview

Dataset statistics

Number of variables13
Number of observations1231
Missing cells1945
Missing cells (%)12.2%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory131.2 KiB
Average record size in memory109.1 B

Variable types

Text4
Numeric2
Categorical5
Boolean1
Unsupported1

Dataset

Description업체(시설)명,인허가번호,업종코드,업종명,지도점검일자,점검기관,점검기관명,지도점검구분,처분대상여부,점검사항,점검결과,소재지도로명주소,소재지주소
Author구로구
URLhttps://data.seoul.go.kr/dataList/OA-2502/S/1/datasetView.do

Alerts

점검기관 has constant value ""Constant
점검기관명 has constant value ""Constant
Dataset has 1 (0.1%) duplicate rowsDuplicates
업종명 is highly overall correlated with 인허가번호 and 1 other fieldsHigh correlation
업종코드 is highly overall correlated with 인허가번호 and 1 other fieldsHigh correlation
인허가번호 is highly overall correlated with 업종코드 and 1 other fieldsHigh correlation
업종코드 is highly imbalanced (53.5%)Imbalance
업종명 is highly imbalanced (55.1%)Imbalance
지도점검구분 is highly imbalanced (64.1%)Imbalance
처분대상여부 is highly imbalanced (90.9%)Imbalance
처분대상여부 has 17 (1.4%) missing valuesMissing
점검사항 has 15 (1.2%) missing valuesMissing
점검결과 has 1231 (100.0%) missing valuesMissing
소재지도로명주소 has 606 (49.2%) missing valuesMissing
소재지주소 has 76 (6.2%) missing valuesMissing
점검결과 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-05-11 06:56:30.995990
Analysis finished2024-05-11 06:56:34.035792
Duration3.04 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct315
Distinct (%)25.6%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
2024-05-11T15:56:34.322176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.1754671
Min length2

Characters and Unicode

Total characters8833
Distinct characters315
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique92 ?
Unique (%)7.5%

Sample

1st row삼호세차장
2nd row서울석유(주) 풀페이주유소
3rd row아주지오텍(주)-신부평~영서2구간 전력구 공사
4th row소낙스
5th row구로성심병원별관
ValueCountFrequency (%)
성금사 24
 
1.8%
대한특수연마도금 21
 
1.6%
덕영특수금속 19
 
1.4%
대광특수 17
 
1.3%
명원금속 16
 
1.2%
한일시멘트(주)영등포공장 16
 
1.2%
동광금속 15
 
1.1%
현대그랜드 15
 
1.1%
태성특수도금 15
 
1.1%
구로그린주유소 14
 
1.1%
Other values (331) 1158
87.1%
2024-05-11T15:56:34.984743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
480
 
5.4%
( 363
 
4.1%
) 363
 
4.1%
292
 
3.3%
236
 
2.7%
217
 
2.5%
208
 
2.4%
201
 
2.3%
186
 
2.1%
184
 
2.1%
Other values (305) 6103
69.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7902
89.5%
Open Punctuation 367
 
4.2%
Close Punctuation 367
 
4.2%
Space Separator 99
 
1.1%
Uppercase Letter 46
 
0.5%
Dash Punctuation 14
 
0.2%
Lowercase Letter 12
 
0.1%
Decimal Number 9
 
0.1%
Other Symbol 7
 
0.1%
Other Punctuation 6
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
480
 
6.1%
292
 
3.7%
236
 
3.0%
217
 
2.7%
208
 
2.6%
201
 
2.5%
186
 
2.4%
184
 
2.3%
169
 
2.1%
159
 
2.0%
Other values (278) 5570
70.5%
Uppercase Letter
ValueCountFrequency (%)
S 10
21.7%
K 9
19.6%
I 7
15.2%
C 4
 
8.7%
D 4
 
8.7%
T 3
 
6.5%
F 3
 
6.5%
E 2
 
4.3%
P 2
 
4.3%
J 1
 
2.2%
Lowercase Letter
ValueCountFrequency (%)
e 3
25.0%
s 3
25.0%
l 3
25.0%
f 3
25.0%
Decimal Number
ValueCountFrequency (%)
2 4
44.4%
1 4
44.4%
5 1
 
11.1%
Open Punctuation
ValueCountFrequency (%)
( 363
98.9%
[ 4
 
1.1%
Close Punctuation
ValueCountFrequency (%)
) 363
98.9%
] 4
 
1.1%
Space Separator
ValueCountFrequency (%)
99
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%
Other Symbol
ValueCountFrequency (%)
7
100.0%
Other Punctuation
ValueCountFrequency (%)
& 6
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7909
89.5%
Common 866
 
9.8%
Latin 58
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
480
 
6.1%
292
 
3.7%
236
 
3.0%
217
 
2.7%
208
 
2.6%
201
 
2.5%
186
 
2.4%
184
 
2.3%
169
 
2.1%
159
 
2.0%
Other values (279) 5577
70.5%
Latin
ValueCountFrequency (%)
S 10
17.2%
K 9
15.5%
I 7
12.1%
C 4
 
6.9%
D 4
 
6.9%
e 3
 
5.2%
s 3
 
5.2%
l 3
 
5.2%
T 3
 
5.2%
f 3
 
5.2%
Other values (5) 9
15.5%
Common
ValueCountFrequency (%)
( 363
41.9%
) 363
41.9%
99
 
11.4%
- 14
 
1.6%
& 6
 
0.7%
[ 4
 
0.5%
] 4
 
0.5%
2 4
 
0.5%
~ 4
 
0.5%
1 4
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7902
89.5%
ASCII 924
 
10.5%
None 7
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
480
 
6.1%
292
 
3.7%
236
 
3.0%
217
 
2.7%
208
 
2.6%
201
 
2.5%
186
 
2.4%
184
 
2.3%
169
 
2.1%
159
 
2.0%
Other values (278) 5570
70.5%
ASCII
ValueCountFrequency (%)
( 363
39.3%
) 363
39.3%
99
 
10.7%
- 14
 
1.5%
S 10
 
1.1%
K 9
 
1.0%
I 7
 
0.8%
& 6
 
0.6%
[ 4
 
0.4%
] 4
 
0.4%
Other values (16) 45
 
4.9%
None
ValueCountFrequency (%)
7
100.0%

인허가번호
Real number (ℝ)

HIGH CORRELATION 

Distinct371
Distinct (%)30.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.1600002 × 1017
Minimum3.1600002 × 1017
Maximum3.1600004 × 1017
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.9 KiB
2024-05-11T15:56:35.226945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3.1600002 × 1017
5-th percentile3.1600002 × 1017
Q13.1600002 × 1017
median3.1600002 × 1017
Q33.1600002 × 1017
95-th percentile3.1600002 × 1017
Maximum3.1600004 × 1017
Range2.10031 × 1010
Interquartile range (IQR)9.9989997 × 108

Descriptive statistics

Standard deviation9.7723194 × 108
Coefficient of variation (CV)3.0925059 × 10-9
Kurtosis312.06105
Mean3.1600002 × 1017
Median Absolute Deviation (MAD)2600000
Skewness15.011218
Sum1.6144012 × 1018
Variance9.5498227 × 1017
MonotonicityNot monotonic
2024-05-11T15:56:35.510591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
316000022199300022 14
 
1.1%
316000021200900004 13
 
1.1%
316000021200600024 12
 
1.0%
316000022200000250 12
 
1.0%
316000022200600007 11
 
0.9%
316000022200900007 11
 
0.9%
316000021200500024 10
 
0.8%
316000022200400003 9
 
0.7%
316000022200600031 9
 
0.7%
316000022200500042 9
 
0.7%
Other values (361) 1121
91.1%
ValueCountFrequency (%)
316000021197200032 5
0.4%
316000021197600021 3
0.2%
316000021198400027 4
0.3%
316000021198800006 1
 
0.1%
316000021198900002 3
0.2%
316000021198900010 4
0.3%
316000021198900020 5
0.4%
316000021199200004 4
0.3%
316000021199300017 1
 
0.1%
316000021199500013 1
 
0.1%
ValueCountFrequency (%)
316000042200300034 1
0.1%
316000042200000074 1
0.1%
316000025200600003 1
0.1%
316000023201000001 1
0.1%
316000023200200002 1
0.1%
316000023200100002 1
0.1%
316000023199700057 1
0.1%
316000023198900017 1
0.1%
316000023198000001 1
0.1%
316000023197900006 1
0.1%

업종코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
21
630 
22
589 
23
 
9
42
 
2
25
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row22
2nd row22
3rd row22
4th row22
5th row22

Common Values

ValueCountFrequency (%)
21 630
51.2%
22 589
47.8%
23 9
 
0.7%
42 2
 
0.2%
25 1
 
0.1%

Length

2024-05-11T15:56:35.789514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:56:36.003651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
21 630
51.2%
22 589
47.8%
23 9
 
0.7%
42 2
 
0.2%
25 1
 
0.1%

업종명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
대기배출업소관리
625 
폐수배출업소관리
579 
<NA>
 
15
소음진동관리
 
9
유독물판매업관리
 
2

Length

Max length9
Median length8
Mean length7.9374492
Min length4

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row폐수배출업소관리
2nd row폐수배출업소관리
3rd row폐수배출업소관리
4th row폐수배출업소관리
5th row폐수배출업소관리

Common Values

ValueCountFrequency (%)
대기배출업소관리 625
50.8%
폐수배출업소관리 579
47.0%
<NA> 15
 
1.2%
소음진동관리 9
 
0.7%
유독물판매업관리 2
 
0.2%
기타수질오염원관리 1
 
0.1%

Length

2024-05-11T15:56:36.244529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:56:36.465910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대기배출업소관리 625
50.8%
폐수배출업소관리 579
47.0%
na 15
 
1.2%
소음진동관리 9
 
0.7%
유독물판매업관리 2
 
0.2%
기타수질오염원관리 1
 
0.1%

지도점검일자
Real number (ℝ)

Distinct281
Distinct (%)22.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20134206
Minimum20100127
Maximum20171130
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.9 KiB
2024-05-11T15:56:36.730921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20100127
5-th percentile20100704
Q120110926
median20130628
Q320160421
95-th percentile20170918
Maximum20171130
Range71003
Interquartile range (IQR)49495

Descriptive statistics

Standard deviation23795.578
Coefficient of variation (CV)0.0011818483
Kurtosis-1.3405838
Mean20134206
Median Absolute Deviation (MAD)20020
Skewness0.087440585
Sum2.4785208 × 1010
Variance5.6622953 × 108
MonotonicityDecreasing
2024-05-11T15:56:37.036666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20121126 16
 
1.3%
20120622 15
 
1.2%
20120625 13
 
1.1%
20160616 12
 
1.0%
20160421 12
 
1.0%
20100723 12
 
1.0%
20160429 11
 
0.9%
20140919 11
 
0.9%
20150304 11
 
0.9%
20101217 10
 
0.8%
Other values (271) 1108
90.0%
ValueCountFrequency (%)
20100127 2
0.2%
20100212 1
 
0.1%
20100218 3
0.2%
20100224 1
 
0.1%
20100302 1
 
0.1%
20100316 2
0.2%
20100330 1
 
0.1%
20100401 1
 
0.1%
20100402 1
 
0.1%
20100405 2
0.2%
ValueCountFrequency (%)
20171130 3
0.2%
20171128 4
0.3%
20171114 3
0.2%
20171113 4
0.3%
20171110 2
0.2%
20171109 2
0.2%
20171027 2
0.2%
20171026 3
0.2%
20170929 4
0.3%
20170928 4
0.3%

점검기관
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
3160000
1231 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3160000
2nd row3160000
3rd row3160000
4th row3160000
5th row3160000

Common Values

ValueCountFrequency (%)
3160000 1231
100.0%

Length

2024-05-11T15:56:37.322371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:56:37.493348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3160000 1231
100.0%

점검기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
서울특별시 구로구
1231 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시 구로구
2nd row서울특별시 구로구
3rd row서울특별시 구로구
4th row서울특별시 구로구
5th row서울특별시 구로구

Common Values

ValueCountFrequency (%)
서울특별시 구로구 1231
100.0%

Length

2024-05-11T15:56:37.687023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:56:37.875421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 1231
50.0%
구로구 1231
50.0%

지도점검구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size9.7 KiB
정기
1077 
합동
 
90
수시
 
38
기타
 
26

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정기
2nd row정기
3rd row정기
4th row정기
5th row정기

Common Values

ValueCountFrequency (%)
정기 1077
87.5%
합동 90
 
7.3%
수시 38
 
3.1%
기타 26
 
2.1%

Length

2024-05-11T15:56:38.093735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:56:38.297554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정기 1077
87.5%
합동 90
 
7.3%
수시 38
 
3.1%
기타 26
 
2.1%

처분대상여부
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)0.2%
Missing17
Missing (%)1.4%
Memory size2.5 KiB
False
1200 
True
 
14
(Missing)
 
17
ValueCountFrequency (%)
False 1200
97.5%
True 14
 
1.1%
(Missing) 17
 
1.4%
2024-05-11T15:56:38.566334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

점검사항
Text

MISSING 

Distinct139
Distinct (%)11.4%
Missing15
Missing (%)1.2%
Memory size9.7 KiB
2024-05-11T15:56:39.124179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length34
Mean length16.860197
Min length2

Characters and Unicode

Total characters20502
Distinct characters88
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique81 ?
Unique (%)6.7%

Sample

1st row폐수
2nd row폐수
3rd row폐수
4th row폐수
5th row폐수
ValueCountFrequency (%)
940
17.1%
방지시설 913
16.6%
배출시설 857
15.6%
여부 697
12.7%
305
 
5.6%
정상가동 261
 
4.8%
적정 181
 
3.3%
정상운영 125
 
2.3%
적정운영 117
 
2.1%
대기 104
 
1.9%
Other values (76) 987
18.0%
2024-05-11T15:56:39.851534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4271
20.8%
1960
 
9.6%
1959
 
9.6%
998
 
4.9%
998
 
4.9%
998
 
4.9%
986
 
4.8%
964
 
4.7%
811
 
4.0%
810
 
4.0%
Other values (78) 5747
28.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16214
79.1%
Space Separator 4271
 
20.8%
Other Punctuation 13
 
0.1%
Close Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1960
 
12.1%
1959
 
12.1%
998
 
6.2%
998
 
6.2%
998
 
6.2%
986
 
6.1%
964
 
5.9%
811
 
5.0%
810
 
5.0%
787
 
4.9%
Other values (74) 4943
30.5%
Space Separator
ValueCountFrequency (%)
4271
100.0%
Other Punctuation
ValueCountFrequency (%)
, 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16214
79.1%
Common 4288
 
20.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1960
 
12.1%
1959
 
12.1%
998
 
6.2%
998
 
6.2%
998
 
6.2%
986
 
6.1%
964
 
5.9%
811
 
5.0%
810
 
5.0%
787
 
4.9%
Other values (74) 4943
30.5%
Common
ValueCountFrequency (%)
4271
99.6%
, 13
 
0.3%
) 2
 
< 0.1%
( 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16214
79.1%
ASCII 4288
 
20.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4271
99.6%
, 13
 
0.3%
) 2
 
< 0.1%
( 2
 
< 0.1%
Hangul
ValueCountFrequency (%)
1960
 
12.1%
1959
 
12.1%
998
 
6.2%
998
 
6.2%
998
 
6.2%
986
 
6.1%
964
 
5.9%
811
 
5.0%
810
 
5.0%
787
 
4.9%
Other values (74) 4943
30.5%

점검결과
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1231
Missing (%)100.0%
Memory size10.9 KiB
Distinct174
Distinct (%)27.8%
Missing606
Missing (%)49.2%
Memory size9.7 KiB
2024-05-11T15:56:40.465218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length67
Median length51
Mean length27.8896
Min length21

Characters and Unicode

Total characters17431
Distinct characters173
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)6.1%

Sample

1st row서울특별시 구로구 중앙로 93 (고척동)
2nd row서울특별시 구로구 경인로 41 (온수동)
3rd row서울특별시 구로구 경인로 15-20 (온수동)
4th row서울특별시 구로구 가마산로 91-30 (구로동)
5th row서울특별시 구로구 중앙로 10-15 (고척동)
ValueCountFrequency (%)
서울특별시 625
19.0%
구로구 625
19.0%
구로동 311
 
9.4%
신도림동 129
 
3.9%
경인로 72
 
2.2%
온수동 58
 
1.8%
경인로55길 57
 
1.7%
61 57
 
1.7%
구로중앙로42길 57
 
1.7%
고척동 48
 
1.5%
Other values (259) 1259
38.2%
2024-05-11T15:56:41.463938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2673
 
15.3%
1701
 
9.8%
1693
 
9.7%
669
 
3.8%
634
 
3.6%
630
 
3.6%
625
 
3.6%
625
 
3.6%
625
 
3.6%
) 625
 
3.6%
Other values (163) 6931
39.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10496
60.2%
Decimal Number 2709
 
15.5%
Space Separator 2673
 
15.3%
Close Punctuation 625
 
3.6%
Open Punctuation 625
 
3.6%
Other Punctuation 177
 
1.0%
Dash Punctuation 101
 
0.6%
Uppercase Letter 17
 
0.1%
Math Symbol 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1701
16.2%
1693
16.1%
669
 
6.4%
634
 
6.0%
630
 
6.0%
625
 
6.0%
625
 
6.0%
625
 
6.0%
404
 
3.8%
185
 
1.8%
Other values (140) 2705
25.8%
Decimal Number
ValueCountFrequency (%)
1 564
20.8%
2 417
15.4%
5 344
12.7%
3 280
10.3%
4 271
10.0%
6 203
 
7.5%
0 202
 
7.5%
7 199
 
7.3%
8 127
 
4.7%
9 102
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
B 8
47.1%
G 5
29.4%
S 2
 
11.8%
C 1
 
5.9%
J 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
, 175
98.9%
& 2
 
1.1%
Math Symbol
ValueCountFrequency (%)
~ 5
62.5%
+ 3
37.5%
Space Separator
ValueCountFrequency (%)
2673
100.0%
Close Punctuation
ValueCountFrequency (%)
) 625
100.0%
Open Punctuation
ValueCountFrequency (%)
( 625
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 101
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10496
60.2%
Common 6918
39.7%
Latin 17
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1701
16.2%
1693
16.1%
669
 
6.4%
634
 
6.0%
630
 
6.0%
625
 
6.0%
625
 
6.0%
625
 
6.0%
404
 
3.8%
185
 
1.8%
Other values (140) 2705
25.8%
Common
ValueCountFrequency (%)
2673
38.6%
) 625
 
9.0%
( 625
 
9.0%
1 564
 
8.2%
2 417
 
6.0%
5 344
 
5.0%
3 280
 
4.0%
4 271
 
3.9%
6 203
 
2.9%
0 202
 
2.9%
Other values (8) 714
 
10.3%
Latin
ValueCountFrequency (%)
B 8
47.1%
G 5
29.4%
S 2
 
11.8%
C 1
 
5.9%
J 1
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10496
60.2%
ASCII 6935
39.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2673
38.5%
) 625
 
9.0%
( 625
 
9.0%
1 564
 
8.1%
2 417
 
6.0%
5 344
 
5.0%
3 280
 
4.0%
4 271
 
3.9%
6 203
 
2.9%
0 202
 
2.9%
Other values (13) 731
 
10.5%
Hangul
ValueCountFrequency (%)
1701
16.2%
1693
16.1%
669
 
6.4%
634
 
6.0%
630
 
6.0%
625
 
6.0%
625
 
6.0%
625
 
6.0%
404
 
3.8%
185
 
1.8%
Other values (140) 2705
25.8%

소재지주소
Text

MISSING 

Distinct271
Distinct (%)23.5%
Missing76
Missing (%)6.2%
Memory size9.7 KiB
2024-05-11T15:56:42.035753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length38
Mean length24.038095
Min length14

Characters and Unicode

Total characters27764
Distinct characters110
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique77 ?
Unique (%)6.7%

Sample

1st row서울특별시 구로구 고척동 185-9번지
2nd row서울특별시 구로구 온수동 35-3번지
3rd row서울특별시 구로구 온수동 45-24번지
4th row서울특별시 구로구 구로동 701-22번지
5th row서울특별시 구로구 고척동 76-147번지
ValueCountFrequency (%)
서울특별시 1155
23.3%
구로구 1155
23.3%
구로동 542
 
10.9%
신도림동 254
 
5.1%
온수동 107
 
2.2%
616-1번지 106
 
2.1%
고척동 105
 
2.1%
개봉동 59
 
1.2%
오류동 57
 
1.1%
286-3번지 46
 
0.9%
Other values (311) 1372
27.7%
2024-05-11T15:56:42.763034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4892
17.6%
2853
 
10.3%
1697
 
6.1%
1 1322
 
4.8%
1163
 
4.2%
1158
 
4.2%
1155
 
4.2%
1155
 
4.2%
1155
 
4.2%
1155
 
4.2%
Other values (100) 10059
36.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15962
57.5%
Decimal Number 5685
 
20.5%
Space Separator 4892
 
17.6%
Dash Punctuation 1117
 
4.0%
Other Punctuation 37
 
0.1%
Close Punctuation 28
 
0.1%
Open Punctuation 28
 
0.1%
Uppercase Letter 15
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2853
17.9%
1697
10.6%
1163
7.3%
1158
7.3%
1155
7.2%
1155
7.2%
1155
7.2%
1155
7.2%
1155
7.2%
1144
7.2%
Other values (80) 2172
13.6%
Decimal Number
ValueCountFrequency (%)
1 1322
23.3%
2 808
14.2%
6 775
13.6%
0 472
 
8.3%
8 440
 
7.7%
3 414
 
7.3%
7 406
 
7.1%
5 376
 
6.6%
4 354
 
6.2%
9 318
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
I 6
40.0%
T 5
33.3%
B 2
 
13.3%
E 1
 
6.7%
Z 1
 
6.7%
Space Separator
ValueCountFrequency (%)
4892
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1117
100.0%
Other Punctuation
ValueCountFrequency (%)
, 37
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15962
57.5%
Common 11787
42.5%
Latin 15
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2853
17.9%
1697
10.6%
1163
7.3%
1158
7.3%
1155
7.2%
1155
7.2%
1155
7.2%
1155
7.2%
1155
7.2%
1144
7.2%
Other values (80) 2172
13.6%
Common
ValueCountFrequency (%)
4892
41.5%
1 1322
 
11.2%
- 1117
 
9.5%
2 808
 
6.9%
6 775
 
6.6%
0 472
 
4.0%
8 440
 
3.7%
3 414
 
3.5%
7 406
 
3.4%
5 376
 
3.2%
Other values (5) 765
 
6.5%
Latin
ValueCountFrequency (%)
I 6
40.0%
T 5
33.3%
B 2
 
13.3%
E 1
 
6.7%
Z 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15962
57.5%
ASCII 11802
42.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4892
41.5%
1 1322
 
11.2%
- 1117
 
9.5%
2 808
 
6.8%
6 775
 
6.6%
0 472
 
4.0%
8 440
 
3.7%
3 414
 
3.5%
7 406
 
3.4%
5 376
 
3.2%
Other values (10) 780
 
6.6%
Hangul
ValueCountFrequency (%)
2853
17.9%
1697
10.6%
1163
7.3%
1158
7.3%
1155
7.2%
1155
7.2%
1155
7.2%
1155
7.2%
1155
7.2%
1144
7.2%
Other values (80) 2172
13.6%

Interactions

2024-05-11T15:56:32.627876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:32.227620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:32.829151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:32.410285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:56:42.939729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인허가번호업종코드업종명지도점검일자지도점검구분처분대상여부
인허가번호1.0001.0001.0000.0660.0000.000
업종코드1.0001.0001.0000.1400.0530.062
업종명1.0001.0001.0000.1370.0590.062
지도점검일자0.0660.1400.1371.0000.5610.112
지도점검구분0.0000.0530.0590.5611.0000.233
처분대상여부0.0000.0620.0620.1120.2331.000
2024-05-11T15:56:43.116502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명업종코드처분대상여부지도점검구분
업종명1.0001.0000.0760.048
업종코드1.0001.0000.0760.043
처분대상여부0.0760.0761.0000.154
지도점검구분0.0480.0430.1541.000
2024-05-11T15:56:43.260540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인허가번호지도점검일자업종코드업종명지도점검구분처분대상여부
인허가번호1.0000.0460.9990.9990.0000.000
지도점검일자0.0461.0000.0850.0830.2770.084
업종코드0.9990.0851.0001.0000.0430.076
업종명0.9990.0831.0001.0000.0480.076
지도점검구분0.0000.2770.0430.0481.0000.154
처분대상여부0.0000.0840.0760.0760.1541.000

Missing values

2024-05-11T15:56:33.161469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:56:33.637656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T15:56:33.900980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업체(시설)명인허가번호업종코드업종명지도점검일자점검기관점검기관명지도점검구분처분대상여부점검사항점검결과소재지도로명주소소재지주소
0삼호세차장31600002219840005922폐수배출업소관리201711303160000서울특별시 구로구정기N폐수<NA>서울특별시 구로구 중앙로 93 (고척동)서울특별시 구로구 고척동 185-9번지
1서울석유(주) 풀페이주유소31600002219970009822폐수배출업소관리201711303160000서울특별시 구로구정기N폐수<NA>서울특별시 구로구 경인로 41 (온수동)서울특별시 구로구 온수동 35-3번지
2아주지오텍(주)-신부평~영서2구간 전력구 공사31600002220140000722폐수배출업소관리201711303160000서울특별시 구로구정기N폐수<NA>서울특별시 구로구 경인로 15-20 (온수동)서울특별시 구로구 온수동 45-24번지
3소낙스31600002220140000322폐수배출업소관리201711283160000서울특별시 구로구정기N폐수<NA>서울특별시 구로구 가마산로 91-30 (구로동)서울특별시 구로구 구로동 701-22번지
4구로성심병원별관31600002220100001022폐수배출업소관리201711283160000서울특별시 구로구정기N폐수<NA>서울특별시 구로구 중앙로 10-15 (고척동)서울특별시 구로구 고척동 76-147번지
5대성모터스31600002219960008522폐수배출업소관리201711283160000서울특별시 구로구정기N폐수<NA>서울특별시 구로구 구로동로 3 (가리봉동)서울특별시 구로구 가리봉동 121-43번지
6구로아지트셀프세차장31600002220000025022폐수배출업소관리201711283160000서울특별시 구로구정기N폐수<NA>서울특별시 구로구 가마산로 91-30, 110호 (구로동)서울특별시 구로구 구로동 701-22번지
7대성산업(주) 오류충전소31600002220020004122폐수배출업소관리201711143160000서울특별시 구로구정기N폐수<NA><NA>서울특별시 구로구 오류동 74-8번지
8대형상운(주)31600002220070000622폐수배출업소관리201711143160000서울특별시 구로구정기N폐수<NA><NA>서울특별시 구로구 오류동 123번지
9(주)삼신교통31600002219870007022폐수배출업소관리201711143160000서울특별시 구로구정기N폐수<NA>서울특별시 구로구 경인로 107 (오류동)서울특별시 구로구 오류동 90-5번지
업체(시설)명인허가번호업종코드업종명지도점검일자점검기관점검기관명지도점검구분처분대상여부점검사항점검결과소재지도로명주소소재지주소
1221㈜청룡환경31600002220080001522폐수배출업소관리201003163160000서울특별시 구로구합동N환경관련법 준수사항 여부 등<NA><NA>서울특별시 구로구 구로동 197-22번지 에이스테크노타워 209호
1222(주)청룡환경31600002220090000822폐수배출업소관리201003163160000서울특별시 구로구정기N환경관련법 준수사항 여부 등<NA><NA>서울특별시 구로구 구로동 197-22번지 에이크테크노타워5차 403호
1223한일시멘트(주)영등포공장31600002119720003221대기배출업소관리201003023160000서울특별시 구로구정기N환경관련법 준수사항 이행여부 등<NA>서울특별시 구로구 경인로 302 (개봉동)서울특별시 구로구 개봉동 222번지
1224덴트젠구로점31600002220090000922폐수배출업소관리201002243160000서울특별시 구로구정기N배출시설 및 방지시설 적정 설치 운영 여부<NA><NA>서울특별시 구로구 구로동 808-34번지 4필지
1225보성운수-주31600002219800002622폐수배출업소관리201002183160000서울특별시 구로구정기N방지시설 적정운영 및 개선명령이행 여부 확인<NA><NA>서울특별시 구로구 구로동 145-17번지
1226대성산업(주)제5주유소31600002219950001122폐수배출업소관리201002183160000서울특별시 구로구정기N방지시설 적정운영 및 개선명령이행 여부 확인<NA><NA>서울특별시 구로구 신도림동 361번지
1227현대오일뱅크(주)구로공단주유소31600002220050001622폐수배출업소관리201002183160000서울특별시 구로구정기N방지시설 적정운영 및 개선명령이행 여부 확인<NA><NA>서울특별시 구로구 구로동 1131-4번지
1228동선주유소31600002220060002122폐수배출업소관리201002123160000서울특별시 구로구정기N방지시설 적정운영 및 개선명령이행 여부<NA><NA>서울특별시 구로구 오류동 77-10번지
1229한일시멘트(주)영등포공장31600002219720006322폐수배출업소관리201001273160000서울특별시 구로구정기N폐수배출시설 및 방지시설 운영 적정 여부 등<NA>서울특별시 구로구 경인로 302 (개봉동)서울특별시 구로구 개봉동 222번지
1230(주)유진개발31600002220060000722폐수배출업소관리201001273160000서울특별시 구로구정기N폐수배출시설 및 방지시설 적정 운영 여부 등<NA><NA>서울특별시 구로구 오류동 331-25번지

Duplicate rows

Most frequently occurring

업체(시설)명인허가번호업종코드업종명지도점검일자점검기관점검기관명지도점검구분처분대상여부점검사항소재지도로명주소소재지주소# duplicates
0직암특수금속31600002120040005121대기배출업소관리201010253160000서울특별시 구로구정기N배출시설 및 방지시설 운영사항<NA>서울특별시 구로구 신도림동 286-3번지2