Overview

Dataset statistics

Number of variables13
Number of observations701
Missing cells963
Missing cells (%)10.6%
Duplicate rows2
Duplicate rows (%)0.3%
Total size in memory74.7 KiB
Average record size in memory109.2 B

Variable types

Text4
Numeric2
Categorical5
Boolean1
Unsupported1

Dataset

Description업체(시설)명,인허가번호,업종코드,업종명,지도점검일자,점검기관,점검기관명,지도점검구분,처분대상여부,점검사항,점검결과,소재지도로명주소,소재지주소
Author강동구
URLhttps://data.seoul.go.kr/dataList/OA-10663/S/1/datasetView.do

Alerts

점검기관 has constant value ""Constant
점검기관명 has constant value ""Constant
Dataset has 2 (0.3%) duplicate rowsDuplicates
업종코드 is highly overall correlated with 업종명High correlation
업종명 is highly overall correlated with 업종코드High correlation
업종코드 is highly imbalanced (50.2%)Imbalance
업종명 is highly imbalanced (56.3%)Imbalance
지도점검구분 is highly imbalanced (64.2%)Imbalance
처분대상여부 is highly imbalanced (73.6%)Imbalance
처분대상여부 has 9 (1.3%) missing valuesMissing
점검결과 has 701 (100.0%) missing valuesMissing
소재지도로명주소 has 239 (34.1%) missing valuesMissing
소재지주소 has 12 (1.7%) missing valuesMissing
점검결과 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-05-11 06:50:11.213151
Analysis finished2024-05-11 06:50:13.815076
Duration2.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct270
Distinct (%)38.5%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2024-05-11T15:50:14.184646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length7.3908702
Min length1

Characters and Unicode

Total characters5181
Distinct characters288
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)14.1%

Sample

1st row(주)강산
2nd row강동JJ24시셀프세차장
3rd row(주)알엠앤에스
4th row(주)경기에너지
5th row(주)신흥이엔지
ValueCountFrequency (%)
삼호섬유 14
 
1.8%
신일섬유 12
 
1.5%
서울탁주강동연합제조장 11
 
1.4%
세정섬유 9
 
1.2%
도이치모터스(주)강동서비스센터 8
 
1.0%
암사아리수정수센터 8
 
1.0%
주)서울승합 8
 
1.0%
주)나엔 8
 
1.0%
둔촌충전소 8
 
1.0%
코스모24시셀프세차장 7
 
0.9%
Other values (290) 687
88.1%
2024-05-11T15:50:15.014922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
250
 
4.8%
168
 
3.2%
( 168
 
3.2%
) 168
 
3.2%
156
 
3.0%
126
 
2.4%
122
 
2.4%
121
 
2.3%
95
 
1.8%
87
 
1.7%
Other values (278) 3720
71.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4559
88.0%
Open Punctuation 168
 
3.2%
Close Punctuation 168
 
3.2%
Uppercase Letter 123
 
2.4%
Space Separator 79
 
1.5%
Decimal Number 38
 
0.7%
Lowercase Letter 33
 
0.6%
Other Punctuation 5
 
0.1%
Dash Punctuation 5
 
0.1%
Other Symbol 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
250
 
5.5%
168
 
3.7%
156
 
3.4%
126
 
2.8%
122
 
2.7%
121
 
2.7%
95
 
2.1%
87
 
1.9%
83
 
1.8%
82
 
1.8%
Other values (245) 3269
71.7%
Uppercase Letter
ValueCountFrequency (%)
S 23
18.7%
P 17
13.8%
J 16
13.0%
K 16
13.0%
I 11
8.9%
O 10
8.1%
T 8
 
6.5%
V 6
 
4.9%
N 5
 
4.1%
G 4
 
3.3%
Other values (3) 7
 
5.7%
Lowercase Letter
ValueCountFrequency (%)
e 7
21.2%
r 5
15.2%
t 4
12.1%
s 3
9.1%
k 3
9.1%
c 3
9.1%
f 3
9.1%
o 3
9.1%
a 1
 
3.0%
m 1
 
3.0%
Decimal Number
ValueCountFrequency (%)
2 19
50.0%
4 17
44.7%
5 1
 
2.6%
3 1
 
2.6%
Open Punctuation
ValueCountFrequency (%)
( 168
100.0%
Close Punctuation
ValueCountFrequency (%)
) 168
100.0%
Space Separator
ValueCountFrequency (%)
79
100.0%
Other Punctuation
ValueCountFrequency (%)
& 5
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4562
88.1%
Common 463
 
8.9%
Latin 156
 
3.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
250
 
5.5%
168
 
3.7%
156
 
3.4%
126
 
2.8%
122
 
2.7%
121
 
2.7%
95
 
2.1%
87
 
1.9%
83
 
1.8%
82
 
1.8%
Other values (246) 3272
71.7%
Latin
ValueCountFrequency (%)
S 23
14.7%
P 17
10.9%
J 16
10.3%
K 16
10.3%
I 11
 
7.1%
O 10
 
6.4%
T 8
 
5.1%
e 7
 
4.5%
V 6
 
3.8%
N 5
 
3.2%
Other values (13) 37
23.7%
Common
ValueCountFrequency (%)
( 168
36.3%
) 168
36.3%
79
17.1%
2 19
 
4.1%
4 17
 
3.7%
& 5
 
1.1%
- 5
 
1.1%
5 1
 
0.2%
3 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4559
88.0%
ASCII 619
 
11.9%
None 3
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
250
 
5.5%
168
 
3.7%
156
 
3.4%
126
 
2.8%
122
 
2.7%
121
 
2.7%
95
 
2.1%
87
 
1.9%
83
 
1.8%
82
 
1.8%
Other values (245) 3269
71.7%
ASCII
ValueCountFrequency (%)
( 168
27.1%
) 168
27.1%
79
12.8%
S 23
 
3.7%
2 19
 
3.1%
4 17
 
2.7%
P 17
 
2.7%
J 16
 
2.6%
K 16
 
2.6%
I 11
 
1.8%
Other values (22) 85
13.7%
None
ValueCountFrequency (%)
3
100.0%

인허가번호
Real number (ℝ)

Distinct242
Distinct (%)34.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2400002 × 1017
Minimum3.2400002 × 1017
Maximum3.2400006 × 1017
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.3 KiB
2024-05-11T15:50:15.311006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3.2400002 × 1017
5-th percentile3.2400002 × 1017
Q13.2400002 × 1017
median3.2400002 × 1017
Q33.2400002 × 1017
95-th percentile3.2400003 × 1017
Maximum3.2400006 × 1017
Range4.10018 × 1010
Interquartile range (IQR)1399936

Descriptive statistics

Standard deviation2.670968 × 109
Coefficient of variation (CV)8.243728 × 10-9
Kurtosis196.5565
Mean3.2400002 × 1017
Median Absolute Deviation (MAD)699712
Skewness13.575066
Sum5.7630868 × 1018
Variance7.1340703 × 1018
MonotonicityNot monotonic
2024-05-11T15:50:15.585344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
324000022199800013 18
 
2.6%
324000022199800017 14
 
2.0%
324000022199800252 12
 
1.7%
324000022197200001 11
 
1.6%
324000022199800012 9
 
1.3%
324000022199500001 8
 
1.1%
324000022200900011 8
 
1.1%
324000021199800012 8
 
1.1%
324000022200200027 7
 
1.0%
324000022200000375 7
 
1.0%
Other values (232) 599
85.4%
ValueCountFrequency (%)
324000021199800005 2
 
0.3%
324000021199800006 2
 
0.3%
324000021199800007 5
0.7%
324000021199800011 1
 
0.1%
324000021199800012 8
1.1%
324000021199900024 3
 
0.4%
324000021200000028 2
 
0.3%
324000021200100001 5
0.7%
324000021200200001 5
0.7%
324000021200300001 6
0.9%
ValueCountFrequency (%)
324000062201600001 1
0.1%
324000061201400001 1
0.1%
324000061201100001 1
0.1%
324000025201100001 1
0.1%
324000025200900013 2
0.3%
324000025200900011 1
0.1%
324000025200800008 2
0.3%
324000025200800002 1
0.1%
324000025200700014 1
0.1%
324000025200700013 1
0.1%

업종코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
22
585 
21
83 
25
 
33

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row22
2nd row22
3rd row22
4th row22
5th row22

Common Values

ValueCountFrequency (%)
22 585
83.5%
21 83
 
11.8%
25 33
 
4.7%

Length

2024-05-11T15:50:15.830060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:50:16.018485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
22 585
83.5%
21 83
 
11.8%
25 33
 
4.7%

업종명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
폐수배출업소관리
579 
대기배출업소관리
80 
기타수질오염원관리
 
33
<NA>
 
9

Length

Max length9
Median length8
Mean length7.9957204
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row폐수배출업소관리
2nd row폐수배출업소관리
3rd row폐수배출업소관리
4th row폐수배출업소관리
5th row폐수배출업소관리

Common Values

ValueCountFrequency (%)
폐수배출업소관리 579
82.6%
대기배출업소관리 80
 
11.4%
기타수질오염원관리 33
 
4.7%
<NA> 9
 
1.3%

Length

2024-05-11T15:50:16.236608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:50:16.448907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
폐수배출업소관리 579
82.6%
대기배출업소관리 80
 
11.4%
기타수질오염원관리 33
 
4.7%
na 9
 
1.3%

지도점검일자
Real number (ℝ)

Distinct248
Distinct (%)35.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20128951
Minimum20100205
Maximum20170627
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.3 KiB
2024-05-11T15:50:16.681074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20100205
5-th percentile20100723
Q120111005
median20121120
Q320150304
95-th percentile20161014
Maximum20170627
Range70422
Interquartile range (IQR)39299

Descriptive statistics

Standard deviation20296.038
Coefficient of variation (CV)0.0010083008
Kurtosis-1.0374577
Mean20128951
Median Absolute Deviation (MAD)10910
Skewness0.3627867
Sum1.4110395 × 1010
Variance4.1192915 × 108
MonotonicityDecreasing
2024-05-11T15:50:16.962083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20120222 10
 
1.4%
20100908 8
 
1.1%
20121120 8
 
1.1%
20120509 7
 
1.0%
20111213 6
 
0.9%
20100909 6
 
0.9%
20100928 6
 
0.9%
20120416 6
 
0.9%
20141007 6
 
0.9%
20101020 6
 
0.9%
Other values (238) 632
90.2%
ValueCountFrequency (%)
20100205 2
 
0.3%
20100209 5
0.7%
20100311 2
 
0.3%
20100518 2
 
0.3%
20100520 1
 
0.1%
20100617 3
0.4%
20100621 2
 
0.3%
20100630 2
 
0.3%
20100714 5
0.7%
20100720 3
0.4%
ValueCountFrequency (%)
20170627 5
0.7%
20170623 4
0.6%
20170608 4
0.6%
20170515 1
 
0.1%
20170425 1
 
0.1%
20170406 2
 
0.3%
20161216 2
 
0.3%
20161213 3
0.4%
20161122 2
 
0.3%
20161107 2
 
0.3%

점검기관
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
3240000
701 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3240000
2nd row3240000
3rd row3240000
4th row3240000
5th row3240000

Common Values

ValueCountFrequency (%)
3240000 701
100.0%

Length

2024-05-11T15:50:17.194938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:50:17.369215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3240000 701
100.0%

점검기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
서울특별시 강동구
701 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시 강동구
2nd row서울특별시 강동구
3rd row서울특별시 강동구
4th row서울특별시 강동구
5th row서울특별시 강동구

Common Values

ValueCountFrequency (%)
서울특별시 강동구 701
100.0%

Length

2024-05-11T15:50:17.563847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:50:17.800322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 701
50.0%
강동구 701
50.0%

지도점검구분
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
정기
595 
수시
63 
기타
 
26
합동
 
15
<NA>
 
2

Length

Max length4
Median length2
Mean length2.0057061
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정기
2nd row정기
3rd row정기
4th row정기
5th row정기

Common Values

ValueCountFrequency (%)
정기 595
84.9%
수시 63
 
9.0%
기타 26
 
3.7%
합동 15
 
2.1%
<NA> 2
 
0.3%

Length

2024-05-11T15:50:18.078631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:50:18.352220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정기 595
84.9%
수시 63
 
9.0%
기타 26
 
3.7%
합동 15
 
2.1%
na 2
 
0.3%

처분대상여부
Boolean

IMBALANCE  MISSING 

Distinct2
Distinct (%)0.3%
Missing9
Missing (%)1.3%
Memory size1.5 KiB
False
661 
True
 
31
(Missing)
 
9
ValueCountFrequency (%)
False 661
94.3%
True 31
 
4.4%
(Missing) 9
 
1.3%
2024-05-11T15:50:18.517636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct122
Distinct (%)17.5%
Missing2
Missing (%)0.3%
Memory size5.6 KiB
2024-05-11T15:50:18.873642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length38
Mean length17.626609
Min length6

Characters and Unicode

Total characters12321
Distinct characters129
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)9.4%

Sample

1st row배출시설및 방지시설 적정운영
2nd row배출시설및 방지시설 적정운영 여부
3rd row배출시설및 방지시설 적정운영 여부
4th row배출시설및 방지시설 적정운영 여부
5th row배출시설및 방지시설 적정운영 여부
ValueCountFrequency (%)
방지시설 519
16.5%
여부 460
14.6%
391
12.4%
배출시설 381
12.1%
적정운영 333
10.6%
적정 203
 
6.4%
운영 121
 
3.8%
배출시설및 114
 
3.6%
발생폐수 56
 
1.8%
확인 43
 
1.4%
Other values (136) 534
16.9%
2024-05-11T15:50:19.501142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2462
20.0%
1093
 
8.9%
1080
 
8.8%
606
 
4.9%
591
 
4.8%
576
 
4.7%
553
 
4.5%
543
 
4.4%
541
 
4.4%
538
 
4.4%
Other values (119) 3738
30.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9802
79.6%
Space Separator 2462
 
20.0%
Uppercase Letter 16
 
0.1%
Close Punctuation 11
 
0.1%
Open Punctuation 11
 
0.1%
Lowercase Letter 10
 
0.1%
Dash Punctuation 5
 
< 0.1%
Other Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1093
 
11.2%
1080
 
11.0%
606
 
6.2%
591
 
6.0%
576
 
5.9%
553
 
5.6%
543
 
5.5%
541
 
5.5%
538
 
5.5%
538
 
5.5%
Other values (107) 3143
32.1%
Uppercase Letter
ValueCountFrequency (%)
X 5
31.2%
R 5
31.2%
T 2
 
12.5%
M 2
 
12.5%
S 2
 
12.5%
Lowercase Letter
ValueCountFrequency (%)
a 5
50.0%
y 5
50.0%
Space Separator
ValueCountFrequency (%)
2462
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9802
79.6%
Common 2493
 
20.2%
Latin 26
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1093
 
11.2%
1080
 
11.0%
606
 
6.2%
591
 
6.0%
576
 
5.9%
553
 
5.6%
543
 
5.5%
541
 
5.5%
538
 
5.5%
538
 
5.5%
Other values (107) 3143
32.1%
Latin
ValueCountFrequency (%)
a 5
19.2%
X 5
19.2%
y 5
19.2%
R 5
19.2%
T 2
 
7.7%
M 2
 
7.7%
S 2
 
7.7%
Common
ValueCountFrequency (%)
2462
98.8%
) 11
 
0.4%
( 11
 
0.4%
- 5
 
0.2%
, 4
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9801
79.5%
ASCII 2519
 
20.4%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2462
97.7%
) 11
 
0.4%
( 11
 
0.4%
a 5
 
0.2%
X 5
 
0.2%
- 5
 
0.2%
y 5
 
0.2%
R 5
 
0.2%
, 4
 
0.2%
T 2
 
0.1%
Other values (2) 4
 
0.2%
Hangul
ValueCountFrequency (%)
1093
 
11.2%
1080
 
11.0%
606
 
6.2%
591
 
6.0%
576
 
5.9%
553
 
5.6%
543
 
5.5%
541
 
5.5%
538
 
5.5%
538
 
5.5%
Other values (106) 3142
32.1%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

점검결과
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing701
Missing (%)100.0%
Memory size6.3 KiB
Distinct203
Distinct (%)43.9%
Missing239
Missing (%)34.1%
Memory size5.6 KiB
2024-05-11T15:50:19.958448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length32
Mean length25.645022
Min length21

Characters and Unicode

Total characters11848
Distinct characters117
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)18.8%

Sample

1st row서울특별시 강동구 성안로 46 (성내동)
2nd row서울특별시 강동구 아리수로87길 272 (고덕동)
3rd row서울특별시 강동구 천호대로 1056 (성내동)
4th row서울특별시 강동구 성안로13길 지하 42 (성내동)
5th row서울특별시 강동구 성내로13길 지하 12-15 (성내동)
ValueCountFrequency (%)
서울특별시 462
18.8%
강동구 462
18.8%
성내동 125
 
5.1%
지하 111
 
4.5%
천호동 75
 
3.0%
둔촌동 68
 
2.8%
암사동 60
 
2.4%
길동 51
 
2.1%
올림픽로 44
 
1.8%
성안로 38
 
1.5%
Other values (267) 966
39.2%
2024-05-11T15:50:20.683696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2004
 
16.9%
948
 
8.0%
485
 
4.1%
481
 
4.1%
463
 
3.9%
463
 
3.9%
) 462
 
3.9%
462
 
3.9%
( 462
 
3.9%
462
 
3.9%
Other values (107) 5156
43.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7208
60.8%
Space Separator 2004
 
16.9%
Decimal Number 1625
 
13.7%
Close Punctuation 462
 
3.9%
Open Punctuation 462
 
3.9%
Dash Punctuation 44
 
0.4%
Other Punctuation 39
 
0.3%
Math Symbol 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
948
13.2%
485
 
6.7%
481
 
6.7%
463
 
6.4%
463
 
6.4%
462
 
6.4%
462
 
6.4%
462
 
6.4%
462
 
6.4%
239
 
3.3%
Other values (91) 2281
31.6%
Decimal Number
ValueCountFrequency (%)
1 316
19.4%
2 200
12.3%
8 196
12.1%
5 164
10.1%
6 156
9.6%
3 149
9.2%
0 123
 
7.6%
7 118
 
7.3%
4 102
 
6.3%
9 101
 
6.2%
Space Separator
ValueCountFrequency (%)
2004
100.0%
Close Punctuation
ValueCountFrequency (%)
) 462
100.0%
Open Punctuation
ValueCountFrequency (%)
( 462
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 44
100.0%
Other Punctuation
ValueCountFrequency (%)
, 39
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7208
60.8%
Common 4640
39.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
948
13.2%
485
 
6.7%
481
 
6.7%
463
 
6.4%
463
 
6.4%
462
 
6.4%
462
 
6.4%
462
 
6.4%
462
 
6.4%
239
 
3.3%
Other values (91) 2281
31.6%
Common
ValueCountFrequency (%)
2004
43.2%
) 462
 
10.0%
( 462
 
10.0%
1 316
 
6.8%
2 200
 
4.3%
8 196
 
4.2%
5 164
 
3.5%
6 156
 
3.4%
3 149
 
3.2%
0 123
 
2.7%
Other values (6) 408
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7208
60.8%
ASCII 4640
39.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2004
43.2%
) 462
 
10.0%
( 462
 
10.0%
1 316
 
6.8%
2 200
 
4.3%
8 196
 
4.2%
5 164
 
3.5%
6 156
 
3.4%
3 149
 
3.2%
0 123
 
2.7%
Other values (6) 408
 
8.8%
Hangul
ValueCountFrequency (%)
948
13.2%
485
 
6.7%
481
 
6.7%
463
 
6.4%
463
 
6.4%
462
 
6.4%
462
 
6.4%
462
 
6.4%
462
 
6.4%
239
 
3.3%
Other values (91) 2281
31.6%

소재지주소
Text

MISSING 

Distinct246
Distinct (%)35.7%
Missing12
Missing (%)1.7%
Memory size5.6 KiB
2024-05-11T15:50:21.180124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length30
Mean length22.014514
Min length17

Characters and Unicode

Total characters15168
Distinct characters58
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique88 ?
Unique (%)12.8%

Sample

1st row서울특별시 강동구 강일동 -718번지
2nd row서울특별시 강동구 성내동 423-3번지
3rd row서울특별시 강동구 고덕동 85번지
4th row서울특별시 강동구 성내동 19-1번지
5th row서울특별시 강동구 강일동 724번지
ValueCountFrequency (%)
서울특별시 689
24.5%
강동구 689
24.5%
성내동 176
 
6.3%
천호동 132
 
4.7%
둔촌동 96
 
3.4%
암사동 86
 
3.1%
길동 74
 
2.6%
고덕동 45
 
1.6%
상일동 33
 
1.2%
강일동 25
 
0.9%
Other values (248) 768
27.3%
2024-05-11T15:50:22.056312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2814
18.6%
1380
 
9.1%
714
 
4.7%
694
 
4.6%
689
 
4.5%
689
 
4.5%
689
 
4.5%
689
 
4.5%
689
 
4.5%
689
 
4.5%
Other values (48) 5432
35.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8959
59.1%
Space Separator 2814
 
18.6%
Decimal Number 2814
 
18.6%
Dash Punctuation 574
 
3.8%
Other Punctuation 5
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1380
15.4%
714
8.0%
694
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
678
7.6%
Other values (34) 1359
15.2%
Decimal Number
ValueCountFrequency (%)
4 474
16.8%
1 442
15.7%
5 372
13.2%
3 356
12.7%
2 311
11.1%
0 195
6.9%
7 191
6.8%
8 159
 
5.7%
9 157
 
5.6%
6 157
 
5.6%
Space Separator
ValueCountFrequency (%)
2814
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 574
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8959
59.1%
Common 6209
40.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1380
15.4%
714
8.0%
694
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
678
7.6%
Other values (34) 1359
15.2%
Common
ValueCountFrequency (%)
2814
45.3%
- 574
 
9.2%
4 474
 
7.6%
1 442
 
7.1%
5 372
 
6.0%
3 356
 
5.7%
2 311
 
5.0%
0 195
 
3.1%
7 191
 
3.1%
8 159
 
2.6%
Other values (4) 321
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8959
59.1%
ASCII 6209
40.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2814
45.3%
- 574
 
9.2%
4 474
 
7.6%
1 442
 
7.1%
5 372
 
6.0%
3 356
 
5.7%
2 311
 
5.0%
0 195
 
3.1%
7 191
 
3.1%
8 159
 
2.6%
Other values (4) 321
 
5.2%
Hangul
ValueCountFrequency (%)
1380
15.4%
714
8.0%
694
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
689
7.7%
678
7.6%
Other values (34) 1359
15.2%

Interactions

2024-05-11T15:50:12.683771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:50:12.330359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:50:12.839303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:50:12.526033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:50:22.249538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인허가번호업종코드업종명지도점검일자지도점검구분처분대상여부
인허가번호1.0000.0710.0720.0000.3300.387
업종코드0.0711.0001.0000.2730.0540.008
업종명0.0721.0001.0000.2660.0510.000
지도점검일자0.0000.2730.2661.0000.4890.224
지도점검구분0.3300.0540.0510.4891.0000.340
처분대상여부0.3870.0080.0000.2240.3401.000
2024-05-11T15:50:22.852365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종코드처분대상여부지도점검구분업종명
업종코드1.0000.0140.0501.000
처분대상여부0.0141.0000.2260.000
지도점검구분0.0500.2261.0000.048
업종명1.0000.0000.0481.000
2024-05-11T15:50:23.037454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인허가번호지도점검일자업종코드업종명지도점검구분처분대상여부
인허가번호1.0000.0350.0980.1000.2200.249
지도점검일자0.0351.0000.1790.1750.2320.168
업종코드0.0980.1791.0001.0000.0500.014
업종명0.1000.1751.0001.0000.0480.000
지도점검구분0.2200.2320.0500.0481.0000.226
처분대상여부0.2490.1680.0140.0000.2261.000

Missing values

2024-05-11T15:50:13.104544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:50:13.420754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T15:50:13.674322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업체(시설)명인허가번호업종코드업종명지도점검일자점검기관점검기관명지도점검구분처분대상여부점검사항점검결과소재지도로명주소소재지주소
0(주)강산32400002220160000622폐수배출업소관리201706273240000서울특별시 강동구정기N배출시설및 방지시설 적정운영<NA><NA>서울특별시 강동구 강일동 -718번지
1강동JJ24시셀프세차장32400002220120000122폐수배출업소관리201706273240000서울특별시 강동구정기N배출시설및 방지시설 적정운영 여부<NA>서울특별시 강동구 성안로 46 (성내동)서울특별시 강동구 성내동 423-3번지
2(주)알엠앤에스32400002220150000322폐수배출업소관리201706273240000서울특별시 강동구정기N배출시설및 방지시설 적정운영 여부<NA>서울특별시 강동구 아리수로87길 272 (고덕동)서울특별시 강동구 고덕동 85번지
3(주)경기에너지32400002220130000422폐수배출업소관리201706273240000서울특별시 강동구정기N배출시설및 방지시설 적정운영 여부<NA>서울특별시 강동구 천호대로 1056 (성내동)서울특별시 강동구 성내동 19-1번지
4(주)신흥이엔지32400002220160000922폐수배출업소관리201706273240000서울특별시 강동구정기N배출시설및 방지시설 적정운영 여부<NA><NA>서울특별시 강동구 강일동 724번지
5티티안32400002220110000722폐수배출업소관리201706233240000서울특별시 강동구정기N배출시설 및 방지시설 적정운영 여부<NA>서울특별시 강동구 성안로13길 지하 42 (성내동)서울특별시 강동구 성내동 505-3번지
6성우사32400002220120000822폐수배출업소관리201706233240000서울특별시 강동구정기N배출시설및 방지시설 적정운영<NA>서울특별시 강동구 성내로13길 지하 12-15 (성내동)서울특별시 강동구 성내동 543-11번지
7원색32400002220160000422폐수배출업소관리201706233240000서울특별시 강동구정기N배출시설및 방지시설 적정운영 여부<NA>서울특별시 강동구 양재대로85길 75 (성내동)<NA>
8동아32400002220080000722폐수배출업소관리201706233240000서울특별시 강동구정기N배출시설및 방지시설 적정 운영<NA>서울특별시 강동구 풍성로45길 지하 64 (성내동)서울특별시 강동구 성내동 501번지
9서울금속32400002220160000322폐수배출업소관리201706083240000서울특별시 강동구정기N배출시설및 방지시설 적정운영<NA>서울특별시 강동구 풍성로 160 (성내동)<NA>
업체(시설)명인허가번호업종코드업종명지도점검일자점검기관점검기관명지도점검구분처분대상여부점검사항점검결과소재지도로명주소소재지주소
691아름체인32400002220000025522폐수배출업소관리201005183240000서울특별시 강동구정기N배출시설 및 방지시설 적정 운영 여부<NA><NA>서울특별시 강동구 성내동 499-15번지
692(주)나엔32400002220060000122폐수배출업소관리201003113240000서울특별시 강동구정기N배출시설 및 방지시설 적정여부<NA>서울특별시 강동구 아리수로87가길 275 (고덕동)서울특별시 강동구 고덕동 360번지
693(주)나엔32400002120070000221대기배출업소관리201003113240000서울특별시 강동구정기N배출시설 및 방지시설 적정여부<NA><NA>서울특별시 강동구 고덕동 360번지
694차사랑카세차장32400002220090001022폐수배출업소관리201002093240000서울특별시 강동구정기N배출및방지시설 운영사항<NA><NA>서울특별시 강동구 둔촌동 54번지
695의료법인강동성심병원32400002219860000222폐수배출업소관리201002093240000서울특별시 강동구정기N배출및방지시설 운영사항<NA>서울특별시 강동구 성안로 150 (길동)서울특별시 강동구 길동 445번지
696(주)선우지에스엠32400002219940002822폐수배출업소관리201002093240000서울특별시 강동구정기N배출및방지시설 운영사항<NA><NA>서울특별시 강동구 길동 368-5번지
697신일섬유32400002219980001322폐수배출업소관리201002093240000서울특별시 강동구정기N배출및방지시설 운영사항<NA>서울특별시 강동구 천호대로186길 8 (둔촌동)서울특별시 강동구 둔촌동 67-10번지
698세정섬유32400002219980001222폐수배출업소관리201002093240000서울특별시 강동구정기N배출및방지시설 운영사항<NA>서울특별시 강동구 양재대로102길 25 (둔촌동)서울특별시 강동구 둔촌동 427-3번지
699태양체인32400002220020002722폐수배출업소관리201002053240000서울특별시 강동구정기N배출시설 및 방지시설 적정운영 여부<NA><NA>서울특별시 강동구 성내동 544-7번지
700일오삼케스팅32400002220000035922폐수배출업소관리201002053240000서울특별시 강동구정기N배출시설 및 방지시설 적정운영 여부<NA><NA>서울특별시 강동구 둔촌동 147-3번지

Duplicate rows

Most frequently occurring

업체(시설)명인허가번호업종코드업종명지도점검일자점검기관점검기관명지도점검구분처분대상여부점검사항소재지도로명주소소재지주소# duplicates
0VIP자동차 공업사32400002120010000121대기배출업소관리201109273240000서울특별시 강동구정기N배출시설 및 방지시설 적정운영 여부<NA>서울특별시 강동구 성내동 548번지2
1코스모24셀프세차타운32400002219980013422폐수배출업소관리201105163240000서울특별시 강동구정기N배출시설 및 방지시설 적정운영 여부<NA>서울특별시 강동구 둔촌동 63-1번지2