Overview

Dataset statistics

Number of variables14
Number of observations3171
Missing cells3920
Missing cells (%)8.8%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory362.4 KiB
Average record size in memory117.0 B

Variable types

Categorical5
Text3
Numeric5
Boolean1

Dataset

Description공중이용시설 현황(복합건축물)
Author행정안전부
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=36LDS75V730UWM772RLH1697725&infSeq=1

Alerts

다중이용업소여부 has constant value ""Constant
Dataset has 1 (< 0.1%) duplicate rowsDuplicates
시군명 is highly overall correlated with 폐업일자 and 6 other fieldsHigh correlation
위생업종명 is highly overall correlated with 인허가일자 and 8 other fieldsHigh correlation
건물소유구분명 is highly overall correlated with 폐업일자 and 5 other fieldsHigh correlation
위생업태명 is highly overall correlated with 인허가일자 and 8 other fieldsHigh correlation
영업상태명 is highly overall correlated with 폐업일자 and 2 other fieldsHigh correlation
인허가일자 is highly overall correlated with 위생업종명 and 1 other fieldsHigh correlation
폐업일자 is highly overall correlated with 시군명 and 4 other fieldsHigh correlation
소재지우편번호 is highly overall correlated with WGS84경도 and 3 other fieldsHigh correlation
WGS84위도 is highly overall correlated with 시군명 and 3 other fieldsHigh correlation
WGS84경도 is highly overall correlated with 소재지우편번호 and 4 other fieldsHigh correlation
영업상태명 is highly imbalanced (75.3%)Imbalance
폐업일자 has 3041 (95.9%) missing valuesMissing
다중이용업소여부 has 586 (18.5%) missing valuesMissing
소재지도로명주소 has 121 (3.8%) missing valuesMissing
WGS84위도 has 85 (2.7%) missing valuesMissing
WGS84경도 has 85 (2.7%) missing valuesMissing

Reproduction

Analysis started2023-12-10 22:02:56.892001
Analysis finished2023-12-10 22:03:00.952628
Duration4.06 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct29
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size24.9 KiB
부천시
586 
성남시
555 
안산시
395 
수원시
263 
시흥시
234 
Other values (24)
1138 

Length

Max length4
Median length3
Mean length3.0264901
Min length3

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row가평군
2nd row고양시
3rd row고양시
4th row고양시
5th row고양시

Common Values

ValueCountFrequency (%)
부천시 586
18.5%
성남시 555
17.5%
안산시 395
12.5%
수원시 263
8.3%
시흥시 234
 
7.4%
용인시 197
 
6.2%
김포시 138
 
4.4%
광명시 124
 
3.9%
군포시 109
 
3.4%
고양시 92
 
2.9%
Other values (19) 478
15.1%

Length

2023-12-11T07:03:01.030330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
부천시 586
18.5%
성남시 555
17.5%
안산시 395
12.5%
수원시 263
8.3%
시흥시 234
 
7.4%
용인시 197
 
6.2%
김포시 138
 
4.4%
광명시 124
 
3.9%
군포시 109
 
3.4%
고양시 92
 
2.9%
Other values (19) 478
15.1%
Distinct2599
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Memory size24.9 KiB
2023-12-11T07:03:01.327614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length19
Mean length5.688111
Min length1

Characters and Unicode

Total characters18037
Distinct characters540
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2368 ?
Unique (%)74.7%

Sample

1st row에덴농산물센타
2nd row삼흥프라자
3rd row화정종합상가
4th row능곡프라자
5th row동양쇼핑(주)
ValueCountFrequency (%)
명칭없음 80
 
2.4%
0 69
 
2.0%
없음 30
 
0.9%
서현동 16
 
0.5%
제일프라자 13
 
0.4%
현대프라자 12
 
0.4%
건축물 12
 
0.4%
수내동 10
 
0.3%
월드프라자 10
 
0.3%
중앙프라자 10
 
0.3%
Other values (2666) 3108
92.2%
2023-12-11T07:03:01.775366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
885
 
4.9%
837
 
4.6%
834
 
4.6%
833
 
4.6%
802
 
4.4%
364
 
2.0%
303
 
1.7%
250
 
1.4%
247
 
1.4%
237
 
1.3%
Other values (530) 12445
69.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16525
91.6%
Decimal Number 619
 
3.4%
Space Separator 199
 
1.1%
Close Punctuation 195
 
1.1%
Open Punctuation 194
 
1.1%
Uppercase Letter 173
 
1.0%
Dash Punctuation 90
 
0.5%
Other Punctuation 22
 
0.1%
Lowercase Letter 12
 
0.1%
Letter Number 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
885
 
5.4%
837
 
5.1%
834
 
5.0%
833
 
5.0%
802
 
4.9%
364
 
2.2%
303
 
1.8%
250
 
1.5%
247
 
1.5%
237
 
1.4%
Other values (478) 10933
66.2%
Uppercase Letter
ValueCountFrequency (%)
A 24
13.9%
B 20
 
11.6%
K 11
 
6.4%
S 11
 
6.4%
G 11
 
6.4%
M 10
 
5.8%
C 9
 
5.2%
E 9
 
5.2%
T 9
 
5.2%
L 7
 
4.0%
Other values (14) 52
30.1%
Decimal Number
ValueCountFrequency (%)
2 128
20.7%
0 116
18.7%
1 112
18.1%
3 74
12.0%
5 42
 
6.8%
6 39
 
6.3%
4 31
 
5.0%
7 29
 
4.7%
8 25
 
4.0%
9 23
 
3.7%
Lowercase Letter
ValueCountFrequency (%)
l 4
33.3%
o 3
25.0%
s 1
 
8.3%
g 1
 
8.3%
n 1
 
8.3%
i 1
 
8.3%
e 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
. 9
40.9%
, 8
36.4%
& 3
 
13.6%
/ 2
 
9.1%
Letter Number
ValueCountFrequency (%)
5
62.5%
2
 
25.0%
1
 
12.5%
Space Separator
ValueCountFrequency (%)
199
100.0%
Close Punctuation
ValueCountFrequency (%)
) 195
100.0%
Open Punctuation
ValueCountFrequency (%)
( 194
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 90
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16525
91.6%
Common 1319
 
7.3%
Latin 193
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
885
 
5.4%
837
 
5.1%
834
 
5.0%
833
 
5.0%
802
 
4.9%
364
 
2.2%
303
 
1.8%
250
 
1.5%
247
 
1.5%
237
 
1.4%
Other values (478) 10933
66.2%
Latin
ValueCountFrequency (%)
A 24
 
12.4%
B 20
 
10.4%
K 11
 
5.7%
S 11
 
5.7%
G 11
 
5.7%
M 10
 
5.2%
C 9
 
4.7%
E 9
 
4.7%
T 9
 
4.7%
L 7
 
3.6%
Other values (24) 72
37.3%
Common
ValueCountFrequency (%)
199
15.1%
) 195
14.8%
( 194
14.7%
2 128
9.7%
0 116
8.8%
1 112
8.5%
- 90
6.8%
3 74
 
5.6%
5 42
 
3.2%
6 39
 
3.0%
Other values (8) 130
9.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16525
91.6%
ASCII 1504
 
8.3%
Number Forms 8
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
885
 
5.4%
837
 
5.1%
834
 
5.0%
833
 
5.0%
802
 
4.9%
364
 
2.2%
303
 
1.8%
250
 
1.5%
247
 
1.5%
237
 
1.4%
Other values (478) 10933
66.2%
ASCII
ValueCountFrequency (%)
199
13.2%
) 195
13.0%
( 194
12.9%
2 128
8.5%
0 116
 
7.7%
1 112
 
7.4%
- 90
 
6.0%
3 74
 
4.9%
5 42
 
2.8%
6 39
 
2.6%
Other values (39) 315
20.9%
Number Forms
ValueCountFrequency (%)
5
62.5%
2
 
25.0%
1
 
12.5%

인허가일자
Real number (ℝ)

HIGH CORRELATION 

Distinct1162
Distinct (%)36.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20027964
Minimum19720320
Maximum20160126
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.0 KiB
2023-12-11T07:03:01.939919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19720320
5-th percentile19910824
Q119970322
median20040702
Q320071205
95-th percentile20101117
Maximum20160126
Range439806
Interquartile range (IQR)100883.5

Descriptive statistics

Standard deviation64704.713
Coefficient of variation (CV)0.0032307184
Kurtosis-0.30798758
Mean20027964
Median Absolute Deviation (MAD)49399
Skewness-0.62081641
Sum6.3508674 × 1010
Variance4.1866999 × 109
MonotonicityNot monotonic
2023-12-11T07:03:02.075542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20070406 205
 
6.5%
20031001 116
 
3.7%
20100517 99
 
3.1%
20070226 77
 
2.4%
20070207 75
 
2.4%
20070223 66
 
2.1%
20080102 63
 
2.0%
20100302 63
 
2.0%
20040702 42
 
1.3%
20070227 42
 
1.3%
Other values (1152) 2323
73.3%
ValueCountFrequency (%)
19720320 1
< 0.1%
19740518 1
< 0.1%
19781007 1
< 0.1%
19800227 1
< 0.1%
19800515 1
< 0.1%
19800628 1
< 0.1%
19800725 1
< 0.1%
19800814 1
< 0.1%
19810226 1
< 0.1%
19820520 1
< 0.1%
ValueCountFrequency (%)
20160126 8
 
0.3%
20140325 13
0.4%
20130911 7
 
0.2%
20130318 2
 
0.1%
20130226 5
 
0.2%
20130125 1
 
< 0.1%
20120913 3
 
0.1%
20120912 20
0.6%
20120905 8
 
0.3%
20120903 8
 
0.3%

영업상태명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.9 KiB
운영중
3041 
폐업 등
 
130

Length

Max length4
Median length3
Mean length3.0409965
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row운영중
2nd row운영중
3rd row운영중
4th row운영중
5th row운영중

Common Values

ValueCountFrequency (%)
운영중 3041
95.9%
폐업 등 130
 
4.1%

Length

2023-12-11T07:03:02.212513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:03:02.322306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
운영중 3041
92.1%
폐업 130
 
3.9%
130
 
3.9%

폐업일자
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)16.9%
Missing3041
Missing (%)95.9%
Infinite0
Infinite (%)0.0%
Mean20092495
Minimum20030102
Maximum20160401
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.0 KiB
2023-12-11T07:03:02.426096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20030102
5-th percentile20070319
Q120070319
median20071213
Q320101231
95-th percentile20140379
Maximum20160401
Range130299
Interquartile range (IQR)30912

Descriptive statistics

Standard deviation27088.176
Coefficient of variation (CV)0.0013481738
Kurtosis-0.67329945
Mean20092495
Median Absolute Deviation (MAD)24499
Skewness0.57866208
Sum2.6120244 × 109
Variance7.3376928 × 108
MonotonicityNot monotonic
2023-12-11T07:03:02.549371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
20070319 46
 
1.5%
20101231 30
 
0.9%
20140318 10
 
0.3%
20070416 7
 
0.2%
20071213 6
 
0.2%
20141023 4
 
0.1%
20120704 3
 
0.1%
20130416 3
 
0.1%
20140429 2
 
0.1%
20120903 2
 
0.1%
Other values (12) 17
 
0.5%
(Missing) 3041
95.9%
ValueCountFrequency (%)
20030102 1
 
< 0.1%
20040102 1
 
< 0.1%
20070305 2
 
0.1%
20070319 46
1.5%
20070416 7
 
0.2%
20070820 1
 
< 0.1%
20071207 2
 
0.1%
20071213 6
 
0.2%
20091001 1
 
< 0.1%
20100423 2
 
0.1%
ValueCountFrequency (%)
20160401 1
 
< 0.1%
20141023 4
 
0.1%
20140429 2
 
0.1%
20140318 10
0.3%
20130806 2
 
0.1%
20130416 3
 
0.1%
20120903 2
 
0.1%
20120810 1
 
< 0.1%
20120704 3
 
0.1%
20110224 2
 
0.1%

건물소유구분명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.9 KiB
<NA>
2247 
자가
832 
임대
 
92

Length

Max length4
Median length4
Mean length3.4172185
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자가
2nd row임대
3rd row임대
4th row임대
5th row임대

Common Values

ValueCountFrequency (%)
<NA> 2247
70.9%
자가 832
 
26.2%
임대 92
 
2.9%

Length

2023-12-11T07:03:02.668834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:03:02.761101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 2247
70.9%
자가 832
 
26.2%
임대 92
 
2.9%

다중이용업소여부
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing586
Missing (%)18.5%
Memory size6.3 KiB
False
2585 
(Missing)
586 
ValueCountFrequency (%)
False 2585
81.5%
(Missing) 586
 
18.5%
2023-12-11T07:03:02.836779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

위생업종명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.9 KiB
공중이용시설
2585 
<NA>
586 

Length

Max length6
Median length6
Mean length5.6304005
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공중이용시설
2nd row공중이용시설
3rd row공중이용시설
4th row공중이용시설
5th row공중이용시설

Common Values

ValueCountFrequency (%)
공중이용시설 2585
81.5%
<NA> 586
 
18.5%

Length

2023-12-11T07:03:02.930470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:03:03.026345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공중이용시설 2585
81.5%
na 586
 
18.5%

위생업태명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size24.9 KiB
복합건축물
2585 
<NA>
586 

Length

Max length5
Median length5
Mean length4.8152003
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row복합건축물
2nd row복합건축물
3rd row복합건축물
4th row복합건축물
5th row복합건축물

Common Values

ValueCountFrequency (%)
복합건축물 2585
81.5%
<NA> 586
 
18.5%

Length

2023-12-11T07:03:03.138303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:03:03.235019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
복합건축물 2585
81.5%
na 586
 
18.5%
Distinct3010
Distinct (%)98.7%
Missing121
Missing (%)3.8%
Memory size24.9 KiB
2023-12-11T07:03:03.478452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length41
Mean length25.40623
Min length13

Characters and Unicode

Total characters77489
Distinct characters359
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2978 ?
Unique (%)97.6%

Sample

1st row경기도 가평군 청평면 경춘로 1444
2nd row경기도 고양시 덕양구 행신로 95 (토당동)
3rd row경기도 고양시 덕양구 화중로 60 (화정동)
4th row경기도 고양시 덕양구 행당로 9 (토당동,,9)
5th row경기도 고양시 덕양구 고양시청로 13-2 (주교동)
ValueCountFrequency (%)
경기도 3050
 
17.9%
부천시 579
 
3.4%
성남시 533
 
3.1%
분당구 459
 
2.7%
안산시 391
 
2.3%
단원구 277
 
1.6%
수원시 252
 
1.5%
시흥시 218
 
1.3%
용인시 184
 
1.1%
팔달구 183
 
1.1%
Other values (2656) 10937
64.1%
2023-12-11T07:03:03.879395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14013
 
18.1%
3300
 
4.3%
3214
 
4.1%
3169
 
4.1%
3129
 
4.0%
3111
 
4.0%
2918
 
3.8%
) 2882
 
3.7%
( 2882
 
3.7%
1 2206
 
2.8%
Other values (349) 36665
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45058
58.1%
Space Separator 14013
 
18.1%
Decimal Number 11514
 
14.9%
Close Punctuation 2882
 
3.7%
Open Punctuation 2882
 
3.7%
Other Punctuation 596
 
0.8%
Dash Punctuation 532
 
0.7%
Uppercase Letter 11
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3300
 
7.3%
3214
 
7.1%
3169
 
7.0%
3129
 
6.9%
3111
 
6.9%
2918
 
6.5%
1702
 
3.8%
1139
 
2.5%
929
 
2.1%
876
 
1.9%
Other values (324) 21571
47.9%
Decimal Number
ValueCountFrequency (%)
1 2206
19.2%
2 1639
14.2%
3 1432
12.4%
4 1061
9.2%
7 1043
9.1%
5 975
8.5%
6 874
 
7.6%
0 806
 
7.0%
8 745
 
6.5%
9 733
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
B 3
27.3%
A 2
18.2%
C 1
 
9.1%
S 1
 
9.1%
M 1
 
9.1%
D 1
 
9.1%
J 1
 
9.1%
W 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
, 592
99.3%
. 4
 
0.7%
Space Separator
ValueCountFrequency (%)
14013
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2882
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2882
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 532
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45058
58.1%
Common 32420
41.8%
Latin 11
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3300
 
7.3%
3214
 
7.1%
3169
 
7.0%
3129
 
6.9%
3111
 
6.9%
2918
 
6.5%
1702
 
3.8%
1139
 
2.5%
929
 
2.1%
876
 
1.9%
Other values (324) 21571
47.9%
Common
ValueCountFrequency (%)
14013
43.2%
) 2882
 
8.9%
( 2882
 
8.9%
1 2206
 
6.8%
2 1639
 
5.1%
3 1432
 
4.4%
4 1061
 
3.3%
7 1043
 
3.2%
5 975
 
3.0%
6 874
 
2.7%
Other values (7) 3413
 
10.5%
Latin
ValueCountFrequency (%)
B 3
27.3%
A 2
18.2%
C 1
 
9.1%
S 1
 
9.1%
M 1
 
9.1%
D 1
 
9.1%
J 1
 
9.1%
W 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45058
58.1%
ASCII 32431
41.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14013
43.2%
) 2882
 
8.9%
( 2882
 
8.9%
1 2206
 
6.8%
2 1639
 
5.1%
3 1432
 
4.4%
4 1061
 
3.3%
7 1043
 
3.2%
5 975
 
3.0%
6 874
 
2.7%
Other values (15) 3424
 
10.6%
Hangul
ValueCountFrequency (%)
3300
 
7.3%
3214
 
7.1%
3169
 
7.0%
3129
 
6.9%
3111
 
6.9%
2918
 
6.5%
1702
 
3.8%
1139
 
2.5%
929
 
2.1%
876
 
1.9%
Other values (324) 21571
47.9%
Distinct3128
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size24.9 KiB
2023-12-11T07:03:04.244281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length40
Mean length21.486282
Min length15

Characters and Unicode

Total characters68133
Distinct characters305
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3092 ?
Unique (%)97.5%

Sample

1st row경기도 가평군 청평면 상천리 257-4번지
2nd row경기도 고양시 덕양구 토당동 887-5번지
3rd row경기도 고양시 덕양구 화정동 983번지
4th row경기도 고양시 덕양구 토당동 877-8번지 ,9
5th row경기도 고양시 덕양구 성사동 502-57번지
ValueCountFrequency (%)
경기도 3171
 
21.5%
부천시 586
 
4.0%
성남시 555
 
3.8%
분당구 462
 
3.1%
안산시 395
 
2.7%
단원구 280
 
1.9%
수원시 263
 
1.8%
시흥시 234
 
1.6%
용인시 197
 
1.3%
팔달구 193
 
1.3%
Other values (3337) 8412
57.0%
2023-12-11T07:03:04.722699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11577
17.0%
3405
 
5.0%
3361
 
4.9%
3307
 
4.9%
3205
 
4.7%
3204
 
4.7%
3177
 
4.7%
3174
 
4.7%
1 2720
 
4.0%
- 2447
 
3.6%
Other values (295) 28556
41.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41054
60.3%
Decimal Number 12928
 
19.0%
Space Separator 11577
 
17.0%
Dash Punctuation 2447
 
3.6%
Other Punctuation 108
 
0.2%
Uppercase Letter 9
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Close Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3405
 
8.3%
3361
 
8.2%
3307
 
8.1%
3205
 
7.8%
3204
 
7.8%
3177
 
7.7%
3174
 
7.7%
1714
 
4.2%
767
 
1.9%
686
 
1.7%
Other values (269) 15054
36.7%
Decimal Number
ValueCountFrequency (%)
1 2720
21.0%
2 1539
11.9%
3 1505
11.6%
5 1277
9.9%
4 1238
9.6%
7 1148
8.9%
6 973
 
7.5%
0 926
 
7.2%
8 895
 
6.9%
9 707
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
B 2
22.2%
D 1
11.1%
C 1
11.1%
S 1
11.1%
M 1
11.1%
W 1
11.1%
J 1
11.1%
A 1
11.1%
Other Punctuation
ValueCountFrequency (%)
, 102
94.4%
. 4
 
3.7%
@ 1
 
0.9%
/ 1
 
0.9%
Space Separator
ValueCountFrequency (%)
11577
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2447
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41054
60.3%
Common 27070
39.7%
Latin 9
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3405
 
8.3%
3361
 
8.2%
3307
 
8.1%
3205
 
7.8%
3204
 
7.8%
3177
 
7.7%
3174
 
7.7%
1714
 
4.2%
767
 
1.9%
686
 
1.7%
Other values (269) 15054
36.7%
Common
ValueCountFrequency (%)
11577
42.8%
1 2720
 
10.0%
- 2447
 
9.0%
2 1539
 
5.7%
3 1505
 
5.6%
5 1277
 
4.7%
4 1238
 
4.6%
7 1148
 
4.2%
6 973
 
3.6%
0 926
 
3.4%
Other values (8) 1720
 
6.4%
Latin
ValueCountFrequency (%)
B 2
22.2%
D 1
11.1%
C 1
11.1%
S 1
11.1%
M 1
11.1%
W 1
11.1%
J 1
11.1%
A 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41054
60.3%
ASCII 27079
39.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11577
42.8%
1 2720
 
10.0%
- 2447
 
9.0%
2 1539
 
5.7%
3 1505
 
5.6%
5 1277
 
4.7%
4 1238
 
4.6%
7 1148
 
4.2%
6 973
 
3.6%
0 926
 
3.4%
Other values (16) 1729
 
6.4%
Hangul
ValueCountFrequency (%)
3405
 
8.3%
3361
 
8.2%
3307
 
8.1%
3205
 
7.8%
3204
 
7.8%
3177
 
7.7%
3174
 
7.7%
1714
 
4.2%
767
 
1.9%
686
 
1.7%
Other values (269) 15054
36.7%

소재지우편번호
Real number (ℝ)

HIGH CORRELATION 

Distinct787
Distinct (%)24.8%
Missing2
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean363678.23
Minimum14405
Maximum487912
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.0 KiB
2023-12-11T07:03:04.852174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum14405
5-th percentile14538
Q1415831
median429874
Q3459822
95-th percentile471823
Maximum487912
Range473507
Interquartile range (IQR)43991

Descriptive statistics

Standard deviation166849.63
Coefficient of variation (CV)0.45878365
Kurtosis0.59351682
Mean363678.23
Median Absolute Deviation (MAD)17057
Skewness-1.5881275
Sum1.1524963 × 109
Variance2.7838797 × 1010
MonotonicityNot monotonic
2023-12-11T07:03:04.977983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
463824 75
 
2.4%
425868 72
 
2.3%
442834 67
 
2.1%
463825 50
 
1.6%
425807 46
 
1.5%
14548 44
 
1.4%
429874 39
 
1.2%
435805 38
 
1.2%
463828 33
 
1.0%
442835 33
 
1.0%
Other values (777) 2672
84.3%
ValueCountFrequency (%)
14405 1
 
< 0.1%
14406 1
 
< 0.1%
14407 2
0.1%
14408 3
0.1%
14411 2
0.1%
14413 1
 
< 0.1%
14414 1
 
< 0.1%
14416 4
0.1%
14418 1
 
< 0.1%
14419 2
0.1%
ValueCountFrequency (%)
487912 2
 
0.1%
487911 2
 
0.1%
487857 1
 
< 0.1%
487853 2
 
0.1%
487823 19
0.6%
487805 1
 
< 0.1%
487804 1
 
< 0.1%
487803 1
 
< 0.1%
487050 1
 
< 0.1%
487020 1
 
< 0.1%

WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct3032
Distinct (%)98.3%
Missing85
Missing (%)2.7%
Infinite0
Infinite (%)0.0%
Mean37.417794
Minimum36.953958
Maximum38.030808
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.0 KiB
2023-12-11T07:03:05.083915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.953958
5-th percentile37.26204
Q137.319983
median37.386613
Q337.499699
95-th percentile37.647434
Maximum38.030808
Range1.0768501
Interquartile range (IQR)0.17971619

Descriptive statistics

Standard deviation0.13780014
Coefficient of variation (CV)0.0036827437
Kurtosis2.0204308
Mean37.417794
Median Absolute Deviation (MAD)0.084412877
Skewness0.79340666
Sum115471.31
Variance0.01898888
MonotonicityNot monotonic
2023-12-11T07:03:05.201598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.3716254258 6
 
0.2%
37.3377771272 3
 
0.1%
37.6165068253 3
 
0.1%
37.5235727573 3
 
0.1%
37.3759788963 2
 
0.1%
37.3661175242 2
 
0.1%
37.3520167812 2
 
0.1%
37.2661834411 2
 
0.1%
37.3858474259 2
 
0.1%
37.8291668315 2
 
0.1%
Other values (3022) 3059
96.5%
(Missing) 85
 
2.7%
ValueCountFrequency (%)
36.9539581621 1
< 0.1%
36.9567947948 1
< 0.1%
36.9806819927 1
< 0.1%
36.9896617484 1
< 0.1%
36.9911199178 1
< 0.1%
36.9912796065 1
< 0.1%
36.9913004352 1
< 0.1%
36.9915887354 1
< 0.1%
36.9924129235 1
< 0.1%
36.9926341108 1
< 0.1%
ValueCountFrequency (%)
38.0308082416 1
< 0.1%
38.0278453361 1
< 0.1%
37.9831850705 1
< 0.1%
37.9742518697 1
< 0.1%
37.9675908676 1
< 0.1%
37.9150871886 1
< 0.1%
37.9108059116 1
< 0.1%
37.9081162039 1
< 0.1%
37.9081012046 1
< 0.1%
37.9080783485 1
< 0.1%

WGS84경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct3032
Distinct (%)98.3%
Missing85
Missing (%)2.7%
Infinite0
Infinite (%)0.0%
Mean126.93354
Minimum126.53419
Maximum127.63589
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.0 KiB
2023-12-11T07:03:05.310503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.53419
5-th percentile126.72802
Q1126.786
median126.88351
Q3127.10677
95-th percentile127.14692
Maximum127.63589
Range1.1017
Interquartile range (IQR)0.3207643

Descriptive statistics

Standard deviation0.16091465
Coefficient of variation (CV)0.0012677079
Kurtosis-0.94562405
Mean126.93354
Median Absolute Deviation (MAD)0.13107868
Skewness0.26441732
Sum391716.91
Variance0.025893525
MonotonicityNot monotonic
2023-12-11T07:03:05.416889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.7572663613 6
 
0.2%
126.7498217877 3
 
0.1%
126.621303491 3
 
0.1%
126.7668756786 3
 
0.1%
127.1171172326 2
 
0.1%
126.9292363809 2
 
0.1%
127.1244991912 2
 
0.1%
127.1069368459 2
 
0.1%
127.1250648 2
 
0.1%
127.1490370888 2
 
0.1%
Other values (3022) 3059
96.5%
(Missing) 85
 
2.7%
ValueCountFrequency (%)
126.5341928032 1
< 0.1%
126.5519172752 1
< 0.1%
126.5602792108 1
< 0.1%
126.5639366925 1
< 0.1%
126.5649396805 1
< 0.1%
126.5882834947 1
< 0.1%
126.5976057487 1
< 0.1%
126.5984311406 1
< 0.1%
126.5986993879 1
< 0.1%
126.5989605649 1
< 0.1%
ValueCountFrequency (%)
127.6358927578 1
< 0.1%
127.635598597 1
< 0.1%
127.4891064272 1
< 0.1%
127.4647290322 1
< 0.1%
127.4511242582 1
< 0.1%
127.4471528681 1
< 0.1%
127.4465867053 1
< 0.1%
127.4450748263 1
< 0.1%
127.4434388848 1
< 0.1%
127.4391970024 1
< 0.1%

Interactions

2023-12-11T07:02:59.652025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.029334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.410280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.769981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:59.200838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:59.746748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.103970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.476428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.851260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:59.298753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:03:00.062706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.189129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.552339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.945975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:59.393253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:03:00.146254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.261714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.623742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:59.024097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:59.469765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:03:00.234087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.336885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:58.693326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:59.120127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:02:59.566489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:03:05.496707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명인허가일자영업상태명폐업일자건물소유구분명소재지우편번호WGS84위도WGS84경도
시군명1.0000.7270.2920.9660.9370.9980.9740.965
인허가일자0.7271.0000.0880.7380.3510.3230.5080.560
영업상태명0.2920.0881.000NaN0.0840.0710.0820.127
폐업일자0.9660.738NaN1.000NaN0.9320.7970.885
건물소유구분명0.9370.3510.084NaN1.0000.4220.6640.599
소재지우편번호0.9980.3230.0710.9320.4221.0000.7640.826
WGS84위도0.9740.5080.0820.7970.6640.7641.0000.734
WGS84경도0.9650.5600.1270.8850.5990.8260.7341.000
2023-12-11T07:03:05.821263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명위생업종명건물소유구분명위생업태명영업상태명
시군명1.0001.0000.8021.0000.249
위생업종명1.0001.0001.0001.0001.000
건물소유구분명0.8021.0001.0001.0000.054
위생업태명1.0001.0001.0001.0001.000
영업상태명0.2491.0000.0541.0001.000
2023-12-11T07:03:05.901119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인허가일자폐업일자소재지우편번호WGS84위도WGS84경도시군명영업상태명건물소유구분명위생업종명위생업태명
인허가일자1.0000.1770.038-0.0740.0010.3580.0680.2621.0001.000
폐업일자0.1771.0000.0480.2640.1940.8091.0001.0001.0001.000
소재지우편번호0.0380.0481.000-0.3330.8760.9940.1190.2771.0001.000
WGS84위도-0.0740.264-0.3331.000-0.3200.8310.0630.6691.0001.000
WGS84경도0.0010.1940.876-0.3201.0000.7920.0970.6021.0001.000
시군명0.3580.8090.9940.8310.7921.0000.2490.8021.0001.000
영업상태명0.0681.0000.1190.0630.0970.2491.0000.0541.0001.000
건물소유구분명0.2621.0000.2770.6690.6020.8020.0541.0001.0001.000
위생업종명1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
위생업태명1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000

Missing values

2023-12-11T07:03:00.388940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:03:00.644085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T07:03:00.820668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명사업장명인허가일자영업상태명폐업일자건물소유구분명다중이용업소여부위생업종명위생업태명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도
0가평군에덴농산물센타19911113운영중<NA>자가N공중이용시설복합건축물경기도 가평군 청평면 경춘로 1444경기도 가평군 청평면 상천리 257-4번지47781437.778273127.464729
1고양시삼흥프라자19961024운영중<NA>임대N공중이용시설복합건축물경기도 고양시 덕양구 행신로 95 (토당동)경기도 고양시 덕양구 토당동 887-5번지41282137.617783126.826335
2고양시화정종합상가19961024운영중<NA>임대N공중이용시설복합건축물경기도 고양시 덕양구 화중로 60 (화정동)경기도 고양시 덕양구 화정동 983번지41282737.632437126.830815
3고양시능곡프라자19961024운영중<NA>임대N공중이용시설복합건축물경기도 고양시 덕양구 행당로 9 (토당동,,9)경기도 고양시 덕양구 토당동 877-8번지 ,941221037.619988126.825737
4고양시동양쇼핑(주)19961024운영중<NA>임대N공중이용시설복합건축물<NA>경기도 고양시 덕양구 성사동 502-57번지412806<NA><NA>
5고양시성광빌딩19961226운영중<NA>임대N공중이용시설복합건축물경기도 고양시 덕양구 고양시청로 13-2 (주교동)경기도 고양시 덕양구 주교동 602번지41281137.657456126.832008
6고양시충현빌딩19961226운영중<NA>임대N공중이용시설복합건축물경기도 고양시 덕양구 고양시청로 3 (주교동)경기도 고양시 덕양구 주교동 601-4번지41281137.657246126.83311
7고양시태흥빌딩19961226운영중<NA>임대N공중이용시설복합건축물경기도 고양시 덕양구 호국로 822 (성사동)경기도 고양시 덕양구 성사동 501-2번지41280637.658531126.838846
8고양시건우프라자19961226운영중<NA>임대N공중이용시설복합건축물경기도 고양시 덕양구 고양시청로 13-5 (주교동)경기도 고양시 덕양구 주교동 601-9번지41281137.657289126.83269
9고양시태성빌딩19961226운영중<NA>임대N공중이용시설복합건축물경기도 고양시 덕양구 호국로811번길 19 (주교동)경기도 고양시 덕양구 주교동 619번지41281237.658664126.836861
시군명사업장명인허가일자영업상태명폐업일자건물소유구분명다중이용업소여부위생업종명위생업태명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도
3161하남시대유빌딩19980210운영중<NA>자가N공중이용시설복합건축물경기도 하남시 대청로 29 (신장동,지하3층,지상7층)경기도 하남시 신장동 523-4번지 지하3층,지상7층46581037.541054127.215571
3162하남시은혜빌딩19980416운영중<NA>자가N공중이용시설복합건축물경기도 하남시 하남대로784번안길 20 (신장동)경기도 하남시 신장동 386-2번지46581037.54119127.212174
3163하남시(상호미정)김경재20040706운영중<NA>자가N공중이용시설복합건축물경기도 하남시 하남대로 833 (덕풍동)경기도 하남시 덕풍동 318-4번지46580037.542272127.205163
3164하남시(상호미정) 조성강20040706운영중<NA>자가N공중이용시설복합건축물경기도 하남시 하남대로801번길 65 (신장동)경기도 하남시 신장동 427-38번지46582037.538211127.206662
3165하남시(상호미정) 박석주20040706운영중<NA>자가N공중이용시설복합건축물경기도 하남시 하남대로 804 (신장동)경기도 하남시 신장동 564-1번지46581537.541235127.208438
3166하남시(상호미정) 최성학20040706운영중<NA>자가N공중이용시설복합건축물경기도 하남시 하남대로 812 (신장동)경기도 하남시 신장동 564-7번지46581537.541558127.207781
3167하남시(상호미정) 이명희20040706운영중<NA>자가N공중이용시설복합건축물경기도 하남시 대성로319번길 50 (덕풍동)경기도 하남시 덕풍동 459-8번지46580537.537315127.205024
3168하남시한양빌딩19921215폐업 등20030102자가N공중이용시설복합건축물경기도 하남시 신장로 156경기도 하남시 덕풍동 394-1번지46581337.53993127.201946
3169하남시덕화빌딩19870619폐업 등20040102자가N공중이용시설복합건축물<NA>경기도 하남시 덕풍동 459-1번지465805<NA><NA>
3170화성시르노삼성자동차(주)19980317운영중<NA><NA>N공중이용시설복합건축물<NA>경기도 화성시 영천동 76번지44513037.216274127.103884

Duplicate rows

Most frequently occurring

시군명사업장명인허가일자영업상태명폐업일자건물소유구분명다중이용업소여부위생업종명위생업태명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도# duplicates
0시흥시(주)시흥관광호텔20031001폐업 등20070319<NA>N공중이용시설복합건축물경기도 시흥시 평안상가4길 21경기도 시흥시 정왕동 1622-6번지42985837.337777126.7498222