Overview

Dataset statistics

Number of variables14
Number of observations330
Missing cells1225
Missing cells (%)26.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory37.5 KiB
Average record size in memory116.4 B

Variable types

Categorical1
Text6
DateTime3
Numeric4

Dataset

Description경기도_공동주택 시공 현황
Author가평군
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=O7S825IBO42JHKPB03C426764301&infSeq=1

Alerts

우편번호 is highly overall correlated with WGS84위도 and 1 other fieldsHigh correlation
WGS84위도 is highly overall correlated with 우편번호 and 1 other fieldsHigh correlation
WGS84경도 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with 우편번호 and 2 other fieldsHigh correlation
공사완료일 has 11 (3.3%) missing valuesMissing
시공사전화번호 has 90 (27.3%) missing valuesMissing
세대수 has 21 (6.4%) missing valuesMissing
입주예정일 has 65 (19.7%) missing valuesMissing
우편번호 has 173 (52.4%) missing valuesMissing
시공위치지번주소 has 89 (27.0%) missing valuesMissing
시공위치도로명주소 has 286 (86.7%) missing valuesMissing
WGS84위도 has 113 (34.2%) missing valuesMissing
WGS84경도 has 113 (34.2%) missing valuesMissing
비고 has 260 (78.8%) missing valuesMissing

Reproduction

Analysis started2024-04-29 13:28:21.783775
Analysis finished2024-04-29 13:28:26.180915
Duration4.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct29
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
화성시
35 
안양시
33 
양주시
32 
평택시
31 
용인시
25 
Other values (24)
174 

Length

Max length4
Median length3
Mean length3.0393939
Min length3

Unique

Unique4 ?
Unique (%)1.2%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row가평군
5th row고양시

Common Values

ValueCountFrequency (%)
화성시 35
 
10.6%
안양시 33
 
10.0%
양주시 32
 
9.7%
평택시 31
 
9.4%
용인시 25
 
7.6%
파주시 20
 
6.1%
고양시 20
 
6.1%
김포시 16
 
4.8%
오산시 14
 
4.2%
광주시 12
 
3.6%
Other values (19) 92
27.9%

Length

2024-04-29T22:28:26.243642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
화성시 35
 
10.6%
안양시 33
 
10.0%
양주시 32
 
9.7%
평택시 31
 
9.4%
용인시 25
 
7.6%
파주시 20
 
6.1%
고양시 20
 
6.1%
김포시 16
 
4.8%
오산시 14
 
4.2%
광주시 12
 
3.6%
Other values (19) 92
27.9%
Distinct311
Distinct (%)94.2%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2024-04-29T22:28:26.512444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length29
Mean length16.521212
Min length5

Characters and Unicode

Total characters5452
Distinct characters320
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique298 ?
Unique (%)90.3%

Sample

1st row가평 대곡2지구 공동주택 신축공사
2nd row가평 대곡지구 공동주택 신축공사
3rd row가평 읍내지구 공동주택 신축공사
4th row디엘본가평설악지역 주택조합 아파트 신축공사
5th row고양 덕은 A2블록 공동주택 신축공사
ValueCountFrequency (%)
공동주택 121
 
11.4%
신축공사 79
 
7.4%
양주 32
 
3.0%
옥정지구 20
 
1.9%
재개발정비사업 19
 
1.8%
아파트 17
 
1.6%
힐스테이트 16
 
1.5%
파주운정3지구 13
 
1.2%
세교2지구 11
 
1.0%
고양 10
 
0.9%
Other values (489) 726
68.2%
2024-04-29T22:28:26.938122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
734
 
13.5%
242
 
4.4%
238
 
4.4%
195
 
3.6%
178
 
3.3%
162
 
3.0%
145
 
2.7%
142
 
2.6%
105
 
1.9%
103
 
1.9%
Other values (310) 3208
58.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3956
72.6%
Space Separator 734
 
13.5%
Decimal Number 338
 
6.2%
Uppercase Letter 272
 
5.0%
Dash Punctuation 59
 
1.1%
Open Punctuation 28
 
0.5%
Close Punctuation 28
 
0.5%
Lowercase Letter 21
 
0.4%
Other Punctuation 14
 
0.3%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
242
 
6.1%
238
 
6.0%
195
 
4.9%
178
 
4.5%
162
 
4.1%
145
 
3.7%
142
 
3.6%
105
 
2.7%
103
 
2.6%
87
 
2.2%
Other values (260) 2359
59.6%
Uppercase Letter
ValueCountFrequency (%)
B 96
35.3%
A 72
26.5%
L 69
25.4%
C 10
 
3.7%
P 4
 
1.5%
E 4
 
1.5%
S 4
 
1.5%
D 3
 
1.1%
H 3
 
1.1%
T 2
 
0.7%
Other values (5) 5
 
1.8%
Lowercase Letter
ValueCountFrequency (%)
e 5
23.8%
d 2
 
9.5%
t 2
 
9.5%
a 2
 
9.5%
i 2
 
9.5%
r 2
 
9.5%
k 1
 
4.8%
o 1
 
4.8%
m 1
 
4.8%
s 1
 
4.8%
Other values (2) 2
 
9.5%
Decimal Number
ValueCountFrequency (%)
2 92
27.2%
1 90
26.6%
3 52
15.4%
4 32
 
9.5%
6 16
 
4.7%
9 15
 
4.4%
5 14
 
4.1%
0 11
 
3.3%
8 8
 
2.4%
7 8
 
2.4%
Other Punctuation
ValueCountFrequency (%)
, 7
50.0%
/ 2
 
14.3%
? 1
 
7.1%
& 1
 
7.1%
; 1
 
7.1%
· 1
 
7.1%
# 1
 
7.1%
Space Separator
ValueCountFrequency (%)
734
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 59
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Currency Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3956
72.6%
Common 1203
 
22.1%
Latin 293
 
5.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
242
 
6.1%
238
 
6.0%
195
 
4.9%
178
 
4.5%
162
 
4.1%
145
 
3.7%
142
 
3.6%
105
 
2.7%
103
 
2.6%
87
 
2.2%
Other values (260) 2359
59.6%
Latin
ValueCountFrequency (%)
B 96
32.8%
A 72
24.6%
L 69
23.5%
C 10
 
3.4%
e 5
 
1.7%
P 4
 
1.4%
E 4
 
1.4%
S 4
 
1.4%
D 3
 
1.0%
H 3
 
1.0%
Other values (17) 23
 
7.8%
Common
ValueCountFrequency (%)
734
61.0%
2 92
 
7.6%
1 90
 
7.5%
- 59
 
4.9%
3 52
 
4.3%
4 32
 
2.7%
( 28
 
2.3%
) 28
 
2.3%
6 16
 
1.3%
9 15
 
1.2%
Other values (13) 57
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3956
72.6%
ASCII 1494
 
27.4%
None 1
 
< 0.1%
Currency Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
734
49.1%
B 96
 
6.4%
2 92
 
6.2%
1 90
 
6.0%
A 72
 
4.8%
L 69
 
4.6%
- 59
 
3.9%
3 52
 
3.5%
4 32
 
2.1%
( 28
 
1.9%
Other values (38) 170
 
11.4%
Hangul
ValueCountFrequency (%)
242
 
6.1%
238
 
6.0%
195
 
4.9%
178
 
4.5%
162
 
4.1%
145
 
3.7%
142
 
3.6%
105
 
2.7%
103
 
2.6%
87
 
2.2%
Other values (260) 2359
59.6%
None
ValueCountFrequency (%)
· 1
100.0%
Currency Symbols
ValueCountFrequency (%)
1
100.0%
Distinct244
Distinct (%)74.4%
Missing2
Missing (%)0.6%
Memory size2.7 KiB
Minimum2015-08-10 00:00:00
Maximum2023-12-07 00:00:00
2024-04-29T22:28:27.067995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:27.198956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

공사완료일
Date

MISSING 

Distinct180
Distinct (%)56.4%
Missing11
Missing (%)3.3%
Memory size2.7 KiB
Minimum2018-01-01 00:00:00
Maximum2027-07-01 00:00:00
2024-04-29T22:28:27.317050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:27.437557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct168
Distinct (%)51.2%
Missing2
Missing (%)0.6%
Memory size2.7 KiB
2024-04-29T22:28:27.646680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length28
Mean length6.6432927
Min length2

Characters and Unicode

Total characters2179
Distinct characters156
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)33.5%

Sample

1st row지에스건설㈜
2nd row대림산업주식회사
3rd row현대건설㈜
4th row선원건설
5th row㈜중흥토건
ValueCountFrequency (%)
현대건설㈜ 24
 
6.3%
지에스건설㈜ 13
 
3.4%
㈜현대건설 8
 
2.1%
㈜대우건설 8
 
2.1%
제일건설㈜ 8
 
2.1%
gs건설 7
 
1.8%
주)포스코건설 7
 
1.8%
현대엔지니어링㈜ 7
 
1.8%
주식회사 7
 
1.8%
계룡건설산업㈜ 7
 
1.8%
Other values (173) 286
74.9%
2024-04-29T22:28:27.989608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
258
 
11.8%
239
 
11.0%
238
 
10.9%
102
 
4.7%
71
 
3.3%
57
 
2.6%
55
 
2.5%
) 50
 
2.3%
( 50
 
2.3%
49
 
2.2%
Other values (146) 1010
46.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1708
78.4%
Other Symbol 238
 
10.9%
Space Separator 55
 
2.5%
Close Punctuation 50
 
2.3%
Open Punctuation 50
 
2.3%
Uppercase Letter 46
 
2.1%
Other Punctuation 28
 
1.3%
Decimal Number 3
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
258
 
15.1%
239
 
14.0%
102
 
6.0%
71
 
4.2%
57
 
3.3%
49
 
2.9%
49
 
2.9%
37
 
2.2%
31
 
1.8%
27
 
1.6%
Other values (128) 788
46.1%
Uppercase Letter
ValueCountFrequency (%)
S 15
32.6%
G 12
26.1%
C 8
17.4%
K 7
15.2%
R 1
 
2.2%
E 1
 
2.2%
H 1
 
2.2%
D 1
 
2.2%
Other Punctuation
ValueCountFrequency (%)
, 26
92.9%
/ 1
 
3.6%
& 1
 
3.6%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
2 1
33.3%
Other Symbol
ValueCountFrequency (%)
238
100.0%
Space Separator
ValueCountFrequency (%)
55
100.0%
Close Punctuation
ValueCountFrequency (%)
) 50
100.0%
Open Punctuation
ValueCountFrequency (%)
( 50
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1946
89.3%
Common 187
 
8.6%
Latin 46
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
258
 
13.3%
239
 
12.3%
238
 
12.2%
102
 
5.2%
71
 
3.6%
57
 
2.9%
49
 
2.5%
49
 
2.5%
37
 
1.9%
31
 
1.6%
Other values (129) 815
41.9%
Common
ValueCountFrequency (%)
55
29.4%
) 50
26.7%
( 50
26.7%
, 26
13.9%
1 2
 
1.1%
/ 1
 
0.5%
& 1
 
0.5%
2 1
 
0.5%
+ 1
 
0.5%
Latin
ValueCountFrequency (%)
S 15
32.6%
G 12
26.1%
C 8
17.4%
K 7
15.2%
R 1
 
2.2%
E 1
 
2.2%
H 1
 
2.2%
D 1
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1708
78.4%
None 238
 
10.9%
ASCII 233
 
10.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
258
 
15.1%
239
 
14.0%
102
 
6.0%
71
 
4.2%
57
 
3.3%
49
 
2.9%
49
 
2.9%
37
 
2.2%
31
 
1.8%
27
 
1.6%
Other values (128) 788
46.1%
None
ValueCountFrequency (%)
238
100.0%
ASCII
ValueCountFrequency (%)
55
23.6%
) 50
21.5%
( 50
21.5%
, 26
11.2%
S 15
 
6.4%
G 12
 
5.2%
C 8
 
3.4%
K 7
 
3.0%
1 2
 
0.9%
R 1
 
0.4%
Other values (7) 7
 
3.0%

시공사전화번호
Text

MISSING 

Distinct204
Distinct (%)85.0%
Missing90
Missing (%)27.3%
Memory size2.7 KiB
2024-04-29T22:28:28.436622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.083333
Min length9

Characters and Unicode

Total characters2900
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique179 ?
Unique (%)74.6%

Sample

1st row02-2154-7631
2nd row031-581-9431
3rd row031-582-8410
4th row031-584-5235
5th row02-504-7780
ValueCountFrequency (%)
031-868-7884 6
 
2.5%
054-223-6114 4
 
1.7%
02-746-8946 3
 
1.2%
031-668-9430 3
 
1.2%
031-868-7209 3
 
1.2%
031-321-1238 3
 
1.2%
031-333-9361 3
 
1.2%
1588-9707 2
 
0.8%
031-692-5788 2
 
0.8%
1577-7755 2
 
0.8%
Other values (194) 209
87.1%
2024-04-29T22:28:28.799981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 476
16.4%
0 431
14.9%
3 352
12.1%
1 329
11.3%
7 242
8.3%
8 229
7.9%
6 218
7.5%
4 178
 
6.1%
2 173
 
6.0%
5 140
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2424
83.6%
Dash Punctuation 476
 
16.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 431
17.8%
3 352
14.5%
1 329
13.6%
7 242
10.0%
8 229
9.4%
6 218
9.0%
4 178
7.3%
2 173
7.1%
5 140
 
5.8%
9 132
 
5.4%
Dash Punctuation
ValueCountFrequency (%)
- 476
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2900
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 476
16.4%
0 431
14.9%
3 352
12.1%
1 329
11.3%
7 242
8.3%
8 229
7.9%
6 218
7.5%
4 178
 
6.1%
2 173
 
6.0%
5 140
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 476
16.4%
0 431
14.9%
3 352
12.1%
1 329
11.3%
7 242
8.3%
8 229
7.9%
6 218
7.5%
4 178
 
6.1%
2 173
 
6.0%
5 140
 
4.8%

세대수
Real number (ℝ)

MISSING 

Distinct259
Distinct (%)83.8%
Missing21
Missing (%)6.4%
Infinite0
Infinite (%)0.0%
Mean814.17476
Minimum56
Maximum4154
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2024-04-29T22:28:28.948712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum56
5-th percentile126.2
Q1406
median650
Q3983
95-th percentile2331.4
Maximum4154
Range4098
Interquartile range (IQR)577

Descriptive statistics

Standard deviation650.02974
Coefficient of variation (CV)0.79839092
Kurtosis3.4734825
Mean814.17476
Median Absolute Deviation (MAD)284
Skewness1.7223332
Sum251580
Variance422538.66
MonotonicityNot monotonic
2024-04-29T22:28:29.079064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2739 3
 
0.9%
406 3
 
0.9%
1021 3
 
0.9%
2417 3
 
0.9%
448 3
 
0.9%
230 2
 
0.6%
1199 2
 
0.6%
125 2
 
0.6%
921 2
 
0.6%
2329 2
 
0.6%
Other values (249) 284
86.1%
(Missing) 21
 
6.4%
ValueCountFrequency (%)
56 1
0.3%
60 2
0.6%
61 1
0.3%
72 1
0.3%
73 1
0.3%
88 1
0.3%
94 1
0.3%
99 1
0.3%
107 1
0.3%
116 1
0.3%
ValueCountFrequency (%)
4154 1
 
0.3%
2886 2
0.6%
2739 3
0.9%
2737 2
0.6%
2736 2
0.6%
2530 1
 
0.3%
2456 1
 
0.3%
2417 3
0.9%
2333 1
 
0.3%
2329 2
0.6%

입주예정일
Date

MISSING 

Distinct151
Distinct (%)57.0%
Missing65
Missing (%)19.7%
Memory size2.7 KiB
Minimum2018-01-01 00:00:00
Maximum2027-04-30 00:00:00
2024-04-29T22:28:29.189129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:29.308142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct111
Distinct (%)70.7%
Missing173
Missing (%)52.4%
Infinite0
Infinite (%)0.0%
Mean14476.79
Minimum10031
Maximum18151
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2024-04-29T22:28:29.431059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10031
5-th percentile10105
Q112416
median14081
Q317338
95-th percentile18117
Maximum18151
Range8120
Interquartile range (IQR)4922

Descriptive statistics

Standard deviation2737.3794
Coefficient of variation (CV)0.18908746
Kurtosis-1.4235424
Mean14476.79
Median Absolute Deviation (MAD)2954
Skewness-0.095931898
Sum2272856
Variance7493246.2
MonotonicityNot monotonic
2024-04-29T22:28:29.577893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13544 7
 
2.1%
11800 5
 
1.5%
10865 4
 
1.2%
12790 4
 
1.2%
17161 3
 
0.9%
17035 3
 
0.9%
14081 3
 
0.9%
17842 3
 
0.9%
17845 3
 
0.9%
17564 2
 
0.6%
Other values (101) 120
36.4%
(Missing) 173
52.4%
ValueCountFrequency (%)
10031 1
0.3%
10056 1
0.3%
10057 1
0.3%
10061 1
0.3%
10068 2
0.6%
10099 1
0.3%
10105 2
0.6%
10121 2
0.6%
10124 1
0.3%
10130 1
0.3%
ValueCountFrequency (%)
18151 1
 
0.3%
18145 1
 
0.3%
18127 1
 
0.3%
18125 2
0.6%
18124 2
0.6%
18117 2
0.6%
17996 1
 
0.3%
17877 2
0.6%
17845 3
0.9%
17842 3
0.9%
Distinct225
Distinct (%)93.4%
Missing89
Missing (%)27.0%
Memory size2.7 KiB
2024-04-29T22:28:29.783721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length28
Mean length21.037344
Min length15

Characters and Unicode

Total characters5070
Distinct characters183
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique214 ?
Unique (%)88.8%

Sample

1st row경기도 가평군 가평읍 대곡리 390-2
2nd row경기도 가평군 가평읍 대곡리 291-6
3rd row경기도 가평군 가평읍 읍내리 205
4th row경기도 가평군 설악면 신천리 산45-27번지 외3필지
5th row경기도 과천시 갈현동 641
ValueCountFrequency (%)
경기도 241
 
20.3%
일원 44
 
3.7%
안양시 33
 
2.8%
평택시 31
 
2.6%
용인시 25
 
2.1%
동안구 22
 
1.9%
파주시 18
 
1.5%
김포시 16
 
1.3%
오산시 14
 
1.2%
처인구 14
 
1.2%
Other values (433) 729
61.4%
2024-04-29T22:28:30.113264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
946
18.7%
250
 
4.9%
248
 
4.9%
242
 
4.8%
238
 
4.7%
237
 
4.7%
1 187
 
3.7%
2 149
 
2.9%
- 147
 
2.9%
121
 
2.4%
Other values (173) 2305
45.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3071
60.6%
Space Separator 946
 
18.7%
Decimal Number 889
 
17.5%
Dash Punctuation 147
 
2.9%
Uppercase Letter 10
 
0.2%
Close Punctuation 3
 
0.1%
Open Punctuation 3
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
250
 
8.1%
248
 
8.1%
242
 
7.9%
238
 
7.7%
237
 
7.7%
121
 
3.9%
111
 
3.6%
97
 
3.2%
80
 
2.6%
69
 
2.2%
Other values (155) 1378
44.9%
Decimal Number
ValueCountFrequency (%)
1 187
21.0%
2 149
16.8%
4 92
10.3%
3 87
9.8%
5 81
9.1%
8 76
8.5%
9 62
 
7.0%
6 61
 
6.9%
7 50
 
5.6%
0 44
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
A 8
80.0%
C 1
 
10.0%
B 1
 
10.0%
Space Separator
ValueCountFrequency (%)
946
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3071
60.6%
Common 1989
39.2%
Latin 10
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
250
 
8.1%
248
 
8.1%
242
 
7.9%
238
 
7.7%
237
 
7.7%
121
 
3.9%
111
 
3.6%
97
 
3.2%
80
 
2.6%
69
 
2.2%
Other values (155) 1378
44.9%
Common
ValueCountFrequency (%)
946
47.6%
1 187
 
9.4%
2 149
 
7.5%
- 147
 
7.4%
4 92
 
4.6%
3 87
 
4.4%
5 81
 
4.1%
8 76
 
3.8%
9 62
 
3.1%
6 61
 
3.1%
Other values (5) 101
 
5.1%
Latin
ValueCountFrequency (%)
A 8
80.0%
C 1
 
10.0%
B 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3071
60.6%
ASCII 1999
39.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
946
47.3%
1 187
 
9.4%
2 149
 
7.5%
- 147
 
7.4%
4 92
 
4.6%
3 87
 
4.4%
5 81
 
4.1%
8 76
 
3.8%
9 62
 
3.1%
6 61
 
3.1%
Other values (8) 111
 
5.6%
Hangul
ValueCountFrequency (%)
250
 
8.1%
248
 
8.1%
242
 
7.9%
238
 
7.7%
237
 
7.7%
121
 
3.9%
111
 
3.6%
97
 
3.2%
80
 
2.6%
69
 
2.2%
Other values (155) 1378
44.9%
Distinct39
Distinct (%)88.6%
Missing286
Missing (%)86.7%
Memory size2.7 KiB
2024-04-29T22:28:30.367529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length21.5
Mean length16.636364
Min length1

Characters and Unicode

Total characters732
Distinct characters108
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)79.5%

Sample

1st row경기도 가평군 가평읍 문화로 167
2nd row경기도 광명시 목감로 130
3rd row경기도 광주시 태봉로 225
4th row경기도 구리시 인창1로 35
5th row경기도 김포시 양촌읍 양곡로449번길 15
ValueCountFrequency (%)
경기도 39
 
21.4%
김포시 7
 
3.8%
안양시 6
 
3.3%
동안구 3
 
1.6%
파주시 3
 
1.6%
양촌읍 3
 
1.6%
미정 3
 
1.6%
만안구 3
 
1.6%
15 3
 
1.6%
용인시 2
 
1.1%
Other values (97) 110
60.4%
2024-04-29T22:28:30.726493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
138
18.9%
42
 
5.7%
40
 
5.5%
40
 
5.5%
40
 
5.5%
36
 
4.9%
1 33
 
4.5%
2 21
 
2.9%
16
 
2.2%
13
 
1.8%
Other values (98) 313
42.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 457
62.4%
Space Separator 138
 
18.9%
Decimal Number 129
 
17.6%
Dash Punctuation 8
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
42
 
9.2%
40
 
8.8%
40
 
8.8%
40
 
8.8%
36
 
7.9%
16
 
3.5%
13
 
2.8%
13
 
2.8%
10
 
2.2%
9
 
2.0%
Other values (86) 198
43.3%
Decimal Number
ValueCountFrequency (%)
1 33
25.6%
2 21
16.3%
5 12
 
9.3%
8 11
 
8.5%
0 10
 
7.8%
6 10
 
7.8%
3 9
 
7.0%
7 9
 
7.0%
9 8
 
6.2%
4 6
 
4.7%
Space Separator
ValueCountFrequency (%)
138
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 457
62.4%
Common 275
37.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
42
 
9.2%
40
 
8.8%
40
 
8.8%
40
 
8.8%
36
 
7.9%
16
 
3.5%
13
 
2.8%
13
 
2.8%
10
 
2.2%
9
 
2.0%
Other values (86) 198
43.3%
Common
ValueCountFrequency (%)
138
50.2%
1 33
 
12.0%
2 21
 
7.6%
5 12
 
4.4%
8 11
 
4.0%
0 10
 
3.6%
6 10
 
3.6%
3 9
 
3.3%
7 9
 
3.3%
- 8
 
2.9%
Other values (2) 14
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 457
62.4%
ASCII 275
37.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
138
50.2%
1 33
 
12.0%
2 21
 
7.6%
5 12
 
4.4%
8 11
 
4.0%
0 10
 
3.6%
6 10
 
3.6%
3 9
 
3.3%
7 9
 
3.3%
- 8
 
2.9%
Other values (2) 14
 
5.1%
Hangul
ValueCountFrequency (%)
42
 
9.2%
40
 
8.8%
40
 
8.8%
40
 
8.8%
36
 
7.9%
16
 
3.5%
13
 
2.8%
13
 
2.8%
10
 
2.2%
9
 
2.0%
Other values (86) 198
43.3%

WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct200
Distinct (%)92.2%
Missing113
Missing (%)34.2%
Infinite0
Infinite (%)0.0%
Mean37.404499
Minimum36.955941
Maximum38.012491
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2024-04-29T22:28:30.861487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.955941
5-th percentile37.021459
Q137.267403
median37.374475
Q337.601486
95-th percentile37.744081
Maximum38.012491
Range1.0565507
Interquartile range (IQR)0.33408252

Descriptive statistics

Standard deviation0.22143525
Coefficient of variation (CV)0.0059200164
Kurtosis-0.4921961
Mean37.404499
Median Absolute Deviation (MAD)0.12747653
Skewness0.23829742
Sum8116.7764
Variance0.04903357
MonotonicityNot monotonic
2024-04-29T22:28:30.992229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.40110553 3
 
0.9%
37.34434545 3
 
0.9%
37.40741801 3
 
0.9%
37.38121932 3
 
0.9%
37.22904851 3
 
0.9%
37.39749483 2
 
0.6%
37.07307036 2
 
0.6%
37.39837833 2
 
0.6%
37.38699805 2
 
0.6%
37.3913317 2
 
0.6%
Other values (190) 192
58.2%
(Missing) 113
34.2%
ValueCountFrequency (%)
36.9559407 1
0.3%
36.990548 1
0.3%
36.992098 1
0.3%
36.99498281 1
0.3%
36.996356 1
0.3%
37.003179 1
0.3%
37.00526772 1
0.3%
37.005286 1
0.3%
37.013485 1
0.3%
37.018872 1
0.3%
ValueCountFrequency (%)
38.01249139 1
0.3%
37.868661694 1
0.3%
37.85253179 1
0.3%
37.8376733321 1
0.3%
37.83259331 1
0.3%
37.8272910619 1
0.3%
37.821066668 1
0.3%
37.8063598598 1
0.3%
37.7641130781 1
0.3%
37.7474235309 1
0.3%

WGS84경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct200
Distinct (%)92.2%
Missing113
Missing (%)34.2%
Infinite0
Infinite (%)0.0%
Mean127.0553
Minimum126.5961
Maximum127.6482
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2024-04-29T22:28:31.127516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.5961
5-th percentile126.70312
Q1126.92278
median127.05549
Q3127.19863
95-th percentile127.49539
Maximum127.6482
Range1.0521
Interquartile range (IQR)0.2758479

Descriptive statistics

Standard deviation0.23532467
Coefficient of variation (CV)0.0018521436
Kurtosis-0.23436986
Mean127.0553
Median Absolute Deviation (MAD)0.1370314
Skewness0.3068173
Sum27571.001
Variance0.055377698
MonotonicityNot monotonic
2024-04-29T22:28:31.265741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.9337067 3
 
0.9%
127.2550883 3
 
0.9%
126.9436968 3
 
0.9%
126.9492839 3
 
0.9%
127.2839785 3
 
0.9%
126.9387599 2
 
0.6%
127.4250436 2
 
0.6%
126.9275118 2
 
0.6%
126.9230941 2
 
0.6%
126.9227846 2
 
0.6%
Other values (190) 192
58.2%
(Missing) 113
34.2%
ValueCountFrequency (%)
126.5961 1
0.3%
126.6263531 1
0.3%
126.6305512 1
0.3%
126.6318084 1
0.3%
126.6342161 1
0.3%
126.6356032 1
0.3%
126.6409313 1
0.3%
126.6837523 1
0.3%
126.6993196 1
0.3%
126.7007953 1
0.3%
ValueCountFrequency (%)
127.6482 1
0.3%
127.6284 1
0.3%
127.6253 1
0.3%
127.6236127331 1
0.3%
127.5950812 1
0.3%
127.5423 1
0.3%
127.5121331002 1
0.3%
127.5085208775 1
0.3%
127.5076488162 1
0.3%
127.5066664832 1
0.3%

비고
Text

MISSING 

Distinct51
Distinct (%)72.9%
Missing260
Missing (%)78.8%
Memory size2.7 KiB
2024-04-29T22:28:31.502894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length86
Median length22
Mean length20.285714
Min length3

Characters and Unicode

Total characters1420
Distinct characters106
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)68.6%

Sample

1st row2020-02-20
2nd row2020-02-20
3rd row2020-02-20
4th row2020-02-20
5th row2020-02-20
ValueCountFrequency (%)
경기도 34
 
12.1%
양주시 32
 
11.4%
옥정지구 20
 
7.1%
2020-02-20 16
 
5.7%
대장동 9
 
3.2%
성남 6
 
2.1%
판교대장 6
 
2.1%
도시개발구역 5
 
1.8%
위도 4
 
1.4%
입력 4
 
1.4%
Other values (86) 145
51.6%
2024-04-29T22:28:31.852776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
211
 
14.9%
2 86
 
6.1%
0 73
 
5.1%
- 61
 
4.3%
61
 
4.3%
51
 
3.6%
46
 
3.2%
45
 
3.2%
B 43
 
3.0%
41
 
2.9%
Other values (96) 702
49.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 782
55.1%
Decimal Number 224
 
15.8%
Space Separator 211
 
14.9%
Uppercase Letter 108
 
7.6%
Dash Punctuation 61
 
4.3%
Close Punctuation 13
 
0.9%
Open Punctuation 13
 
0.9%
Math Symbol 8
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
61
 
7.8%
51
 
6.5%
46
 
5.9%
45
 
5.8%
41
 
5.2%
40
 
5.1%
38
 
4.9%
36
 
4.6%
32
 
4.1%
21
 
2.7%
Other values (76) 371
47.4%
Decimal Number
ValueCountFrequency (%)
2 86
38.4%
0 73
32.6%
1 29
 
12.9%
4 8
 
3.6%
6 6
 
2.7%
3 6
 
2.7%
7 5
 
2.2%
5 5
 
2.2%
9 4
 
1.8%
8 2
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
B 43
39.8%
L 34
31.5%
A 24
22.2%
D 4
 
3.7%
C 3
 
2.8%
Space Separator
ValueCountFrequency (%)
211
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 61
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Math Symbol
ValueCountFrequency (%)
+ 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 778
54.8%
Common 530
37.3%
Latin 108
 
7.6%
Han 4
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
61
 
7.8%
51
 
6.6%
46
 
5.9%
45
 
5.8%
41
 
5.3%
40
 
5.1%
38
 
4.9%
36
 
4.6%
32
 
4.1%
21
 
2.7%
Other values (75) 367
47.2%
Common
ValueCountFrequency (%)
211
39.8%
2 86
16.2%
0 73
 
13.8%
- 61
 
11.5%
1 29
 
5.5%
) 13
 
2.5%
( 13
 
2.5%
4 8
 
1.5%
+ 8
 
1.5%
6 6
 
1.1%
Other values (5) 22
 
4.2%
Latin
ValueCountFrequency (%)
B 43
39.8%
L 34
31.5%
A 24
22.2%
D 4
 
3.7%
C 3
 
2.8%
Han
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 778
54.8%
ASCII 638
44.9%
CJK 4
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
211
33.1%
2 86
13.5%
0 73
 
11.4%
- 61
 
9.6%
B 43
 
6.7%
L 34
 
5.3%
1 29
 
4.5%
A 24
 
3.8%
) 13
 
2.0%
( 13
 
2.0%
Other values (10) 51
 
8.0%
Hangul
ValueCountFrequency (%)
61
 
7.8%
51
 
6.6%
46
 
5.9%
45
 
5.8%
41
 
5.3%
40
 
5.1%
38
 
4.9%
36
 
4.6%
32
 
4.1%
21
 
2.7%
Other values (75) 367
47.2%
CJK
ValueCountFrequency (%)
4
100.0%

Interactions

2024-04-29T22:28:25.361915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:24.291048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:24.665504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:25.019386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:25.435609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:24.418155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:24.748533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:25.096370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:25.532008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:24.509591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:24.842464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:25.187788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:25.610299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:24.589452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:24.933925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-29T22:28:25.273084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-29T22:28:31.943398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명세대수우편번호시공위치도로명주소WGS84위도WGS84경도비고
시군명1.0000.2950.9891.0000.9720.9391.000
세대수0.2951.0000.1790.9710.2400.4150.000
우편번호0.9890.1791.0001.0000.8550.8791.000
시공위치도로명주소1.0000.9711.0001.0001.0000.9861.000
WGS84위도0.9720.2400.8551.0001.0000.8080.974
WGS84경도0.9390.4150.8790.9860.8081.0000.734
비고1.0000.0001.0001.0000.9740.7341.000
2024-04-29T22:28:32.049147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세대수우편번호WGS84위도WGS84경도시군명
세대수1.0000.138-0.022-0.1410.113
우편번호0.1381.000-0.9000.4120.888
WGS84위도-0.022-0.9001.000-0.4350.807
WGS84경도-0.1410.412-0.4351.0000.687
시군명0.1130.8880.8070.6871.000

Missing values

2024-04-29T22:28:25.729856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-29T22:28:25.895778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-29T22:28:26.054477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명공사명공사시작일공사완료일시공사명시공사전화번호세대수입주예정일우편번호시공위치지번주소시공위치도로명주소WGS84위도WGS84경도비고
0가평군가평 대곡2지구 공동주택 신축공사2021-01-152023-08-15지에스건설㈜02-2154-76315052023-08-3112416경기도 가평군 가평읍 대곡리 390-2<NA>37.821067127.508521<NA>
1가평군가평 대곡지구 공동주택 신축공사2021-01-252023-07-24대림산업주식회사031-581-94314722023-08-3112417경기도 가평군 가평읍 대곡리 291-6경기도 가평군 가평읍 문화로 16737.827291127.506666<NA>
2가평군가평 읍내지구 공동주택 신축공사2021-09-152023-10-30현대건설㈜031-582-84104512023-11-3012412경기도 가평군 가평읍 읍내리 205<NA>37.837673127.512133<NA>
3가평군디엘본가평설악지역 주택조합 아파트 신축공사2022-01-252024-11-01선원건설031-584-52354202024-11-3012467경기도 가평군 설악면 신천리 산45-27번지 외3필지<NA>37.676533127.490173<NA>
4고양시고양 덕은 A2블록 공동주택 신축공사2019-10-072022-10-06㈜중흥토건<NA>8942022-10-06<NA><NA><NA><NA><NA><NA>
5고양시고양 덕은 A4블록 공동주택 신축공사2020-04-142022-07-30㈜GS건설<NA>7022022-07-30<NA><NA><NA><NA><NA><NA>
6고양시고양 덕은 A5블록 공동주택 신축공사2019-07-032022-07-02㈜대방건설<NA>6222022-07-02<NA><NA><NA><NA><NA><NA>
7고양시고양 덕은 A6블록 공동주택 신축공사2020-05-092022-07-31㈜GS건설<NA>6202022-07-31<NA><NA><NA><NA><NA><NA>
8고양시고양 덕은 A7블록 공동주택 신축공사2020-04-142022-08-31㈜GS건설<NA>3182022-08-31<NA><NA><NA><NA><NA><NA>
9고양시고양 식사2구역 A2 공동주택 신축공사2018-12-112022-02-28㈜GS건설<NA>13332022-02-28<NA><NA><NA><NA><NA><NA>
시군명공사명공사시작일공사완료일시공사명시공사전화번호세대수입주예정일우편번호시공위치지번주소시공위치도로명주소WGS84위도WGS84경도비고
320화성시제일풍경채 퍼스티어2021-10-262024-04-30제일건설(주)070-7704-13883082024-04-30<NA><NA><NA><NA><NA><NA>
321화성시조암 써밋 아파트2022-09-202024-11-30㈜군장종합건설031-351-96212242024-11-30<NA><NA><NA><NA><NA><NA>
322화성시중흥S클래스아파트2020-12-142023-12-13중흥건설㈜031-227-58358082023-12-13<NA><NA><NA><NA><NA><NA>
323화성시중흥S클래스아파트(C-1)2020-12-072023-12-13중흥토건㈜031-298-21237072023-12-13<NA><NA><NA><NA><NA><NA>
324화성시청도 솔리움 더 테라스2021-09-032023-08-31청도건설주식회사031-8003-7700732023-08-31<NA><NA><NA><NA><NA><NA>
325화성시향남 상신지구A1-1블럭 공동주택 신축공사2021-05-062024-01-31㈜한양031-8059-69409452024-01-31<NA><NA><NA><NA><NA><NA>
326화성시화성 향남2지구 16블록 공동주택 신축공사2020-12-022024-04-30㈜동광주택02-3774-55007722023-04-30<NA><NA><NA><NA><NA><NA>
327화성시화성비봉 우미린2022-03-142024-11-30우미건설㈜1588-97071982024-11-30<NA><NA><NA><NA><NA><NA>
328화성시화성신남지역주택조합 공동주택 신축공사2020-09-152023-08-31㈜서희건설070-5038-026018462023-08-31<NA><NA><NA><NA><NA><NA>
329화성시힐스테이트 동탄 더 테라스2021-12-302023-11-30현대엔지니어링㈜070-5142-63011252023-11-30<NA><NA><NA><NA><NA><NA>