Overview

Dataset statistics

Number of variables3
Number of observations1414
Missing cells9
Missing cells (%)0.2%
Duplicate rows16
Duplicate rows (%)1.1%
Total size in memory33.3 KiB
Average record size in memory24.1 B

Variable types

Text2
Categorical1

Dataset

Description경기도 시흥시 공인 중개사 현황에 대한 데이터로, 시흥시 관내 공인중개사의 상호명, 소재지 (주소) 등으로 구성되어있습니다.
URLhttps://www.data.go.kr/data/3071673/fileData.do

Alerts

Dataset has 16 (1.1%) duplicate rowsDuplicates
데이터기준일자 is highly imbalanced (97.2%)Imbalance

Reproduction

Analysis started2023-12-12 06:05:23.351306
Analysis finished2023-12-12 06:05:24.047406
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1134
Distinct (%)80.4%
Missing4
Missing (%)0.3%
Memory size11.2 KiB
2023-12-12T15:05:24.231083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length25
Mean length12.592908
Min length5

Characters and Unicode

Total characters17756
Distinct characters409
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique965 ?
Unique (%)68.4%

Sample

1st row탑 공인중개사사무소
2nd row이룸부동산 공인중개사사무소
3rd row성실 공인중개사사무소
4th row대동랜드 공인중개사사무소
5th row시화부동산경매컨설팅 공인중개사
ValueCountFrequency (%)
공인중개사사무소 1251
43.4%
공인중개사 36
 
1.2%
사무소 33
 
1.1%
부동산 15
 
0.5%
삼성 11
 
0.4%
대우 10
 
0.3%
장현지구 9
 
0.3%
행운 9
 
0.3%
미래 9
 
0.3%
우리 8
 
0.3%
Other values (1111) 1494
51.8%
2023-12-12T15:05:24.719517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2750
15.5%
1477
 
8.3%
1435
 
8.1%
1415
 
8.0%
1411
 
7.9%
1386
 
7.8%
1385
 
7.8%
1378
 
7.8%
315
 
1.8%
299
 
1.7%
Other values (399) 4505
25.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16055
90.4%
Space Separator 1477
 
8.3%
Uppercase Letter 81
 
0.5%
Decimal Number 78
 
0.4%
Open Punctuation 27
 
0.2%
Close Punctuation 27
 
0.2%
Lowercase Letter 5
 
< 0.1%
Dash Punctuation 4
 
< 0.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2750
17.1%
1435
 
8.9%
1415
 
8.8%
1411
 
8.8%
1386
 
8.6%
1385
 
8.6%
1378
 
8.6%
315
 
2.0%
299
 
1.9%
281
 
1.8%
Other values (363) 4000
24.9%
Uppercase Letter
ValueCountFrequency (%)
K 15
18.5%
M 11
13.6%
S 10
12.3%
T 9
11.1%
O 8
9.9%
V 7
8.6%
B 4
 
4.9%
L 4
 
4.9%
A 3
 
3.7%
P 2
 
2.5%
Other values (7) 8
9.9%
Decimal Number
ValueCountFrequency (%)
1 35
44.9%
4 11
 
14.1%
9 8
 
10.3%
5 6
 
7.7%
6 5
 
6.4%
3 4
 
5.1%
2 4
 
5.1%
0 3
 
3.8%
7 1
 
1.3%
8 1
 
1.3%
Lowercase Letter
ValueCountFrequency (%)
e 3
60.0%
h 1
 
20.0%
o 1
 
20.0%
Other Punctuation
ValueCountFrequency (%)
! 1
50.0%
& 1
50.0%
Space Separator
ValueCountFrequency (%)
1477
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27
100.0%
Close Punctuation
ValueCountFrequency (%)
) 27
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16055
90.4%
Common 1615
 
9.1%
Latin 86
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2750
17.1%
1435
 
8.9%
1415
 
8.8%
1411
 
8.8%
1386
 
8.6%
1385
 
8.6%
1378
 
8.6%
315
 
2.0%
299
 
1.9%
281
 
1.8%
Other values (363) 4000
24.9%
Latin
ValueCountFrequency (%)
K 15
17.4%
M 11
12.8%
S 10
11.6%
T 9
10.5%
O 8
9.3%
V 7
8.1%
B 4
 
4.7%
L 4
 
4.7%
e 3
 
3.5%
A 3
 
3.5%
Other values (10) 12
14.0%
Common
ValueCountFrequency (%)
1477
91.5%
1 35
 
2.2%
( 27
 
1.7%
) 27
 
1.7%
4 11
 
0.7%
9 8
 
0.5%
5 6
 
0.4%
6 5
 
0.3%
- 4
 
0.2%
3 4
 
0.2%
Other values (6) 11
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16055
90.4%
ASCII 1701
 
9.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2750
17.1%
1435
 
8.9%
1415
 
8.8%
1411
 
8.8%
1386
 
8.6%
1385
 
8.6%
1378
 
8.6%
315
 
2.0%
299
 
1.9%
281
 
1.8%
Other values (363) 4000
24.9%
ASCII
ValueCountFrequency (%)
1477
86.8%
1 35
 
2.1%
( 27
 
1.6%
) 27
 
1.6%
K 15
 
0.9%
M 11
 
0.6%
4 11
 
0.6%
S 10
 
0.6%
T 9
 
0.5%
O 8
 
0.5%
Other values (26) 71
 
4.2%
Distinct1370
Distinct (%)97.2%
Missing5
Missing (%)0.4%
Memory size11.2 KiB
2023-12-12T15:05:25.096160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length62
Median length46
Mean length32.437899
Min length15

Characters and Unicode

Total characters45705
Distinct characters355
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1334 ?
Unique (%)94.7%

Sample

1st row경기도 시흥시 진말로 36-1 103호(장곡동,우성@상가)
2nd row경기도 시흥시 하중로 193 102호(하중동)
3rd row경기도 시흥시 정왕시장길 38 (정왕동1동)
4th row경기도 시흥시 함송로29번길 37
5th row경기도 시흥시 옥구공원로 189 120호(정왕2동)
ValueCountFrequency (%)
경기도 1407
 
17.0%
시흥시 1404
 
17.0%
1층 108
 
1.3%
상가동 102
 
1.2%
공단1대로 74
 
0.9%
101호 45
 
0.5%
244 37
 
0.4%
중심상가로 36
 
0.4%
서울대학로278번길 34
 
0.4%
상가 33
 
0.4%
Other values (1961) 4997
60.4%
2023-12-12T15:05:25.650369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6902
 
15.1%
3002
 
6.6%
1 2967
 
6.5%
1669
 
3.7%
1495
 
3.3%
1469
 
3.2%
1460
 
3.2%
1425
 
3.1%
2 1289
 
2.8%
1246
 
2.7%
Other values (345) 22781
49.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26049
57.0%
Decimal Number 9153
 
20.0%
Space Separator 6902
 
15.1%
Close Punctuation 1237
 
2.7%
Open Punctuation 1236
 
2.7%
Other Punctuation 628
 
1.4%
Dash Punctuation 326
 
0.7%
Uppercase Letter 168
 
0.4%
Lowercase Letter 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3002
 
11.5%
1669
 
6.4%
1495
 
5.7%
1469
 
5.6%
1460
 
5.6%
1425
 
5.5%
1246
 
4.8%
1200
 
4.6%
644
 
2.5%
590
 
2.3%
Other values (309) 11849
45.5%
Uppercase Letter
ValueCountFrequency (%)
A 43
25.6%
B 38
22.6%
T 20
11.9%
S 15
 
8.9%
C 13
 
7.7%
I 9
 
5.4%
N 9
 
5.4%
M 4
 
2.4%
Y 4
 
2.4%
V 3
 
1.8%
Other values (5) 10
 
6.0%
Decimal Number
ValueCountFrequency (%)
1 2967
32.4%
2 1289
14.1%
0 1216
13.3%
3 874
 
9.5%
4 810
 
8.8%
6 452
 
4.9%
7 428
 
4.7%
5 423
 
4.6%
9 356
 
3.9%
8 338
 
3.7%
Lowercase Letter
ValueCountFrequency (%)
e 2
33.3%
i 1
16.7%
s 1
16.7%
h 1
16.7%
x 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 574
91.4%
@ 54
 
8.6%
Space Separator
ValueCountFrequency (%)
6902
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1237
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1236
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 326
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26049
57.0%
Common 19482
42.6%
Latin 174
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3002
 
11.5%
1669
 
6.4%
1495
 
5.7%
1469
 
5.6%
1460
 
5.6%
1425
 
5.5%
1246
 
4.8%
1200
 
4.6%
644
 
2.5%
590
 
2.3%
Other values (309) 11849
45.5%
Latin
ValueCountFrequency (%)
A 43
24.7%
B 38
21.8%
T 20
11.5%
S 15
 
8.6%
C 13
 
7.5%
I 9
 
5.2%
N 9
 
5.2%
M 4
 
2.3%
Y 4
 
2.3%
V 3
 
1.7%
Other values (10) 16
 
9.2%
Common
ValueCountFrequency (%)
6902
35.4%
1 2967
15.2%
2 1289
 
6.6%
) 1237
 
6.3%
( 1236
 
6.3%
0 1216
 
6.2%
3 874
 
4.5%
4 810
 
4.2%
, 574
 
2.9%
6 452
 
2.3%
Other values (6) 1925
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26049
57.0%
ASCII 19656
43.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6902
35.1%
1 2967
15.1%
2 1289
 
6.6%
) 1237
 
6.3%
( 1236
 
6.3%
0 1216
 
6.2%
3 874
 
4.4%
4 810
 
4.1%
, 574
 
2.9%
6 452
 
2.3%
Other values (26) 2099
 
10.7%
Hangul
ValueCountFrequency (%)
3002
 
11.5%
1669
 
6.4%
1495
 
5.7%
1469
 
5.6%
1460
 
5.6%
1425
 
5.5%
1246
 
4.8%
1200
 
4.6%
644
 
2.5%
590
 
2.3%
Other values (309) 11849
45.5%

데이터기준일자
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.2 KiB
2023-08-02
1410 
<NA>
 
4

Length

Max length10
Median length10
Mean length9.9830269
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-08-02
2nd row2023-08-02
3rd row2023-08-02
4th row2023-08-02
5th row2023-08-02

Common Values

ValueCountFrequency (%)
2023-08-02 1410
99.7%
<NA> 4
 
0.3%

Length

2023-12-12T15:05:25.820713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:05:25.950574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-08-02 1410
99.7%
na 4
 
0.3%

Missing values

2023-12-12T15:05:23.828292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:05:23.903522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T15:05:23.989994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

상호명소재지데이터기준일자
0탑 공인중개사사무소경기도 시흥시 진말로 36-1 103호(장곡동,우성@상가)2023-08-02
1이룸부동산 공인중개사사무소경기도 시흥시 하중로 193 102호(하중동)2023-08-02
2성실 공인중개사사무소경기도 시흥시 정왕시장길 38 (정왕동1동)2023-08-02
3대동랜드 공인중개사사무소경기도 시흥시 함송로29번길 372023-08-02
4시화부동산경매컨설팅 공인중개사경기도 시흥시 옥구공원로 189 120호(정왕2동)2023-08-02
5비전중개인사무소 이봉희경기도 시흥시 호현로 42-19 1층(대야동)2023-08-02
6롯데 부동산중개사무소경기도 시흥시 호현로34번안길 8 (대야동)2023-08-02
7경인부동산중개사무소경기도 시흥시 호현로 102 1층(대야동)2023-08-02
8그랑트리홈즈 공인중개사사무소경기도 시흥시 은계호수로 49 1층 1131호(은행동, 시흥센트럴돔 그랑트리)2023-08-02
9까치 공인중개사사무소경기도 시흥시 군자로 521 지하 1층 B01호(거모동, ST타워)2023-08-02
상호명소재지데이터기준일자
1404한라정문 공인중개사사무소경기도 시흥시 서울대학로 172-20 상가2동 101호(배곧2동)2023-08-02
1405서준 공인중개사사무소경기도 시흥시 신천1길 9 103호(신천동)2023-08-02
1406금솔부동산 공인중개사사무소경기도 시흥시 목감남서로 92-15 목감중흥레이크힐스 제상가2동 110호(목감동)2023-08-02
1407호반써밋 공인중개사사무소경기도 시흥시 서울대학로264번길 12 호반써밋플레이스 207동 103호2023-08-02
1408더클래스 공인중개사사무소경기도 시흥시 서촌상가1길 13 315호(정왕동, 신안프라자)2023-08-02
1409풍림아이원 공인중개사사무소경기도 시흥시 월곶중앙로14번길 27 101호(월곶동,풍림아이원4차@상가)2023-08-02
1410<NA><NA><NA>
1411<NA><NA><NA>
1412<NA><NA><NA>
1413<NA><NA><NA>

Duplicate rows

Most frequently occurring

상호명소재지데이터기준일자# duplicates
15<NA><NA><NA>4
9이화 공인중개사사무소경기도 시흥시 봉우재로 199 103호(정왕본동)2023-08-023
0거모 공인중개사사무소경기도 시흥시 군자로 475 102호(거모동)2023-08-022
1공장뱅크 공인중개사사무소경기도 시흥시 공단1대로 204 공구상가 32동 210호2023-08-022
2동원베네스트 공인중개사사무소경기도 시흥시 소망공원로 232 103호(정왕1동,시화동원베네스트)2023-08-022
3럭키 부동산공인중개사사무소경기도 시흥시 연성로29번길 32 101호(하중동, 참이슬아파트상가동)2023-08-022
4매산플러스 공인중개사사무소경기도 시흥시 매화산단2길 31 , 103호 (도창동, TNS프라자 상가)2023-08-022
5부동산채널 공인중개사사무소경기도 시흥시 은계남로 12 상가110호(은행동, 시흥은계 호반써밋플레이스 판매시설)2023-08-022
6시흥시청역 트리플포레 공인중개사사무소경기도 시흥시 장현천로 61 1층 4호(능곡동)2023-08-022
7원룸 공인중개사사무소경기도 시흥시 오이도1길 2 (정왕3동)2023-08-022