Overview

Dataset statistics

Number of variables5
Number of observations229
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.3 KiB
Average record size in memory41.6 B

Variable types

Numeric1
Text3
Categorical1

Dataset

Description서산시에서 허가된 부동산중개업소 현황에 대한 데이터로, 사업자상호, 소재지 도로명 주소, 중개업자명, 전화번호 등의 정보를 제공합니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=451&beforeMenuCd=DOM_000000201001001000&publicdatapk=15000665

Alerts

운영상태 is highly imbalanced (87.3%)Imbalance
등록번호 has unique valuesUnique
대표자명 has unique valuesUnique

Reproduction

Analysis started2024-01-09 21:41:11.146102
Analysis finished2024-01-09 21:41:11.587760
Duration0.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

등록번호
Real number (ℝ)

UNIQUE 

Distinct229
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean583.08297
Minimum1
Maximum904
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2024-01-10T06:41:11.663473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile23.4
Q1349
median718
Q3828
95-th percentile892.6
Maximum904
Range903
Interquartile range (IQR)479

Descriptive statistics

Standard deviation309.4877
Coefficient of variation (CV)0.53077815
Kurtosis-0.86239334
Mean583.08297
Median Absolute Deviation (MAD)148
Skewness-0.83019527
Sum133526
Variance95782.638
MonotonicityStrictly increasing
2024-01-10T06:41:11.799791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
793 1
 
0.4%
795 1
 
0.4%
797 1
 
0.4%
799 1
 
0.4%
800 1
 
0.4%
803 1
 
0.4%
804 1
 
0.4%
807 1
 
0.4%
808 1
 
0.4%
Other values (219) 219
95.6%
ValueCountFrequency (%)
1 1
0.4%
4 1
0.4%
5 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
12 1
0.4%
15 1
0.4%
19 1
0.4%
21 1
0.4%
ValueCountFrequency (%)
904 1
0.4%
903 1
0.4%
902 1
0.4%
901 1
0.4%
900 1
0.4%
899 1
0.4%
898 1
0.4%
897 1
0.4%
896 1
0.4%
895 1
0.4%

명칭
Text

Distinct228
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2024-01-10T06:41:12.009151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length8.3886463
Min length6

Characters and Unicode

Total characters1921
Distinct characters210
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique227 ?
Unique (%)99.1%

Sample

1st row양지부동산중개인사무소
2nd row영창부동산중개인사무소
3rd row황금부동산중개인사무소
4th row성지부동산중개인사무소
5th row극동부동산중개인사무소
ValueCountFrequency (%)
대명공인중개사 2
 
0.9%
양지부동산중개인사무소 1
 
0.4%
장수공인중개사 1
 
0.4%
롯데공인중개사 1
 
0.4%
청구공인중개사 1
 
0.4%
국민공인중개사 1
 
0.4%
지오공인중개사 1
 
0.4%
호수공인중개사 1
 
0.4%
성심공인중개사 1
 
0.4%
음암공인중개사 1
 
0.4%
Other values (219) 219
95.2%
2024-01-10T06:41:12.312869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
235
12.2%
232
12.1%
229
11.9%
228
11.9%
198
 
10.3%
60
 
3.1%
54
 
2.8%
53
 
2.8%
39
 
2.0%
38
 
2.0%
Other values (200) 555
28.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1903
99.1%
Decimal Number 8
 
0.4%
Uppercase Letter 6
 
0.3%
Space Separator 2
 
0.1%
Lowercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
235
12.3%
232
12.2%
229
12.0%
228
12.0%
198
 
10.4%
60
 
3.2%
54
 
2.8%
53
 
2.8%
39
 
2.0%
38
 
2.0%
Other values (188) 537
28.2%
Uppercase Letter
ValueCountFrequency (%)
T 1
16.7%
O 1
16.7%
K 1
16.7%
A 1
16.7%
B 1
16.7%
L 1
16.7%
Decimal Number
ValueCountFrequency (%)
1 5
62.5%
4 2
 
25.0%
2 1
 
12.5%
Lowercase Letter
ValueCountFrequency (%)
h 1
50.0%
e 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1903
99.1%
Common 10
 
0.5%
Latin 8
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
235
12.3%
232
12.2%
229
12.0%
228
12.0%
198
 
10.4%
60
 
3.2%
54
 
2.8%
53
 
2.8%
39
 
2.0%
38
 
2.0%
Other values (188) 537
28.2%
Latin
ValueCountFrequency (%)
T 1
12.5%
h 1
12.5%
e 1
12.5%
O 1
12.5%
K 1
12.5%
A 1
12.5%
B 1
12.5%
L 1
12.5%
Common
ValueCountFrequency (%)
1 5
50.0%
4 2
 
20.0%
2
 
20.0%
2 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1903
99.1%
ASCII 18
 
0.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
235
12.3%
232
12.2%
229
12.0%
228
12.0%
198
 
10.4%
60
 
3.2%
54
 
2.8%
53
 
2.8%
39
 
2.0%
38
 
2.0%
Other values (188) 537
28.2%
ASCII
ValueCountFrequency (%)
1 5
27.8%
4 2
 
11.1%
2
 
11.1%
T 1
 
5.6%
h 1
 
5.6%
e 1
 
5.6%
2 1
 
5.6%
O 1
 
5.6%
K 1
 
5.6%
A 1
 
5.6%
Other values (2) 2
 
11.1%
Distinct199
Distinct (%)86.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2024-01-10T06:41:12.529275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length35
Mean length20.925764
Min length15

Characters and Unicode

Total characters4792
Distinct characters120
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique174 ?
Unique (%)76.0%

Sample

1st row충청남도 서산시 대산읍 대산리 714-3
2nd row충청남도 서산시 동문동 423-15
3rd row충청남도 서산시 동문동 426
4th row충청남도 서산시 대산읍 대산리 137
5th row충청남도 서산시 대산읍 대산리 133-22
ValueCountFrequency (%)
충청남도 229
21.8%
서산시 229
21.8%
동문동 43
 
4.1%
예천동 34
 
3.2%
대산읍 25
 
2.4%
석림동 22
 
2.1%
잠홍동 21
 
2.0%
대산리 20
 
1.9%
읍내동 19
 
1.8%
해미면 13
 
1.2%
Other values (265) 394
37.6%
2024-01-10T06:41:12.855110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
824
17.2%
285
 
5.9%
240
 
5.0%
232
 
4.8%
231
 
4.8%
230
 
4.8%
229
 
4.8%
229
 
4.8%
1 220
 
4.6%
208
 
4.3%
Other values (110) 1864
38.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2714
56.6%
Decimal Number 1029
 
21.5%
Space Separator 824
 
17.2%
Dash Punctuation 197
 
4.1%
Open Punctuation 9
 
0.2%
Close Punctuation 9
 
0.2%
Other Punctuation 6
 
0.1%
Lowercase Letter 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
285
10.5%
240
 
8.8%
232
 
8.5%
231
 
8.5%
230
 
8.5%
229
 
8.4%
229
 
8.4%
208
 
7.7%
79
 
2.9%
53
 
2.0%
Other values (91) 698
25.7%
Decimal Number
ValueCountFrequency (%)
1 220
21.4%
2 154
15.0%
5 111
10.8%
3 88
 
8.6%
6 87
 
8.5%
0 83
 
8.1%
4 79
 
7.7%
7 76
 
7.4%
9 73
 
7.1%
8 58
 
5.6%
Other Punctuation
ValueCountFrequency (%)
@ 5
83.3%
, 1
 
16.7%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
R 1
50.0%
Space Separator
ValueCountFrequency (%)
824
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 197
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Lowercase Letter
ValueCountFrequency (%)
a 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2714
56.6%
Common 2074
43.3%
Latin 4
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
285
10.5%
240
 
8.8%
232
 
8.5%
231
 
8.5%
230
 
8.5%
229
 
8.4%
229
 
8.4%
208
 
7.7%
79
 
2.9%
53
 
2.0%
Other values (91) 698
25.7%
Common
ValueCountFrequency (%)
824
39.7%
1 220
 
10.6%
- 197
 
9.5%
2 154
 
7.4%
5 111
 
5.4%
3 88
 
4.2%
6 87
 
4.2%
0 83
 
4.0%
4 79
 
3.8%
7 76
 
3.7%
Other values (6) 155
 
7.5%
Latin
ValueCountFrequency (%)
a 2
50.0%
C 1
25.0%
R 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2714
56.6%
ASCII 2078
43.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
824
39.7%
1 220
 
10.6%
- 197
 
9.5%
2 154
 
7.4%
5 111
 
5.3%
3 88
 
4.2%
6 87
 
4.2%
0 83
 
4.0%
4 79
 
3.8%
7 76
 
3.7%
Other values (9) 159
 
7.7%
Hangul
ValueCountFrequency (%)
285
10.5%
240
 
8.8%
232
 
8.5%
231
 
8.5%
230
 
8.5%
229
 
8.4%
229
 
8.4%
208
 
7.7%
79
 
2.9%
53
 
2.0%
Other values (91) 698
25.7%

대표자명
Text

UNIQUE 

Distinct229
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2024-01-10T06:41:13.138895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9956332
Min length2

Characters and Unicode

Total characters686
Distinct characters143
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique229 ?
Unique (%)100.0%

Sample

1st row배영원
2nd row도경희
3rd row김영호
4th row김웅곤
5th row김후경
ValueCountFrequency (%)
배영원 1
 
0.4%
최은미 1
 
0.4%
박한이 1
 
0.4%
서수애 1
 
0.4%
인명란 1
 
0.4%
김주원 1
 
0.4%
조경자 1
 
0.4%
서근원 1
 
0.4%
오인규 1
 
0.4%
정휘근 1
 
0.4%
Other values (220) 220
95.7%
2024-01-10T06:41:13.506890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
58
 
8.5%
32
 
4.7%
26
 
3.8%
23
 
3.4%
22
 
3.2%
16
 
2.3%
16
 
2.3%
15
 
2.2%
13
 
1.9%
12
 
1.7%
Other values (133) 453
66.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 685
99.9%
Space Separator 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
58
 
8.5%
32
 
4.7%
26
 
3.8%
23
 
3.4%
22
 
3.2%
16
 
2.3%
16
 
2.3%
15
 
2.2%
13
 
1.9%
12
 
1.8%
Other values (132) 452
66.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 685
99.9%
Common 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
58
 
8.5%
32
 
4.7%
26
 
3.8%
23
 
3.4%
22
 
3.2%
16
 
2.3%
16
 
2.3%
15
 
2.2%
13
 
1.9%
12
 
1.8%
Other values (132) 452
66.0%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 685
99.9%
ASCII 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
58
 
8.5%
32
 
4.7%
26
 
3.8%
23
 
3.4%
22
 
3.2%
16
 
2.3%
16
 
2.3%
15
 
2.2%
13
 
1.9%
12
 
1.8%
Other values (132) 452
66.0%
ASCII
ValueCountFrequency (%)
1
100.0%

운영상태
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
영업중
225 
휴업
 
4

Length

Max length3
Median length3
Mean length2.9825328
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영업중
2nd row영업중
3rd row영업중
4th row영업중
5th row영업중

Common Values

ValueCountFrequency (%)
영업중 225
98.3%
휴업 4
 
1.7%

Length

2024-01-10T06:41:13.615972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:41:13.684136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업중 225
98.3%
휴업 4
 
1.7%

Interactions

2024-01-10T06:41:11.399479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T06:41:13.727828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록번호운영상태
등록번호1.0000.000
운영상태0.0001.000
2024-01-10T06:41:13.788513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록번호운영상태
등록번호1.0000.000
운영상태0.0001.000

Missing values

2024-01-10T06:41:11.477054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:41:11.551003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

등록번호명칭주 소대표자명운영상태
01양지부동산중개인사무소충청남도 서산시 대산읍 대산리 714-3배영원영업중
14영창부동산중개인사무소충청남도 서산시 동문동 423-15도경희영업중
25황금부동산중개인사무소충청남도 서산시 동문동 426김영호영업중
37성지부동산중개인사무소충청남도 서산시 대산읍 대산리 137김웅곤영업중
48극동부동산중개인사무소충청남도 서산시 대산읍 대산리 133-22김후경영업중
59서해부동산중개인사무소충청남도 서산시 대산읍 화곡리 11서갑석영업중
612진영부동산중개인사무소충청남도 서산시 해미면 읍내리 321-15진희태영업중
715로타리부동산중개사무소충청남도 서산시 해미면 휴암리 228-5오병은영업중
819팔구사부동산중개인사무소충청남도 서산시 잠홍동 769-12강신구영업중
921푸른부동산중개인사무소충청남도 서산시 성연면 평리 292-2표세웅휴업
등록번호명칭주 소대표자명운영상태
219895서산롯데공인중개사충청남도 서산시 읍내동 742-9윤영희영업중
220896엘리트공인중개사충청남도 서산시 대산읍 화곡리 831-1변지수영업중
221897골드공인중개사충청남도 서산시 잠홍동 502-1차경환영업중
222898한은정공인중개사충청남도 서산시 예천동 1271-3한은정영업중
223899세종공인중개사충청남도 서산시 해미면 읍내리 184-1 해미시장117호김부배영업중
224900토지공인중개사충청남도 서산시 운산면 갈산리 430-14정종옥영업중
225901억대부동산공인중개충청남도 서산시 해미면 읍내리 153-7최교용영업중
226902한빛공인중개사충청남도 서산시 동문동 193-8박홍규영업중
227903오용호공인중개사충청남도 서산시 예천동 1280오용호영업중
228904국보공인중개사충청남도 서산시 잠홍동 515-6손종근영업중