Overview

Dataset statistics

Number of variables21
Number of observations10000
Missing cells13224
Missing cells (%)6.3%
Duplicate rows45
Duplicate rows (%)0.4%
Total size in memory1.8 MiB
Average record size in memory187.0 B

Variable types

Categorical7
Numeric9
Text3
Unsupported2

Dataset

Description접수연도,자치구코드,자치구명,법정동코드,법정동명,지번구분,지번구분명,본번,부번,건물명,계약일,물건금액(만원),건물면적(㎡),토지면적(㎡),층,권리구분,취소일,건축년도,건물용도,신고구분,신고한 개업공인중개사 시군구명
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21275/S/1/datasetView.do

Alerts

Dataset has 45 (0.4%) duplicate rowsDuplicates
지번구분 is highly imbalanced (87.8%)Imbalance
지번구분명 is highly imbalanced (87.8%)Imbalance
권리구분 is highly imbalanced (95.9%)Imbalance
본번 has 390 (3.9%) missing valuesMissing
부번 has 390 (3.9%) missing valuesMissing
건물명 has 388 (3.9%) missing valuesMissing
토지면적(㎡) has 390 (3.9%) missing valuesMissing
has 387 (3.9%) missing valuesMissing
취소일 has 9597 (96.0%) missing valuesMissing
신고한 개업공인중개사 시군구명 has 1642 (16.4%) missing valuesMissing
토지면적(㎡) is highly skewed (γ1 = 95.92368168)Skewed
본번 is an unsupported type, check if it needs cleaning or further analysisUnsupported
부번 is an unsupported type, check if it needs cleaning or further analysisUnsupported
토지면적(㎡) has 4622 (46.2%) zerosZeros

Reproduction

Analysis started2024-05-11 07:01:53.115485
Analysis finished2024-05-11 07:01:55.346086
Duration2.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

접수연도
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023
5377 
2024
4623 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024
2nd row2024
3rd row2023
4th row2023
5th row2024

Common Values

ValueCountFrequency (%)
2023 5377
53.8%
2024 4623
46.2%

Length

2024-05-11T16:01:55.461886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:55.611233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 5377
53.8%
2024 4623
46.2%

자치구코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11453.503
Minimum11110
Maximum11740
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:55.795501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110
5-th percentile11170
Q111305
median11470
Q311590
95-th percentile11740
Maximum11740
Range630
Interquartile range (IQR)285

Descriptive statistics

Standard deviation175.57802
Coefficient of variation (CV)0.015329635
Kurtosis-1.0995539
Mean11453.503
Median Absolute Deviation (MAD)150
Skewness-0.052074353
Sum1.1453503 × 108
Variance30827.642
MonotonicityNot monotonic
2024-05-11T16:01:56.012031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
11500 697
 
7.0%
11710 590
 
5.9%
11380 542
 
5.4%
11740 513
 
5.1%
11440 513
 
5.1%
11350 494
 
4.9%
11680 493
 
4.9%
11530 454
 
4.5%
11290 451
 
4.5%
11560 440
 
4.4%
Other values (15) 4813
48.1%
ValueCountFrequency (%)
11110 175
 
1.8%
11140 136
 
1.4%
11170 204
2.0%
11200 361
3.6%
11215 322
3.2%
11230 366
3.7%
11260 331
3.3%
11290 451
4.5%
11305 339
3.4%
11320 332
3.3%
ValueCountFrequency (%)
11740 513
5.1%
11710 590
5.9%
11680 493
4.9%
11650 390
3.9%
11620 365
3.6%
11590 412
4.1%
11560 440
4.4%
11545 347
3.5%
11530 454
4.5%
11500 697
7.0%

자치구명
Categorical

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
강서구
 
697
송파구
 
590
은평구
 
542
강동구
 
513
마포구
 
513
Other values (20)
7145 

Length

Max length4
Median length3
Mean length3.101
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row노원구
2nd row강서구
3rd row구로구
4th row종로구
5th row서초구

Common Values

ValueCountFrequency (%)
강서구 697
 
7.0%
송파구 590
 
5.9%
은평구 542
 
5.4%
강동구 513
 
5.1%
마포구 513
 
5.1%
노원구 494
 
4.9%
강남구 493
 
4.9%
구로구 454
 
4.5%
성북구 451
 
4.5%
영등포구 440
 
4.4%
Other values (15) 4813
48.1%

Length

2024-05-11T16:01:56.261175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강서구 697
 
7.0%
송파구 590
 
5.9%
은평구 542
 
5.4%
강동구 513
 
5.1%
마포구 513
 
5.1%
노원구 494
 
4.9%
강남구 493
 
4.9%
구로구 454
 
4.5%
성북구 451
 
4.5%
영등포구 440
 
4.4%
Other values (15) 4813
48.1%

법정동코드
Real number (ℝ)

Distinct76
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10963.96
Minimum10100
Maximum18700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:56.516639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10100
5-th percentile10100
Q110200
median10600
Q311000
95-th percentile13400
Maximum18700
Range8600
Interquartile range (IQR)800

Descriptive statistics

Standard deviation1257.9473
Coefficient of variation (CV)0.11473476
Kurtosis10.418164
Mean10963.96
Median Absolute Deviation (MAD)400
Skewness2.9432304
Sum1.096396 × 108
Variance1582431.4
MonotonicityNot monotonic
2024-05-11T16:01:56.766160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10200 1292
12.9%
10100 1251
12.5%
10300 1240
12.4%
10500 800
 
8.0%
10700 751
 
7.5%
10800 634
 
6.3%
10600 620
 
6.2%
10900 477
 
4.8%
10400 392
 
3.9%
11000 218
 
2.2%
Other values (66) 2325
23.2%
ValueCountFrequency (%)
10100 1251
12.5%
10200 1292
12.9%
10300 1240
12.4%
10400 392
 
3.9%
10500 800
8.0%
10600 620
6.2%
10700 751
7.5%
10800 634
6.3%
10900 477
 
4.8%
11000 218
 
2.2%
ValueCountFrequency (%)
18700 5
0.1%
18600 2
 
< 0.1%
18500 1
 
< 0.1%
18400 3
 
< 0.1%
18300 10
0.1%
18200 1
 
< 0.1%
18100 4
 
< 0.1%
18000 3
 
< 0.1%
17900 2
 
< 0.1%
17700 4
 
< 0.1%
Distinct337
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T16:01:57.264668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1219
Min length2

Characters and Unicode

Total characters31219
Distinct characters192
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)0.4%

Sample

1st row중계동
2nd row화곡동
3rd row항동
4th row원서동
5th row우면동
ValueCountFrequency (%)
화곡동 299
 
3.0%
상계동 190
 
1.9%
봉천동 183
 
1.8%
신림동 163
 
1.6%
구로동 157
 
1.6%
목동 151
 
1.5%
수유동 147
 
1.5%
천호동 141
 
1.4%
신정동 133
 
1.3%
독산동 133
 
1.3%
Other values (327) 8303
83.0%
2024-05-11T16:01:58.459625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9973
31.9%
873
 
2.8%
821
 
2.6%
586
 
1.9%
525
 
1.7%
475
 
1.5%
460
 
1.5%
442
 
1.4%
411
 
1.3%
410
 
1.3%
Other values (182) 16243
52.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30719
98.4%
Decimal Number 500
 
1.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9973
32.5%
873
 
2.8%
821
 
2.7%
586
 
1.9%
525
 
1.7%
475
 
1.5%
460
 
1.5%
442
 
1.4%
411
 
1.3%
410
 
1.3%
Other values (174) 15743
51.2%
Decimal Number
ValueCountFrequency (%)
1 109
21.8%
2 102
20.4%
3 98
19.6%
4 71
14.2%
5 62
12.4%
7 23
 
4.6%
6 23
 
4.6%
8 12
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30719
98.4%
Common 500
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9973
32.5%
873
 
2.8%
821
 
2.7%
586
 
1.9%
525
 
1.7%
475
 
1.5%
460
 
1.5%
442
 
1.4%
411
 
1.3%
410
 
1.3%
Other values (174) 15743
51.2%
Common
ValueCountFrequency (%)
1 109
21.8%
2 102
20.4%
3 98
19.6%
4 71
14.2%
5 62
12.4%
7 23
 
4.6%
6 23
 
4.6%
8 12
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30719
98.4%
ASCII 500
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9973
32.5%
873
 
2.8%
821
 
2.7%
586
 
1.9%
525
 
1.7%
475
 
1.5%
460
 
1.5%
442
 
1.4%
411
 
1.3%
410
 
1.3%
Other values (174) 15743
51.2%
ASCII
ValueCountFrequency (%)
1 109
21.8%
2 102
20.4%
3 98
19.6%
4 71
14.2%
5 62
12.4%
7 23
 
4.6%
6 23
 
4.6%
8 12
 
2.4%

지번구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9605 
<NA>
 
389
3
 
5
2
 
1

Length

Max length4
Median length1
Mean length1.1167
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9605
96.0%
<NA> 389
 
3.9%
3 5
 
0.1%
2 1
 
< 0.1%

Length

2024-05-11T16:01:58.742710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:58.945583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9605
96.0%
na 389
 
3.9%
3 5
 
< 0.1%
2 1
 
< 0.1%

지번구분명
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
대지
9605 
<NA>
 
389
블럭
 
5
 
1

Length

Max length4
Median length2
Mean length2.0777
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row대지
2nd row대지
3rd row대지
4th row대지
5th row대지

Common Values

ValueCountFrequency (%)
대지 9605
96.0%
<NA> 389
 
3.9%
블럭 5
 
0.1%
1
 
< 0.1%

Length

2024-05-11T16:01:59.156196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:59.374625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대지 9605
96.0%
na 389
 
3.9%
블럭 5
 
< 0.1%
1
 
< 0.1%

본번
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing390
Missing (%)3.9%
Memory size156.2 KiB

부번
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing390
Missing (%)3.9%
Memory size156.2 KiB

건물명
Text

MISSING 

Distinct5154
Distinct (%)53.6%
Missing388
Missing (%)3.9%
Memory size156.2 KiB
2024-05-11T16:01:59.802339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length26
Mean length6.9507907
Min length2

Characters and Unicode

Total characters66811
Distinct characters604
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3621 ?
Unique (%)37.7%

Sample

1st row삼창타워주상프라자
2nd row경원아트빌B동
3rd row하버라인2단지
4th row(103-1)
5th row서초리슈빌S글로벌
ValueCountFrequency (%)
가산 72
 
0.7%
시빅 70
 
0.7%
오피스텔 62
 
0.6%
현대 50
 
0.5%
아이파크 38
 
0.4%
마곡나루역 37
 
0.4%
힐스테이트에코 37
 
0.4%
서울마곡 37
 
0.4%
라마다앙코르 37
 
0.4%
238-7 35
 
0.3%
Other values (5290) 10023
95.5%
2024-05-11T16:02:00.624447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2001
 
3.0%
1881
 
2.8%
1 1791
 
2.7%
) 1515
 
2.3%
1515
 
2.3%
( 1515
 
2.3%
1388
 
2.1%
1337
 
2.0%
2 1316
 
2.0%
1236
 
1.8%
Other values (594) 51316
76.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52120
78.0%
Decimal Number 7617
 
11.4%
Close Punctuation 1515
 
2.3%
Open Punctuation 1515
 
2.3%
Uppercase Letter 1432
 
2.1%
Dash Punctuation 1083
 
1.6%
Space Separator 911
 
1.4%
Lowercase Letter 310
 
0.5%
Other Punctuation 145
 
0.2%
Math Symbol 77
 
0.1%
Other values (2) 86
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2001
 
3.8%
1881
 
3.6%
1515
 
2.9%
1388
 
2.7%
1337
 
2.6%
1236
 
2.4%
1116
 
2.1%
1110
 
2.1%
1062
 
2.0%
1038
 
2.0%
Other values (518) 38436
73.7%
Uppercase Letter
ValueCountFrequency (%)
C 155
10.8%
A 141
 
9.8%
B 129
 
9.0%
M 111
 
7.8%
I 102
 
7.1%
E 94
 
6.6%
S 93
 
6.5%
D 88
 
6.1%
K 77
 
5.4%
L 61
 
4.3%
Other values (15) 381
26.6%
Lowercase Letter
ValueCountFrequency (%)
e 168
54.2%
l 24
 
7.7%
i 16
 
5.2%
n 15
 
4.8%
s 14
 
4.5%
a 13
 
4.2%
c 11
 
3.5%
u 9
 
2.9%
h 8
 
2.6%
r 7
 
2.3%
Other values (9) 25
 
8.1%
Decimal Number
ValueCountFrequency (%)
1 1791
23.5%
2 1316
17.3%
3 849
11.1%
4 659
 
8.7%
0 620
 
8.1%
5 601
 
7.9%
6 534
 
7.0%
7 455
 
6.0%
8 421
 
5.5%
9 371
 
4.9%
Other Punctuation
ValueCountFrequency (%)
, 88
60.7%
. 19
 
13.1%
& 9
 
6.2%
/ 9
 
6.2%
' 7
 
4.8%
5
 
3.4%
? 5
 
3.4%
# 3
 
2.1%
Letter Number
ValueCountFrequency (%)
21
36.2%
15
25.9%
11
19.0%
6
 
10.3%
2
 
3.4%
2
 
3.4%
1
 
1.7%
Math Symbol
ValueCountFrequency (%)
~ 76
98.7%
1
 
1.3%
Close Punctuation
ValueCountFrequency (%)
) 1515
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1515
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1083
100.0%
Space Separator
ValueCountFrequency (%)
911
100.0%
Control
ValueCountFrequency (%)
28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 52098
78.0%
Common 12891
 
19.3%
Latin 1800
 
2.7%
Han 22
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2001
 
3.8%
1881
 
3.6%
1515
 
2.9%
1388
 
2.7%
1337
 
2.6%
1236
 
2.4%
1116
 
2.1%
1110
 
2.1%
1062
 
2.0%
1038
 
2.0%
Other values (514) 38414
73.7%
Latin
ValueCountFrequency (%)
e 168
 
9.3%
C 155
 
8.6%
A 141
 
7.8%
B 129
 
7.2%
M 111
 
6.2%
I 102
 
5.7%
E 94
 
5.2%
S 93
 
5.2%
D 88
 
4.9%
K 77
 
4.3%
Other values (41) 642
35.7%
Common
ValueCountFrequency (%)
1 1791
13.9%
) 1515
11.8%
( 1515
11.8%
2 1316
10.2%
- 1083
8.4%
911
7.1%
3 849
6.6%
4 659
 
5.1%
0 620
 
4.8%
5 601
 
4.7%
Other values (15) 2031
15.8%
Han
ValueCountFrequency (%)
7
31.8%
6
27.3%
6
27.3%
3
13.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52098
78.0%
ASCII 14627
 
21.9%
Number Forms 58
 
0.1%
CJK 22
 
< 0.1%
None 5
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2001
 
3.8%
1881
 
3.6%
1515
 
2.9%
1388
 
2.7%
1337
 
2.6%
1236
 
2.4%
1116
 
2.1%
1110
 
2.1%
1062
 
2.0%
1038
 
2.0%
Other values (514) 38414
73.7%
ASCII
ValueCountFrequency (%)
1 1791
12.2%
) 1515
10.4%
( 1515
10.4%
2 1316
 
9.0%
- 1083
 
7.4%
911
 
6.2%
3 849
 
5.8%
4 659
 
4.5%
0 620
 
4.2%
5 601
 
4.1%
Other values (57) 3767
25.8%
Number Forms
ValueCountFrequency (%)
21
36.2%
15
25.9%
11
19.0%
6
 
10.3%
2
 
3.4%
2
 
3.4%
1
 
1.7%
CJK
ValueCountFrequency (%)
7
31.8%
6
27.3%
6
27.3%
3
13.6%
None
ValueCountFrequency (%)
5
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

계약일
Real number (ℝ)

Distinct302
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20234956
Minimum20230712
Maximum20240509
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:02:00.896179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20230712
5-th percentile20230725
Q120230913
median20231207
Q320240227
95-th percentile20240415
Maximum20240509
Range9797
Interquartile range (IQR)9314

Descriptive statistics

Standard deviation4613.6992
Coefficient of variation (CV)0.00022800638
Kurtosis-1.9149379
Mean20234956
Median Absolute Deviation (MAD)478
Skewness0.28555568
Sum2.0234956 × 1011
Variance21286220
MonotonicityNot monotonic
2024-05-11T16:02:01.157759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20240215 103
 
1.0%
20240330 83
 
0.8%
20240323 80
 
0.8%
20230909 77
 
0.8%
20240316 71
 
0.7%
20240309 70
 
0.7%
20230826 68
 
0.7%
20240406 67
 
0.7%
20231214 67
 
0.7%
20230821 67
 
0.7%
Other values (292) 9247
92.5%
ValueCountFrequency (%)
20230712 24
0.2%
20230713 39
0.4%
20230714 32
0.3%
20230715 54
0.5%
20230716 15
 
0.1%
20230717 36
0.4%
20230718 31
0.3%
20230719 43
0.4%
20230720 45
0.4%
20230721 46
0.5%
ValueCountFrequency (%)
20240509 1
 
< 0.1%
20240508 9
0.1%
20240507 5
 
0.1%
20240506 2
 
< 0.1%
20240505 1
 
< 0.1%
20240504 14
0.1%
20240503 14
0.1%
20240502 20
0.2%
20240501 18
0.2%
20240430 15
0.1%

물건금액(만원)
Real number (ℝ)

Distinct1686
Distinct (%)16.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75205.451
Minimum3000
Maximum1800000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:02:01.418445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3000
5-th percentile14000
Q127000
median50544
Q395000
95-th percentile215000
Maximum1800000
Range1797000
Interquartile range (IQR)68000

Descriptive statistics

Standard deviation81054.123
Coefficient of variation (CV)1.0777693
Kurtosis47.811948
Mean75205.451
Median Absolute Deviation (MAD)28044
Skewness4.699505
Sum7.5205451 × 108
Variance6.5697708 × 109
MonotonicityNot monotonic
2024-05-11T16:02:01.696711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25000 115
 
1.1%
30000 102
 
1.0%
50000 87
 
0.9%
20000 85
 
0.9%
26000 84
 
0.8%
27000 78
 
0.8%
15000 77
 
0.8%
28000 71
 
0.7%
40000 71
 
0.7%
35000 69
 
0.7%
Other values (1676) 9161
91.6%
ValueCountFrequency (%)
3000 1
 
< 0.1%
3500 1
 
< 0.1%
3800 1
 
< 0.1%
4300 1
 
< 0.1%
4500 1
 
< 0.1%
4600 1
 
< 0.1%
4700 1
 
< 0.1%
5230 1
 
< 0.1%
5500 1
 
< 0.1%
5599 3
< 0.1%
ValueCountFrequency (%)
1800000 1
< 0.1%
1350000 1
< 0.1%
1030000 2
< 0.1%
1000000 2
< 0.1%
995000 1
< 0.1%
985000 1
< 0.1%
850000 1
< 0.1%
818000 1
< 0.1%
790000 1
< 0.1%
745000 1
< 0.1%

건물면적(㎡)
Real number (ℝ)

Distinct4312
Distinct (%)43.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.60445
Minimum8.48
Maximum1775.16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:02:01.984186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8.48
5-th percentile19.58
Q135.89
median59.231
Q384.6
95-th percentile130.17
Maximum1775.16
Range1766.68
Interquartile range (IQR)48.71

Descriptive statistics

Standard deviation57.584644
Coefficient of variation (CV)0.89134176
Kurtosis224.55841
Mean64.60445
Median Absolute Deviation (MAD)25.009
Skewness10.46099
Sum646044.5
Variance3315.9912
MonotonicityNot monotonic
2024-05-11T16:02:02.352457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
84.98 169
 
1.7%
84.99 141
 
1.4%
84.97 122
 
1.2%
59.98 112
 
1.1%
84.96 102
 
1.0%
59.99 101
 
1.0%
59.96 84
 
0.8%
59.97 79
 
0.8%
84.95 67
 
0.7%
59.94 66
 
0.7%
Other values (4302) 8957
89.6%
ValueCountFrequency (%)
8.48 1
< 0.1%
8.6 1
< 0.1%
9.85 1
< 0.1%
9.94 1
< 0.1%
11.16 1
< 0.1%
11.61 1
< 0.1%
12.01 1
< 0.1%
12.02 2
< 0.1%
12.04 1
< 0.1%
12.05 2
< 0.1%
ValueCountFrequency (%)
1775.16 1
< 0.1%
1757.18 1
< 0.1%
1469.49 1
< 0.1%
1175.75 1
< 0.1%
924.81 1
< 0.1%
727.16 1
< 0.1%
711.65 1
< 0.1%
651.86 1
< 0.1%
646.42 1
< 0.1%
639.45 1
< 0.1%

토지면적(㎡)
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct2531
Distinct (%)26.3%
Missing390
Missing (%)3.9%
Infinite0
Infinite (%)0.0%
Mean24.298698
Minimum0
Maximum30822
Zeros4622
Zeros (%)46.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:02:02.619607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median12.23
Q330
95-th percentile71.6336
Maximum30822
Range30822
Interquartile range (IQR)30

Descriptive statistics

Standard deviation316.49117
Coefficient of variation (CV)13.025026
Kurtosis9334.2789
Mean24.298698
Median Absolute Deviation (MAD)12.23
Skewness95.923682
Sum233510.49
Variance100166.66
MonotonicityNot monotonic
2024-05-11T16:02:02.855551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 4622
46.2%
18.0 58
 
0.6%
17.0 48
 
0.5%
28.0 47
 
0.5%
19.0 42
 
0.4%
27.0 37
 
0.4%
62.17 37
 
0.4%
20.0 36
 
0.4%
16.0 36
 
0.4%
26.0 36
 
0.4%
Other values (2521) 4611
46.1%
(Missing) 390
 
3.9%
ValueCountFrequency (%)
0.0 4622
46.2%
3.38 1
 
< 0.1%
3.7 1
 
< 0.1%
4.0 1
 
< 0.1%
4.8 1
 
< 0.1%
4.96 1
 
< 0.1%
5.0 6
 
0.1%
5.02 1
 
< 0.1%
5.43 1
 
< 0.1%
5.6 1
 
< 0.1%
ValueCountFrequency (%)
30822.0 1
< 0.1%
720.0 1
< 0.1%
641.3 1
< 0.1%
600.0 1
< 0.1%
594.3 1
< 0.1%
552.0 1
< 0.1%
499.0 1
< 0.1%
452.0 1
< 0.1%
446.0 1
< 0.1%
442.0 1
< 0.1%


Real number (ℝ)

MISSING 

Distinct50
Distinct (%)0.5%
Missing387
Missing (%)3.9%
Infinite0
Infinite (%)0.0%
Mean7.3982108
Minimum-1
Maximum58
Zeros0
Zeros (%)0.0%
Negative183
Negative (%)1.8%
Memory size166.0 KiB
2024-05-11T16:02:03.096773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile1
Q13
median5
Q311
95-th percentile19
Maximum58
Range59
Interquartile range (IQR)8

Descriptive statistics

Standard deviation6.0638993
Coefficient of variation (CV)0.81964403
Kurtosis3.5249033
Mean7.3982108
Median Absolute Deviation (MAD)3
Skewness1.4907304
Sum71119
Variance36.770875
MonotonicityNot monotonic
2024-05-11T16:02:03.357851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 1159
11.6%
3 1090
10.9%
4 1030
 
10.3%
5 909
 
9.1%
1 586
 
5.9%
6 540
 
5.4%
8 446
 
4.5%
7 398
 
4.0%
10 394
 
3.9%
9 379
 
3.8%
Other values (40) 2682
26.8%
(Missing) 387
 
3.9%
ValueCountFrequency (%)
-1 183
 
1.8%
1 586
5.9%
2 1159
11.6%
3 1090
10.9%
4 1030
10.3%
5 909
9.1%
6 540
5.4%
7 398
 
4.0%
8 446
 
4.5%
9 379
 
3.8%
ValueCountFrequency (%)
58 1
< 0.1%
55 1
< 0.1%
50 1
< 0.1%
49 1
< 0.1%
47 1
< 0.1%
45 1
< 0.1%
43 1
< 0.1%
42 2
< 0.1%
41 1
< 0.1%
40 1
< 0.1%

권리구분
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9932 
입주권
 
42
분양권
 
26

Length

Max length4
Median length4
Mean length3.9932
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9932
99.3%
입주권 42
 
0.4%
분양권 26
 
0.3%

Length

2024-05-11T16:02:03.621853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:02:03.773877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9932
99.3%
입주권 42
 
0.4%
분양권 26
 
0.3%

취소일
Real number (ℝ)

MISSING 

Distinct175
Distinct (%)43.4%
Missing9597
Missing (%)96.0%
Infinite0
Infinite (%)0.0%
Mean20235613
Minimum20230804
Maximum20240508
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:02:03.967020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20230804
5-th percentile20230901
Q120231025
median20231229
Q320240314
95-th percentile20240429
Maximum20240508
Range9704
Interquartile range (IQR)9289.5

Descriptive statistics

Standard deviation4637.6194
Coefficient of variation (CV)0.00022918106
Kurtosis-2.0066003
Mean20235613
Median Absolute Deviation (MAD)419
Skewness0.02469818
Sum8.1549522 × 109
Variance21507513
MonotonicityNot monotonic
2024-05-11T16:02:04.255879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20231121 6
 
0.1%
20240205 6
 
0.1%
20240401 6
 
0.1%
20240319 6
 
0.1%
20240329 6
 
0.1%
20240416 5
 
0.1%
20231226 5
 
0.1%
20240227 5
 
0.1%
20230918 5
 
0.1%
20240429 5
 
0.1%
Other values (165) 348
 
3.5%
(Missing) 9597
96.0%
ValueCountFrequency (%)
20230804 1
< 0.1%
20230808 1
< 0.1%
20230810 1
< 0.1%
20230811 2
< 0.1%
20230815 1
< 0.1%
20230816 1
< 0.1%
20230817 1
< 0.1%
20230818 1
< 0.1%
20230821 1
< 0.1%
20230822 1
< 0.1%
ValueCountFrequency (%)
20240508 1
 
< 0.1%
20240503 3
< 0.1%
20240502 4
< 0.1%
20240501 4
< 0.1%
20240430 4
< 0.1%
20240429 5
0.1%
20240427 1
 
< 0.1%
20240426 2
 
< 0.1%
20240425 1
 
< 0.1%
20240424 3
< 0.1%

건축년도
Real number (ℝ)

Distinct74
Distinct (%)0.7%
Missing40
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean1991.6652
Minimum0
Maximum2024
Zeros68
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:02:04.539010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1985
Q11997
median2005
Q32016
95-th percentile2023
Maximum2024
Range2024
Interquartile range (IQR)19

Descriptive statistics

Standard deviation165.60408
Coefficient of variation (CV)0.083148553
Kurtosis139.93358
Mean1991.6652
Median Absolute Deviation (MAD)10
Skewness-11.878317
Sum19836985
Variance27424.71
MonotonicityNot monotonic
2024-05-11T16:02:04.847238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2023 544
 
5.4%
2003 417
 
4.2%
2004 390
 
3.9%
2002 390
 
3.9%
2017 377
 
3.8%
2016 353
 
3.5%
2019 349
 
3.5%
2014 290
 
2.9%
2022 288
 
2.9%
2015 286
 
2.9%
Other values (64) 6276
62.8%
ValueCountFrequency (%)
0 68
0.7%
1900 3
 
< 0.1%
1930 1
 
< 0.1%
1935 1
 
< 0.1%
1940 1
 
< 0.1%
1954 1
 
< 0.1%
1955 1
 
< 0.1%
1956 1
 
< 0.1%
1958 3
 
< 0.1%
1959 3
 
< 0.1%
ValueCountFrequency (%)
2024 55
 
0.5%
2023 544
5.4%
2022 288
2.9%
2021 236
2.4%
2020 269
2.7%
2019 349
3.5%
2018 261
2.6%
2017 377
3.8%
2016 353
3.5%
2015 286
2.9%

건물용도
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
아파트
5012 
연립다세대
3333 
오피스텔
1268 
단독다가구
 
387

Length

Max length5
Median length3
Mean length3.8708
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row아파트
2nd row연립다세대
3rd row아파트
4th row연립다세대
5th row오피스텔

Common Values

ValueCountFrequency (%)
아파트 5012
50.1%
연립다세대 3333
33.3%
오피스텔 1268
 
12.7%
단독다가구 387
 
3.9%

Length

2024-05-11T16:02:05.096219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:02:05.293053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
아파트 5012
50.1%
연립다세대 3333
33.3%
오피스텔 1268
 
12.7%
단독다가구 387
 
3.9%

신고구분
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
중개거래
8367 
직거래
1633 

Length

Max length4
Median length4
Mean length3.8367
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중개거래
2nd row직거래
3rd row중개거래
4th row중개거래
5th row중개거래

Common Values

ValueCountFrequency (%)
중개거래 8367
83.7%
직거래 1633
 
16.3%

Length

2024-05-11T16:02:05.478807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:02:05.637593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
중개거래 8367
83.7%
직거래 1633
 
16.3%
Distinct309
Distinct (%)3.7%
Missing1642
Missing (%)16.4%
Memory size156.2 KiB
2024-05-11T16:02:05.839784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length6
Mean length6.6555396
Min length3

Characters and Unicode

Total characters55627
Distinct characters83
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique187 ?
Unique (%)2.2%

Sample

1st row서울 노원구
2nd row경기 부천시, 서울 구로구
3rd row대전 유성구
4th row서울 서초구
5th row서울 강북구
ValueCountFrequency (%)
서울 8732
49.0%
강남구 550
 
3.1%
강서구 546
 
3.1%
송파구 545
 
3.1%
마포구 507
 
2.8%
노원구 459
 
2.6%
은평구 452
 
2.5%
강동구 436
 
2.4%
서초구 414
 
2.3%
성북구 392
 
2.2%
Other values (87) 4789
26.9%
2024-05-11T16:02:06.400615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9994
18.0%
9464
17.0%
9195
16.5%
8733
15.7%
1813
 
3.3%
1483
 
2.7%
898
 
1.6%
742
 
1.3%
678
 
1.2%
624
 
1.1%
Other values (73) 12003
21.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45616
82.0%
Space Separator 9464
 
17.0%
Other Punctuation 547
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9994
21.9%
9195
20.2%
8733
19.1%
1813
 
4.0%
1483
 
3.3%
898
 
2.0%
742
 
1.6%
678
 
1.5%
624
 
1.4%
620
 
1.4%
Other values (71) 10836
23.8%
Space Separator
ValueCountFrequency (%)
9464
100.0%
Other Punctuation
ValueCountFrequency (%)
, 547
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45616
82.0%
Common 10011
 
18.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9994
21.9%
9195
20.2%
8733
19.1%
1813
 
4.0%
1483
 
3.3%
898
 
2.0%
742
 
1.6%
678
 
1.5%
624
 
1.4%
620
 
1.4%
Other values (71) 10836
23.8%
Common
ValueCountFrequency (%)
9464
94.5%
, 547
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45616
82.0%
ASCII 10011
 
18.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9994
21.9%
9195
20.2%
8733
19.1%
1813
 
4.0%
1483
 
3.3%
898
 
2.0%
742
 
1.6%
678
 
1.5%
624
 
1.4%
620
 
1.4%
Other values (71) 10836
23.8%
ASCII
ValueCountFrequency (%)
9464
94.5%
, 547
 
5.5%

Sample

접수연도자치구코드자치구명법정동코드법정동명지번구분지번구분명본번부번건물명계약일물건금액(만원)건물면적(㎡)토지면적(㎡)권리구분취소일건축년도건물용도신고구분신고한 개업공인중개사 시군구명
11077202411350노원구10600중계동1대지05840001삼창타워주상프라자202403144450059.390.018<NA><NA>1997아파트중개거래서울 노원구
10866202411500강서구10300화곡동1대지10560027경원아트빌B동202403151930028.814.01<NA><NA>2001연립다세대직거래<NA>
42123202311530구로구11200항동1대지237.00.0하버라인2단지202309196400059.860.02<NA><NA>2019아파트중개거래경기 부천시, 서울 구로구
28487202311110종로구14900원서동1대지01030001(103-1)202312081300013.776.92<NA><NA>1996연립다세대중개거래대전 유성구
23613202411650서초구10300우면동1대지07870000서초리슈빌S글로벌202401101600021.6830.3810<NA><NA>2015오피스텔중개거래서울 서초구
58172202311320도봉구10700창동1대지731.029.0혜성루첸리202307121830029.7312.475<NA><NA>2015연립다세대중개거래서울 강북구
22241202411320도봉구10500쌍문동1대지00590000한양2202401173400059.670.09<NA><NA>1988아파트중개거래서울 도봉구
34339202311560영등포구13300대림동1대지1118.06.0(1118-6)202311022020042.5416.04<NA><NA>2000연립다세대중개거래서울 영등포구
28354202311500강서구10300화곡동1대지03710042창대빌라(371-42)202312092000054.9130.282<NA><NA>2009연립다세대중개거래서울 강서구
12030202411410서대문구11100홍제동1대지00820000홍제한양202403106370060.060.07<NA><NA>1993아파트중개거래서울 강남구
접수연도자치구코드자치구명법정동코드법정동명지번구분지번구분명본번부번건물명계약일물건금액(만원)건물면적(㎡)토지면적(㎡)권리구분취소일건축년도건물용도신고구분신고한 개업공인중개사 시군구명
48941202311380은평구10700응암동1대지586.021.0월드샤인빌202308222900067.6538.123<NA><NA>2003연립다세대중개거래서울 은평구
36914202311470양천구10200목동1대지743.04.0한솔파크빌202310192550066.441.083<NA><NA>2007연립다세대중개거래서울 양천구
28997202411680강남구10500삼성동1대지00160002삼성동힐스테이트 1단지20231205322000114.460.014<NA><NA>2008아파트중개거래서울 강남구
19077202411650서초구10700반포동1대지00010001래미안원베일리2024020329100059.960.033<NA><NA>2023아파트중개거래서울 서초구
14179202411620관악구10200신림동1대지16630005성일주택202402292490053.0226.04<NA><NA>2001연립다세대중개거래서울 마포구
50194202311650서초구10800서초동1대지1303.06.0현대썬앤빌 강남 더 인피닛202308172300017.9531.338<NA><NA>2018오피스텔중개거래서울 서초구
57265202311620관악구10200신림동1대지415.048.0(415-48)202307172000027.9222.02<NA><NA>1990연립다세대직거래<NA>
54662202311500강서구10200등촌동1대지519.017.0호성탑스빌가동202307274150061.126.073<NA><NA>2002연립다세대중개거래서울 강서구
4439202411680강남구10100역삼동1대지09020000강남센트럴아이파크2024040829300084.990.07<NA><NA>2022아파트중개거래서울 광진구
54494202311290성북구13800장위동1대지68.0925.0베아트리스202307283900028.9218.72<NA><NA>2019연립다세대중개거래서울 동대문구, 서울 성북구

Duplicate rows

Most frequently occurring

접수연도자치구코드자치구명법정동코드법정동명지번구분지번구분명건물명계약일물건금액(만원)건물면적(㎡)토지면적(㎡)권리구분취소일건축년도건물용도신고구분신고한 개업공인중개사 시군구명# duplicates
17202311500강서구10500마곡동1대지힐스테이트에코 마곡나루역 라마다앙코르 서울마곡202312142141719.5833.0114<NA><NA>2017오피스텔직거래<NA>4
0202311110종로구13500관철동1대지프라움스테이202312073661725.5334.623<NA><NA>2023오피스텔직거래<NA>3
3202311110종로구13500관철동1대지프라움스테이202312073713225.5334.628<NA><NA>2023오피스텔직거래<NA>3
6202311200성동구12200용답동1대지(228-1)20230920559923.4331.645<NA><NA>1990오피스텔직거래<NA>3
8202311440마포구12300망원동1대지보라맨션2023091210000090.378.581<NA><NA>1981연립다세대중개거래서울 강남구, 서울 서초구3
10202311500강서구10500마곡동1대지힐스테이트에코 마곡나루역 라마다앙코르 서울마곡202312142030919.5833.014<NA><NA>2017오피스텔직거래<NA>3
11202311500강서구10500마곡동1대지힐스테이트에코 마곡나루역 라마다앙코르 서울마곡202312142062519.5833.016<NA><NA>2017오피스텔직거래<NA>3
14202311500강서구10500마곡동1대지힐스테이트에코 마곡나루역 라마다앙코르 서울마곡202312142099519.5833.019<NA><NA>2017오피스텔직거래<NA>3
16202311500강서구10500마곡동1대지힐스테이트에코 마곡나루역 라마다앙코르 서울마곡202312142120619.5833.0113<NA><NA>2017오피스텔직거래<NA>3
19202311500강서구10500마곡동1대지힐스테이트에코 마곡나루역 라마다앙코르 서울마곡202312142162820.2134.058<NA><NA>2017오피스텔직거래<NA>3