Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells3164
Missing cells (%)6.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Numeric2
Text3

Dataset

Description한국부동산원(구.한국감정원)에서 제공하는 공동주택 단지 식별정보 중 동정보 데이터입니다. - (동정보) 단지고유번호, 동명, 지상층수
URLhttps://www.data.go.kr/data/15106866/fileData.do

Alerts

동명_공시가격 has 896 (9.0%) missing valuesMissing
동명_건축물대장 has 918 (9.2%) missing valuesMissing
동명_도로명주소 has 1350 (13.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 05:13:19.590692
Analysis finished2023-12-12 05:13:21.301848
Duration1.71 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

단지고유번호
Real number (ℝ)

Distinct7227
Distinct (%)72.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5177329 × 1013
Minimum1.11101 × 1013
Maximum5.013012 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:13:21.428131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.11101 × 1013
5-th percentile1.138012 × 1013
Q12.81771 × 1013
median4.119012 × 1013
Q34.31301 × 1013
95-th percentile4.82501 × 1013
Maximum5.013012 × 1013
Range3.902002 × 1013
Interquartile range (IQR)1.4953 × 1013

Descriptive statistics

Standard deviation1.1635043 × 1013
Coefficient of variation (CV)0.33075401
Kurtosis-0.34990561
Mean3.5177329 × 1013
Median Absolute Deviation (MAD)6.0299902 × 1012
Skewness-0.88686671
Sum3.5177329 × 1017
Variance1.3537422 × 1026
MonotonicityNot monotonic
2023-12-12T14:13:21.621106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11710120384641 15
 
0.1%
11650120113618 12
 
0.1%
36110120412626 8
 
0.1%
11740120394388 7
 
0.1%
41463100008311 7
 
0.1%
11710120099495 7
 
0.1%
41463100008313 7
 
0.1%
41220120412751 6
 
0.1%
47290100015364 6
 
0.1%
41390100009380 6
 
0.1%
Other values (7217) 9919
99.2%
ValueCountFrequency (%)
11110100000004 1
< 0.1%
11110100000034 1
< 0.1%
11110100000039 1
< 0.1%
11110100000077 1
< 0.1%
11110100000081 1
< 0.1%
11110100000090 1
< 0.1%
11110100248227 1
< 0.1%
11110100448199 1
< 0.1%
11110100600280 1
< 0.1%
11110120090775 1
< 0.1%
ValueCountFrequency (%)
50130120391368 1
 
< 0.1%
50130120383793 1
 
< 0.1%
50130120356582 1
 
< 0.1%
50130120348716 3
< 0.1%
50130120346520 1
 
< 0.1%
50130120332195 1
 
< 0.1%
50130120328243 1
 
< 0.1%
50130120299502 1
 
< 0.1%
50130120284622 2
< 0.1%
50130120162775 1
 
< 0.1%

동명_공시가격
Text

MISSING 

Distinct1135
Distinct (%)12.5%
Missing896
Missing (%)9.0%
Memory size156.2 KiB
2023-12-12T14:13:22.146935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length3
Mean length3.1260984
Min length1

Characters and Unicode

Total characters28460
Distinct characters247
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique678 ?
Unique (%)7.4%

Sample

1st row109
2nd row2402동
3rd row206동
4th row106
5th row4
ValueCountFrequency (%)
102 583
 
6.4%
101 582
 
6.3%
103 439
 
4.8%
105 327
 
3.6%
104 321
 
3.5%
106 274
 
3.0%
107 226
 
2.5%
108 180
 
2.0%
1 175
 
1.9%
109 144
 
1.6%
Other values (1141) 5920
64.6%
2023-12-12T14:13:22.826401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 7957
28.0%
0 6900
24.2%
2 2965
 
10.4%
3 1963
 
6.9%
1517
 
5.3%
4 1309
 
4.6%
5 1211
 
4.3%
6 1022
 
3.6%
7 828
 
2.9%
8 703
 
2.5%
Other values (237) 2085
 
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25420
89.3%
Other Letter 2792
 
9.8%
Uppercase Letter 125
 
0.4%
Space Separator 71
 
0.2%
Close Punctuation 18
 
0.1%
Open Punctuation 18
 
0.1%
Dash Punctuation 9
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Lowercase Letter 1
 
< 0.1%
Math Symbol 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1517
54.3%
85
 
3.0%
77
 
2.8%
65
 
2.3%
64
 
2.3%
59
 
2.1%
58
 
2.1%
36
 
1.3%
35
 
1.3%
31
 
1.1%
Other values (204) 765
27.4%
Uppercase Letter
ValueCountFrequency (%)
A 52
41.6%
B 45
36.0%
C 12
 
9.6%
S 3
 
2.4%
H 2
 
1.6%
D 2
 
1.6%
P 2
 
1.6%
L 1
 
0.8%
T 1
 
0.8%
M 1
 
0.8%
Other values (4) 4
 
3.2%
Decimal Number
ValueCountFrequency (%)
1 7957
31.3%
0 6900
27.1%
2 2965
 
11.7%
3 1963
 
7.7%
4 1309
 
5.1%
5 1211
 
4.8%
6 1022
 
4.0%
7 828
 
3.3%
8 703
 
2.8%
9 562
 
2.2%
Space Separator
ValueCountFrequency (%)
71
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25541
89.7%
Hangul 2791
 
9.8%
Latin 127
 
0.4%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1517
54.4%
85
 
3.0%
77
 
2.8%
65
 
2.3%
64
 
2.3%
59
 
2.1%
58
 
2.1%
36
 
1.3%
35
 
1.3%
31
 
1.1%
Other values (203) 764
27.4%
Common
ValueCountFrequency (%)
1 7957
31.2%
0 6900
27.0%
2 2965
 
11.6%
3 1963
 
7.7%
4 1309
 
5.1%
5 1211
 
4.7%
6 1022
 
4.0%
7 828
 
3.2%
8 703
 
2.8%
9 562
 
2.2%
Other values (7) 121
 
0.5%
Latin
ValueCountFrequency (%)
A 52
40.9%
B 45
35.4%
C 12
 
9.4%
S 3
 
2.4%
H 2
 
1.6%
D 2
 
1.6%
P 2
 
1.6%
L 1
 
0.8%
T 1
 
0.8%
M 1
 
0.8%
Other values (6) 6
 
4.7%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25667
90.2%
Hangul 2791
 
9.8%
CJK 1
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 7957
31.0%
0 6900
26.9%
2 2965
 
11.6%
3 1963
 
7.6%
4 1309
 
5.1%
5 1211
 
4.7%
6 1022
 
4.0%
7 828
 
3.2%
8 703
 
2.7%
9 562
 
2.2%
Other values (22) 247
 
1.0%
Hangul
ValueCountFrequency (%)
1517
54.4%
85
 
3.0%
77
 
2.8%
65
 
2.3%
64
 
2.3%
59
 
2.1%
58
 
2.1%
36
 
1.3%
35
 
1.3%
31
 
1.1%
Other values (203) 764
27.4%
CJK
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct1131
Distinct (%)12.5%
Missing918
Missing (%)9.2%
Memory size156.2 KiB
2023-12-12T14:13:23.264850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length4
Mean length3.9883286
Min length1

Characters and Unicode

Total characters36222
Distinct characters292
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique728 ?
Unique (%)8.0%

Sample

1st row109동
2nd row2402동
3rd row206동
4th row106동
5th row4동
ValueCountFrequency (%)
101동 669
 
7.3%
102동 638
 
6.9%
103동 525
 
5.7%
104동 382
 
4.1%
105동 368
 
4.0%
106동 335
 
3.6%
107동 264
 
2.9%
108동 221
 
2.4%
109동 163
 
1.8%
201동 147
 
1.6%
Other values (1159) 5503
59.7%
2023-12-12T14:13:23.845989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8640
23.9%
1 7846
21.7%
0 6844
18.9%
2 2938
 
8.1%
3 1957
 
5.4%
4 1304
 
3.6%
5 1207
 
3.3%
6 1017
 
2.8%
7 827
 
2.3%
8 702
 
1.9%
Other values (282) 2940
 
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25203
69.6%
Other Letter 10736
29.6%
Space Separator 133
 
0.4%
Uppercase Letter 103
 
0.3%
Open Punctuation 14
 
< 0.1%
Close Punctuation 14
 
< 0.1%
Dash Punctuation 10
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Lowercase Letter 3
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8640
80.5%
127
 
1.2%
125
 
1.2%
124
 
1.2%
119
 
1.1%
81
 
0.8%
67
 
0.6%
63
 
0.6%
47
 
0.4%
44
 
0.4%
Other values (250) 1299
 
12.1%
Uppercase Letter
ValueCountFrequency (%)
A 42
40.8%
B 32
31.1%
C 11
 
10.7%
S 3
 
2.9%
P 3
 
2.9%
K 2
 
1.9%
D 2
 
1.9%
T 2
 
1.9%
M 1
 
1.0%
E 1
 
1.0%
Other values (4) 4
 
3.9%
Decimal Number
ValueCountFrequency (%)
1 7846
31.1%
0 6844
27.2%
2 2938
 
11.7%
3 1957
 
7.8%
4 1304
 
5.2%
5 1207
 
4.8%
6 1017
 
4.0%
7 827
 
3.3%
8 702
 
2.8%
9 561
 
2.2%
Space Separator
ValueCountFrequency (%)
133
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Other Punctuation
ValueCountFrequency (%)
. 4
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 3
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25379
70.1%
Hangul 10736
29.6%
Latin 107
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8640
80.5%
127
 
1.2%
125
 
1.2%
124
 
1.2%
119
 
1.1%
81
 
0.8%
67
 
0.6%
63
 
0.6%
47
 
0.4%
44
 
0.4%
Other values (250) 1299
 
12.1%
Common
ValueCountFrequency (%)
1 7846
30.9%
0 6844
27.0%
2 2938
 
11.6%
3 1957
 
7.7%
4 1304
 
5.1%
5 1207
 
4.8%
6 1017
 
4.0%
7 827
 
3.3%
8 702
 
2.8%
9 561
 
2.2%
Other values (6) 176
 
0.7%
Latin
ValueCountFrequency (%)
A 42
39.3%
B 32
29.9%
C 11
 
10.3%
S 3
 
2.8%
P 3
 
2.8%
e 3
 
2.8%
K 2
 
1.9%
D 2
 
1.9%
T 2
 
1.9%
M 1
 
0.9%
Other values (6) 6
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25485
70.4%
Hangul 10736
29.6%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8640
80.5%
127
 
1.2%
125
 
1.2%
124
 
1.2%
119
 
1.1%
81
 
0.8%
67
 
0.6%
63
 
0.6%
47
 
0.4%
44
 
0.4%
Other values (250) 1299
 
12.1%
ASCII
ValueCountFrequency (%)
1 7846
30.8%
0 6844
26.9%
2 2938
 
11.5%
3 1957
 
7.7%
4 1304
 
5.1%
5 1207
 
4.7%
6 1017
 
4.0%
7 827
 
3.2%
8 702
 
2.8%
9 561
 
2.2%
Other values (21) 282
 
1.1%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct939
Distinct (%)10.9%
Missing1350
Missing (%)13.5%
Memory size156.2 KiB
2023-12-12T14:13:24.257626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length4
Mean length3.9338728
Min length1

Characters and Unicode

Total characters34028
Distinct characters230
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique552 ?
Unique (%)6.4%

Sample

1st row109동
2nd row2402
3rd row206동
4th row106동
5th row4동
ValueCountFrequency (%)
101동 658
 
7.6%
102동 635
 
7.3%
103동 512
 
5.9%
105동 366
 
4.2%
104동 364
 
4.2%
106동 326
 
3.7%
107동 261
 
3.0%
108동 211
 
2.4%
109동 160
 
1.8%
201동 142
 
1.6%
Other values (942) 5069
58.2%
2023-12-12T14:13:24.827914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8342
24.5%
1 7557
22.2%
0 6612
19.4%
2 2862
 
8.4%
3 1899
 
5.6%
4 1238
 
3.6%
5 1165
 
3.4%
6 983
 
2.9%
7 799
 
2.3%
8 674
 
2.0%
Other values (220) 1897
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24334
71.5%
Other Letter 9427
 
27.7%
Uppercase Letter 104
 
0.3%
Space Separator 54
 
0.2%
Close Punctuation 47
 
0.1%
Open Punctuation 47
 
0.1%
Dash Punctuation 6
 
< 0.1%
Other Punctuation 5
 
< 0.1%
Lowercase Letter 3
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8342
88.5%
80
 
0.8%
74
 
0.8%
53
 
0.6%
51
 
0.5%
43
 
0.5%
41
 
0.4%
31
 
0.3%
27
 
0.3%
23
 
0.2%
Other values (195) 662
 
7.0%
Decimal Number
ValueCountFrequency (%)
1 7557
31.1%
0 6612
27.2%
2 2862
 
11.8%
3 1899
 
7.8%
4 1238
 
5.1%
5 1165
 
4.8%
6 983
 
4.0%
7 799
 
3.3%
8 674
 
2.8%
9 545
 
2.2%
Uppercase Letter
ValueCountFrequency (%)
A 50
48.1%
B 36
34.6%
C 13
 
12.5%
D 2
 
1.9%
V 1
 
1.0%
H 1
 
1.0%
E 1
 
1.0%
Other Punctuation
ValueCountFrequency (%)
, 3
60.0%
. 2
40.0%
Space Separator
ValueCountFrequency (%)
54
100.0%
Close Punctuation
ValueCountFrequency (%)
) 47
100.0%
Open Punctuation
ValueCountFrequency (%)
( 47
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 3
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24493
72.0%
Hangul 9427
 
27.7%
Latin 108
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8342
88.5%
80
 
0.8%
74
 
0.8%
53
 
0.6%
51
 
0.5%
43
 
0.5%
41
 
0.4%
31
 
0.3%
27
 
0.3%
23
 
0.2%
Other values (195) 662
 
7.0%
Common
ValueCountFrequency (%)
1 7557
30.9%
0 6612
27.0%
2 2862
 
11.7%
3 1899
 
7.8%
4 1238
 
5.1%
5 1165
 
4.8%
6 983
 
4.0%
7 799
 
3.3%
8 674
 
2.8%
9 545
 
2.2%
Other values (6) 159
 
0.6%
Latin
ValueCountFrequency (%)
A 50
46.3%
B 36
33.3%
C 13
 
12.0%
e 3
 
2.8%
D 2
 
1.9%
V 1
 
0.9%
1
 
0.9%
H 1
 
0.9%
E 1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24600
72.3%
Hangul 9427
 
27.7%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8342
88.5%
80
 
0.8%
74
 
0.8%
53
 
0.6%
51
 
0.5%
43
 
0.5%
41
 
0.4%
31
 
0.3%
27
 
0.3%
23
 
0.2%
Other values (195) 662
 
7.0%
ASCII
ValueCountFrequency (%)
1 7557
30.7%
0 6612
26.9%
2 2862
 
11.6%
3 1899
 
7.7%
4 1238
 
5.0%
5 1165
 
4.7%
6 983
 
4.0%
7 799
 
3.2%
8 674
 
2.7%
9 545
 
2.2%
Other values (14) 266
 
1.1%
Number Forms
ValueCountFrequency (%)
1
100.0%

지상층수
Real number (ℝ)

Distinct53
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.7258
Minimum1
Maximum59
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:13:25.010465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q111
median15
Q320
95-th percentile28
Maximum59
Range58
Interquartile range (IQR)9

Descriptive statistics

Standard deviation7.1671299
Coefficient of variation (CV)0.45575614
Kurtosis1.3250506
Mean15.7258
Median Absolute Deviation (MAD)5
Skewness0.62752395
Sum157258
Variance51.367751
MonotonicityNot monotonic
2023-12-12T14:13:25.143931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15 2090
20.9%
5 1109
 
11.1%
20 911
 
9.1%
18 513
 
5.1%
25 411
 
4.1%
10 389
 
3.9%
14 379
 
3.8%
12 367
 
3.7%
13 333
 
3.3%
19 326
 
3.3%
Other values (43) 3172
31.7%
ValueCountFrequency (%)
1 5
 
0.1%
2 6
 
0.1%
3 17
 
0.2%
4 28
 
0.3%
5 1109
11.1%
6 313
 
3.1%
7 196
 
2.0%
8 153
 
1.5%
9 186
 
1.9%
10 389
 
3.9%
ValueCountFrequency (%)
59 1
 
< 0.1%
57 1
 
< 0.1%
55 1
 
< 0.1%
53 2
 
< 0.1%
49 15
0.1%
48 5
 
0.1%
47 2
 
< 0.1%
46 3
 
< 0.1%
45 1
 
< 0.1%
44 1
 
< 0.1%

Interactions

2023-12-12T14:13:20.583520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:13:20.260714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:13:20.741141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:13:20.431590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:13:25.245302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단지고유번호지상층수
단지고유번호1.0000.197
지상층수0.1971.000
2023-12-12T14:13:25.332166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단지고유번호지상층수
단지고유번호1.000-0.074
지상층수-0.0741.000

Missing values

2023-12-12T14:13:20.925748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:13:21.087369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:13:21.222751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

단지고유번호동명_공시가격동명_건축물대장동명_도로명주소지상층수
1479748310100051960109109동109동15
37586282601204236362402동2402동240229
9401941590120356242206동206동206동19
7987030200100219654106106동106동10
702341147010000322344동4동5
301774276010022071822동<NA>13
23823414801204116181205동1205동1205동24
1445311680120440294909동909동909동17
4716448330120068191104104동104동15
2243831200100017210<NA><NA><NA>5
단지고유번호동명_공시가격동명_건축물대장동명_도로명주소지상층수
4918029170100013462103103동103동21
5208311350120008131101101동101동7
3440345180100012454101101동101동14
5934536110120355145503동503동503동22
679194128710000795411031103동1103동20
840746230120172235109109동109동17
3003841220100007343101101동101동14
3833546130100012932109109동109동18
7714747190100014736316316동316동15
1939426230100003985214214동214동19