Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells3167
Missing cells (%)3.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory908.2 KiB
Average record size in memory93.0 B

Variable types

Numeric4
Text5
Categorical1

Dataset

Description한국부동산원(구.한국감정원)에서 제공하는 공동주택 단지 식별정보 중 기본정보 데이터입니다. - (기본정보) 단지고유번호, 필지고유번호, 주소, 단지명, 단지종류, 동수, 세대수, 사용승인일
URLhttps://www.data.go.kr/data/15106861/fileData.do

Alerts

단지종류 has constant value ""Constant
단지고유번호 is highly overall correlated with 필지고유번호High correlation
필지고유번호 is highly overall correlated with 단지고유번호High correlation
동수 is highly overall correlated with 세대수High correlation
세대수 is highly overall correlated with 동수High correlation
단지명_건축물대장 has 1579 (15.8%) missing valuesMissing
단지명_도로명주소 has 1588 (15.9%) missing valuesMissing
단지고유번호 has unique valuesUnique

Reproduction

Analysis started2023-09-13 05:45:35.548384
Analysis finished2023-09-13 05:45:44.391710
Duration8.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

단지고유번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2623982 × 1013
Minimum1.11101 × 1013
Maximum5.013012 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-09-13T14:45:44.552079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.11101 × 1013
5-th percentile1.13201 × 1013
Q12.626012 × 1013
median3.17101 × 1013
Q34.413312 × 1013
95-th percentile4.827012 × 1013
Maximum5.013012 × 1013
Range3.902002 × 1013
Interquartile range (IQR)1.7873 × 1013

Descriptive statistics

Standard deviation1.3347209 × 1013
Coefficient of variation (CV)0.40912261
Kurtosis-1.1885311
Mean3.2623982 × 1013
Median Absolute Deviation (MAD)1.140102 × 1013
Skewness-0.45433073
Sum3.2623982 × 1017
Variance1.7814799 × 1026
MonotonicityNot monotonic
2023-09-13T14:45:44.955581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41281100007869 1
 
< 0.1%
47170100014648 1
 
< 0.1%
11380120091516 1
 
< 0.1%
26170100448582 1
 
< 0.1%
27200100014071 1
 
< 0.1%
46870100013245 1
 
< 0.1%
48310120354539 1
 
< 0.1%
27200100014066 1
 
< 0.1%
11470100003224 1
 
< 0.1%
11545100003529 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
11110100000012 1
< 0.1%
11110100000016 1
< 0.1%
11110100000027 1
< 0.1%
11110100000030 1
< 0.1%
11110100000032 1
< 0.1%
11110100000033 1
< 0.1%
11110100000034 1
< 0.1%
11110100000039 1
< 0.1%
11110100000045 1
< 0.1%
11110100000047 1
< 0.1%
ValueCountFrequency (%)
50130120433550 1
< 0.1%
50130120427628 1
< 0.1%
50130120426526 1
< 0.1%
50130120392764 1
< 0.1%
50130120384832 1
< 0.1%
50130120383793 1
< 0.1%
50130120366080 1
< 0.1%
50130120362681 1
< 0.1%
50130120360868 1
< 0.1%
50130120360555 1
< 0.1%

필지고유번호
Real number (ℝ)

HIGH CORRELATION 

Distinct9982
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2624009 × 1018
Minimum1.1110115 × 1018
Maximum5.013032 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-09-13T14:45:45.281060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110115 × 1018
5-th percentile1.1320107 × 1018
Q12.6260108 × 1018
median3.1710255 × 1018
Q34.4133144 × 1018
95-th percentile4.8270253 × 1018
Maximum5.013032 × 1018
Range3.9020205 × 1018
Interquartile range (IQR)1.7873036 × 1018

Descriptive statistics

Standard deviation1.3347231 × 1018
Coefficient of variation (CV)0.40912294
Kurtosis-1.1885327
Mean3.2624009 × 1018
Median Absolute Deviation (MAD)1.140087 × 1018
Skewness-0.45432902
Sum-8.2811784 × 1018
Variance1.7814857 × 1036
MonotonicityNot monotonic
2023-09-13T14:45:45.617565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4127310900110790000 2
 
< 0.1%
4711132000300310008 2
 
< 0.1%
4825010600103210000 2
 
< 0.1%
4159031021100920000 2
 
< 0.1%
4611015800107400000 2
 
< 0.1%
4128112300108700000 2
 
< 0.1%
4128510600111350000 2
 
< 0.1%
4127310900110850000 2
 
< 0.1%
4128510200115740000 2
 
< 0.1%
2826011500300000000 2
 
< 0.1%
Other values (9972) 9980
99.8%
ValueCountFrequency (%)
1111011500100090000 1
< 0.1%
1111011700101450000 1
< 0.1%
1111013300100300006 1
< 0.1%
1111013300100550000 1
< 0.1%
1111013400100220000 1
< 0.1%
1111013700100220001 1
< 0.1%
1111016200100650002 1
< 0.1%
1111016500100090001 1
< 0.1%
1111016700100600000 1
< 0.1%
1111016800100040157 1
< 0.1%
ValueCountFrequency (%)
5013032021115910005 1
< 0.1%
5013032021115340001 1
< 0.1%
5013032021106810000 1
< 0.1%
5013025924111320021 1
< 0.1%
5013025321113020003 1
< 0.1%
5013025321112940003 1
< 0.1%
5013025022112210001 1
< 0.1%
5013025022111870002 1
< 0.1%
5013011600102000000 1
< 0.1%
5013011600101890000 1
< 0.1%

주소
Text

Distinct9982
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-09-13T14:45:46.232978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length26
Mean length18.752
Min length12

Characters and Unicode

Total characters187520
Distinct characters342
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9964 ?
Unique (%)99.6%

Sample

1st row경기도 고양덕양구 행신동 953
2nd row경상남도 거제시 장승포동 367-4
3rd row서울특별시 강동구 명일동 352-18
4th row부산광역시 남구 대연동 876-11
5th row전라북도 임실군 임실읍 이도리 212
ValueCountFrequency (%)
서울특별시 2159
 
5.2%
경기도 1739
 
4.2%
부산광역시 1083
 
2.6%
경상남도 801
 
1.9%
경상북도 628
 
1.5%
인천광역시 475
 
1.1%
대구광역시 419
 
1.0%
울산광역시 379
 
0.9%
충청남도 324
 
0.8%
전라남도 309
 
0.7%
Other values (9532) 33085
79.9%
2023-09-13T14:45:47.273055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31475
 
16.8%
9819
 
5.2%
8081
 
4.3%
1 7920
 
4.2%
7380
 
3.9%
- 6286
 
3.4%
5408
 
2.9%
2 4560
 
2.4%
3 4150
 
2.2%
4 3738
 
2.0%
Other values (332) 98703
52.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 110644
59.0%
Decimal Number 39079
 
20.8%
Space Separator 31475
 
16.8%
Dash Punctuation 6286
 
3.4%
Uppercase Letter 36
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9819
 
8.9%
8081
 
7.3%
7380
 
6.7%
5408
 
4.9%
3473
 
3.1%
3453
 
3.1%
3352
 
3.0%
3138
 
2.8%
2941
 
2.7%
2627
 
2.4%
Other values (318) 60972
55.1%
Decimal Number
ValueCountFrequency (%)
1 7920
20.3%
2 4560
11.7%
3 4150
10.6%
4 3738
9.6%
5 3646
9.3%
6 3409
8.7%
7 3223
8.2%
8 2864
 
7.3%
0 2841
 
7.3%
9 2728
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
L 18
50.0%
B 18
50.0%
Space Separator
ValueCountFrequency (%)
31475
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6286
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 110644
59.0%
Common 76840
41.0%
Latin 36
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9819
 
8.9%
8081
 
7.3%
7380
 
6.7%
5408
 
4.9%
3473
 
3.1%
3453
 
3.1%
3352
 
3.0%
3138
 
2.8%
2941
 
2.7%
2627
 
2.4%
Other values (318) 60972
55.1%
Common
ValueCountFrequency (%)
31475
41.0%
1 7920
 
10.3%
- 6286
 
8.2%
2 4560
 
5.9%
3 4150
 
5.4%
4 3738
 
4.9%
5 3646
 
4.7%
6 3409
 
4.4%
7 3223
 
4.2%
8 2864
 
3.7%
Other values (2) 5569
 
7.2%
Latin
ValueCountFrequency (%)
L 18
50.0%
B 18
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 110644
59.0%
ASCII 76876
41.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
31475
40.9%
1 7920
 
10.3%
- 6286
 
8.2%
2 4560
 
5.9%
3 4150
 
5.4%
4 3738
 
4.9%
5 3646
 
4.7%
6 3409
 
4.4%
7 3223
 
4.2%
8 2864
 
3.7%
Other values (4) 5605
 
7.3%
Hangul
ValueCountFrequency (%)
9819
 
8.9%
8081
 
7.3%
7380
 
6.7%
5408
 
4.9%
3473
 
3.1%
3453
 
3.1%
3352
 
3.0%
3138
 
2.8%
2941
 
2.7%
2627
 
2.4%
Other values (318) 60972
55.1%
Distinct8841
Distinct (%)88.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-09-13T14:45:47.971045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length26
Mean length6.1889
Min length1

Characters and Unicode

Total characters61889
Distinct characters690
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8303 ?
Unique (%)83.0%

Sample

1st row햇빛주공23
2nd rowSG펠리체A동
3rd row태천해오름102동B
4th row삼정그린타운
5th row아도훼밀리
ValueCountFrequency (%)
현대 49
 
0.5%
삼성 15
 
0.1%
주공 14
 
0.1%
삼익 13
 
0.1%
우성 12
 
0.1%
신동아 12
 
0.1%
벽산 11
 
0.1%
한성 11
 
0.1%
현대2 10
 
0.1%
삼호 10
 
0.1%
Other values (8833) 9846
98.4%
2023-09-13T14:45:49.318139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1529
 
2.5%
1451
 
2.3%
1439
 
2.3%
1403
 
2.3%
1358
 
2.2%
1230
 
2.0%
1152
 
1.9%
2 1145
 
1.9%
1107
 
1.8%
1074
 
1.7%
Other values (680) 49001
79.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52350
84.6%
Decimal Number 5568
 
9.0%
Close Punctuation 1031
 
1.7%
Open Punctuation 1031
 
1.7%
Uppercase Letter 978
 
1.6%
Dash Punctuation 522
 
0.8%
Lowercase Letter 279
 
0.5%
Other Punctuation 86
 
0.1%
Letter Number 21
 
< 0.1%
Math Symbol 20
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1451
 
2.8%
1439
 
2.7%
1403
 
2.7%
1358
 
2.6%
1230
 
2.3%
1152
 
2.2%
1107
 
2.1%
1074
 
2.1%
959
 
1.8%
934
 
1.8%
Other values (607) 40243
76.9%
Uppercase Letter
ValueCountFrequency (%)
A 135
13.8%
B 92
 
9.4%
S 78
 
8.0%
C 73
 
7.5%
I 69
 
7.1%
L 68
 
7.0%
K 54
 
5.5%
H 50
 
5.1%
T 43
 
4.4%
E 37
 
3.8%
Other values (16) 279
28.5%
Lowercase Letter
ValueCountFrequency (%)
e 103
36.9%
i 26
 
9.3%
l 24
 
8.6%
t 18
 
6.5%
a 17
 
6.1%
r 13
 
4.7%
y 11
 
3.9%
u 10
 
3.6%
s 9
 
3.2%
k 8
 
2.9%
Other values (12) 40
 
14.3%
Decimal Number
ValueCountFrequency (%)
1 1529
27.5%
2 1145
20.6%
3 581
 
10.4%
0 570
 
10.2%
4 359
 
6.4%
5 350
 
6.3%
7 331
 
5.9%
6 295
 
5.3%
9 216
 
3.9%
8 192
 
3.4%
Other Punctuation
ValueCountFrequency (%)
, 69
80.2%
. 9
 
10.5%
& 3
 
3.5%
' 3
 
3.5%
: 2
 
2.3%
Math Symbol
ValueCountFrequency (%)
~ 18
90.0%
> 1
 
5.0%
< 1
 
5.0%
Letter Number
ValueCountFrequency (%)
14
66.7%
4
 
19.0%
3
 
14.3%
Close Punctuation
ValueCountFrequency (%)
) 1031
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1031
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 522
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 52345
84.6%
Common 8261
 
13.3%
Latin 1278
 
2.1%
Han 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1451
 
2.8%
1439
 
2.7%
1403
 
2.7%
1358
 
2.6%
1230
 
2.3%
1152
 
2.2%
1107
 
2.1%
1074
 
2.1%
959
 
1.8%
934
 
1.8%
Other values (604) 40238
76.9%
Latin
ValueCountFrequency (%)
A 135
 
10.6%
e 103
 
8.1%
B 92
 
7.2%
S 78
 
6.1%
C 73
 
5.7%
I 69
 
5.4%
L 68
 
5.3%
K 54
 
4.2%
H 50
 
3.9%
T 43
 
3.4%
Other values (41) 513
40.1%
Common
ValueCountFrequency (%)
1 1529
18.5%
2 1145
13.9%
) 1031
12.5%
( 1031
12.5%
3 581
 
7.0%
0 570
 
6.9%
- 522
 
6.3%
4 359
 
4.3%
5 350
 
4.2%
7 331
 
4.0%
Other values (12) 812
9.8%
Han
ValueCountFrequency (%)
3
60.0%
1
 
20.0%
1
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52344
84.6%
ASCII 9518
 
15.4%
Number Forms 21
 
< 0.1%
CJK 4
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1529
16.1%
2 1145
12.0%
) 1031
10.8%
( 1031
10.8%
3 581
 
6.1%
0 570
 
6.0%
- 522
 
5.5%
4 359
 
3.8%
5 350
 
3.7%
7 331
 
3.5%
Other values (60) 2069
21.7%
Hangul
ValueCountFrequency (%)
1451
 
2.8%
1439
 
2.7%
1403
 
2.7%
1358
 
2.6%
1230
 
2.3%
1152
 
2.2%
1107
 
2.1%
1074
 
2.1%
959
 
1.8%
934
 
1.8%
Other values (603) 40237
76.9%
Number Forms
ValueCountFrequency (%)
14
66.7%
4
 
19.0%
3
 
14.3%
CJK
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Distinct7376
Distinct (%)87.6%
Missing1579
Missing (%)15.8%
Memory size156.2 KiB
2023-09-13T14:45:49.809781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length25
Mean length7.4997031
Min length1

Characters and Unicode

Total characters63155
Distinct characters683
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6895 ?
Unique (%)81.9%

Sample

1st rowSG펠리체
2nd row태천해오름아파트
3rd row삼정그린타운
4th row아도훼밀리아파트
5th row동양라파크
ValueCountFrequency (%)
아파트 375
 
3.3%
현대아파트 76
 
0.7%
52
 
0.5%
2차 51
 
0.4%
푸르지오 46
 
0.4%
2단지 45
 
0.4%
주공아파트 44
 
0.4%
e편한세상 40
 
0.3%
1단지 38
 
0.3%
롯데캐슬 31
 
0.3%
Other values (8000) 10686
93.1%
2023-09-13T14:45:50.582926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4573
 
7.2%
4474
 
7.1%
4329
 
6.9%
3067
 
4.9%
1276
 
2.0%
1215
 
1.9%
1137
 
1.8%
1010
 
1.6%
922
 
1.5%
837
 
1.3%
Other values (673) 40315
63.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 56753
89.9%
Space Separator 3067
 
4.9%
Decimal Number 1880
 
3.0%
Uppercase Letter 776
 
1.2%
Lowercase Letter 285
 
0.5%
Dash Punctuation 111
 
0.2%
Other Punctuation 98
 
0.2%
Open Punctuation 82
 
0.1%
Close Punctuation 82
 
0.1%
Letter Number 20
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4573
 
8.1%
4474
 
7.9%
4329
 
7.6%
1276
 
2.2%
1215
 
2.1%
1137
 
2.0%
1010
 
1.8%
922
 
1.6%
837
 
1.5%
796
 
1.4%
Other values (599) 36184
63.8%
Uppercase Letter
ValueCountFrequency (%)
A 83
 
10.7%
S 64
 
8.2%
C 62
 
8.0%
I 55
 
7.1%
L 50
 
6.4%
K 49
 
6.3%
T 42
 
5.4%
H 39
 
5.0%
P 38
 
4.9%
B 38
 
4.9%
Other values (16) 256
33.0%
Lowercase Letter
ValueCountFrequency (%)
e 101
35.4%
i 29
 
10.2%
l 24
 
8.4%
t 20
 
7.0%
a 14
 
4.9%
s 12
 
4.2%
n 11
 
3.9%
r 11
 
3.9%
o 11
 
3.9%
y 11
 
3.9%
Other values (12) 41
14.4%
Decimal Number
ValueCountFrequency (%)
1 576
30.6%
2 556
29.6%
3 224
 
11.9%
0 146
 
7.8%
5 97
 
5.2%
4 94
 
5.0%
6 75
 
4.0%
7 45
 
2.4%
8 44
 
2.3%
9 23
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 39
39.8%
. 38
38.8%
& 6
 
6.1%
· 4
 
4.1%
/ 4
 
4.1%
' 3
 
3.1%
: 3
 
3.1%
# 1
 
1.0%
Letter Number
ValueCountFrequency (%)
13
65.0%
4
 
20.0%
3
 
15.0%
Space Separator
ValueCountFrequency (%)
3067
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 111
100.0%
Open Punctuation
ValueCountFrequency (%)
( 82
100.0%
Close Punctuation
ValueCountFrequency (%)
) 82
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 56748
89.9%
Common 5321
 
8.4%
Latin 1081
 
1.7%
Han 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4573
 
8.1%
4474
 
7.9%
4329
 
7.6%
1276
 
2.2%
1215
 
2.1%
1137
 
2.0%
1010
 
1.8%
922
 
1.6%
837
 
1.5%
796
 
1.4%
Other values (597) 36179
63.8%
Latin
ValueCountFrequency (%)
e 101
 
9.3%
A 83
 
7.7%
S 64
 
5.9%
C 62
 
5.7%
I 55
 
5.1%
L 50
 
4.6%
K 49
 
4.5%
T 42
 
3.9%
H 39
 
3.6%
P 38
 
3.5%
Other values (41) 498
46.1%
Common
ValueCountFrequency (%)
3067
57.6%
1 576
 
10.8%
2 556
 
10.4%
3 224
 
4.2%
0 146
 
2.7%
- 111
 
2.1%
5 97
 
1.8%
4 94
 
1.8%
( 82
 
1.5%
) 82
 
1.5%
Other values (13) 286
 
5.4%
Han
ValueCountFrequency (%)
4
80.0%
1
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 56748
89.9%
ASCII 6378
 
10.1%
Number Forms 20
 
< 0.1%
CJK 5
 
< 0.1%
None 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4573
 
8.1%
4474
 
7.9%
4329
 
7.6%
1276
 
2.2%
1215
 
2.1%
1137
 
2.0%
1010
 
1.8%
922
 
1.6%
837
 
1.5%
796
 
1.4%
Other values (597) 36179
63.8%
ASCII
ValueCountFrequency (%)
3067
48.1%
1 576
 
9.0%
2 556
 
8.7%
3 224
 
3.5%
0 146
 
2.3%
- 111
 
1.7%
e 101
 
1.6%
5 97
 
1.5%
4 94
 
1.5%
A 83
 
1.3%
Other values (60) 1323
20.7%
Number Forms
ValueCountFrequency (%)
13
65.0%
4
 
20.0%
3
 
15.0%
None
ValueCountFrequency (%)
· 4
100.0%
CJK
ValueCountFrequency (%)
4
80.0%
1
 
20.0%
Distinct7227
Distinct (%)85.9%
Missing1588
Missing (%)15.9%
Memory size156.2 KiB
2023-09-13T14:45:51.241785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length26
Mean length7.3392772
Min length1

Characters and Unicode

Total characters61738
Distinct characters674
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6725 ?
Unique (%)79.9%

Sample

1st row햇빛마을
2nd rowSG펠리체
3rd row태천해오름아파트
4th row삼정그린타운
5th row아도훼밀리아파트
ValueCountFrequency (%)
아파트 328
 
2.9%
현대아파트 97
 
0.9%
주공아파트 53
 
0.5%
2차 44
 
0.4%
43
 
0.4%
2단지 42
 
0.4%
푸르지오 40
 
0.4%
1단지 35
 
0.3%
e편한세상 34
 
0.3%
101동 30
 
0.3%
Other values (7815) 10398
93.3%
2023-09-13T14:45:52.217198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4721
 
7.6%
4578
 
7.4%
4469
 
7.2%
2740
 
4.4%
1208
 
2.0%
1193
 
1.9%
1168
 
1.9%
936
 
1.5%
862
 
1.4%
821
 
1.3%
Other values (664) 39042
63.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 55739
90.3%
Space Separator 2740
 
4.4%
Decimal Number 1931
 
3.1%
Uppercase Letter 755
 
1.2%
Lowercase Letter 242
 
0.4%
Dash Punctuation 96
 
0.2%
Open Punctuation 79
 
0.1%
Close Punctuation 79
 
0.1%
Other Punctuation 57
 
0.1%
Letter Number 20
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4721
 
8.5%
4578
 
8.2%
4469
 
8.0%
1208
 
2.2%
1193
 
2.1%
1168
 
2.1%
936
 
1.7%
862
 
1.5%
821
 
1.5%
793
 
1.4%
Other values (590) 34990
62.8%
Uppercase Letter
ValueCountFrequency (%)
A 89
 
11.8%
S 61
 
8.1%
C 55
 
7.3%
I 52
 
6.9%
T 47
 
6.2%
P 47
 
6.2%
K 45
 
6.0%
L 44
 
5.8%
B 40
 
5.3%
E 31
 
4.1%
Other values (16) 244
32.3%
Lowercase Letter
ValueCountFrequency (%)
e 91
37.6%
l 23
 
9.5%
i 22
 
9.1%
t 15
 
6.2%
s 11
 
4.5%
o 10
 
4.1%
n 9
 
3.7%
a 9
 
3.7%
r 8
 
3.3%
u 8
 
3.3%
Other values (12) 36
 
14.9%
Decimal Number
ValueCountFrequency (%)
1 592
30.7%
2 538
27.9%
3 233
 
12.1%
0 200
 
10.4%
5 94
 
4.9%
4 88
 
4.6%
6 75
 
3.9%
7 45
 
2.3%
8 41
 
2.1%
9 25
 
1.3%
Other Punctuation
ValueCountFrequency (%)
. 23
40.4%
, 16
28.1%
& 7
 
12.3%
' 3
 
5.3%
· 3
 
5.3%
: 2
 
3.5%
/ 1
 
1.8%
@ 1
 
1.8%
# 1
 
1.8%
Letter Number
ValueCountFrequency (%)
14
70.0%
3
 
15.0%
3
 
15.0%
Space Separator
ValueCountFrequency (%)
2740
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 96
100.0%
Open Punctuation
ValueCountFrequency (%)
( 79
100.0%
Close Punctuation
ValueCountFrequency (%)
) 79
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 55733
90.3%
Common 4982
 
8.1%
Latin 1017
 
1.6%
Han 6
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4721
 
8.5%
4578
 
8.2%
4469
 
8.0%
1208
 
2.2%
1193
 
2.1%
1168
 
2.1%
936
 
1.7%
862
 
1.5%
821
 
1.5%
793
 
1.4%
Other values (587) 34984
62.8%
Latin
ValueCountFrequency (%)
e 91
 
8.9%
A 89
 
8.8%
S 61
 
6.0%
C 55
 
5.4%
I 52
 
5.1%
T 47
 
4.6%
P 47
 
4.6%
K 45
 
4.4%
L 44
 
4.3%
B 40
 
3.9%
Other values (41) 446
43.9%
Common
ValueCountFrequency (%)
2740
55.0%
1 592
 
11.9%
2 538
 
10.8%
3 233
 
4.7%
0 200
 
4.0%
- 96
 
1.9%
5 94
 
1.9%
4 88
 
1.8%
( 79
 
1.6%
) 79
 
1.6%
Other values (13) 243
 
4.9%
Han
ValueCountFrequency (%)
4
66.7%
1
 
16.7%
1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 55733
90.3%
ASCII 5976
 
9.7%
Number Forms 20
 
< 0.1%
CJK 5
 
< 0.1%
None 3
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4721
 
8.5%
4578
 
8.2%
4469
 
8.0%
1208
 
2.2%
1193
 
2.1%
1168
 
2.1%
936
 
1.7%
862
 
1.5%
821
 
1.5%
793
 
1.4%
Other values (587) 34984
62.8%
ASCII
ValueCountFrequency (%)
2740
45.9%
1 592
 
9.9%
2 538
 
9.0%
3 233
 
3.9%
0 200
 
3.3%
- 96
 
1.6%
5 94
 
1.6%
e 91
 
1.5%
A 89
 
1.5%
4 88
 
1.5%
Other values (60) 1215
20.3%
Number Forms
ValueCountFrequency (%)
14
70.0%
3
 
15.0%
3
 
15.0%
CJK
ValueCountFrequency (%)
4
80.0%
1
 
20.0%
None
ValueCountFrequency (%)
· 3
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

단지종류
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 10000
100.0%

Length

2023-09-13T14:45:52.481508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-13T14:45:52.659761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 10000
100.0%

동수
Real number (ℝ)

HIGH CORRELATION 

Distinct48
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.6015
Minimum1
Maximum72
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-09-13T14:45:52.865141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q35
95-th percentile12
Maximum72
Range71
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.6779575
Coefficient of variation (CV)1.2988914
Kurtosis23.586366
Mean3.6015
Median Absolute Deviation (MAD)0
Skewness3.5630939
Sum36015
Variance21.883286
MonotonicityNot monotonic
2023-09-13T14:45:53.170373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
1 5404
54.0%
2 958
 
9.6%
3 554
 
5.5%
4 485
 
4.9%
5 420
 
4.2%
6 418
 
4.2%
7 308
 
3.1%
8 279
 
2.8%
9 224
 
2.2%
10 187
 
1.9%
Other values (38) 763
 
7.6%
ValueCountFrequency (%)
1 5404
54.0%
2 958
 
9.6%
3 554
 
5.5%
4 485
 
4.9%
5 420
 
4.2%
6 418
 
4.2%
7 308
 
3.1%
8 279
 
2.8%
9 224
 
2.2%
10 187
 
1.9%
ValueCountFrequency (%)
72 1
< 0.1%
66 1
< 0.1%
65 1
< 0.1%
60 1
< 0.1%
51 1
< 0.1%
49 1
< 0.1%
46 1
< 0.1%
44 1
< 0.1%
41 1
< 0.1%
40 1
< 0.1%

세대수
Real number (ℝ)

HIGH CORRELATION 

Distinct1330
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean274.4745
Minimum4
Maximum6864
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-09-13T14:45:53.446249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile12
Q121
median95
Q3378
95-th percentile1056.1
Maximum6864
Range6860
Interquartile range (IQR)357

Descriptive statistics

Standard deviation416.78995
Coefficient of variation (CV)1.5185015
Kurtosis23.375225
Mean274.4745
Median Absolute Deviation (MAD)79
Skewness3.5227466
Sum2744745
Variance173713.86
MonotonicityNot monotonic
2023-09-13T14:45:53.723884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19 716
 
7.2%
18 402
 
4.0%
14 209
 
2.1%
12 204
 
2.0%
16 165
 
1.7%
10 163
 
1.6%
40 132
 
1.3%
15 132
 
1.3%
29 111
 
1.1%
28 111
 
1.1%
Other values (1320) 7655
76.5%
ValueCountFrequency (%)
4 4
 
< 0.1%
5 71
 
0.7%
6 16
 
0.2%
7 16
 
0.2%
8 29
 
0.3%
9 40
 
0.4%
10 163
1.6%
11 63
 
0.6%
12 204
2.0%
13 68
 
0.7%
ValueCountFrequency (%)
6864 1
< 0.1%
5678 1
< 0.1%
5563 1
< 0.1%
5076 1
< 0.1%
4089 1
< 0.1%
3853 1
< 0.1%
3850 1
< 0.1%
3806 1
< 0.1%
3728 1
< 0.1%
3696 1
< 0.1%
Distinct6142
Distinct (%)61.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-09-13T14:45:54.296435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters100000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3669 ?
Unique (%)36.7%

Sample

1st row1996-11-22
2nd row2015-07-05
3rd row2004-10-30
4th row2004-06-07
5th row1993-07-16
ValueCountFrequency (%)
2008-10-31 9
 
0.1%
2002-11-14 8
 
0.1%
2002-11-01 8
 
0.1%
2004-08-27 8
 
0.1%
2004-07-30 8
 
0.1%
2017-10-27 8
 
0.1%
2004-09-24 8
 
0.1%
2002-12-18 7
 
0.1%
2003-01-22 7
 
0.1%
2003-09-08 7
 
0.1%
Other values (6132) 9922
99.2%
2023-09-13T14:45:55.167081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 21799
21.8%
- 20000
20.0%
1 16043
16.0%
2 14619
14.6%
9 9158
9.2%
8 3587
 
3.6%
3 3553
 
3.6%
7 2900
 
2.9%
4 2887
 
2.9%
6 2757
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
80.0%
Dash Punctuation 20000
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 21799
27.2%
1 16043
20.1%
2 14619
18.3%
9 9158
11.4%
8 3587
 
4.5%
3 3553
 
4.4%
7 2900
 
3.6%
4 2887
 
3.6%
6 2757
 
3.4%
5 2697
 
3.4%
Dash Punctuation
ValueCountFrequency (%)
- 20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 100000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 21799
21.8%
- 20000
20.0%
1 16043
16.0%
2 14619
14.6%
9 9158
9.2%
8 3587
 
3.6%
3 3553
 
3.6%
7 2900
 
2.9%
4 2887
 
2.9%
6 2757
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 100000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 21799
21.8%
- 20000
20.0%
1 16043
16.0%
2 14619
14.6%
9 9158
9.2%
8 3587
 
3.6%
3 3553
 
3.6%
7 2900
 
2.9%
4 2887
 
2.9%
6 2757
 
2.8%

Interactions

2023-09-13T14:45:42.574545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:39.848292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:40.807349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:41.714657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:42.806694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:40.083728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:41.024228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:41.941661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:43.122968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:40.353392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:41.257459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:42.144048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:43.313588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:40.605986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:41.457653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-13T14:45:42.355222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-09-13T14:45:55.381606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단지고유번호필지고유번호동수세대수
단지고유번호1.0001.0000.1450.131
필지고유번호1.0001.0000.1450.131
동수0.1450.1451.0000.872
세대수0.1310.1310.8721.000
2023-09-13T14:45:55.624766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단지고유번호필지고유번호동수세대수
단지고유번호1.0001.0000.1140.124
필지고유번호1.0001.0000.1150.125
동수0.1140.1151.0000.858
세대수0.1240.1250.8581.000

Missing values

2023-09-13T14:45:43.657499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-13T14:45:43.998370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-09-13T14:45:44.252931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

단지고유번호필지고유번호주소단지명_공시가격단지명_건축물대장단지명_도로명주소단지종류동수세대수사용승인일
9130412811000078694128112800109530000경기도 고양덕양구 행신동 953햇빛주공23<NA>햇빛마을12018131996-11-22
20967483101203226074831010200103670004경상남도 거제시 장승포동 367-4SG펠리체A동SG펠리체SG펠리체11162015-07-05
32957117401200227391174010100103520018서울특별시 강동구 명일동 352-18태천해오름102동B태천해오름아파트태천해오름아파트1162004-10-30
15859262901002500982629010600108760011부산광역시 남구 대연동 876-11삼정그린타운삼정그린타운삼정그린타운11142004-06-07
41950457501000123864575025022102120000전라북도 임실군 임실읍 이도리 212아도훼밀리아도훼밀리아파트아도훼밀리아파트11581993-07-16
4080412871000080864128710500103820013경기도 고양일산서구 덕이동 382-13동양라파크동양라파크동양라파크152002002-02-28
35061262301000039922623010400112560002부산광역시 부산진구 범천동 1256-2서면항도타워맨션서면항도타워맨션서면항도타워맨션111881994-05-16
19914412101201319014121010100107840000경기도 광명시 광명동 784광명제일풍경채제일풍경채아파트제일풍경채아파트151952010-09-03
29382482401200690514824025021103040001경상남도 사천시 사천읍 선인리 304-1진성아트빌진성아트빌진성아트빌11172007-04-25
12089115901000011091159010500103270000서울특별시 동작구 흑석동 327청호청호아파트청호아파트153461997-10-14
단지고유번호필지고유번호주소단지명_공시가격단지명_건축물대장단지명_도로명주소단지종류동수세대수사용승인일
27473115601000010581156013200145180000서울특별시 영등포구 신길동 4518우성2<NA><NA>177251986-09-25
6923115001000018221150010300109130001서울특별시 강서구 화곡동 913-1삼도삼도아파트삼도아파트12641999-11-10
29771427701203816574277025321104400000강원도 정선군 고한읍 고한리 440파인앤유아파트파인앤유 아파트<NA>152992018-11-29
25330501101204047115011013700103060000제주특별자치도 제주시 연동 306제주연동중흥S-클래스제주 연동 중흥S-클래스제주 연동 중흥S-클래스111512020-05-15
3627111101002482001111016800100040157서울특별시 종로구 동숭동 4-157동성아파트(2동)동성아파트동성아파트11181999-08-21
24587115001203164091150010300109170014서울특별시 강서구 화곡동 917-14삼성다빈치(917-14)삼성다빈치삼성다빈치111042015-04-15
32425264701000053112647010200111220001부산광역시 연제구 연산동 1122-1남일<NA><NA>131181976-11-05
31706442501200099554425010100101670001충청남도 계룡시 금암동 167-1우림루미아트우림루미아트우림루미아트1148682005-06-16
1356447701000113414477025026103030007충청남도 서천군 장항읍 화천리 303-7신흥신흥아파트신흥아파트112521995-09-23
11455115451000506931154510200109870011서울특별시 금천구 독산동 987-11한아(987-11)한아아파트한아아파트11122003-11-20