Overview

Dataset statistics

Number of variables13
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 MiB
Average record size in memory114.0 B

Variable types

Numeric2
Categorical3
Text8

Alerts

SIDO_NM is highly overall correlated with APT_POTVALE_RLT_RT_CMPR_INFO_NO and 1 other fieldsHigh correlation
TNSHP_NM is highly overall correlated with APT_POTVALE_RLT_RT_CMPR_INFO_NO and 2 other fieldsHigh correlation
APT_POTVALE_RLT_RT_CMPR_INFO_NO is highly overall correlated with SIDO_NM and 1 other fieldsHigh correlation
EMD_ACCTO_POTVALE_RLT_RT is highly overall correlated with TNSHP_NMHigh correlation
TNSHP_NM is highly imbalanced (62.7%)Imbalance
APT_POTVALE_RLT_RT_CMPR_INFO_NO has unique valuesUnique

Reproduction

Analysis started2023-12-11 22:32:15.280230
Analysis finished2023-12-11 22:32:18.348337
Duration3.07 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

APT_POTVALE_RLT_RT_CMPR_INFO_NO
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11748.064
Minimum1
Maximum23550
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:18.425499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1139.85
Q15852.75
median11701
Q317694.5
95-th percentile22386.15
Maximum23550
Range23549
Interquartile range (IQR)11841.75

Descriptive statistics

Standard deviation6813.232
Coefficient of variation (CV)0.57994511
Kurtosis-1.2086111
Mean11748.064
Median Absolute Deviation (MAD)5919
Skewness0.0073313905
Sum1.1748064 × 108
Variance46420130
MonotonicityNot monotonic
2023-12-12T07:32:18.529256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
619 1
 
< 0.1%
14376 1
 
< 0.1%
6749 1
 
< 0.1%
17485 1
 
< 0.1%
7556 1
 
< 0.1%
19924 1
 
< 0.1%
6698 1
 
< 0.1%
12875 1
 
< 0.1%
20898 1
 
< 0.1%
12144 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
4 1
< 0.1%
8 1
< 0.1%
10 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
15 1
< 0.1%
17 1
< 0.1%
20 1
< 0.1%
ValueCountFrequency (%)
23550 1
< 0.1%
23549 1
< 0.1%
23548 1
< 0.1%
23543 1
< 0.1%
23542 1
< 0.1%
23532 1
< 0.1%
23529 1
< 0.1%
23526 1
< 0.1%
23520 1
< 0.1%
23517 1
< 0.1%

SIDO_NM
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기도
6738 
인천광역시
1731 
서울특별시
1531 

Length

Max length5
Median length3
Mean length3.6524
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row경기도
3rd row경기도
4th row인천광역시
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 6738
67.4%
인천광역시 1731
 
17.3%
서울특별시 1531
 
15.3%

Length

2023-12-12T07:32:18.649514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:32:18.760434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 6738
67.4%
인천광역시 1731
 
17.3%
서울특별시 1531
 
15.3%
Distinct64
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:32:18.945310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.0433
Min length2

Characters and Unicode

Total characters30433
Distinct characters65
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강서구
2nd row포천시
3rd row안양시
4th row계양구
5th row수원시
ValueCountFrequency (%)
평택시 681
 
6.8%
수원시 564
 
5.6%
화성시 544
 
5.4%
고양시 481
 
4.8%
용인시 467
 
4.7%
서구 374
 
3.7%
시흥시 358
 
3.6%
남동구 350
 
3.5%
부천시 319
 
3.2%
파주시 295
 
2.9%
Other values (54) 5567
55.7%
2023-12-12T07:32:19.281746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7022
23.1%
3362
 
11.0%
1404
 
4.6%
1094
 
3.6%
1071
 
3.5%
1029
 
3.4%
984
 
3.2%
832
 
2.7%
789
 
2.6%
780
 
2.6%
Other values (55) 12066
39.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30433
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7022
23.1%
3362
 
11.0%
1404
 
4.6%
1094
 
3.6%
1071
 
3.5%
1029
 
3.4%
984
 
3.2%
832
 
2.7%
789
 
2.6%
780
 
2.6%
Other values (55) 12066
39.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30433
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7022
23.1%
3362
 
11.0%
1404
 
4.6%
1094
 
3.6%
1071
 
3.5%
1029
 
3.4%
984
 
3.2%
832
 
2.7%
789
 
2.6%
780
 
2.6%
Other values (55) 12066
39.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30433
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7022
23.1%
3362
 
11.0%
1404
 
4.6%
1094
 
3.6%
1071
 
3.5%
1029
 
3.4%
984
 
3.2%
832
 
2.7%
789
 
2.6%
780
 
2.6%
Other values (55) 12066
39.6%

TNSHP_NM
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
7920 
영통구
 
240
일산서구
 
178
덕양구
 
175
권선구
 
173
Other values (13)
1314 

Length

Max length4
Median length4
Mean length3.8226
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row동안구
4th row<NA>
5th row권선구

Common Values

ValueCountFrequency (%)
<NA> 7920
79.2%
영통구 240
 
2.4%
일산서구 178
 
1.8%
덕양구 175
 
1.8%
권선구 173
 
1.7%
기흥구 166
 
1.7%
수지구 160
 
1.6%
처인구 141
 
1.4%
일산동구 128
 
1.3%
단원구 123
 
1.2%
Other values (8) 596
 
6.0%

Length

2023-12-12T07:32:19.412281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 7920
79.2%
영통구 240
 
2.4%
일산서구 178
 
1.8%
덕양구 175
 
1.8%
권선구 173
 
1.7%
기흥구 166
 
1.7%
수지구 160
 
1.6%
처인구 141
 
1.4%
일산동구 128
 
1.3%
단원구 123
 
1.2%
Other values (8) 596
 
6.0%

EMD_NM
Text

Distinct678
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:32:19.706191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.0121
Min length2

Characters and Unicode

Total characters30121
Distinct characters251
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)0.7%

Sample

1st row마곡동
2nd row신북면
3rd row관양동
4th row계산동
5th row호매실동
ValueCountFrequency (%)
공도읍 124
 
1.2%
송도동 107
 
1.1%
영통동 102
 
1.0%
만수동 97
 
1.0%
배곧동 84
 
0.8%
구월동 83
 
0.8%
논현동 83
 
0.8%
중산동 79
 
0.8%
안양동 75
 
0.8%
청라동 74
 
0.7%
Other values (668) 9092
90.9%
2023-12-12T07:32:20.143946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8962
29.8%
1007
 
3.3%
780
 
2.6%
568
 
1.9%
443
 
1.5%
396
 
1.3%
385
 
1.3%
376
 
1.2%
334
 
1.1%
313
 
1.0%
Other values (241) 16557
55.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30015
99.6%
Decimal Number 106
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8962
29.9%
1007
 
3.4%
780
 
2.6%
568
 
1.9%
443
 
1.5%
396
 
1.3%
385
 
1.3%
376
 
1.3%
334
 
1.1%
313
 
1.0%
Other values (233) 16451
54.8%
Decimal Number
ValueCountFrequency (%)
2 25
23.6%
1 24
22.6%
3 20
18.9%
7 18
17.0%
5 7
 
6.6%
4 7
 
6.6%
6 3
 
2.8%
8 2
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30015
99.6%
Common 106
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8962
29.9%
1007
 
3.4%
780
 
2.6%
568
 
1.9%
443
 
1.5%
396
 
1.3%
385
 
1.3%
376
 
1.3%
334
 
1.1%
313
 
1.0%
Other values (233) 16451
54.8%
Common
ValueCountFrequency (%)
2 25
23.6%
1 24
22.6%
3 20
18.9%
7 18
17.0%
5 7
 
6.6%
4 7
 
6.6%
6 3
 
2.8%
8 2
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30015
99.6%
ASCII 106
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8962
29.9%
1007
 
3.4%
780
 
2.6%
568
 
1.9%
443
 
1.5%
396
 
1.3%
385
 
1.3%
376
 
1.3%
334
 
1.1%
313
 
1.0%
Other values (233) 16451
54.8%
ASCII
ValueCountFrequency (%)
2 25
23.6%
1 24
22.6%
3 20
18.9%
7 18
17.0%
5 7
 
6.6%
4 7
 
6.6%
6 3
 
2.8%
8 2
 
1.9%
Distinct3037
Distinct (%)30.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:32:20.483436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length4.9727
Min length2

Characters and Unicode

Total characters49727
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1222 ?
Unique (%)12.2%

Sample

1st row 748
2nd row 101-6
3rd row 1589
4th row 62-1
5th row 1408
ValueCountFrequency (%)
752 39
 
0.4%
955 29
 
0.3%
717 28
 
0.3%
1149 27
 
0.3%
176 25
 
0.2%
36-1 25
 
0.2%
693 25
 
0.2%
1142 24
 
0.2%
23 23
 
0.2%
736 23
 
0.2%
Other values (3027) 9732
97.3%
2023-12-12T07:32:20.935022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10000
20.1%
1 7425
14.9%
- 3949
 
7.9%
2 3678
 
7.4%
3 3522
 
7.1%
5 3464
 
7.0%
6 3248
 
6.5%
7 3066
 
6.2%
4 2964
 
6.0%
0 2909
 
5.8%
Other values (2) 5502
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 35778
71.9%
Control 10000
 
20.1%
Dash Punctuation 3949
 
7.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 7425
20.8%
2 3678
10.3%
3 3522
9.8%
5 3464
9.7%
6 3248
9.1%
7 3066
8.6%
4 2964
 
8.3%
0 2909
 
8.1%
8 2804
 
7.8%
9 2698
 
7.5%
Control
ValueCountFrequency (%)
10000
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3949
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49727
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
10000
20.1%
1 7425
14.9%
- 3949
 
7.9%
2 3678
 
7.4%
3 3522
 
7.1%
5 3464
 
7.0%
6 3248
 
6.5%
7 3066
 
6.2%
4 2964
 
6.0%
0 2909
 
5.8%
Other values (2) 5502
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49727
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10000
20.1%
1 7425
14.9%
- 3949
 
7.9%
2 3678
 
7.4%
3 3522
 
7.1%
5 3464
 
7.0%
6 3248
 
6.5%
7 3066
 
6.2%
4 2964
 
6.0%
0 2909
 
5.8%
Other values (2) 5502
11.1%

APT_NM
Text

Distinct3919
Distinct (%)39.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:32:21.497794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length25
Mean length7.2647
Min length2

Characters and Unicode

Total characters72647
Distinct characters576
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1930 ?
Unique (%)19.3%

Sample

1st row마곡13단지힐스테이트마스터
2nd row신포천
3rd row한가람(세경)
4th row현대
5th row엘에이치호매실스타힐스
ValueCountFrequency (%)
현대 60
 
0.6%
벽산 38
 
0.4%
주은풍림 37
 
0.4%
동남 35
 
0.3%
동탄역 34
 
0.3%
옥정센트럴파크푸르지오 29
 
0.3%
주은청설 27
 
0.3%
산내마을9단지힐스테이트운정 27
 
0.3%
삼성래미안 26
 
0.3%
한신 24
 
0.2%
Other values (3960) 10002
96.7%
2023-12-12T07:32:21.888434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2042
 
2.8%
1 1635
 
2.3%
1608
 
2.2%
1591
 
2.2%
1560
 
2.1%
1497
 
2.1%
1449
 
2.0%
1445
 
2.0%
1437
 
2.0%
1201
 
1.7%
Other values (566) 57182
78.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 62803
86.4%
Decimal Number 5292
 
7.3%
Open Punctuation 1190
 
1.6%
Close Punctuation 1190
 
1.6%
Uppercase Letter 890
 
1.2%
Dash Punctuation 355
 
0.5%
Space Separator 344
 
0.5%
Lowercase Letter 330
 
0.5%
Other Punctuation 178
 
0.2%
Math Symbol 49
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2042
 
3.3%
1608
 
2.6%
1591
 
2.5%
1560
 
2.5%
1497
 
2.4%
1449
 
2.3%
1445
 
2.3%
1437
 
2.3%
1201
 
1.9%
1121
 
1.8%
Other values (500) 47852
76.2%
Uppercase Letter
ValueCountFrequency (%)
S 148
16.6%
I 88
9.9%
K 82
 
9.2%
C 81
 
9.1%
E 58
 
6.5%
V 53
 
6.0%
L 46
 
5.2%
W 45
 
5.1%
B 40
 
4.5%
D 38
 
4.3%
Other values (15) 211
23.7%
Lowercase Letter
ValueCountFrequency (%)
e 207
62.7%
k 18
 
5.5%
a 17
 
5.2%
r 15
 
4.5%
h 10
 
3.0%
i 9
 
2.7%
y 9
 
2.7%
w 6
 
1.8%
u 6
 
1.8%
t 6
 
1.8%
Other values (9) 27
 
8.2%
Decimal Number
ValueCountFrequency (%)
1 1635
30.9%
2 1149
21.7%
3 544
 
10.3%
5 400
 
7.6%
0 341
 
6.4%
4 340
 
6.4%
6 280
 
5.3%
7 215
 
4.1%
9 206
 
3.9%
8 182
 
3.4%
Other Punctuation
ValueCountFrequency (%)
, 133
74.7%
. 38
 
21.3%
' 4
 
2.2%
& 3
 
1.7%
Letter Number
ValueCountFrequency (%)
16
61.5%
5
 
19.2%
5
 
19.2%
Open Punctuation
ValueCountFrequency (%)
( 1190
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1190
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 355
100.0%
Space Separator
ValueCountFrequency (%)
344
100.0%
Math Symbol
ValueCountFrequency (%)
~ 49
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 62802
86.4%
Common 8598
 
11.8%
Latin 1246
 
1.7%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2042
 
3.3%
1608
 
2.6%
1591
 
2.5%
1560
 
2.5%
1497
 
2.4%
1449
 
2.3%
1445
 
2.3%
1437
 
2.3%
1201
 
1.9%
1121
 
1.8%
Other values (499) 47851
76.2%
Latin
ValueCountFrequency (%)
e 207
16.6%
S 148
 
11.9%
I 88
 
7.1%
K 82
 
6.6%
C 81
 
6.5%
E 58
 
4.7%
V 53
 
4.3%
L 46
 
3.7%
W 45
 
3.6%
B 40
 
3.2%
Other values (37) 398
31.9%
Common
ValueCountFrequency (%)
1 1635
19.0%
( 1190
13.8%
) 1190
13.8%
2 1149
13.4%
3 544
 
6.3%
5 400
 
4.7%
- 355
 
4.1%
344
 
4.0%
0 341
 
4.0%
4 340
 
4.0%
Other values (9) 1110
12.9%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 62802
86.4%
ASCII 9818
 
13.5%
Number Forms 26
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2042
 
3.3%
1608
 
2.6%
1591
 
2.5%
1560
 
2.5%
1497
 
2.4%
1449
 
2.3%
1445
 
2.3%
1437
 
2.3%
1201
 
1.9%
1121
 
1.8%
Other values (499) 47851
76.2%
ASCII
ValueCountFrequency (%)
1 1635
16.7%
( 1190
12.1%
) 1190
12.1%
2 1149
11.7%
3 544
 
5.5%
5 400
 
4.1%
- 355
 
3.6%
344
 
3.5%
0 341
 
3.5%
4 340
 
3.5%
Other values (53) 2330
23.7%
Number Forms
ValueCountFrequency (%)
16
61.5%
5
 
19.2%
5
 
19.2%
CJK
ValueCountFrequency (%)
1
100.0%

SMOEU
Real number (ℝ)

Distinct2596
Distinct (%)26.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.386963
Minimum11.72
Maximum270.25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:22.023729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11.72
5-th percentile24.4095
Q159.1275
median72.6
Q384.94
95-th percentile120.82
Maximum270.25
Range258.53
Interquartile range (IQR)25.8125

Descriptive statistics

Standard deviation27.487324
Coefficient of variation (CV)0.38504684
Kurtosis3.5277313
Mean71.386963
Median Absolute Deviation (MAD)12.64
Skewness0.85249405
Sum713869.63
Variance755.553
MonotonicityNot monotonic
2023-12-12T07:32:22.164051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
84.99 382
 
3.8%
84.98 295
 
2.9%
59.99 212
 
2.1%
84.97 212
 
2.1%
84.96 208
 
2.1%
59.97 140
 
1.4%
59.98 133
 
1.3%
84.94 130
 
1.3%
59.94 125
 
1.2%
59.96 123
 
1.2%
Other values (2586) 8040
80.4%
ValueCountFrequency (%)
11.72 2
< 0.1%
11.74 1
 
< 0.1%
11.96 1
 
< 0.1%
12.02 1
 
< 0.1%
12.03 2
< 0.1%
12.04 2
< 0.1%
12.1 1
 
< 0.1%
12.11 1
 
< 0.1%
12.16 1
 
< 0.1%
12.19 4
< 0.1%
ValueCountFrequency (%)
270.25 2
< 0.1%
258.28 1
< 0.1%
244.55 1
< 0.1%
244.22 1
< 0.1%
244.07 1
< 0.1%
242.34 1
< 0.1%
240.98 1
< 0.1%
239.19 1
< 0.1%
235.31 1
< 0.1%
226.45 1
< 0.1%

SAPR
Text

Distinct1085
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:32:22.538363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length6.0165
Min length5

Characters and Unicode

Total characters60165
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique378 ?
Unique (%)3.8%

Sample

1st row113,000
2nd row6,500
3rd row47,000
4th row37,800
5th row38,500
ValueCountFrequency (%)
45,000 165
 
1.7%
30,000 164
 
1.6%
50,000 158
 
1.6%
40,000 157
 
1.6%
60,000 142
 
1.4%
35,000 141
 
1.4%
28,000 109
 
1.1%
25,000 108
 
1.1%
31,000 94
 
0.9%
55,000 94
 
0.9%
Other values (1075) 8668
86.7%
2023-12-12T07:32:22.983936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 27437
45.6%
, 10000
 
16.6%
5 4312
 
7.2%
1 3093
 
5.1%
2 2971
 
4.9%
3 2916
 
4.8%
4 2501
 
4.2%
8 1845
 
3.1%
7 1820
 
3.0%
6 1780
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50165
83.4%
Other Punctuation 10000
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 27437
54.7%
5 4312
 
8.6%
1 3093
 
6.2%
2 2971
 
5.9%
3 2916
 
5.8%
4 2501
 
5.0%
8 1845
 
3.7%
7 1820
 
3.6%
6 1780
 
3.5%
9 1490
 
3.0%
Other Punctuation
ValueCountFrequency (%)
, 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60165
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 27437
45.6%
, 10000
 
16.6%
5 4312
 
7.2%
1 3093
 
5.1%
2 2971
 
4.9%
3 2916
 
4.8%
4 2501
 
4.2%
8 1845
 
3.1%
7 1820
 
3.0%
6 1780
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60165
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 27437
45.6%
, 10000
 
16.6%
5 4312
 
7.2%
1 3093
 
5.1%
2 2971
 
4.9%
3 2916
 
4.8%
4 2501
 
4.2%
8 1845
 
3.1%
7 1820
 
3.0%
6 1780
 
3.0%
Distinct1696
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:32:23.307789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.9025
Min length5

Characters and Unicode

Total characters59025
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique526 ?
Unique (%)5.3%

Sample

1st row104,900
2nd row3,500
3rd row45,300
4th row30,300
5th row34,200
ValueCountFrequency (%)
13,500 37
 
0.4%
23,200 37
 
0.4%
18,600 32
 
0.3%
10,500 32
 
0.3%
29,800 31
 
0.3%
28,800 30
 
0.3%
22,000 30
 
0.3%
40,500 30
 
0.3%
19,300 30
 
0.3%
10,700 29
 
0.3%
Other values (1686) 9682
96.8%
2023-12-12T07:32:23.754715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 20938
35.5%
, 10000
16.9%
1 4384
 
7.4%
2 3963
 
6.7%
3 3666
 
6.2%
4 3307
 
5.6%
5 2992
 
5.1%
6 2549
 
4.3%
9 2437
 
4.1%
8 2436
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49025
83.1%
Other Punctuation 10000
 
16.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 20938
42.7%
1 4384
 
8.9%
2 3963
 
8.1%
3 3666
 
7.5%
4 3307
 
6.7%
5 2992
 
6.1%
6 2549
 
5.2%
9 2437
 
5.0%
8 2436
 
5.0%
7 2353
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 59025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 20938
35.5%
, 10000
16.9%
1 4384
 
7.4%
2 3963
 
6.7%
3 3666
 
6.2%
4 3307
 
5.6%
5 2992
 
5.1%
6 2549
 
4.3%
9 2437
 
4.1%
8 2436
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 20938
35.5%
, 10000
16.9%
1 4384
 
7.4%
2 3963
 
6.7%
3 3666
 
6.2%
4 3307
 
5.6%
5 2992
 
5.1%
6 2549
 
4.3%
9 2437
 
4.1%
8 2436
 
4.1%
Distinct99
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:32:23.977349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.0396
Min length3

Characters and Unicode

Total characters30396
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.1%

Sample

1st row93%
2nd row54%
3rd row96%
4th row80%
5th row89%
ValueCountFrequency (%)
72 295
 
2.9%
67 293
 
2.9%
68 284
 
2.8%
69 283
 
2.8%
71 281
 
2.8%
76 280
 
2.8%
66 272
 
2.7%
64 267
 
2.7%
70 264
 
2.6%
65 263
 
2.6%
Other values (89) 7218
72.2%
2023-12-12T07:32:24.289001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
% 10000
32.9%
6 3579
 
11.8%
7 3568
 
11.7%
8 3066
 
10.1%
9 2208
 
7.3%
5 2061
 
6.8%
1 1434
 
4.7%
0 1361
 
4.5%
4 1130
 
3.7%
3 999
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20396
67.1%
Other Punctuation 10000
32.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 3579
17.5%
7 3568
17.5%
8 3066
15.0%
9 2208
10.8%
5 2061
10.1%
1 1434
7.0%
0 1361
 
6.7%
4 1130
 
5.5%
3 999
 
4.9%
2 990
 
4.9%
Other Punctuation
ValueCountFrequency (%)
% 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30396
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
% 10000
32.9%
6 3579
 
11.8%
7 3568
 
11.7%
8 3066
 
10.1%
9 2208
 
7.3%
5 2061
 
6.8%
1 1434
 
4.7%
0 1361
 
4.5%
4 1130
 
3.7%
3 999
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30396
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
% 10000
32.9%
6 3579
 
11.8%
7 3568
 
11.7%
8 3066
 
10.1%
9 2208
 
7.3%
5 2061
 
6.8%
1 1434
 
4.7%
0 1361
 
4.5%
4 1130
 
3.7%
3 999
 
3.3%
Distinct83
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:32:24.485916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.0058
Min length3

Characters and Unicode

Total characters30058
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.1%

Sample

1st row89%
2nd row59%
3rd row86%
4th row75%
5th row84%
ValueCountFrequency (%)
81 403
 
4.0%
71 386
 
3.9%
69 362
 
3.6%
79 348
 
3.5%
70 343
 
3.4%
75 335
 
3.4%
73 330
 
3.3%
76 320
 
3.2%
74 316
 
3.2%
72 315
 
3.1%
Other values (73) 6542
65.4%
2023-12-12T07:32:24.786249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
% 10000
33.3%
7 4163
13.8%
6 3518
 
11.7%
8 3412
 
11.4%
9 2022
 
6.7%
5 1664
 
5.5%
1 1285
 
4.3%
4 1036
 
3.4%
3 1035
 
3.4%
0 1027
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20058
66.7%
Other Punctuation 10000
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 4163
20.8%
6 3518
17.5%
8 3412
17.0%
9 2022
10.1%
5 1664
 
8.3%
1 1285
 
6.4%
4 1036
 
5.2%
3 1035
 
5.2%
0 1027
 
5.1%
2 896
 
4.5%
Other Punctuation
ValueCountFrequency (%)
% 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30058
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
% 10000
33.3%
7 4163
13.8%
6 3518
 
11.7%
8 3412
 
11.4%
9 2022
 
6.7%
5 1664
 
5.5%
1 1285
 
4.3%
4 1036
 
3.4%
3 1035
 
3.4%
0 1027
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30058
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
% 10000
33.3%
7 4163
13.8%
6 3518
 
11.7%
8 3412
 
11.4%
9 2022
 
6.7%
5 1664
 
5.5%
1 1285
 
4.3%
4 1036
 
3.4%
3 1035
 
3.4%
0 1027
 
3.4%

EMD_ACCTO_POTVALE_RLT_RT
Categorical

HIGH CORRELATION 

Distinct48
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
77%
 
654
72%
 
588
78%
 
515
81%
 
478
70%
 
435
Other values (43)
7330 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row79%
2nd row61%
3rd row81%
4th row71%
5th row85%

Common Values

ValueCountFrequency (%)
77% 654
 
6.5%
72% 588
 
5.9%
78% 515
 
5.1%
81% 478
 
4.8%
70% 435
 
4.3%
71% 431
 
4.3%
73% 431
 
4.3%
82% 416
 
4.2%
76% 416
 
4.2%
74% 402
 
4.0%
Other values (38) 5234
52.3%

Length

2023-12-12T07:32:24.893404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
77 654
 
6.5%
72 588
 
5.9%
78 515
 
5.1%
81 478
 
4.8%
70 435
 
4.3%
71 431
 
4.3%
73 431
 
4.3%
82 416
 
4.2%
76 416
 
4.2%
74 402
 
4.0%
Other values (38) 5234
52.3%

Interactions

2023-12-12T07:32:17.907098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:17.757195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:17.990218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:17.830125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:32:24.958394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
APT_POTVALE_RLT_RT_CMPR_INFO_NOSIDO_NMSIGNGU_NMTNSHP_NMSMOEUPOTVALE_RLT_RTAPT_ACCTO_POTVALE_RLT_RTEMD_ACCTO_POTVALE_RLT_RT
APT_POTVALE_RLT_RT_CMPR_INFO_NO1.0000.9170.9960.9600.2660.3020.5010.744
SIDO_NM0.9171.0000.999NaN0.2470.1790.3230.571
SIGNGU_NM0.9960.9991.0001.0000.4450.4870.7350.923
TNSHP_NM0.960NaN1.0001.0000.3350.4110.7150.909
SMOEU0.2660.2470.4450.3351.0000.3120.4630.334
POTVALE_RLT_RT0.3020.1790.4870.4110.3121.0000.9680.648
APT_ACCTO_POTVALE_RLT_RT0.5010.3230.7350.7150.4630.9681.0000.842
EMD_ACCTO_POTVALE_RLT_RT0.7440.5710.9230.9090.3340.6480.8421.000
2023-12-12T07:32:25.069669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
EMD_ACCTO_POTVALE_RLT_RTSIDO_NMTNSHP_NM
EMD_ACCTO_POTVALE_RLT_RT1.0000.3230.521
SIDO_NM0.3231.0001.000
TNSHP_NM0.5211.0001.000
2023-12-12T07:32:25.149228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
APT_POTVALE_RLT_RT_CMPR_INFO_NOSMOEUSIDO_NMTNSHP_NMEMD_ACCTO_POTVALE_RLT_RT
APT_POTVALE_RLT_RT_CMPR_INFO_NO1.0000.0310.8830.8820.360
SMOEU0.0311.0000.1520.1370.120
SIDO_NM0.8830.1521.0001.0000.323
TNSHP_NM0.8820.1371.0001.0000.521
EMD_ACCTO_POTVALE_RLT_RT0.3600.1200.3230.5211.000

Missing values

2023-12-12T07:32:18.094604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:32:18.252374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

APT_POTVALE_RLT_RT_CMPR_INFO_NOSIDO_NMSIGNGU_NMTNSHP_NMEMD_NMHONO_NMAPT_NMSMOEUSAPRMOLIT_POTVALE_AMTPOTVALE_RLT_RTAPT_ACCTO_POTVALE_RLT_RTEMD_ACCTO_POTVALE_RLT_RT
618619서울특별시강서구<NA>마곡동748마곡13단지힐스테이트마스터84.98113,000104,90093%89%79%
2191921920경기도포천시<NA>신북면101-6신포천50.676,5003,50054%59%61%
1559115592경기도안양시동안구관양동1589한가람(세경)49.6847,00045,30096%86%81%
36803681인천광역시계양구<NA>계산동62-1현대84.9537,80030,30080%75%71%
1240812409경기도수원시권선구호매실동1408엘에이치호매실스타힐스59.9838,50034,20089%84%85%
93969397경기도군포시<NA>당동954무지개마을대림84.8755,00042,30077%79%70%
2192521926경기도포천시<NA>신북면101-6신포천44.225,8503,28056%59%61%
94309431경기도군포시<NA>대야미동652-7신안실크밸리(28-7)59.9237,00023,90065%65%73%
2053920540경기도평택시<NA>서정동787-2대옥352.2313,7005,86043%44%57%
2132021321경기도평택시<NA>청북읍1104부영사랑으로2단지59.9615,90013,50085%74%71%
APT_POTVALE_RLT_RT_CMPR_INFO_NOSIDO_NMSIGNGU_NMTNSHP_NMEMD_NMHONO_NMAPT_NMSMOEUSAPRMOLIT_POTVALE_AMTPOTVALE_RLT_RTAPT_ACCTO_POTVALE_RLT_RTEMD_ACCTO_POTVALE_RLT_RT
898899서울특별시광진구<NA>화양동110-37화양타워105.0474,00041,60056%56%67%
12241225서울특별시노원구<NA>상계동1320미라보(성림)60.042,50031,20073%73%77%
74847485인천광역시중구<NA>인현동3-2뉴코아60.013,0008,44065%59%59%
1634616347경기도양주시<NA>옥정동1051e편한세상옥정에듀써밋74.9838,80036,00093%91%91%
1099810999경기도부천시<NA>고강동367-5건일45.017,50011,10063%59%63%
2268722688경기도화성시<NA>병점동859병점역에듀포레75.9931,50025,60081%82%80%
1994619947경기도파주시<NA>문산읍1352양우내안애3단지59.9319,30011,50060%59%58%
64586459인천광역시서구<NA>석남동559경인44.8213,3007,96060%60%70%
1120311204경기도부천시<NA>상동413상동스카이뷰자이84.9678,00059,80077%73%80%
2193021931경기도포천시<NA>신북면101-3후레쉬빌49.978,4505,08060%64%61%