Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells9997
Missing cells (%)10.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory908.2 KiB
Average record size in memory93.0 B

Variable types

Text5
Numeric3
Categorical2

Dataset

Description관리_호별_명세_pk,관리_동별_개요_pk,호_번호,호_명칭,평형_구분_명,층_번호,층_구분_코드,관리_건축물대장_참조_pk,변경_구분_코드,작업_일자
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15674/S/1/datasetView.do

Alerts

변경_구분_코드 is highly overall correlated with 호_번호 and 3 other fieldsHigh correlation
층_구분_코드 is highly overall correlated with 변경_구분_코드High correlation
호_번호 is highly overall correlated with 변경_구분_코드High correlation
층_번호 is highly overall correlated with 변경_구분_코드High correlation
작업_일자 is highly overall correlated with 변경_구분_코드High correlation
층_구분_코드 is highly imbalanced (92.6%)Imbalance
변경_구분_코드 is highly imbalanced (92.5%)Imbalance
관리_건축물대장_참조_pk has 9911 (99.1%) missing valuesMissing
관리_호별_명세_pk has unique valuesUnique
호_번호 has 8076 (80.8%) zerosZeros

Reproduction

Analysis started2024-05-11 02:08:15.074658
Analysis finished2024-05-11 02:08:20.576925
Duration5.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T02:08:21.029433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length18.3254
Min length15

Characters and Unicode

Total characters183254
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row11230-1000000000000000054318
2nd row11350-1000000000000000521372
3rd row11000-100017041
4th row11000-100006342
5th row11380-1000000000000000594147
ValueCountFrequency (%)
11230-1000000000000000054318 1
 
< 0.1%
11000-100036835 1
 
< 0.1%
11350-100040673 1
 
< 0.1%
11000-100035002 1
 
< 0.1%
11000-100043739 1
 
< 0.1%
11215-1000000000000000371670 1
 
< 0.1%
11350-100041472 1
 
< 0.1%
11000-100010812 1
 
< 0.1%
11000-100030430 1
 
< 0.1%
11380-1000000000000000593763 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-11T02:08:21.843724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 84338
46.0%
1 37133
20.3%
- 10000
 
5.5%
2 9118
 
5.0%
3 9004
 
4.9%
5 6543
 
3.6%
4 5758
 
3.1%
7 5705
 
3.1%
9 5526
 
3.0%
6 5117
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 173254
94.5%
Dash Punctuation 10000
 
5.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 84338
48.7%
1 37133
21.4%
2 9118
 
5.3%
3 9004
 
5.2%
5 6543
 
3.8%
4 5758
 
3.3%
7 5705
 
3.3%
9 5526
 
3.2%
6 5117
 
3.0%
8 5012
 
2.9%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 183254
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 84338
46.0%
1 37133
20.3%
- 10000
 
5.5%
2 9118
 
5.0%
3 9004
 
4.9%
5 6543
 
3.6%
4 5758
 
3.1%
7 5705
 
3.1%
9 5526
 
3.0%
6 5117
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 183254
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 84338
46.0%
1 37133
20.3%
- 10000
 
5.5%
2 9118
 
5.0%
3 9004
 
4.9%
5 6543
 
3.6%
4 5758
 
3.1%
7 5705
 
3.1%
9 5526
 
3.0%
6 5117
 
2.8%
Distinct1399
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T02:08:22.468318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length18.3254
Min length15

Characters and Unicode

Total characters183254
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique170 ?
Unique (%)1.7%

Sample

1st row11230-1000000000000000007437
2nd row11350-1000000000000000097236
3rd row11000-100004454
4th row11000-100003664
5th row11380-1000000000000000106283
ValueCountFrequency (%)
11230-1000000000000000101255 139
 
1.4%
11305-100006084 91
 
0.9%
11140-1000000000000000066327 81
 
0.8%
11230-1000000000000000101254 78
 
0.8%
11170-100006321 75
 
0.8%
11140-1000000000000000053703 61
 
0.6%
11170-100005821 58
 
0.6%
11170-100005902 58
 
0.6%
11170-100005942 57
 
0.6%
11170-100005941 56
 
0.6%
Other values (1389) 9246
92.5%
2024-05-11T02:08:23.518435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 90490
49.4%
1 37478
20.5%
- 10000
 
5.5%
2 7872
 
4.3%
3 7588
 
4.1%
5 5711
 
3.1%
6 5662
 
3.1%
4 4943
 
2.7%
7 4788
 
2.6%
8 4571
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 173254
94.5%
Dash Punctuation 10000
 
5.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 90490
52.2%
1 37478
21.6%
2 7872
 
4.5%
3 7588
 
4.4%
5 5711
 
3.3%
6 5662
 
3.3%
4 4943
 
2.9%
7 4788
 
2.8%
8 4571
 
2.6%
9 4151
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 183254
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 90490
49.4%
1 37478
20.5%
- 10000
 
5.5%
2 7872
 
4.3%
3 7588
 
4.1%
5 5711
 
3.1%
6 5662
 
3.1%
4 4943
 
2.7%
7 4788
 
2.6%
8 4571
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 183254
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 90490
49.4%
1 37478
20.5%
- 10000
 
5.5%
2 7872
 
4.3%
3 7588
 
4.1%
5 5711
 
3.1%
6 5662
 
3.1%
4 4943
 
2.7%
7 4788
 
2.6%
8 4571
 
2.5%

호_번호
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct452
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.9901
Minimum0
Maximum752
Zeros8076
Zeros (%)80.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T02:08:23.877641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile309.05
Maximum752
Range752
Interquartile range (IQR)0

Descriptive statistics

Standard deviation102.33018
Coefficient of variation (CV)2.9245465
Kurtosis11.63231
Mean34.9901
Median Absolute Deviation (MAD)0
Skewness3.4276364
Sum349901
Variance10471.465
MonotonicityNot monotonic
2024-05-11T02:08:24.275256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8076
80.8%
404 22
 
0.2%
407 20
 
0.2%
403 20
 
0.2%
4 19
 
0.2%
30 19
 
0.2%
3 18
 
0.2%
7 17
 
0.2%
401 17
 
0.2%
315 16
 
0.2%
Other values (442) 1756
 
17.6%
ValueCountFrequency (%)
0 8076
80.8%
1 12
 
0.1%
2 14
 
0.1%
3 18
 
0.2%
4 19
 
0.2%
5 11
 
0.1%
6 12
 
0.1%
7 17
 
0.2%
8 15
 
0.1%
9 7
 
0.1%
ValueCountFrequency (%)
752 1
< 0.1%
746 1
< 0.1%
741 1
< 0.1%
707 1
< 0.1%
705 1
< 0.1%
701 1
< 0.1%
697 1
< 0.1%
692 1
< 0.1%
687 1
< 0.1%
667 1
< 0.1%
Distinct1340
Distinct (%)13.4%
Missing3
Missing (%)< 0.1%
Memory size156.2 KiB
2024-05-11T02:08:24.988856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.5331599
Min length1

Characters and Unicode

Total characters35321
Distinct characters30
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique854 ?
Unique (%)8.5%

Sample

1st row1003
2nd row202
3rd row902
4th row103
5th row302
ValueCountFrequency (%)
202 141
 
1.4%
401 136
 
1.4%
602 132
 
1.3%
301 132
 
1.3%
303 124
 
1.2%
501 120
 
1.2%
504 119
 
1.2%
403 118
 
1.2%
203 116
 
1.2%
402 116
 
1.2%
Other values (1320) 8769
87.5%
2024-05-11T02:08:26.237463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 9947
28.2%
1 6854
19.4%
2 4467
12.6%
3 3345
 
9.5%
4 2967
 
8.4%
5 1984
 
5.6%
6 1568
 
4.4%
7 1248
 
3.5%
8 1057
 
3.0%
9 985
 
2.8%
Other values (20) 899
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34422
97.5%
Other Letter 586
 
1.7%
Uppercase Letter 266
 
0.8%
Space Separator 26
 
0.1%
Dash Punctuation 21
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
395
67.4%
156
 
26.6%
6
 
1.0%
5
 
0.9%
5
 
0.9%
3
 
0.5%
3
 
0.5%
3
 
0.5%
3
 
0.5%
3
 
0.5%
Other values (2) 4
 
0.7%
Decimal Number
ValueCountFrequency (%)
0 9947
28.9%
1 6854
19.9%
2 4467
13.0%
3 3345
 
9.7%
4 2967
 
8.6%
5 1984
 
5.8%
6 1568
 
4.6%
7 1248
 
3.6%
8 1057
 
3.1%
9 985
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
B 115
43.2%
C 59
22.2%
A 59
22.2%
D 30
 
11.3%
T 2
 
0.8%
H 1
 
0.4%
Space Separator
ValueCountFrequency (%)
26
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 34469
97.6%
Hangul 586
 
1.7%
Latin 266
 
0.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9947
28.9%
1 6854
19.9%
2 4467
13.0%
3 3345
 
9.7%
4 2967
 
8.6%
5 1984
 
5.8%
6 1568
 
4.5%
7 1248
 
3.6%
8 1057
 
3.1%
9 985
 
2.9%
Other values (2) 47
 
0.1%
Hangul
ValueCountFrequency (%)
395
67.4%
156
 
26.6%
6
 
1.0%
5
 
0.9%
5
 
0.9%
3
 
0.5%
3
 
0.5%
3
 
0.5%
3
 
0.5%
3
 
0.5%
Other values (2) 4
 
0.7%
Latin
ValueCountFrequency (%)
B 115
43.2%
C 59
22.2%
A 59
22.2%
D 30
 
11.3%
T 2
 
0.8%
H 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34735
98.3%
Hangul 586
 
1.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9947
28.6%
1 6854
19.7%
2 4467
12.9%
3 3345
 
9.6%
4 2967
 
8.5%
5 1984
 
5.7%
6 1568
 
4.5%
7 1248
 
3.6%
8 1057
 
3.0%
9 985
 
2.8%
Other values (8) 313
 
0.9%
Hangul
ValueCountFrequency (%)
395
67.4%
156
 
26.6%
6
 
1.0%
5
 
0.9%
5
 
0.9%
3
 
0.5%
3
 
0.5%
3
 
0.5%
3
 
0.5%
3
 
0.5%
Other values (2) 4
 
0.7%
Distinct920
Distinct (%)9.3%
Missing83
Missing (%)0.8%
Memory size156.2 KiB
2024-05-11T02:08:26.769276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length3
Mean length3.3445598
Min length1

Characters and Unicode

Total characters33168
Distinct characters91
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique497 ?
Unique (%)5.0%

Sample

1st row84C
2nd row97A
3rd row49C
4th row59B-1
5th row39
ValueCountFrequency (%)
84a 1099
 
11.0%
59a 946
 
9.5%
84b 585
 
5.9%
49a 386
 
3.9%
59b 284
 
2.9%
84c 277
 
2.8%
49b 262
 
2.6%
59 237
 
2.4%
39a 199
 
2.0%
74a 184
 
1.8%
Other values (899) 5497
55.2%
2024-05-11T02:08:27.694270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 5255
15.8%
A 4174
12.6%
9 3760
11.3%
8 3197
9.6%
1 2681
8.1%
5 2546
 
7.7%
B 2077
 
6.3%
3 1152
 
3.5%
2 1002
 
3.0%
- 954
 
2.9%
Other values (81) 6370
19.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21415
64.6%
Uppercase Letter 8803
26.5%
Other Letter 1376
 
4.1%
Dash Punctuation 954
 
2.9%
Lowercase Letter 217
 
0.7%
Other Punctuation 120
 
0.4%
Open Punctuation 117
 
0.4%
Close Punctuation 117
 
0.4%
Space Separator 39
 
0.1%
Other Symbol 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
158
 
11.5%
133
 
9.7%
73
 
5.3%
72
 
5.2%
67
 
4.9%
67
 
4.9%
65
 
4.7%
63
 
4.6%
60
 
4.4%
60
 
4.4%
Other values (33) 558
40.6%
Uppercase Letter
ValueCountFrequency (%)
A 4174
47.4%
B 2077
23.6%
C 872
 
9.9%
D 475
 
5.4%
E 239
 
2.7%
O 185
 
2.1%
S 168
 
1.9%
F 152
 
1.7%
T 148
 
1.7%
M 76
 
0.9%
Other values (12) 237
 
2.7%
Decimal Number
ValueCountFrequency (%)
4 5255
24.5%
9 3760
17.6%
8 3197
14.9%
1 2681
12.5%
5 2546
11.9%
3 1152
 
5.4%
2 1002
 
4.7%
0 771
 
3.6%
7 647
 
3.0%
6 404
 
1.9%
Lowercase Letter
ValueCountFrequency (%)
a 94
43.3%
b 25
 
11.5%
e 21
 
9.7%
p 21
 
9.7%
y 21
 
9.7%
t 21
 
9.7%
d 9
 
4.1%
c 5
 
2.3%
Other Punctuation
ValueCountFrequency (%)
. 98
81.7%
' 21
 
17.5%
, 1
 
0.8%
Dash Punctuation
ValueCountFrequency (%)
- 954
100.0%
Open Punctuation
ValueCountFrequency (%)
( 117
100.0%
Close Punctuation
ValueCountFrequency (%)
) 117
100.0%
Space Separator
ValueCountFrequency (%)
39
100.0%
Other Symbol
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22772
68.7%
Latin 9020
 
27.2%
Hangul 1376
 
4.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
158
 
11.5%
133
 
9.7%
73
 
5.3%
72
 
5.2%
67
 
4.9%
67
 
4.9%
65
 
4.7%
63
 
4.6%
60
 
4.4%
60
 
4.4%
Other values (33) 558
40.6%
Latin
ValueCountFrequency (%)
A 4174
46.3%
B 2077
23.0%
C 872
 
9.7%
D 475
 
5.3%
E 239
 
2.6%
O 185
 
2.1%
S 168
 
1.9%
F 152
 
1.7%
T 148
 
1.6%
a 94
 
1.0%
Other values (20) 436
 
4.8%
Common
ValueCountFrequency (%)
4 5255
23.1%
9 3760
16.5%
8 3197
14.0%
1 2681
11.8%
5 2546
11.2%
3 1152
 
5.1%
2 1002
 
4.4%
- 954
 
4.2%
0 771
 
3.4%
7 647
 
2.8%
Other values (8) 807
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31782
95.8%
Hangul 1376
 
4.1%
CJK Compat 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 5255
16.5%
A 4174
13.1%
9 3760
11.8%
8 3197
10.1%
1 2681
8.4%
5 2546
8.0%
B 2077
 
6.5%
3 1152
 
3.6%
2 1002
 
3.2%
- 954
 
3.0%
Other values (37) 4984
15.7%
Hangul
ValueCountFrequency (%)
158
 
11.5%
133
 
9.7%
73
 
5.3%
72
 
5.2%
67
 
4.9%
67
 
4.9%
65
 
4.7%
63
 
4.6%
60
 
4.4%
60
 
4.4%
Other values (33) 558
40.6%
CJK Compat
ValueCountFrequency (%)
10
100.0%

층_번호
Real number (ℝ)

HIGH CORRELATION 

Distinct63
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.2905
Minimum0
Maximum65
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T02:08:28.133860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median8
Q314
95-th percentile26
Maximum65
Range65
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.0813638
Coefficient of variation (CV)0.78532275
Kurtosis3.9588667
Mean10.2905
Median Absolute Deviation (MAD)5
Skewness1.5757461
Sum102905
Variance65.308441
MonotonicityNot monotonic
2024-05-11T02:08:28.617388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 685
 
6.9%
4 681
 
6.8%
3 664
 
6.6%
5 641
 
6.4%
6 636
 
6.4%
1 620
 
6.2%
7 594
 
5.9%
9 539
 
5.4%
8 538
 
5.4%
10 491
 
4.9%
Other values (53) 3911
39.1%
ValueCountFrequency (%)
0 2
 
< 0.1%
1 620
6.2%
2 685
6.9%
3 664
6.6%
4 681
6.8%
5 641
6.4%
6 636
6.4%
7 594
5.9%
8 538
5.4%
9 539
5.4%
ValueCountFrequency (%)
65 1
 
< 0.1%
63 2
< 0.1%
62 2
< 0.1%
61 2
< 0.1%
60 3
< 0.1%
58 2
< 0.1%
57 1
 
< 0.1%
56 2
< 0.1%
55 2
< 0.1%
54 3
< 0.1%

층_구분_코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
9910 
10
 
90

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 9910
99.1%
10 90
 
0.9%

Length

2024-05-11T02:08:29.053057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T02:08:29.347069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 9910
99.1%
10 90
 
0.9%
Distinct89
Distinct (%)100.0%
Missing9911
Missing (%)99.1%
Memory size156.2 KiB
2024-05-11T02:08:29.877135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length13.685393
Min length11

Characters and Unicode

Total characters1218
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique89 ?
Unique (%)100.0%

Sample

1st row11305-52214
2nd row11215-56321
3rd row11380-56951
4th row11320-107198
5th row11215-56280
ValueCountFrequency (%)
11215-56237 1
 
1.1%
11215-56316 1
 
1.1%
11350-100181860 1
 
1.1%
11350-172291 1
 
1.1%
11350-172194 1
 
1.1%
11260-1000000000000002690854 1
 
1.1%
11320-98292 1
 
1.1%
11350-100181850 1
 
1.1%
11320-84169 1
 
1.1%
11215-56331 1
 
1.1%
Other values (79) 79
88.8%
2024-05-11T02:08:31.082025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 339
27.8%
0 245
20.1%
2 98
 
8.0%
5 95
 
7.8%
- 89
 
7.3%
3 85
 
7.0%
8 76
 
6.2%
6 62
 
5.1%
9 57
 
4.7%
7 37
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1129
92.7%
Dash Punctuation 89
 
7.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 339
30.0%
0 245
21.7%
2 98
 
8.7%
5 95
 
8.4%
3 85
 
7.5%
8 76
 
6.7%
6 62
 
5.5%
9 57
 
5.0%
7 37
 
3.3%
4 35
 
3.1%
Dash Punctuation
ValueCountFrequency (%)
- 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1218
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 339
27.8%
0 245
20.1%
2 98
 
8.0%
5 95
 
7.8%
- 89
 
7.3%
3 85
 
7.0%
8 76
 
6.2%
6 62
 
5.1%
9 57
 
4.7%
7 37
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1218
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 339
27.8%
0 245
20.1%
2 98
 
8.0%
5 95
 
7.8%
- 89
 
7.3%
3 85
 
7.0%
8 76
 
6.2%
6 62
 
5.1%
9 57
 
4.7%
7 37
 
3.0%

변경_구분_코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9908 
1
 
92

Length

Max length4
Median length4
Mean length3.9724
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9908
99.1%
1 92
 
0.9%

Length

2024-05-11T02:08:31.666946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T02:08:32.183223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9908
99.1%
1 92
 
0.9%

작업_일자
Real number (ℝ)

HIGH CORRELATION 

Distinct94
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20229949
Minimum20201201
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T02:08:32.779329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20201201
5-th percentile20210806
Q120220416
median20230831
Q320240510
95-th percentile20240510
Maximum20240510
Range39309
Interquartile range (IQR)20094

Descriptive statistics

Standard deviation11750.572
Coefficient of variation (CV)0.00058085033
Kurtosis-0.91715369
Mean20229949
Median Absolute Deviation (MAD)9679
Skewness-0.70663813
Sum2.0229949 × 1011
Variance1.3807595 × 108
MonotonicityNot monotonic
2024-05-11T02:08:33.485566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20240510 3978
39.8%
20211029 427
 
4.3%
20230808 399
 
4.0%
20211123 319
 
3.2%
20240102 303
 
3.0%
20230110 265
 
2.6%
20230704 265
 
2.6%
20220929 221
 
2.2%
20231028 217
 
2.2%
20220204 208
 
2.1%
Other values (84) 3398
34.0%
ValueCountFrequency (%)
20201201 105
1.1%
20201208 1
 
< 0.1%
20201216 1
 
< 0.1%
20201230 26
 
0.3%
20210106 26
 
0.3%
20210216 118
1.2%
20210309 110
1.1%
20210601 94
0.9%
20210628 3
 
< 0.1%
20210806 150
1.5%
ValueCountFrequency (%)
20240510 3978
39.8%
20240507 3
 
< 0.1%
20240425 9
 
0.1%
20240402 6
 
0.1%
20240327 6
 
0.1%
20240302 95
 
0.9%
20240227 5
 
0.1%
20240223 1
 
< 0.1%
20240222 4
 
< 0.1%
20240221 1
 
< 0.1%

Interactions

2024-05-11T02:08:18.493527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:08:16.438069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:08:17.586984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:08:18.803473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:08:16.726609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:08:17.887213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:08:19.105688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:08:17.017546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:08:18.188119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T02:08:33.991574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호_번호층_번호층_구분_코드관리_건축물대장_참조_pk작업_일자
호_번호1.0000.4540.023NaN0.216
층_번호0.4541.0000.1501.0000.264
층_구분_코드0.0230.1501.0001.0000.064
관리_건축물대장_참조_pkNaN1.0001.0001.0001.000
작업_일자0.2160.2640.0641.0001.000
2024-05-11T02:08:34.389963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
변경_구분_코드층_구분_코드
변경_구분_코드1.0001.000
층_구분_코드1.0001.000
2024-05-11T02:08:34.662969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호_번호층_번호작업_일자층_구분_코드변경_구분_코드
호_번호1.0000.073-0.0700.0181.000
층_번호0.0731.000-0.2140.1151.000
작업_일자-0.070-0.2141.0000.0441.000
층_구분_코드0.0180.1150.0441.0001.000
변경_구분_코드1.0001.0001.0001.0001.000

Missing values

2024-05-11T02:08:19.538094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T02:08:19.985199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T02:08:20.366080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

관리_호별_명세_pk관리_동별_개요_pk호_번호호_명칭평형_구분_명층_번호층_구분_코드관리_건축물대장_참조_pk변경_구분_코드작업_일자
6034111230-100000000000000005431811230-10000000000000000074370100384C1020<NA><NA>20230523
8425211350-100000000000000052137211350-1000000000000000097236020297A220<NA><NA>20230704
1450111000-10001704111000-10000445440190249C920<NA><NA>20240510
433211000-10000634211000-100003664010359B-1120<NA><NA>20240510
9623311380-100000000000000059414711380-1000000000000000106283030239320<NA><NA>20230808
9379711380-100000000000000028256511380-10000000000000000526780100384A1020<NA><NA>20230831
6991411260-10003608711260-10001022601203591220<NA><NA>20211029
2862811000-10003197711000-100003327030249A320<NA><NA>20240510
8937511350-10004436111350-100017623060274B620<NA><NA>20220914
5280411170-10003517511170-10000594330102동 113호102동 113호120<NA><NA>20211029
관리_호별_명세_pk관리_동별_개요_pk호_번호호_명칭평형_구분_명층_번호층_구분_코드관리_건축물대장_참조_pk변경_구분_코드작업_일자
1619011000-10001874311000-10000494832280149B1820<NA><NA>20240510
5638311215-10002615811215-1000052040120240C1220<NA><NA>20210901
2189311000-10002509711000-10000336401201114A1220<NA><NA>20240510
3937611000-10004879011000-100011918090570A920<NA><NA>20210806
7145011260-10003821811260-1000109276330484A320<NA><NA>20211204
8670311350-10004137411350-1000164570804<NA>82011350-100181783120211029
6359211230-100000000000000056159111230-10000000000000001012550A동310684A3120<NA><NA>20231028
9138711380-100000000000000026725411380-1000000000000000047608070259B-1720<NA><NA>20230110
5727711215-10002705211215-10000524301102C1120<NA><NA>20211112
7699311290-10007677311290-1000077070270184A2720<NA><NA>20240207