Overview

Dataset statistics

Number of variables5
Number of observations180
Missing cells15
Missing cells (%)1.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.2 KiB
Average record size in memory40.7 B

Variable types

Text4
Categorical1

Dataset

Description경기도 의정부시 아파트 단지내 조경면적 데이터로 아파트명, 주택유형, 지번주소, 조경면적, 세대수의 항목으로 구성되어 있습니다.
Author경기도 의정부시
URLhttps://www.data.go.kr/data/15063134/fileData.do

Alerts

주택유형 is highly imbalanced (87.6%)Imbalance
조경면적 has 15 (8.3%) missing valuesMissing
아파트명 has unique valuesUnique
지번주소 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:08:14.439796
Analysis finished2023-12-12 13:08:14.809799
Duration0.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

아파트명
Text

UNIQUE 

Distinct180
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-12T22:08:14.968241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length7.2555556
Min length2

Characters and Unicode

Total characters1306
Distinct characters221
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique180 ?
Unique (%)100.0%

Sample

1st row호원 한신1차
2nd row용현 동문
3rd row용현 현대1차
4th row장암 우성
5th row호원 건영
ValueCountFrequency (%)
호원 35
 
11.1%
신곡 22
 
7.0%
용현 8
 
2.5%
장암 8
 
2.5%
금오 7
 
2.2%
민락 7
 
2.2%
녹양 4
 
1.3%
신도브래뉴 3
 
0.9%
e편한세상 3
 
0.9%
1차 3
 
0.9%
Other values (193) 216
68.4%
2023-12-12T22:08:15.328992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
136
 
10.4%
58
 
4.4%
49
 
3.8%
47
 
3.6%
43
 
3.3%
1 27
 
2.1%
24
 
1.8%
22
 
1.7%
22
 
1.7%
2 21
 
1.6%
Other values (211) 857
65.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1072
82.1%
Space Separator 136
 
10.4%
Decimal Number 68
 
5.2%
Other Punctuation 7
 
0.5%
Uppercase Letter 6
 
0.5%
Close Punctuation 5
 
0.4%
Open Punctuation 5
 
0.4%
Lowercase Letter 4
 
0.3%
Dash Punctuation 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
58
 
5.4%
49
 
4.6%
47
 
4.4%
43
 
4.0%
24
 
2.2%
22
 
2.1%
22
 
2.1%
20
 
1.9%
19
 
1.8%
19
 
1.8%
Other values (190) 749
69.9%
Decimal Number
ValueCountFrequency (%)
1 27
39.7%
2 21
30.9%
3 6
 
8.8%
4 4
 
5.9%
5 3
 
4.4%
0 3
 
4.4%
6 2
 
2.9%
7 1
 
1.5%
9 1
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
S 2
33.3%
K 1
16.7%
U 1
16.7%
L 1
16.7%
P 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 6
85.7%
& 1
 
14.3%
Space Separator
ValueCountFrequency (%)
136
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1072
82.1%
Common 224
 
17.2%
Latin 10
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
58
 
5.4%
49
 
4.6%
47
 
4.4%
43
 
4.0%
24
 
2.2%
22
 
2.1%
22
 
2.1%
20
 
1.9%
19
 
1.8%
19
 
1.8%
Other values (190) 749
69.9%
Common
ValueCountFrequency (%)
136
60.7%
1 27
 
12.1%
2 21
 
9.4%
, 6
 
2.7%
3 6
 
2.7%
) 5
 
2.2%
( 5
 
2.2%
4 4
 
1.8%
5 3
 
1.3%
0 3
 
1.3%
Other values (5) 8
 
3.6%
Latin
ValueCountFrequency (%)
e 4
40.0%
S 2
20.0%
K 1
 
10.0%
U 1
 
10.0%
L 1
 
10.0%
P 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1072
82.1%
ASCII 234
 
17.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
136
58.1%
1 27
 
11.5%
2 21
 
9.0%
, 6
 
2.6%
3 6
 
2.6%
) 5
 
2.1%
( 5
 
2.1%
e 4
 
1.7%
4 4
 
1.7%
5 3
 
1.3%
Other values (11) 17
 
7.3%
Hangul
ValueCountFrequency (%)
58
 
5.4%
49
 
4.6%
47
 
4.4%
43
 
4.0%
24
 
2.2%
22
 
2.1%
22
 
2.1%
20
 
1.9%
19
 
1.8%
19
 
1.8%
Other values (190) 749
69.9%

주택유형
Categorical

IMBALANCE 

Distinct6
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
아파트
173 
아파트(복합)
 
3
아파트
 
1
다세대
 
1
연립주택
 
1

Length

Max length7
Median length3
Mean length3.0833333
Min length3

Unique

Unique4 ?
Unique (%)2.2%

Sample

1st row아파트
2nd row아파트
3rd row아파트
4th row아파트
5th row아파트

Common Values

ValueCountFrequency (%)
아파트 173
96.1%
아파트(복합) 3
 
1.7%
아파트 1
 
0.6%
다세대 1
 
0.6%
연립주택 1
 
0.6%
소형주택 1
 
0.6%

Length

2023-12-12T22:08:15.475285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:08:15.588960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
아파트 174
96.7%
아파트(복합 3
 
1.7%
다세대 1
 
0.6%
연립주택 1
 
0.6%
소형주택 1
 
0.6%

지번주소
Text

UNIQUE 

Distinct180
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-12T22:08:15.933528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length8.4833333
Min length6

Characters and Unicode

Total characters1527
Distinct characters37
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique180 ?
Unique (%)100.0%

Sample

1st row호원동 10-1
2nd row용현동 227
3rd row용현동 46
4th row장암동 14-27
5th row호원동 121
ValueCountFrequency (%)
호원동 52
 
14.5%
신곡동 36
 
10.0%
의정부동 20
 
5.6%
가능동 16
 
4.5%
용현동 12
 
3.3%
민락동 10
 
2.8%
금오동 9
 
2.5%
장암동 7
 
1.9%
녹양동 6
 
1.7%
낙양동 5
 
1.4%
Other values (182) 186
51.8%
2023-12-12T22:08:16.493792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
180
 
11.8%
179
 
11.7%
- 115
 
7.5%
1 110
 
7.2%
4 95
 
6.2%
2 86
 
5.6%
7 72
 
4.7%
3 67
 
4.4%
6 63
 
4.1%
5 60
 
3.9%
Other values (27) 500
32.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 670
43.9%
Other Letter 560
36.7%
Space Separator 179
 
11.7%
Dash Punctuation 115
 
7.5%
Other Punctuation 2
 
0.1%
Uppercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
180
32.1%
52
 
9.3%
52
 
9.3%
38
 
6.8%
37
 
6.6%
20
 
3.6%
20
 
3.6%
20
 
3.6%
16
 
2.9%
16
 
2.9%
Other values (13) 109
19.5%
Decimal Number
ValueCountFrequency (%)
1 110
16.4%
4 95
14.2%
2 86
12.8%
7 72
10.7%
3 67
10.0%
6 63
9.4%
5 60
9.0%
8 47
7.0%
9 37
 
5.5%
0 33
 
4.9%
Space Separator
ValueCountFrequency (%)
179
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 115
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 966
63.3%
Hangul 560
36.7%
Latin 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
180
32.1%
52
 
9.3%
52
 
9.3%
38
 
6.8%
37
 
6.6%
20
 
3.6%
20
 
3.6%
20
 
3.6%
16
 
2.9%
16
 
2.9%
Other values (13) 109
19.5%
Common
ValueCountFrequency (%)
179
18.5%
- 115
11.9%
1 110
11.4%
4 95
9.8%
2 86
8.9%
7 72
7.5%
3 67
 
6.9%
6 63
 
6.5%
5 60
 
6.2%
8 47
 
4.9%
Other values (3) 72
7.5%
Latin
ValueCountFrequency (%)
C 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 967
63.3%
Hangul 560
36.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
180
32.1%
52
 
9.3%
52
 
9.3%
38
 
6.8%
37
 
6.6%
20
 
3.6%
20
 
3.6%
20
 
3.6%
16
 
2.9%
16
 
2.9%
Other values (13) 109
19.5%
ASCII
ValueCountFrequency (%)
179
18.5%
- 115
11.9%
1 110
11.4%
4 95
9.8%
2 86
8.9%
7 72
7.4%
3 67
 
6.9%
6 63
 
6.5%
5 60
 
6.2%
8 47
 
4.9%
Other values (4) 73
7.5%

조경면적
Text

MISSING 

Distinct161
Distinct (%)97.6%
Missing15
Missing (%)8.3%
Memory size1.5 KiB
2023-12-12T22:08:16.877624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.1575758
Min length2

Characters and Unicode

Total characters686
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique158 ?
Unique (%)95.8%

Sample

1st row1447
2nd row582
3rd row9481
4th row5183
5th row6590
ValueCountFrequency (%)
1590 3
 
1.8%
211 2
 
1.2%
176 2
 
1.2%
80.22 1
 
0.6%
4301 1
 
0.6%
6808 1
 
0.6%
484 1
 
0.6%
7465 1
 
0.6%
4473 1
 
0.6%
1016 1
 
0.6%
Other values (151) 151
91.5%
2023-12-12T22:08:17.509589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 102
14.9%
2 79
11.5%
4 74
10.8%
3 72
10.5%
9 66
9.6%
0 64
9.3%
6 61
8.9%
5 57
8.3%
8 56
8.2%
7 39
 
5.7%
Other values (2) 16
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 670
97.7%
Other Punctuation 16
 
2.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 102
15.2%
2 79
11.8%
4 74
11.0%
3 72
10.7%
9 66
9.9%
0 64
9.6%
6 61
9.1%
5 57
8.5%
8 56
8.4%
7 39
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 9
56.2%
, 7
43.8%

Most occurring scripts

ValueCountFrequency (%)
Common 686
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 102
14.9%
2 79
11.5%
4 74
10.8%
3 72
10.5%
9 66
9.6%
0 64
9.3%
6 61
8.9%
5 57
8.3%
8 56
8.2%
7 39
 
5.7%
Other values (2) 16
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 686
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 102
14.9%
2 79
11.5%
4 74
10.8%
3 72
10.5%
9 66
9.6%
0 64
9.3%
6 61
8.9%
5 57
8.3%
8 56
8.2%
7 39
 
5.7%
Other values (2) 16
 
2.3%
Distinct161
Distinct (%)89.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-12T22:08:17.948479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9888889
Min length2

Characters and Unicode

Total characters538
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique144 ?
Unique (%)80.0%

Sample

1st row204
2nd row160
3rd row986
4th row510
5th row900
ValueCountFrequency (%)
420 3
 
1.7%
36 3
 
1.7%
832 2
 
1.1%
32 2
 
1.1%
162 2
 
1.1%
494 2
 
1.1%
144 2
 
1.1%
135 2
 
1.1%
47 2
 
1.1%
606 2
 
1.1%
Other values (151) 158
87.8%
2023-12-12T22:08:18.562046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 81
15.1%
4 66
12.3%
2 66
12.3%
6 55
10.2%
3 53
9.9%
0 47
8.7%
8 47
8.7%
9 45
8.4%
7 44
8.2%
5 32
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 536
99.6%
Other Punctuation 2
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 81
15.1%
4 66
12.3%
2 66
12.3%
6 55
10.3%
3 53
9.9%
0 47
8.8%
8 47
8.8%
9 45
8.4%
7 44
8.2%
5 32
 
6.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 538
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 81
15.1%
4 66
12.3%
2 66
12.3%
6 55
10.2%
3 53
9.9%
0 47
8.7%
8 47
8.7%
9 45
8.4%
7 44
8.2%
5 32
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 538
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 81
15.1%
4 66
12.3%
2 66
12.3%
6 55
10.2%
3 53
9.9%
0 47
8.7%
8 47
8.7%
9 45
8.4%
7 44
8.2%
5 32
 
5.9%

Missing values

2023-12-12T22:08:14.680436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:08:14.776146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명주택유형지번주소조경면적세대수
0호원 한신1차아파트호원동 10-11447204
1용현 동문아파트용현동 227582160
2용현 현대1차아파트용현동 469481986
3장암 우성아파트장암동 14-275183510
4호원 건영아파트호원동 1216590900
5금오 세아아파트금오동 80-204119250
6청솔 (제일생명주상복합)아파트가능동 717-22649106
7신곡 성원1차아파트신곡동 169-203623525
8호원 우성3차아파트호원동 4304192615
9가능 동원1차아파트가능동 106-211870406
아파트명주택유형지번주소조경면적세대수
170고산3단지아파트고산동 90321,3461,331
171이안더메트로아파트(복합)의정부동 208-2340170
172리버카운티아파트(복합)의정부동 20-19751
173드림팰리스아파트(복합)호원동 414-2340.3636
174더샵파크에비뉴아파트가능동 224-446196.43420
175탑석센트럴자이아파트용현동 24135273.882571
176의정부역 센트럴자이앤위브캐슬아파트의정부동 38035332.752473
177혜성루첸리아파트의정부동 214-7680.2247
178아이디캐슬연립주택호원동 419-5246.1424
179나이키빌소형주택의정부동 131-2390.9107