Overview

Dataset statistics

Number of variables4
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 KiB
Average record size in memory33.3 B

Variable types

Text3
Categorical1

Dataset

Description샘플 데이터
Author오픈메이트
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=6

Alerts

아파트_단지_코드 has unique valuesUnique
지번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:58:29.560978
Analysis finished2023-12-10 14:58:32.751442
Duration3.19 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T23:58:33.064301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1000
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowB000002699
2nd rowA106000620
3rd rowB000049483
4th rowB000071503
5th rowA000016043
ValueCountFrequency (%)
b000002699 1
 
1.0%
a102814000 1
 
1.0%
a000078340 1
 
1.0%
b000034772 1
 
1.0%
b000005930 1
 
1.0%
a000013541 1
 
1.0%
b000008626 1
 
1.0%
u000003074 1
 
1.0%
a106010049 1
 
1.0%
a000036236 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T23:58:33.765356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 401
40.1%
1 89
 
8.9%
6 60
 
6.0%
4 60
 
6.0%
2 54
 
5.4%
3 54
 
5.4%
B 50
 
5.0%
9 47
 
4.7%
5 47
 
4.7%
A 46
 
4.6%
Other values (3) 92
 
9.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 900
90.0%
Uppercase Letter 100
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 401
44.6%
1 89
 
9.9%
6 60
 
6.7%
4 60
 
6.7%
2 54
 
6.0%
3 54
 
6.0%
9 47
 
5.2%
5 47
 
5.2%
7 45
 
5.0%
8 43
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
B 50
50.0%
A 46
46.0%
U 4
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
Common 900
90.0%
Latin 100
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 401
44.6%
1 89
 
9.9%
6 60
 
6.7%
4 60
 
6.7%
2 54
 
6.0%
3 54
 
6.0%
9 47
 
5.2%
5 47
 
5.2%
7 45
 
5.0%
8 43
 
4.8%
Latin
ValueCountFrequency (%)
B 50
50.0%
A 46
46.0%
U 4
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 401
40.1%
1 89
 
8.9%
6 60
 
6.0%
4 60
 
6.0%
2 54
 
5.4%
3 54
 
5.4%
B 50
 
5.0%
9 47
 
4.7%
5 47
 
4.7%
A 46
 
4.6%
Other values (3) 92
 
9.2%

지번
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T23:58:34.169800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters1900
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row1*5*0*0*0*0*3*6*0*2
2nd row1*5*0*0*0*0*2*7*0*4
3rd row1*7*0*0*0*0*2*0*0*3
4th row1*6*0*0*0*0*0*7*0*4
5th row1*5*0*0*0*0*2*0*0*9
ValueCountFrequency (%)
1*5*0*0*0*0*3*6*0*2 1
 
1.0%
1*7*0*0*0*0*2*1*0*3 1
 
1.0%
1*3*5*0*0*0*5*3*0*1 1
 
1.0%
1*6*0*0*0*0*0*0*0*0 1
 
1.0%
1*4*0*1*0*0*0*5*0*2 1
 
1.0%
1*5*0*0*0*0*6*0*0*0 1
 
1.0%
1*2*0*0*0*0*0*5*0*9 1
 
1.0%
1*3*0*0*0*0*4*4*0*7 1
 
1.0%
1*6*0*0*0*0*9*0*0*4 1
 
1.0%
1*4*0*0*0*0*5*5*0*4 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T23:58:34.819611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 900
47.4%
0 505
26.6%
1 156
 
8.2%
5 62
 
3.3%
3 57
 
3.0%
2 55
 
2.9%
6 53
 
2.8%
4 40
 
2.1%
7 31
 
1.6%
9 22
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1000
52.6%
Other Punctuation 900
47.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 505
50.5%
1 156
 
15.6%
5 62
 
6.2%
3 57
 
5.7%
2 55
 
5.5%
6 53
 
5.3%
4 40
 
4.0%
7 31
 
3.1%
9 22
 
2.2%
8 19
 
1.9%
Other Punctuation
ValueCountFrequency (%)
* 900
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1900
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 900
47.4%
0 505
26.6%
1 156
 
8.2%
5 62
 
3.3%
3 57
 
3.0%
2 55
 
2.9%
6 53
 
2.8%
4 40
 
2.1%
7 31
 
1.6%
9 22
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 900
47.4%
0 505
26.6%
1 156
 
8.2%
5 62
 
3.3%
3 57
 
3.0%
2 55
 
2.9%
6 53
 
2.8%
4 40
 
2.1%
7 31
 
1.6%
9 22
 
1.2%
Distinct77
Distinct (%)77.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T23:58:35.281335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length4
Mean length4.92
Min length2

Characters and Unicode

Total characters492
Distinct characters95
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75 ?
Unique (%)75.0%

Sample

1st row에*원*우*
2nd row다*아*빌*
3rd row명*없*
4th row복*라*
5th row초*빌*트*
ValueCountFrequency (%)
명*없 23
 
23.0%
삼*빌 2
 
2.0%
중*맨 1
 
1.0%
에*원*우 1
 
1.0%
한*스*이 1
 
1.0%
우*라*프*파 1
 
1.0%
나*스 1
 
1.0%
길*그*빌 1
 
1.0%
패*리 1
 
1.0%
서*팰*스 1
 
1.0%
Other values (67) 67
67.0%
2023-12-10T23:58:35.878041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 246
50.0%
34
 
6.9%
24
 
4.9%
23
 
4.7%
11
 
2.2%
7
 
1.4%
6
 
1.2%
5
 
1.0%
4
 
0.8%
4
 
0.8%
Other values (85) 128
26.0%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 246
50.0%
Other Letter 243
49.4%
Decimal Number 3
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
34
 
14.0%
24
 
9.9%
23
 
9.5%
11
 
4.5%
7
 
2.9%
6
 
2.5%
5
 
2.1%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (82) 121
49.8%
Decimal Number
ValueCountFrequency (%)
2 2
66.7%
8 1
33.3%
Other Punctuation
ValueCountFrequency (%)
* 246
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 249
50.6%
Hangul 243
49.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
34
 
14.0%
24
 
9.9%
23
 
9.5%
11
 
4.5%
7
 
2.9%
6
 
2.5%
5
 
2.1%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (82) 121
49.8%
Common
ValueCountFrequency (%)
* 246
98.8%
2 2
 
0.8%
8 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 249
50.6%
Hangul 243
49.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 246
98.8%
2 2
 
0.8%
8 1
 
0.4%
Hangul
ValueCountFrequency (%)
34
 
14.0%
24
 
9.9%
23
 
9.5%
11
 
4.5%
7
 
2.9%
6
 
2.5%
5
 
2.1%
4
 
1.6%
4
 
1.6%
4
 
1.6%
Other values (82) 121
49.8%
Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
V
87 
A
13 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowV
2nd rowV
3rd rowV
4th rowV
5th rowV

Common Values

ValueCountFrequency (%)
V 87
87.0%
A 13
 
13.0%

Length

2023-12-10T23:58:36.166926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:58:36.380461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
v 87
87.0%
a 13
 
13.0%

Correlations

2023-12-10T23:58:36.526309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
아파트_단지_코드지번아파트_단지_명아파트_빌라_구분_코드
아파트_단지_코드1.0001.0001.0001.000
지번1.0001.0001.0001.000
아파트_단지_명1.0001.0001.0000.000
아파트_빌라_구분_코드1.0001.0000.0001.000

Missing values

2023-12-10T23:58:32.404327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:58:32.673699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트_단지_코드지번아파트_단지_명아파트_빌라_구분_코드
0B0000026991*5*0*0*0*0*3*6*0*2에*원*우*V
1A1060006201*5*0*0*0*0*2*7*0*4다*아*빌*V
2B0000494831*7*0*0*0*0*2*0*0*3명*없*V
3B0000715031*6*0*0*0*0*0*7*0*4복*라*V
4A0000160431*5*0*0*0*0*2*0*0*9초*빌*트*V
5A0000428991*3*0*0*0*0*0*1*1*1엘*이*강*8*지*파*V
6B0000262421*1*0*2*0*0*3*2*0*9세*빌*V
7A1014474001*3*0*0*0*0*0*1*0*9신*림*스*이*아*트*V
8A1058536491*3*0*0*0*0*2*4*0*6명*없*V
9B0000759651*1*0*0*0*0*0*7*0*5명*없*V
아파트_단지_코드지번아파트_단지_명아파트_빌라_구분_코드
90B0000496181*6*0*0*0*0*5*1*0*8새*센*빌*V
91B0000323621*5*0*0*0*0*1*0*0*2홍*한*아*트*V
92B0000064401*6*0*0*0*0*1*6*2*1현*빌*V
93B0000623731*5*5*0*0*0*2*3*0*0하*빌*V
94B0000771931*3*0*0*0*0*0*4*0*1잠*월*메*디*V
95A1060051961*2*0*0*0*0*2*7*0*1명*없*A
96A0000388191*3*0*0*0*0*6*8*0*8명*없*V
97B0000003681*6*0*0*0*0*2*3*0*7명*없*V
98A0010024261*6*0*0*0*0*4*7*0*9명*없*V
99B0000778061*4*0*2*0*0*3*8*0*6평*V