Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 500 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 28.9 KiB |
Average record size in memory | 59.3 B |
Variable types
Text | 4 |
---|---|
Numeric | 2 |
Categorical | 1 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | 오픈메이트 |
URL | https://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=6 |
엑스좌표_값 is highly overall correlated with 행정구역_중복_여부 | High correlation |
와이좌표_값 is highly overall correlated with 행정구역_중복_여부 | High correlation |
행정구역_중복_여부 is highly overall correlated with 엑스좌표_값 and 1 other fields | High correlation |
행정구역_중복_여부 is highly imbalanced (83.7%) | Imbalance |
아파트_동_코드 has unique values | Unique |
Reproduction
Analysis started | 2023-12-10 14:58:51.273383 |
---|---|
Analysis finished | 2023-12-10 14:58:52.781056 |
Duration | 1.51 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
아파트_동_코드
Text
UNIQUE
 
Distinct | 500 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 5000 |
---|---|
Distinct characters | 14 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 500 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | B000073363 |
---|---|
2nd row | B000045014 |
3rd row | B000071436 |
4th row | U000010740 |
5th row | A000081751 |
Value | Count | Frequency (%) |
b000073363 | 1 | 0.2% |
a001027259 | 1 | 0.2% |
b000063377 | 1 | 0.2% |
b000020746 | 1 | 0.2% |
u000004143 | 1 | 0.2% |
b000056248 | 1 | 0.2% |
b000001552 | 1 | 0.2% |
a001024947 | 1 | 0.2% |
b000088761 | 1 | 0.2% |
b000010493 | 1 | 0.2% |
Other values (490) | 490 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 2202 | |
1 | 379 | 7.6% |
B | 287 | 5.7% |
2 | 270 | 5.4% |
4 | 253 | 5.1% |
5 | 247 | 4.9% |
3 | 245 | 4.9% |
8 | 241 | 4.8% |
7 | 230 | 4.6% |
6 | 219 | 4.4% |
Other values (4) | 427 | 8.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 4500 | |
Uppercase Letter | 500 | 10.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 2202 | |
1 | 379 | 8.4% |
2 | 270 | 6.0% |
4 | 253 | 5.6% |
5 | 247 | 5.5% |
3 | 245 | 5.4% |
8 | 241 | 5.4% |
7 | 230 | 5.1% |
6 | 219 | 4.9% |
9 | 214 | 4.8% |
Uppercase Letter
Value | Count | Frequency (%) |
B | 287 | |
A | 187 | |
U | 18 | 3.6% |
X | 8 | 1.6% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 4500 | |
Latin | 500 | 10.0% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 2202 | |
1 | 379 | 8.4% |
2 | 270 | 6.0% |
4 | 253 | 5.6% |
5 | 247 | 5.5% |
3 | 245 | 5.4% |
8 | 241 | 5.4% |
7 | 230 | 5.1% |
6 | 219 | 4.9% |
9 | 214 | 4.8% |
Latin
Value | Count | Frequency (%) |
B | 287 | |
A | 187 | |
U | 18 | 3.6% |
X | 8 | 1.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 5000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 2202 | |
1 | 379 | 7.6% |
B | 287 | 5.7% |
2 | 270 | 5.4% |
4 | 253 | 5.1% |
5 | 247 | 4.9% |
3 | 245 | 4.9% |
8 | 241 | 4.8% |
7 | 230 | 4.6% |
6 | 219 | 4.4% |
Other values (4) | 427 | 8.5% |
아파트_단지_코드
Text
Distinct | 493 |
---|---|
Distinct (%) | 98.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 5000 |
---|---|
Distinct characters | 14 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 486 ? |
---|---|
Unique (%) | 97.2% |
Sample
1st row | B000077122 |
---|---|
2nd row | B000002197 |
3rd row | B000015590 |
4th row | A000044415 |
5th row | B000062484 |
Value | Count | Frequency (%) |
x000011476 | 2 | 0.4% |
a000068640 | 2 | 0.4% |
a000017357 | 2 | 0.4% |
u000000678 | 2 | 0.4% |
u000000113 | 2 | 0.4% |
a000069718 | 2 | 0.4% |
a000058015 | 2 | 0.4% |
a000068231 | 1 | 0.2% |
b000010874 | 1 | 0.2% |
a000068698 | 1 | 0.2% |
Other values (483) | 483 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 2093 | |
1 | 417 | 8.3% |
5 | 305 | 6.1% |
4 | 276 | 5.5% |
2 | 263 | 5.3% |
7 | 254 | 5.1% |
6 | 251 | 5.0% |
3 | 248 | 5.0% |
B | 239 | 4.8% |
A | 231 | 4.6% |
Other values (4) | 423 | 8.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 4500 | |
Uppercase Letter | 500 | 10.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 2093 | |
1 | 417 | 9.3% |
5 | 305 | 6.8% |
4 | 276 | 6.1% |
2 | 263 | 5.8% |
7 | 254 | 5.6% |
6 | 251 | 5.6% |
3 | 248 | 5.5% |
8 | 224 | 5.0% |
9 | 169 | 3.8% |
Uppercase Letter
Value | Count | Frequency (%) |
B | 239 | |
A | 231 | |
U | 23 | 4.6% |
X | 7 | 1.4% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 4500 | |
Latin | 500 | 10.0% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 2093 | |
1 | 417 | 9.3% |
5 | 305 | 6.8% |
4 | 276 | 6.1% |
2 | 263 | 5.8% |
7 | 254 | 5.6% |
6 | 251 | 5.6% |
3 | 248 | 5.5% |
8 | 224 | 5.0% |
9 | 169 | 3.8% |
Latin
Value | Count | Frequency (%) |
B | 239 | |
A | 231 | |
U | 23 | 4.6% |
X | 7 | 1.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 5000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 2093 | |
1 | 417 | 8.3% |
5 | 305 | 6.1% |
4 | 276 | 5.5% |
2 | 263 | 5.3% |
7 | 254 | 5.1% |
6 | 251 | 5.0% |
3 | 248 | 5.0% |
B | 239 | 4.8% |
A | 231 | 4.6% |
Other values (4) | 423 | 8.5% |
아파트_동_명
Text
Distinct | 73 |
---|---|
Distinct (%) | 14.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
Value | Count | Frequency (%) |
동명없음1 | 304 | |
1 | 34 | 6.8% |
b | 15 | 3.0% |
a | 12 | 2.4% |
101 | 11 | 2.2% |
가 | 10 | 2.0% |
103 | 9 | 1.8% |
나 | 9 | 1.8% |
2 | 7 | 1.4% |
108 | 4 | 0.8% |
Other values (63) | 85 | 17.0% |
Most occurring characters
Value | Count | Frequency (%) |
1 | 420 | |
동 | 304 | |
없 | 304 | |
음 | 304 | |
명 | 304 | |
0 | 72 | 3.8% |
2 | 39 | 2.1% |
3 | 23 | 1.2% |
5 | 16 | 0.8% |
B | 15 | 0.8% |
Other values (16) | 99 | 5.2% |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 1241 | |
Decimal Number | 628 | |
Uppercase Letter | 31 | 1.6% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
동 | 304 | |
없 | 304 | |
음 | 304 | |
명 | 304 | |
나 | 10 | 0.8% |
가 | 10 | 0.8% |
다 | 1 | 0.1% |
프 | 1 | 0.1% |
리 | 1 | 0.1% |
우 | 1 | 0.1% |
Decimal Number
Value | Count | Frequency (%) |
1 | 420 | |
0 | 72 | 11.5% |
2 | 39 | 6.2% |
3 | 23 | 3.7% |
5 | 16 | 2.5% |
7 | 14 | 2.2% |
6 | 14 | 2.2% |
4 | 13 | 2.1% |
8 | 10 | 1.6% |
9 | 7 | 1.1% |
Uppercase Letter
Value | Count | Frequency (%) |
B | 15 | |
A | 12 | |
E | 2 | 6.5% |
I | 1 | 3.2% |
D | 1 | 3.2% |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 1241 | |
Common | 628 | |
Latin | 31 | 1.6% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
동 | 304 | |
없 | 304 | |
음 | 304 | |
명 | 304 | |
나 | 10 | 0.8% |
가 | 10 | 0.8% |
다 | 1 | 0.1% |
프 | 1 | 0.1% |
리 | 1 | 0.1% |
우 | 1 | 0.1% |
Common
Value | Count | Frequency (%) |
1 | 420 | |
0 | 72 | 11.5% |
2 | 39 | 6.2% |
3 | 23 | 3.7% |
5 | 16 | 2.5% |
7 | 14 | 2.2% |
6 | 14 | 2.2% |
4 | 13 | 2.1% |
8 | 10 | 1.6% |
9 | 7 | 1.1% |
Latin
Value | Count | Frequency (%) |
B | 15 | |
A | 12 | |
E | 2 | 6.5% |
I | 1 | 3.2% |
D | 1 | 3.2% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 1241 | |
ASCII | 659 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 420 | |
0 | 72 | 10.9% |
2 | 39 | 5.9% |
3 | 23 | 3.5% |
5 | 16 | 2.4% |
B | 15 | 2.3% |
7 | 14 | 2.1% |
6 | 14 | 2.1% |
4 | 13 | 2.0% |
A | 12 | 1.8% |
Other values (5) | 21 | 3.2% |
Hangul
Value | Count | Frequency (%) |
동 | 304 | |
없 | 304 | |
음 | 304 | |
명 | 304 | |
나 | 10 | 0.8% |
가 | 10 | 0.8% |
다 | 1 | 0.1% |
프 | 1 | 0.1% |
리 | 1 | 0.1% |
우 | 1 | 0.1% |
엑스좌표_값
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 495 |
---|---|
Distinct (%) | 99.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 198940.48 |
Minimum | 182866 |
---|---|
Maximum | 215235 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 182866 |
---|---|
5-th percentile | 186439.3 |
Q1 | 192717 |
median | 199141.5 |
Q3 | 204794.25 |
95-th percentile | 211412.45 |
Maximum | 215235 |
Range | 32369 |
Interquartile range (IQR) | 12077.25 |
Descriptive statistics
Standard deviation | 7813.1494 |
---|---|
Coefficient of variation (CV) | 0.039273804 |
Kurtosis | -1.0649422 |
Mean | 198940.48 |
Median Absolute Deviation (MAD) | 6305 |
Skewness | 0.012637556 |
Sum | 99470241 |
Variance | 61045304 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
192321 | 2 | 0.4% |
192474 | 2 | 0.4% |
196750 | 2 | 0.4% |
192036 | 2 | 0.4% |
194602 | 2 | 0.4% |
201484 | 1 | 0.2% |
203248 | 1 | 0.2% |
204660 | 1 | 0.2% |
186287 | 1 | 0.2% |
201453 | 1 | 0.2% |
Other values (485) | 485 |
Value | Count | Frequency (%) |
182866 | 1 | |
183174 | 1 | |
184697 | 1 | |
184987 | 1 | |
185071 | 1 | |
185146 | 1 | |
185154 | 1 | |
185185 | 1 | |
185202 | 1 | |
185216 | 1 |
Value | Count | Frequency (%) |
215235 | 1 | |
214973 | 1 | |
213554 | 1 | |
213134 | 1 | |
213125 | 1 | |
213088 | 1 | |
212959 | 1 | |
212893 | 1 | |
212785 | 1 | |
212652 | 1 |
와이좌표_값
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 495 |
---|---|
Distinct (%) | 99.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 450742.75 |
Minimum | 439209 |
---|---|
Maximum | 464512 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 439209 |
---|---|
5-th percentile | 442031.9 |
Q1 | 445025.25 |
median | 450361.5 |
Q3 | 455572.5 |
95-th percentile | 461156 |
Maximum | 464512 |
Range | 25303 |
Interquartile range (IQR) | 10547.25 |
Descriptive statistics
Standard deviation | 6046.4312 |
---|---|
Coefficient of variation (CV) | 0.013414373 |
Kurtosis | -0.95141797 |
Mean | 450742.75 |
Median Absolute Deviation (MAD) | 5226 |
Skewness | 0.19292169 |
Sum | 2.2537138 × 108 |
Variance | 36559330 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
449898 | 2 | 0.4% |
448731 | 2 | 0.4% |
443328 | 2 | 0.4% |
441984 | 2 | 0.4% |
443193 | 2 | 0.4% |
458217 | 1 | 0.2% |
442980 | 1 | 0.2% |
452612 | 1 | 0.2% |
445144 | 1 | 0.2% |
443131 | 1 | 0.2% |
Other values (485) | 485 |
Value | Count | Frequency (%) |
439209 | 1 | |
439261 | 1 | |
439601 | 1 | |
439755 | 1 | |
440086 | 1 | |
440281 | 1 | |
440554 | 1 | |
440577 | 1 | |
441005 | 1 | |
441015 | 1 |
Value | Count | Frequency (%) |
464512 | 1 | |
464065 | 1 | |
463850 | 1 | |
463834 | 1 | |
463424 | 1 | |
463299 | 1 | |
463069 | 1 | |
463051 | 1 | |
462860 | 1 | |
462777 | 1 |
행정구역_중복_여부
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
0 | |
---|---|
<NA> | 12 |
Length
Max length | 4 |
---|---|
Median length | 1 |
Mean length | 1.072 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 0 |
---|---|
2nd row | 0 |
3rd row | 0 |
4th row | 0 |
5th row | 0 |
Common Values
Value | Count | Frequency (%) |
0 | 488 | |
<NA> | 12 | 2.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
0 | 488 | |
na | 12 | 2.4% |
블록_코드
Text
Distinct | 327 |
---|---|
Distinct (%) | 65.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
Value | Count | Frequency (%) |
2*1*3 | 7 | 1.4% |
2*0*9 | 6 | 1.2% |
2*2*6 | 6 | 1.2% |
2*0*4 | 6 | 1.2% |
2*9*2 | 5 | 1.0% |
2*1*9 | 5 | 1.0% |
2*2*5 | 5 | 1.0% |
2*1*0 | 5 | 1.0% |
3*3*2 | 5 | 1.0% |
2*1*5 | 5 | 1.0% |
Other values (255) | 445 |
Most occurring characters
Value | Count | Frequency (%) |
* | 1353 | |
2 | 336 | 11.9% |
1 | 227 | 8.0% |
3 | 197 | 7.0% |
4 | 134 | 4.7% |
5 | 111 | 3.9% |
0 | 102 | 3.6% |
9 | 101 | 3.6% |
8 | 93 | 3.3% |
7 | 90 | 3.2% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 1477 | |
Other Punctuation | 1353 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
2 | 336 | |
1 | 227 | |
3 | 197 | |
4 | 134 | 9.1% |
5 | 111 | 7.5% |
0 | 102 | 6.9% |
9 | 101 | 6.8% |
8 | 93 | 6.3% |
7 | 90 | 6.1% |
6 | 86 | 5.8% |
Other Punctuation
Value | Count | Frequency (%) |
* | 1353 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 2830 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
* | 1353 | |
2 | 336 | 11.9% |
1 | 227 | 8.0% |
3 | 197 | 7.0% |
4 | 134 | 4.7% |
5 | 111 | 3.9% |
0 | 102 | 3.6% |
9 | 101 | 3.6% |
8 | 93 | 3.3% |
7 | 90 | 3.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2830 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
* | 1353 | |
2 | 336 | 11.9% |
1 | 227 | 8.0% |
3 | 197 | 7.0% |
4 | 134 | 4.7% |
5 | 111 | 3.9% |
0 | 102 | 3.6% |
9 | 101 | 3.6% |
8 | 93 | 3.3% |
7 | 90 | 3.2% |
아파트_동_명 | 엑스좌표_값 | 와이좌표_값 | |
---|---|---|---|
아파트_동_명 | 1.000 | 0.000 | 0.000 |
엑스좌표_값 | 0.000 | 1.000 | 0.000 |
와이좌표_값 | 0.000 | 0.000 | 1.000 |
엑스좌표_값 | 와이좌표_값 | 행정구역_중복_여부 | |
---|---|---|---|
엑스좌표_값 | 1.000 | 0.020 | 1.000 |
와이좌표_값 | 0.020 | 1.000 | 1.000 |
행정구역_중복_여부 | 1.000 | 1.000 | 1.000 |
아파트_동_코드 | 아파트_단지_코드 | 아파트_동_명 | 엑스좌표_값 | 와이좌표_값 | 행정구역_중복_여부 | 블록_코드 | |
---|---|---|---|---|---|---|---|
0 | B000073363 | B000077122 | 동명없음1 | 186642 | 443131 | 0 | 1*3*8* |
1 | B000045014 | B000002197 | 동명없음1 | 194350 | 453906 | 0 | 2*7*1* |
2 | B000071436 | B000015590 | 동명없음1 | 196868 | 450193 | 0 | 2*8*6* |
3 | U000010740 | A000044415 | 1 | 212652 | 445020 | 0 | 2*7*7* |
4 | A000081751 | B000062484 | 동명없음1 | 196046 | 457856 | 0 | 3*3*1* |
5 | A000024502 | A105860897 | 동명없음1 | 193854 | 461155 | 0 | 2*1*3* |
6 | B000057029 | U000001154 | 109 | 190546 | 458467 | 0 | 3*2*6 |
7 | B000016708 | U000000978 | 동명없음1 | 204752 | 462777 | 0 | 2*1*0* |
8 | A002013853 | B000038494 | 906 | 200489 | 444104 | 0 | 6*4*5 |
9 | U000009344 | A000050414 | 동명없음1 | 194012 | 450371 | 0 | 2*3*0* |
아파트_동_코드 | 아파트_단지_코드 | 아파트_동_명 | 엑스좌표_값 | 와이좌표_값 | 행정구역_중복_여부 | 블록_코드 | |
---|---|---|---|---|---|---|---|
490 | B000065770 | X000011476 | 동명없음1 | 204666 | 456182 | 0 | 1*7*7* |
491 | A001048542 | B000072973 | 동명없음1 | 208148 | 452012 | <NA> | 2*2*9* |
492 | B000050926 | B000079111 | 동명없음1 | 185071 | 453464 | 0 | 1*3*0 |
493 | A001050924 | B000012552 | 동명없음1 | 207482 | 445806 | 0 | 1*0*0 |
494 | A001032657 | A001004646 | 동명없음1 | 198930 | 448916 | 0 | 1*0*8 |
495 | B000004270 | U000001904 | 동명없음1 | 212302 | 451324 | 0 | 9*8* |
496 | B000022282 | X000010919 | 동명없음1 | 193297 | 444990 | 0 | 8*4* |
497 | A000049258 | B000030390 | 1 | 187020 | 461031 | 0 | 1*7*4* |
498 | B000066007 | A101480300 | 1 | 197713 | 458128 | 0 | 1*3*5* |
499 | A000060625 | U000002837 | 301 | 201276 | 442638 | <NA> | 2*4*0* |