Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 9343 |
Missing cells (%) | 13.3% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 654.3 KiB |
Average record size in memory | 67.0 B |
Variable types
Text | 3 |
---|---|
Categorical | 4 |
Dataset
Description | 파일 다운로드 |
---|---|
Author | 서울특별시 |
URL | https://data.seoul.go.kr/dataList/OA-15658/S/1/datasetView.do |
작업_일자 has constant value "" | Constant |
지역지구구역_구분_코드 is highly overall correlated with 지역지구구역_코드 | High correlation |
지역지구구역_코드 is highly overall correlated with 지역지구구역_구분_코드 and 1 other fields | High correlation |
대표_여부 is highly overall correlated with 지역지구구역_코드 | High correlation |
지역지구구역_코드 is highly imbalanced (87.7%) | Imbalance |
대표_여부 is highly imbalanced (98.1%) | Imbalance |
기타_지역지구구역 has 9343 (93.4%) missing values | Missing |
관리_지역지구구역 has unique values | Unique |
Reproduction
Analysis started | 2024-04-20 21:16:24.074126 |
---|---|
Analysis finished | 2024-04-20 21:16:25.842555 |
Duration | 1.77 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
관리_지역지구구역
Text
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 15 |
---|---|
Median length | 11 |
Mean length | 10.5154 |
Min length | 7 |
Characters and Unicode
Total characters | 105154 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10000 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 11290-29008 |
---|---|
2nd row | 11230-7204 |
3rd row | 11215-2577 |
4th row | 11290-26242 |
5th row | 11290-34809 |
Value | Count | Frequency (%) |
11290-29008 | 1 | < 0.1% |
11320-7063 | 1 | < 0.1% |
11290-12026 | 1 | < 0.1% |
11290-23658 | 1 | < 0.1% |
11290-23341 | 1 | < 0.1% |
11110-5583 | 1 | < 0.1% |
11290-4804 | 1 | < 0.1% |
11290-38324 | 1 | < 0.1% |
11305-5470 | 1 | < 0.1% |
11200-8895 | 1 | < 0.1% |
Other values (9990) | 9990 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 28002 | |
0 | 14124 | |
2 | 13442 | |
- | 10000 | 9.5% |
9 | 8726 | 8.3% |
3 | 8610 | 8.2% |
5 | 5534 | 5.3% |
4 | 4680 | 4.5% |
6 | 4607 | 4.4% |
8 | 3790 | 3.6% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 95154 | |
Dash Punctuation | 10000 | 9.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 28002 | |
0 | 14124 | |
2 | 13442 | |
9 | 8726 | 9.2% |
3 | 8610 | 9.0% |
5 | 5534 | 5.8% |
4 | 4680 | 4.9% |
6 | 4607 | 4.8% |
8 | 3790 | 4.0% |
7 | 3639 | 3.8% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 105154 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 28002 | |
0 | 14124 | |
2 | 13442 | |
- | 10000 | 9.5% |
9 | 8726 | 8.3% |
3 | 8610 | 8.2% |
5 | 5534 | 5.3% |
4 | 4680 | 4.5% |
6 | 4607 | 4.4% |
8 | 3790 | 3.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 105154 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 28002 | |
0 | 14124 | |
2 | 13442 | |
- | 10000 | 9.5% |
9 | 8726 | 8.3% |
3 | 8610 | 8.2% |
5 | 5534 | 5.3% |
4 | 4680 | 4.5% |
6 | 4607 | 4.4% |
8 | 3790 | 3.6% |
관리_폐쇄말소대장
Text
Distinct | 8571 |
---|---|
Distinct (%) | 85.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 15 |
---|---|
Median length | 10 |
Mean length | 10.2561 |
Min length | 7 |
Characters and Unicode
Total characters | 102561 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 7226 ? |
---|---|
Unique (%) | 72.3% |
Sample
1st row | 11290-13273 |
---|---|
2nd row | 11230-3489 |
3rd row | 11215-1343 |
4th row | 11290-12097 |
5th row | 11290-15217 |
Value | Count | Frequency (%) |
11410-1422 | 3 | < 0.1% |
11110-4817 | 3 | < 0.1% |
11290-15776 | 3 | < 0.1% |
11110-1776 | 3 | < 0.1% |
11290-2256 | 3 | < 0.1% |
11290-1379 | 3 | < 0.1% |
11230-7161 | 3 | < 0.1% |
11215-2761 | 3 | < 0.1% |
11260-804 | 3 | < 0.1% |
11410-933 | 3 | < 0.1% |
Other values (8561) | 9970 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 28524 | |
0 | 13767 | |
2 | 11879 | |
- | 10000 | 9.8% |
9 | 8313 | 8.1% |
3 | 7109 | 6.9% |
5 | 5632 | 5.5% |
4 | 5027 | 4.9% |
6 | 4841 | 4.7% |
7 | 3767 | 3.7% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 92561 | |
Dash Punctuation | 10000 | 9.8% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 28524 | |
0 | 13767 | |
2 | 11879 | |
9 | 8313 | 9.0% |
3 | 7109 | 7.7% |
5 | 5632 | 6.1% |
4 | 5027 | 5.4% |
6 | 4841 | 5.2% |
7 | 3767 | 4.1% |
8 | 3702 | 4.0% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 102561 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 28524 | |
0 | 13767 | |
2 | 11879 | |
- | 10000 | 9.8% |
9 | 8313 | 8.1% |
3 | 7109 | 6.9% |
5 | 5632 | 5.5% |
4 | 5027 | 4.9% |
6 | 4841 | 4.7% |
7 | 3767 | 3.7% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 102561 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 28524 | |
0 | 13767 | |
2 | 11879 | |
- | 10000 | 9.8% |
9 | 8313 | 8.1% |
3 | 7109 | 6.9% |
5 | 5632 | 5.5% |
4 | 5027 | 4.9% |
6 | 4841 | 4.7% |
7 | 3767 | 3.7% |
지역지구구역_구분_코드
Categorical
HIGH CORRELATION
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
1 | |
---|---|
2 | |
3 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1 |
---|---|
2nd row | 2 |
3rd row | 1 |
4th row | 2 |
5th row | 2 |
Common Values
Value | Count | Frequency (%) |
1 | 3604 | |
2 | 3268 | |
3 | 3128 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 3604 | |
2 | 3268 | |
3 | 3128 |
지역지구구역_코드
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 37 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
<NA> | |
---|---|
1020 | 315 |
260 | 127 |
070 | 101 |
1022 | 61 |
Other values (32) | 188 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.9683 |
Min length | 2 |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | 1022 |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 9208 | |
1020 | 315 | 3.1% |
260 | 127 | 1.3% |
070 | 101 | 1.0% |
1022 | 61 | 0.6% |
1330 | 48 | 0.5% |
1023 | 18 | 0.2% |
103 | 18 | 0.2% |
1120 | 14 | 0.1% |
1021 | 10 | 0.1% |
Other values (27) | 80 | 0.8% |
Length
Value | Count | Frequency (%) |
na | 9208 | |
1020 | 315 | 3.1% |
260 | 127 | 1.3% |
070 | 101 | 1.0% |
1022 | 61 | 0.6% |
1330 | 48 | 0.5% |
1023 | 18 | 0.2% |
103 | 18 | 0.2% |
1120 | 14 | 0.1% |
1021 | 10 | 0.1% |
Other values (27) | 80 | 0.8% |
대표_여부
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
1 | |
---|---|
0 | 27 |
<NA> | 2 |
Length
Max length | 4 |
---|---|
Median length | 1 |
Mean length | 1.0006 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 9971 | |
0 | 27 | 0.3% |
<NA> | 2 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 9971 | |
0 | 27 | 0.3% |
na | 2 | < 0.1% |
기타_지역지구구역
Text
MISSING
 
Distinct | 60 |
---|---|
Distinct (%) | 9.1% |
Missing | 9343 |
Missing (%) | 93.4% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
일반주거지역 | 288 | |
주차장정비지구 | 102 | 15.3% |
자연녹지지역 | 47 | 7.0% |
개발제한구역 | 47 | 7.0% |
일반주거 | 34 | 5.1% |
주차장정비 | 17 | 2.5% |
제2종일반주거지역 | 12 | 1.8% |
2종일반주거지역 | 11 | 1.6% |
일반상업지역 | 8 | 1.2% |
4종미관,공원용지 | 6 | 0.9% |
Other values (53) | 96 | 14.4% |
Most occurring characters
Value | Count | Frequency (%) |
지 | 590 | |
주 | 497 | |
역 | 454 | |
일 | 379 | |
거 | 372 | |
반 | 372 | |
구 | 194 | 4.7% |
차 | 125 | 3.0% |
정 | 125 | 3.0% |
비 | 125 | 3.0% |
Other values (58) | 878 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 4031 | |
Decimal Number | 54 | 1.3% |
Other Punctuation | 12 | 0.3% |
Space Separator | 11 | 0.3% |
Open Punctuation | 1 | < 0.1% |
Lowercase Letter | 1 | < 0.1% |
Close Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
지 | 590 | |
주 | 497 | |
역 | 454 | |
일 | 379 | |
거 | 372 | |
반 | 372 | |
구 | 194 | 4.8% |
차 | 125 | 3.1% |
정 | 125 | 3.1% |
비 | 125 | 3.1% |
Other values (47) | 798 |
Decimal Number
Value | Count | Frequency (%) |
2 | 28 | |
4 | 9 | 16.7% |
1 | 8 | 14.8% |
3 | 8 | 14.8% |
5 | 1 | 1.9% |
Other Punctuation
Value | Count | Frequency (%) |
, | 11 | |
/ | 1 | 8.3% |
Space Separator
Value | Count | Frequency (%) |
11 |
Open Punctuation
Value | Count | Frequency (%) |
( | 1 |
Lowercase Letter
Value | Count | Frequency (%) |
m | 1 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 4031 | |
Common | 79 | 1.9% |
Latin | 1 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
지 | 590 | |
주 | 497 | |
역 | 454 | |
일 | 379 | |
거 | 372 | |
반 | 372 | |
구 | 194 | 4.8% |
차 | 125 | 3.1% |
정 | 125 | 3.1% |
비 | 125 | 3.1% |
Other values (47) | 798 |
Common
Value | Count | Frequency (%) |
2 | 28 | |
11 | 13.9% | |
, | 11 | 13.9% |
4 | 9 | 11.4% |
1 | 8 | 10.1% |
3 | 8 | 10.1% |
5 | 1 | 1.3% |
( | 1 | 1.3% |
) | 1 | 1.3% |
/ | 1 | 1.3% |
Latin
Value | Count | Frequency (%) |
m | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 4031 | |
ASCII | 80 | 1.9% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
지 | 590 | |
주 | 497 | |
역 | 454 | |
일 | 379 | |
거 | 372 | |
반 | 372 | |
구 | 194 | 4.8% |
차 | 125 | 3.1% |
정 | 125 | 3.1% |
비 | 125 | 3.1% |
Other values (47) | 798 |
ASCII
Value | Count | Frequency (%) |
2 | 28 | |
11 | 13.8% | |
, | 11 | 13.8% |
4 | 9 | 11.2% |
1 | 8 | 10.0% |
3 | 8 | 10.0% |
5 | 1 | 1.2% |
( | 1 | 1.2% |
m | 1 | 1.2% |
) | 1 | 1.2% |
작업_일자
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
20111227 |
---|
Length
Max length | 8 |
---|---|
Median length | 8 |
Mean length | 8 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20111227 |
---|---|
2nd row | 20111227 |
3rd row | 20111227 |
4th row | 20111227 |
5th row | 20111227 |
Common Values
Value | Count | Frequency (%) |
20111227 | 10000 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20111227 | 10000 |
지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | 기타_지역지구구역 | |
---|---|---|---|---|
지역지구구역_구분_코드 | 1.000 | 0.989 | 0.000 | 1.000 |
지역지구구역_코드 | 0.989 | 1.000 | 0.819 | 0.997 |
대표_여부 | 0.000 | 0.819 | 1.000 | 0.812 |
기타_지역지구구역 | 1.000 | 0.997 | 0.812 | 1.000 |
지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | |
---|---|---|---|
지역지구구역_구분_코드 | 1.000 | 0.874 | 0.000 |
지역지구구역_코드 | 0.874 | 1.000 | 0.667 |
대표_여부 | 0.000 | 0.667 | 1.000 |
지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | |
---|---|---|---|
지역지구구역_구분_코드 | 1.000 | 0.874 | 0.000 |
지역지구구역_코드 | 0.874 | 1.000 | 0.667 |
대표_여부 | 0.000 | 0.667 | 1.000 |
관리_지역지구구역 | 관리_폐쇄말소대장 | 지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | 기타_지역지구구역 | 작업_일자 | |
---|---|---|---|---|---|---|---|
25334 | 11290-29008 | 11290-13273 | 1 | <NA> | 1 | <NA> | 20111227 |
994 | 11230-7204 | 11230-3489 | 2 | <NA> | 1 | <NA> | 20111227 |
30666 | 11215-2577 | 11215-1343 | 1 | 1022 | 1 | 2종일반주거지역 | 20111227 |
51952 | 11290-26242 | 11290-12097 | 2 | <NA> | 1 | <NA> | 20111227 |
51066 | 11290-34809 | 11290-15217 | 2 | <NA> | 1 | <NA> | 20111227 |
47656 | 11305-3137 | 11305-1589 | 2 | <NA> | 1 | <NA> | 20111227 |
51797 | 11305-4856 | 11305-2405 | 3 | <NA> | 1 | <NA> | 20111227 |
16642 | 11290-5963 | 11290-3402 | 3 | <NA> | 1 | <NA> | 20111227 |
55187 | 11410-2713 | 11410-1610 | 1 | <NA> | 1 | <NA> | 20111227 |
13313 | 11290-320 | 11290-1351 | 1 | <NA> | 1 | <NA> | 20111227 |
관리_지역지구구역 | 관리_폐쇄말소대장 | 지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | 기타_지역지구구역 | 작업_일자 | |
---|---|---|---|---|---|---|---|
56865 | 11350-4870 | 11350-2979 | 1 | <NA> | 1 | <NA> | 20111227 |
4301 | 11230-2176 | 11230-1565 | 2 | <NA> | 1 | <NA> | 20111227 |
3905 | 11260-2654 | 11260-1225 | 3 | <NA> | 1 | <NA> | 20111227 |
53139 | 11305-9847 | 11305-5018 | 2 | <NA> | 1 | <NA> | 20111227 |
45396 | 11320-2155 | 11320-34 | 1 | <NA> | 1 | <NA> | 20111227 |
43597 | 11320-6611 | 11320-2571 | 1 | <NA> | 1 | <NA> | 20111227 |
54455 | 11410-10119 | 11410-4600 | 1 | <NA> | 1 | <NA> | 20111227 |
20777 | 11290-10678 | 11290-5599 | 3 | <NA> | 1 | <NA> | 20111227 |
27291 | 11290-18278 | 11290-8661 | 1 | <NA> | 1 | <NA> | 20111227 |
24730 | 11290-38649 | 11290-16662 | 2 | <NA> | 1 | <NA> | 20111227 |