Dataset statistics
Number of variables | 8 |
---|---|
Number of observations | 30 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 1 |
Duplicate rows (%) | 3.3% |
Total size in memory | 2.2 KiB |
Average record size in memory | 74.4 B |
Variable types
Numeric | 4 |
---|---|
Categorical | 3 |
Text | 1 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | 국토연구원 |
URL | https://bigdata-region.kr/#/dataset/db83d675-b728-4a34-a144-a6acd0379722 |
ROAD_ST_NM has constant value "" | Constant |
Dataset has 1 (3.3%) duplicate rows | Duplicates |
MANAGE_NO is highly overall correlated with RN_CD and 2 other fields | High correlation |
ROAD_LT is highly overall correlated with NTFC_DE | High correlation |
RN_CD is highly overall correlated with MANAGE_NO and 2 other fields | High correlation |
SIG_CD is highly overall correlated with MANAGE_NO and 2 other fields | High correlation |
NTFC_DE is highly overall correlated with MANAGE_NO and 3 other fields | High correlation |
Reproduction
Analysis started | 2023-12-10 14:07:23.467648 |
---|---|
Analysis finished | 2023-12-10 14:07:26.956405 |
Duration | 3.49 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
MANAGE_NO
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 22 |
---|---|
Distinct (%) | 73.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4026933 |
Minimum | 3005032 |
---|---|
Maximum | 4856583 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 402.0 B |
Quantile statistics
Minimum | 3005032 |
---|---|
5-th percentile | 3050475 |
Q1 | 4115316.5 |
median | 4121417 |
Q3 | 4121749.8 |
95-th percentile | 4525921.6 |
Maximum | 4856583 |
Range | 1851551 |
Interquartile range (IQR) | 6433.25 |
Descriptive statistics
Standard deviation | 430158.78 |
---|---|
Coefficient of variation (CV) | 0.10682045 |
Kurtosis | 2.3194104 |
Mean | 4026933 |
Median Absolute Deviation (MAD) | 1658.5 |
Skewness | -1.1755145 |
Sum | 1.2080799 × 108 |
Variance | 1.8503657 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4121751 | 3 | 10.0% |
4118463 | 2 | 6.7% |
4121749 | 2 | 6.7% |
4115019 | 2 | 6.7% |
4115654 | 2 | 6.7% |
4856583 | 2 | 6.7% |
4121750 | 2 | 6.7% |
4121780 | 1 | 3.3% |
4121713 | 1 | 3.3% |
4121328 | 1 | 3.3% |
Other values (12) | 12 |
Value | Count | Frequency (%) |
3005032 | 1 | |
3005038 | 1 | |
3106009 | 1 | |
3107013 | 1 | |
4115019 | 2 | |
4115203 | 1 | |
4115204 | 1 | |
4115654 | 2 | |
4118461 | 1 | |
4118463 | 2 |
Value | Count | Frequency (%) |
4856583 | 2 | |
4121780 | 1 | 3.3% |
4121751 | 3 | |
4121750 | 2 | |
4121749 | 2 | |
4121713 | 1 | 3.3% |
4121698 | 1 | 3.3% |
4121578 | 1 | 3.3% |
4121509 | 1 | 3.3% |
4121506 | 1 | 3.3% |
SIG_CD
Categorical
HIGH CORRELATION
 
Distinct | 3 |
---|---|
Distinct (%) | 10.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 372.0 B |
11290 | |
---|---|
11230 | |
11260 |
Length
Max length | 5 |
---|---|
Median length | 5 |
Mean length | 5 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 11260 |
---|---|
2nd row | 11290 |
3rd row | 11230 |
4th row | 11230 |
5th row | 11290 |
Common Values
Value | Count | Frequency (%) |
11290 | 19 | |
11230 | 7 | 23.3% |
11260 | 4 | 13.3% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
11290 | 19 | |
11230 | 7 | 23.3% |
11260 | 4 | 13.3% |
ROAD_LT
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 28 |
---|---|
Distinct (%) | 93.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 520.9 |
Minimum | 6 |
---|---|
Maximum | 11800 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 402.0 B |
Quantile statistics
Minimum | 6 |
---|---|
5-th percentile | 11.45 |
Q1 | 28 |
median | 61 |
Q3 | 201.75 |
95-th percentile | 496 |
Maximum | 11800 |
Range | 11794 |
Interquartile range (IQR) | 173.75 |
Descriptive statistics
Standard deviation | 2135.7804 |
---|---|
Coefficient of variation (CV) | 4.1001735 |
Kurtosis | 29.65894 |
Mean | 520.9 |
Median Absolute Deviation (MAD) | 48.5 |
Skewness | 5.4325656 |
Sum | 15627 |
Variance | 4561557.9 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
12 | 2 | 6.7% |
496 | 2 | 6.7% |
21 | 1 | 3.3% |
434 | 1 | 3.3% |
32 | 1 | 3.3% |
452 | 1 | 3.3% |
40 | 1 | 3.3% |
65 | 1 | 3.3% |
16 | 1 | 3.3% |
292 | 1 | 3.3% |
Other values (18) | 18 |
Value | Count | Frequency (%) |
6 | 1 | |
11 | 1 | |
12 | 2 | |
15 | 1 | |
16 | 1 | |
21 | 1 | |
27 | 1 | |
31 | 1 | |
32 | 1 | |
40 | 1 |
Value | Count | Frequency (%) |
11800 | 1 | |
496 | 2 | |
452 | 1 | |
434 | 1 | |
292 | 1 | |
217 | 1 | |
203 | 1 | |
198 | 1 | |
164 | 1 | |
138 | 1 |
RN
Text
Distinct | 22 |
---|---|
Distinct (%) | 73.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 372.0 B |
Value | Count | Frequency (%) |
화랑로18나길 | 3 | 10.0% |
경희대로1길 | 2 | 6.7% |
홍릉로1가길 | 2 | 6.7% |
고려대로1길 | 2 | 6.7% |
화랑로18길 | 2 | 6.7% |
용마공원로4길 | 2 | 6.7% |
화랑로18가길 | 2 | 6.7% |
장월로14길 | 1 | 3.3% |
한천로76길 | 1 | 3.3% |
성북로16가길 | 1 | 3.3% |
Other values (12) | 12 |
Most occurring characters
Value | Count | Frequency (%) |
로 | 30 | |
길 | 26 | 14.0% |
1 | 17 | 9.1% |
화 | 8 | 4.3% |
랑 | 8 | 4.3% |
8 | 7 | 3.8% |
가 | 6 | 3.2% |
4 | 6 | 3.2% |
대 | 6 | 3.2% |
마 | 4 | 2.2% |
Other values (33) | 68 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 145 | |
Decimal Number | 41 | 22.0% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
로 | 30 | |
길 | 26 | |
화 | 8 | 5.5% |
랑 | 8 | 5.5% |
가 | 6 | 4.1% |
대 | 6 | 4.1% |
마 | 4 | 2.8% |
용 | 4 | 2.8% |
공 | 4 | 2.8% |
장 | 4 | 2.8% |
Other values (24) | 45 |
Decimal Number
Value | Count | Frequency (%) |
1 | 17 | |
8 | 7 | |
4 | 6 | 14.6% |
5 | 2 | 4.9% |
3 | 2 | 4.9% |
6 | 2 | 4.9% |
7 | 2 | 4.9% |
0 | 2 | 4.9% |
2 | 1 | 2.4% |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 145 | |
Common | 41 | 22.0% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
로 | 30 | |
길 | 26 | |
화 | 8 | 5.5% |
랑 | 8 | 5.5% |
가 | 6 | 4.1% |
대 | 6 | 4.1% |
마 | 4 | 2.8% |
용 | 4 | 2.8% |
공 | 4 | 2.8% |
장 | 4 | 2.8% |
Other values (24) | 45 |
Common
Value | Count | Frequency (%) |
1 | 17 | |
8 | 7 | |
4 | 6 | 14.6% |
5 | 2 | 4.9% |
3 | 2 | 4.9% |
6 | 2 | 4.9% |
7 | 2 | 4.9% |
0 | 2 | 4.9% |
2 | 1 | 2.4% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 145 | |
ASCII | 41 | 22.0% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
로 | 30 | |
길 | 26 | |
화 | 8 | 5.5% |
랑 | 8 | 5.5% |
가 | 6 | 4.1% |
대 | 6 | 4.1% |
마 | 4 | 2.8% |
용 | 4 | 2.8% |
공 | 4 | 2.8% |
장 | 4 | 2.8% |
Other values (24) | 45 |
ASCII
Value | Count | Frequency (%) |
1 | 17 | |
8 | 7 | |
4 | 6 | 14.6% |
5 | 2 | 4.9% |
3 | 2 | 4.9% |
6 | 2 | 4.9% |
7 | 2 | 4.9% |
0 | 2 | 4.9% |
2 | 1 | 2.4% |
RN_CD
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 22 |
---|---|
Distinct (%) | 73.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4026933 |
Minimum | 3005032 |
---|---|
Maximum | 4856583 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 402.0 B |
Quantile statistics
Minimum | 3005032 |
---|---|
5-th percentile | 3050475 |
Q1 | 4115316.5 |
median | 4121417 |
Q3 | 4121749.8 |
95-th percentile | 4525921.6 |
Maximum | 4856583 |
Range | 1851551 |
Interquartile range (IQR) | 6433.25 |
Descriptive statistics
Standard deviation | 430158.78 |
---|---|
Coefficient of variation (CV) | 0.10682045 |
Kurtosis | 2.3194104 |
Mean | 4026933 |
Median Absolute Deviation (MAD) | 1658.5 |
Skewness | -1.1755145 |
Sum | 1.2080799 × 108 |
Variance | 1.8503657 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4121751 | 3 | 10.0% |
4118463 | 2 | 6.7% |
4121749 | 2 | 6.7% |
4115019 | 2 | 6.7% |
4115654 | 2 | 6.7% |
4856583 | 2 | 6.7% |
4121750 | 2 | 6.7% |
4121780 | 1 | 3.3% |
4121713 | 1 | 3.3% |
4121328 | 1 | 3.3% |
Other values (12) | 12 |
Value | Count | Frequency (%) |
3005032 | 1 | |
3005038 | 1 | |
3106009 | 1 | |
3107013 | 1 | |
4115019 | 2 | |
4115203 | 1 | |
4115204 | 1 | |
4115654 | 2 | |
4118461 | 1 | |
4118463 | 2 |
Value | Count | Frequency (%) |
4856583 | 2 | |
4121780 | 1 | 3.3% |
4121751 | 3 | |
4121750 | 2 | |
4121749 | 2 | |
4121713 | 1 | 3.3% |
4121698 | 1 | 3.3% |
4121578 | 1 | 3.3% |
4121509 | 1 | 3.3% |
4121506 | 1 | 3.3% |
NTFC_DE
Categorical
HIGH CORRELATION
 
Distinct | 5 |
---|---|
Distinct (%) | 16.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 372.0 B |
20100610 | |
---|---|
20100617 | |
20181224 | 2 |
20100422 | 1 |
20090710 | 1 |
Length
Max length | 8 |
---|---|
Median length | 8 |
Mean length | 8 |
Min length | 8 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 6.7% |
Sample
1st row | 20100610 |
---|---|
2nd row | 20100422 |
3rd row | 20100617 |
4th row | 20100617 |
5th row | 20100610 |
Common Values
Value | Count | Frequency (%) |
20100610 | 20 | |
20100617 | 6 | 20.0% |
20181224 | 2 | 6.7% |
20100422 | 1 | 3.3% |
20090710 | 1 | 3.3% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20100610 | 20 | |
20100617 | 6 | 20.0% |
20181224 | 2 | 6.7% |
20100422 | 1 | 3.3% |
20090710 | 1 | 3.3% |
ROAD_BT
Real number (ℝ)
Distinct | 9 |
---|---|
Distinct (%) | 30.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.5666667 |
Minimum | 1 |
---|---|
Maximum | 21 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 402.0 B |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1.45 |
Q1 | 2 |
median | 4 |
Q3 | 5 |
95-th percentile | 7.55 |
Maximum | 21 |
Range | 20 |
Interquartile range (IQR) | 3 |
Descriptive statistics
Standard deviation | 3.6263721 |
---|---|
Coefficient of variation (CV) | 0.79409608 |
Kurtosis | 14.764185 |
Mean | 4.5666667 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 3.3276168 |
Sum | 137 |
Variance | 13.150575 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 7 | |
5 | 7 | |
4 | 4 | |
6 | 3 | |
3 | 3 | |
1 | 2 | 6.7% |
7 | 2 | 6.7% |
8 | 1 | 3.3% |
21 | 1 | 3.3% |
Value | Count | Frequency (%) |
1 | 2 | 6.7% |
2 | 7 | |
3 | 3 | |
4 | 4 | |
5 | 7 | |
6 | 3 | |
7 | 2 | 6.7% |
8 | 1 | 3.3% |
21 | 1 | 3.3% |
Value | Count | Frequency (%) |
21 | 1 | 3.3% |
8 | 1 | 3.3% |
7 | 2 | 6.7% |
6 | 3 | |
5 | 7 | |
4 | 4 | |
3 | 3 | |
2 | 7 | |
1 | 2 | 6.7% |
ROAD_ST_NM
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 3.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 372.0 B |
현황도로 |
---|
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 현황도로 |
---|---|
2nd row | 현황도로 |
3rd row | 현황도로 |
4th row | 현황도로 |
5th row | 현황도로 |
Common Values
Value | Count | Frequency (%) |
현황도로 | 30 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
현황도로 | 30 |
MANAGE_NO | SIG_CD | ROAD_LT | RN | RN_CD | NTFC_DE | ROAD_BT | |
---|---|---|---|---|---|---|---|
MANAGE_NO | 1.000 | 0.656 | 0.680 | 1.000 | 1.000 | 0.990 | 0.000 |
SIG_CD | 0.656 | 1.000 | 0.000 | 1.000 | 0.656 | 0.686 | 0.252 |
ROAD_LT | 0.680 | 0.000 | 1.000 | 1.000 | 0.680 | 1.000 | 0.000 |
RN | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.860 |
RN_CD | 1.000 | 0.656 | 0.680 | 1.000 | 1.000 | 0.990 | 0.000 |
NTFC_DE | 0.990 | 0.686 | 1.000 | 1.000 | 0.990 | 1.000 | 0.483 |
ROAD_BT | 0.000 | 0.252 | 0.000 | 0.860 | 0.000 | 0.483 | 1.000 |
NTFC_DE | SIG_CD | |
---|---|---|
NTFC_DE | 1.000 | 0.637 |
SIG_CD | 0.637 | 1.000 |
MANAGE_NO | ROAD_LT | RN_CD | ROAD_BT | SIG_CD | NTFC_DE | |
---|---|---|---|---|---|---|
MANAGE_NO | 1.000 | 0.062 | 1.000 | 0.139 | 0.608 | 0.870 |
ROAD_LT | 0.062 | 1.000 | 0.062 | 0.208 | 0.000 | 0.945 |
RN_CD | 1.000 | 0.062 | 1.000 | 0.139 | 0.608 | 0.870 |
ROAD_BT | 0.139 | 0.208 | 0.139 | 1.000 | 0.312 | 0.155 |
SIG_CD | 0.608 | 0.000 | 0.608 | 0.312 | 1.000 | 0.637 |
NTFC_DE | 0.870 | 0.945 | 0.870 | 0.155 | 0.637 | 1.000 |
MANAGE_NO | SIG_CD | ROAD_LT | RN | RN_CD | NTFC_DE | ROAD_BT | ROAD_ST_NM | |
---|---|---|---|---|---|---|---|---|
0 | 4118463 | 11260 | 21 | 용마공원로4길 | 4118463 | 20100610 | 2 | 현황도로 |
1 | 3005038 | 11290 | 11800 | 한천로 | 3005038 | 20100422 | 2 | 현황도로 |
2 | 4115019 | 11230 | 109 | 경희대로1길 | 4115019 | 20100617 | 6 | 현황도로 |
3 | 4115654 | 11230 | 31 | 홍릉로1가길 | 4115654 | 20100617 | 1 | 현황도로 |
4 | 4121578 | 11290 | 12 | 장위로40길 | 4121578 | 20100610 | 5 | 현황도로 |
5 | 3106009 | 11260 | 58 | 용마공원로 | 3106009 | 20100610 | 4 | 현황도로 |
6 | 4115019 | 11230 | 217 | 경희대로1길 | 4115019 | 20100617 | 7 | 현황도로 |
7 | 4121506 | 11290 | 138 | 장월로10길 | 4121506 | 20100610 | 4 | 현황도로 |
8 | 4118461 | 11260 | 57 | 용마공원로2길 | 4118461 | 20100610 | 4 | 현황도로 |
9 | 4856583 | 11290 | 496 | 고려대로1길 | 4856583 | 20181224 | 5 | 현황도로 |
MANAGE_NO | SIG_CD | ROAD_LT | RN | RN_CD | NTFC_DE | ROAD_BT | ROAD_ST_NM | |
---|---|---|---|---|---|---|---|---|
20 | 4121751 | 11290 | 12 | 화랑로18나길 | 4121751 | 20100610 | 5 | 현황도로 |
21 | 4121751 | 11290 | 203 | 화랑로18나길 | 4121751 | 20100610 | 3 | 현황도로 |
22 | 4121750 | 11290 | 198 | 화랑로18길 | 4121750 | 20100610 | 6 | 현황도로 |
23 | 4121509 | 11290 | 292 | 장월로14길 | 4121509 | 20100610 | 3 | 현황도로 |
24 | 4121229 | 11290 | 16 | 북악산로15길 | 4121229 | 20100610 | 4 | 현황도로 |
25 | 4121698 | 11290 | 65 | 창경궁로43길 | 4121698 | 20100610 | 5 | 현황도로 |
26 | 4121328 | 11290 | 40 | 성북로16가길 | 4121328 | 20100610 | 2 | 현황도로 |
27 | 4121713 | 11290 | 452 | 한천로76길 | 4121713 | 20100610 | 21 | 현황도로 |
28 | 4856583 | 11290 | 496 | 고려대로1길 | 4856583 | 20181224 | 5 | 현황도로 |
29 | 4121780 | 11290 | 32 | 화랑로37가길 | 4121780 | 20100610 | 2 | 현황도로 |
Most frequently occurring
MANAGE_NO | SIG_CD | ROAD_LT | RN | RN_CD | NTFC_DE | ROAD_BT | ROAD_ST_NM | # duplicates | |
---|---|---|---|---|---|---|---|---|---|
0 | 4856583 | 11290 | 496 | 고려대로1길 | 4856583 | 20181224 | 5 | 현황도로 | 2 |