Overview

Dataset statistics

Number of variables8
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)3.3%
Total size in memory2.2 KiB
Average record size in memory74.4 B

Variable types

Numeric4
Categorical3
Text1

Dataset

Description샘플 데이터
Author국토연구원
URLhttps://bigdata-region.kr/#/dataset/db83d675-b728-4a34-a144-a6acd0379722

Alerts

ROAD_ST_NM has constant value ""Constant
Dataset has 1 (3.3%) duplicate rowsDuplicates
MANAGE_NO is highly overall correlated with RN_CD and 2 other fieldsHigh correlation
ROAD_LT is highly overall correlated with NTFC_DEHigh correlation
RN_CD is highly overall correlated with MANAGE_NO and 2 other fieldsHigh correlation
SIG_CD is highly overall correlated with MANAGE_NO and 2 other fieldsHigh correlation
NTFC_DE is highly overall correlated with MANAGE_NO and 3 other fieldsHigh correlation

Reproduction

Analysis started2023-12-10 14:07:23.467648
Analysis finished2023-12-10 14:07:26.956405
Duration3.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

MANAGE_NO
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4026933
Minimum3005032
Maximum4856583
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:07:27.042979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3005032
5-th percentile3050475
Q14115316.5
median4121417
Q34121749.8
95-th percentile4525921.6
Maximum4856583
Range1851551
Interquartile range (IQR)6433.25

Descriptive statistics

Standard deviation430158.78
Coefficient of variation (CV)0.10682045
Kurtosis2.3194104
Mean4026933
Median Absolute Deviation (MAD)1658.5
Skewness-1.1755145
Sum1.2080799 × 108
Variance1.8503657 × 1011
MonotonicityNot monotonic
2023-12-10T23:07:27.224712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
4121751 3
 
10.0%
4118463 2
 
6.7%
4121749 2
 
6.7%
4115019 2
 
6.7%
4115654 2
 
6.7%
4856583 2
 
6.7%
4121750 2
 
6.7%
4121780 1
 
3.3%
4121713 1
 
3.3%
4121328 1
 
3.3%
Other values (12) 12
40.0%
ValueCountFrequency (%)
3005032 1
3.3%
3005038 1
3.3%
3106009 1
3.3%
3107013 1
3.3%
4115019 2
6.7%
4115203 1
3.3%
4115204 1
3.3%
4115654 2
6.7%
4118461 1
3.3%
4118463 2
6.7%
ValueCountFrequency (%)
4856583 2
6.7%
4121780 1
 
3.3%
4121751 3
10.0%
4121750 2
6.7%
4121749 2
6.7%
4121713 1
 
3.3%
4121698 1
 
3.3%
4121578 1
 
3.3%
4121509 1
 
3.3%
4121506 1
 
3.3%

SIG_CD
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
11290
19 
11230
11260

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11260
2nd row11290
3rd row11230
4th row11230
5th row11290

Common Values

ValueCountFrequency (%)
11290 19
63.3%
11230 7
 
23.3%
11260 4
 
13.3%

Length

2023-12-10T23:07:27.423862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:07:27.588894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11290 19
63.3%
11230 7
 
23.3%
11260 4
 
13.3%

ROAD_LT
Real number (ℝ)

HIGH CORRELATION 

Distinct28
Distinct (%)93.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean520.9
Minimum6
Maximum11800
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:07:27.807347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile11.45
Q128
median61
Q3201.75
95-th percentile496
Maximum11800
Range11794
Interquartile range (IQR)173.75

Descriptive statistics

Standard deviation2135.7804
Coefficient of variation (CV)4.1001735
Kurtosis29.65894
Mean520.9
Median Absolute Deviation (MAD)48.5
Skewness5.4325656
Sum15627
Variance4561557.9
MonotonicityNot monotonic
2023-12-10T23:07:28.021419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
12 2
 
6.7%
496 2
 
6.7%
21 1
 
3.3%
434 1
 
3.3%
32 1
 
3.3%
452 1
 
3.3%
40 1
 
3.3%
65 1
 
3.3%
16 1
 
3.3%
292 1
 
3.3%
Other values (18) 18
60.0%
ValueCountFrequency (%)
6 1
3.3%
11 1
3.3%
12 2
6.7%
15 1
3.3%
16 1
3.3%
21 1
3.3%
27 1
3.3%
31 1
3.3%
32 1
3.3%
40 1
3.3%
ValueCountFrequency (%)
11800 1
3.3%
496 2
6.7%
452 1
3.3%
434 1
3.3%
292 1
3.3%
217 1
3.3%
203 1
3.3%
198 1
3.3%
164 1
3.3%
138 1
3.3%

RN
Text

Distinct22
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:07:28.328998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.2
Min length3

Characters and Unicode

Total characters186
Distinct characters43
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)50.0%

Sample

1st row용마공원로4길
2nd row한천로
3rd row경희대로1길
4th row홍릉로1가길
5th row장위로40길
ValueCountFrequency (%)
화랑로18나길 3
 
10.0%
경희대로1길 2
 
6.7%
홍릉로1가길 2
 
6.7%
고려대로1길 2
 
6.7%
화랑로18길 2
 
6.7%
용마공원로4길 2
 
6.7%
화랑로18가길 2
 
6.7%
장월로14길 1
 
3.3%
한천로76길 1
 
3.3%
성북로16가길 1
 
3.3%
Other values (12) 12
40.0%
2023-12-10T23:07:28.957852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
16.1%
26
 
14.0%
1 17
 
9.1%
8
 
4.3%
8
 
4.3%
8 7
 
3.8%
6
 
3.2%
4 6
 
3.2%
6
 
3.2%
4
 
2.2%
Other values (33) 68
36.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 145
78.0%
Decimal Number 41
 
22.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
20.7%
26
17.9%
8
 
5.5%
8
 
5.5%
6
 
4.1%
6
 
4.1%
4
 
2.8%
4
 
2.8%
4
 
2.8%
4
 
2.8%
Other values (24) 45
31.0%
Decimal Number
ValueCountFrequency (%)
1 17
41.5%
8 7
17.1%
4 6
 
14.6%
5 2
 
4.9%
3 2
 
4.9%
6 2
 
4.9%
7 2
 
4.9%
0 2
 
4.9%
2 1
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 145
78.0%
Common 41
 
22.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
20.7%
26
17.9%
8
 
5.5%
8
 
5.5%
6
 
4.1%
6
 
4.1%
4
 
2.8%
4
 
2.8%
4
 
2.8%
4
 
2.8%
Other values (24) 45
31.0%
Common
ValueCountFrequency (%)
1 17
41.5%
8 7
17.1%
4 6
 
14.6%
5 2
 
4.9%
3 2
 
4.9%
6 2
 
4.9%
7 2
 
4.9%
0 2
 
4.9%
2 1
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 145
78.0%
ASCII 41
 
22.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
30
20.7%
26
17.9%
8
 
5.5%
8
 
5.5%
6
 
4.1%
6
 
4.1%
4
 
2.8%
4
 
2.8%
4
 
2.8%
4
 
2.8%
Other values (24) 45
31.0%
ASCII
ValueCountFrequency (%)
1 17
41.5%
8 7
17.1%
4 6
 
14.6%
5 2
 
4.9%
3 2
 
4.9%
6 2
 
4.9%
7 2
 
4.9%
0 2
 
4.9%
2 1
 
2.4%

RN_CD
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4026933
Minimum3005032
Maximum4856583
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:07:29.223906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3005032
5-th percentile3050475
Q14115316.5
median4121417
Q34121749.8
95-th percentile4525921.6
Maximum4856583
Range1851551
Interquartile range (IQR)6433.25

Descriptive statistics

Standard deviation430158.78
Coefficient of variation (CV)0.10682045
Kurtosis2.3194104
Mean4026933
Median Absolute Deviation (MAD)1658.5
Skewness-1.1755145
Sum1.2080799 × 108
Variance1.8503657 × 1011
MonotonicityNot monotonic
2023-12-10T23:07:29.585095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
4121751 3
 
10.0%
4118463 2
 
6.7%
4121749 2
 
6.7%
4115019 2
 
6.7%
4115654 2
 
6.7%
4856583 2
 
6.7%
4121750 2
 
6.7%
4121780 1
 
3.3%
4121713 1
 
3.3%
4121328 1
 
3.3%
Other values (12) 12
40.0%
ValueCountFrequency (%)
3005032 1
3.3%
3005038 1
3.3%
3106009 1
3.3%
3107013 1
3.3%
4115019 2
6.7%
4115203 1
3.3%
4115204 1
3.3%
4115654 2
6.7%
4118461 1
3.3%
4118463 2
6.7%
ValueCountFrequency (%)
4856583 2
6.7%
4121780 1
 
3.3%
4121751 3
10.0%
4121750 2
6.7%
4121749 2
6.7%
4121713 1
 
3.3%
4121698 1
 
3.3%
4121578 1
 
3.3%
4121509 1
 
3.3%
4121506 1
 
3.3%

NTFC_DE
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
20100610
20 
20100617
20181224
 
2
20100422
 
1
20090710
 
1

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique2 ?
Unique (%)6.7%

Sample

1st row20100610
2nd row20100422
3rd row20100617
4th row20100617
5th row20100610

Common Values

ValueCountFrequency (%)
20100610 20
66.7%
20100617 6
 
20.0%
20181224 2
 
6.7%
20100422 1
 
3.3%
20090710 1
 
3.3%

Length

2023-12-10T23:07:29.812088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:07:29.975348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20100610 20
66.7%
20100617 6
 
20.0%
20181224 2
 
6.7%
20100422 1
 
3.3%
20090710 1
 
3.3%

ROAD_BT
Real number (ℝ)

Distinct9
Distinct (%)30.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5666667
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:07:30.133329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.45
Q12
median4
Q35
95-th percentile7.55
Maximum21
Range20
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.6263721
Coefficient of variation (CV)0.79409608
Kurtosis14.764185
Mean4.5666667
Median Absolute Deviation (MAD)2
Skewness3.3276168
Sum137
Variance13.150575
MonotonicityNot monotonic
2023-12-10T23:07:30.334431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 7
23.3%
5 7
23.3%
4 4
13.3%
6 3
10.0%
3 3
10.0%
1 2
 
6.7%
7 2
 
6.7%
8 1
 
3.3%
21 1
 
3.3%
ValueCountFrequency (%)
1 2
 
6.7%
2 7
23.3%
3 3
10.0%
4 4
13.3%
5 7
23.3%
6 3
10.0%
7 2
 
6.7%
8 1
 
3.3%
21 1
 
3.3%
ValueCountFrequency (%)
21 1
 
3.3%
8 1
 
3.3%
7 2
 
6.7%
6 3
10.0%
5 7
23.3%
4 4
13.3%
3 3
10.0%
2 7
23.3%
1 2
 
6.7%

ROAD_ST_NM
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
현황도로
30 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row현황도로
2nd row현황도로
3rd row현황도로
4th row현황도로
5th row현황도로

Common Values

ValueCountFrequency (%)
현황도로 30
100.0%

Length

2023-12-10T23:07:30.934804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:07:31.125675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
현황도로 30
100.0%

Interactions

2023-12-10T23:07:25.987976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:23.888582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:24.588492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:25.267216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:26.166913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:24.058633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:24.759173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:25.460839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:26.313159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:24.231326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:24.934580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:25.633641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:26.466307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:24.392556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:25.103101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:07:25.796203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:07:31.326179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
MANAGE_NOSIG_CDROAD_LTRNRN_CDNTFC_DEROAD_BT
MANAGE_NO1.0000.6560.6801.0001.0000.9900.000
SIG_CD0.6561.0000.0001.0000.6560.6860.252
ROAD_LT0.6800.0001.0001.0000.6801.0000.000
RN1.0001.0001.0001.0001.0001.0000.860
RN_CD1.0000.6560.6801.0001.0000.9900.000
NTFC_DE0.9900.6861.0001.0000.9901.0000.483
ROAD_BT0.0000.2520.0000.8600.0000.4831.000
2023-12-10T23:07:31.537603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
NTFC_DESIG_CD
NTFC_DE1.0000.637
SIG_CD0.6371.000
2023-12-10T23:07:31.696693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
MANAGE_NOROAD_LTRN_CDROAD_BTSIG_CDNTFC_DE
MANAGE_NO1.0000.0621.0000.1390.6080.870
ROAD_LT0.0621.0000.0620.2080.0000.945
RN_CD1.0000.0621.0000.1390.6080.870
ROAD_BT0.1390.2080.1391.0000.3120.155
SIG_CD0.6080.0000.6080.3121.0000.637
NTFC_DE0.8700.9450.8700.1550.6371.000

Missing values

2023-12-10T23:07:26.694035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:07:26.876580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

MANAGE_NOSIG_CDROAD_LTRNRN_CDNTFC_DEROAD_BTROAD_ST_NM
041184631126021용마공원로4길4118463201006102현황도로
130050381129011800한천로3005038201004222현황도로
2411501911230109경희대로1길4115019201006176현황도로
341156541123031홍릉로1가길4115654201006171현황도로
441215781129012장위로40길4121578201006105현황도로
531060091126058용마공원로3106009201006104현황도로
6411501911230217경희대로1길4115019201006177현황도로
7412150611290138장월로10길4121506201006104현황도로
841184611126057용마공원로2길4118461201006104현황도로
9485658311290496고려대로1길4856583201812245현황도로
MANAGE_NOSIG_CDROAD_LTRNRN_CDNTFC_DEROAD_BTROAD_ST_NM
2041217511129012화랑로18나길4121751201006105현황도로
21412175111290203화랑로18나길4121751201006103현황도로
22412175011290198화랑로18길4121750201006106현황도로
23412150911290292장월로14길4121509201006103현황도로
2441212291129016북악산로15길4121229201006104현황도로
2541216981129065창경궁로43길4121698201006105현황도로
2641213281129040성북로16가길4121328201006102현황도로
27412171311290452한천로76길41217132010061021현황도로
28485658311290496고려대로1길4856583201812245현황도로
2941217801129032화랑로37가길4121780201006102현황도로

Duplicate rows

Most frequently occurring

MANAGE_NOSIG_CDROAD_LTRNRN_CDNTFC_DEROAD_BTROAD_ST_NM# duplicates
0485658311290496고려대로1길4856583201812245현황도로2