Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory673.8 KiB
Average record size in memory69.0 B

Variable types

Numeric4
Categorical2
Text1

Dataset

Description경기도_BMS GIS 경로 단위 정보
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=HNIP5FSE7EYEJITHJR3X33138663&infSeq=1

Alerts

등록일자 is highly overall correlated with 사용구분High correlation
사용구분 is highly overall correlated with 등록일자High correlation
시외버스추가경로구분 is highly imbalanced (96.9%)Imbalance

Reproduction

Analysis started2023-12-10 21:01:00.925421
Analysis finished2023-12-10 21:01:03.860670
Duration2.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선아이디
Real number (ℝ)

Distinct1851
Distinct (%)18.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2962998 × 108
Minimum2.0000001 × 108
Maximum2.4120301 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:01:03.938165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0000001 × 108
5-th percentile2.0000028 × 108
Q12.2300004 × 108
median2.3400047 × 108
Q32.4100102 × 108
95-th percentile2.4100704 × 108
Maximum2.4120301 × 108
Range41203001
Interquartile range (IQR)18000982

Descriptive statistics

Standard deviation12197687
Coefficient of variation (CV)0.05311888
Kurtosis0.032676724
Mean2.2962998 × 108
Median Absolute Deviation (MAD)7002126
Skewness-1.0501257
Sum2.2962998 × 1012
Variance1.4878358 × 1014
MonotonicityNot monotonic
2023-12-11T06:01:04.125180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
200000028 87
 
0.9%
241006942 79
 
0.8%
241006799 69
 
0.7%
224000014 67
 
0.7%
241006390 65
 
0.7%
241000040 63
 
0.6%
229000035 56
 
0.6%
241003900 48
 
0.5%
241000320 48
 
0.5%
233000383 47
 
0.5%
Other values (1841) 9371
93.7%
ValueCountFrequency (%)
200000009 3
 
< 0.1%
200000015 16
 
0.2%
200000017 5
 
0.1%
200000021 1
 
< 0.1%
200000024 6
 
0.1%
200000028 87
0.9%
200000032 8
 
0.1%
200000034 3
 
< 0.1%
200000042 2
 
< 0.1%
200000057 2
 
< 0.1%
ValueCountFrequency (%)
241203010 7
 
0.1%
241105900 6
 
0.1%
241103250 1
 
< 0.1%
241103010 5
 
0.1%
241007245 16
0.2%
241007243 24
0.2%
241007225 18
0.2%
241007203 22
0.2%
241007199 39
0.4%
241007197 10
 
0.1%

링크순서
Real number (ℝ)

Distinct440
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80.5772
Minimum1
Maximum584
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:01:04.351146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q124
median55
Q3112
95-th percentile236
Maximum584
Range583
Interquartile range (IQR)88

Descriptive statistics

Standard deviation79.314539
Coefficient of variation (CV)0.9843298
Kurtosis5.0094006
Mean80.5772
Median Absolute Deviation (MAD)37
Skewness1.938253
Sum805772
Variance6290.7961
MonotonicityNot monotonic
2023-12-11T06:01:04.605175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 128
 
1.3%
17 124
 
1.2%
11 123
 
1.2%
5 116
 
1.2%
10 115
 
1.1%
24 113
 
1.1%
1 113
 
1.1%
25 111
 
1.1%
6 111
 
1.1%
18 109
 
1.1%
Other values (430) 8837
88.4%
ValueCountFrequency (%)
1 113
1.1%
2 128
1.3%
3 101
1.0%
4 82
0.8%
5 116
1.2%
6 111
1.1%
7 102
1.0%
8 94
0.9%
9 96
1.0%
10 115
1.1%
ValueCountFrequency (%)
584 1
< 0.1%
582 1
< 0.1%
571 1
< 0.1%
558 1
< 0.1%
544 1
< 0.1%
542 1
< 0.1%
539 1
< 0.1%
537 1
< 0.1%
527 1
< 0.1%
526 1
< 0.1%

링크아이디
Real number (ℝ)

Distinct6226
Distinct (%)62.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1537979 × 109
Minimum1.0000009 × 109
Maximum3.6900008 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:01:04.815484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000009 × 109
5-th percentile1.1700209 × 109
Q12.1200302 × 109
median2.2700615 × 109
Q32.3303149 × 109
95-th percentile2.4000749 × 109
Maximum3.6900008 × 109
Range2.6899999 × 109
Interquartile range (IQR)2.1028475 × 108

Descriptive statistics

Standard deviation3.6841695 × 108
Coefficient of variation (CV)0.17105456
Kurtosis3.6584331
Mean2.1537979 × 109
Median Absolute Deviation (MAD)90063350
Skewness-1.5255746
Sum2.1537979 × 1013
Variance1.3573105 × 1017
MonotonicityNot monotonic
2023-12-11T06:01:05.021616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2370156600 22
 
0.2%
2300118900 20
 
0.2%
2390057100 20
 
0.2%
2060114700 18
 
0.2%
1040022100 18
 
0.2%
2390056700 17
 
0.2%
2390056900 16
 
0.2%
2330113200 15
 
0.1%
1040021900 15
 
0.1%
2370160400 15
 
0.1%
Other values (6216) 9824
98.2%
ValueCountFrequency (%)
1000000900 1
< 0.1%
1000001000 2
< 0.1%
1000001700 1
< 0.1%
1000001800 1
< 0.1%
1000002400 1
< 0.1%
1000013800 1
< 0.1%
1000014200 2
< 0.1%
1000014300 1
< 0.1%
1000014500 1
< 0.1%
1000014600 1
< 0.1%
ValueCountFrequency (%)
3690000800 1
< 0.1%
3690000700 1
< 0.1%
3650000401 1
< 0.1%
3630004700 2
< 0.1%
3630004300 1
< 0.1%
3630004200 1
< 0.1%
3630002301 1
< 0.1%
3630002201 1
< 0.1%
3630002101 1
< 0.1%
3630001001 1
< 0.1%

사용구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Y
5153 
9
3121 
0
1726 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowY
2nd rowY
3rd row9
4th row9
5th rowY

Common Values

ValueCountFrequency (%)
Y 5153
51.5%
9 3121
31.2%
0 1726
 
17.3%

Length

2023-12-11T06:01:05.208618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:01:05.339039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
y 5153
51.5%
9 3121
31.2%
0 1726
 
17.3%
Distinct86
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T06:01:05.624585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length4.2542
Min length1

Characters and Unicode

Total characters42542
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowkdbus01
2nd rowsuwon345
3rd row0
4th row0
5th rowkd0112
ValueCountFrequency (%)
0 4847
48.5%
kdbus01 500
 
5.0%
busdy000 276
 
2.8%
seedsp1400 232
 
2.3%
kdbus03 211
 
2.1%
kwbus119 164
 
1.6%
kd0112 143
 
1.4%
hckim082 140
 
1.4%
kkks1952 139
 
1.4%
shinsung00 137
 
1.4%
Other values (76) 3211
32.1%
2023-12-11T06:01:06.045645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 10519
24.7%
s 3338
 
7.8%
k 3084
 
7.2%
1 3064
 
7.2%
d 2579
 
6.1%
u 2072
 
4.9%
b 1586
 
3.7%
2 1402
 
3.3%
n 1121
 
2.6%
8 1057
 
2.5%
Other values (24) 12720
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21675
50.9%
Decimal Number 20830
49.0%
Dash Punctuation 37
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 3338
15.4%
k 3084
14.2%
d 2579
11.9%
u 2072
9.6%
b 1586
 
7.3%
n 1121
 
5.2%
h 1009
 
4.7%
i 849
 
3.9%
y 842
 
3.9%
w 726
 
3.3%
Other values (13) 4469
20.6%
Decimal Number
ValueCountFrequency (%)
0 10519
50.5%
1 3064
 
14.7%
2 1402
 
6.7%
8 1057
 
5.1%
3 1020
 
4.9%
4 1017
 
4.9%
5 854
 
4.1%
9 703
 
3.4%
6 635
 
3.0%
7 559
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 37
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21675
50.9%
Common 20867
49.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 3338
15.4%
k 3084
14.2%
d 2579
11.9%
u 2072
9.6%
b 1586
 
7.3%
n 1121
 
5.2%
h 1009
 
4.7%
i 849
 
3.9%
y 842
 
3.9%
w 726
 
3.3%
Other values (13) 4469
20.6%
Common
ValueCountFrequency (%)
0 10519
50.4%
1 3064
 
14.7%
2 1402
 
6.7%
8 1057
 
5.1%
3 1020
 
4.9%
4 1017
 
4.9%
5 854
 
4.1%
9 703
 
3.4%
6 635
 
3.0%
7 559
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42542
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 10519
24.7%
s 3338
 
7.8%
k 3084
 
7.2%
1 3064
 
7.2%
d 2579
 
6.1%
u 2072
 
4.9%
b 1586
 
3.7%
2 1402
 
3.3%
n 1121
 
2.6%
8 1057
 
2.5%
Other values (24) 12720
29.9%

등록일자
Real number (ℝ)

HIGH CORRELATION 

Distinct321
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0206503 × 1013
Minimum2.019083 × 1013
Maximum2.0230907 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:01:06.262242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.019083 × 1013
5-th percentile2.019083 × 1013
Q12.019083 × 1013
median2.0200207 × 1013
Q32.0230327 × 1013
95-th percentile2.0230719 × 1013
Maximum2.0230907 × 1013
Range4.007694 × 1010
Interquartile range (IQR)3.9496926 × 1010

Descriptive statistics

Standard deviation1.78032 × 1010
Coefficient of variation (CV)0.00088106291
Kurtosis-1.6229774
Mean2.0206503 × 1013
Median Absolute Deviation (MAD)9.376918 × 109
Skewness0.47146739
Sum2.0206503 × 1017
Variance3.1695394 × 1020
MonotonicityNot monotonic
2023-12-11T06:01:06.474458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20190830185541 4847
48.5%
20230329172451 87
 
0.9%
20200211182427 79
 
0.8%
20210312163443 69
 
0.7%
20230327111357 67
 
0.7%
20200214101250 65
 
0.7%
20200219153506 63
 
0.6%
20230418104933 56
 
0.6%
20210527170513 48
 
0.5%
20200210093348 48
 
0.5%
Other values (311) 4571
45.7%
ValueCountFrequency (%)
20190830185541 4847
48.5%
20191031104625 7
 
0.1%
20191031104656 2
 
< 0.1%
20191101132221 2
 
< 0.1%
20191217154625 5
 
0.1%
20191218173216 12
 
0.1%
20191220145012 2
 
< 0.1%
20191226110423 6
 
0.1%
20200115164018 43
 
0.4%
20200117171026 8
 
0.1%
ValueCountFrequency (%)
20230907125904 7
 
0.1%
20230831175513 18
0.2%
20230830153749 37
0.4%
20230830112556 13
 
0.1%
20230829104407 27
0.3%
20230829104238 30
0.3%
20230823112003 6
 
0.1%
20230821152026 7
 
0.1%
20230821110819 6
 
0.1%
20230821110809 11
 
0.1%

시외버스추가경로구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9968 
1
 
32

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9968
99.7%
1 32
 
0.3%

Length

2023-12-11T06:01:06.632725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:01:06.757897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9968
99.7%
1 32
 
0.3%

Interactions

2023-12-11T06:01:03.011979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:01.650440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:02.086882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:02.564625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:03.111877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:01.754951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:02.183319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:02.672588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:03.259771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:01.870600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:02.303941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:02.772160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:03.410823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:01.971418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:02.400423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:02.903426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:01:06.842669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선아이디링크순서링크아이디사용구분등록아이디등록일자시외버스추가경로구분
노선아이디1.0000.3740.4980.3570.9600.5280.096
링크순서0.3741.0000.1870.1870.3860.1860.041
링크아이디0.4980.1871.0000.3160.7530.3490.129
사용구분0.3570.1870.3161.0000.896NaN0.032
등록아이디0.9600.3860.7530.8961.0000.8930.788
등록일자0.5280.1860.349NaN0.8931.0000.204
시외버스추가경로구분0.0960.0410.1290.0320.7880.2041.000
2023-12-11T06:01:06.969812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용구분시외버스추가경로구분
사용구분1.0000.053
시외버스추가경로구분0.0531.000
2023-12-11T06:01:07.060574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선아이디링크순서링크아이디등록일자사용구분시외버스추가경로구분
노선아이디1.0000.1930.205-0.2930.2290.074
링크순서0.1931.000-0.106-0.0190.1130.032
링크아이디0.205-0.1061.000-0.0800.1460.129
등록일자-0.293-0.019-0.0801.0000.7020.158
사용구분0.2290.1130.1460.7021.0000.053
시외버스추가경로구분0.0740.0320.1290.1580.0531.000

Missing values

2023-12-11T06:01:03.640995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:01:03.800778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선아이디링크순서링크아이디사용구분등록아이디등록일자시외버스추가경로구분
20072241006799812280182500Ykdbus01202103121634430
4238020000026542000020800Ysuwon345202303281354430
44126241006870174124002910090201908301855410
13725241005570138205004310090201908301855410
13880233000281212330154100Ykd0112202308301537490
160202190000041081180016900Ycrowkyh202304201121580
16248214000193182140115900Ypt8556202304061644330
3656231000135502310010700Ychh72018202309071259040
3142123400133377230011410090201908301855410
2761522200012950221003810000201908301855410
노선아이디링크순서링크아이디사용구분등록아이디등록일자시외버스추가경로구분
456772340015701262330141400Ykd0112202303301007410
34616241006837500226005870000201908301855410
2318424100627037224023570090201908301855410
25211204000139951230009200Ysncbbgy202307191413490
1778204000158141210030000Ykd0004202104191324150
4196210000013851150011600Ysoshin11202306081303530
51737229000153118229008320090201908301855410
478752410000401522310148800Yky7266202002191535060
36427226000032142260072400Yhagi2205202305221409080
300302290000351952290136000Y4110400202304181049330