Overview

Dataset statistics

Number of variables7
Number of observations522
Missing cells412
Missing cells (%)11.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory31.2 KiB
Average record size in memory61.3 B

Variable types

Numeric5
Categorical1
Text1

Dataset

Description빛가람포탈 내 제공중인 나주시 버스 노선ID 및 정류소명
Author한전KDN(주)
URLhttps://www.data.go.kr/data/15038341/fileData.do

Alerts

노선ID is highly overall correlated with 노선명High correlation
정류소ID is highly overall correlated with 노선명High correlation
노선명 is highly overall correlated with 노선ID and 1 other fieldsHigh correlation
위도 has 206 (39.5%) missing valuesMissing
경도 has 206 (39.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 10:12:01.804505
Analysis finished2023-12-12 10:12:05.426963
Duration3.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선ID
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4770115
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-12T19:12:05.487899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median6
Q37
95-th percentile7
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.5448847
Coefficient of variation (CV)0.5684338
Kurtosis-1.5391763
Mean4.4770115
Median Absolute Deviation (MAD)1
Skewness-0.22285058
Sum2337
Variance6.4764379
MonotonicityIncreasing
2023-12-12T19:12:05.616183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
7 132
25.3%
1 124
23.8%
6 120
23.0%
3 63
12.1%
2 37
 
7.1%
8 18
 
3.4%
4 10
 
1.9%
5 10
 
1.9%
9 8
 
1.5%
ValueCountFrequency (%)
1 124
23.8%
2 37
 
7.1%
3 63
12.1%
4 10
 
1.9%
5 10
 
1.9%
6 120
23.0%
7 132
25.3%
8 18
 
3.4%
9 8
 
1.5%
ValueCountFrequency (%)
9 8
 
1.5%
8 18
 
3.4%
7 132
25.3%
6 120
23.0%
5 10
 
1.9%
4 10
 
1.9%
3 63
12.1%
2 37
 
7.1%
1 124
23.8%

정류소ID
Real number (ℝ)

HIGH CORRELATION 

Distinct221
Distinct (%)42.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean298522.91
Minimum47
Maximum7021070
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-12T19:12:06.093293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum47
5-th percentile1200.5
Q11630
median2315
Q34490
95-th percentile6748.95
Maximum7021070
Range7021023
Interquartile range (IQR)2860

Descriptive statistics

Standard deviation1409954.8
Coefficient of variation (CV)4.7231042
Kurtosis18.963927
Mean298522.91
Median Absolute Deviation (MAD)1275
Skewness4.5706926
Sum1.5582896 × 108
Variance1.9879726 × 1012
MonotonicityNot monotonic
2023-12-12T19:12:06.293335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1250 6
 
1.1%
6748 5
 
1.0%
1400 5
 
1.0%
6749 5
 
1.0%
2210 5
 
1.0%
1870 4
 
0.8%
1300 4
 
0.8%
1740 4
 
0.8%
1240 4
 
0.8%
1380 4
 
0.8%
Other values (211) 476
91.2%
ValueCountFrequency (%)
47 1
 
0.2%
48 1
 
0.2%
811 1
 
0.2%
817 1
 
0.2%
1010 1
 
0.2%
1040 3
0.6%
1050 3
0.6%
1080 4
0.8%
1090 3
0.6%
1100 2
0.4%
ValueCountFrequency (%)
7021070 1
0.2%
7021060 1
0.2%
7021050 1
0.2%
7021040 1
0.2%
7021030 1
0.2%
7021020 1
0.2%
7021010 1
0.2%
7021000 1
0.2%
7011110 1
0.2%
7011100 1
0.2%

순서
Real number (ℝ)

Distinct70
Distinct (%)13.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean260.64176
Minimum10
Maximum690
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-12T19:12:06.462260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile20
Q190
median230
Q3410
95-th percentile590
Maximum690
Range680
Interquartile range (IQR)320

Descriptive statistics

Standard deviation186.73377
Coefficient of variation (CV)0.71643841
Kurtosis-1.0332659
Mean260.64176
Median Absolute Deviation (MAD)150
Skewness0.43278674
Sum136055
Variance34869.501
MonotonicityNot monotonic
2023-12-12T19:12:06.665257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20 16
 
3.1%
30 16
 
3.1%
40 16
 
3.1%
10 16
 
3.1%
50 14
 
2.7%
60 14
 
2.7%
70 14
 
2.7%
80 14
 
2.7%
90 14
 
2.7%
100 12
 
2.3%
Other values (60) 376
72.0%
ValueCountFrequency (%)
10 16
3.1%
20 16
3.1%
30 16
3.1%
40 16
3.1%
50 14
2.7%
60 14
2.7%
70 14
2.7%
80 14
2.7%
90 14
2.7%
100 12
2.3%
ValueCountFrequency (%)
690 1
 
0.2%
680 1
 
0.2%
670 1
 
0.2%
660 1
 
0.2%
650 1
 
0.2%
640 2
 
0.4%
630 3
0.6%
620 4
0.8%
610 4
0.8%
600 5
1.0%

노선명
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
999번(그린로)
132 
999번(빛가람로)
124 
999번(우정로)
120 
700
63 
1160
37 
Other values (4)
46 

Length

Max length10
Median length9
Mean length7.8218391
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row999번(빛가람로)
2nd row999번(빛가람로)
3rd row999번(빛가람로)
4th row999번(빛가람로)
5th row999번(빛가람로)

Common Values

ValueCountFrequency (%)
999번(그린로) 132
25.3%
999번(빛가람로) 124
23.8%
999번(우정로) 120
23.0%
700 63
12.1%
1160 37
 
7.1%
701 18
 
3.4%
혁신도시셔틀1번 10
 
1.9%
혁신도시셔틀2번 10
 
1.9%
702 8
 
1.5%

Length

2023-12-12T19:12:06.855365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:12:07.001753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
999번(그린로 132
25.3%
999번(빛가람로 124
23.8%
999번(우정로 120
23.0%
700 63
12.1%
1160 37
 
7.1%
701 18
 
3.4%
혁신도시셔틀1번 10
 
1.9%
혁신도시셔틀2번 10
 
1.9%
702 8
 
1.5%
Distinct128
Distinct (%)24.5%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-12T19:12:07.301636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length10
Mean length5.4214559
Min length2

Characters and Unicode

Total characters2830
Distinct characters187
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)1.1%

Sample

1st row전남대후문(남)
2nd row전남대후문위
3rd row동강대후문
4th row산수오거리
5th row살레시오여고
ValueCountFrequency (%)
전력거래소입구 10
 
1.8%
농어촌공사 9
 
1.6%
나주역 9
 
1.6%
1차 9
 
1.6%
나주시청 8
 
1.5%
소방삼거리 8
 
1.5%
노인복지회관 8
 
1.5%
나주터미널 8
 
1.5%
영산포터미널 8
 
1.5%
매성(산포농협 6
 
1.1%
Other values (123) 463
84.8%
2023-12-12T19:12:07.840443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
69
 
2.4%
68
 
2.4%
64
 
2.3%
55
 
1.9%
54
 
1.9%
51
 
1.8%
51
 
1.8%
50
 
1.8%
49
 
1.7%
48
 
1.7%
Other values (177) 2271
80.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2604
92.0%
Uppercase Letter 56
 
2.0%
Close Punctuation 45
 
1.6%
Open Punctuation 45
 
1.6%
Decimal Number 30
 
1.1%
Space Separator 24
 
0.8%
Lowercase Letter 20
 
0.7%
Dash Punctuation 4
 
0.1%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
69
 
2.6%
68
 
2.6%
64
 
2.5%
55
 
2.1%
54
 
2.1%
51
 
2.0%
51
 
2.0%
50
 
1.9%
49
 
1.9%
48
 
1.8%
Other values (154) 2045
78.5%
Uppercase Letter
ValueCountFrequency (%)
H 19
33.9%
L 19
33.9%
S 9
16.1%
N 2
 
3.6%
D 2
 
3.6%
K 2
 
3.6%
G 1
 
1.8%
E 1
 
1.8%
P 1
 
1.8%
Decimal Number
ValueCountFrequency (%)
1 14
46.7%
6 5
 
16.7%
5 4
 
13.3%
2 4
 
13.3%
3 3
 
10.0%
Lowercase Letter
ValueCountFrequency (%)
s 8
40.0%
c 4
20.0%
l 4
20.0%
a 4
20.0%
Close Punctuation
ValueCountFrequency (%)
) 45
100.0%
Open Punctuation
ValueCountFrequency (%)
( 45
100.0%
Space Separator
ValueCountFrequency (%)
24
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Other Punctuation
ValueCountFrequency (%)
· 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2604
92.0%
Common 150
 
5.3%
Latin 76
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
69
 
2.6%
68
 
2.6%
64
 
2.5%
55
 
2.1%
54
 
2.1%
51
 
2.0%
51
 
2.0%
50
 
1.9%
49
 
1.9%
48
 
1.8%
Other values (154) 2045
78.5%
Latin
ValueCountFrequency (%)
H 19
25.0%
L 19
25.0%
S 9
11.8%
s 8
10.5%
c 4
 
5.3%
l 4
 
5.3%
a 4
 
5.3%
N 2
 
2.6%
D 2
 
2.6%
K 2
 
2.6%
Other values (3) 3
 
3.9%
Common
ValueCountFrequency (%)
) 45
30.0%
( 45
30.0%
24
16.0%
1 14
 
9.3%
6 5
 
3.3%
- 4
 
2.7%
5 4
 
2.7%
2 4
 
2.7%
3 3
 
2.0%
· 2
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2604
92.0%
ASCII 224
 
7.9%
None 2
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
69
 
2.6%
68
 
2.6%
64
 
2.5%
55
 
2.1%
54
 
2.1%
51
 
2.0%
51
 
2.0%
50
 
1.9%
49
 
1.9%
48
 
1.8%
Other values (154) 2045
78.5%
ASCII
ValueCountFrequency (%)
) 45
20.1%
( 45
20.1%
24
10.7%
H 19
8.5%
L 19
8.5%
1 14
 
6.2%
S 9
 
4.0%
s 8
 
3.6%
6 5
 
2.2%
- 4
 
1.8%
Other values (12) 32
14.3%
None
ValueCountFrequency (%)
· 2
100.0%

위도
Real number (ℝ)

MISSING 

Distinct119
Distinct (%)37.7%
Missing206
Missing (%)39.5%
Infinite0
Infinite (%)0.0%
Mean126.77195
Minimum126.71006
Maximum126.93046
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-12T19:12:08.062774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.71006
5-th percentile126.71191
Q1126.71658
median126.7802
Q3126.79545
95-th percentile126.91046
Maximum126.93046
Range0.2204037
Interquartile range (IQR)0.0788743

Descriptive statistics

Standard deviation0.054892995
Coefficient of variation (CV)0.00043300585
Kurtosis1.0251024
Mean126.77195
Median Absolute Deviation (MAD)0.03142395
Skewness0.97939122
Sum40059.936
Variance0.0030132409
MonotonicityNot monotonic
2023-12-12T19:12:08.263367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.7165752 9
 
1.7%
126.7126551 8
 
1.5%
126.7215608 8
 
1.5%
126.7955037 8
 
1.5%
126.8043715 6
 
1.1%
126.803639 6
 
1.1%
126.7780739 5
 
1.0%
126.779335 5
 
1.0%
126.7944375 5
 
1.0%
126.7792358 5
 
1.0%
Other values (109) 251
48.1%
(Missing) 206
39.5%
ValueCountFrequency (%)
126.7100583 3
 
0.6%
126.7102023 3
 
0.6%
126.7112393 4
0.8%
126.7116075 1
 
0.2%
126.7117562 4
0.8%
126.7119066 4
0.8%
126.7120273 3
 
0.6%
126.7126551 8
1.5%
126.7130638 1
 
0.2%
126.7131303 1
 
0.2%
ValueCountFrequency (%)
126.930462 3
0.6%
126.9304597 3
0.6%
126.9222343 3
0.6%
126.922233 3
0.6%
126.9112582 3
0.6%
126.910456 3
0.6%
126.8999653 3
0.6%
126.8405704 3
0.6%
126.8405367 3
0.6%
126.8268571 3
0.6%

경도
Real number (ℝ)

MISSING 

Distinct120
Distinct (%)38.0%
Missing206
Missing (%)39.5%
Infinite0
Infinite (%)0.0%
Mean35.034345
Minimum34.994633
Maximum35.181632
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-12T19:12:08.443090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34.994633
5-th percentile34.999597
Q135.016824
median35.025575
Q335.037016
95-th percentile35.144234
Maximum35.181632
Range0.1869987
Interquartile range (IQR)0.020192

Descriptive statistics

Standard deviation0.037017048
Coefficient of variation (CV)0.0010565931
Kurtosis6.4632986
Mean35.034345
Median Absolute Deviation (MAD)0.0109741
Skewness2.6091432
Sum11070.853
Variance0.0013702619
MonotonicityNot monotonic
2023-12-12T19:12:08.636770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.994633 8
 
1.5%
35.0342576 8
 
1.5%
35.025575 8
 
1.5%
35.0408805 6
 
1.1%
35.0370197 6
 
1.1%
35.01446 6
 
1.1%
35.0214993 5
 
1.0%
35.02118 5
 
1.0%
35.0211754 5
 
1.0%
35.019404 5
 
1.0%
Other values (110) 254
48.7%
(Missing) 206
39.5%
ValueCountFrequency (%)
34.994633 8
1.5%
34.9968108 3
 
0.6%
34.9974177 3
 
0.6%
34.999597 4
0.8%
34.9996798 4
0.8%
35.0077 2
 
0.4%
35.0086884 1
 
0.2%
35.0086948 4
0.8%
35.00938 4
0.8%
35.0094199 3
 
0.6%
ValueCountFrequency (%)
35.1816317 3
0.6%
35.1801761 3
0.6%
35.1480602 3
0.6%
35.1477176 3
0.6%
35.1444144 3
0.6%
35.1442341 3
0.6%
35.14410716 1
 
0.2%
35.14401842 1
 
0.2%
35.1382692 1
 
0.2%
35.1382213 1
 
0.2%

Interactions

2023-12-12T19:12:04.416146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:02.215154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:02.814617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.421710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.957179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:04.548639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:02.332242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:02.945206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.547589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:04.044287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:04.696866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:02.467876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.052916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.674578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:04.132176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:04.831722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:02.579324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.173922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.772523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:04.228433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:04.947844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:02.705941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.283841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:03.856259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:04.311610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:12:08.785093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선ID정류소ID순서노선명위도경도
노선ID1.0000.8190.4181.0000.5140.347
정류소ID0.8191.0000.3680.819NaNNaN
순서0.4180.3681.0000.4180.6730.554
노선명1.0000.8190.4181.0000.5140.347
위도0.514NaN0.6730.5141.0000.808
경도0.347NaN0.5540.3470.8081.000
2023-12-12T19:12:08.921932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선ID정류소ID순서위도경도노선명
노선ID1.0000.1320.0040.0220.0671.000
정류소ID0.1321.0000.1120.1490.3010.839
순서0.0040.1121.0000.1080.2830.204
위도0.0220.1490.1081.0000.4560.193
경도0.0670.3010.2830.4561.0000.200
노선명1.0000.8390.2040.1930.2001.000

Missing values

2023-12-12T19:12:05.128162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:12:05.255716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T19:12:05.375307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

노선ID정류소ID순서노선명정류소명위도경도
012090630999번(빛가람로)전남대후문(남)<NA><NA>
114410640999번(빛가람로)전남대후문위126.91125835.181632
21148030999번(빛가람로)동강대후문<NA><NA>
31167040999번(빛가람로)산수오거리<NA><NA>
41169050999번(빛가람로)살레시오여고<NA><NA>
51217060999번(빛가람로)조선대학교126.9304635.14806
61208070999번(빛가람로)전남대병원126.92223335.144414
71183080999번(빛가람로)양림 (휴먼시아) 1차<NA><NA>
81134090999번(빛가람로)남광주농협<NA><NA>
911580100999번(빛가람로)백운광장<NA><NA>
노선ID정류소ID순서노선명정류소명위도경도
5128701111010701신도산단<NA><NA>
5138701109020701산림연구원<NA><NA>
5149702107010702중흥리조트<NA><NA>
5159702106040702중흥리조트<NA><NA>
5169702104030702인암마을<NA><NA>
5179702102020702양우내안애<NA><NA>
5189702100010702남평터미널<NA><NA>
5199702103030702양우내안애<NA><NA>
5209702105020702인암마을<NA><NA>
5219702101040702남평터미널<NA><NA>