Overview

Dataset statistics

Number of variables5
Number of observations6645
Missing cells887
Missing cells (%)2.7%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory285.7 KiB
Average record size in memory44.0 B

Variable types

Categorical1
Numeric3
DateTime1

Dataset

Description빛가람정보포탈 내 제공중인 나주시 버스시간 및 노선ID
Author한전KDN(주)
URLhttps://www.data.go.kr/data/15038342/fileData.do

Alerts

Dataset has 2 (< 0.1%) duplicate rowsDuplicates
시간 has 887 (13.3%) missing valuesMissing
시간순서 has 207 (3.1%) zerosZeros

Reproduction

Analysis started2023-12-12 00:46:33.077391
Analysis finished2023-12-12 00:46:35.201831
Duration2.12 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.0 KiB
0
4832 
1
1813 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 4832
72.7%
1 1813
 
27.3%

Length

2023-12-12T09:46:35.284042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:46:35.389506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 4832
72.7%
1 1813
 
27.3%

정류장ID
Real number (ℝ)

Distinct229
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean610013.4
Minimum47
Maximum7021070
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.5 KiB
2023-12-12T09:46:35.506478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum47
5-th percentile1250
Q11510
median2010
Q32330
95-th percentile7011050
Maximum7021070
Range7021023
Interquartile range (IQR)820

Descriptive statistics

Standard deviation1972578.8
Coefficient of variation (CV)3.2336648
Kurtosis6.6372933
Mean610013.4
Median Absolute Deviation (MAD)500
Skewness2.9385842
Sum4.053539 × 109
Variance3.8910673 × 1012
MonotonicityIncreasing
2023-12-12T09:46:35.964050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1870 217
 
3.3%
2020 210
 
3.2%
2010 210
 
3.2%
1300 210
 
3.2%
1290 210
 
3.2%
1860 193
 
2.9%
2170 171
 
2.6%
2070 168
 
2.5%
2160 168
 
2.5%
1580 168
 
2.5%
Other values (219) 4720
71.0%
ValueCountFrequency (%)
47 25
0.4%
48 24
0.4%
811 25
0.4%
817 25
0.4%
1010 1
 
< 0.1%
1020 2
 
< 0.1%
1040 44
0.7%
1050 24
0.4%
1080 47
0.7%
1090 4
 
0.1%
ValueCountFrequency (%)
7021070 21
0.3%
7021060 21
0.3%
7021050 21
0.3%
7021040 21
0.3%
7021030 21
0.3%
7021020 21
0.3%
7021010 21
0.3%
7021000 21
0.3%
7011110 27
0.4%
7011100 27
0.4%

시간순서
Real number (ℝ)

ZEROS 

Distinct42
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.12611
Minimum0
Maximum41
Zeros207
Zeros (%)3.1%
Negative0
Negative (%)0.0%
Memory size58.5 KiB
2023-12-12T09:46:36.116451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median12
Q320
95-th percentile30
Maximum41
Range41
Interquartile range (IQR)16

Descriptive statistics

Standard deviation9.8741503
Coefficient of variation (CV)0.7522526
Kurtosis-0.45844639
Mean13.12611
Median Absolute Deviation (MAD)8
Skewness0.52840591
Sum87223
Variance97.498844
MonotonicityNot monotonic
2023-12-12T09:46:36.252015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
1 888
 
13.4%
11 208
 
3.1%
20 208
 
3.1%
18 208
 
3.1%
17 208
 
3.1%
16 208
 
3.1%
15 208
 
3.1%
14 208
 
3.1%
13 208
 
3.1%
12 208
 
3.1%
Other values (32) 3885
58.5%
ValueCountFrequency (%)
0 207
 
3.1%
1 888
13.4%
2 208
 
3.1%
3 208
 
3.1%
4 208
 
3.1%
5 208
 
3.1%
6 208
 
3.1%
7 208
 
3.1%
8 208
 
3.1%
9 208
 
3.1%
ValueCountFrequency (%)
41 28
0.4%
40 28
0.4%
39 28
0.4%
38 28
0.4%
37 28
0.4%
36 28
0.4%
35 28
0.4%
34 28
0.4%
33 28
0.4%
32 28
0.4%

노선ID
Real number (ℝ)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5504891
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.5 KiB
2023-12-12T09:46:36.376279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q37
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.5216197
Coefficient of variation (CV)0.55414256
Kurtosis-1.4232131
Mean4.5504891
Median Absolute Deviation (MAD)2
Skewness-0.11387087
Sum30238
Variance6.3585658
MonotonicityNot monotonic
2023-12-12T09:46:36.518374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
7 1338
20.1%
1 1328
20.0%
6 1320
19.9%
3 1209
18.2%
8 486
 
7.3%
2 398
 
6.0%
5 209
 
3.1%
4 189
 
2.8%
9 168
 
2.5%
ValueCountFrequency (%)
1 1328
20.0%
2 398
 
6.0%
3 1209
18.2%
4 189
 
2.8%
5 209
 
3.1%
6 1320
19.9%
7 1338
20.1%
8 486
 
7.3%
9 168
 
2.5%
ValueCountFrequency (%)
9 168
 
2.5%
8 486
 
7.3%
7 1338
20.1%
6 1320
19.9%
5 209
 
3.1%
4 189
 
2.8%
3 1209
18.2%
2 398
 
6.0%
1 1328
20.0%

시간
Date

MISSING 

Distinct1014
Distinct (%)17.6%
Missing887
Missing (%)13.3%
Memory size52.0 KiB
Minimum2023-12-12 05:33:00
Maximum2023-12-12 23:48:00
2023-12-12T09:46:36.689694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:36.831251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T09:46:34.552729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:33.794145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:34.157877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:34.667670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:33.910217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:34.303165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:34.880967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:34.013094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:46:34.421415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:46:36.926597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
평일주말구분(0:평일, 1:주말)정류장ID시간순서노선ID
평일주말구분(0:평일, 1:주말)1.0000.2910.3180.498
정류장ID0.2911.0000.0830.783
시간순서0.3180.0831.0000.367
노선ID0.4980.7830.3671.000
2023-12-12T09:46:37.054202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정류장ID시간순서노선ID평일주말구분(0:평일, 1:주말)
정류장ID1.000-0.1270.2000.188
시간순서-0.1271.000-0.0520.244
노선ID0.200-0.0521.0000.499
평일주말구분(0:평일, 1:주말)0.1880.2440.4991.000

Missing values

2023-12-12T09:46:35.041542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:46:35.154656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

평일주말구분(0:평일, 1:주말)정류장ID시간순서노선ID시간
004702<NA>
10471207:55
20472208:50
30473209:30
40474210:10
50475210:25
60476211:35
70477212:15
80478213:00
90479213:15
평일주말구분(0:평일, 1:주말)정류장ID시간순서노선ID시간
66350702107011915:05
66360702107012915:45
66370702107013916:25
66380702107014917:05
66390702107015918:35
66400702107016919:15
66410702107017919:40
66420702107018920:20
66430702107019920:45
66440702107020921:30

Duplicate rows

Most frequently occurring

평일주말구분(0:평일, 1:주말)정류장ID시간순서노선ID시간# duplicates
00467011<NA>2
10468011<NA>2