Overview

Dataset statistics

Number of variables4
Number of observations372
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.5 KiB
Average record size in memory34.4 B

Variable types

Numeric2
Text1
Categorical1

Dataset

Description한국도로공사 고속도로 영업소별 하이패스 이용비율 관련 정보를 제공한다.(구분, 영업소, 하이패스 이용비율, 비고)
URLhttps://www.data.go.kr/data/15101912/fileData.do

Alerts

구분 is highly overall correlated with 하이패스 이용비율 and 1 other fieldsHigh correlation
하이패스 이용비율 is highly overall correlated with 구분High correlation
비고 is highly overall correlated with 구분High correlation
비고 is highly imbalanced (68.7%)Imbalance
구분 has unique valuesUnique
영업소 has unique valuesUnique

Reproduction

Analysis started2023-12-12 04:57:41.788404
Analysis finished2023-12-12 04:57:42.689157
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct372
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean186.5
Minimum1
Maximum372
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.4 KiB
2023-12-12T13:57:42.808159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile19.55
Q193.75
median186.5
Q3279.25
95-th percentile353.45
Maximum372
Range371
Interquartile range (IQR)185.5

Descriptive statistics

Standard deviation107.53139
Coefficient of variation (CV)0.57657582
Kurtosis-1.2
Mean186.5
Median Absolute Deviation (MAD)93
Skewness0
Sum69378
Variance11563
MonotonicityStrictly increasing
2023-12-12T13:57:43.002159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
247 1
 
0.3%
256 1
 
0.3%
255 1
 
0.3%
254 1
 
0.3%
253 1
 
0.3%
252 1
 
0.3%
251 1
 
0.3%
250 1
 
0.3%
249 1
 
0.3%
Other values (362) 362
97.3%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
372 1
0.3%
371 1
0.3%
370 1
0.3%
369 1
0.3%
368 1
0.3%
367 1
0.3%
366 1
0.3%
365 1
0.3%
364 1
0.3%
363 1
0.3%

영업소
Text

UNIQUE 

Distinct372
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T13:57:43.421768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.5295699
Min length2

Characters and Unicode

Total characters941
Distinct characters190
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique372 ?
Unique (%)100.0%

Sample

1st row기흥동탄
2nd row수원신갈
3rd row동수원
4th row북여주
5th row북수원
ValueCountFrequency (%)
기흥동탄 1
 
0.3%
동고령 1
 
0.3%
사천 1
 
0.3%
서순천 1
 
0.3%
서안동 1
 
0.3%
상주 1
 
0.3%
영동 1
 
0.3%
동순천 1
 
0.3%
북영천 1
 
0.3%
목포 1
 
0.3%
Other values (362) 362
97.3%
2023-12-12T13:57:44.034276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42
 
4.5%
40
 
4.3%
38
 
4.0%
37
 
3.9%
37
 
3.9%
34
 
3.6%
28
 
3.0%
24
 
2.6%
24
 
2.6%
19
 
2.0%
Other values (180) 618
65.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 929
98.7%
Close Punctuation 4
 
0.4%
Open Punctuation 4
 
0.4%
Uppercase Letter 3
 
0.3%
Decimal Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
42
 
4.5%
40
 
4.3%
38
 
4.1%
37
 
4.0%
37
 
4.0%
34
 
3.7%
28
 
3.0%
24
 
2.6%
24
 
2.6%
19
 
2.0%
Other values (174) 606
65.2%
Uppercase Letter
ValueCountFrequency (%)
E 1
33.3%
K 1
33.3%
C 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 929
98.7%
Common 9
 
1.0%
Latin 3
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
42
 
4.5%
40
 
4.3%
38
 
4.1%
37
 
4.0%
37
 
4.0%
34
 
3.7%
28
 
3.0%
24
 
2.6%
24
 
2.6%
19
 
2.0%
Other values (174) 606
65.2%
Common
ValueCountFrequency (%)
) 4
44.4%
( 4
44.4%
2 1
 
11.1%
Latin
ValueCountFrequency (%)
E 1
33.3%
K 1
33.3%
C 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 929
98.7%
ASCII 12
 
1.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
42
 
4.5%
40
 
4.3%
38
 
4.1%
37
 
4.0%
37
 
4.0%
34
 
3.7%
28
 
3.0%
24
 
2.6%
24
 
2.6%
19
 
2.0%
Other values (174) 606
65.2%
ASCII
ValueCountFrequency (%)
) 4
33.3%
( 4
33.3%
E 1
 
8.3%
K 1
 
8.3%
C 1
 
8.3%
2 1
 
8.3%

하이패스 이용비율
Real number (ℝ)

HIGH CORRELATION 

Distinct96
Distinct (%)25.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean86.595968
Minimum78.9
Maximum92.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.4 KiB
2023-12-12T13:57:44.192731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum78.9
5-th percentile83.155
Q185.1
median86.4
Q387.9
95-th percentile90.9
Maximum92.6
Range13.7
Interquartile range (IQR)2.8

Descriptive statistics

Standard deviation2.3041881
Coefficient of variation (CV)0.026608492
Kurtosis0.13871039
Mean86.595968
Median Absolute Deviation (MAD)1.4
Skewness0.17994416
Sum32213.7
Variance5.3092829
MonotonicityNot monotonic
2023-12-12T13:57:44.357898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
85.2 11
 
3.0%
85.3 11
 
3.0%
85.8 10
 
2.7%
86.7 9
 
2.4%
86.4 9
 
2.4%
86.2 8
 
2.2%
86.8 8
 
2.2%
86.6 8
 
2.2%
85.0 8
 
2.2%
85.4 8
 
2.2%
Other values (86) 282
75.8%
ValueCountFrequency (%)
78.9 1
0.3%
80.3 1
0.3%
81.0 1
0.3%
81.3 1
0.3%
81.4 1
0.3%
81.5 1
0.3%
81.8 2
0.5%
82.0 1
0.3%
82.1 2
0.5%
82.4 1
0.3%
ValueCountFrequency (%)
92.6 1
 
0.3%
92.3 1
 
0.3%
92.0 2
0.5%
91.7 3
0.8%
91.6 2
0.5%
91.5 2
0.5%
91.4 1
 
0.3%
91.3 1
 
0.3%
91.2 2
0.5%
91.1 2
0.5%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
폐쇄식
351 
개방식
 
21

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row폐쇄식
2nd row폐쇄식
3rd row폐쇄식
4th row폐쇄식
5th row폐쇄식

Common Values

ValueCountFrequency (%)
폐쇄식 351
94.4%
개방식 21
 
5.6%

Length

2023-12-12T13:57:44.506315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:57:44.632457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
폐쇄식 351
94.4%
개방식 21
 
5.6%

Interactions

2023-12-12T13:57:42.279487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:57:42.025361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:57:42.391019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:57:42.145671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:57:44.718632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분하이패스 이용비율비고
구분1.0000.9540.882
하이패스 이용비율0.9541.0000.479
비고0.8820.4791.000
2023-12-12T13:57:44.837143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분하이패스 이용비율비고
구분1.000-0.8040.709
하이패스 이용비율-0.8041.0000.365
비고0.7090.3651.000

Missing values

2023-12-12T13:57:42.546656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:57:42.638373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분영업소하이패스 이용비율비고
01기흥동탄92.6폐쇄식
12수원신갈92.3폐쇄식
23동수원92.0폐쇄식
34북여주92.0폐쇄식
45북수원91.7폐쇄식
56마성91.6폐쇄식
67서안산91.5폐쇄식
78부곡91.4폐쇄식
89경기광주91.2폐쇄식
910남여주91.2폐쇄식
구분영업소하이패스 이용비율비고
362363다사89.1개방식
363364김포88.5개방식
364365인천87.8개방식
365366내서86.4개방식
366367가락(개)85.5개방식
367368대동(개)84.2개방식
368369순천만82.1개방식
369370서영암(개)81.0개방식
370371일로80.3개방식
371372동부산89.2개방식