Overview

Dataset statistics

Number of variables7
Number of observations78
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.6 KiB
Average record size in memory60.7 B

Variable types

Text1
Categorical3
Numeric3

Dataset

Description경찰청 배기량별 차량현황 데이터로 배기량, 차량대수, 승용, 승합, 화물, 특수, 이륜 항목을 제공합니다.(2020.12.31.기준)
Author경찰청
URLhttps://www.data.go.kr/data/15065779/fileData.do

Alerts

화물 is highly overall correlated with 차량대수 and 1 other fieldsHigh correlation
특수 is highly overall correlated with 차량대수 and 1 other fieldsHigh correlation
이륜 is highly overall correlated with 차량대수High correlation
차량대수 is highly overall correlated with 화물 and 3 other fieldsHigh correlation
승용 is highly overall correlated with 차량대수High correlation
승합 is highly overall correlated with 화물 and 1 other fieldsHigh correlation
승합 is highly imbalanced (52.9%)Imbalance
배기량 has unique valuesUnique
화물 has 67 (85.9%) zerosZeros
특수 has 49 (62.8%) zerosZeros
이륜 has 71 (91.0%) zerosZeros

Reproduction

Analysis started2023-12-12 04:03:34.421405
Analysis finished2023-12-12 04:03:35.828196
Duration1.41 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

배기량
Text

UNIQUE 

Distinct78
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size756.0 B
2023-12-12T13:03:35.984426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length7.8076923
Min length4

Characters and Unicode

Total characters609
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)100.0%

Sample

1st row0 cc
2nd row1 cc
3rd row20 cc
4th row100 cc
5th row113 cc
ValueCountFrequency (%)
cc 78
50.0%
2,696 1
 
0.6%
4,565 1
 
0.6%
3,933 1
 
0.6%
3,907 1
 
0.6%
3,800 1
 
0.6%
3,778 1
 
0.6%
3,773 1
 
0.6%
6,299 1
 
0.6%
3,470 1
 
0.6%
Other values (69) 69
44.2%
2023-12-12T13:03:36.365808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 156
25.6%
78
12.8%
, 69
11.3%
9 54
 
8.9%
1 47
 
7.7%
0 41
 
6.7%
2 33
 
5.4%
6 28
 
4.6%
7 26
 
4.3%
3 23
 
3.8%
Other values (3) 54
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 306
50.2%
Lowercase Letter 156
25.6%
Space Separator 78
 
12.8%
Other Punctuation 69
 
11.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 54
17.6%
1 47
15.4%
0 41
13.4%
2 33
10.8%
6 28
9.2%
7 26
8.5%
3 23
7.5%
5 23
7.5%
4 18
 
5.9%
8 13
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
c 156
100.0%
Space Separator
ValueCountFrequency (%)
78
100.0%
Other Punctuation
ValueCountFrequency (%)
, 69
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 453
74.4%
Latin 156
 
25.6%

Most frequent character per script

Common
ValueCountFrequency (%)
78
17.2%
, 69
15.2%
9 54
11.9%
1 47
10.4%
0 41
9.1%
2 33
7.3%
6 28
 
6.2%
7 26
 
5.7%
3 23
 
5.1%
5 23
 
5.1%
Other values (2) 31
 
6.8%
Latin
ValueCountFrequency (%)
c 156
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 609
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 156
25.6%
78
12.8%
, 69
11.3%
9 54
 
8.9%
1 47
 
7.7%
0 41
 
6.7%
2 33
 
5.4%
6 28
 
4.6%
7 26
 
4.3%
3 23
 
3.8%
Other values (3) 54
 
8.9%

차량대수
Categorical

HIGH CORRELATION 

Distinct38
Distinct (%)48.7%
Missing0
Missing (%)0.0%
Memory size756.0 B
1
12 
2
11 
7
6
 
4
8
 
4
Other values (33)
41 

Length

Max length5
Median length1
Mean length1.7051282
Min length1

Unique

Unique29 ?
Unique (%)37.2%

Sample

1st row6
2nd row9
3rd row2
4th row73
5th row6

Common Values

ValueCountFrequency (%)
1 12
15.4%
2 11
 
14.1%
7 6
 
7.7%
6 4
 
5.1%
8 4
 
5.1%
4 4
 
5.1%
5 3
 
3.8%
9 3
 
3.8%
16 2
 
2.6%
1,463 1
 
1.3%
Other values (28) 28
35.9%

Length

2023-12-12T13:03:36.503206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 12
15.4%
2 11
 
14.1%
7 6
 
7.7%
6 4
 
5.1%
8 4
 
5.1%
4 4
 
5.1%
5 3
 
3.8%
9 3
 
3.8%
16 2
 
2.6%
3,756 1
 
1.3%
Other values (28) 28
35.9%

승용
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)26.9%
Missing0
Missing (%)0.0%
Memory size756.0 B
0
42 
1
2
7
 
4
4
 
3
Other values (16)
16 

Length

Max length5
Median length1
Mean length1.3589744
Min length1

Unique

Unique16 ?
Unique (%)20.5%

Sample

1st row0
2nd row7
3rd row0
4th row0
5th row6

Common Values

ValueCountFrequency (%)
0 42
53.8%
1 7
 
9.0%
2 6
 
7.7%
7 4
 
5.1%
4 3
 
3.8%
53 1
 
1.3%
6 1
 
1.3%
156 1
 
1.3%
274 1
 
1.3%
364 1
 
1.3%
Other values (11) 11
 
14.1%

Length

2023-12-12T13:03:36.632320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 42
53.8%
1 7
 
9.0%
2 6
 
7.7%
7 4
 
5.1%
4 3
 
3.8%
1,463 1
 
1.3%
16 1
 
1.3%
27 1
 
1.3%
115 1
 
1.3%
3,756 1
 
1.3%
Other values (11) 11
 
14.1%

승합
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct17
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Memory size756.0 B
0
56 
1
 
4
2
 
3
4
 
2
629
 
1
Other values (12)
12 

Length

Max length5
Median length1
Mean length1.2051282
Min length1

Unique

Unique13 ?
Unique (%)16.7%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 56
71.8%
1 4
 
5.1%
2 3
 
3.8%
4 2
 
2.6%
629 1
 
1.3%
38 1
 
1.3%
624 1
 
1.3%
853 1
 
1.3%
3,808 1
 
1.3%
3 1
 
1.3%
Other values (7) 7
 
9.0%

Length

2023-12-12T13:03:36.787079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 56
71.8%
1 4
 
5.1%
2 3
 
3.8%
4 2
 
2.6%
6 1
 
1.3%
8 1
 
1.3%
5 1
 
1.3%
48 1
 
1.3%
10 1
 
1.3%
15 1
 
1.3%
Other values (7) 7
 
9.0%

화물
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct11
Distinct (%)14.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.487179
Minimum0
Maximum500
Zeros67
Zeros (%)85.9%
Negative0
Negative (%)0.0%
Memory size834.0 B
2023-12-12T13:03:36.924893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile23.05
Maximum500
Range500
Interquartile range (IQR)0

Descriptive statistics

Standard deviation59.610586
Coefficient of variation (CV)5.684139
Kurtosis61.096328
Mean10.487179
Median Absolute Deviation (MAD)0
Skewness7.5611634
Sum818
Variance3553.4219
MonotonicityNot monotonic
2023-12-12T13:03:37.066636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0 67
85.9%
2 2
 
2.6%
29 1
 
1.3%
120 1
 
1.3%
22 1
 
1.3%
500 1
 
1.3%
7 1
 
1.3%
128 1
 
1.3%
1 1
 
1.3%
4 1
 
1.3%
ValueCountFrequency (%)
0 67
85.9%
1 1
 
1.3%
2 2
 
2.6%
3 1
 
1.3%
4 1
 
1.3%
7 1
 
1.3%
22 1
 
1.3%
29 1
 
1.3%
120 1
 
1.3%
128 1
 
1.3%
ValueCountFrequency (%)
500 1
1.3%
128 1
1.3%
120 1
1.3%
29 1
1.3%
22 1
1.3%
7 1
1.3%
4 1
1.3%
3 1
1.3%
2 2
2.6%
1 1
1.3%

특수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct15
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.474359
Minimum0
Maximum64
Zeros49
Zeros (%)62.8%
Negative0
Negative (%)0.0%
Memory size834.0 B
2023-12-12T13:03:37.188753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile11.9
Maximum64
Range64
Interquartile range (IQR)2

Descriptive statistics

Standard deviation10.141498
Coefficient of variation (CV)2.9189552
Kurtosis23.615804
Mean3.474359
Median Absolute Deviation (MAD)0
Skewness4.6970205
Sum271
Variance102.84998
MonotonicityNot monotonic
2023-12-12T13:03:37.334174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 49
62.8%
1 7
 
9.0%
6 4
 
5.1%
2 3
 
3.8%
5 3
 
3.8%
4 2
 
2.6%
7 2
 
2.6%
11 1
 
1.3%
8 1
 
1.3%
9 1
 
1.3%
Other values (5) 5
 
6.4%
ValueCountFrequency (%)
0 49
62.8%
1 7
 
9.0%
2 3
 
3.8%
3 1
 
1.3%
4 2
 
2.6%
5 3
 
3.8%
6 4
 
5.1%
7 2
 
2.6%
8 1
 
1.3%
9 1
 
1.3%
ValueCountFrequency (%)
64 1
 
1.3%
53 1
 
1.3%
32 1
 
1.3%
17 1
 
1.3%
11 1
 
1.3%
9 1
 
1.3%
8 1
 
1.3%
7 2
2.6%
6 4
5.1%
5 3
3.8%

이륜
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct8
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.935897
Minimum0
Maximum951
Zeros71
Zeros (%)91.0%
Negative0
Negative (%)0.0%
Memory size834.0 B
2023-12-12T13:03:37.527688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile85.75
Maximum951
Range951
Interquartile range (IQR)0

Descriptive statistics

Standard deviation114.07532
Coefficient of variation (CV)5.4487903
Kurtosis59.251711
Mean20.935897
Median Absolute Deviation (MAD)0
Skewness7.398674
Sum1633
Variance13013.178
MonotonicityNot monotonic
2023-12-12T13:03:37.657236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 71
91.0%
2 1
 
1.3%
73 1
 
1.3%
951 1
 
1.3%
200 1
 
1.3%
158 1
 
1.3%
1 1
 
1.3%
248 1
 
1.3%
ValueCountFrequency (%)
0 71
91.0%
1 1
 
1.3%
2 1
 
1.3%
73 1
 
1.3%
158 1
 
1.3%
200 1
 
1.3%
248 1
 
1.3%
951 1
 
1.3%
ValueCountFrequency (%)
951 1
 
1.3%
248 1
 
1.3%
200 1
 
1.3%
158 1
 
1.3%
73 1
 
1.3%
2 1
 
1.3%
1 1
 
1.3%
0 71
91.0%

Interactions

2023-12-12T13:03:35.401133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:34.803018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:35.114199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:35.490073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:34.904369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:35.225654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:35.571165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:35.031505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:03:35.322209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:03:37.768639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배기량차량대수승용승합화물특수이륜
배기량1.0001.0001.0001.0001.0001.0001.000
차량대수1.0001.0000.9860.9291.0000.9841.000
승용1.0000.9861.0000.0000.0000.0000.000
승합1.0000.9290.0001.0001.0000.8520.000
화물1.0001.0000.0001.0001.0000.9860.000
특수1.0000.9840.0000.8520.9861.0000.000
이륜1.0001.0000.0000.0000.0000.0001.000
2023-12-12T13:03:37.912100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승용차량대수승합
승용1.0000.6810.000
차량대수0.6811.0000.456
승합0.0000.4561.000
2023-12-12T13:03:38.045886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
화물특수이륜차량대수승용승합
화물1.0000.403-0.1270.7300.0000.902
특수0.4031.000-0.2330.6730.0000.565
이륜-0.127-0.2331.0000.7350.0000.000
차량대수0.7300.6730.7351.0000.6810.456
승용0.0000.0000.0000.6811.0000.000
승합0.9020.5650.0000.4560.0001.000

Missing values

2023-12-12T13:03:35.679575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:03:35.786959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

배기량차량대수승용승합화물특수이륜
00 cc600060
11 cc970020
220 cc200002
3100 cc73000073
4113 cc660000
5120 cc404000
6124 cc9510000951
7995 cc1561560000
8999 cc2742740000
91,170 cc2000000200
배기량차량대수승용승합화물특수이륜
687,640 cc701060
699,960 cc16010060
7010,964 cc48048000
7111,051 cc605010
7211,120 cc100010
7311,149 cc808000
7412,300 cc202000
7512,344 cc201010
7612,700 cc6440644000
7712,742 cc202000