Overview

Dataset statistics

Number of variables9
Number of observations6147
Missing cells11939
Missing cells (%)21.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory438.3 KiB
Average record size in memory73.0 B

Variable types

DateTime1
Categorical3
Text4
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15640/F/1/datasetView.do

Alerts

특수 is highly overall correlated with 용도별High correlation
용도별 is highly overall correlated with 특수 High correlation
승용 has 621 (10.1%) missing valuesMissing
승합 has 3210 (52.2%) missing valuesMissing
화물 has 3212 (52.3%) missing valuesMissing
특수 has 4896 (79.6%) missing valuesMissing

Reproduction

Analysis started2024-04-06 11:20:38.432993
Analysis finished2024-04-06 11:20:39.869561
Duration1.44 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년월
Date

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size48.2 KiB
Minimum2019-01-01 00:00:00
Maximum2019-12-01 00:00:00
2024-04-06T20:20:39.940882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T20:20:40.138339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)

시군구별
Categorical

Distinct25
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size48.2 KiB
금천구
 
268
송파구
 
264
강남구
 
260
양천구
 
257
영등포구
 
256
Other values (20)
4842 

Length

Max length4
Median length3
Mean length3.0806898
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row종로구
3rd row종로구
4th row종로구
5th row종로구

Common Values

ValueCountFrequency (%)
금천구 268
 
4.4%
송파구 264
 
4.3%
강남구 260
 
4.2%
양천구 257
 
4.2%
영등포구 256
 
4.2%
서대문구 252
 
4.1%
노원구 252
 
4.1%
서초구 252
 
4.1%
강동구 252
 
4.1%
은평구 250
 
4.1%
Other values (15) 3584
58.3%

Length

2024-04-06T20:20:40.378427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
금천구 268
 
4.4%
송파구 264
 
4.3%
강남구 260
 
4.2%
양천구 257
 
4.2%
영등포구 256
 
4.2%
서대문구 252
 
4.1%
노원구 252
 
4.1%
서초구 252
 
4.1%
강동구 252
 
4.1%
은평구 250
 
4.1%
Other values (15) 3584
58.3%

연료별
Categorical

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size48.2 KiB
CNG
600 
경유
600 
기타연료
600 
엘피지
600 
휘발유(무연)
600 
Other values (8)
3147 

Length

Max length13
Median length12
Mean length5.4947129
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCNG
2nd rowCNG
3rd row경유
4th row경유
5th row기타연료

Common Values

ValueCountFrequency (%)
CNG 600
9.8%
경유 600
9.8%
기타연료 600
9.8%
엘피지 600
9.8%
휘발유(무연) 600
9.8%
휘발유 593
9.6%
하이브리드(휘발유+전기) 591
9.6%
전기 586
9.5%
하이브리드(LPG+전기) 359
5.8%
휘발유(유연) 355
5.8%
Other values (3) 663
10.8%

Length

2024-04-06T20:20:40.791585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cng 600
9.8%
경유 600
9.8%
기타연료 600
9.8%
엘피지 600
9.8%
휘발유(무연 600
9.8%
휘발유 593
9.6%
하이브리드(휘발유+전기 591
9.6%
전기 586
9.5%
하이브리드(lpg+전기 359
5.8%
휘발유(유연 355
5.8%
Other values (3) 663
10.8%

용도별
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.2 KiB
비사업용
3528 
사업용
2619 

Length

Max length4
Median length4
Mean length3.5739385
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row비사업용
2nd row사업용
3rd row비사업용
4th row사업용
5th row비사업용

Common Values

ValueCountFrequency (%)
비사업용 3528
57.4%
사업용 2619
42.6%

Length

2024-04-06T20:20:41.056515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T20:20:41.318297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
비사업용 3528
57.4%
사업용 2619
42.6%

승용
Text

MISSING 

Distinct2288
Distinct (%)41.4%
Missing621
Missing (%)10.1%
Memory size48.2 KiB
2024-04-06T20:20:41.913658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1471227
Min length1

Characters and Unicode

Total characters17391
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1889 ?
Unique (%)34.2%

Sample

1st row6
2nd row11,053
3rd row56
4th row2
5th row6
ValueCountFrequency (%)
1 333
 
6.0%
2 312
 
5.6%
3 183
 
3.3%
4 103
 
1.9%
6 89
 
1.6%
5 71
 
1.3%
10 57
 
1.0%
8 54
 
1.0%
14 48
 
0.9%
7 48
 
0.9%
Other values (2278) 4228
76.5%
2024-04-06T20:20:42.991811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2564
14.7%
2 2430
14.0%
, 1900
10.9%
3 1809
10.4%
4 1495
8.6%
5 1380
7.9%
6 1315
7.6%
7 1222
7.0%
0 1114
6.4%
9 1086
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15491
89.1%
Other Punctuation 1900
 
10.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2564
16.6%
2 2430
15.7%
3 1809
11.7%
4 1495
9.7%
5 1380
8.9%
6 1315
8.5%
7 1222
7.9%
0 1114
7.2%
9 1086
7.0%
8 1076
6.9%
Other Punctuation
ValueCountFrequency (%)
, 1900
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17391
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2564
14.7%
2 2430
14.0%
, 1900
10.9%
3 1809
10.4%
4 1495
8.6%
5 1380
7.9%
6 1315
7.6%
7 1222
7.0%
0 1114
6.4%
9 1086
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17391
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2564
14.7%
2 2430
14.0%
, 1900
10.9%
3 1809
10.4%
4 1495
8.6%
5 1380
7.9%
6 1315
7.6%
7 1222
7.0%
0 1114
6.4%
9 1086
6.2%

승합
Text

MISSING 

Distinct785
Distinct (%)26.7%
Missing3210
Missing (%)52.2%
Memory size48.2 KiB
2024-04-06T20:20:43.633319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.3125638
Min length1

Characters and Unicode

Total characters6792
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique555 ?
Unique (%)18.9%

Sample

1st row2
2nd row92
3rd row3,266
4th row224
5th row25
ValueCountFrequency (%)
2 213
 
7.3%
1 205
 
7.0%
3 101
 
3.4%
14 59
 
2.0%
10 55
 
1.9%
5 54
 
1.8%
6 52
 
1.8%
4 50
 
1.7%
15 49
 
1.7%
8 42
 
1.4%
Other values (775) 2057
70.0%
2024-04-06T20:20:44.507866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1092
16.1%
2 1031
15.2%
3 810
11.9%
4 709
10.4%
5 555
8.2%
6 529
7.8%
8 471
6.9%
7 464
6.8%
9 451
6.6%
0 353
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6465
95.2%
Other Punctuation 327
 
4.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1092
16.9%
2 1031
15.9%
3 810
12.5%
4 709
11.0%
5 555
8.6%
6 529
8.2%
8 471
7.3%
7 464
7.2%
9 451
7.0%
0 353
 
5.5%
Other Punctuation
ValueCountFrequency (%)
, 327
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6792
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1092
16.1%
2 1031
15.2%
3 810
11.9%
4 709
10.4%
5 555
8.2%
6 529
7.8%
8 471
6.9%
7 464
6.8%
9 451
6.6%
0 353
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6792
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1092
16.1%
2 1031
15.2%
3 810
11.9%
4 709
10.4%
5 555
8.2%
6 529
7.8%
8 471
6.9%
7 464
6.8%
9 451
6.6%
0 353
 
5.2%

화물
Text

MISSING 

Distinct1039
Distinct (%)35.4%
Missing3212
Missing (%)52.3%
Memory size48.2 KiB
2024-04-06T20:20:45.112885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length2.8156729
Min length1

Characters and Unicode

Total characters8264
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique721 ?
Unique (%)24.6%

Sample

1st row16
2nd row4,114
3rd row602
4th row47
5th row91
ValueCountFrequency (%)
1 174
 
5.9%
4 56
 
1.9%
6 46
 
1.6%
16 42
 
1.4%
23 40
 
1.4%
28 35
 
1.2%
7 34
 
1.2%
11 31
 
1.1%
14 30
 
1.0%
17 30
 
1.0%
Other values (1029) 2417
82.4%
2024-04-06T20:20:46.056854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1561
18.9%
2 920
11.1%
3 878
10.6%
4 761
9.2%
, 649
7.9%
7 617
 
7.5%
8 612
 
7.4%
0 603
 
7.3%
6 561
 
6.8%
9 557
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7615
92.1%
Other Punctuation 649
 
7.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1561
20.5%
2 920
12.1%
3 878
11.5%
4 761
10.0%
7 617
 
8.1%
8 612
 
8.0%
0 603
 
7.9%
6 561
 
7.4%
9 557
 
7.3%
5 545
 
7.2%
Other Punctuation
ValueCountFrequency (%)
, 649
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8264
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1561
18.9%
2 920
11.1%
3 878
10.6%
4 761
9.2%
, 649
7.9%
7 617
 
7.5%
8 612
 
7.4%
0 603
 
7.3%
6 561
 
6.8%
9 557
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8264
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1561
18.9%
2 920
11.1%
3 878
10.6%
4 761
9.2%
, 649
7.9%
7 617
 
7.5%
8 612
 
7.4%
0 603
 
7.3%
6 561
 
6.8%
9 557
 
6.7%

특수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct240
Distinct (%)19.2%
Missing4896
Missing (%)79.6%
Infinite0
Infinite (%)0.0%
Mean77.735412
Minimum1
Maximum630
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size54.2 KiB
2024-04-06T20:20:46.286300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median17
Q3119
95-th percentile266.5
Maximum630
Range629
Interquartile range (IQR)118

Descriptive statistics

Standard deviation112.13797
Coefficient of variation (CV)1.4425597
Kurtosis7.0459505
Mean77.735412
Median Absolute Deviation (MAD)16
Skewness2.3513697
Sum97247
Variance12574.924
MonotonicityNot monotonic
2024-04-06T20:20:46.944622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 322
 
5.2%
2 104
 
1.7%
3 49
 
0.8%
5 44
 
0.7%
4 31
 
0.5%
8 23
 
0.4%
17 19
 
0.3%
6 16
 
0.3%
60 15
 
0.2%
82 14
 
0.2%
Other values (230) 614
 
10.0%
(Missing) 4896
79.6%
ValueCountFrequency (%)
1 322
5.2%
2 104
 
1.7%
3 49
 
0.8%
4 31
 
0.5%
5 44
 
0.7%
6 16
 
0.3%
7 12
 
0.2%
8 23
 
0.4%
9 1
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
630 1
< 0.1%
629 1
< 0.1%
628 2
< 0.1%
627 1
< 0.1%
625 1
< 0.1%
622 1
< 0.1%
621 1
< 0.1%
618 2
< 0.1%
617 1
< 0.1%
613 1
< 0.1%


Text

Distinct2474
Distinct (%)40.2%
Missing0
Missing (%)0.0%
Memory size48.2 KiB
2024-04-06T20:20:47.545284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.2446722
Min length1

Characters and Unicode

Total characters19945
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1982 ?
Unique (%)32.2%

Sample

1st row24
2nd row92
3rd row18,553
4th row935
5th row84
ValueCountFrequency (%)
1 287
 
4.7%
2 188
 
3.1%
3 165
 
2.7%
4 100
 
1.6%
8 75
 
1.2%
5 75
 
1.2%
6 72
 
1.2%
10 50
 
0.8%
27 49
 
0.8%
38 46
 
0.7%
Other values (2464) 5040
82.0%
2024-04-06T20:20:48.470640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2790
14.0%
2 2616
13.1%
3 2152
10.8%
, 2145
10.8%
4 1729
8.7%
5 1563
7.8%
6 1525
7.6%
7 1435
7.2%
8 1367
6.9%
9 1340
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17800
89.2%
Other Punctuation 2145
 
10.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2790
15.7%
2 2616
14.7%
3 2152
12.1%
4 1729
9.7%
5 1563
8.8%
6 1525
8.6%
7 1435
8.1%
8 1367
7.7%
9 1340
7.5%
0 1283
7.2%
Other Punctuation
ValueCountFrequency (%)
, 2145
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19945
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2790
14.0%
2 2616
13.1%
3 2152
10.8%
, 2145
10.8%
4 1729
8.7%
5 1563
7.8%
6 1525
7.6%
7 1435
7.2%
8 1367
6.9%
9 1340
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19945
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2790
14.0%
2 2616
13.1%
3 2152
10.8%
, 2145
10.8%
4 1729
8.7%
5 1563
7.8%
6 1525
7.6%
7 1435
7.2%
8 1367
6.9%
9 1340
6.7%

Interactions

2024-04-06T20:20:39.124102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T20:20:48.646188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년월시군구별연료별용도별특수
년월1.0000.0000.0000.0000.000
시군구별0.0001.0000.0820.0000.757
연료별0.0000.0821.0000.3600.586
용도별0.0000.0000.3601.0000.600
특수0.0000.7570.5860.6001.000
2024-04-06T20:20:48.808364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용도별시군구별연료별
용도별1.0000.0000.335
시군구별0.0001.0000.026
연료별0.3350.0261.000
2024-04-06T20:20:48.963614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
특수시군구별연료별용도별
특수1.0000.4090.3590.603
시군구별0.4091.0000.0260.000
연료별0.3590.0261.0000.335
용도별0.6030.0000.3351.000

Missing values

2024-04-06T20:20:39.319678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T20:20:39.540345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-06T20:20:39.753869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

년월시군구별연료별용도별승용승합화물특수
02019-01종로구CNG비사업용6216<NA>24
12019-01종로구CNG사업용<NA>92<NA><NA>92
22019-01종로구경유비사업용11,0533,2664,11412018,553
32019-01종로구경유사업용5622460253935
42019-01종로구기타연료비사업용225471084
52019-01종로구기타연료사업용<NA><NA>91<NA>91
62019-01종로구수소비사업용6<NA><NA><NA>6
72019-01종로구엘피지비사업용1,736236341<NA>2,313
82019-01종로구엘피지사업용223<NA>124<NA>347
92019-01종로구전기비사업용127<NA>4<NA>131
년월시군구별연료별용도별승용승합화물특수
61372019-12강동구하이브리드(CNG+전기)사업용<NA>2<NA><NA>2
61382019-12강동구하이브리드(LPG+전기)비사업용73<NA><NA><NA>73
61392019-12강동구하이브리드(경유+전기)비사업용1<NA><NA><NA>1
61402019-12강동구하이브리드(휘발유+전기)비사업용3,695<NA><NA><NA>3,695
61412019-12강동구하이브리드(휘발유+전기)사업용30<NA><NA><NA>30
61422019-12강동구휘발유비사업용28,1801955<NA>28,254
61432019-12강동구휘발유사업용703<NA><NA>73
61442019-12강동구휘발유(무연)비사업용41,2761928<NA>41,323
61452019-12강동구휘발유(무연)사업용232<NA><NA><NA>232
61462019-12강동구휘발유(유연)비사업용52<NA><NA><NA>52