Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells2861
Missing cells (%)4.8%
Duplicate rows283
Duplicate rows (%)2.8%
Total size in memory556.6 KiB
Average record size in memory57.0 B

Variable types

DateTime2
Text2
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21245/F/1/datasetView.do

Alerts

기준연월 has constant value ""Constant
Dataset has 283 (2.8%) duplicate rowsDuplicates
연료 is highly imbalanced (75.7%)Imbalance
현소유자의출생년도 has 2861 (28.6%) missing valuesMissing

Reproduction

Analysis started2024-03-13 07:47:39.155044
Analysis finished2024-03-13 07:47:39.718397
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준연월
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2015-12-01 00:00:00
Maximum2015-12-01 00:00:00
2024-03-13T16:47:39.758690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T16:47:39.836707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct424
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-13T16:47:40.112970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length14
Mean length13.855
Min length11

Characters and Unicode

Total characters138550
Distinct characters194
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row서울특별시 마포구 서강동
2nd row서울특별시 강남구 수서동
3rd row서울특별시 관악구 보라매동
4th row서울특별시 중랑구 망우본동
5th row서울특별시 강서구 가양1동
ValueCountFrequency (%)
서울특별시 10000
33.3%
강남구 1339
 
4.5%
서초구 1017
 
3.4%
강서구 887
 
3.0%
송파구 658
 
2.2%
영등포구 601
 
2.0%
역삼1동 536
 
1.8%
마포구 427
 
1.4%
강동구 392
 
1.3%
노원구 366
 
1.2%
Other values (439) 13777
45.9%
2024-03-13T16:47:40.550822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20000
14.4%
12406
 
9.0%
11259
 
8.1%
10641
 
7.7%
10078
 
7.3%
10000
 
7.2%
10000
 
7.2%
10000
 
7.2%
1 3130
 
2.3%
2878
 
2.1%
Other values (184) 38158
27.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 111476
80.5%
Space Separator 20000
 
14.4%
Decimal Number 6902
 
5.0%
Other Punctuation 172
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12406
 
11.1%
11259
 
10.1%
10641
 
9.5%
10078
 
9.0%
10000
 
9.0%
10000
 
9.0%
10000
 
9.0%
2878
 
2.6%
1417
 
1.3%
1291
 
1.2%
Other values (172) 31506
28.3%
Decimal Number
ValueCountFrequency (%)
1 3130
45.3%
2 2084
30.2%
3 755
 
10.9%
4 508
 
7.4%
5 142
 
2.1%
6 117
 
1.7%
7 110
 
1.6%
8 27
 
0.4%
0 16
 
0.2%
9 13
 
0.2%
Space Separator
ValueCountFrequency (%)
20000
100.0%
Other Punctuation
ValueCountFrequency (%)
. 172
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 111476
80.5%
Common 27074
 
19.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12406
 
11.1%
11259
 
10.1%
10641
 
9.5%
10078
 
9.0%
10000
 
9.0%
10000
 
9.0%
10000
 
9.0%
2878
 
2.6%
1417
 
1.3%
1291
 
1.2%
Other values (172) 31506
28.3%
Common
ValueCountFrequency (%)
20000
73.9%
1 3130
 
11.6%
2 2084
 
7.7%
3 755
 
2.8%
4 508
 
1.9%
. 172
 
0.6%
5 142
 
0.5%
6 117
 
0.4%
7 110
 
0.4%
8 27
 
0.1%
Other values (2) 29
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 111476
80.5%
ASCII 27074
 
19.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20000
73.9%
1 3130
 
11.6%
2 2084
 
7.7%
3 755
 
2.8%
4 508
 
1.9%
. 172
 
0.6%
5 142
 
0.5%
6 117
 
0.4%
7 110
 
0.4%
8 27
 
0.1%
Other values (2) 29
 
0.1%
Hangul
ValueCountFrequency (%)
12406
 
11.1%
11259
 
10.1%
10641
 
9.5%
10078
 
9.0%
10000
 
9.0%
10000
 
9.0%
10000
 
9.0%
2878
 
2.6%
1417
 
1.3%
1291
 
1.2%
Other values (172) 31506
28.3%

차명
Text

Distinct109
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-13T16:47:40.765476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length13.4965
Min length2

Characters and Unicode

Total characters134965
Distinct characters118
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)0.4%

Sample

1st row도요타 프리우스(하이브리드)
2nd row그랜저 하이브리드 (GRANDEUR
3rd rowK5 하이브리드
4th row쏘나타 (SONATA) 하이브리드
5th row그랜저(GRANDEUR) 하이브리드
ValueCountFrequency (%)
하이브리드 5090
22.2%
렉서스 1712
 
7.5%
쏘나타 1710
 
7.5%
k5 1444
 
6.3%
토요타 1340
 
5.8%
es300h 1140
 
5.0%
hyb 1109
 
4.8%
sonata 1036
 
4.5%
prius 931
 
4.1%
그랜저(grandeur 843
 
3.7%
Other values (106) 6559
28.6%
2024-03-13T16:47:41.146029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13295
 
9.9%
A 7620
 
5.6%
6559
 
4.9%
6446
 
4.8%
6428
 
4.8%
6399
 
4.7%
6394
 
4.7%
S 4925
 
3.6%
N 4471
 
3.3%
( 4342
 
3.2%
Other values (108) 68086
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 56090
41.6%
Uppercase Letter 45869
34.0%
Space Separator 13295
 
9.9%
Decimal Number 7305
 
5.4%
Lowercase Letter 4967
 
3.7%
Open Punctuation 4342
 
3.2%
Close Punctuation 2845
 
2.1%
Other Punctuation 224
 
0.2%
Dash Punctuation 28
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6559
11.7%
6446
11.5%
6428
11.5%
6399
11.4%
6394
11.4%
3904
 
7.0%
2642
 
4.7%
2573
 
4.6%
1773
 
3.2%
1742
 
3.1%
Other values (51) 11230
20.0%
Uppercase Letter
ValueCountFrequency (%)
A 7620
16.6%
S 4925
10.7%
N 4471
9.7%
R 4018
 
8.8%
T 3319
 
7.2%
E 3091
 
6.7%
O 2597
 
5.7%
U 2176
 
4.7%
K 1737
 
3.8%
H 1702
 
3.7%
Other values (15) 10213
22.3%
Lowercase Letter
ValueCountFrequency (%)
h 1746
35.2%
r 616
 
12.4%
y 607
 
12.2%
i 589
 
11.9%
b 482
 
9.7%
d 482
 
9.7%
a 125
 
2.5%
m 124
 
2.5%
n 40
 
0.8%
c 27
 
0.5%
Other values (8) 129
 
2.6%
Decimal Number
ValueCountFrequency (%)
0 3388
46.4%
5 1621
22.2%
3 1374
18.8%
2 282
 
3.9%
7 276
 
3.8%
4 252
 
3.4%
6 98
 
1.3%
8 14
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 223
99.6%
, 1
 
0.4%
Space Separator
ValueCountFrequency (%)
13295
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4342
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2845
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 56090
41.6%
Latin 50836
37.7%
Common 28039
20.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6559
11.7%
6446
11.5%
6428
11.5%
6399
11.4%
6394
11.4%
3904
 
7.0%
2642
 
4.7%
2573
 
4.6%
1773
 
3.2%
1742
 
3.1%
Other values (51) 11230
20.0%
Latin
ValueCountFrequency (%)
A 7620
15.0%
S 4925
 
9.7%
N 4471
 
8.8%
R 4018
 
7.9%
T 3319
 
6.5%
E 3091
 
6.1%
O 2597
 
5.1%
U 2176
 
4.3%
h 1746
 
3.4%
K 1737
 
3.4%
Other values (33) 15136
29.8%
Common
ValueCountFrequency (%)
13295
47.4%
( 4342
 
15.5%
0 3388
 
12.1%
) 2845
 
10.1%
5 1621
 
5.8%
3 1374
 
4.9%
2 282
 
1.0%
7 276
 
1.0%
4 252
 
0.9%
. 223
 
0.8%
Other values (4) 141
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78875
58.4%
Hangul 56090
41.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13295
16.9%
A 7620
 
9.7%
S 4925
 
6.2%
N 4471
 
5.7%
( 4342
 
5.5%
R 4018
 
5.1%
0 3388
 
4.3%
T 3319
 
4.2%
E 3091
 
3.9%
) 2845
 
3.6%
Other values (47) 27561
34.9%
Hangul
ValueCountFrequency (%)
6559
11.7%
6446
11.5%
6428
11.5%
6399
11.4%
6394
11.4%
3904
 
7.0%
2642
 
4.7%
2573
 
4.6%
1773
 
3.2%
1742
 
3.1%
Other values (51) 11230
20.0%

연료
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
하이브리드(휘발유+전기)
9025 
하이브리드(LPG+전기)
 
644
전기
 
309
하이브리드(CNG+전기)
 
14
하이브리드(경유+전기)
 
8

Length

Max length13
Median length13
Mean length12.6593
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하이브리드(휘발유+전기)
2nd row하이브리드(휘발유+전기)
3rd row하이브리드(휘발유+전기)
4th row하이브리드(휘발유+전기)
5th row하이브리드(휘발유+전기)

Common Values

ValueCountFrequency (%)
하이브리드(휘발유+전기) 9025
90.2%
하이브리드(LPG+전기) 644
 
6.4%
전기 309
 
3.1%
하이브리드(CNG+전기) 14
 
0.1%
하이브리드(경유+전기) 8
 
0.1%

Length

2024-03-13T16:47:41.260996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T16:47:41.352644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하이브리드(휘발유+전기 9025
90.2%
하이브리드(lpg+전기 644
 
6.4%
전기 309
 
3.1%
하이브리드(cng+전기 14
 
0.1%
하이브리드(경유+전기 8
 
0.1%
Distinct1709
Distinct (%)17.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2004-12-13 00:00:00
Maximum2015-12-31 00:00:00
2024-03-13T16:47:41.452594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T16:47:41.579856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

현소유자의출생년도
Real number (ℝ)

MISSING 

Distinct72
Distinct (%)1.0%
Missing2861
Missing (%)28.6%
Infinite0
Infinite (%)0.0%
Mean1969.9831
Minimum1922
Maximum2011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T16:47:41.736150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1922
5-th percentile1949
Q11962
median1971
Q31979
95-th percentile1985
Maximum2011
Range89
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.376901
Coefficient of variation (CV)0.0057751261
Kurtosis-0.1092293
Mean1969.9831
Median Absolute Deviation (MAD)9
Skewness-0.50510652
Sum14063709
Variance129.43387
MonotonicityNot monotonic
2024-03-13T16:47:41.898314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1982 275
 
2.8%
1983 265
 
2.6%
1981 258
 
2.6%
1979 256
 
2.6%
1980 254
 
2.5%
1974 240
 
2.4%
1973 235
 
2.4%
1977 231
 
2.3%
1978 230
 
2.3%
1971 221
 
2.2%
Other values (62) 4674
46.7%
(Missing) 2861
28.6%
ValueCountFrequency (%)
1922 1
 
< 0.1%
1926 1
 
< 0.1%
1927 1
 
< 0.1%
1929 2
 
< 0.1%
1931 1
 
< 0.1%
1932 2
 
< 0.1%
1933 2
 
< 0.1%
1934 3
< 0.1%
1935 6
0.1%
1936 4
< 0.1%
ValueCountFrequency (%)
2011 7
 
0.1%
2009 1
 
< 0.1%
2001 1
 
< 0.1%
2000 1
 
< 0.1%
1994 1
 
< 0.1%
1993 4
 
< 0.1%
1992 5
 
0.1%
1991 7
 
0.1%
1990 14
0.1%
1989 28
0.3%

Interactions

2024-03-13T16:47:39.443968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T16:47:41.982233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연료현소유자의출생년도
연료1.0000.027
현소유자의출생년도0.0271.000
2024-03-13T16:47:42.055717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
현소유자의출생년도연료
현소유자의출생년도1.0000.017
연료0.0171.000

Missing values

2024-03-13T16:47:39.559585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T16:47:39.666339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연월사용본거지시읍면동_행정동기준차명연료최초등록일현소유자의출생년도
192332015-12서울특별시 마포구 서강동도요타 프리우스(하이브리드)하이브리드(휘발유+전기)2006-09-081960
36152015-12서울특별시 강남구 수서동그랜저 하이브리드 (GRANDEUR하이브리드(휘발유+전기)2015-09-011947
324782015-12서울특별시 관악구 보라매동K5 하이브리드하이브리드(휘발유+전기)2012-06-081978
348282015-12서울특별시 중랑구 망우본동쏘나타 (SONATA) 하이브리드하이브리드(휘발유+전기)2012-04-251968
174802015-12서울특별시 강서구 가양1동그랜저(GRANDEUR) 하이브리드하이브리드(휘발유+전기)2014-08-07<NA>
324892015-12서울특별시 관악구 성현동토요타 CAMRY Hybrid하이브리드(휘발유+전기)2009-12-141961
280972015-12서울특별시 중구 필동쏘나타 (SONATA) 하이브리드하이브리드(휘발유+전기)2011-09-081976
246462015-12서울특별시 강남구 역삼1동렉서스 NX300h하이브리드(휘발유+전기)2015-07-16<NA>
306632015-12서울특별시 서초구 방배4동쏘나타 (SONATA) 하이브리드하이브리드(휘발유+전기)2012-07-241969
233212015-12서울특별시 은평구 진관동아반떼 하이브리드(AVANTE HYB하이브리드(LPG+전기)2009-07-311972
기준연월사용본거지시읍면동_행정동기준차명연료최초등록일현소유자의출생년도
288182015-12서울특별시 강남구 역삼1동렉서스 ES300h하이브리드(휘발유+전기)2015-01-30<NA>
331672015-12서울특별시 영등포구 양평1동토요타 PRIUS하이브리드(휘발유+전기)2014-09-111971
251862015-12서울특별시 강서구 가양1동쏘나타(SONATA) 하이브리드하이브리드(휘발유+전기)2013-06-21<NA>
199212015-12서울특별시 금천구 가산동쏘나타(SONATA) 하이브리드하이브리드(휘발유+전기)2014-05-08<NA>
206112015-12서울특별시 마포구 도화동쏘나타 (SONATA) 하이브리드하이브리드(휘발유+전기)2012-11-091981
24242015-12서울특별시 양천구 목1동레이 전기차전기2013-02-14<NA>
286992015-12서울특별시 동작구 사당1동렉서스 ES300h하이브리드(휘발유+전기)2014-09-241981
85522015-12서울특별시 영등포구 여의동쏘나타 (SONATA) 하이브리드하이브리드(휘발유+전기)2013-05-09<NA>
173832015-12서울특별시 서초구 방배3동렉서스 CT200h하이브리드(휘발유+전기)2011-07-291981
135582015-12서울특별시 강남구 대치2동그랜저 하이브리드 (GRANDEUR하이브리드(휘발유+전기)2015-12-041943

Duplicate rows

Most frequently occurring

기준연월사용본거지시읍면동_행정동기준차명연료최초등록일현소유자의출생년도# duplicates
2132015-12서울특별시 서초구 양재2동K5 하이브리드하이브리드(휘발유+전기)2011-11-04<NA>58
2122015-12서울특별시 서초구 양재2동K5 하이브리드하이브리드(휘발유+전기)2011-10-19<NA>38
2032015-12서울특별시 서초구 양재1동쏘나타 하이브리드(SONATA HYB하이브리드(휘발유+전기)2014-12-24<NA>35
2002015-12서울특별시 서초구 양재1동쏘나타 하이브리드(SONATA HYB하이브리드(휘발유+전기)2014-12-19<NA>30
1002015-12서울특별시 강서구 가양1동그랜저(GRANDEUR) 하이브리드하이브리드(휘발유+전기)2014-01-24<NA>23
2042015-12서울특별시 서초구 양재1동쏘나타 하이브리드(SONATA HYB하이브리드(휘발유+전기)2014-12-26<NA>21
2142015-12서울특별시 서초구 양재2동K5 하이브리드하이브리드(휘발유+전기)2015-11-27<NA>18
2202015-12서울특별시 서초구 양재2동쏘나타 (SONATA) 하이브리드하이브리드(휘발유+전기)2011-09-05<NA>18
2012015-12서울특별시 서초구 양재1동쏘나타 하이브리드(SONATA HYB하이브리드(휘발유+전기)2014-12-22<NA>16
2022015-12서울특별시 서초구 양재1동쏘나타 하이브리드(SONATA HYB하이브리드(휘발유+전기)2014-12-23<NA>16