Overview

Dataset statistics

Number of variables17
Number of observations5170
Missing cells9803
Missing cells (%)11.2%
Duplicate rows18
Duplicate rows (%)0.3%
Total size in memory686.8 KiB
Average record size in memory136.0 B

Variable types

Categorical2
Text2
Unsupported13

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21236/F/1/datasetView.do

Alerts

Dataset has 18 (0.3%) duplicate rowsDuplicates
읍면동별 연료별 자동차 등록현황 (행정동) is highly imbalanced (87.7%)Imbalance
Unnamed: 1 has 5168 (> 99.9%) missing valuesMissing
Unnamed: 2 has 4433 (85.7%) missing valuesMissing
Unnamed: 4 has 136 (2.6%) missing valuesMissing
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 15 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 16 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-05-11 06:17:30.483208
Analysis finished2024-05-11 06:17:31.915769
Duration1.43 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct31
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size40.5 KiB
<NA>
4861 
서울특별시 송파구
 
18
서울특별시 강남구
 
16
서울특별시 관악구
 
15
서울특별시 강서구
 
14
Other values (26)
 
246

Length

Max length11
Median length4
Mean length4.3025145
Min length2

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st row자동차관리정보시스템
2nd row<NA>
3rd rowPROG_ID :
4th row<NA>
5th row기준일자 :

Common Values

ValueCountFrequency (%)
<NA> 4861
94.0%
서울특별시 송파구 18
 
0.3%
서울특별시 강남구 16
 
0.3%
서울특별시 관악구 15
 
0.3%
서울특별시 강서구 14
 
0.3%
서울특별시 노원구 14
 
0.3%
서울특별시 강동구 13
 
0.3%
서울특별시 양천구 13
 
0.3%
서울특별시 성북구 13
 
0.3%
서울특별시 영등포구 13
 
0.3%
Other values (21) 180
 
3.5%

Length

2024-05-11T15:17:31.998026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 4861
88.8%
서울특별시 304
 
5.6%
송파구 18
 
0.3%
강남구 16
 
0.3%
관악구 15
 
0.3%
강서구 14
 
0.3%
노원구 14
 
0.3%
영등포구 13
 
0.2%
성북구 13
 
0.2%
양천구 13
 
0.2%
Other values (24) 196
 
3.6%

Unnamed: 1
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing5168
Missing (%)> 99.9%
Memory size40.5 KiB
2024-05-11T15:17:32.139497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7.5
Mean length7.5
Min length6

Characters and Unicode

Total characters15
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowSTA029Q24
2nd row202403
ValueCountFrequency (%)
sta029q24 1
50.0%
202403 1
50.0%
2024-05-11T15:17:32.470890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 4
26.7%
0 3
20.0%
4 2
13.3%
S 1
 
6.7%
T 1
 
6.7%
A 1
 
6.7%
9 1
 
6.7%
Q 1
 
6.7%
3 1
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11
73.3%
Uppercase Letter 4
 
26.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4
36.4%
0 3
27.3%
4 2
18.2%
9 1
 
9.1%
3 1
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
S 1
25.0%
T 1
25.0%
A 1
25.0%
Q 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11
73.3%
Latin 4
 
26.7%

Most frequent character per script

Common
ValueCountFrequency (%)
2 4
36.4%
0 3
27.3%
4 2
18.2%
9 1
 
9.1%
3 1
 
9.1%
Latin
ValueCountFrequency (%)
S 1
25.0%
T 1
25.0%
A 1
25.0%
Q 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4
26.7%
0 3
20.0%
4 2
13.3%
S 1
 
6.7%
T 1
 
6.7%
A 1
 
6.7%
9 1
 
6.7%
Q 1
 
6.7%
3 1
 
6.7%

Unnamed: 2
Text

MISSING 

Distinct454
Distinct (%)61.6%
Missing4433
Missing (%)85.7%
Memory size40.5 KiB
2024-05-11T15:17:33.015765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length8.3188602
Min length2

Characters and Unicode

Total characters6131
Distinct characters200
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique230 ?
Unique (%)31.2%

Sample

1st row읍면동 (행정동)
2nd row기타
3rd row종로구 청운효자동
4th row종로구 사직동
5th row종로구 사직동
ValueCountFrequency (%)
기타 59
 
4.2%
송파구 40
 
2.8%
강남구 33
 
2.3%
강동구 33
 
2.3%
중구 33
 
2.3%
강서구 32
 
2.3%
노원구 32
 
2.3%
관악구 32
 
2.3%
성북구 29
 
2.0%
영등포구 29
 
2.0%
Other values (475) 1064
75.1%
2024-05-11T15:17:33.730848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1356
22.1%
784
 
12.8%
715
 
11.7%
1 166
 
2.7%
2 142
 
2.3%
124
 
2.0%
96
 
1.6%
78
 
1.3%
78
 
1.3%
73
 
1.2%
Other values (190) 2519
41.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4287
69.9%
Space Separator 1356
 
22.1%
Decimal Number 468
 
7.6%
Other Punctuation 18
 
0.3%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
784
 
18.3%
715
 
16.7%
124
 
2.9%
96
 
2.2%
78
 
1.8%
78
 
1.8%
73
 
1.7%
70
 
1.6%
63
 
1.5%
62
 
1.4%
Other values (176) 2144
50.0%
Decimal Number
ValueCountFrequency (%)
1 166
35.5%
2 142
30.3%
3 71
15.2%
4 40
 
8.5%
5 19
 
4.1%
6 13
 
2.8%
7 9
 
1.9%
8 4
 
0.9%
9 2
 
0.4%
0 2
 
0.4%
Space Separator
ValueCountFrequency (%)
1356
100.0%
Other Punctuation
ValueCountFrequency (%)
. 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4287
69.9%
Common 1844
30.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
784
 
18.3%
715
 
16.7%
124
 
2.9%
96
 
2.2%
78
 
1.8%
78
 
1.8%
73
 
1.7%
70
 
1.6%
63
 
1.5%
62
 
1.4%
Other values (176) 2144
50.0%
Common
ValueCountFrequency (%)
1356
73.5%
1 166
 
9.0%
2 142
 
7.7%
3 71
 
3.9%
4 40
 
2.2%
5 19
 
1.0%
. 18
 
1.0%
6 13
 
0.7%
7 9
 
0.5%
8 4
 
0.2%
Other values (4) 6
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4287
69.9%
ASCII 1844
30.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1356
73.5%
1 166
 
9.0%
2 142
 
7.7%
3 71
 
3.9%
4 40
 
2.2%
5 19
 
1.0%
. 18
 
1.0%
6 13
 
0.7%
7 9
 
0.5%
8 4
 
0.2%
Other values (4) 6
 
0.3%
Hangul
ValueCountFrequency (%)
784
 
18.3%
715
 
16.7%
124
 
2.9%
96
 
2.2%
78
 
1.8%
78
 
1.8%
73
 
1.7%
70
 
1.6%
63
 
1.5%
62
 
1.4%
Other values (176) 2144
50.0%

Unnamed: 3
Categorical

Distinct16
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size40.5 KiB
경유
535 
휘발유
505 
휘발유(무연)
499 
엘피지
461 
하이브리드(휘발유+전기)
435 
Other values (11)
2735 

Length

Max length13
Median length12
Mean length5.7882012
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
경유 535
10.3%
휘발유 505
9.8%
휘발유(무연) 499
9.7%
엘피지 461
8.9%
하이브리드(휘발유+전기) 435
8.4%
기타연료 435
8.4%
전기 429
8.3%
하이브리드(경유+전기) 418
8.1%
수소 415
8.0%
하이브리드(LPG+전기) 371
7.2%
Other values (6) 667
12.9%

Length

2024-05-11T15:17:33.909252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경유 535
10.3%
휘발유 505
9.8%
휘발유(무연 499
9.7%
엘피지 461
8.9%
하이브리드(휘발유+전기 435
8.4%
기타연료 435
8.4%
전기 429
8.3%
하이브리드(경유+전기 418
8.1%
수소 415
8.0%
하이브리드(lpg+전기 371
7.2%
Other values (6) 667
12.9%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing136
Missing (%)2.6%
Memory size40.5 KiB

Unnamed: 5
Unsupported

REJECTED  UNSUPPORTED 

Missing5
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 6
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 7
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 8
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 9
Unsupported

REJECTED  UNSUPPORTED 

Missing5
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 10
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 11
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 12
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 13
Unsupported

REJECTED  UNSUPPORTED 

Missing3
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 14
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 15
Unsupported

REJECTED  UNSUPPORTED 

Missing5
Missing (%)0.1%
Memory size40.5 KiB

Unnamed: 16
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.5 KiB

Correlations

2024-05-11T15:17:34.005860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 1Unnamed: 3
읍면동별 연료별 자동차 등록현황 (행정동)1.0000.0000.684
Unnamed: 10.0001.000NaN
Unnamed: 30.684NaN1.000
2024-05-11T15:17:34.137838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 3읍면동별 연료별 자동차 등록현황 (행정동)
Unnamed: 31.0000.248
읍면동별 연료별 자동차 등록현황 (행정동)0.2481.000
2024-05-11T15:17:34.263867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 3
읍면동별 연료별 자동차 등록현황 (행정동)1.0000.248
Unnamed: 30.2481.000

Missing values

2024-05-11T15:17:30.992270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:17:31.388451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T15:17:31.667151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16
0자동차관리정보시스템<NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1<NA><NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2PROG_ID :STA029Q24<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNPage No.:NaN1NaN
3<NA><NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
4기준일자 :202403<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaN출력일시 :NaNNaNNaN
5사용본거지 시군구<NA>읍면동 (행정동)연료관용NaNNaNNaN자가용NaNNaNNaN영업용NaNNaNNaN
6<NA><NA><NA><NA>NaN승용승합화물특수승용승합화물특수승용승합화물특수
7서울특별시 종로구<NA>기타휘발유(무연)1000010000000
8<NA><NA><NA>휘발유2000020000000
9<NA><NA><NA>경유1000001000000
읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16
5160<NA><NA><NA>하이브리드(휘발유+전기)59600005940002000
5161<NA><NA><NA>하이브리드(경유+전기)9000090000000
5162<NA><NA><NA>하이브리드(LPG+전기)2000020000000
5163<NA><NA><NA>수소7000070000000
5164<NA><NA><NA>기타연료160000021220000
5165<NA><NA>기타휘발유240000221100000
5166<NA><NA><NA>경유180000041400000
5167<NA><NA><NA>휘발유(무연)4000030100000
5168서울특별시 강동구<NA><NA><NA>15452210741147813011033361202621249767982539222
5169서울<NA><NA><NA>3187776470137953900426265338671231246878578811729914861602715240

Duplicate rows

Most frequently occurring

읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 1Unnamed: 2Unnamed: 3# duplicates
5<NA><NA><NA>경유486
15<NA><NA><NA>휘발유(무연)472
8<NA><NA><NA>엘피지433
6<NA><NA><NA>기타연료414
13<NA><NA><NA>하이브리드(휘발유+전기)412
9<NA><NA><NA>전기408
7<NA><NA><NA>수소399
12<NA><NA><NA>하이브리드(경유+전기)388
11<NA><NA><NA>하이브리드(LPG+전기)358
16<NA><NA><NA>휘발유(유연)346