Overview

Dataset statistics

Number of variables17
Number of observations5224
Missing cells9938
Missing cells (%)11.2%
Duplicate rows19
Duplicate rows (%)0.4%
Total size in memory693.9 KiB
Average record size in memory136.0 B

Variable types

Categorical2
Text2
Unsupported13

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21236/F/1/datasetView.do

Alerts

Dataset has 19 (0.4%) duplicate rowsDuplicates
읍면동별 연료별 자동차 등록현황 (행정동) is highly imbalanced (87.7%)Imbalance
Unnamed: 1 has 5222 (> 99.9%) missing valuesMissing
Unnamed: 2 has 4484 (85.8%) missing valuesMissing
Unnamed: 4 has 167 (3.2%) missing valuesMissing
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 15 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 16 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-06 12:33:25.406331
Analysis finished2024-04-06 12:33:28.383737
Duration2.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct31
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size40.9 KiB
<NA>
4911 
서울특별시 송파구
 
19
서울특별시 강남구
 
16
서울특별시 노원구
 
14
서울특별시 강서구
 
14
Other values (26)
 
250

Length

Max length11
Median length4
Mean length4.3032159
Min length2

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st row자동차관리정보시스템
2nd row<NA>
3rd rowPROG_ID :
4th row<NA>
5th row기준일자 :

Common Values

ValueCountFrequency (%)
<NA> 4911
94.0%
서울특별시 송파구 19
 
0.4%
서울특별시 강남구 16
 
0.3%
서울특별시 노원구 14
 
0.3%
서울특별시 강서구 14
 
0.3%
서울특별시 관악구 14
 
0.3%
서울특별시 강동구 14
 
0.3%
서울특별시 마포구 13
 
0.2%
서울특별시 영등포구 13
 
0.2%
서울특별시 성북구 13
 
0.2%
Other values (21) 183
 
3.5%

Length

2024-04-06T21:33:28.546597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 4911
88.7%
서울특별시 308
 
5.6%
송파구 19
 
0.3%
강남구 16
 
0.3%
노원구 14
 
0.3%
강서구 14
 
0.3%
관악구 14
 
0.3%
강동구 14
 
0.3%
서초구 13
 
0.2%
성북구 13
 
0.2%
Other values (24) 199
 
3.6%

Unnamed: 1
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing5222
Missing (%)> 99.9%
Memory size40.9 KiB
2024-04-06T21:33:28.775762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7.5
Mean length7.5
Min length6

Characters and Unicode

Total characters15
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowSTA029Q24
2nd row202302
ValueCountFrequency (%)
sta029q24 1
50.0%
202302 1
50.0%
2024-04-06T21:33:29.284568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 5
33.3%
0 3
20.0%
S 1
 
6.7%
T 1
 
6.7%
A 1
 
6.7%
9 1
 
6.7%
Q 1
 
6.7%
4 1
 
6.7%
3 1
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11
73.3%
Uppercase Letter 4
 
26.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5
45.5%
0 3
27.3%
9 1
 
9.1%
4 1
 
9.1%
3 1
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
S 1
25.0%
T 1
25.0%
A 1
25.0%
Q 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11
73.3%
Latin 4
 
26.7%

Most frequent character per script

Common
ValueCountFrequency (%)
2 5
45.5%
0 3
27.3%
9 1
 
9.1%
4 1
 
9.1%
3 1
 
9.1%
Latin
ValueCountFrequency (%)
S 1
25.0%
T 1
25.0%
A 1
25.0%
Q 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 5
33.3%
0 3
20.0%
S 1
 
6.7%
T 1
 
6.7%
A 1
 
6.7%
9 1
 
6.7%
Q 1
 
6.7%
4 1
 
6.7%
3 1
 
6.7%

Unnamed: 2
Text

MISSING 

Distinct454
Distinct (%)61.4%
Missing4484
Missing (%)85.8%
Memory size40.9 KiB
2024-04-06T21:33:29.963891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length8.3027027
Min length2

Characters and Unicode

Total characters6144
Distinct characters198
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique230 ?
Unique (%)31.1%

Sample

1st row읍면동 (행정동)
2nd row기타
3rd row종로구 청운효자동
4th row종로구 사직동
5th row종로구 사직동
ValueCountFrequency (%)
기타 63
 
4.4%
송파구 41
 
2.9%
강동구 33
 
2.3%
중구 32
 
2.3%
노원구 32
 
2.3%
강남구 32
 
2.3%
성북구 31
 
2.2%
관악구 31
 
2.2%
서초구 29
 
2.0%
강서구 29
 
2.0%
Other values (474) 1065
75.1%
2024-04-06T21:33:30.943225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1354
22.0%
787
 
12.8%
716
 
11.7%
1 155
 
2.5%
2 151
 
2.5%
120
 
2.0%
99
 
1.6%
83
 
1.4%
75
 
1.2%
75
 
1.2%
Other values (188) 2529
41.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4307
70.1%
Space Separator 1354
 
22.0%
Decimal Number 466
 
7.6%
Other Punctuation 15
 
0.2%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
787
 
18.3%
716
 
16.6%
120
 
2.8%
99
 
2.3%
83
 
1.9%
75
 
1.7%
75
 
1.7%
70
 
1.6%
68
 
1.6%
67
 
1.6%
Other values (174) 2147
49.8%
Decimal Number
ValueCountFrequency (%)
1 155
33.3%
2 151
32.4%
3 69
14.8%
4 46
 
9.9%
5 17
 
3.6%
6 11
 
2.4%
7 8
 
1.7%
8 6
 
1.3%
0 2
 
0.4%
9 1
 
0.2%
Space Separator
ValueCountFrequency (%)
1354
100.0%
Other Punctuation
ValueCountFrequency (%)
. 15
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4307
70.1%
Common 1837
29.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
787
 
18.3%
716
 
16.6%
120
 
2.8%
99
 
2.3%
83
 
1.9%
75
 
1.7%
75
 
1.7%
70
 
1.6%
68
 
1.6%
67
 
1.6%
Other values (174) 2147
49.8%
Common
ValueCountFrequency (%)
1354
73.7%
1 155
 
8.4%
2 151
 
8.2%
3 69
 
3.8%
4 46
 
2.5%
5 17
 
0.9%
. 15
 
0.8%
6 11
 
0.6%
7 8
 
0.4%
8 6
 
0.3%
Other values (4) 5
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4307
70.1%
ASCII 1837
29.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1354
73.7%
1 155
 
8.4%
2 151
 
8.2%
3 69
 
3.8%
4 46
 
2.5%
5 17
 
0.9%
. 15
 
0.8%
6 11
 
0.6%
7 8
 
0.4%
8 6
 
0.3%
Other values (4) 5
 
0.3%
Hangul
ValueCountFrequency (%)
787
 
18.3%
716
 
16.6%
120
 
2.8%
99
 
2.3%
83
 
1.9%
75
 
1.7%
75
 
1.7%
70
 
1.6%
68
 
1.6%
67
 
1.6%
Other values (174) 2147
49.8%

Unnamed: 3
Categorical

Distinct17
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size40.9 KiB
경유
545 
휘발유
515 
휘발유(무연)
510 
엘피지
462 
기타연료
437 
Other values (12)
2755 

Length

Max length13
Median length12
Mean length5.7666539
Min length2

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
경유 545
10.4%
휘발유 515
9.9%
휘발유(무연) 510
9.8%
엘피지 462
8.8%
기타연료 437
8.4%
하이브리드(휘발유+전기) 436
8.3%
전기 429
8.2%
수소 412
7.9%
하이브리드(경유+전기) 409
7.8%
하이브리드(LPG+전기) 379
7.3%
Other values (7) 690
13.2%

Length

2024-04-06T21:33:31.264976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경유 545
10.4%
휘발유 515
9.9%
휘발유(무연 510
9.8%
엘피지 462
8.8%
기타연료 437
8.4%
하이브리드(휘발유+전기 436
8.3%
전기 429
8.2%
수소 412
7.9%
하이브리드(경유+전기 409
7.8%
하이브리드(lpg+전기 379
7.3%
Other values (7) 690
13.2%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing167
Missing (%)3.2%
Memory size40.9 KiB

Unnamed: 5
Unsupported

REJECTED  UNSUPPORTED 

Missing5
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 6
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 7
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 8
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 9
Unsupported

REJECTED  UNSUPPORTED 

Missing5
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 10
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 11
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 12
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 13
Unsupported

REJECTED  UNSUPPORTED 

Missing3
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 14
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 15
Unsupported

REJECTED  UNSUPPORTED 

Missing4
Missing (%)0.1%
Memory size40.9 KiB

Unnamed: 16
Unsupported

REJECTED  UNSUPPORTED 

Missing6
Missing (%)0.1%
Memory size40.9 KiB

Correlations

2024-04-06T21:33:31.434666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 1Unnamed: 3
읍면동별 연료별 자동차 등록현황 (행정동)1.0000.0000.714
Unnamed: 10.0001.000NaN
Unnamed: 30.714NaN1.000
2024-04-06T21:33:31.658292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 3읍면동별 연료별 자동차 등록현황 (행정동)
Unnamed: 31.0000.269
읍면동별 연료별 자동차 등록현황 (행정동)0.2691.000
2024-04-06T21:33:31.853497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 3
읍면동별 연료별 자동차 등록현황 (행정동)1.0000.269
Unnamed: 30.2691.000

Missing values

2024-04-06T21:33:26.467097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T21:33:27.057627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-06T21:33:27.999939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16
0자동차관리정보시스템<NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1<NA><NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2PROG_ID :STA029Q24<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNPage No.:NaN1NaN
3<NA><NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
4기준일자 :202302<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaN출력일시 :NaN2023-03-23 11:15:38NaN
5사용본거지 시군구<NA>읍면동 (행정동)연료관용NaNNaNNaN자가용NaNNaNNaN영업용NaNNaNNaN
6<NA><NA><NA><NA>NaN승용승합화물특수승용승합화물특수승용승합화물특수
7서울특별시 종로구<NA>기타휘발유(무연)1000010000000
8<NA><NA><NA>휘발유2000020000000
9<NA><NA><NA>경유1000001000000
읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16
5214<NA><NA><NA>하이브리드(휘발유+전기)41400004140000000
5215<NA><NA><NA>하이브리드(경유+전기)7000070000000
5216<NA><NA><NA>하이브리드(LPG+전기)2000020000000
5217<NA><NA><NA>수소120000120000000
5218<NA><NA><NA>기타연료160000031030000
5219<NA><NA>기타휘발유240000221100000
5220<NA><NA><NA>경유180000041400000
5221<NA><NA><NA>휘발유(무연)4000030100000
5222서울특별시 강동구<NA><NA><NA>15360910243155612858335481234419151178222475223
5223서울<NA><NA><NA>3191681470538523897409263593876373257113520412409815185596095298

Duplicate rows

Most frequently occurring

읍면동별 연료별 자동차 등록현황 (행정동)Unnamed: 1Unnamed: 2Unnamed: 3# duplicates
6<NA><NA><NA>경유491
16<NA><NA><NA>휘발유(무연)483
9<NA><NA><NA>엘피지438
14<NA><NA><NA>하이브리드(휘발유+전기)419
7<NA><NA><NA>기타연료416
10<NA><NA><NA>전기407
8<NA><NA><NA>수소391
13<NA><NA><NA>하이브리드(경유+전기)388
12<NA><NA><NA>하이브리드(LPG+전기)363
17<NA><NA><NA>휘발유(유연)336