Overview

Dataset statistics

Number of variables9
Number of observations73
Missing cells20
Missing cells (%)3.0%
Duplicate rows1
Duplicate rows (%)1.4%
Total size in memory5.3 KiB
Average record size in memory73.8 B

Variable types

Text2
Categorical4
Boolean2
DateTime1

Dataset

Description자동차관리법 및 자동차종합검사 시행등에 관한 규칙에 따라 한국교통안전공단(KOTSA)에서 관리하는 자동차검사 자료입니다.
Author한국교통안전공단
URLhttps://www.data.go.kr/data/15088064/fileData.do

Alerts

취소여부 has constant value ""Constant
Dataset has 1 (1.4%) duplicate rowsDuplicates
전문정비업체법정동코드 is highly overall correlated with 전문정비업체사업자등록번호 and 3 other fieldsHigh correlation
전문정비업체사업자등록번호 is highly overall correlated with 전문정비업체법정동코드 and 3 other fieldsHigh correlation
수정일시 is highly overall correlated with 전문정비업체사업자등록번호 and 3 other fieldsHigh correlation
사용여부 is highly overall correlated with 전문정비업체사업자등록번호 and 3 other fieldsHigh correlation
전문정비업체구분 is highly overall correlated with 전문정비업체사업자등록번호 and 3 other fieldsHigh correlation
전문정비업체사업자등록번호 is highly imbalanced (69.4%)Imbalance
전문정비업체법정동코드 is highly imbalanced (69.4%)Imbalance
전문정비업체구분 is highly imbalanced (69.4%)Imbalance
사용여부 is highly imbalanced (57.4%)Imbalance
수정일시 is highly imbalanced (77.8%)Imbalance
전문정비업체명 has 4 (5.5%) missing valuesMissing
전문정비업체주소명 has 4 (5.5%) missing valuesMissing
취소여부 has 4 (5.5%) missing valuesMissing
사용여부 has 4 (5.5%) missing valuesMissing
등록일시 has 4 (5.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 13:49:04.876468
Analysis finished2023-12-12 13:49:06.068046
Duration1.19 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

전문정비업체명
Text

MISSING 

Distinct69
Distinct (%)100.0%
Missing4
Missing (%)5.5%
Memory size716.0 B
2023-12-12T22:49:06.288557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length8.1884058
Min length3

Characters and Unicode

Total characters565
Distinct characters135
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique69 ?
Unique (%)100.0%

Sample

1st row광주수양자동차공업사 주식회사
2nd row그린 기능장 카 서비스
3rd row(주)동인천서비스
4th row현대하이모터스
5th row현대기아모터스
ValueCountFrequency (%)
제이디피모터스 5
 
5.7%
주식회사 3
 
3.4%
카매니져 1
 
1.1%
일동카센타 1
 
1.1%
주)구미우본정비 1
 
1.1%
선경산업(주 1
 
1.1%
프로 1
 
1.1%
타이어 1
 
1.1%
규암쌍용카 1
 
1.1%
상아종합모터스 1
 
1.1%
Other values (72) 72
81.8%
2023-12-12T22:49:06.817623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
38
 
6.7%
24
 
4.2%
20
 
3.5%
19
 
3.4%
19
 
3.4%
18
 
3.2%
18
 
3.2%
17
 
3.0%
16
 
2.8%
15
 
2.7%
Other values (125) 361
63.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 524
92.7%
Space Separator 19
 
3.4%
Open Punctuation 10
 
1.8%
Close Punctuation 10
 
1.8%
Decimal Number 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
 
7.3%
24
 
4.6%
20
 
3.8%
19
 
3.6%
18
 
3.4%
18
 
3.4%
17
 
3.2%
16
 
3.1%
15
 
2.9%
14
 
2.7%
Other values (120) 325
62.0%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
19
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 524
92.7%
Common 41
 
7.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
38
 
7.3%
24
 
4.6%
20
 
3.8%
19
 
3.6%
18
 
3.4%
18
 
3.4%
17
 
3.2%
16
 
3.1%
15
 
2.9%
14
 
2.7%
Other values (120) 325
62.0%
Common
ValueCountFrequency (%)
19
46.3%
( 10
24.4%
) 10
24.4%
1 1
 
2.4%
2 1
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 524
92.7%
ASCII 41
 
7.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
38
 
7.3%
24
 
4.6%
20
 
3.8%
19
 
3.6%
18
 
3.4%
18
 
3.4%
17
 
3.2%
16
 
3.1%
15
 
2.9%
14
 
2.7%
Other values (120) 325
62.0%
ASCII
ValueCountFrequency (%)
19
46.3%
( 10
24.4%
) 10
24.4%
1 1
 
2.4%
2 1
 
2.4%

전문정비업체사업자등록번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size716.0 B
*************
69 
<NA>
 
4

Length

Max length13
Median length13
Mean length12.506849
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row*************
2nd row*************
3rd row*************
4th row*************
5th row*************

Common Values

ValueCountFrequency (%)
************* 69
94.5%
<NA> 4
 
5.5%

Length

2023-12-12T22:49:06.986070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:49:07.115687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
69
94.5%
na 4
 
5.5%

전문정비업체법정동코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size716.0 B
69 
<NA>
 
4

Length

Max length4
Median length1
Mean length1.1643836
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
69
94.5%
<NA> 4
 
5.5%

Length

2023-12-12T22:49:07.252796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:49:07.391752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4
100.0%
Distinct68
Distinct (%)98.6%
Missing4
Missing (%)5.5%
Memory size716.0 B
2023-12-12T22:49:07.753618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length33
Mean length26.550725
Min length20

Characters and Unicode

Total characters1832
Distinct characters211
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)97.1%

Sample

1st row경기도 광주시 곤지암읍 구수동길 14-10
2nd row강원도 원주시 우산로 21 (우산동)
3rd row인천광역시 중구 인중로 44 (신흥동3가)
4th row경기도 의정부시 평화로 359 (호원동,회룡한주아파트)
5th row부산광역시 수영구 연수로 339 (망미동)
ValueCountFrequency (%)
경기도 9
 
2.5%
충청남도 9
 
2.5%
제주특별자치도 7
 
2.0%
전라남도 7
 
2.0%
경상북도 6
 
1.7%
경상남도 6
 
1.7%
순천시 5
 
1.4%
제주시 5
 
1.4%
서울특별시 4
 
1.1%
대구광역시 4
 
1.1%
Other values (247) 293
82.5%
2023-12-12T22:49:08.323586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
343
 
18.7%
74
 
4.0%
67
 
3.7%
61
 
3.3%
59
 
3.2%
( 59
 
3.2%
) 59
 
3.2%
1 44
 
2.4%
35
 
1.9%
32
 
1.7%
Other values (201) 999
54.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1120
61.1%
Space Separator 343
 
18.7%
Decimal Number 213
 
11.6%
Open Punctuation 59
 
3.2%
Close Punctuation 59
 
3.2%
Other Punctuation 17
 
0.9%
Dash Punctuation 12
 
0.7%
Uppercase Letter 9
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
74
 
6.6%
67
 
6.0%
61
 
5.4%
59
 
5.3%
35
 
3.1%
32
 
2.9%
26
 
2.3%
24
 
2.1%
23
 
2.1%
23
 
2.1%
Other values (179) 696
62.1%
Decimal Number
ValueCountFrequency (%)
1 44
20.7%
3 30
14.1%
2 29
13.6%
5 26
12.2%
0 18
8.5%
7 17
 
8.0%
9 16
 
7.5%
8 12
 
5.6%
4 11
 
5.2%
6 10
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
L 2
22.2%
E 2
22.2%
A 1
11.1%
N 1
11.1%
D 1
11.1%
S 1
11.1%
T 1
11.1%
Space Separator
ValueCountFrequency (%)
343
100.0%
Open Punctuation
ValueCountFrequency (%)
( 59
100.0%
Close Punctuation
ValueCountFrequency (%)
) 59
100.0%
Other Punctuation
ValueCountFrequency (%)
, 17
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1120
61.1%
Common 703
38.4%
Latin 9
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
74
 
6.6%
67
 
6.0%
61
 
5.4%
59
 
5.3%
35
 
3.1%
32
 
2.9%
26
 
2.3%
24
 
2.1%
23
 
2.1%
23
 
2.1%
Other values (179) 696
62.1%
Common
ValueCountFrequency (%)
343
48.8%
( 59
 
8.4%
) 59
 
8.4%
1 44
 
6.3%
3 30
 
4.3%
2 29
 
4.1%
5 26
 
3.7%
0 18
 
2.6%
7 17
 
2.4%
, 17
 
2.4%
Other values (5) 61
 
8.7%
Latin
ValueCountFrequency (%)
L 2
22.2%
E 2
22.2%
A 1
11.1%
N 1
11.1%
D 1
11.1%
S 1
11.1%
T 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1120
61.1%
ASCII 712
38.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
343
48.2%
( 59
 
8.3%
) 59
 
8.3%
1 44
 
6.2%
3 30
 
4.2%
2 29
 
4.1%
5 26
 
3.7%
0 18
 
2.5%
7 17
 
2.4%
, 17
 
2.4%
Other values (12) 70
 
9.8%
Hangul
ValueCountFrequency (%)
74
 
6.6%
67
 
6.0%
61
 
5.4%
59
 
5.3%
35
 
3.1%
32
 
2.9%
26
 
2.3%
24
 
2.1%
23
 
2.1%
23
 
2.1%
Other values (179) 696
62.1%

전문정비업체구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size716.0 B
전문정비업체
69 
<NA>
 
4

Length

Max length6
Median length6
Mean length5.890411
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전문정비업체
2nd row전문정비업체
3rd row전문정비업체
4th row전문정비업체
5th row전문정비업체

Common Values

ValueCountFrequency (%)
전문정비업체 69
94.5%
<NA> 4
 
5.5%

Length

2023-12-12T22:49:08.463976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:49:08.569007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전문정비업체 69
94.5%
na 4
 
5.5%

취소여부
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)1.4%
Missing4
Missing (%)5.5%
Memory size278.0 B
False
69 
(Missing)
 
4
ValueCountFrequency (%)
False 69
94.5%
(Missing) 4
 
5.5%
2023-12-12T22:49:08.643452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

사용여부
Boolean

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)2.9%
Missing4
Missing (%)5.5%
Memory size278.0 B
True
63 
False
 
6
(Missing)
 
4
ValueCountFrequency (%)
True 63
86.3%
False 6
 
8.2%
(Missing) 4
 
5.5%
2023-12-12T22:49:08.714122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

등록일시
Date

MISSING 

Distinct57
Distinct (%)82.6%
Missing4
Missing (%)5.5%
Memory size716.0 B
Minimum2018-05-31 00:00:00
Maximum2020-11-11 00:00:00
2023-12-12T22:49:08.818503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:49:08.951899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

수정일시
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)9.6%
Missing0
Missing (%)0.0%
Memory size716.0 B
67 
2020-07-31
 
1
2019-03-29
 
1
2021-06-10
 
1
2021-04-29
 
1
Other values (2)
 
2

Length

Max length10
Median length1
Mean length1.739726
Min length1

Unique

Unique6 ?
Unique (%)8.2%

Sample

1st row
2nd row
3rd row
4th row
5th row2020-07-31

Common Values

ValueCountFrequency (%)
67
91.8%
2020-07-31 1
 
1.4%
2019-03-29 1
 
1.4%
2021-06-10 1
 
1.4%
2021-04-29 1
 
1.4%
2021-01-19 1
 
1.4%
2021-03-04 1
 
1.4%

Length

2023-12-12T22:49:09.083557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:49:09.190405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-07-31 1
16.7%
2019-03-29 1
16.7%
2021-06-10 1
16.7%
2021-04-29 1
16.7%
2021-01-19 1
16.7%
2021-03-04 1
16.7%

Correlations

2023-12-12T22:49:09.272845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전문정비업체명전문정비업체주소명사용여부등록일시수정일시
전문정비업체명1.0001.0001.0001.0001.000
전문정비업체주소명1.0001.0000.0000.9960.000
사용여부1.0000.0001.0000.8281.000
등록일시1.0000.9960.8281.0000.946
수정일시1.0000.0001.0000.9461.000
2023-12-12T22:49:09.382121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전문정비업체법정동코드전문정비업체사업자등록번호수정일시사용여부전문정비업체구분
전문정비업체법정동코드1.0001.0001.0001.0001.000
전문정비업체사업자등록번호1.0001.0001.0001.0001.000
수정일시1.0001.0001.0000.9621.000
사용여부1.0001.0000.9621.0001.000
전문정비업체구분1.0001.0001.0001.0001.000
2023-12-12T22:49:09.496248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전문정비업체사업자등록번호전문정비업체법정동코드전문정비업체구분사용여부수정일시
전문정비업체사업자등록번호1.0001.0001.0001.0001.000
전문정비업체법정동코드1.0001.0001.0001.0001.000
전문정비업체구분1.0001.0001.0001.0001.000
사용여부1.0001.0001.0001.0000.962
수정일시1.0001.0001.0000.9621.000

Missing values

2023-12-12T22:49:05.611717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:49:05.772824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T22:49:05.950149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

전문정비업체명전문정비업체사업자등록번호전문정비업체법정동코드전문정비업체주소명전문정비업체구분취소여부사용여부등록일시수정일시
0광주수양자동차공업사 주식회사*************경기도 광주시 곤지암읍 구수동길 14-10전문정비업체NY2018-05-31
1그린 기능장 카 서비스*************강원도 원주시 우산로 21 (우산동)전문정비업체NY2018-06-15
2(주)동인천서비스*************인천광역시 중구 인중로 44 (신흥동3가)전문정비업체NY2018-06-15
3현대하이모터스*************경기도 의정부시 평화로 359 (호원동,회룡한주아파트)전문정비업체NY2018-06-18
4현대기아모터스*************부산광역시 수영구 연수로 339 (망미동)전문정비업체NN2018-06-212020-07-31
5심상철자동차정비*************경상남도 양산시 웅상대로 929 (평산동,심상철자동차정비)전문정비업체NY2018-06-26
6애니카랜드 호원점*************경기도 의정부시 평화로 251 (호원동,애니카랜드)전문정비업체NY2018-06-26
7거제현대서비스*************경상남도 거제시 연초면 연하해안로 328전문정비업체NY2018-07-04
8카매니져*************대구광역시 달서구 성서4차첨단로 169 (월암동)전문정비업체NY2018-07-13
9오성자동차*************경기도 성남시 중원구 사기막골로 193 (상대원동,럭키참좋은힐)전문정비업체NN2018-07-172019-03-29
전문정비업체명전문정비업체사업자등록번호전문정비업체법정동코드전문정비업체주소명전문정비업체구분취소여부사용여부등록일시수정일시
63신평자동차공업사(주)*************충청남도 당진시 신평면 서해로 7093-1 (신평자동차공업사)전문정비업체NN2020-11-052021-03-04
64공단카센타*************전라북도 군산시 외항로 155 (산북동,성진자원)전문정비업체NY2020-11-05
65보쉬 여수디젤서비스*************전라남도 여수시 좌수영로 540 (둔덕동)전문정비업체NY2020-11-06
66동천현대모터스*************경상북도 경주시 백률로57번길 6-7 (동천동)전문정비업체NY2020-11-09
67백마카센타(군산)*************전라북도 군산시 내사길 52 (사정동,삼성라디에터전북대리점)전문정비업체NY2020-11-11
68거라지21*************광주광역시 서구 군분2로 31-3 (화정동)전문정비업체NY2020-11-11
69<NA><NA><NA><NA><NA><NA><NA><NA>
70<NA><NA><NA><NA><NA><NA><NA><NA>
71<NA><NA><NA><NA><NA><NA><NA><NA>
72<NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

전문정비업체명전문정비업체사업자등록번호전문정비업체법정동코드전문정비업체주소명전문정비업체구분취소여부사용여부등록일시수정일시# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA>4