Overview

Dataset statistics

Number of variables9
Number of observations79
Missing cells323
Missing cells (%)45.4%
Duplicate rows1
Duplicate rows (%)1.3%
Total size in memory5.7 KiB
Average record size in memory73.7 B

Variable types

Unsupported3
Text4
Categorical1
Boolean1

Dataset

Description교통문화지수(운전행태/교통안전/교통환경 조사분석한 수치)의 영역별(운전/교통/보행/문화...) 정보
Author교통안전공단
URLhttps://www.vworld.kr/dtmk/dtmk_ntads_s002.do?dsId=30035

Alerts

Unnamed: 5 has constant value ""Constant
Unnamed: 8 has constant value ""Constant
Dataset has 1 (1.3%) duplicate rowsDuplicates
Unnamed: 3 is highly imbalanced (61.1%)Imbalance
테이블정의서 has 1 (1.3%) missing valuesMissing
Unnamed: 1 has 6 (7.6%) missing valuesMissing
Unnamed: 2 has 5 (6.3%) missing valuesMissing
Unnamed: 4 has 6 (7.6%) missing valuesMissing
Unnamed: 5 has 77 (97.5%) missing valuesMissing
Unnamed: 6 has 74 (93.7%) missing valuesMissing
Unnamed: 7 has 76 (96.2%) missing valuesMissing
Unnamed: 8 has 78 (98.7%) missing valuesMissing
테이블정의서 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-18 00:13:06.716370
Analysis finished2024-04-18 00:13:08.276232
Duration1.56 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

테이블정의서
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1
Missing (%)1.3%
Memory size764.0 B

Unnamed: 1
Text

MISSING 

Distinct73
Distinct (%)100.0%
Missing6
Missing (%)7.6%
Memory size764.0 B
2024-04-18T09:13:08.418633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length29
Mean length19.342466
Min length4

Characters and Unicode

Total characters1412
Distinct characters27
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique73 ?
Unique (%)100.0%

Sample

1st row컬럼ID
2nd rowYEAR_CD
3rd rowJIJACE_CD
4th rowDRV_BHV_GRD
5th rowTRF_SAF_GRD
ValueCountFrequency (%)
saf_blt_rank 1
 
1.4%
local_safty_perform_rank 1
 
1.4%
sgn_cnfm_rat_avg 1
 
1.4%
drct_sgl_rgtrat_avg 1
 
1.4%
crswk_stp_ln_cnfm_rat_avg 1
 
1.4%
not_crsw_smart_userat_rank 1
 
1.4%
not_crsw_smart_userat_avg 1
 
1.4%
busi_car_cnt_road_acc_death_rank 1
 
1.4%
busi_car_cnt_road_acc_death_cnt 1
 
1.4%
people_road_perdestrian_death_rank 1
 
1.4%
Other values (63) 63
86.3%
2024-04-18T09:13:08.757140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 221
15.7%
A 155
11.0%
R 139
 
9.8%
T 95
 
6.7%
S 93
 
6.6%
E 82
 
5.8%
C 70
 
5.0%
D 59
 
4.2%
G 57
 
4.0%
N 56
 
4.0%
Other values (17) 385
27.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1189
84.2%
Connector Punctuation 221
 
15.7%
Other Letter 2
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 155
13.0%
R 139
11.7%
T 95
 
8.0%
S 93
 
7.8%
E 82
 
6.9%
C 70
 
5.9%
D 59
 
5.0%
G 57
 
4.8%
N 56
 
4.7%
V 40
 
3.4%
Other values (14) 343
28.8%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 221
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1189
84.2%
Common 221
 
15.7%
Hangul 2
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 155
13.0%
R 139
11.7%
T 95
 
8.0%
S 93
 
7.8%
E 82
 
6.9%
C 70
 
5.9%
D 59
 
5.0%
G 57
 
4.8%
N 56
 
4.7%
V 40
 
3.4%
Other values (14) 343
28.8%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Common
ValueCountFrequency (%)
_ 221
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1410
99.9%
Hangul 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 221
15.7%
A 155
11.0%
R 139
 
9.9%
T 95
 
6.7%
S 93
 
6.6%
E 82
 
5.8%
C 70
 
5.0%
D 59
 
4.2%
G 57
 
4.0%
N 56
 
4.0%
Other values (15) 383
27.2%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 2
Text

MISSING 

Distinct74
Distinct (%)100.0%
Missing5
Missing (%)6.3%
Memory size764.0 B
2024-04-18T09:13:08.998604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length25
Mean length13.418919
Min length3

Characters and Unicode

Total characters993
Distinct characters116
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)100.0%

Sample

1st row컬럼명
2nd row년도_코드
3rd row지자체_코드
4th row운전_행태_점수
5th row교통_안전_점수
ValueCountFrequency (%)
랭크 11
 
5.4%
9
 
4.4%
9
 
4.4%
지자체 9
 
4.4%
사망자 9
 
4.4%
도로연장 9
 
4.4%
교통사고 6
 
2.9%
인구 6
 
2.9%
교통안전 6
 
2.9%
자동차 6
 
2.9%
Other values (74) 124
60.8%
2024-04-18T09:13:09.388736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
130
 
13.1%
_ 76
 
7.7%
35
 
3.5%
32
 
3.2%
31
 
3.1%
29
 
2.9%
27
 
2.7%
23
 
2.3%
20
 
2.0%
20
 
2.0%
Other values (106) 570
57.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 765
77.0%
Space Separator 130
 
13.1%
Connector Punctuation 76
 
7.7%
Uppercase Letter 22
 
2.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
4.6%
32
 
4.2%
31
 
4.1%
29
 
3.8%
27
 
3.5%
23
 
3.0%
20
 
2.6%
20
 
2.6%
19
 
2.5%
19
 
2.5%
Other values (89) 510
66.7%
Uppercase Letter
ValueCountFrequency (%)
E 2
 
9.1%
S 2
 
9.1%
D 2
 
9.1%
I 2
 
9.1%
R 2
 
9.1%
T 2
 
9.1%
U 2
 
9.1%
X 1
 
4.5%
P 1
 
4.5%
A 1
 
4.5%
Other values (5) 5
22.7%
Space Separator
ValueCountFrequency (%)
130
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 76
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 765
77.0%
Common 206
 
20.7%
Latin 22
 
2.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
4.6%
32
 
4.2%
31
 
4.1%
29
 
3.8%
27
 
3.5%
23
 
3.0%
20
 
2.6%
20
 
2.6%
19
 
2.5%
19
 
2.5%
Other values (89) 510
66.7%
Latin
ValueCountFrequency (%)
E 2
 
9.1%
S 2
 
9.1%
D 2
 
9.1%
I 2
 
9.1%
R 2
 
9.1%
T 2
 
9.1%
U 2
 
9.1%
X 1
 
4.5%
P 1
 
4.5%
A 1
 
4.5%
Other values (5) 5
22.7%
Common
ValueCountFrequency (%)
130
63.1%
_ 76
36.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 765
77.0%
ASCII 228
 
23.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
130
57.0%
_ 76
33.3%
E 2
 
0.9%
S 2
 
0.9%
D 2
 
0.9%
I 2
 
0.9%
R 2
 
0.9%
T 2
 
0.9%
U 2
 
0.9%
X 1
 
0.4%
Other values (7) 7
 
3.1%
Hangul
ValueCountFrequency (%)
35
 
4.6%
32
 
4.2%
31
 
4.1%
29
 
3.8%
27
 
3.5%
23
 
3.0%
20
 
2.6%
20
 
2.6%
19
 
2.5%
19
 
2.5%
Other values (89) 510
66.7%

Unnamed: 3
Categorical

IMBALANCE 

Distinct6
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size764.0 B
NUMBER
65 
VARCHAR
 
6
<NA>
 
5
테이블ID
 
1
타입
 
1

Length

Max length7
Median length6
Mean length5.8607595
Min length2

Unique

Unique3 ?
Unique (%)3.8%

Sample

1st row<NA>
2nd row테이블ID
3rd row<NA>
4th row타입
5th rowVARCHAR

Common Values

ValueCountFrequency (%)
NUMBER 65
82.3%
VARCHAR 6
 
7.6%
<NA> 5
 
6.3%
테이블ID 1
 
1.3%
타입 1
 
1.3%
DATE 1
 
1.3%

Length

2024-04-18T09:13:09.524839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T09:13:09.645801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
number 65
82.3%
varchar 6
 
7.6%
na 5
 
6.3%
테이블id 1
 
1.3%
타입 1
 
1.3%
date 1
 
1.3%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6
Missing (%)7.6%
Memory size764.0 B

Unnamed: 5
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)50.0%
Missing77
Missing (%)97.5%
Memory size290.0 B
False
 
2
(Missing)
77 
ValueCountFrequency (%)
False 2
 
2.5%
(Missing) 77
97.5%
2024-04-18T09:13:09.737979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Unnamed: 6
Text

MISSING 

Distinct4
Distinct (%)80.0%
Missing74
Missing (%)93.7%
Memory size764.0 B
2024-04-18T09:13:09.839915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length3.2
Min length2

Characters and Unicode

Total characters16
Distinct characters11
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)60.0%

Sample

1st row작성일
2nd row테이블명
3rd rowPK/FK
4th rowPK
5th rowPK
ValueCountFrequency (%)
pk 2
40.0%
작성일 1
20.0%
테이블명 1
20.0%
pk/fk 1
20.0%
2024-04-18T09:13:10.083578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
K 4
25.0%
P 3
18.8%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
/ 1
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8
50.0%
Other Letter 7
43.8%
Other Punctuation 1
 
6.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Uppercase Letter
ValueCountFrequency (%)
K 4
50.0%
P 3
37.5%
F 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
50.0%
Hangul 7
43.8%
Common 1
 
6.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Latin
ValueCountFrequency (%)
K 4
50.0%
P 3
37.5%
F 1
 
12.5%
Common
ValueCountFrequency (%)
/ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
56.2%
Hangul 7
43.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
K 4
44.4%
P 3
33.3%
/ 1
 
11.1%
F 1
 
11.1%
Hangul
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing76
Missing (%)96.2%
Memory size764.0 B

Unnamed: 8
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing78
Missing (%)98.7%
Memory size764.0 B
2024-04-18T09:13:10.228493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row참조테이블명/비고
ValueCountFrequency (%)
참조테이블명/비고 1
100.0%
2024-04-18T09:13:10.469197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
/ 1
11.1%
1
11.1%
1
11.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8
88.9%
Other Punctuation 1
 
11.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8
88.9%
Common 1
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Common
ValueCountFrequency (%)
/ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8
88.9%
ASCII 1
 
11.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
ASCII
ValueCountFrequency (%)
/ 1
100.0%

Correlations

2024-04-18T09:13:10.550917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 6
Unnamed: 11.0001.0001.0001.000
Unnamed: 21.0001.0001.0001.000
Unnamed: 31.0001.0001.0001.000
Unnamed: 61.0001.0001.0001.000

Missing values

2024-04-18T09:13:07.940055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T09:13:08.134850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
0작성자<NA><NA><NA>NaN<NA>작성일2019-09-04 00:00:00<NA>
1주제영역명<NA><NA>테이블IDZ_TMACS_T_W_BASE_TRF_CULT_IDX<NA>테이블명기초통계 교통문화지수 영역별<NA>
2테이블설명<NA><NA><NA>NaN<NA><NA>NaN<NA>
3No컬럼ID컬럼명타입길이(Byte)<NA>PK/FKDefault참조테이블명/비고
41YEAR_CD년도_코드VARCHAR4NPKNaN<NA>
52JIJACE_CD지자체_코드VARCHAR5NPKNaN<NA>
63DRV_BHV_GRD운전_행태_점수NUMBER10,2<NA><NA>NaN<NA>
74TRF_SAF_GRD교통_안전_점수NUMBER10,2<NA><NA>NaN<NA>
85WK_BHV_GRD보행_행태_점수NUMBER10,2<NA><NA>NaN<NA>
96TRF_WKP_GRD교통_약자_점수NUMBER10,2<NA><NA>NaN<NA>
테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8
6965CRSW_SMART_USERAT_AVG<NA>NUMBER10,2<NA><NA>NaN<NA>
7065CRSW_SMART_USERAT_AVG_AVG횡단중 스마트기기 사용률_평균NUMBER10,2<NA><NA>NaN<NA>
7165NOT_CRSW_SMART_USERAT_AVG_AVG횡단보도가 아닌 도로에서의 무단횡단 빈도_평균NUMBER10,2<NA><NA>NaN<NA>
7265DRV_BHV_GRD_RK운전_행태_순위NUMBER10<NA><NA>NaN<NA>
7365TRF_SAF_GRD_RK교통_안전_순위NUMBER10<NA><NA>NaN<NA>
7465WK_BHV_GRD_RK보행_행태_순위NUMBER10<NA><NA>NaN<NA>
7565TRF_WKP_GRDRK교통_약자_순위NUMBER10<NA><NA>NaN<NA>
76인덱스명<NA>인덱스키<NA>NaN<NA><NA>NaN<NA>
77NaN<NA>BASE_TRF_CULT_IDX_PK<NA>NaN<NA><NA>NaN<NA>
78업무규칙<NA><NA><NA>NaN<NA><NA>NaN<NA>

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 5Unnamed: 6Unnamed: 8# duplicates
0<NA><NA><NA><NA><NA><NA>2