Overview

Dataset statistics

Number of variables19
Number of observations693
Missing cells631
Missing cells (%)4.8%
Duplicate rows31
Duplicate rows (%)4.5%
Total size in memory103.7 KiB
Average record size in memory153.2 B

Variable types

Categorical15
Text2
Boolean1
Numeric1

Dataset

Description국토안전관리원에서 자분탐상 시험을 한 데이터입니다. 시설물 구분은 상하수도와 교량이며 시설물 종류는 광역상수도, 도로교량, 철도교량 입니다. 자분탐상 시험판독내용과 용접형태, 검사길이의 결과값을 제공드립니다.
Author국토안전관리원
URLhttps://www.data.go.kr/data/15110996/fileData.do

Alerts

종별 has constant value ""Constant
탈자여부 has constant value ""Constant
Dataset has 31 (4.5%) duplicate rowsDuplicates
용접형태 is highly overall correlated with 검사길이(mm) and 9 other fieldsHigh correlation
통전법 is highly overall correlated with 검사길이(mm) and 11 other fieldsHigh correlation
전류타입 is highly overall correlated with 시설물종류 and 9 other fieldsHigh correlation
표면상태 is highly overall correlated with 검사길이(mm) and 13 other fieldsHigh correlation
극간거리(cm) is highly overall correlated with 시설물종류 and 10 other fieldsHigh correlation
시설물구분 is highly overall correlated with 시설물종류 and 9 other fieldsHigh correlation
판독내용 is highly overall correlated with 시설물구분 and 8 other fieldsHigh correlation
자화방법 is highly overall correlated with 시설물번호 and 7 other fieldsHigh correlation
시험일자 is highly overall correlated with 시설물구분 and 9 other fieldsHigh correlation
식별번호 is highly overall correlated with 검사길이(mm) and 11 other fieldsHigh correlation
시설물종류 is highly overall correlated with 시설물구분 and 11 other fieldsHigh correlation
자분형식 is highly overall correlated with 검사길이(mm) and 13 other fieldsHigh correlation
시설물번호 is highly overall correlated with 시설물구분 and 11 other fieldsHigh correlation
비고 is highly overall correlated with 검사길이(mm) and 11 other fieldsHigh correlation
검사길이(mm) is highly overall correlated with 통전법 and 5 other fieldsHigh correlation
자화방법 is highly imbalanced (81.1%)Imbalance
극간거리(cm) is highly imbalanced (54.6%)Imbalance
자분형식 is highly imbalanced (84.2%)Imbalance
표면상태 is highly imbalanced (84.2%)Imbalance
판독내용 is highly imbalanced (58.6%)Imbalance
비고 is highly imbalanced (73.8%)Imbalance
탈자여부 has 16 (2.3%) missing valuesMissing
결함 길이(mm) has 551 (79.5%) missing valuesMissing
검사길이(mm) has 64 (9.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 06:23:30.231163
Analysis finished2023-12-12 06:23:33.036630
Duration2.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설물구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
교량
350 
상하수도
343 

Length

Max length4
Median length2
Mean length2.989899
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상하수도
2nd row상하수도
3rd row상하수도
4th row상하수도
5th row상하수도

Common Values

ValueCountFrequency (%)
교량 350
50.5%
상하수도 343
49.5%

Length

2023-12-12T15:23:33.524059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:33.670615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교량 350
50.5%
상하수도 343
49.5%

시설물종류
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
광역상수도
343 
철도교량
201 
도로교량
149 

Length

Max length5
Median length4
Mean length4.4949495
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광역상수도
2nd row광역상수도
3rd row광역상수도
4th row광역상수도
5th row광역상수도

Common Values

ValueCountFrequency (%)
광역상수도 343
49.5%
철도교량 201
29.0%
도로교량 149
21.5%

Length

2023-12-12T15:23:33.789469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:33.910852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광역상수도 343
49.5%
철도교량 201
29.0%
도로교량 149
21.5%

종별
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
1종
693 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1종
2nd row1종
3rd row1종
4th row1종
5th row1종

Common Values

ValueCountFrequency (%)
1종 693
100.0%

Length

2023-12-12T15:23:34.049566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:34.185047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1종 693
100.0%

시설물번호
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
BR0017
164 
WS0003
107 
BR0015
90 
WS0002
78 
BR0007
53 
Other values (12)
201 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowWS0002
2nd rowWS0002
3rd rowWS0002
4th rowWS0002
5th rowWS0002

Common Values

ValueCountFrequency (%)
BR0017 164
23.7%
WS0003 107
15.4%
BR0015 90
13.0%
WS0002 78
11.3%
BR0007 53
 
7.6%
WS0008 51
 
7.4%
WS0001 42
 
6.1%
WS0009 31
 
4.5%
WS0005 22
 
3.2%
BR0002 14
 
2.0%
Other values (7) 41
 
5.9%

Length

2023-12-12T15:23:34.317695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
br0017 164
23.7%
ws0003 107
15.4%
br0015 90
13.0%
ws0002 78
11.3%
br0007 53
 
7.6%
ws0008 51
 
7.4%
ws0001 42
 
6.1%
ws0009 31
 
4.5%
ws0005 22
 
3.2%
br0002 14
 
2.0%
Other values (7) 41
 
5.9%
Distinct626
Distinct (%)90.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2023-12-12T15:23:34.670848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length32
Mean length22.167388
Min length8

Characters and Unicode

Total characters15362
Distinct characters135
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique559 ?
Unique (%)80.7%

Sample

1st row1단계 도수 01-01(01) 80+11.43 A1
2nd row1단계 도수 01-01(01) 80+11.43 A2
3rd row1단계 충무 02-02(05) 164+3.79 A1
4th row1단계 충무 02-02(05) 164+3.79 A2
5th row1단계 충무 02-03(02) 247+26 A1
ValueCountFrequency (%)
a1 129
 
4.0%
r 121
 
3.8%
l 111
 
3.4%
a2 101
 
3.1%
1단계 75
 
2.3%
충무 71
 
2.2%
a1측 66
 
2.0%
a2측 62
 
1.9%
w2 48
 
1.5%
s(남측 48
 
1.5%
Other values (640) 2394
74.2%
2023-12-12T15:23:35.271519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2537
 
16.5%
1 1418
 
9.2%
2 933
 
6.1%
0 933
 
6.1%
- 787
 
5.1%
3 679
 
4.4%
A 495
 
3.2%
) 436
 
2.8%
( 436
 
2.8%
4 349
 
2.3%
Other values (125) 6359
41.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5392
35.1%
Uppercase Letter 2595
16.9%
Space Separator 2537
16.5%
Other Letter 2468
16.1%
Dash Punctuation 787
 
5.1%
Close Punctuation 436
 
2.8%
Open Punctuation 436
 
2.8%
Math Symbol 419
 
2.7%
Other Punctuation 208
 
1.4%
Lowercase Letter 84
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
251
 
10.2%
218
 
8.8%
125
 
5.1%
112
 
4.5%
91
 
3.7%
78
 
3.2%
76
 
3.1%
72
 
2.9%
71
 
2.9%
71
 
2.9%
Other values (79) 1303
52.8%
Uppercase Letter
ValueCountFrequency (%)
A 495
19.1%
S 284
10.9%
D 256
9.9%
R 224
8.6%
T 195
 
7.5%
L 182
 
7.0%
U 156
 
6.0%
C 149
 
5.7%
B 138
 
5.3%
G 108
 
4.2%
Other values (12) 408
15.7%
Decimal Number
ValueCountFrequency (%)
1 1418
26.3%
2 933
17.3%
0 933
17.3%
3 679
12.6%
4 349
 
6.5%
5 308
 
5.7%
6 241
 
4.5%
8 211
 
3.9%
7 170
 
3.2%
9 150
 
2.8%
Lowercase Letter
ValueCountFrequency (%)
b 28
33.3%
i 24
28.6%
m 12
14.3%
t 11
 
13.1%
h 5
 
6.0%
e 4
 
4.8%
Math Symbol
ValueCountFrequency (%)
+ 321
76.6%
~ 98
 
23.4%
Other Punctuation
ValueCountFrequency (%)
. 200
96.2%
/ 8
 
3.8%
Space Separator
ValueCountFrequency (%)
2537
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 787
100.0%
Close Punctuation
ValueCountFrequency (%)
) 436
100.0%
Open Punctuation
ValueCountFrequency (%)
( 436
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10215
66.5%
Latin 2679
 
17.4%
Hangul 2468
 
16.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
251
 
10.2%
218
 
8.8%
125
 
5.1%
112
 
4.5%
91
 
3.7%
78
 
3.2%
76
 
3.1%
72
 
2.9%
71
 
2.9%
71
 
2.9%
Other values (79) 1303
52.8%
Latin
ValueCountFrequency (%)
A 495
18.5%
S 284
10.6%
D 256
9.6%
R 224
8.4%
T 195
 
7.3%
L 182
 
6.8%
U 156
 
5.8%
C 149
 
5.6%
B 138
 
5.2%
G 108
 
4.0%
Other values (18) 492
18.4%
Common
ValueCountFrequency (%)
2537
24.8%
1 1418
13.9%
2 933
 
9.1%
0 933
 
9.1%
- 787
 
7.7%
3 679
 
6.6%
) 436
 
4.3%
( 436
 
4.3%
4 349
 
3.4%
+ 321
 
3.1%
Other values (8) 1386
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12894
83.9%
Hangul 2468
 
16.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2537
19.7%
1 1418
 
11.0%
2 933
 
7.2%
0 933
 
7.2%
- 787
 
6.1%
3 679
 
5.3%
A 495
 
3.8%
) 436
 
3.4%
( 436
 
3.4%
4 349
 
2.7%
Other values (36) 3891
30.2%
Hangul
ValueCountFrequency (%)
251
 
10.2%
218
 
8.8%
125
 
5.1%
112
 
4.5%
91
 
3.7%
78
 
3.2%
76
 
3.1%
72
 
2.9%
71
 
2.9%
71
 
2.9%
Other values (79) 1303
52.8%

자화방법
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
YOKE
663 
<NA>
 
16
YOKE
 
14

Length

Max length5
Median length4
Mean length4.020202
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYOKE
2nd rowYOKE
3rd rowYOKE
4th rowYOKE
5th rowYOKE

Common Values

ValueCountFrequency (%)
YOKE 663
95.7%
<NA> 16
 
2.3%
YOKE 14
 
2.0%

Length

2023-12-12T15:23:35.458076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:35.594758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
yoke 677
97.7%
na 16
 
2.3%

통전법
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
<NA>
590 
연속법
103 

Length

Max length4
Median length4
Mean length3.8513709
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 590
85.1%
연속법 103
 
14.9%

Length

2023-12-12T15:23:35.743684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:35.881892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 590
85.1%
연속법 103
 
14.9%

탈자여부
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)0.1%
Missing16
Missing (%)2.3%
Memory size1.5 KiB
False
677 
(Missing)
 
16
ValueCountFrequency (%)
False 677
97.7%
(Missing) 16
 
2.3%
2023-12-12T15:23:35.979807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

시험일자
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
315 
2020-12
184 
2013-04
53 
2010-06
50 
2010-10
42 
Other values (7)
49 

Length

Max length8
Median length8
Mean length4.8181818
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
315
45.5%
2020-12 184
26.6%
2013-04 53
 
7.6%
2010-06 50
 
7.2%
2010-10 42
 
6.1%
2015-12 24
 
3.5%
2013-12 9
 
1.3%
2019-11 6
 
0.9%
2015-09 5
 
0.7%
2019-12 2
 
0.3%
Other values (2) 3
 
0.4%

Length

2023-12-12T15:23:36.150870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-12 184
48.7%
2013-04 53
 
14.0%
2010-06 50
 
13.2%
2010-10 42
 
11.1%
2015-12 24
 
6.3%
2013-12 9
 
2.4%
2019-11 6
 
1.6%
2015-09 5
 
1.3%
2019-12 2
 
0.5%
2011-09 2
 
0.5%

전류타입
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
교류
473 
직류
154 
AC 교류
50 
<NA>
 
16

Length

Max length5
Median length2
Mean length2.2626263
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교류
2nd row교류
3rd row교류
4th row교류
5th row교류

Common Values

ValueCountFrequency (%)
교류 473
68.3%
직류 154
 
22.2%
AC 교류 50
 
7.2%
<NA> 16
 
2.3%

Length

2023-12-12T15:23:36.339323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:36.489893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교류 523
70.4%
직류 154
 
20.7%
ac 50
 
6.7%
na 16
 
2.2%

극간거리(cm)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
10.16~17.78
574 
7.62~15.24
 
53
10.16~15.24
 
50
<NA>
 
16

Length

Max length11
Median length11
Mean length10.761905
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10.16~17.78
2nd row10.16~17.78
3rd row10.16~17.78
4th row10.16~17.78
5th row10.16~17.78

Common Values

ValueCountFrequency (%)
10.16~17.78 574
82.8%
7.62~15.24 53
 
7.6%
10.16~15.24 50
 
7.2%
<NA> 16
 
2.3%

Length

2023-12-12T15:23:36.656689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:36.816681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10.16~17.78 574
82.8%
7.62~15.24 53
 
7.6%
10.16~15.24 50
 
7.2%
na 16
 
2.3%

자분형식
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
습식
677 
<NA>
 
16

Length

Max length4
Median length2
Mean length2.046176
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row습식
2nd row습식
3rd row습식
4th row습식
5th row습식

Common Values

ValueCountFrequency (%)
습식 677
97.7%
<NA> 16
 
2.3%

Length

2023-12-12T15:23:36.990362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:37.157067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
습식 677
97.7%
na 16
 
2.3%

표면상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
AS WELDED
677 
<NA>
 
16

Length

Max length9
Median length9
Mean length8.8845599
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAS WELDED
2nd rowAS WELDED
3rd rowAS WELDED
4th rowAS WELDED
5th rowAS WELDED

Common Values

ValueCountFrequency (%)
AS WELDED 677
97.7%
<NA> 16
 
2.3%

Length

2023-12-12T15:23:37.298928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:37.427048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
as 677
49.4%
welded 677
49.4%
na 16
 
1.2%

식별번호
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
<NA>
494 
MP3123
199 

Length

Max length6
Median length4
Mean length4.5743146
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 494
71.3%
MP3123 199
28.7%

Length

2023-12-12T15:23:37.609682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:37.739961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 494
71.3%
mp3123 199
28.7%

판독내용
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct11
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
결함사항 없음
394 
<NA>
247 
원형상 지시
 
26
용접 누락
 
13
원형, 선형상 지시
 
6
Other values (6)
 
7

Length

Max length29
Median length7
Mean length5.9336219
Min length3

Unique

Unique5 ?
Unique (%)0.7%

Sample

1st row결함사항 없음
2nd row결함사항 없음
3rd row결함사항 없음
4th row결함사항 없음
5th row결함사항 없음

Common Values

ValueCountFrequency (%)
결함사항 없음 394
56.9%
<NA> 247
35.6%
원형상 지시 26
 
3.8%
용접 누락 13
 
1.9%
원형, 선형상 지시 6
 
0.9%
선상 지시 2
 
0.3%
결함번호:1, 종류:균열, X:73, Y:-, L:3 1
 
0.1%
원형상 지시, 용접 누락 1
 
0.1%
NO RECORDABLE INDICATION 1
 
0.1%
선형상 지시 1
 
0.1%

Length

2023-12-12T15:23:37.861498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
결함사항 394
34.3%
없음 394
34.3%
na 247
21.5%
지시 36
 
3.1%
원형상 27
 
2.3%
용접 14
 
1.2%
누락 14
 
1.2%
선형상 7
 
0.6%
원형 6
 
0.5%
선상 2
 
0.2%
Other values (9) 9
 
0.8%

용접형태
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
<NA>
350 
겹치기
136 
겹치기(외면)
128 
맞대기
 
31
맞대기(외면)
 
23
Other values (2)
 
25

Length

Max length7
Median length4
Mean length4.3694084
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row겹치기
2nd row겹치기
3rd row겹치기
4th row겹치기
5th row겹치기

Common Values

ValueCountFrequency (%)
<NA> 350
50.5%
겹치기 136
 
19.6%
겹치기(외면) 128
 
18.5%
맞대기 31
 
4.5%
맞대기(외면) 23
 
3.3%
필렛 21
 
3.0%
겹치기(내면) 4
 
0.6%

Length

2023-12-12T15:23:37.996439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:23:38.124542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 350
50.5%
겹치기 136
 
19.6%
겹치기(외면 128
 
18.5%
맞대기 31
 
4.5%
맞대기(외면 23
 
3.3%
필렛 21
 
3.0%
겹치기(내면 4
 
0.6%

결함 길이(mm)
Text

MISSING 

Distinct65
Distinct (%)45.8%
Missing551
Missing (%)79.5%
Memory size5.5 KiB
2023-12-12T15:23:38.310846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.2957746
Min length1

Characters and Unicode

Total characters326
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique42 ?
Unique (%)29.6%

Sample

1st row150
2nd row10
3rd row50
4th row11
5th row0
ValueCountFrequency (%)
0 19
 
12.6%
10 10
 
6.6%
30 9
 
6.0%
20 9
 
6.0%
50 7
 
4.6%
15 6
 
4.0%
9 6
 
4.0%
40 5
 
3.3%
12 5
 
3.3%
25 4
 
2.6%
Other values (47) 71
47.0%
2023-12-12T15:23:38.654779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 93
28.5%
1 54
16.6%
5 36
 
11.0%
2 35
 
10.7%
3 31
 
9.5%
4 19
 
5.8%
6 10
 
3.1%
9 9
 
2.8%
, 9
 
2.8%
9
 
2.8%
Other values (3) 21
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 301
92.3%
Other Punctuation 9
 
2.8%
Space Separator 9
 
2.8%
Math Symbol 7
 
2.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 93
30.9%
1 54
17.9%
5 36
 
12.0%
2 35
 
11.6%
3 31
 
10.3%
4 19
 
6.3%
6 10
 
3.3%
9 9
 
3.0%
8 8
 
2.7%
7 6
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 9
100.0%
Space Separator
ValueCountFrequency (%)
9
100.0%
Math Symbol
ValueCountFrequency (%)
+ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 326
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 93
28.5%
1 54
16.6%
5 36
 
11.0%
2 35
 
10.7%
3 31
 
9.5%
4 19
 
5.8%
6 10
 
3.1%
9 9
 
2.8%
, 9
 
2.8%
9
 
2.8%
Other values (3) 21
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 93
28.5%
1 54
16.6%
5 36
 
11.0%
2 35
 
10.7%
3 31
 
9.5%
4 19
 
5.8%
6 10
 
3.1%
9 9
 
2.8%
, 9
 
2.8%
9
 
2.8%
Other values (3) 21
 
6.4%

검사길이(mm)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct26
Distinct (%)4.1%
Missing64
Missing (%)9.2%
Infinite0
Infinite (%)0.0%
Mean291.47854
Minimum80
Maximum6910
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.2 KiB
2023-12-12T15:23:38.785849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum80
5-th percentile100
Q1200
median250
Q3250
95-th percentile900
Maximum6910
Range6830
Interquartile range (IQR)50

Descriptive statistics

Standard deviation328.32116
Coefficient of variation (CV)1.1263991
Kurtosis263.67484
Mean291.47854
Median Absolute Deviation (MAD)0
Skewness13.651748
Sum183340
Variance107794.78
MonotonicityNot monotonic
2023-12-12T15:23:38.931114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
250 325
46.9%
200 111
 
16.0%
180 68
 
9.8%
100 32
 
4.6%
900 22
 
3.2%
1000 17
 
2.5%
300 14
 
2.0%
150 8
 
1.2%
400 5
 
0.7%
600 3
 
0.4%
Other values (16) 24
 
3.5%
(Missing) 64
 
9.2%
ValueCountFrequency (%)
80 1
 
0.1%
100 32
 
4.6%
120 2
 
0.3%
150 8
 
1.2%
160 2
 
0.3%
180 68
 
9.8%
200 111
 
16.0%
230 1
 
0.1%
250 325
46.9%
270 1
 
0.1%
ValueCountFrequency (%)
6910 1
 
0.1%
1200 1
 
0.1%
1000 17
2.5%
950 1
 
0.1%
900 22
3.2%
850 2
 
0.3%
820 1
 
0.1%
800 2
 
0.3%
700 1
 
0.1%
600 3
 
0.4%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct20
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
<NA>
573 
2회
68 
결함없음
 
9
피로균열없음
 
5
제1경간
 
4
Other values (15)
 
34

Length

Max length9
Median length4
Mean length3.8975469
Min length2

Unique

Unique2 ?
Unique (%)0.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 573
82.7%
2회 68
 
9.8%
결함없음 9
 
1.3%
피로균열없음 5
 
0.7%
제1경간 4
 
0.6%
A TRUSS 4
 
0.6%
C TRUSS 4
 
0.6%
과년도검사 3
 
0.4%
신규 3
 
0.4%
TRUSS B.G 2
 
0.3%
Other values (10) 18
 
2.6%

Length

2023-12-12T15:23:39.101893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 573
80.4%
2회 68
 
9.5%
truss 18
 
2.5%
결함없음 9
 
1.3%
피로균열없음 5
 
0.7%
제1경간 4
 
0.6%
a 4
 
0.6%
c 4
 
0.6%
과년도검사 3
 
0.4%
신규 3
 
0.4%
Other values (12) 22
 
3.1%

Interactions

2023-12-12T15:23:32.043565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:23:39.223626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분시설물종류시설물번호자화방법시험일자전류타입극간거리(cm)판독내용용접형태결함 길이(mm)검사길이(mm)비고
시설물구분1.0001.0001.0000.2050.9970.2500.2620.813NaNNaN0.015NaN
시설물종류1.0001.0001.0000.1310.9990.8710.8940.813NaN0.9100.5361.000
시설물번호1.0001.0001.0001.0000.9800.9460.9560.6120.7030.9560.6881.000
자화방법0.2050.1311.0001.0000.2610.1590.017NaNNaN0.8750.000NaN
시험일자0.9970.9990.9800.2611.0000.9891.0000.8150.5360.9430.2911.000
전류타입0.2500.8710.9460.1590.9891.0000.9740.6280.8850.8810.2441.000
극간거리(cm)0.2620.8940.9560.0171.0000.9741.0000.795NaN0.5750.0991.000
판독내용0.8130.8130.612NaN0.8150.6280.7951.0000.2440.5120.234NaN
용접형태NaNNaN0.703NaN0.5360.885NaN0.2441.000NaN0.969NaN
결함 길이(mm)NaN0.9100.9560.8750.9430.8810.5750.512NaN1.0000.9501.000
검사길이(mm)0.0150.5360.6880.0000.2910.2440.0990.2340.9690.9501.000NaN
비고NaN1.0001.000NaN1.0001.0001.000NaNNaN1.000NaN1.000
2023-12-12T15:23:39.388677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용접형태통전법전류타입표면상태극간거리(cm)시설물구분판독내용자화방법시험일자식별번호시설물종류자분형식시설물번호비고
용접형태1.000NaN0.6951.0001.0001.0000.1741.0000.2591.0001.0001.0000.515NaN
통전법NaN1.0001.0001.0001.0001.0001.0001.0001.000NaN1.0001.0001.0001.000
전류타입0.6951.0001.0001.0000.8000.4070.4700.2620.8661.0000.5651.0000.8740.929
표면상태1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
극간거리(cm)1.0001.0000.8001.0001.0000.4260.6790.0290.9931.0000.6041.0000.8960.925
시설물구분1.0001.0000.4071.0000.4261.0000.6410.1310.9471.0000.9991.0000.9891.000
판독내용0.1741.0000.4701.0000.6790.6411.0001.0000.4701.0000.6411.0000.3341.000
자화방법1.0001.0000.2621.0000.0290.1311.0001.0000.2011.0000.2171.0000.9891.000
시험일자0.2591.0000.8661.0000.9930.9470.4700.2011.0001.0000.9561.0000.8880.945
식별번호1.000NaN1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000NaN
시설물종류1.0001.0000.5651.0000.6040.9990.6410.2170.9561.0001.0001.0000.9900.925
자분형식1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
시설물번호0.5151.0000.8741.0000.8960.9890.3340.9890.8881.0000.9901.0001.0000.941
비고NaN1.0000.9291.0000.9251.0001.0001.0000.945NaN0.9251.0000.9411.000
2023-12-12T15:23:39.593260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검사길이(mm)시설물구분시설물종류시설물번호자화방법통전법시험일자전류타입극간거리(cm)자분형식표면상태식별번호판독내용용접형태비고
검사길이(mm)1.0000.0250.2260.4110.0001.0000.1330.3970.1631.0001.0001.0000.1420.7831.000
시설물구분0.0251.0000.9990.9890.1311.0000.9470.4070.4261.0001.0001.0000.6411.0001.000
시설물종류0.2260.9991.0000.9900.2171.0000.9560.5650.6041.0001.0001.0000.6411.0000.925
시설물번호0.4110.9890.9901.0000.9891.0000.8880.8740.8961.0001.0001.0000.3340.5150.941
자화방법0.0000.1310.2170.9891.0001.0000.2010.2620.0291.0001.0001.0001.0001.0001.000
통전법1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.000
시험일자0.1330.9470.9560.8880.2011.0001.0000.8660.9931.0001.0001.0000.4700.2590.945
전류타입0.3970.4070.5650.8740.2621.0000.8661.0000.8001.0001.0001.0000.4700.6950.929
극간거리(cm)0.1630.4260.6040.8960.0291.0000.9930.8001.0001.0001.0001.0000.6791.0000.925
자분형식1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
표면상태1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
식별번호1.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0000.000
판독내용0.1420.6410.6410.3341.0001.0000.4700.4700.6791.0001.0001.0001.0000.1741.000
용접형태0.7831.0001.0000.5151.0000.0000.2590.6951.0001.0001.0001.0000.1741.0000.000
비고1.0001.0000.9250.9411.0001.0000.9450.9290.9251.0001.0000.0001.0000.0001.000

Missing values

2023-12-12T15:23:32.246659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:23:32.563822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T15:23:32.829978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시설물구분시설물종류종별시설물번호세부위치자화방법통전법탈자여부시험일자전류타입극간거리(cm)자분형식표면상태식별번호판독내용용접형태결함 길이(mm)검사길이(mm)비고
0상하수도광역상수도1종WS00021단계 도수 01-01(01) 80+11.43 A1YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
1상하수도광역상수도1종WS00021단계 도수 01-01(01) 80+11.43 A2YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
2상하수도광역상수도1종WS00021단계 충무 02-02(05) 164+3.79 A1YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
3상하수도광역상수도1종WS00021단계 충무 02-02(05) 164+3.79 A2YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
4상하수도광역상수도1종WS00021단계 충무 02-03(02) 247+26 A1YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
5상하수도광역상수도1종WS00021단계 충무 02-03(02) 247+26 A2YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
6상하수도광역상수도1종WS00021단계 충무 도수터널 +100m A8YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
7상하수도광역상수도1종WS00021단계 충무 도수터널 +100m A9YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
8상하수도광역상수도1종WS00021단계 충무 도수터널 +312m A8YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
9상하수도광역상수도1종WS00021단계 충무 도수터널 +312m A9YOKE<NA>NO교류10.16~17.78습식AS WELDED<NA>결함사항 없음겹치기<NA>250<NA>
시설물구분시설물종류종별시설물번호세부위치자화방법통전법탈자여부시험일자전류타입극간거리(cm)자분형식표면상태식별번호판독내용용접형태결함 길이(mm)검사길이(mm)비고
683상하수도광역상수도1종WS0002충무 04관로 제수밸브실 344+8.25 A01YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음겹치기(외면)<NA>250<NA>
684상하수도광역상수도1종WS0002충무 04관로 제수밸브실 392+8.72 A01YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음맞대기(외면)<NA>250<NA>
685상하수도광역상수도1종WS0002충무 04관로 제수밸브실 421+7 A01YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음겹치기(외면)<NA>250<NA>
686상하수도광역상수도1종WS0002충무 04관로 암거 노출관 427+11.85 A09YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음맞대기(외면)<NA>250<NA>
687상하수도광역상수도1종WS0002충무 04관로 암거 노출관 427+11.85 A10YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음맞대기(외면)<NA>250<NA>
688상하수도광역상수도1종WS0002충무 04관로 암거 노출관 427+11.85 B09YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음겹치기(외면)<NA>250<NA>
689상하수도광역상수도1종WS0002충무 04관로 암거 노출관 427+11.85 B10YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음겹치기(외면)<NA>250<NA>
690상하수도광역상수도1종WS0002충무 04관로 제수밸브실 444+14.98 A01YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음겹치기(외면)<NA>250<NA>
691상하수도광역상수도1종WS0002충무 04관로 유량계실 444+30 A09YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음맞대기(외면)<NA>250<NA>
692상하수도광역상수도1종WS0002충무 04관로 제수밸브실 495+9.8 A01YOKE<NA>NO교류10.16~17.78습식AS WELDEDMP3123결함사항 없음겹치기(외면)<NA>250<NA>

Duplicate rows

Most frequently occurring

시설물구분시설물종류종별시설물번호세부위치자화방법통전법탈자여부시험일자전류타입극간거리(cm)자분형식표면상태식별번호판독내용용접형태결함 길이(mm)검사길이(mm)비고# duplicates
0교량철도교량1종BR0017E2 N(북측) A1측 상단 LYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
1교량철도교량1종BR0017E2 N(북측) A1측 중단 LYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
2교량철도교량1종BR0017E2 N(북측) A1측 중단 RYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
3교량철도교량1종BR0017E2 N(북측) A2측 상단 LYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
4교량철도교량1종BR0017E2 N(북측) A2측 상단 RYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
5교량철도교량1종BR0017E2 N(북측) A2측 중단 LYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
6교량철도교량1종BR0017E2 N(북측) A2측 중단 RYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
7교량철도교량1종BR0017E2 S(남측) A1측 상단 RYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
8교량철도교량1종BR0017E2 S(남측) A1측 중단 LYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2
9교량철도교량1종BR0017E2 S(남측) A1측 중단 RYOKE<NA>NO2020-12교류10.16~17.78습식AS WELDED<NA><NA><NA><NA>200<NA>2