Overview

Dataset statistics

Number of variables7
Number of observations2295
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows31
Duplicate rows (%)1.4%
Total size in memory127.9 KiB
Average record size in memory57.1 B

Variable types

Categorical4
Text2
Numeric1

Dataset

Description국토안전관리원에서 염화물침투량 시험을 한 데이터 입니다. 시험대상은 철도터널이며 종별은 1종입니다. 염화물침투량의 단위는 (kg/m^3) 입니다. 염화물침투량에 따른 결과값과 평가등급을 알려드립니다.
Author국토안전관리원
URLhttps://www.data.go.kr/data/15110982/fileData.do

Alerts

종별 has constant value ""Constant
Dataset has 31 (1.4%) duplicate rowsDuplicates
시설물종류 is highly overall correlated with 시설물구분High correlation
시설물구분 is highly overall correlated with 시설물종류High correlation

Reproduction

Analysis started2023-12-12 01:27:08.573464
Analysis finished2023-12-12 01:27:09.777436
Duration1.2 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설물구분
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size18.1 KiB
터널
1223 
교량
933 
상하수도
 
79
하천
 
45
항만
 
15

Length

Max length4
Median length2
Mean length2.0688453
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row터널
2nd row터널
3rd row터널
4th row터널
5th row터널

Common Values

ValueCountFrequency (%)
터널 1223
53.3%
교량 933
40.7%
상하수도 79
 
3.4%
하천 45
 
2.0%
항만 15
 
0.7%

Length

2023-12-12T10:27:09.889046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:27:10.060863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
터널 1223
53.3%
교량 933
40.7%
상하수도 79
 
3.4%
하천 45
 
2.0%
항만 15
 
0.7%

시설물종류
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size18.1 KiB
도로교량
734 
도로터널
658 
철도터널
565 
철도교량
199 
광역상수도
79 
Other values (2)
 
60

Length

Max length5
Median length4
Mean length4.0148148
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row철도터널
2nd row철도터널
3rd row철도터널
4th row철도터널
5th row철도터널

Common Values

ValueCountFrequency (%)
도로교량 734
32.0%
도로터널 658
28.7%
철도터널 565
24.6%
철도교량 199
 
8.7%
광역상수도 79
 
3.4%
하구둑 45
 
2.0%
계류시설 15
 
0.7%

Length

2023-12-12T10:27:10.215276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:27:10.403728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도로교량 734
32.0%
도로터널 658
28.7%
철도터널 565
24.6%
철도교량 199
 
8.7%
광역상수도 79
 
3.4%
하구둑 45
 
2.0%
계류시설 15
 
0.7%

종별
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.1 KiB
1종
2295 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1종
2nd row1종
3rd row1종
4th row1종
5th row1종

Common Values

ValueCountFrequency (%)
1종 2295
100.0%

Length

2023-12-12T10:27:10.549187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:27:10.657170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1종 2295
100.0%
Distinct65
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size18.1 KiB
2023-12-12T10:27:10.886101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters13770
Distinct characters20
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTU0002
2nd rowTU0002
3rd rowTU0002
4th rowTU0002
5th rowTU0002
ValueCountFrequency (%)
br0013 161
 
7.0%
tu0037 116
 
5.1%
br0018 104
 
4.5%
br0005 92
 
4.0%
br0004 89
 
3.9%
br0017 69
 
3.0%
tu0006 69
 
3.0%
tu0022 64
 
2.8%
tu0017 60
 
2.6%
tu0035 48
 
2.1%
Other values (55) 1423
62.0%
2023-12-12T10:27:11.304981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5453
39.6%
T 1223
 
8.9%
U 1223
 
8.9%
1 1037
 
7.5%
B 951
 
6.9%
R 933
 
6.8%
3 681
 
4.9%
2 576
 
4.2%
7 333
 
2.4%
4 275
 
2.0%
Other values (10) 1085
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9180
66.7%
Uppercase Letter 4590
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5453
59.4%
1 1037
 
11.3%
3 681
 
7.4%
2 576
 
6.3%
7 333
 
3.6%
4 275
 
3.0%
5 264
 
2.9%
8 230
 
2.5%
6 171
 
1.9%
9 160
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
T 1223
26.6%
U 1223
26.6%
B 951
20.7%
R 933
20.3%
W 79
 
1.7%
S 79
 
1.7%
E 45
 
1.0%
D 27
 
0.6%
H 15
 
0.3%
M 15
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common 9180
66.7%
Latin 4590
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5453
59.4%
1 1037
 
11.3%
3 681
 
7.4%
2 576
 
6.3%
7 333
 
3.6%
4 275
 
3.0%
5 264
 
2.9%
8 230
 
2.5%
6 171
 
1.9%
9 160
 
1.7%
Latin
ValueCountFrequency (%)
T 1223
26.6%
U 1223
26.6%
B 951
20.7%
R 933
20.3%
W 79
 
1.7%
S 79
 
1.7%
E 45
 
1.0%
D 27
 
0.6%
H 15
 
0.3%
M 15
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13770
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5453
39.6%
T 1223
 
8.9%
U 1223
 
8.9%
1 1037
 
7.5%
B 951
 
6.9%
R 933
 
6.8%
3 681
 
4.9%
2 576
 
4.2%
7 333
 
2.4%
4 275
 
2.0%
Other values (10) 1085
 
7.9%
Distinct1804
Distinct (%)78.6%
Missing0
Missing (%)0.0%
Memory size18.1 KiB
2023-12-12T10:27:11.710845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length39
Mean length19.506318
Min length5

Characters and Unicode

Total characters44767
Distinct characters225
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1741 ?
Unique (%)75.9%

Sample

1st row상선 박스 17k580 LW 0~15mm
2nd row상선 박스 17k580 LW 15~30mm
3rd row상선 박스 17k580 LW 30~45mm
4th row상선 박스 18k065 LW 0~15mm
5th row상선 박스 18k065 LW 15~30mm
ValueCountFrequency (%)
lw 292
 
3.3%
30~45mm 287
 
3.2%
15~30mm 286
 
3.2%
rc 277
 
3.1%
lc 256
 
2.9%
rw 229
 
2.6%
0~2mm 203
 
2.3%
2~15mm 189
 
2.1%
20~40mm 173
 
1.9%
0~20mm 163
 
1.8%
Other values (698) 6548
73.5%
2023-12-12T10:27:12.260075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6611
 
14.8%
m 4459
 
10.0%
0 3488
 
7.8%
~ 2194
 
4.9%
1 1804
 
4.0%
2 1772
 
4.0%
5 1600
 
3.6%
3 1210
 
2.7%
4 1129
 
2.5%
6 756
 
1.7%
Other values (215) 19744
44.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12953
28.9%
Other Letter 11225
25.1%
Space Separator 6611
14.8%
Lowercase Letter 4948
 
11.1%
Uppercase Letter 4576
 
10.2%
Math Symbol 2408
 
5.4%
Open Punctuation 716
 
1.6%
Close Punctuation 716
 
1.6%
Dash Punctuation 430
 
1.0%
Other Punctuation 184
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
682
 
6.1%
570
 
5.1%
498
 
4.4%
479
 
4.3%
457
 
4.1%
383
 
3.4%
380
 
3.4%
292
 
2.6%
257
 
2.3%
249
 
2.2%
Other values (162) 6978
62.2%
Uppercase Letter
ValueCountFrequency (%)
R 691
15.1%
P 671
14.7%
C 669
14.6%
W 638
13.9%
L 627
13.7%
S 475
10.4%
B 220
 
4.8%
A 171
 
3.7%
M 159
 
3.5%
K 64
 
1.4%
Other values (9) 191
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
m 4459
90.1%
k 267
 
5.4%
t 57
 
1.2%
a 57
 
1.2%
n 33
 
0.7%
e 21
 
0.4%
i 19
 
0.4%
c 12
 
0.2%
b 5
 
0.1%
u 5
 
0.1%
Other values (3) 13
 
0.3%
Decimal Number
ValueCountFrequency (%)
0 3488
26.9%
1 1804
13.9%
2 1772
13.7%
5 1600
12.4%
3 1210
 
9.3%
4 1129
 
8.7%
6 756
 
5.8%
8 516
 
4.0%
7 360
 
2.8%
9 318
 
2.5%
Other Punctuation
ValueCountFrequency (%)
/ 69
37.5%
, 62
33.7%
. 42
22.8%
' 11
 
6.0%
Math Symbol
ValueCountFrequency (%)
~ 2194
91.1%
158
 
6.6%
+ 56
 
2.3%
Space Separator
ValueCountFrequency (%)
6611
100.0%
Open Punctuation
ValueCountFrequency (%)
( 716
100.0%
Close Punctuation
ValueCountFrequency (%)
) 716
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 430
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24018
53.7%
Hangul 11225
25.1%
Latin 9524
 
21.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
682
 
6.1%
570
 
5.1%
498
 
4.4%
479
 
4.3%
457
 
4.1%
383
 
3.4%
380
 
3.4%
292
 
2.6%
257
 
2.3%
249
 
2.2%
Other values (162) 6978
62.2%
Latin
ValueCountFrequency (%)
m 4459
46.8%
R 691
 
7.3%
P 671
 
7.0%
C 669
 
7.0%
W 638
 
6.7%
L 627
 
6.6%
S 475
 
5.0%
k 267
 
2.8%
B 220
 
2.3%
A 171
 
1.8%
Other values (22) 636
 
6.7%
Common
ValueCountFrequency (%)
6611
27.5%
0 3488
14.5%
~ 2194
 
9.1%
1 1804
 
7.5%
2 1772
 
7.4%
5 1600
 
6.7%
3 1210
 
5.0%
4 1129
 
4.7%
6 756
 
3.1%
( 716
 
3.0%
Other values (11) 2738
11.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33384
74.6%
Hangul 11225
 
25.1%
Math Operators 158
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6611
19.8%
m 4459
13.4%
0 3488
10.4%
~ 2194
 
6.6%
1 1804
 
5.4%
2 1772
 
5.3%
5 1600
 
4.8%
3 1210
 
3.6%
4 1129
 
3.4%
6 756
 
2.3%
Other values (42) 8361
25.0%
Hangul
ValueCountFrequency (%)
682
 
6.1%
570
 
5.1%
498
 
4.4%
479
 
4.3%
457
 
4.1%
383
 
3.4%
380
 
3.4%
292
 
2.6%
257
 
2.3%
249
 
2.2%
Other values (162) 6978
62.2%
Math Operators
ValueCountFrequency (%)
158
100.0%

염화물침투량
Real number (ℝ)

Distinct355
Distinct (%)15.5%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1.0793984
Minimum0.02
Maximum40.68
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.3 KiB
2023-12-12T10:27:12.466012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.02
5-th percentile0.19
Q10.33
median0.5
Q30.83
95-th percentile3.874
Maximum40.68
Range40.66
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation2.3154506
Coefficient of variation (CV)2.1451306
Kurtosis78.708173
Mean1.0793984
Median Absolute Deviation (MAD)0.21
Skewness7.4503214
Sum2476.14
Variance5.3613114
MonotonicityNot monotonic
2023-12-12T10:27:12.678235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.29 74
 
3.2%
0.27 49
 
2.1%
0.32 47
 
2.0%
0.39 46
 
2.0%
0.36 45
 
2.0%
0.38 44
 
1.9%
0.34 43
 
1.9%
0.33 42
 
1.8%
0.4 42
 
1.8%
0.35 41
 
1.8%
Other values (345) 1821
79.3%
ValueCountFrequency (%)
0.02 1
 
< 0.1%
0.07 3
 
0.1%
0.08 1
 
< 0.1%
0.09 4
 
0.2%
0.1 7
0.3%
0.11 1
 
< 0.1%
0.12 3
 
0.1%
0.13 7
0.3%
0.14 9
0.4%
0.15 15
0.7%
ValueCountFrequency (%)
40.68 1
< 0.1%
28.11 1
< 0.1%
27.08 1
< 0.1%
25.79 1
< 0.1%
25.78 1
< 0.1%
24.06 1
< 0.1%
22.01 1
< 0.1%
17.69 1
< 0.1%
17.58 1
< 0.1%
15.35 1
< 0.1%
Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size18.1 KiB
b
1297 
a
461 
c
200 
<NA>
174 
d
163 

Length

Max length4
Median length1
Mean length1.227451
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowb
2nd rowb
3rd rowb
4th rowa
5th rowa

Common Values

ValueCountFrequency (%)
b 1297
56.5%
a 461
 
20.1%
c 200
 
8.7%
<NA> 174
 
7.6%
d 163
 
7.1%

Length

2023-12-12T10:27:12.913083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:27:13.073579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
b 1297
56.5%
a 461
 
20.1%
c 200
 
8.7%
na 174
 
7.6%
d 163
 
7.1%

Interactions

2023-12-12T10:27:09.431427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:27:13.168888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분시설물종류시설물번호염화물침투량염화물침투량 평가등급
시설물구분1.0001.0001.0000.1520.315
시설물종류1.0001.0001.0000.1560.420
시설물번호1.0001.0001.0000.2740.755
염화물침투량0.1520.1560.2741.0000.768
염화물침투량 평가등급0.3150.4200.7550.7681.000
2023-12-12T10:27:13.283217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물종류염화물침투량 평가등급시설물구분
시설물종류1.0000.3011.000
염화물침투량 평가등급0.3011.0000.262
시설물구분1.0000.2621.000
2023-12-12T10:27:13.420181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
염화물침투량시설물구분시설물종류염화물침투량 평가등급
염화물침투량1.0000.0930.0840.435
시설물구분0.0931.0001.0000.262
시설물종류0.0841.0001.0000.301
염화물침투량 평가등급0.4350.2620.3011.000

Missing values

2023-12-12T10:27:09.587528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:27:09.723114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설물구분시설물종류종별시설물번호세부위치염화물침투량염화물침투량 평가등급
0터널철도터널1종TU0002상선 박스 17k580 LW 0~15mm0.72b
1터널철도터널1종TU0002상선 박스 17k580 LW 15~30mm0.63b
2터널철도터널1종TU0002상선 박스 17k580 LW 30~45mm0.52b
3터널철도터널1종TU0002상선 박스 18k065 LW 0~15mm0.21a
4터널철도터널1종TU0002상선 박스 18k065 LW 15~30mm0.18a
5터널철도터널1종TU0002상선 박스 18k065 LW 30~45mm0.17a
6터널철도터널1종TU0002상선 박스 18k080 LW 0~15mm0.2a
7터널철도터널1종TU0002상선 박스 18k080 LW 15~30mm0.16a
8터널철도터널1종TU0002상선 박스 18k080 LW 30~45mm0.15a
9터널철도터널1종TU0002하선 박스 17k545 RW 0~15mm0.53b
시설물구분시설물종류종별시설물번호세부위치염화물침투량염화물침투량 평가등급
2285터널철도터널1종TU0015LW 15~30mm0.56b
2286터널철도터널1종TU0015LW 30~45mm0.42b
2287터널철도터널1종TU001831k290 하선 벽체 0~2mm0.72b
2288터널철도터널1종TU001831k290 하선 벽체 2~15mm0.48b
2289터널철도터널1종TU001831k290 하선 벽체 15~30mm0.46b
2290터널철도터널1종TU001831k290 하선 벽체 30~45mm0.44b
2291터널철도터널1종TU001831k835 상선 라이닝 0~2mm0.49b
2292터널철도터널1종TU001831k835 상선 라이닝 2~15mm0.48b
2293터널철도터널1종TU001831k835 상선 라이닝 15~30mm0.45b
2294터널철도터널1종TU001831k835 상선 라이닝 30~45mm0.44b

Duplicate rows

Most frequently occurring

시설물구분시설물종류종별시설물번호세부위치염화물침투량염화물침투량 평가등급# duplicates
9터널도로터널1종TU0013LC 15~30mm0.26a3
26터널철도터널1종TU0015LW 0~2mm0.66b3
27터널철도터널1종TU0015LW 15~30mm0.56b3
28터널철도터널1종TU0015LW 2~15mm0.63b3
0터널도로터널1종TU0007RC 20~40mm0.29a2
1터널도로터널1종TU0008RC 20~40mm0.28a2
2터널도로터널1종TU0008RC 20~40mm0.35b2
3터널도로터널1종TU0008RC 2~20mm0.36b2
4터널도로터널1종TU0010LC 15~30mm0.43b2
5터널도로터널1종TU0010LC 30~45mm0.42b2