Overview

Dataset statistics

Number of variables8
Number of observations489
Missing cells31
Missing cells (%)0.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory32.1 KiB
Average record size in memory67.3 B

Variable types

Categorical5
Text1
Numeric2

Alerts

시설물위치 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with 시설물위치High correlation
집계년도 is highly imbalanced (74.3%)Imbalance
시설물구분명 is highly imbalanced (71.3%)Imbalance
시설물준공년도 has 31 (6.3%) missing valuesMissing

Reproduction

Analysis started2023-12-10 22:00:41.109368
Analysis finished2023-12-10 22:00:42.139363
Duration1.03 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

집계년도
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2020
451 
2023
 
15
2021
 
15
2022
 
8

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2020 451
92.2%
2023 15
 
3.1%
2021 15
 
3.1%
2022 8
 
1.6%

Length

2023-12-11T07:00:42.196546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:00:42.297889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 451
92.2%
2023 15
 
3.1%
2021 15
 
3.1%
2022 8
 
1.6%

시군명
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
양평군
95 
안성시
91 
용인시
77 
화성시
64 
여주시
46 
Other values (4)
116 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row김포시
2nd row안성시
3rd row안성시
4th row안성시
5th row안성시

Common Values

ValueCountFrequency (%)
양평군 95
19.4%
안성시 91
18.6%
용인시 77
15.7%
화성시 64
13.1%
여주시 46
9.4%
광주시 35
 
7.2%
이천시 31
 
6.3%
김포시 25
 
5.1%
평택시 25
 
5.1%

Length

2023-12-11T07:00:42.416105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:00:42.522862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
양평군 95
19.4%
안성시 91
18.6%
용인시 77
15.7%
화성시 64
13.1%
여주시 46
9.4%
광주시 35
 
7.2%
이천시 31
 
6.3%
김포시 25
 
5.1%
평택시 25
 
5.1%
Distinct487
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2023-12-11T07:00:42.846081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length15
Mean length4.7484663
Min length3

Characters and Unicode

Total characters2322
Distinct characters231
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique485 ?
Unique (%)99.2%

Sample

1st row고음달천교
2nd row구수교(1)
3rd row동평교
4th row남풍교
5th row가율교
ValueCountFrequency (%)
3
 
0.6%
3
 
0.6%
보도육교 3
 
0.6%
한천교 2
 
0.4%
숙곡교 2
 
0.4%
군월교 2
 
0.4%
아곡교 2
 
0.4%
봉명ic교 2
 
0.4%
구포교 2
 
0.4%
청미교 1
 
0.2%
Other values (481) 481
95.6%
2023-12-11T07:00:43.329049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
441
 
19.0%
( 87
 
3.7%
) 87
 
3.7%
1 67
 
2.9%
63
 
2.7%
2 61
 
2.6%
49
 
2.1%
41
 
1.8%
35
 
1.5%
34
 
1.5%
Other values (221) 1357
58.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1855
79.9%
Decimal Number 237
 
10.2%
Open Punctuation 87
 
3.7%
Close Punctuation 87
 
3.7%
Dash Punctuation 26
 
1.1%
Uppercase Letter 16
 
0.7%
Space Separator 14
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
441
23.8%
63
 
3.4%
49
 
2.6%
41
 
2.2%
35
 
1.9%
34
 
1.8%
31
 
1.7%
31
 
1.7%
31
 
1.7%
31
 
1.7%
Other values (204) 1068
57.6%
Decimal Number
ValueCountFrequency (%)
1 67
28.3%
2 61
25.7%
3 26
 
11.0%
7 16
 
6.8%
8 15
 
6.3%
0 13
 
5.5%
5 12
 
5.1%
6 12
 
5.1%
4 10
 
4.2%
9 5
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
C 8
50.0%
I 7
43.8%
F 1
 
6.2%
Open Punctuation
ValueCountFrequency (%)
( 87
100.0%
Close Punctuation
ValueCountFrequency (%)
) 87
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%
Space Separator
ValueCountFrequency (%)
14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1855
79.9%
Common 451
 
19.4%
Latin 16
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
441
23.8%
63
 
3.4%
49
 
2.6%
41
 
2.2%
35
 
1.9%
34
 
1.8%
31
 
1.7%
31
 
1.7%
31
 
1.7%
31
 
1.7%
Other values (204) 1068
57.6%
Common
ValueCountFrequency (%)
( 87
19.3%
) 87
19.3%
1 67
14.9%
2 61
13.5%
- 26
 
5.8%
3 26
 
5.8%
7 16
 
3.5%
8 15
 
3.3%
14
 
3.1%
0 13
 
2.9%
Other values (4) 39
8.6%
Latin
ValueCountFrequency (%)
C 8
50.0%
I 7
43.8%
F 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1855
79.9%
ASCII 467
 
20.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
441
23.8%
63
 
3.4%
49
 
2.6%
41
 
2.2%
35
 
1.9%
34
 
1.8%
31
 
1.7%
31
 
1.7%
31
 
1.7%
31
 
1.7%
Other values (204) 1068
57.6%
ASCII
ValueCountFrequency (%)
( 87
18.6%
) 87
18.6%
1 67
14.3%
2 61
13.1%
- 26
 
5.6%
3 26
 
5.6%
7 16
 
3.4%
8 15
 
3.2%
14
 
3.0%
0 13
 
2.8%
Other values (7) 55
11.8%

시설물구분명
Categorical

IMBALANCE 

Distinct6
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
교량
431 
절토사면
 
31
터널
 
13
옹벽
 
6
보도육교
 
5

Length

Max length4
Median length2
Mean length2.1595092
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교량
2nd row교량
3rd row교량
4th row교량
5th row교량

Common Values

ValueCountFrequency (%)
교량 431
88.1%
절토사면 31
 
6.3%
터널 13
 
2.7%
옹벽 6
 
1.2%
보도육교 5
 
1.0%
지하차도 3
 
0.6%

Length

2023-12-11T07:00:43.489576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:00:43.607498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교량 431
88.1%
절토사면 31
 
6.3%
터널 13
 
2.7%
옹벽 6
 
1.2%
보도육교 5
 
1.0%
지하차도 3
 
0.6%

시설물위치
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
양평
95 
안성
91 
용인
77 
화성
64 
여주
46 
Other values (4)
116 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row김포
2nd row안성
3rd row안성
4th row안성
5th row안성

Common Values

ValueCountFrequency (%)
양평 95
19.4%
안성 91
18.6%
용인 77
15.7%
화성 64
13.1%
여주 46
9.4%
광주 35
 
7.2%
이천 31
 
6.3%
김포 25
 
5.1%
평택 25
 
5.1%

Length

2023-12-11T07:00:43.713730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:00:43.835322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
양평 95
19.4%
안성 91
18.6%
용인 77
15.7%
화성 64
13.1%
여주 46
9.4%
광주 35
 
7.2%
이천 31
 
6.3%
김포 25
 
5.1%
평택 25
 
5.1%

시설물준공년도
Real number (ℝ)

MISSING 

Distinct49
Distinct (%)10.7%
Missing31
Missing (%)6.3%
Infinite0
Infinite (%)0.0%
Mean2002.2664
Minimum1960
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.4 KiB
2023-12-11T07:00:43.997610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1960
5-th percentile1985.85
Q11994
median2002
Q32012
95-th percentile2020
Maximum2022
Range62
Interquartile range (IQR)18

Descriptive statistics

Standard deviation11.735926
Coefficient of variation (CV)0.0058613209
Kurtosis0.12599325
Mean2002.2664
Median Absolute Deviation (MAD)9
Skewness-0.41287969
Sum917038
Variance137.73195
MonotonicityNot monotonic
2023-12-11T07:00:44.422718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1992 24
 
4.9%
2013 23
 
4.7%
1996 21
 
4.3%
2001 20
 
4.1%
2010 19
 
3.9%
2016 19
 
3.9%
1993 19
 
3.9%
1994 19
 
3.9%
2005 18
 
3.7%
1997 17
 
3.5%
Other values (39) 259
53.0%
(Missing) 31
 
6.3%
ValueCountFrequency (%)
1960 1
 
0.2%
1966 1
 
0.2%
1967 1
 
0.2%
1970 3
0.6%
1972 2
0.4%
1973 4
0.8%
1974 2
0.4%
1975 1
 
0.2%
1976 1
 
0.2%
1980 4
0.8%
ValueCountFrequency (%)
2022 13
2.7%
2021 1
 
0.2%
2020 15
3.1%
2019 15
3.1%
2018 2
 
0.4%
2017 14
2.9%
2016 19
3.9%
2015 2
 
0.4%
2014 8
 
1.6%
2013 23
4.7%

종별구분명
Categorical

Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
3
187 
기타
180 
2
108 
1
 
14

Length

Max length2
Median length1
Mean length1.3680982
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기타
2nd row3
3rd row기타
4th row기타
5th row기타

Common Values

ValueCountFrequency (%)
3 187
38.2%
기타 180
36.8%
2 108
22.1%
1 14
 
2.9%

Length

2023-12-11T07:00:44.554826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:00:44.672406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 187
38.2%
기타 180
36.8%
2 108
22.1%
1 14
 
2.9%

노선명
Real number (ℝ)

Distinct35
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean237.66462
Minimum23
Maximum391
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.4 KiB
2023-12-11T07:00:44.771402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile23
Q188
median314
Q3337
95-th percentile349
Maximum391
Range368
Interquartile range (IQR)249

Descriptive statistics

Standard deviation125.32316
Coefficient of variation (CV)0.52731098
Kurtosis-1.4688366
Mean237.66462
Median Absolute Deviation (MAD)28
Skewness-0.64271371
Sum116218
Variance15705.895
MonotonicityNot monotonic
2023-12-11T07:00:44.916872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
318 38
 
7.8%
23 32
 
6.5%
70 29
 
5.9%
88 27
 
5.5%
98 26
 
5.3%
341 26
 
5.3%
325 24
 
4.9%
321 22
 
4.5%
345 22
 
4.5%
342 21
 
4.3%
Other values (25) 222
45.4%
ValueCountFrequency (%)
23 32
6.5%
56 2
 
0.4%
57 20
4.1%
70 29
5.9%
78 4
 
0.8%
82 18
3.7%
84 13
2.7%
88 27
5.5%
98 26
5.3%
301 1
 
0.2%
ValueCountFrequency (%)
391 3
 
0.6%
356 7
 
1.4%
355 5
 
1.0%
352 8
 
1.6%
351 1
 
0.2%
349 14
2.9%
345 22
4.5%
342 21
4.3%
341 26
5.3%
338 3
 
0.6%

Interactions

2023-12-11T07:00:41.723198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:00:41.532161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:00:41.829504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:00:41.622940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:00:45.018761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
집계년도시군명시설물구분명시설물위치시설물준공년도종별구분명노선명
집계년도1.0000.4180.0560.4180.4430.2740.322
시군명0.4181.0000.2651.0000.4170.2420.739
시설물구분명0.0560.2651.0000.2650.0000.4590.224
시설물위치0.4181.0000.2651.0000.4170.2420.739
시설물준공년도0.4430.4170.0000.4171.0000.5290.361
종별구분명0.2740.2420.4590.2420.5291.0000.252
노선명0.3220.7390.2240.7390.3610.2521.000
2023-12-11T07:00:45.130450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물위치집계년도시군명시설물구분명종별구분명
시설물위치1.0000.2781.0000.1340.156
집계년도0.2781.0000.2780.0360.110
시군명1.0000.2781.0000.1340.156
시설물구분명0.1340.0360.1341.0000.312
종별구분명0.1560.1100.1560.3121.000
2023-12-11T07:00:45.239992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물준공년도노선명집계년도시군명시설물구분명시설물위치종별구분명
시설물준공년도1.000-0.1350.2780.2110.0000.2110.341
노선명-0.1351.0000.2120.4770.0830.4770.164
집계년도0.2780.2121.0000.2780.0360.2780.110
시군명0.2110.4770.2781.0000.1341.0000.156
시설물구분명0.0000.0830.0360.1341.0000.1340.312
시설물위치0.2110.4770.2781.0000.1341.0000.156
종별구분명0.3410.1640.1100.1560.3120.1561.000

Missing values

2023-12-11T07:00:41.973057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:00:42.094676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

집계년도시군명시설물명시설물구분명시설물위치시설물준공년도종별구분명노선명
02023김포시고음달천교교량김포2022기타84
12023안성시구수교(1)교량안성1992323
22023안성시동평교교량안성1992기타325
32023안성시남풍교교량안성2022기타325
42023안성시가율교교량안성2022기타325
52023안성시신장교교량안성2022기타325
62023안성시산하교교량안성1997기타306
72023용인시봉명IC교교량용인2022223
82023용인시봉명IC교 우측 보강토옹벽옹벽용인2022223
92023이천시돈실교교량이천19993325
집계년도시군명시설물명시설물구분명시설물위치시설물준공년도종별구분명노선명
4792020화성시상안교(하)교량화성20082318
4802020화성시상안육교(송산방향)교량화성20082318
4812020화성시상안육교(탄도방향)교량화성20082318
4822020화성시무송2교교량화성20162318
4832020화성시천천지하차도지하차도화성2005284
4842020화성시숙곡지하차도지하차도화성2016298
4852020화성시구봉터널(상)터널화성20022318
4862020화성시구봉터널(하)터널화성20082318
4872020화성시북양리절토사면(565-36)절토사면화성<NA>2322
4882020화성시상안리절토사면(8-12)절토사면화성<NA>2318