Overview

Dataset statistics

Number of variables18
Number of observations2347
Missing cells4293
Missing cells (%)10.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory348.5 KiB
Average record size in memory152.1 B

Variable types

Text6
Numeric5
Categorical7

Dataset

Description보행자작동신호기 관리번호,지주관리번호,방향 (공통),설치일,교체일,X좌표,Y좌표,제조업체,작업구분 (공통),표출구분 (공통),종류,신규정규화ID,상태 (공통),공사관리번호,보행자작동신호기 관리번호,이력ID,위치정보,공사형태 (공통)
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15545/S/1/datasetView.do

Alerts

공사형태 (공통) is highly overall correlated with 표출구분 (공통) and 2 other fieldsHigh correlation
위치정보 is highly overall correlated with 이력ID and 3 other fieldsHigh correlation
제조업체 is highly overall correlated with 작업구분 (공통) and 2 other fieldsHigh correlation
상태 (공통) is highly overall correlated with 제조업체 and 2 other fieldsHigh correlation
방향 (공통) is highly overall correlated with 이력IDHigh correlation
X좌표 is highly overall correlated with 신규정규화IDHigh correlation
신규정규화ID is highly overall correlated with X좌표High correlation
이력ID is highly overall correlated with 방향 (공통) and 1 other fieldsHigh correlation
작업구분 (공통) is highly overall correlated with 제조업체High correlation
표출구분 (공통) is highly overall correlated with 공사형태 (공통)High correlation
제조업체 is highly imbalanced (61.5%)Imbalance
상태 (공통) is highly imbalanced (91.4%)Imbalance
방향 (공통) has 152 (6.5%) missing valuesMissing
설치일 has 1237 (52.7%) missing valuesMissing
교체일 has 1443 (61.5%) missing valuesMissing
X좌표 has 152 (6.5%) missing valuesMissing
Y좌표 has 152 (6.5%) missing valuesMissing
신규정규화ID has 760 (32.4%) missing valuesMissing
공사관리번호 has 397 (16.9%) missing valuesMissing
이력ID has unique valuesUnique
방향 (공통) has 336 (14.3%) zerosZeros

Reproduction

Analysis started2024-05-04 05:44:21.818415
Analysis finished2024-05-04 05:44:32.020177
Duration10.2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2154
Distinct (%)91.8%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
2024-05-04T05:44:32.266676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters30511
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1977 ?
Unique (%)84.2%

Sample

1st row29-0000001015
2nd row29-0000001040
3rd row29-0000001058
4th row29-0000001064
5th row29-0000001092
ValueCountFrequency (%)
29-0000000001 5
 
0.2%
29-0000000093 4
 
0.2%
29-0000000039 3
 
0.1%
29-0000000035 3
 
0.1%
29-0000000658 3
 
0.1%
29-0000000092 3
 
0.1%
29-0000000030 3
 
0.1%
29-0000000033 3
 
0.1%
29-0000000031 3
 
0.1%
29-0000000034 3
 
0.1%
Other values (2144) 2314
98.6%
2024-05-04T05:44:32.858772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 16127
52.9%
2 3212
 
10.5%
9 3025
 
9.9%
- 2347
 
7.7%
1 1730
 
5.7%
6 693
 
2.3%
3 684
 
2.2%
5 679
 
2.2%
7 678
 
2.2%
4 674
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28164
92.3%
Dash Punctuation 2347
 
7.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 16127
57.3%
2 3212
 
11.4%
9 3025
 
10.7%
1 1730
 
6.1%
6 693
 
2.5%
3 684
 
2.4%
5 679
 
2.4%
7 678
 
2.4%
4 674
 
2.4%
8 662
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 2347
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30511
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 16127
52.9%
2 3212
 
10.5%
9 3025
 
9.9%
- 2347
 
7.7%
1 1730
 
5.7%
6 693
 
2.3%
3 684
 
2.2%
5 679
 
2.2%
7 678
 
2.2%
4 674
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30511
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 16127
52.9%
2 3212
 
10.5%
9 3025
 
9.9%
- 2347
 
7.7%
1 1730
 
5.7%
6 693
 
2.3%
3 684
 
2.2%
5 679
 
2.2%
7 678
 
2.2%
4 674
 
2.2%
Distinct1917
Distinct (%)81.7%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
2024-05-04T05:44:33.283582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters30511
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1636 ?
Unique (%)69.7%

Sample

1st row02-0000000030
2nd row02-0000117327
3rd row02-0000120394
4th row02-0000050216
5th row02-0000009634
ValueCountFrequency (%)
02-0000162102 8
 
0.3%
02-0000162105 8
 
0.3%
02-0000162103 6
 
0.3%
02-0000006646 5
 
0.2%
02-0000160966 5
 
0.2%
02-0000136757 5
 
0.2%
02-0000080250 5
 
0.2%
02-0000134220 5
 
0.2%
02-0000160968 4
 
0.2%
02-0000001343 4
 
0.2%
Other values (1907) 2292
97.7%
2024-05-04T05:44:33.923133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 14079
46.1%
2 3634
 
11.9%
1 2380
 
7.8%
- 2347
 
7.7%
6 1328
 
4.4%
5 1260
 
4.1%
7 1232
 
4.0%
9 1102
 
3.6%
8 1088
 
3.6%
3 1063
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28164
92.3%
Dash Punctuation 2347
 
7.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 14079
50.0%
2 3634
 
12.9%
1 2380
 
8.5%
6 1328
 
4.7%
5 1260
 
4.5%
7 1232
 
4.4%
9 1102
 
3.9%
8 1088
 
3.9%
3 1063
 
3.8%
4 998
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 2347
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30511
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 14079
46.1%
2 3634
 
11.9%
1 2380
 
7.8%
- 2347
 
7.7%
6 1328
 
4.4%
5 1260
 
4.1%
7 1232
 
4.0%
9 1102
 
3.6%
8 1088
 
3.6%
3 1063
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30511
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 14079
46.1%
2 3634
 
11.9%
1 2380
 
7.8%
- 2347
 
7.7%
6 1328
 
4.4%
5 1260
 
4.1%
7 1232
 
4.0%
9 1102
 
3.6%
8 1088
 
3.6%
3 1063
 
3.5%

방향 (공통)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct9
Distinct (%)0.4%
Missing152
Missing (%)6.5%
Infinite0
Infinite (%)0.0%
Mean117.84055
Minimum-45
Maximum315
Zeros336
Zeros (%)14.3%
Negative6
Negative (%)0.3%
Memory size20.8 KiB
2024-05-04T05:44:34.146922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-45
5-th percentile0
Q190
median90
Q3180
95-th percentile180
Maximum315
Range360
Interquartile range (IQR)90

Descriptive statistics

Standard deviation70.308489
Coefficient of variation (CV)0.59664089
Kurtosis-0.55948267
Mean117.84055
Median Absolute Deviation (MAD)90
Skewness-0.12277135
Sum258660
Variance4943.2836
MonotonicityNot monotonic
2024-05-04T05:44:34.341445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
90 881
37.5%
180 862
36.7%
0 336
 
14.3%
270 64
 
2.7%
45 16
 
0.7%
135 11
 
0.5%
225 11
 
0.5%
315 8
 
0.3%
-45 6
 
0.3%
(Missing) 152
 
6.5%
ValueCountFrequency (%)
-45 6
 
0.3%
0 336
 
14.3%
45 16
 
0.7%
90 881
37.5%
135 11
 
0.5%
180 862
36.7%
225 11
 
0.5%
270 64
 
2.7%
315 8
 
0.3%
ValueCountFrequency (%)
315 8
 
0.3%
270 64
 
2.7%
225 11
 
0.5%
180 862
36.7%
135 11
 
0.5%
90 881
37.5%
45 16
 
0.7%
0 336
 
14.3%
-45 6
 
0.3%

설치일
Text

MISSING 

Distinct304
Distinct (%)27.4%
Missing1237
Missing (%)52.7%
Memory size18.5 KiB
2024-05-04T05:44:34.844950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.9873874
Min length1

Characters and Unicode

Total characters8866
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70 ?
Unique (%)6.3%

Sample

1st row20201215
2nd row20201215
3rd row20011203
4th row20201215
5th row20161111
ValueCountFrequency (%)
20100625 34
 
3.1%
20201215 25
 
2.3%
20130430 25
 
2.3%
20171031 24
 
2.2%
20090101 23
 
2.1%
20130725 18
 
1.6%
20131220 16
 
1.4%
20131231 16
 
1.4%
20130920 15
 
1.4%
20150922 14
 
1.3%
Other values (293) 898
81.0%
2024-05-04T05:44:35.870462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2698
30.4%
1 2083
23.5%
2 1909
21.5%
3 530
 
6.0%
5 404
 
4.6%
9 297
 
3.3%
4 280
 
3.2%
6 277
 
3.1%
8 194
 
2.2%
7 192
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8864
> 99.9%
Space Separator 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2698
30.4%
1 2083
23.5%
2 1909
21.5%
3 530
 
6.0%
5 404
 
4.6%
9 297
 
3.4%
4 280
 
3.2%
6 277
 
3.1%
8 194
 
2.2%
7 192
 
2.2%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8866
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2698
30.4%
1 2083
23.5%
2 1909
21.5%
3 530
 
6.0%
5 404
 
4.6%
9 297
 
3.3%
4 280
 
3.2%
6 277
 
3.1%
8 194
 
2.2%
7 192
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8866
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2698
30.4%
1 2083
23.5%
2 1909
21.5%
3 530
 
6.0%
5 404
 
4.6%
9 297
 
3.3%
4 280
 
3.2%
6 277
 
3.1%
8 194
 
2.2%
7 192
 
2.2%

교체일
Text

MISSING 

Distinct260
Distinct (%)28.8%
Missing1443
Missing (%)61.5%
Memory size18.5 KiB
2024-05-04T05:44:36.484873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.9845133
Min length1

Characters and Unicode

Total characters7218
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)7.0%

Sample

1st row20201215
2nd row20201215
3rd row20201215
4th row20201215
5th row20201215
ValueCountFrequency (%)
20201215 42
 
4.7%
20171031 22
 
2.4%
20131220 16
 
1.8%
20171215 16
 
1.8%
20131231 16
 
1.8%
20190913 16
 
1.8%
20201016 15
 
1.7%
20190912 14
 
1.6%
20171220 14
 
1.6%
20140425 14
 
1.6%
Other values (249) 717
79.5%
2024-05-04T05:44:37.280076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1929
26.7%
2 1769
24.5%
1 1664
23.1%
3 377
 
5.2%
5 345
 
4.8%
9 313
 
4.3%
6 239
 
3.3%
7 221
 
3.1%
4 215
 
3.0%
8 144
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7216
> 99.9%
Space Separator 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1929
26.7%
2 1769
24.5%
1 1664
23.1%
3 377
 
5.2%
5 345
 
4.8%
9 313
 
4.3%
6 239
 
3.3%
7 221
 
3.1%
4 215
 
3.0%
8 144
 
2.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7218
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1929
26.7%
2 1769
24.5%
1 1664
23.1%
3 377
 
5.2%
5 345
 
4.8%
9 313
 
4.3%
6 239
 
3.3%
7 221
 
3.1%
4 215
 
3.0%
8 144
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7218
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1929
26.7%
2 1769
24.5%
1 1664
23.1%
3 377
 
5.2%
5 345
 
4.8%
9 313
 
4.3%
6 239
 
3.3%
7 221
 
3.1%
4 215
 
3.0%
8 144
 
2.0%

X좌표
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1922
Distinct (%)87.6%
Missing152
Missing (%)6.5%
Infinite0
Infinite (%)0.0%
Mean199413.05
Minimum182613.69
Maximum215848.52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.8 KiB
2024-05-04T05:44:37.699730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum182613.69
5-th percentile187931.67
Q1193500.77
median201173.34
Q3204601.21
95-th percentile209579.13
Maximum215848.52
Range33234.824
Interquartile range (IQR)11100.435

Descriptive statistics

Standard deviation6941.2835
Coefficient of variation (CV)0.034808571
Kurtosis-0.6921701
Mean199413.05
Median Absolute Deviation (MAD)4986.3851
Skewness-0.22726675
Sum4.3771165 × 108
Variance48181416
MonotonicityNot monotonic
2024-05-04T05:44:37.971135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
190934.63327 8
 
0.3%
192304.7565 8
 
0.3%
205745.67406 5
 
0.2%
207062.173 5
 
0.2%
207057.53485 5
 
0.2%
207039.62379 4
 
0.2%
196710.89419 4
 
0.2%
204945.33963 4
 
0.2%
204955.31176 4
 
0.2%
201347.58665 4
 
0.2%
Other values (1912) 2144
91.4%
(Missing) 152
 
6.5%
ValueCountFrequency (%)
182613.69228 1
< 0.1%
183176.56511 1
< 0.1%
183187.03519 1
< 0.1%
183521.12614 1
< 0.1%
183525.41352 1
< 0.1%
183551.46591 1
< 0.1%
183551.5259 1
< 0.1%
183558.63468 1
< 0.1%
183558.75466 1
< 0.1%
183772.7444 1
< 0.1%
ValueCountFrequency (%)
215848.51616 1
< 0.1%
215834.65854 1
< 0.1%
215126.15046 1
< 0.1%
215113.1247 2
0.1%
215101.13852 1
< 0.1%
215095.4747 2
0.1%
214933.75982 1
< 0.1%
214929.32808 1
< 0.1%
214187.61946 1
< 0.1%
214181.54551 1
< 0.1%

Y좌표
Real number (ℝ)

MISSING 

Distinct1921
Distinct (%)87.5%
Missing152
Missing (%)6.5%
Infinite0
Infinite (%)0.0%
Mean551362.48
Minimum539291.37
Maximum565368.17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.8 KiB
2024-05-04T05:44:38.341004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum539291.37
5-th percentile541969.51
Q1547190.92
median550806.82
Q3554889.3
95-th percentile561977.15
Maximum565368.17
Range26076.806
Interquartile range (IQR)7698.3781

Descriptive statistics

Standard deviation5824.2743
Coefficient of variation (CV)0.010563421
Kurtosis-0.47890259
Mean551362.48
Median Absolute Deviation (MAD)3865.3558
Skewness0.23294895
Sum1.2102407 × 109
Variance33922171
MonotonicityNot monotonic
2024-05-04T05:44:38.927475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
548293.31251 8
 
0.3%
549168.58356 8
 
0.3%
552374.0823 5
 
0.2%
553910.32527 5
 
0.2%
552756.64403 5
 
0.2%
552607.95843 4
 
0.2%
552587.21478 4
 
0.2%
553163.29356 4
 
0.2%
548515.29834 4
 
0.2%
551529.3468 4
 
0.2%
Other values (1911) 2144
91.4%
(Missing) 152
 
6.5%
ValueCountFrequency (%)
539291.36638 1
< 0.1%
539298.7976 1
< 0.1%
539308.26847 1
< 0.1%
539315.62095 1
< 0.1%
539664.75454 1
< 0.1%
539665.84256 1
< 0.1%
539700.94402 1
< 0.1%
539735.44746 1
< 0.1%
539735.71367 1
< 0.1%
539744.00724 1
< 0.1%
ValueCountFrequency (%)
565368.17229 1
< 0.1%
565365.97236 1
< 0.1%
565360.45237 1
< 0.1%
565358.06919 1
< 0.1%
565346.92577 1
< 0.1%
565345.19616 1
< 0.1%
565310.74999 1
< 0.1%
565306.49072 1
< 0.1%
565305.83083 1
< 0.1%
565303.1013 1
< 0.1%

제조업체
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct45
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
<NA>
1137 
대한신호
830 
대한신호(주)
 
100
.
 
63
0
 
24
Other values (40)
193 

Length

Max length10
Median length4
Mean length4.0877716
Min length1

Unique

Unique11 ?
Unique (%)0.5%

Sample

1st row대한신호
2nd row.
3rd row.
4th row.
5th row대한신호

Common Values

ValueCountFrequency (%)
<NA> 1137
48.4%
대한신호 830
35.4%
대한신호(주) 100
 
4.3%
. 63
 
2.7%
0 24
 
1.0%
대한신호㈜ 24
 
1.0%
서돌전자통신 17
 
0.7%
한길에이치씨 14
 
0.6%
대한 14
 
0.6%
1 13
 
0.6%
Other values (35) 111
 
4.7%

Length

2024-05-04T05:44:39.339169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1137
48.5%
대한신호 830
35.4%
대한신호(주 101
 
4.3%
63
 
2.7%
0 24
 
1.0%
대한신호㈜ 24
 
1.0%
서돌전자통신 17
 
0.7%
한길에이치씨 14
 
0.6%
대한 14
 
0.6%
1 13
 
0.6%
Other values (33) 108
 
4.6%

작업구분 (공통)
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
1
1735 
6
261 
2
259 
4
 
74
3
 
18

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row2
5th row1

Common Values

ValueCountFrequency (%)
1 1735
73.9%
6 261
 
11.1%
2 259
 
11.0%
4 74
 
3.2%
3 18
 
0.8%

Length

2024-05-04T05:44:39.544633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T05:44:39.720496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1735
73.9%
6 261
 
11.1%
2 259
 
11.0%
4 74
 
3.2%
3 18
 
0.8%

표출구분 (공통)
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
1
1783 
2
564 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1783
76.0%
2 564
 
24.0%

Length

2024-05-04T05:44:39.912877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T05:44:40.166489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1783
76.0%
2 564
 
24.0%

종류
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
0
1791 
1
537 
<NA>
 
19

Length

Max length4
Median length1
Mean length1.0242863
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 1791
76.3%
1 537
 
22.9%
<NA> 19
 
0.8%

Length

2024-05-04T05:44:40.514997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T05:44:40.788293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 1791
76.3%
1 537
 
22.9%
na 19
 
0.8%

신규정규화ID
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1578
Distinct (%)99.4%
Missing760
Missing (%)32.4%
Infinite0
Infinite (%)0.0%
Mean4041538.4
Minimum289141
Maximum52858210
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.8 KiB
2024-05-04T05:44:41.050578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum289141
5-th percentile1178773
Q12303672.5
median4221694
Q35108596
95-th percentile6214761.7
Maximum52858210
Range52569069
Interquartile range (IQR)2804923.5

Descriptive statistics

Standard deviation3628310.5
Coefficient of variation (CV)0.89775481
Kurtosis92.332027
Mean4041538.4
Median Absolute Deviation (MAD)1084922
Skewness8.4893263
Sum6.4139215 × 109
Variance1.3164637 × 1013
MonotonicityNot monotonic
2024-05-04T05:44:41.483900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32035810 3
 
0.1%
2979810 3
 
0.1%
32674510 2
 
0.1%
20069010 2
 
0.1%
52858210 2
 
0.1%
12716410 2
 
0.1%
44652510 2
 
0.1%
2373936 1
 
< 0.1%
4432912 1
 
< 0.1%
1106496 1
 
< 0.1%
Other values (1568) 1568
66.8%
(Missing) 760
32.4%
ValueCountFrequency (%)
289141 1
< 0.1%
289244 1
< 0.1%
297972 1
< 0.1%
297989 1
< 0.1%
352541 1
< 0.1%
352542 1
< 0.1%
352811 1
< 0.1%
362942 1
< 0.1%
372042 1
< 0.1%
380205 1
< 0.1%
ValueCountFrequency (%)
52858210 2
0.1%
45309410 1
 
< 0.1%
44652510 2
0.1%
42562410 1
 
< 0.1%
33840610 1
 
< 0.1%
32674510 2
0.1%
32035810 3
0.1%
20069010 2
0.1%
20059910 1
 
< 0.1%
12716410 2
0.1%

상태 (공통)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
001
2288 
<NA>
 
41
004
 
14
002
 
2
 
2

Length

Max length4
Median length3
Mean length3.0157648
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row001
2nd row001
3rd row001
4th row001
5th row001

Common Values

ValueCountFrequency (%)
001 2288
97.5%
<NA> 41
 
1.7%
004 14
 
0.6%
002 2
 
0.1%
2
 
0.1%

Length

2024-05-04T05:44:41.787182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T05:44:42.073808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
001 2288
97.6%
na 41
 
1.7%
004 14
 
0.6%
002 2
 
0.1%

공사관리번호
Text

MISSING 

Distinct312
Distinct (%)16.0%
Missing397
Missing (%)16.9%
Memory size18.5 KiB
2024-05-04T05:44:42.700173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.987692
Min length1

Characters and Unicode

Total characters25326
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)2.7%

Sample

1st row2012-0101-006
2nd row2012-0101-091
3rd row2012-0101-112
4th row2012-0101-112
5th row2012-0101-111
ValueCountFrequency (%)
2000-0000-000 702
36.0%
2019-0201-040 58
 
3.0%
2020-0101-114 42
 
2.2%
2011-0101-084 32
 
1.6%
2009-1101-151 26
 
1.3%
2012-0101-012 26
 
1.3%
2016-2301-001 23
 
1.2%
2013-0111-004 18
 
0.9%
2014-0101-034 18
 
0.9%
2020-0101-011 16
 
0.8%
Other values (301) 987
50.7%
2024-05-04T05:44:43.774558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 12138
47.9%
1 4182
 
16.5%
- 3896
 
15.4%
2 2896
 
11.4%
4 453
 
1.8%
3 395
 
1.6%
6 308
 
1.2%
7 302
 
1.2%
9 297
 
1.2%
5 247
 
1.0%
Other values (2) 212
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21428
84.6%
Dash Punctuation 3896
 
15.4%
Space Separator 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 12138
56.6%
1 4182
 
19.5%
2 2896
 
13.5%
4 453
 
2.1%
3 395
 
1.8%
6 308
 
1.4%
7 302
 
1.4%
9 297
 
1.4%
5 247
 
1.2%
8 210
 
1.0%
Dash Punctuation
ValueCountFrequency (%)
- 3896
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25326
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 12138
47.9%
1 4182
 
16.5%
- 3896
 
15.4%
2 2896
 
11.4%
4 453
 
1.8%
3 395
 
1.6%
6 308
 
1.2%
7 302
 
1.2%
9 297
 
1.2%
5 247
 
1.0%
Other values (2) 212
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 12138
47.9%
1 4182
 
16.5%
- 3896
 
15.4%
2 2896
 
11.4%
4 453
 
1.8%
3 395
 
1.6%
6 308
 
1.2%
7 302
 
1.2%
9 297
 
1.2%
5 247
 
1.0%
Other values (2) 212
 
0.8%
Distinct2154
Distinct (%)91.8%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
2024-05-04T05:44:44.430432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters21123
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1977 ?
Unique (%)84.2%

Sample

1st row29-001015
2nd row29-001040
3rd row29-001058
4th row29-001064
5th row29-001092
ValueCountFrequency (%)
29-000001 5
 
0.2%
29-000093 4
 
0.2%
29-000039 3
 
0.1%
29-000035 3
 
0.1%
29-000658 3
 
0.1%
29-000092 3
 
0.1%
29-000030 3
 
0.1%
29-000033 3
 
0.1%
29-000031 3
 
0.1%
29-000034 3
 
0.1%
Other values (2144) 2314
98.6%
2024-05-04T05:44:45.552490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 6739
31.9%
2 3212
15.2%
9 3025
14.3%
- 2347
 
11.1%
1 1730
 
8.2%
6 693
 
3.3%
3 684
 
3.2%
5 679
 
3.2%
7 678
 
3.2%
4 674
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18776
88.9%
Dash Punctuation 2347
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6739
35.9%
2 3212
17.1%
9 3025
16.1%
1 1730
 
9.2%
6 693
 
3.7%
3 684
 
3.6%
5 679
 
3.6%
7 678
 
3.6%
4 674
 
3.6%
8 662
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 2347
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21123
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6739
31.9%
2 3212
15.2%
9 3025
14.3%
- 2347
 
11.1%
1 1730
 
8.2%
6 693
 
3.3%
3 684
 
3.2%
5 679
 
3.2%
7 678
 
3.2%
4 674
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21123
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6739
31.9%
2 3212
15.2%
9 3025
14.3%
- 2347
 
11.1%
1 1730
 
8.2%
6 693
 
3.3%
3 684
 
3.2%
5 679
 
3.2%
7 678
 
3.2%
4 674
 
3.2%

이력ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2347
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1182.0017
Minimum1
Maximum2374
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.8 KiB
2024-05-04T05:44:45.884039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile118.3
Q1587.5
median1174
Q31776.5
95-th percentile2256.7
Maximum2374
Range2373
Interquartile range (IQR)1189

Descriptive statistics

Standard deviation686.5346
Coefficient of variation (CV)0.58082369
Kurtosis-1.202622
Mean1182.0017
Median Absolute Deviation (MAD)595
Skewness0.014284317
Sum2774158
Variance471329.75
MonotonicityNot monotonic
2024-05-04T05:44:46.139616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 1
 
< 0.1%
2028 1
 
< 0.1%
1769 1
 
< 0.1%
1768 1
 
< 0.1%
1949 1
 
< 0.1%
1939 1
 
< 0.1%
1420 1
 
< 0.1%
1990 1
 
< 0.1%
2002 1
 
< 0.1%
1980 1
 
< 0.1%
Other values (2337) 2337
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2374 1
< 0.1%
2373 1
< 0.1%
2372 1
< 0.1%
2371 1
< 0.1%
2370 1
< 0.1%
2369 1
< 0.1%
2368 1
< 0.1%
2367 1
< 0.1%
2366 1
< 0.1%
2365 1
< 0.1%

위치정보
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
1
1289 
<NA>
1056 
 
2

Length

Max length4
Median length1
Mean length2.3498083
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1289
54.9%
<NA> 1056
45.0%
2
 
0.1%

Length

2024-05-04T05:44:46.400499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T05:44:46.790202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1289
55.0%
na 1056
45.0%

공사형태 (공통)
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size18.5 KiB
<NA>
949 
003
805 
001
415 
002
 
90
004
 
79
Other values (2)
 
9

Length

Max length4
Median length3
Mean length3.4026417
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row003
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row003

Common Values

ValueCountFrequency (%)
<NA> 949
40.4%
003 805
34.3%
001 415
17.7%
002 90
 
3.8%
004 79
 
3.4%
006 7
 
0.3%
2
 
0.1%

Length

2024-05-04T05:44:47.137265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T05:44:47.470230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 949
40.5%
003 805
34.3%
001 415
17.7%
002 90
 
3.8%
004 79
 
3.4%
006 7
 
0.3%

Interactions

2024-05-04T05:44:29.066697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:24.004309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:25.357297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:26.725243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:28.053773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:29.338665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:24.277593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:25.647923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:26.968825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:28.322325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:29.603013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:24.556955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:25.921837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:27.236816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:28.498561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:29.856264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:24.841581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:26.135576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:27.539398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:28.667268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:30.110350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:25.099360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:26.461121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:27.800304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T05:44:28.831994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T05:44:47.809483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
방향 (공통)X좌표Y좌표제조업체작업구분 (공통)표출구분 (공통)종류신규정규화ID상태 (공통)이력ID위치정보공사형태 (공통)
방향 (공통)1.0000.2230.3030.4760.4400.1570.2160.0000.0000.6120.2230.268
X좌표0.2231.0000.6640.6560.3290.1320.1590.6350.1550.5230.3040.233
Y좌표0.3030.6641.0000.6330.4710.3000.2240.2440.1590.5940.3580.392
제조업체0.4760.6560.6331.0000.8230.4500.3950.0000.8800.6351.0000.794
작업구분 (공통)0.4400.3290.4710.8231.0000.2230.1310.1140.0000.5800.0000.392
표출구분 (공통)0.1570.1320.3000.4500.2231.0000.2220.0240.0910.5850.1000.911
종류0.2160.1590.2240.3950.1310.2221.0000.0730.0720.3990.0900.145
신규정규화ID0.0000.6350.2440.0000.1140.0240.0731.0000.0460.1630.1350.101
상태 (공통)0.0000.1550.1590.8800.0000.0910.0720.0461.0000.0991.0000.741
이력ID0.6120.5230.5940.6350.5800.5850.3990.1630.0991.0001.0000.448
위치정보0.2230.3040.3581.0000.0000.1000.0900.1351.0001.0001.0001.000
공사형태 (공통)0.2680.2330.3920.7940.3920.9110.1450.1010.7410.4481.0001.000
2024-05-04T05:44:48.208038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사형태 (공통)종류위치정보작업구분 (공통)제조업체표출구분 (공통)상태 (공통)
공사형태 (공통)1.0000.1040.9970.2820.4870.7340.577
종류0.1041.0000.0570.1600.3090.1420.047
위치정보0.9970.0571.0000.0000.9810.0640.999
작업구분 (공통)0.2820.1600.0001.0000.5330.2720.000
제조업체0.4870.3090.9810.5331.0000.3530.683
표출구분 (공통)0.7340.1420.0640.2720.3531.0000.060
상태 (공통)0.5770.0470.9990.0000.6830.0601.000
2024-05-04T05:44:48.529997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
방향 (공통)X좌표Y좌표신규정규화ID이력ID제조업체작업구분 (공통)표출구분 (공통)종류상태 (공통)위치정보공사형태 (공통)
방향 (공통)1.000-0.083-0.208-0.0950.5240.1990.2720.1570.2150.0000.2230.152
X좌표-0.0831.0000.2750.9400.0290.2830.1420.1010.1220.0930.2330.124
Y좌표-0.2080.2751.0000.374-0.0910.2700.2140.2310.1740.0950.2740.219
신규정규화ID-0.0950.9400.3741.0000.0710.0000.1220.0260.0780.0310.2230.060
이력ID0.5240.029-0.0910.0711.0000.2710.2780.4520.3060.0590.9980.255
제조업체0.1990.2830.2700.0000.2711.0000.5330.3530.3090.6830.9810.487
작업구분 (공통)0.2720.1420.2140.1220.2780.5331.0000.2720.1600.0000.0000.282
표출구분 (공통)0.1570.1010.2310.0260.4520.3530.2721.0000.1420.0600.0640.734
종류0.2150.1220.1740.0780.3060.3090.1600.1421.0000.0470.0570.104
상태 (공통)0.0000.0930.0950.0310.0590.6830.0000.0600.0471.0000.9990.577
위치정보0.2230.2330.2740.2230.9980.9810.0000.0640.0570.9991.0000.997
공사형태 (공통)0.1520.1240.2190.0600.2550.4870.2820.7340.1040.5770.9971.000

Missing values

2024-05-04T05:44:30.497320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T05:44:31.191995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-04T05:44:31.710274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

보행자작동신호기 관리번호지주관리번호방향 (공통)설치일교체일X좌표Y좌표제조업체작업구분 (공통)표출구분 (공통)종류신규정규화ID상태 (공통)공사관리번호보행자작동신호기 관리번호.1이력ID위치정보공사형태 (공통)
029-000000101502-0000000030315<NA><NA>197694.00138550450.82047대한신호11132670460012012-0101-00629-00101531003
129-000000104002-000011732790<NA><NA>201594.49009558600.64063.210<NA>0012012-0101-09129-0010408071<NA>
229-000000105802-000012039490<NA><NA>206861.93441550509.99666.210<NA>0012012-0101-11229-0010588361<NA>
329-000000106402-000005021690<NA><NA>206777.83936550295.15259.210<NA>0012012-0101-11229-0010642631<NA>
429-000000109202-000000963490<NA><NA>198742.67717550794.67193대한신호11132881020012012-0101-11129-0010921031003
529-000000108702-000013970390<NA><NA>197867.46342550996.25356.210<NA>0012012-0101-11129-0010879401<NA>
629-000000095002-000007991290<NA><NA>206522.05977554275.642310110<NA>0012011-0101-08429-0009504961<NA>
729-000000092902-00001291230<NA><NA>204960.44412551607.190860110<NA>0012011-0101-08429-0009298601<NA>
829-000000061802-000006829090<NA><NA>206029.1903553142.69337<NA>11053227730012000-0000-00029-0006183991003
929-000000064702-000006875990<NA><NA>205673.63057552209.97284<NA>11153109930012000-0000-00029-0006474011003
보행자작동신호기 관리번호지주관리번호방향 (공통)설치일교체일X좌표Y좌표제조업체작업구분 (공통)표출구분 (공통)종류신규정규화ID상태 (공통)공사관리번호보행자작동신호기 관리번호.1이력ID위치정보공사형태 (공통)
233729-000000217602-0000223061<NA>2023072420230724<NA><NA>대한신호1213525410012023-0101-06829-0021762369<NA>001
233829-000000217502-0000223059<NA>2023072420230724<NA><NA>대한신호(주)12113310810012023-0101-06229-0021752368<NA>001
233929-000000217402-0000070042<NA>2023072420230724<NA><NA>대한신호(주)12113310910012023-0101-06229-0021742367<NA>001
234029-000000114502-00001573161802023093020230930199951.69226545467.93916대한신호12141075470012023-0101-01029-0011451338<NA>002
234129-000000162102-00001805941802014080620230725203736.54785565288.73377대한신호(주)12045871040012023-0101-13329-0016211814<NA>002
234229-000000162202-00001805951802014080620230725203707.51284565283.3347대한신호(주)12045870040012023-0101-13329-0016221815<NA>002
234329-000000129502-00001616211802023061520230615201527.47298561459.06215<NA>12044397430012023-0101-13329-0012951488<NA>002
234429-000000129402-00001616221802023061520230615201517.11351561448.54239<NA>12044396330012023-0101-13329-0012941487<NA>002
234529-000000156402-00001784461802023112820231128198126.13139550491.07917대한신호12132679450012023-0101-14829-0015641757<NA>002
234629-000000156302-00001784481802013010120231128198138.24931550498.78784대한신호12132679440012023-0101-14829-0015631756<NA>002