Overview

Dataset statistics

Number of variables4
Number of observations33
Missing cells75
Missing cells (%)56.8%
Duplicate rows1
Duplicate rows (%)3.0%
Total size in memory1.2 KiB
Average record size in memory37.0 B

Variable types

Numeric1
Text1
DateTime1
Categorical1

Dataset

Description부산광역시남구_정보통신공사사용전검사현황_20220308
Author부산광역시 남구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3080535

Alerts

Dataset has 1 (3.0%) duplicate rowsDuplicates
순번 is highly overall correlated with 공사의종류High correlation
공사의종류 is highly overall correlated with 순번High correlation
순번 has 25 (75.8%) missing valuesMissing
현장주소 has 25 (75.8%) missing valuesMissing
교부연월일 has 25 (75.8%) missing valuesMissing

Reproduction

Analysis started2023-12-10 17:06:21.255896
Analysis finished2023-12-10 17:06:22.344156
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct8
Distinct (%)100.0%
Missing25
Missing (%)75.8%
Infinite0
Infinite (%)0.0%
Mean4.5
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size429.0 B
2023-12-11T02:06:22.429811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.35
Q12.75
median4.5
Q36.25
95-th percentile7.65
Maximum8
Range7
Interquartile range (IQR)3.5

Descriptive statistics

Standard deviation2.4494897
Coefficient of variation (CV)0.54433105
Kurtosis-1.2
Mean4.5
Median Absolute Deviation (MAD)2
Skewness0
Sum36
Variance6
MonotonicityStrictly increasing
2023-12-11T02:06:22.651285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 1
 
3.0%
2 1
 
3.0%
3 1
 
3.0%
4 1
 
3.0%
5 1
 
3.0%
6 1
 
3.0%
7 1
 
3.0%
8 1
 
3.0%
(Missing) 25
75.8%
ValueCountFrequency (%)
1 1
3.0%
2 1
3.0%
3 1
3.0%
4 1
3.0%
5 1
3.0%
6 1
3.0%
7 1
3.0%
8 1
3.0%
ValueCountFrequency (%)
8 1
3.0%
7 1
3.0%
6 1
3.0%
5 1
3.0%
4 1
3.0%
3 1
3.0%
2 1
3.0%
1 1
3.0%

현장주소
Text

MISSING 

Distinct8
Distinct (%)100.0%
Missing25
Missing (%)75.8%
Memory size396.0 B
2023-12-11T02:06:22.906793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length18.5
Mean length16.5
Min length12

Characters and Unicode

Total characters132
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st row남구 용호동 554-21번지
2nd row대연동 894-18번지
3rd row남구 용호동 407-10 외 6필지
4th row남구 감만동 33-8번지
5th row남구 문현동 127-78번지
ValueCountFrequency (%)
남구 7
25.0%
대연동 4
14.3%
부산광역시 3
10.7%
용호동 2
 
7.1%
554-21번지 1
 
3.6%
894-18번지 1
 
3.6%
407-10 1
 
3.6%
1
 
3.6%
6필지 1
 
3.6%
감만동 1
 
3.6%
Other values (6) 6
21.4%
2023-12-11T02:06:23.358884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
 
15.2%
- 8
 
6.1%
8
 
6.1%
7
 
5.3%
7
 
5.3%
6
 
4.5%
1 6
 
4.5%
2 5
 
3.8%
5
 
3.8%
7 5
 
3.8%
Other values (22) 55
41.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66
50.0%
Decimal Number 38
28.8%
Space Separator 20
 
15.2%
Dash Punctuation 8
 
6.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
12.1%
7
10.6%
7
10.6%
6
 
9.1%
5
 
7.6%
4
 
6.1%
4
 
6.1%
3
 
4.5%
3
 
4.5%
3
 
4.5%
Other values (10) 16
24.2%
Decimal Number
ValueCountFrequency (%)
1 6
15.8%
2 5
13.2%
7 5
13.2%
3 5
13.2%
5 4
10.5%
8 4
10.5%
6 3
7.9%
4 3
7.9%
0 2
 
5.3%
9 1
 
2.6%
Space Separator
ValueCountFrequency (%)
20
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 66
50.0%
Hangul 66
50.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
12.1%
7
10.6%
7
10.6%
6
 
9.1%
5
 
7.6%
4
 
6.1%
4
 
6.1%
3
 
4.5%
3
 
4.5%
3
 
4.5%
Other values (10) 16
24.2%
Common
ValueCountFrequency (%)
20
30.3%
- 8
 
12.1%
1 6
 
9.1%
2 5
 
7.6%
7 5
 
7.6%
3 5
 
7.6%
5 4
 
6.1%
8 4
 
6.1%
6 3
 
4.5%
4 3
 
4.5%
Other values (2) 3
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66
50.0%
Hangul 66
50.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20
30.3%
- 8
 
12.1%
1 6
 
9.1%
2 5
 
7.6%
7 5
 
7.6%
3 5
 
7.6%
5 4
 
6.1%
8 4
 
6.1%
6 3
 
4.5%
4 3
 
4.5%
Other values (2) 3
 
4.5%
Hangul
ValueCountFrequency (%)
8
12.1%
7
10.6%
7
10.6%
6
 
9.1%
5
 
7.6%
4
 
6.1%
4
 
6.1%
3
 
4.5%
3
 
4.5%
3
 
4.5%
Other values (10) 16
24.2%

교부연월일
Date

MISSING 

Distinct7
Distinct (%)87.5%
Missing25
Missing (%)75.8%
Memory size396.0 B
Minimum2022-01-10 00:00:00
Maximum2022-02-23 00:00:00
2023-12-11T02:06:23.560651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:06:23.756403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)

공사의종류
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
<NA>
25 
구내통신선로설비,방송공동수신설비(종합유선방송)
구내통신선로설비,방송공동수신설비(지상파TV,위성방송,FM라디오방송,종합유선방송)

Length

Max length44
Median length4
Mean length10.818182
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구내통신선로설비,방송공동수신설비(종합유선방송)
2nd row구내통신선로설비,방송공동수신설비(지상파TV,위성방송,FM라디오방송,종합유선방송)
3rd row구내통신선로설비,방송공동수신설비(지상파TV,위성방송,FM라디오방송,종합유선방송)
4th row구내통신선로설비,방송공동수신설비(지상파TV,위성방송,FM라디오방송,종합유선방송)
5th row구내통신선로설비,방송공동수신설비(종합유선방송)

Common Values

ValueCountFrequency (%)
<NA> 25
75.8%
구내통신선로설비,방송공동수신설비(종합유선방송) 5
 
15.2%
구내통신선로설비,방송공동수신설비(지상파TV,위성방송,FM라디오방송,종합유선방송) 3
 
9.1%

Length

2023-12-11T02:06:24.037606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:06:24.274452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 25
75.8%
구내통신선로설비,방송공동수신설비(종합유선방송 5
 
15.2%
구내통신선로설비,방송공동수신설비(지상파tv,위성방송,fm라디오방송,종합유선방송 3
 
9.1%

Interactions

2023-12-11T02:06:21.506331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:06:24.428006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번현장주소교부연월일공사의종류
순번1.0001.0001.0001.000
현장주소1.0001.0001.0001.000
교부연월일1.0001.0001.0001.000
공사의종류1.0001.0001.0001.000
2023-12-11T02:06:24.627909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번공사의종류
순번1.0001.000
공사의종류1.0001.000

Missing values

2023-12-11T02:06:21.762853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:06:21.996840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T02:06:22.230829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순번현장주소교부연월일공사의종류
01남구 용호동 554-21번지2022-01-10구내통신선로설비,방송공동수신설비(종합유선방송)
12대연동 894-18번지2022-01-20구내통신선로설비,방송공동수신설비(지상파TV,위성방송,FM라디오방송,종합유선방송)
23남구 용호동 407-10 외 6필지2022-01-26구내통신선로설비,방송공동수신설비(지상파TV,위성방송,FM라디오방송,종합유선방송)
34남구 감만동 33-8번지2022-01-27구내통신선로설비,방송공동수신설비(지상파TV,위성방송,FM라디오방송,종합유선방송)
45남구 문현동 127-78번지2022-02-10구내통신선로설비,방송공동수신설비(종합유선방송)
56부산광역시 남구 대연동 231-56번지2022-02-10구내통신선로설비,방송공동수신설비(종합유선방송)
67부산광역시 남구 대연동 1736-32022-02-18구내통신선로설비,방송공동수신설비(종합유선방송)
78부산광역시 남구 대연동 252-72022-02-23구내통신선로설비,방송공동수신설비(종합유선방송)
8<NA><NA><NA><NA>
9<NA><NA><NA><NA>
순번현장주소교부연월일공사의종류
23<NA><NA><NA><NA>
24<NA><NA><NA><NA>
25<NA><NA><NA><NA>
26<NA><NA><NA><NA>
27<NA><NA><NA><NA>
28<NA><NA><NA><NA>
29<NA><NA><NA><NA>
30<NA><NA><NA><NA>
31<NA><NA><NA><NA>
32<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

순번현장주소교부연월일공사의종류# duplicates
0<NA><NA><NA><NA>25