Overview

Dataset statistics

Number of variables6
Number of observations33
Missing cells37
Missing cells (%)18.7%
Duplicate rows1
Duplicate rows (%)3.0%
Total size in memory1.8 KiB
Average record size in memory57.0 B

Variable types

Text1
Numeric1
Categorical3
Unsupported1

Dataset

Description광주광역시 동부소방서 1급선임대상 현황에 대한 데이터로 동부소방서 관내 각 안전센터별 구분한 수치자료를 제공합니다.
Author광주광역시
URLhttps://www.data.go.kr/data/15054938/fileData.do

Alerts

Dataset has 1 (3.0%) duplicate rowsDuplicates
지산 is highly overall correlated with 대인High correlation
용산 is highly overall correlated with 합계 and 1 other fieldsHigh correlation
대인 is highly overall correlated with 합계 and 2 other fieldsHigh correlation
합계 is highly overall correlated with 대인 and 1 other fieldsHigh correlation
용산 is highly imbalanced (60.2%)Imbalance
종류 has 2 (6.1%) missing valuesMissing
합계 has 2 (6.1%) missing valuesMissing
Unnamed: 5 has 33 (100.0%) missing valuesMissing
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
합계 has 20 (60.6%) zerosZeros

Reproduction

Analysis started2023-12-12 08:49:23.335710
Analysis finished2023-12-12 08:49:24.040570
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

종류
Text

MISSING 

Distinct31
Distinct (%)100.0%
Missing2
Missing (%)6.1%
Memory size396.0 B
2023-12-12T17:49:24.168740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length5.4516129
Min length3

Characters and Unicode

Total characters169
Distinct characters81
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)100.0%

Sample

1st row공동주택(아파트)
2nd row공동주택(기숙사)
3rd row근린생활
4th row문화 및 집회시설
5th row종교시설
ValueCountFrequency (%)
6
 
13.3%
공동주택(기숙사 1
 
2.2%
방송통신시설 1
 
2.2%
동물 1
 
2.2%
식물 1
 
2.2%
관련 1
 
2.2%
분뇨 1
 
2.2%
쓰레기 1
 
2.2%
교정 1
 
2.2%
군사 1
 
2.2%
Other values (30) 30
66.7%
2023-12-12T17:49:24.497154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17
 
10.1%
17
 
10.1%
14
 
8.3%
6
 
3.6%
5
 
3.0%
4
 
2.4%
4
 
2.4%
4
 
2.4%
3
 
1.8%
3
 
1.8%
Other values (71) 92
54.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 151
89.3%
Space Separator 14
 
8.3%
Open Punctuation 2
 
1.2%
Close Punctuation 2
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17
 
11.3%
17
 
11.3%
6
 
4.0%
5
 
3.3%
4
 
2.6%
4
 
2.6%
4
 
2.6%
3
 
2.0%
3
 
2.0%
3
 
2.0%
Other values (68) 85
56.3%
Space Separator
ValueCountFrequency (%)
14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 151
89.3%
Common 18
 
10.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
11.3%
17
 
11.3%
6
 
4.0%
5
 
3.3%
4
 
2.6%
4
 
2.6%
4
 
2.6%
3
 
2.0%
3
 
2.0%
3
 
2.0%
Other values (68) 85
56.3%
Common
ValueCountFrequency (%)
14
77.8%
( 2
 
11.1%
) 2
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 151
89.3%
ASCII 18
 
10.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
17
 
11.3%
17
 
11.3%
6
 
4.0%
5
 
3.3%
4
 
2.6%
4
 
2.6%
4
 
2.6%
3
 
2.0%
3
 
2.0%
3
 
2.0%
Other values (68) 85
56.3%
ASCII
ValueCountFrequency (%)
14
77.8%
( 2
 
11.1%
) 2
 
11.1%

합계
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct7
Distinct (%)22.6%
Missing2
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean1.6451613
Minimum0
Maximum19
Zeros20
Zeros (%)60.6%
Negative0
Negative (%)0.0%
Memory size429.0 B
2023-12-12T17:49:24.606123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31.5
95-th percentile8.5
Maximum19
Range19
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation4.0045673
Coefficient of variation (CV)2.4341487
Kurtosis12.9082
Mean1.6451613
Median Absolute Deviation (MAD)0
Skewness3.4947345
Sum51
Variance16.036559
MonotonicityNot monotonic
2023-12-12T17:49:24.702704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 20
60.6%
2 4
 
12.1%
1 3
 
9.1%
4 1
 
3.0%
5 1
 
3.0%
12 1
 
3.0%
19 1
 
3.0%
(Missing) 2
 
6.1%
ValueCountFrequency (%)
0 20
60.6%
1 3
 
9.1%
2 4
 
12.1%
4 1
 
3.0%
5 1
 
3.0%
12 1
 
3.0%
19 1
 
3.0%
ValueCountFrequency (%)
19 1
 
3.0%
12 1
 
3.0%
5 1
 
3.0%
4 1
 
3.0%
2 4
 
12.1%
1 3
 
9.1%
0 20
60.6%

대인
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Memory size396.0 B
<NA>
25 
1
2
 
2
4
 
1
10
 
1

Length

Max length4
Median length4
Mean length3.3333333
Min length1

Unique

Unique3 ?
Unique (%)9.1%

Sample

1st row1
2nd row<NA>
3rd row<NA>
4th row2
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 25
75.8%
1 3
 
9.1%
2 2
 
6.1%
4 1
 
3.0%
10 1
 
3.0%
15 1
 
3.0%

Length

2023-12-12T17:49:24.842614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:49:25.086305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 25
75.8%
1 3
 
9.1%
2 2
 
6.1%
4 1
 
3.0%
10 1
 
3.0%
15 1
 
3.0%

용산
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
<NA>
29 
1
3
 
1

Length

Max length4
Median length4
Mean length3.6363636
Min length1

Unique

Unique1 ?
Unique (%)3.0%

Sample

1st row1
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 29
87.9%
1 3
 
9.1%
3 1
 
3.0%

Length

2023-12-12T17:49:25.291049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:49:25.406652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 29
87.9%
1 3
 
9.1%
3 1
 
3.0%

지산
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
<NA>
27 
1
2

Length

Max length4
Median length4
Mean length3.4545455
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 27
81.8%
1 3
 
9.1%
2 3
 
9.1%

Length

2023-12-12T17:49:25.530655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:49:25.644312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 27
81.8%
1 3
 
9.1%
2 3
 
9.1%

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing33
Missing (%)100.0%
Memory size429.0 B

Interactions

2023-12-12T17:49:23.578755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:49:25.726732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종류합계대인용산지산
종류1.0001.0001.0001.0001.000
합계1.0001.0000.9361.0000.000
대인1.0000.9361.0001.0001.000
용산1.0001.0001.0001.0000.000
지산1.0000.0001.0000.0001.000
2023-12-12T17:49:25.833102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지산용산대인
지산1.0000.0001.000
용산0.0001.0001.000
대인1.0001.0001.000
2023-12-12T17:49:25.940108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
합계대인용산지산
합계1.0000.5650.7070.000
대인0.5651.0001.0001.000
용산0.7071.0001.0000.000
지산0.0001.0000.0001.000

Missing values

2023-12-12T17:49:23.753331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:49:23.880022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T17:49:23.980344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

종류합계대인용산지산Unnamed: 5
0공동주택(아파트)211<NA><NA>
1공동주택(기숙사)0<NA><NA><NA><NA>
2근린생활0<NA><NA><NA><NA>
3문화 및 집회시설22<NA><NA><NA>
4종교시설0<NA><NA><NA><NA>
5판매시설44<NA><NA><NA>
6운수시설0<NA><NA><NA><NA>
7의료시설2<NA>11<NA>
8교육연구시설5212<NA>
9노유자시설0<NA><NA><NA><NA>
종류합계대인용산지산Unnamed: 5
23발전시설0<NA><NA><NA><NA>
24묘지관련시설0<NA><NA><NA><NA>
25관광휴게시설0<NA><NA><NA><NA>
26장례식장0<NA><NA><NA><NA>
27지하가11<NA><NA><NA>
28지하구0<NA><NA><NA><NA>
29문화재0<NA><NA><NA><NA>
30복합건축물191531<NA>
31<NA><NA><NA><NA><NA><NA>
32<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

종류합계대인용산지산# duplicates
0<NA><NA><NA><NA><NA>2