Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells2278
Missing cells (%)7.6%
Duplicate rows599
Duplicate rows (%)6.0%
Total size in memory332.0 KiB
Average record size in memory34.0 B

Variable types

Text1
Numeric2

Dataset

Description경기도 광주시 도시계획정보시스템의 건축물주제도 용적율 현황에 관한 데이터로 지번코드, 건물군관리번호, 용적률에 대한 항목을 제공합니다.
Author경기도 광주시
URLhttps://www.data.go.kr/data/15122653/fileData.do

Alerts

Dataset has 599 (6.0%) duplicate rowsDuplicates
건물군관리번호 has 2278 (22.8%) missing valuesMissing
용적률_심볼 has 3741 (37.4%) zerosZeros

Reproduction

Analysis started2023-12-12 08:22:20.268833
Analysis finished2023-12-12 08:22:21.391517
Duration1.12 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9330
Distinct (%)93.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T17:22:21.644215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length64
Median length19
Mean length19.0135
Min length19

Characters and Unicode

Total characters190135
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8731 ?
Unique (%)87.3%

Sample

1st row4161025324100170002
2nd row4161035025102830002
3rd row4161010300100930040
4th row4161025027100770029
5th row4161025023101450029
ValueCountFrequency (%)
4161025027105020009 6
 
0.1%
4161025934101490000 5
 
0.1%
4161035021101160000 5
 
0.1%
4161025936100720001 4
 
< 0.1%
4161034021101880000 4
 
< 0.1%
4161011100107570000 4
 
< 0.1%
4161034028100890000 4
 
< 0.1%
4161010800102110008 4
 
< 0.1%
4161010500100400000 4
 
< 0.1%
4161011300100910004 3
 
< 0.1%
Other values (9319) 9954
99.6%
2023-12-12T17:22:22.131637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 65951
34.7%
1 41935
22.1%
2 18927
 
10.0%
4 15785
 
8.3%
6 14243
 
7.5%
3 11796
 
6.2%
5 9965
 
5.2%
7 4158
 
2.2%
9 4157
 
2.2%
8 3026
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 189943
99.9%
Space Separator 192
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 65951
34.7%
1 41935
22.1%
2 18927
 
10.0%
4 15785
 
8.3%
6 14243
 
7.5%
3 11796
 
6.2%
5 9965
 
5.2%
7 4158
 
2.2%
9 4157
 
2.2%
8 3026
 
1.6%
Space Separator
ValueCountFrequency (%)
192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 190135
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 65951
34.7%
1 41935
22.1%
2 18927
 
10.0%
4 15785
 
8.3%
6 14243
 
7.5%
3 11796
 
6.2%
5 9965
 
5.2%
7 4158
 
2.2%
9 4157
 
2.2%
8 3026
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 190135
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 65951
34.7%
1 41935
22.1%
2 18927
 
10.0%
4 15785
 
8.3%
6 14243
 
7.5%
3 11796
 
6.2%
5 9965
 
5.2%
7 4158
 
2.2%
9 4157
 
2.2%
8 3026
 
1.6%

건물군관리번호
Real number (ℝ)

MISSING 

Distinct7223
Distinct (%)93.5%
Missing2278
Missing (%)22.8%
Infinite0
Infinite (%)0.0%
Mean32356917
Minimum1
Maximum1.0030381 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:22:22.370310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile983.05
Q113790.5
median25582.5
Q31.0019085 × 108
95-th percentile1.0027774 × 108
Maximum1.0030381 × 108
Range1.0030381 × 108
Interquartile range (IQR)1.0017706 × 108

Descriptive statistics

Standard deviation46852877
Coefficient of variation (CV)1.4480019
Kurtosis-1.4249457
Mean32356917
Median Absolute Deviation (MAD)15011.5
Skewness0.75856592
Sum2.4986011 × 1011
Variance2.1951921 × 1015
MonotonicityNot monotonic
2023-12-12T17:22:22.578353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17060 6
 
0.1%
28619 5
 
0.1%
13494 5
 
0.1%
16930 4
 
< 0.1%
100199797 4
 
< 0.1%
146 4
 
< 0.1%
7807 4
 
< 0.1%
18537 4
 
< 0.1%
25899 3
 
< 0.1%
29784 3
 
< 0.1%
Other values (7213) 7680
76.8%
(Missing) 2278
 
22.8%
ValueCountFrequency (%)
1 2
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
12 1
< 0.1%
17 2
< 0.1%
18 1
< 0.1%
ValueCountFrequency (%)
100303807 1
< 0.1%
100303803 1
< 0.1%
100303722 1
< 0.1%
100303703 1
< 0.1%
100303702 1
< 0.1%
100303543 1
< 0.1%
100303482 1
< 0.1%
100303448 1
< 0.1%
100303378 2
< 0.1%
100303358 1
< 0.1%

용적률_심볼
Real number (ℝ)

ZEROS 

Distinct4082
Distinct (%)40.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39.11274
Minimum0
Maximum523.85
Zeros3741
Zeros (%)37.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:22:22.766460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median29.92
Q353.7975
95-th percentile144.261
Maximum523.85
Range523.85
Interquartile range (IQR)53.7975

Descriptive statistics

Standard deviation46.909809
Coefficient of variation (CV)1.1993486
Kurtosis4.4484797
Mean39.11274
Median Absolute Deviation (MAD)29.92
Skewness1.7140783
Sum391127.4
Variance2200.5302
MonotonicityNot monotonic
2023-12-12T17:22:22.963732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 3741
37.4%
50.0 14
 
0.1%
99.62 11
 
0.1%
99.75 11
 
0.1%
30.0 11
 
0.1%
99.77 11
 
0.1%
99.83 11
 
0.1%
99.97 11
 
0.1%
39.9 10
 
0.1%
99.92 10
 
0.1%
Other values (4072) 6159
61.6%
ValueCountFrequency (%)
0.0 3741
37.4%
0.03 1
 
< 0.1%
0.49 1
 
< 0.1%
0.61 2
 
< 0.1%
0.64 2
 
< 0.1%
0.7 1
 
< 0.1%
1.34 1
 
< 0.1%
1.59 1
 
< 0.1%
1.67 1
 
< 0.1%
1.73 3
 
< 0.1%
ValueCountFrequency (%)
523.85 1
< 0.1%
468.98 1
< 0.1%
365.63 1
< 0.1%
314.29 1
< 0.1%
298.25 1
< 0.1%
296.19 1
< 0.1%
296.11 1
< 0.1%
288.82 1
< 0.1%
287.53 2
< 0.1%
282.42 1
< 0.1%

Interactions

2023-12-12T17:22:20.922564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:22:20.601754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:22:21.064671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:22:20.778094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:22:23.089536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물군관리번호용적률_심볼
건물군관리번호1.0000.297
용적률_심볼0.2971.000
2023-12-12T17:22:23.231157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물군관리번호용적률_심볼
건물군관리번호1.0000.213
용적률_심볼0.2131.000

Missing values

2023-12-12T17:22:21.240964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:22:21.341768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지번코드건물군관리번호용적률_심볼
344544161025324100170002<NA>0.0
227724161035025102830002100220.0
41756416101030010093004028321111.55
2923241610250271007700291214134.54
1544741610250231014500298545148.36
3977141610259241007500121163721.97
6588416102592110447000168790.0
173334161011200100230030168270.0
2940841610250271036500032752671.9
171124161010300101100016321200.0
지번코드건물군관리번호용적률_심볼
304604161025327103970000196600.0
1665041610259331003000002718928.85
24743416102502310368001110025034096.07
136764161010100101480065<NA>0.0
32524161025933101130000306500.0
359754161025022105220015<NA>0.0
4368416102532410077000210020718022.95
344364161025324101170001<NA>0.0
11839416102532910276000010020272998.61
151934161025022103660006100286839146.54

Duplicate rows

Most frequently occurring

지번코드건물군관리번호용적률_심볼# duplicates
2464161025027105020009170600.06
4374161025934101490000286190.05
55541610350211011600001349417.365
524161010500100400000169300.04
834161010800102110008<NA>0.04
1224161011100107570000100199797199.34
45241610259361007200011465.884
518416103402110188000078070.04
5484161034028100890000185370.04
0<NA>0.03