Overview

Dataset statistics

Number of variables3
Number of observations1014
Missing cells45
Missing cells (%)1.5%
Duplicate rows11
Duplicate rows (%)1.1%
Total size in memory24.9 KiB
Average record size in memory25.1 B

Variable types

Text2
Numeric1

Dataset

Description인천광역시 공공건축물에 대한 현황을 보여주는 데이터로 목록으로는 (상호, 주소, 면적)에 대한 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15112899/fileData.do

Alerts

Dataset has 11 (1.1%) duplicate rowsDuplicates
상호 has 45 (4.4%) missing valuesMissing
면적 has 13 (1.3%) zerosZeros

Reproduction

Analysis started2023-12-12 01:43:46.992977
Analysis finished2023-12-12 01:43:47.539026
Duration0.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

MISSING 

Distinct887
Distinct (%)91.5%
Missing45
Missing (%)4.4%
Memory size8.1 KiB
2023-12-12T10:43:47.766281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length55
Median length29
Mean length10.94324
Min length2

Characters and Unicode

Total characters10604
Distinct characters421
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique846 ?
Unique (%)87.3%

Sample

1st row성산가압펌프장
2nd row성산가압장(경비소)
3rd row풍납취수장(수위실)
4th row풍납취수장(경비소)
5th row풍납취수장(가압펌프장, 수배전반실)
ValueCountFrequency (%)
24
 
1.5%
공중화장실 20
 
1.2%
인천광역시 19
 
1.2%
화장실 16
 
1.0%
관리동 16
 
1.0%
동인천역북광장조성사업 15
 
0.9%
인천대공원 15
 
0.9%
검수구실 14
 
0.9%
남동정수사업소 13
 
0.8%
공촌하수종말처리장 11
 
0.7%
Other values (1046) 1446
89.9%
2023-12-12T10:43:48.295011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
646
 
6.1%
398
 
3.8%
334
 
3.1%
321
 
3.0%
244
 
2.3%
234
 
2.2%
) 231
 
2.2%
( 231
 
2.2%
222
 
2.1%
1 207
 
2.0%
Other values (411) 7536
71.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8868
83.6%
Space Separator 646
 
6.1%
Decimal Number 488
 
4.6%
Close Punctuation 231
 
2.2%
Open Punctuation 231
 
2.2%
Uppercase Letter 83
 
0.8%
Dash Punctuation 25
 
0.2%
Other Punctuation 20
 
0.2%
Math Symbol 9
 
0.1%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
398
 
4.5%
334
 
3.8%
321
 
3.6%
244
 
2.8%
234
 
2.6%
222
 
2.5%
194
 
2.2%
181
 
2.0%
168
 
1.9%
149
 
1.7%
Other values (373) 6423
72.4%
Uppercase Letter
ValueCountFrequency (%)
B 20
24.1%
A 16
19.3%
T 8
 
9.6%
C 6
 
7.2%
R 6
 
7.2%
G 6
 
7.2%
P 3
 
3.6%
L 3
 
3.6%
N 2
 
2.4%
E 2
 
2.4%
Other values (7) 11
13.3%
Decimal Number
ValueCountFrequency (%)
1 207
42.4%
9 66
 
13.5%
2 60
 
12.3%
3 40
 
8.2%
4 28
 
5.7%
5 21
 
4.3%
6 19
 
3.9%
7 18
 
3.7%
8 16
 
3.3%
0 13
 
2.7%
Other Punctuation
ValueCountFrequency (%)
, 13
65.0%
/ 4
 
20.0%
. 3
 
15.0%
Letter Number
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Space Separator
ValueCountFrequency (%)
646
100.0%
Close Punctuation
ValueCountFrequency (%)
) 231
100.0%
Open Punctuation
ValueCountFrequency (%)
( 231
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25
100.0%
Math Symbol
ValueCountFrequency (%)
~ 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8868
83.6%
Common 1650
 
15.6%
Latin 86
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
398
 
4.5%
334
 
3.8%
321
 
3.6%
244
 
2.8%
234
 
2.6%
222
 
2.5%
194
 
2.2%
181
 
2.0%
168
 
1.9%
149
 
1.7%
Other values (373) 6423
72.4%
Latin
ValueCountFrequency (%)
B 20
23.3%
A 16
18.6%
T 8
 
9.3%
C 6
 
7.0%
R 6
 
7.0%
G 6
 
7.0%
P 3
 
3.5%
L 3
 
3.5%
N 2
 
2.3%
E 2
 
2.3%
Other values (10) 14
16.3%
Common
ValueCountFrequency (%)
646
39.2%
) 231
 
14.0%
( 231
 
14.0%
1 207
 
12.5%
9 66
 
4.0%
2 60
 
3.6%
3 40
 
2.4%
4 28
 
1.7%
- 25
 
1.5%
5 21
 
1.3%
Other values (8) 95
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8868
83.6%
ASCII 1733
 
16.3%
Number Forms 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
646
37.3%
) 231
 
13.3%
( 231
 
13.3%
1 207
 
11.9%
9 66
 
3.8%
2 60
 
3.5%
3 40
 
2.3%
4 28
 
1.6%
- 25
 
1.4%
5 21
 
1.2%
Other values (25) 178
 
10.3%
Hangul
ValueCountFrequency (%)
398
 
4.5%
334
 
3.8%
321
 
3.6%
244
 
2.8%
234
 
2.6%
222
 
2.5%
194
 
2.2%
181
 
2.0%
168
 
1.9%
149
 
1.7%
Other values (373) 6423
72.4%
Number Forms
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

주소
Text

Distinct582
Distinct (%)57.4%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
2023-12-12T10:43:48.752351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length35
Mean length21.16568
Min length16

Characters and Unicode

Total characters21462
Distinct characters204
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique456 ?
Unique (%)45.0%

Sample

1st row서울특별시 영등포구 양화동 1-26
2nd row서울특별시 영등포구 양화동 1-26
3rd row서울특별시 송파구 풍납동 419
4th row서울특별시 송파구 풍납동 419
5th row서울특별시 송파구 풍납동 419
ValueCountFrequency (%)
인천광역시 1006
 
21.9%
서구 228
 
5.0%
남동구 183
 
4.0%
연수구 133
 
2.9%
중구 110
 
2.4%
부평구 91
 
2.0%
89
 
1.9%
미추홀구 87
 
1.9%
68
 
1.5%
가좌동 67
 
1.5%
Other values (779) 2529
55.1%
2023-12-12T10:43:49.425442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4502
21.0%
1223
 
5.7%
1017
 
4.7%
1015
 
4.7%
1010
 
4.7%
1006
 
4.7%
1006
 
4.7%
979
 
4.6%
1 817
 
3.8%
- 664
 
3.1%
Other values (194) 8223
38.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12079
56.3%
Space Separator 4502
 
21.0%
Decimal Number 4182
 
19.5%
Dash Punctuation 664
 
3.1%
Other Punctuation 16
 
0.1%
Open Punctuation 9
 
< 0.1%
Close Punctuation 9
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1223
 
10.1%
1017
 
8.4%
1015
 
8.4%
1010
 
8.4%
1006
 
8.3%
1006
 
8.3%
979
 
8.1%
265
 
2.2%
259
 
2.1%
193
 
1.6%
Other values (175) 4106
34.0%
Decimal Number
ValueCountFrequency (%)
1 817
19.5%
2 561
13.4%
5 404
9.7%
4 395
9.4%
0 382
9.1%
9 380
9.1%
3 366
8.8%
8 347
8.3%
7 271
 
6.5%
6 259
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 12
75.0%
. 2
 
12.5%
: 1
 
6.2%
' 1
 
6.2%
Space Separator
ValueCountFrequency (%)
4502
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 664
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Uppercase Letter
ValueCountFrequency (%)
B 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12079
56.3%
Common 9382
43.7%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1223
 
10.1%
1017
 
8.4%
1015
 
8.4%
1010
 
8.4%
1006
 
8.3%
1006
 
8.3%
979
 
8.1%
265
 
2.2%
259
 
2.1%
193
 
1.6%
Other values (175) 4106
34.0%
Common
ValueCountFrequency (%)
4502
48.0%
1 817
 
8.7%
- 664
 
7.1%
2 561
 
6.0%
5 404
 
4.3%
4 395
 
4.2%
0 382
 
4.1%
9 380
 
4.1%
3 366
 
3.9%
8 347
 
3.7%
Other values (8) 564
 
6.0%
Latin
ValueCountFrequency (%)
B 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12079
56.3%
ASCII 9383
43.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4502
48.0%
1 817
 
8.7%
- 664
 
7.1%
2 561
 
6.0%
5 404
 
4.3%
4 395
 
4.2%
0 382
 
4.1%
9 380
 
4.0%
3 366
 
3.9%
8 347
 
3.7%
Other values (9) 565
 
6.0%
Hangul
ValueCountFrequency (%)
1223
 
10.1%
1017
 
8.4%
1015
 
8.4%
1010
 
8.4%
1006
 
8.3%
1006
 
8.3%
979
 
8.1%
265
 
2.2%
259
 
2.1%
193
 
1.6%
Other values (175) 4106
34.0%

면적
Real number (ℝ)

ZEROS 

Distinct853
Distinct (%)84.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2884.9508
Minimum0
Maximum205309
Zeros13
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size9.0 KiB
2023-12-12T10:43:49.696492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.6
Q158.86
median300.91
Q31519.48
95-th percentile11585.982
Maximum205309
Range205309
Interquartile range (IQR)1460.62

Descriptive statistics

Standard deviation11530.971
Coefficient of variation (CV)3.9969386
Kurtosis129.92446
Mean2884.9508
Median Absolute Deviation (MAD)280.91
Skewness9.883666
Sum2925340.1
Variance1.329633 × 108
MonotonicityNot monotonic
2023-12-12T10:43:49.898189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 13
 
1.3%
25.41 8
 
0.8%
720.0 7
 
0.7%
5.76 6
 
0.6%
12.0 6
 
0.6%
33.0 5
 
0.5%
45.9 5
 
0.5%
27.0 5
 
0.5%
7.6 4
 
0.4%
140.0 4
 
0.4%
Other values (843) 951
93.8%
ValueCountFrequency (%)
0.0 13
1.3%
1.0 1
 
0.1%
2.0 1
 
0.1%
2.25 1
 
0.1%
2.43 1
 
0.1%
2.76 1
 
0.1%
2.8 1
 
0.1%
3.26 1
 
0.1%
3.85 1
 
0.1%
4.09 1
 
0.1%
ValueCountFrequency (%)
205309.0 1
0.1%
128870.12 1
0.1%
117163.6 1
0.1%
110153.08 1
0.1%
86164.88 1
0.1%
66630.56 1
0.1%
63195.0 2
0.2%
54428.94 1
0.1%
51977.07 1
0.1%
51681.62 1
0.1%

Interactions

2023-12-12T10:43:47.315556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T10:43:47.431329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:43:47.508278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호주소면적
0성산가압펌프장서울특별시 영등포구 양화동 1-26774.35
1성산가압장(경비소)서울특별시 영등포구 양화동 1-262.25
2풍납취수장(수위실)서울특별시 송파구 풍납동 41913.44
3풍납취수장(경비소)서울특별시 송파구 풍납동 4192.0
4풍납취수장(가압펌프장, 수배전반실)서울특별시 송파구 풍납동 4191405.1
5풍납취수장(염소투입실)서울특별시 송파구 풍납동 419271.76
6인천아트플랫폼인천광역시 중구 해안동1가 10-1542.1
7미술문화공간조성사업인천광역시 중구 해안동1가 10-10.0
8미술문화공간조성사업인천광역시 중구 해안동1가 10-10.0
9미술문화공간조성사업인천광역시 중구 해안동1가 10-10.0
상호주소면적
1004영흥수협수산물위판장 바동인천광역시 옹진군 영흥면 내리 8-16543.0
1005영흥수협수산물위판장 사동인천광역시 옹진군 영흥면 내리 8-16572.0
1006아동인천광역시 옹진군 영흥면 내리 8-16582.8
1007자월119지역대인천광역시 옹진군 자월면 자월리 1065-11765.78
1008대연평도해수담수화시설 당섬담수화시설인천광역시 옹진군 연평면 연평리 산 16-142.88
1009대연평도해수담수화시설 해수취수시설인천광역시 옹진군 연평면 연평리 187-3198.56
1010대연평도해수담수화시설 중부리담수화시설인천광역시 옹진군 연평면 연평리 325-160385.34
1011연평119지역대인천광역시 옹진군 연평면 연평리 493-3210.73
1012군경합동검문소경기도 김포시 대곶면 약암리 437-200777.0
1013과적차량 검문소경기도 김포시 대곶면 약암리 1152-380.5

Duplicate rows

Most frequently occurring

상호주소면적# duplicates
3수산정수사업소(초소동)인천광역시 남동구 만수동 산 30 외 60필지7.64
0미술문화공간조성사업인천광역시 중구 해안동1가 10-10.03
1송도연장선인천광역시 연수구 송도동 48-163195.02
2송암미술관인천광역시 미추홀구 학익동 587-1452527.432
4약품투입 및 급수시설인천광역시 서구 가좌동 5981270.52
5어린이과학관인천광역시 계양구 방축동 108-114998.02
6영종도서관건립공사인천광역시 중구 운서동 2709-10.02
7판매시설(수위실)인천광역시 부평구 삼산동 7-130.552
8<NA>인천광역시 남동구 구월동 1446 외 2필지98.662
9<NA>인천광역시 남동구 구월동 1446 외 2필지124.942