Overview

Dataset statistics

Number of variables6
Number of observations70
Missing cells11
Missing cells (%)2.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.5 KiB
Average record size in memory51.9 B

Variable types

Numeric2
Categorical2
Text2

Dataset

Description인천광역시 계양구의 연면적 1만제곱미터이상 일반건축물현황에 대한 파일로서 연번, 주용도코드명, 기타용도내용, 건물명, 도로명주소, 연면적을 포함하고 있는 데이터파일입니다.
Author인천광역시 계양구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15124647&srcSe=7661IVAWM27C61E190

Alerts

연면적 is highly overall correlated with 기타용도내용High correlation
주용도코드명 is highly overall correlated with 기타용도내용High correlation
기타용도내용 is highly overall correlated with 연면적 and 1 other fieldsHigh correlation
건물명 has 9 (12.9%) missing valuesMissing
도로명주소 has 2 (2.9%) missing valuesMissing
연번 has unique valuesUnique
연면적 has unique valuesUnique

Reproduction

Analysis started2024-01-28 12:32:59.998512
Analysis finished2024-01-28 12:33:00.885535
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct70
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.5
Minimum1
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size762.0 B
2024-01-28T21:33:00.944622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.45
Q118.25
median35.5
Q352.75
95-th percentile66.55
Maximum70
Range69
Interquartile range (IQR)34.5

Descriptive statistics

Standard deviation20.351085
Coefficient of variation (CV)0.57327
Kurtosis-1.2
Mean35.5
Median Absolute Deviation (MAD)17.5
Skewness0
Sum2485
Variance414.16667
MonotonicityStrictly increasing
2024-01-28T21:33:01.066483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.4%
46 1
 
1.4%
52 1
 
1.4%
51 1
 
1.4%
50 1
 
1.4%
49 1
 
1.4%
48 1
 
1.4%
47 1
 
1.4%
45 1
 
1.4%
54 1
 
1.4%
Other values (60) 60
85.7%
ValueCountFrequency (%)
1 1
1.4%
2 1
1.4%
3 1
1.4%
4 1
1.4%
5 1
1.4%
6 1
1.4%
7 1
1.4%
8 1
1.4%
9 1
1.4%
10 1
1.4%
ValueCountFrequency (%)
70 1
1.4%
69 1
1.4%
68 1
1.4%
67 1
1.4%
66 1
1.4%
65 1
1.4%
64 1
1.4%
63 1
1.4%
62 1
1.4%
61 1
1.4%

주용도코드명
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)22.9%
Missing0
Missing (%)0.0%
Memory size692.0 B
교육연구시설
23 
공장
16 
교육연구및복지시설
판매시설
의료시설
Other values (11)
14 

Length

Max length9
Median length7
Mean length4.9714286
Min length2

Unique

Unique8 ?
Unique (%)11.4%

Sample

1st row교육연구시설
2nd row판매시설
3rd row방송통신시설
4th row교육연구및복지시설
5th row판매시설

Common Values

ValueCountFrequency (%)
교육연구시설 23
32.9%
공장 16
22.9%
교육연구및복지시설 7
 
10.0%
판매시설 5
 
7.1%
의료시설 5
 
7.1%
운수시설 2
 
2.9%
문화및집회시설 2
 
2.9%
업무시설 2
 
2.9%
방송통신시설 1
 
1.4%
교정및군사시설 1
 
1.4%
Other values (6) 6
 
8.6%

Length

2024-01-28T21:33:01.184857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
교육연구시설 23
32.9%
공장 16
22.9%
교육연구및복지시설 7
 
10.0%
판매시설 5
 
7.1%
의료시설 5
 
7.1%
운수시설 2
 
2.9%
문화및집회시설 2
 
2.9%
업무시설 2
 
2.9%
방송통신시설 1
 
1.4%
교정및군사시설 1
 
1.4%
Other values (6) 6
 
8.6%

기타용도내용
Categorical

HIGH CORRELATION 

Distinct33
Distinct (%)47.1%
Missing0
Missing (%)0.0%
Memory size692.0 B
교육연구시설
18 
공장
14 
교육연구및복지시설
방송통신시설
 
1
학교교육시설
 
1
Other values (28)
28 

Length

Max length28
Median length23.5
Mean length8.6285714
Min length2

Unique

Unique30 ?
Unique (%)42.9%

Sample

1st row교육연구시설
2nd row판매시설,운동시설,제1,2종근린생활시설
3rd row방송통신시설
4th row학교교육시설
5th row판매및영업시설, 운동시설, 제1종근린생활시설

Common Values

ValueCountFrequency (%)
교육연구시설 18
25.7%
공장 14
20.0%
교육연구및복지시설 8
 
11.4%
방송통신시설 1
 
1.4%
학교교육시설 1
 
1.4%
판매및영업시설, 운동시설, 제1종근린생활시설 1
 
1.4%
통합막사 1
 
1.4%
근린생활시설, 관람집회시설, 업무시설 1
 
1.4%
교육연구 및 복지시설 1
 
1.4%
의료시설(종합병원), 제1,2종근린생활시설 1
 
1.4%
Other values (23) 23
32.9%

Length

2024-01-28T21:33:01.303088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
교육연구시설 18
19.8%
공장 15
16.5%
교육연구및복지시설 8
 
8.8%
판매시설 4
 
4.4%
의료시설 3
 
3.3%
업무시설 2
 
2.2%
2
 
2.2%
1 2
 
2.2%
2
 
2.2%
문화및집회시설 2
 
2.2%
Other values (30) 33
36.3%

건물명
Text

MISSING 

Distinct60
Distinct (%)98.4%
Missing9
Missing (%)12.9%
Memory size692.0 B
2024-01-28T21:33:01.548159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length17
Mean length8.3606557
Min length2

Characters and Unicode

Total characters510
Distinct characters152
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique59 ?
Unique (%)96.7%

Sample

1st row신대초등학교
2nd row(주)농협하나로유통 하나로마트 인천점
3rd row인천 계양 방송통신시설
4th row인천안남초등학교
5th rowHOME PLUS 작전점
ValueCountFrequency (%)
계산종합의료단지 2
 
2.4%
공장 2
 
2.4%
계양 2
 
2.4%
계수중학교 1
 
1.2%
계양체육관 1
 
1.2%
대원루스터(주 1
 
1.2%
서운공장 1
 
1.2%
센서텍(주 1
 
1.2%
계산중앙감리교회 1
 
1.2%
인천어린이과학관 1
 
1.2%
Other values (69) 69
84.1%
2024-01-28T21:33:01.883240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
 
5.9%
30
 
5.9%
21
 
4.1%
19
 
3.7%
17
 
3.3%
17
 
3.3%
) 15
 
2.9%
( 15
 
2.9%
14
 
2.7%
14
 
2.7%
Other values (142) 318
62.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 428
83.9%
Space Separator 21
 
4.1%
Uppercase Letter 21
 
4.1%
Close Punctuation 15
 
2.9%
Open Punctuation 15
 
2.9%
Decimal Number 8
 
1.6%
Dash Punctuation 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
7.0%
30
 
7.0%
19
 
4.4%
17
 
4.0%
17
 
4.0%
14
 
3.3%
14
 
3.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
Other values (118) 259
60.5%
Uppercase Letter
ValueCountFrequency (%)
M 3
14.3%
S 3
14.3%
A 2
9.5%
O 2
9.5%
E 2
9.5%
V 1
 
4.8%
K 1
 
4.8%
W 1
 
4.8%
H 1
 
4.8%
T 1
 
4.8%
Other values (4) 4
19.0%
Decimal Number
ValueCountFrequency (%)
2 2
25.0%
1 2
25.0%
0 1
12.5%
6 1
12.5%
5 1
12.5%
3 1
12.5%
Space Separator
ValueCountFrequency (%)
21
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 428
83.9%
Common 61
 
12.0%
Latin 21
 
4.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
7.0%
30
 
7.0%
19
 
4.4%
17
 
4.0%
17
 
4.0%
14
 
3.3%
14
 
3.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
Other values (118) 259
60.5%
Latin
ValueCountFrequency (%)
M 3
14.3%
S 3
14.3%
A 2
9.5%
O 2
9.5%
E 2
9.5%
V 1
 
4.8%
K 1
 
4.8%
W 1
 
4.8%
H 1
 
4.8%
T 1
 
4.8%
Other values (4) 4
19.0%
Common
ValueCountFrequency (%)
21
34.4%
) 15
24.6%
( 15
24.6%
2 2
 
3.3%
1 2
 
3.3%
- 2
 
3.3%
0 1
 
1.6%
6 1
 
1.6%
5 1
 
1.6%
3 1
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 428
83.9%
ASCII 82
 
16.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
30
 
7.0%
30
 
7.0%
19
 
4.4%
17
 
4.0%
17
 
4.0%
14
 
3.3%
14
 
3.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
Other values (118) 259
60.5%
ASCII
ValueCountFrequency (%)
21
25.6%
) 15
18.3%
( 15
18.3%
M 3
 
3.7%
S 3
 
3.7%
A 2
 
2.4%
2 2
 
2.4%
1 2
 
2.4%
- 2
 
2.4%
O 2
 
2.4%
Other values (14) 15
18.3%

도로명주소
Text

MISSING 

Distinct65
Distinct (%)95.6%
Missing2
Missing (%)2.9%
Memory size692.0 B
2024-01-28T21:33:02.113143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length21
Mean length18.485294
Min length15

Characters and Unicode

Total characters1257
Distinct characters69
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)91.2%

Sample

1st row인천광역시 계양구 계산새로 133
2nd row인천광역시 계양구 아나지로 341
3rd row인천광역시 계양구 오조산로 129
4th row인천광역시 계양구 경명대로1114번길 26
5th row인천광역시 계양구 계양대로 27
ValueCountFrequency (%)
인천광역시 68
25.0%
계양구 68
25.0%
아나지로 7
 
2.6%
서운산단로1길 7
 
2.6%
서운산단로2길 5
 
1.8%
장제로 5
 
1.8%
6 3
 
1.1%
65 3
 
1.1%
계산새로 3
 
1.1%
63 3
 
1.1%
Other values (85) 100
36.8%
2024-01-28T21:33:02.851944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
204
16.2%
77
 
6.1%
74
 
5.9%
68
 
5.4%
68
 
5.4%
68
 
5.4%
68
 
5.4%
68
 
5.4%
68
 
5.4%
64
 
5.1%
Other values (59) 430
34.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 851
67.7%
Space Separator 204
 
16.2%
Decimal Number 202
 
16.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
77
 
9.0%
74
 
8.7%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
64
 
7.5%
30
 
3.5%
Other values (48) 198
23.3%
Decimal Number
ValueCountFrequency (%)
2 36
17.8%
1 34
16.8%
6 23
11.4%
5 19
9.4%
3 19
9.4%
8 18
8.9%
4 16
7.9%
7 15
7.4%
0 11
 
5.4%
9 11
 
5.4%
Space Separator
ValueCountFrequency (%)
204
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 851
67.7%
Common 406
32.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
77
 
9.0%
74
 
8.7%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
64
 
7.5%
30
 
3.5%
Other values (48) 198
23.3%
Common
ValueCountFrequency (%)
204
50.2%
2 36
 
8.9%
1 34
 
8.4%
6 23
 
5.7%
5 19
 
4.7%
3 19
 
4.7%
8 18
 
4.4%
4 16
 
3.9%
7 15
 
3.7%
0 11
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 851
67.7%
ASCII 406
32.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
204
50.2%
2 36
 
8.9%
1 34
 
8.4%
6 23
 
5.7%
5 19
 
4.7%
3 19
 
4.7%
8 18
 
4.4%
4 16
 
3.9%
7 15
 
3.7%
0 11
 
2.7%
Hangul
ValueCountFrequency (%)
77
 
9.0%
74
 
8.7%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
68
 
8.0%
64
 
7.5%
30
 
3.5%
Other values (48) 198
23.3%

연면적
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct70
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18474.006
Minimum10000.69
Maximum58119.41
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size762.0 B
2024-01-28T21:33:02.974254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10000.69
5-th percentile10239.19
Q111422.285
median12876.76
Q320846.368
95-th percentile41959.812
Maximum58119.41
Range48118.72
Interquartile range (IQR)9424.0825

Descriptive statistics

Standard deviation11312.087
Coefficient of variation (CV)0.61232452
Kurtosis2.6895965
Mean18474.006
Median Absolute Deviation (MAD)2376.405
Skewness1.8367493
Sum1293180.4
Variance1.2796331 × 108
MonotonicityNot monotonic
2024-01-28T21:33:03.104619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10503.8 1
 
1.4%
14996.43 1
 
1.4%
11420.21 1
 
1.4%
20121.92 1
 
1.4%
21087.85 1
 
1.4%
18718.68 1
 
1.4%
12168.3 1
 
1.4%
12687.87 1
 
1.4%
10370.25 1
 
1.4%
10120.81 1
 
1.4%
Other values (60) 60
85.7%
ValueCountFrequency (%)
10000.69 1
1.4%
10025.73 1
1.4%
10120.81 1
1.4%
10131.96 1
1.4%
10370.25 1
1.4%
10385.29 1
1.4%
10496.91 1
1.4%
10503.8 1
1.4%
10536.45 1
1.4%
10587.35 1
1.4%
ValueCountFrequency (%)
58119.41 1
1.4%
51727.58 1
1.4%
49482.98 1
1.4%
42535.65 1
1.4%
41256.01 1
1.4%
40467.06 1
1.4%
40336.24 1
1.4%
35477.45 1
1.4%
35334.61 1
1.4%
33755.46 1
1.4%

Interactions

2024-01-28T21:33:00.492579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T21:33:00.347928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T21:33:00.571804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-28T21:33:00.425020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-28T21:33:03.184655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번주용도코드명기타용도내용건물명도로명주소연면적
연번1.0000.3590.2831.0000.9420.267
주용도코드명0.3591.0000.9971.0001.0000.659
기타용도내용0.2830.9971.0000.0000.0000.920
건물명1.0001.0000.0001.0001.0000.957
도로명주소0.9421.0000.0001.0001.0000.000
연면적0.2670.6590.9200.9570.0001.000
2024-01-28T21:33:03.269945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주용도코드명기타용도내용
주용도코드명1.0000.796
기타용도내용0.7961.000
2024-01-28T21:33:03.351494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번연면적주용도코드명기타용도내용
연번1.0000.0660.1280.000
연면적0.0661.0000.3240.513
주용도코드명0.1280.3241.0000.796
기타용도내용0.0000.5130.7961.000

Missing values

2024-01-28T21:33:00.675618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-28T21:33:00.757677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-28T21:33:00.839894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번주용도코드명기타용도내용건물명도로명주소연면적
01교육연구시설교육연구시설신대초등학교인천광역시 계양구 계산새로 13310503.8
12판매시설판매시설,운동시설,제1,2종근린생활시설(주)농협하나로유통 하나로마트 인천점인천광역시 계양구 아나지로 34141256.01
23방송통신시설방송통신시설인천 계양 방송통신시설인천광역시 계양구 오조산로 12915562.28
34교육연구및복지시설학교교육시설인천안남초등학교인천광역시 계양구 경명대로1114번길 2610025.73
45판매시설판매및영업시설, 운동시설, 제1종근린생활시설HOME PLUS 작전점인천광역시 계양구 계양대로 2749482.98
56교육연구시설교육연구시설서운중학교인천광역시 계양구 아나지로 46710751.57
67교육연구시설교육연구및복지시설서운고등학교인천광역시 계양구 아나지로 48116235.37
78공장공장(주)피제이전자 SMT 공장인천광역시 계양구 서운산단로2길 3012967.96
89교육연구시설교육연구및복지시설작전고등학교인천광역시 계양구 봉오대로729번길 2711921.21
910교육연구시설교육연구및복지시설효성고등학교인천광역시 계양구 새벌로171번길 2112938.56
연번주용도코드명기타용도내용건물명도로명주소연면적
6061의료시설의료시설, 근린생활시설,교육연구시설한림병원인천광역시 계양구 장제로 72224973.79
6162업무시설공공업무시설 외 1인천광역시 계양구청사인천광역시 계양구 계산새로 8835334.61
6263공장공장(주)씨에스아이엔테크인천광역시 계양구 서운산업로 6115969.08
6364공장공장(주)피스코코리아인천광역시 계양구 서운산단로2길 4212549.67
6465교육연구시설교육연구시설인천작전초등학교인천광역시 계양구 주부토로 45610587.35
6566숙박시설숙박시설(관광호텔),업무시설,일반음식점(주)호텔카리스인천광역시 계양구 계양대로 2816766.86
6667공장공장<NA>인천광역시 계양구 서운산단로1길 6712399.11
6768운수시설도시철도차량기지(주공장)2동인천광역시 계양구 만봉길 6526528.4
6869교육연구시설교육연구시설인천부현초등학교인천광역시 계양구 장제로755번길 3010643.55
6970창고시설창고시설(주)지오영인천물류센터인천광역시 계양구 마장로 53721373.16