Overview

Dataset statistics

Number of variables6
Number of observations98
Missing cells98
Missing cells (%)16.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.0 KiB
Average record size in memory52.3 B

Variable types

Numeric3
Text2
Categorical1

Dataset

Description경상북도 김천시의 기계설비 성능점검 대상 건축물 현황(연번, 건물명, 주소, 연면적, 세대수, 데이터 기준일자)을 제공합니다.
Author경상북도 김천시
URLhttps://www.data.go.kr/data/15112040/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
연번 is highly overall correlated with 세대수(공동주택)High correlation
세대수(공동주택) is highly overall correlated with 연번High correlation
연면적(건축물) has 21 (21.4%) missing valuesMissing
세대수(공동주택) has 77 (78.6%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-15 02:07:03.684454
Analysis finished2024-03-15 02:07:06.902841
Duration3.22 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct98
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.5
Minimum1
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1010.0 B
2024-03-15T11:07:07.096458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.85
Q125.25
median49.5
Q373.75
95-th percentile93.15
Maximum98
Range97
Interquartile range (IQR)48.5

Descriptive statistics

Standard deviation28.434134
Coefficient of variation (CV)0.57442696
Kurtosis-1.2
Mean49.5
Median Absolute Deviation (MAD)24.5
Skewness0
Sum4851
Variance808.5
MonotonicityStrictly increasing
2024-03-15T11:07:07.614381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
75 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
67 1
 
1.0%
66 1
 
1.0%
Other values (88) 88
89.8%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%
90 1
1.0%
89 1
1.0%
Distinct81
Distinct (%)82.7%
Missing0
Missing (%)0.0%
Memory size912.0 B
2024-03-15T11:07:08.675189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length14
Mean length9.1020408
Min length2

Characters and Unicode

Total characters892
Distinct characters197
Distinct categories7 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique73 ?
Unique (%)74.5%

Sample

1st row김천제일병원
2nd row로제니아호텔
3rd row신남라인프라자
4th row메타폴리스
5th row한신休시티
ValueCountFrequency (%)
김천공장 14
 
9.6%
유한킴벌리㈜ 6
 
4.1%
김천 5
 
3.4%
코오롱인더스트리㈜ 5
 
3.4%
아주스틸㈜ 4
 
2.7%
파크드림 4
 
2.7%
시설물관리공단 3
 
2.1%
유니투스 3
 
2.1%
시티써밋 3
 
2.1%
김천1공장 3
 
2.1%
Other values (91) 96
65.8%
2024-03-15T11:07:10.092801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
50
 
5.6%
48
 
5.4%
46
 
5.2%
31
 
3.5%
26
 
2.9%
24
 
2.7%
23
 
2.6%
19
 
2.1%
17
 
1.9%
17
 
1.9%
Other values (187) 591
66.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 784
87.9%
Space Separator 50
 
5.6%
Other Symbol 24
 
2.7%
Decimal Number 14
 
1.6%
Open Punctuation 7
 
0.8%
Close Punctuation 7
 
0.8%
Uppercase Letter 6
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
48
 
6.1%
46
 
5.9%
31
 
4.0%
26
 
3.3%
23
 
2.9%
19
 
2.4%
17
 
2.2%
17
 
2.2%
16
 
2.0%
15
 
1.9%
Other values (173) 526
67.1%
Uppercase Letter
ValueCountFrequency (%)
W 1
16.7%
X 1
16.7%
K 1
16.7%
T 1
16.7%
H 1
16.7%
J 1
16.7%
Decimal Number
ValueCountFrequency (%)
1 7
50.0%
2 4
28.6%
3 2
 
14.3%
4 1
 
7.1%
Space Separator
ValueCountFrequency (%)
50
100.0%
Other Symbol
ValueCountFrequency (%)
24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 807
90.5%
Common 78
 
8.7%
Latin 6
 
0.7%
Han 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
48
 
5.9%
46
 
5.7%
31
 
3.8%
26
 
3.2%
24
 
3.0%
23
 
2.9%
19
 
2.4%
17
 
2.1%
17
 
2.1%
16
 
2.0%
Other values (173) 540
66.9%
Common
ValueCountFrequency (%)
50
64.1%
( 7
 
9.0%
) 7
 
9.0%
1 7
 
9.0%
2 4
 
5.1%
3 2
 
2.6%
4 1
 
1.3%
Latin
ValueCountFrequency (%)
W 1
16.7%
X 1
16.7%
K 1
16.7%
T 1
16.7%
H 1
16.7%
J 1
16.7%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 783
87.8%
ASCII 84
 
9.4%
None 24
 
2.7%
CJK 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
50
59.5%
( 7
 
8.3%
) 7
 
8.3%
1 7
 
8.3%
2 4
 
4.8%
3 2
 
2.4%
W 1
 
1.2%
X 1
 
1.2%
K 1
 
1.2%
T 1
 
1.2%
Other values (3) 3
 
3.6%
Hangul
ValueCountFrequency (%)
48
 
6.1%
46
 
5.9%
31
 
4.0%
26
 
3.3%
23
 
2.9%
19
 
2.4%
17
 
2.2%
17
 
2.2%
16
 
2.0%
15
 
1.9%
Other values (172) 525
67.0%
None
ValueCountFrequency (%)
24
100.0%
CJK
ValueCountFrequency (%)
1
100.0%

주소
Text

Distinct79
Distinct (%)80.6%
Missing0
Missing (%)0.0%
Memory size912.0 B
2024-03-15T11:07:11.382923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length16
Mean length17.469388
Min length15

Characters and Unicode

Total characters1712
Distinct characters72
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)69.4%

Sample

1st row경상북도 김천시 신음1길 12
2nd row경상북도 김천시 혁신3로 16
3rd row경상북도 김천시 혁신2로 50
4th row경상북도 김천시 혁신로 185
5th row경상북도 김천시 혁신1로 81
ValueCountFrequency (%)
경상북도 98
23.8%
김천시 98
23.8%
어모면 17
 
4.1%
공단로 10
 
2.4%
공단1길 8
 
1.9%
34 6
 
1.5%
혁신1로 6
 
1.5%
75 6
 
1.5%
공단3길 4
 
1.0%
혁신2로 4
 
1.0%
Other values (103) 155
37.6%
2024-03-15T11:07:12.913767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
314
18.3%
103
 
6.0%
98
 
5.7%
98
 
5.7%
98
 
5.7%
98
 
5.7%
98
 
5.7%
98
 
5.7%
1 70
 
4.1%
62
 
3.6%
Other values (62) 575
33.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1093
63.8%
Space Separator 314
 
18.3%
Decimal Number 298
 
17.4%
Dash Punctuation 7
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
103
 
9.4%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
62
 
5.7%
39
 
3.6%
37
 
3.4%
Other values (50) 264
24.2%
Decimal Number
ValueCountFrequency (%)
1 70
23.5%
3 34
11.4%
4 32
10.7%
5 31
10.4%
2 31
10.4%
7 28
 
9.4%
8 24
 
8.1%
6 21
 
7.0%
0 17
 
5.7%
9 10
 
3.4%
Space Separator
ValueCountFrequency (%)
314
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1093
63.8%
Common 619
36.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
103
 
9.4%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
62
 
5.7%
39
 
3.6%
37
 
3.4%
Other values (50) 264
24.2%
Common
ValueCountFrequency (%)
314
50.7%
1 70
 
11.3%
3 34
 
5.5%
4 32
 
5.2%
5 31
 
5.0%
2 31
 
5.0%
7 28
 
4.5%
8 24
 
3.9%
6 21
 
3.4%
0 17
 
2.7%
Other values (2) 17
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1093
63.8%
ASCII 619
36.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
314
50.7%
1 70
 
11.3%
3 34
 
5.5%
4 32
 
5.2%
5 31
 
5.0%
2 31
 
5.0%
7 28
 
4.5%
8 24
 
3.9%
6 21
 
3.4%
0 17
 
2.7%
Other values (2) 17
 
2.7%
Hangul
ValueCountFrequency (%)
103
 
9.4%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
98
 
9.0%
62
 
5.7%
39
 
3.6%
37
 
3.4%
Other values (50) 264
24.2%

연면적(건축물)
Real number (ℝ)

MISSING 

Distinct77
Distinct (%)100.0%
Missing21
Missing (%)21.4%
Infinite0
Infinite (%)0.0%
Mean22650
Minimum10042
Maximum135305
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1010.0 B
2024-03-15T11:07:13.309442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10042
5-th percentile10146.8
Q112142
median16160
Q321660
95-th percentile53948.4
Maximum135305
Range125263
Interquartile range (IQR)9518

Descriptive statistics

Standard deviation22398.339
Coefficient of variation (CV)0.98888912
Kurtosis14.83476
Mean22650
Median Absolute Deviation (MAD)4391
Skewness3.723087
Sum1744050
Variance5.0168557 × 108
MonotonicityNot monotonic
2024-03-15T11:07:13.671064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10912 1
 
1.0%
19568 1
 
1.0%
15930 1
 
1.0%
12865 1
 
1.0%
12188 1
 
1.0%
20207 1
 
1.0%
10772 1
 
1.0%
10883 1
 
1.0%
16657 1
 
1.0%
10061 1
 
1.0%
Other values (67) 67
68.4%
(Missing) 21
 
21.4%
ValueCountFrequency (%)
10042 1
1.0%
10049 1
1.0%
10061 1
1.0%
10078 1
1.0%
10164 1
1.0%
10570 1
1.0%
10571 1
1.0%
10615 1
1.0%
10772 1
1.0%
10883 1
1.0%
ValueCountFrequency (%)
135305 1
1.0%
126569 1
1.0%
97180 1
1.0%
75278 1
1.0%
48616 1
1.0%
40867 1
1.0%
37623 1
1.0%
36317 1
1.0%
32990 1
1.0%
32225 1
1.0%

세대수(공동주택)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)90.5%
Missing77
Missing (%)78.6%
Infinite0
Infinite (%)0.0%
Mean768.28571
Minimum388
Maximum1119
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1010.0 B
2024-03-15T11:07:13.987036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum388
5-th percentile512
Q1642
median783
Q3919
95-th percentile1084
Maximum1119
Range731
Interquartile range (IQR)277

Descriptive statistics

Standard deviation195.59861
Coefficient of variation (CV)0.25459097
Kurtosis-0.66508487
Mean768.28571
Median Absolute Deviation (MAD)141
Skewness-0.049779085
Sum16134
Variance38258.814
MonotonicityNot monotonic
2024-03-15T11:07:14.390460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
534 2
 
2.0%
700 2
 
2.0%
1084 1
 
1.0%
388 1
 
1.0%
512 1
 
1.0%
602 1
 
1.0%
642 1
 
1.0%
660 1
 
1.0%
714 1
 
1.0%
1119 1
 
1.0%
Other values (9) 9
 
9.2%
(Missing) 77
78.6%
ValueCountFrequency (%)
388 1
1.0%
512 1
1.0%
534 2
2.0%
602 1
1.0%
642 1
1.0%
660 1
1.0%
700 2
2.0%
714 1
1.0%
783 1
1.0%
793 1
1.0%
ValueCountFrequency (%)
1119 1
1.0%
1084 1
1.0%
959 1
1.0%
938 1
1.0%
930 1
1.0%
919 1
1.0%
916 1
1.0%
896 1
1.0%
811 1
1.0%
793 1
1.0%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size912.0 B
2023-12-31
98 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-12-31
2nd row2023-12-31
3rd row2023-12-31
4th row2023-12-31
5th row2023-12-31

Common Values

ValueCountFrequency (%)
2023-12-31 98
100.0%

Length

2024-03-15T11:07:14.638670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T11:07:14.823462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-12-31 98
100.0%

Interactions

2024-03-15T11:07:05.866010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T11:07:04.386070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T11:07:05.202024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T11:07:06.101575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T11:07:04.624193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T11:07:05.374481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T11:07:06.236711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T11:07:05.018231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T11:07:05.632481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T11:07:14.998373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번건물명주소연면적(건축물)세대수(공동주택)
연번1.0000.9850.9930.0950.823
건물명0.9851.0001.0000.9401.000
주소0.9931.0001.0000.9541.000
연면적(건축물)0.0950.9400.9541.000NaN
세대수(공동주택)0.8231.0001.000NaN1.000
2024-03-15T11:07:15.288328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번연면적(건축물)세대수(공동주택)
연번1.0000.018-0.997
연면적(건축물)0.0181.000NaN
세대수(공동주택)-0.997NaN1.000

Missing values

2024-03-15T11:07:06.455153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T11:07:06.661410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-15T11:07:06.819375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번건물명주소연면적(건축물)세대수(공동주택)데이터기준일자
01김천제일병원경상북도 김천시 신음1길 1219678<NA>2023-12-31
12로제니아호텔경상북도 김천시 혁신3로 1625795<NA>2023-12-31
23신남라인프라자경상북도 김천시 혁신2로 5010078<NA>2023-12-31
34메타폴리스경상북도 김천시 혁신로 18514000<NA>2023-12-31
45한신休시티경상북도 김천시 혁신1로 8148616<NA>2023-12-31
56김천 파크드림 시티써밋경상북도 김천시 혁신1로 7510571<NA>2023-12-31
67김천 파크드림 시티써밋경상북도 김천시 혁신1로 7518257<NA>2023-12-31
78김천 파크드림 시티써밋경상북도 김천시 혁신1로 7512738<NA>2023-12-31
89스타메디경상북도 김천시 혁신3로 3920230<NA>2023-12-31
910영무메트로 오피스텔경상북도 김천시 혁신3로 4627152<NA>2023-12-31
연번건물명주소연면적(건축물)세대수(공동주택)데이터기준일자
8889김천주공 해돋이타운경상북도 김천시 신음새동네길 110<NA>7142023-12-31
8990남혁신 코아루푸르나임경상북도 김천시 무실7길 25<NA>7002023-12-31
9091코아루1차아파트경상북도 김천시 삼락택지길 132<NA>7002023-12-31
9192경북혁신케이씨씨스위첸경상북도 김천시 율곡동 408<NA>6602023-12-31
9293김천혁신도시 영무예다음1차아파트경상북도 김천시 혁신8로 64<NA>6422023-12-31
9394덕곡주공아파트경상북도 김천시 무실7길 38<NA>6022023-12-31
9495김천 삼도뷰엔빌W 아파트 1단지경상북도 김천시 시청7길 62<NA>5342023-12-31
9596김천혁신도시골드클래스경상북도 김천시 혁신2로 76<NA>5122023-12-31
9697부곡3주공아파트경상북도 김천시 문지왈길 68<NA>5342023-12-31
9798부곡2주공아파트경상북도 김천시 문지왈길 68<NA>3882023-12-31