Overview

Dataset statistics

Number of variables4
Number of observations96
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 KiB
Average record size in memory33.4 B

Variable types

Categorical2
Text2

Dataset

Description한국환경공단에서 운영하는 실내공기질 관리 종합정보망에 등록된 신축공동주택 오염도검사 목록 정보를 제공합니다.
Author한국환경공단
URLhttps://www.data.go.kr/data/15093403/fileData.do

Alerts

구분 has constant value ""Constant
시설명 has unique valuesUnique

Reproduction

Analysis started2024-03-16 06:43:23.768497
Analysis finished2024-03-16 06:43:30.324972
Duration6.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군
Categorical

Distinct13
Distinct (%)13.5%
Missing0
Missing (%)0.0%
Memory size900.0 B
서울특별시
39 
부산광역시
17 
경기도
13 
대구광역시
대전광역시
 
3
Other values (8)
15 

Length

Max length7
Median length5
Mean length4.6875
Min length3

Unique

Unique4 ?
Unique (%)4.2%

Sample

1st row대구광역시
2nd row대구광역시
3rd row서울특별시
4th row대전광역시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 39
40.6%
부산광역시 17
17.7%
경기도 13
 
13.5%
대구광역시 9
 
9.4%
대전광역시 3
 
3.1%
전라북도 3
 
3.1%
울산광역시 3
 
3.1%
인천광역시 3
 
3.1%
강원도 2
 
2.1%
세종특별자치시 1
 
1.0%
Other values (3) 3
 
3.1%

Length

2024-03-16T06:43:30.596399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울특별시 39
40.6%
부산광역시 17
17.7%
경기도 13
 
13.5%
대구광역시 9
 
9.4%
대전광역시 3
 
3.1%
전라북도 3
 
3.1%
울산광역시 3
 
3.1%
인천광역시 3
 
3.1%
강원도 2
 
2.1%
세종특별자치시 1
 
1.0%
Other values (3) 3
 
3.1%
Distinct55
Distinct (%)57.3%
Missing0
Missing (%)0.0%
Memory size900.0 B
2024-03-16T06:43:31.222117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3
Min length2

Characters and Unicode

Total characters288
Distinct characters67
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)35.4%

Sample

1st row달서구
2nd row북구
3rd row영등포구
4th row유성구
5th row서대문구
ValueCountFrequency (%)
남구 5
 
5.2%
서구 5
 
5.2%
영등포구 5
 
5.2%
화성시 4
 
4.2%
강동구 4
 
4.2%
중구 3
 
3.1%
부산진구 3
 
3.1%
구로구 3
 
3.1%
서대문구 3
 
3.1%
익산시 3
 
3.1%
Other values (45) 58
60.4%
2024-03-16T06:43:32.400500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
74
25.7%
21
 
7.3%
12
 
4.2%
10
 
3.5%
10
 
3.5%
10
 
3.5%
9
 
3.1%
9
 
3.1%
7
 
2.4%
6
 
2.1%
Other values (57) 120
41.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 288
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
74
25.7%
21
 
7.3%
12
 
4.2%
10
 
3.5%
10
 
3.5%
10
 
3.5%
9
 
3.1%
9
 
3.1%
7
 
2.4%
6
 
2.1%
Other values (57) 120
41.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 288
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
74
25.7%
21
 
7.3%
12
 
4.2%
10
 
3.5%
10
 
3.5%
10
 
3.5%
9
 
3.1%
9
 
3.1%
7
 
2.4%
6
 
2.1%
Other values (57) 120
41.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 288
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
74
25.7%
21
 
7.3%
12
 
4.2%
10
 
3.5%
10
 
3.5%
10
 
3.5%
9
 
3.1%
9
 
3.1%
7
 
2.4%
6
 
2.1%
Other values (57) 120
41.7%

시설명
Text

UNIQUE 

Distinct96
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size900.0 B
2024-03-16T06:43:33.258811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length17
Mean length10.145833
Min length4

Characters and Unicode

Total characters974
Distinct characters240
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)100.0%

Sample

1st row월배삼정그린코아 포레스트 카운티
2nd row대구도남A-1BL 아파트
3rd row여의도 리미티오 148 오피스텔
4th row레자미 탐앤탐
5th row가좌역스타타워
ValueCountFrequency (%)
힐스테이트 3
 
1.8%
아파트 3
 
1.8%
더샵 2
 
1.2%
기안우방아이유쉘 2
 
1.2%
역세권 2
 
1.2%
청년주택 2
 
1.2%
화성 2
 
1.2%
여의도 2
 
1.2%
e편한세상 2
 
1.2%
데시앙 1
 
0.6%
Other values (145) 145
87.3%
2024-03-16T06:43:34.574473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
72
 
7.4%
28
 
2.9%
28
 
2.9%
22
 
2.3%
19
 
2.0%
18
 
1.8%
16
 
1.6%
14
 
1.4%
14
 
1.4%
12
 
1.2%
Other values (230) 731
75.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 825
84.7%
Space Separator 72
 
7.4%
Decimal Number 33
 
3.4%
Uppercase Letter 24
 
2.5%
Close Punctuation 7
 
0.7%
Open Punctuation 7
 
0.7%
Lowercase Letter 4
 
0.4%
Dash Punctuation 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
 
3.4%
28
 
3.4%
22
 
2.7%
19
 
2.3%
18
 
2.2%
16
 
1.9%
14
 
1.7%
14
 
1.7%
12
 
1.5%
12
 
1.5%
Other values (204) 642
77.8%
Uppercase Letter
ValueCountFrequency (%)
R 3
12.5%
D 3
12.5%
K 3
12.5%
A 3
12.5%
I 2
8.3%
P 2
8.3%
C 2
8.3%
M 2
8.3%
S 1
 
4.2%
N 1
 
4.2%
Other values (2) 2
8.3%
Decimal Number
ValueCountFrequency (%)
1 9
27.3%
2 8
24.2%
4 7
21.2%
3 4
12.1%
6 1
 
3.0%
0 1
 
3.0%
8 1
 
3.0%
5 1
 
3.0%
9 1
 
3.0%
Space Separator
ValueCountFrequency (%)
72
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 825
84.7%
Common 121
 
12.4%
Latin 28
 
2.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
 
3.4%
28
 
3.4%
22
 
2.7%
19
 
2.3%
18
 
2.2%
16
 
1.9%
14
 
1.7%
14
 
1.7%
12
 
1.5%
12
 
1.5%
Other values (204) 642
77.8%
Common
ValueCountFrequency (%)
72
59.5%
1 9
 
7.4%
2 8
 
6.6%
) 7
 
5.8%
( 7
 
5.8%
4 7
 
5.8%
3 4
 
3.3%
- 2
 
1.7%
6 1
 
0.8%
0 1
 
0.8%
Other values (3) 3
 
2.5%
Latin
ValueCountFrequency (%)
e 4
14.3%
R 3
10.7%
D 3
10.7%
K 3
10.7%
A 3
10.7%
I 2
7.1%
P 2
7.1%
C 2
7.1%
M 2
7.1%
S 1
 
3.6%
Other values (3) 3
10.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 825
84.7%
ASCII 149
 
15.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
72
48.3%
1 9
 
6.0%
2 8
 
5.4%
) 7
 
4.7%
( 7
 
4.7%
4 7
 
4.7%
e 4
 
2.7%
3 4
 
2.7%
R 3
 
2.0%
D 3
 
2.0%
Other values (16) 25
 
16.8%
Hangul
ValueCountFrequency (%)
28
 
3.4%
28
 
3.4%
22
 
2.7%
19
 
2.3%
18
 
2.2%
16
 
1.9%
14
 
1.7%
14
 
1.7%
12
 
1.5%
12
 
1.5%
Other values (204) 642
77.8%

구분
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size900.0 B
오염도검사
96 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row오염도검사
2nd row오염도검사
3rd row오염도검사
4th row오염도검사
5th row오염도검사

Common Values

ValueCountFrequency (%)
오염도검사 96
100.0%

Length

2024-03-16T06:43:35.188799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T06:43:35.704803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
오염도검사 96
100.0%

Correlations

2024-03-16T06:43:36.038636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군시군구시설명
시군1.0000.9621.000
시군구0.9621.0001.000
시설명1.0001.0001.000

Missing values

2024-03-16T06:43:29.476324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-16T06:43:30.039422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군시군구시설명구분
0대구광역시달서구월배삼정그린코아 포레스트 카운티오염도검사
1대구광역시북구대구도남A-1BL 아파트오염도검사
2서울특별시영등포구여의도 리미티오 148 오피스텔오염도검사
3대전광역시유성구레자미 탐앤탐오염도검사
4서울특별시서대문구가좌역스타타워오염도검사
5대전광역시중구에이스퀘어오염도검사
6부산광역시영도구우성스마트시티뷰오염도검사
7서울특별시관악구신림동 주상복합 신축오염도검사
8서울특별시송파구호반써밋 송파 2차오염도검사
9대구광역시동구안심시티프라디움오염도검사
시군시군구시설명구분
86부산광역시해운대구해운대 센트럴 푸르지오오염도검사
87경기도광명시광명푸르지오센트베르오염도검사
88부산광역시부산진구서면스위트엠골드에비뉴오염도검사
89서울특별시구로구고척IPARK(RD블럭)오염도검사
90부산광역시북구포레나부산덕천1차오염도검사
91부산광역시부산진구시민공원 이편한세상오염도검사
92강원도춘천시롯데캐슬위너클래스오염도검사
93대전광역시동구신흥SK뷰 아파트오염도검사
94서울특별시강동구힐데스하임 천호오염도검사
95경기도광주시오포 더샵 센트럴포레(고산1지구 C1블럭)오염도검사