Overview

Dataset statistics

Number of variables4
Number of observations828
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)0.2%
Total size in memory26.8 KiB
Average record size in memory33.2 B

Variable types

Categorical2
Text2

Dataset

Description실내공기질 관리법 제3조에 해당하는 적용대상 중 다중이용시설 현황에 대한 데이터로써 시설명, 시설군, 해당 시설 주소에 대한 데이터 입니다.
URLhttps://www.data.go.kr/data/15072710/fileData.do

Alerts

Dataset has 2 (0.2%) duplicate rowsDuplicates
보유시설군 is highly imbalanced (88.4%)Imbalance

Reproduction

Analysis started2023-12-12 12:08:15.117928
Analysis finished2023-12-12 12:08:15.741651
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

Distinct15
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
천안시
259 
아산시
112 
당진시
68 
서산시
67 
논산시
59 
Other values (10)
263 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row천안시
2nd row천안시
3rd row천안시
4th row천안시
5th row천안시

Common Values

ValueCountFrequency (%)
천안시 259
31.3%
아산시 112
13.5%
당진시 68
 
8.2%
서산시 67
 
8.1%
논산시 59
 
7.1%
공주시 51
 
6.2%
홍성군 47
 
5.7%
부여군 34
 
4.1%
보령시 25
 
3.0%
태안군 25
 
3.0%
Other values (5) 81
 
9.8%

Length

2023-12-12T21:08:15.821471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
천안시 259
31.3%
아산시 112
13.5%
당진시 68
 
8.2%
서산시 67
 
8.1%
논산시 59
 
7.1%
공주시 51
 
6.2%
홍성군 47
 
5.7%
부여군 34
 
4.1%
보령시 25
 
3.0%
태안군 25
 
3.0%
Other values (5) 81
 
9.8%

보유시설군
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
1
802 
2
 
20
3
 
4
0
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 802
96.9%
2 20
 
2.4%
3 4
 
0.5%
0 2
 
0.2%

Length

2023-12-12T21:08:15.942291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:08:16.049519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 802
96.9%
2 20
 
2.4%
3 4
 
0.5%
0 2
 
0.2%
Distinct805
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
2023-12-12T21:08:16.248536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length7.8985507
Min length2

Characters and Unicode

Total characters6540
Distinct characters489
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique783 ?
Unique (%)94.6%

Sample

1st row천안역지하상가
2nd row아우내도서관
3rd row신방도서관
4th row두정도서관
5th row쌍용도서관
ValueCountFrequency (%)
어린이집 14
 
1.4%
pc방 6
 
0.6%
pc 5
 
0.5%
천안점 4
 
0.4%
예일어린이집 3
 
0.3%
롯데시네마 3
 
0.3%
사회복지법인 3
 
0.3%
도서관 3
 
0.3%
롯데마트 3
 
0.3%
메가박스 3
 
0.3%
Other values (867) 922
95.1%
2023-12-12T21:08:16.667050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
350
 
5.4%
299
 
4.6%
297
 
4.5%
295
 
4.5%
281
 
4.3%
161
 
2.5%
145
 
2.2%
143
 
2.2%
138
 
2.1%
99
 
1.5%
Other values (479) 4332
66.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6070
92.8%
Uppercase Letter 178
 
2.7%
Space Separator 143
 
2.2%
Close Punctuation 42
 
0.6%
Open Punctuation 41
 
0.6%
Decimal Number 27
 
0.4%
Lowercase Letter 20
 
0.3%
Other Symbol 11
 
0.2%
Other Punctuation 3
 
< 0.1%
Math Symbol 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
350
 
5.8%
299
 
4.9%
297
 
4.9%
295
 
4.9%
281
 
4.6%
161
 
2.7%
145
 
2.4%
138
 
2.3%
99
 
1.6%
93
 
1.5%
Other values (433) 3912
64.4%
Uppercase Letter
ValueCountFrequency (%)
C 63
35.4%
P 52
29.2%
G 12
 
6.7%
V 11
 
6.2%
A 5
 
2.8%
R 5
 
2.8%
O 5
 
2.8%
T 4
 
2.2%
E 4
 
2.2%
N 2
 
1.1%
Other values (10) 15
 
8.4%
Decimal Number
ValueCountFrequency (%)
2 8
29.6%
1 4
14.8%
3 4
14.8%
4 3
 
11.1%
8 2
 
7.4%
9 2
 
7.4%
6 2
 
7.4%
5 2
 
7.4%
Lowercase Letter
ValueCountFrequency (%)
c 5
25.0%
e 5
25.0%
p 4
20.0%
h 2
 
10.0%
a 2
 
10.0%
f 2
 
10.0%
Other Punctuation
ValueCountFrequency (%)
, 1
33.3%
. 1
33.3%
& 1
33.3%
Math Symbol
ValueCountFrequency (%)
~ 1
50.0%
+ 1
50.0%
Letter Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
143
100.0%
Close Punctuation
ValueCountFrequency (%)
) 42
100.0%
Open Punctuation
ValueCountFrequency (%)
( 41
100.0%
Other Symbol
ValueCountFrequency (%)
11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6081
93.0%
Common 259
 
4.0%
Latin 200
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
350
 
5.8%
299
 
4.9%
297
 
4.9%
295
 
4.9%
281
 
4.6%
161
 
2.6%
145
 
2.4%
138
 
2.3%
99
 
1.6%
93
 
1.5%
Other values (434) 3923
64.5%
Latin
ValueCountFrequency (%)
C 63
31.5%
P 52
26.0%
G 12
 
6.0%
V 11
 
5.5%
c 5
 
2.5%
A 5
 
2.5%
R 5
 
2.5%
e 5
 
2.5%
O 5
 
2.5%
T 4
 
2.0%
Other values (18) 33
16.5%
Common
ValueCountFrequency (%)
143
55.2%
) 42
 
16.2%
( 41
 
15.8%
2 8
 
3.1%
1 4
 
1.5%
3 4
 
1.5%
4 3
 
1.2%
8 2
 
0.8%
9 2
 
0.8%
6 2
 
0.8%
Other values (7) 8
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6070
92.8%
ASCII 457
 
7.0%
None 11
 
0.2%
Number Forms 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
350
 
5.8%
299
 
4.9%
297
 
4.9%
295
 
4.9%
281
 
4.6%
161
 
2.7%
145
 
2.4%
138
 
2.3%
99
 
1.6%
93
 
1.5%
Other values (433) 3912
64.4%
ASCII
ValueCountFrequency (%)
143
31.3%
C 63
13.8%
P 52
 
11.4%
) 42
 
9.2%
( 41
 
9.0%
G 12
 
2.6%
V 11
 
2.4%
2 8
 
1.8%
c 5
 
1.1%
A 5
 
1.1%
Other values (33) 75
16.4%
None
ValueCountFrequency (%)
11
100.0%
Number Forms
ValueCountFrequency (%)
1
50.0%
1
50.0%

주소
Text

Distinct805
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
2023-12-12T21:08:17.090279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length59
Median length41
Mean length20.549517
Min length5

Characters and Unicode

Total characters17015
Distinct characters344
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique784 ?
Unique (%)94.7%

Sample

1st row충청남도 천안시 동남구 버들로 2 지하상가 10호 관리사무실
2nd row충청남도 천안시 동남구 병천2로 57
3rd row충청남도 천안시 동남구 통정4로 7
4th row충청남도 천안시 서북구 부성3길 9 (두정동)
5th row충청남도 천안시 서북구 월봉4로 153 (쌍용동)
ValueCountFrequency (%)
충청남도 473
 
12.9%
천안시 259
 
7.1%
서북구 148
 
4.1%
동남구 111
 
3.0%
충남 70
 
1.9%
당진시 68
 
1.9%
논산시 59
 
1.6%
공주시 49
 
1.3%
부여군 34
 
0.9%
보령시 25
 
0.7%
Other values (1404) 2358
64.5%
2023-12-12T21:08:17.752724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2844
 
16.7%
715
 
4.2%
1 680
 
4.0%
585
 
3.4%
562
 
3.3%
533
 
3.1%
509
 
3.0%
502
 
3.0%
2 497
 
2.9%
415
 
2.4%
Other values (334) 9173
53.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10135
59.6%
Decimal Number 3115
 
18.3%
Space Separator 2844
 
16.7%
Close Punctuation 253
 
1.5%
Open Punctuation 253
 
1.5%
Dash Punctuation 246
 
1.4%
Other Punctuation 143
 
0.8%
Math Symbol 16
 
0.1%
Uppercase Letter 7
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
715
 
7.1%
585
 
5.8%
562
 
5.5%
533
 
5.3%
509
 
5.0%
502
 
5.0%
415
 
4.1%
390
 
3.8%
340
 
3.4%
323
 
3.2%
Other values (308) 5261
51.9%
Decimal Number
ValueCountFrequency (%)
1 680
21.8%
2 497
16.0%
3 359
11.5%
4 275
8.8%
5 272
 
8.7%
7 236
 
7.6%
0 224
 
7.2%
8 204
 
6.5%
6 197
 
6.3%
9 171
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
C 2
28.6%
G 1
14.3%
V 1
14.3%
T 1
14.3%
Y 1
14.3%
I 1
14.3%
Other Punctuation
ValueCountFrequency (%)
, 142
99.3%
. 1
 
0.7%
Lowercase Letter
ValueCountFrequency (%)
l 1
50.0%
h 1
50.0%
Space Separator
ValueCountFrequency (%)
2844
100.0%
Close Punctuation
ValueCountFrequency (%)
) 253
100.0%
Open Punctuation
ValueCountFrequency (%)
( 253
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 246
100.0%
Math Symbol
ValueCountFrequency (%)
~ 16
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10135
59.6%
Common 6870
40.4%
Latin 10
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
715
 
7.1%
585
 
5.8%
562
 
5.5%
533
 
5.3%
509
 
5.0%
502
 
5.0%
415
 
4.1%
390
 
3.8%
340
 
3.4%
323
 
3.2%
Other values (308) 5261
51.9%
Common
ValueCountFrequency (%)
2844
41.4%
1 680
 
9.9%
2 497
 
7.2%
3 359
 
5.2%
4 275
 
4.0%
5 272
 
4.0%
) 253
 
3.7%
( 253
 
3.7%
- 246
 
3.6%
7 236
 
3.4%
Other values (7) 955
 
13.9%
Latin
ValueCountFrequency (%)
C 2
20.0%
G 1
10.0%
V 1
10.0%
l 1
10.0%
h 1
10.0%
T 1
10.0%
1
10.0%
Y 1
10.0%
I 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10135
59.6%
ASCII 6879
40.4%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2844
41.3%
1 680
 
9.9%
2 497
 
7.2%
3 359
 
5.2%
4 275
 
4.0%
5 272
 
4.0%
) 253
 
3.7%
( 253
 
3.7%
- 246
 
3.6%
7 236
 
3.4%
Other values (15) 964
 
14.0%
Hangul
ValueCountFrequency (%)
715
 
7.1%
585
 
5.8%
562
 
5.5%
533
 
5.3%
509
 
5.0%
502
 
5.0%
415
 
4.1%
390
 
3.8%
340
 
3.4%
323
 
3.2%
Other values (308) 5261
51.9%
Number Forms
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-12T21:08:17.877465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명보유시설군
시군명1.0000.175
보유시설군0.1751.000
2023-12-12T21:08:17.970103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명보유시설군
시군명1.0000.099
보유시설군0.0991.000
2023-12-12T21:08:18.102254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명보유시설군
시군명1.0000.099
보유시설군0.0991.000

Missing values

2023-12-12T21:08:15.599403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:08:15.700607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군명보유시설군시설명주소
0천안시2천안역지하상가충청남도 천안시 동남구 버들로 2 지하상가 10호 관리사무실
1천안시1아우내도서관충청남도 천안시 동남구 병천2로 57
2천안시1신방도서관충청남도 천안시 동남구 통정4로 7
3천안시1두정도서관충청남도 천안시 서북구 부성3길 9 (두정동)
4천안시1쌍용도서관충청남도 천안시 서북구 월봉4로 153 (쌍용동)
5천안시1청수도서관충청남도 천안시 동남구 청수16로 5-10
6천안시1충청남도학생교육문화원충청남도 천안시 동남구 옛농고1길 41(원성동,충청남도학생교육문화원)
7천안시1독립기념관충청남도 천안시 동남구 목천읍 남화리 230
8천안시1천안박물관충청남도 천안시 동남구 천안대로 429-13 (삼룡동)
9천안시1천안요양병원충청남도 천안시 동남구 삼룡천3길 10 (구성동, 이라의료재단 천안병원)
시군명보유시설군시설명주소
818태안군1성은실버요양원태안읍 안면대로 88
819태안군1서혜원이원면 발전로 939
820태안군1효림요양원태안읍 군청8길 42-7
821태안군1어쎈독학기숙학원안면읍 중신로 422
822태안군1새빛어린이집태안읍 동평로 16
823태안군1태안문화원 영화관태안읍 백화로 192
824태안군1비상에듀기숙학원안면읍 백사장2길 25-60
825태안군1태안사랑어린이집태안읍 소란길 29-11
826태안군1국립해양문화재연구소(국립태안해양유물전시관)근흥면 신진대교길 94-33
827태안군1유류피해극복기념관소원면 천리포1길 120

Duplicate rows

Most frequently occurring

시군명보유시설군시설명주소# duplicates
0공주시3지방공사 충남공주의료원충청남도 공주시 무령로 772
1당진시1당진종합병원충남 당진시 반촌로 5-152