Overview

Dataset statistics

Number of variables6
Number of observations346
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.7 KiB
Average record size in memory49.4 B

Variable types

Categorical3
Text2
Numeric1

Dataset

Description대전광역시 민간체육시설(골프장, 승마장, 종합체육시설, 체육관, 수영장, 헬스장, 스크린골프연습장 등) 현황입니다.
URLhttps://www.data.go.kr/data/15061935/fileData.do

Alerts

주관부서 is highly overall correlated with 면적 and 1 other fieldsHigh correlation
등록_신고 is highly overall correlated with 면적 and 2 other fieldsHigh correlation
업종 is highly overall correlated with 면적 and 1 other fieldsHigh correlation
면적 is highly overall correlated with 등록_신고 and 2 other fieldsHigh correlation
등록_신고 is highly imbalanced (92.8%)Imbalance

Reproduction

Analysis started2023-12-12 12:07:50.729876
Analysis finished2023-12-12 12:07:51.503600
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

등록_신고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
신고
343 
등록
 
3

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row등록
2nd row등록
3rd row등록
4th row신고
5th row신고

Common Values

ValueCountFrequency (%)
신고 343
99.1%
등록 3
 
0.9%

Length

2023-12-12T21:07:51.591782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:07:51.694493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신고 343
99.1%
등록 3
 
0.9%

업종
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
골프연습장업(스크린)
162 
골프연습장업(실내)
99 
수영장업(실내)
32 
골프연습장업(실외)
32 
종합체육시설업
 
9
Other values (5)
 
12

Length

Max length11
Median length10
Mean length10.066474
Min length4

Unique

Unique2 ?
Unique (%)0.6%

Sample

1st row골프장업
2nd row골프장업
3rd row골프장업
4th row승마장업
5th row승마장업

Common Values

ValueCountFrequency (%)
골프연습장업(스크린) 162
46.8%
골프연습장업(실내) 99
28.6%
수영장업(실내) 32
 
9.2%
골프연습장업(실외) 32
 
9.2%
종합체육시설업 9
 
2.6%
가상체험체육시설업 4
 
1.2%
골프장업 3
 
0.9%
승마장업 3
 
0.9%
수영장업(실외) 1
 
0.3%
썰매장업 1
 
0.3%

Length

2023-12-12T21:07:51.840220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:07:52.025073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
골프연습장업(스크린 162
46.8%
골프연습장업(실내 99
28.6%
수영장업(실내 32
 
9.2%
골프연습장업(실외 32
 
9.2%
종합체육시설업 9
 
2.6%
가상체험체육시설업 4
 
1.2%
골프장업 3
 
0.9%
승마장업 3
 
0.9%
수영장업(실외 1
 
0.3%
썰매장업 1
 
0.3%
Distinct339
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2023-12-12T21:07:52.367524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length23
Mean length8.5693642
Min length3

Characters and Unicode

Total characters2965
Distinct characters337
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique332 ?
Unique (%)96.0%

Sample

1st row유성 컨트리클럽
2nd row대덕연구개발특구복지센터
3rd row금실 컨트리클럽
4th row퀸즈승마장
5th row대전승마장
ValueCountFrequency (%)
스크린골프 23
 
4.3%
골프 21
 
4.0%
스크린 17
 
3.2%
골프연습장 12
 
2.3%
수영장 9
 
1.7%
대전 6
 
1.1%
아카데미 6
 
1.1%
골프존 5
 
0.9%
sg 5
 
0.9%
스위밍키즈 4
 
0.8%
Other values (389) 422
79.6%
2023-12-12T21:07:52.907324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
264
 
8.9%
252
 
8.5%
222
 
7.5%
184
 
6.2%
151
 
5.1%
149
 
5.0%
69
 
2.3%
47
 
1.6%
44
 
1.5%
36
 
1.2%
Other values (327) 1547
52.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2567
86.6%
Space Separator 184
 
6.2%
Uppercase Letter 109
 
3.7%
Lowercase Letter 37
 
1.2%
Close Punctuation 26
 
0.9%
Open Punctuation 26
 
0.9%
Decimal Number 8
 
0.3%
Dash Punctuation 3
 
0.1%
Other Punctuation 3
 
0.1%
Other Symbol 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
264
 
10.3%
252
 
9.8%
222
 
8.6%
151
 
5.9%
149
 
5.8%
69
 
2.7%
47
 
1.8%
44
 
1.7%
36
 
1.4%
32
 
1.2%
Other values (279) 1301
50.7%
Uppercase Letter
ValueCountFrequency (%)
S 27
24.8%
G 26
23.9%
K 9
 
8.3%
P 6
 
5.5%
O 6
 
5.5%
D 5
 
4.6%
J 4
 
3.7%
R 4
 
3.7%
T 3
 
2.8%
I 3
 
2.8%
Other values (11) 16
14.7%
Lowercase Letter
ValueCountFrequency (%)
e 5
13.5%
o 5
13.5%
g 4
10.8%
n 4
10.8%
f 3
8.1%
l 3
8.1%
s 2
 
5.4%
i 2
 
5.4%
r 2
 
5.4%
a 2
 
5.4%
Other values (5) 5
13.5%
Decimal Number
ValueCountFrequency (%)
2 3
37.5%
1 2
25.0%
7 1
 
12.5%
8 1
 
12.5%
3 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
& 2
66.7%
' 1
33.3%
Space Separator
ValueCountFrequency (%)
184
100.0%
Close Punctuation
ValueCountFrequency (%)
) 26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 26
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2569
86.6%
Common 250
 
8.4%
Latin 146
 
4.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
264
 
10.3%
252
 
9.8%
222
 
8.6%
151
 
5.9%
149
 
5.8%
69
 
2.7%
47
 
1.8%
44
 
1.7%
36
 
1.4%
32
 
1.2%
Other values (280) 1303
50.7%
Latin
ValueCountFrequency (%)
S 27
18.5%
G 26
17.8%
K 9
 
6.2%
P 6
 
4.1%
O 6
 
4.1%
D 5
 
3.4%
e 5
 
3.4%
o 5
 
3.4%
J 4
 
2.7%
g 4
 
2.7%
Other values (26) 49
33.6%
Common
ValueCountFrequency (%)
184
73.6%
) 26
 
10.4%
( 26
 
10.4%
- 3
 
1.2%
2 3
 
1.2%
& 2
 
0.8%
1 2
 
0.8%
7 1
 
0.4%
8 1
 
0.4%
3 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2567
86.6%
ASCII 396
 
13.4%
None 2
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
264
 
10.3%
252
 
9.8%
222
 
8.6%
151
 
5.9%
149
 
5.8%
69
 
2.7%
47
 
1.8%
44
 
1.7%
36
 
1.4%
32
 
1.2%
Other values (279) 1301
50.7%
ASCII
ValueCountFrequency (%)
184
46.5%
S 27
 
6.8%
) 26
 
6.6%
( 26
 
6.6%
G 26
 
6.6%
K 9
 
2.3%
P 6
 
1.5%
O 6
 
1.5%
D 5
 
1.3%
e 5
 
1.3%
Other values (37) 76
19.2%
None
ValueCountFrequency (%)
2
100.0%
Distinct342
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2023-12-12T21:07:53.271067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length40
Mean length28.881503
Min length12

Characters and Unicode

Total characters9993
Distinct characters243
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique338 ?
Unique (%)97.7%

Sample

1st row대전 유성구 현충원로 200(덕명동 215-7번지)
2nd row대전 유성구 전민동 산11-2번지
3rd row대전 유성구 테크노중앙로 210(용산동 676번지)
4th row대전시 서구 언목재길 253 (흑석동)
5th row대전시 서구 조련길 50 (흑석동)
ValueCountFrequency (%)
대전광역시 308
 
15.4%
서구 136
 
6.8%
유성구 117
 
5.9%
대덕구 35
 
1.8%
중구 34
 
1.7%
대전 32
 
1.6%
동구 24
 
1.2%
3층 23
 
1.1%
둔산동 21
 
1.1%
2층 21
 
1.1%
Other values (656) 1249
62.5%
2023-12-12T21:07:53.778389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1672
 
16.7%
470
 
4.7%
418
 
4.2%
365
 
3.7%
355
 
3.6%
343
 
3.4%
) 342
 
3.4%
( 342
 
3.4%
1 340
 
3.4%
318
 
3.2%
Other values (233) 5028
50.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5699
57.0%
Space Separator 1672
 
16.7%
Decimal Number 1580
 
15.8%
Close Punctuation 342
 
3.4%
Open Punctuation 342
 
3.4%
Other Punctuation 290
 
2.9%
Dash Punctuation 41
 
0.4%
Uppercase Letter 18
 
0.2%
Math Symbol 9
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
470
 
8.2%
418
 
7.3%
365
 
6.4%
355
 
6.2%
343
 
6.0%
318
 
5.6%
309
 
5.4%
308
 
5.4%
183
 
3.2%
154
 
2.7%
Other values (207) 2476
43.4%
Decimal Number
ValueCountFrequency (%)
1 340
21.5%
2 206
13.0%
3 183
11.6%
0 149
9.4%
5 148
9.4%
4 123
 
7.8%
6 122
 
7.7%
7 111
 
7.0%
8 110
 
7.0%
9 88
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
B 6
33.3%
A 4
22.2%
K 3
16.7%
P 2
 
11.1%
Q 1
 
5.6%
D 1
 
5.6%
J 1
 
5.6%
Other Punctuation
ValueCountFrequency (%)
, 286
98.6%
. 2
 
0.7%
/ 1
 
0.3%
@ 1
 
0.3%
Space Separator
ValueCountFrequency (%)
1672
100.0%
Close Punctuation
ValueCountFrequency (%)
) 342
100.0%
Open Punctuation
ValueCountFrequency (%)
( 342
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41
100.0%
Math Symbol
ValueCountFrequency (%)
~ 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5699
57.0%
Common 4276
42.8%
Latin 18
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
470
 
8.2%
418
 
7.3%
365
 
6.4%
355
 
6.2%
343
 
6.0%
318
 
5.6%
309
 
5.4%
308
 
5.4%
183
 
3.2%
154
 
2.7%
Other values (207) 2476
43.4%
Common
ValueCountFrequency (%)
1672
39.1%
) 342
 
8.0%
( 342
 
8.0%
1 340
 
8.0%
, 286
 
6.7%
2 206
 
4.8%
3 183
 
4.3%
0 149
 
3.5%
5 148
 
3.5%
4 123
 
2.9%
Other values (9) 485
 
11.3%
Latin
ValueCountFrequency (%)
B 6
33.3%
A 4
22.2%
K 3
16.7%
P 2
 
11.1%
Q 1
 
5.6%
D 1
 
5.6%
J 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5699
57.0%
ASCII 4294
43.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1672
38.9%
) 342
 
8.0%
( 342
 
8.0%
1 340
 
7.9%
, 286
 
6.7%
2 206
 
4.8%
3 183
 
4.3%
0 149
 
3.5%
5 148
 
3.4%
4 123
 
2.9%
Other values (16) 503
 
11.7%
Hangul
ValueCountFrequency (%)
470
 
8.2%
418
 
7.3%
365
 
6.4%
355
 
6.2%
343
 
6.0%
318
 
5.6%
309
 
5.4%
308
 
5.4%
183
 
3.2%
154
 
2.7%
Other values (207) 2476
43.4%

면적
Real number (ℝ)

HIGH CORRELATION 

Distinct268
Distinct (%)77.7%
Missing1
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean5966.7072
Minimum6
Maximum1156423
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB
2023-12-12T21:07:53.996086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile150
Q1284
median384
Q3531
95-th percentile3368
Maximum1156423
Range1156417
Interquartile range (IQR)247

Descriptive statistics

Standard deviation65950.016
Coefficient of variation (CV)11.053
Kurtosis273.03711
Mean5966.7072
Median Absolute Deviation (MAD)113
Skewness15.997831
Sum2058514
Variance4.3494046 × 109
MonotonicityNot monotonic
2023-12-12T21:07:54.162607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
219 4
 
1.2%
492 4
 
1.2%
335 4
 
1.2%
314 4
 
1.2%
320 4
 
1.2%
497 4
 
1.2%
205 4
 
1.2%
330 4
 
1.2%
441 3
 
0.9%
348 3
 
0.9%
Other values (258) 307
88.7%
ValueCountFrequency (%)
6 1
0.3%
55 1
0.3%
69 1
0.3%
79 1
0.3%
95 1
0.3%
100 1
0.3%
104 1
0.3%
115 1
0.3%
119 1
0.3%
121 1
0.3%
ValueCountFrequency (%)
1156423 1
0.3%
318382 1
0.3%
259401 1
0.3%
28991 1
0.3%
19921 1
0.3%
18202 1
0.3%
15157 1
0.3%
11711 1
0.3%
9222 1
0.3%
8137 1
0.3%

주관부서
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
서구청 문화체육과
136 
유성구청 문화관광과
113 
대덕구청 문화관광체육과
35 
중구청 문화체육과
34 
동구청 관광문화체육과
24 
Other values (2)
 
4

Length

Max length12
Median length11
Mean length9.7803468
Min length4

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row대전광역시청 체육진흥과
2nd row대전광역시청 체육진흥과
3rd row대전광역시청 체육진흥과
4th row서구청 문화체육과
5th row서구청 문화체육과

Common Values

ValueCountFrequency (%)
서구청 문화체육과 136
39.3%
유성구청 문화관광과 113
32.7%
대덕구청 문화관광체육과 35
 
10.1%
중구청 문화체육과 34
 
9.8%
동구청 관광문화체육과 24
 
6.9%
대전광역시청 체육진흥과 3
 
0.9%
<NA> 1
 
0.3%

Length

2023-12-12T21:07:54.319123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:07:54.480495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
문화체육과 170
24.6%
서구청 136
19.7%
유성구청 113
16.4%
문화관광과 113
16.4%
대덕구청 35
 
5.1%
문화관광체육과 35
 
5.1%
중구청 34
 
4.9%
동구청 24
 
3.5%
관광문화체육과 24
 
3.5%
대전광역시청 3
 
0.4%
Other values (2) 4
 
0.6%

Interactions

2023-12-12T21:07:51.177954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:07:54.593175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록_신고업종면적주관부서
등록_신고1.0001.0001.0001.000
업종1.0001.0000.8050.707
면적1.0000.8051.0000.939
주관부서1.0000.7070.9391.000
2023-12-12T21:07:54.704037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주관부서등록_신고업종
주관부서1.0000.9940.469
등록_신고0.9941.0000.988
업종0.4690.9881.000
2023-12-12T21:07:54.835320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적등록_신고업종주관부서
면적1.0000.9990.6900.699
등록_신고0.9991.0000.9880.994
업종0.6900.9881.0000.469
주관부서0.6990.9940.4691.000

Missing values

2023-12-12T21:07:51.307441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:07:51.446923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

등록_신고업종업소명소재지면적주관부서
0등록골프장업유성 컨트리클럽대전 유성구 현충원로 200(덕명동 215-7번지)1156423대전광역시청 체육진흥과
1등록골프장업대덕연구개발특구복지센터대전 유성구 전민동 산11-2번지318382대전광역시청 체육진흥과
2등록골프장업금실 컨트리클럽대전 유성구 테크노중앙로 210(용산동 676번지)259401대전광역시청 체육진흥과
3신고승마장업퀸즈승마장대전시 서구 언목재길 253 (흑석동)855서구청 문화체육과
4신고승마장업대전승마장대전시 서구 조련길 50 (흑석동)2471서구청 문화체육과
5신고승마장업복용승마장대전광역시 유성구 덕명로56번길 199(덕명동)28991유성구청 문화관광과
6신고종합체육시설업동구국민체육센터대전광역시 동구 가양로 9 (가양동)1244동구청 관광문화체육과
7신고종합체육시설업중구국민체육센터대전광역시 중구 선화서로 701349중구청 문화체육과
8신고종합체육시설업삼부스포렉스대전광역시 중구 태평로 836291중구청 문화체육과
9신고종합체육시설업대전시립산성종합복지관대전광역시 중구 유등천동로 232(산성동)617중구청 문화체육과
등록_신고업종업소명소재지면적주관부서
336신고골프연습장업(스크린)대화스크린골프대전광역시 대덕구 대화10길 86 (대화동)418대덕구청 문화관광체육과
337신고골프연습장업(스크린)SP플러스대전광역시 대덕구 대화로 21 (대화동)150대덕구청 문화관광체육과
338신고골프연습장업(스크린)SG골프 한밭대로점대전광역시 대덕구 한밭대로 1149, 지하 1층 (중리동)492대덕구청 문화관광체육과
339신고썰매장업대전 오월드 사계절썰매장대전시 중구 사정공원로 70 (사정동)3814중구청 문화체육과
340신고가상체험체육시설업크리스탈 스포츠센터대전 동구 대전로 647 (효동)531동구청 관광문화체육과
341신고가상체험체육시설업스트라이크존 가오점대전 동구 동구청로 101, 세이프존 5층(가오동)157동구청 관광문화체육과
342신고가상체험체육시설업O2 스크린골프대전광역시 유성구 학하로 98, 4층 402,403,404호 (학하동)484유성구청 문화관광과
343신고가상체험체육시설업티업프렌즈 스크린골프대전광역시 유성구 테크노4로 80-7, 7층 (관평동)497유성구청 문화관광과
344신고종합체육시설업파랑새스포츠센터대전광역시 유성구 탑립로 49, 지하1~2, 지상2층1448유성구청 문화관광과
345신고종합체육시설업호텔오노마(오노마클럽)대전광역시 유성구 엑스포로 1, 27~28층(도룡동)<NA><NA>