Overview

Dataset statistics

Number of variables4
Number of observations155
Missing cells118
Missing cells (%)19.0%
Duplicate rows1
Duplicate rows (%)0.6%
Total size in memory5.1 KiB
Average record size in memory33.9 B

Variable types

Text1
Categorical2
Numeric1

Dataset

Description서대문문화체육회관에서 진행하는 체육강좌 프로그램의 연도별 목록
Author서울특별시서대문구도시관리공단
URLhttps://www.data.go.kr/data/15074487/fileData.do

Alerts

Dataset has 1 (0.6%) duplicate rowsDuplicates
강좌명 has 59 (38.1%) missing valuesMissing
프로그램 이용료 has 59 (38.1%) missing valuesMissing

Reproduction

Analysis started2023-12-12 18:23:31.251936
Analysis finished2023-12-12 18:23:32.300603
Duration1.05 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

강좌명
Text

MISSING 

Distinct52
Distinct (%)54.2%
Missing59
Missing (%)38.1%
Memory size1.3 KiB
2023-12-13T03:23:32.516268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length15
Mean length7.8958333
Min length2

Characters and Unicode

Total characters758
Distinct characters109
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)34.4%

Sample

1st row배드민턴 강습+개인연습
2nd row어린이배드민턴
3rd row어린이배드민턴(초등학교 저학년)
4th row어린이배드민턴(초등학교 고학년)
5th row피클볼
ValueCountFrequency (%)
서대문fc 23
 
17.7%
에어로빅 10
 
7.7%
성장줄넘기 8
 
6.2%
기구필라테스 5
 
3.8%
골프레슨 4
 
3.1%
snpe 3
 
2.3%
엘리트(초고 3
 
2.3%
골프자유연습 3
 
2.3%
방송댄스(어린이 3
 
2.3%
건강댄스 3
 
2.3%
Other values (47) 65
50.0%
2023-12-13T03:23:33.011612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49
 
6.5%
32
 
4.2%
) 30
 
4.0%
( 30
 
4.0%
24
 
3.2%
C 24
 
3.2%
F 24
 
3.2%
24
 
3.2%
24
 
3.2%
19
 
2.5%
Other values (99) 478
63.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 538
71.0%
Uppercase Letter 60
 
7.9%
Space Separator 49
 
6.5%
Close Punctuation 30
 
4.0%
Open Punctuation 30
 
4.0%
Decimal Number 29
 
3.8%
Dash Punctuation 8
 
1.1%
Math Symbol 7
 
0.9%
Other Punctuation 7
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
5.9%
24
 
4.5%
24
 
4.5%
24
 
4.5%
19
 
3.5%
15
 
2.8%
15
 
2.8%
14
 
2.6%
13
 
2.4%
13
 
2.4%
Other values (79) 345
64.1%
Uppercase Letter
ValueCountFrequency (%)
C 24
40.0%
F 24
40.0%
E 3
 
5.0%
S 3
 
5.0%
P 3
 
5.0%
N 3
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 9
31.0%
2 6
20.7%
3 5
17.2%
4 4
13.8%
6 4
13.8%
5 1
 
3.4%
Math Symbol
ValueCountFrequency (%)
~ 6
85.7%
+ 1
 
14.3%
Other Punctuation
ValueCountFrequency (%)
, 4
57.1%
. 3
42.9%
Space Separator
ValueCountFrequency (%)
49
100.0%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 538
71.0%
Common 160
 
21.1%
Latin 60
 
7.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
5.9%
24
 
4.5%
24
 
4.5%
24
 
4.5%
19
 
3.5%
15
 
2.8%
15
 
2.8%
14
 
2.6%
13
 
2.4%
13
 
2.4%
Other values (79) 345
64.1%
Common
ValueCountFrequency (%)
49
30.6%
) 30
18.8%
( 30
18.8%
1 9
 
5.6%
- 8
 
5.0%
~ 6
 
3.8%
2 6
 
3.8%
3 5
 
3.1%
4 4
 
2.5%
6 4
 
2.5%
Other values (4) 9
 
5.6%
Latin
ValueCountFrequency (%)
C 24
40.0%
F 24
40.0%
E 3
 
5.0%
S 3
 
5.0%
P 3
 
5.0%
N 3
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 538
71.0%
ASCII 220
29.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
49
22.3%
) 30
13.6%
( 30
13.6%
C 24
10.9%
F 24
10.9%
1 9
 
4.1%
- 8
 
3.6%
~ 6
 
2.7%
2 6
 
2.7%
3 5
 
2.3%
Other values (10) 29
13.2%
Hangul
ValueCountFrequency (%)
32
 
5.9%
24
 
4.5%
24
 
4.5%
24
 
4.5%
19
 
3.5%
15
 
2.8%
15
 
2.8%
14
 
2.6%
13
 
2.4%
13
 
2.4%
Other values (79) 345
64.1%

수업요일
Categorical

Distinct15
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
<NA>
59 
화,목
25 
월,수,금
17 
16 
Other values (10)
31 

Length

Max length5
Median length4
Mean length3.2
Min length1

Unique

Unique2 ?
Unique (%)1.3%

Sample

1st row월~토
2nd row월,수,금
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
<NA> 59
38.1%
화,목 25
16.1%
월,수,금 17
 
11.0%
16
 
10.3%
7
 
4.5%
화,목,토 5
 
3.2%
5
 
3.2%
월~토 4
 
2.6%
월~금 4
 
2.6%
월,수 3
 
1.9%
Other values (5) 10
 
6.5%

Length

2023-12-13T03:23:33.248036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 59
38.1%
화,목 25
16.1%
월,수,금 17
 
11.0%
16
 
10.3%
7
 
4.5%
화,목,토 5
 
3.2%
5
 
3.2%
월~토 4
 
2.6%
월~금 4
 
2.6%
월,수 3
 
1.9%
Other values (5) 10
 
6.5%

수업시간
Categorical

Distinct24
Distinct (%)15.5%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
<NA>
59 
16시
19 
10시
16 
15시
11 
9시
Other values (19)
44 

Length

Max length10
Median length8
Mean length3.8322581
Min length2

Unique

Unique6 ?
Unique (%)3.9%

Sample

1st row6시~9시
2nd row15시
3rd row15시
4th row16시
5th row13시

Common Values

ValueCountFrequency (%)
<NA> 59
38.1%
16시 19
 
12.3%
10시 16
 
10.3%
15시 11
 
7.1%
9시 6
 
3.9%
6시 4
 
2.6%
11시 4
 
2.6%
19시 4
 
2.6%
7시~21시 4
 
2.6%
12시 4
 
2.6%
Other values (14) 24
15.5%

Length

2023-12-13T03:23:33.426624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 59
37.1%
16시 19
 
11.9%
10시 16
 
10.1%
15시 11
 
6.9%
9시 6
 
3.8%
19시 6
 
3.8%
7시~21시 6
 
3.8%
6시 4
 
2.5%
11시 4
 
2.5%
12시 4
 
2.5%
Other values (14) 24
15.1%

프로그램 이용료
Real number (ℝ)

MISSING 

Distinct23
Distinct (%)24.0%
Missing59
Missing (%)38.1%
Infinite0
Infinite (%)0.0%
Mean48458.333
Minimum10000
Maximum180000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-13T03:23:33.575772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10000
5-th percentile20000
Q125000
median41000
Q355000
95-th percentile120000
Maximum180000
Range170000
Interquartile range (IQR)30000

Descriptive statistics

Standard deviation34744.759
Coefficient of variation (CV)0.71700276
Kurtosis5.0727431
Mean48458.333
Median Absolute Deviation (MAD)16000
Skewness2.1193367
Sum4652000
Variance1.2071982 × 109
MonotonicityNot monotonic
2023-12-13T03:23:33.757637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
25000 16
 
10.3%
50000 16
 
10.3%
21000 7
 
4.5%
20000 7
 
4.5%
35000 6
 
3.9%
120000 5
 
3.2%
41000 5
 
3.2%
80000 5
 
3.2%
60000 4
 
2.6%
30000 4
 
2.6%
Other values (13) 21
 
13.5%
(Missing) 59
38.1%
ValueCountFrequency (%)
10000 2
 
1.3%
15000 1
 
0.6%
20000 7
4.5%
21000 7
4.5%
25000 16
10.3%
26000 1
 
0.6%
30000 4
 
2.6%
35000 6
 
3.9%
37000 1
 
0.6%
40000 2
 
1.3%
ValueCountFrequency (%)
180000 3
 
1.9%
120000 5
 
3.2%
100000 1
 
0.6%
80000 5
 
3.2%
70000 2
 
1.3%
65000 2
 
1.3%
60000 4
 
2.6%
55000 3
 
1.9%
50000 16
10.3%
49000 1
 
0.6%

Interactions

2023-12-13T03:23:31.835115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:23:33.872323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강좌명수업요일수업시간프로그램 이용료
강좌명1.0000.7790.8590.931
수업요일0.7791.0000.0000.533
수업시간0.8590.0001.0000.698
프로그램 이용료0.9310.5330.6981.000
2023-12-13T03:23:33.977941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수업시간수업요일
수업시간1.0000.000
수업요일0.0001.000
2023-12-13T03:23:34.082714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
프로그램 이용료수업요일수업시간
프로그램 이용료1.0000.2650.328
수업요일0.2651.0000.000
수업시간0.3280.0001.000

Missing values

2023-12-13T03:23:31.992019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:23:32.099615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T03:23:32.228771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

강좌명수업요일수업시간프로그램 이용료
0배드민턴 강습+개인연습월~토6시~9시43000
1어린이배드민턴월,수,금15시40000
2어린이배드민턴(초등학교 저학년)15시25000
3어린이배드민턴(초등학교 고학년)16시25000
4피클볼13시25000
5에어로빅월~토6시41000
6에어로빅월,수,금6시21000
7에어로빅화,목,토6시21000
8에어로빅월~토10시41000
9에어로빅월,수,금10시21000
강좌명수업요일수업시간프로그램 이용료
145<NA><NA><NA><NA>
146<NA><NA><NA><NA>
147<NA><NA><NA><NA>
148<NA><NA><NA><NA>
149<NA><NA><NA><NA>
150<NA><NA><NA><NA>
151<NA><NA><NA><NA>
152<NA><NA><NA><NA>
153<NA><NA><NA><NA>
154<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

강좌명수업요일수업시간프로그램 이용료# duplicates
0<NA><NA><NA><NA>59