Overview

Dataset statistics

Number of variables7
Number of observations87
Missing cells8
Missing cells (%)1.3%
Duplicate rows1
Duplicate rows (%)1.1%
Total size in memory5.0 KiB
Average record size in memory58.5 B

Variable types

Categorical3
Text1
DateTime2
Numeric1

Dataset

Description대구시설공단이 운영하는 서재문화체육센터의 (강좌명, 시작 및 종료시간, 수강료 등)데이터를 제공합니다.
Author대구시설공단
URLhttps://www.data.go.kr/data/15018724/fileData.do

Alerts

Dataset has 1 (1.1%) duplicate rowsDuplicates
수강료 is highly overall correlated with 분류 and 1 other fieldsHigh correlation
분류 is highly overall correlated with 수강료 High correlation
강습요일 is highly overall correlated with 수강료 High correlation
대상 is highly imbalanced (53.8%)Imbalance
강좌명 / 반명 has 2 (2.3%) missing valuesMissing
시작시간 has 2 (2.3%) missing valuesMissing
종료시간 has 2 (2.3%) missing valuesMissing
수강료 has 2 (2.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:35:56.700984
Analysis finished2023-12-12 21:35:57.582730
Duration0.88 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분류
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Memory size828.0 B
수영
53 
탁구
11 
배드민턴
댄스
 
5
아쿠아로빅
 
4
Other values (3)

Length

Max length5
Median length2
Mean length2.3678161
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수영
2nd row수영
3rd row수영
4th row수영
5th row수영

Common Values

ValueCountFrequency (%)
수영 53
60.9%
탁구 11
 
12.6%
배드민턴 6
 
6.9%
댄스 5
 
5.7%
아쿠아로빅 4
 
4.6%
요가 4
 
4.6%
힐링몸짱 2
 
2.3%
<NA> 2
 
2.3%

Length

2023-12-13T06:35:57.662629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:35:57.814423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수영 53
60.9%
탁구 11
 
12.6%
배드민턴 6
 
6.9%
댄스 5
 
5.7%
아쿠아로빅 4
 
4.6%
요가 4
 
4.6%
힐링몸짱 2
 
2.3%
na 2
 
2.3%

강좌명 / 반명
Text

MISSING 

Distinct85
Distinct (%)100.0%
Missing2
Missing (%)2.3%
Memory size828.0 B
2023-12-13T06:35:58.041630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length12.882353
Min length9

Characters and Unicode

Total characters1095
Distinct characters59
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique85 ?
Unique (%)100.0%

Sample

1st row주3회 초급반(07시)
2nd row주2회 초급반(20시)
3rd row주3회 중급반(06시)
4th row주3회 중급반(11시)
5th row주3회 중급반(20시)
ValueCountFrequency (%)
주3회 48
24.5%
주2회 37
18.9%
배드민턴 6
 
3.1%
10시 6
 
3.1%
고급탁구 6
 
3.1%
초급탁구 5
 
2.6%
19시 5
 
2.6%
15시 4
 
2.0%
11시 4
 
2.0%
교정반(16시 2
 
1.0%
Other values (55) 73
37.2%
2023-12-13T06:35:58.444568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
111
 
10.1%
85
 
7.8%
85
 
7.8%
( 85
 
7.8%
85
 
7.8%
) 85
 
7.8%
1 63
 
5.8%
53
 
4.8%
2 49
 
4.5%
3 49
 
4.5%
Other values (49) 345
31.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 533
48.7%
Decimal Number 255
23.3%
Space Separator 111
 
10.1%
Open Punctuation 85
 
7.8%
Close Punctuation 85
 
7.8%
Uppercase Letter 25
 
2.3%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
85
15.9%
85
15.9%
85
15.9%
53
9.9%
29
 
5.4%
27
 
5.1%
22
 
4.1%
15
 
2.8%
15
 
2.8%
11
 
2.1%
Other values (29) 106
19.9%
Decimal Number
ValueCountFrequency (%)
1 63
24.7%
2 49
19.2%
3 49
19.2%
0 40
15.7%
9 18
 
7.1%
7 14
 
5.5%
6 14
 
5.5%
5 4
 
1.6%
4 2
 
0.8%
8 2
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
A 10
40.0%
B 10
40.0%
P 2
 
8.0%
O 1
 
4.0%
K 1
 
4.0%
C 1
 
4.0%
Space Separator
ValueCountFrequency (%)
111
100.0%
Open Punctuation
ValueCountFrequency (%)
( 85
100.0%
Close Punctuation
ValueCountFrequency (%)
) 85
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 537
49.0%
Hangul 533
48.7%
Latin 25
 
2.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
85
15.9%
85
15.9%
85
15.9%
53
9.9%
29
 
5.4%
27
 
5.1%
22
 
4.1%
15
 
2.8%
15
 
2.8%
11
 
2.1%
Other values (29) 106
19.9%
Common
ValueCountFrequency (%)
111
20.7%
( 85
15.8%
) 85
15.8%
1 63
11.7%
2 49
9.1%
3 49
9.1%
0 40
 
7.4%
9 18
 
3.4%
7 14
 
2.6%
6 14
 
2.6%
Other values (4) 9
 
1.7%
Latin
ValueCountFrequency (%)
A 10
40.0%
B 10
40.0%
P 2
 
8.0%
O 1
 
4.0%
K 1
 
4.0%
C 1
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 562
51.3%
Hangul 533
48.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
111
19.8%
( 85
15.1%
) 85
15.1%
1 63
11.2%
2 49
8.7%
3 49
8.7%
0 40
 
7.1%
9 18
 
3.2%
7 14
 
2.5%
6 14
 
2.5%
Other values (10) 34
 
6.0%
Hangul
ValueCountFrequency (%)
85
15.9%
85
15.9%
85
15.9%
53
9.9%
29
 
5.4%
27
 
5.1%
22
 
4.1%
15
 
2.8%
15
 
2.8%
11
 
2.1%
Other values (29) 106
19.9%

강습요일
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size828.0 B
월,수,금
48 
화,목
37 
<NA>
 
2

Length

Max length5
Median length5
Mean length4.1264368
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row월,수,금
2nd row화,목
3rd row월,수,금
4th row월,수,금
5th row월,수,금

Common Values

ValueCountFrequency (%)
월,수,금 48
55.2%
화,목 37
42.5%
<NA> 2
 
2.3%

Length

2023-12-13T06:35:58.616406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:35:58.755930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
월,수,금 48
55.2%
화,목 37
42.5%
na 2
 
2.3%

시작시간
Date

MISSING 

Distinct15
Distinct (%)17.6%
Missing2
Missing (%)2.3%
Memory size828.0 B
Minimum2023-12-13 06:00:00
Maximum2023-12-13 21:00:00
2023-12-13T06:35:58.854002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:35:58.981018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)

종료시간
Date

MISSING 

Distinct19
Distinct (%)22.4%
Missing2
Missing (%)2.3%
Memory size828.0 B
Minimum2023-12-13 06:50:00
Maximum2023-12-13 22:00:00
2023-12-13T06:35:59.091232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:35:59.195505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)

대상
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size828.0 B
성인
73 
어린이
12 
<NA>
 
2

Length

Max length4
Median length2
Mean length2.183908
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row성인
2nd row성인
3rd row성인
4th row성인
5th row성인

Common Values

ValueCountFrequency (%)
성인 73
83.9%
어린이 12
 
13.8%
<NA> 2
 
2.3%

Length

2023-12-13T06:35:59.354582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:35:59.473864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
성인 73
83.9%
어린이 12
 
13.8%
na 2
 
2.3%

수강료
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct12
Distinct (%)14.1%
Missing2
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean49117.647
Minimum25000
Maximum104000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size915.0 B
2023-12-13T06:35:59.573327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25000
5-th percentile28400
Q132000
median50000
Q350000
95-th percentile90000
Maximum104000
Range79000
Interquartile range (IQR)18000

Descriptive statistics

Standard deviation19775.774
Coefficient of variation (CV)0.40262055
Kurtosis0.93879382
Mean49117.647
Median Absolute Deviation (MAD)18000
Skewness1.2546531
Sum4175000
Variance3.9108123 × 108
MonotonicityNot monotonic
2023-12-13T06:35:59.696761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
50000 27
31.0%
32000 19
21.8%
45000 6
 
6.9%
42000 6
 
6.9%
84000 6
 
6.9%
30000 5
 
5.7%
28000 4
 
4.6%
90000 3
 
3.4%
60000 3
 
3.4%
104000 3
 
3.4%
Other values (2) 3
 
3.4%
ValueCountFrequency (%)
25000 1
 
1.1%
28000 4
 
4.6%
30000 5
 
5.7%
32000 19
21.8%
42000 6
 
6.9%
45000 6
 
6.9%
50000 27
31.0%
60000 3
 
3.4%
71000 2
 
2.3%
84000 6
 
6.9%
ValueCountFrequency (%)
104000 3
 
3.4%
90000 3
 
3.4%
84000 6
 
6.9%
71000 2
 
2.3%
60000 3
 
3.4%
50000 27
31.0%
45000 6
 
6.9%
42000 6
 
6.9%
32000 19
21.8%
30000 5
 
5.7%

Interactions

2023-12-13T06:35:57.046484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:35:59.788550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분류강좌명 / 반명강습요일시작시간종료시간대상수강료
분류1.0001.0000.0000.8040.8610.0750.784
강좌명 / 반명1.0001.0001.0001.0001.0001.0001.000
강습요일0.0001.0001.0000.0000.0000.0000.998
시작시간0.8041.0000.0001.0000.9941.0000.569
종료시간0.8611.0000.0000.9941.0001.0000.735
대상0.0751.0000.0001.0001.0001.0000.575
수강료0.7841.0000.9980.5690.7350.5751.000
2023-12-13T06:35:59.925374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대상분류강습요일
대상1.0000.0720.000
분류0.0721.0000.000
강습요일0.0000.0001.000
2023-12-13T06:36:00.019405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수강료분류강습요일대상
수강료1.0000.5670.9250.407
분류0.5671.0000.0000.072
강습요일0.9250.0001.0000.000
대상0.4070.0720.0001.000

Missing values

2023-12-13T06:35:57.201150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:35:57.347471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:35:57.492002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

분류강좌명 / 반명강습요일시작시간종료시간대상수강료
0수영주3회 초급반(07시)월,수,금7:007:50성인50000
1수영주2회 초급반(20시)화,목20:0020:50성인32000
2수영주3회 중급반(06시)월,수,금6:006:50성인50000
3수영주3회 중급반(11시)월,수,금11:0011:50성인50000
4수영주3회 중급반(20시)월,수,금20:0020:50성인50000
5수영주2회 중급반(07시)화,목7:007:50성인32000
6수영주2회 상급반(06시)화,목6:006:50성인32000
7수영주2회 상급반(07시)화,목7:007:50성인32000
8수영주2회 고급반(19시)화,목19:0019:50성인32000
9수영주3회 교정A반(06시)월,수,금6:006:50성인50000
분류강좌명 / 반명강습요일시작시간종료시간대상수강료
77탁구주3회 고급탁구 (15시)월,수,금15:0016:00성인104000
78탁구주3회 초급탁구 (19시)월,수,금19:0020:00성인84000
79탁구주3회 고급탁구 (19시)월,수,금19:0021:00성인104000
80탁구주2회 초급탁구 (10시)화,목10:0012:00성인71000
81탁구주2회 고급탁구 (11시)화,목11:0012:00성인84000
82탁구주2회 초급탁구 (15시)화,목15:0017:00성인71000
83탁구주2회 고급탁구 (15시)화,목15:0016:00성인84000
84탁구주2회 고급탁구 (19시)화,목19:0021:00성인84000
85<NA><NA><NA><NA><NA><NA><NA>
86<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

분류강좌명 / 반명강습요일시작시간종료시간대상수강료# duplicates
0<NA><NA><NA><NA><NA><NA><NA>2