Overview

Dataset statistics

Number of variables8
Number of observations52
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.5 KiB
Average record size in memory68.5 B

Variable types

Numeric1
Categorical6
Text1

Dataset

Description답십리도서관 교육 및 문화 프로그램 종류, 수강시기, 대상 등 수강정보 제공
Author동대문구시설관리공단
URLhttps://www.data.go.kr/data/15044064/fileData.do

Alerts

분야 has constant value ""Constant
시간 is highly overall correlated with 정원High correlation
요일 is highly overall correlated with 정원High correlation
수강료(월) is highly overall correlated with 정원High correlation
정원 is highly overall correlated with 시간 and 2 other fieldsHigh correlation
정원 is highly imbalanced (61.0%)Imbalance
연번 has unique valuesUnique
프로그램명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 19:47:03.817476
Analysis finished2023-12-12 19:47:04.742526
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct52
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.5
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size600.0 B
2023-12-13T04:47:04.883375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.55
Q113.75
median26.5
Q339.25
95-th percentile49.45
Maximum52
Range51
Interquartile range (IQR)25.5

Descriptive statistics

Standard deviation15.154757
Coefficient of variation (CV)0.57187763
Kurtosis-1.2
Mean26.5
Median Absolute Deviation (MAD)13
Skewness0
Sum1378
Variance229.66667
MonotonicityStrictly increasing
2023-12-13T04:47:05.083380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.9%
28 1
 
1.9%
30 1
 
1.9%
31 1
 
1.9%
32 1
 
1.9%
33 1
 
1.9%
34 1
 
1.9%
35 1
 
1.9%
36 1
 
1.9%
37 1
 
1.9%
Other values (42) 42
80.8%
ValueCountFrequency (%)
1 1
1.9%
2 1
1.9%
3 1
1.9%
4 1
1.9%
5 1
1.9%
6 1
1.9%
7 1
1.9%
8 1
1.9%
9 1
1.9%
10 1
1.9%
ValueCountFrequency (%)
52 1
1.9%
51 1
1.9%
50 1
1.9%
49 1
1.9%
48 1
1.9%
47 1
1.9%
46 1
1.9%
45 1
1.9%
44 1
1.9%
43 1
1.9%

분야
Categorical

CONSTANT 

Distinct1
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size548.0 B
문화
52 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row문화
2nd row문화
3rd row문화
4th row문화
5th row문화

Common Values

ValueCountFrequency (%)
문화 52
100.0%

Length

2023-12-13T04:47:05.264455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:47:05.382939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
문화 52
100.0%

프로그램명
Text

UNIQUE 

Distinct52
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size548.0 B
2023-12-13T04:47:05.690046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length14
Mean length8.1538462
Min length3

Characters and Unicode

Total characters424
Distinct characters121
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)100.0%

Sample

1st rowNIE(글쓰기)
2nd rowNIE(창의언어)
3rd rowNIE(토의)
4th row과학창작 케이넥스B
5th row노부영 스토리텔링A
ValueCountFrequency (%)
노부영 5
 
6.8%
뮤지컬 2
 
2.7%
영화제 2
 
2.7%
기초탄탄 2
 
2.7%
어린이 2
 
2.7%
동화속 2
 
2.7%
중국어회화(초급 1
 
1.4%
창의미술a 1
 
1.4%
중국어회화(중급 1
 
1.4%
창의미술b 1
 
1.4%
Other values (55) 55
74.3%
2023-12-13T04:47:06.162970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22
 
5.2%
( 17
 
4.0%
) 17
 
4.0%
14
 
3.3%
B 13
 
3.1%
A 12
 
2.8%
12
 
2.8%
10
 
2.4%
10
 
2.4%
10
 
2.4%
Other values (111) 287
67.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 308
72.6%
Uppercase Letter 43
 
10.1%
Space Separator 22
 
5.2%
Open Punctuation 17
 
4.0%
Close Punctuation 17
 
4.0%
Lowercase Letter 11
 
2.6%
Math Symbol 6
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
4.5%
12
 
3.9%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
8
 
2.6%
8
 
2.6%
Other values (88) 206
66.9%
Uppercase Letter
ValueCountFrequency (%)
B 13
30.2%
A 12
27.9%
E 4
 
9.3%
I 3
 
7.0%
N 3
 
7.0%
C 3
 
7.0%
D 2
 
4.7%
L 1
 
2.3%
R 1
 
2.3%
W 1
 
2.3%
Lowercase Letter
ValueCountFrequency (%)
e 2
18.2%
d 2
18.2%
i 1
9.1%
s 1
9.1%
t 1
9.1%
n 1
9.1%
a 1
9.1%
o 1
9.1%
r 1
9.1%
Space Separator
ValueCountFrequency (%)
22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Math Symbol
ValueCountFrequency (%)
+ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 308
72.6%
Common 62
 
14.6%
Latin 54
 
12.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
4.5%
12
 
3.9%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
8
 
2.6%
8
 
2.6%
Other values (88) 206
66.9%
Latin
ValueCountFrequency (%)
B 13
24.1%
A 12
22.2%
E 4
 
7.4%
I 3
 
5.6%
N 3
 
5.6%
C 3
 
5.6%
e 2
 
3.7%
d 2
 
3.7%
D 2
 
3.7%
L 1
 
1.9%
Other values (9) 9
16.7%
Common
ValueCountFrequency (%)
22
35.5%
( 17
27.4%
) 17
27.4%
+ 6
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 308
72.6%
ASCII 116
 
27.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
22
19.0%
( 17
14.7%
) 17
14.7%
B 13
11.2%
A 12
10.3%
+ 6
 
5.2%
E 4
 
3.4%
I 3
 
2.6%
N 3
 
2.6%
C 3
 
2.6%
Other values (13) 16
13.8%
Hangul
ValueCountFrequency (%)
14
 
4.5%
12
 
3.9%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
10
 
3.2%
8
 
2.6%
8
 
2.6%
Other values (88) 206
66.9%

시간
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)46.2%
Missing0
Missing (%)0.0%
Memory size548.0 B
16:00~16:50
10 
17:00~17:50
11:00~11:50
10:00~11:20
10:30~11:50
Other values (19)
24 

Length

Max length11
Median length11
Mean length10.980769
Min length10

Unique

Unique15 ?
Unique (%)28.8%

Sample

1st row16:00~16:50
2nd row15:00~15:50
3rd row17:00~17:50
4th row16:00~16:50
5th row15:00~1:50

Common Values

ValueCountFrequency (%)
16:00~16:50 10
19.2%
17:00~17:50 8
15.4%
11:00~11:50 4
 
7.7%
10:00~11:20 3
 
5.8%
10:30~11:50 3
 
5.8%
15:00~15:50 3
 
5.8%
13:00~13:50 2
 
3.8%
10:00~10:50 2
 
3.8%
19:30~20:50 2
 
3.8%
12:10~13:30 1
 
1.9%
Other values (14) 14
26.9%

Length

2023-12-13T04:47:06.329684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
16:00~16:50 10
19.2%
17:00~17:50 8
15.4%
11:00~11:50 4
 
7.7%
10:00~11:20 3
 
5.8%
10:30~11:50 3
 
5.8%
15:00~15:50 3
 
5.8%
13:00~13:50 2
 
3.8%
10:00~10:50 2
 
3.8%
19:30~20:50 2
 
3.8%
19:00~19:50 1
 
1.9%
Other values (14) 14
26.9%

요일
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)13.5%
Missing0
Missing (%)0.0%
Memory size548.0 B
14 
10 
10 
Other values (2)

Length

Max length6
Median length1
Mean length1.1923077
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
14
26.9%
10
19.2%
10
19.2%
9
17.3%
5
 
9.6%
1,3주 토 2
 
3.8%
2
 
3.8%

Length

2023-12-13T04:47:06.481055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:47:06.624604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
14
25.9%
11
20.4%
10
18.5%
10
18.5%
5
 
9.3%
1,3주 2
 
3.7%
2
 
3.7%

대상
Categorical

Distinct16
Distinct (%)30.8%
Missing0
Missing (%)0.0%
Memory size548.0 B
성인
12 
6~7세
11 
5~6세
7세~초등
누구나(초등이상)
Other values (11)
14 

Length

Max length9
Median length7.5
Mean length3.9038462
Min length2

Unique

Unique9 ?
Unique (%)17.3%

Sample

1st row초1~2
2nd row6~7세
3rd row초3~4
4th row6~7세
5th row5~6세

Common Values

ValueCountFrequency (%)
성인 12
23.1%
6~7세 11
21.2%
5~6세 7
13.5%
7세~초등 4
 
7.7%
누구나(초등이상) 4
 
7.7%
초1~2 3
 
5.8%
초1~3 2
 
3.8%
초3~4 1
 
1.9%
초3~5 1
 
1.9%
초3~6 1
 
1.9%
Other values (6) 6
11.5%

Length

2023-12-13T04:47:06.798501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
성인 12
23.1%
6~7세 11
21.2%
5~6세 7
13.5%
7세~초등 4
 
7.7%
누구나(초등이상 4
 
7.7%
초1~2 3
 
5.8%
초1~3 2
 
3.8%
초3~4 1
 
1.9%
초3~5 1
 
1.9%
초3~6 1
 
1.9%
Other values (6) 6
11.5%

수강료(월)
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)9.6%
Missing0
Missing (%)0.0%
Memory size548.0 B
22000
29 
25000
15 
10000
무료
90000
 
1

Length

Max length5
Median length5
Mean length4.8269231
Min length2

Unique

Unique1 ?
Unique (%)1.9%

Sample

1st row22000
2nd row22000
3rd row22000
4th row22000
5th row25000

Common Values

ValueCountFrequency (%)
22000 29
55.8%
25000 15
28.8%
10000 4
 
7.7%
무료 3
 
5.8%
90000 1
 
1.9%

Length

2023-12-13T04:47:06.928034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:47:07.356888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
22000 29
55.8%
25000 15
28.8%
10000 4
 
7.7%
무료 3
 
5.8%
90000 1
 
1.9%

정원
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size548.0 B
12
45 
15
 
3
7
 
2
30
 
2

Length

Max length2
Median length2
Mean length1.9615385
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row12
2nd row12
3rd row12
4th row12
5th row12

Common Values

ValueCountFrequency (%)
12 45
86.5%
15 3
 
5.8%
7 2
 
3.8%
30 2
 
3.8%

Length

2023-12-13T04:47:07.495498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:47:07.608963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
12 45
86.5%
15 3
 
5.8%
7 2
 
3.8%
30 2
 
3.8%

Interactions

2023-12-13T04:47:04.326810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:47:07.687231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번프로그램명시간요일대상수강료(월)정원
연번1.0001.0000.0000.6560.6130.8280.486
프로그램명1.0001.0001.0001.0001.0001.0001.000
시간0.0001.0001.0000.7610.5610.8540.974
요일0.6561.0000.7611.0000.7890.5420.816
대상0.6131.0000.5610.7891.0000.5890.772
수강료(월)0.8281.0000.8540.5420.5891.0000.642
정원0.4861.0000.9740.8160.7720.6421.000
2023-12-13T04:47:07.803754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정원요일시간대상수강료(월)
정원1.0000.6980.6180.4010.565
요일0.6981.0000.3480.4570.375
시간0.6180.3481.0000.1300.473
대상0.4010.4570.1301.0000.297
수강료(월)0.5650.3750.4730.2971.000
2023-12-13T04:47:07.911564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시간요일대상수강료(월)정원
연번1.0000.0000.3900.2600.4580.286
시간0.0001.0000.3480.1300.4730.618
요일0.3900.3481.0000.4570.3750.698
대상0.2600.1300.4571.0000.2970.401
수강료(월)0.4580.4730.3750.2971.0000.565
정원0.2860.6180.6980.4010.5651.000

Missing values

2023-12-13T04:47:04.492335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:47:04.664018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번분야프로그램명시간요일대상수강료(월)정원
01문화NIE(글쓰기)16:00~16:50초1~22200012
12문화NIE(창의언어)15:00~15:506~7세2200012
23문화NIE(토의)17:00~17:50초3~42200012
34문화과학창작 케이넥스B16:00~16:506~7세2200012
45문화노부영 스토리텔링A15:00~1:505~6세2500012
56문화노부영 스토리텔링B16:00~16:506~7세2500012
67문화노부영 스토리텔링C17:00~17:506~7세2500012
78문화노부영 스토리텔링D13:00~13:505~6세2500012
89문화노부영 스토리텔링E14:00~14:506~7세2500012
910문화놀이수학A16:00~16:505~6세2200012
연번분야프로그램명시간요일대상수강료(월)정원
4243문화뮤지컬 잉글리시A16:00~16:506세2200012
4344문화뮤지컬 잉글리시B17:00~17:507세2200012
4445문화오카리나A11:00~11:50성인1000012
4546문화오카리나B11:00~11:50누구나(초등이상)1000012
4647문화캘리그라피(자격증과정)10:20~12:20성인2500015
4748문화플룻A18:00~18:50누구나(초등이상)2500012
4849문화플룻B19:00~19:50누구나(초등이상)2500012
4950문화예비 중학생을 위한 문학 작품 읽기17:00~18:20초6무료12
5051문화어린이 영화제13:00~15:00유아,어린이무료30
5152문화도서관 영화제16:00~18:00성인무료30