Overview

Dataset statistics

Number of variables7
Number of observations35
Missing cells35
Missing cells (%)14.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory61.8 B

Variable types

Numeric1
Categorical2
Text3
Unsupported1

Dataset

Description2017년 성주관내 학원교습소 현황(학원명, 주소, 전화번호)
Author경상북도교육청 경상북도성주교육지원청
URLhttps://www.data.go.kr/data/15053328/fileData.do

Alerts

연번 is highly overall correlated with 구분High correlation
구분 is highly overall correlated with 연번High correlation
비고 is highly imbalanced (56.2%)Imbalance
Unnamed: 5 has 35 (100.0%) missing valuesMissing
연번 has unique valuesUnique
학원(교습소)명 has unique valuesUnique
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 14:50:01.528284
Analysis finished2023-12-12 14:50:02.092604
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct35
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18
Minimum1
Maximum35
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size447.0 B
2023-12-12T23:50:02.515269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.7
Q19.5
median18
Q326.5
95-th percentile33.3
Maximum35
Range34
Interquartile range (IQR)17

Descriptive statistics

Standard deviation10.246951
Coefficient of variation (CV)0.56927504
Kurtosis-1.2
Mean18
Median Absolute Deviation (MAD)9
Skewness0
Sum630
Variance105
MonotonicityStrictly increasing
2023-12-12T23:50:02.634161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
1 1
 
2.9%
2 1
 
2.9%
21 1
 
2.9%
22 1
 
2.9%
23 1
 
2.9%
24 1
 
2.9%
25 1
 
2.9%
26 1
 
2.9%
27 1
 
2.9%
28 1
 
2.9%
Other values (25) 25
71.4%
ValueCountFrequency (%)
1 1
2.9%
2 1
2.9%
3 1
2.9%
4 1
2.9%
5 1
2.9%
6 1
2.9%
7 1
2.9%
8 1
2.9%
9 1
2.9%
10 1
2.9%
ValueCountFrequency (%)
35 1
2.9%
34 1
2.9%
33 1
2.9%
32 1
2.9%
31 1
2.9%
30 1
2.9%
29 1
2.9%
28 1
2.9%
27 1
2.9%
26 1
2.9%

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Memory size412.0 B
학원
28 
교습소

Length

Max length3
Median length2
Mean length2.2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row학원
2nd row학원
3rd row학원
4th row학원
5th row학원

Common Values

ValueCountFrequency (%)
학원 28
80.0%
교습소 7
 
20.0%

Length

2023-12-12T23:50:02.795095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:50:02.919096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
학원 28
80.0%
교습소 7
 
20.0%
Distinct35
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size412.0 B
2023-12-12T23:50:03.140423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length10
Mean length7.9142857
Min length4

Characters and Unicode

Total characters277
Distinct characters102
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)100.0%

Sample

1st rowBB음악학원
2nd rowGGE지니어스입시학원
3rd rowGnB영어전문학원
4th row국제왕수학교실
5th row뉴아이들무용학원
ValueCountFrequency (%)
bb음악학원 1
 
2.4%
피아노이야기학원 1
 
2.4%
학이학원 1
 
2.4%
한빛입시학원 1
 
2.4%
한솔외국어학원 1
 
2.4%
해법수학학원 1
 
2.4%
수학의 1
 
2.4%
달인 1
 
2.4%
성주학원 1
 
2.4%
생각나무학원 1
 
2.4%
Other values (32) 32
76.2%
2023-12-12T23:50:03.642576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35
 
12.6%
27
 
9.7%
9
 
3.2%
8
 
2.9%
8
 
2.9%
7
 
2.5%
7
 
2.5%
7
 
2.5%
7
 
2.5%
7
 
2.5%
Other values (92) 155
56.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 258
93.1%
Space Separator 7
 
2.5%
Uppercase Letter 7
 
2.5%
Lowercase Letter 4
 
1.4%
Other Punctuation 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
13.6%
27
 
10.5%
9
 
3.5%
8
 
3.1%
8
 
3.1%
7
 
2.7%
7
 
2.7%
7
 
2.7%
7
 
2.7%
6
 
2.3%
Other values (84) 137
53.1%
Uppercase Letter
ValueCountFrequency (%)
B 3
42.9%
G 3
42.9%
E 1
 
14.3%
Lowercase Letter
ValueCountFrequency (%)
n 2
50.0%
e 1
25.0%
o 1
25.0%
Space Separator
ValueCountFrequency (%)
7
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 258
93.1%
Latin 11
 
4.0%
Common 8
 
2.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
13.6%
27
 
10.5%
9
 
3.5%
8
 
3.1%
8
 
3.1%
7
 
2.7%
7
 
2.7%
7
 
2.7%
7
 
2.7%
6
 
2.3%
Other values (84) 137
53.1%
Latin
ValueCountFrequency (%)
B 3
27.3%
G 3
27.3%
n 2
18.2%
e 1
 
9.1%
o 1
 
9.1%
E 1
 
9.1%
Common
ValueCountFrequency (%)
7
87.5%
& 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 258
93.1%
ASCII 19
 
6.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
35
 
13.6%
27
 
10.5%
9
 
3.5%
8
 
3.1%
8
 
3.1%
7
 
2.7%
7
 
2.7%
7
 
2.7%
7
 
2.7%
6
 
2.3%
Other values (84) 137
53.1%
ASCII
ValueCountFrequency (%)
7
36.8%
B 3
15.8%
G 3
15.8%
n 2
 
10.5%
e 1
 
5.3%
o 1
 
5.3%
E 1
 
5.3%
& 1
 
5.3%

주소
Text

Distinct32
Distinct (%)91.4%
Missing0
Missing (%)0.0%
Memory size412.0 B
2023-12-12T23:50:03.876323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length29
Mean length26.457143
Min length19

Characters and Unicode

Total characters926
Distinct characters37
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)85.7%

Sample

1st row경상북도 성주군 성주읍 성주읍4길 4 2층
2nd row경상북도 성주군 성주읍 성주로 3265 2층
3rd row경상북도 성주군 성주읍 성산6길 2 , 2층 (성주읍)
4th row경상북도 성주군 성주읍 성주읍1길 7-7
5th row경상북도 성주군 성주읍 성주읍2길 36-3 , 2층 (성주읍)
ValueCountFrequency (%)
성주읍 46
21.5%
경상북도 34
15.9%
성주군 34
15.9%
2층 14
 
6.5%
성주읍2길 9
 
4.2%
8
 
3.7%
성주로 6
 
2.8%
성주읍4길 5
 
2.3%
성주읍3길 4
 
1.9%
36-3 3
 
1.4%
Other values (43) 51
23.8%
2023-12-12T23:50:04.330884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
186
20.1%
110
11.9%
108
11.7%
67
 
7.2%
2 39
 
4.2%
37
 
4.0%
36
 
3.9%
35
 
3.8%
35
 
3.8%
35
 
3.8%
Other values (27) 238
25.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 544
58.7%
Space Separator 186
 
20.1%
Decimal Number 137
 
14.8%
Dash Punctuation 18
 
1.9%
Open Punctuation 16
 
1.7%
Close Punctuation 16
 
1.7%
Other Punctuation 9
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
110
20.2%
108
19.9%
67
12.3%
37
 
6.8%
36
 
6.6%
35
 
6.4%
35
 
6.4%
35
 
6.4%
29
 
5.3%
18
 
3.3%
Other values (11) 34
 
6.2%
Decimal Number
ValueCountFrequency (%)
2 39
28.5%
3 27
19.7%
1 23
16.8%
4 14
 
10.2%
6 8
 
5.8%
5 8
 
5.8%
8 7
 
5.1%
7 6
 
4.4%
9 3
 
2.2%
0 2
 
1.5%
Other Punctuation
ValueCountFrequency (%)
, 8
88.9%
. 1
 
11.1%
Space Separator
ValueCountFrequency (%)
186
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 544
58.7%
Common 382
41.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
110
20.2%
108
19.9%
67
12.3%
37
 
6.8%
36
 
6.6%
35
 
6.4%
35
 
6.4%
35
 
6.4%
29
 
5.3%
18
 
3.3%
Other values (11) 34
 
6.2%
Common
ValueCountFrequency (%)
186
48.7%
2 39
 
10.2%
3 27
 
7.1%
1 23
 
6.0%
- 18
 
4.7%
( 16
 
4.2%
) 16
 
4.2%
4 14
 
3.7%
, 8
 
2.1%
6 8
 
2.1%
Other values (6) 27
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 544
58.7%
ASCII 382
41.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
186
48.7%
2 39
 
10.2%
3 27
 
7.1%
1 23
 
6.0%
- 18
 
4.7%
( 16
 
4.2%
) 16
 
4.2%
4 14
 
3.7%
, 8
 
2.1%
6 8
 
2.1%
Other values (6) 27
 
7.1%
Hangul
ValueCountFrequency (%)
110
20.2%
108
19.9%
67
12.3%
37
 
6.8%
36
 
6.6%
35
 
6.4%
35
 
6.4%
35
 
6.4%
29
 
5.3%
18
 
3.3%
Other values (11) 34
 
6.2%
Distinct25
Distinct (%)71.4%
Missing0
Missing (%)0.0%
Memory size412.0 B
2023-12-12T23:50:04.567890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length10.457143
Min length7

Characters and Unicode

Total characters366
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)68.6%

Sample

1st row054-933-6879
2nd row054-931-4142
3rd row054-932-0594
4th row054-933-3929
5th row054-933-5174
ValueCountFrequency (%)
유선전화 11
23.9%
없음 11
23.9%
054-933-3678 1
 
2.2%
070-4140-4627 1
 
2.2%
054-931-7955 1
 
2.2%
054-933-7890 1
 
2.2%
054-931-7813 1
 
2.2%
054-933-0509 1
 
2.2%
054-933-0231 1
 
2.2%
054-932-2126 1
 
2.2%
Other values (16) 16
34.8%
2023-12-12T23:50:04.980405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 48
13.1%
3 44
12.0%
4 36
9.8%
9 36
9.8%
0 34
9.3%
5 28
 
7.7%
1 20
 
5.5%
2 15
 
4.1%
7 12
 
3.3%
11
 
3.0%
Other values (8) 82
22.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 241
65.8%
Other Letter 66
 
18.0%
Dash Punctuation 48
 
13.1%
Space Separator 11
 
3.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 44
18.3%
4 36
14.9%
9 36
14.9%
0 34
14.1%
5 28
11.6%
1 20
8.3%
2 15
 
6.2%
7 12
 
5.0%
6 9
 
3.7%
8 7
 
2.9%
Other Letter
ValueCountFrequency (%)
11
16.7%
11
16.7%
11
16.7%
11
16.7%
11
16.7%
11
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 48
100.0%
Space Separator
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 300
82.0%
Hangul 66
 
18.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 48
16.0%
3 44
14.7%
4 36
12.0%
9 36
12.0%
0 34
11.3%
5 28
9.3%
1 20
6.7%
2 15
 
5.0%
7 12
 
4.0%
11
 
3.7%
Other values (2) 16
 
5.3%
Hangul
ValueCountFrequency (%)
11
16.7%
11
16.7%
11
16.7%
11
16.7%
11
16.7%
11
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 300
82.0%
Hangul 66
 
18.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 48
16.0%
3 44
14.7%
4 36
12.0%
9 36
12.0%
0 34
11.3%
5 28
9.3%
1 20
6.7%
2 15
 
5.0%
7 12
 
4.0%
11
 
3.7%
Other values (2) 16
 
5.3%
Hangul
ValueCountFrequency (%)
11
16.7%
11
16.7%
11
16.7%
11
16.7%
11
16.7%
11
16.7%

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing35
Missing (%)100.0%
Memory size447.0 B

비고
Categorical

IMBALANCE 

Distinct3
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size412.0 B
<NA>
30 
개원
교습소명 변경
 
1

Length

Max length7
Median length4
Mean length3.8571429
Min length2

Unique

Unique1 ?
Unique (%)2.9%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 30
85.7%
개원 4
 
11.4%
교습소명 변경 1
 
2.9%

Length

2023-12-12T23:50:05.150625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:50:05.268356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 30
83.3%
개원 4
 
11.1%
교습소명 1
 
2.8%
변경 1
 
2.8%

Interactions

2023-12-12T23:50:01.819968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:50:05.350265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분학원(교습소)명주소전 화 번 호비고
연번1.0001.0001.0000.8430.8090.000
구분1.0001.0001.0000.0000.0000.000
학원(교습소)명1.0001.0001.0001.0001.0001.000
주소0.8430.0001.0001.0000.9571.000
전 화 번 호0.8090.0001.0000.9571.0001.000
비고0.0000.0001.0001.0001.0001.000
2023-12-12T23:50:05.483577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분비고
구분1.0000.000
비고0.0001.000
2023-12-12T23:50:05.593254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분비고
연번1.0000.8700.000
구분0.8701.0000.000
비고0.0000.0001.000

Missing values

2023-12-12T23:50:01.947520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:50:02.054183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분학원(교습소)명주소전 화 번 호Unnamed: 5비고
01학원BB음악학원경상북도 성주군 성주읍 성주읍4길 4 2층054-933-6879<NA><NA>
12학원GGE지니어스입시학원경상북도 성주군 성주읍 성주로 3265 2층054-931-4142<NA><NA>
23학원GnB영어전문학원경상북도 성주군 성주읍 성산6길 2 , 2층 (성주읍)054-932-0594<NA><NA>
34학원국제왕수학교실경상북도 성주군 성주읍 성주읍1길 7-7054-933-3929<NA><NA>
45학원뉴아이들무용학원경상북도 성주군 성주읍 성주읍2길 36-3 , 2층 (성주읍)054-933-5174<NA><NA>
56학원대산학원경상북도성주군 성주읍 성주읍2길 19054-932-1314<NA><NA>
67학원상지입시학원경상북도 성주군 선남면 도성3길 8-4번지054-933-3969<NA><NA>
78학원성주다올학원경상북도 성주군 성주읍 성주읍3길 4-1 , 2층 (성주읍)유선전화 없음<NA><NA>
89학원소나타음악전문학원경상북도 성주군 성주읍 성주읍 3길 15-5 2층054-931-4321<NA><NA>
910학원솔로몬영수학원경상북도 성주군 성주읍 성주로 3218 3층054-932-8133<NA><NA>
연번구분학원(교습소)명주소전 화 번 호Unnamed: 5비고
2526학원생각나무학원경상북도 성주군 성주읍 성주읍2길 33유선전화 없음<NA>개원
2627학원으뜸one수학학원경상북도 성주군 성주읍 성주로 3236(성주읍)054-931-7955<NA>개원
2728학원김쌤 스터디 학원경상북도 성주군 초전면 대장길 115.2층유선전화 없음<NA>개원
2829교습소한우리독서토론논술성주교습소경상북도 성주군 성주읍 성주읍2길 17 , 2층 (성주읍)070-4140-4627<NA><NA>
2930교습소뉴아이들미술교습소경상북도 성주군 성주읍 성주읍2길 36-3 , 2층 (성주읍)유선전화 없음<NA><NA>
3031교습소뉴아이들수학교습소경상북도 성주군 성주읍 성주읍2길 36-3 , 2층 (성주읍)유선전화 없음<NA><NA>
3132교습소톡톡영어전문교습소경상북도 성주군 초전면 대장길 119-7 (초전면)054-931-8797<NA>교습소명 변경
3233교습소아이공부방교습소경상북도 성주군 성주읍 성주읍1길 8-12 (성주읍)유선전화 없음<NA><NA>
3334교습소다이룸 영어 교습소경상북도 성주군 성주읍 성주로 3233 (성주읍)유선전화 없음<NA><NA>
3435교습소드림수학교습소경상북도 성주군 성주읍 성주읍4길 18-8유선전화 없음<NA><NA>