Overview

Dataset statistics

Number of variables4
Number of observations499
Missing cells545
Missing cells (%)27.3%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory16.7 KiB
Average record size in memory34.3 B

Variable types

Text2
Numeric2

Dataset

Description2020년 4/4분기(12월말 기준) 전국에 소재한 자동차 운전전문학원 목록 현황(학원명, 주소, 부지면적, 교육생 정원) 자료입니다.
Author경찰청
URLhttps://www.data.go.kr/data/15029970/fileData.do

Alerts

Dataset has 1 (0.2%) duplicate rowsDuplicates
학원명 has 135 (27.1%) missing valuesMissing
주 소 has 136 (27.3%) missing valuesMissing
총부지면적(㎡) has 137 (27.5%) missing valuesMissing
교육생정원(명) has 137 (27.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 22:28:44.090693
Analysis finished2023-12-12 22:28:45.110275
Duration1.02 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

학원명
Text

MISSING 

Distinct334
Distinct (%)91.8%
Missing135
Missing (%)27.1%
Memory size4.0 KiB
2023-12-13T07:28:45.430675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length2
Mean length2.989011
Min length1

Characters and Unicode

Total characters1088
Distinct characters189
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique313 ?
Unique (%)86.0%

Sample

1st row광연
2nd row삼일
3rd row창동
4th row녹천
5th row사당
ValueCountFrequency (%)
신진 6
 
1.6%
대성 4
 
1.1%
쌍용 3
 
0.8%
대우 3
 
0.8%
신세계 3
 
0.8%
한일 3
 
0.8%
신삼성 2
 
0.5%
영동 2
 
0.5%
삼성 2
 
0.5%
제일 2
 
0.5%
Other values (324) 336
91.8%
2023-12-13T07:28:45.986931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 57
 
5.2%
) 57
 
5.2%
45
 
4.1%
44
 
4.0%
37
 
3.4%
35
 
3.2%
30
 
2.8%
30
 
2.8%
28
 
2.6%
22
 
2.0%
Other values (179) 703
64.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 960
88.2%
Open Punctuation 57
 
5.2%
Close Punctuation 57
 
5.2%
Uppercase Letter 6
 
0.6%
Decimal Number 4
 
0.4%
Space Separator 2
 
0.2%
Lowercase Letter 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
45
 
4.7%
44
 
4.6%
37
 
3.9%
35
 
3.6%
30
 
3.1%
30
 
3.1%
28
 
2.9%
22
 
2.3%
21
 
2.2%
21
 
2.2%
Other values (167) 647
67.4%
Uppercase Letter
ValueCountFrequency (%)
C 1
16.7%
K 1
16.7%
O 1
16.7%
W 1
16.7%
E 1
16.7%
N 1
16.7%
Decimal Number
ValueCountFrequency (%)
2 3
75.0%
1 1
 
25.0%
Open Punctuation
ValueCountFrequency (%)
( 57
100.0%
Close Punctuation
ValueCountFrequency (%)
) 57
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 960
88.2%
Common 120
 
11.0%
Latin 8
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
45
 
4.7%
44
 
4.6%
37
 
3.9%
35
 
3.6%
30
 
3.1%
30
 
3.1%
28
 
2.9%
22
 
2.3%
21
 
2.2%
21
 
2.2%
Other values (167) 647
67.4%
Latin
ValueCountFrequency (%)
e 2
25.0%
C 1
12.5%
K 1
12.5%
O 1
12.5%
W 1
12.5%
E 1
12.5%
N 1
12.5%
Common
ValueCountFrequency (%)
( 57
47.5%
) 57
47.5%
2 3
 
2.5%
2
 
1.7%
1 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 960
88.2%
ASCII 128
 
11.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 57
44.5%
) 57
44.5%
2 3
 
2.3%
2
 
1.6%
e 2
 
1.6%
C 1
 
0.8%
1 1
 
0.8%
K 1
 
0.8%
O 1
 
0.8%
W 1
 
0.8%
Other values (2) 2
 
1.6%
Hangul
ValueCountFrequency (%)
45
 
4.7%
44
 
4.6%
37
 
3.9%
35
 
3.6%
30
 
3.1%
30
 
3.1%
28
 
2.9%
22
 
2.3%
21
 
2.2%
21
 
2.2%
Other values (167) 647
67.4%

주 소
Text

MISSING 

Distinct362
Distinct (%)99.7%
Missing136
Missing (%)27.3%
Memory size4.0 KiB
2023-12-13T07:28:46.309541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length36
Mean length26.721763
Min length11

Characters and Unicode

Total characters9700
Distinct characters323
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique361 ?
Unique (%)99.4%

Sample

1st row서울 강남구 헌릉로 733, (세곡동, 광일자동차운전학원)
2nd row서울 강남구 헌릉로 736, (세곡동, 삼일자동차운전학원)
3rd row서울 도봉구 도봉로136다길 4, (창동, 창동운전전문학원)
4th row서울 노원구 마들로5길 91, (월계동, 녹천자동차운전학원)
5th row서울 서초구 과천대로 904-6, (방배동, 사당자동차운전면허학원)
ValueCountFrequency (%)
경기 70
 
3.5%
경북 40
 
2.0%
경남 32
 
1.6%
충남 27
 
1.4%
전남 26
 
1.3%
전북 23
 
1.2%
강원 23
 
1.2%
충북 20
 
1.0%
부산 18
 
0.9%
인천 17
 
0.9%
Other values (1286) 1697
85.1%
2023-12-13T07:28:46.797991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1630
 
16.8%
405
 
4.2%
, 342
 
3.5%
) 254
 
2.6%
( 254
 
2.6%
253
 
2.6%
250
 
2.6%
1 241
 
2.5%
230
 
2.4%
211
 
2.2%
Other values (313) 5630
58.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5785
59.6%
Space Separator 1630
 
16.8%
Decimal Number 1314
 
13.5%
Other Punctuation 343
 
3.5%
Close Punctuation 254
 
2.6%
Open Punctuation 254
 
2.6%
Dash Punctuation 110
 
1.1%
Lowercase Letter 5
 
0.1%
Uppercase Letter 4
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
405
 
7.0%
253
 
4.4%
250
 
4.3%
230
 
4.0%
211
 
3.6%
176
 
3.0%
171
 
3.0%
160
 
2.8%
158
 
2.7%
150
 
2.6%
Other values (288) 3621
62.6%
Decimal Number
ValueCountFrequency (%)
1 241
18.3%
2 204
15.5%
3 160
12.2%
6 122
9.3%
4 113
8.6%
5 108
8.2%
7 98
7.5%
0 94
 
7.2%
8 88
 
6.7%
9 86
 
6.5%
Lowercase Letter
ValueCountFrequency (%)
e 2
40.0%
o 1
20.0%
k 1
20.0%
c 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
K 1
25.0%
S 1
25.0%
C 1
25.0%
L 1
25.0%
Other Punctuation
ValueCountFrequency (%)
, 342
99.7%
. 1
 
0.3%
Space Separator
ValueCountFrequency (%)
1630
100.0%
Close Punctuation
ValueCountFrequency (%)
) 254
100.0%
Open Punctuation
ValueCountFrequency (%)
( 254
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 110
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5786
59.6%
Common 3905
40.3%
Latin 9
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
405
 
7.0%
253
 
4.4%
250
 
4.3%
230
 
4.0%
211
 
3.6%
176
 
3.0%
171
 
3.0%
160
 
2.8%
158
 
2.7%
150
 
2.6%
Other values (289) 3622
62.6%
Common
ValueCountFrequency (%)
1630
41.7%
, 342
 
8.8%
) 254
 
6.5%
( 254
 
6.5%
1 241
 
6.2%
2 204
 
5.2%
3 160
 
4.1%
6 122
 
3.1%
4 113
 
2.9%
- 110
 
2.8%
Other values (6) 475
 
12.2%
Latin
ValueCountFrequency (%)
e 2
22.2%
K 1
11.1%
S 1
11.1%
o 1
11.1%
k 1
11.1%
C 1
11.1%
L 1
11.1%
c 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5785
59.6%
ASCII 3914
40.4%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1630
41.6%
, 342
 
8.7%
) 254
 
6.5%
( 254
 
6.5%
1 241
 
6.2%
2 204
 
5.2%
3 160
 
4.1%
6 122
 
3.1%
4 113
 
2.9%
- 110
 
2.8%
Other values (14) 484
 
12.4%
Hangul
ValueCountFrequency (%)
405
 
7.0%
253
 
4.4%
250
 
4.3%
230
 
4.0%
211
 
3.6%
176
 
3.0%
171
 
3.0%
160
 
2.8%
158
 
2.7%
150
 
2.6%
Other values (288) 3621
62.6%
None
ValueCountFrequency (%)
1
100.0%

총부지면적(㎡)
Real number (ℝ)

MISSING 

Distinct350
Distinct (%)96.7%
Missing137
Missing (%)27.5%
Infinite0
Infinite (%)0.0%
Mean11917.019
Minimum0
Maximum150514
Zeros1
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-13T07:28:46.960887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6715.4
Q18612
median9981
Q313375.25
95-th percentile19678.8
Maximum150514
Range150514
Interquartile range (IQR)4763.25

Descriptive statistics

Standard deviation9912.9423
Coefficient of variation (CV)0.83183068
Kurtosis130.06917
Mean11917.019
Median Absolute Deviation (MAD)2066
Skewness10.264978
Sum4313961
Variance98266426
MonotonicityNot monotonic
2023-12-13T07:28:47.151394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9900 3
 
0.6%
9208 2
 
0.4%
6909 2
 
0.4%
11330 2
 
0.4%
8972 2
 
0.4%
6700 2
 
0.4%
6817 2
 
0.4%
14576 2
 
0.4%
11219 2
 
0.4%
9927 2
 
0.4%
Other values (340) 341
68.3%
(Missing) 137
27.5%
ValueCountFrequency (%)
0 1
0.2%
169 1
0.2%
196 1
0.2%
210 1
0.2%
3244 1
0.2%
3907 1
0.2%
6546 1
0.2%
6618 1
0.2%
6619 1
0.2%
6637 1
0.2%
ValueCountFrequency (%)
150514 1
0.2%
108310 1
0.2%
38160 1
0.2%
32016 1
0.2%
26471 1
0.2%
25850 1
0.2%
23288 1
0.2%
22008 1
0.2%
21795 1
0.2%
21629 1
0.2%

교육생정원(명)
Real number (ℝ)

MISSING 

Distinct130
Distinct (%)35.9%
Missing137
Missing (%)27.5%
Infinite0
Infinite (%)0.0%
Mean210.11326
Minimum0
Maximum1024
Zeros3
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-13T07:28:47.314584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile56
Q1109
median168.5
Q3252
95-th percentile577.9
Maximum1024
Range1024
Interquartile range (IQR)143

Descriptive statistics

Standard deviation158.46392
Coefficient of variation (CV)0.75418336
Kurtosis4.7788725
Mean210.11326
Median Absolute Deviation (MAD)69.5
Skewness1.9976094
Sum76061
Variance25110.815
MonotonicityNot monotonic
2023-12-13T07:28:47.470500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
56 23
 
4.6%
196 18
 
3.6%
168 12
 
2.4%
182 11
 
2.2%
84 11
 
2.2%
192 11
 
2.2%
112 10
 
2.0%
96 10
 
2.0%
126 10
 
2.0%
98 9
 
1.8%
Other values (120) 237
47.5%
(Missing) 137
27.5%
ValueCountFrequency (%)
0 3
 
0.6%
28 1
 
0.2%
32 1
 
0.2%
42 1
 
0.2%
45 1
 
0.2%
48 2
 
0.4%
56 23
4.6%
58 1
 
0.2%
60 3
 
0.6%
64 4
 
0.8%
ValueCountFrequency (%)
1024 1
0.2%
908 1
0.2%
833 1
0.2%
792 1
0.2%
780 1
0.2%
748 1
0.2%
742 1
0.2%
728 1
0.2%
648 1
0.2%
640 1
0.2%

Interactions

2023-12-13T07:28:44.593277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:28:44.390724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:28:44.707768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:28:44.504395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:28:47.566873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
총부지면적(㎡)교육생정원(명)
총부지면적(㎡)1.0000.232
교육생정원(명)0.2321.000
2023-12-13T07:28:47.646465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
총부지면적(㎡)교육생정원(명)
총부지면적(㎡)1.0000.245
교육생정원(명)0.2451.000

Missing values

2023-12-13T07:28:44.823883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:28:44.920403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:28:45.038018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

학원명주 소총부지면적(㎡)교육생정원(명)
0광연서울 강남구 헌릉로 733, (세곡동, 광일자동차운전학원)12771405
1삼일서울 강남구 헌릉로 736, (세곡동, 삼일자동차운전학원)13586578
2창동서울 도봉구 도봉로136다길 4, (창동, 창동운전전문학원)17914416
3녹천서울 노원구 마들로5길 91, (월계동, 녹천자동차운전학원)8232630
4사당서울 서초구 과천대로 904-6, (방배동, 사당자동차운전면허학원)11013640
5서울서울 강서구 남부순환로 222, (외발산동, 서울자동차학원)10491792
6온수역서울 구로구 부일로1가길 18-44, (온수동)8383432
7중랑서울 중랑구 상봉로 117, (상봉동, 상봉시외버스터미널)6905416
8양재서울 서초구 남부순환로342길 62-26, (양재동, 양재자동차운전학원)8389476
9신도림서울 구로구 신도림로19길 68, (신도림동, 신도림자동차전문학원)171951024
학원명주 소총부지면적(㎡)교육생정원(명)
489<NA><NA><NA><NA>
490<NA><NA><NA><NA>
491<NA><NA><NA><NA>
492<NA><NA><NA><NA>
493<NA><NA><NA><NA>
494<NA><NA><NA><NA>
495<NA><NA><NA><NA>
496<NA><NA><NA><NA>
497<NA><NA><NA><NA>
498<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

학원명주 소총부지면적(㎡)교육생정원(명)# duplicates
0<NA><NA><NA><NA>135