Overview

Dataset statistics

Number of variables8
Number of observations28
Missing cells14
Missing cells (%)6.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory69.7 B

Variable types

Categorical3
Text4
Numeric1

Dataset

Description전라북도내 자동차 운전전문학원 현황에 대한 데이터로서 지역, 학원명, 전화번호, 팩스, 우편번호, 소재지 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15089820/fileData.do

Alerts

지역 is highly overall correlated with 우편번호 and 1 other fieldsHigh correlation
비고 is highly overall correlated with 구분High correlation
구분 is highly overall correlated with 우편번호 and 2 other fieldsHigh correlation
우편번호 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
구분 is highly imbalanced (72.1%)Imbalance
학원명 has 2 (7.1%) missing valuesMissing
전화번호 has 3 (10.7%) missing valuesMissing
팩스 has 3 (10.7%) missing valuesMissing
우편번호 has 3 (10.7%) missing valuesMissing
소 재 지 has 3 (10.7%) missing valuesMissing

Reproduction

Analysis started2023-12-11 22:58:31.788173
Analysis finished2023-12-11 22:58:32.562322
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Memory size356.0 B
전문학원
26 
<NA>
 
1
 
1

Length

Max length4
Median length4
Mean length3.9285714
Min length2

Unique

Unique2 ?
Unique (%)7.1%

Sample

1st row전문학원
2nd row전문학원
3rd row전문학원
4th row전문학원
5th row전문학원

Common Values

ValueCountFrequency (%)
전문학원 26
92.9%
<NA> 1
 
3.6%
1
 
3.6%

Length

2023-12-12T07:58:32.637968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:58:32.742364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전문학원 26
96.3%
na 1
 
3.7%

지역
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)32.1%
Missing0
Missing (%)0.0%
Memory size356.0 B
전주
익산
군산
정읍
김제
Other values (4)

Length

Max length4
Median length2
Mean length2.1428571
Min length2

Unique

Unique3 ?
Unique (%)10.7%

Sample

1st row익산
2nd row익산
3rd row정읍
4th row군산
5th row전주

Common Values

ValueCountFrequency (%)
전주 8
28.6%
익산 6
21.4%
군산 4
14.3%
정읍 3
 
10.7%
김제 2
 
7.1%
<NA> 2
 
7.1%
남원 1
 
3.6%
고창 1
 
3.6%
부안 1
 
3.6%

Length

2023-12-12T07:58:32.871181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:58:32.985062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전주 8
28.6%
익산 6
21.4%
군산 4
14.3%
정읍 3
 
10.7%
김제 2
 
7.1%
na 2
 
7.1%
남원 1
 
3.6%
고창 1
 
3.6%
부안 1
 
3.6%

학원명
Text

MISSING 

Distinct26
Distinct (%)100.0%
Missing2
Missing (%)7.1%
Memory size356.0 B
2023-12-12T07:58:33.197466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.2692308
Min length1

Characters and Unicode

Total characters59
Distinct characters44
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)100.0%

Sample

1st row가나안
2nd row강남
3rd row구룡
4th row군산
5th row그랜드
ValueCountFrequency (%)
강남 1
 
3.8%
구룡 1
 
3.8%
현대 1
 
3.8%
한일 1
 
3.8%
퍼스트 1
 
3.8%
참길 1
 
3.8%
정읍 1
 
3.8%
전주 1
 
3.8%
전북전주 1
 
3.8%
이리 1
 
3.8%
Other values (16) 16
61.5%
2023-12-12T07:58:33.505804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
 
5.1%
3
 
5.1%
3
 
5.1%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
Other values (34) 36
61.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3
 
5.1%
3
 
5.1%
3
 
5.1%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
Other values (34) 36
61.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3
 
5.1%
3
 
5.1%
3
 
5.1%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
Other values (34) 36
61.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3
 
5.1%
3
 
5.1%
3
 
5.1%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
2
 
3.4%
Other values (34) 36
61.0%

전화번호
Text

MISSING 

Distinct25
Distinct (%)100.0%
Missing3
Missing (%)10.7%
Memory size356.0 B
2023-12-12T07:58:33.729601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters300
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row063-837-5000
2nd row063-834-8831
3rd row063-535-9400
4th row063-467-1123
5th row063-226-0031
ValueCountFrequency (%)
063-834-8831 1
 
4.0%
063-212-7771 1
 
4.0%
063-214-2700 1
 
4.0%
063-453-5881 1
 
4.0%
063-581-0234 1
 
4.0%
063-535-3192 1
 
4.0%
063-212-1177 1
 
4.0%
063-254-2580 1
 
4.0%
063-852-2073 1
 
4.0%
063-834-1088 1
 
4.0%
Other values (15) 15
60.0%
2023-12-12T07:58:34.011992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 50
16.7%
3 48
16.0%
0 47
15.7%
6 38
12.7%
1 28
9.3%
2 24
8.0%
5 18
 
6.0%
4 17
 
5.7%
8 15
 
5.0%
7 10
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 250
83.3%
Dash Punctuation 50
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 48
19.2%
0 47
18.8%
6 38
15.2%
1 28
11.2%
2 24
9.6%
5 18
 
7.2%
4 17
 
6.8%
8 15
 
6.0%
7 10
 
4.0%
9 5
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 50
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 300
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 50
16.7%
3 48
16.0%
0 47
15.7%
6 38
12.7%
1 28
9.3%
2 24
8.0%
5 18
 
6.0%
4 17
 
5.7%
8 15
 
5.0%
7 10
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 300
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 50
16.7%
3 48
16.0%
0 47
15.7%
6 38
12.7%
1 28
9.3%
2 24
8.0%
5 18
 
6.0%
4 17
 
5.7%
8 15
 
5.0%
7 10
 
3.3%

팩스
Text

MISSING 

Distinct25
Distinct (%)100.0%
Missing3
Missing (%)10.7%
Memory size356.0 B
2023-12-12T07:58:34.252790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters300
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row063-854-0045
2nd row063-834-8861
3rd row063-537-7727
4th row063-464-9090
5th row063-226-0032
ValueCountFrequency (%)
063-834-8861 1
 
4.0%
063-212-7772 1
 
4.0%
063-214-4600 1
 
4.0%
063-453-5883 1
 
4.0%
063-581-0236 1
 
4.0%
063-532-1115 1
 
4.0%
063-213-0106 1
 
4.0%
063-255-3141 1
 
4.0%
063-856-2073 1
 
4.0%
063-834-2222 1
 
4.0%
Other values (15) 15
60.0%
2023-12-12T07:58:34.547517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 50
16.7%
3 45
15.0%
0 44
14.7%
6 40
13.3%
2 27
9.0%
1 26
8.7%
4 17
 
5.7%
5 17
 
5.7%
7 15
 
5.0%
8 13
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 250
83.3%
Dash Punctuation 50
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 45
18.0%
0 44
17.6%
6 40
16.0%
2 27
10.8%
1 26
10.4%
4 17
 
6.8%
5 17
 
6.8%
7 15
 
6.0%
8 13
 
5.2%
9 6
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 50
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 300
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 50
16.7%
3 45
15.0%
0 44
14.7%
6 40
13.3%
2 27
9.0%
1 26
8.7%
4 17
 
5.7%
5 17
 
5.7%
7 15
 
5.0%
8 13
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 300
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 50
16.7%
3 45
15.0%
0 44
14.7%
6 40
13.3%
2 27
9.0%
1 26
8.7%
4 17
 
5.7%
5 17
 
5.7%
7 15
 
5.0%
8 13
 
4.3%

우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)88.0%
Missing3
Missing (%)10.7%
Infinite0
Infinite (%)0.0%
Mean54886.12
Minimum54064
Maximum56453
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size384.0 B
2023-12-12T07:58:34.697884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum54064
5-th percentile54071.6
Q154525
median54811
Q354878
95-th percentile56279.2
Maximum56453
Range2389
Interquartile range (IQR)353

Descriptive statistics

Standard deviation715.6825
Coefficient of variation (CV)0.013039408
Kurtosis0.22042595
Mean54886.12
Median Absolute Deviation (MAD)276
Skewness1.1111404
Sum1372153
Variance512201.44
MonotonicityNot monotonic
2023-12-12T07:58:34.825072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
54817 2
 
7.1%
54846 2
 
7.1%
54535 2
 
7.1%
54878 1
 
3.6%
54370 1
 
3.6%
54855 1
 
3.6%
54065 1
 
3.6%
56303 1
 
3.6%
56153 1
 
3.6%
54811 1
 
3.6%
Other values (12) 12
42.9%
(Missing) 3
 
10.7%
ValueCountFrequency (%)
54064 1
3.6%
54065 1
3.6%
54098 1
3.6%
54157 1
3.6%
54323 1
3.6%
54370 1
3.6%
54525 1
3.6%
54535 2
7.1%
54548 1
3.6%
54569 1
3.6%
ValueCountFrequency (%)
56453 1
3.6%
56303 1
3.6%
56184 1
3.6%
56153 1
3.6%
55734 1
3.6%
55077 1
3.6%
54878 1
3.6%
54855 1
3.6%
54846 2
7.1%
54817 2
7.1%

소 재 지
Text

MISSING 

Distinct25
Distinct (%)100.0%
Missing3
Missing (%)10.7%
Memory size356.0 B
2023-12-12T07:58:35.069040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length14.32
Min length10

Characters and Unicode

Total characters358
Distinct characters79
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row익산시 무왕로 779-26
2nd row익산시 부송로16길 40
3rd row정읍시 용북길 57-11
4th row군산시 개사길 58-2
5th row전주시 완산구 콩쥐팥쥐로 1616
ValueCountFrequency (%)
전주시 8
 
9.0%
덕진구 6
 
6.7%
익산시 6
 
6.7%
군산시 4
 
4.5%
정읍시 2
 
2.2%
개정면 2
 
2.2%
완산구 2
 
2.2%
김제시 2
 
2.2%
추천로 2
 
2.2%
4 2
 
2.2%
Other values (53) 53
59.6%
2023-12-12T07:58:35.450316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
66
 
18.4%
23
 
6.4%
1 18
 
5.0%
16
 
4.5%
16
 
4.5%
12
 
3.4%
2 10
 
2.8%
6 9
 
2.5%
- 8
 
2.2%
5 8
 
2.2%
Other values (69) 172
48.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 203
56.7%
Decimal Number 81
 
22.6%
Space Separator 66
 
18.4%
Dash Punctuation 8
 
2.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
11.3%
16
 
7.9%
16
 
7.9%
12
 
5.9%
8
 
3.9%
8
 
3.9%
8
 
3.9%
7
 
3.4%
7
 
3.4%
6
 
3.0%
Other values (57) 92
45.3%
Decimal Number
ValueCountFrequency (%)
1 18
22.2%
2 10
12.3%
6 9
11.1%
5 8
9.9%
7 8
9.9%
3 8
9.9%
0 6
 
7.4%
4 6
 
7.4%
8 4
 
4.9%
9 4
 
4.9%
Space Separator
ValueCountFrequency (%)
66
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 203
56.7%
Common 155
43.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
11.3%
16
 
7.9%
16
 
7.9%
12
 
5.9%
8
 
3.9%
8
 
3.9%
8
 
3.9%
7
 
3.4%
7
 
3.4%
6
 
3.0%
Other values (57) 92
45.3%
Common
ValueCountFrequency (%)
66
42.6%
1 18
 
11.6%
2 10
 
6.5%
6 9
 
5.8%
- 8
 
5.2%
5 8
 
5.2%
7 8
 
5.2%
3 8
 
5.2%
0 6
 
3.9%
4 6
 
3.9%
Other values (2) 8
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 203
56.7%
ASCII 155
43.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
66
42.6%
1 18
 
11.6%
2 10
 
6.5%
6 9
 
5.8%
- 8
 
5.2%
5 8
 
5.2%
7 8
 
5.2%
3 8
 
5.2%
0 6
 
3.9%
4 6
 
3.9%
Other values (2) 8
 
5.2%
Hangul
ValueCountFrequency (%)
23
 
11.3%
16
 
7.9%
16
 
7.9%
12
 
5.9%
8
 
3.9%
8
 
3.9%
8
 
3.9%
7
 
3.4%
7
 
3.4%
6
 
3.0%
Other values (57) 92
45.3%

비고
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)32.1%
Missing0
Missing (%)0.0%
Memory size356.0 B
<NA>
대형
대형, 2종소형
대형, 대형견인, 2종소형
대형, 소형견인
Other values (4)

Length

Max length18
Median length14
Mean length5.8571429
Min length2

Unique

Unique4 ?
Unique (%)14.3%

Sample

1st row대형, 소형견인
2nd row<NA>
3rd row대형
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 8
28.6%
대형 7
25.0%
대형, 2종소형 4
14.3%
대형, 대형견인, 2종소형 3
 
10.7%
대형, 소형견인 2
 
7.1%
대형, 대형견인, 구난, 2종소형 1
 
3.6%
대형견인 1
 
3.6%
2종소형 1
 
3.6%
휴원 1
 
3.6%

Length

2023-12-12T07:58:35.629208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:58:35.773265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대형 17
39.5%
2종소형 9
20.9%
na 8
18.6%
대형견인 5
 
11.6%
소형견인 2
 
4.7%
구난 1
 
2.3%
휴원 1
 
2.3%

Interactions

2023-12-12T07:58:32.125077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:58:35.890557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분지역학원명전화번호팩스우편번호소 재 지비고
구분1.000NaNNaNNaNNaNNaNNaNNaN
지역NaN1.0001.0001.0001.0000.9851.0000.389
학원명NaN1.0001.0001.0001.0001.0001.0001.000
전화번호NaN1.0001.0001.0001.0001.0001.0001.000
팩스NaN1.0001.0001.0001.0001.0001.0001.000
우편번호NaN0.9851.0001.0001.0001.0001.0000.417
소 재 지NaN1.0001.0001.0001.0001.0001.0001.000
비고NaN0.3891.0001.0001.0000.4171.0001.000
2023-12-12T07:58:36.005368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역비고구분
지역1.0000.0001.000
비고0.0001.0001.000
구분1.0001.0001.000
2023-12-12T07:58:36.079457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호구분지역비고
우편번호1.0001.0000.7980.157
구분1.0001.0001.0001.000
지역0.7981.0001.0000.000
비고0.1571.0000.0001.000

Missing values

2023-12-12T07:58:32.224948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:58:32.355422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T07:58:32.478375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분지역학원명전화번호팩스우편번호소 재 지비고
0전문학원익산가나안063-837-5000063-854-004554535익산시 무왕로 779-26대형, 소형견인
1전문학원익산강남063-834-8831063-834-886154548익산시 부송로16길 40<NA>
2전문학원정읍구룡063-535-9400063-537-772756184정읍시 용북길 57-11대형
3전문학원군산군산063-467-1123063-464-909054157군산시 개사길 58-2<NA>
4전문학원전주그랜드063-226-0031063-226-003254878전주시 완산구 콩쥐팥쥐로 1616<NA>
5전문학원군산금강063-463-6561063-463-657154098군산시 공단대로 179대형
6전문학원익산063-831-0123063-833-107154569익산시 신왕길 32대형, 대형견인, 2종소형
7전문학원남원남원063-633-1312063-631-071055734남원시 요천로 1798-6대형, 대형견인, 구난, 2종소형
8전문학원전주동성063-244-4660063-723-222254817전주시 덕진구 장재길 4대형, 2종소형
9전문학원전주동아063-211-1100063-211-114954846전주시 덕진구 추천로 209<NA>
구분지역학원명전화번호팩스우편번호소 재 지비고
18전문학원익산이리063-852-2073063-856-207354535익산시 익산대로33길 46-5대형
19전문학원전주전북전주063-254-2580063-255-314154817전주시 덕진구 초포다리로 117-6<NA>
20전문학원전주전주063-212-1177063-213-010654811전주시 덕진구 한내로 21대형견인
21전문학원정읍정읍063-535-3192063-532-111556153정읍시 충정로 638대형
22전문학원부안참길063-581-0234063-581-023656303부안군 부안읍 부령로 153-17대형
23전문학원군산퍼스트063-453-5881063-453-588354065군산시 개정면 당산1길 20대형, 소형견인
24전문학원전주한일063-214-2700063-214-460054855전주시 덕진구 비석날로 252종소형
25전문학원정읍현대<NA><NA><NA><NA>휴원
26<NA><NA><NA><NA><NA><NA><NA><NA>
27<NA><NA><NA><NA><NA><NA><NA>