Overview

Dataset statistics

Number of variables6
Number of observations21
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory56.3 B

Variable types

Numeric2
Text3
Categorical1

Dataset

Description대전광역시 서구 관내 대형마트 및 준대규모점포에 대한 자료자료이며, 상호명, 주소, 지역, 연락처 및 기타 사항에 대한 내용입니다.
URLhttps://www.data.go.kr/data/15028914/fileData.do

Alerts

번호 is highly overall correlated with 면적(제곱미터)High correlation
면적(제곱미터) is highly overall correlated with 번호 and 1 other fieldsHigh correlation
구분 is highly overall correlated with 면적(제곱미터)High correlation
구분 is highly imbalanced (54.6%)Imbalance
번호 has unique valuesUnique
상호 has unique valuesUnique
주소 has unique valuesUnique
면적(제곱미터) has unique valuesUnique
전화번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 23:18:38.434263
Analysis finished2023-12-12 23:18:39.229483
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T08:18:39.288862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median11
Q316
95-th percentile20
Maximum21
Range20
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.2048368
Coefficient of variation (CV)0.56407607
Kurtosis-1.2
Mean11
Median Absolute Deviation (MAD)5
Skewness0
Sum231
Variance38.5
MonotonicityStrictly increasing
2023-12-13T08:18:39.641002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
1 1
 
4.8%
2 1
 
4.8%
21 1
 
4.8%
20 1
 
4.8%
19 1
 
4.8%
18 1
 
4.8%
17 1
 
4.8%
16 1
 
4.8%
15 1
 
4.8%
14 1
 
4.8%
Other values (11) 11
52.4%
ValueCountFrequency (%)
1 1
4.8%
2 1
4.8%
3 1
4.8%
4 1
4.8%
5 1
4.8%
6 1
4.8%
7 1
4.8%
8 1
4.8%
9 1
4.8%
10 1
4.8%
ValueCountFrequency (%)
21 1
4.8%
20 1
4.8%
19 1
4.8%
18 1
4.8%
17 1
4.8%
16 1
4.8%
15 1
4.8%
14 1
4.8%
13 1
4.8%
12 1
4.8%

상호
Text

UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
2023-12-13T08:18:39.787996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length12.047619
Min length8

Characters and Unicode

Total characters253
Distinct characters57
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row(주)이마트 대전둔산점
2nd row(주)이마트트레이더스 월평점
3rd row롯데슈퍼 둔산점
4th row홈플러스익스프레스 괴정점
5th row홈플러스익스프레스 관저점
ValueCountFrequency (%)
홈플러스익스프레스 8
19.0%
이마트노브랜드 5
11.9%
둔산점 3
 
7.1%
gs슈퍼 2
 
4.8%
월평점 2
 
4.8%
관저점 2
 
4.8%
갈마점 2
 
4.8%
탄방점 2
 
4.8%
대전둔산점 2
 
4.8%
이마트에브리데이 2
 
4.8%
Other values (12) 12
28.6%
2023-12-13T08:18:40.053053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25
 
9.9%
21
 
8.3%
21
 
8.3%
13
 
5.1%
12
 
4.7%
10
 
4.0%
9
 
3.6%
8
 
3.2%
8
 
3.2%
8
 
3.2%
Other values (47) 118
46.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 220
87.0%
Space Separator 21
 
8.3%
Decimal Number 4
 
1.6%
Uppercase Letter 4
 
1.6%
Open Punctuation 2
 
0.8%
Close Punctuation 2
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
11.4%
21
 
9.5%
13
 
5.9%
12
 
5.5%
10
 
4.5%
9
 
4.1%
8
 
3.6%
8
 
3.6%
8
 
3.6%
8
 
3.6%
Other values (40) 98
44.5%
Decimal Number
ValueCountFrequency (%)
9 3
75.0%
2 1
 
25.0%
Uppercase Letter
ValueCountFrequency (%)
G 2
50.0%
S 2
50.0%
Space Separator
ValueCountFrequency (%)
21
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 220
87.0%
Common 29
 
11.5%
Latin 4
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
11.4%
21
 
9.5%
13
 
5.9%
12
 
5.5%
10
 
4.5%
9
 
4.1%
8
 
3.6%
8
 
3.6%
8
 
3.6%
8
 
3.6%
Other values (40) 98
44.5%
Common
ValueCountFrequency (%)
21
72.4%
9 3
 
10.3%
( 2
 
6.9%
) 2
 
6.9%
2 1
 
3.4%
Latin
ValueCountFrequency (%)
G 2
50.0%
S 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 220
87.0%
ASCII 33
 
13.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
25
 
11.4%
21
 
9.5%
13
 
5.9%
12
 
5.5%
10
 
4.5%
9
 
4.1%
8
 
3.6%
8
 
3.6%
8
 
3.6%
8
 
3.6%
Other values (40) 98
44.5%
ASCII
ValueCountFrequency (%)
21
63.6%
9 3
 
9.1%
G 2
 
6.1%
S 2
 
6.1%
( 2
 
6.1%
) 2
 
6.1%
2 1
 
3.0%

주소
Text

UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
2023-12-13T08:18:40.229530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length38
Mean length24.428571
Min length15

Characters and Unicode

Total characters513
Distinct characters91
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row대전광역시 서구 둔산북로 41(둔산동)
2nd row대전광역시 서구 한밭대로 580(월평동)
3rd row대전광역시 서구 청사로 281, 샘머리코아(둔산동)
4th row대전광역시 서구 갈마로 257(괴정동)
5th row대전광역시 서구 관저북로 71(관저동)
ValueCountFrequency (%)
대전광역시 21
22.1%
서구 21
22.1%
청사로 2
 
2.1%
관저북로 2
 
2.1%
12번길 1
 
1.1%
1층(탄방동 1
 
1.1%
진양빌딩 1
 
1.1%
구봉로 1
 
1.1%
139 1
 
1.1%
106~116호(관저동 1
 
1.1%
Other values (43) 43
45.3%
2023-12-13T08:18:40.508152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
75
 
14.6%
24
 
4.7%
23
 
4.5%
1 23
 
4.5%
22
 
4.3%
21
 
4.1%
21
 
4.1%
21
 
4.1%
21
 
4.1%
21
 
4.1%
Other values (81) 241
47.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 323
63.0%
Space Separator 75
 
14.6%
Decimal Number 66
 
12.9%
Open Punctuation 20
 
3.9%
Close Punctuation 20
 
3.9%
Other Punctuation 8
 
1.6%
Math Symbol 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
7.4%
23
 
7.1%
22
 
6.8%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
7
 
2.2%
Other values (66) 121
37.5%
Decimal Number
ValueCountFrequency (%)
1 23
34.8%
2 11
16.7%
0 7
 
10.6%
8 6
 
9.1%
6 5
 
7.6%
9 4
 
6.1%
4 3
 
4.5%
5 3
 
4.5%
3 2
 
3.0%
7 2
 
3.0%
Space Separator
ValueCountFrequency (%)
75
100.0%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%
Other Punctuation
ValueCountFrequency (%)
, 8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 323
63.0%
Common 190
37.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
7.4%
23
 
7.1%
22
 
6.8%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
7
 
2.2%
Other values (66) 121
37.5%
Common
ValueCountFrequency (%)
75
39.5%
1 23
 
12.1%
( 20
 
10.5%
) 20
 
10.5%
2 11
 
5.8%
, 8
 
4.2%
0 7
 
3.7%
8 6
 
3.2%
6 5
 
2.6%
9 4
 
2.1%
Other values (5) 11
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 323
63.0%
ASCII 190
37.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
75
39.5%
1 23
 
12.1%
( 20
 
10.5%
) 20
 
10.5%
2 11
 
5.8%
, 8
 
4.2%
0 7
 
3.7%
8 6
 
3.2%
6 5
 
2.6%
9 4
 
2.1%
Other values (5) 11
 
5.8%
Hangul
ValueCountFrequency (%)
24
 
7.4%
23
 
7.1%
22
 
6.8%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
21
 
6.5%
7
 
2.2%
Other values (66) 121
37.5%

면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1331.4762
Minimum163
Maximum11319
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T08:18:40.613453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum163
5-th percentile193
Q1255
median360
Q3423
95-th percentile9623
Maximum11319
Range11156
Interquartile range (IQR)168

Descriptive statistics

Standard deviation3054.9658
Coefficient of variation (CV)2.2944202
Kurtosis7.8871529
Mean1331.4762
Median Absolute Deviation (MAD)105
Skewness3.0002061
Sum27961
Variance9332816.1
MonotonicityNot monotonic
2023-12-13T08:18:40.701314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
9623 1
 
4.8%
11319 1
 
4.8%
163 1
 
4.8%
330 1
 
4.8%
193 1
 
4.8%
255 1
 
4.8%
223 1
 
4.8%
423 1
 
4.8%
380 1
 
4.8%
661 1
 
4.8%
Other values (11) 11
52.4%
ValueCountFrequency (%)
163 1
4.8%
193 1
4.8%
199 1
4.8%
223 1
4.8%
238 1
4.8%
255 1
4.8%
265 1
4.8%
330 1
4.8%
345 1
4.8%
347 1
4.8%
ValueCountFrequency (%)
11319 1
4.8%
9623 1
4.8%
865 1
4.8%
661 1
4.8%
629 1
4.8%
423 1
4.8%
410 1
4.8%
380 1
4.8%
367 1
4.8%
366 1
4.8%

전화번호
Text

UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
2023-12-13T08:18:40.849310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters252
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row042-479-1234
2nd row042-718-1234
3rd row042-471-5608
4th row042-525-8546
5th row042-545-8545
ValueCountFrequency (%)
042-479-1234 1
 
4.8%
042-489-8382 1
 
4.8%
042-541-8911 1
 
4.8%
042-545-8591 1
 
4.8%
042-535-8546 1
 
4.8%
042-477-8546 1
 
4.8%
042-537-8680 1
 
4.8%
042-543-8177 1
 
4.8%
042-489-7801 1
 
4.8%
042-545-8521 1
 
4.8%
Other values (11) 11
52.4%
2023-12-13T08:18:41.140062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 46
18.3%
- 42
16.7%
2 32
12.7%
0 29
11.5%
5 25
9.9%
8 22
8.7%
1 19
7.5%
7 13
 
5.2%
3 9
 
3.6%
9 8
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 210
83.3%
Dash Punctuation 42
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 46
21.9%
2 32
15.2%
0 29
13.8%
5 25
11.9%
8 22
10.5%
1 19
9.0%
7 13
 
6.2%
3 9
 
4.3%
9 8
 
3.8%
6 7
 
3.3%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 252
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 46
18.3%
- 42
16.7%
2 32
12.7%
0 29
11.5%
5 25
9.9%
8 22
8.7%
1 19
7.5%
7 13
 
5.2%
3 9
 
3.6%
9 8
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 252
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 46
18.3%
- 42
16.7%
2 32
12.7%
0 29
11.5%
5 25
9.9%
8 22
8.7%
1 19
7.5%
7 13
 
5.2%
3 9
 
3.6%
9 8
 
3.2%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size300.0 B
준대규모점포
19 
대형마트

Length

Max length6
Median length6
Mean length5.8095238
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대형마트
2nd row대형마트
3rd row준대규모점포
4th row준대규모점포
5th row준대규모점포

Common Values

ValueCountFrequency (%)
준대규모점포 19
90.5%
대형마트 2
 
9.5%

Length

2023-12-13T08:18:41.288064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:18:41.427739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
준대규모점포 19
90.5%
대형마트 2
 
9.5%

Interactions

2023-12-13T08:18:38.890169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:18:38.712644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:18:38.974966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:18:38.803215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:18:41.501912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호상호주소면적(제곱미터)전화번호구분
번호1.0001.0001.0000.4481.0001.000
상호1.0001.0001.0001.0001.0001.000
주소1.0001.0001.0001.0001.0001.000
면적(제곱미터)0.4481.0001.0001.0001.0001.000
전화번호1.0001.0001.0001.0001.0001.000
구분1.0001.0001.0001.0001.0001.000
2023-12-13T08:18:41.596554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호면적(제곱미터)구분
번호1.000-0.5140.437
면적(제곱미터)-0.5141.0000.973
구분0.4370.9731.000

Missing values

2023-12-13T08:18:39.087626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:18:39.186070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호상호주소면적(제곱미터)전화번호구분
01(주)이마트 대전둔산점대전광역시 서구 둔산북로 41(둔산동)9623042-479-1234대형마트
12(주)이마트트레이더스 월평점대전광역시 서구 한밭대로 580(월평동)11319042-718-1234대형마트
23롯데슈퍼 둔산점대전광역시 서구 청사로 281, 샘머리코아(둔산동)629042-471-5608준대규모점포
34홈플러스익스프레스 괴정점대전광역시 서구 갈마로 257(괴정동)238042-525-8546준대규모점포
45홈플러스익스프레스 관저점대전광역시 서구 관저북로 71(관저동)265042-545-8545준대규모점포
56홈플러스익스프레스 용문점대전광역시 서구 계룡로 648(용문동)367042-528-8546준대규모점포
67홈플러스익스프레스 월평점대전광역시 서구 월평북로 90(월평동)366042-482-8543준대규모점포
78홈플러스익스프레스 탄방점대전광역시 서구 문예로15360042-710-7212준대규모점포
89GS슈퍼 도마점대전광역시 서구 배재로 206(도마동)865042-536-1002준대규모점포
910이마트에브리데이 관저효성점대전광역시 서구 관저동로 12, 효성해링턴플레이스(관저동)199042-710-1141준대규모점포
번호상호주소면적(제곱미터)전화번호구분
1112이마트노브랜드 탄방점대전광역시 서구 탄방로 8, 1층(탄방동, 진양빌딩)345042-489-8382준대규모점포
1213이마트노브랜드 관저점대전광역시 서구 구봉로 139, 106~116호(관저동, 밀리온빌딩)410042-545-8521준대규모점포
1314이마트노브랜드 둔산점대전광역시 서구 대덕대로 230 (둔산동)661042-489-7801준대규모점포
1415이마트노브랜드 가수원점대전광역시 서구 도안동로 12번길 9(가수원동)380042-543-8177준대규모점포
1516이마트노브랜드 갈마점대전광역시 서구 신갈마로 186(상가동 1층)423042-537-8680준대규모점포
1617홈플러스익스프레스 둔산점대전광역시 서구 청사로 281(둔산동)223042-477-8546준대규모점포
1718홈플러스익스프레스 갈마점대전광역시 서구 갈마중로 12(갈마동)255042-535-8546준대규모점포
1819홈플러스익스프레스 도안2점대전광역시 서구 원도안로 191(도안동)193042-545-8591준대규모점포
1920GS슈퍼 관저북로점대전광역시 서구 관저북로 21(관저동)330042-541-8911준대규모점포
2021롯데마켓999 만년점대전광역시 서구 만년남로11번길 42(만년동)163042-484-4999준대규모점포