Overview

Dataset statistics

Number of variables4
Number of observations1586
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory51.2 KiB
Average record size in memory33.1 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description대구광역시_안전상비의약품 판매업소현황_20210617
Author대구광역시
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15054008&dataSetDetailId=1505400819b8a45ac57aa&provdMethod=FILE

Alerts

연번 is highly overall correlated with 구군High correlation
구군 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-19 06:36:38.023050
Analysis finished2024-04-19 06:36:38.721244
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1586
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean793.5
Minimum1
Maximum1586
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 KiB
2024-04-19T15:36:38.806796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile80.25
Q1397.25
median793.5
Q31189.75
95-th percentile1506.75
Maximum1586
Range1585
Interquartile range (IQR)792.5

Descriptive statistics

Standard deviation457.98308
Coefficient of variation (CV)0.57716834
Kurtosis-1.2
Mean793.5
Median Absolute Deviation (MAD)396.5
Skewness0
Sum1258491
Variance209748.5
MonotonicityStrictly increasing
2024-04-19T15:36:38.953987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1067 1
 
0.1%
1065 1
 
0.1%
1064 1
 
0.1%
1063 1
 
0.1%
1062 1
 
0.1%
1061 1
 
0.1%
1060 1
 
0.1%
1059 1
 
0.1%
1058 1
 
0.1%
Other values (1576) 1576
99.4%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1586 1
0.1%
1585 1
0.1%
1584 1
0.1%
1583 1
0.1%
1582 1
0.1%
1581 1
0.1%
1580 1
0.1%
1579 1
0.1%
1578 1
0.1%
1577 1
0.1%

구군
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size12.5 KiB
달서구
379 
북구
283 
수성구
244 
동구
228 
달성군
141 
Other values (3)
311 

Length

Max length3
Median length2
Mean length2.481715
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중구
2nd row중구
3rd row중구
4th row중구
5th row중구

Common Values

ValueCountFrequency (%)
달서구 379
23.9%
북구 283
17.8%
수성구 244
15.4%
동구 228
14.4%
달성군 141
 
8.9%
남구 116
 
7.3%
중구 109
 
6.9%
서구 86
 
5.4%

Length

2024-04-19T15:36:39.097918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-19T15:36:39.212034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
달서구 379
23.9%
북구 283
17.8%
수성구 244
15.4%
동구 228
14.4%
달성군 141
 
8.9%
남구 116
 
7.3%
중구 109
 
6.9%
서구 86
 
5.4%
Distinct1582
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size12.5 KiB
2024-04-19T15:36:39.464815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length10.932535
Min length3

Characters and Unicode

Total characters17339
Distinct characters402
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1578 ?
Unique (%)99.5%

Sample

1st row(주)코리아세븐대구드림점
2nd row씨유삼덕원룸점
3rd row지에스(GS)25대구시티센터점
4th row씨유대신태왕점
5th row씨유대봉더샵센트럴파크점
ValueCountFrequency (%)
세븐일레븐 247
 
9.4%
씨유 234
 
8.9%
gs25 143
 
5.4%
지에스25 118
 
4.5%
이마트24 63
 
2.4%
지에스(gs)25 62
 
2.4%
주)코리아세븐 56
 
2.1%
미니스톱 39
 
1.5%
위드미 18
 
0.7%
cu 12
 
0.5%
Other values (1531) 1633
62.2%
2024-04-19T15:36:39.879081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1550
 
8.9%
1045
 
6.0%
951
 
5.5%
819
 
4.7%
786
 
4.5%
2 609
 
3.5%
5 513
 
3.0%
452
 
2.6%
431
 
2.5%
420
 
2.4%
Other values (392) 9763
56.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13822
79.7%
Decimal Number 1230
 
7.1%
Space Separator 1045
 
6.0%
Uppercase Letter 820
 
4.7%
Close Punctuation 203
 
1.2%
Open Punctuation 199
 
1.1%
Lowercase Letter 17
 
0.1%
Dash Punctuation 1
 
< 0.1%
Other Symbol 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1550
 
11.2%
951
 
6.9%
819
 
5.9%
786
 
5.7%
452
 
3.3%
431
 
3.1%
420
 
3.0%
404
 
2.9%
374
 
2.7%
370
 
2.7%
Other values (350) 7265
52.6%
Uppercase Letter
ValueCountFrequency (%)
S 347
42.3%
G 346
42.2%
C 52
 
6.3%
U 46
 
5.6%
R 6
 
0.7%
I 4
 
0.5%
K 3
 
0.4%
L 3
 
0.4%
H 3
 
0.4%
B 3
 
0.4%
Other values (7) 7
 
0.9%
Lowercase Letter
ValueCountFrequency (%)
e 6
35.3%
y 2
 
11.8%
u 2
 
11.8%
c 1
 
5.9%
k 1
 
5.9%
s 1
 
5.9%
t 1
 
5.9%
r 1
 
5.9%
a 1
 
5.9%
m 1
 
5.9%
Decimal Number
ValueCountFrequency (%)
2 609
49.5%
5 513
41.7%
4 87
 
7.1%
1 9
 
0.7%
3 8
 
0.7%
8 1
 
0.1%
7 1
 
0.1%
6 1
 
0.1%
0 1
 
0.1%
Space Separator
ValueCountFrequency (%)
1045
100.0%
Close Punctuation
ValueCountFrequency (%)
) 203
100.0%
Open Punctuation
ValueCountFrequency (%)
( 199
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13823
79.7%
Common 2679
 
15.5%
Latin 837
 
4.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1550
 
11.2%
951
 
6.9%
819
 
5.9%
786
 
5.7%
452
 
3.3%
431
 
3.1%
420
 
3.0%
404
 
2.9%
374
 
2.7%
370
 
2.7%
Other values (351) 7266
52.6%
Latin
ValueCountFrequency (%)
S 347
41.5%
G 346
41.3%
C 52
 
6.2%
U 46
 
5.5%
e 6
 
0.7%
R 6
 
0.7%
I 4
 
0.5%
K 3
 
0.4%
L 3
 
0.4%
H 3
 
0.4%
Other values (17) 21
 
2.5%
Common
ValueCountFrequency (%)
1045
39.0%
2 609
22.7%
5 513
19.1%
) 203
 
7.6%
( 199
 
7.4%
4 87
 
3.2%
1 9
 
0.3%
3 8
 
0.3%
8 1
 
< 0.1%
- 1
 
< 0.1%
Other values (4) 4
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13822
79.7%
ASCII 3516
 
20.3%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1550
 
11.2%
951
 
6.9%
819
 
5.9%
786
 
5.7%
452
 
3.3%
431
 
3.1%
420
 
3.0%
404
 
2.9%
374
 
2.7%
370
 
2.7%
Other values (350) 7265
52.6%
ASCII
ValueCountFrequency (%)
1045
29.7%
2 609
17.3%
5 513
14.6%
S 347
 
9.9%
G 346
 
9.8%
) 203
 
5.8%
( 199
 
5.7%
4 87
 
2.5%
C 52
 
1.5%
U 46
 
1.3%
Other values (31) 69
 
2.0%
None
ValueCountFrequency (%)
1
100.0%

주소
Text

Distinct1563
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size12.5 KiB
2024-04-19T15:36:40.306938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length68
Median length56
Mean length29.178436
Min length20

Characters and Unicode

Total characters46277
Distinct characters375
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1540 ?
Unique (%)97.1%

Sample

1st row대구광역시 중구 달구벌대로 2125-3, 1층 (봉산동)
2nd row대구광역시 중구 공평로 26-20, 1층 (삼덕동2가)
3rd row대구광역시 중구 국채보상로 621 (공평동)
4th row대구광역시 중구 달구벌대로393길 62, 서문목욕탕 상가동 1층 1-1호 (대신동)
5th row대구광역시 중구 대봉로 226, 1층 (대봉동, 대봉화성그린빌아파트)
ValueCountFrequency (%)
대구광역시 1586
 
16.9%
1층 483
 
5.1%
달서구 379
 
4.0%
북구 283
 
3.0%
수성구 244
 
2.6%
동구 228
 
2.4%
달성군 141
 
1.5%
남구 116
 
1.2%
중구 109
 
1.2%
서구 86
 
0.9%
Other values (1967) 5738
61.1%
2024-04-19T15:36:40.837466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7807
 
16.9%
3284
 
7.1%
1 2216
 
4.8%
2140
 
4.6%
2138
 
4.6%
1634
 
3.5%
1627
 
3.5%
1590
 
3.4%
1553
 
3.4%
( 1486
 
3.2%
Other values (365) 20802
45.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26885
58.1%
Space Separator 7807
 
16.9%
Decimal Number 7319
 
15.8%
Open Punctuation 1486
 
3.2%
Close Punctuation 1486
 
3.2%
Other Punctuation 1086
 
2.3%
Dash Punctuation 140
 
0.3%
Uppercase Letter 45
 
0.1%
Lowercase Letter 16
 
< 0.1%
Math Symbol 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3284
 
12.2%
2140
 
8.0%
2138
 
8.0%
1634
 
6.1%
1627
 
6.1%
1590
 
5.9%
1553
 
5.8%
729
 
2.7%
658
 
2.4%
658
 
2.4%
Other values (325) 10874
40.4%
Uppercase Letter
ValueCountFrequency (%)
S 5
11.1%
H 5
11.1%
O 4
 
8.9%
I 4
 
8.9%
L 3
 
6.7%
T 3
 
6.7%
D 3
 
6.7%
W 3
 
6.7%
A 3
 
6.7%
R 2
 
4.4%
Other values (7) 10
22.2%
Decimal Number
ValueCountFrequency (%)
1 2216
30.3%
2 946
12.9%
0 770
 
10.5%
3 729
 
10.0%
4 611
 
8.3%
5 554
 
7.6%
6 460
 
6.3%
7 373
 
5.1%
8 338
 
4.6%
9 322
 
4.4%
Lowercase Letter
ValueCountFrequency (%)
e 10
62.5%
a 2
 
12.5%
i 1
 
6.2%
n 1
 
6.2%
h 1
 
6.2%
b 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 1084
99.8%
. 2
 
0.2%
Space Separator
ValueCountFrequency (%)
7807
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1486
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1486
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 140
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26885
58.1%
Common 19331
41.8%
Latin 61
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3284
 
12.2%
2140
 
8.0%
2138
 
8.0%
1634
 
6.1%
1627
 
6.1%
1590
 
5.9%
1553
 
5.8%
729
 
2.7%
658
 
2.4%
658
 
2.4%
Other values (325) 10874
40.4%
Latin
ValueCountFrequency (%)
e 10
16.4%
S 5
 
8.2%
H 5
 
8.2%
O 4
 
6.6%
I 4
 
6.6%
L 3
 
4.9%
T 3
 
4.9%
D 3
 
4.9%
W 3
 
4.9%
A 3
 
4.9%
Other values (13) 18
29.5%
Common
ValueCountFrequency (%)
7807
40.4%
1 2216
 
11.5%
( 1486
 
7.7%
) 1486
 
7.7%
, 1084
 
5.6%
2 946
 
4.9%
0 770
 
4.0%
3 729
 
3.8%
4 611
 
3.2%
5 554
 
2.9%
Other values (7) 1642
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26885
58.1%
ASCII 19392
41.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7807
40.3%
1 2216
 
11.4%
( 1486
 
7.7%
) 1486
 
7.7%
, 1084
 
5.6%
2 946
 
4.9%
0 770
 
4.0%
3 729
 
3.8%
4 611
 
3.2%
5 554
 
2.9%
Other values (30) 1703
 
8.8%
Hangul
ValueCountFrequency (%)
3284
 
12.2%
2140
 
8.0%
2138
 
8.0%
1634
 
6.1%
1627
 
6.1%
1590
 
5.9%
1553
 
5.8%
729
 
2.7%
658
 
2.4%
658
 
2.4%
Other values (325) 10874
40.4%

Interactions

2024-04-19T15:36:38.436636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-19T15:36:40.931932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구군
연번1.0000.939
구군0.9391.000
2024-04-19T15:36:41.031025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구군
연번1.0000.817
구군0.8171.000

Missing values

2024-04-19T15:36:38.576971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-19T15:36:38.676787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구군상호명주소
01중구(주)코리아세븐대구드림점대구광역시 중구 달구벌대로 2125-3, 1층 (봉산동)
12중구씨유삼덕원룸점대구광역시 중구 공평로 26-20, 1층 (삼덕동2가)
23중구지에스(GS)25대구시티센터점대구광역시 중구 국채보상로 621 (공평동)
34중구씨유대신태왕점대구광역시 중구 달구벌대로393길 62, 서문목욕탕 상가동 1층 1-1호 (대신동)
45중구씨유대봉더샵센트럴파크점대구광역시 중구 대봉로 226, 1층 (대봉동, 대봉화성그린빌아파트)
56중구씨유남산e편한점대구광역시 중구 달구벌대로 2020, 남산e편한세상 단지내상가 401동 106호 (남산동, 남산e편한세상)
67중구씨유삼덕대로점대구광역시 중구 달구벌대로 2145 (삼덕동1가)
78중구지에스25남산휴먼점대구광역시 중구 달구벌대로 1988-13, 1층 (남산동)
89중구지에스(GS)25교동타운점대구광역시 중구 교동4길 30, 1층 (완전동)
910중구세븐일레븐 대구동성로스파크점대구광역시 중구 동성로6길 61, 1층 103호 (공평동)
연번구군상호명주소
15761577달성군세븐일레븐달성구지대로점대구광역시 달성군 구지면 과학마을로 34
15771578달성군씨유대구화원점대구광역시 달성군 화원읍 사문진로 434
15781579달성군GS25 논공베스트점대구광역시 달성군 논공읍 북리1길 15
15791580달성군GS25다사주공점대구광역시 달성군 다사읍 매곡로14길 11
15801581달성군GS25달성본리점대구광역시 달성군 화원읍 성암로1길 5
15811582달성군GS25옥포대로점대구광역시 달성군 옥포읍 비슬로 2180
15821583달성군GS25논공점대구광역시 달성군 논공읍 논공로5길 152
15831584달성군세븐일레븐대구다사역점대구광역시 달성군 다사읍 다사역로 39
15841585달성군세븐일레븐논공점대구광역시 달성군 논공읍 논공로30길 8-1
15851586달성군씨유논공행복점대구광역시 달성군 논공읍 논공로24길 23