Overview

Dataset statistics

Number of variables4
Number of observations96
Missing cells103
Missing cells (%)26.8%
Duplicate rows1
Duplicate rows (%)1.0%
Total size in memory3.2 KiB
Average record size in memory34.4 B

Variable types

Numeric1
Text2
Categorical1

Dataset

Description인천광역시 서구에 위치한 동물판매업소 현황(사업장명칭, 소재지(지번), 소재지(도로명))을 포함한 데이터 파일입니다.
URLhttps://www.data.go.kr/data/15068794/fileData.do

Alerts

Dataset has 1 (1.0%) duplicate rowsDuplicates
연번 is highly overall correlated with 데이터기준일자High correlation
데이터기준일자 is highly overall correlated with 연번High correlation
연번 has 34 (35.4%) missing valuesMissing
사업장명칭 has 34 (35.4%) missing valuesMissing
소재지(도로명) has 35 (36.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 13:43:51.426281
Analysis finished2023-12-12 13:43:52.216296
Duration0.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct62
Distinct (%)100.0%
Missing34
Missing (%)35.4%
Infinite0
Infinite (%)0.0%
Mean31.5
Minimum1
Maximum62
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size996.0 B
2023-12-12T22:43:52.316281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.05
Q116.25
median31.5
Q346.75
95-th percentile58.95
Maximum62
Range61
Interquartile range (IQR)30.5

Descriptive statistics

Standard deviation18.041619
Coefficient of variation (CV)0.5727498
Kurtosis-1.2
Mean31.5
Median Absolute Deviation (MAD)15.5
Skewness0
Sum1953
Variance325.5
MonotonicityStrictly increasing
2023-12-12T22:43:52.519286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48 1
 
1.0%
35 1
 
1.0%
36 1
 
1.0%
37 1
 
1.0%
38 1
 
1.0%
39 1
 
1.0%
40 1
 
1.0%
41 1
 
1.0%
42 1
 
1.0%
43 1
 
1.0%
Other values (52) 52
54.2%
(Missing) 34
35.4%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
62 1
1.0%
61 1
1.0%
60 1
1.0%
59 1
1.0%
58 1
1.0%
57 1
1.0%
56 1
1.0%
55 1
1.0%
54 1
1.0%
53 1
1.0%

사업장명칭
Text

MISSING 

Distinct61
Distinct (%)98.4%
Missing34
Missing (%)35.4%
Memory size900.0 B
2023-12-12T22:43:52.804811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length20
Mean length7.8548387
Min length2

Characters and Unicode

Total characters487
Distinct characters161
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique60 ?
Unique (%)96.8%

Sample

1st row럭셔리하개
2nd row도그샵
3rd row애견용품 할인매장
4th row인천조류원
5th row25시파랑새동물병원
ValueCountFrequency (%)
인천점 4
 
3.8%
고양이 3
 
2.9%
인천청라점 3
 
2.9%
2
 
1.9%
인천강아지분양 2
 
1.9%
2
 
1.9%
강아지고양이 2
 
1.9%
스퀘어독스 2
 
1.9%
2
 
1.9%
강아지 2
 
1.9%
Other values (78) 81
77.1%
2023-12-12T22:43:53.205276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
43
 
8.8%
17
 
3.5%
17
 
3.5%
16
 
3.3%
16
 
3.3%
16
 
3.3%
13
 
2.7%
13
 
2.7%
13
 
2.7%
12
 
2.5%
Other values (151) 311
63.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 421
86.4%
Space Separator 43
 
8.8%
Uppercase Letter 8
 
1.6%
Close Punctuation 5
 
1.0%
Open Punctuation 5
 
1.0%
Decimal Number 4
 
0.8%
Other Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17
 
4.0%
17
 
4.0%
16
 
3.8%
16
 
3.8%
16
 
3.8%
13
 
3.1%
13
 
3.1%
13
 
3.1%
12
 
2.9%
10
 
2.4%
Other values (137) 278
66.0%
Uppercase Letter
ValueCountFrequency (%)
K 2
25.0%
I 1
12.5%
G 1
12.5%
O 1
12.5%
D 1
12.5%
H 1
12.5%
P 1
12.5%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
5 1
25.0%
4 1
25.0%
Space Separator
ValueCountFrequency (%)
43
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 421
86.4%
Common 58
 
11.9%
Latin 8
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
4.0%
17
 
4.0%
16
 
3.8%
16
 
3.8%
16
 
3.8%
13
 
3.1%
13
 
3.1%
13
 
3.1%
12
 
2.9%
10
 
2.4%
Other values (137) 278
66.0%
Common
ValueCountFrequency (%)
43
74.1%
) 5
 
8.6%
( 5
 
8.6%
2 2
 
3.4%
& 1
 
1.7%
5 1
 
1.7%
4 1
 
1.7%
Latin
ValueCountFrequency (%)
K 2
25.0%
I 1
12.5%
G 1
12.5%
O 1
12.5%
D 1
12.5%
H 1
12.5%
P 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 421
86.4%
ASCII 66
 
13.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
43
65.2%
) 5
 
7.6%
( 5
 
7.6%
K 2
 
3.0%
2 2
 
3.0%
& 1
 
1.5%
I 1
 
1.5%
G 1
 
1.5%
O 1
 
1.5%
D 1
 
1.5%
Other values (4) 4
 
6.1%
Hangul
ValueCountFrequency (%)
17
 
4.0%
17
 
4.0%
16
 
3.8%
16
 
3.8%
16
 
3.8%
13
 
3.1%
13
 
3.1%
13
 
3.1%
12
 
2.9%
10
 
2.4%
Other values (137) 278
66.0%

소재지(도로명)
Text

MISSING 

Distinct59
Distinct (%)96.7%
Missing35
Missing (%)36.5%
Memory size900.0 B
2023-12-12T22:43:53.485132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length55
Median length38
Mean length32.57377
Min length21

Characters and Unicode

Total characters1987
Distinct characters147
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)93.4%

Sample

1st row인천광역시 서구 승학로 457, 상가동 1층 101호 (검암동)
2nd row인천광역시 서구 승학로 283 (연희동)
3rd row인천광역시 서구 가정로 298 (석남동)
4th row인천광역시 서구 검단로609번길 3, 4층 (마전동)
5th row인천광역시 서구 완정로 182 (마전동)
ValueCountFrequency (%)
인천광역시 61
 
15.7%
서구 61
 
15.7%
청라동 16
 
4.1%
1층 11
 
2.8%
가좌동 7
 
1.8%
연희동 6
 
1.5%
마전동 6
 
1.5%
승학로 6
 
1.5%
지하1층 5
 
1.3%
검암동 5
 
1.3%
Other values (147) 205
52.7%
2023-12-12T22:43:53.978172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
329
 
16.6%
1 97
 
4.9%
70
 
3.5%
65
 
3.3%
62
 
3.1%
62
 
3.1%
62
 
3.1%
( 61
 
3.1%
61
 
3.1%
61
 
3.1%
Other values (137) 1057
53.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1135
57.1%
Space Separator 329
 
16.6%
Decimal Number 323
 
16.3%
Open Punctuation 61
 
3.1%
Close Punctuation 61
 
3.1%
Other Punctuation 51
 
2.6%
Dash Punctuation 17
 
0.9%
Uppercase Letter 10
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
70
 
6.2%
65
 
5.7%
62
 
5.5%
62
 
5.5%
62
 
5.5%
61
 
5.4%
61
 
5.4%
61
 
5.4%
61
 
5.4%
47
 
4.1%
Other values (119) 523
46.1%
Decimal Number
ValueCountFrequency (%)
1 97
30.0%
2 41
12.7%
0 33
 
10.2%
8 29
 
9.0%
5 25
 
7.7%
7 23
 
7.1%
4 21
 
6.5%
3 20
 
6.2%
9 18
 
5.6%
6 16
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
B 6
60.0%
J 2
 
20.0%
A 2
 
20.0%
Space Separator
ValueCountFrequency (%)
329
100.0%
Open Punctuation
ValueCountFrequency (%)
( 61
100.0%
Close Punctuation
ValueCountFrequency (%)
) 61
100.0%
Other Punctuation
ValueCountFrequency (%)
, 51
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1135
57.1%
Common 842
42.4%
Latin 10
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
70
 
6.2%
65
 
5.7%
62
 
5.5%
62
 
5.5%
62
 
5.5%
61
 
5.4%
61
 
5.4%
61
 
5.4%
61
 
5.4%
47
 
4.1%
Other values (119) 523
46.1%
Common
ValueCountFrequency (%)
329
39.1%
1 97
 
11.5%
( 61
 
7.2%
) 61
 
7.2%
, 51
 
6.1%
2 41
 
4.9%
0 33
 
3.9%
8 29
 
3.4%
5 25
 
3.0%
7 23
 
2.7%
Other values (5) 92
 
10.9%
Latin
ValueCountFrequency (%)
B 6
60.0%
J 2
 
20.0%
A 2
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1135
57.1%
ASCII 852
42.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
329
38.6%
1 97
 
11.4%
( 61
 
7.2%
) 61
 
7.2%
, 51
 
6.0%
2 41
 
4.8%
0 33
 
3.9%
8 29
 
3.4%
5 25
 
2.9%
7 23
 
2.7%
Other values (8) 102
 
12.0%
Hangul
ValueCountFrequency (%)
70
 
6.2%
65
 
5.7%
62
 
5.5%
62
 
5.5%
62
 
5.5%
61
 
5.4%
61
 
5.4%
61
 
5.4%
61
 
5.4%
47
 
4.1%
Other values (119) 523
46.1%

데이터기준일자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size900.0 B
2023-07-31
62 
<NA>
34 

Length

Max length10
Median length10
Mean length7.875
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-07-31
2nd row2023-07-31
3rd row2023-07-31
4th row2023-07-31
5th row2023-07-31

Common Values

ValueCountFrequency (%)
2023-07-31 62
64.6%
<NA> 34
35.4%

Length

2023-12-12T22:43:54.149621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:43:54.273866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-07-31 62
64.6%
na 34
35.4%

Interactions

2023-12-12T22:43:51.785074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:43:54.351800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업장명칭소재지(도로명)
연번1.0001.0001.000
사업장명칭1.0001.0001.000
소재지(도로명)1.0001.0001.000
2023-12-12T22:43:54.476486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번데이터기준일자
연번1.0001.000
데이터기준일자1.0001.000

Missing values

2023-12-12T22:43:51.928290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:43:52.042478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T22:43:52.155826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번사업장명칭소재지(도로명)데이터기준일자
01럭셔리하개인천광역시 서구 승학로 457, 상가동 1층 101호 (검암동)2023-07-31
12도그샵인천광역시 서구 승학로 283 (연희동)2023-07-31
23애견용품 할인매장인천광역시 서구 가정로 298 (석남동)2023-07-31
34인천조류원인천광역시 서구 검단로609번길 3, 4층 (마전동)2023-07-31
4525시파랑새동물병원인천광역시 서구 완정로 182 (마전동)2023-07-31
56행복한동물병원<NA>2023-07-31
67아지와 옹이인천광역시 서구 승학로 446 (검암동)2023-07-31
78메인독인천광역시 서구 신석로111번길 15 (석남동)2023-07-31
89핫도그인천광역시 서구 승학로 574 (검암동)2023-07-31
910해피펫인천광역시 서구 검단로501번안길 16 (마전동)2023-07-31
연번사업장명칭소재지(도로명)데이터기준일자
86<NA><NA><NA><NA>
87<NA><NA><NA><NA>
88<NA><NA><NA><NA>
89<NA><NA><NA><NA>
90<NA><NA><NA><NA>
91<NA><NA><NA><NA>
92<NA><NA><NA><NA>
93<NA><NA><NA><NA>
94<NA><NA><NA><NA>
95<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번사업장명칭소재지(도로명)데이터기준일자# duplicates
0<NA><NA><NA><NA>34