Overview

Dataset statistics

Number of variables3
Number of observations135
Missing cells2
Missing cells (%)0.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 KiB
Average record size in memory25.0 B

Variable types

Text2
Categorical1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-16013/S/1/datasetView.do

Alerts

Unnamed: 2 is highly imbalanced (66.5%)Imbalance

Reproduction

Analysis started2024-05-11 01:30:53.746894
Analysis finished2024-05-11 01:30:55.332256
Duration1.59 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct134
Distinct (%)100.0%
Missing1
Missing (%)0.7%
Memory size1.2 KiB
2024-05-11T01:30:56.191815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.1865672
Min length1

Characters and Unicode

Total characters293
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134 ?
Unique (%)100.0%

Sample

1st row연번
2nd row1
3rd row2
4th row3
5th row4
ValueCountFrequency (%)
32 1
 
0.7%
101 1
 
0.7%
98 1
 
0.7%
97 1
 
0.7%
96 1
 
0.7%
95 1
 
0.7%
94 1
 
0.7%
93 1
 
0.7%
92 1
 
0.7%
91 1
 
0.7%
Other values (124) 124
92.5%
2024-05-11T01:30:57.654068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 67
22.9%
2 34
11.6%
3 29
9.9%
4 24
 
8.2%
5 23
 
7.8%
6 23
 
7.8%
8 23
 
7.8%
9 23
 
7.8%
0 23
 
7.8%
7 22
 
7.5%
Other values (2) 2
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 291
99.3%
Other Letter 2
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 67
23.0%
2 34
11.7%
3 29
10.0%
4 24
 
8.2%
5 23
 
7.9%
6 23
 
7.9%
8 23
 
7.9%
9 23
 
7.9%
0 23
 
7.9%
7 22
 
7.6%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 291
99.3%
Hangul 2
 
0.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 67
23.0%
2 34
11.7%
3 29
10.0%
4 24
 
8.2%
5 23
 
7.9%
6 23
 
7.9%
8 23
 
7.9%
9 23
 
7.9%
0 23
 
7.9%
7 22
 
7.6%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 291
99.3%
Hangul 2
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 67
23.0%
2 34
11.7%
3 29
10.0%
4 24
 
8.2%
5 23
 
7.9%
6 23
 
7.9%
8 23
 
7.9%
9 23
 
7.9%
0 23
 
7.9%
7 22
 
7.6%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct132
Distinct (%)98.5%
Missing1
Missing (%)0.7%
Memory size1.2 KiB
2024-05-11T01:30:58.351676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length18
Mean length7.1716418
Min length2

Characters and Unicode

Total characters961
Distinct characters250
Distinct categories7 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique130 ?
Unique (%)97.0%

Sample

1st row업체명
2nd row㈜구펍
3rd row㈜굿트랜드
4th row㈜산시아코리아
5th row㈜삼경아이에스
ValueCountFrequency (%)
주식회사 16
 
10.1%
㈜벨벳케어 2
 
1.3%
디지탈디자인(inter 2
 
1.3%
zoo 2
 
1.3%
데이원 1
 
0.6%
㈜에스텍 1
 
0.6%
㈜에버셀 1
 
0.6%
㈜에스틴 1
 
0.6%
㈜페슬러 1
 
0.6%
㈜퍼플네스트 1
 
0.6%
Other values (130) 130
82.3%
2024-05-11T01:30:59.537112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
57
 
5.9%
33
 
3.4%
31
 
3.2%
30
 
3.1%
30
 
3.1%
28
 
2.9%
27
 
2.8%
25
 
2.6%
24
 
2.5%
24
 
2.5%
Other values (240) 652
67.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 762
79.3%
Other Symbol 57
 
5.9%
Uppercase Letter 57
 
5.9%
Lowercase Letter 37
 
3.9%
Space Separator 24
 
2.5%
Open Punctuation 12
 
1.2%
Close Punctuation 12
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
 
4.3%
31
 
4.1%
30
 
3.9%
30
 
3.9%
28
 
3.7%
27
 
3.5%
25
 
3.3%
24
 
3.1%
24
 
3.1%
21
 
2.8%
Other values (202) 489
64.2%
Uppercase Letter
ValueCountFrequency (%)
A 7
 
12.3%
N 6
 
10.5%
O 5
 
8.8%
I 4
 
7.0%
T 4
 
7.0%
E 3
 
5.3%
G 3
 
5.3%
S 3
 
5.3%
M 2
 
3.5%
W 2
 
3.5%
Other values (11) 18
31.6%
Lowercase Letter
ValueCountFrequency (%)
e 5
13.5%
l 5
13.5%
n 4
10.8%
t 4
10.8%
a 4
10.8%
o 4
10.8%
r 3
8.1%
i 2
 
5.4%
u 2
 
5.4%
b 1
 
2.7%
Other values (3) 3
8.1%
Other Symbol
ValueCountFrequency (%)
57
100.0%
Space Separator
ValueCountFrequency (%)
24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 818
85.1%
Latin 94
 
9.8%
Common 48
 
5.0%
Han 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
57
 
7.0%
33
 
4.0%
31
 
3.8%
30
 
3.7%
30
 
3.7%
28
 
3.4%
27
 
3.3%
25
 
3.1%
24
 
2.9%
24
 
2.9%
Other values (202) 509
62.2%
Latin
ValueCountFrequency (%)
A 7
 
7.4%
N 6
 
6.4%
e 5
 
5.3%
l 5
 
5.3%
O 5
 
5.3%
n 4
 
4.3%
t 4
 
4.3%
a 4
 
4.3%
o 4
 
4.3%
I 4
 
4.3%
Other values (24) 46
48.9%
Common
ValueCountFrequency (%)
24
50.0%
( 12
25.0%
) 12
25.0%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 761
79.2%
ASCII 142
 
14.8%
None 57
 
5.9%
CJK 1
 
0.1%

Most frequent character per block

None
ValueCountFrequency (%)
57
100.0%
Hangul
ValueCountFrequency (%)
33
 
4.3%
31
 
4.1%
30
 
3.9%
30
 
3.9%
28
 
3.7%
27
 
3.5%
25
 
3.3%
24
 
3.2%
24
 
3.2%
21
 
2.8%
Other values (201) 488
64.1%
ASCII
ValueCountFrequency (%)
24
16.9%
( 12
 
8.5%
) 12
 
8.5%
A 7
 
4.9%
N 6
 
4.2%
e 5
 
3.5%
l 5
 
3.5%
O 5
 
3.5%
n 4
 
2.8%
t 4
 
2.8%
Other values (27) 58
40.8%
CJK
ValueCountFrequency (%)
1
100.0%

Unnamed: 2
Categorical

IMBALANCE 

Distinct4
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
수입
116 
제조
17 
<NA>
 
1
수입/제조
 
1

Length

Max length5
Median length2
Mean length2.037037
Min length2

Unique

Unique2 ?
Unique (%)1.5%

Sample

1st row<NA>
2nd row수입/제조
3rd row수입
4th row수입
5th row수입

Common Values

ValueCountFrequency (%)
수입 116
85.9%
제조 17
 
12.6%
<NA> 1
 
0.7%
수입/제조 1
 
0.7%

Length

2024-05-11T01:31:00.307101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T01:31:00.687500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수입 116
85.9%
제조 17
 
12.6%
na 1
 
0.7%
수입/제조 1
 
0.7%

Missing values

2024-05-11T01:30:54.209842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T01:30:54.617065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T01:30:55.090990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

반려동물 배합사료 업체 현황(19.11.15.기준)Unnamed: 1Unnamed: 2
0<NA><NA><NA>
1연번업체명수입/제조
21㈜구펍수입
32㈜굿트랜드수입
43㈜산시아코리아수입
54㈜삼경아이에스수입
65㈜삼경에프에스수입
76㈜스템텍코리아수입
87㈜싱크라이크펫수입
98㈜엔디펫수입
반려동물 배합사료 업체 현황(19.11.15.기준)Unnamed: 1Unnamed: 2
125125㈜벨벳케어제조
126126주식회사 알앤케이컴퍼니제조
127127주식회사 와스컴퍼니(WAAS COMPANY)제조
128128주식회사 펫픽제조
129129㈜우리펫제조
130130키친반제조
131131트릿테이블제조
132132판타펫코리아제조
133133펫츠쿡제조
134134하쿠페쿠제조