Overview

Dataset statistics

Number of variables3
Number of observations478
Missing cells15
Missing cells (%)1.0%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory11.3 KiB
Average record size in memory24.3 B

Variable types

DateTime1
Text2

Dataset

Description2024년 충청남도 방역소독업체 현황을 제공하는 데이터로 영업신고일, 업소명, 사무실 소재지 등의 정보를 제공합니다.
Author충청남도
URLhttps://www.data.go.kr/data/15069639/fileData.do

Alerts

Dataset has 1 (0.2%) duplicate rowsDuplicates
영업신고일 has 5 (1.0%) missing valuesMissing
업소명 has 5 (1.0%) missing valuesMissing
사무실 소재지 has 5 (1.0%) missing valuesMissing

Reproduction

Analysis started2024-03-14 15:24:01.785551
Analysis finished2024-03-14 15:24:03.086736
Duration1.3 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

영업신고일
Date

MISSING 

Distinct414
Distinct (%)87.5%
Missing5
Missing (%)1.0%
Memory size3.9 KiB
Minimum1985-01-28 00:00:00
Maximum2024-01-11 00:00:00
2024-03-15T00:24:03.316547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T00:24:03.634346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

업소명
Text

MISSING 

Distinct462
Distinct (%)97.7%
Missing5
Missing (%)1.0%
Memory size3.9 KiB
2024-03-15T00:24:04.569560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length17
Mean length6.8033827
Min length2

Characters and Unicode

Total characters3218
Distinct characters361
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique453 ?
Unique (%)95.8%

Sample

1st row현씨앤씨
2nd row소담코프
3rd row조현준의방역대통령
4th row일퍼스트
5th row영클린
ValueCountFrequency (%)
주식회사 51
 
8.8%
7
 
1.2%
피엠텍 5
 
0.9%
방역 4
 
0.7%
그린환경 3
 
0.5%
제일방역 3
 
0.5%
주)세스코 3
 
0.5%
방역소독 3
 
0.5%
솔루션 2
 
0.3%
에스웜 2
 
0.3%
Other values (490) 498
85.7%
2024-03-15T00:24:05.719598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
162
 
5.0%
115
 
3.6%
109
 
3.4%
103
 
3.2%
97
 
3.0%
) 91
 
2.8%
( 90
 
2.8%
79
 
2.5%
73
 
2.3%
72
 
2.2%
Other values (351) 2227
69.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2859
88.8%
Space Separator 109
 
3.4%
Close Punctuation 91
 
2.8%
Open Punctuation 90
 
2.8%
Uppercase Letter 34
 
1.1%
Lowercase Letter 12
 
0.4%
Decimal Number 10
 
0.3%
Other Symbol 7
 
0.2%
Other Punctuation 5
 
0.2%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
162
 
5.7%
115
 
4.0%
103
 
3.6%
97
 
3.4%
79
 
2.8%
73
 
2.6%
72
 
2.5%
68
 
2.4%
67
 
2.3%
59
 
2.1%
Other values (312) 1964
68.7%
Uppercase Letter
ValueCountFrequency (%)
P 4
11.8%
C 3
8.8%
M 3
8.8%
O 3
8.8%
B 3
8.8%
K 3
8.8%
T 3
8.8%
A 2
 
5.9%
J 2
 
5.9%
R 2
 
5.9%
Other values (6) 6
17.6%
Lowercase Letter
ValueCountFrequency (%)
h 1
8.3%
d 1
8.3%
s 1
8.3%
c 1
8.3%
k 1
8.3%
y 1
8.3%
n 1
8.3%
a 1
8.3%
p 1
8.3%
m 1
8.3%
Other values (2) 2
16.7%
Decimal Number
ValueCountFrequency (%)
1 6
60.0%
2 2
 
20.0%
9 1
 
10.0%
4 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
& 4
80.0%
, 1
 
20.0%
Space Separator
ValueCountFrequency (%)
109
100.0%
Close Punctuation
ValueCountFrequency (%)
) 91
100.0%
Open Punctuation
ValueCountFrequency (%)
( 90
100.0%
Other Symbol
ValueCountFrequency (%)
7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2866
89.1%
Common 306
 
9.5%
Latin 46
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
162
 
5.7%
115
 
4.0%
103
 
3.6%
97
 
3.4%
79
 
2.8%
73
 
2.5%
72
 
2.5%
68
 
2.4%
67
 
2.3%
59
 
2.1%
Other values (313) 1971
68.8%
Latin
ValueCountFrequency (%)
P 4
 
8.7%
C 3
 
6.5%
M 3
 
6.5%
O 3
 
6.5%
B 3
 
6.5%
K 3
 
6.5%
T 3
 
6.5%
A 2
 
4.3%
J 2
 
4.3%
R 2
 
4.3%
Other values (18) 18
39.1%
Common
ValueCountFrequency (%)
109
35.6%
) 91
29.7%
( 90
29.4%
1 6
 
2.0%
& 4
 
1.3%
2 2
 
0.7%
, 1
 
0.3%
9 1
 
0.3%
4 1
 
0.3%
- 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2859
88.8%
ASCII 352
 
10.9%
None 7
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
162
 
5.7%
115
 
4.0%
103
 
3.6%
97
 
3.4%
79
 
2.8%
73
 
2.6%
72
 
2.5%
68
 
2.4%
67
 
2.3%
59
 
2.1%
Other values (312) 1964
68.7%
ASCII
ValueCountFrequency (%)
109
31.0%
) 91
25.9%
( 90
25.6%
1 6
 
1.7%
P 4
 
1.1%
& 4
 
1.1%
C 3
 
0.9%
M 3
 
0.9%
O 3
 
0.9%
B 3
 
0.9%
Other values (28) 36
 
10.2%
None
ValueCountFrequency (%)
7
100.0%

사무실 소재지
Text

MISSING 

Distinct467
Distinct (%)98.7%
Missing5
Missing (%)1.0%
Memory size3.9 KiB
2024-03-15T00:24:06.904116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length55
Median length44
Mean length28.281184
Min length15

Characters and Unicode

Total characters13377
Distinct characters326
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique462 ?
Unique (%)97.7%

Sample

1st row충청남도 천안시 동남구 신촌4로 42, 신방현대프라자 2층 (신방동)
2nd row충청남도 천안시 서북구 백석4길 12, 거성캐슬B 6층 602일부호 (백석동)
3rd row충청남도 천안시 동남구 복자1길 13, 1층 (성황동)
4th row충청남도 천안시 동남구 원거리6길 6, 1층 (원성동)
5th row충청남도 천안시 동남구 풍세로 920, 1층 (청수동)
ValueCountFrequency (%)
충청남도 443
 
15.1%
천안시 161
 
5.5%
1층 93
 
3.2%
서북구 83
 
2.8%
동남구 78
 
2.7%
2층 60
 
2.0%
당진시 53
 
1.8%
아산시 45
 
1.5%
충청남도서산시 29
 
1.0%
태안군 26
 
0.9%
Other values (1007) 1865
63.5%
2024-03-15T00:24:08.415470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2478
 
18.5%
572
 
4.3%
1 556
 
4.2%
506
 
3.8%
485
 
3.6%
481
 
3.6%
397
 
3.0%
383
 
2.9%
2 340
 
2.5%
, 295
 
2.2%
Other values (316) 6884
51.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7817
58.4%
Space Separator 2478
 
18.5%
Decimal Number 2102
 
15.7%
Other Punctuation 298
 
2.2%
Open Punctuation 268
 
2.0%
Close Punctuation 268
 
2.0%
Dash Punctuation 125
 
0.9%
Uppercase Letter 16
 
0.1%
Lowercase Letter 4
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
572
 
7.3%
506
 
6.5%
485
 
6.2%
481
 
6.2%
397
 
5.1%
383
 
4.9%
292
 
3.7%
236
 
3.0%
226
 
2.9%
219
 
2.8%
Other values (285) 4020
51.4%
Decimal Number
ValueCountFrequency (%)
1 556
26.5%
2 340
16.2%
3 238
11.3%
4 201
 
9.6%
0 196
 
9.3%
5 143
 
6.8%
6 128
 
6.1%
8 110
 
5.2%
7 106
 
5.0%
9 84
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
B 5
31.2%
C 2
 
12.5%
S 2
 
12.5%
D 2
 
12.5%
A 2
 
12.5%
K 1
 
6.2%
F 1
 
6.2%
J 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 295
99.0%
/ 1
 
0.3%
: 1
 
0.3%
. 1
 
0.3%
Lowercase Letter
ValueCountFrequency (%)
e 1
25.0%
n 1
25.0%
o 1
25.0%
c 1
25.0%
Space Separator
ValueCountFrequency (%)
2478
100.0%
Open Punctuation
ValueCountFrequency (%)
( 268
100.0%
Close Punctuation
ValueCountFrequency (%)
) 268
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 125
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7817
58.4%
Common 5539
41.4%
Latin 21
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
572
 
7.3%
506
 
6.5%
485
 
6.2%
481
 
6.2%
397
 
5.1%
383
 
4.9%
292
 
3.7%
236
 
3.0%
226
 
2.9%
219
 
2.8%
Other values (285) 4020
51.4%
Common
ValueCountFrequency (%)
2478
44.7%
1 556
 
10.0%
2 340
 
6.1%
, 295
 
5.3%
( 268
 
4.8%
) 268
 
4.8%
3 238
 
4.3%
4 201
 
3.6%
0 196
 
3.5%
5 143
 
2.6%
Other values (8) 556
 
10.0%
Latin
ValueCountFrequency (%)
B 5
23.8%
C 2
 
9.5%
S 2
 
9.5%
D 2
 
9.5%
A 2
 
9.5%
K 1
 
4.8%
e 1
 
4.8%
n 1
 
4.8%
F 1
 
4.8%
o 1
 
4.8%
Other values (3) 3
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7817
58.4%
ASCII 5559
41.6%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2478
44.6%
1 556
 
10.0%
2 340
 
6.1%
, 295
 
5.3%
( 268
 
4.8%
) 268
 
4.8%
3 238
 
4.3%
4 201
 
3.6%
0 196
 
3.5%
5 143
 
2.6%
Other values (20) 576
 
10.4%
Hangul
ValueCountFrequency (%)
572
 
7.3%
506
 
6.5%
485
 
6.2%
481
 
6.2%
397
 
5.1%
383
 
4.9%
292
 
3.7%
236
 
3.0%
226
 
2.9%
219
 
2.8%
Other values (285) 4020
51.4%
Number Forms
ValueCountFrequency (%)
1
100.0%

Missing values

2024-03-15T00:24:02.416707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T00:24:02.668438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-15T00:24:02.946019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

영업신고일업소명사무실 소재지
02020-07-21현씨앤씨충청남도 천안시 동남구 신촌4로 42, 신방현대프라자 2층 (신방동)
12023-11-22소담코프충청남도 천안시 서북구 백석4길 12, 거성캐슬B 6층 602일부호 (백석동)
22023-11-15조현준의방역대통령충청남도 천안시 동남구 복자1길 13, 1층 (성황동)
32023-11-02일퍼스트충청남도 천안시 동남구 원거리6길 6, 1층 (원성동)
42022-04-22영클린충청남도 천안시 동남구 풍세로 920, 1층 (청수동)
52023-09-13진종합관리충청남도 천안시 서북구 미라1길 33, 1층 (쌍용동)
62023-08-28우정방역충청남도 천안시 동남구 목천읍 남부대로 1012, 1층 일부호
72023-08-17잡스콤충청남도 천안시 동남구 수신면 신풍1길 12-61, 4동 1층
82023-08-04엠케이크린코퍼레이션충청남도 천안시 동남구 중앙로 281-2, 승지빌딩 2층 일부호 (신부동)
92023-04-24빅스케어충청남도 천안시 동남구 용곡3길 4, 1층 (용곡동)
영업신고일업소명사무실 소재지
4682015-08-05천수위생방역충청남도 태안군 태안읍 동백로 103
4692013-01-24하늘과바다사이리조트(주)충청남도 태안군 원북면 신두해변길 199
4702021-04-27태양종합환경충청남도 태안군 태안읍 진벌로 34-3, 105호 (태안신동아아파트)
4712012-07-10서부위생방역충청남도 태안군 태안읍 군청5길 22-4
4722016-02-01태안위생방역충청남도 태안군 태안읍 백화로 32
473<NA><NA><NA>
474<NA><NA><NA>
475<NA><NA><NA>
476<NA><NA><NA>
477<NA><NA><NA>

Duplicate rows

Most frequently occurring

영업신고일업소명사무실 소재지# duplicates
0<NA><NA><NA>5