Overview

Dataset statistics

Number of variables4
Number of observations1318
Missing cells176
Missing cells (%)3.3%
Duplicate rows60
Duplicate rows (%)4.6%
Total size in memory41.3 KiB
Average record size in memory32.1 B

Variable types

Text3
Categorical1

Dataset

Description충청북도 청주시 사업장 폐기물 배출자 신고현황에 대한 데이터로 상호 및 소재지(주소), 전화번호, 데이터 기준일자의 항목을 제공 합니다.
URLhttps://www.data.go.kr/data/15081133/fileData.do

Alerts

기준일자 has constant value ""Constant
Dataset has 60 (4.6%) duplicate rowsDuplicates
전화번호 has 176 (13.4%) missing valuesMissing

Reproduction

Analysis started2023-12-13 00:56:42.732201
Analysis finished2023-12-13 00:56:43.234896
Duration0.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

Distinct1191
Distinct (%)90.4%
Missing0
Missing (%)0.0%
Memory size10.4 KiB
2023-12-13T09:56:43.370222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length26
Mean length8.9567527
Min length2

Characters and Unicode

Total characters11805
Distinct characters533
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1080 ?
Unique (%)81.9%

Sample

1st row서원대학교
2nd row충북대학교
3rd row충북대학교병원
4th row충청북도 청주의료원
5th row(주)이마트 청주점
ValueCountFrequency (%)
주식회사 90
 
5.7%
오송공장 13
 
0.8%
청주공장 8
 
0.5%
한국전력공사 8
 
0.5%
오창공장 7
 
0.4%
환경시설관리 6
 
0.4%
오창 6
 
0.4%
공군제17전투비행단 5
 
0.3%
현대바이오랜드 5
 
0.3%
한국교원대학교 4
 
0.3%
Other values (1264) 1430
90.4%
2023-12-13T09:56:43.657591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
880
 
7.5%
) 684
 
5.8%
( 683
 
5.8%
343
 
2.9%
267
 
2.3%
253
 
2.1%
250
 
2.1%
216
 
1.8%
216
 
1.8%
182
 
1.5%
Other values (523) 7831
66.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9928
84.1%
Close Punctuation 691
 
5.9%
Open Punctuation 690
 
5.8%
Space Separator 267
 
2.3%
Decimal Number 82
 
0.7%
Uppercase Letter 73
 
0.6%
Other Symbol 48
 
0.4%
Lowercase Letter 14
 
0.1%
Other Punctuation 8
 
0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
880
 
8.9%
343
 
3.5%
253
 
2.5%
250
 
2.5%
216
 
2.2%
216
 
2.2%
182
 
1.8%
182
 
1.8%
178
 
1.8%
170
 
1.7%
Other values (478) 7058
71.1%
Uppercase Letter
ValueCountFrequency (%)
G 11
15.1%
L 9
12.3%
I 9
12.3%
S 8
11.0%
E 5
6.8%
F 5
6.8%
K 5
6.8%
T 4
 
5.5%
R 3
 
4.1%
A 3
 
4.1%
Other values (5) 11
15.1%
Lowercase Letter
ValueCountFrequency (%)
k 3
21.4%
e 3
21.4%
s 1
 
7.1%
r 1
 
7.1%
t 1
 
7.1%
w 1
 
7.1%
o 1
 
7.1%
p 1
 
7.1%
a 1
 
7.1%
c 1
 
7.1%
Decimal Number
ValueCountFrequency (%)
2 28
34.1%
1 19
23.2%
3 15
18.3%
7 6
 
7.3%
6 5
 
6.1%
5 4
 
4.9%
4 3
 
3.7%
9 2
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 6
75.0%
/ 1
 
12.5%
& 1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 684
99.0%
] 7
 
1.0%
Open Punctuation
ValueCountFrequency (%)
( 683
99.0%
[ 7
 
1.0%
Math Symbol
ValueCountFrequency (%)
> 1
50.0%
< 1
50.0%
Space Separator
ValueCountFrequency (%)
267
100.0%
Other Symbol
ValueCountFrequency (%)
48
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9976
84.5%
Common 1742
 
14.8%
Latin 87
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
880
 
8.8%
343
 
3.4%
253
 
2.5%
250
 
2.5%
216
 
2.2%
216
 
2.2%
182
 
1.8%
182
 
1.8%
178
 
1.8%
170
 
1.7%
Other values (479) 7106
71.2%
Latin
ValueCountFrequency (%)
G 11
12.6%
L 9
 
10.3%
I 9
 
10.3%
S 8
 
9.2%
E 5
 
5.7%
F 5
 
5.7%
K 5
 
5.7%
T 4
 
4.6%
k 3
 
3.4%
R 3
 
3.4%
Other values (15) 25
28.7%
Common
ValueCountFrequency (%)
) 684
39.3%
( 683
39.2%
267
 
15.3%
2 28
 
1.6%
1 19
 
1.1%
3 15
 
0.9%
] 7
 
0.4%
[ 7
 
0.4%
7 6
 
0.3%
. 6
 
0.3%
Other values (9) 20
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9928
84.1%
ASCII 1829
 
15.5%
None 48
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
880
 
8.9%
343
 
3.5%
253
 
2.5%
250
 
2.5%
216
 
2.2%
216
 
2.2%
182
 
1.8%
182
 
1.8%
178
 
1.8%
170
 
1.7%
Other values (478) 7058
71.1%
ASCII
ValueCountFrequency (%)
) 684
37.4%
( 683
37.3%
267
 
14.6%
2 28
 
1.5%
1 19
 
1.0%
3 15
 
0.8%
G 11
 
0.6%
L 9
 
0.5%
I 9
 
0.5%
S 8
 
0.4%
Other values (34) 96
 
5.2%
None
ValueCountFrequency (%)
48
100.0%

주소
Text

Distinct1138
Distinct (%)86.3%
Missing0
Missing (%)0.0%
Memory size10.4 KiB
2023-12-13T09:56:43.923208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length46
Mean length28.58953
Min length18

Characters and Unicode

Total characters37681
Distinct characters358
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1001 ?
Unique (%)75.9%

Sample

1st row충청북도 청주시 서원구 모충동 231-5
2nd row충청북도 청주시 서원구 충대로 1
3rd row충청북도 청주시 서원구 1순환로 776
4th row충청북도 청주시 서원구 사직동 554-6
5th row충청북도 청주시 서원구 미평동 123-1
ValueCountFrequency (%)
충청북도 1346
 
16.3%
청주시 1321
 
16.0%
흥덕구 569
 
6.9%
청원구 485
 
5.9%
오창읍 264
 
3.2%
서원구 147
 
1.8%
옥산면 124
 
1.5%
상당구 117
 
1.4%
오송읍 108
 
1.3%
북이면 89
 
1.1%
Other values (1411) 3703
44.8%
2023-12-13T09:56:44.301955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7076
18.8%
3265
 
8.7%
1463
 
3.9%
1418
 
3.8%
1402
 
3.7%
1380
 
3.7%
1362
 
3.6%
1361
 
3.6%
1 1024
 
2.7%
907
 
2.4%
Other values (348) 17023
45.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23985
63.7%
Space Separator 7076
 
18.8%
Decimal Number 5029
 
13.3%
Open Punctuation 424
 
1.1%
Close Punctuation 424
 
1.1%
Dash Punctuation 374
 
1.0%
Connector Punctuation 259
 
0.7%
Uppercase Letter 53
 
0.1%
Other Punctuation 47
 
0.1%
Math Symbol 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3265
 
13.6%
1463
 
6.1%
1418
 
5.9%
1402
 
5.8%
1380
 
5.8%
1362
 
5.7%
1361
 
5.7%
907
 
3.8%
687
 
2.9%
615
 
2.6%
Other values (313) 10125
42.2%
Uppercase Letter
ValueCountFrequency (%)
K 9
17.0%
S 8
15.1%
I 6
11.3%
T 5
9.4%
A 4
7.5%
F 4
7.5%
L 4
7.5%
G 4
7.5%
D 2
 
3.8%
H 1
 
1.9%
Other values (6) 6
11.3%
Decimal Number
ValueCountFrequency (%)
1 1024
20.4%
2 730
14.5%
3 608
12.1%
4 499
9.9%
5 444
8.8%
0 434
8.6%
6 395
 
7.9%
8 316
 
6.3%
7 303
 
6.0%
9 276
 
5.5%
Other Punctuation
ValueCountFrequency (%)
, 46
97.9%
& 1
 
2.1%
Space Separator
ValueCountFrequency (%)
7076
100.0%
Open Punctuation
ValueCountFrequency (%)
( 424
100.0%
Close Punctuation
ValueCountFrequency (%)
) 424
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 374
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 259
100.0%
Math Symbol
ValueCountFrequency (%)
~ 9
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23986
63.7%
Common 13642
36.2%
Latin 53
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3265
 
13.6%
1463
 
6.1%
1418
 
5.9%
1402
 
5.8%
1380
 
5.8%
1362
 
5.7%
1361
 
5.7%
907
 
3.8%
687
 
2.9%
615
 
2.6%
Other values (314) 10126
42.2%
Common
ValueCountFrequency (%)
7076
51.9%
1 1024
 
7.5%
2 730
 
5.4%
3 608
 
4.5%
4 499
 
3.7%
5 444
 
3.3%
0 434
 
3.2%
( 424
 
3.1%
) 424
 
3.1%
6 395
 
2.9%
Other values (8) 1584
 
11.6%
Latin
ValueCountFrequency (%)
K 9
17.0%
S 8
15.1%
I 6
11.3%
T 5
9.4%
A 4
7.5%
F 4
7.5%
L 4
7.5%
G 4
7.5%
D 2
 
3.8%
H 1
 
1.9%
Other values (6) 6
11.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 23985
63.7%
ASCII 13695
36.3%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7076
51.7%
1 1024
 
7.5%
2 730
 
5.3%
3 608
 
4.4%
4 499
 
3.6%
5 444
 
3.2%
0 434
 
3.2%
( 424
 
3.1%
) 424
 
3.1%
6 395
 
2.9%
Other values (24) 1637
 
12.0%
Hangul
ValueCountFrequency (%)
3265
 
13.6%
1463
 
6.1%
1418
 
5.9%
1402
 
5.8%
1380
 
5.8%
1362
 
5.7%
1361
 
5.7%
907
 
3.8%
687
 
2.9%
615
 
2.6%
Other values (313) 10125
42.2%
None
ValueCountFrequency (%)
1
100.0%

전화번호
Text

MISSING 

Distinct1002
Distinct (%)87.7%
Missing176
Missing (%)13.4%
Memory size10.4 KiB
2023-12-13T09:56:44.510566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.879159
Min length1

Characters and Unicode

Total characters13566
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique893 ?
Unique (%)78.2%

Sample

1st row043-299-8114
2nd row043-261-2994
3rd row043-269-6684
4th row043-279-0134
5th row043-210-1234
ValueCountFrequency (%)
043-218-0510 10
 
0.9%
043 5
 
0.4%
043-214-7588 3
 
0.3%
043-240-8600 3
 
0.3%
043-265-0071 3
 
0.3%
043-240-1069 3
 
0.3%
043-211-1837 3
 
0.3%
043-240-6051 3
 
0.3%
043-711-8899 3
 
0.3%
043-249-8759 3
 
0.3%
Other values (992) 1094
96.6%
2023-12-13T09:56:44.814219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 2249
16.6%
0 2111
15.6%
3 1743
12.8%
2 1606
11.8%
4 1600
11.8%
1 1102
8.1%
7 792
 
5.8%
5 649
 
4.8%
8 599
 
4.4%
6 579
 
4.3%
Other values (3) 536
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11296
83.3%
Dash Punctuation 2249
 
16.6%
Space Separator 20
 
0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2111
18.7%
3 1743
15.4%
2 1606
14.2%
4 1600
14.2%
1 1102
9.8%
7 792
 
7.0%
5 649
 
5.7%
8 599
 
5.3%
6 579
 
5.1%
9 515
 
4.6%
Dash Punctuation
ValueCountFrequency (%)
- 2249
100.0%
Space Separator
ValueCountFrequency (%)
20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13566
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 2249
16.6%
0 2111
15.6%
3 1743
12.8%
2 1606
11.8%
4 1600
11.8%
1 1102
8.1%
7 792
 
5.8%
5 649
 
4.8%
8 599
 
4.4%
6 579
 
4.3%
Other values (3) 536
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13566
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 2249
16.6%
0 2111
15.6%
3 1743
12.8%
2 1606
11.8%
4 1600
11.8%
1 1102
8.1%
7 792
 
5.8%
5 649
 
4.8%
8 599
 
4.4%
6 579
 
4.3%
Other values (3) 536
 
4.0%

기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size10.4 KiB
2023-06-30
1318 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-06-30
2nd row2023-06-30
3rd row2023-06-30
4th row2023-06-30
5th row2023-06-30

Common Values

ValueCountFrequency (%)
2023-06-30 1318
100.0%

Length

2023-12-13T09:56:44.925005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:56:44.996206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-06-30 1318
100.0%

Missing values

2023-12-13T09:56:43.145438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:56:43.207452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호주소전화번호기준일자
0서원대학교충청북도 청주시 서원구 모충동 231-5043-299-81142023-06-30
1충북대학교충청북도 청주시 서원구 충대로 1043-261-29942023-06-30
2충북대학교병원충청북도 청주시 서원구 1순환로 776043-269-66842023-06-30
3충청북도 청주의료원충청북도 청주시 서원구 사직동 554-6043-279-01342023-06-30
4(주)이마트 청주점충청북도 청주시 서원구 미평동 123-1043-210-12342023-06-30
5넥상스코리아(주)충청북도 청주시 서원구 남이면 사동길 50043-270-02172023-06-30
6(주)서일기업충청북도 청주시 서원구 남이면 사동리 124043-269-08462023-06-30
7한국지역난방공사청주지사충청북도 청주시 서원구 죽림동 224043-230-46612023-06-30
8성진엔지니어링(주)충청북도 청주시 서원구 남이면 갈원리 162-1043-269-42222023-06-30
9오비맥주 주식회사 청주공장충청북도 청주시 서원구 현도면 중삼리 52번지043-279-47232023-06-30
상호주소전화번호기준일자
1308손경락내과의원충청북도 청주시 흥덕구 사직대로 74<NA>2023-06-30
1309녹십자외과의원충청북도 청주시 흥덕구 직지대로 575<NA>2023-06-30
1310서울정형외과충청북도 청주시 흥덕구 1순환로 509<NA>2023-06-30
1311양치과의원충청북도 청주시 흥덕구 직지대로 559<NA>2023-06-30
1312김태룡내과의원충청북도 청주시 흥덕구 1순환로 511-1<NA>2023-06-30
1313스마일치과의원충청북도 청주시 흥덕구 증안로 54<NA>2023-06-30
1314임상헌치과의원충청북도 청주시 흥덕구 가로수로 1340-1, 중앙빌딩 5층<NA>2023-06-30
1315김박내과의원충청북도 청주시 흥덕구 복대로 179, 메디포스 2층<NA>2023-06-30
1316한국도자기(주)충청북도 청주시 흥덕구 송정동 27-10<NA>2023-06-30
1317한국도자기(주) 슈퍼2부충청북도 청주시 흥덕구 송정동 140-34<NA>2023-06-30

Duplicate rows

Most frequently occurring

상호주소전화번호기준일자# duplicates
5(주)노바렉스(3공장)충청북도 청주시 청원구 오창읍 각리1길 64043-218-05102023-06-303
9(주)렉스진바이오텍충청북도 청주시 청원구 오창읍 각리1길 94043-218-05102023-06-303
12(주)바이오톡스텍충청북도 청주시 청원구 오창읍 연구단지로 53043-210-77772023-06-303
16(주)유한양행충청북도 청주시 청원구 오창읍 연구단지로 219043-240-10692023-06-303
41주식회사 현대바이오랜드 오창충청북도 청주시 청원구 오창읍 과학산업3로 162043-240-86002023-06-303
0(재)한국건설생활환경시험연구원 오창충청북도 청주시 청원구 오창읍 양청3길 73043-718-90032023-06-302
1(주)강내자동차해체재활용산업충청북도 청주시 흥덕구 강내면 황탄리길 183043-238-77772023-06-302
2(주)그린광학 오송공장충청북도 청주시 흥덕구 오송읍 오송생명4로 168-19 (주)그린광학043-218-21832023-06-302
3(주)나이스폐차충청북도 청주시 청원구 오창읍 여천3길 149043-213-40002023-06-302
4(주)노바렉스(2공장)충청북도 청주시 청원구 오창읍 각리1길 60043-218-05102023-06-302