Overview

Dataset statistics

Number of variables3
Number of observations2335
Missing cells492
Missing cells (%)7.0%
Duplicate rows3
Duplicate rows (%)0.1%
Total size in memory54.9 KiB
Average record size in memory24.1 B

Variable types

Text3

Dataset

Description기업명, 소재지, 연락처경산시(하양읍, 진량읍, 압량읍, 와촌면, 자인면, 용성면, 남산면, 남천면, 중앙동, 동부동, 서부1동, 서부2동, 남부동, 북부동, 중방동)
Author경상북도 경산시
URLhttps://www.data.go.kr/data/15100835/fileData.do

Alerts

Dataset has 3 (0.1%) duplicate rowsDuplicates
전화번호 has 492 (21.1%) missing valuesMissing

Reproduction

Analysis started2024-04-29 23:00:35.382826
Analysis finished2024-04-29 23:00:36.250818
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2223
Distinct (%)95.2%
Missing0
Missing (%)0.0%
Memory size18.4 KiB
2024-04-30T08:00:36.458655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length6.2278373
Min length2

Characters and Unicode

Total characters14542
Distinct characters535
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2123 ?
Unique (%)90.9%

Sample

1st row(유)농업회사법인 삼미식품
2nd row(유)지오선
3rd row(유)협신모직
4th row(주) 경도철강 가공센터
5th row(주) 썬로드
ValueCountFrequency (%)
주식회사 48
 
1.9%
경산지점 14
 
0.6%
경산공장 13
 
0.5%
농업회사법인 10
 
0.4%
5
 
0.2%
제2공장 4
 
0.2%
대성산업 4
 
0.2%
남경산업 4
 
0.2%
우정섬유 3
 
0.1%
대원금속(주 3
 
0.1%
Other values (2248) 2382
95.7%
2024-04-30T08:00:36.896226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1088
 
7.5%
( 1017
 
7.0%
) 1016
 
7.0%
466
 
3.2%
463
 
3.2%
323
 
2.2%
304
 
2.1%
247
 
1.7%
246
 
1.7%
228
 
1.6%
Other values (525) 9144
62.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12006
82.6%
Open Punctuation 1017
 
7.0%
Close Punctuation 1016
 
7.0%
Uppercase Letter 255
 
1.8%
Space Separator 155
 
1.1%
Decimal Number 51
 
0.4%
Other Punctuation 33
 
0.2%
Dash Punctuation 5
 
< 0.1%
Lowercase Letter 3
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1088
 
9.1%
466
 
3.9%
463
 
3.9%
323
 
2.7%
304
 
2.5%
247
 
2.1%
246
 
2.0%
228
 
1.9%
220
 
1.8%
198
 
1.6%
Other values (483) 8223
68.5%
Uppercase Letter
ValueCountFrequency (%)
E 35
13.7%
N 26
 
10.2%
S 26
 
10.2%
C 26
 
10.2%
T 14
 
5.5%
G 14
 
5.5%
K 13
 
5.1%
M 10
 
3.9%
P 10
 
3.9%
A 9
 
3.5%
Other values (13) 72
28.2%
Decimal Number
ValueCountFrequency (%)
2 36
70.6%
1 7
 
13.7%
3 4
 
7.8%
0 1
 
2.0%
7 1
 
2.0%
6 1
 
2.0%
4 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
. 22
66.7%
& 9
27.3%
, 1
 
3.0%
/ 1
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
c 1
33.3%
v 1
33.3%
a 1
33.3%
Open Punctuation
ValueCountFrequency (%)
( 1017
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1016
100.0%
Space Separator
ValueCountFrequency (%)
155
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12007
82.6%
Common 2277
 
15.7%
Latin 258
 
1.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1088
 
9.1%
466
 
3.9%
463
 
3.9%
323
 
2.7%
304
 
2.5%
247
 
2.1%
246
 
2.0%
228
 
1.9%
220
 
1.8%
198
 
1.6%
Other values (484) 8224
68.5%
Latin
ValueCountFrequency (%)
E 35
13.6%
N 26
 
10.1%
S 26
 
10.1%
C 26
 
10.1%
T 14
 
5.4%
G 14
 
5.4%
K 13
 
5.0%
M 10
 
3.9%
P 10
 
3.9%
A 9
 
3.5%
Other values (16) 75
29.1%
Common
ValueCountFrequency (%)
( 1017
44.7%
) 1016
44.6%
155
 
6.8%
2 36
 
1.6%
. 22
 
1.0%
& 9
 
0.4%
1 7
 
0.3%
- 5
 
0.2%
3 4
 
0.2%
, 1
 
< 0.1%
Other values (5) 5
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12006
82.6%
ASCII 2535
 
17.4%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1088
 
9.1%
466
 
3.9%
463
 
3.9%
323
 
2.7%
304
 
2.5%
247
 
2.1%
246
 
2.0%
228
 
1.9%
220
 
1.8%
198
 
1.6%
Other values (483) 8223
68.5%
ASCII
ValueCountFrequency (%)
( 1017
40.1%
) 1016
40.1%
155
 
6.1%
2 36
 
1.4%
E 35
 
1.4%
N 26
 
1.0%
S 26
 
1.0%
C 26
 
1.0%
. 22
 
0.9%
T 14
 
0.6%
Other values (31) 162
 
6.4%
None
ValueCountFrequency (%)
1
100.0%

주소
Text

Distinct2190
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Memory size18.4 KiB
2024-04-30T08:00:37.190590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length63
Median length56
Mean length24.863812
Min length17

Characters and Unicode

Total characters58057
Distinct characters363
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2068 ?
Unique (%)88.6%

Sample

1st row경상북도 경산시 자인면 자인공단2로4길 8
2nd row경상북도 경산시 삼풍로 27, 경북테크노파크 제2생산공장 105호, 106호 (삼풍동)
3rd row경상북도 경산시 중산길 21-15 (중산동)
4th row경상북도 경산시 남산면 하대리 7번지 외 7필지
5th row경상북도 경산시 남산면 서원천로 260-17
ValueCountFrequency (%)
경상북도 2335
 
17.8%
경산시 2334
 
17.8%
진량읍 766
 
5.8%
압량면 358
 
2.7%
와촌면 286
 
2.2%
자인면 247
 
1.9%
남천면 215
 
1.6%
필지 176
 
1.3%
176
 
1.3%
남산면 161
 
1.2%
Other values (2150) 6052
46.2%
2024-04-30T08:00:37.638750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10821
18.6%
4777
 
8.2%
2736
 
4.7%
2397
 
4.1%
2395
 
4.1%
2355
 
4.1%
2343
 
4.0%
1 1699
 
2.9%
2 1383
 
2.4%
1307
 
2.3%
Other values (353) 25844
44.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 35404
61.0%
Space Separator 10821
 
18.6%
Decimal Number 8902
 
15.3%
Dash Punctuation 866
 
1.5%
Close Punctuation 823
 
1.4%
Open Punctuation 823
 
1.4%
Other Punctuation 259
 
0.4%
Uppercase Letter 136
 
0.2%
Lowercase Letter 23
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4777
 
13.5%
2736
 
7.7%
2397
 
6.8%
2395
 
6.8%
2355
 
6.7%
2343
 
6.6%
1307
 
3.7%
1249
 
3.5%
1158
 
3.3%
1118
 
3.2%
Other values (302) 13569
38.3%
Uppercase Letter
ValueCountFrequency (%)
B 25
18.4%
D 17
12.5%
R 17
12.5%
M 9
 
6.6%
C 9
 
6.6%
G 9
 
6.6%
A 8
 
5.9%
T 6
 
4.4%
E 6
 
4.4%
S 5
 
3.7%
Other values (10) 25
18.4%
Lowercase Letter
ValueCountFrequency (%)
i 7
30.4%
o 4
17.4%
g 2
 
8.7%
s 2
 
8.7%
t 1
 
4.3%
c 1
 
4.3%
r 1
 
4.3%
u 1
 
4.3%
d 1
 
4.3%
n 1
 
4.3%
Other values (2) 2
 
8.7%
Decimal Number
ValueCountFrequency (%)
1 1699
19.1%
2 1383
15.5%
3 980
11.0%
4 932
10.5%
5 844
9.5%
6 709
8.0%
8 622
 
7.0%
7 620
 
7.0%
0 618
 
6.9%
9 495
 
5.6%
Other Punctuation
ValueCountFrequency (%)
, 232
89.6%
& 19
 
7.3%
. 8
 
3.1%
Close Punctuation
ValueCountFrequency (%)
) 818
99.4%
] 5
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 818
99.4%
[ 5
 
0.6%
Space Separator
ValueCountFrequency (%)
10821
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 866
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 35360
60.9%
Common 22494
38.7%
Latin 159
 
0.3%
Han 44
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4777
 
13.5%
2736
 
7.7%
2397
 
6.8%
2395
 
6.8%
2355
 
6.7%
2343
 
6.6%
1307
 
3.7%
1249
 
3.5%
1158
 
3.3%
1118
 
3.2%
Other values (299) 13525
38.2%
Latin
ValueCountFrequency (%)
B 25
15.7%
D 17
 
10.7%
R 17
 
10.7%
M 9
 
5.7%
C 9
 
5.7%
G 9
 
5.7%
A 8
 
5.0%
i 7
 
4.4%
T 6
 
3.8%
E 6
 
3.8%
Other values (22) 46
28.9%
Common
ValueCountFrequency (%)
10821
48.1%
1 1699
 
7.6%
2 1383
 
6.1%
3 980
 
4.4%
4 932
 
4.1%
- 866
 
3.8%
5 844
 
3.8%
) 818
 
3.6%
( 818
 
3.6%
6 709
 
3.2%
Other values (9) 2624
 
11.7%
Han
ValueCountFrequency (%)
22
50.0%
17
38.6%
5
 
11.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 35360
60.9%
ASCII 22653
39.0%
CJK 44
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10821
47.8%
1 1699
 
7.5%
2 1383
 
6.1%
3 980
 
4.3%
4 932
 
4.1%
- 866
 
3.8%
5 844
 
3.7%
) 818
 
3.6%
( 818
 
3.6%
6 709
 
3.1%
Other values (41) 2783
 
12.3%
Hangul
ValueCountFrequency (%)
4777
 
13.5%
2736
 
7.7%
2397
 
6.8%
2395
 
6.8%
2355
 
6.7%
2343
 
6.6%
1307
 
3.7%
1249
 
3.5%
1158
 
3.3%
1118
 
3.2%
Other values (299) 13525
38.2%
CJK
ValueCountFrequency (%)
22
50.0%
17
38.6%
5
 
11.4%

전화번호
Text

MISSING 

Distinct1693
Distinct (%)91.9%
Missing492
Missing (%)21.1%
Memory size18.4 KiB
2024-04-30T08:00:37.883682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.01682
Min length9

Characters and Unicode

Total characters22147
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1564 ?
Unique (%)84.9%

Sample

1st row053-856-9928
2nd row053-802-9700
3rd row053-811-4360
4th row053-851-8334
5th row064-792-8681
ValueCountFrequency (%)
053-981-8806 4
 
0.2%
053-856-5101 4
 
0.2%
053-752-0573 4
 
0.2%
053-859-1100 4
 
0.2%
053-853-6868 3
 
0.2%
053-813-4518 3
 
0.2%
053-851-8600 3
 
0.2%
053-857-9097 3
 
0.2%
053-856-9032 3
 
0.2%
053-856-9100 3
 
0.2%
Other values (1683) 1809
98.2%
2024-04-30T08:00:38.262844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 3744
16.9%
- 3685
16.6%
0 3041
13.7%
3 2808
12.7%
8 2443
11.0%
1 1798
8.1%
7 1129
 
5.1%
6 1017
 
4.6%
2 986
 
4.5%
4 815
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18462
83.4%
Dash Punctuation 3685
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 3744
20.3%
0 3041
16.5%
3 2808
15.2%
8 2443
13.2%
1 1798
9.7%
7 1129
 
6.1%
6 1017
 
5.5%
2 986
 
5.3%
4 815
 
4.4%
9 681
 
3.7%
Dash Punctuation
ValueCountFrequency (%)
- 3685
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22147
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 3744
16.9%
- 3685
16.6%
0 3041
13.7%
3 2808
12.7%
8 2443
11.0%
1 1798
8.1%
7 1129
 
5.1%
6 1017
 
4.6%
2 986
 
4.5%
4 815
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22147
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 3744
16.9%
- 3685
16.6%
0 3041
13.7%
3 2808
12.7%
8 2443
11.0%
1 1798
8.1%
7 1129
 
5.1%
6 1017
 
4.6%
2 986
 
4.5%
4 815
 
3.7%

Missing values

2024-04-30T08:00:36.137375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T08:00:36.208113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회사명주소전화번호
0(유)농업회사법인 삼미식품경상북도 경산시 자인면 자인공단2로4길 8053-856-9928
1(유)지오선경상북도 경산시 삼풍로 27, 경북테크노파크 제2생산공장 105호, 106호 (삼풍동)053-802-9700
2(유)협신모직경상북도 경산시 중산길 21-15 (중산동)053-811-4360
3(주) 경도철강 가공센터경상북도 경산시 남산면 하대리 7번지 외 7필지053-851-8334
4(주) 썬로드경상북도 경산시 남산면 서원천로 260-17064-792-8681
5(주) 유덕 경산지점경상북도 경산시 압량면 원효로 549-7070-7576-1345
6(주) 현진경상북도 경산시 남천면 대명리 309053-751-1417
7(주) 화산경상북도 경산시 진량읍 일연로115길 18054-335-6666
8(주)E.V산업경상북도 경산시 진량읍 일연로 491-2 (유성ENG)053-816-1505
9(주)ESI 은성경상북도 경산시 압량면 가일길24길 33<NA>
회사명주소전화번호
2325훈텍스경상북도 경산시 압량면 가일길28길 11-27<NA>
2326휴먼플러스(주)CNC사업본부경상북도 경산시 진량읍 공단9로 6053-710-2030
2327흥생농장경상북도 경산시 진량읍 선화리 227<NA>
2328흥성실업경상북도 경산시 남천면 신석길 46-7053-812-3331
2329흥창스틸(주)경상북도 경산시 자인면 한장군로 412, (북사리 1084-3)053-851-8486
2330흥창스틸(주)경상북도 경산시 남산면 하남로 75053-851-8486
2331흥창스틸(주)(하남지점)경상북도 경산시 남산면 하남로 39 (남산면)053-851-8486
2332희수엔지니어링(유)경상북도 경산시 진량읍 가야로67길 13-6053-593-7700
2333희승무역주식회사경상북도 경산시 진량읍 진성로 407-22 (총 2 필지)053-853-7744
2334히아브 특장경상북도 경산시 와촌면 계당리 172-2053-852-7708

Duplicate rows

Most frequently occurring

회사명주소전화번호# duplicates
0(주)대동가스텍경상북도 경산시 진량읍 공단4로 96 (주대동가스텍)053-856-51422
1남경산업경상북도 경산시 남천면 상대로 127 (주동제C&P)053-813-66002
2대우연사경상북도 경산시 남산면 대왕로 60-9053-792-17512