Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells15
Missing cells (%)< 0.1%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Text5
Numeric2

Dataset

Description경기도 시흥시 관내 기업체 현황 정보입니다.(경기도 시흥시 기업체 현황 정보에는 업체명, 소재지(도로명, 지번주소), 업종명, 생산품, 용지면적, 건축면적이 있습니다.)
URLhttps://www.data.go.kr/data/3077206/fileData.do

Alerts

Dataset has 2 (< 0.1%) duplicate rowsDuplicates
용지면적 is highly overall correlated with 건축면적High correlation
건축면적 is highly overall correlated with 용지면적High correlation
건축면적 is highly skewed (γ1 = 25.91639196)Skewed
용지면적 has 5673 (56.7%) zerosZeros
건축면적 has 103 (1.0%) zerosZeros

Reproduction

Analysis started2023-12-12 18:56:51.150583
Analysis finished2023-12-12 18:56:55.245828
Duration4.1 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct8827
Distinct (%)88.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:56:55.613334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length26
Mean length6.3207
Min length1

Characters and Unicode

Total characters63207
Distinct characters731
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7945 ?
Unique (%)79.5%

Sample

1st row삼오금형
2nd row(주)엑소
3rd row(주)한국코드
4th row(주)디앤더블유
5th row삼원테크
ValueCountFrequency (%)
주식회사 144
 
1.4%
태양광발전소 34
 
0.3%
tech 27
 
0.3%
eng 11
 
0.1%
제2공장 10
 
0.1%
시흥지점 10
 
0.1%
하나테크 9
 
0.1%
대성정밀 9
 
0.1%
우리테크 8
 
0.1%
제이에스테크 8
 
0.1%
Other values (8910) 10157
97.4%
2023-12-13T03:56:56.419441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4631
 
7.3%
( 4476
 
7.1%
) 4476
 
7.1%
2489
 
3.9%
1850
 
2.9%
1572
 
2.5%
1493
 
2.4%
1331
 
2.1%
956
 
1.5%
917
 
1.5%
Other values (721) 39016
61.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 51326
81.2%
Open Punctuation 4478
 
7.1%
Close Punctuation 4478
 
7.1%
Uppercase Letter 1931
 
3.1%
Space Separator 461
 
0.7%
Lowercase Letter 200
 
0.3%
Decimal Number 158
 
0.2%
Other Punctuation 148
 
0.2%
Dash Punctuation 16
 
< 0.1%
Other Symbol 8
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4631
 
9.0%
2489
 
4.8%
1850
 
3.6%
1572
 
3.1%
1493
 
2.9%
1331
 
2.6%
956
 
1.9%
917
 
1.8%
909
 
1.8%
900
 
1.8%
Other values (650) 34278
66.8%
Uppercase Letter
ValueCountFrequency (%)
E 254
13.2%
N 200
10.4%
G 183
 
9.5%
S 173
 
9.0%
T 155
 
8.0%
C 131
 
6.8%
M 105
 
5.4%
H 88
 
4.6%
K 73
 
3.8%
A 73
 
3.8%
Other values (15) 496
25.7%
Lowercase Letter
ValueCountFrequency (%)
e 43
21.5%
c 25
12.5%
h 25
12.5%
n 16
 
8.0%
o 15
 
7.5%
t 14
 
7.0%
i 8
 
4.0%
a 8
 
4.0%
r 7
 
3.5%
s 7
 
3.5%
Other values (12) 32
16.0%
Decimal Number
ValueCountFrequency (%)
2 60
38.0%
1 40
25.3%
3 19
 
12.0%
4 13
 
8.2%
0 10
 
6.3%
5 5
 
3.2%
8 4
 
2.5%
9 4
 
2.5%
6 2
 
1.3%
7 1
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 109
73.6%
& 32
 
21.6%
, 6
 
4.1%
/ 1
 
0.7%
Open Punctuation
ValueCountFrequency (%)
( 4476
> 99.9%
[ 2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 4476
> 99.9%
] 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
461
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%
Other Symbol
ValueCountFrequency (%)
8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 51334
81.2%
Common 9741
 
15.4%
Latin 2132
 
3.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4631
 
9.0%
2489
 
4.8%
1850
 
3.6%
1572
 
3.1%
1493
 
2.9%
1331
 
2.6%
956
 
1.9%
917
 
1.8%
909
 
1.8%
900
 
1.8%
Other values (651) 34286
66.8%
Latin
ValueCountFrequency (%)
E 254
 
11.9%
N 200
 
9.4%
G 183
 
8.6%
S 173
 
8.1%
T 155
 
7.3%
C 131
 
6.1%
M 105
 
4.9%
H 88
 
4.1%
K 73
 
3.4%
A 73
 
3.4%
Other values (38) 697
32.7%
Common
ValueCountFrequency (%)
( 4476
46.0%
) 4476
46.0%
461
 
4.7%
. 109
 
1.1%
2 60
 
0.6%
1 40
 
0.4%
& 32
 
0.3%
3 19
 
0.2%
- 16
 
0.2%
4 13
 
0.1%
Other values (12) 39
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 51326
81.2%
ASCII 11872
 
18.8%
None 8
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4631
 
9.0%
2489
 
4.8%
1850
 
3.6%
1572
 
3.1%
1493
 
2.9%
1331
 
2.6%
956
 
1.9%
917
 
1.8%
909
 
1.8%
900
 
1.8%
Other values (650) 34278
66.8%
ASCII
ValueCountFrequency (%)
( 4476
37.7%
) 4476
37.7%
461
 
3.9%
E 254
 
2.1%
N 200
 
1.7%
G 183
 
1.5%
S 173
 
1.5%
T 155
 
1.3%
C 131
 
1.1%
. 109
 
0.9%
Other values (59) 1254
 
10.6%
None
ValueCountFrequency (%)
8
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct7156
Distinct (%)71.6%
Missing12
Missing (%)0.1%
Memory size156.2 KiB
2023-12-13T03:56:56.952548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length94
Median length78
Mean length38.804265
Min length14

Characters and Unicode

Total characters387577
Distinct characters397
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5506 ?
Unique (%)55.1%

Sample

1st row경기도 시흥시 황골길 87-11 (방산동)
2nd row경기도 시흥시 정왕천로 197, 3다402 동우디지털파크 A-313 (정왕동)
3rd row경기도 시흥시 매화산단로 165 (매화동)
4th row경기도 시흥시 시화벤처로 151, (1사302호)(정왕동)
5th row경기도 시흥시 공단2대로139번길 25, [정왕동 1702-1 2마 102] (정왕동)
ValueCountFrequency (%)
경기도 9988
 
13.0%
시흥시 9988
 
13.0%
정왕동 8370
 
10.9%
시화단지 1162
 
1.5%
3바 917
 
1.2%
공단1대로 722
 
0.9%
2바 612
 
0.8%
정왕천로 567
 
0.7%
3마 520
 
0.7%
시화산단 450
 
0.6%
Other values (5455) 43244
56.5%
2023-12-13T03:56:57.821126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
66915
 
17.3%
22721
 
5.9%
1 21643
 
5.6%
2 15636
 
4.0%
3 13161
 
3.4%
12804
 
3.3%
, 11295
 
2.9%
11143
 
2.9%
( 10978
 
2.8%
) 10976
 
2.8%
Other values (387) 190305
49.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 186422
48.1%
Decimal Number 93875
24.2%
Space Separator 66915
 
17.3%
Other Punctuation 11321
 
2.9%
Open Punctuation 11273
 
2.9%
Close Punctuation 11271
 
2.9%
Dash Punctuation 3483
 
0.9%
Uppercase Letter 2594
 
0.7%
Lowercase Letter 386
 
0.1%
Letter Number 22
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
22721
 
12.2%
12804
 
6.9%
11143
 
6.0%
10580
 
5.7%
10159
 
5.4%
10111
 
5.4%
10039
 
5.4%
9988
 
5.4%
9831
 
5.3%
6510
 
3.5%
Other values (332) 72536
38.9%
Uppercase Letter
ValueCountFrequency (%)
B 622
24.0%
A 598
23.1%
M 415
16.0%
T 412
15.9%
V 409
15.8%
D 30
 
1.2%
C 21
 
0.8%
E 20
 
0.8%
G 14
 
0.5%
F 12
 
0.5%
Other values (10) 41
 
1.6%
Lowercase Letter
ValueCountFrequency (%)
l 347
89.9%
b 13
 
3.4%
m 5
 
1.3%
t 5
 
1.3%
v 5
 
1.3%
g 3
 
0.8%
a 3
 
0.8%
e 1
 
0.3%
j 1
 
0.3%
o 1
 
0.3%
Other values (2) 2
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 21643
23.1%
2 15636
16.7%
3 13161
14.0%
0 10896
11.6%
4 6302
 
6.7%
5 5829
 
6.2%
6 5610
 
6.0%
7 5339
 
5.7%
8 5027
 
5.4%
9 4432
 
4.7%
Other Punctuation
ValueCountFrequency (%)
, 11295
99.8%
/ 13
 
0.1%
. 7
 
0.1%
: 4
 
< 0.1%
& 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 10978
97.4%
[ 295
 
2.6%
Close Punctuation
ValueCountFrequency (%)
) 10976
97.4%
] 295
 
2.6%
Space Separator
ValueCountFrequency (%)
66915
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3483
100.0%
Letter Number
ValueCountFrequency (%)
22
100.0%
Math Symbol
ValueCountFrequency (%)
~ 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 198153
51.1%
Hangul 186419
48.1%
Latin 3002
 
0.8%
Han 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
22721
 
12.2%
12804
 
6.9%
11143
 
6.0%
10580
 
5.7%
10159
 
5.4%
10111
 
5.4%
10039
 
5.4%
9988
 
5.4%
9831
 
5.3%
6510
 
3.5%
Other values (329) 72533
38.9%
Latin
ValueCountFrequency (%)
B 622
20.7%
A 598
19.9%
M 415
13.8%
T 412
13.7%
V 409
13.6%
l 347
11.6%
D 30
 
1.0%
22
 
0.7%
C 21
 
0.7%
E 20
 
0.7%
Other values (23) 106
 
3.5%
Common
ValueCountFrequency (%)
66915
33.8%
1 21643
 
10.9%
2 15636
 
7.9%
3 13161
 
6.6%
, 11295
 
5.7%
( 10978
 
5.5%
) 10976
 
5.5%
0 10896
 
5.5%
4 6302
 
3.2%
5 5829
 
2.9%
Other values (12) 24522
 
12.4%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 201133
51.9%
Hangul 186419
48.1%
Number Forms 22
 
< 0.1%
CJK 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
66915
33.3%
1 21643
 
10.8%
2 15636
 
7.8%
3 13161
 
6.5%
, 11295
 
5.6%
( 10978
 
5.5%
) 10976
 
5.5%
0 10896
 
5.4%
4 6302
 
3.1%
5 5829
 
2.9%
Other values (44) 27502
13.7%
Hangul
ValueCountFrequency (%)
22721
 
12.2%
12804
 
6.9%
11143
 
6.0%
10580
 
5.7%
10159
 
5.4%
10111
 
5.4%
10039
 
5.4%
9988
 
5.4%
9831
 
5.3%
6510
 
3.5%
Other values (329) 72533
38.9%
Number Forms
ValueCountFrequency (%)
22
100.0%
CJK
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct7071
Distinct (%)70.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:56:58.333297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length102
Median length77
Mean length31.3617
Min length11

Characters and Unicode

Total characters313617
Distinct characters346
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5451 ?
Unique (%)54.5%

Sample

1st row경기도 시흥시 방산동 184번지
2nd row경기도 시흥시 정왕동 1288-2번지 3다402 동우디지털파크 A-313
3rd row경기도 시흥시 매화동 87-2번지
4th row경기도 시흥시 정왕동 2598-6 (1사302호)
5th row경기도 시흥시 정왕동 1702-1번지
ValueCountFrequency (%)
경기도 9985
 
15.8%
시흥시 9983
 
15.8%
정왕동 8971
 
14.2%
3바 716
 
1.1%
시화단지 646
 
1.0%
2바 453
 
0.7%
3층 413
 
0.7%
2층 412
 
0.7%
1층 409
 
0.6%
3마 376
 
0.6%
Other values (6402) 31011
48.9%
2023-12-13T03:56:59.298706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
54629
17.4%
1 21814
 
7.0%
21546
 
6.9%
2 17755
 
5.7%
12077
 
3.9%
- 10806
 
3.4%
10133
 
3.2%
10085
 
3.2%
10051
 
3.2%
10036
 
3.2%
Other values (336) 134685
42.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 145666
46.4%
Decimal Number 94445
30.1%
Space Separator 54629
 
17.4%
Dash Punctuation 10806
 
3.4%
Uppercase Letter 2449
 
0.8%
Open Punctuation 1825
 
0.6%
Close Punctuation 1823
 
0.6%
Other Punctuation 1597
 
0.5%
Lowercase Letter 344
 
0.1%
Letter Number 25
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21546
14.8%
12077
 
8.3%
10133
 
7.0%
10085
 
6.9%
10051
 
6.9%
10036
 
6.9%
9590
 
6.6%
9349
 
6.4%
9316
 
6.4%
8214
 
5.6%
Other values (283) 35269
24.2%
Uppercase Letter
ValueCountFrequency (%)
B 568
23.2%
A 531
21.7%
M 414
16.9%
T 410
16.7%
V 408
16.7%
D 27
 
1.1%
C 19
 
0.8%
E 18
 
0.7%
F 12
 
0.5%
G 11
 
0.4%
Other values (8) 31
 
1.3%
Lowercase Letter
ValueCountFrequency (%)
l 307
89.2%
b 12
 
3.5%
v 5
 
1.5%
t 5
 
1.5%
m 5
 
1.5%
g 3
 
0.9%
a 2
 
0.6%
e 1
 
0.3%
y 1
 
0.3%
j 1
 
0.3%
Other values (2) 2
 
0.6%
Decimal Number
ValueCountFrequency (%)
1 21814
23.1%
2 17755
18.8%
3 9917
10.5%
0 9912
10.5%
5 6481
 
6.9%
6 6277
 
6.6%
7 6185
 
6.5%
4 6050
 
6.4%
8 5867
 
6.2%
9 4187
 
4.4%
Other Punctuation
ValueCountFrequency (%)
, 1575
98.6%
/ 12
 
0.8%
. 6
 
0.4%
: 3
 
0.2%
& 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 1642
90.0%
[ 183
 
10.0%
Close Punctuation
ValueCountFrequency (%)
) 1640
90.0%
] 183
 
10.0%
Space Separator
ValueCountFrequency (%)
54629
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10806
100.0%
Letter Number
ValueCountFrequency (%)
25
100.0%
Math Symbol
ValueCountFrequency (%)
~ 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 165133
52.7%
Hangul 145663
46.4%
Latin 2818
 
0.9%
Han 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21546
14.8%
12077
 
8.3%
10133
 
7.0%
10085
 
6.9%
10051
 
6.9%
10036
 
6.9%
9590
 
6.6%
9349
 
6.4%
9316
 
6.4%
8214
 
5.6%
Other values (280) 35266
24.2%
Latin
ValueCountFrequency (%)
B 568
20.2%
A 531
18.8%
M 414
14.7%
T 410
14.5%
V 408
14.5%
l 307
10.9%
D 27
 
1.0%
25
 
0.9%
C 19
 
0.7%
E 18
 
0.6%
Other values (21) 91
 
3.2%
Common
ValueCountFrequency (%)
54629
33.1%
1 21814
 
13.2%
2 17755
 
10.8%
- 10806
 
6.5%
3 9917
 
6.0%
0 9912
 
6.0%
5 6481
 
3.9%
6 6277
 
3.8%
7 6185
 
3.7%
4 6050
 
3.7%
Other values (12) 15307
 
9.3%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 167926
53.5%
Hangul 145663
46.4%
Number Forms 25
 
< 0.1%
CJK 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
54629
32.5%
1 21814
 
13.0%
2 17755
 
10.6%
- 10806
 
6.4%
3 9917
 
5.9%
0 9912
 
5.9%
5 6481
 
3.9%
6 6277
 
3.7%
7 6185
 
3.7%
4 6050
 
3.6%
Other values (42) 18100
 
10.8%
Hangul
ValueCountFrequency (%)
21546
14.8%
12077
 
8.3%
10133
 
7.0%
10085
 
6.9%
10051
 
6.9%
10036
 
6.9%
9590
 
6.6%
9349
 
6.4%
9316
 
6.4%
8214
 
5.6%
Other values (280) 35266
24.2%
Number Forms
ValueCountFrequency (%)
25
100.0%
CJK
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct1010
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:56:59.876562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length31
Mean length15.8477
Min length3

Characters and Unicode

Total characters158477
Distinct characters341
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique443 ?
Unique (%)4.4%

Sample

1st row주형 및 금형 제조업
2nd row그 외 기타 전자부품 제조업 외 1 종
3rd row기타 절연선 및 케이블 제조업 외 1 종
4th row그 외 기타 1차 철강 제조업
5th row육상 금속 골조 구조재 제조업 외 2 종
ValueCountFrequency (%)
제조업 7437
 
14.9%
4552
 
9.1%
4352
 
8.7%
기타 2844
 
5.7%
2521
 
5.1%
2030
 
4.1%
1 1607
 
3.2%
유사처리업 1163
 
2.3%
절삭가공 1163
 
2.3%
금속 810
 
1.6%
Other values (694) 21355
42.9%
2023-12-13T03:57:00.771842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
39840
25.1%
10348
 
6.5%
9503
 
6.0%
8455
 
5.3%
6902
 
4.4%
4570
 
2.9%
4352
 
2.7%
2849
 
1.8%
2730
 
1.7%
2565
 
1.6%
Other values (331) 66363
41.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 114809
72.4%
Space Separator 39840
 
25.1%
Decimal Number 2915
 
1.8%
Other Punctuation 895
 
0.6%
Open Punctuation 9
 
< 0.1%
Close Punctuation 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10348
 
9.0%
9503
 
8.3%
8455
 
7.4%
6902
 
6.0%
4570
 
4.0%
4352
 
3.8%
2849
 
2.5%
2730
 
2.4%
2565
 
2.2%
2530
 
2.2%
Other values (316) 60005
52.3%
Decimal Number
ValueCountFrequency (%)
1 1997
68.5%
2 450
 
15.4%
3 232
 
8.0%
4 104
 
3.6%
5 44
 
1.5%
6 38
 
1.3%
7 23
 
0.8%
9 13
 
0.4%
8 11
 
0.4%
0 3
 
0.1%
Other Punctuation
ValueCountFrequency (%)
, 862
96.3%
. 33
 
3.7%
Space Separator
ValueCountFrequency (%)
39840
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 114809
72.4%
Common 43668
 
27.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10348
 
9.0%
9503
 
8.3%
8455
 
7.4%
6902
 
6.0%
4570
 
4.0%
4352
 
3.8%
2849
 
2.5%
2730
 
2.4%
2565
 
2.2%
2530
 
2.2%
Other values (316) 60005
52.3%
Common
ValueCountFrequency (%)
39840
91.2%
1 1997
 
4.6%
, 862
 
2.0%
2 450
 
1.0%
3 232
 
0.5%
4 104
 
0.2%
5 44
 
0.1%
6 38
 
0.1%
. 33
 
0.1%
7 23
 
0.1%
Other values (5) 45
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 114772
72.4%
ASCII 43668
 
27.6%
Compat Jamo 37
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
39840
91.2%
1 1997
 
4.6%
, 862
 
2.0%
2 450
 
1.0%
3 232
 
0.5%
4 104
 
0.2%
5 44
 
0.1%
6 38
 
0.1%
. 33
 
0.1%
7 23
 
0.1%
Other values (5) 45
 
0.1%
Hangul
ValueCountFrequency (%)
10348
 
9.0%
9503
 
8.3%
8455
 
7.4%
6902
 
6.0%
4570
 
4.0%
4352
 
3.8%
2849
 
2.5%
2730
 
2.4%
2565
 
2.2%
2530
 
2.2%
Other values (315) 59968
52.2%
Compat Jamo
ValueCountFrequency (%)
37
100.0%
Distinct6123
Distinct (%)61.2%
Missing3
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-13T03:57:01.689836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length53
Mean length7.4944483
Min length1

Characters and Unicode

Total characters74922
Distinct characters752
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5342 ?
Unique (%)53.4%

Sample

1st row모터코어 편칭 및 금형
2nd rowCD-ROM, USB
3rd row전선 및 전원플러그
4th row철강재 절단
5th row철구조물
ValueCountFrequency (%)
490
 
2.9%
기계부품 467
 
2.7%
402
 
2.4%
금형 352
 
2.1%
부품 324
 
1.9%
286
 
1.7%
자동차부품 277
 
1.6%
배전반 235
 
1.4%
제조업 203
 
1.2%
반도체 144
 
0.8%
Other values (5810) 13913
81.4%
2023-12-13T03:57:02.497277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7389
 
9.9%
4145
 
5.5%
2873
 
3.8%
2428
 
3.2%
, 2338
 
3.1%
1710
 
2.3%
1625
 
2.2%
1505
 
2.0%
1396
 
1.9%
1375
 
1.8%
Other values (742) 48138
64.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 62090
82.9%
Space Separator 7389
 
9.9%
Other Punctuation 2393
 
3.2%
Uppercase Letter 1584
 
2.1%
Lowercase Letter 799
 
1.1%
Open Punctuation 278
 
0.4%
Close Punctuation 278
 
0.4%
Decimal Number 92
 
0.1%
Dash Punctuation 11
 
< 0.1%
Control 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4145
 
6.7%
2873
 
4.6%
2428
 
3.9%
1710
 
2.8%
1625
 
2.6%
1505
 
2.4%
1396
 
2.2%
1375
 
2.2%
1371
 
2.2%
1220
 
2.0%
Other values (672) 42442
68.4%
Uppercase Letter
ValueCountFrequency (%)
C 215
13.6%
D 143
 
9.0%
E 142
 
9.0%
L 139
 
8.8%
P 136
 
8.6%
A 84
 
5.3%
B 83
 
5.2%
T 82
 
5.2%
S 69
 
4.4%
R 66
 
4.2%
Other values (14) 425
26.8%
Lowercase Letter
ValueCountFrequency (%)
e 89
11.1%
l 70
 
8.8%
t 61
 
7.6%
c 61
 
7.6%
r 58
 
7.3%
a 56
 
7.0%
o 52
 
6.5%
s 46
 
5.8%
p 46
 
5.8%
n 45
 
5.6%
Other values (14) 215
26.9%
Other Punctuation
ValueCountFrequency (%)
, 2338
97.7%
/ 26
 
1.1%
. 20
 
0.8%
& 3
 
0.1%
' 3
 
0.1%
· 2
 
0.1%
% 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 38
41.3%
2 25
27.2%
3 12
 
13.0%
0 7
 
7.6%
5 7
 
7.6%
4 2
 
2.2%
6 1
 
1.1%
Open Punctuation
ValueCountFrequency (%)
( 277
99.6%
[ 1
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 277
99.6%
] 1
 
0.4%
Space Separator
ValueCountFrequency (%)
7389
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%
Control
ValueCountFrequency (%)
6
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 62090
82.9%
Common 10449
 
13.9%
Latin 2383
 
3.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4145
 
6.7%
2873
 
4.6%
2428
 
3.9%
1710
 
2.8%
1625
 
2.6%
1505
 
2.4%
1396
 
2.2%
1375
 
2.2%
1371
 
2.2%
1220
 
2.0%
Other values (672) 42442
68.4%
Latin
ValueCountFrequency (%)
C 215
 
9.0%
D 143
 
6.0%
E 142
 
6.0%
L 139
 
5.8%
P 136
 
5.7%
e 89
 
3.7%
A 84
 
3.5%
B 83
 
3.5%
T 82
 
3.4%
l 70
 
2.9%
Other values (38) 1200
50.4%
Common
ValueCountFrequency (%)
7389
70.7%
, 2338
 
22.4%
( 277
 
2.7%
) 277
 
2.7%
1 38
 
0.4%
/ 26
 
0.2%
2 25
 
0.2%
. 20
 
0.2%
3 12
 
0.1%
- 11
 
0.1%
Other values (12) 36
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 62089
82.9%
ASCII 12830
 
17.1%
None 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7389
57.6%
, 2338
 
18.2%
( 277
 
2.2%
) 277
 
2.2%
C 215
 
1.7%
D 143
 
1.1%
E 142
 
1.1%
L 139
 
1.1%
P 136
 
1.1%
e 89
 
0.7%
Other values (59) 1685
 
13.1%
Hangul
ValueCountFrequency (%)
4145
 
6.7%
2873
 
4.6%
2428
 
3.9%
1710
 
2.8%
1625
 
2.6%
1505
 
2.4%
1396
 
2.2%
1375
 
2.2%
1371
 
2.2%
1220
 
2.0%
Other values (671) 42441
68.4%
None
ValueCountFrequency (%)
· 2
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

용지면적
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2381
Distinct (%)23.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean916.66339
Minimum0
Maximum85942.8
Zeros5673
Zeros (%)56.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:57:02.739250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3525.475
95-th percentile3701.49
Maximum85942.8
Range85942.8
Interquartile range (IQR)525.475

Descriptive statistics

Standard deviation3261.6625
Coefficient of variation (CV)3.55819
Kurtosis179.02837
Mean916.66339
Median Absolute Deviation (MAD)0
Skewness10.837791
Sum9166633.9
Variance10638442
MonotonicityNot monotonic
2023-12-13T03:57:02.988131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 5673
56.7%
59.54 95
 
0.9%
42.2 69
 
0.7%
69.28 61
 
0.6%
330.0 36
 
0.4%
80.82 33
 
0.3%
34.54 31
 
0.3%
29.0 30
 
0.3%
37.771 29
 
0.3%
57.24 27
 
0.3%
Other values (2371) 3916
39.2%
ValueCountFrequency (%)
0.0 5673
56.7%
0.2 1
 
< 0.1%
4.0 1
 
< 0.1%
14.56 1
 
< 0.1%
15.0 1
 
< 0.1%
15.43 1
 
< 0.1%
18.918 3
 
< 0.1%
20.04 3
 
< 0.1%
25.56 1
 
< 0.1%
25.98 10
 
0.1%
ValueCountFrequency (%)
85942.8 1
< 0.1%
79745.8 1
< 0.1%
75358.3 1
< 0.1%
60663.8 1
< 0.1%
52669.0 1
< 0.1%
51744.0 1
< 0.1%
51111.5 1
< 0.1%
49840.9 1
< 0.1%
48187.7 1
< 0.1%
46879.4 1
< 0.1%

건축면적
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct5082
Distinct (%)50.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean871.68443
Minimum-127.11
Maximum171614.75
Zeros103
Zeros (%)1.0%
Negative1
Negative (%)< 0.1%
Memory size166.0 KiB
2023-12-13T03:57:03.176187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-127.11
5-th percentile33
Q1130
median251.97
Q3617.025
95-th percentile3002.7945
Maximum171614.75
Range171741.86
Interquartile range (IQR)487.025

Descriptive statistics

Standard deviation3851.3983
Coefficient of variation (CV)4.4183402
Kurtosis934.35276
Mean871.68443
Median Absolute Deviation (MAD)169.47
Skewness25.916392
Sum8716844.3
Variance14833269
MonotonicityNot monotonic
2023-12-13T03:57:03.389729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.0 176
 
1.8%
165.0 167
 
1.7%
330.0 159
 
1.6%
66.0 149
 
1.5%
99.0 132
 
1.3%
100.0 109
 
1.1%
0.0 103
 
1.0%
132.0 85
 
0.9%
60.0 80
 
0.8%
198.0 77
 
0.8%
Other values (5072) 8763
87.6%
ValueCountFrequency (%)
-127.11 1
 
< 0.1%
0.0 103
1.0%
0.5 1
 
< 0.1%
1.0 3
 
< 0.1%
3.3 1
 
< 0.1%
4.0 1
 
< 0.1%
6.6 1
 
< 0.1%
8.529 1
 
< 0.1%
9.0 1
 
< 0.1%
9.3 1
 
< 0.1%
ValueCountFrequency (%)
171614.75 1
< 0.1%
158657.21 1
< 0.1%
142192.47 1
< 0.1%
87300.557 1
< 0.1%
86805.33 1
< 0.1%
69396.54 1
< 0.1%
58335.1 1
< 0.1%
56027.12 1
< 0.1%
54687.16 1
< 0.1%
54278.21 1
< 0.1%

Interactions

2023-12-13T03:56:54.275257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:56:53.948874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:56:54.462305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:56:54.099722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:57:03.540842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용지면적건축면적
용지면적1.0000.725
건축면적0.7251.000
2023-12-13T03:57:03.677763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용지면적건축면적
용지면적1.0000.506
건축면적0.5061.000

Missing values

2023-12-13T03:56:54.704223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:56:54.929265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T03:56:55.131804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업체명소재지 지번주소소재지 도로명주소업종명생산품용지면적건축면적
6125삼오금형경기도 시흥시 황골길 87-11 (방산동)경기도 시흥시 방산동 184번지주형 및 금형 제조업모터코어 편칭 및 금형341.0198.86
1944(주)엑소경기도 시흥시 정왕천로 197, 3다402 동우디지털파크 A-313 (정왕동)경기도 시흥시 정왕동 1288-2번지 3다402 동우디지털파크 A-313그 외 기타 전자부품 제조업 외 1 종CD-ROM, USB52.058.91
3427(주)한국코드경기도 시흥시 매화산단로 165 (매화동)경기도 시흥시 매화동 87-2번지기타 절연선 및 케이블 제조업 외 1 종전선 및 전원플러그1715.01271.25
664(주)디앤더블유경기도 시흥시 시화벤처로 151, (1사302호)(정왕동)경기도 시흥시 정왕동 2598-6 (1사302호)그 외 기타 1차 철강 제조업철강재 절단1686.981161.43
6156삼원테크경기도 시흥시 공단2대로139번길 25, [정왕동 1702-1 2마 102] (정왕동)경기도 시흥시 정왕동 1702-1번지육상 금속 골조 구조재 제조업 외 2 종철구조물0.066.0
9630케이엔시시스템경기도 시흥시 금오로 326-4(과림동)경기도 시흥시 과림동 608-13전동기 및 발전기 제조업전동창(창호자동개폐장치)419.0192.08
8109우진실업(주)경기도 시흥시 은행로12번길 5 (은행동, 우진실업(주))경기도 시흥시 은행동 284-10번지산업용 송풍기 및 배기장치 제조업송풍기478.0849.1
5814미화전자개발경기도 시흥시 미산로 121 (미산동, 현우정밀)경기도 시흥시 미산동 339-2번지배전반 및 전기 자동제어반 제조업자동제어판제조292.0160.12
2144(주)와이에스지경기도 시흥시 마유로238번길 43, 3나 207호 (정왕동)경기도 시흥시 정왕동 1278-6번지 3나 207호유선 통신장비 제조업 외 1 종통신장비케이스0.0792.0
7203아진테크경기도 시흥시 협력로 188, 1다 307 (정왕동)경기도 시흥시 정왕동 1245-6번지 1다 307그 외 기타 특수목적용 기계 제조업 외 1 종수지가공기0.0292.58
업체명소재지 지번주소소재지 도로명주소업종명생산품용지면적건축면적
3482(주)한신경기도 시흥시 공단1대로260번안길 3, 시화단지 3다 712호 (정왕동)경기도 시흥시 정왕동 1275-11번지배전반 및 전기 자동제어반 제조업 외 2 종부품가공, 전동피더 외3322.82957.87
10389한성공업사경기도 시흥시 경기과기대로 145, 3라 204 (정왕동)경기도 시흥시 정왕동 1273-3번지비주거용 건물 임대업임대3302.31740.51
10411한신금속열처리경기도 시흥시 경제로 296, 3마 301 (정왕동)경기도 시흥시 정왕동 1379번지 3마 301금속 열처리업자동차부품0.0502.0
4534다온경기도 시흥시 협력로 197, 정왕동 1240-1번지, 105동, 1다 202-4 (정왕동)경기도 시흥시 정왕동 1240-1번지 정왕동 1240-1번지, 105동, 1다 202-4기타 가공 공작기계 제조업제관용 용기 제작 기계0.090.0
4340기룡공업경기도 시흥시 옥구천동로 230, (1255-14, 2나112-1) (정왕동)경기도 시흥시 정왕동 1255-14번지 (1255-14, 2나112-1)구조용 금속 판제품 및 공작물 제조업산업구조물0.0168.0
4232그린밸류(주)경기도 시흥시 서울대학로 59-69, 1003호(배곧동, 배곧테크노밸리)경기도 시흥시 배곧동 292-3 배곧테크노밸리 1003호탭, 밸브 및 유사장치 제조업 외 1 종밸브, 위생용플라스틱 제품51.2251.22
9538천호테크경기도 시흥시 공단1대로322번길 20, 3다 304호 (정왕동)경기도 시흥시 정왕동 1281-3번지 3다 304호반도체 제조용 기계 제조업반도체 장비 프레임0.0347.0
1719(주)에스아이멤브레인경기도 시흥시 서해안로 242, 415호 (정왕동, 시화하이테크 아파트형공장)경기도 시흥시 정왕동 1234-7번지 시화하이테크 아파트형공장 415호액체 여과기 제조업수처리설비, 폐수처리장치9918.722085.67
2270(주)월드씨앤지경기도 시흥시 소망공원로 323, 4층 404 (정왕동) 4층 404호경기도 시흥시 정왕동 1287-5번지 4층 404 4층 404호금속 문, 창, 셔터 및 관련제품 제조업금속 문, 창, 셔터0.030.0
8638일성테크경기도 시흥시 공단1대로 152, 정왕동1258-13 (101호)(2다 203-1) (정왕동)경기도 시흥시 정왕동 1258-13번지 정왕동1258-13 (101호)(2다 203-1)절삭가공 및 유사처리업부품절삭가공0.034.0

Duplicate rows

Most frequently occurring

업체명소재지 지번주소소재지 도로명주소업종명생산품용지면적건축면적# duplicates
0(주)태진경기도 시흥시 정왕동 번지경기도 시흥시 정왕동 번지금속 위생용품 제조업 외 9 종핵반응기 및 증기발생기 제조업외0.0300.02
1한아정밀경기도 시흥시 마유로10번길 121, 시화공단 3바 519-8 (정왕동)경기도 시흥시 정왕동 2195-17번지주형 및 금형 제조업금형0.0165.02