Overview

Dataset statistics

Number of variables3
Number of observations1946
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory47.6 KiB
Average record size in memory25.1 B

Variable types

Numeric1
Text2

Dataset

Description경상북도 구미시의 기업사랑도우미시스템 내의 기업체 정보 데이터로서 기본키, 기업명, 대표자명, 기업구분, 주소 항목의 데이터를 제공합니다.
Author경상북도 구미시
URLhttps://www.data.go.kr/data/15039542/fileData.do

Alerts

연번 has unique valuesUnique
기업명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 17:46:21.530744
Analysis finished2023-12-12 17:46:22.145551
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct1946
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean973.5
Minimum1
Maximum1946
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.2 KiB
2023-12-13T02:46:22.231854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile98.25
Q1487.25
median973.5
Q31459.75
95-th percentile1848.75
Maximum1946
Range1945
Interquartile range (IQR)972.5

Descriptive statistics

Standard deviation561.90613
Coefficient of variation (CV)0.57720198
Kurtosis-1.2
Mean973.5
Median Absolute Deviation (MAD)486.5
Skewness0
Sum1894431
Variance315738.5
MonotonicityStrictly increasing
2023-12-13T02:46:22.385696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1279 1
 
0.1%
1307 1
 
0.1%
1306 1
 
0.1%
1305 1
 
0.1%
1304 1
 
0.1%
1303 1
 
0.1%
1302 1
 
0.1%
1301 1
 
0.1%
1300 1
 
0.1%
Other values (1936) 1936
99.5%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1946 1
0.1%
1945 1
0.1%
1944 1
0.1%
1943 1
0.1%
1942 1
0.1%
1941 1
0.1%
1940 1
0.1%
1939 1
0.1%
1938 1
0.1%
1937 1
0.1%

기업명
Text

UNIQUE 

Distinct1946
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
2023-12-13T02:46:22.711059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length17
Mean length6.2759507
Min length2

Characters and Unicode

Total characters12213
Distinct characters502
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1946 ?
Unique (%)100.0%

Sample

1st row(주) 화성피엔지
2nd row(유)클라리오스델코
3rd row(주) 새길건설산업
4th row(주) 신성 하이텍
5th row(주) 에스앤유
ValueCountFrequency (%)
주식회사 5
 
0.2%
4
 
0.2%
tech 4
 
0.2%
eng 4
 
0.2%
공장 3
 
0.1%
전자 3
 
0.1%
2공장 3
 
0.1%
구미사업장 2
 
0.1%
주)티케이케미칼 2
 
0.1%
영테크 2
 
0.1%
Other values (1989) 2005
98.4%
2023-12-13T02:46:23.146327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 888
 
7.3%
( 885
 
7.2%
880
 
7.2%
358
 
2.9%
317
 
2.6%
276
 
2.3%
253
 
2.1%
209
 
1.7%
199
 
1.6%
196
 
1.6%
Other values (492) 7752
63.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9689
79.3%
Close Punctuation 888
 
7.3%
Open Punctuation 885
 
7.2%
Uppercase Letter 489
 
4.0%
Space Separator 107
 
0.9%
Other Symbol 50
 
0.4%
Decimal Number 39
 
0.3%
Other Punctuation 29
 
0.2%
Lowercase Letter 29
 
0.2%
Dash Punctuation 8
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
880
 
9.1%
358
 
3.7%
317
 
3.3%
276
 
2.8%
253
 
2.6%
209
 
2.2%
199
 
2.1%
196
 
2.0%
196
 
2.0%
191
 
2.0%
Other values (438) 6614
68.3%
Uppercase Letter
ValueCountFrequency (%)
G 56
11.5%
E 53
10.8%
S 46
9.4%
T 43
 
8.8%
M 39
 
8.0%
N 37
 
7.6%
C 32
 
6.5%
L 29
 
5.9%
D 23
 
4.7%
A 20
 
4.1%
Other values (14) 111
22.7%
Lowercase Letter
ValueCountFrequency (%)
n 7
24.1%
c 4
13.8%
e 4
13.8%
o 2
 
6.9%
a 2
 
6.9%
i 2
 
6.9%
k 1
 
3.4%
s 1
 
3.4%
l 1
 
3.4%
u 1
 
3.4%
Other values (4) 4
13.8%
Decimal Number
ValueCountFrequency (%)
2 24
61.5%
1 9
 
23.1%
3 2
 
5.1%
7 1
 
2.6%
4 1
 
2.6%
5 1
 
2.6%
6 1
 
2.6%
Other Punctuation
ValueCountFrequency (%)
. 14
48.3%
& 9
31.0%
, 5
 
17.2%
/ 1
 
3.4%
Close Punctuation
ValueCountFrequency (%)
) 888
100.0%
Open Punctuation
ValueCountFrequency (%)
( 885
100.0%
Space Separator
ValueCountFrequency (%)
107
100.0%
Other Symbol
ValueCountFrequency (%)
50
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9735
79.7%
Common 1956
 
16.0%
Latin 518
 
4.2%
Han 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
880
 
9.0%
358
 
3.7%
317
 
3.3%
276
 
2.8%
253
 
2.6%
209
 
2.1%
199
 
2.0%
196
 
2.0%
196
 
2.0%
191
 
2.0%
Other values (435) 6660
68.4%
Latin
ValueCountFrequency (%)
G 56
 
10.8%
E 53
 
10.2%
S 46
 
8.9%
T 43
 
8.3%
M 39
 
7.5%
N 37
 
7.1%
C 32
 
6.2%
L 29
 
5.6%
D 23
 
4.4%
A 20
 
3.9%
Other values (28) 140
27.0%
Common
ValueCountFrequency (%)
) 888
45.4%
( 885
45.2%
107
 
5.5%
2 24
 
1.2%
. 14
 
0.7%
& 9
 
0.5%
1 9
 
0.5%
- 8
 
0.4%
, 5
 
0.3%
3 2
 
0.1%
Other values (5) 5
 
0.3%
Han
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9685
79.3%
ASCII 2474
 
20.3%
None 50
 
0.4%
CJK 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 888
35.9%
( 885
35.8%
107
 
4.3%
G 56
 
2.3%
E 53
 
2.1%
S 46
 
1.9%
T 43
 
1.7%
M 39
 
1.6%
N 37
 
1.5%
C 32
 
1.3%
Other values (43) 288
 
11.6%
Hangul
ValueCountFrequency (%)
880
 
9.1%
358
 
3.7%
317
 
3.3%
276
 
2.8%
253
 
2.6%
209
 
2.2%
199
 
2.1%
196
 
2.0%
196
 
2.0%
191
 
2.0%
Other values (434) 6610
68.2%
None
ValueCountFrequency (%)
50
100.0%
CJK
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

주소
Text

Distinct1661
Distinct (%)85.4%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
2023-12-13T02:46:23.589283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length42
Mean length22.373073
Min length1

Characters and Unicode

Total characters43538
Distinct characters300
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1538 ?
Unique (%)79.0%

Sample

1st row경상북도 구미시 공단동 282-6번지
2nd row경상북도 구미시 옥계2공단로 13
3rd row경상북도 구미시 오태동451-1
4th row경상북도 구미시 황상동 502-3
5th row경상북도 구미시 황상동 501
ValueCountFrequency (%)
구미시 1946
21.2%
경상북도 1936
21.1%
공단동 393
 
4.3%
산동읍 268
 
2.9%
1공단로 171
 
1.9%
비산동 143
 
1.6%
고아읍 140
 
1.5%
1공단로6길 80
 
0.9%
시미동 79
 
0.9%
장천면 74
 
0.8%
Other values (1888) 3960
43.1%
2023-12-13T02:46:24.239124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8245
18.9%
2067
 
4.7%
1 2063
 
4.7%
2043
 
4.7%
2043
 
4.7%
1988
 
4.6%
1956
 
4.5%
1942
 
4.5%
1939
 
4.5%
1306
 
3.0%
Other values (290) 17946
41.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24426
56.1%
Decimal Number 8863
 
20.4%
Space Separator 8245
 
18.9%
Dash Punctuation 1265
 
2.9%
Close Punctuation 248
 
0.6%
Open Punctuation 248
 
0.6%
Uppercase Letter 165
 
0.4%
Other Punctuation 48
 
0.1%
Lowercase Letter 18
 
< 0.1%
Math Symbol 12
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2067
 
8.5%
2043
 
8.4%
2043
 
8.4%
1988
 
8.1%
1956
 
8.0%
1942
 
8.0%
1939
 
7.9%
1306
 
5.3%
1040
 
4.3%
869
 
3.6%
Other values (245) 7233
29.6%
Uppercase Letter
ValueCountFrequency (%)
B 71
43.0%
L 69
41.8%
S 4
 
2.4%
C 3
 
1.8%
G 3
 
1.8%
D 3
 
1.8%
I 2
 
1.2%
E 2
 
1.2%
K 2
 
1.2%
M 1
 
0.6%
Other values (5) 5
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
s 4
22.2%
k 2
11.1%
n 2
11.1%
o 2
11.1%
t 2
11.1%
i 1
 
5.6%
h 1
 
5.6%
l 1
 
5.6%
u 1
 
5.6%
j 1
 
5.6%
Decimal Number
ValueCountFrequency (%)
1 2063
23.3%
2 1305
14.7%
3 926
10.4%
4 921
10.4%
6 830
9.4%
7 650
 
7.3%
5 622
 
7.0%
8 527
 
5.9%
0 516
 
5.8%
9 503
 
5.7%
Other Punctuation
ValueCountFrequency (%)
. 23
47.9%
, 15
31.2%
/ 9
 
18.8%
@ 1
 
2.1%
Space Separator
ValueCountFrequency (%)
8245
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1265
100.0%
Close Punctuation
ValueCountFrequency (%)
) 248
100.0%
Open Punctuation
ValueCountFrequency (%)
( 248
100.0%
Math Symbol
ValueCountFrequency (%)
~ 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24426
56.1%
Common 18929
43.5%
Latin 183
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2067
 
8.5%
2043
 
8.4%
2043
 
8.4%
1988
 
8.1%
1956
 
8.0%
1942
 
8.0%
1939
 
7.9%
1306
 
5.3%
1040
 
4.3%
869
 
3.6%
Other values (245) 7233
29.6%
Latin
ValueCountFrequency (%)
B 71
38.8%
L 69
37.7%
s 4
 
2.2%
S 4
 
2.2%
C 3
 
1.6%
G 3
 
1.6%
D 3
 
1.6%
I 2
 
1.1%
k 2
 
1.1%
n 2
 
1.1%
Other values (16) 20
 
10.9%
Common
ValueCountFrequency (%)
8245
43.6%
1 2063
 
10.9%
2 1305
 
6.9%
- 1265
 
6.7%
3 926
 
4.9%
4 921
 
4.9%
6 830
 
4.4%
7 650
 
3.4%
5 622
 
3.3%
8 527
 
2.8%
Other values (9) 1575
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24425
56.1%
ASCII 19112
43.9%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8245
43.1%
1 2063
 
10.8%
2 1305
 
6.8%
- 1265
 
6.6%
3 926
 
4.8%
4 921
 
4.8%
6 830
 
4.3%
7 650
 
3.4%
5 622
 
3.3%
8 527
 
2.8%
Other values (35) 1758
 
9.2%
Hangul
ValueCountFrequency (%)
2067
 
8.5%
2043
 
8.4%
2043
 
8.4%
1988
 
8.1%
1956
 
8.0%
1942
 
8.0%
1939
 
7.9%
1306
 
5.3%
1040
 
4.3%
869
 
3.6%
Other values (244) 7232
29.6%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-13T02:46:21.924267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-13T02:46:22.047668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:46:22.117361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번기업명주소
01(주) 화성피엔지경상북도 구미시 공단동 282-6번지
12(유)클라리오스델코경상북도 구미시 옥계2공단로 13
23(주) 새길건설산업경상북도 구미시 오태동451-1
34(주) 신성 하이텍경상북도 구미시 황상동 502-3
45(주) 에스앤유경상북도 구미시 황상동 501
56(주)AMC ENG경상북도 구미시 고아읍 대망리 4-1
67(주)AST젯텍경상북도 구미시 산동읍 봉산리 4단지 12-2B 15L
78(주)DS인텍경상북도 구미시 공단동 274-4
89(주)GMT경상북도 구미시 공단동 260-10
910(주)GnB경상북도 구미시 공단동 300-10
연번기업명주소
19361937회명산업(주) 구미지점경상북도 구미시 구포동 639
19371938회명세미크린(주) 구미공장경상북도 구미시 구포동 639
19381939회명솔레니스(주)경상북도 구미시 구포동 639
19391940회명애쉬랜드경상북도 구미시 1공단로6길 103-41
19401941효성수지경상북도 구미시 장천면 신장2리 76-7
19411942효성티엔에스경상북도 구미시 구포동 640
19421943효성하이텍경상북도 구미시 오태길 100-28
19431944효신정밀경상북도 구미시 4공단로 131 4공단로 131
19441945휴먼스(창고활용)경상북도 구미시 산동읍 적림리 226-6
19451946흥일기연경상북도 구미시 1공단로2길 13