Overview

Dataset statistics

Number of variables5
Number of observations1027
Missing cells394
Missing cells (%)7.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory40.2 KiB
Average record size in memory40.1 B

Variable types

Text4
Categorical1

Dataset

Description9월말여성기업현황공공데이터
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=201851

Alerts

주요 생산품목 has 394 (38.4%) missing valuesMissing

Reproduction

Analysis started2024-03-14 00:30:15.022589
Analysis finished2024-03-14 00:30:15.632981
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1020
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size8.2 KiB
2024-03-14T09:30:15.792460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length7.345667
Min length1

Characters and Unicode

Total characters7544
Distinct characters458
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1013 ?
Unique (%)98.6%

Sample

1st row 안과밖
2nd row주식회사 욱성건설
3rd row행복한건축사사무소
4th row(유)잡플러스
5th row(유)우리여행사
ValueCountFrequency (%)
주식회사 80
 
6.5%
유한회사 70
 
5.7%
9
 
0.7%
건축사사무소 5
 
0.4%
5
 
0.4%
디자인 3
 
0.2%
농업회사법인 3
 
0.2%
주)서원산업 2
 
0.2%
신영 2
 
0.2%
비타민선물유통 2
 
0.2%
Other values (1045) 1054
85.3%
2024-03-14T09:30:16.167641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 467
 
6.2%
( 460
 
6.1%
381
 
5.1%
353
 
4.7%
319
 
4.2%
224
 
3.0%
212
 
2.8%
183
 
2.4%
162
 
2.1%
144
 
1.9%
Other values (448) 4639
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6296
83.5%
Close Punctuation 467
 
6.2%
Open Punctuation 460
 
6.1%
Space Separator 212
 
2.8%
Uppercase Letter 74
 
1.0%
Lowercase Letter 17
 
0.2%
Decimal Number 8
 
0.1%
Other Punctuation 7
 
0.1%
Other Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
381
 
6.1%
353
 
5.6%
319
 
5.1%
224
 
3.6%
183
 
2.9%
162
 
2.6%
144
 
2.3%
138
 
2.2%
130
 
2.1%
106
 
1.7%
Other values (408) 4156
66.0%
Uppercase Letter
ValueCountFrequency (%)
S 11
14.9%
A 8
10.8%
G 6
 
8.1%
O 6
 
8.1%
E 5
 
6.8%
M 5
 
6.8%
N 5
 
6.8%
D 3
 
4.1%
Y 3
 
4.1%
B 3
 
4.1%
Other values (10) 19
25.7%
Lowercase Letter
ValueCountFrequency (%)
a 3
17.6%
e 3
17.6%
g 2
11.8%
n 2
11.8%
s 2
11.8%
d 1
 
5.9%
i 1
 
5.9%
l 1
 
5.9%
t 1
 
5.9%
f 1
 
5.9%
Decimal Number
ValueCountFrequency (%)
0 3
37.5%
3 2
25.0%
1 2
25.0%
2 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
. 4
57.1%
& 3
42.9%
Close Punctuation
ValueCountFrequency (%)
) 467
100.0%
Open Punctuation
ValueCountFrequency (%)
( 460
100.0%
Space Separator
ValueCountFrequency (%)
212
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6297
83.5%
Common 1154
 
15.3%
Latin 91
 
1.2%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
381
 
6.1%
353
 
5.6%
319
 
5.1%
224
 
3.6%
183
 
2.9%
162
 
2.6%
144
 
2.3%
138
 
2.2%
130
 
2.1%
106
 
1.7%
Other values (407) 4157
66.0%
Latin
ValueCountFrequency (%)
S 11
 
12.1%
A 8
 
8.8%
G 6
 
6.6%
O 6
 
6.6%
E 5
 
5.5%
M 5
 
5.5%
N 5
 
5.5%
D 3
 
3.3%
a 3
 
3.3%
Y 3
 
3.3%
Other values (20) 36
39.6%
Common
ValueCountFrequency (%)
) 467
40.5%
( 460
39.9%
212
18.4%
. 4
 
0.3%
& 3
 
0.3%
0 3
 
0.3%
3 2
 
0.2%
1 2
 
0.2%
2 1
 
0.1%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6294
83.4%
ASCII 1245
 
16.5%
None 3
 
< 0.1%
CJK 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 467
37.5%
( 460
36.9%
212
17.0%
S 11
 
0.9%
A 8
 
0.6%
G 6
 
0.5%
O 6
 
0.5%
E 5
 
0.4%
M 5
 
0.4%
N 5
 
0.4%
Other values (29) 60
 
4.8%
Hangul
ValueCountFrequency (%)
381
 
6.1%
353
 
5.6%
319
 
5.1%
224
 
3.6%
183
 
2.9%
162
 
2.6%
144
 
2.3%
138
 
2.2%
130
 
2.1%
106
 
1.7%
Other values (406) 4154
66.0%
None
ValueCountFrequency (%)
3
100.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct906
Distinct (%)88.2%
Missing0
Missing (%)0.0%
Memory size8.2 KiB
2024-03-14T09:30:16.445462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.0934761
Min length2

Characters and Unicode

Total characters3177
Distinct characters178
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique811 ?
Unique (%)79.0%

Sample

1st row안윤정
2nd row채양숙
3rd row반은희
4th row강승민
5th row김도영
ValueCountFrequency (%)
김미경 6
 
0.6%
김은희 4
 
0.4%
김영희 4
 
0.4%
김은숙 4
 
0.4%
윤영애 4
 
0.4%
최정숙 4
 
0.4%
최정희 3
 
0.3%
1명 3
 
0.3%
이숙현 3
 
0.3%
김현정 3
 
0.3%
Other values (918) 1019
96.4%
2024-03-14T09:30:16.804995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
220
 
6.9%
187
 
5.9%
150
 
4.7%
144
 
4.5%
133
 
4.2%
125
 
3.9%
112
 
3.5%
102
 
3.2%
93
 
2.9%
88
 
2.8%
Other values (168) 1823
57.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3138
98.8%
Space Separator 33
 
1.0%
Decimal Number 5
 
0.2%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
220
 
7.0%
187
 
6.0%
150
 
4.8%
144
 
4.6%
133
 
4.2%
125
 
4.0%
112
 
3.6%
102
 
3.3%
93
 
3.0%
88
 
2.8%
Other values (165) 1784
56.9%
Space Separator
ValueCountFrequency (%)
33
100.0%
Decimal Number
ValueCountFrequency (%)
1 5
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3138
98.8%
Common 39
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
220
 
7.0%
187
 
6.0%
150
 
4.8%
144
 
4.6%
133
 
4.2%
125
 
4.0%
112
 
3.6%
102
 
3.3%
93
 
3.0%
88
 
2.8%
Other values (165) 1784
56.9%
Common
ValueCountFrequency (%)
33
84.6%
1 5
 
12.8%
/ 1
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3138
98.8%
ASCII 39
 
1.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
220
 
7.0%
187
 
6.0%
150
 
4.8%
144
 
4.6%
133
 
4.2%
125
 
4.0%
112
 
3.6%
102
 
3.3%
93
 
3.0%
88
 
2.8%
Other values (165) 1784
56.9%
ASCII
ValueCountFrequency (%)
33
84.6%
1 5
 
12.8%
/ 1
 
2.6%
Distinct949
Distinct (%)92.4%
Missing0
Missing (%)0.0%
Memory size8.2 KiB
2024-03-14T09:30:17.121988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length27
Mean length19.9815
Min length12

Characters and Unicode

Total characters20521
Distinct characters280
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique890 ?
Unique (%)86.7%

Sample

1st row전라북도 군산시 상지곡안4길 20
2nd row전라북도 부안군 부안읍 봉서길 15
3rd row전북 고창군 고창읍 읍내리 201
4th row전북 전주시 완산구 서학로 22
5th row전북 전주시 완산구 마전로 5
ValueCountFrequency (%)
전라북도 993
 
20.4%
전주시 407
 
8.4%
완산구 225
 
4.6%
덕진구 182
 
3.7%
익산시 157
 
3.2%
군산시 95
 
2.0%
김제시 58
 
1.2%
정읍시 56
 
1.2%
남원시 55
 
1.1%
완주군 51
 
1.0%
Other values (1344) 2582
53.1%
2024-03-14T09:30:17.528658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3851
18.8%
1487
 
7.2%
1071
 
5.2%
1006
 
4.9%
997
 
4.9%
842
 
4.1%
1 631
 
3.1%
619
 
3.0%
612
 
3.0%
500
 
2.4%
Other values (270) 8905
43.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13089
63.8%
Space Separator 3851
 
18.8%
Decimal Number 3257
 
15.9%
Dash Punctuation 312
 
1.5%
Open Punctuation 6
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1487
 
11.4%
1071
 
8.2%
1006
 
7.7%
997
 
7.6%
842
 
6.4%
619
 
4.7%
612
 
4.7%
500
 
3.8%
437
 
3.3%
416
 
3.2%
Other values (254) 5102
39.0%
Decimal Number
ValueCountFrequency (%)
1 631
19.4%
2 484
14.9%
3 376
11.5%
4 314
9.6%
5 313
9.6%
6 282
8.7%
7 234
 
7.2%
0 217
 
6.7%
9 203
 
6.2%
8 203
 
6.2%
Space Separator
ValueCountFrequency (%)
3851
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 312
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13089
63.8%
Common 7431
36.2%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1487
 
11.4%
1071
 
8.2%
1006
 
7.7%
997
 
7.6%
842
 
6.4%
619
 
4.7%
612
 
4.7%
500
 
3.8%
437
 
3.3%
416
 
3.2%
Other values (254) 5102
39.0%
Common
ValueCountFrequency (%)
3851
51.8%
1 631
 
8.5%
2 484
 
6.5%
3 376
 
5.1%
4 314
 
4.2%
5 313
 
4.2%
- 312
 
4.2%
6 282
 
3.8%
7 234
 
3.1%
0 217
 
2.9%
Other values (5) 417
 
5.6%
Latin
ValueCountFrequency (%)
A 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13089
63.8%
ASCII 7432
36.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3851
51.8%
1 631
 
8.5%
2 484
 
6.5%
3 376
 
5.1%
4 314
 
4.2%
5 313
 
4.2%
- 312
 
4.2%
6 282
 
3.8%
7 234
 
3.1%
0 217
 
2.9%
Other values (6) 418
 
5.6%
Hangul
ValueCountFrequency (%)
1487
 
11.4%
1071
 
8.2%
1006
 
7.7%
997
 
7.6%
842
 
6.4%
619
 
4.7%
612
 
4.7%
500
 
3.8%
437
 
3.3%
416
 
3.2%
Other values (254) 5102
39.0%

주업종
Categorical

Distinct43
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size8.2 KiB
건설업
332 
제조업
265 
도매 및 소매업
152 
사업시설관리 및 사업지원 서비스업
57 
전문 과학 및 기술 서비스업
35 
Other values (38)
186 

Length

Max length45
Median length3
Mean length7.0292113
Min length2

Unique

Unique15 ?
Unique (%)1.5%

Sample

1st row출판 영상 방송통신 및 정보서비스업
2nd row건설업
3rd row전문 과학 및 기술 서비스업
4th row사업시설관리 및 사업지원 서비스업
5th row예술 스포츠 및 여가 관련 서비스업

Common Values

ValueCountFrequency (%)
건설업 332
32.3%
제조업 265
25.8%
도매 및 소매업 152
14.8%
사업시설관리 및 사업지원 서비스업 57
 
5.6%
전문 과학 및 기술 서비스업 35
 
3.4%
출판 영상 방송통신 및 정보서비스업 28
 
2.7%
수리(修理) 및 기타 개인 서비스업 24
 
2.3%
전기 가스 증기 및 수도사업 15
 
1.5%
운수업 13
 
1.3%
농업 임업 및 어업 10
 
1.0%
Other values (33) 96
 
9.3%

Length

2024-03-14T09:30:17.651079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
376
17.0%
건설업 332
15.0%
제조업 300
13.6%
소매업 159
 
7.2%
도매 156
 
7.1%
서비스업 135
 
6.1%
사업시설관리 57
 
2.6%
사업지원 57
 
2.6%
전문 36
 
1.6%
과학 36
 
1.6%
Other values (75) 563
25.5%

주요 생산품목
Text

MISSING 

Distinct471
Distinct (%)74.4%
Missing394
Missing (%)38.4%
Memory size8.2 KiB
2024-03-14T09:30:17.874940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length27
Mean length7.3696682
Min length1

Characters and Unicode

Total characters4665
Distinct characters377
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique403 ?
Unique (%)63.7%

Sample

1st row인쇄광고용역
2nd row건설업이라 생산품목은 없슴
3rd row건축설계
4th row경비청소시설관리용역
5th row여행관련용역
ValueCountFrequency (%)
17
 
2.1%
기타인쇄물 16
 
2.0%
건축설계 11
 
1.4%
건물청소서비스 8
 
1.0%
건설업 8
 
1.0%
철근콘크리트공사 8
 
1.0%
전기공사업 8
 
1.0%
시설물유지관리 7
 
0.9%
전기공사용역 7
 
0.9%
시설물유지관리업 6
 
0.8%
Other values (552) 699
87.9%
2024-03-14T09:30:18.283673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
174
 
3.7%
148
 
3.2%
146
 
3.1%
141
 
3.0%
136
 
2.9%
132
 
2.8%
109
 
2.3%
98
 
2.1%
98
 
2.1%
93
 
2.0%
Other values (367) 3390
72.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4411
94.6%
Space Separator 174
 
3.7%
Other Punctuation 36
 
0.8%
Close Punctuation 12
 
0.3%
Open Punctuation 12
 
0.3%
Uppercase Letter 11
 
0.2%
Decimal Number 7
 
0.2%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
148
 
3.4%
146
 
3.3%
141
 
3.2%
136
 
3.1%
132
 
3.0%
109
 
2.5%
98
 
2.2%
98
 
2.2%
93
 
2.1%
74
 
1.7%
Other values (348) 3236
73.4%
Decimal Number
ValueCountFrequency (%)
1 2
28.6%
7 1
14.3%
2 1
14.3%
8 1
14.3%
4 1
14.3%
0 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
D 3
27.3%
E 3
27.3%
L 3
27.3%
A 1
 
9.1%
O 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
. 26
72.2%
/ 9
 
25.0%
; 1
 
2.8%
Lowercase Letter
ValueCountFrequency (%)
s 1
50.0%
a 1
50.0%
Space Separator
ValueCountFrequency (%)
174
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4410
94.5%
Common 241
 
5.2%
Latin 13
 
0.3%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
148
 
3.4%
146
 
3.3%
141
 
3.2%
136
 
3.1%
132
 
3.0%
109
 
2.5%
98
 
2.2%
98
 
2.2%
93
 
2.1%
74
 
1.7%
Other values (347) 3235
73.4%
Common
ValueCountFrequency (%)
174
72.2%
. 26
 
10.8%
) 12
 
5.0%
( 12
 
5.0%
/ 9
 
3.7%
1 2
 
0.8%
7 1
 
0.4%
2 1
 
0.4%
8 1
 
0.4%
4 1
 
0.4%
Other values (2) 2
 
0.8%
Latin
ValueCountFrequency (%)
D 3
23.1%
E 3
23.1%
L 3
23.1%
A 1
 
7.7%
O 1
 
7.7%
s 1
 
7.7%
a 1
 
7.7%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4409
94.5%
ASCII 254
 
5.4%
Compat Jamo 1
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
174
68.5%
. 26
 
10.2%
) 12
 
4.7%
( 12
 
4.7%
/ 9
 
3.5%
D 3
 
1.2%
E 3
 
1.2%
L 3
 
1.2%
1 2
 
0.8%
7 1
 
0.4%
Other values (9) 9
 
3.5%
Hangul
ValueCountFrequency (%)
148
 
3.4%
146
 
3.3%
141
 
3.2%
136
 
3.1%
132
 
3.0%
109
 
2.5%
98
 
2.2%
98
 
2.2%
93
 
2.1%
74
 
1.7%
Other values (346) 3234
73.3%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
100.0%

Missing values

2024-03-14T09:30:15.525722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:30:15.604192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명대표자소재지주업종주요 생산품목
0안과밖안윤정전라북도 군산시 상지곡안4길 20출판 영상 방송통신 및 정보서비스업인쇄광고용역
1주식회사 욱성건설채양숙전라북도 부안군 부안읍 봉서길 15건설업건설업이라 생산품목은 없슴
2행복한건축사사무소반은희전북 고창군 고창읍 읍내리 201전문 과학 및 기술 서비스업건축설계
3(유)잡플러스강승민전북 전주시 완산구 서학로 22사업시설관리 및 사업지원 서비스업경비청소시설관리용역
4(유)우리여행사김도영전북 전주시 완산구 마전로 5예술 스포츠 및 여가 관련 서비스업여행관련용역
5(유)일동이엔지박미정전라북도 익산시 함열읍 함열10길 24건설업전기공사용역
6(주) 다일시스템김순영전라북도 전주시 덕진구 반월동 615-64건설업실내건축공사
7고려기획신은영전라북도 전주시 완산구 서신로 57도매 및 소매업판촉사업
8대유건설 주식회사안공선전라북도 정읍시 명륜길 3건설업건설기계및장비설치
9숲속정원이지혜전라북도 군산시 대야면 번영로 961도매 및 소매업선인장
업체명대표자소재지주업종주요 생산품목
1017밀알박애경전라북도 전주시 완산구 아중로 33출판 영상 방송통신 및 정보서비스업아트디자인및그래픽용역
1018향기나는 한지꽃강수영전라북도 전주시 완산구 천잠로 303제조업한지꽃 한지인테리어소품
1019(주)청원산업유세아전라북도 익산시 함열읍 용왕석재길 30제조업<NA>
1020MS건축사사무소최미선전라북도 전주시 완산구 우전로 323건설업건축설계
1021주식회사대유통신박상미전라북도 정읍시 천변로 158-9출판 영상 방송통신 및 정보서비스업통신사업소프트웨어사업
1022(유)태림건설김숙희전라북도 김제시 산정길 84-9건설업<NA>
1023주식회사 현대씨엠한정완전라북도 익산시 왕궁면 평장길 32제조업점토벽돌
1024(유)광명전기유명희전라북도 김제시 동서로 204건설업<NA>
1025주식회사 건양전력강미화전라북도 남원시 시청동로 10-6건설업<NA>
1026주식회사 경동김경동전라북도 익산시 부송로 120-1사업시설관리 및 사업지원 서비스업<NA>