Overview

Dataset statistics

Number of variables14
Number of observations125
Missing cells238
Missing cells (%)13.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.9 KiB
Average record size in memory114.1 B

Variable types

Text7
Categorical5
DateTime1
Numeric1

Dataset

Description충청남도 청양군 건축업체에 관한 정보로 업체명, 업종, 주소, 설립일, 종업원수, 홈페이지, 대표자이름 등에 관한 데이터를 제공합니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=1&beforeMenuCd=DOM_000000201001001000&publicdatapk=15126669

Alerts

주력사업 is highly overall correlated with 유형 and 3 other fieldsHigh correlation
주력업종 is highly overall correlated with 유형 and 3 other fieldsHigh correlation
사업자 업태업종 is highly overall correlated with 유형 and 3 other fieldsHigh correlation
면허 is highly overall correlated with 유형 and 3 other fieldsHigh correlation
유형 is highly overall correlated with 주력업종 and 3 other fieldsHigh correlation
유형 is highly imbalanced (93.3%)Imbalance
사업자 업태업종 is highly imbalanced (80.7%)Imbalance
사업자등록번호 has 2 (1.6%) missing valuesMissing
전화번호 has 4 (3.2%) missing valuesMissing
팩스 has 110 (88.0%) missing valuesMissing
홈페이지 주소 has 122 (97.6%) missing valuesMissing
업체명 has unique valuesUnique

Reproduction

Analysis started2024-03-13 11:47:06.923832
Analysis finished2024-03-13 11:47:08.642883
Duration1.72 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업체명
Text

UNIQUE 

Distinct125
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-03-13T20:47:08.850214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.176
Min length4

Characters and Unicode

Total characters897
Distinct characters135
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)100.0%

Sample

1st row정산하나종합정비
2nd row정산농업협동조합
3rd row대광건기㈜
4th row㈜정산종합모터스
5th row(주)계룡
ValueCountFrequency (%)
주식회사 3
 
2.3%
정산하나종합정비 1
 
0.8%
씨에스테크 1
 
0.8%
신행건설(주 1
 
0.8%
송산건설(주 1
 
0.8%
성진건설(주 1
 
0.8%
성아건설(주 1
 
0.8%
성심건축(종합설비 1
 
0.8%
서정건설(주 1
 
0.8%
서부건설(주 1
 
0.8%
Other values (116) 116
90.6%
2024-03-13T20:47:09.267136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
113
 
12.6%
( 105
 
11.7%
) 105
 
11.7%
80
 
8.9%
77
 
8.6%
16
 
1.8%
12
 
1.3%
11
 
1.2%
11
 
1.2%
11
 
1.2%
Other values (125) 356
39.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 680
75.8%
Open Punctuation 105
 
11.7%
Close Punctuation 105
 
11.7%
Other Symbol 4
 
0.4%
Space Separator 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
113
 
16.6%
80
 
11.8%
77
 
11.3%
16
 
2.4%
12
 
1.8%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
Other values (121) 327
48.1%
Open Punctuation
ValueCountFrequency (%)
( 105
100.0%
Close Punctuation
ValueCountFrequency (%)
) 105
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 684
76.3%
Common 213
 
23.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
113
 
16.5%
80
 
11.7%
77
 
11.3%
16
 
2.3%
12
 
1.8%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
Other values (122) 331
48.4%
Common
ValueCountFrequency (%)
( 105
49.3%
) 105
49.3%
3
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 680
75.8%
ASCII 213
 
23.7%
None 4
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
113
 
16.6%
80
 
11.8%
77
 
11.3%
16
 
2.4%
12
 
1.8%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
Other values (121) 327
48.1%
ASCII
ValueCountFrequency (%)
( 105
49.3%
) 105
49.3%
3
 
1.4%
None
ValueCountFrequency (%)
4
100.0%

유형
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
법인
124 
개인
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row개인
2nd row법인
3rd row법인
4th row법인
5th row법인

Common Values

ValueCountFrequency (%)
법인 124
99.2%
개인 1
 
0.8%

Length

2024-03-13T20:47:09.491317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T20:47:09.665812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
법인 124
99.2%
개인 1
 
0.8%

사업자등록번호
Text

MISSING 

Distinct123
Distinct (%)100.0%
Missing2
Missing (%)1.6%
Memory size1.1 KiB
2024-03-13T20:47:09.950168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1476
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123 ?
Unique (%)100.0%

Sample

1st row310-06-32828
2nd row307-82-00310
3rd row310-81-18783
4th row311-86-00108
5th row310-81-19026
ValueCountFrequency (%)
308-81-18209 1
 
0.8%
441-03-00032 1
 
0.8%
410-86-14461 1
 
0.8%
780-87-01090 1
 
0.8%
411-81-54215 1
 
0.8%
311-81-31343 1
 
0.8%
307-81-13161 1
 
0.8%
310-06-40895 1
 
0.8%
304-81-19002 1
 
0.8%
310-81-16728 1
 
0.8%
Other values (113) 113
91.9%
2024-03-13T20:47:10.475690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 246
16.7%
1 236
16.0%
8 198
13.4%
0 174
11.8%
3 129
8.7%
2 98
 
6.6%
6 92
 
6.2%
7 86
 
5.8%
5 80
 
5.4%
4 80
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1230
83.3%
Dash Punctuation 246
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 236
19.2%
8 198
16.1%
0 174
14.1%
3 129
10.5%
2 98
8.0%
6 92
 
7.5%
7 86
 
7.0%
5 80
 
6.5%
4 80
 
6.5%
9 57
 
4.6%
Dash Punctuation
ValueCountFrequency (%)
- 246
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1476
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 246
16.7%
1 236
16.0%
8 198
13.4%
0 174
11.8%
3 129
8.7%
2 98
 
6.6%
6 92
 
6.2%
7 86
 
5.8%
5 80
 
5.4%
4 80
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1476
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 246
16.7%
1 236
16.0%
8 198
13.4%
0 174
11.8%
3 129
8.7%
2 98
 
6.6%
6 92
 
6.2%
7 86
 
5.8%
5 80
 
5.4%
4 80
 
5.4%

주력업종
Categorical

HIGH CORRELATION 

Distinct39
Distinct (%)31.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
철근ㆍ콘크리트공사
28 
상ㆍ하수도설비공사
10 
석공사
철근ㆍ콘크리트공사
 
6
석공사
 
6
Other values (34)
68 

Length

Max length27
Median length19
Mean length9.208
Min length3

Unique

Unique17 ?
Unique (%)13.6%

Sample

1st row건설.광업용 기계 및 장비 수리업
2nd row건설 및 토목공사용 기계ㆍ장비 임대업
3rd row건설 및 토목공사용 기계ㆍ장비 임대업
4th row금속류 해체, 선별 및 원료 재생업
5th row콘크리트 및 철근 공사업

Common Values

ValueCountFrequency (%)
철근ㆍ콘크리트공사 28
22.4%
상ㆍ하수도설비공사 10
 
8.0%
석공사 7
 
5.6%
철근ㆍ콘크리트공사 6
 
4.8%
석공사 6
 
4.8%
조경식재공사 6
 
4.8%
토공사 5
 
4.0%
콘크리트 및 철근 공사업 4
 
3.2%
조경식재공사, 조경시설물설치공사 4
 
3.2%
실내건축공사 3
 
2.4%
Other values (29) 46
36.8%

Length

2024-03-13T20:47:10.727882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
철근ㆍ콘크리트공사 34
18.9%
석공사 18
 
10.0%
상ㆍ하수도설비공사 13
 
7.2%
조경식재공사 11
 
6.1%
9
 
5.0%
토공사 9
 
5.0%
공사업 8
 
4.4%
포장공사 7
 
3.9%
도장공사 6
 
3.3%
습식ㆍ방수공사 5
 
2.8%
Other values (33) 60
33.3%

사업자 업태업종
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct11
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
건설업
115 
자동차 종합 수리업
 
1
금융
 
1
운수 및 창고업
 
1
수도, 하수 및 폐기물 처리, 원료 재생업
 
1
Other values (6)
 
6

Length

Max length23
Median length3
Mean length3.448
Min length2

Unique

Unique10 ?
Unique (%)8.0%

Sample

1st row자동차 종합 수리업
2nd row금융
3rd row운수 및 창고업
4th row수도, 하수 및 폐기물 처리, 원료 재생업
5th row건설업

Common Values

ValueCountFrequency (%)
건설업 115
92.0%
자동차 종합 수리업 1
 
0.8%
금융 1
 
0.8%
운수 및 창고업 1
 
0.8%
수도, 하수 및 폐기물 처리, 원료 재생업 1
 
0.8%
목재생산업 1
 
0.8%
제조 1
 
0.8%
습식·방수공사업 1
 
0.8%
철근·콘크리트공사업 1
 
0.8%
가스난방공사업 1
 
0.8%

Length

2024-03-13T20:47:10.866764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
건설업 115
83.3%
3
 
2.2%
원료 1
 
0.7%
임업 1
 
0.7%
농업 1
 
0.7%
가스난방공사업 1
 
0.7%
철근·콘크리트공사업 1
 
0.7%
습식·방수공사업 1
 
0.7%
제조 1
 
0.7%
목재생산업 1
 
0.7%
Other values (12) 12
 
8.7%

면허
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
철근ㆍ콘크리트공사업
31 
도장ㆍ습식ㆍ방수ㆍ석공사업
17 
지반조성ㆍ포장공사업
12 
상ㆍ하수도설비공사업
11 
조경식재ㆍ시설물공사업
10 
Other values (17)
44 

Length

Max length16
Median length10
Mean length10.656
Min length7

Unique

Unique6 ?
Unique (%)4.8%

Sample

1st row건설기계 정비업
2nd row건설기계 대여업
3rd row건설기계 대여업
4th row건설기계 해체재활용업
5th row철근·콘크리트공사업

Common Values

ValueCountFrequency (%)
철근ㆍ콘크리트공사업 31
24.8%
도장ㆍ습식ㆍ방수ㆍ석공사업 17
13.6%
지반조성ㆍ포장공사업 12
 
9.6%
상ㆍ하수도설비공사업 11
 
8.8%
조경식재ㆍ시설물공사업 10
 
8.0%
가스난방공사업 6
 
4.8%
금속창호ㆍ지붕건축물조립공사업 5
 
4.0%
실내건축공사업 4
 
3.2%
철근·콘크리트공사업 4
 
3.2%
지반조성·포장공사업 4
 
3.2%
Other values (12) 21
16.8%

Length

2024-03-13T20:47:11.010402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
철근ㆍ콘크리트공사업 34
26.4%
도장ㆍ습식ㆍ방수ㆍ석공사업 21
16.3%
지반조성ㆍ포장공사업 13
 
10.1%
상ㆍ하수도설비공사업 13
 
10.1%
조경식재ㆍ시설물공사업 10
 
7.8%
가스난방공사업 6
 
4.7%
금속창호ㆍ지붕건축물조립공사업 6
 
4.7%
건설기계 4
 
3.1%
지반조성·포장공사업 4
 
3.1%
철근·콘크리트공사업 4
 
3.1%
Other values (8) 14
10.9%

주력사업
Categorical

HIGH CORRELATION 

Distinct37
Distinct (%)29.6%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
철근ㆍ콘크리트공사
28 
상ㆍ하수도설비공사
10 
석공사
석공사
 
6
조경식재공사
 
6
Other values (32)
68 

Length

Max length27
Median length18
Mean length8.656
Min length3

Unique

Unique16 ?
Unique (%)12.8%

Sample

1st row건설기계 정비
2nd row건설기계 대여
3rd row건설기계 대여
4th row건설기계 해체재활용
5th row철근·콘크리트공사

Common Values

ValueCountFrequency (%)
철근ㆍ콘크리트공사 28
22.4%
상ㆍ하수도설비공사 10
 
8.0%
석공사 7
 
5.6%
석공사 6
 
4.8%
조경식재공사 6
 
4.8%
토공사 6
 
4.8%
철근ㆍ콘크리트공사 6
 
4.8%
포장공사 5
 
4.0%
철근·콘크리트공사 4
 
3.2%
조경식재공사, 조경시설물설치공사 4
 
3.2%
Other values (27) 43
34.4%

Length

2024-03-13T20:47:11.162137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
철근ㆍ콘크리트공사 34
23.1%
석공사 18
12.2%
상ㆍ하수도설비공사 13
 
8.8%
조경식재공사 11
 
7.5%
토공사 10
 
6.8%
포장공사 9
 
6.1%
도장공사 6
 
4.1%
습식ㆍ방수공사 5
 
3.4%
금속구조물ㆍ창호ㆍ온실공사 5
 
3.4%
조경시설물설치공사 4
 
2.7%
Other values (16) 32
21.8%

전화번호
Text

MISSING 

Distinct115
Distinct (%)95.0%
Missing4
Missing (%)3.2%
Memory size1.1 KiB
2024-03-13T20:47:11.440526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.041322
Min length12

Characters and Unicode

Total characters1457
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109 ?
Unique (%)90.1%

Sample

1st row041-944-2121
2nd row041-942-0681
3rd row041-834-0501
4th row041-942-0707
5th row041-943-1019
ValueCountFrequency (%)
070-7782-9266 2
 
1.7%
041-942-1110 2
 
1.7%
041-944-0030 2
 
1.7%
041-943-6067 2
 
1.7%
041-943-8080 2
 
1.7%
041-943-6986 2
 
1.7%
041-942-7220 1
 
0.8%
041-944-0288 1
 
0.8%
041-943-4347 1
 
0.8%
041-943-1777 1
 
0.8%
Other values (105) 105
86.8%
2024-03-13T20:47:11.908576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 283
19.4%
- 242
16.6%
0 206
14.1%
1 172
11.8%
9 138
9.5%
3 94
 
6.5%
2 90
 
6.2%
7 80
 
5.5%
6 58
 
4.0%
8 53
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1215
83.4%
Dash Punctuation 242
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 283
23.3%
0 206
17.0%
1 172
14.2%
9 138
11.4%
3 94
 
7.7%
2 90
 
7.4%
7 80
 
6.6%
6 58
 
4.8%
8 53
 
4.4%
5 41
 
3.4%
Dash Punctuation
ValueCountFrequency (%)
- 242
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1457
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 283
19.4%
- 242
16.6%
0 206
14.1%
1 172
11.8%
9 138
9.5%
3 94
 
6.5%
2 90
 
6.2%
7 80
 
5.5%
6 58
 
4.0%
8 53
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1457
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 283
19.4%
- 242
16.6%
0 206
14.1%
1 172
11.8%
9 138
9.5%
3 94
 
6.5%
2 90
 
6.2%
7 80
 
5.5%
6 58
 
4.0%
8 53
 
3.6%

주소
Text

Distinct119
Distinct (%)95.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-03-13T20:47:12.389109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length31
Mean length23.712
Min length18

Characters and Unicode

Total characters2964
Distinct characters103
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique113 ?
Unique (%)90.4%

Sample

1st row충청남도 청양군 정산면 칠갑산로 1828
2nd row충청남도 청양군 정산면 정현길 67
3rd row충청남도 청양군 청양읍 충절로 1176
4th row충청남도 청양군 정산면 충의로 1126
5th row충청남도 청양군 운곡면 청신로 567-83
ValueCountFrequency (%)
충청남도 125
18.1%
청양군 125
18.1%
청양읍 83
 
12.0%
정산면 18
 
2.6%
중앙로 17
 
2.5%
칠갑산로 12
 
1.7%
충절로 12
 
1.7%
2층 11
 
1.6%
비봉면 8
 
1.2%
1층 7
 
1.0%
Other values (163) 273
39.5%
2024-03-13T20:47:12.997637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
567
19.1%
342
 
11.5%
211
 
7.1%
1 155
 
5.2%
143
 
4.8%
130
 
4.4%
125
 
4.2%
125
 
4.2%
94
 
3.2%
2 89
 
3.0%
Other values (93) 983
33.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1811
61.1%
Space Separator 567
 
19.1%
Decimal Number 528
 
17.8%
Dash Punctuation 32
 
1.1%
Other Punctuation 18
 
0.6%
Close Punctuation 4
 
0.1%
Open Punctuation 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
342
18.9%
211
11.7%
143
 
7.9%
130
 
7.2%
125
 
6.9%
125
 
6.9%
94
 
5.2%
83
 
4.6%
63
 
3.5%
51
 
2.8%
Other values (77) 444
24.5%
Decimal Number
ValueCountFrequency (%)
1 155
29.4%
2 89
16.9%
0 57
 
10.8%
3 41
 
7.8%
5 37
 
7.0%
6 33
 
6.2%
4 32
 
6.1%
9 30
 
5.7%
8 28
 
5.3%
7 26
 
4.9%
Other Punctuation
ValueCountFrequency (%)
, 15
83.3%
3
 
16.7%
Space Separator
ValueCountFrequency (%)
567
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 32
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1811
61.1%
Common 1153
38.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
342
18.9%
211
11.7%
143
 
7.9%
130
 
7.2%
125
 
6.9%
125
 
6.9%
94
 
5.2%
83
 
4.6%
63
 
3.5%
51
 
2.8%
Other values (77) 444
24.5%
Common
ValueCountFrequency (%)
567
49.2%
1 155
 
13.4%
2 89
 
7.7%
0 57
 
4.9%
3 41
 
3.6%
5 37
 
3.2%
6 33
 
2.9%
- 32
 
2.8%
4 32
 
2.8%
9 30
 
2.6%
Other values (6) 80
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1811
61.1%
ASCII 1150
38.8%
None 3
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
567
49.3%
1 155
 
13.5%
2 89
 
7.7%
0 57
 
5.0%
3 41
 
3.6%
5 37
 
3.2%
6 33
 
2.9%
- 32
 
2.8%
4 32
 
2.8%
9 30
 
2.6%
Other values (5) 77
 
6.7%
Hangul
ValueCountFrequency (%)
342
18.9%
211
11.7%
143
 
7.9%
130
 
7.2%
125
 
6.9%
125
 
6.9%
94
 
5.2%
83
 
4.6%
63
 
3.5%
51
 
2.8%
Other values (77) 444
24.5%
None
ValueCountFrequency (%)
3
100.0%

팩스
Text

MISSING 

Distinct15
Distinct (%)100.0%
Missing110
Missing (%)88.0%
Memory size1.1 KiB
2024-03-13T20:47:13.198814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.266667
Min length12

Characters and Unicode

Total characters184
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)100.0%

Sample

1st row041-942-0052
2nd row041-834-0503
3rd row041-944-1019
4th row041-943-1950
5th row0303-3130-1285
ValueCountFrequency (%)
041-942-0052 1
 
6.7%
041-834-0503 1
 
6.7%
041-944-1019 1
 
6.7%
041-943-1950 1
 
6.7%
0303-3130-1285 1
 
6.7%
042-535-6243 1
 
6.7%
041-944-1338 1
 
6.7%
0504-411-0560 1
 
6.7%
041-943-7435 1
 
6.7%
070-4850-8556 1
 
6.7%
Other values (5) 5
33.3%
2024-03-13T20:47:13.539895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 30
16.3%
- 30
16.3%
4 29
15.8%
1 21
11.4%
3 20
10.9%
9 14
7.6%
5 13
7.1%
2 10
 
5.4%
8 7
 
3.8%
6 6
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 154
83.7%
Dash Punctuation 30
 
16.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 30
19.5%
4 29
18.8%
1 21
13.6%
3 20
13.0%
9 14
9.1%
5 13
8.4%
2 10
 
6.5%
8 7
 
4.5%
6 6
 
3.9%
7 4
 
2.6%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 184
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 30
16.3%
- 30
16.3%
4 29
15.8%
1 21
11.4%
3 20
10.9%
9 14
7.6%
5 13
7.1%
2 10
 
5.4%
8 7
 
3.8%
6 6
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 184
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 30
16.3%
- 30
16.3%
4 29
15.8%
1 21
11.4%
3 20
10.9%
9 14
7.6%
5 13
7.1%
2 10
 
5.4%
8 7
 
3.8%
6 6
 
3.3%
Distinct121
Distinct (%)96.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
Minimum1970-07-21 00:00:00
Maximum2023-08-25 00:00:00
2024-03-13T20:47:13.683050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T20:47:14.147452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

종업원수
Real number (ℝ)

Distinct14
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.76
Minimum2
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-03-13T20:47:14.321334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q13
median4
Q36
95-th percentile11.6
Maximum21
Range19
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.3466401
Coefficient of variation (CV)0.70307565
Kurtosis6.1821873
Mean4.76
Median Absolute Deviation (MAD)2
Skewness2.2026406
Sum595
Variance11.2
MonotonicityNot monotonic
2024-03-13T20:47:14.509111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
2 31
24.8%
3 30
24.0%
5 17
13.6%
4 13
10.4%
7 10
 
8.0%
6 7
 
5.6%
8 5
 
4.0%
10 3
 
2.4%
12 2
 
1.6%
14 2
 
1.6%
Other values (4) 5
 
4.0%
ValueCountFrequency (%)
2 31
24.8%
3 30
24.0%
4 13
10.4%
5 17
13.6%
6 7
 
5.6%
7 10
 
8.0%
8 5
 
4.0%
9 2
 
1.6%
10 3
 
2.4%
12 2
 
1.6%
ValueCountFrequency (%)
21 1
 
0.8%
18 1
 
0.8%
15 1
 
0.8%
14 2
 
1.6%
12 2
 
1.6%
10 3
 
2.4%
9 2
 
1.6%
8 5
4.0%
7 10
8.0%
6 7
5.6%

홈페이지 주소
Text

MISSING 

Distinct3
Distinct (%)100.0%
Missing122
Missing (%)97.6%
Memory size1.1 KiB
2024-03-13T20:47:14.714481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length15
Mean length18.666667
Min length11

Characters and Unicode

Total characters56
Distinct characters23
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowhttps://jeongsan.nonghyup.com/
2nd rowwww.나무들.com
3rd rowwww.sjenc.co.kr
ValueCountFrequency (%)
https://jeongsan.nonghyup.com 1
33.3%
www.나무들.com 1
33.3%
www.sjenc.co.kr 1
33.3%
2024-03-13T20:47:15.150003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 7
12.5%
w 6
 
10.7%
o 5
 
8.9%
n 5
 
8.9%
c 4
 
7.1%
/ 3
 
5.4%
s 3
 
5.4%
j 2
 
3.6%
e 2
 
3.6%
g 2
 
3.6%
Other values (13) 17
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 42
75.0%
Other Punctuation 11
 
19.6%
Other Letter 3
 
5.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 6
14.3%
o 5
11.9%
n 5
11.9%
c 4
9.5%
s 3
 
7.1%
j 2
 
4.8%
e 2
 
4.8%
g 2
 
4.8%
t 2
 
4.8%
p 2
 
4.8%
Other values (7) 9
21.4%
Other Punctuation
ValueCountFrequency (%)
. 7
63.6%
/ 3
27.3%
: 1
 
9.1%
Other Letter
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 42
75.0%
Common 11
 
19.6%
Hangul 3
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 6
14.3%
o 5
11.9%
n 5
11.9%
c 4
9.5%
s 3
 
7.1%
j 2
 
4.8%
e 2
 
4.8%
g 2
 
4.8%
t 2
 
4.8%
p 2
 
4.8%
Other values (7) 9
21.4%
Common
ValueCountFrequency (%)
. 7
63.6%
/ 3
27.3%
: 1
 
9.1%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53
94.6%
Hangul 3
 
5.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 7
13.2%
w 6
11.3%
o 5
 
9.4%
n 5
 
9.4%
c 4
 
7.5%
/ 3
 
5.7%
s 3
 
5.7%
j 2
 
3.8%
e 2
 
3.8%
g 2
 
3.8%
Other values (10) 14
26.4%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct124
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-03-13T20:47:15.533271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.088
Min length2

Characters and Unicode

Total characters386
Distinct characters113
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123 ?
Unique (%)98.4%

Sample

1st row홍종근
2nd row김봉락
3rd row정준희
4th row이성호
5th row우소제
ValueCountFrequency (%)
한미순 2
 
1.6%
박영희 1
 
0.8%
명재항 1
 
0.8%
김수형 1
 
0.8%
정희완 1
 
0.8%
오보섭 1
 
0.8%
정재성 1
 
0.8%
명재원 1
 
0.8%
이은지 1
 
0.8%
김경호 1
 
0.8%
Other values (114) 114
91.2%
2024-03-13T20:47:16.063384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18
 
4.7%
17
 
4.4%
13
 
3.4%
13
 
3.4%
12
 
3.1%
11
 
2.8%
11
 
2.8%
9
 
2.3%
9
 
2.3%
8
 
2.1%
Other values (103) 265
68.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 383
99.2%
Other Punctuation 3
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
 
4.7%
17
 
4.4%
13
 
3.4%
13
 
3.4%
12
 
3.1%
11
 
2.9%
11
 
2.9%
9
 
2.3%
9
 
2.3%
8
 
2.1%
Other values (102) 262
68.4%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 383
99.2%
Common 3
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
 
4.7%
17
 
4.4%
13
 
3.4%
13
 
3.4%
12
 
3.1%
11
 
2.9%
11
 
2.9%
9
 
2.3%
9
 
2.3%
8
 
2.1%
Other values (102) 262
68.4%
Common
ValueCountFrequency (%)
, 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 383
99.2%
ASCII 3
 
0.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
18
 
4.7%
17
 
4.4%
13
 
3.4%
13
 
3.4%
12
 
3.1%
11
 
2.9%
11
 
2.9%
9
 
2.3%
9
 
2.3%
8
 
2.1%
Other values (102) 262
68.4%
ASCII
ValueCountFrequency (%)
, 3
100.0%

Interactions

2024-03-13T20:47:07.819976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T20:47:16.202065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유형주력업종사업자 업태업종면허주력사업팩스종업원수홈페이지 주소
유형1.0001.0001.0001.0001.000NaN0.000NaN
주력업종1.0001.0000.9360.9891.0001.0000.7561.000
사업자 업태업종1.0000.9361.0000.9260.9381.0000.2661.000
면허1.0000.9890.9261.0000.9861.0000.0001.000
주력사업1.0001.0000.9380.9861.0001.0000.7691.000
팩스NaN1.0001.0001.0001.0001.0001.0001.000
종업원수0.0000.7560.2660.0000.7691.0001.0001.000
홈페이지 주소NaN1.0001.0001.0001.0001.0001.0001.000
2024-03-13T20:47:16.363080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주력사업주력업종사업자 업태업종면허유형
주력사업1.0000.9890.5940.7430.846
주력업종0.9891.0000.5790.7560.836
사업자 업태업종0.5940.5791.0000.5460.963
면허0.7430.7560.5461.0000.915
유형0.8460.8360.9630.9151.000
2024-03-13T20:47:16.517994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종업원수유형주력업종사업자 업태업종면허주력사업
종업원수1.0000.0000.3330.1190.0000.351
유형0.0001.0000.8360.9630.9150.846
주력업종0.3330.8361.0000.5790.7560.989
사업자 업태업종0.1190.9630.5791.0000.5460.594
면허0.0000.9150.7560.5461.0000.743
주력사업0.3510.8460.9890.5940.7431.000

Missing values

2024-03-13T20:47:08.035672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T20:47:08.418765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-13T20:47:08.572072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업체명유형사업자등록번호주력업종사업자 업태업종면허주력사업전화번호주소팩스설립일자(경력)종업원수홈페이지 주소대표자 이름
0정산하나종합정비개인310-06-32828건설.광업용 기계 및 장비 수리업자동차 종합 수리업건설기계 정비업건설기계 정비041-944-2121충청남도 청양군 정산면 칠갑산로 1828<NA>2008-10-132<NA>홍종근
1정산농업협동조합법인307-82-00310건설 및 토목공사용 기계ㆍ장비 임대업금융건설기계 대여업건설기계 대여041-942-0681충청남도 청양군 정산면 정현길 67041-942-00521970-07-215https://jeongsan.nonghyup.com/김봉락
2대광건기㈜법인310-81-18783건설 및 토목공사용 기계ㆍ장비 임대업운수 및 창고업건설기계 대여업건설기계 대여041-834-0501충청남도 청양군 청양읍 충절로 1176041-834-05032006-02-165<NA>정준희
3㈜정산종합모터스법인311-86-00108금속류 해체, 선별 및 원료 재생업수도, 하수 및 폐기물 처리, 원료 재생업건설기계 해체재활용업건설기계 해체재활용041-942-0707충청남도 청양군 정산면 충의로 1126<NA>2015-07-013<NA>이성호
4(주)계룡법인310-81-19026콘크리트 및 철근 공사업건설업철근·콘크리트공사업철근·콘크리트공사041-943-1019충청남도 청양군 운곡면 청신로 567-83041-944-10192006-07-195<NA>우소제
5(주)금강건설기술단법인788-86-01229포장 공사업건설업지반조성·포장공사업포장공사041-943-1950충청남도 청양군 청양읍 중앙로 184, 2층041-943-19502019-08-235<NA>문권식
6(주)금호건설산업법인510-81-28047포장 공사업건설업지반조성·포장공사업포장공사<NA>충청남도 청양군 화성면 구숫골길 105-190303-3130-12852012-04-013<NA>김종례
7(주)나무들법인314-86-59404조경 건설업목재생산업조경식재·시설물공사업조경식재·시설물공사041-943-6242충청남도 청양군 운곡면 신대길 14-16042-535-62432014-05-0712www.나무들.com배은숙
8(주)대덕건설법인102-81-45388콘크리트 및 철근 공사업건설업철근·콘크리트공사업철근·콘크리트공사041-943-3786충청남도 청양군 청양읍 칠갑산로1길 54, 101호<NA>2018-01-173<NA>이대규
9(주)대명건설법인165-88-01075콘크리트 및 철근 공사업건설업철근·콘크리트공사업철근·콘크리트공사041-944-1115충청남도 청양군 청양읍 칠갑산로7길 9-1, 201호<NA>2018-02-145<NA>명노일
업체명유형사업자등록번호주력업종사업자 업태업종면허주력사업전화번호주소팩스설립일자(경력)종업원수홈페이지 주소대표자 이름
115청운건설산업(주)법인307-81-14520토공사, 포장공사건설업지반조성ㆍ포장공사업토공사, 포장공사041-943-6569충청남도 청양군 청양읍 칠갑산로12길 17 나동 201호<NA>2000-02-0912<NA>강경구,강윤모
116최고건설(주)법인482-86-01369철근ㆍ콘크리트공사건설업철근ㆍ콘크리트공사업철근ㆍ콘크리트공사041-940-0000충청남도 청양군 청양읍 평촌1길 4-25 1층<NA>2019-01-032<NA>고종갑
117충청건설(주)법인307-81-11519조경식재공사건설업조경식재ㆍ시설물공사업조경식재공사041-942-9787충청남도 청양군 청양읍 칠갑산로1길 54 201호<NA>1998-02-067<NA>노만자
118태광건설(주)법인312-86-45898도장공사, 석공사건설업도장ㆍ습식ㆍ방수ㆍ석공사업도장공사, 석공사041-532-5528충청남도 청양군 청양읍 돌담불길 53, 103호<NA>2012-12-043<NA>박신영
119태상건설주식회사법인897-87-00273철근ㆍ콘크리트공사건설업철근ㆍ콘크리트공사업철근ㆍ콘크리트공사<NA>충청남도 청양군 청양읍 학사길 42 1층(청명빌딩)<NA>2016-07-143<NA>김상훈
120태성건설(주)법인762-87-00318철근ㆍ콘크리트공사건설업철근ㆍ콘크리트공사업철근ㆍ콘크리트공사041-943-6067충청남도 청양군 청양읍 중앙로11길 10 1층 101호<NA>2016-01-155<NA>한미순
121태승건설(주)법인314-81-89198석공사건설업도장ㆍ습식ㆍ방수ㆍ석공사업석공사041-942-2031충청남도 청양군 청양읍 돌담불길 53 101호<NA>2007-04-047<NA>박태진
122태양건설(주)법인310-81-16413상ㆍ하수도설비공사건설업상ㆍ하수도설비공사업상ㆍ하수도설비공사041-943-6067충청남도 청양군 청양읍 중앙로11길 10 1-101<NA>2003-01-142<NA>한미순
123하늘건설(주)법인268-86-00333토공사건설업지반조성ㆍ포장공사업토공사041-943-8587충청남도 청양군 청양읍 중앙로11길 10 1-101<NA>2016-01-204<NA>명노율
124현우건설(주)법인307-81-13569토공사, 포장공사건설업지반조성ㆍ포장공사업토공사, 포장공사041-943-0077충청남도 청양군 정산면 칠갑산로 1944-12<NA>1999-08-018<NA>윤종순