Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells26334
Missing cells (%)37.6%
Duplicate rows1130
Duplicate rows (%)11.3%
Total size in memory625.0 KiB
Average record size in memory64.0 B

Variable types

Text5
Categorical1
Boolean1

Dataset

Description사업번호,업체분류명,업체명,주소,전화번호,홈페이지주소,계약해지여부
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2255/S/1/datasetView.do

Alerts

Dataset has 1130 (11.3%) duplicate rowsDuplicates
업체분류명 is highly imbalanced (68.1%)Imbalance
계약해지여부 is highly imbalanced (84.7%)Imbalance
주소 has 8292 (82.9%) missing valuesMissing
전화번호 has 8451 (84.5%) missing valuesMissing
홈페이지주소 has 9591 (95.9%) missing valuesMissing

Reproduction

Analysis started2024-03-13 07:43:30.061327
Analysis finished2024-03-13 07:43:30.945146
Duration0.88 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct563
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-13T16:43:31.082680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters150000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)0.5%

Sample

1st row11230-900000737
2nd row11560-100003012
3rd row11380-100001057
4th row11650-100002018
5th row11680-900000942
ValueCountFrequency (%)
11710-100002008 136
 
1.4%
11230-100006043 114
 
1.1%
11290-100016009 101
 
1.0%
11380-100001045 92
 
0.9%
11290-100015000 85
 
0.9%
11110-100003002 84
 
0.8%
11620-100004002 84
 
0.8%
11650-900000161 80
 
0.8%
11740-100000049 76
 
0.8%
11590-100002006 76
 
0.8%
Other values (553) 9072
90.7%
2024-03-13T16:43:31.403244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 66200
44.1%
1 35498
23.7%
- 10000
 
6.7%
2 6815
 
4.5%
9 5312
 
3.5%
6 5030
 
3.4%
5 4965
 
3.3%
3 4861
 
3.2%
4 4742
 
3.2%
7 3681
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 140000
93.3%
Dash Punctuation 10000
 
6.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 66200
47.3%
1 35498
25.4%
2 6815
 
4.9%
9 5312
 
3.8%
6 5030
 
3.6%
5 4965
 
3.5%
3 4861
 
3.5%
4 4742
 
3.4%
7 3681
 
2.6%
8 2896
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 150000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 66200
44.1%
1 35498
23.7%
- 10000
 
6.7%
2 6815
 
4.5%
9 5312
 
3.5%
6 5030
 
3.4%
5 4965
 
3.3%
3 4861
 
3.2%
4 4742
 
3.2%
7 3681
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 150000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 66200
44.1%
1 35498
23.7%
- 10000
 
6.7%
2 6815
 
4.5%
9 5312
 
3.5%
6 5030
 
3.4%
5 4965
 
3.3%
3 4861
 
3.2%
4 4742
 
3.2%
7 3681
 
2.5%

업체분류명
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타
8795 
설계자
 
491
정비업자
 
400
시공자
 
251
철거업자
 
63

Length

Max length4
Median length2
Mean length2.1668
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row설계자
2nd row기타
3rd row기타
4th row기타
5th row기타

Common Values

ValueCountFrequency (%)
기타 8795
87.9%
설계자 491
 
4.9%
정비업자 400
 
4.0%
시공자 251
 
2.5%
철거업자 63
 
0.6%

Length

2024-03-13T16:43:31.549294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T16:43:31.645213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 8795
87.9%
설계자 491
 
4.9%
정비업자 400
 
4.0%
시공자 251
 
2.5%
철거업자 63
 
0.6%
Distinct4269
Distinct (%)42.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-13T16:43:31.850968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length21
Mean length9.2611
Min length2

Characters and Unicode

Total characters92611
Distinct characters587
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2753 ?
Unique (%)27.5%

Sample

1st row(주)그룹환경종합건축사사무소
2nd row(주)개신건설
3rd row(주) 나라감정평가법인
4th row법무법인엘케이비앤파트너스
5th row(주)주성 시.엠.시
ValueCountFrequency (%)
주식회사 957
 
7.3%
법무법인 902
 
6.9%
227
 
1.7%
법률사무소 176
 
1.3%
을지 85
 
0.6%
법무법인(유한 76
 
0.6%
산하 75
 
0.6%
정비 74
 
0.6%
법무법인산하 68
 
0.5%
법무법인을지 58
 
0.4%
Other values (4028) 10432
79.5%
2024-03-13T16:43:32.211801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5689
 
6.1%
4914
 
5.3%
4716
 
5.1%
) 4651
 
5.0%
( 4640
 
5.0%
3676
 
4.0%
3226
 
3.5%
2995
 
3.2%
2026
 
2.2%
1868
 
2.0%
Other values (577) 54210
58.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 79555
85.9%
Close Punctuation 4657
 
5.0%
Open Punctuation 4646
 
5.0%
Space Separator 3226
 
3.5%
Uppercase Letter 340
 
0.4%
Lowercase Letter 86
 
0.1%
Other Punctuation 65
 
0.1%
Decimal Number 35
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5689
 
7.2%
4914
 
6.2%
4716
 
5.9%
3676
 
4.6%
2995
 
3.8%
2026
 
2.5%
1868
 
2.3%
1754
 
2.2%
1695
 
2.1%
1624
 
2.0%
Other values (521) 48598
61.1%
Uppercase Letter
ValueCountFrequency (%)
H 49
14.4%
P 47
13.8%
C 43
12.6%
S 40
11.8%
G 28
8.2%
M 18
 
5.3%
K 16
 
4.7%
N 15
 
4.4%
T 15
 
4.4%
D 13
 
3.8%
Other values (13) 56
16.5%
Lowercase Letter
ValueCountFrequency (%)
s 13
15.1%
o 10
11.6%
n 8
9.3%
t 7
8.1%
c 6
 
7.0%
d 6
 
7.0%
r 5
 
5.8%
i 5
 
5.8%
e 5
 
5.8%
h 4
 
4.7%
Other values (7) 17
19.8%
Decimal Number
ValueCountFrequency (%)
1 13
37.1%
2 12
34.3%
0 3
 
8.6%
5 3
 
8.6%
6 2
 
5.7%
3 2
 
5.7%
Other Punctuation
ValueCountFrequency (%)
. 51
78.5%
, 7
 
10.8%
& 6
 
9.2%
' 1
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 4651
99.9%
6
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 4640
99.9%
6
 
0.1%
Space Separator
ValueCountFrequency (%)
3226
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 79556
85.9%
Common 12629
 
13.6%
Latin 426
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5689
 
7.2%
4914
 
6.2%
4716
 
5.9%
3676
 
4.6%
2995
 
3.8%
2026
 
2.5%
1868
 
2.3%
1754
 
2.2%
1695
 
2.1%
1624
 
2.0%
Other values (522) 48599
61.1%
Latin
ValueCountFrequency (%)
H 49
 
11.5%
P 47
 
11.0%
C 43
 
10.1%
S 40
 
9.4%
G 28
 
6.6%
M 18
 
4.2%
K 16
 
3.8%
N 15
 
3.5%
T 15
 
3.5%
D 13
 
3.1%
Other values (30) 142
33.3%
Common
ValueCountFrequency (%)
) 4651
36.8%
( 4640
36.7%
3226
25.5%
. 51
 
0.4%
1 13
 
0.1%
2 12
 
0.1%
, 7
 
0.1%
6
 
< 0.1%
6
 
< 0.1%
& 6
 
< 0.1%
Other values (5) 11
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 79555
85.9%
ASCII 13043
 
14.1%
None 13
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5689
 
7.2%
4914
 
6.2%
4716
 
5.9%
3676
 
4.6%
2995
 
3.8%
2026
 
2.5%
1868
 
2.3%
1754
 
2.2%
1695
 
2.1%
1624
 
2.0%
Other values (521) 48598
61.1%
ASCII
ValueCountFrequency (%)
) 4651
35.7%
( 4640
35.6%
3226
24.7%
. 51
 
0.4%
H 49
 
0.4%
P 47
 
0.4%
C 43
 
0.3%
S 40
 
0.3%
G 28
 
0.2%
M 18
 
0.1%
Other values (43) 250
 
1.9%
None
ValueCountFrequency (%)
6
46.2%
6
46.2%
1
 
7.7%

주소
Text

MISSING 

Distinct794
Distinct (%)46.5%
Missing8292
Missing (%)82.9%
Memory size156.2 KiB
2024-03-13T16:43:32.517307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length48
Median length41
Mean length27.883489
Min length20

Characters and Unicode

Total characters47625
Distinct characters430
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique495 ?
Unique (%)29.0%

Sample

1st row서울특별시 강남구 봉은사로 179 (논현동,H-TOWER)
2nd row경기도 안양시 만안구 만안로 49 (안양동)
3rd row서울특별시 서초구 서초대로8길 62 (방배동)
4th row인천광역시 계양구 서운로 33-8 (서운동)
5th row서울특별시 서초구 서초대로 253 (서초동,지천빌딩)
ValueCountFrequency (%)
서울특별시 1490
 
16.9%
서초구 475
 
5.4%
강남구 297
 
3.4%
송파구 181
 
2.0%
경기도 164
 
1.9%
서초동 104
 
1.2%
마포구 94
 
1.1%
문정동 90
 
1.0%
서초대로 86
 
1.0%
역삼동 82
 
0.9%
Other values (1582) 5776
65.3%
2024-03-13T16:43:32.928116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7141
 
15.0%
2675
 
5.6%
1992
 
4.2%
1864
 
3.9%
1731
 
3.6%
1725
 
3.6%
) 1724
 
3.6%
( 1724
 
3.6%
1531
 
3.2%
1498
 
3.1%
Other values (420) 24020
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30021
63.0%
Space Separator 7141
 
15.0%
Decimal Number 5800
 
12.2%
Close Punctuation 1724
 
3.6%
Open Punctuation 1724
 
3.6%
Other Punctuation 753
 
1.6%
Uppercase Letter 264
 
0.6%
Dash Punctuation 171
 
0.4%
Letter Number 10
 
< 0.1%
Lowercase Letter 9
 
< 0.1%
Other values (2) 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2675
 
8.9%
1992
 
6.6%
1864
 
6.2%
1731
 
5.8%
1725
 
5.7%
1531
 
5.1%
1498
 
5.0%
1496
 
5.0%
1045
 
3.5%
729
 
2.4%
Other values (371) 13735
45.8%
Uppercase Letter
ValueCountFrequency (%)
R 39
14.8%
I 33
12.5%
T 24
9.1%
E 20
 
7.6%
S 17
 
6.4%
O 17
 
6.4%
W 17
 
6.4%
A 16
 
6.1%
K 15
 
5.7%
D 13
 
4.9%
Other values (12) 53
20.1%
Decimal Number
ValueCountFrequency (%)
1 1253
21.6%
2 927
16.0%
3 640
11.0%
4 562
9.7%
6 545
9.4%
5 505
8.7%
0 391
 
6.7%
7 379
 
6.5%
8 337
 
5.8%
9 261
 
4.5%
Other Punctuation
ValueCountFrequency (%)
, 741
98.4%
& 5
 
0.7%
. 4
 
0.5%
/ 2
 
0.3%
1
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
f 2
22.2%
e 2
22.2%
v 2
22.2%
i 2
22.2%
n 1
11.1%
Space Separator
ValueCountFrequency (%)
7141
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1724
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1724
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 171
100.0%
Letter Number
ValueCountFrequency (%)
10
100.0%
Other Symbol
ValueCountFrequency (%)
6
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30013
63.0%
Common 17321
36.4%
Latin 283
 
0.6%
Han 8
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2675
 
8.9%
1992
 
6.6%
1864
 
6.2%
1731
 
5.8%
1725
 
5.7%
1531
 
5.1%
1498
 
5.0%
1496
 
5.0%
1045
 
3.5%
729
 
2.4%
Other values (369) 13727
45.7%
Latin
ValueCountFrequency (%)
R 39
13.8%
I 33
11.7%
T 24
 
8.5%
E 20
 
7.1%
S 17
 
6.0%
O 17
 
6.0%
W 17
 
6.0%
A 16
 
5.7%
K 15
 
5.3%
D 13
 
4.6%
Other values (18) 72
25.4%
Common
ValueCountFrequency (%)
7141
41.2%
) 1724
 
10.0%
( 1724
 
10.0%
1 1253
 
7.2%
2 927
 
5.4%
, 741
 
4.3%
3 640
 
3.7%
4 562
 
3.2%
6 545
 
3.1%
5 505
 
2.9%
Other values (11) 1559
 
9.0%
Han
ValueCountFrequency (%)
4
50.0%
4
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30013
63.0%
ASCII 17587
36.9%
Number Forms 10
 
< 0.1%
CJK 8
 
< 0.1%
Enclosed Alphanum 6
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7141
40.6%
) 1724
 
9.8%
( 1724
 
9.8%
1 1253
 
7.1%
2 927
 
5.3%
, 741
 
4.2%
3 640
 
3.6%
4 562
 
3.2%
6 545
 
3.1%
5 505
 
2.9%
Other values (36) 1825
 
10.4%
Hangul
ValueCountFrequency (%)
2675
 
8.9%
1992
 
6.6%
1864
 
6.2%
1731
 
5.8%
1725
 
5.7%
1531
 
5.1%
1498
 
5.0%
1496
 
5.0%
1045
 
3.5%
729
 
2.4%
Other values (369) 13727
45.7%
Number Forms
ValueCountFrequency (%)
10
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
6
100.0%
CJK
ValueCountFrequency (%)
4
50.0%
4
50.0%
None
ValueCountFrequency (%)
1
100.0%

전화번호
Text

MISSING 

Distinct808
Distinct (%)52.2%
Missing8451
Missing (%)84.5%
Memory size156.2 KiB
2024-03-13T16:43:33.168857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length11.342156
Min length9

Characters and Unicode

Total characters17569
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique541 ?
Unique (%)34.9%

Sample

1st row02-518-8818
2nd row031-6909-5912
3rd row02-521-1122
4th row02-536-5805
5th row02-525-2733
ValueCountFrequency (%)
02-537-3322 38
 
2.5%
02-2055-1919 33
 
2.1%
02-455-5503 18
 
1.2%
02-536-5805 17
 
1.1%
02-584-2581 17
 
1.1%
02-587-2130 16
 
1.0%
02-533-8505 13
 
0.8%
02-3019-1200 13
 
0.8%
02-6346-2287 13
 
0.8%
02-599-5333 13
 
0.8%
Other values (798) 1358
87.7%
2024-03-13T16:43:33.555779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3046
17.3%
- 2893
16.5%
2 2711
15.4%
3 1523
8.7%
5 1513
8.6%
1 1289
7.3%
4 1075
 
6.1%
7 971
 
5.5%
8 877
 
5.0%
9 844
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14676
83.5%
Dash Punctuation 2893
 
16.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3046
20.8%
2 2711
18.5%
3 1523
10.4%
5 1513
10.3%
1 1289
8.8%
4 1075
 
7.3%
7 971
 
6.6%
8 877
 
6.0%
9 844
 
5.8%
6 827
 
5.6%
Dash Punctuation
ValueCountFrequency (%)
- 2893
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17569
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3046
17.3%
- 2893
16.5%
2 2711
15.4%
3 1523
8.7%
5 1513
8.6%
1 1289
7.3%
4 1075
 
6.1%
7 971
 
5.5%
8 877
 
5.0%
9 844
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17569
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3046
17.3%
- 2893
16.5%
2 2711
15.4%
3 1523
8.7%
5 1513
8.6%
1 1289
7.3%
4 1075
 
6.1%
7 971
 
5.5%
8 877
 
5.0%
9 844
 
4.8%

홈페이지주소
Text

MISSING 

Distinct252
Distinct (%)61.6%
Missing9591
Missing (%)95.9%
Memory size156.2 KiB
2024-03-13T16:43:33.771821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length63
Median length41
Mean length23.848411
Min length16

Characters and Unicode

Total characters9754
Distinct characters54
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique180 ?
Unique (%)44.0%

Sample

1st rowhttp://www.eagar.co.kr
2nd rowhttp://www.grouphan.co.kr
3rd rowhttp://www.msapp.co.kr/
4th rowhttp://www.donginlaw.co.kr
5th rowhttp://www.sanhalaw.co.kr/
ValueCountFrequency (%)
http://www.ulchi.co.kr 14
 
3.4%
http://www.sanhalaw.co.kr 12
 
2.9%
http://www.jblaw.kr 12
 
2.9%
http://www.dh2002.co.kr 7
 
1.7%
http://www.leeko.com:8080 6
 
1.5%
http://nksbar@gmail.com 6
 
1.5%
http://www.eagar.co.kr 6
 
1.5%
http://www.bkl.co.kr 6
 
1.5%
http://www.eagroup.co.kr 5
 
1.2%
http://www.cmeia.co.kr 5
 
1.2%
Other values (222) 330
80.7%
2024-03-13T16:43:34.079078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
w 1090
11.2%
. 1039
 
10.7%
/ 1028
 
10.5%
t 905
 
9.3%
o 560
 
5.7%
h 554
 
5.7%
c 547
 
5.6%
p 504
 
5.2%
: 415
 
4.3%
r 401
 
4.1%
Other values (44) 2711
27.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7062
72.4%
Other Punctuation 2507
 
25.7%
Decimal Number 145
 
1.5%
Dash Punctuation 20
 
0.2%
Uppercase Letter 14
 
0.1%
Math Symbol 4
 
< 0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 1090
15.4%
t 905
12.8%
o 560
 
7.9%
h 554
 
7.8%
c 547
 
7.7%
p 504
 
7.1%
r 401
 
5.7%
a 380
 
5.4%
k 363
 
5.1%
n 280
 
4.0%
Other values (15) 1478
20.9%
Decimal Number
ValueCountFrequency (%)
2 38
26.2%
0 36
24.8%
8 18
12.4%
1 15
 
10.3%
4 10
 
6.9%
3 9
 
6.2%
5 8
 
5.5%
7 8
 
5.5%
9 2
 
1.4%
6 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 1039
41.4%
/ 1028
41.0%
: 415
 
16.6%
@ 15
 
0.6%
# 4
 
0.2%
? 4
 
0.2%
& 1
 
< 0.1%
1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
R 3
21.4%
W 3
21.4%
K 2
14.3%
A 2
14.3%
L 1
 
7.1%
N 1
 
7.1%
I 1
 
7.1%
C 1
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 20
100.0%
Math Symbol
ValueCountFrequency (%)
= 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7076
72.5%
Common 2678
 
27.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 1090
15.4%
t 905
12.8%
o 560
 
7.9%
h 554
 
7.8%
c 547
 
7.7%
p 504
 
7.1%
r 401
 
5.7%
a 380
 
5.4%
k 363
 
5.1%
n 280
 
4.0%
Other values (23) 1492
21.1%
Common
ValueCountFrequency (%)
. 1039
38.8%
/ 1028
38.4%
: 415
 
15.5%
2 38
 
1.4%
0 36
 
1.3%
- 20
 
0.7%
8 18
 
0.7%
@ 15
 
0.6%
1 15
 
0.6%
4 10
 
0.4%
Other values (11) 44
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9753
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
w 1090
11.2%
. 1039
 
10.7%
/ 1028
 
10.5%
t 905
 
9.3%
o 560
 
5.7%
h 554
 
5.7%
c 547
 
5.6%
p 504
 
5.2%
: 415
 
4.3%
r 401
 
4.1%
Other values (43) 2710
27.8%
None
ValueCountFrequency (%)
1
100.0%

계약해지여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
False
9779 
True
 
221
ValueCountFrequency (%)
False 9779
97.8%
True 221
 
2.2%
2024-03-13T16:43:34.175855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T16:43:34.224206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체분류명계약해지여부
업체분류명1.0000.093
계약해지여부0.0931.000
2024-03-13T16:43:34.308604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계약해지여부업체분류명
계약해지여부1.0000.114
업체분류명0.1141.000
2024-03-13T16:43:34.382891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체분류명계약해지여부
업체분류명1.0000.114
계약해지여부0.1141.000

Missing values

2024-03-13T16:43:30.677525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T16:43:30.795926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-13T16:43:30.887628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업번호업체분류명업체명주소전화번호홈페이지주소계약해지여부
1613111230-900000737설계자(주)그룹환경종합건축사사무소<NA><NA><NA>N
1873611560-100003012기타(주)개신건설<NA><NA><NA>N
3197811380-100001057기타(주) 나라감정평가법인<NA><NA><NA>N
501111650-100002018기타법무법인엘케이비앤파트너스<NA><NA><NA>N
273811680-900000942기타(주)주성 시.엠.시<NA><NA><NA>N
2544011590-900000481기타유안타증권(주)<NA><NA><NA>N
2740511380-100001048기타대한지적공사<NA><NA><NA>N
1055811620-100004000기타주식회사 코리아이앤씨<NA><NA><NA>N
514811290-100016005기타법무법인(유한) 시그니처<NA><NA><NA>N
822711740-900000149기타한방유비스<NA><NA><NA>N
사업번호업체분류명업체명주소전화번호홈페이지주소계약해지여부
2891011440-100006006기타(주)우현 엔지니어링<NA><NA><NA>N
2776211380-100001060기타주식회사 풍익개발<NA><NA><NA>N
1752311230-100006042기타(주)씨엠닉스<NA><NA><NA>N
2465411140-100002004기타건화종합건축사사무소<NA><NA><NA>N
305611650-900000553기타주식회사 플로우컴퍼니<NA><NA><NA>N
2549911680-900000694기타(주)다올하우징<NA><NA><NA>N
2643811290-100016009기타(주)삼호기술개발공사서울특별시 서초구 서초대로22길 11-7 (방배동)031-426-3966<NA>N
3167611290-900000106기타(주)제이앤비코퍼레이션<NA><NA><NA>N
2006911260-100002011기타(주)도울기획<NA><NA><NA>N
132411710-100002008기타주식회사 백야시스템<NA><NA><NA>N

Duplicate rows

Most frequently occurring

사업번호업체분류명업체명주소전화번호홈페이지주소계약해지여부# duplicates
22611230-100012000기타법률사무소정비정경아<NA><NA><NA>N45
79611620-100004002기타국토속기사사무소<NA><NA><NA>N35
43811380-100001045기타법무법인산하<NA><NA><NA>N31
72611590-100001010기타변호사남기송법률사무소<NA><NA><NA>N29
9011200-100002000기타법무법인 혜안<NA><NA><NA>N22
98011680-900000542기타법무법인로드맵<NA><NA><NA>N20
104711710-900000431기타법무법인케이씨엘<NA><NA><NA>N20
24911260-100003045기타중원종합법률사무소<NA><NA><NA>N18
14011215-100002008기타법무법인(유한)대륙아주<NA><NA><NA>N17
50711380-900000142기타법률사무소 정비<NA><NA><NA>N17