Overview

Dataset statistics

Number of variables10
Number of observations642
Missing cells874
Missing cells (%)13.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory50.3 KiB
Average record size in memory80.2 B

Variable types

Text9
Categorical1

Alerts

(2014. 8. 17) has constant value ""Constant
Unnamed: 8 has 233 (36.3%) missing valuesMissing
(2014. 8. 17) has 641 (99.8%) missing valuesMissing
Unnamed: 0 has unique valuesUnique

Reproduction

Analysis started2024-03-14 00:34:30.536015
Analysis finished2024-03-14 00:34:31.378528
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Text

UNIQUE 

Distinct642
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2024-03-14T09:34:31.665336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.8302181
Min length1

Characters and Unicode

Total characters1817
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique642 ?
Unique (%)100.0%

Sample

1st row연번
2nd row1
3rd row2
4th row3
5th row4
ValueCountFrequency (%)
연번 1
 
0.2%
480 1
 
0.2%
440 1
 
0.2%
430 1
 
0.2%
423 1
 
0.2%
424 1
 
0.2%
425 1
 
0.2%
426 1
 
0.2%
427 1
 
0.2%
428 1
 
0.2%
Other values (632) 632
98.4%
2024-03-14T09:34:32.187065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 235
12.9%
3 234
12.9%
2 234
12.9%
4 226
12.4%
5 224
12.3%
6 166
9.1%
9 124
6.8%
7 124
6.8%
8 124
6.8%
0 124
6.8%
Other values (2) 2
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1815
99.9%
Other Letter 2
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 235
12.9%
3 234
12.9%
2 234
12.9%
4 226
12.5%
5 224
12.3%
6 166
9.1%
9 124
6.8%
7 124
6.8%
8 124
6.8%
0 124
6.8%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1815
99.9%
Hangul 2
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 235
12.9%
3 234
12.9%
2 234
12.9%
4 226
12.5%
5 224
12.3%
6 166
9.1%
9 124
6.8%
7 124
6.8%
8 124
6.8%
0 124
6.8%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1815
99.9%
Hangul 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 235
12.9%
3 234
12.9%
2 234
12.9%
4 226
12.5%
5 224
12.3%
6 166
9.1%
9 124
6.8%
7 124
6.8%
8 124
6.8%
0 124
6.8%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct632
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2024-03-14T09:34:32.419803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length7.2647975
Min length1

Characters and Unicode

Total characters4664
Distinct characters397
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique622 ?
Unique (%)96.9%

Sample

1st row업체명
2nd row(유)유송건설
3rd row(유)지에스엔지니어링
4th row(유)한양기공
5th row대원전력주식회사
ValueCountFrequency (%)
주식회사 34
 
4.6%
유한회사 26
 
3.6%
6
 
0.8%
5
 
0.7%
그린숲 2
 
0.3%
대원전력주식회사 2
 
0.3%
주)휴먼제이앤씨 2
 
0.3%
에코하이테크 2
 
0.3%
유)경동건설 2
 
0.3%
유)신우개발 2
 
0.3%
Other values (643) 649
88.7%
2024-03-14T09:34:32.761948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 364
 
7.8%
( 357
 
7.7%
269
 
5.8%
219
 
4.7%
136
 
2.9%
111
 
2.4%
99
 
2.1%
98
 
2.1%
94
 
2.0%
93
 
2.0%
Other values (387) 2824
60.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3775
80.9%
Close Punctuation 364
 
7.8%
Open Punctuation 357
 
7.7%
Space Separator 94
 
2.0%
Uppercase Letter 25
 
0.5%
Lowercase Letter 20
 
0.4%
Decimal Number 17
 
0.4%
Other Punctuation 8
 
0.2%
Other Symbol 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
269
 
7.1%
219
 
5.8%
136
 
3.6%
111
 
2.9%
99
 
2.6%
98
 
2.6%
93
 
2.5%
74
 
2.0%
72
 
1.9%
66
 
1.7%
Other values (345) 2538
67.2%
Uppercase Letter
ValueCountFrequency (%)
S 4
16.0%
J 3
12.0%
E 2
 
8.0%
P 2
 
8.0%
L 2
 
8.0%
M 2
 
8.0%
C 1
 
4.0%
H 1
 
4.0%
G 1
 
4.0%
N 1
 
4.0%
Other values (6) 6
24.0%
Lowercase Letter
ValueCountFrequency (%)
s 4
20.0%
a 3
15.0%
e 3
15.0%
n 2
10.0%
g 2
10.0%
t 2
10.0%
f 1
 
5.0%
l 1
 
5.0%
i 1
 
5.0%
r 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 4
23.5%
6 3
17.6%
2 3
17.6%
7 2
11.8%
3 1
 
5.9%
4 1
 
5.9%
0 1
 
5.9%
9 1
 
5.9%
5 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
. 4
50.0%
& 3
37.5%
' 1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 364
100.0%
Open Punctuation
ValueCountFrequency (%)
( 357
100.0%
Space Separator
ValueCountFrequency (%)
94
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3779
81.0%
Common 840
 
18.0%
Latin 45
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
269
 
7.1%
219
 
5.8%
136
 
3.6%
111
 
2.9%
99
 
2.6%
98
 
2.6%
93
 
2.5%
74
 
2.0%
72
 
1.9%
66
 
1.7%
Other values (346) 2542
67.3%
Latin
ValueCountFrequency (%)
S 4
 
8.9%
s 4
 
8.9%
a 3
 
6.7%
e 3
 
6.7%
J 3
 
6.7%
n 2
 
4.4%
g 2
 
4.4%
t 2
 
4.4%
E 2
 
4.4%
P 2
 
4.4%
Other values (16) 18
40.0%
Common
ValueCountFrequency (%)
) 364
43.3%
( 357
42.5%
94
 
11.2%
. 4
 
0.5%
1 4
 
0.5%
6 3
 
0.4%
& 3
 
0.4%
2 3
 
0.4%
7 2
 
0.2%
3 1
 
0.1%
Other values (5) 5
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3775
80.9%
ASCII 885
 
19.0%
None 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 364
41.1%
( 357
40.3%
94
 
10.6%
S 4
 
0.5%
s 4
 
0.5%
. 4
 
0.5%
1 4
 
0.5%
6 3
 
0.3%
a 3
 
0.3%
& 3
 
0.3%
Other values (31) 45
 
5.1%
Hangul
ValueCountFrequency (%)
269
 
7.1%
219
 
5.8%
136
 
3.6%
111
 
2.9%
99
 
2.6%
98
 
2.6%
93
 
2.5%
74
 
2.0%
72
 
1.9%
66
 
1.7%
Other values (345) 2538
67.2%
None
ValueCountFrequency (%)
4
100.0%
Distinct592
Distinct (%)92.2%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2024-03-14T09:34:33.066274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.0903427
Min length2

Characters and Unicode

Total characters1984
Distinct characters160
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique546 ?
Unique (%)85.0%

Sample

1st row대표자
2nd row조은주
3rd row서봉순
4th row강순덕
5th row박태숙
ValueCountFrequency (%)
최정희 3
 
0.5%
김영희 3
 
0.5%
김정숙 3
 
0.5%
김영미 3
 
0.5%
김선자 2
 
0.3%
윤정란 2
 
0.3%
안기령 2
 
0.3%
김은영 2
 
0.3%
신경숙 2
 
0.3%
정윤희 2
 
0.3%
Other values (596) 636
96.4%
2024-03-14T09:34:33.446533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
127
 
6.4%
118
 
5.9%
94
 
4.7%
85
 
4.3%
78
 
3.9%
78
 
3.9%
69
 
3.5%
69
 
3.5%
66
 
3.3%
61
 
3.1%
Other values (150) 1139
57.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1952
98.4%
Space Separator 21
 
1.1%
Other Punctuation 9
 
0.5%
Decimal Number 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
127
 
6.5%
118
 
6.0%
94
 
4.8%
85
 
4.4%
78
 
4.0%
78
 
4.0%
69
 
3.5%
69
 
3.5%
66
 
3.4%
61
 
3.1%
Other values (147) 1107
56.7%
Space Separator
ValueCountFrequency (%)
21
100.0%
Other Punctuation
ValueCountFrequency (%)
, 9
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1952
98.4%
Common 32
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
127
 
6.5%
118
 
6.0%
94
 
4.8%
85
 
4.4%
78
 
4.0%
78
 
4.0%
69
 
3.5%
69
 
3.5%
66
 
3.4%
61
 
3.1%
Other values (147) 1107
56.7%
Common
ValueCountFrequency (%)
21
65.6%
, 9
28.1%
1 2
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1952
98.4%
ASCII 32
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
127
 
6.5%
118
 
6.0%
94
 
4.8%
85
 
4.4%
78
 
4.0%
78
 
4.0%
69
 
3.5%
69
 
3.5%
66
 
3.4%
61
 
3.1%
Other values (147) 1107
56.7%
ASCII
ValueCountFrequency (%)
21
65.6%
, 9
28.1%
1 2
 
6.2%
Distinct326
Distinct (%)50.8%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2024-03-14T09:34:33.713771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.9859813
Min length1

Characters and Unicode

Total characters4485
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique181 ?
Unique (%)28.2%

Sample

1st row우편번호
2nd row590-120
3rd row560-869
4th row573-140
5th row573-360
ValueCountFrequency (%)
560-838 10
 
1.6%
560-822 9
 
1.4%
570-979 9
 
1.4%
560-821 9
 
1.4%
560-759 7
 
1.1%
561-330 7
 
1.1%
560-870 7
 
1.1%
573-885 7
 
1.1%
561-845 7
 
1.1%
561-370 7
 
1.1%
Other values (316) 563
87.7%
2024-03-14T09:34:34.104179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 822
18.3%
- 640
14.3%
0 634
14.1%
8 457
10.2%
6 421
9.4%
7 354
7.9%
1 315
 
7.0%
9 265
 
5.9%
3 248
 
5.5%
2 209
 
4.7%
Other values (5) 120
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3841
85.6%
Dash Punctuation 640
 
14.3%
Other Letter 4
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 822
21.4%
0 634
16.5%
8 457
11.9%
6 421
11.0%
7 354
9.2%
1 315
 
8.2%
9 265
 
6.9%
3 248
 
6.5%
2 209
 
5.4%
4 116
 
3.0%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 640
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4481
99.9%
Hangul 4
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
5 822
18.3%
- 640
14.3%
0 634
14.1%
8 457
10.2%
6 421
9.4%
7 354
7.9%
1 315
 
7.0%
9 265
 
5.9%
3 248
 
5.5%
2 209
 
4.7%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4481
99.9%
Hangul 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 822
18.3%
- 640
14.3%
0 634
14.1%
8 457
10.2%
6 421
9.4%
7 354
7.9%
1 315
 
7.0%
9 265
 
5.9%
3 248
 
5.5%
2 209
 
4.7%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct621
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2024-03-14T09:34:34.465781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length44.5
Mean length26.936137
Min length2

Characters and Unicode

Total characters17293
Distinct characters349
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique601 ?
Unique (%)93.6%

Sample

1st row주소
2nd row전북 남원시 왕정동 194
3rd row전북 전주시 완산구 효자동3가 1450-2
4th row전북 군산시 문화동 904-1
5th row전북 군산시 미룡동 408-4번지
ValueCountFrequency (%)
전라북도 581
 
15.8%
전주시 274
 
7.5%
완산구 150
 
4.1%
덕진구 124
 
3.4%
익산시 123
 
3.3%
전북 67
 
1.8%
군산시 61
 
1.7%
김제시 38
 
1.0%
남원시 34
 
0.9%
정읍시 28
 
0.8%
Other values (1259) 2193
59.7%
2024-03-14T09:34:34.913965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3231
 
18.7%
975
 
5.6%
698
 
4.0%
615
 
3.6%
594
 
3.4%
579
 
3.3%
1 563
 
3.3%
470
 
2.7%
440
 
2.5%
2 430
 
2.5%
Other values (339) 8698
50.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10327
59.7%
Space Separator 3231
 
18.7%
Decimal Number 2691
 
15.6%
Open Punctuation 367
 
2.1%
Close Punctuation 367
 
2.1%
Dash Punctuation 269
 
1.6%
Other Punctuation 28
 
0.2%
Uppercase Letter 13
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
975
 
9.4%
698
 
6.8%
615
 
6.0%
594
 
5.8%
579
 
5.6%
470
 
4.6%
440
 
4.3%
368
 
3.6%
345
 
3.3%
294
 
2.8%
Other values (310) 4949
47.9%
Decimal Number
ValueCountFrequency (%)
1 563
20.9%
2 430
16.0%
3 354
13.2%
4 236
8.8%
5 231
8.6%
0 220
 
8.2%
6 196
 
7.3%
7 191
 
7.1%
8 142
 
5.3%
9 128
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
F 2
15.4%
S 2
15.4%
J 2
15.4%
M 1
7.7%
I 1
7.7%
B 1
7.7%
A 1
7.7%
G 1
7.7%
T 1
7.7%
K 1
7.7%
Other Punctuation
ValueCountFrequency (%)
, 14
50.0%
10
35.7%
. 4
 
14.3%
Open Punctuation
ValueCountFrequency (%)
( 366
99.7%
[ 1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 366
99.7%
] 1
 
0.3%
Space Separator
ValueCountFrequency (%)
3231
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 269
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10327
59.7%
Common 6953
40.2%
Latin 13
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
975
 
9.4%
698
 
6.8%
615
 
6.0%
594
 
5.8%
579
 
5.6%
470
 
4.6%
440
 
4.3%
368
 
3.6%
345
 
3.3%
294
 
2.8%
Other values (310) 4949
47.9%
Common
ValueCountFrequency (%)
3231
46.5%
1 563
 
8.1%
2 430
 
6.2%
( 366
 
5.3%
) 366
 
5.3%
3 354
 
5.1%
- 269
 
3.9%
4 236
 
3.4%
5 231
 
3.3%
0 220
 
3.2%
Other values (9) 687
 
9.9%
Latin
ValueCountFrequency (%)
F 2
15.4%
S 2
15.4%
J 2
15.4%
M 1
7.7%
I 1
7.7%
B 1
7.7%
A 1
7.7%
G 1
7.7%
T 1
7.7%
K 1
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10327
59.7%
ASCII 6956
40.2%
None 10
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3231
46.4%
1 563
 
8.1%
2 430
 
6.2%
( 366
 
5.3%
) 366
 
5.3%
3 354
 
5.1%
- 269
 
3.9%
4 236
 
3.4%
5 231
 
3.3%
0 220
 
3.2%
Other values (18) 690
 
9.9%
Hangul
ValueCountFrequency (%)
975
 
9.4%
698
 
6.8%
615
 
6.0%
594
 
5.8%
579
 
5.6%
470
 
4.6%
440
 
4.3%
368
 
3.6%
345
 
3.3%
294
 
2.8%
Other values (310) 4949
47.9%
None
ValueCountFrequency (%)
10
100.0%
Distinct625
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2024-03-14T09:34:35.163695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length11.750779
Min length4

Characters and Unicode

Total characters7544
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique609 ?
Unique (%)94.9%

Sample

1st row전화번호
2nd row063-246-6055
3rd row0632531337
4th row063-466-4012
5th row063-468-6658
ValueCountFrequency (%)
063-451-8046 3
 
0.5%
063-833-7123 2
 
0.3%
063-548-0025 2
 
0.3%
063-837-5623 2
 
0.3%
0635640456 2
 
0.3%
063-225-2959 2
 
0.3%
563-2295 2
 
0.3%
063-236-4775 2
 
0.3%
063-653-8803 2
 
0.3%
063-246-7589 2
 
0.3%
Other values (615) 621
96.7%
2024-03-14T09:34:35.483816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1140
15.1%
3 1102
14.6%
0 1083
14.4%
6 986
13.1%
2 723
9.6%
5 558
7.4%
4 474
6.3%
1 447
 
5.9%
8 434
 
5.8%
7 368
 
4.9%
Other values (5) 229
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6400
84.8%
Dash Punctuation 1140
 
15.1%
Other Letter 4
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 1102
17.2%
0 1083
16.9%
6 986
15.4%
2 723
11.3%
5 558
8.7%
4 474
7.4%
1 447
7.0%
8 434
 
6.8%
7 368
 
5.8%
9 225
 
3.5%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 1140
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7540
99.9%
Hangul 4
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1140
15.1%
3 1102
14.6%
0 1083
14.4%
6 986
13.1%
2 723
9.6%
5 558
7.4%
4 474
6.3%
1 447
 
5.9%
8 434
 
5.8%
7 368
 
4.9%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7540
99.9%
Hangul 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1140
15.1%
3 1102
14.6%
0 1083
14.4%
6 986
13.1%
2 723
9.6%
5 558
7.4%
4 474
6.3%
1 447
 
5.9%
8 434
 
5.8%
7 368
 
4.9%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct607
Distinct (%)94.5%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2024-03-14T09:34:35.663225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length11.623053
Min length4

Characters and Unicode

Total characters7462
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique572 ?
Unique (%)89.1%

Sample

1st row팩스번호
2nd row063-244-5988
3rd row063-253-1336
4th row063-463-7363
5th row063-468-6659
ValueCountFrequency (%)
063-272-6354 2
 
0.3%
063-242-1078 2
 
0.3%
0505-304-9999 2
 
0.3%
063-467-3013 2
 
0.3%
063-220-3130 2
 
0.3%
063-225-2949 2
 
0.3%
063-243-7529 2
 
0.3%
063-561-5183 2
 
0.3%
063-270-3948 2
 
0.3%
0636259366 2
 
0.3%
Other values (597) 622
96.9%
2024-03-14T09:34:35.930017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 1129
15.1%
- 1059
14.2%
0 1027
13.8%
6 1014
13.6%
2 723
9.7%
5 569
7.6%
4 480
6.4%
8 423
 
5.7%
1 407
 
5.5%
7 355
 
4.8%
Other values (5) 276
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6399
85.8%
Dash Punctuation 1059
 
14.2%
Other Letter 4
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 1129
17.6%
0 1027
16.0%
6 1014
15.8%
2 723
11.3%
5 569
8.9%
4 480
7.5%
8 423
 
6.6%
1 407
 
6.4%
7 355
 
5.5%
9 272
 
4.3%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 1059
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7458
99.9%
Hangul 4
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 1129
15.1%
- 1059
14.2%
0 1027
13.8%
6 1014
13.6%
2 723
9.7%
5 569
7.6%
4 480
6.4%
8 423
 
5.7%
1 407
 
5.5%
7 355
 
4.8%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7458
99.9%
Hangul 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 1129
15.1%
- 1059
14.2%
0 1027
13.8%
6 1014
13.6%
2 723
9.7%
5 569
7.6%
4 480
6.4%
8 423
 
5.7%
1 407
 
5.5%
7 355
 
4.8%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 7
Categorical

Distinct16
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
제조업
227 
건설업
223 
도매 및 소매업
71 
사업시설관리 및 사업지원서비스업
42 
출판, 영상, 방송통신 및 정보서비스
 
16
Other values (11)
63 

Length

Max length21
Median length3
Mean length5.8847352
Min length2

Unique

Unique2 ?
Unique (%)0.3%

Sample

1st row주업종
2nd row건설업
3rd row건설업
4th row건설업
5th row전기, 가스, 증기 및 수도사업

Common Values

ValueCountFrequency (%)
제조업 227
35.4%
건설업 223
34.7%
도매 및 소매업 71
 
11.1%
사업시설관리 및 사업지원서비스업 42
 
6.5%
출판, 영상, 방송통신 및 정보서비스 16
 
2.5%
수리 및 기타서비스업 16
 
2.5%
전기, 가스, 증기 및 수도사업 13
 
2.0%
전문, 과학 및 기술 서비스업 12
 
1.9%
부동산업 및 임대업 4
 
0.6%
운수업 4
 
0.6%
Other values (6) 14
 
2.2%

Length

2024-03-14T09:34:36.051980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제조업 227
20.3%
건설업 223
20.0%
184
16.5%
도매 71
 
6.4%
소매업 71
 
6.4%
사업시설관리 42
 
3.8%
사업지원서비스업 42
 
3.8%
출판 16
 
1.4%
영상 16
 
1.4%
정보서비스 16
 
1.4%
Other values (31) 208
18.6%

Unnamed: 8
Text

MISSING 

Distinct310
Distinct (%)75.8%
Missing233
Missing (%)36.3%
Memory size5.1 KiB
2024-03-14T09:34:36.244191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length20
Mean length7.193154
Min length1

Characters and Unicode

Total characters2942
Distinct characters344
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique276 ?
Unique (%)67.5%

Sample

1st row주생산품
2nd row시설물유지관리
3rd row기계설비공사
4th row없음
5th row전기공사용역
ValueCountFrequency (%)
건물청소용역 14
 
2.8%
전기공사 13
 
2.6%
의약품 10
 
2.0%
건설업 8
 
1.6%
전기공사용역 7
 
1.4%
7
 
1.4%
시설물유지관리업 7
 
1.4%
실내건축공사 7
 
1.4%
토목공사 6
 
1.2%
철근콘크리트공사 6
 
1.2%
Other values (360) 422
83.2%
2024-03-14T09:34:36.655978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
127
 
4.3%
125
 
4.2%
102
 
3.5%
, 85
 
2.9%
77
 
2.6%
74
 
2.5%
70
 
2.4%
68
 
2.3%
67
 
2.3%
64
 
2.2%
Other values (334) 2083
70.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2683
91.2%
Other Punctuation 109
 
3.7%
Space Separator 102
 
3.5%
Decimal Number 13
 
0.4%
Uppercase Letter 12
 
0.4%
Close Punctuation 11
 
0.4%
Open Punctuation 11
 
0.4%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
127
 
4.7%
125
 
4.7%
77
 
2.9%
74
 
2.8%
70
 
2.6%
68
 
2.5%
67
 
2.5%
64
 
2.4%
54
 
2.0%
48
 
1.8%
Other values (313) 1909
71.2%
Uppercase Letter
ValueCountFrequency (%)
C 2
16.7%
D 2
16.7%
K 1
8.3%
T 1
8.3%
Y 1
8.3%
N 1
8.3%
I 1
8.3%
F 1
8.3%
R 1
8.3%
M 1
8.3%
Decimal Number
ValueCountFrequency (%)
9 9
69.2%
0 2
 
15.4%
2 1
 
7.7%
3 1
 
7.7%
Other Punctuation
ValueCountFrequency (%)
, 85
78.0%
. 18
 
16.5%
/ 6
 
5.5%
Space Separator
ValueCountFrequency (%)
102
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2682
91.2%
Common 247
 
8.4%
Latin 12
 
0.4%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
127
 
4.7%
125
 
4.7%
77
 
2.9%
74
 
2.8%
70
 
2.6%
68
 
2.5%
67
 
2.5%
64
 
2.4%
54
 
2.0%
48
 
1.8%
Other values (312) 1908
71.1%
Common
ValueCountFrequency (%)
102
41.3%
, 85
34.4%
. 18
 
7.3%
) 11
 
4.5%
( 11
 
4.5%
9 9
 
3.6%
/ 6
 
2.4%
0 2
 
0.8%
- 1
 
0.4%
2 1
 
0.4%
Latin
ValueCountFrequency (%)
C 2
16.7%
D 2
16.7%
K 1
8.3%
T 1
8.3%
Y 1
8.3%
N 1
8.3%
I 1
8.3%
F 1
8.3%
R 1
8.3%
M 1
8.3%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2682
91.2%
ASCII 259
 
8.8%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
127
 
4.7%
125
 
4.7%
77
 
2.9%
74
 
2.8%
70
 
2.6%
68
 
2.5%
67
 
2.5%
64
 
2.4%
54
 
2.0%
48
 
1.8%
Other values (312) 1908
71.1%
ASCII
ValueCountFrequency (%)
102
39.4%
, 85
32.8%
. 18
 
6.9%
) 11
 
4.2%
( 11
 
4.2%
9 9
 
3.5%
/ 6
 
2.3%
C 2
 
0.8%
0 2
 
0.8%
D 2
 
0.8%
Other values (11) 11
 
4.2%
CJK
ValueCountFrequency (%)
1
100.0%

(2014. 8. 17)
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing641
Missing (%)99.8%
Memory size5.1 KiB
2024-03-14T09:34:36.751112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row비고
ValueCountFrequency (%)
비고 1
100.0%
2024-03-14T09:34:36.907918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Missing values

2024-03-14T09:34:31.155807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:34:31.264552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T09:34:31.342012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8(2014. 8. 17)
0연번업체명대표자우편번호주소전화번호팩스번호주업종주생산품비고
11(유)유송건설조은주590-120전북 남원시 왕정동 194063-246-6055063-244-5988건설업시설물유지관리<NA>
22(유)지에스엔지니어링서봉순560-869전북 전주시 완산구 효자동3가 1450-20632531337063-253-1336건설업기계설비공사<NA>
33(유)한양기공강순덕573-140전북 군산시 문화동 904-1063-466-4012063-463-7363건설업없음<NA>
44대원전력주식회사박태숙573-360전북 군산시 미룡동 408-4번지063-468-6658063-468-6659전기, 가스, 증기 및 수도사업전기공사용역<NA>
55유한회사 계림건설강점이570-986전북 익산시 인화동2가 100-2063-856-2650063-855-0760건설업건축공사<NA>
66오리엔트21양태련570-210전북 익산시 어양동 513-16번지011-659-2468063-832-2283제조업석재타일 및 판석<NA>
77(주)빗살이지수570-811전북 익산시 황등면 율촌리 907번지063-858-6527063-858-6529제조업<NA><NA>
88대한엔지니어링김경미560-822전북 전주시 완산구 서신동 991-2063-221-2948063-221-8722전기, 가스, 증기 및 수도사업전기공사용역<NA>
99(주)영웅건설임옥희570-988전북 익산시 중앙동3가 8-5063-843-3700063-843-3720건설업건설업<NA>
Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8(2014. 8. 17)
632632(유)우리산업강춘화590-911전라북도 남원시 덕과면 덕오로 474063-631-4088063-631-5088제조업<NA><NA>
633633(유)범건설유현자590-989전라북도 남원시 동림로 47 (향교동)063-636-6800063-636-6801건설업건설업<NA>
634634주식회사 포올최귀숙585-821전라북도 고창군 흥덕면 선운대로 3619-79063-561-2390063-561-2391건설업<NA><NA>
635635태양전력(유)이덕례580-060전라북도 정읍시 청수1길 3 전북정읍시시기동204-117 (시기동)063-536-15200635371520전기, 가스, 증기 및 수도사업<NA><NA>
636636(유)하나석재오민행570-801전라북도 익산시 함열읍 미륵사지로 1098063-861-4555063-861-4556제조업경계석<NA>
637637대원전력주식회사박태숙573-360전라북도 군산시 미제길 10-18 408-4 (미룡동)063-468-6658063-468-6659전기, 가스, 증기 및 수도사업<NA><NA>
638638성원유통민지은560-870전라북도 전주시 완산구 서곡2길 14-1 (효자동3가)063-252-51790303-3442-9762도매 및 소매업사무용품및액세사리<NA>
639639(유)신흥공영박순영570-803전라북도 익산시 함열읍 함열10길 15063-861-1568063-861-1350건설업창호공사<NA>
640640오에이원정보경560-901전라북도 전주시 완산구 척동9길 5-6 204호 (효자동3가)063-222-9601063-222-9602도매 및 소매업서비스<NA>
641641한들인쇄이금순외1명560-838전라북도 전주시 완산구 메너머1길 15 (중화산동2가)063-278-5977063-278-5976제조업<NA><NA>