Overview

Dataset statistics

Number of variables3
Number of observations1107
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows4
Duplicate rows (%)0.4%
Total size in memory26.1 KiB
Average record size in memory24.1 B

Variable types

Text2
Categorical1

Dataset

Description법무법인 데이터 중 설립인가 받은 전국 법무법인의 주소 및 소속회 자료 현황을 제공합니다. (제공항목은 법무법인, 소속회, 주소)
Author법무부
URLhttps://www.data.go.kr/data/15052564/fileData.do

Alerts

Dataset has 4 (0.4%) duplicate rowsDuplicates
소속회 is highly imbalanced (50.3%)Imbalance

Reproduction

Analysis started2023-12-12 04:56:07.266839
Analysis finished2023-12-12 04:56:07.803683
Duration0.54 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1098
Distinct (%)99.3%
Missing1
Missing (%)0.1%
Memory size8.8 KiB
2023-12-12T13:56:08.132335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length7
Mean length7.6003617
Min length4

Characters and Unicode

Total characters8406
Distinct characters349
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1090 ?
Unique (%)98.6%

Sample

1st row광화문 법무법인
2nd row동방종합 법무법인
3rd row법무법인 삼덕
4th row법무법인 네이버스
5th row법무법인 동일
ValueCountFrequency (%)
법무법인 1043
47.5%
법무법인(유한 40
 
1.8%
법조 2
 
0.1%
사람 2
 
0.1%
종합법률사무소 2
 
0.1%
선인 2
 
0.1%
좋은친구 2
 
0.1%
대신 2
 
0.1%
호림 2
 
0.1%
호민 2
 
0.1%
Other values (1094) 1097
50.0%
2023-12-12T13:56:08.717378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2235
26.6%
1149
13.7%
1123
13.4%
1099
13.1%
97
 
1.2%
74
 
0.9%
68
 
0.8%
61
 
0.7%
58
 
0.7%
54
 
0.6%
Other values (339) 2388
28.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7200
85.7%
Space Separator 1099
 
13.1%
Open Punctuation 51
 
0.6%
Close Punctuation 51
 
0.6%
Decimal Number 4
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2235
31.0%
1149
16.0%
1123
15.6%
97
 
1.3%
74
 
1.0%
68
 
0.9%
61
 
0.8%
58
 
0.8%
54
 
0.8%
54
 
0.8%
Other values (331) 2227
30.9%
Decimal Number
ValueCountFrequency (%)
1 1
25.0%
0 1
25.0%
2 1
25.0%
3 1
25.0%
Space Separator
ValueCountFrequency (%)
1099
100.0%
Open Punctuation
ValueCountFrequency (%)
( 51
100.0%
Close Punctuation
ValueCountFrequency (%)
) 51
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7198
85.6%
Common 1206
 
14.3%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2235
31.1%
1149
16.0%
1123
15.6%
97
 
1.3%
74
 
1.0%
68
 
0.9%
61
 
0.8%
58
 
0.8%
54
 
0.8%
54
 
0.8%
Other values (329) 2225
30.9%
Common
ValueCountFrequency (%)
1099
91.1%
( 51
 
4.2%
) 51
 
4.2%
1 1
 
0.1%
0 1
 
0.1%
. 1
 
0.1%
2 1
 
0.1%
3 1
 
0.1%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7198
85.6%
ASCII 1206
 
14.3%
CJK 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2235
31.1%
1149
16.0%
1123
15.6%
97
 
1.3%
74
 
1.0%
68
 
0.9%
61
 
0.8%
58
 
0.8%
54
 
0.8%
54
 
0.8%
Other values (329) 2225
30.9%
ASCII
ValueCountFrequency (%)
1099
91.1%
( 51
 
4.2%
) 51
 
4.2%
1 1
 
0.1%
0 1
 
0.1%
. 1
 
0.1%
2 1
 
0.1%
3 1
 
0.1%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%

소속회
Categorical

IMBALANCE 

Distinct23
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
서울
723 
부산
 
64
인천
 
40
수원
 
39
대구
 
38
Other values (18)
203 

Length

Max length4
Median length2
Mean length2.0569106
Min length2

Unique

Unique4 ?
Unique (%)0.4%

Sample

1st row서울
2nd row서울
3rd row부산
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
서울 723
65.3%
부산 64
 
5.8%
인천 40
 
3.6%
수원 39
 
3.5%
대구 38
 
3.4%
대전 31
 
2.8%
경기 29
 
2.6%
경남 25
 
2.3%
광주 24
 
2.2%
전북 18
 
1.6%
Other values (13) 76
 
6.9%

Length

2023-12-12T13:56:08.916655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 729
65.9%
부산 65
 
5.9%
인천 40
 
3.6%
수원 39
 
3.5%
대구 38
 
3.4%
대전 31
 
2.8%
경기 29
 
2.6%
경남 25
 
2.3%
광주 24
 
2.2%
전북 18
 
1.6%
Other values (11) 69
 
6.2%
Distinct1095
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
2023-12-12T13:56:09.305328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length45
Mean length30.706414
Min length16

Characters and Unicode

Total characters33992
Distinct characters416
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1083 ?
Unique (%)97.8%

Sample

1st row서울 종로구 신문로 1가 24(고려빌딩 613-4)
2nd row서울 중구 퇴계로 187(필동1가, 국제빌딩)
3rd row부산 연제구 거제동 1490-3(세헌빌딩 4층)
4th row서울 종로구 종로3길 34, 701, 702호(삼송빌딩)
5th row서울 광진구 구의동 246-3(민도빌딩 1층)
ValueCountFrequency (%)
서울 629
 
9.9%
서초구 476
 
7.5%
서초대로 145
 
2.3%
강남구 109
 
1.7%
경기 81
 
1.3%
서초동 80
 
1.3%
서초중앙로 76
 
1.2%
반포대로 57
 
0.9%
법원로 55
 
0.9%
부산 55
 
0.9%
Other values (2226) 4588
72.2%
2023-12-12T13:56:09.841876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5303
 
15.6%
1926
 
5.7%
, 1577
 
4.6%
1 1302
 
3.8%
1127
 
3.3%
1124
 
3.3%
986
 
2.9%
2 958
 
2.8%
946
 
2.8%
( 933
 
2.7%
Other values (406) 17810
52.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18260
53.7%
Decimal Number 6622
 
19.5%
Space Separator 5303
 
15.6%
Other Punctuation 1591
 
4.7%
Open Punctuation 933
 
2.7%
Close Punctuation 928
 
2.7%
Dash Punctuation 276
 
0.8%
Uppercase Letter 50
 
0.1%
Math Symbol 27
 
0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1926
 
10.5%
1127
 
6.2%
1124
 
6.2%
986
 
5.4%
946
 
5.2%
782
 
4.3%
778
 
4.3%
724
 
4.0%
720
 
3.9%
496
 
2.7%
Other values (363) 8651
47.4%
Uppercase Letter
ValueCountFrequency (%)
K 8
16.0%
E 5
10.0%
I 4
 
8.0%
T 4
 
8.0%
A 3
 
6.0%
G 3
 
6.0%
S 3
 
6.0%
B 3
 
6.0%
O 3
 
6.0%
R 2
 
4.0%
Other values (9) 12
24.0%
Decimal Number
ValueCountFrequency (%)
1 1302
19.7%
2 958
14.5%
0 868
13.1%
3 790
11.9%
4 667
10.1%
5 598
9.0%
6 449
 
6.8%
7 380
 
5.7%
8 341
 
5.1%
9 269
 
4.1%
Other Punctuation
ValueCountFrequency (%)
, 1577
99.1%
. 9
 
0.6%
/ 3
 
0.2%
& 1
 
0.1%
\ 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 23
85.2%
2
 
7.4%
2
 
7.4%
Space Separator
ValueCountFrequency (%)
5303
100.0%
Open Punctuation
ValueCountFrequency (%)
( 933
100.0%
Close Punctuation
ValueCountFrequency (%)
) 928
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 276
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Lowercase Letter
ValueCountFrequency (%)
p 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18260
53.7%
Common 15680
46.1%
Latin 52
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1926
 
10.5%
1127
 
6.2%
1124
 
6.2%
986
 
5.4%
946
 
5.2%
782
 
4.3%
778
 
4.3%
724
 
4.0%
720
 
3.9%
496
 
2.7%
Other values (363) 8651
47.4%
Common
ValueCountFrequency (%)
5303
33.8%
, 1577
 
10.1%
1 1302
 
8.3%
2 958
 
6.1%
( 933
 
6.0%
) 928
 
5.9%
0 868
 
5.5%
3 790
 
5.0%
4 667
 
4.3%
5 598
 
3.8%
Other values (12) 1756
 
11.2%
Latin
ValueCountFrequency (%)
K 8
15.4%
E 5
 
9.6%
I 4
 
7.7%
T 4
 
7.7%
A 3
 
5.8%
G 3
 
5.8%
S 3
 
5.8%
B 3
 
5.8%
O 3
 
5.8%
R 2
 
3.8%
Other values (11) 14
26.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18260
53.7%
ASCII 15727
46.3%
None 2
 
< 0.1%
Math Operators 2
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5303
33.7%
, 1577
 
10.0%
1 1302
 
8.3%
2 958
 
6.1%
( 933
 
5.9%
) 928
 
5.9%
0 868
 
5.5%
3 790
 
5.0%
4 667
 
4.2%
5 598
 
3.8%
Other values (30) 1803
 
11.5%
Hangul
ValueCountFrequency (%)
1926
 
10.5%
1127
 
6.2%
1124
 
6.2%
986
 
5.4%
946
 
5.2%
782
 
4.3%
778
 
4.3%
724
 
4.0%
720
 
3.9%
496
 
2.7%
Other values (363) 8651
47.4%
None
ValueCountFrequency (%)
2
100.0%
Math Operators
ValueCountFrequency (%)
2
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

Missing values

2023-12-12T13:56:07.684495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:56:07.770735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

법무법인소속회주 소
0광화문 법무법인서울서울 종로구 신문로 1가 24(고려빌딩 613-4)
1동방종합 법무법인서울서울 중구 퇴계로 187(필동1가, 국제빌딩)
2법무법인 삼덕부산부산 연제구 거제동 1490-3(세헌빌딩 4층)
3법무법인 네이버스서울서울 종로구 종로3길 34, 701, 702호(삼송빌딩)
4법무법인 동일서울서울 광진구 구의동 246-3(민도빌딩 1층)
5법무법인 광장서울서울 중구 남대문로 63, 18층(한진빌딩 본관)
6법무법인 을지서울서울 서초구 서초동 1554-5(한승아스트라빌딩 11층)
7법무법인 대종서울서울 종로구 새문안로5길 13, 303호(변호사회관)
8성심종합 법무법인서울서울 광진구 구의동 243-7(덕운빌딩 2층)
9법무법인 북부합동서울서울 도봉구 마들로 732, 1, 3층(거봉빌딩)
법무법인소속회주 소
1097법무법인 상록수충남충남 당진시 청룡길 159, 202호, 203호(용기빌딩)
1098법무법인 조운서울서울특별시 서초구 반포대로30길 33, 2층(서초동, 블리스빌딩)
1099법무법인 제민서울서울특별시 서초구 서초중앙로24길 16, 3층(서초동)
1100법무법인 지헌서울서울특별시 서초구 반포대로28길 33, 3층(서초동, 33빌딩)
1101법무법인 나율경기경기도 부천시 원미구 상일로 126 뉴법조타운 201호, 206호, 207호
1102법무법인 선유충남천안시 동남구 청수14로 102, 에이스법조타워 503, 504, 505호
1103법무법인 서중서울서울특별시 서초구 서초대로 274, 9층
1104법무법인 수오재서울서울시 강남구 강남대로 302, 16층
1105법무법인 청음서울서울특별시 서초구 법원로3길 26, 2층(서초동, 한림빌딩)
1106법무법인 정평재서울서울특별시 서대문구 연희로 239, 3층

Duplicate rows

Most frequently occurring

법무법인소속회주 소# duplicates
0법무법인 대신부산부산 연제구 법원로 38, 801호(거제동, 로펌빌딩)2
1법무법인 법조서울서울 서초구 반포대로 106, 2층(서초동, 태흥빌딩2
2법무법인 호림전북전북 전주시 덕진구 사평로 43, 2층(덕진동1가, 통일빌딩)2
3법무법인(유한) 원서울서울 서초구 강남대로 343, 신덕빌딩2