Overview

Dataset statistics

Number of variables11
Number of observations1494
Missing cells2449
Missing cells (%)14.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory131.4 KiB
Average record size in memory90.1 B

Variable types

Text6
Numeric1
Categorical3
Unsupported1

Dataset

Description키값,등록번호,상호,기관구분,행정시,행정구,행정동,대표자,주소,도로명주소,타겟국가
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12973/S/1/datasetView.do

Alerts

행정시 is highly overall correlated with 행정구High correlation
행정구 is highly overall correlated with 행정시High correlation
행정시 is highly imbalanced (98.6%)Imbalance
도로명주소 has 1494 (100.0%) missing valuesMissing
타겟국가 has 953 (63.8%) missing valuesMissing
키값 has unique valuesUnique
등록번호 has unique valuesUnique
도로명주소 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-16 22:38:03.083797
Analysis finished2024-04-16 22:38:04.242041
Duration1.16 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

키값
Text

UNIQUE 

Distinct1494
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size11.8 KiB
2024-04-17T07:38:04.417433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters20916
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1494 ?
Unique (%)100.0%

Sample

1st rowBE_LiST21-1460
2nd rowBE_LiST21-1461
3rd rowBE_LiST21-1462
4th rowBE_LiST21-1463
5th rowBE_LiST21-1464
ValueCountFrequency (%)
be_list21-1460 1
 
0.1%
be_list21-1010 1
 
0.1%
be_list21-1019 1
 
0.1%
be_list21-1018 1
 
0.1%
be_list21-1017 1
 
0.1%
be_list21-1016 1
 
0.1%
be_list21-1015 1
 
0.1%
be_list21-1014 1
 
0.1%
be_list21-1013 1
 
0.1%
be_list21-1012 1
 
0.1%
Other values (1484) 1484
99.3%
2024-04-17T07:38:04.753180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2489
11.9%
2 1994
9.5%
0 1496
 
7.2%
B 1494
 
7.1%
T 1494
 
7.1%
E 1494
 
7.1%
- 1494
 
7.1%
S 1494
 
7.1%
i 1494
 
7.1%
L 1494
 
7.1%
Other values (8) 4479
21.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8964
42.9%
Uppercase Letter 7470
35.7%
Dash Punctuation 1494
 
7.1%
Lowercase Letter 1494
 
7.1%
Connector Punctuation 1494
 
7.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2489
27.8%
2 1994
22.2%
0 1496
16.7%
3 500
 
5.6%
4 495
 
5.5%
6 399
 
4.5%
5 399
 
4.5%
7 399
 
4.5%
8 399
 
4.5%
9 394
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
B 1494
20.0%
T 1494
20.0%
E 1494
20.0%
S 1494
20.0%
L 1494
20.0%
Dash Punctuation
ValueCountFrequency (%)
- 1494
100.0%
Lowercase Letter
ValueCountFrequency (%)
i 1494
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1494
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11952
57.1%
Latin 8964
42.9%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2489
20.8%
2 1994
16.7%
0 1496
12.5%
- 1494
12.5%
_ 1494
12.5%
3 500
 
4.2%
4 495
 
4.1%
6 399
 
3.3%
5 399
 
3.3%
7 399
 
3.3%
Other values (2) 793
 
6.6%
Latin
ValueCountFrequency (%)
B 1494
16.7%
T 1494
16.7%
E 1494
16.7%
S 1494
16.7%
i 1494
16.7%
L 1494
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2489
11.9%
2 1994
9.5%
0 1496
 
7.2%
B 1494
 
7.1%
T 1494
 
7.1%
E 1494
 
7.1%
- 1494
 
7.1%
S 1494
 
7.1%
i 1494
 
7.1%
L 1494
 
7.1%
Other values (8) 4479
21.4%

등록번호
Real number (ℝ)

UNIQUE 

Distinct1494
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2337.7544
Minimum1
Maximum9998
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.3 KiB
2024-04-17T07:38:04.879433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile184.3
Q11364
median2577.5
Q33369.75
95-th percentile3988.7
Maximum9998
Range9997
Interquartile range (IQR)2005.75

Descriptive statistics

Standard deviation1237.5948
Coefficient of variation (CV)0.52939472
Kurtosis-0.19297006
Mean2337.7544
Median Absolute Deviation (MAD)947
Skewness-0.22293602
Sum3492605
Variance1531640.9
MonotonicityNot monotonic
2024-04-17T07:38:04.997257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4074 1
 
0.1%
3121 1
 
0.1%
3126 1
 
0.1%
3119 1
 
0.1%
3109 1
 
0.1%
3101 1
 
0.1%
3137 1
 
0.1%
3127 1
 
0.1%
3148 1
 
0.1%
3125 1
 
0.1%
Other values (1484) 1484
99.3%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
4 1
0.1%
16 1
0.1%
20 1
0.1%
21 1
0.1%
22 1
0.1%
24 1
0.1%
25 1
0.1%
27 1
0.1%
ValueCountFrequency (%)
9998 1
0.1%
4138 1
0.1%
4137 1
0.1%
4136 1
0.1%
4135 1
0.1%
4133 1
0.1%
4132 1
0.1%
4127 1
0.1%
4126 1
0.1%
4125 1
0.1%

상호
Text

Distinct1420
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Memory size11.8 KiB
2024-04-17T07:38:05.196816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length21
Mean length7.7175368
Min length3

Characters and Unicode

Total characters11530
Distinct characters517
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1367 ?
Unique (%)91.5%

Sample

1st row별이성형외과
2nd row허윞업신경외과의원
3rd row더마주피부과의원
4th row아이템의원피부과
5th row정다운임치과의원
ValueCountFrequency (%)
성형외과 13
 
0.8%
유디치과의원 12
 
0.8%
의원 8
 
0.5%
성형외과의원 6
 
0.4%
의료법인 6
 
0.4%
한의원 4
 
0.3%
라마르의원 4
 
0.3%
리뉴미피부과의원 4
 
0.3%
미소랑치과의원 4
 
0.3%
경희봄한의원 4
 
0.3%
Other values (1462) 1523
95.9%
2024-04-17T07:38:05.520850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1328
 
11.5%
1174
 
10.2%
933
 
8.1%
409
 
3.5%
369
 
3.2%
344
 
3.0%
299
 
2.6%
239
 
2.1%
216
 
1.9%
206
 
1.8%
Other values (507) 6013
52.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11190
97.1%
Space Separator 95
 
0.8%
Uppercase Letter 92
 
0.8%
Close Punctuation 45
 
0.4%
Open Punctuation 43
 
0.4%
Decimal Number 39
 
0.3%
Lowercase Letter 16
 
0.1%
Other Punctuation 8
 
0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1328
 
11.9%
1174
 
10.5%
933
 
8.3%
409
 
3.7%
369
 
3.3%
344
 
3.1%
299
 
2.7%
239
 
2.1%
216
 
1.9%
206
 
1.8%
Other values (455) 5673
50.7%
Uppercase Letter
ValueCountFrequency (%)
S 10
 
10.9%
J 7
 
7.6%
K 7
 
7.6%
A 6
 
6.5%
B 6
 
6.5%
P 6
 
6.5%
C 6
 
6.5%
T 5
 
5.4%
Y 5
 
5.4%
U 5
 
5.4%
Other values (11) 29
31.5%
Lowercase Letter
ValueCountFrequency (%)
e 4
25.0%
k 2
12.5%
c 1
 
6.2%
m 1
 
6.2%
n 1
 
6.2%
t 1
 
6.2%
r 1
 
6.2%
u 1
 
6.2%
l 1
 
6.2%
o 1
 
6.2%
Other values (2) 2
12.5%
Decimal Number
ValueCountFrequency (%)
3 8
20.5%
6 8
20.5%
5 8
20.5%
2 4
10.3%
7 3
 
7.7%
1 3
 
7.7%
9 2
 
5.1%
8 1
 
2.6%
0 1
 
2.6%
4 1
 
2.6%
Other Punctuation
ValueCountFrequency (%)
. 2
25.0%
& 2
25.0%
/ 2
25.0%
, 1
12.5%
? 1
12.5%
Space Separator
ValueCountFrequency (%)
95
100.0%
Close Punctuation
ValueCountFrequency (%)
) 45
100.0%
Open Punctuation
ValueCountFrequency (%)
( 43
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11190
97.1%
Common 232
 
2.0%
Latin 108
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1328
 
11.9%
1174
 
10.5%
933
 
8.3%
409
 
3.7%
369
 
3.3%
344
 
3.1%
299
 
2.7%
239
 
2.1%
216
 
1.9%
206
 
1.8%
Other values (455) 5673
50.7%
Latin
ValueCountFrequency (%)
S 10
 
9.3%
J 7
 
6.5%
K 7
 
6.5%
A 6
 
5.6%
B 6
 
5.6%
P 6
 
5.6%
C 6
 
5.6%
T 5
 
4.6%
Y 5
 
4.6%
U 5
 
4.6%
Other values (23) 45
41.7%
Common
ValueCountFrequency (%)
95
40.9%
) 45
19.4%
( 43
18.5%
3 8
 
3.4%
6 8
 
3.4%
5 8
 
3.4%
2 4
 
1.7%
7 3
 
1.3%
1 3
 
1.3%
. 2
 
0.9%
Other values (9) 13
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11190
97.1%
ASCII 340
 
2.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1328
 
11.9%
1174
 
10.5%
933
 
8.3%
409
 
3.7%
369
 
3.3%
344
 
3.1%
299
 
2.7%
239
 
2.1%
216
 
1.9%
206
 
1.8%
Other values (455) 5673
50.7%
ASCII
ValueCountFrequency (%)
95
27.9%
) 45
13.2%
( 43
12.6%
S 10
 
2.9%
3 8
 
2.4%
6 8
 
2.4%
5 8
 
2.4%
J 7
 
2.1%
K 7
 
2.1%
A 6
 
1.8%
Other values (42) 103
30.3%

기관구분
Categorical

Distinct9
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size11.8 KiB
의원
878 
치과의원
245 
한의원
154 
병원
116 
치과병원
 
38
Other values (4)
 
63

Length

Max length6
Median length2
Mean length2.5823293
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row의원
2nd row의원
3rd row의원
4th row의원
5th row치과의원

Common Values

ValueCountFrequency (%)
의원 878
58.8%
치과의원 245
 
16.4%
한의원 154
 
10.3%
병원 116
 
7.8%
치과병원 38
 
2.5%
종합병원 31
 
2.1%
상급종합병원 16
 
1.1%
한방병원 12
 
0.8%
기타 4
 
0.3%

Length

2024-04-17T07:38:05.641152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T07:38:05.748729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
의원 878
58.8%
치과의원 245
 
16.4%
한의원 154
 
10.3%
병원 116
 
7.8%
치과병원 38
 
2.5%
종합병원 31
 
2.1%
상급종합병원 16
 
1.1%
한방병원 12
 
0.8%
기타 4
 
0.3%

행정시
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size11.8 KiB
서울특별시
1491 
경기도
 
2
<NA>
 
1

Length

Max length5
Median length5
Mean length4.9966533
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 1491
99.8%
경기도 2
 
0.1%
<NA> 1
 
0.1%

Length

2024-04-17T07:38:05.895612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T07:38:05.990322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 1491
99.8%
경기도 2
 
0.1%
na 1
 
0.1%

행정구
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size11.8 KiB
강남구
740 
서초구
193 
중구
94 
영등포구
 
46
송파구
 
45
Other values (23)
376 

Length

Max length8
Median length3
Mean length3.0060241
Min length2

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st row강남구
2nd row강남구
3rd row강남구
4th row강남구
5th row중랑구

Common Values

ValueCountFrequency (%)
강남구 740
49.5%
서초구 193
 
12.9%
중구 94
 
6.3%
영등포구 46
 
3.1%
송파구 45
 
3.0%
강서구 40
 
2.7%
마포구 33
 
2.2%
동대문구 28
 
1.9%
종로구 24
 
1.6%
용산구 22
 
1.5%
Other values (18) 229
 
15.3%

Length

2024-04-17T07:38:06.091004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강남구 740
49.5%
서초구 193
 
12.9%
중구 94
 
6.3%
영등포구 46
 
3.1%
송파구 45
 
3.0%
강서구 40
 
2.7%
마포구 33
 
2.2%
동대문구 28
 
1.9%
종로구 24
 
1.6%
용산구 22
 
1.5%
Other values (19) 231
 
15.4%
Distinct246
Distinct (%)16.5%
Missing1
Missing (%)0.1%
Memory size11.8 KiB
2024-04-17T07:38:06.342218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length4
Mean length3.7139987
Min length2

Characters and Unicode

Total characters5545
Distinct characters163
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique118 ?
Unique (%)7.9%

Sample

1st row청담동
2nd row도곡1동
3rd row삼성1동
4th row논현1동
5th row중화2동
ValueCountFrequency (%)
압구정동 140
 
9.4%
역삼1동 137
 
9.2%
신사동 123
 
8.2%
논현1동 91
 
6.1%
청담동 90
 
6.0%
서초4동 75
 
5.0%
명동 56
 
3.8%
논현2동 56
 
3.8%
잠원동 28
 
1.9%
삼성1동 21
 
1.4%
Other values (236) 676
45.3%
2024-04-17T07:38:06.704861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1497
27.0%
1 405
 
7.3%
2 203
 
3.7%
193
 
3.5%
169
 
3.0%
169
 
3.0%
157
 
2.8%
152
 
2.7%
147
 
2.7%
147
 
2.7%
Other values (153) 2306
41.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4633
83.6%
Decimal Number 854
 
15.4%
Other Punctuation 58
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1497
32.3%
193
 
4.2%
169
 
3.6%
169
 
3.6%
157
 
3.4%
152
 
3.3%
147
 
3.2%
147
 
3.2%
140
 
3.0%
140
 
3.0%
Other values (143) 1722
37.2%
Decimal Number
ValueCountFrequency (%)
1 405
47.4%
2 203
23.8%
4 121
 
14.2%
3 76
 
8.9%
6 22
 
2.6%
5 15
 
1.8%
7 10
 
1.2%
8 1
 
0.1%
0 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4633
83.6%
Common 912
 
16.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1497
32.3%
193
 
4.2%
169
 
3.6%
169
 
3.6%
157
 
3.4%
152
 
3.3%
147
 
3.2%
147
 
3.2%
140
 
3.0%
140
 
3.0%
Other values (143) 1722
37.2%
Common
ValueCountFrequency (%)
1 405
44.4%
2 203
22.3%
4 121
 
13.3%
3 76
 
8.3%
. 58
 
6.4%
6 22
 
2.4%
5 15
 
1.6%
7 10
 
1.1%
8 1
 
0.1%
0 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4633
83.6%
ASCII 912
 
16.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1497
32.3%
193
 
4.2%
169
 
3.6%
169
 
3.6%
157
 
3.4%
152
 
3.3%
147
 
3.2%
147
 
3.2%
140
 
3.0%
140
 
3.0%
Other values (143) 1722
37.2%
ASCII
ValueCountFrequency (%)
1 405
44.4%
2 203
22.3%
4 121
 
13.3%
3 76
 
8.3%
. 58
 
6.4%
6 22
 
2.4%
5 15
 
1.6%
7 10
 
1.1%
8 1
 
0.1%
0 1
 
0.1%
Distinct1426
Distinct (%)95.4%
Missing0
Missing (%)0.0%
Memory size11.8 KiB
2024-04-17T07:38:06.957491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length3
Mean length3.2925033
Min length2

Characters and Unicode

Total characters4919
Distinct characters239
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1366 ?
Unique (%)91.4%

Sample

1st row홍왕광
2nd row안계훈
3rd row김주영
4th row이진화
5th row강철규
ValueCountFrequency (%)
8
 
0.5%
1명 5
 
0.3%
김영수 4
 
0.3%
이규장 3
 
0.2%
임영진 3
 
0.2%
김종민 3
 
0.2%
곽영태 3
 
0.2%
김형섭 3
 
0.2%
황경식 3
 
0.2%
2명 3
 
0.2%
Other values (1422) 1475
97.5%
2024-04-17T07:38:07.314822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
301
 
6.1%
237
 
4.8%
135
 
2.7%
134
 
2.7%
127
 
2.6%
124
 
2.5%
123
 
2.5%
103
 
2.1%
100
 
2.0%
90
 
1.8%
Other values (229) 3445
70.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4753
96.6%
Decimal Number 103
 
2.1%
Other Punctuation 32
 
0.7%
Space Separator 19
 
0.4%
Uppercase Letter 10
 
0.2%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
301
 
6.3%
237
 
5.0%
135
 
2.8%
134
 
2.8%
127
 
2.7%
124
 
2.6%
123
 
2.6%
103
 
2.2%
100
 
2.1%
90
 
1.9%
Other values (208) 3279
69.0%
Uppercase Letter
ValueCountFrequency (%)
O 1
10.0%
S 1
10.0%
T 1
10.0%
A 1
10.0%
N 1
10.0%
C 1
10.0%
H 1
10.0%
E 1
10.0%
W 1
10.0%
I 1
10.0%
Decimal Number
ValueCountFrequency (%)
1 70
68.0%
2 25
 
24.3%
3 3
 
2.9%
5 2
 
1.9%
4 2
 
1.9%
7 1
 
1.0%
Other Punctuation
ValueCountFrequency (%)
/ 24
75.0%
, 8
 
25.0%
Space Separator
ValueCountFrequency (%)
19
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4753
96.6%
Common 156
 
3.2%
Latin 10
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
301
 
6.3%
237
 
5.0%
135
 
2.8%
134
 
2.8%
127
 
2.7%
124
 
2.6%
123
 
2.6%
103
 
2.2%
100
 
2.1%
90
 
1.9%
Other values (208) 3279
69.0%
Common
ValueCountFrequency (%)
1 70
44.9%
2 25
 
16.0%
/ 24
 
15.4%
19
 
12.2%
, 8
 
5.1%
3 3
 
1.9%
5 2
 
1.3%
4 2
 
1.3%
) 1
 
0.6%
( 1
 
0.6%
Latin
ValueCountFrequency (%)
O 1
10.0%
S 1
10.0%
T 1
10.0%
A 1
10.0%
N 1
10.0%
C 1
10.0%
H 1
10.0%
E 1
10.0%
W 1
10.0%
I 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4753
96.6%
ASCII 166
 
3.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
301
 
6.3%
237
 
5.0%
135
 
2.8%
134
 
2.8%
127
 
2.7%
124
 
2.6%
123
 
2.6%
103
 
2.2%
100
 
2.1%
90
 
1.9%
Other values (208) 3279
69.0%
ASCII
ValueCountFrequency (%)
1 70
42.2%
2 25
 
15.1%
/ 24
 
14.5%
19
 
11.4%
, 8
 
4.8%
3 3
 
1.8%
5 2
 
1.2%
4 2
 
1.2%
O 1
 
0.6%
S 1
 
0.6%
Other values (11) 11
 
6.6%

주소
Text

Distinct1479
Distinct (%)99.1%
Missing1
Missing (%)0.1%
Memory size11.8 KiB
2024-04-17T07:38:07.625414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length45
Mean length27.843269
Min length9

Characters and Unicode

Total characters41570
Distinct characters458
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1468 ?
Unique (%)98.3%

Sample

1st row서울특별시 강남구 도산대로 443 청담빌딩 3,4층
2nd row서울특별시 강남구 강남대로 254 용문빌딩 4층
3rd row서울특별시 강남구 영동대로 607 랜드마크빌딩
4th row서울특별시 강남구 도산대로 134
5th row중랑구 중랑역로 51, 대종빌딩 3층
ValueCountFrequency (%)
서울특별시 1404
 
17.6%
강남구 737
 
9.3%
서초구 192
 
2.4%
신사동 153
 
1.9%
강남대로 132
 
1.7%
3층 102
 
1.3%
중구 93
 
1.2%
2층 86
 
1.1%
도산대로 57
 
0.7%
4층 57
 
0.7%
Other values (2651) 4950
62.2%
2024-04-17T07:38:08.055633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6516
 
15.7%
1896
 
4.6%
1597
 
3.8%
1485
 
3.6%
1474
 
3.5%
1 1458
 
3.5%
1408
 
3.4%
1407
 
3.4%
1346
 
3.2%
2 1084
 
2.6%
Other values (448) 21899
52.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24570
59.1%
Decimal Number 7762
 
18.7%
Space Separator 6516
 
15.7%
Other Punctuation 805
 
1.9%
Dash Punctuation 575
 
1.4%
Close Punctuation 538
 
1.3%
Open Punctuation 538
 
1.3%
Uppercase Letter 180
 
0.4%
Math Symbol 49
 
0.1%
Lowercase Letter 35
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1896
 
7.7%
1597
 
6.5%
1485
 
6.0%
1474
 
6.0%
1408
 
5.7%
1407
 
5.7%
1346
 
5.5%
1028
 
4.2%
992
 
4.0%
946
 
3.9%
Other values (391) 10991
44.7%
Uppercase Letter
ValueCountFrequency (%)
B 31
17.2%
A 15
 
8.3%
K 14
 
7.8%
I 12
 
6.7%
M 12
 
6.7%
S 11
 
6.1%
E 10
 
5.6%
J 9
 
5.0%
F 9
 
5.0%
C 8
 
4.4%
Other values (12) 49
27.2%
Lowercase Letter
ValueCountFrequency (%)
e 6
17.1%
i 4
11.4%
v 4
11.4%
n 3
8.6%
u 3
8.6%
b 2
 
5.7%
r 2
 
5.7%
s 2
 
5.7%
t 2
 
5.7%
a 2
 
5.7%
Other values (4) 5
14.3%
Decimal Number
ValueCountFrequency (%)
1 1458
18.8%
2 1084
14.0%
3 963
12.4%
5 763
9.8%
4 753
9.7%
0 703
9.1%
6 670
8.6%
8 509
 
6.6%
7 499
 
6.4%
9 360
 
4.6%
Other Punctuation
ValueCountFrequency (%)
, 784
97.4%
. 10
 
1.2%
/ 6
 
0.7%
& 4
 
0.5%
: 1
 
0.1%
Space Separator
ValueCountFrequency (%)
6516
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 575
100.0%
Close Punctuation
ValueCountFrequency (%)
) 538
100.0%
Open Punctuation
ValueCountFrequency (%)
( 538
100.0%
Math Symbol
ValueCountFrequency (%)
~ 49
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24572
59.1%
Common 16783
40.4%
Latin 215
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1896
 
7.7%
1597
 
6.5%
1485
 
6.0%
1474
 
6.0%
1408
 
5.7%
1407
 
5.7%
1346
 
5.5%
1028
 
4.2%
992
 
4.0%
946
 
3.8%
Other values (392) 10993
44.7%
Latin
ValueCountFrequency (%)
B 31
 
14.4%
A 15
 
7.0%
K 14
 
6.5%
I 12
 
5.6%
M 12
 
5.6%
S 11
 
5.1%
E 10
 
4.7%
J 9
 
4.2%
F 9
 
4.2%
C 8
 
3.7%
Other values (26) 84
39.1%
Common
ValueCountFrequency (%)
6516
38.8%
1 1458
 
8.7%
2 1084
 
6.5%
3 963
 
5.7%
, 784
 
4.7%
5 763
 
4.5%
4 753
 
4.5%
0 703
 
4.2%
6 670
 
4.0%
- 575
 
3.4%
Other values (10) 2514
 
15.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24570
59.1%
ASCII 16998
40.9%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6516
38.3%
1 1458
 
8.6%
2 1084
 
6.4%
3 963
 
5.7%
, 784
 
4.6%
5 763
 
4.5%
4 753
 
4.4%
0 703
 
4.1%
6 670
 
3.9%
- 575
 
3.4%
Other values (46) 2729
16.1%
Hangul
ValueCountFrequency (%)
1896
 
7.7%
1597
 
6.5%
1485
 
6.0%
1474
 
6.0%
1408
 
5.7%
1407
 
5.7%
1346
 
5.5%
1028
 
4.2%
992
 
4.0%
946
 
3.9%
Other values (391) 10991
44.7%
None
ValueCountFrequency (%)
2
100.0%

도로명주소
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1494
Missing (%)100.0%
Memory size13.3 KiB

타겟국가
Text

MISSING 

Distinct81
Distinct (%)15.0%
Missing953
Missing (%)63.8%
Memory size11.8 KiB
2024-04-17T07:38:08.228781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length20
Mean length10.500924
Min length2

Characters and Unicode

Total characters5681
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)6.7%

Sample

1st row중국,
2nd row중국,러시아,몽골,
3rd row미국,일본,중국,러시아,중동,몽골,베트남,
4th row미국,중국,
5th row일본,중국,몽골,
ValueCountFrequency (%)
미국,일본,중국 94
17.3%
중국 59
 
10.9%
미국,일본,중국,러시아,중동,몽골,베트남 49
 
9.0%
일본,중국 36
 
6.6%
미국,중국 31
 
5.7%
미국,일본,중국,러시아,몽골 30
 
5.5%
미국,일본,중국,러시아 22
 
4.1%
일본,중국,러시아 12
 
2.2%
중국,러시아 11
 
2.0%
미국 11
 
2.0%
Other values (72) 188
34.6%
2024-04-17T07:38:08.506789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 1763
31.0%
821
14.5%
575
 
10.1%
334
 
5.9%
334
 
5.9%
334
 
5.9%
218
 
3.8%
210
 
3.7%
205
 
3.6%
192
 
3.4%
Other values (25) 695
 
12.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3916
68.9%
Other Punctuation 1763
31.0%
Space Separator 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
821
21.0%
575
14.7%
334
8.5%
334
8.5%
334
8.5%
218
 
5.6%
210
 
5.4%
205
 
5.2%
192
 
4.9%
192
 
4.9%
Other values (23) 501
12.8%
Other Punctuation
ValueCountFrequency (%)
, 1763
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3916
68.9%
Common 1765
31.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
821
21.0%
575
14.7%
334
8.5%
334
8.5%
334
8.5%
218
 
5.6%
210
 
5.4%
205
 
5.2%
192
 
4.9%
192
 
4.9%
Other values (23) 501
12.8%
Common
ValueCountFrequency (%)
, 1763
99.9%
2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3916
68.9%
ASCII 1765
31.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 1763
99.9%
2
 
0.1%
Hangul
ValueCountFrequency (%)
821
21.0%
575
14.7%
334
8.5%
334
8.5%
334
8.5%
218
 
5.6%
210
 
5.4%
205
 
5.2%
192
 
4.9%
192
 
4.9%
Other values (23) 501
12.8%

Interactions

2024-04-17T07:38:03.870960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-17T07:38:08.581488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록번호기관구분행정시행정구타겟국가
등록번호1.0000.2170.0000.1770.676
기관구분0.2171.0000.0000.4600.628
행정시0.0000.0001.0001.0000.000
행정구0.1770.4601.0001.0000.634
타겟국가0.6760.6280.0000.6341.000
2024-04-17T07:38:08.660998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정시행정구기관구분
행정시1.0000.9920.000
행정구0.9921.0000.170
기관구분0.0000.1701.000
2024-04-17T07:38:08.732605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록번호기관구분행정시행정구
등록번호1.0000.1090.0000.078
기관구분0.1091.0000.0000.170
행정시0.0000.0001.0000.992
행정구0.0780.1700.9921.000

Missing values

2024-04-17T07:38:03.968162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T07:38:04.090048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-17T07:38:04.187102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

키값등록번호상호기관구분행정시행정구행정동대표자주소도로명주소타겟국가
0BE_LiST21-14604074별이성형외과의원서울특별시강남구청담동홍왕광서울특별시 강남구 도산대로 443 청담빌딩 3,4층<NA>중국,
1BE_LiST21-14614055허윞업신경외과의원의원서울특별시강남구도곡1동안계훈서울특별시 강남구 강남대로 254 용문빌딩 4층<NA>중국,러시아,몽골,
2BE_LiST21-14624052더마주피부과의원의원서울특별시강남구삼성1동김주영서울특별시 강남구 영동대로 607 랜드마크빌딩<NA>미국,일본,중국,러시아,중동,몽골,베트남,
3BE_LiST21-14634081아이템의원피부과의원서울특별시강남구논현1동이진화서울특별시 강남구 도산대로 134<NA>미국,중국,
4BE_LiST21-14644078정다운임치과의원치과의원서울특별시중랑구중화2동강철규중랑구 중랑역로 51, 대종빌딩 3층<NA><NA>
5BE_LiST21-14654084스타동안피부과의원의원서울특별시서초구잠원동최호철서울시 서초구 잠원동 6-1 리버사이드호텔 5층 스타동원피부과의원<NA><NA>
6BE_LiST21-14664088디테일의원의원서울특별시강남구논현2동김진명서울시 강남구 논현동 61-6번지 지안빌딩 2층<NA>일본,중국,몽골,
7BE_LiST21-14674083압구정현치과의원치과의원서울특별시강남구신사동박현서울특별시 강남구 압구정로 164 아세아빌딩<NA><NA>
8BE_LiST21-13263747제이엠제이의원의원서울특별시강남구압구정동박주연서울특별시 강남구 신사동 659-13 3층<NA>중국,러시아,
9BE_LiST21-13273763담의원의원서울특별시강남구삼성1동김홍두서울특별시 강남구 삼성로126길 6 보고재빌딩 B2<NA>미국,중국,
키값등록번호상호기관구분행정시행정구행정동대표자주소도로명주소타겟국가
1484BE_LiST21-11363337서울복지한방병원한방병원서울특별시영등포구대림2동김영대서울특별시 영등포구 도림로 144 (대림동)<NA><NA>
1485BE_LiST21-11373359리영의원의원서울특별시서초구서초4동안상태서울특별시 서초구 서초대로77길 54, 서초더블유타워 602호<NA><NA>
1486BE_LiST21-11389998의료기관명국문병원서울특별시강서구발산1동대표자국문서울 강서구 내발산동 마곡수명산파크아파트 아파트아파트아파트 408동 201호 11111<NA>미국,
1487BE_LiST21-11393373키위 (KIWI) 성형외과의원서울특별시강남구역삼1동김지훈서울특별시 강남구 강남대로 406, 1402(역삼동)<NA><NA>
1488BE_LiST21-11403384남상천한의원한의원서울특별시서초구서초3동정철서울특별시 서초구 반포대로 109 (서초동,서초빌딩3층,5층)<NA><NA>
1489BE_LiST21-05191610남대문외과의원의원서울특별시중구회현동임진상서울특별시 중구 남창동 9-15 금오빌딩 9층<NA><NA>
1490BE_LiST21-05201616최병기치과의원치과의원서울특별시노원구공릉1동최병기서울특별시 노원구 공릉1동 581-1 공릉쇼핑 3층<NA><NA>
1491BE_LiST21-05211617플로렌치과의원치과의원서울특별시강남구압구정동오경아서울특별시 강남구 신사동 653-16 (2,3층)<NA><NA>
1492BE_LiST21-05221625서울탑치과병원치과병원서울특별시서초구서초3동김현종외1명서울특별시 서초구 서초중앙로 39(서초동,5층)<NA><NA>
1493BE_LiST21-05231630사계절한의원한의원서울특별시중구명동김계진서울특별시 중구 을지로2가 163-3 보승빌딩 2층<NA><NA>