Overview

Dataset statistics

Number of variables16
Number of observations5328
Missing cells15632
Missing cells (%)18.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory692.1 KiB
Average record size in memory133.0 B

Variable types

Numeric3
Text4
DateTime4
Categorical3
Unsupported2

Dataset

Description등록번호,상호,허가일자,주소,상세주소,업소상태코드,업소상태,관할구청코드,관할구청,전화번호,국적,보험시작일,보험만료일,변경일자,변경사유코드,변경사유
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2246/S/1/datasetView.do

Alerts

국적 has a high cardinality: 51 distinct valuesHigh cardinality
업소상태코드 is highly overall correlated with 업소상태High correlation
관할구청코드 is highly overall correlated with 관할구청High correlation
업소상태 is highly overall correlated with 업소상태코드High correlation
관할구청 is highly overall correlated with 관할구청코드High correlation
국적 is highly imbalanced (85.3%)Imbalance
상세주소 has 844 (15.8%) missing valuesMissing
전화번호 has 714 (13.4%) missing valuesMissing
보험시작일 has 181 (3.4%) missing valuesMissing
보험만료일 has 181 (3.4%) missing valuesMissing
변경일자 has 3056 (57.4%) missing valuesMissing
변경사유코드 has 5328 (100.0%) missing valuesMissing
변경사유 has 5328 (100.0%) missing valuesMissing
등록번호 has unique valuesUnique
변경사유코드 is an unsupported type, check if it needs cleaning or further analysisUnsupported
변경사유 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-05-03 20:07:34.658530
Analysis finished2024-05-03 20:07:44.045846
Duration9.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

등록번호
Real number (ℝ)

UNIQUE 

Distinct5328
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2933.3095
Minimum2
Maximum7452
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.0 KiB
2024-05-03T20:07:44.324593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile339.35
Q11629.75
median2962.5
Q34295.25
95-th percentile5364.65
Maximum7452
Range7450
Interquartile range (IQR)2665.5

Descriptive statistics

Standard deviation1590.004
Coefficient of variation (CV)0.54205124
Kurtosis-1.1210164
Mean2933.3095
Median Absolute Deviation (MAD)1333
Skewness-0.067556596
Sum15628673
Variance2528112.8
MonotonicityNot monotonic
2024-05-03T20:07:45.101547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7452 1
 
< 0.1%
4962 1
 
< 0.1%
4927 1
 
< 0.1%
4707 1
 
< 0.1%
4452 1
 
< 0.1%
1636 1
 
< 0.1%
579 1
 
< 0.1%
649 1
 
< 0.1%
651 1
 
< 0.1%
2065 1
 
< 0.1%
Other values (5318) 5318
99.8%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
ValueCountFrequency (%)
7452 1
< 0.1%
5872 1
< 0.1%
5871 1
< 0.1%
5870 1
< 0.1%
5869 1
< 0.1%
5868 1
< 0.1%
5867 1
< 0.1%
5866 1
< 0.1%
5865 1
< 0.1%
5864 1
< 0.1%

상호
Text

Distinct5274
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size41.8 KiB
2024-05-03T20:07:45.904898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length59
Median length50
Mean length10.414977
Min length2

Characters and Unicode

Total characters55491
Distinct characters628
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5221 ?
Unique (%)98.0%

Sample

1st row(유)에이포앤트레이드
2nd row주식회사 거해물류
3rd row(주)오토런
4th row주식회사 레인보우티엔에스
5th row대하에스앤에이 주식회사
ValueCountFrequency (%)
주식회사 1408
 
20.3%
26
 
0.4%
유한회사 21
 
0.3%
co 20
 
0.3%
ltd 19
 
0.3%
logistics 9
 
0.1%
co.,ltd 7
 
0.1%
international 5
 
0.1%
global 4
 
0.1%
유한책임회사 4
 
0.1%
Other values (5355) 5425
78.1%
2024-05-03T20:07:47.378530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5261
 
9.5%
3879
 
7.0%
( 3845
 
6.9%
) 3844
 
6.9%
1853
 
3.3%
1779
 
3.2%
1634
 
2.9%
1490
 
2.7%
1465
 
2.6%
1436
 
2.6%
Other values (618) 29005
52.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45409
81.8%
Open Punctuation 3846
 
6.9%
Close Punctuation 3845
 
6.9%
Space Separator 1634
 
2.9%
Uppercase Letter 301
 
0.5%
Lowercase Letter 293
 
0.5%
Other Punctuation 143
 
0.3%
Decimal Number 17
 
< 0.1%
Dash Punctuation 2
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5261
 
11.6%
3879
 
8.5%
1853
 
4.1%
1779
 
3.9%
1490
 
3.3%
1465
 
3.2%
1436
 
3.2%
1420
 
3.1%
1346
 
3.0%
978
 
2.2%
Other values (557) 24502
54.0%
Uppercase Letter
ValueCountFrequency (%)
L 57
18.9%
C 43
14.3%
I 21
 
7.0%
A 21
 
7.0%
T 20
 
6.6%
O 18
 
6.0%
N 16
 
5.3%
S 13
 
4.3%
R 12
 
4.0%
D 11
 
3.7%
Other values (13) 69
22.9%
Lowercase Letter
ValueCountFrequency (%)
o 49
16.7%
t 36
12.3%
i 35
11.9%
d 27
9.2%
s 24
8.2%
r 17
 
5.8%
e 17
 
5.8%
g 16
 
5.5%
a 16
 
5.5%
n 16
 
5.5%
Other values (10) 40
13.7%
Decimal Number
ValueCountFrequency (%)
2 4
23.5%
4 3
17.6%
0 3
17.6%
5 2
11.8%
1 2
11.8%
7 2
11.8%
3 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
, 87
60.8%
. 54
37.8%
& 1
 
0.7%
; 1
 
0.7%
Open Punctuation
ValueCountFrequency (%)
( 3845
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 3844
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1634
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45410
81.8%
Common 9487
 
17.1%
Latin 594
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5261
 
11.6%
3879
 
8.5%
1853
 
4.1%
1779
 
3.9%
1490
 
3.3%
1465
 
3.2%
1436
 
3.2%
1420
 
3.1%
1346
 
3.0%
978
 
2.2%
Other values (558) 24503
54.0%
Latin
ValueCountFrequency (%)
L 57
 
9.6%
o 49
 
8.2%
C 43
 
7.2%
t 36
 
6.1%
i 35
 
5.9%
d 27
 
4.5%
s 24
 
4.0%
I 21
 
3.5%
A 21
 
3.5%
T 20
 
3.4%
Other values (33) 261
43.9%
Common
ValueCountFrequency (%)
( 3845
40.5%
) 3844
40.5%
1634
17.2%
, 87
 
0.9%
. 54
 
0.6%
2 4
 
< 0.1%
4 3
 
< 0.1%
0 3
 
< 0.1%
- 2
 
< 0.1%
5 2
 
< 0.1%
Other values (7) 9
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45409
81.8%
ASCII 10081
 
18.2%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5261
 
11.6%
3879
 
8.5%
1853
 
4.1%
1779
 
3.9%
1490
 
3.3%
1465
 
3.2%
1436
 
3.2%
1420
 
3.1%
1346
 
3.0%
978
 
2.2%
Other values (557) 24502
54.0%
ASCII
ValueCountFrequency (%)
( 3845
38.1%
) 3844
38.1%
1634
16.2%
, 87
 
0.9%
L 57
 
0.6%
. 54
 
0.5%
o 49
 
0.5%
C 43
 
0.4%
t 36
 
0.4%
i 35
 
0.3%
Other values (50) 397
 
3.9%
None
ValueCountFrequency (%)
1
100.0%
Distinct2922
Distinct (%)54.8%
Missing0
Missing (%)0.0%
Memory size41.8 KiB
Minimum1992-08-19 00:00:00
Maximum2024-05-01 00:00:00
2024-05-03T20:07:47.975555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:48.475411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

주소
Text

Distinct3667
Distinct (%)68.8%
Missing0
Missing (%)0.0%
Memory size41.8 KiB
2024-05-03T20:07:49.445117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length43
Mean length23.781907
Min length3

Characters and Unicode

Total characters126710
Distinct characters551
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3084 ?
Unique (%)57.9%

Sample

1st row서울특별시 강남구 역삼로 177 (역삼동,서현빌딩)
2nd row서울특별시 중구 세종대로 74 삼정빌딩
3rd row강남구 역삼동
4th row강남구 논현동 80-17
5th row마포구 마포대로 52,
ValueCountFrequency (%)
서울특별시 3100
 
12.2%
마포구 948
 
3.7%
강서구 924
 
3.6%
중구 595
 
2.3%
강남구 438
 
1.7%
종로구 394
 
1.6%
영등포구 380
 
1.5%
마곡동 294
 
1.2%
280
 
1.1%
280
 
1.1%
Other values (4208) 17702
69.9%
2024-05-03T20:07:51.176944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21070
 
16.6%
5521
 
4.4%
5093
 
4.0%
5020
 
4.0%
4834
 
3.8%
1 3829
 
3.0%
3369
 
2.7%
( 3326
 
2.6%
) 3325
 
2.6%
3161
 
2.5%
Other values (541) 68162
53.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 78585
62.0%
Space Separator 21070
 
16.6%
Decimal Number 16774
 
13.2%
Open Punctuation 3327
 
2.6%
Close Punctuation 3326
 
2.6%
Other Punctuation 2623
 
2.1%
Dash Punctuation 572
 
0.5%
Uppercase Letter 300
 
0.2%
Lowercase Letter 99
 
0.1%
Letter Number 34
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5521
 
7.0%
5093
 
6.5%
5020
 
6.4%
4834
 
6.2%
3369
 
4.3%
3161
 
4.0%
3113
 
4.0%
3101
 
3.9%
1836
 
2.3%
1723
 
2.2%
Other values (487) 41814
53.2%
Uppercase Letter
ValueCountFrequency (%)
K 35
11.7%
S 32
10.7%
G 27
9.0%
T 24
 
8.0%
L 23
 
7.7%
C 22
 
7.3%
I 21
 
7.0%
A 19
 
6.3%
M 18
 
6.0%
B 13
 
4.3%
Other values (13) 66
22.0%
Decimal Number
ValueCountFrequency (%)
1 3829
22.8%
2 2432
14.5%
3 1842
11.0%
5 1566
9.3%
4 1459
 
8.7%
6 1361
 
8.1%
0 1265
 
7.5%
7 1058
 
6.3%
8 1027
 
6.1%
9 935
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
e 29
29.3%
r 18
18.2%
n 13
13.1%
o 9
 
9.1%
c 9
 
9.1%
t 9
 
9.1%
w 8
 
8.1%
a 2
 
2.0%
u 1
 
1.0%
m 1
 
1.0%
Letter Number
ValueCountFrequency (%)
21
61.8%
12
35.3%
1
 
2.9%
Open Punctuation
ValueCountFrequency (%)
( 3326
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 3325
> 99.9%
] 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
, 2616
99.7%
. 7
 
0.3%
Space Separator
ValueCountFrequency (%)
21070
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 572
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 78585
62.0%
Common 47692
37.6%
Latin 433
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5521
 
7.0%
5093
 
6.5%
5020
 
6.4%
4834
 
6.2%
3369
 
4.3%
3161
 
4.0%
3113
 
4.0%
3101
 
3.9%
1836
 
2.3%
1723
 
2.2%
Other values (487) 41814
53.2%
Latin
ValueCountFrequency (%)
K 35
 
8.1%
S 32
 
7.4%
e 29
 
6.7%
G 27
 
6.2%
T 24
 
5.5%
L 23
 
5.3%
C 22
 
5.1%
21
 
4.8%
I 21
 
4.8%
A 19
 
4.4%
Other values (26) 180
41.6%
Common
ValueCountFrequency (%)
21070
44.2%
1 3829
 
8.0%
( 3326
 
7.0%
) 3325
 
7.0%
, 2616
 
5.5%
2 2432
 
5.1%
3 1842
 
3.9%
5 1566
 
3.3%
4 1459
 
3.1%
6 1361
 
2.9%
Other values (8) 4866
 
10.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 78585
62.0%
ASCII 48091
38.0%
Number Forms 34
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21070
43.8%
1 3829
 
8.0%
( 3326
 
6.9%
) 3325
 
6.9%
, 2616
 
5.4%
2 2432
 
5.1%
3 1842
 
3.8%
5 1566
 
3.3%
4 1459
 
3.0%
6 1361
 
2.8%
Other values (41) 5265
 
10.9%
Hangul
ValueCountFrequency (%)
5521
 
7.0%
5093
 
6.5%
5020
 
6.4%
4834
 
6.2%
3369
 
4.3%
3161
 
4.0%
3113
 
4.0%
3101
 
3.9%
1836
 
2.3%
1723
 
2.2%
Other values (487) 41814
53.2%
Number Forms
ValueCountFrequency (%)
21
61.8%
12
35.3%
1
 
2.9%

상세주소
Text

MISSING 

Distinct3118
Distinct (%)69.5%
Missing844
Missing (%)15.8%
Memory size41.8 KiB
2024-05-03T20:07:52.125623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length29
Mean length8.7582516
Min length1

Characters and Unicode

Total characters39272
Distinct characters469
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2770 ?
Unique (%)61.8%

Sample

1st row비101호
2nd row648-18
3rd row양지빌딩 2층 1호
4th row고려아카데미텔2 1110호(도화동)
5th row678-10 재영빌딩 1층
ValueCountFrequency (%)
2층 167
 
2.1%
3층 160
 
2.1%
4층 141
 
1.8%
5층 105
 
1.3%
1층 87
 
1.1%
6층 78
 
1.0%
에이동 70
 
0.9%
7층 67
 
0.9%
10층 55
 
0.7%
비동 51
 
0.7%
Other values (3231) 6820
87.4%
2024-05-03T20:07:53.669272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3874
 
9.9%
3496
 
8.9%
2894
 
7.4%
0 2740
 
7.0%
2 2025
 
5.2%
3 1559
 
4.0%
1354
 
3.4%
1281
 
3.3%
5 1213
 
3.1%
4 1122
 
2.9%
Other values (459) 17714
45.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16308
41.5%
Decimal Number 15799
40.2%
Space Separator 3496
 
8.9%
Open Punctuation 991
 
2.5%
Close Punctuation 991
 
2.5%
Dash Punctuation 904
 
2.3%
Other Punctuation 593
 
1.5%
Uppercase Letter 143
 
0.4%
Lowercase Letter 28
 
0.1%
Math Symbol 19
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2894
 
17.7%
1354
 
8.3%
1281
 
7.9%
784
 
4.8%
754
 
4.6%
350
 
2.1%
316
 
1.9%
262
 
1.6%
214
 
1.3%
205
 
1.3%
Other values (399) 7894
48.4%
Uppercase Letter
ValueCountFrequency (%)
B 33
23.1%
A 27
18.9%
K 8
 
5.6%
C 8
 
5.6%
S 6
 
4.2%
E 6
 
4.2%
R 6
 
4.2%
T 6
 
4.2%
G 5
 
3.5%
L 5
 
3.5%
Other values (13) 33
23.1%
Lowercase Letter
ValueCountFrequency (%)
o 5
17.9%
e 3
10.7%
u 3
10.7%
n 3
10.7%
l 3
10.7%
g 3
10.7%
i 2
 
7.1%
a 1
 
3.6%
r 1
 
3.6%
f 1
 
3.6%
Other values (3) 3
10.7%
Decimal Number
ValueCountFrequency (%)
1 3874
24.5%
0 2740
17.3%
2 2025
12.8%
3 1559
9.9%
5 1213
 
7.7%
4 1122
 
7.1%
6 992
 
6.3%
7 825
 
5.2%
8 730
 
4.6%
9 719
 
4.6%
Other Punctuation
ValueCountFrequency (%)
, 580
97.8%
. 4
 
0.7%
/ 2
 
0.3%
: 2
 
0.3%
1
 
0.2%
@ 1
 
0.2%
; 1
 
0.2%
# 1
 
0.2%
& 1
 
0.2%
Space Separator
ValueCountFrequency (%)
3496
100.0%
Open Punctuation
ValueCountFrequency (%)
( 991
100.0%
Close Punctuation
ValueCountFrequency (%)
) 991
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 904
100.0%
Math Symbol
ValueCountFrequency (%)
~ 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22793
58.0%
Hangul 16308
41.5%
Latin 171
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2894
 
17.7%
1354
 
8.3%
1281
 
7.9%
784
 
4.8%
754
 
4.6%
350
 
2.1%
316
 
1.9%
262
 
1.6%
214
 
1.3%
205
 
1.3%
Other values (399) 7894
48.4%
Latin
ValueCountFrequency (%)
B 33
19.3%
A 27
15.8%
K 8
 
4.7%
C 8
 
4.7%
S 6
 
3.5%
E 6
 
3.5%
R 6
 
3.5%
T 6
 
3.5%
G 5
 
2.9%
L 5
 
2.9%
Other values (26) 61
35.7%
Common
ValueCountFrequency (%)
1 3874
17.0%
3496
15.3%
0 2740
12.0%
2 2025
8.9%
3 1559
6.8%
5 1213
 
5.3%
4 1122
 
4.9%
6 992
 
4.4%
( 991
 
4.3%
) 991
 
4.3%
Other values (14) 3790
16.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22963
58.5%
Hangul 16308
41.5%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3874
16.9%
3496
15.2%
0 2740
11.9%
2 2025
8.8%
3 1559
 
6.8%
5 1213
 
5.3%
4 1122
 
4.9%
6 992
 
4.3%
( 991
 
4.3%
) 991
 
4.3%
Other values (49) 3960
17.2%
Hangul
ValueCountFrequency (%)
2894
 
17.7%
1354
 
8.3%
1281
 
7.9%
784
 
4.8%
754
 
4.6%
350
 
2.1%
316
 
1.9%
262
 
1.6%
214
 
1.3%
205
 
1.3%
Other values (399) 7894
48.4%
None
ValueCountFrequency (%)
1
100.0%

업소상태코드
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.707958
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.0 KiB
2024-05-03T20:07:54.184369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q35
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.8895533
Coefficient of variation (CV)0.69777792
Kurtosis-1.3984381
Mean2.707958
Median Absolute Deviation (MAD)2
Skewness0.48267921
Sum14428
Variance3.5704116
MonotonicityNot monotonic
2024-05-03T20:07:54.700132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 2653
49.8%
5 1183
22.2%
3 1002
 
18.8%
6 455
 
8.5%
4 27
 
0.5%
2 8
 
0.2%
ValueCountFrequency (%)
1 2653
49.8%
2 8
 
0.2%
3 1002
 
18.8%
4 27
 
0.5%
5 1183
22.2%
6 455
 
8.5%
ValueCountFrequency (%)
6 455
 
8.5%
5 1183
22.2%
4 27
 
0.5%
3 1002
 
18.8%
2 8
 
0.2%
1 2653
49.8%

업소상태
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size41.8 KiB
영업
2653 
등록취소
1183 
폐지
1002 
타시도이관
455 
사업정지
 
27

Length

Max length5
Median length2
Mean length2.7103979
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영업
2nd row사업정지
3rd row폐지
4th row폐지
5th row영업

Common Values

ValueCountFrequency (%)
영업 2653
49.8%
등록취소 1183
22.2%
폐지 1002
 
18.8%
타시도이관 455
 
8.5%
사업정지 27
 
0.5%
휴지 8
 
0.2%

Length

2024-05-03T20:07:55.211378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T20:07:55.591748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업 2653
49.8%
등록취소 1183
22.2%
폐지 1002
 
18.8%
타시도이관 455
 
8.5%
사업정지 27
 
0.5%
휴지 8
 
0.2%

관할구청코드
Real number (ℝ)

HIGH CORRELATION 

Distinct25
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4487297.3
Minimum4040000
Maximum4960000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.0 KiB
2024-05-03T20:07:56.118692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4040000
5-th percentile4040000
Q14120000
median4520000
Q34680000
95-th percentile4880000
Maximum4960000
Range920000
Interquartile range (IQR)560000

Descriptive statistics

Standard deviation285001.42
Coefficient of variation (CV)0.063512934
Kurtosis-1.1382001
Mean4487297.3
Median Absolute Deviation (MAD)200000
Skewness-0.3543943
Sum2.390832 × 1010
Variance8.1225808 × 1010
MonotonicityNot monotonic
2024-05-03T20:07:56.605042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
4520000 997
18.7%
4600000 946
17.8%
4080000 881
16.5%
4880000 461
8.7%
4040000 398
 
7.5%
4720000 383
 
7.2%
4840000 262
 
4.9%
4680000 190
 
3.6%
4920000 116
 
2.2%
4640000 106
 
2.0%
Other values (15) 588
11.0%
ValueCountFrequency (%)
4040000 398
7.5%
4080000 881
16.5%
4120000 75
 
1.4%
4160000 100
 
1.9%
4200000 31
 
0.6%
4240000 49
 
0.9%
4280000 20
 
0.4%
4320000 21
 
0.4%
4340000 14
 
0.3%
4360000 13
 
0.2%
ValueCountFrequency (%)
4960000 26
 
0.5%
4920000 116
 
2.2%
4880000 461
8.7%
4840000 262
 
4.9%
4800000 17
 
0.3%
4760000 23
 
0.4%
4720000 383
7.2%
4680000 190
 
3.6%
4640000 106
 
2.0%
4600000 946
17.8%

관할구청
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size41.8 KiB
마포구
997 
강서구
946 
중구
881 
강남구
461 
종로구
398 
Other values (20)
1645 

Length

Max length4
Median length3
Mean length2.9286787
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강남구
2nd row중구
3rd row강남구
4th row강남구
5th row마포구

Common Values

ValueCountFrequency (%)
마포구 997
18.7%
강서구 946
17.8%
중구 881
16.5%
강남구 461
8.7%
종로구 398
 
7.5%
영등포구 383
 
7.2%
서초구 262
 
4.9%
금천구 190
 
3.6%
송파구 116
 
2.2%
구로구 106
 
2.0%
Other values (15) 588
11.0%

Length

2024-05-03T20:07:57.159355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
마포구 997
18.7%
강서구 946
17.8%
중구 881
16.5%
강남구 461
8.7%
종로구 398
 
7.5%
영등포구 383
 
7.2%
서초구 262
 
4.9%
금천구 190
 
3.6%
송파구 116
 
2.2%
구로구 106
 
2.0%
Other values (15) 588
11.0%

전화번호
Text

MISSING 

Distinct4487
Distinct (%)97.2%
Missing714
Missing (%)13.4%
Memory size41.8 KiB
2024-05-03T20:07:58.127153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length10.566537
Min length3

Characters and Unicode

Total characters48754
Distinct characters19
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4372 ?
Unique (%)94.8%

Sample

1st row775-5166
2nd row02-569-5393
3rd row725-5717
4th row02)3273-1866
5th row3664-7491
ValueCountFrequency (%)
02-2151-9774 3
 
0.1%
02-752-8000 3
 
0.1%
02-2666-8266 3
 
0.1%
2149-3700 3
 
0.1%
756-9344 3
 
0.1%
02)6329-7715 3
 
0.1%
02-730-5445 3
 
0.1%
02-3142-3600 3
 
0.1%
02-322-7733 3
 
0.1%
334-8200 3
 
0.1%
Other values (4480) 4588
99.4%
2024-05-03T20:07:59.586053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 7681
15.8%
0 7270
14.9%
2 6854
14.1%
7 4490
9.2%
3 4315
8.9%
6 3745
7.7%
1 3588
7.4%
5 3353
6.9%
8 2584
 
5.3%
4 2532
 
5.2%
Other values (9) 2342
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40904
83.9%
Dash Punctuation 7681
 
15.8%
Close Punctuation 122
 
0.3%
Math Symbol 30
 
0.1%
Other Letter 8
 
< 0.1%
Other Punctuation 5
 
< 0.1%
Space Separator 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7270
17.8%
2 6854
16.8%
7 4490
11.0%
3 4315
10.5%
6 3745
9.2%
1 3588
8.8%
5 3353
8.2%
8 2584
 
6.3%
4 2532
 
6.2%
9 2173
 
5.3%
Other Letter
ValueCountFrequency (%)
2
25.0%
2
25.0%
2
25.0%
2
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 7681
100.0%
Close Punctuation
ValueCountFrequency (%)
) 122
100.0%
Math Symbol
ValueCountFrequency (%)
~ 30
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 5
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 48746
> 99.9%
Hangul 8
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 7681
15.8%
0 7270
14.9%
2 6854
14.1%
7 4490
9.2%
3 4315
8.9%
6 3745
7.7%
1 3588
7.4%
5 3353
6.9%
8 2584
 
5.3%
4 2532
 
5.2%
Other values (5) 2334
 
4.8%
Hangul
ValueCountFrequency (%)
2
25.0%
2
25.0%
2
25.0%
2
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48746
> 99.9%
Hangul 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 7681
15.8%
0 7270
14.9%
2 6854
14.1%
7 4490
9.2%
3 4315
8.9%
6 3745
7.7%
1 3588
7.4%
5 3353
6.9%
8 2584
 
5.3%
4 2532
 
5.2%
Other values (5) 2334
 
4.8%
Hangul
ValueCountFrequency (%)
2
25.0%
2
25.0%
2
25.0%
2
25.0%

국적
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct51
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size41.8 KiB
한국
4609 
대한민국
535 
중국
 
47
미국
 
22
일본
 
19
Other values (46)
 
96

Length

Max length9
Median length2
Mean length2.2321697
Min length2

Unique

Unique24 ?
Unique (%)0.5%

Sample

1st row한국
2nd row한국
3rd row한국
4th row한국
5th row한국

Common Values

ValueCountFrequency (%)
한국 4609
86.5%
대한민국 535
 
10.0%
중국 47
 
0.9%
미국 22
 
0.4%
일본 19
 
0.4%
<NA> 11
 
0.2%
프랑스 5
 
0.1%
중화인민공화국 5
 
0.1%
독일 5
 
0.1%
캐나다 4
 
0.1%
Other values (41) 66
 
1.2%

Length

2024-05-03T20:08:00.148122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한국 4611
86.5%
대한민국 535
 
10.0%
중국 47
 
0.9%
미국 22
 
0.4%
일본 19
 
0.4%
na 11
 
0.2%
프랑스 6
 
0.1%
중화인민공화국 5
 
0.1%
독일 5
 
0.1%
캐나다 4
 
0.1%
Other values (38) 64
 
1.2%

보험시작일
Date

MISSING 

Distinct2767
Distinct (%)53.8%
Missing181
Missing (%)3.4%
Memory size41.8 KiB
Minimum1996-05-20 00:00:00
Maximum2024-12-26 00:00:00
2024-05-03T20:08:00.684479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:08:01.123716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

보험만료일
Date

MISSING 

Distinct2755
Distinct (%)53.5%
Missing181
Missing (%)3.4%
Memory size41.8 KiB
Minimum1997-05-19 00:00:00
Maximum2028-09-14 00:00:00
2024-05-03T20:08:01.537235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:08:02.022848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

변경일자
Date

MISSING 

Distinct1392
Distinct (%)61.3%
Missing3056
Missing (%)57.4%
Memory size41.8 KiB
Minimum1999-07-31 00:00:00
Maximum2024-04-30 00:00:00
2024-05-03T20:08:02.543108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:08:03.121475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

변경사유코드
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5328
Missing (%)100.0%
Memory size47.0 KiB

변경사유
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5328
Missing (%)100.0%
Memory size47.0 KiB

Interactions

2024-05-03T20:07:41.057655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:38.746988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:39.834699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:41.444650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:39.120888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:40.300288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:41.741920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:39.461343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:07:40.575343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T20:08:03.493548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록번호업소상태코드업소상태관할구청코드관할구청국적
등록번호1.0000.3960.3960.1910.2160.642
업소상태코드0.3961.0001.0000.1820.2230.410
업소상태0.3961.0001.0000.1820.2230.410
관할구청코드0.1910.1820.1821.0001.0000.166
관할구청0.2160.2230.2231.0001.0000.189
국적0.6420.4100.4100.1660.1891.000
2024-05-03T20:08:03.865068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국적관할구청업소상태
국적1.0000.0390.187
관할구청0.0391.0000.101
업소상태0.1870.1011.000
2024-05-03T20:08:04.184203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록번호업소상태코드관할구청코드업소상태관할구청국적
등록번호1.000-0.3250.1430.2090.0840.289
업소상태코드-0.3251.000-0.0571.0000.1010.187
관할구청코드0.143-0.0571.0000.0910.9990.055
업소상태0.2091.0000.0911.0000.1010.187
관할구청0.0840.1010.9990.1011.0000.039
국적0.2890.1870.0550.1870.0391.000

Missing values

2024-05-03T20:07:42.235237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T20:07:42.967927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-03T20:07:43.637802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

등록번호상호허가일자주소상세주소업소상태코드업소상태관할구청코드관할구청전화번호국적보험시작일보험만료일변경일자변경사유코드변경사유
07452(유)에이포앤트레이드2018-03-26서울특별시 강남구 역삼로 177 (역삼동,서현빌딩)비101호1영업4880000강남구<NA>한국2018-03-162019-03-16<NA><NA><NA>
13258주식회사 거해물류2009-05-18서울특별시 중구 세종대로 74 삼정빌딩<NA>4사업정지4080000중구775-5166한국2018-04-302019-04-30<NA><NA><NA>
22277(주)오토런2004-01-08강남구 역삼동648-183폐지4880000강남구02-569-5393한국2005-01-022006-01-022005-01-17<NA><NA>
33160주식회사 레인보우티엔에스2008-10-16강남구 논현동 80-17양지빌딩 2층 1호3폐지4880000강남구725-5717한국2008-09-182009-09-182009-02-04<NA><NA>
43069대하에스앤에이 주식회사2008-04-25마포구 마포대로 52,고려아카데미텔2 1110호(도화동)1영업4520000마포구02)3273-1866한국2024-04-222025-04-21<NA><NA><NA>
52765(주)로젠글로벌2006-07-10강서구 등촌동678-10 재영빌딩 1층5등록취소4600000강서구3664-7491한국2006-06-272007-06-27<NA><NA><NA>
62892중한세계물류(주)2007-03-26중구 남대문로 113 (다동)<NA>3폐지4080000중구773-8388한국2011-03-202012-03-202012-03-05<NA><NA>
72783(주)이온프로젝트2006-08-24중 구 충무로1가25-5 고려대연각타워304호5등록취소4080000중구02-754-5742한국2006-08-162007-08-16<NA><NA><NA>
83061주식회사 토브로지스틱스2008-04-15성북구 동소문로 43,흥화브라운빌 603호 (동소문동4가)1영업4320000성북구02-953-9761한국2024-04-182025-04-18<NA><NA><NA>
93920영성해운 주식회사2013-01-03서울특별시 마포구 마포대로 86 창강빌딩 (도화동 22)11층3폐지4520000마포구<NA>한국<NA><NA>2018-06-01<NA><NA>
등록번호상호허가일자주소상세주소업소상태코드업소상태관할구청코드관할구청전화번호국적보험시작일보험만료일변경일자변경사유코드변경사유
53182718용방물류(주)2006-03-21서초구 양재동2263폐지4840000서초구575-1361한국2006-03-152007-03-142007-03-16<NA><NA>
5319845(주)크로스오버1997-11-14중 구 다동 97산다빌딩 401호5등록취소4080000중구02-779-3754한국2005-11-042006-11-04<NA><NA><NA>
5320849피엠엘(주)1997-11-19영등포구 여의도동 43<NA>3폐지4720000영등포구02-785-0634한국1999-11-032000-11-022000-12-07<NA><NA>
5321884엠아이티항공해운(주)1998-02-19마포구 서교동395-755등록취소4520000마포구02-338-7400한국2000-02-162001-02-162002-06-03<NA><NA>
5322883동화물류(주)1998-02-19은평구 은평로 82,해태드림타운 207호 (응암동)3폐지4440000은평구02-338-3284한국2013-06-142014-06-142014-12-26<NA><NA>
5323801에스앤에이코리아(주)1997-07-31강남구 역삼동707-383폐지4880000강남구02-538-3021한국2002-03-182003-03-172003-03-17<NA><NA>
53242170천경해운(주)2003-06-07중구 을지로 80-1(을지로2가)3폐지4080000중구3788-6871한국<NA><NA>2017-08-09<NA><NA>
53252672(주)클럽리치항공2006-01-16중구 무교로 16,체육회관빌딩 4층 (무교동)5등록취소4080000중구776-2727한국2009-01-092010-01-09<NA><NA><NA>
53262556(주)동특2005-06-07서울특별시 영등포구 국제금융로 70,미원빌딩 10층1영업4720000영등포구3215-5000한국<NA><NA><NA><NA><NA>
53275103(주)팬더해운항공2020-11-18서울특별시 관악구 낙성대역4길 3 (봉천동)4층1영업4800000관악구<NA>한국2023-11-112024-11-11<NA><NA><NA>