Overview

Dataset statistics

Number of variables7
Number of observations5147
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory286.6 KiB
Average record size in memory57.0 B

Variable types

Numeric1
Text3
Categorical2
DateTime1

Dataset

Description사하구 내에 현재 정상영업 중인 통신판매업체 5147개 업소에 대한 법인 또는 상호명, 운영상태, 대표자 성함 및 관리 번호 등의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/3079278/fileData.do

Alerts

운영상태 has constant value ""Constant
데이터기준일자 has constant value ""Constant
법인구분 is highly imbalanced (74.8%)Imbalance
번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 07:15:08.985679
Analysis finished2023-12-12 07:15:10.302226
Duration1.32 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct5147
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2574
Minimum1
Maximum5147
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size45.4 KiB
2023-12-12T16:15:10.389524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile258.3
Q11287.5
median2574
Q33860.5
95-th percentile4889.7
Maximum5147
Range5146
Interquartile range (IQR)2573

Descriptive statistics

Standard deviation1485.9552
Coefficient of variation (CV)0.57729419
Kurtosis-1.2
Mean2574
Median Absolute Deviation (MAD)1287
Skewness0
Sum13248378
Variance2208063
MonotonicityStrictly increasing
2023-12-12T16:15:10.567345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
3380 1
 
< 0.1%
3438 1
 
< 0.1%
3437 1
 
< 0.1%
3436 1
 
< 0.1%
3435 1
 
< 0.1%
3434 1
 
< 0.1%
3433 1
 
< 0.1%
3432 1
 
< 0.1%
3431 1
 
< 0.1%
Other values (5137) 5137
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
5147 1
< 0.1%
5146 1
< 0.1%
5145 1
< 0.1%
5144 1
< 0.1%
5143 1
< 0.1%
5142 1
< 0.1%
5141 1
< 0.1%
5140 1
< 0.1%
5139 1
< 0.1%
5138 1
< 0.1%
Distinct5130
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size40.3 KiB
2023-12-12T16:15:10.827598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length13.981154
Min length3

Characters and Unicode

Total characters71961
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5113 ?
Unique (%)99.3%

Sample

1st row2023-부산사하-0500
2nd row2023-부산사하-0499
3rd row2023-부산사하-0498
4th row2023-부산사하-0497
5th row2023-부산사하-0496
ValueCountFrequency (%)
2020-부산사하-00383 2
 
< 0.1%
2020-부산사하-0890 2
 
< 0.1%
2013-부산사하-0044 2
 
< 0.1%
2016-부산사하-0317 2
 
< 0.1%
2022-부산사하-0318 2
 
< 0.1%
2018-부산사하-0071 2
 
< 0.1%
2021-부산사하-0618 2
 
< 0.1%
2020-부산사하-0025 2
 
< 0.1%
2019-부산사하-00222 2
 
< 0.1%
2019-부산사하-0019 2
 
< 0.1%
Other values (5120) 5127
99.6%
2023-12-12T16:15:11.203763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 13506
18.8%
2 10636
14.8%
- 10225
14.2%
1 5108
 
7.1%
5082
 
7.1%
5082
 
7.1%
5079
 
7.1%
5079
 
7.1%
3 2296
 
3.2%
9 1719
 
2.4%
Other values (13) 8149
11.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 41404
57.5%
Other Letter 20332
28.3%
Dash Punctuation 10225
 
14.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5082
25.0%
5082
25.0%
5079
25.0%
5079
25.0%
2
 
< 0.1%
2
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 13506
32.6%
2 10636
25.7%
1 5108
 
12.3%
3 2296
 
5.5%
9 1719
 
4.2%
8 1703
 
4.1%
4 1657
 
4.0%
5 1613
 
3.9%
7 1587
 
3.8%
6 1579
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 10225
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 51629
71.7%
Hangul 20332
 
28.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5082
25.0%
5082
25.0%
5079
25.0%
5079
25.0%
2
 
< 0.1%
2
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Common
ValueCountFrequency (%)
0 13506
26.2%
2 10636
20.6%
- 10225
19.8%
1 5108
 
9.9%
3 2296
 
4.4%
9 1719
 
3.3%
8 1703
 
3.3%
4 1657
 
3.2%
5 1613
 
3.1%
7 1587
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 51629
71.7%
Hangul 20332
 
28.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 13506
26.2%
2 10636
20.6%
- 10225
19.8%
1 5108
 
9.9%
3 2296
 
4.4%
9 1719
 
3.3%
8 1703
 
3.3%
4 1657
 
3.2%
5 1613
 
3.1%
7 1587
 
3.1%
Hangul
ValueCountFrequency (%)
5082
25.0%
5082
25.0%
5079
25.0%
5079
25.0%
2
 
< 0.1%
2
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Distinct4202
Distinct (%)81.7%
Missing1
Missing (%)< 0.1%
Memory size40.3 KiB
2023-12-12T16:15:11.606027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length3
Mean length3.0991061
Min length2

Characters and Unicode

Total characters15948
Distinct characters326
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3580 ?
Unique (%)69.6%

Sample

1st row변다미
2nd row정종우
3rd row임혜민
4th row윤미연
5th row최성훈
ValueCountFrequency (%)
13
 
0.2%
김민지 10
 
0.2%
장경숙 8
 
0.2%
이지현 8
 
0.2%
김민정 7
 
0.1%
김재선 7
 
0.1%
1명 7
 
0.1%
김희정 7
 
0.1%
김민주 6
 
0.1%
김태훈 6
 
0.1%
Other values (4224) 5123
98.5%
2023-12-12T16:15:12.155721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1122
 
7.0%
735
 
4.6%
721
 
4.5%
485
 
3.0%
464
 
2.9%
387
 
2.4%
338
 
2.1%
329
 
2.1%
313
 
2.0%
313
 
2.0%
Other values (316) 10741
67.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15575
97.7%
Uppercase Letter 215
 
1.3%
Space Separator 57
 
0.4%
Other Punctuation 57
 
0.4%
Decimal Number 24
 
0.2%
Close Punctuation 9
 
0.1%
Open Punctuation 9
 
0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1122
 
7.2%
735
 
4.7%
721
 
4.6%
485
 
3.1%
464
 
3.0%
387
 
2.5%
338
 
2.2%
329
 
2.1%
313
 
2.0%
313
 
2.0%
Other values (279) 10368
66.6%
Uppercase Letter
ValueCountFrequency (%)
I 31
14.4%
A 28
13.0%
N 25
11.6%
L 12
 
5.6%
O 11
 
5.1%
E 11
 
5.1%
U 11
 
5.1%
H 10
 
4.7%
G 10
 
4.7%
Y 9
 
4.2%
Other values (16) 57
26.5%
Decimal Number
ValueCountFrequency (%)
1 14
58.3%
8 4
 
16.7%
4 2
 
8.3%
0 2
 
8.3%
2 1
 
4.2%
6 1
 
4.2%
Space Separator
ValueCountFrequency (%)
57
100.0%
Other Punctuation
ValueCountFrequency (%)
57
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15575
97.7%
Latin 215
 
1.3%
Common 158
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1122
 
7.2%
735
 
4.7%
721
 
4.6%
485
 
3.1%
464
 
3.0%
387
 
2.5%
338
 
2.2%
329
 
2.1%
313
 
2.0%
313
 
2.0%
Other values (279) 10368
66.6%
Latin
ValueCountFrequency (%)
I 31
14.4%
A 28
13.0%
N 25
11.6%
L 12
 
5.6%
O 11
 
5.1%
E 11
 
5.1%
U 11
 
5.1%
H 10
 
4.7%
G 10
 
4.7%
Y 9
 
4.2%
Other values (16) 57
26.5%
Common
ValueCountFrequency (%)
57
36.1%
57
36.1%
1 14
 
8.9%
) 9
 
5.7%
( 9
 
5.7%
8 4
 
2.5%
4 2
 
1.3%
- 2
 
1.3%
0 2
 
1.3%
2 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15575
97.7%
ASCII 316
 
2.0%
None 57
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1122
 
7.2%
735
 
4.7%
721
 
4.6%
485
 
3.1%
464
 
3.0%
387
 
2.5%
338
 
2.2%
329
 
2.1%
313
 
2.0%
313
 
2.0%
Other values (279) 10368
66.6%
ASCII
ValueCountFrequency (%)
57
18.0%
I 31
 
9.8%
A 28
 
8.9%
N 25
 
7.9%
1 14
 
4.4%
L 12
 
3.8%
O 11
 
3.5%
E 11
 
3.5%
U 11
 
3.5%
H 10
 
3.2%
Other values (26) 106
33.5%
None
ValueCountFrequency (%)
57
100.0%
Distinct5076
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size40.3 KiB
2023-12-12T16:15:12.594209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length35
Mean length6.2438314
Min length1

Characters and Unicode

Total characters32137
Distinct characters976
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5011 ?
Unique (%)97.4%

Sample

1st row디엠아이스
2nd row우기상점
3rd row하망이네
4th row리투
5th row훈씨
ValueCountFrequency (%)
주식회사 201
 
3.2%
16
 
0.3%
컴퍼니 14
 
0.2%
인셀덤 12
 
0.2%
스튜디오 12
 
0.2%
11
 
0.2%
아리따움 10
 
0.2%
하단점 10
 
0.2%
shop 8
 
0.1%
international 7
 
0.1%
Other values (5655) 5948
95.2%
2023-12-12T16:15:13.166746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1205
 
3.7%
1116
 
3.5%
863
 
2.7%
( 792
 
2.5%
) 791
 
2.5%
560
 
1.7%
411
 
1.3%
411
 
1.3%
386
 
1.2%
341
 
1.1%
Other values (966) 25261
78.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24153
75.2%
Lowercase Letter 2597
 
8.1%
Uppercase Letter 2133
 
6.6%
Space Separator 1116
 
3.5%
Open Punctuation 794
 
2.5%
Close Punctuation 793
 
2.5%
Decimal Number 278
 
0.9%
Other Punctuation 159
 
0.5%
Other Symbol 73
 
0.2%
Dash Punctuation 30
 
0.1%
Other values (3) 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1205
 
5.0%
863
 
3.6%
560
 
2.3%
411
 
1.7%
411
 
1.7%
386
 
1.6%
341
 
1.4%
338
 
1.4%
300
 
1.2%
292
 
1.2%
Other values (885) 19046
78.9%
Lowercase Letter
ValueCountFrequency (%)
e 320
12.3%
o 269
 
10.4%
a 221
 
8.5%
n 192
 
7.4%
i 184
 
7.1%
l 149
 
5.7%
r 148
 
5.7%
s 144
 
5.5%
t 113
 
4.4%
m 108
 
4.2%
Other values (16) 749
28.8%
Uppercase Letter
ValueCountFrequency (%)
A 175
 
8.2%
E 160
 
7.5%
S 155
 
7.3%
O 154
 
7.2%
M 124
 
5.8%
N 121
 
5.7%
T 120
 
5.6%
I 119
 
5.6%
L 104
 
4.9%
R 94
 
4.4%
Other values (16) 807
37.8%
Decimal Number
ValueCountFrequency (%)
1 62
22.3%
2 41
14.7%
0 40
14.4%
5 24
 
8.6%
3 21
 
7.6%
9 21
 
7.6%
6 20
 
7.2%
8 19
 
6.8%
7 19
 
6.8%
4 11
 
4.0%
Other Punctuation
ValueCountFrequency (%)
. 75
47.2%
& 46
28.9%
' 15
 
9.4%
? 9
 
5.7%
# 9
 
5.7%
: 2
 
1.3%
! 1
 
0.6%
/ 1
 
0.6%
· 1
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 792
99.7%
[ 2
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 791
99.7%
] 2
 
0.3%
Space Separator
ValueCountFrequency (%)
1116
100.0%
Other Symbol
ValueCountFrequency (%)
73
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 9
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24221
75.4%
Latin 4730
 
14.7%
Common 3181
 
9.9%
Han 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1205
 
5.0%
863
 
3.6%
560
 
2.3%
411
 
1.7%
411
 
1.7%
386
 
1.6%
341
 
1.4%
338
 
1.4%
300
 
1.2%
292
 
1.2%
Other values (881) 19114
78.9%
Latin
ValueCountFrequency (%)
e 320
 
6.8%
o 269
 
5.7%
a 221
 
4.7%
n 192
 
4.1%
i 184
 
3.9%
A 175
 
3.7%
E 160
 
3.4%
S 155
 
3.3%
O 154
 
3.3%
l 149
 
3.2%
Other values (42) 2751
58.2%
Common
ValueCountFrequency (%)
1116
35.1%
( 792
24.9%
) 791
24.9%
. 75
 
2.4%
1 62
 
1.9%
& 46
 
1.4%
2 41
 
1.3%
0 40
 
1.3%
- 30
 
0.9%
5 24
 
0.8%
Other values (18) 164
 
5.2%
Han
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24148
75.1%
ASCII 7910
 
24.6%
None 74
 
0.2%
CJK 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1205
 
5.0%
863
 
3.6%
560
 
2.3%
411
 
1.7%
411
 
1.7%
386
 
1.6%
341
 
1.4%
338
 
1.4%
300
 
1.2%
292
 
1.2%
Other values (880) 19041
78.9%
ASCII
ValueCountFrequency (%)
1116
 
14.1%
( 792
 
10.0%
) 791
 
10.0%
e 320
 
4.0%
o 269
 
3.4%
a 221
 
2.8%
n 192
 
2.4%
i 184
 
2.3%
A 175
 
2.2%
E 160
 
2.0%
Other values (69) 3690
46.6%
None
ValueCountFrequency (%)
73
98.6%
· 1
 
1.4%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

법인구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size40.3 KiB
개인
4654 
법인
 
439
<NA>
 
53
제공
 
1

Length

Max length4
Median length2
Mean length2.0205945
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row개인
2nd row개인
3rd row개인
4th row개인
5th row개인

Common Values

ValueCountFrequency (%)
개인 4654
90.4%
법인 439
 
8.5%
<NA> 53
 
1.0%
제공 1
 
< 0.1%

Length

2023-12-12T16:15:13.355292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:15:13.480500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개인 4654
90.4%
법인 439
 
8.5%
na 53
 
1.0%
제공 1
 
< 0.1%

운영상태
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.3 KiB
정상영업
5147 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정상영업
2nd row정상영업
3rd row정상영업
4th row정상영업
5th row정상영업

Common Values

ValueCountFrequency (%)
정상영업 5147
100.0%

Length

2023-12-12T16:15:13.586792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:15:13.664854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정상영업 5147
100.0%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.3 KiB
Minimum2023-05-23 00:00:00
Maximum2023-05-23 00:00:00
2023-12-12T16:15:13.740767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:15:13.843298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T16:15:09.916312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:15:13.914138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호법인구분
번호1.0000.109
법인구분0.1091.000
2023-12-12T16:15:14.011509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호법인구분
번호1.0000.065
법인구분0.0651.000

Missing values

2023-12-12T16:15:10.072666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:15:10.245975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호관리번호대표자명법인또는상호법인구분운영상태데이터기준일자
012023-부산사하-0500변다미디엠아이스개인정상영업2023-05-23
122023-부산사하-0499정종우우기상점개인정상영업2023-05-23
232023-부산사하-0498임혜민하망이네개인정상영업2023-05-23
342023-부산사하-0497윤미연리투개인정상영업2023-05-23
452023-부산사하-0496최성훈훈씨개인정상영업2023-05-23
562023-부산사하-0493장현우주식회사 잇츠부산법인정상영업2023-05-23
672023-부산사하-0492백정인행복한숍개인정상영업2023-05-23
782023-부산사하-0491강승구플러스닷컴개인정상영업2023-05-23
892023-부산사하-0490김은애라이징솔루션개인정상영업2023-05-23
9102023-부산사하-0495최영준모던팩토리개인정상영업2023-05-23
번호관리번호대표자명법인또는상호법인구분운영상태데이터기준일자
5137513810-205김승용S.Y.K.상사개인정상영업2023-05-23
5138513910-187김태완하나스포츠개인정상영업2023-05-23
5139514010-162이충녕ezy개인정상영업2023-05-23
5140514110-121김승진원아사개인정상영업2023-05-23
5141514210-118박재근애드파골프(주)법인정상영업2023-05-23
514251432022-부산사하-0700박영동주식회사청산에식품법인정상영업2023-05-23
514351442021-부산사하-0291심정섭㈜오리엔탈코머스개인정상영업2023-05-23
514451451139김종묵한아툴스㈜개인정상영업2023-05-23
514551462021-부산사하-0361최현리캐스트개인정상영업2023-05-23
51465147321조선자남광식품상사개인정상영업2023-05-23