Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows26
Duplicate rows (%)0.3%
Total size in memory546.9 KiB
Average record size in memory56.0 B

Variable types

Text2
Categorical2
DateTime2

Dataset

Description2022년 12월 한달 동안 e프라이버시 클린서비스 웹페이지에서 발생한 민원에 대한 데이터로, 사업체명,도메인주소, 민원 구분 등의 데이터를 제공합니다.
Author개인정보보호위원회
URLhttps://www.data.go.kr/data/15119766/fileData.do

Alerts

Dataset has 26 (0.3%) duplicate rowsDuplicates
민원 구분-상세 is highly overall correlated with 민원 구분High correlation
민원 구분 is highly overall correlated with 민원 구분-상세High correlation
민원 구분 is highly imbalanced (79.8%)Imbalance
민원 구분-상세 is highly imbalanced (85.2%)Imbalance

Reproduction

Analysis started2023-12-12 17:38:50.046132
Analysis finished2023-12-12 17:38:50.933198
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1937
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T02:38:51.201281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length20
Mean length7.7009
Min length1

Characters and Unicode

Total characters77009
Distinct characters724
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1000 ?
Unique (%)10.0%

Sample

1st row한성대학교
2nd row(NULL)
3rd row한국세무사회
4th row서울특별시미래청년기획단
5th row한국환경공단
ValueCountFrequency (%)
null 198
 
1.8%
주식회사 163
 
1.5%
kb손해보험 155
 
1.4%
고용정보원 153
 
1.4%
주)이베이코리아 119
 
1.1%
엔에이치엔(주 119
 
1.1%
롯데멤버스(주 117
 
1.1%
위대한상상(요기요 113
 
1.0%
주)현대백화점 106
 
1.0%
주)11번가 105
 
1.0%
Other values (2047) 9432
87.5%
2023-12-13T02:38:51.668569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 6181
 
8.0%
) 6181
 
8.0%
5801
 
7.5%
2831
 
3.7%
2191
 
2.8%
1265
 
1.6%
1234
 
1.6%
862
 
1.1%
808
 
1.0%
802
 
1.0%
Other values (714) 48853
63.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59946
77.8%
Open Punctuation 6181
 
8.0%
Close Punctuation 6181
 
8.0%
Uppercase Letter 3036
 
3.9%
Space Separator 802
 
1.0%
Decimal Number 390
 
0.5%
Lowercase Letter 348
 
0.5%
Other Symbol 70
 
0.1%
Other Punctuation 38
 
< 0.1%
Dash Punctuation 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5801
 
9.7%
2831
 
4.7%
2191
 
3.7%
1265
 
2.1%
1234
 
2.1%
862
 
1.4%
808
 
1.3%
779
 
1.3%
760
 
1.3%
738
 
1.2%
Other values (644) 42677
71.2%
Uppercase Letter
ValueCountFrequency (%)
L 568
18.7%
N 301
9.9%
K 269
8.9%
S 224
 
7.4%
B 221
 
7.3%
U 209
 
6.9%
G 191
 
6.3%
T 148
 
4.9%
C 127
 
4.2%
E 115
 
3.8%
Other values (15) 663
21.8%
Lowercase Letter
ValueCountFrequency (%)
s 43
12.4%
e 33
 
9.5%
p 25
 
7.2%
t 25
 
7.2%
g 23
 
6.6%
r 23
 
6.6%
a 22
 
6.3%
o 22
 
6.3%
i 21
 
6.0%
c 21
 
6.0%
Other values (13) 90
25.9%
Decimal Number
ValueCountFrequency (%)
1 224
57.4%
3 42
 
10.8%
6 27
 
6.9%
2 23
 
5.9%
8 20
 
5.1%
4 15
 
3.8%
9 14
 
3.6%
5 13
 
3.3%
0 11
 
2.8%
7 1
 
0.3%
Other Punctuation
ValueCountFrequency (%)
& 20
52.6%
. 6
 
15.8%
/ 5
 
13.2%
, 3
 
7.9%
· 3
 
7.9%
' 1
 
2.6%
Open Punctuation
ValueCountFrequency (%)
( 6181
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6181
100.0%
Space Separator
ValueCountFrequency (%)
802
100.0%
Other Symbol
ValueCountFrequency (%)
70
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60016
77.9%
Common 13609
 
17.7%
Latin 3384
 
4.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5801
 
9.7%
2831
 
4.7%
2191
 
3.7%
1265
 
2.1%
1234
 
2.1%
862
 
1.4%
808
 
1.3%
779
 
1.3%
760
 
1.3%
738
 
1.2%
Other values (645) 42747
71.2%
Latin
ValueCountFrequency (%)
L 568
16.8%
N 301
 
8.9%
K 269
 
7.9%
S 224
 
6.6%
B 221
 
6.5%
U 209
 
6.2%
G 191
 
5.6%
T 148
 
4.4%
C 127
 
3.8%
E 115
 
3.4%
Other values (38) 1011
29.9%
Common
ValueCountFrequency (%)
( 6181
45.4%
) 6181
45.4%
802
 
5.9%
1 224
 
1.6%
3 42
 
0.3%
6 27
 
0.2%
2 23
 
0.2%
8 20
 
0.1%
& 20
 
0.1%
4 15
 
0.1%
Other values (11) 74
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59946
77.8%
ASCII 16990
 
22.1%
None 73
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 6181
36.4%
) 6181
36.4%
802
 
4.7%
L 568
 
3.3%
N 301
 
1.8%
K 269
 
1.6%
S 224
 
1.3%
1 224
 
1.3%
B 221
 
1.3%
U 209
 
1.2%
Other values (58) 1810
 
10.7%
Hangul
ValueCountFrequency (%)
5801
 
9.7%
2831
 
4.7%
2191
 
3.7%
1265
 
2.1%
1234
 
2.1%
862
 
1.4%
808
 
1.3%
779
 
1.3%
760
 
1.3%
738
 
1.2%
Other values (644) 42677
71.2%
None
ValueCountFrequency (%)
70
95.9%
· 3
 
4.1%
Distinct2155
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T02:38:51.994298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length28
Mean length12.9571
Min length6

Characters and Unicode

Total characters129571
Distinct characters90
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1139 ?
Unique (%)11.4%

Sample

1st rowhansung.ac.kr
2nd row(NULL)
3rd rowlicense.kacpta.or.kr
4th rowyouth.seoul.go.kr
5th rowcpoint.or.kr
ValueCountFrequency (%)
null 198
 
2.0%
kbinsure.co.kr 155
 
1.6%
work.go.kr 144
 
1.4%
lpoint.com 115
 
1.1%
yogiyo.co.kr 113
 
1.1%
hangame.com 108
 
1.1%
ehyundai.com 105
 
1.1%
interpark.com 99
 
1.0%
gmarket.co.kr 98
 
1.0%
tmoney.co.kr 94
 
0.9%
Other values (2145) 8771
87.7%
2023-12-13T02:38:52.395687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 15858
12.2%
o 15405
11.9%
c 10784
 
8.3%
r 9274
 
7.2%
m 7822
 
6.0%
k 7602
 
5.9%
e 7449
 
5.7%
a 7352
 
5.7%
n 6025
 
4.6%
i 5046
 
3.9%
Other values (80) 36954
28.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 110697
85.4%
Other Punctuation 15930
 
12.3%
Decimal Number 1368
 
1.1%
Uppercase Letter 792
 
0.6%
Dash Punctuation 295
 
0.2%
Open Punctuation 200
 
0.2%
Close Punctuation 200
 
0.2%
Other Letter 89
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
15.7%
13
 
14.6%
4
 
4.5%
4
 
4.5%
4
 
4.5%
4
 
4.5%
3
 
3.4%
2
 
2.2%
2
 
2.2%
2
 
2.2%
Other values (35) 37
41.6%
Lowercase Letter
ValueCountFrequency (%)
o 15405
13.9%
c 10784
 
9.7%
r 9274
 
8.4%
m 7822
 
7.1%
k 7602
 
6.9%
e 7449
 
6.7%
a 7352
 
6.6%
n 6025
 
5.4%
i 5046
 
4.6%
t 4463
 
4.0%
Other values (16) 29475
26.6%
Decimal Number
ValueCountFrequency (%)
1 437
31.9%
2 250
18.3%
4 165
 
12.1%
9 160
 
11.7%
0 100
 
7.3%
8 59
 
4.3%
5 59
 
4.3%
6 57
 
4.2%
3 43
 
3.1%
7 38
 
2.8%
Other Punctuation
ValueCountFrequency (%)
. 15858
99.5%
/ 40
 
0.3%
: 32
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
L 396
50.0%
U 198
25.0%
N 198
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 295
100.0%
Open Punctuation
ValueCountFrequency (%)
( 200
100.0%
Close Punctuation
ValueCountFrequency (%)
) 200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 111489
86.0%
Common 17993
 
13.9%
Hangul 89
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
15.7%
13
 
14.6%
4
 
4.5%
4
 
4.5%
4
 
4.5%
4
 
4.5%
3
 
3.4%
2
 
2.2%
2
 
2.2%
2
 
2.2%
Other values (35) 37
41.6%
Latin
ValueCountFrequency (%)
o 15405
13.8%
c 10784
 
9.7%
r 9274
 
8.3%
m 7822
 
7.0%
k 7602
 
6.8%
e 7449
 
6.7%
a 7352
 
6.6%
n 6025
 
5.4%
i 5046
 
4.5%
t 4463
 
4.0%
Other values (19) 30267
27.1%
Common
ValueCountFrequency (%)
. 15858
88.1%
1 437
 
2.4%
- 295
 
1.6%
2 250
 
1.4%
( 200
 
1.1%
) 200
 
1.1%
4 165
 
0.9%
9 160
 
0.9%
0 100
 
0.6%
8 59
 
0.3%
Other values (6) 269
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 129482
99.9%
Hangul 89
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 15858
12.2%
o 15405
11.9%
c 10784
 
8.3%
r 9274
 
7.2%
m 7822
 
6.0%
k 7602
 
5.9%
e 7449
 
5.8%
a 7352
 
5.7%
n 6025
 
4.7%
i 5046
 
3.9%
Other values (35) 36865
28.5%
Hangul
ValueCountFrequency (%)
14
 
15.7%
13
 
14.6%
4
 
4.5%
4
 
4.5%
4
 
4.5%
4
 
4.5%
3
 
3.4%
2
 
2.2%
2
 
2.2%
2
 
2.2%
Other values (35) 37
41.6%

민원 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
회원탈퇴
9684 
개인정보
 
316

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row회원탈퇴
2nd row회원탈퇴
3rd row회원탈퇴
4th row회원탈퇴
5th row회원탈퇴

Common Values

ValueCountFrequency (%)
회원탈퇴 9684
96.8%
개인정보 316
 
3.2%

Length

2023-12-13T02:38:52.538809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:38:52.637154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
회원탈퇴 9684
96.8%
개인정보 316
 
3.2%

민원 구분-상세
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
탈퇴
9684 
열람
 
163
처리정지
 
153

Length

Max length4
Median length2
Mean length2.0306
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row탈퇴
2nd row탈퇴
3rd row탈퇴
4th row탈퇴
5th row탈퇴

Common Values

ValueCountFrequency (%)
탈퇴 9684
96.8%
열람 163
 
1.6%
처리정지 153
 
1.5%

Length

2023-12-13T02:38:52.752887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:38:52.863893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
탈퇴 9684
96.8%
열람 163
 
1.6%
처리정지 153
 
1.5%
Distinct5306
Distinct (%)53.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-12-01 00:06:00
Maximum2022-12-31 23:57:00
2023-12-13T02:38:52.998979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:38:53.142637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2874
Distinct (%)28.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-12-01 01:22:00
Maximum2023-05-04 12:02:00
2023-12-13T02:38:53.342407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:38:53.514790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Correlations

2023-12-13T02:38:53.619961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
민원 구분민원 구분-상세
민원 구분1.0001.000
민원 구분-상세1.0001.000
2023-12-13T02:38:53.726763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
민원 구분-상세민원 구분
민원 구분-상세1.0001.000
민원 구분1.0001.000
2023-12-13T02:38:53.835687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
민원 구분민원 구분-상세
민원 구분1.0001.000
민원 구분-상세1.0001.000

Missing values

2023-12-13T02:38:50.747198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:38:50.867202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업체명도메인 주소민원 구분민원 구분-상세신청일자완료일자
36518한성대학교hansung.ac.kr회원탈퇴탈퇴2022-12-23 14:252022-12-23 14:26
30703(NULL)(NULL)회원탈퇴탈퇴2022-12-19 22:392022-12-28 11:01
5639한국세무사회license.kacpta.or.kr회원탈퇴탈퇴2022-12-04 20:452022-12-21 19:43
44876서울특별시미래청년기획단youth.seoul.go.kr회원탈퇴탈퇴2022-12-28 17:562023-01-17 13:40
2973한국환경공단cpoint.or.kr회원탈퇴탈퇴2022-12-02 17:202023-02-01 18:00
11208KB손해보험kbinsure.co.kr회원탈퇴탈퇴2022-12-08 01:062022-12-13 19:10
22361로카모빌리티cashbee.co.kr회원탈퇴탈퇴2022-12-14 16:572023-02-14 14:14
41086블리자드엔터테인먼트(유)blizzard.co.kr회원탈퇴탈퇴2022-12-26 18:272023-02-02 15:11
38491러브밤lovebam.kr회원탈퇴탈퇴2022-12-24 22:072023-04-20 17:10
18814(주)팬딩fanding.kr회원탈퇴탈퇴2022-12-12 20:522023-02-13 18:34
사업체명도메인 주소민원 구분민원 구분-상세신청일자완료일자
4885본아이에프(주)bonif.co.kr회원탈퇴탈퇴2022-12-04 10:272022-12-23 08:43
22211(NULL)(NULL)회원탈퇴탈퇴2022-12-14 16:062022-12-27 16:45
40977주식회사 위대한상상(요기요)yogiyo.co.kr회원탈퇴탈퇴2022-12-26 17:262023-01-19 11:44
42606BS몰banana69.co.kr회원탈퇴탈퇴2022-12-27 15:562023-04-20 17:10
45826(주)한섬sign.handsome.co.kr회원탈퇴탈퇴2022-12-29 12:392023-01-13 16:27
4921국민권익위원회epeople.go.kr회원탈퇴탈퇴2022-12-04 10:522022-12-15 19:16
20221(주)알라딘커뮤니케이션aladin.co.kr회원탈퇴탈퇴2022-12-13 15:082022-12-21 19:30
746콘텐츠웨이브(주)wavve.com회원탈퇴탈퇴2022-12-01 11:422022-12-19 19:44
46666AKS&D(주)AK인터넷쇼핑몰akmall.com회원탈퇴탈퇴2022-12-29 20:202023-01-30 14:34
45602(주)이랜드이츠elandeat.com회원탈퇴탈퇴2022-12-29 09:592023-03-02 22:47

Duplicate rows

Most frequently occurring

사업체명도메인 주소민원 구분민원 구분-상세신청일자완료일자# duplicates
7(주)커리어케어careercare.co.kr회원탈퇴탈퇴2022-12-31 01:352023-01-13 16:233
8(주)티머니tmoney.co.kr회원탈퇴탈퇴2022-12-13 19:472022-12-23 19:103
24한국전력공사recruit.kepco.co.kr회원탈퇴탈퇴2022-12-13 19:472023-03-20 15:533
0(주)11번가11st.co.kr회원탈퇴탈퇴2022-12-30 14:452023-03-08 13:442
1(주)교보문고mobile.kyobobook.co.kr회원탈퇴탈퇴2022-12-29 15:112023-02-10 15:052
2(주)데일리펀딩daily-funding.com회원탈퇴탈퇴2022-12-10 14:532022-12-14 19:192
3(주)번개장터bunjang.co.kr회원탈퇴탈퇴2022-12-16 17:042023-01-16 19:422
4(주)에이비씨마트코리아abcmart.co.kr회원탈퇴탈퇴2022-12-16 11:562022-12-22 19:082
5(주)위메프wemakeprice.com회원탈퇴탈퇴2022-12-24 23:262023-01-20 12:132
6(주)지니뮤직genie.co.kr회원탈퇴탈퇴2022-12-09 19:232022-12-23 19:192