Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells891
Missing cells (%)1.5%
Duplicate rows798
Duplicate rows (%)8.0%
Total size in memory546.9 KiB
Average record size in memory56.0 B

Variable types

Text4
DateTime1
Categorical1

Dataset

Description경기도 광주시 스마트이통장넷 문서발신에 대한 데이터로 제목, 담당자, 등록일, 상태코드, 작성자, 발송부서 등을 제공합니다
URLhttps://www.data.go.kr/data/15101136/fileData.do

Alerts

Dataset has 798 (8.0%) duplicate rowsDuplicates
상태코드 is highly imbalanced (95.9%)Imbalance
담당자 has 297 (3.0%) missing valuesMissing
작성자 has 297 (3.0%) missing valuesMissing
발송부서 has 297 (3.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:01:16.780470
Analysis finished2023-12-12 14:01:17.999119
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

제목
Text

Distinct3699
Distinct (%)37.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T23:01:18.297194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length67
Median length50
Mean length31.3575
Min length10

Characters and Unicode

Total characters313575
Distinct characters680
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2668 ?
Unique (%)26.7%

Sample

1st row강설에 따른 적설 취약구조물 안전관리 강화
2nd row돌풍·강수 대비 「낙뢰」 예방활동 철저
3rd row제17회 퇴촌 토마토축제 포스터 배부 및 홍보 협조 요청
4th row2022년 7월 시민정보화교육 홍보 협조
5th row제2차 경기도 재난기본소득 오프라인 접수 알림
ValueCountFrequency (%)
요청 2884
 
4.1%
알림 2496
 
3.5%
2236
 
3.1%
홍보 2068
 
2.9%
철저 1887
 
2.7%
협조 1703
 
2.4%
대비 1548
 
2.2%
따른 1283
 
1.8%
안전관리 1163
 
1.6%
안내 1128
 
1.6%
Other values (4961) 52660
74.1%
2023-12-12T23:01:18.886666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
61062
 
19.5%
2 9327
 
3.0%
5605
 
1.8%
0 4894
 
1.6%
4888
 
1.6%
4803
 
1.5%
1 4354
 
1.4%
4269
 
1.4%
4250
 
1.4%
4227
 
1.3%
Other values (670) 205896
65.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 210666
67.2%
Space Separator 61062
 
19.5%
Decimal Number 25434
 
8.1%
Other Punctuation 4440
 
1.4%
Close Punctuation 3968
 
1.3%
Open Punctuation 3963
 
1.3%
Math Symbol 1871
 
0.6%
Uppercase Letter 1422
 
0.5%
Initial Punctuation 248
 
0.1%
Final Punctuation 243
 
0.1%
Other values (5) 258
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5605
 
2.7%
4888
 
2.3%
4803
 
2.3%
4269
 
2.0%
4250
 
2.0%
4227
 
2.0%
3889
 
1.8%
3822
 
1.8%
3625
 
1.7%
3538
 
1.7%
Other values (584) 167750
79.6%
Uppercase Letter
ValueCountFrequency (%)
C 334
23.5%
T 200
14.1%
V 155
10.9%
A 107
 
7.5%
L 85
 
6.0%
P 81
 
5.7%
D 75
 
5.3%
E 74
 
5.2%
G 70
 
4.9%
I 60
 
4.2%
Other values (13) 181
12.7%
Lowercase Letter
ValueCountFrequency (%)
e 42
26.1%
b 32
19.9%
a 18
11.2%
o 17
10.6%
q 16
 
9.9%
t 10
 
6.2%
f 10
 
6.2%
u 8
 
5.0%
i 2
 
1.2%
s 1
 
0.6%
Other values (5) 5
 
3.1%
Other Punctuation
ValueCountFrequency (%)
. 2342
52.7%
· 1177
26.5%
' 508
 
11.4%
" 94
 
2.1%
; 84
 
1.9%
# 84
 
1.9%
& 84
 
1.9%
% 43
 
1.0%
! 17
 
0.4%
/ 3
 
0.1%
Other values (2) 4
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 9327
36.7%
0 4894
19.2%
1 4354
17.1%
3 1764
 
6.9%
8 1172
 
4.6%
9 1078
 
4.2%
7 936
 
3.7%
4 733
 
2.9%
6 613
 
2.4%
5 563
 
2.2%
Math Symbol
ValueCountFrequency (%)
~ 1166
62.3%
354
 
18.9%
+ 347
 
18.5%
> 1
 
0.1%
1
 
0.1%
< 1
 
0.1%
1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 2253
56.8%
1225
30.9%
258
 
6.5%
] 211
 
5.3%
21
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 2252
56.8%
1223
30.9%
256
 
6.5%
[ 211
 
5.3%
21
 
0.5%
Initial Punctuation
ValueCountFrequency (%)
192
77.4%
56
 
22.6%
Final Punctuation
ValueCountFrequency (%)
186
76.5%
57
 
23.5%
Space Separator
ValueCountFrequency (%)
61062
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 91
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 210569
67.2%
Common 101325
32.3%
Latin 1584
 
0.5%
Han 97
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5605
 
2.7%
4888
 
2.3%
4803
 
2.3%
4269
 
2.0%
4250
 
2.0%
4227
 
2.0%
3889
 
1.8%
3822
 
1.8%
3625
 
1.7%
3538
 
1.7%
Other values (573) 167653
79.6%
Common
ValueCountFrequency (%)
61062
60.3%
2 9327
 
9.2%
0 4894
 
4.8%
1 4354
 
4.3%
. 2342
 
2.3%
) 2253
 
2.2%
( 2252
 
2.2%
3 1764
 
1.7%
1225
 
1.2%
1223
 
1.2%
Other values (37) 10629
 
10.5%
Latin
ValueCountFrequency (%)
C 334
21.1%
T 200
12.6%
V 155
9.8%
A 107
 
6.8%
L 85
 
5.4%
P 81
 
5.1%
D 75
 
4.7%
E 74
 
4.7%
G 70
 
4.4%
I 60
 
3.8%
Other values (29) 343
21.7%
Han
ValueCountFrequency (%)
25
25.8%
25
25.8%
15
15.5%
10
 
10.3%
10
 
10.3%
7
 
7.2%
1
 
1.0%
1
 
1.0%
1
 
1.0%
1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 210546
67.1%
ASCII 97879
31.2%
None 4181
 
1.3%
Punctuation 492
 
0.2%
Arrows 354
 
0.1%
CJK 97
 
< 0.1%
Compat Jamo 23
 
< 0.1%
Number Forms 1
 
< 0.1%
Geometric Shapes 1
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
61062
62.4%
2 9327
 
9.5%
0 4894
 
5.0%
1 4354
 
4.4%
. 2342
 
2.4%
) 2253
 
2.3%
( 2252
 
2.3%
3 1764
 
1.8%
8 1172
 
1.2%
~ 1166
 
1.2%
Other values (60) 7293
 
7.5%
Hangul
ValueCountFrequency (%)
5605
 
2.7%
4888
 
2.3%
4803
 
2.3%
4269
 
2.0%
4250
 
2.0%
4227
 
2.0%
3889
 
1.8%
3822
 
1.8%
3625
 
1.7%
3538
 
1.7%
Other values (572) 167630
79.6%
None
ValueCountFrequency (%)
1225
29.3%
1223
29.3%
· 1177
28.2%
258
 
6.2%
256
 
6.1%
21
 
0.5%
21
 
0.5%
Arrows
ValueCountFrequency (%)
354
100.0%
Punctuation
ValueCountFrequency (%)
192
39.0%
186
37.8%
57
 
11.6%
56
 
11.4%
1
 
0.2%
CJK
ValueCountFrequency (%)
25
25.8%
25
25.8%
15
15.5%
10
 
10.3%
10
 
10.3%
7
 
7.2%
1
 
1.0%
1
 
1.0%
1
 
1.0%
1
 
1.0%
Compat Jamo
ValueCountFrequency (%)
23
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

담당자
Text

MISSING 

Distinct359
Distinct (%)3.7%
Missing297
Missing (%)3.0%
Memory size156.2 KiB
2023-12-12T23:01:19.284470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.994847
Min length2

Characters and Unicode

Total characters29059
Distinct characters154
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)0.6%

Sample

1st row신은빈
2nd row양희구
3rd row류기영
4th row백광훈
5th row윤태제
ValueCountFrequency (%)
신은빈 1591
 
16.4%
이미소 423
 
4.4%
서영대 403
 
4.2%
박정아 371
 
3.8%
조지연 265
 
2.7%
우상권 261
 
2.7%
이호운 251
 
2.6%
윤태제 201
 
2.1%
배은지 167
 
1.7%
강현철 159
 
1.6%
Other values (349) 5611
57.8%
2023-12-12T23:01:19.759970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2126
 
7.3%
1681
 
5.8%
1663
 
5.7%
1611
 
5.5%
1332
 
4.6%
1082
 
3.7%
1028
 
3.5%
950
 
3.3%
781
 
2.7%
746
 
2.6%
Other values (144) 16059
55.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29059
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2126
 
7.3%
1681
 
5.8%
1663
 
5.7%
1611
 
5.5%
1332
 
4.6%
1082
 
3.7%
1028
 
3.5%
950
 
3.3%
781
 
2.7%
746
 
2.6%
Other values (144) 16059
55.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29059
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2126
 
7.3%
1681
 
5.8%
1663
 
5.7%
1611
 
5.5%
1332
 
4.6%
1082
 
3.7%
1028
 
3.5%
950
 
3.3%
781
 
2.7%
746
 
2.6%
Other values (144) 16059
55.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29059
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2126
 
7.3%
1681
 
5.8%
1663
 
5.7%
1611
 
5.5%
1332
 
4.6%
1082
 
3.7%
1028
 
3.5%
950
 
3.3%
781
 
2.7%
746
 
2.6%
Other values (144) 16059
55.3%
Distinct1443
Distinct (%)14.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2017-03-17 00:00:00
Maximum2023-07-21 00:00:00
2023-12-12T23:01:19.906331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:01:20.046364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

상태코드
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
9956 
D
 
44

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 9956
99.6%
D 44
 
0.4%

Length

2023-12-12T23:01:20.175037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:01:20.256649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 9956
99.6%
d 44
 
0.4%

작성자
Text

MISSING 

Distinct359
Distinct (%)3.7%
Missing297
Missing (%)3.0%
Memory size156.2 KiB
2023-12-12T23:01:20.547163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.994847
Min length2

Characters and Unicode

Total characters29059
Distinct characters154
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)0.6%

Sample

1st row신은빈
2nd row양희구
3rd row류기영
4th row백광훈
5th row윤태제
ValueCountFrequency (%)
신은빈 1591
 
16.4%
이미소 422
 
4.3%
서영대 403
 
4.2%
박정아 371
 
3.8%
조지연 265
 
2.7%
우상권 261
 
2.7%
이호운 251
 
2.6%
윤태제 201
 
2.1%
배은지 167
 
1.7%
강현철 159
 
1.6%
Other values (349) 5612
57.8%
2023-12-12T23:01:21.112104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2126
 
7.3%
1680
 
5.8%
1663
 
5.7%
1611
 
5.5%
1332
 
4.6%
1083
 
3.7%
1028
 
3.5%
950
 
3.3%
780
 
2.7%
746
 
2.6%
Other values (144) 16060
55.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29059
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2126
 
7.3%
1680
 
5.8%
1663
 
5.7%
1611
 
5.5%
1332
 
4.6%
1083
 
3.7%
1028
 
3.5%
950
 
3.3%
780
 
2.7%
746
 
2.6%
Other values (144) 16060
55.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29059
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2126
 
7.3%
1680
 
5.8%
1663
 
5.7%
1611
 
5.5%
1332
 
4.6%
1083
 
3.7%
1028
 
3.5%
950
 
3.3%
780
 
2.7%
746
 
2.6%
Other values (144) 16060
55.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29059
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2126
 
7.3%
1680
 
5.8%
1663
 
5.7%
1611
 
5.5%
1332
 
4.6%
1083
 
3.7%
1028
 
3.5%
950
 
3.3%
780
 
2.7%
746
 
2.6%
Other values (144) 16060
55.3%

발송부서
Text

MISSING 

Distinct76
Distinct (%)0.8%
Missing297
Missing (%)3.0%
Memory size156.2 KiB
2023-12-12T23:01:21.454812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length4.4368752
Min length3

Characters and Unicode

Total characters43051
Distinct characters121
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row안전총괄과
2nd row퇴촌면
3rd row정보통신담당관
4th row남종면
5th row안전총괄과
ValueCountFrequency (%)
안전총괄과 2690
27.7%
오포읍 819
 
8.4%
곤지암읍 769
 
7.9%
지역안전과 494
 
5.1%
초월읍 453
 
4.7%
도척면 389
 
4.0%
교통정책과 261
 
2.7%
기업지원과 218
 
2.2%
정보통신과 205
 
2.1%
남한산성면 191
 
2.0%
Other values (66) 3214
33.1%
2023-12-12T23:01:21.954692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5718
 
13.3%
3301
 
7.7%
3208
 
7.5%
2718
 
6.3%
2690
 
6.2%
2041
 
4.7%
1653
 
3.8%
1260
 
2.9%
959
 
2.2%
869
 
2.0%
Other values (111) 18634
43.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42803
99.4%
Decimal Number 248
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5718
 
13.4%
3301
 
7.7%
3208
 
7.5%
2718
 
6.4%
2690
 
6.3%
2041
 
4.8%
1653
 
3.9%
1260
 
2.9%
959
 
2.2%
869
 
2.0%
Other values (109) 18386
43.0%
Decimal Number
ValueCountFrequency (%)
2 146
58.9%
1 102
41.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42803
99.4%
Common 248
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5718
 
13.4%
3301
 
7.7%
3208
 
7.5%
2718
 
6.4%
2690
 
6.3%
2041
 
4.8%
1653
 
3.9%
1260
 
2.9%
959
 
2.2%
869
 
2.0%
Other values (109) 18386
43.0%
Common
ValueCountFrequency (%)
2 146
58.9%
1 102
41.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42803
99.4%
ASCII 248
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5718
 
13.4%
3301
 
7.7%
3208
 
7.5%
2718
 
6.4%
2690
 
6.3%
2041
 
4.8%
1653
 
3.9%
1260
 
2.9%
959
 
2.2%
869
 
2.0%
Other values (109) 18386
43.0%
ASCII
ValueCountFrequency (%)
2 146
58.9%
1 102
41.1%

Correlations

2023-12-12T23:01:22.072947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상태코드발송부서
상태코드1.0000.455
발송부서0.4551.000

Missing values

2023-12-12T23:01:17.665736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:01:17.787386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:01:17.916250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

제목담당자등록일상태코드작성자발송부서
6875강설에 따른 적설 취약구조물 안전관리 강화<NA>2021-03-02A<NA><NA>
5395돌풍·강수 대비 「낙뢰」 예방활동 철저신은빈2021-05-04A신은빈안전총괄과
1253제17회 퇴촌 토마토축제 포스터 배부 및 홍보 협조 요청양희구2019-05-20A양희구퇴촌면
41892022년 7월 시민정보화교육 홍보 협조류기영2022-06-10A류기영정보통신담당관
6850제2차 경기도 재난기본소득 오프라인 접수 알림백광훈2021-02-25A백광훈남종면
9312019년 농어업인(고교생)자녀 학자금 지원사업 알림<NA>2019-02-25A<NA><NA>
3221제10호 태풍 ‘하이선’ 북상에 따른 상황판단회의(9.4일자) 결과 통보 및 지시사항 시달윤태제2020-09-05A윤태제안전총괄과
223제7회 광주시 독서감상문대회 개최에 따른 홍보 협조 요청권수란2017-08-01A권수란시립중앙도서관
2746제9호 태풍 ‘마이삭’ 북상 대비 행정안전부장관 특별 지시사항 통보윤태제2020-09-01A윤태제안전총괄과
1389설날 귀성객 축산농가 등 방문 및 모임 자제 등 요청김기영2019-02-25A김기영농업정책과
제목담당자등록일상태코드작성자발송부서
2802「2020년 경기도 에너지자립 선도사업」2차 지원공고 알림이호운2020-07-16A이호운기업지원과
98037.13~14일 호우 대비 인명피해우려지역 등 재해취약지역 안전관리 철저<NA>2022-07-13A<NA><NA>
780711.8~10일 강설·한파 대비 사전 안전관리 및 예방활동 강화우상권2021-11-08A우상권안전총괄과
51112021년 검천 평생학습센터 성인 프로그램 관내 수강생 모집 홍보 요청정아름2021-08-31A정아름교육청소년과
5287월 법정리 이장회의 자료 송부최진2018-07-17A최진오포읍
29792021년도 곤지암읍 적십자회비 모금 목표 금액 안내이지원2020-12-11A이지원곤지암읍
3008경기지역화폐(광주사랑카드) 소비지원금 확대지원 안내박정아2020-11-16A박정아오포읍
66431.28.(목)~1.29.(금) 강풍·대설·한파 대비 대응·대처 철저윤태제2021-01-28A윤태제안전총괄과
2533불법 주정차 단속시스템(CCTV) 신규설치에 따른 행정예고 알림황춘식2020-07-21A황춘식교통정책과
6712018년「열린 학습공간(열공)」지원사업 재공고 안내김지현2018-03-01A김지현평생교육과

Duplicate rows

Most frequently occurring

제목담당자등록일상태코드작성자발송부서# duplicates
4011.22.~25일 강설·한파 대비 사전 안전관리 및 예방활동 강화 요청서영대2021-11-19A서영대안전총괄과14
1152021년 가정용 저녹스보일러 설치지원 사업 홍보요청주현영2021-01-04A주현영환경정책과14
500광주시 경안 외 7개소 처리구역 하수관로 정비사업 착공 알림김민기2021-09-02A김민기하수과14
642오수관로 설치사업 추진 시 오수받이 일괄 시공 계획 주민홍보 협조 요청이형일2022-01-25A이형일하수과14
15(지시사항)도내 거주 외국인 방역수칙 준수 및 불법체류 통보의무 면제사항 등 홍보요청이호운2021-02-26A이호운기업지원과13
161.10.~12일 강설·한파 대비 사전 안전관리 및 예방활동 강화 요청서영대2022-01-10A서영대안전총괄과13
191.17~18일 강설대비 사전대비(제설·제빙) 철저 및 한파 대응 준비 요청신은빈2021-01-15A신은빈안전총괄과13
201.17~18일 대설 관련 국무총리 긴급 지시사항 통보신은빈2021-01-17A신은빈안전총괄과13
241.19일 대설·한파 대비 사전 안전관리 및 예방활동 강화김승태2022-01-19A김승태안전총괄과13
251.19일(수) 강설 대비 안전관리 철저 요청서영대2022-01-18A서영대안전총괄과13