Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows2023
Duplicate rows (%)20.2%
Total size in memory410.2 KiB
Average record size in memory42.0 B

Variable types

Text1
Categorical2
Numeric1

Dataset

Description대구광역시 두드리소(민원제안콜통합관리시스템)를 통해 처리된 민원유형별 현황 데이터로 접수년월별로 민원유형에 따른 처리 통계입니다.
URLhttps://www.data.go.kr/data/15092070/fileData.do

Alerts

Dataset has 2023 (20.2%) duplicate rowsDuplicates
취하 is highly imbalanced (96.5%)Imbalance
해결 has 3298 (33.0%) zerosZeros

Reproduction

Analysis started2023-12-12 17:17:53.880075
Analysis finished2023-12-12 17:17:54.479048
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct90
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T02:17:54.625543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021년 04월
2nd row2019년 02월
3rd row2020년 09월
4th row2022년 08월
5th row2022년 02월
ValueCountFrequency (%)
2022년 1482
 
7.4%
2021년 1421
 
7.1%
2020년 1379
 
6.9%
2019년 1343
 
6.7%
2018년 1317
 
6.6%
2017년 1256
 
6.3%
2016년 1189
 
5.9%
03월 951
 
4.8%
12월 949
 
4.7%
04월 875
 
4.4%
Other values (11) 7838
39.2%
2023-12-13T02:17:55.013459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 19596
21.8%
2 18066
20.1%
1 10801
12.0%
10000
11.1%
10000
11.1%
10000
11.1%
9 2130
 
2.4%
8 2112
 
2.3%
7 2055
 
2.3%
6 1989
 
2.2%
Other values (3) 3251
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60000
66.7%
Other Letter 20000
 
22.2%
Space Separator 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 19596
32.7%
2 18066
30.1%
1 10801
18.0%
9 2130
 
3.5%
8 2112
 
3.5%
7 2055
 
3.4%
6 1989
 
3.3%
3 1444
 
2.4%
5 932
 
1.6%
4 875
 
1.5%
Other Letter
ValueCountFrequency (%)
10000
50.0%
10000
50.0%
Space Separator
ValueCountFrequency (%)
10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 70000
77.8%
Hangul 20000
 
22.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 19596
28.0%
2 18066
25.8%
1 10801
15.4%
10000
14.3%
9 2130
 
3.0%
8 2112
 
3.0%
7 2055
 
2.9%
6 1989
 
2.8%
3 1444
 
2.1%
5 932
 
1.3%
Hangul
ValueCountFrequency (%)
10000
50.0%
10000
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70000
77.8%
Hangul 20000
 
22.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 19596
28.0%
2 18066
25.8%
1 10801
15.4%
10000
14.3%
9 2130
 
3.0%
8 2112
 
3.0%
7 2055
 
2.9%
6 1989
 
2.8%
3 1444
 
2.1%
5 932
 
1.3%
Hangul
ValueCountFrequency (%)
10000
50.0%
10000
50.0%

민원유형
Categorical

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
제출
1059 
신고
933 
증명
879 
교부
876 
등록
815 
Other values (19)
5438 

Length

Max length4
Median length2
Mean length2.0032
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row보고
2nd row확인
3rd row시험
4th row시험
5th row시험

Common Values

ValueCountFrequency (%)
제출 1059
10.6%
신고 933
9.3%
증명 879
8.8%
교부 876
8.8%
등록 815
8.2%
검사 812
8.1%
허가 807
8.1%
보고 796
8.0%
신청 763
7.6%
시험 532
 
5.3%
Other values (14) 1728
17.3%

Length

2023-12-13T02:17:55.198390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제출 1059
10.6%
신고 933
9.3%
증명 879
8.8%
교부 876
8.8%
등록 815
8.2%
검사 812
8.1%
허가 807
8.1%
보고 796
8.0%
신청 763
7.6%
시험 532
 
5.3%
Other values (14) 1728
17.3%

해결
Real number (ℝ)

ZEROS 

Distinct167
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.2436
Minimum0
Maximum404
Zeros3298
Zeros (%)33.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T02:17:55.385414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q36
95-th percentile61
Maximum404
Range404
Interquartile range (IQR)6

Descriptive statistics

Standard deviation24.360447
Coefficient of variation (CV)2.1666057
Kurtosis24.691747
Mean11.2436
Median Absolute Deviation (MAD)1
Skewness3.8699981
Sum112436
Variance593.4314
MonotonicityNot monotonic
2023-12-13T02:17:55.541804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3298
33.0%
1 1911
19.1%
2 883
 
8.8%
3 555
 
5.5%
4 413
 
4.1%
5 315
 
3.1%
6 236
 
2.4%
7 168
 
1.7%
8 119
 
1.2%
9 87
 
0.9%
Other values (157) 2015
20.2%
ValueCountFrequency (%)
0 3298
33.0%
1 1911
19.1%
2 883
 
8.8%
3 555
 
5.5%
4 413
 
4.1%
5 315
 
3.1%
6 236
 
2.4%
7 168
 
1.7%
8 119
 
1.2%
9 87
 
0.9%
ValueCountFrequency (%)
404 1
< 0.1%
320 1
< 0.1%
274 1
< 0.1%
267 1
< 0.1%
258 1
< 0.1%
240 1
< 0.1%
237 1
< 0.1%
230 1
< 0.1%
229 1
< 0.1%
227 1
< 0.1%

취하
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9919 
1
 
76
2
 
4
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9919
99.2%
1 76
 
0.8%
2 4
 
< 0.1%
3 1
 
< 0.1%

Length

2023-12-13T02:17:55.682106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:17:55.789048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9919
99.2%
1 76
 
0.8%
2 4
 
< 0.1%
3 1
 
< 0.1%

Interactions

2023-12-13T02:17:54.184510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:17:55.867787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
접수월민원유형해결취하
접수월1.0000.2000.1560.000
민원유형0.2001.0000.5660.138
해결0.1560.5661.0000.034
취하0.0000.1380.0341.000
2023-12-13T02:17:55.986781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
민원유형취하
민원유형1.0000.066
취하0.0661.000
2023-12-13T02:17:56.093117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해결민원유형취하
해결1.0000.2530.022
민원유형0.2531.0000.066
취하0.0220.0661.000

Missing values

2023-12-13T02:17:54.337263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:17:54.430270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

접수월민원유형해결취하
153572021년 04월보고10
90592019년 02월확인10
137962020년 09월시험00
196192022년 08월시험00
179442022년 02월시험00
190912022년 06월증명10
209322023년 01월등록00
17872016년 08월제출310
30352017년 01월신고10
125902020년 05월등록40
접수월민원유형해결취하
42462017년 06월증명90
154672021년 04월검사10
177142022년 01월허가00
199062022년 09월신청00
116682020년 01월신청00
109452019년 10월지정00
210882023년 02월승인00
8342016년 03월신고370
174762021년 12월제출620
61972018년 02월제출20

Duplicate rows

Most frequently occurring

접수월민원유형해결취하# duplicates
522016년 03월보고0013
5362018년 03월등록0013
2272016년 11월허가0012
8612019년 05월허가0012
12862020년 11월허가0012
19722023년 02월허가0012
1262016년 06월허가0011
12002020년 08월시험0011
15342021년 09월허가0011
17632022년 06월허가0011