Overview

Dataset statistics

Number of variables5
Number of observations45
Missing cells42
Missing cells (%)18.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory43.9 B

Variable types

Categorical2
Text2
Numeric1

Dataset

Description서울특별시 성동구 제증명 문서 발급을 위한 수수료 정보입니다. 수수료 구분, 종목, 단위, 수수료, 비고 등에 대한 정보를 포함하고 있습니다.
Author서울특별시 성동구
URLhttps://www.data.go.kr/data/15084616/fileData.do

Alerts

구분 is highly overall correlated with 단위High correlation
단위 is highly overall correlated with 구분High correlation
비고 has 42 (93.3%) missing valuesMissing
종목 has unique valuesUnique
수수료 has 3 (6.7%) zerosZeros

Reproduction

Analysis started2023-12-12 03:09:49.315588
Analysis finished2023-12-12 03:09:49.798271
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Memory size492.0 B
무인민원발급창구
11 
사실 및 실적
주택, 도시계획, 지적
주소, 신상, 직무
회계관계
Other values (5)
10 

Length

Max length12
Median length10
Mean length7.9777778
Min length3

Unique

Unique2 ?
Unique (%)4.4%

Sample

1st row재산 및 지방세
2nd row재산 및 지방세
3rd row재산 및 지방세
4th row건축 및 주택
5th row건축 및 주택

Common Values

ValueCountFrequency (%)
무인민원발급창구 11
24.4%
사실 및 실적 7
15.6%
주택, 도시계획, 지적 7
15.6%
주소, 신상, 직무 5
11.1%
회계관계 5
11.1%
재산 및 지방세 3
 
6.7%
건축 및 주택 3
 
6.7%
그 밖에 제증명 2
 
4.4%
자동차 1
 
2.2%
공부열람 1
 
2.2%

Length

2023-12-12T12:09:49.866674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:09:49.997348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
13
13.1%
무인민원발급창구 11
11.1%
주택 10
10.1%
사실 7
 
7.1%
실적 7
 
7.1%
도시계획 7
 
7.1%
지적 7
 
7.1%
회계관계 5
 
5.1%
직무 5
 
5.1%
신상 5
 
5.1%
Other values (9) 22
22.2%

종목
Text

UNIQUE 

Distinct45
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size492.0 B
2023-12-12T12:09:50.259753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length18
Mean length12.555556
Min length4

Characters and Unicode

Total characters565
Distinct characters134
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)100.0%

Sample

1st row건축물 관리대장 무한 증명
2nd row공부의 등본, 초본, 하부원본
3rd row지방세 세목별 과세증명
4th row건축허가 또는 건축준공 검사증명
5th row시영주택 분양금 납부증명
ValueCountFrequency (%)
증명 12
 
10.0%
또는 3
 
2.5%
졸업증명서 2
 
1.7%
성적증명서 2
 
1.7%
등본 2
 
1.7%
초본 2
 
1.7%
2
 
1.7%
· 2
 
1.7%
건축물 1
 
0.8%
등록증명 1
 
0.8%
Other values (91) 91
75.8%
2023-12-12T12:09:50.687820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
80
 
14.2%
32
 
5.7%
32
 
5.7%
13
 
2.3%
13
 
2.3%
12
 
2.1%
10
 
1.8%
9
 
1.6%
) 9
 
1.6%
9
 
1.6%
Other values (124) 346
61.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 455
80.5%
Space Separator 80
 
14.2%
Other Punctuation 11
 
1.9%
Close Punctuation 9
 
1.6%
Open Punctuation 9
 
1.6%
Decimal Number 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
7.0%
32
 
7.0%
13
 
2.9%
13
 
2.9%
12
 
2.6%
10
 
2.2%
9
 
2.0%
9
 
2.0%
9
 
2.0%
9
 
2.0%
Other values (118) 307
67.5%
Other Punctuation
ValueCountFrequency (%)
, 6
54.5%
· 5
45.5%
Space Separator
ValueCountFrequency (%)
80
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Decimal Number
ValueCountFrequency (%)
7 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 455
80.5%
Common 110
 
19.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
7.0%
32
 
7.0%
13
 
2.9%
13
 
2.9%
12
 
2.6%
10
 
2.2%
9
 
2.0%
9
 
2.0%
9
 
2.0%
9
 
2.0%
Other values (118) 307
67.5%
Common
ValueCountFrequency (%)
80
72.7%
) 9
 
8.2%
( 9
 
8.2%
, 6
 
5.5%
· 5
 
4.5%
7 1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 455
80.5%
ASCII 105
 
18.6%
None 5
 
0.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
80
76.2%
) 9
 
8.6%
( 9
 
8.6%
, 6
 
5.7%
7 1
 
1.0%
Hangul
ValueCountFrequency (%)
32
 
7.0%
32
 
7.0%
13
 
2.9%
13
 
2.9%
12
 
2.6%
10
 
2.2%
9
 
2.0%
9
 
2.0%
9
 
2.0%
9
 
2.0%
Other values (118) 307
67.5%
None
ValueCountFrequency (%)
· 5
100.0%

단위
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size492.0 B
24 
11 
필지
주택
 
2
시간
 
1

Length

Max length2
Median length1
Mean length1.2
Min length1

Unique

Unique2 ?
Unique (%)4.4%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
24
53.3%
11
24.4%
필지 6
 
13.3%
주택 2
 
4.4%
시간 1
 
2.2%
1
 
2.2%

Length

2023-12-12T12:09:50.845190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:09:50.975478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
24
53.3%
11
24.4%
필지 6
 
13.3%
주택 2
 
4.4%
시간 1
 
2.2%
1
 
2.2%

수수료
Real number (ℝ)

ZEROS 

Distinct13
Distinct (%)28.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean700
Minimum0
Maximum5500
Zeros3
Zeros (%)6.7%
Negative0
Negative (%)0.0%
Memory size537.0 B
2023-12-12T12:09:51.115339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile20
Q1350
median350
Q3800
95-th percentile1400
Maximum5500
Range5500
Interquartile range (IQR)450

Descriptive statistics

Standard deviation900.3156
Coefficient of variation (CV)1.2861651
Kurtosis19.707876
Mean700
Median Absolute Deviation (MAD)150
Skewness4.1290917
Sum31500
Variance810568.18
MonotonicityNot monotonic
2023-12-12T12:09:51.248820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
350 17
37.8%
1000 7
15.6%
800 5
 
11.1%
600 3
 
6.7%
0 3
 
6.7%
400 2
 
4.4%
300 2
 
4.4%
3300 1
 
2.2%
450 1
 
2.2%
5500 1
 
2.2%
Other values (3) 3
 
6.7%
ValueCountFrequency (%)
0 3
 
6.7%
100 1
 
2.2%
300 2
 
4.4%
350 17
37.8%
400 2
 
4.4%
450 1
 
2.2%
500 1
 
2.2%
600 3
 
6.7%
800 5
 
11.1%
1000 7
15.6%
ValueCountFrequency (%)
5500 1
 
2.2%
3300 1
 
2.2%
1500 1
 
2.2%
1000 7
15.6%
800 5
 
11.1%
600 3
 
6.7%
500 1
 
2.2%
450 1
 
2.2%
400 2
 
4.4%
350 17
37.8%

비고
Text

MISSING 

Distinct3
Distinct (%)100.0%
Missing42
Missing (%)93.3%
Memory size492.0 B
2023-12-12T12:09:51.439169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length13
Mean length18.666667
Min length10

Characters and Unicode

Total characters56
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row4매를 초과하는 경우에는 초과수 4매마다 100원을 더한다.
2nd row초과료 10분당 100원
3rd row초과1장당 100원
ValueCountFrequency (%)
100원 2
16.7%
4매를 1
8.3%
초과하는 1
8.3%
경우에는 1
8.3%
초과수 1
8.3%
4매마다 1
8.3%
100원을 1
8.3%
더한다 1
8.3%
초과료 1
8.3%
10분당 1
8.3%
2023-12-12T12:09:51.803209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9
16.1%
0 7
12.5%
1 5
 
8.9%
4
 
7.1%
4
 
7.1%
3
 
5.4%
2
 
3.6%
2
 
3.6%
4 2
 
3.6%
2
 
3.6%
Other values (15) 16
28.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 32
57.1%
Decimal Number 14
25.0%
Space Separator 9
 
16.1%
Other Punctuation 1
 
1.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
12.5%
4
 
12.5%
3
 
9.4%
2
 
6.2%
2
 
6.2%
2
 
6.2%
2
 
6.2%
1
 
3.1%
1
 
3.1%
1
 
3.1%
Other values (10) 10
31.2%
Decimal Number
ValueCountFrequency (%)
0 7
50.0%
1 5
35.7%
4 2
 
14.3%
Space Separator
ValueCountFrequency (%)
9
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 32
57.1%
Common 24
42.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
12.5%
4
 
12.5%
3
 
9.4%
2
 
6.2%
2
 
6.2%
2
 
6.2%
2
 
6.2%
1
 
3.1%
1
 
3.1%
1
 
3.1%
Other values (10) 10
31.2%
Common
ValueCountFrequency (%)
9
37.5%
0 7
29.2%
1 5
20.8%
4 2
 
8.3%
. 1
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 32
57.1%
ASCII 24
42.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9
37.5%
0 7
29.2%
1 5
20.8%
4 2
 
8.3%
. 1
 
4.2%
Hangul
ValueCountFrequency (%)
4
 
12.5%
4
 
12.5%
3
 
9.4%
2
 
6.2%
2
 
6.2%
2
 
6.2%
2
 
6.2%
1
 
3.1%
1
 
3.1%
1
 
3.1%
Other values (10) 10
31.2%

Interactions

2023-12-12T12:09:49.552766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:09:51.926824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분종목단위수수료비고
구분1.0001.0000.8480.0001.000
종목1.0001.0001.0001.0001.000
단위0.8481.0001.0000.2001.000
수수료0.0001.0000.2001.000NaN
비고1.0001.0001.000NaN1.000
2023-12-12T12:09:52.071234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단위구분
단위1.0000.624
구분0.6241.000
2023-12-12T12:09:52.199407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수수료구분단위
수수료1.0000.0000.124
구분0.0001.0000.624
단위0.1240.6241.000

Missing values

2023-12-12T12:09:49.665556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:09:49.761214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분종목단위수수료비고
0재산 및 지방세건축물 관리대장 무한 증명350<NA>
1재산 및 지방세공부의 등본, 초본, 하부원본350<NA>
2재산 및 지방세지방세 세목별 과세증명800<NA>
3건축 및 주택건축허가 또는 건축준공 검사증명350<NA>
4건축 및 주택시영주택 분양금 납부증명350<NA>
5건축 및 주택무허가 건물 확인원350<NA>
6사실 및 실적공장등록 증명1000<NA>
7사실 및 실적생산실적 증명350<NA>
8사실 및 실적실수요자 증명350<NA>
9사실 및 실적공장저당법 제7조 증명350<NA>
구분종목단위수수료비고
35무인민원발급창구개별공시지가확인서필지600<NA>
36무인민원발급창구토지이용계획확인서800<NA>
37무인민원발급창구토지(임야)대장400초과1장당 100원
38무인민원발급창구건축물대장400<NA>
39무인민원발급창구자동차등록원부(갑 · 을)300<NA>
40무인민원발급창구건설기계등록원부(갑 ·을)500<NA>
41무인민원발급창구지방세세목별과세증명서600<NA>
42무인민원발급창구가족관계증명원0<NA>
43무인민원발급창구성적증명서, 졸업증명서0<NA>
44무인민원발급창구검정고시 성적증명서, 졸업증명서300<NA>