Overview

Dataset statistics

Number of variables5
Number of observations61
Missing cells61
Missing cells (%)20.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 KiB
Average record size in memory44.2 B

Variable types

Categorical2
Text1
Numeric1
Unsupported1

Dataset

Description서울특별시 성동구 인허가 신청을 위한 수수료 정보입니다. 수수료 구분, 종목, 단위, 수수료, 비고 등에 대한 정보를 포함하고 있습니다.
Author서울특별시 성동구
URLhttps://www.data.go.kr/data/15084617/fileData.do

Alerts

단위 is highly imbalanced (87.9%)Imbalance
비고 has 61 (100.0%) missing valuesMissing
종목 has unique valuesUnique
비고 is an unsupported type, check if it needs cleaning or further analysisUnsupported
수수료 has 3 (4.9%) zerosZeros

Reproduction

Analysis started2023-12-12 10:52:30.080951
Analysis finished2023-12-12 10:52:30.930385
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct8
Distinct (%)13.1%
Missing0
Missing (%)0.0%
Memory size620.0 B
신고신청
19 
게임산업
10 
영화 및 비디오 산업
인허가
의료기관
Other values (3)
10 

Length

Max length11
Median length4
Mean length4.8032787
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row인허가
2nd row인허가
3rd row인허가
4th row인허가
5th row인허가

Common Values

ValueCountFrequency (%)
신고신청 19
31.1%
게임산업 10
16.4%
영화 및 비디오 산업 8
13.1%
인허가 7
 
11.5%
의료기관 7
 
11.5%
음악산업 6
 
9.8%
공연장 2
 
3.3%
체육시설업 2
 
3.3%

Length

2023-12-12T19:52:31.110950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:52:31.422462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신고신청 19
22.4%
게임산업 10
11.8%
영화 8
9.4%
8
9.4%
비디오 8
9.4%
산업 8
9.4%
인허가 7
 
8.2%
의료기관 7
 
8.2%
음악산업 6
 
7.1%
공연장 2
 
2.4%

종목
Text

UNIQUE 

Distinct61
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size620.0 B
2023-12-12T19:52:31.908606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length26
Mean length19.016393
Min length7

Characters and Unicode

Total characters1160
Distinct characters137
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique61 ?
Unique (%)100.0%

Sample

1st row행정서사업 허가증 교부
2nd row도로하천부지점용(지하도, 지하상가, 노상주차장)허가
3rd row공유수면 점용허가
4th row권리변경(재개발지구)행위 허가신청
5th row사도개설허가 및 축조허가 신청
ValueCountFrequency (%)
· 25
 
10.2%
신청 13
 
5.3%
신고 8
 
3.3%
변경신고 6
 
2.4%
등록신청 6
 
2.4%
등록사항 5
 
2.0%
또는 5
 
2.0%
음반 4
 
1.6%
개설등록 4
 
1.6%
4
 
1.6%
Other values (108) 165
67.3%
2023-12-12T19:52:32.865026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
186
 
16.0%
60
 
5.2%
40
 
3.4%
38
 
3.3%
· 25
 
2.2%
24
 
2.1%
23
 
2.0%
23
 
2.0%
23
 
2.0%
22
 
1.9%
Other values (127) 696
60.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 901
77.7%
Space Separator 186
 
16.0%
Other Punctuation 43
 
3.7%
Close Punctuation 15
 
1.3%
Open Punctuation 15
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
60
 
6.7%
40
 
4.4%
38
 
4.2%
24
 
2.7%
23
 
2.6%
23
 
2.6%
23
 
2.6%
22
 
2.4%
22
 
2.4%
21
 
2.3%
Other values (122) 605
67.1%
Other Punctuation
ValueCountFrequency (%)
· 25
58.1%
, 18
41.9%
Space Separator
ValueCountFrequency (%)
186
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 901
77.7%
Common 259
 
22.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
60
 
6.7%
40
 
4.4%
38
 
4.2%
24
 
2.7%
23
 
2.6%
23
 
2.6%
23
 
2.6%
22
 
2.4%
22
 
2.4%
21
 
2.3%
Other values (122) 605
67.1%
Common
ValueCountFrequency (%)
186
71.8%
· 25
 
9.7%
, 18
 
6.9%
) 15
 
5.8%
( 15
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 901
77.7%
ASCII 234
 
20.2%
None 25
 
2.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
186
79.5%
, 18
 
7.7%
) 15
 
6.4%
( 15
 
6.4%
Hangul
ValueCountFrequency (%)
60
 
6.7%
40
 
4.4%
38
 
4.2%
24
 
2.7%
23
 
2.6%
23
 
2.6%
23
 
2.6%
22
 
2.4%
22
 
2.4%
21
 
2.3%
Other values (122) 605
67.1%
None
ValueCountFrequency (%)
· 25
100.0%

단위
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size620.0 B
60 
필지
 
1

Length

Max length2
Median length1
Mean length1.0163934
Min length1

Unique

Unique1 ?
Unique (%)1.6%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
60
98.4%
필지 1
 
1.6%

Length

2023-12-12T19:52:33.269931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:52:33.531461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60
98.4%
필지 1
 
1.6%

수수료
Real number (ℝ)

ZEROS 

Distinct13
Distinct (%)21.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14304.918
Minimum0
Maximum100000
Zeros3
Zeros (%)4.9%
Negative0
Negative (%)0.0%
Memory size681.0 B
2023-12-12T19:52:33.768042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile550
Q13000
median10000
Q320000
95-th percentile40000
Maximum100000
Range100000
Interquartile range (IQR)17000

Descriptive statistics

Standard deviation18835.334
Coefficient of variation (CV)1.3167034
Kurtosis12.876938
Mean14304.918
Median Absolute Deviation (MAD)9450
Skewness3.2446595
Sum872600
Variance3.5476981 × 108
MonotonicityNot monotonic
2023-12-12T19:52:34.030015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
20000 16
26.2%
10000 12
19.7%
5000 10
16.4%
550 7
11.5%
0 3
 
4.9%
30000 3
 
4.9%
1000 2
 
3.3%
100000 2
 
3.3%
40000 2
 
3.3%
2000 1
 
1.6%
Other values (3) 3
 
4.9%
ValueCountFrequency (%)
0 3
 
4.9%
550 7
11.5%
800 1
 
1.6%
950 1
 
1.6%
1000 2
 
3.3%
2000 1
 
1.6%
3000 1
 
1.6%
5000 10
16.4%
10000 12
19.7%
20000 16
26.2%
ValueCountFrequency (%)
100000 2
 
3.3%
40000 2
 
3.3%
30000 3
 
4.9%
20000 16
26.2%
10000 12
19.7%
5000 10
16.4%
3000 1
 
1.6%
2000 1
 
1.6%
1000 2
 
3.3%
950 1
 
1.6%

비고
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing61
Missing (%)100.0%
Memory size681.0 B

Interactions

2023-12-12T19:52:30.402580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:52:34.210431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분종목단위수수료
구분1.0001.0000.0000.528
종목1.0001.0001.0001.000
단위0.0001.0001.0000.000
수수료0.5281.0000.0001.000
2023-12-12T19:52:34.385899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단위구분
단위1.0000.000
구분0.0001.000
2023-12-12T19:52:34.542594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수수료구분단위
수수료1.0000.3470.000
구분0.3471.0000.000
단위0.0000.0001.000

Missing values

2023-12-12T19:52:30.637145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:52:30.853108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분종목단위수수료비고
0인허가행정서사업 허가증 교부550<NA>
1인허가도로하천부지점용(지하도, 지하상가, 노상주차장)허가1000<NA>
2인허가공유수면 점용허가0<NA>
3인허가권리변경(재개발지구)행위 허가신청550<NA>
4인허가사도개설허가 및 축조허가 신청550<NA>
5인허가토지형질변경(택지조성)행위허가 신청2000<NA>
6인허가소음 · 진동 배출시설 설치허가 신청10000<NA>
7신고신청출판사 인쇄소 등록신고0<NA>
8신고신청이 · 미용업 개설 신고증 재교부0<NA>
9신고신청국 · 공유재산 대부신청(신규)5000<NA>
구분종목단위수수료비고
51영화 및 비디오 산업영화상영관 등록신청20000<NA>
52영화 및 비디오 산업영화상영관 변경등록 신청10000<NA>
53영화 및 비디오 산업비디오물시청제공업 등록신청20000<NA>
54영화 및 비디오 산업비디오물시청제공업 변경등록 신청5000<NA>
55영화 및 비디오 산업비디오물제작업 · 비디오물배급업 신고20000<NA>
56영화 및 비디오 산업비디오물제작업 · 비디오물배급업 변경신고10000<NA>
57영화 및 비디오 산업영화제작 · 수입 · 배급 · 상영업 신규신고20000<NA>
58영화 및 비디오 산업영화제작 · 수입 · 배급 · 상영업 변경신고10000<NA>
59체육시설업체육시설업의 신고30000<NA>
60체육시설업체육시설업의 변경신고10000<NA>