Overview

Dataset statistics

Number of variables9
Number of observations93
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.1 KiB
Average record size in memory78.4 B

Variable types

Numeric3
Categorical5
Text1

Dataset

Description부산광역시남구_세원유형별과세현황_20211231
Author부산광역시 남구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15078563

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
141 is highly overall correlated with 과세년도High correlation
부과건수 is highly overall correlated with 부과금액 and 1 other fieldsHigh correlation
부과금액 is highly overall correlated with 부과건수High correlation
과세년도 is highly overall correlated with 141High correlation
세목명 is highly overall correlated with 부과건수High correlation
141 has unique valuesUnique
부과건수 has 27 (29.0%) zerosZeros
부과금액 has 27 (29.0%) zerosZeros

Reproduction

Analysis started2023-12-10 17:03:40.926223
Analysis finished2023-12-10 17:03:42.882786
Duration1.96 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

141
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct93
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47
Minimum1
Maximum93
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-11T02:03:42.999401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.6
Q124
median47
Q370
95-th percentile88.4
Maximum93
Range92
Interquartile range (IQR)46

Descriptive statistics

Standard deviation26.990739
Coefficient of variation (CV)0.57427105
Kurtosis-1.2
Mean47
Median Absolute Deviation (MAD)23
Skewness0
Sum4371
Variance728.5
MonotonicityStrictly increasing
2023-12-11T02:03:43.246024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.1%
60 1
 
1.1%
69 1
 
1.1%
68 1
 
1.1%
67 1
 
1.1%
66 1
 
1.1%
65 1
 
1.1%
64 1
 
1.1%
63 1
 
1.1%
62 1
 
1.1%
Other values (83) 83
89.2%
ValueCountFrequency (%)
1 1
1.1%
2 1
1.1%
3 1
1.1%
4 1
1.1%
5 1
1.1%
6 1
1.1%
7 1
1.1%
8 1
1.1%
9 1
1.1%
10 1
1.1%
ValueCountFrequency (%)
93 1
1.1%
92 1
1.1%
91 1
1.1%
90 1
1.1%
89 1
1.1%
88 1
1.1%
87 1
1.1%
86 1
1.1%
85 1
1.1%
84 1
1.1%

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
부산광역시
93 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산광역시
2nd row부산광역시
3rd row부산광역시
4th row부산광역시
5th row부산광역시

Common Values

ValueCountFrequency (%)
부산광역시 93
100.0%

Length

2023-12-11T02:03:43.486542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:03:43.628074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산광역시 93
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
남구
93 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남구
2nd row남구
3rd row남구
4th row남구
5th row남구

Common Values

ValueCountFrequency (%)
남구 93
100.0%

Length

2023-12-11T02:03:43.788495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:03:43.935043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남구 93
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
26290
93 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row26290
2nd row26290
3rd row26290
4th row26290
5th row26290

Common Values

ValueCountFrequency (%)
26290 93
100.0%

Length

2023-12-11T02:03:44.095844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:03:44.237491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
26290 93
100.0%

과세년도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size876.0 B
2020
47 
2021
46 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 47
50.5%
2021 46
49.5%

Length

2023-12-11T02:03:44.397481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:03:44.560691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 47
50.5%
2021 46
49.5%

세목명
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Memory size876.0 B
취득세
18 
주민세
16 
자동차세
14 
재산세
10 
지방소득세
Other values (8)
27 

Length

Max length7
Median length3
Mean length3.7311828
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row담배소비세
2nd row교육세
3rd row도시계획세
4th row취득세
5th row취득세

Common Values

ValueCountFrequency (%)
취득세 18
19.4%
주민세 16
17.2%
자동차세 14
15.1%
재산세 10
10.8%
지방소득세 8
8.6%
레저세 8
8.6%
지역자원시설세 5
 
5.4%
등록면허세 4
 
4.3%
담배소비세 2
 
2.2%
교육세 2
 
2.2%
Other values (3) 6
 
6.5%

Length

2023-12-11T02:03:44.719959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
취득세 18
19.4%
주민세 16
17.2%
자동차세 14
15.1%
재산세 10
10.8%
지방소득세 8
8.6%
레저세 8
8.6%
지역자원시설세 5
 
5.4%
등록면허세 4
 
4.3%
담배소비세 2
 
2.2%
교육세 2
 
2.2%
Other values (3) 6
 
6.5%
Distinct50
Distinct (%)53.8%
Missing0
Missing (%)0.0%
Memory size876.0 B
2023-12-11T02:03:45.059186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length6.0322581
Min length2

Characters and Unicode

Total characters561
Distinct characters74
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)7.5%

Sample

1st row담배소비세
2nd row교육세
3rd row도시계획세
4th row건축물
5th row주택(개별)
ValueCountFrequency (%)
담배소비세 2
 
2.2%
도시계획세 2
 
2.2%
지역자원시설세(소방 2
 
2.2%
주민세(종합소득 2
 
2.2%
지방소득세(특별징수 2
 
2.2%
지방소득세(법인소득 2
 
2.2%
지방소득세(양도소득 2
 
2.2%
지방소득세(종합소득 2
 
2.2%
지방소비세 2
 
2.2%
승합 2
 
2.2%
Other values (40) 73
78.5%
2023-12-11T02:03:45.602512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
9.8%
( 49
 
8.7%
) 49
 
8.7%
27
 
4.8%
24
 
4.3%
19
 
3.4%
18
 
3.2%
16
 
2.9%
12
 
2.1%
11
 
2.0%
Other values (64) 281
50.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 461
82.2%
Open Punctuation 49
 
8.7%
Close Punctuation 49
 
8.7%
Decimal Number 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
Open Punctuation
ValueCountFrequency (%)
( 49
100.0%
Close Punctuation
ValueCountFrequency (%)
) 49
100.0%
Decimal Number
ValueCountFrequency (%)
3 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 461
82.2%
Common 100
 
17.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
Common
ValueCountFrequency (%)
( 49
49.0%
) 49
49.0%
3 2
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 461
82.2%
ASCII 100
 
17.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
ASCII
ValueCountFrequency (%)
( 49
49.0%
) 49
49.0%
3 2
 
2.0%

부과건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct65
Distinct (%)69.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32750.86
Minimum0
Maximum559677
Zeros27
Zeros (%)29.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-11T02:03:45.833718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1338
Q317803
95-th percentile175153.4
Maximum559677
Range559677
Interquartile range (IQR)17803

Descriptive statistics

Standard deviation89860.342
Coefficient of variation (CV)2.7437552
Kurtosis23.148798
Mean32750.86
Median Absolute Deviation (MAD)1338
Skewness4.5145974
Sum3045830
Variance8.074881 × 109
MonotonicityNot monotonic
2023-12-11T02:03:46.073311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 27
29.0%
3 3
 
3.2%
1367 1
 
1.1%
99317 1
 
1.1%
11377 1
 
1.1%
18344 1
 
1.1%
305 1
 
1.1%
20825 1
 
1.1%
119146 1
 
1.1%
50 1
 
1.1%
Other values (55) 55
59.1%
ValueCountFrequency (%)
0 27
29.0%
3 3
 
3.2%
10 1
 
1.1%
13 1
 
1.1%
14 1
 
1.1%
50 1
 
1.1%
70 1
 
1.1%
297 1
 
1.1%
305 1
 
1.1%
312 1
 
1.1%
ValueCountFrequency (%)
559677 1
1.1%
543254 1
1.1%
192393 1
1.1%
189350 1
1.1%
179459 1
1.1%
172283 1
1.1%
132020 1
1.1%
131238 1
1.1%
120678 1
1.1%
119146 1
1.1%

부과금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct67
Distinct (%)72.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.5907714 × 109
Minimum0
Maximum9.2190692 × 1010
Zeros27
Zeros (%)29.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-11T02:03:46.390312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3.21064 × 108
Q39.376677 × 109
95-th percentile3.243598 × 1010
Maximum9.2190692 × 1010
Range9.2190692 × 1010
Interquartile range (IQR)9.376677 × 109

Descriptive statistics

Standard deviation1.4257439 × 1010
Coefficient of variation (CV)1.8782595
Kurtosis13.751344
Mean7.5907714 × 109
Median Absolute Deviation (MAD)3.21064 × 108
Skewness3.1895797
Sum7.0594174 × 1011
Variance2.0327456 × 1020
MonotonicityNot monotonic
2023-12-11T02:03:46.684425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 27
29.0%
29393516000 1
 
1.1%
996821000 1
 
1.1%
1407374000 1
 
1.1%
5917537000 1
 
1.1%
53154000 1
 
1.1%
23715420000 1
 
1.1%
29783823000 1
 
1.1%
200058000 1
 
1.1%
22242472000 1
 
1.1%
Other values (57) 57
61.3%
ValueCountFrequency (%)
0 27
29.0%
10034000 1
 
1.1%
13860000 1
 
1.1%
15023000 1
 
1.1%
15603000 1
 
1.1%
18095000 1
 
1.1%
23356000 1
 
1.1%
36124000 1
 
1.1%
50513000 1
 
1.1%
53154000 1
 
1.1%
ValueCountFrequency (%)
92190692000 1
1.1%
51152538000 1
1.1%
48511527000 1
1.1%
34503341000 1
1.1%
33429916000 1
1.1%
31773356000 1
1.1%
29783823000 1
1.1%
29393516000 1
1.1%
28798706000 1
1.1%
23715420000 1
1.1%

Interactions

2023-12-11T02:03:42.145256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:03:41.333231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:03:41.750064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:03:42.243374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:03:41.459502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:03:41.882188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:03:42.382121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:03:41.608363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:03:42.017786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:03:46.867567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
141과세년도세목명세원유형명부과건수부과금액
1411.0001.0000.7880.0000.0000.000
과세년도1.0001.0000.0000.0000.0000.000
세목명0.7880.0001.0001.0000.8290.607
세원유형명0.0000.0001.0001.0000.9990.912
부과건수0.0000.0000.8290.9991.0000.495
부과금액0.0000.0000.6070.9120.4951.000
2023-12-11T02:03:47.044417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명
과세년도1.0000.000
세목명0.0001.000
2023-12-11T02:03:47.209090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
141부과건수부과금액과세년도세목명
1411.0000.046-0.0640.9550.468
부과건수0.0461.0000.8390.0000.611
부과금액-0.0640.8391.0000.0000.339
과세년도0.9550.0000.0001.0000.000
세목명0.4680.6110.3390.0001.000

Missing values

2023-12-11T02:03:42.571476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:03:42.811519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

141시도명시군구명자치단체코드과세년도세목명세원유형명부과건수부과금액
01부산광역시남구262902020담배소비세담배소비세00
12부산광역시남구262902020교육세교육세55967722242472000
23부산광역시남구262902020도시계획세도시계획세00
34부산광역시남구262902020취득세건축물6457118039000
45부산광역시남구262902020취득세주택(개별)218913248744000
56부산광역시남구262902020취득세주택(단독)843851152538000
67부산광역시남구262902020취득세기타1466725000
78부산광역시남구262902020취득세항공기00
89부산광역시남구262902020취득세기계장비310034000
910부산광역시남구262902020취득세차량2131315202000
141시도명시군구명자치단체코드과세년도세목명세원유형명부과건수부과금액
8384부산광역시남구262902021레저세경륜00
8485부산광역시남구262902021레저세경마00
8586부산광역시남구262902021자동차세자동차세(주행)00
8687부산광역시남구262902021자동차세3륜이하133815603000
8788부산광역시남구262902021자동차세특수5734184116000
8889부산광역시남구262902021자동차세화물11102287606000
8990부산광역시남구262902021자동차세승합3006152193000
9091부산광역시남구262902021자동차세기타승용56536124000
9192부산광역시남구262902021자동차세승용13123818811057000
9293부산광역시남구262902021체납체납1722839365924000