Overview

Dataset statistics

Number of variables7
Number of observations320
Missing cells288
Missing cells (%)12.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.2 KiB
Average record size in memory61.4 B

Variable types

Categorical1
Text2
Numeric4

Dataset

Description주민 1인당 자체수입액 현황
Author행정안전부
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=MM5VCBBW0VJQNMTSDBNQ22787583&infSeq=1

Alerts

지방세금액(천원) is highly overall correlated with 세외수입액(천원) and 1 other fieldsHigh correlation
세외수입액(천원) is highly overall correlated with 지방세금액(천원) and 1 other fieldsHigh correlation
인구수(명) is highly overall correlated with 지방세금액(천원) and 1 other fieldsHigh correlation
회계연도 is highly imbalanced (53.1%)Imbalance
시군명 has 288 (90.0%) missing valuesMissing

Reproduction

Analysis started2023-12-10 21:45:04.931540
Analysis finished2023-12-10 21:45:06.688953
Duration1.76 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
2022
288 
2023
32 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2022 288
90.0%
2023 32
 
10.0%

Length

2023-12-11T06:45:06.743934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:45:06.828948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 288
90.0%
2023 32
 
10.0%

시군명
Text

MISSING 

Distinct32
Distinct (%)100.0%
Missing288
Missing (%)90.0%
Memory size2.6 KiB
2023-12-11T06:45:06.987596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.09375
Min length3

Characters and Unicode

Total characters99
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)100.0%

Sample

1st row가평군
2nd row경기도
3rd row고양시
4th row과천시
5th row광명시
ValueCountFrequency (%)
경기도 1
 
3.1%
고양시 1
 
3.1%
화성시 1
 
3.1%
하남시 1
 
3.1%
포천시 1
 
3.1%
평택시 1
 
3.1%
파주시 1
 
3.1%
이천시 1
 
3.1%
의정부시 1
 
3.1%
의왕시 1
 
3.1%
Other values (22) 22
68.8%
2023-12-11T06:45:07.278339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 99
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 99
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 99
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%
Distinct288
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
2023-12-11T06:45:07.581231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.79375
Min length3

Characters and Unicode

Total characters1534
Distinct characters135
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique256 ?
Unique (%)80.0%

Sample

1st row경기가평군
2nd row경기본청
3rd row경기고양시
4th row경기과천시
5th row경기광명시
ValueCountFrequency (%)
경기가평군 2
 
0.6%
경기화성시 2
 
0.6%
경기안성시 2
 
0.6%
경기용인시 2
 
0.6%
경기여주시 2
 
0.6%
경기연천군 2
 
0.6%
경기오산시 2
 
0.6%
경기의왕시 2
 
0.6%
경기본청 2
 
0.6%
경기이천시 2
 
0.6%
Other values (278) 300
93.8%
2023-12-11T06:45:08.062082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
119
 
7.8%
113
 
7.4%
101
 
6.6%
94
 
6.1%
82
 
5.3%
68
 
4.4%
66
 
4.3%
54
 
3.5%
47
 
3.1%
47
 
3.1%
Other values (125) 743
48.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1534
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
119
 
7.8%
113
 
7.4%
101
 
6.6%
94
 
6.1%
82
 
5.3%
68
 
4.4%
66
 
4.3%
54
 
3.5%
47
 
3.1%
47
 
3.1%
Other values (125) 743
48.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1534
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
119
 
7.8%
113
 
7.4%
101
 
6.6%
94
 
6.1%
82
 
5.3%
68
 
4.4%
66
 
4.3%
54
 
3.5%
47
 
3.1%
47
 
3.1%
Other values (125) 743
48.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1534
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
119
 
7.8%
113
 
7.4%
101
 
6.6%
94
 
6.1%
82
 
5.3%
68
 
4.4%
66
 
4.3%
54
 
3.5%
47
 
3.1%
47
 
3.1%
Other values (125) 743
48.4%
Distinct288
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1064.6531
Minimum220
Maximum2976
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 KiB
2023-12-11T06:45:08.208427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum220
5-th percentile345.75
Q1764
median1021.5
Q31290.5
95-th percentile2068
Maximum2976
Range2756
Interquartile range (IQR)526.5

Descriptive statistics

Standard deviation515.28124
Coefficient of variation (CV)0.48398979
Kurtosis1.1969236
Mean1064.6531
Median Absolute Deviation (MAD)264
Skewness0.92339612
Sum340689
Variance265514.75
MonotonicityNot monotonic
2023-12-11T06:45:08.335210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
849 3
 
0.9%
902 3
 
0.9%
1300 2
 
0.6%
1219 2
 
0.6%
296 2
 
0.6%
340 2
 
0.6%
894 2
 
0.6%
978 2
 
0.6%
1053 2
 
0.6%
329 2
 
0.6%
Other values (278) 298
93.1%
ValueCountFrequency (%)
220 1
0.3%
281 1
0.3%
289 1
0.3%
292 1
0.3%
296 2
0.6%
307 1
0.3%
308 1
0.3%
315 1
0.3%
325 1
0.3%
327 1
0.3%
ValueCountFrequency (%)
2976 1
0.3%
2818 1
0.3%
2704 2
0.6%
2618 1
0.3%
2403 2
0.6%
2383 1
0.3%
2374 1
0.3%
2363 1
0.3%
2308 1
0.3%
2303 1
0.3%

지방세금액(천원)
Real number (ℝ)

HIGH CORRELATION 

Distinct315
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2168723 × 109
Minimum8356000
Maximum1.0850696 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 KiB
2023-12-11T06:45:08.526656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8356000
5-th percentile20446108
Q149458456
median1.3389933 × 108
Q34.0898676 × 108
95-th percentile3.9641314 × 109
Maximum1.0850696 × 1011
Range1.0849861 × 1011
Interquartile range (IQR)3.5952831 × 108

Descriptive statistics

Standard deviation6.7282318 × 109
Coefficient of variation (CV)5.5291191
Kurtosis205.59651
Mean1.2168723 × 109
Median Absolute Deviation (MAD)1.0000333 × 108
Skewness13.37066
Sum3.8939913 × 1011
Variance4.5269103 × 1019
MonotonicityNot monotonic
2023-12-11T06:45:08.665939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1666070000 2
 
0.6%
146800000 2
 
0.6%
284596764 2
 
0.6%
825098477 2
 
0.6%
198694990 2
 
0.6%
4372221602 1
 
0.3%
110338216 1
 
0.3%
148530813 1
 
0.3%
166158823 1
 
0.3%
84947110 1
 
0.3%
Other values (305) 305
95.3%
ValueCountFrequency (%)
8356000 1
0.3%
12241787 1
0.3%
13801917 1
0.3%
14084000 1
0.3%
14699000 1
0.3%
15370000 1
0.3%
16712900 1
0.3%
16749822 1
0.3%
16854741 1
0.3%
17923754 1
0.3%
ValueCountFrequency (%)
108506961931 1
0.3%
29338920494 1
0.3%
25572342593 1
0.3%
23095574000 1
0.3%
17144600000 1
0.3%
16024600000 1
0.3%
11981763983 1
0.3%
6061044329 1
0.3%
6030582423 1
0.3%
5443672615 1
0.3%

세외수입액(천원)
Real number (ℝ)

HIGH CORRELATION 

Distinct315
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3849906 × 108
Minimum3784171
Maximum1.0439536 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 KiB
2023-12-11T06:45:08.817627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3784171
5-th percentile10109482
Q118941127
median35536726
Q377680794
95-th percentile3.965078 × 108
Maximum1.0439536 × 1010
Range1.0435752 × 1010
Interquartile range (IQR)58739667

Descriptive statistics

Standard deviation6.3903619 × 108
Coefficient of variation (CV)4.6140109
Kurtosis214.26404
Mean1.3849906 × 108
Median Absolute Deviation (MAD)20336236
Skewness13.692361
Sum4.43197 × 1010
Variance4.0836726 × 1017
MonotonicityNot monotonic
2023-12-11T06:45:08.959447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
163570077 2
 
0.6%
26451652 2
 
0.6%
38387000 2
 
0.6%
68639298 2
 
0.6%
38097453 2
 
0.6%
589888647 1
 
0.3%
55295743 1
 
0.3%
52947370 1
 
0.3%
41609811 1
 
0.3%
32867740 1
 
0.3%
Other values (305) 305
95.3%
ValueCountFrequency (%)
3784171 1
0.3%
5471568 1
0.3%
6011132 1
0.3%
6313998 1
0.3%
7342866 1
0.3%
7634007 1
0.3%
7827222 1
0.3%
8299234 1
0.3%
8867542 1
0.3%
9004612 1
0.3%
ValueCountFrequency (%)
10439536057 1
0.3%
2730930474 1
0.3%
2129551872 1
0.3%
1967131619 1
0.3%
1802338069 1
0.3%
1565131956 1
0.3%
833741219 1
0.3%
669048286 1
0.3%
589888647 1
0.3%
587433131 1
0.3%

인구수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct297
Distinct (%)92.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean885236.25
Minimum8867
Maximum51638809
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 KiB
2023-12-11T06:45:09.121701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8867
5-th percentile27894.1
Q174429.5
median233589.5
Q3524878.25
95-th percentile2852615.3
Maximum51638809
Range51629942
Interquartile range (IQR)450448.75

Descriptive statistics

Standard deviation3374101
Coefficient of variation (CV)3.811526
Kurtosis163.43402
Mean885236.25
Median Absolute Deviation (MAD)183112
Skewness11.559738
Sum2.832756 × 108
Variance1.1384557 × 1013
MonotonicityNot monotonic
2023-12-11T06:45:09.281996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9509458 3
 
0.9%
1452251 3
 
0.9%
1441611 3
 
0.9%
1786855 2
 
0.6%
1538492 2
 
0.6%
2119257 2
 
0.6%
2385412 2
 
0.6%
262451 2
 
0.6%
2948375 2
 
0.6%
177125 2
 
0.6%
Other values (287) 297
92.8%
ValueCountFrequency (%)
8867 1
0.3%
16320 1
0.3%
20342 1
0.3%
21695 1
0.3%
21748 1
0.3%
22945 1
0.3%
23748 1
0.3%
24195 1
0.3%
24539 1
0.3%
24987 1
0.3%
ValueCountFrequency (%)
51638809 1
 
0.3%
13589432 1
 
0.3%
13565450 2
0.6%
13339235 1
 
0.3%
9509458 3
0.9%
3350380 2
0.6%
3314183 2
0.6%
3173255 1
 
0.3%
2948375 2
0.6%
2858340 1
 
0.3%

Interactions

2023-12-11T06:45:06.168813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.211952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.555748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.858960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:06.260269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.310712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.637530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.936134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:06.344919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.397905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.708892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:06.013109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:06.419331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.476810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:05.783445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:45:06.089892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:45:09.369097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도시군명주민1인당자체수입액(천원)지방세금액(천원)세외수입액(천원)인구수(명)
회계연도1.000NaN0.0990.0000.0000.000
시군명NaN1.0001.0001.000NaN1.000
주민1인당자체수입액(천원)0.0991.0001.0000.6250.6220.620
지방세금액(천원)0.0001.0000.6251.0000.9700.991
세외수입액(천원)0.000NaN0.6220.9701.0000.973
인구수(명)0.0001.0000.6200.9910.9731.000
2023-12-11T06:45:09.473468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주민1인당자체수입액(천원)지방세금액(천원)세외수입액(천원)인구수(명)회계연도
주민1인당자체수입액(천원)1.0000.2010.180-0.0830.077
지방세금액(천원)0.2011.0000.9160.9270.000
세외수입액(천원)0.1800.9161.0000.8980.000
인구수(명)-0.0830.9270.8981.0000.000
회계연도0.0770.0000.0000.0001.000

Missing values

2023-12-11T06:45:06.532276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:45:06.647573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계연도시군명자치단체명주민1인당자체수입액(천원)지방세금액(천원)세외수입액(천원)인구수(명)
02023가평군경기가평군1526774560001737688962150
12023경기도경기본청12221602460000058743313113589432
22023고양시경기고양시7797150000001234132801076535
32023과천시경기과천시21921374600003378217078137
42023광명시경기광명시110826582600053086127287945
52023광주시경기광주시102133405800065424886391462
62023구리시경기구리시89813055000038861184188701
72023군포시경기군포시88921240300024126322266213
82023김포시경기김포시1079420200000102369888484267
92023남양주시경기남양주시79348710000097256047737353
회계연도시군명자치단체명주민1인당자체수입액(천원)지방세금액(천원)세외수입액(천원)인구수(명)
3102022<NA>충북청주시849615186601105171619848482
3112022<NA>충북충주시87615137300031942912209358
3122022<NA>충북제천시8728926734025425306131591
3132022<NA>충청북도군계1318421064384116608078407996
3142022<NA>충북보은군118528489947929124131878
3152022<NA>충북옥천군1068395824511392191650093
3162022<NA>충북영동군940306314821239949545773
3172022<NA>충북증평군92828330000547156836426
3182022<NA>충북진천군16381191280022037881885176
3192022<NA>충북괴산군1006261267371222389038122