Overview

Dataset statistics

Number of variables6
Number of observations275
Missing cells243
Missing cells (%)14.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.1 KiB
Average record size in memory52.5 B

Variable types

Categorical1
Text2
Numeric3

Alerts

연말지출원인행위액(원) is highly overall correlated with 세출결산액(원)High correlation
세출결산액(원) is highly overall correlated with 연말지출원인행위액(원)High correlation
시군명 has 243 (88.4%) missing valuesMissing
연말지출원인행위액(원) has unique valuesUnique
세출결산액(원) has unique valuesUnique

Reproduction

Analysis started2023-12-10 22:25:13.891186
Analysis finished2023-12-10 22:25:15.041951
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Categorical

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2020
243 
2021
32 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2020 243
88.4%
2021 32
 
11.6%

Length

2023-12-11T07:25:15.093269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:25:15.167199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 243
88.4%
2021 32
 
11.6%

시군명
Text

MISSING 

Distinct32
Distinct (%)100.0%
Missing243
Missing (%)88.4%
Memory size2.3 KiB
2023-12-11T07:25:15.323131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.09375
Min length3

Characters and Unicode

Total characters99
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)100.0%

Sample

1st row가평군
2nd row경기도
3rd row고양시
4th row과천시
5th row광명시
ValueCountFrequency (%)
경기도 1
 
3.1%
고양시 1
 
3.1%
화성시 1
 
3.1%
하남시 1
 
3.1%
포천시 1
 
3.1%
평택시 1
 
3.1%
파주시 1
 
3.1%
이천시 1
 
3.1%
의정부시 1
 
3.1%
의왕시 1
 
3.1%
Other values (22) 22
68.8%
2023-12-11T07:25:15.845587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 99
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 99
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 99
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%
Distinct243
Distinct (%)88.4%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-11T07:25:16.141793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.8872727
Min length4

Characters and Unicode

Total characters1344
Distinct characters133
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique211 ?
Unique (%)76.7%

Sample

1st row경기가평군
2nd row경기본청
3rd row경기고양시
4th row경기과천시
5th row경기광명시
ValueCountFrequency (%)
경기가평군 2
 
0.7%
경기평택시 2
 
0.7%
경기안성시 2
 
0.7%
경기여주시 2
 
0.7%
경기용인시 2
 
0.7%
경기연천군 2
 
0.7%
경기양평군 2
 
0.7%
경기의왕시 2
 
0.7%
경기하남시 2
 
0.7%
경기이천시 2
 
0.7%
Other values (233) 255
92.7%
2023-12-11T07:25:16.604570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
110
 
8.2%
105
 
7.8%
89
 
6.6%
84
 
6.2%
73
 
5.4%
65
 
4.8%
57
 
4.2%
45
 
3.3%
41
 
3.1%
39
 
2.9%
Other values (123) 636
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1344
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
110
 
8.2%
105
 
7.8%
89
 
6.6%
84
 
6.2%
73
 
5.4%
65
 
4.8%
57
 
4.2%
45
 
3.3%
41
 
3.1%
39
 
2.9%
Other values (123) 636
47.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1344
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
110
 
8.2%
105
 
7.8%
89
 
6.6%
84
 
6.2%
73
 
5.4%
65
 
4.8%
57
 
4.2%
45
 
3.3%
41
 
3.1%
39
 
2.9%
Other values (123) 636
47.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1344
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
110
 
8.2%
105
 
7.8%
89
 
6.6%
84
 
6.2%
73
 
5.4%
65
 
4.8%
57
 
4.2%
45
 
3.3%
41
 
3.1%
39
 
2.9%
Other values (123) 636
47.3%

연말지출원인행위액(원)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct275
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6090087 × 1010
Minimum2.3960806 × 109
Maximum7.8923815 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-11T07:25:16.729433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.3960806 × 109
5-th percentile7.1811244 × 109
Q11.5482909 × 1010
median2.3459045 × 1010
Q33.4833847 × 1010
95-th percentile5.1833609 × 1010
Maximum7.8923815 × 1010
Range7.6527734 × 1010
Interquartile range (IQR)1.9350938 × 1010

Descriptive statistics

Standard deviation1.4448841 × 1010
Coefficient of variation (CV)0.55380577
Kurtosis0.8427253
Mean2.6090087 × 1010
Median Absolute Deviation (MAD)9.6706011 × 109
Skewness0.90985952
Sum7.1747739 × 1012
Variance2.0876899 × 1020
MonotonicityNot monotonic
2023-12-11T07:25:16.864334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32747031863 1
 
0.4%
35884931629 1
 
0.4%
24588306590 1
 
0.4%
36573381150 1
 
0.4%
34253609988 1
 
0.4%
47414172998 1
 
0.4%
48912923565 1
 
0.4%
45595619884 1
 
0.4%
37935269486 1
 
0.4%
26216001940 1
 
0.4%
Other values (265) 265
96.4%
ValueCountFrequency (%)
2396080650 1
0.4%
3529341090 1
0.4%
4261834170 1
0.4%
4443855109 1
0.4%
5086022390 1
0.4%
5237512200 1
0.4%
5470971170 1
0.4%
5488295230 1
0.4%
5825297220 1
0.4%
6539109590 1
0.4%
ValueCountFrequency (%)
78923814920 1
0.4%
77623649385 1
0.4%
72852887240 1
0.4%
63785617050 1
0.4%
63620665700 1
0.4%
62501680850 1
0.4%
62055891390 1
0.4%
61233985570 1
0.4%
60935271450 1
0.4%
57953927367 1
0.4%

세출결산액(원)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct275
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2326084 × 1011
Minimum1.8556059 × 1010
Maximum3.4015388 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-11T07:25:17.013891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.8556059 × 1010
5-th percentile4.0095881 × 1010
Q17.0999161 × 1010
median1.0962682 × 1011
Q31.7086319 × 1011
95-th percentile2.3753145 × 1011
Maximum3.4015388 × 1011
Range3.2159782 × 1011
Interquartile range (IQR)9.9864031 × 1010

Descriptive statistics

Standard deviation6.4838839 × 1010
Coefficient of variation (CV)0.5260295
Kurtosis0.02538837
Mean1.2326084 × 1011
Median Absolute Deviation (MAD)4.8169396 × 1010
Skewness0.69190213
Sum3.3896731 × 1013
Variance4.204075 × 1021
MonotonicityNot monotonic
2023-12-11T07:25:17.138478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
144416699323 1
 
0.4%
205354939799 1
 
0.4%
96422896320 1
 
0.4%
135938031440 1
 
0.4%
173230739623 1
 
0.4%
227859801156 1
 
0.4%
210857160425 1
 
0.4%
180839731316 1
 
0.4%
213574776921 1
 
0.4%
102228630260 1
 
0.4%
Other values (265) 265
96.4%
ValueCountFrequency (%)
18556058950 1
0.4%
25432125370 1
0.4%
29017031173 1
0.4%
29437677730 1
0.4%
31134985900 1
0.4%
33642574610 1
0.4%
34112297640 1
0.4%
35597629160 1
0.4%
35864205560 1
0.4%
36272989893 1
0.4%
ValueCountFrequency (%)
340153882840 1
0.4%
321552849656 1
0.4%
295532748210 1
0.4%
292356241516 1
0.4%
290480259623 1
0.4%
287105305006 1
0.4%
285425440820 1
0.4%
266217296572 1
0.4%
261148994718 1
0.4%
260744949230 1
0.4%

연말지출비율(%)
Real number (ℝ)

Distinct265
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.147818
Minimum2.4
Maximum69.51
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-11T07:25:17.277174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.4
5-th percentile11.128
Q116.185
median20.51
Q327.075
95-th percentile45.699
Maximum69.51
Range67.11
Interquartile range (IQR)10.89

Descriptive statistics

Standard deviation11.143818
Coefficient of variation (CV)0.4814198
Kurtosis3.4407623
Mean23.147818
Median Absolute Deviation (MAD)4.99
Skewness1.5753488
Sum6365.65
Variance124.18468
MonotonicityNot monotonic
2023-12-11T07:25:17.427842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.33 2
 
0.7%
22.86 2
 
0.7%
23.72 2
 
0.7%
17.79 2
 
0.7%
17.21 2
 
0.7%
17.86 2
 
0.7%
34.65 2
 
0.7%
14.7 2
 
0.7%
13.17 2
 
0.7%
29.58 2
 
0.7%
Other values (255) 255
92.7%
ValueCountFrequency (%)
2.4 1
0.4%
2.66 1
0.4%
4.45 1
0.4%
4.63 1
0.4%
5.52 1
0.4%
7.61 1
0.4%
7.65 1
0.4%
8.04 1
0.4%
8.63 1
0.4%
8.79 1
0.4%
ValueCountFrequency (%)
69.51 1
0.4%
67.36 1
0.4%
66.81 1
0.4%
65.15 1
0.4%
65.05 1
0.4%
56.93 1
0.4%
51.62 1
0.4%
50.28 1
0.4%
49.46 1
0.4%
49.44 1
0.4%

Interactions

2023-12-11T07:25:14.629824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:25:14.101759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:25:14.381126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:25:14.728092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:25:14.209513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:25:14.466921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:25:14.808937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:25:14.293839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:25:14.548219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:25:17.526989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도시군명연말지출원인행위액(원)세출결산액(원)연말지출비율(%)
회계연도1.000NaN0.3250.0000.146
시군명NaN1.0001.0001.0001.000
연말지출원인행위액(원)0.3251.0001.0000.7680.495
세출결산액(원)0.0001.0000.7681.0000.189
연말지출비율(%)0.1461.0000.4950.1891.000
2023-12-11T07:25:17.650743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연말지출원인행위액(원)세출결산액(원)연말지출비율(%)회계연도
연말지출원인행위액(원)1.0000.7060.4050.245
세출결산액(원)0.7061.000-0.2970.000
연말지출비율(%)0.405-0.2971.0000.110
회계연도0.2450.0000.1101.000

Missing values

2023-12-11T07:25:14.911797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:25:15.004317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계연도시군명자치단체명연말지출원인행위액(원)세출결산액(원)연말지출비율(%)
02021가평군경기가평군3274703186314441669932322.68
12021경기도경기본청105338691804377101126024.07
22021고양시경기고양시578698000108324907049069.51
32021과천시경기과천시126636381705265640865024.05
42021광명시경기광명시96552331904019791595024.02
52021광주시경기광주시153036354809819330586015.59
62021구리시경기구리시200833025808421143229023.85
72021군포시경기군포시167228923105479843696030.52
82021김포시경기김포시6378561705020006782404031.88
92021남양주시경기남양주시3312964624020027649232016.54
회계연도시군명자치단체명연말지출원인행위액(원)세출결산액(원)연말지출비율(%)
2652020<NA>서울용산구203236644904897074642041.5
2662020<NA>서울성동구269453631305358957308050.28
2672020<NA>서울광진구90774089393741544780024.26
2682020<NA>서울동대문구293715321165159353193056.93
2692020<NA>서울중랑구360663941809239415197039.04
2702020<NA>서울성북구284436818706920875628041.1
2712020<NA>서울강북구358856658505371576187066.81
2722020<NA>서울도봉구411462568428677714782547.42
2732020<NA>서울노원구3240429148012802849284025.31
2742020<NA>서울은평구331892441706977001228047.57