Overview

Dataset statistics

Number of variables6
Number of observations275
Missing cells243
Missing cells (%)14.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.1 KiB
Average record size in memory52.5 B

Variable types

Categorical1
Text2
Numeric3

Alerts

업무추진비(원) is highly overall correlated with 세출결산액(원)High correlation
세출결산액(원) is highly overall correlated with 업무추진비(원) and 1 other fieldsHigh correlation
업무추진비비율(%) is highly overall correlated with 세출결산액(원)High correlation
시군명 has 243 (88.4%) missing valuesMissing
업무추진비(원) has unique valuesUnique
세출결산액(원) has unique valuesUnique

Reproduction

Analysis started2023-12-10 22:04:01.356354
Analysis finished2023-12-10 22:04:02.438289
Duration1.08 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Categorical

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2020
243 
2021
32 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2020 243
88.4%
2021 32
 
11.6%

Length

2023-12-11T07:04:02.538122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:04:02.622066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 243
88.4%
2021 32
 
11.6%

시군명
Text

MISSING 

Distinct32
Distinct (%)100.0%
Missing243
Missing (%)88.4%
Memory size2.3 KiB
2023-12-11T07:04:02.791352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.09375
Min length3

Characters and Unicode

Total characters99
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)100.0%

Sample

1st row가평군
2nd row경기도
3rd row고양시
4th row과천시
5th row광명시
ValueCountFrequency (%)
경기도 1
 
3.1%
고양시 1
 
3.1%
화성시 1
 
3.1%
하남시 1
 
3.1%
포천시 1
 
3.1%
평택시 1
 
3.1%
파주시 1
 
3.1%
이천시 1
 
3.1%
의정부시 1
 
3.1%
의왕시 1
 
3.1%
Other values (22) 22
68.8%
2023-12-11T07:04:03.104322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 99
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 99
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 99
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
29
29.3%
6
 
6.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
3
 
3.0%
Other values (31) 35
35.4%
Distinct243
Distinct (%)88.4%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-11T07:04:03.402648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.8872727
Min length4

Characters and Unicode

Total characters1344
Distinct characters133
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique211 ?
Unique (%)76.7%

Sample

1st row경기가평군
2nd row경기본청
3rd row경기고양시
4th row경기과천시
5th row경기광명시
ValueCountFrequency (%)
경기가평군 2
 
0.7%
경기평택시 2
 
0.7%
경기안성시 2
 
0.7%
경기여주시 2
 
0.7%
경기용인시 2
 
0.7%
경기연천군 2
 
0.7%
경기양평군 2
 
0.7%
경기의왕시 2
 
0.7%
경기하남시 2
 
0.7%
경기이천시 2
 
0.7%
Other values (233) 255
92.7%
2023-12-11T07:04:03.851808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
110
 
8.2%
105
 
7.8%
89
 
6.6%
84
 
6.2%
73
 
5.4%
65
 
4.8%
57
 
4.2%
45
 
3.3%
41
 
3.1%
39
 
2.9%
Other values (123) 636
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1344
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
110
 
8.2%
105
 
7.8%
89
 
6.6%
84
 
6.2%
73
 
5.4%
65
 
4.8%
57
 
4.2%
45
 
3.3%
41
 
3.1%
39
 
2.9%
Other values (123) 636
47.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1344
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
110
 
8.2%
105
 
7.8%
89
 
6.6%
84
 
6.2%
73
 
5.4%
65
 
4.8%
57
 
4.2%
45
 
3.3%
41
 
3.1%
39
 
2.9%
Other values (123) 636
47.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1344
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
110
 
8.2%
105
 
7.8%
89
 
6.6%
84
 
6.2%
73
 
5.4%
65
 
4.8%
57
 
4.2%
45
 
3.3%
41
 
3.1%
39
 
2.9%
Other values (123) 636
47.3%

업무추진비(원)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct275
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.1319184 × 108
Minimum54020080
Maximum5.3112305 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-11T07:04:03.987124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum54020080
5-th percentile2.5909935 × 108
Q13.3976227 × 108
median4.6398687 × 108
Q38.5662705 × 108
95-th percentile1.8308898 × 109
Maximum5.3112305 × 109
Range5.2572104 × 109
Interquartile range (IQR)5.1686478 × 108

Descriptive statistics

Standard deviation6.2106264 × 108
Coefficient of variation (CV)0.87082129
Kurtosis14.24689
Mean7.1319184 × 108
Median Absolute Deviation (MAD)1.4400721 × 108
Skewness3.0372197
Sum1.9612776 × 1011
Variance3.857188 × 1017
MonotonicityNot monotonic
2023-12-11T07:04:04.105484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
357338030 1
 
0.4%
1323855730 1
 
0.4%
1767640690 1
 
0.4%
1165489960 1
 
0.4%
1352695800 1
 
0.4%
1617293820 1
 
0.4%
1327752020 1
 
0.4%
1196480155 1
 
0.4%
1395417619 1
 
0.4%
1155859520 1
 
0.4%
Other values (265) 265
96.4%
ValueCountFrequency (%)
54020080 1
0.4%
183854360 1
0.4%
232664890 1
0.4%
235080630 1
0.4%
235788000 1
0.4%
242968930 1
0.4%
244098430 1
0.4%
244986060 1
0.4%
246221304 1
0.4%
249098050 1
0.4%
ValueCountFrequency (%)
5311230523 1
0.4%
3884853308 1
0.4%
3564171961 1
0.4%
2780697874 1
0.4%
2367045165 1
0.4%
2275630571 1
0.4%
2242330565 1
0.4%
2207385698 1
0.4%
2056452000 1
0.4%
2047865343 1
0.4%

세출결산액(원)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct275
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.589749 × 1012
Minimum1.9776779 × 1011
Maximum3.5414408 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-11T07:04:04.249704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9776779 × 1011
5-th percentile3.5484197 × 1011
Q15.2326295 × 1011
median7.738949 × 1011
Q31.214491 × 1012
95-th percentile5.2740029 × 1012
Maximum3.5414408 × 1013
Range3.521664 × 1013
Interquartile range (IQR)6.9122808 × 1011

Descriptive statistics

Standard deviation3.6604453 × 1012
Coefficient of variation (CV)2.3025304
Kurtosis56.592463
Mean1.589749 × 1012
Median Absolute Deviation (MAD)2.7304387 × 1011
Skewness7.0752606
Sum4.3718097 × 1014
Variance1.339886 × 1025
MonotonicityNot monotonic
2023-12-11T07:04:04.378546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
627701847937 1
 
0.4%
772718018860 1
 
0.4%
869476230669 1
 
0.4%
675105780094 1
 
0.4%
958325312802 1
 
0.4%
1262748705170 1
 
0.4%
915974340913 1
 
0.4%
818941892160 1
 
0.4%
1070007759745 1
 
0.4%
725348251920 1
 
0.4%
Other values (265) 265
96.4%
ValueCountFrequency (%)
197767788320 1
0.4%
206093632978 1
0.4%
220834492812 1
0.4%
241683889120 1
0.4%
271540996093 1
0.4%
314267279691 1
0.4%
328353120047 1
0.4%
336872668680 1
0.4%
341956394407 1
0.4%
343712572400 1
0.4%
ValueCountFrequency (%)
35414408138443 1
0.4%
31952141756841 1
0.4%
30265897238043 1
0.4%
11495800974570 1
0.4%
10551719887982 1
0.4%
10360701614267 1
0.4%
8919629759110 1
0.4%
8894496711282 1
0.4%
8251316785631 1
0.4%
7990852530568 1
0.4%

업무추진비비율(%)
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.069818182
Minimum0.01
Maximum0.28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-11T07:04:04.510645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.03
Q10.05
median0.06
Q30.08
95-th percentile0.16
Maximum0.28
Range0.27
Interquartile range (IQR)0.03

Descriptive statistics

Standard deviation0.036811361
Coefficient of variation (CV)0.52724606
Kurtosis5.6142394
Mean0.069818182
Median Absolute Deviation (MAD)0.01
Skewness2.0425167
Sum19.2
Variance0.0013550763
MonotonicityNot monotonic
2023-12-11T07:04:04.651391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0.06 58
21.1%
0.05 53
19.3%
0.07 45
16.4%
0.04 28
10.2%
0.08 23
 
8.4%
0.09 13
 
4.7%
0.03 10
 
3.6%
0.16 8
 
2.9%
0.02 6
 
2.2%
0.1 6
 
2.2%
Other values (10) 25
9.1%
ValueCountFrequency (%)
0.01 3
 
1.1%
0.02 6
 
2.2%
0.03 10
 
3.6%
0.04 28
10.2%
0.05 53
19.3%
0.06 58
21.1%
0.07 45
16.4%
0.08 23
 
8.4%
0.09 13
 
4.7%
0.1 6
 
2.2%
ValueCountFrequency (%)
0.28 1
 
0.4%
0.2 2
 
0.7%
0.19 1
 
0.4%
0.18 2
 
0.7%
0.17 3
 
1.1%
0.16 8
2.9%
0.15 3
 
1.1%
0.14 3
 
1.1%
0.13 4
1.5%
0.11 3
 
1.1%

Interactions

2023-12-11T07:04:02.065348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:04:01.589069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:04:01.840974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:04:02.134938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:04:01.662836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:04:01.913708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:04:02.209876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:04:01.765212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:04:01.990468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:04:04.731977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도시군명업무추진비(원)세출결산액(원)업무추진비비율(%)
회계연도1.000NaN0.3600.0000.000
시군명NaN1.0001.0001.0001.000
업무추진비(원)0.3601.0001.0000.9410.780
세출결산액(원)0.0001.0000.9411.0000.599
업무추진비비율(%)0.0001.0000.7800.5991.000
2023-12-11T07:04:04.851091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무추진비(원)세출결산액(원)업무추진비비율(%)회계연도
업무추진비(원)1.0000.837-0.1280.356
세출결산액(원)0.8371.000-0.5950.000
업무추진비비율(%)-0.128-0.5951.0000.000
회계연도0.3560.0000.0001.000

Missing values

2023-12-11T07:04:02.304316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:04:02.397928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계연도시군명자치단체명업무추진비(원)세출결산액(원)업무추진비비율(%)
02021가평군경기가평군3573380306277018479370.06
12021경기도경기본청3564171961354144081384430.01
22021고양시경기고양시145208062026919678172000.05
32021과천시경기과천시3244572104446817507130.07
42021광명시경기광명시7500256108636379950510.09
52021광주시경기광주시75880399413223667671290.06
62021구리시경기구리시5888854606637630135200.09
72021군포시경기군포시4060043207948364909030.05
82021김포시경기김포시70641945014960966039300.05
92021남양주시경기남양주시90804603220046558214050.05
회계연도시군명자치단체명업무추진비(원)세출결산액(원)업무추진비비율(%)
2652020<NA>경기광명시8129798308556070395900.1
2662020<NA>경기김포시58553539013666636037900.04
2672020<NA>경기군포시4586574207712633485500.06
2682020<NA>경기광주시73806790012347956736450.06
2692020<NA>경기이천시71164748310066827921380.07
2702020<NA>경기양주시5410356209118640458700.06
2712020<NA>경기오산시6203618507026088789160.09
2722020<NA>경기구리시5898789306224104126900.09
2732020<NA>경기안성시34452048010273363409750.03
2742020<NA>경기포천시5624250509308331109500.06