Overview

Dataset statistics

Number of variables8
Number of observations7515
Missing cells20021
Missing cells (%)33.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory506.5 KiB
Average record size in memory69.0 B

Variable types

DateTime1
Categorical2
Numeric4
Unsupported1

Dataset

Description경상북도 영주시 재정정보공개시스템 세입현황(일자, 회계명, 세임명, 수입액 등)
Author경상북도 영주시
URLhttps://www.data.go.kr/data/15062964/fileData.do

Alerts

세입명 is highly overall correlated with 회계명High correlation
회계명 is highly overall correlated with 세입명High correlation
전일누계(천원) is highly overall correlated with 수입액(천원) and 1 other fieldsHigh correlation
수입액(천원) is highly overall correlated with 전일누계(천원) and 1 other fieldsHigh correlation
합계(천원) is highly overall correlated with 전일누계(천원) and 1 other fieldsHigh correlation
전일누계(천원) has 161 (2.1%) missing valuesMissing
수입액(천원) has 5247 (69.8%) missing valuesMissing
과오납반환(천원) has 7098 (94.5%) missing valuesMissing
과목경정 has 7515 (100.0%) missing valuesMissing
과목경정 is an unsupported type, check if it needs cleaning or further analysisUnsupported
합계(천원) has 149 (2.0%) zerosZeros

Reproduction

Analysis started2023-12-12 05:46:14.233131
Analysis finished2023-12-12 05:46:16.971468
Duration2.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일자
Date

Distinct1033
Distinct (%)13.7%
Missing0
Missing (%)0.0%
Memory size58.8 KiB
Minimum2016-01-04 00:00:00
Maximum2020-03-17 00:00:00
2023-12-12T14:46:17.044948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:17.237583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

회계명
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size58.8 KiB
수질개선특별회계
1006 
주차장특별회계
755 
의료급여기금특별회계
524 
일반회계
523 
저소득주민주거안정기금특별회계
523 
Other values (8)
4184 

Length

Max length21
Median length15
Mean length10.40519
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수질개선특별회계
2nd row주차장특별회계
3rd row수질개선특별회계
4th row주차장특별회계
5th row수질개선특별회계

Common Values

ValueCountFrequency (%)
수질개선특별회계 1006
13.4%
주차장특별회계 755
10.0%
의료급여기금특별회계 524
 
7.0%
일반회계 523
 
7.0%
저소득주민주거안정기금특별회계 523
 
7.0%
주민소득지원및생활안정기금특별회계 523
 
7.0%
장기미집행도시계획시설대지보상임시특별회계 523
 
7.0%
농공지구조성사업특별회계 523
 
7.0%
치수사업특별회계 523
 
7.0%
기반시설특별회계 523
 
7.0%
Other values (3) 1569
20.9%

Length

2023-12-12T14:46:17.472887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수질개선특별회계 1006
13.4%
주차장특별회계 755
10.0%
의료급여기금특별회계 524
 
7.0%
일반회계 523
 
7.0%
저소득주민주거안정기금특별회계 523
 
7.0%
주민소득지원및생활안정기금특별회계 523
 
7.0%
장기미집행도시계획시설대지보상임시특별회계 523
 
7.0%
농공지구조성사업특별회계 523
 
7.0%
치수사업특별회계 523
 
7.0%
기반시설특별회계 523
 
7.0%
Other values (3) 1569
20.9%

세입명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size58.8 KiB
회계별총계
5754 
세외수입
1761 

Length

Max length5
Median length5
Mean length4.7656687
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row세외수입
2nd row세외수입
3rd row세외수입
4th row세외수입
5th row세외수입

Common Values

ValueCountFrequency (%)
회계별총계 5754
76.6%
세외수입 1761
 
23.4%

Length

2023-12-12T14:46:17.629812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:46:17.726207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
회계별총계 5754
76.6%
세외수입 1761
 
23.4%

전일누계(천원)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct2566
Distinct (%)34.9%
Missing161
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean35986057
Minimum3
Maximum8.2544233 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.2 KiB
2023-12-12T14:46:17.850051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile153359
Q11355799
median2576696
Q310813684
95-th percentile3.3199534 × 108
Maximum8.2544233 × 108
Range8.2544233 × 108
Interquartile range (IQR)9457885

Descriptive statistics

Standard deviation1.1778594 × 108
Coefficient of variation (CV)3.2730994
Kurtosis16.010549
Mean35986057
Median Absolute Deviation (MAD)1852796
Skewness4.0514186
Sum2.6464146 × 1011
Variance1.3873528 × 1016
MonotonicityNot monotonic
2023-12-12T14:46:17.999657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
574637 119
 
1.6%
300129 119
 
1.6%
16775 102
 
1.4%
1663725 96
 
1.3%
560474 67
 
0.9%
2298561 67
 
0.9%
992361 63
 
0.8%
345008 63
 
0.8%
2118444 62
 
0.8%
299847 61
 
0.8%
Other values (2556) 6535
87.0%
(Missing) 161
 
2.1%
ValueCountFrequency (%)
3 7
 
0.1%
264 1
 
< 0.1%
686 1
 
< 0.1%
1247 58
0.8%
1485 1
 
< 0.1%
2541 1
 
< 0.1%
3000 11
 
0.1%
3666 1
 
< 0.1%
5301 1
 
< 0.1%
5729 1
 
< 0.1%
ValueCountFrequency (%)
825442329 1
< 0.1%
799732053 1
< 0.1%
799128007 1
< 0.1%
774751772 1
< 0.1%
774381734 1
< 0.1%
773753352 1
< 0.1%
772397835 1
< 0.1%
747630801 1
< 0.1%
744375877 1
< 0.1%
743079198 1
< 0.1%

수입액(천원)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1930
Distinct (%)85.1%
Missing5247
Missing (%)69.8%
Infinite0
Infinite (%)0.0%
Mean799544.11
Minimum-4000000
Maximum1.5772321 × 108
Zeros0
Zeros (%)0.0%
Negative37
Negative (%)0.5%
Memory size66.2 KiB
2023-12-12T14:46:18.163191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4000000
5-th percentile26.35
Q11519.75
median4764.5
Q3157718.5
95-th percentile3007578.2
Maximum1.5772321 × 108
Range1.6172321 × 108
Interquartile range (IQR)156198.75

Descriptive statistics

Standard deviation4795910.7
Coefficient of variation (CV)5.9983065
Kurtosis526.03024
Mean799544.11
Median Absolute Deviation (MAD)4749
Skewness18.60138
Sum1.813366 × 109
Variance2.3000759 × 1013
MonotonicityNot monotonic
2023-12-12T14:46:18.305674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 20
 
0.3%
100 19
 
0.3%
2000 17
 
0.2%
12000 17
 
0.2%
50 12
 
0.2%
30 12
 
0.2%
200 11
 
0.1%
60 11
 
0.1%
-75394 9
 
0.1%
5 6
 
0.1%
Other values (1920) 2134
28.4%
(Missing) 5247
69.8%
ValueCountFrequency (%)
-4000000 1
 
< 0.1%
-3000000 4
0.1%
-2000000 2
< 0.1%
-1000000 1
 
< 0.1%
-816363 1
 
< 0.1%
-800000 1
 
< 0.1%
-668999 1
 
< 0.1%
-650000 1
 
< 0.1%
-325265 1
 
< 0.1%
-196673 1
 
< 0.1%
ValueCountFrequency (%)
157723214 1
< 0.1%
46837101 1
< 0.1%
44943245 1
< 0.1%
39959024 1
< 0.1%
38306213 1
< 0.1%
36088241 1
< 0.1%
34329605 1
< 0.1%
33670073 1
< 0.1%
33322061 1
< 0.1%
31796525 1
< 0.1%

과오납반환(천원)
Real number (ℝ)

MISSING 

Distinct368
Distinct (%)88.2%
Missing7098
Missing (%)94.5%
Infinite0
Infinite (%)0.0%
Mean7903.8801
Minimum-5349
Maximum1056166
Zeros0
Zeros (%)0.0%
Negative2
Negative (%)< 0.1%
Memory size66.2 KiB
2023-12-12T14:46:18.466788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-5349
5-th percentile12.6
Q1144
median757
Q32866
95-th percentile15325.2
Maximum1056166
Range1061515
Interquartile range (IQR)2722

Descriptive statistics

Standard deviation56425.214
Coefficient of variation (CV)7.1389259
Kurtosis289.88453
Mean7903.8801
Median Absolute Deviation (MAD)712
Skewness16.07238
Sum3295918
Variance3.1838048 × 109
MonotonicityNot monotonic
2023-12-12T14:46:18.630794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 6
 
0.1%
40 5
 
0.1%
2 4
 
0.1%
50 3
 
< 0.1%
581 3
 
< 0.1%
30 3
 
< 0.1%
494 3
 
< 0.1%
10 3
 
< 0.1%
26 2
 
< 0.1%
20 2
 
< 0.1%
Other values (358) 383
 
5.1%
(Missing) 7098
94.5%
ValueCountFrequency (%)
-5349 1
 
< 0.1%
-4110 1
 
< 0.1%
2 4
0.1%
3 2
 
< 0.1%
4 6
0.1%
6 1
 
< 0.1%
9 1
 
< 0.1%
10 3
< 0.1%
11 2
 
< 0.1%
13 2
 
< 0.1%
ValueCountFrequency (%)
1056166 1
< 0.1%
267120 1
< 0.1%
238361 1
< 0.1%
225116 1
< 0.1%
142731 1
< 0.1%
79611 1
< 0.1%
74882 1
< 0.1%
67707 1
< 0.1%
63051 1
< 0.1%
60106 1
< 0.1%

과목경정
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing7515
Missing (%)100.0%
Memory size66.2 KiB

합계(천원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2626
Distinct (%)34.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35456836
Minimum0
Maximum8.271938 × 108
Zeros149
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size66.2 KiB
2023-12-12T14:46:19.119394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile16775
Q11331660
median2557843
Q39717235.5
95-th percentile3.2947458 × 108
Maximum8.271938 × 108
Range8.271938 × 108
Interquartile range (IQR)8385575.5

Descriptive statistics

Standard deviation1.1729455 × 108
Coefficient of variation (CV)3.3080941
Kurtosis16.359811
Mean35456836
Median Absolute Deviation (MAD)1982862
Skewness4.0921615
Sum2.6645812 × 1011
Variance1.3758011 × 1016
MonotonicityNot monotonic
2023-12-12T14:46:19.281895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 149
 
2.0%
574637 119
 
1.6%
300129 119
 
1.6%
16775 101
 
1.3%
1663725 97
 
1.3%
2298561 67
 
0.9%
560474 67
 
0.9%
345008 63
 
0.8%
992361 63
 
0.8%
2118444 61
 
0.8%
Other values (2616) 6609
87.9%
ValueCountFrequency (%)
0 149
2.0%
3 7
 
0.1%
351 1
 
< 0.1%
686 1
 
< 0.1%
1247 58
 
0.8%
2541 1
 
< 0.1%
3000 10
 
0.1%
3666 1
 
< 0.1%
5301 1
 
< 0.1%
5729 1
 
< 0.1%
ValueCountFrequency (%)
827193801 1
< 0.1%
800676932 1
< 0.1%
799735243 1
< 0.1%
775934697 1
< 0.1%
774467018 1
< 0.1%
773846727 1
< 0.1%
773716282 1
< 0.1%
772399837 1
< 0.1%
744931124 1
< 0.1%
744373947 1
< 0.1%

Interactions

2023-12-12T14:46:16.197838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:14.750067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:15.196162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:15.715600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:16.289782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:14.853344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:15.354750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:15.829170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:16.393107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:14.958778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:15.485231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:15.973163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:16.504181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:15.068348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:15.589842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:46:16.082745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:46:19.384025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계명세입명전일누계(천원)수입액(천원)과오납반환(천원)합계(천원)
회계명1.0001.0000.6370.1400.0000.638
세입명1.0001.0000.1940.0520.0000.190
전일누계(천원)0.6370.1941.0000.3530.1361.000
수입액(천원)0.1400.0520.3531.0000.6420.322
과오납반환(천원)0.0000.0000.1360.6421.0000.161
합계(천원)0.6380.1901.0000.3220.1611.000
2023-12-12T14:46:19.488970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세입명회계명
세입명1.0000.999
회계명0.9991.000
2023-12-12T14:46:19.569781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전일누계(천원)수입액(천원)과오납반환(천원)합계(천원)회계명세입명
전일누계(천원)1.0000.6430.1970.9970.3270.148
수입액(천원)0.6431.0000.1350.6510.0760.064
과오납반환(천원)0.1970.1351.0000.2010.0000.000
합계(천원)0.9970.6510.2011.0000.3270.146
회계명0.3270.0760.0000.3271.0000.999
세입명0.1480.0640.0000.1460.9991.000

Missing values

2023-12-12T14:46:16.641812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:46:16.778854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:46:16.904887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

일자회계명세입명전일누계(천원)수입액(천원)과오납반환(천원)과목경정합계(천원)
02020-03-17수질개선특별회계세외수입2438950<NA><NA><NA>2438950
12020-03-17주차장특별회계세외수입113100932238<NA><NA>11312331
22020-03-16수질개선특별회계세외수입2438950<NA><NA><NA>2438950
32020-03-16주차장특별회계세외수입113088671225<NA><NA>11310092
42020-03-13수질개선특별회계세외수입2438950<NA><NA><NA>2438950
52020-03-13주차장특별회계세외수입11306250<NA><NA><NA>11306250
62020-03-12수질개선특별회계세외수입2438950<NA><NA><NA>2438950
72020-03-12주차장특별회계세외수입11304870<NA><NA><NA>11304870
82020-03-11수질개선특별회계세외수입2438950<NA><NA><NA>2438950
92020-03-11주차장특별회계세외수입11303619<NA><NA><NA>11303619
일자회계명세입명전일누계(천원)수입액(천원)과오납반환(천원)과목경정합계(천원)
75052016-01-04의료급여기금특별회계회계별총계2057475<NA><NA><NA>2057475
75062016-01-04저소득주민주거안정기금특별회계회계별총계1093049<NA><NA><NA>1093049
75072016-01-04주민소득지원및생활안정기금특별회계회계별총계2068506<NA><NA><NA>2068506
75082016-01-04장기미집행도시계획시설대지보상임시특별회계회계별총계1247<NA><NA><NA>1247
75092016-01-04농공지구조성사업특별회계회계별총계2962879<NA><NA><NA>2962879
75102016-01-04치수사업특별회계회계별총계30838722<NA><NA><NA>30838722
75112016-01-04기반시설특별회계회계별총계16775<NA><NA><NA>16775
75122016-01-04주거환경개선사업특별회계회계별총계1494204<NA><NA><NA>1494204
75132016-01-04수도사업특별회계회계별총계203435131735<NA><NA>20345248
75142016-01-04하수도사업특별회계회계별총계42348193<NA><NA><NA>42348193