Overview

Dataset statistics

Number of variables12
Number of observations1127
Missing cells575
Missing cells (%)4.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory109.1 KiB
Average record size in memory99.1 B

Variable types

Numeric3
DateTime4
Text1
Boolean3
Categorical1

Dataset

Description당진시 지방보조금통합관리시스템에서 관리하는 보조사업자교부정보에 대한 데이터로 교부번호,사업관리번호,교부자번호,사업수행시작일,사업수행종료일등의 항목을 제공합니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=322&beforeMenuCd=DOM_000000201001001000&publicdatapk=15091602

Alerts

작성자 is highly overall correlated with 교부번호 and 4 other fieldsHigh correlation
자부담집행여부 is highly overall correlated with 적합성여부 and 2 other fieldsHigh correlation
적정성여부 is highly overall correlated with 적합성여부 and 2 other fieldsHigh correlation
적합성여부 is highly overall correlated with 자부담집행여부 and 2 other fieldsHigh correlation
교부번호 is highly overall correlated with 작성자High correlation
사업관리번호 is highly overall correlated with 작성자High correlation
적합성여부 is highly imbalanced (95.2%)Imbalance
자부담집행여부 is highly imbalanced (95.2%)Imbalance
적정성여부 is highly imbalanced (95.2%)Imbalance
사업수행시작일 has 283 (25.1%) missing valuesMissing
사업수행종료일 has 283 (25.1%) missing valuesMissing
교부번호 has unique valuesUnique

Reproduction

Analysis started2024-01-09 21:46:50.905281
Analysis finished2024-01-09 21:46:52.359400
Duration1.45 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

교부번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1127
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27808.962
Minimum24897
Maximum28950
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2024-01-10T06:46:52.416714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum24897
5-th percentile26854.3
Q127287.5
median27570
Q328666.5
95-th percentile28891.7
Maximum28950
Range4053
Interquartile range (IQR)1379

Descriptive statistics

Standard deviation767.22879
Coefficient of variation (CV)0.027589264
Kurtosis0.19558572
Mean27808.962
Median Absolute Deviation (MAD)419
Skewness-0.16544114
Sum31340700
Variance588640.02
MonotonicityNot monotonic
2024-01-10T06:46:52.551078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24897 1
 
0.1%
27307 1
 
0.1%
27313 1
 
0.1%
27312 1
 
0.1%
27311 1
 
0.1%
27310 1
 
0.1%
27309 1
 
0.1%
27308 1
 
0.1%
27306 1
 
0.1%
27315 1
 
0.1%
Other values (1117) 1117
99.1%
ValueCountFrequency (%)
24897 1
0.1%
24898 1
0.1%
24900 1
0.1%
24901 1
0.1%
24902 1
0.1%
24927 1
0.1%
24939 1
0.1%
25014 1
0.1%
25065 1
0.1%
25078 1
0.1%
ValueCountFrequency (%)
28950 1
0.1%
28949 1
0.1%
28948 1
0.1%
28947 1
0.1%
28946 1
0.1%
28945 1
0.1%
28944 1
0.1%
28943 1
0.1%
28942 1
0.1%
28941 1
0.1%

사업관리번호
Real number (ℝ)

HIGH CORRELATION 

Distinct142
Distinct (%)12.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2735.7489
Minimum2639
Maximum2861
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2024-01-10T06:46:52.700138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2639
5-th percentile2664
Q12666
median2672
Q32825
95-th percentile2834
Maximum2861
Range222
Interquartile range (IQR)159

Descriptive statistics

Standard deviation76.651897
Coefficient of variation (CV)0.028018616
Kurtosis-1.8031229
Mean2735.7489
Median Absolute Deviation (MAD)8
Skewness0.29504468
Sum3083189
Variance5875.5133
MonotonicityNot monotonic
2024-01-10T06:46:52.837782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2666 382
33.9%
2834 97
 
8.6%
2833 95
 
8.4%
2825 72
 
6.4%
2664 69
 
6.1%
2668 66
 
5.9%
2812 36
 
3.2%
2836 28
 
2.5%
2813 27
 
2.4%
2669 23
 
2.0%
Other values (132) 232
20.6%
ValueCountFrequency (%)
2639 1
0.1%
2640 1
0.1%
2641 1
0.1%
2646 1
0.1%
2647 1
0.1%
2648 1
0.1%
2649 1
0.1%
2657 1
0.1%
2659 1
0.1%
2661 1
0.1%
ValueCountFrequency (%)
2861 1
0.1%
2860 1
0.1%
2859 1
0.1%
2857 1
0.1%
2856 1
0.1%
2855 1
0.1%
2854 1
0.1%
2853 1
0.1%
2852 1
0.1%
2851 1
0.1%

교부자번호
Real number (ℝ)

Distinct699
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2220.6442
Minimum58
Maximum6259
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2024-01-10T06:46:52.969996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum58
5-th percentile705.3
Q11111
median1272
Q32855
95-th percentile5749.1
Maximum6259
Range6201
Interquartile range (IQR)1744

Descriptive statistics

Standard deviation1644.9178
Coefficient of variation (CV)0.740739
Kurtosis0.0034024017
Mean2220.6442
Median Absolute Deviation (MAD)520
Skewness1.1049121
Sum2502666
Variance2705754.4
MonotonicityNot monotonic
2024-01-10T06:46:53.108618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5810 8
 
0.7%
207 6
 
0.5%
185 6
 
0.5%
1135 5
 
0.4%
2607 4
 
0.4%
2589 4
 
0.4%
1186 4
 
0.4%
2597 4
 
0.4%
1122 4
 
0.4%
1140 4
 
0.4%
Other values (689) 1078
95.7%
ValueCountFrequency (%)
58 3
0.3%
72 2
 
0.2%
84 1
 
0.1%
89 1
 
0.1%
93 3
0.3%
99 1
 
0.1%
133 3
0.3%
185 6
0.5%
207 6
0.5%
209 3
0.3%
ValueCountFrequency (%)
6259 1
0.1%
6258 2
0.2%
6256 1
0.1%
6255 2
0.2%
6253 1
0.1%
6252 1
0.1%
6251 1
0.1%
6250 1
0.1%
6249 1
0.1%
5821 1
0.1%

사업수행시작일
Date

MISSING 

Distinct25
Distinct (%)3.0%
Missing283
Missing (%)25.1%
Memory size8.9 KiB
Minimum2021-01-01 00:00:00
Maximum2021-09-01 00:00:00
2024-01-10T06:46:53.207501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:53.296880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)

사업수행종료일
Date

MISSING 

Distinct11
Distinct (%)1.3%
Missing283
Missing (%)25.1%
Memory size8.9 KiB
Minimum2021-03-31 00:00:00
Maximum2021-12-31 00:00:00
2024-01-10T06:46:53.375441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:53.450545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
Distinct798
Distinct (%)71.4%
Missing9
Missing (%)0.8%
Memory size8.9 KiB
2024-01-10T06:46:53.733511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length37
Mean length14.982111
Min length6

Characters and Unicode

Total characters16750
Distinct characters253
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique578 ?
Unique (%)51.7%

Sample

1st row충청남도 당진시 당진시장길 124 유림회관 2층
2nd row충청남도 당진시 당진시장길 124 유림회관 2층
3rd row충청남도 당진시 남산공원길 151-16 당진문화원
4th row충청남도 당진시 당진시장북길 49 1층
5th row충청남도 당진시 무수동로 55 2층
ValueCountFrequency (%)
당진시 294
 
7.7%
합덕읍 181
 
4.8%
충청남도 154
 
4.1%
고대면 136
 
3.6%
신평면 123
 
3.2%
면천면 116
 
3.1%
순성면 114
 
3.0%
대호지면 78
 
2.1%
송악읍 56
 
1.5%
석문면 53
 
1.4%
Other values (1118) 2496
65.7%
2024-01-10T06:46:54.148587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2776
 
16.6%
1 974
 
5.8%
885
 
5.3%
- 747
 
4.5%
2 688
 
4.1%
3 521
 
3.1%
417
 
2.5%
407
 
2.4%
4 399
 
2.4%
0 388
 
2.3%
Other values (243) 8548
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8690
51.9%
Decimal Number 4487
26.8%
Space Separator 2776
 
16.6%
Dash Punctuation 747
 
4.5%
Other Punctuation 22
 
0.1%
Uppercase Letter 14
 
0.1%
Open Punctuation 7
 
< 0.1%
Close Punctuation 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
885
 
10.2%
417
 
4.8%
407
 
4.7%
357
 
4.1%
350
 
4.0%
340
 
3.9%
298
 
3.4%
285
 
3.3%
242
 
2.8%
232
 
2.7%
Other values (223) 4877
56.1%
Decimal Number
ValueCountFrequency (%)
1 974
21.7%
2 688
15.3%
3 521
11.6%
4 399
8.9%
0 388
 
8.6%
5 357
 
8.0%
6 325
 
7.2%
8 304
 
6.8%
7 296
 
6.6%
9 235
 
5.2%
Other Punctuation
ValueCountFrequency (%)
, 19
86.4%
. 2
 
9.1%
· 1
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
G 6
42.9%
J 6
42.9%
B 2
 
14.3%
Space Separator
ValueCountFrequency (%)
2776
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 747
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8690
51.9%
Common 8046
48.0%
Latin 14
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
885
 
10.2%
417
 
4.8%
407
 
4.7%
357
 
4.1%
350
 
4.0%
340
 
3.9%
298
 
3.4%
285
 
3.3%
242
 
2.8%
232
 
2.7%
Other values (223) 4877
56.1%
Common
ValueCountFrequency (%)
2776
34.5%
1 974
 
12.1%
- 747
 
9.3%
2 688
 
8.6%
3 521
 
6.5%
4 399
 
5.0%
0 388
 
4.8%
5 357
 
4.4%
6 325
 
4.0%
8 304
 
3.8%
Other values (7) 567
 
7.0%
Latin
ValueCountFrequency (%)
G 6
42.9%
J 6
42.9%
B 2
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8688
51.9%
ASCII 8059
48.1%
Compat Jamo 2
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2776
34.4%
1 974
 
12.1%
- 747
 
9.3%
2 688
 
8.5%
3 521
 
6.5%
4 399
 
5.0%
0 388
 
4.8%
5 357
 
4.4%
6 325
 
4.0%
8 304
 
3.8%
Other values (9) 580
 
7.2%
Hangul
ValueCountFrequency (%)
885
 
10.2%
417
 
4.8%
407
 
4.7%
357
 
4.1%
350
 
4.0%
340
 
3.9%
298
 
3.4%
285
 
3.3%
242
 
2.8%
232
 
2.7%
Other values (222) 4875
56.1%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
None
ValueCountFrequency (%)
· 1
100.0%

적합성여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
False
1121 
True
 
6
ValueCountFrequency (%)
False 1121
99.5%
True 6
 
0.5%
2024-01-10T06:46:54.239257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

자부담집행여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
False
1121 
True
 
6
ValueCountFrequency (%)
False 1121
99.5%
True 6
 
0.5%
2024-01-10T06:46:54.299328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

적정성여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
False
1121 
True
 
6
ValueCountFrequency (%)
False 1121
99.5%
True 6
 
0.5%
2024-01-10T06:46:54.357108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

작성자
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
신희철
382 
전건수
221 
김영욱
179 
김소리
105 
배경석
87 
Other values (29)
153 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique8 ?
Unique (%)0.7%

Sample

1st row이호경
2nd row이호경
3rd row김동빈
4th row성기쁜
5th row성기쁜

Common Values

ValueCountFrequency (%)
신희철 382
33.9%
전건수 221
19.6%
김영욱 179
15.9%
김소리 105
 
9.3%
배경석 87
 
7.7%
유민재 21
 
1.9%
김정숙 16
 
1.4%
권혜미 15
 
1.3%
원주현 13
 
1.2%
신지영 11
 
1.0%
Other values (24) 77
 
6.8%

Length

2024-01-10T06:46:54.427551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
신희철 382
33.9%
전건수 221
19.6%
김영욱 179
15.9%
김소리 105
 
9.3%
배경석 87
 
7.7%
유민재 21
 
1.9%
김정숙 16
 
1.4%
권혜미 15
 
1.3%
원주현 13
 
1.2%
신지영 11
 
1.0%
Other values (24) 77
 
6.8%
Distinct155
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
Minimum2021-01-20 16:16:00
Maximum2021-09-15 10:22:00
2024-01-10T06:46:54.523361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:54.617843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct153
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
Minimum2021-01-20 16:16:00
Maximum2021-09-15 10:22:00
2024-01-10T06:46:54.712305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:54.807180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-01-10T06:46:51.856743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:51.408403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:51.643853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:51.933018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:51.489458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:51.722875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:52.002037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:51.565949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T06:46:51.786180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T06:46:54.873400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
교부번호사업관리번호교부자번호사업수행시작일사업수행종료일적합성여부자부담집행여부적정성여부작성자
교부번호1.0000.8760.4370.8710.5310.3330.3330.3330.946
사업관리번호0.8761.0000.5980.9370.6940.5040.5040.5040.958
교부자번호0.4370.5981.0000.5780.4940.2110.2110.2110.735
사업수행시작일0.8710.9370.5781.0000.9551.0001.0001.0000.988
사업수행종료일0.5310.6940.4940.9551.0000.0000.0000.0000.915
적합성여부0.3330.5040.2111.0000.0001.0000.9910.9911.000
자부담집행여부0.3330.5040.2111.0000.0000.9911.0000.9911.000
적정성여부0.3330.5040.2111.0000.0000.9910.9911.0001.000
작성자0.9460.9580.7350.9880.9151.0001.0001.0001.000
2024-01-10T06:46:54.965241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
작성자자부담집행여부적정성여부적합성여부
작성자1.0000.9860.9860.986
자부담집행여부0.9861.0000.9160.916
적정성여부0.9860.9161.0000.916
적합성여부0.9860.9160.9161.000
2024-01-10T06:46:55.044698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
교부번호사업관리번호교부자번호적합성여부자부담집행여부적정성여부작성자
교부번호1.000-0.315-0.1770.2390.2390.2390.769
사업관리번호-0.3151.0000.0110.3870.3870.3870.753
교부자번호-0.1770.0111.0000.1610.1610.1610.358
적합성여부0.2390.3870.1611.0000.9160.9160.986
자부담집행여부0.2390.3870.1610.9161.0000.9160.986
적정성여부0.2390.3870.1610.9160.9161.0000.986
작성자0.7690.7530.3580.9860.9860.9861.000

Missing values

2024-01-10T06:46:52.101731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:46:52.226421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-10T06:46:52.315782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

교부번호사업관리번호교부자번호사업수행시작일사업수행종료일교부신청주소적합성여부자부담집행여부적정성여부작성자작성일변경일
024897264046932021-01-012021-12-31충청남도 당진시 당진시장길 124 유림회관 2층NNN이호경2021-01-20 16:162021-01-20 16:16
124898264146932021-01-012021-12-31충청남도 당진시 당진시장길 124 유림회관 2층NNN이호경2021-01-20 16:172021-01-20 16:17
2249002646582021-01-012021-12-31충청남도 당진시 남산공원길 151-16 당진문화원NNN김동빈2021-01-27 10:292021-01-27 10:29
324901264819112021-01-192021-12-31충청남도 당진시 당진시장북길 49 1층NNN성기쁜2021-01-27 22:142021-01-27 22:14
4249022649842021-01-192021-12-31충청남도 당진시 무수동로 55 2층NNN성기쁜2021-01-27 22:382021-01-27 22:38
52492726571332021-01-012021-12-31충청남도 당진시 무수동로 88 현대자동차영업소 2층NNN김정숙2021-02-02 14:052021-02-02 14:05
624939265955572021-02-012021-12-31충청남도 당진시 무수동로 88 2층NNN김정숙2021-02-08 10:022021-02-08 10:02
7250652639305<NA><NA>충청남도 당진시 원당로 183NNN전건수2021-02-15 9:072021-02-15 9:07
82501426471852021-01-012021-12-31충청남도 당진시 시청2로 20 401호(수청동 GJ빌딩)NNN남슬기2021-02-10 14:432021-02-10 14:43
925078266132522021-03-162021-10-12충청남도 당진시 무수동7길 35 2층 충남농아인협회 당진시지회NNN진영재2021-02-17 10:032021-02-17 10:03
교부번호사업관리번호교부자번호사업수행시작일사업수행종료일교부신청주소적합성여부자부담집행여부적정성여부작성자작성일변경일
111726981280457982021-03-092021-06-30충남 당진군 고대면 용두리 445-1외NNN신선희2021-07-15 14:372021-07-15 14:39
111826980280357972021-03-022021-06-30충남 당진군 석문면 삼봉리 421-2외NNN신선희2021-07-15 14:262021-07-15 14:28
11192698228071332021-06-012021-07-30충청남도 당진시 무수동로 88 현대자동차영업소 2층NNN이이슬2021-07-19 10:472021-07-19 10:47
112026983280857992021-02-022021-07-07충청남도 당진시 송악읍 상중말길 117NNN송수연2021-07-22 16:102021-07-22 16:10
112126984280840052021-02-022021-07-07충청남도 당진시 송악읍 순성로 776-11NNN송수연2021-07-22 16:112021-07-22 16:11
112226985280845802021-02-022021-07-07충청남도 당진시 순성면 가화로 128-39NNN송수연2021-07-22 16:132021-07-22 16:13
112326986280853512021-02-022021-07-07충청남도 당진시 합덕읍 분두길 62NNN송수연2021-07-22 16:142021-07-22 16:14
112426987280830382021-02-022021-07-07충청남도 당진시 송산면 사둘구지길 227-17NNN송수연2021-07-22 16:152021-07-22 16:15
112526988280858002021-02-022021-07-07충청남도 당진시 합덕읍 오여미1길 79NNN송수연2021-07-22 16:182021-07-22 16:18
112627098281758102021-01-012021-12-31충청남도 당진시 무수동7길 42NNN배경석2021-08-02 16:192021-08-02 16:19