Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1685
Duplicate rows (%)16.9%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

DateTime3
Categorical2
Numeric1
Text1

Dataset

DescriptionN/A
Author충청북도 제천시
URLhttps://www.data.go.kr/data/15122293/fileData.do

Alerts

입금상태 has constant value ""Constant
데이터기준일 has constant value ""Constant
Dataset has 1685 (16.9%) duplicate rowsDuplicates
고지구분 is highly imbalanced (56.3%)Imbalance

Reproduction

Analysis started2024-04-21 09:25:09.148870
Analysis finished2024-04-21 09:25:10.529002
Duration1.38 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct563
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-01-01 00:00:00
Maximum2023-07-31 00:00:00
2024-04-21T18:25:10.720002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T18:25:11.156102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

입금상태
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
입금
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row입금
2nd row입금
3rd row입금
4th row입금
5th row입금

Common Values

ValueCountFrequency (%)
입금 10000
100.0%

Length

2024-04-21T18:25:11.757420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T18:25:12.041480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
입금 10000
100.0%
Distinct405
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-01-03 00:00:00
Maximum2023-11-27 00:00:00
2024-04-21T18:25:12.349897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T18:25:12.652606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

고지구분
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반
8145 
독촉
1854 
체납
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row독촉
2nd row일반
3rd row독촉
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 8145
81.5%
독촉 1854
 
18.5%
체납 1
 
< 0.1%

Length

2024-04-21T18:25:12.874900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T18:25:13.040407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 8145
81.5%
독촉 1854
 
18.5%
체납 1
 
< 0.1%

세목코드
Real number (ℝ)

Distinct127
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean247150.81
Minimum201001
Maximum715002
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-21T18:25:13.348677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201001
5-th percentile202002
Q1205009
median219216
Q3288130
95-th percentile294099
Maximum715002
Range514001
Interquartile range (IQR)83121

Descriptive statistics

Standard deviation61101.136
Coefficient of variation (CV)0.24722207
Kurtosis30.904812
Mean247150.81
Median Absolute Deviation (MAD)17214
Skewness4.386167
Sum2.4715081 × 109
Variance3.7333488 × 109
MonotonicityNot monotonic
2024-04-21T18:25:13.780300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
205007 1155
11.6%
288131 1104
11.0%
211173 981
 
9.8%
288080 957
 
9.6%
219216 818
 
8.2%
202002 702
 
7.0%
288130 506
 
5.1%
205009 420
 
4.2%
294099 345
 
3.5%
206001 307
 
3.1%
Other values (117) 2705
27.1%
ValueCountFrequency (%)
201001 77
 
0.8%
201002 1
 
< 0.1%
202001 180
 
1.8%
202002 702
7.0%
202009 9
 
0.1%
202012 4
 
< 0.1%
202099 4
 
< 0.1%
205004 9
 
0.1%
205006 5
 
0.1%
205007 1155
11.6%
ValueCountFrequency (%)
715002 57
0.6%
715001 41
 
0.4%
299099 58
0.6%
295064 1
 
< 0.1%
295062 124
1.2%
295043 2
 
< 0.1%
295038 2
 
< 0.1%
295025 2
 
< 0.1%
294944 4
 
< 0.1%
294907 1
 
< 0.1%
Distinct136
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-21T18:25:14.597777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length15
Mean length8.8094
Min length2

Characters and Unicode

Total characters88094
Distinct characters195
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)0.3%

Sample

1st row자동차손해배상보장법위반과태료
2nd row화물자동차운수사업법위반과징금
3rd row부가가치세
4th row차량출입시설
5th row학사사용료
ValueCountFrequency (%)
차량출입시설 1155
 
11.6%
자동차검사지연과태료 1104
 
11.0%
학사사용료 981
 
9.8%
자동차손해배상보장법위반과태료 956
 
9.6%
보건진료소진료사업수입 818
 
8.2%
장애인주차구역위반과태료 506
 
5.1%
옥외간판도로공간사용료 396
 
4.0%
시군구재산대부료 376
 
3.8%
그외수입 346
 
3.5%
시군구재산임대료 326
 
3.3%
Other values (126) 3036
30.4%
2024-04-21T18:25:15.862367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8296
 
9.4%
5311
 
6.0%
3952
 
4.5%
3293
 
3.7%
3234
 
3.7%
2706
 
3.1%
2570
 
2.9%
2331
 
2.6%
2297
 
2.6%
2256
 
2.6%
Other values (185) 51848
58.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 88079
> 99.9%
Close Punctuation 5
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Decimal Number 2
 
< 0.1%
Dash Punctuation 2
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8296
 
9.4%
5311
 
6.0%
3952
 
4.5%
3293
 
3.7%
3234
 
3.7%
2706
 
3.1%
2570
 
2.9%
2331
 
2.6%
2297
 
2.6%
2256
 
2.6%
Other values (180) 51833
58.8%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Decimal Number
ValueCountFrequency (%)
4 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 88079
> 99.9%
Common 15
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8296
 
9.4%
5311
 
6.0%
3952
 
4.5%
3293
 
3.7%
3234
 
3.7%
2706
 
3.1%
2570
 
2.9%
2331
 
2.6%
2297
 
2.6%
2256
 
2.6%
Other values (180) 51833
58.8%
Common
ValueCountFrequency (%)
) 5
33.3%
( 5
33.3%
4 2
 
13.3%
- 2
 
13.3%
1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 88079
> 99.9%
ASCII 15
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8296
 
9.4%
5311
 
6.0%
3952
 
4.5%
3293
 
3.7%
3234
 
3.7%
2706
 
3.1%
2570
 
2.9%
2331
 
2.6%
2297
 
2.6%
2256
 
2.6%
Other values (180) 51833
58.8%
ASCII
ValueCountFrequency (%)
) 5
33.3%
( 5
33.3%
4 2
 
13.3%
- 2
 
13.3%
1
 
6.7%

데이터기준일
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2023-09-01 00:00:00
Maximum2023-09-01 00:00:00
2024-04-21T18:25:16.202885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T18:25:16.367290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2024-04-21T18:25:09.590956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T18:25:16.481846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고지구분세목코드
고지구분1.0000.416
세목코드0.4161.000
2024-04-21T18:25:16.618481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세목코드고지구분
세목코드1.0000.154
고지구분0.1541.000

Missing values

2024-04-21T18:25:09.950363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T18:25:10.347157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

거래일자입금상태납기일자고지구분세목코드세목명데이터기준일
14892022-02-25입금2022-03-07독촉288080자동차손해배상보장법위반과태료2023-09-01
86862022-10-31입금2022-10-31일반288238화물자동차운수사업법위반과징금2023-09-01
35532022-05-27입금2022-05-27독촉299099부가가치세2023-09-01
153932023-06-23입금2023-06-30일반205007차량출입시설2023-09-01
98642022-12-27입금2022-12-27일반211173학사사용료2023-09-01
28202022-04-26입금2022-05-02독촉288131자동차검사지연과태료2023-09-01
47212022-06-20입금2022-06-30일반205007차량출입시설2023-09-01
442022-01-05입금2022-01-24일반288702감염병예방및관리법과태료2023-09-01
134562023-05-10입금2023-05-10일반211173학사사용료2023-09-01
125282023-03-28입금2023-05-22일반251002시군구유재산매각수입금2023-09-01
거래일자입금상태납기일자고지구분세목코드세목명데이터기준일
33202022-05-19입금2022-05-31독촉288131자동차검사지연과태료2023-09-01
57922022-06-30입금2022-06-30일반205008사설안내표지판2023-09-01
57802022-06-30입금2022-06-30독촉288080자동차손해배상보장법위반과태료2023-09-01
49712022-06-23입금2022-06-23일반219216보건진료소진료사업수입2023-09-01
100392023-01-04입금2023-01-04일반715001국고보조금등반환금2023-09-01
94532022-12-06입금2023-01-02일반288131자동차검사지연과태료2023-09-01
165582023-07-10입금2023-07-10일반219216보건진료소진료사업수입2023-09-01
56312022-06-30입금2022-06-30일반205007차량출입시설2023-09-01
170302023-07-26입금2023-08-31일반288133쓰레기불법투기과태료2023-09-01
140042023-05-31입금2023-05-31독촉288080자동차손해배상보장법위반과태료2023-09-01

Duplicate rows

Most frequently occurring

거래일자입금상태납기일자고지구분세목코드세목명데이터기준일# duplicates
5432022-06-30입금2022-06-30일반205007차량출입시설2023-09-0195
15862023-06-30입금2023-06-30일반205007차량출입시설2023-09-0187
4612022-06-20입금2022-06-30일반205007차량출입시설2023-09-0144
14912023-06-19입금2023-06-30일반205007차량출입시설2023-09-0142
4442022-06-17입금2022-06-30일반205007차량출입시설2023-09-0140
5362022-06-29입금2022-06-30일반205007차량출입시설2023-09-0137
4312022-06-16입금2022-06-30일반205007차량출입시설2023-09-0136
13022023-04-10입금2023-04-10일반211173학사사용료2023-09-0136
14672023-06-16입금2023-06-30일반205007차량출입시설2023-09-0136
13652023-05-10입금2023-05-10일반211173학사사용료2023-09-0135