gimi9 Pandas Profiling

Dataset statistics

Number of variables	3
Number of observations	21
Missing cells	0
Missing cells (%)	0.0%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	657.0 B
Average record size in memory	31.3 B

Variable types

Text	1
Categorical	1
Numeric	1

Dataset

Description	전라남도 여수시 수도요금에 대한 요금정보 부과기준 데이터이며, 구간별, 업종별, 톤당 상수도요금을 제공하고 있습니다.
URL	https://www.data.go.kr/data/15093343/fileData.do

Reproduction

Analysis started	2023-12-12 02:28:00.770891
Analysis finished	2023-12-12 02:28:01.116590
Duration	0.35 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

사용량
Text

Distinct	20
Distinct (%)	95.2%
Missing	0
Missing (%)	0.0%
Memory size	300.0 B

Length

Max length	7
Median length	6
Mean length	5.3809524
Min length	4

Characters and Unicode

Total characters	113
Distinct characters	9
Distinct categories	3 ?
Distinct scripts	2 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	19 ?
Unique (%)	90.5%

Sample

1st row	1-10
2nd row	11-20
3rd row	21-30
4th row	31-40
5th row	41-50

Value	Count	Frequency (%)
51-100	2	9.5%
1-10	1	4.8%
1-30	1	4.8%
1-40000	1	4.8%
501이상	1	4.8%
301-500	1	4.8%
201-300	1	4.8%
1-200	1	4.8%
101이상	1	4.8%
31-50	1	4.8%
Other values (10)	10	47.6%

Most occurring characters

Value	Count	Frequency (%)
0	34	30.1%
1	27	23.9%
-	16	14.2%
5	8	7.1%
3	8	7.1%
2	6	5.3%
이	5	4.4%
상	5	4.4%
4	4	3.5%

Most occurring categories

Value	Count	Frequency (%)
Decimal Number	87	77.0%
Dash Punctuation	16	14.2%
Other Letter	10	8.8%

Most frequent character per category

Decimal Number

Value	Count	Frequency (%)
0	34	39.1%
1	27	31.0%
5	8	9.2%
3	8	9.2%
2	6	6.9%
4	4	4.6%

Other Letter

Value	Count	Frequency (%)
이	5	50.0%
상	5	50.0%

Dash Punctuation

Value	Count	Frequency (%)
-	16	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	103	91.2%
Hangul	10	8.8%

Most frequent character per script

Common

Value	Count	Frequency (%)
0	34	33.0%
1	27	26.2%
-	16	15.5%
5	8	7.8%
3	8	7.8%
2	6	5.8%
4	4	3.9%

Hangul

Value	Count	Frequency (%)
이	5	50.0%
상	5	50.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	103	91.2%
Hangul	10	8.8%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
0	34	33.0%
1	27	26.2%
-	16	15.5%
5	8	7.8%
3	8	7.8%
2	6	5.8%
4	4	3.9%

Hangul

Value	Count	Frequency (%)
이	5	50.0%
상	5	50.0%

업종별
Categorical

Distinct	5
Distinct (%)	23.8%
Missing	0
Missing (%)	0.0%
Memory size	300.0 B

가정용	6
업무용	5
영업용	4
대중탕용	4
공업용	2

Length

Max length	4
Median length	3
Mean length	3.1904762
Min length	3

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	가정용
2nd row	가정용
3rd row	가정용
4th row	가정용
5th row	가정용

Common Values

Value	Count	Frequency (%)
가정용	6	28.6%
업무용	5	23.8%
영업용	4	19.0%
대중탕용	4	19.0%
공업용	2	9.5%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
가정용	6	28.6%
업무용	5	23.8%
영업용	4	19.0%
대중탕용	4	19.0%
공업용	2	9.5%

상수도(원_톤)
Real number (ℝ)

Distinct	20
Distinct (%)	95.2%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Mean	1393.8095

Minimum	550
Maximum	2660
Zeros	0
Zeros (%)	0.0%
Negative	0
Negative (%)	0.0%
Memory size	321.0 B

Quantile statistics

Minimum	550
5-th percentile	590
Q1	950
median	1400
Q3	1780
95-th percentile	2450
Maximum	2660
Range	2110
Interquartile range (IQR)	830

Descriptive statistics

Standard deviation	612.85786
Coefficient of variation (CV)	0.43969987
Kurtosis	-0.54624057
Mean	1393.8095
Median Absolute Deviation (MAD)	420
Skewness	0.53536178
Sum	29270
Variance	375594.76
Monotonicity	Not monotonic

Histogram with fixed size bins (bins=20)

Value	Count	Frequency (%)
770	2	9.5%
590	1	4.8%
1400	1	4.8%
710	1	4.8%
550	1	4.8%
1470	1	4.8%
1230	1	4.8%
1000	1	4.8%
2660	1	4.8%
2240	1	4.8%
Other values (10)	10	47.6%

Minimum 10 values
Maximum 10 values

Value	Count	Frequency (%)
550	1	4.8%
590	1	4.8%
710	1	4.8%
770	2	9.5%
950	1	4.8%
1000	1	4.8%
1110	1	4.8%
1120	1	4.8%
1230	1	4.8%
1400	1	4.8%

Value	Count	Frequency (%)
2660	1	4.8%
2450	1	4.8%
2240	1	4.8%
2120	1	4.8%
1820	1	4.8%
1780	1	4.8%
1660	1	4.8%
1470	1	4.8%
1450	1	4.8%
1420	1	4.8%

상수도(원_톤)

상수도(원_톤)

Phik (φk)
Auto

Heatmap
Table

	사용량	업종별	상수도(원_톤)
사용량	1.000	0.909	0.760
업종별	0.909	1.000	0.000
상수도(원_톤)	0.760	0.000	1.000

Heatmap
Table

	상수도(원_톤)	업종별
상수도(원_톤)	1.000	0.051
업종별	0.051	1.000

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	사용량	업종별	상수도(원_톤)
0	1-10	가정용	590
1	11-20	가정용	770
2	21-30	가정용	950
3	31-40	가정용	1120
4	41-50	가정용	1420
5	51이상	가정용	1660
6	1-20	업무용	1110
7	21-50	업무용	1450
8	51-100	업무용	1780
9	101-300	업무용	2120

	사용량	업종별	상수도(원_톤)
11	1-30	영업용	1400
12	31-50	영업용	1820
13	51-100	영업용	2240
14	101이상	영업용	2660
15	1-200	대중탕용	770
16	201-300	대중탕용	1000
17	301-500	대중탕용	1230
18	501이상	대중탕용	1470
19	1-40000	공업용	550
20	40001이상	공업용	710

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Decimal Number

Other Letter

Dash Punctuation

Most occurring scripts

Most frequent character per script

Common

Hangul

Most occurring blocks

Most frequent character per block

ASCII

Hangul

Common Values

Length

Common Values (Plot)

Interactions

Correlations

Missing values

Sample