gimi9 Pandas Profiling

Dataset statistics

Number of variables	4
Number of observations	131
Missing cells	0
Missing cells (%)	0.0%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	4.5 KiB
Average record size in memory	35.0 B

Variable types

Numeric	2
Text	2

Dataset

Description	대전광역시 서구 대형 폐기물 처리 수수료 정보(대형 폐기물 품명, 대형 폐기물 규격, 대형 폐기물 수수료)를 제공합니다
Author	대전광역시 서구
URL	https://www.data.go.kr/data/15089837/fileData.do

Alerts

순번 has unique values Unique

Reproduction

Analysis started	2023-12-12 20:00:14.486838
Analysis finished	2023-12-12 20:00:15.426964
Duration	0.94 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

순번
Real number (ℝ)

UNIQUE

Distinct	131
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Mean	66

Minimum	1
Maximum	131
Zeros	0
Zeros (%)	0.0%
Negative	0
Negative (%)	0.0%
Memory size	1.3 KiB

Quantile statistics

Minimum	1
5-th percentile	7.5
Q1	33.5
median	66
Q3	98.5
95-th percentile	124.5
Maximum	131
Range	130
Interquartile range (IQR)	65

Descriptive statistics

Standard deviation	37.960506
Coefficient of variation (CV)	0.57515918
Kurtosis	-1.2
Mean	66
Median Absolute Deviation (MAD)	33
Skewness	0
Sum	8646
Variance	1441
Monotonicity	Strictly increasing

Histogram with fixed size bins (bins=50)

Value	Count	Frequency (%)
1	1	0.8%
84	1	0.8%
98	1	0.8%
97	1	0.8%
96	1	0.8%
95	1	0.8%
94	1	0.8%
93	1	0.8%
92	1	0.8%
91	1	0.8%
Other values (121)	121	92.4%

Minimum 10 values
Maximum 10 values

Value	Count	Frequency (%)
1	1	0.8%
2	1	0.8%
3	1	0.8%
4	1	0.8%
5	1	0.8%
6	1	0.8%
7	1	0.8%
8	1	0.8%
9	1	0.8%
10	1	0.8%

Value	Count	Frequency (%)
131	1	0.8%
130	1	0.8%
129	1	0.8%
128	1	0.8%
127	1	0.8%
126	1	0.8%
125	1	0.8%
124	1	0.8%
123	1	0.8%
122	1	0.8%

품명
Text

Distinct	67
Distinct (%)	51.1%
Missing	0
Missing (%)	0.0%
Memory size	1.2 KiB

Length

Max length	8
Median length	7
Mean length	3.2748092
Min length	1

Characters and Unicode

Total characters	429
Distinct characters	123
Distinct categories	5 ?
Distinct scripts	3 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	31 ?
Unique (%)	23.7%

Sample

1st row	냉장고
2nd row	냉장고
3rd row	냉장고
4th row	냉장고
5th row	냉장고

Value	Count	Frequency (%)
냉장고	7	5.3%
침대	6	4.6%
tv	6	4.6%
에어컨(온풍기	5	3.8%
소파	5	3.8%
광고판	4	3.1%
장농	4	3.1%
컴퓨터	4	3.1%
피아노	3	2.3%
책상	3	2.3%
Other values (57)	84	64.1%

Most occurring characters

Value	Count	Frequency (%)
장	27	6.3%
기	25	5.8%
대	16	3.7%
(	15	3.5%
)	15	3.5%
자	11	2.6%
고	11	2.6%
오	11	2.6%
풍	7	1.6%
냉	7	1.6%
Other values (113)	284	66.2%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	383	89.3%
Open Punctuation	15	3.5%
Close Punctuation	15	3.5%
Uppercase Letter	14	3.3%
Decimal Number	2	0.5%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
장	27	7.0%
기	25	6.5%
대	16	4.2%
자	11	2.9%
고	11	2.9%
오	11	2.9%
풍	7	1.8%
냉	7	1.8%
침	7	1.8%
소	6	1.6%
Other values (108)	255	66.6%

Uppercase Letter

Value	Count	Frequency (%)
V	7	50.0%
T	7	50.0%

Open Punctuation

Value	Count	Frequency (%)
(	15	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	15	100.0%

Decimal Number

Value	Count	Frequency (%)
1	2	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	383	89.3%
Common	32	7.5%
Latin	14	3.3%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
장	27	7.0%
기	25	6.5%
대	16	4.2%
자	11	2.9%
고	11	2.9%
오	11	2.9%
풍	7	1.8%
냉	7	1.8%
침	7	1.8%
소	6	1.6%
Other values (108)	255	66.6%

Common

Value	Count	Frequency (%)
(	15	46.9%
)	15	46.9%
1	2	6.2%

Latin

Value	Count	Frequency (%)
V	7	50.0%
T	7	50.0%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	383	89.3%
ASCII	46	10.7%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
장	27	7.0%
기	25	6.5%
대	16	4.2%
자	11	2.9%
고	11	2.9%
오	11	2.9%
풍	7	1.8%
냉	7	1.8%
침	7	1.8%
소	6	1.6%
Other values (108)	255	66.6%

ASCII

Value	Count	Frequency (%)
(	15	32.6%
)	15	32.6%
V	7	15.2%
T	7	15.2%
1	2	4.3%

규격
Text

Distinct	88
Distinct (%)	67.2%
Missing	0
Missing (%)	0.0%
Memory size	1.2 KiB

Length

Max length	11
Median length	10
Mean length	5.5038168
Min length	2

Characters and Unicode

Total characters	721
Distinct characters	108
Distinct categories	8 ?
Distinct scripts	3 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	81 ?
Unique (%)	61.8%

Sample

1st row	800리터이상
2nd row	800리터미만
3rd row	600리터미만
4th row	400리터미만
5th row	300리터미만

Value	Count	Frequency (%)
모든규격	29	22.0%
높이1m미만	6	4.5%
높이1m이상	6	4.5%
5.5제곱미터이상	3	2.3%
편수	2	1.5%
양수	2	1.5%
5.5제곱미터미만	2	1.5%
전신(f.r.p	1	0.8%
어프라이드	1	0.8%
그랜드	1	0.8%
Other values (79)	79	59.8%

Most occurring characters

Value	Count	Frequency (%)
이	61	8.5%
1	41	5.7%
미	39	5.4%
상	31	4.3%
모	30	4.2%
규	29	4.0%
격	29	4.0%
든	29	4.0%
m	28	3.9%
만	27	3.7%
Other values (98)	377	52.3%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	522	72.4%
Decimal Number	125	17.3%
Lowercase Letter	42	5.8%
Other Punctuation	12	1.7%
Open Punctuation	7	1.0%
Close Punctuation	6	0.8%
Uppercase Letter	6	0.8%
Space Separator	1	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
이	61	11.7%
미	39	7.5%
상	31	5.9%
모	30	5.7%
규	29	5.6%
격	29	5.6%
든	29	5.6%
만	27	5.2%
터	22	4.2%
인	17	3.3%
Other values (76)	208	39.8%

Decimal Number

Value	Count	Frequency (%)
1	41	32.8%
5	27	21.6%
0	26	20.8%
2	10	8.0%
4	7	5.6%
9	4	3.2%
3	4	3.2%
6	4	3.2%
8	2	1.6%

Lowercase Letter

Value	Count	Frequency (%)
m	28	66.7%
c	8	19.0%
x	2	4.8%
k	2	4.8%
g	2	4.8%

Uppercase Letter

Value	Count	Frequency (%)
P	2	33.3%
R	2	33.3%
F	2	33.3%

Other Punctuation

Value	Count	Frequency (%)
.	11	91.7%
/	1	8.3%

Open Punctuation

Value	Count	Frequency (%)
(	7	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	6	100.0%

Space Separator

Value	Count	Frequency (%)
	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	522	72.4%
Common	151	20.9%
Latin	48	6.7%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
이	61	11.7%
미	39	7.5%
상	31	5.9%
모	30	5.7%
규	29	5.6%
격	29	5.6%
든	29	5.6%
만	27	5.2%
터	22	4.2%
인	17	3.3%
Other values (76)	208	39.8%

Common

Value	Count	Frequency (%)
1	41	27.2%
5	27	17.9%
0	26	17.2%
.	11	7.3%
2	10	6.6%
4	7	4.6%
(	7	4.6%
)	6	4.0%
9	4	2.6%
3	4	2.6%
Other values (4)	8	5.3%

Latin

Value	Count	Frequency (%)
m	28	58.3%
c	8	16.7%
P	2	4.2%
R	2	4.2%
F	2	4.2%
x	2	4.2%
k	2	4.2%
g	2	4.2%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	522	72.4%
ASCII	199	27.6%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
이	61	11.7%
미	39	7.5%
상	31	5.9%
모	30	5.7%
규	29	5.6%
격	29	5.6%
든	29	5.6%
만	27	5.2%
터	22	4.2%
인	17	3.3%
Other values (76)	208	39.8%

ASCII

Value	Count	Frequency (%)
1	41	20.6%
m	28	14.1%
5	27	13.6%
0	26	13.1%
.	11	5.5%
2	10	5.0%
c	8	4.0%
4	7	3.5%
(	7	3.5%
)	6	3.0%
Other values (12)	28	14.1%

수수료
Real number (ℝ)

Distinct	13
Distinct (%)	9.9%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Mean	5118.3206

Minimum	500
Maximum	20000
Zeros	0
Zeros (%)	0.0%
Negative	0
Negative (%)	0.0%
Memory size	1.3 KiB

Quantile statistics

Minimum	500
5-th percentile	2000
Q1	3000
median	4000
Q3	6000
95-th percentile	11000
Maximum	20000
Range	19500
Interquartile range (IQR)	3000

Descriptive statistics

Standard deviation	3457.8962
Coefficient of variation (CV)	0.67559196
Kurtosis	2.7839778
Mean	5118.3206
Median Absolute Deviation (MAD)	1000
Skewness	1.5645155
Sum	670500
Variance	11957046
Monotonicity	Not monotonic

Histogram with fixed size bins (bins=13)

Value	Count	Frequency (%)
3000	30	22.9%
2000	20	15.3%
4000	19	14.5%
5000	18	13.7%
10000	13	9.9%
8000	10	7.6%
6000	7	5.3%
15000	4	3.1%
1000	4	3.1%
12000	2	1.5%
Other values (3)	4	3.1%

Minimum 10 values
Maximum 10 values

Value	Count	Frequency (%)
500	1	0.8%
1000	4	3.1%
2000	20	15.3%
3000	30	22.9%
4000	19	14.5%
5000	18	13.7%
6000	7	5.3%
7000	2	1.5%
8000	10	7.6%
10000	13	9.9%

Value	Count	Frequency (%)
20000	1	0.8%
15000	4	3.1%
12000	2	1.5%
10000	13	9.9%
8000	10	7.6%
7000	2	1.5%
6000	7	5.3%
5000	18	13.7%
4000	19	14.5%
3000	30	22.9%

순번
수수료

수수료
순번

수수료
순번

Phik (φk)
Auto

Heatmap
Table

	순번	품명	규격	수수료
순번	1.000	0.997	0.663	0.174
품명	0.997	1.000	0.000	0.000
규격	0.663	0.000	1.000	0.971
수수료	0.174	0.000	0.971	1.000

Heatmap
Table

	순번	수수료
순번	1.000	-0.167
수수료	-0.167	1.000

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	순번	품명	규격	수수료
0	1	냉장고	800리터이상	15000
1	2	냉장고	800리터미만	12000
2	3	냉장고	600리터미만	10000
3	4	냉장고	400리터미만	8000
4	5	냉장고	300리터미만	6000
5	6	냉장고	200리터미만	4000
6	7	냉장고	1000리터 이상	20000
7	8	TV	55인치이상	15000
8	9	TV	55인치미만	10000
9	10	TV	45인치이하	8000

	순번	품명	규격	수수료
121	122	목재	길이1m미만	500
122	123	오락기	높이1m이상	10000
123	124	오락기	높이1m미만	5000
124	125	광고판	5제곱미터이상	10000
125	126	광고판	3제곱미터이상	7000
126	127	광고판	1제곱미터이상	5000
127	128	광고판	1제곱미터이하	3000
128	129	물탱크	1톤당(용랑기준)	10000
129	130	이불류	솜이불1채	5000
130	131	이불류	홑이불1채	3000

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Uppercase Letter

Open Punctuation

Close Punctuation

Decimal Number

Most occurring scripts

Most frequent character per script

Hangul

Common

Latin

Most occurring blocks

Most frequent character per block

Hangul

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Decimal Number

Lowercase Letter

Uppercase Letter

Other Punctuation

Open Punctuation

Close Punctuation

Space Separator

Most occurring scripts

Most frequent character per script

Hangul

Common

Latin

Most occurring blocks

Most frequent character per block

Hangul

ASCII

Interactions

Correlations

Missing values

Sample