Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory332.0 KiB
Average record size in memory34.0 B

Variable types

Numeric2
Text1

Dataset

Description전라북도 군산시에 소재한 수용가의 4월 상수도 사용량(번호, 주소(지번, 도로명포함), 4월 상수도 사용량(톤))을 제공하여 데이터를 활용할 수 있도로 개방함.
Author전라북도 군산시
URLhttps://www.data.go.kr/data/15080842/fileData.do

Alerts

4월 사용량(톤) is highly skewed (γ1 = 66.73406893)Skewed
번호 has unique valuesUnique
4월 사용량(톤) has 1614 (16.1%) zerosZeros

Reproduction

Analysis started2023-12-12 16:36:55.556482
Analysis finished2023-12-12 16:36:56.489877
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22925.123
Minimum4
Maximum45932
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:36:56.568420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile2185.9
Q111402.5
median23145
Q334395
95-th percentile43579.1
Maximum45932
Range45928
Interquartile range (IQR)22992.5

Descriptive statistics

Standard deviation13304.668
Coefficient of variation (CV)0.58035317
Kurtosis-1.2067781
Mean22925.123
Median Absolute Deviation (MAD)11486.5
Skewness-0.006816128
Sum2.2925123 × 108
Variance1.7701418 × 108
MonotonicityNot monotonic
2023-12-13T01:36:56.735456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8643 1
 
< 0.1%
31566 1
 
< 0.1%
2320 1
 
< 0.1%
26581 1
 
< 0.1%
23074 1
 
< 0.1%
8432 1
 
< 0.1%
11214 1
 
< 0.1%
23668 1
 
< 0.1%
27293 1
 
< 0.1%
6666 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
4 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
14 1
< 0.1%
16 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
21 1
< 0.1%
23 1
< 0.1%
26 1
< 0.1%
ValueCountFrequency (%)
45932 1
< 0.1%
45928 1
< 0.1%
45924 1
< 0.1%
45922 1
< 0.1%
45913 1
< 0.1%
45907 1
< 0.1%
45894 1
< 0.1%
45891 1
< 0.1%
45886 1
< 0.1%
45884 1
< 0.1%

주소
Text

Distinct267
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T01:36:57.151671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length18
Mean length18.73
Min length14

Characters and Unicode

Total characters187300
Distinct characters176
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)1.0%

Sample

1st row전라북도 군산시 중동 ********
2nd row전라북도 군산시 경암동 *****
3rd row전라북도 군산시 장상리 *****
4th row전라북도 군산시 나운동 *****
5th row전라북도 군산시 나운동 *****
ValueCountFrequency (%)
전라북도 10000
25.5%
군산시 9999
25.5%
9204
23.5%
나운동 1199
 
3.1%
산북동 641
 
1.6%
경암동 488
 
1.2%
조촌동 478
 
1.2%
소룡동 464
 
1.2%
수송동 411
 
1.0%
문화동 330
 
0.8%
Other values (245) 6015
15.3%
2023-12-13T01:36:57.687908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 57536
30.7%
29232
15.6%
11109
 
5.9%
10669
 
5.7%
10332
 
5.5%
10004
 
5.3%
10003
 
5.3%
10001
 
5.3%
10000
 
5.3%
7259
 
3.9%
Other values (166) 21155
 
11.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 100297
53.5%
Other Punctuation 57536
30.7%
Space Separator 29232
 
15.6%
Decimal Number 230
 
0.1%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11109
11.1%
10669
10.6%
10332
10.3%
10004
10.0%
10003
10.0%
10001
10.0%
10000
10.0%
7259
7.2%
2304
 
2.3%
1240
 
1.2%
Other values (154) 17376
17.3%
Decimal Number
ValueCountFrequency (%)
2 84
36.5%
1 77
33.5%
3 46
20.0%
5 9
 
3.9%
9 4
 
1.7%
4 4
 
1.7%
0 3
 
1.3%
7 2
 
0.9%
6 1
 
0.4%
Other Punctuation
ValueCountFrequency (%)
* 57536
100.0%
Space Separator
ValueCountFrequency (%)
29232
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 100297
53.5%
Common 87003
46.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11109
11.1%
10669
10.6%
10332
10.3%
10004
10.0%
10003
10.0%
10001
10.0%
10000
10.0%
7259
7.2%
2304
 
2.3%
1240
 
1.2%
Other values (154) 17376
17.3%
Common
ValueCountFrequency (%)
* 57536
66.1%
29232
33.6%
2 84
 
0.1%
1 77
 
0.1%
3 46
 
0.1%
5 9
 
< 0.1%
- 5
 
< 0.1%
9 4
 
< 0.1%
4 4
 
< 0.1%
0 3
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 100297
53.5%
ASCII 87003
46.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 57536
66.1%
29232
33.6%
2 84
 
0.1%
1 77
 
0.1%
3 46
 
0.1%
5 9
 
< 0.1%
- 5
 
< 0.1%
9 4
 
< 0.1%
4 4
 
< 0.1%
0 3
 
< 0.1%
Other values (2) 3
 
< 0.1%
Hangul
ValueCountFrequency (%)
11109
11.1%
10669
10.6%
10332
10.3%
10004
10.0%
10003
10.0%
10001
10.0%
10000
10.0%
7259
7.2%
2304
 
2.3%
1240
 
1.2%
Other values (154) 17376
17.3%

4월 사용량(톤)
Real number (ℝ)

SKEWED  ZEROS 

Distinct335
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.2835
Minimum0
Maximum240634
Zeros1614
Zeros (%)16.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:36:57.873265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median9
Q319
95-th percentile80
Maximum240634
Range240634
Interquartile range (IQR)16

Descriptive statistics

Standard deviation3454.3548
Coefficient of variation (CV)34.1058
Kurtosis4600.2438
Mean101.2835
Median Absolute Deviation (MAD)7
Skewness66.734069
Sum1012835
Variance11932567
MonotonicityNot monotonic
2023-12-13T01:36:58.051713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1614
 
16.1%
5 437
 
4.4%
4 433
 
4.3%
1 409
 
4.1%
2 407
 
4.1%
6 406
 
4.1%
7 386
 
3.9%
3 384
 
3.8%
9 373
 
3.7%
10 353
 
3.5%
Other values (325) 4798
48.0%
ValueCountFrequency (%)
0 1614
16.1%
1 409
 
4.1%
2 407
 
4.1%
3 384
 
3.8%
4 433
 
4.3%
5 437
 
4.4%
6 406
 
4.1%
7 386
 
3.9%
8 336
 
3.4%
9 373
 
3.7%
ValueCountFrequency (%)
240634 1
< 0.1%
237895 1
< 0.1%
44522 1
< 0.1%
31042 1
< 0.1%
17918 1
< 0.1%
13561 1
< 0.1%
13019 1
< 0.1%
12351 1
< 0.1%
10065 1
< 0.1%
9550 1
< 0.1%

Interactions

2023-12-13T01:36:56.027415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:36:55.795703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:36:56.175830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:36:55.893727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:36:58.171343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호4월 사용량(톤)
번호1.0000.050
4월 사용량(톤)0.0501.000
2023-12-13T01:36:58.264658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호4월 사용량(톤)
번호1.000-0.045
4월 사용량(톤)-0.0451.000

Missing values

2023-12-13T01:36:56.348203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:36:56.449755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호주소4월 사용량(톤)
86428643전라북도 군산시 중동 ********4
1543115432전라북도 군산시 경암동 *****3
4590645907전라북도 군산시 장상리 *****7
2474024741전라북도 군산시 나운동 *****8
2050320504전라북도 군산시 나운동 *****25
3896338964전라북도 군산시 마룡리 *****9
1015710158전라북도 군산시 서흥남동********1
3814338144전라북도 군산시 서수리 *****26
2832828329전라북도 군산시 소룡동 *****1870
3172331724전라북도 군산시 산북동 *****12
번호주소4월 사용량(톤)
3603036031전라북도 군산시 대정리 *****7
4450944510전라북도 군산시 고봉리 *****33
45864587전라북도 군산시 삼학동 ********20
4053040531전라북도 군산시 지경리 *****4
67236724전라북도 군산시 중앙로2********0
2377023771전라북도 군산시 나운동 *****0
1274912750전라북도 군산시 경장동 *****0
4022540226전라북도 군산시 산월리 *****6
2719227193전라북도 군산시 소룡동 *****25
2532225323전라북도 군산시 나운동 *****10