Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 40 |
Duplicate rows (%) | 0.4% |
Total size in memory | 644.5 KiB |
Average record size in memory | 66.0 B |
Variable types
Text | 1 |
---|---|
DateTime | 2 |
Categorical | 2 |
Numeric | 2 |
Dataset
Description | 강원도 춘천시 경영지원과 요금팀의 2015년 6월 16일 ~ 2021년 5월 21일까지의 상수도 및 하수도 관련 전화 접수민원에 대한 분석자료 |
---|---|
Author | 강원도 춘천시 |
URL | https://www.data.go.kr/data/15097628/fileData.do |
데이터기준일 has constant value "" | Constant |
Dataset has 40 (0.4%) duplicate rows | Duplicates |
안내수 is highly overall correlated with 안내금액 | High correlation |
안내금액 is highly overall correlated with 안내수 | High correlation |
분류 is highly imbalanced (50.1%) | Imbalance |
부과유형 is highly imbalanced (57.8%) | Imbalance |
안내금액 is highly skewed (γ1 = 20.56845779) | Skewed |
안내수 has 3254 (32.5%) zeros | Zeros |
안내금액 has 1412 (14.1%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-12 10:33:41.269576 |
---|---|
Analysis finished | 2023-12-12 10:33:42.538401 |
Duration | 1.27 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
수용가번호
Text
Distinct | 5716 |
---|---|
Distinct (%) | 57.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 15 |
---|---|
Median length | 15 |
Mean length | 15 |
Min length | 15 |
Characters and Unicode
Total characters | 150000 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 3704 ? |
---|---|
Unique (%) | 37.0% |
Sample
1st row | 017-014-1125-70 |
---|---|
2nd row | 025-501-1250-00 |
3rd row | 016-072-3100-02 |
4th row | 013-051-1900-00 |
5th row | 020-017-0264-00 |
Value | Count | Frequency (%) |
025-501-5920-00 | 11 | 0.1% |
030-010-0499-00 | 10 | 0.1% |
012-207-0140-70 | 10 | 0.1% |
016-100-9420-02 | 9 | 0.1% |
013-280-2210-70 | 9 | 0.1% |
015-040-2600-01 | 9 | 0.1% |
016-071-1800-01 | 9 | 0.1% |
007-050-0369-00 | 9 | 0.1% |
013-051-2400-04 | 9 | 0.1% |
015-013-6600-01 | 9 | 0.1% |
Other values (5706) | 9906 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 56251 | |
- | 30000 | |
1 | 17599 | 11.7% |
2 | 10103 | 6.7% |
3 | 7567 | 5.0% |
5 | 7183 | 4.8% |
7 | 6201 | 4.1% |
4 | 4978 | 3.3% |
6 | 4201 | 2.8% |
8 | 3266 | 2.2% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 120000 | |
Dash Punctuation | 30000 | 20.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 56251 | |
1 | 17599 | 14.7% |
2 | 10103 | 8.4% |
3 | 7567 | 6.3% |
5 | 7183 | 6.0% |
7 | 6201 | 5.2% |
4 | 4978 | 4.1% |
6 | 4201 | 3.5% |
8 | 3266 | 2.7% |
9 | 2651 | 2.2% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 30000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 150000 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 56251 | |
- | 30000 | |
1 | 17599 | 11.7% |
2 | 10103 | 6.7% |
3 | 7567 | 5.0% |
5 | 7183 | 4.8% |
7 | 6201 | 4.1% |
4 | 4978 | 3.3% |
6 | 4201 | 2.8% |
8 | 3266 | 2.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 150000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 56251 | |
- | 30000 | |
1 | 17599 | 11.7% |
2 | 10103 | 6.7% |
3 | 7567 | 5.0% |
5 | 7183 | 4.8% |
7 | 6201 | 4.1% |
4 | 4978 | 3.3% |
6 | 4201 | 2.8% |
8 | 3266 | 2.2% |
날짜
Date
Distinct | 1289 |
---|---|
Distinct (%) | 12.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Minimum | 2015-06-16 00:00:00 |
---|---|
Maximum | 2021-05-12 00:00:00 |
분류
Categorical
IMBALANCE
 
Distinct | 21 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
요금안내 | |
---|---|
자동납부 | |
요금체납 | |
기타 | |
고지서 | 143 |
Other values (16) |
Length
Max length | 9 |
---|---|
Median length | 4 |
Mean length | 3.8222 |
Min length | 1 |
Unique
Unique | 4 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | 지침 |
---|---|
2nd row | 자동납부 |
3rd row | 요금체납 |
4th row | 요금안내 |
5th row | 요금안내 |
Common Values
Value | Count | Frequency (%) |
요금안내 | 3569 | |
자동납부 | 3521 | |
요금체납 | 1590 | |
기타 | 674 | 6.7% |
고지서 | 143 | 1.4% |
단수예정 | 115 | 1.1% |
이사 | 114 | 1.1% |
지침 | 104 | 1.0% |
이사정산 | 63 | 0.6% |
수도검침 | 50 | 0.5% |
Other values (11) | 57 | 0.6% |
Length
Value | Count | Frequency (%) |
요금안내 | 3569 | |
자동납부 | 3521 | |
요금체납 | 1590 | |
기타 | 674 | 6.7% |
고지서 | 143 | 1.4% |
단수예정 | 115 | 1.1% |
이사 | 114 | 1.1% |
지침 | 104 | 1.0% |
이사정산 | 63 | 0.6% |
수도검침 | 50 | 0.5% |
Other values (11) | 57 | 0.6% |
안내수
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 41 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.7919 |
Minimum | 0 |
---|---|
Maximum | 67 |
Zeros | 3254 |
Zeros (%) | 32.5% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 1 |
Q3 | 1 |
95-th percentile | 7 |
Maximum | 67 |
Range | 67 |
Interquartile range (IQR) | 1 |
Descriptive statistics
Standard deviation | 3.3689359 |
---|---|
Coefficient of variation (CV) | 1.8800915 |
Kurtosis | 60.780371 |
Mean | 1.7919 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 5.8848345 |
Sum | 17919 |
Variance | 11.349729 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 4295 | |
0 | 3254 | |
2 | 606 | 6.1% |
4 | 412 | 4.1% |
3 | 402 | 4.0% |
5 | 249 | 2.5% |
6 | 179 | 1.8% |
7 | 105 | 1.1% |
8 | 95 | 0.9% |
9 | 77 | 0.8% |
Other values (31) | 326 | 3.3% |
Value | Count | Frequency (%) |
0 | 3254 | |
1 | 4295 | |
2 | 606 | 6.1% |
3 | 402 | 4.0% |
4 | 412 | 4.1% |
5 | 249 | 2.5% |
6 | 179 | 1.8% |
7 | 105 | 1.1% |
8 | 95 | 0.9% |
9 | 77 | 0.8% |
Value | Count | Frequency (%) |
67 | 1 | < 0.1% |
56 | 2 | |
52 | 1 | < 0.1% |
49 | 1 | < 0.1% |
48 | 1 | < 0.1% |
45 | 1 | < 0.1% |
43 | 1 | < 0.1% |
38 | 1 | < 0.1% |
37 | 1 | < 0.1% |
36 | 4 |
안내금액
Real number (ℝ)
HIGH CORRELATION
  SKEWED
  ZEROS
 
Distinct | 4650 |
---|---|
Distinct (%) | 46.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 83241.073 |
Minimum | 0 |
---|---|
Maximum | 18438240 |
Zeros | 1412 |
Zeros (%) | 14.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 4460 |
median | 16010 |
Q3 | 46432.5 |
95-th percentile | 313802 |
Maximum | 18438240 |
Range | 18438240 |
Interquartile range (IQR) | 41972.5 |
Descriptive statistics
Standard deviation | 381723.69 |
---|---|
Coefficient of variation (CV) | 4.5857613 |
Kurtosis | 694.57151 |
Mean | 83241.073 |
Median Absolute Deviation (MAD) | 14600 |
Skewness | 20.568458 |
Sum | 8.3241073 × 108 |
Variance | 1.4571298 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 1412 | 14.1% |
1410 | 130 | 1.3% |
1260 | 101 | 1.0% |
1130 | 63 | 0.6% |
9210 | 28 | 0.3% |
3080 | 28 | 0.3% |
6660 | 27 | 0.3% |
2750 | 27 | 0.3% |
7810 | 27 | 0.3% |
4950 | 25 | 0.2% |
Other values (4640) | 8132 |
Value | Count | Frequency (%) |
0 | 1412 | |
90 | 1 | < 0.1% |
430 | 1 | < 0.1% |
570 | 1 | < 0.1% |
650 | 1 | < 0.1% |
680 | 1 | < 0.1% |
770 | 1 | < 0.1% |
900 | 2 | < 0.1% |
910 | 1 | < 0.1% |
920 | 6 | 0.1% |
Value | Count | Frequency (%) |
18438240 | 1 | |
9562040 | 1 | |
8684980 | 1 | |
7890620 | 1 | |
7544720 | 1 | |
7032590 | 1 | |
6933120 | 1 | |
6872830 | 1 | |
6496530 | 1 | |
6041550 | 1 |
부과유형
Categorical
IMBALANCE
 
Distinct | 9 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
<NA> | |
---|---|
가정용 | |
일반용 | |
일반혼합용 | 106 |
전용공업용 | 10 |
Other values (4) | 4 |
Length
Max length | 7 |
---|---|
Median length | 4 |
Mean length | 3.5497 |
Min length | 3 |
Unique
Unique | 4 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | <NA> |
---|---|
2nd row | 가정용 |
3rd row | 일반용 |
4th row | <NA> |
5th row | 일반용 |
Common Values
Value | Count | Frequency (%) |
<NA> | 5258 | |
가정용 | 4047 | |
일반용 | 575 | 5.8% |
일반혼합용 | 106 | 1.1% |
전용공업용 | 10 | 0.1% |
혼합용 | 1 | < 0.1% |
일반용 | 1 | < 0.1% |
대중탕용 | 1 | < 0.1% |
가정용/일반용 | 1 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 5258 | |
가정용 | 4047 | |
일반용 | 576 | 5.8% |
일반혼합용 | 106 | 1.1% |
전용공업용 | 10 | 0.1% |
혼합용 | 1 | < 0.1% |
대중탕용 | 1 | < 0.1% |
가정용/일반용 | 1 | < 0.1% |
데이터기준일
Date
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Minimum | 2021-12-17 00:00:00 |
---|---|
Maximum | 2021-12-17 00:00:00 |
분류 | 안내수 | 안내금액 | 부과유형 | |
---|---|---|---|---|
분류 | 1.000 | 0.215 | 0.039 | 0.240 |
안내수 | 0.215 | 1.000 | 0.038 | 0.111 |
안내금액 | 0.039 | 0.038 | 1.000 | 0.060 |
부과유형 | 0.240 | 0.111 | 0.060 | 1.000 |
부과유형 | 분류 | |
---|---|---|
부과유형 | 1.000 | 0.116 |
분류 | 0.116 | 1.000 |
안내수 | 안내금액 | 분류 | 부과유형 | |
---|---|---|---|---|
안내수 | 1.000 | 0.507 | 0.081 | 0.050 |
안내금액 | 0.507 | 1.000 | 0.015 | 0.027 |
분류 | 0.081 | 0.015 | 1.000 | 0.116 |
부과유형 | 0.050 | 0.027 | 0.116 | 1.000 |
수용가번호 | 날짜 | 분류 | 안내수 | 안내금액 | 부과유형 | 데이터기준일 | |
---|---|---|---|---|---|---|---|
54913 | 017-014-1125-70 | 2019-09-25 | 지침 | 0 | 0 | <NA> | 2021-12-17 |
86186 | 025-501-1250-00 | 2020-12-15 | 자동납부 | 1 | 3120 | 가정용 | 2021-12-17 |
63817 | 016-072-3100-02 | 2020-01-13 | 요금체납 | 1 | 20950 | 일반용 | 2021-12-17 |
52796 | 013-051-1900-00 | 2019-09-02 | 요금안내 | 1 | 34860 | <NA> | 2021-12-17 |
35106 | 020-017-0264-00 | 2018-12-11 | 요금안내 | 1 | 23740 | 일반용 | 2021-12-17 |
15318 | 015-032-0030-05 | 2017-05-17 | 요금안내 | 0 | 0 | <NA> | 2021-12-17 |
37531 | 030-033-0500-02 | 2019-01-11 | 자동납부 | 0 | 8530 | 가정용 | 2021-12-17 |
61598 | 025-501-5400-00 | 2019-12-11 | 자동납부 | 1 | 11760 | 가정용 | 2021-12-17 |
6458 | 026-300-0250-70 | 2016-04-11 | 기타 | 0 | 0 | <NA> | 2021-12-17 |
51154 | 019-010-0056-00 | 2019-08-09 | 요금안내 | 0 | 148940 | 일반혼합용 | 2021-12-17 |
수용가번호 | 날짜 | 분류 | 안내수 | 안내금액 | 부과유형 | 데이터기준일 | |
---|---|---|---|---|---|---|---|
33644 | 030-417-0300-00 | 2018-11-09 | 자동납부 | 1 | 13920 | 일반용 | 2021-12-17 |
93112 | 029-100-0607-01 | 2021-03-09 | 이사정산 | 2 | 56030 | <NA> | 2021-12-17 |
40799 | 033-400-5670-00 | 2019-03-12 | 요금안내 | 0 | 18860 | 가정용 | 2021-12-17 |
20921 | 024-331-2700-00 | 2018-01-22 | 요금안내 | 5 | 204430 | <NA> | 2021-12-17 |
46960 | 007-090-1029-00 | 2019-06-12 | 자동납부 | 0 | 1260 | 가정용 | 2021-12-17 |
35757 | 025-501-0460-00 | 2018-12-11 | 자동납부 | 1 | 8610 | 가정용 | 2021-12-17 |
23599 | 020-016-0125-00 | 2018-04-30 | 요금안내 | 1 | 79680 | <NA> | 2021-12-17 |
98710 | 016-416-0200-02 | 2021-04-21 | 기타 | 0 | 0 | <NA> | 2021-12-17 |
64209 | 021-824-0330-50 | 2020-01-13 | 자동납부 | 1 | 6660 | 가정용 | 2021-12-17 |
68270 | 029-400-1900-40 | 2020-03-05 | 요금안내 | 1 | 98830 | <NA> | 2021-12-17 |
Most frequently occurring
수용가번호 | 날짜 | 분류 | 안내수 | 안내금액 | 부과유형 | 데이터기준일 | # duplicates | |
---|---|---|---|---|---|---|---|---|
35 | 030-020-0267-62 | 2019-01-31 | 요금안내 | 1 | 7360 | <NA> | 2021-12-17 | 3 |
0 | 002-209-0600-02 | 2020-02-25 | 요금체납 | 2 | 18980 | <NA> | 2021-12-17 | 2 |
1 | 004-110-0800-00 | 2018-10-15 | 자동납부 | 1 | 33240 | 일반혼합용 | 2021-12-17 | 2 |
2 | 005-020-1800-00 | 2018-03-05 | 기타 | 0 | 0 | <NA> | 2021-12-17 | 2 |
3 | 006-020-0990-70 | 2017-04-27 | 요금안내 | 0 | 0 | <NA> | 2021-12-17 | 2 |
4 | 007-090-1017-00 | 2017-12-04 | 요금안내 | 10 | 48080 | <NA> | 2021-12-17 | 2 |
5 | 007-090-1086-00 | 2020-01-09 | 자동납부 | 0 | 0 | <NA> | 2021-12-17 | 2 |
6 | 008-035-0400-01 | 2020-04-17 | 요금체납 | 3 | 33220 | <NA> | 2021-12-17 | 2 |
7 | 009-100-5400-00 | 2018-07-11 | 자동납부 | 1 | 2270 | 가정용 | 2021-12-17 | 2 |
8 | 010-096-5600-00 | 2017-08-30 | 요금안내 | 0 | 0 | <NA> | 2021-12-17 | 2 |