Dataset statistics
Number of variables | 8 |
---|---|
Number of observations | 6104 |
Missing cells | 6919 |
Missing cells (%) | 14.2% |
Duplicate rows | 98 |
Duplicate rows (%) | 1.6% |
Total size in memory | 393.6 KiB |
Average record size in memory | 66.0 B |
Variable types
Text | 1 |
---|---|
Unsupported | 1 |
Categorical | 1 |
DateTime | 4 |
Numeric | 1 |
Dataset
Description | 수원도시공사에서 운영중인 문화스포츠시설인 장안구민회관, 종합운동장, 가족여성회관 등에서 발생한 수입금에 대한 회계항목정보를 제공합니다. |
---|---|
Author | 수원도시공사 |
URL | https://www.data.go.kr/data/15123871/fileData.do |
Dataset has 98 (1.6%) duplicate rows | Duplicates |
총금액 is highly overall correlated with 거래상태 | High correlation |
거래상태 is highly overall correlated with 총금액 | High correlation |
거래상태 is highly imbalanced (72.3%) | Imbalance |
현금 has 6104 (100.0%) missing values | Missing |
수입일자 has 815 (13.4%) missing values | Missing |
현금 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-12 14:06:48.866809 |
---|---|
Analysis finished | 2023-12-12 14:06:50.235190 |
Duration | 1.37 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
카드
Text
Distinct | 58 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 47.8 KiB |
Value | Count | Frequency (%) |
현대 | 1637 | |
국민 | 935 | |
신한 | 934 | |
삼성 | 893 | |
구하나 | 438 | 7.1% |
비씨 | 302 | 4.9% |
롯데(신 | 188 | 3.1% |
농협 | 139 | 2.3% |
하나(외환 | 122 | 2.0% |
kb국민카드 | 80 | 1.3% |
Other values (49) | 489 | 7.9% |
Most occurring characters
Value | Count | Frequency (%) |
대 | 1698 | 11.0% |
현 | 1697 | 11.0% |
신 | 1208 | 7.8% |
한 | 1020 | 6.6% |
국 | 1016 | 6.6% |
민 | 1016 | 6.6% |
삼 | 950 | 6.2% |
성 | 950 | 6.2% |
나 | 584 | 3.8% |
하 | 584 | 3.8% |
Other values (68) | 4693 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 14388 | |
Uppercase Letter | 348 | 2.3% |
Open Punctuation | 310 | 2.0% |
Close Punctuation | 310 | 2.0% |
Space Separator | 53 | 0.3% |
Lowercase Letter | 7 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
대 | 1698 | |
현 | 1697 | |
신 | 1208 | 8.4% |
한 | 1020 | 7.1% |
국 | 1016 | 7.1% |
민 | 1016 | 7.1% |
삼 | 950 | 6.6% |
성 | 950 | 6.6% |
나 | 584 | 4.1% |
하 | 584 | 4.1% |
Other values (48) | 3665 |
Uppercase Letter
Value | Count | Frequency (%) |
B | 102 | |
K | 96 | |
N | 60 | |
H | 60 | |
C | 10 | 2.9% |
I | 7 | 2.0% |
J | 6 | 1.7% |
S | 4 | 1.1% |
P | 2 | 0.6% |
V | 1 | 0.3% |
Lowercase Letter
Value | Count | Frequency (%) |
u | 1 | |
n | 1 | |
i | 1 | |
t | 1 | |
a | 1 | |
l | 1 | |
m | 1 |
Open Punctuation
Value | Count | Frequency (%) |
( | 310 |
Close Punctuation
Value | Count | Frequency (%) |
) | 310 |
Space Separator
Value | Count | Frequency (%) |
53 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 14388 | |
Common | 673 | 4.4% |
Latin | 355 | 2.3% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
대 | 1698 | |
현 | 1697 | |
신 | 1208 | 8.4% |
한 | 1020 | 7.1% |
국 | 1016 | 7.1% |
민 | 1016 | 7.1% |
삼 | 950 | 6.6% |
성 | 950 | 6.6% |
나 | 584 | 4.1% |
하 | 584 | 4.1% |
Other values (48) | 3665 |
Latin
Value | Count | Frequency (%) |
B | 102 | |
K | 96 | |
N | 60 | |
H | 60 | |
C | 10 | 2.8% |
I | 7 | 2.0% |
J | 6 | 1.7% |
S | 4 | 1.1% |
P | 2 | 0.6% |
u | 1 | 0.3% |
Other values (7) | 7 | 2.0% |
Common
Value | Count | Frequency (%) |
( | 310 | |
) | 310 | |
53 | 7.9% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 14388 | |
ASCII | 1028 | 6.7% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
대 | 1698 | |
현 | 1697 | |
신 | 1208 | 8.4% |
한 | 1020 | 7.1% |
국 | 1016 | 7.1% |
민 | 1016 | 7.1% |
삼 | 950 | 6.6% |
성 | 950 | 6.6% |
나 | 584 | 4.1% |
하 | 584 | 4.1% |
Other values (48) | 3665 |
ASCII
Value | Count | Frequency (%) |
( | 310 | |
) | 310 | |
B | 102 | 9.9% |
K | 96 | 9.3% |
N | 60 | 5.8% |
H | 60 | 5.8% |
53 | 5.2% | |
C | 10 | 1.0% |
I | 7 | 0.7% |
J | 6 | 0.6% |
Other values (10) | 14 | 1.4% |
현금
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 6104 |
---|---|
Missing (%) | 100.0% |
Memory size | 53.8 KiB |
거래상태
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 47.8 KiB |
결제완료 | |
---|---|
신용승인 | 525 |
신용취소 | 9 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 신용승인 |
---|---|
2nd row | 신용승인 |
3rd row | 신용승인 |
4th row | 신용승인 |
5th row | 신용취소 |
Common Values
Value | Count | Frequency (%) |
결제완료 | 5570 | |
신용승인 | 525 | 8.6% |
신용취소 | 9 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
결제완료 | 5570 | |
신용승인 | 525 | 8.6% |
신용취소 | 9 | 0.1% |
징수결의일자
Date
Distinct | 84 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 47.8 KiB |
Minimum | 2023-07-01 00:00:00 |
---|---|
Maximum | 2023-09-22 00:00:00 |
수입일자
Date
MISSING
 
Distinct | 55 |
---|---|
Distinct (%) | 1.0% |
Missing | 815 |
Missing (%) | 13.4% |
Memory size | 47.8 KiB |
Minimum | 2023-07-07 00:00:00 |
---|---|
Maximum | 2023-09-22 00:00:00 |
총금액
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 120 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 24574.723 |
Minimum | -90000 |
---|---|
Maximum | 180000 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 9 |
Negative (%) | 0.1% |
Memory size | 53.8 KiB |
Quantile statistics
Minimum | -90000 |
---|---|
5-th percentile | 400 |
Q1 | 2400 |
median | 4000 |
Q3 | 45000 |
95-th percentile | 90000 |
Maximum | 180000 |
Range | 270000 |
Interquartile range (IQR) | 42600 |
Descriptive statistics
Standard deviation | 31928.922 |
---|---|
Coefficient of variation (CV) | 1.2992587 |
Kurtosis | 3.1751652 |
Mean | 24574.723 |
Median Absolute Deviation (MAD) | 3200 |
Skewness | 1.5575097 |
Sum | 1.5000411 × 108 |
Variance | 1.0194561 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4000 | 1297 | |
2400 | 349 | 5.7% |
2000 | 266 | 4.4% |
45000 | 251 | 4.1% |
35000 | 245 | 4.0% |
90000 | 222 | 3.6% |
3000 | 208 | 3.4% |
900 | 178 | 2.9% |
1800 | 159 | 2.6% |
60000 | 151 | 2.5% |
Other values (110) | 2778 |
Value | Count | Frequency (%) |
-90000 | 5 | 0.1% |
-15000 | 1 | < 0.1% |
-10000 | 2 | < 0.1% |
-5000 | 1 | < 0.1% |
50 | 2 | < 0.1% |
100 | 87 | |
200 | 56 | 0.9% |
300 | 34 | 0.6% |
350 | 1 | < 0.1% |
400 | 141 |
Value | Count | Frequency (%) |
180000 | 38 | 0.6% |
150000 | 14 | 0.2% |
125810 | 5 | 0.1% |
120000 | 24 | 0.4% |
108000 | 2 | < 0.1% |
106000 | 1 | < 0.1% |
100650 | 10 | 0.2% |
100000 | 68 | 1.1% |
90000 | 222 | |
84000 | 35 | 0.6% |
거래일자
Date
Distinct | 84 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 47.8 KiB |
Minimum | 2023-07-01 00:00:00 |
---|---|
Maximum | 2023-09-22 00:00:00 |
거래시간
Date
Distinct | 1492 |
---|---|
Distinct (%) | 24.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 47.8 KiB |
Minimum | 2023-12-12 00:04:00 |
---|---|
Maximum | 2023-12-12 23:54:00 |
카드 | 거래상태 | 징수결의일자 | 수입일자 | 총금액 | 거래일자 | |
---|---|---|---|---|---|---|
카드 | 1.000 | 0.901 | 0.829 | 0.830 | 0.647 | 0.829 |
거래상태 | 0.901 | 1.000 | 0.923 | 1.000 | 0.841 | 0.923 |
징수결의일자 | 0.829 | 0.923 | 1.000 | 1.000 | 0.775 | 1.000 |
수입일자 | 0.830 | 1.000 | 1.000 | 1.000 | 0.736 | 1.000 |
총금액 | 0.647 | 0.841 | 0.775 | 0.736 | 1.000 | 0.775 |
거래일자 | 0.829 | 0.923 | 1.000 | 1.000 | 0.775 | 1.000 |
총금액 | 거래상태 | |
---|---|---|
총금액 | 1.000 | 0.830 |
거래상태 | 0.830 | 1.000 |
카드 | 현금 | 거래상태 | 징수결의일자 | 수입일자 | 총금액 | 거래일자 | 거래시간 | |
---|---|---|---|---|---|---|---|---|
0 | 삼성 마스타 | <NA> | 신용승인 | 2023-07-03 | 2023-07-10 | 10000 | 2023-07-03 | 9:05:40 |
1 | KB국민카드 | <NA> | 신용승인 | 2023-07-03 | 2023-07-10 | 10000 | 2023-07-03 | 9:06:23 |
2 | 현대비자개인 | <NA> | 신용승인 | 2023-07-03 | 2023-07-10 | 10000 | 2023-07-03 | 9:08:37 |
3 | KB국민카드 | <NA> | 신용승인 | 2023-07-03 | <NA> | 10000 | 2023-07-03 | 9:08:49 |
4 | KB국민카드 | <NA> | 신용취소 | 2023-07-03 | <NA> | -10000 | 2023-07-03 | 9:08:49 |
5 | 신한카드체크 | <NA> | 신용승인 | 2023-07-03 | 2023-07-10 | 10000 | 2023-07-03 | 9:09:23 |
6 | 신한카드 | <NA> | 신용승인 | 2023-07-03 | 2023-07-10 | 10000 | 2023-07-03 | 9:10:57 |
7 | KB국민카드 | <NA> | 신용승인 | 2023-07-03 | 2023-07-10 | 10000 | 2023-07-03 | 9:12:12 |
8 | 현대 카드 | <NA> | 신용취소 | 2023-07-03 | <NA> | -10000 | 2023-07-03 | 9:13:12 |
9 | 현대 카드 | <NA> | 신용승인 | 2023-07-03 | <NA> | 10000 | 2023-07-03 | 9:13:14 |
카드 | 현금 | 거래상태 | 징수결의일자 | 수입일자 | 총금액 | 거래일자 | 거래시간 | |
---|---|---|---|---|---|---|---|---|
6094 | 삼성 | <NA> | 결제완료 | 20230720 | 20230727 | 63000 | 20230720 | 9:21 |
6095 | 삼성 | <NA> | 결제완료 | 20230920 | <NA> | 45290 | 20230920 | 10:34 |
6096 | 삼성 | <NA> | 결제완료 | 20230727 | 20230803 | 2400 | 20230727 | 19:54 |
6097 | 삼성 | <NA> | 결제완료 | 20230811 | 20230821 | 3000 | 20230811 | 11:42 |
6098 | 삼성 | <NA> | 결제완료 | 20230719 | 20230726 | 58500 | 20230719 | 18:10 |
6099 | 삼성 | <NA> | 결제완료 | 20230830 | 20230906 | 900 | 20230830 | 9:58 |
6100 | 삼성 | <NA> | 결제완료 | 20230903 | 20230908 | 4000 | 20230903 | 11:47 |
6101 | 삼성 | <NA> | 결제완료 | 20230814 | 20230822 | 4000 | 20230814 | 7:45 |
6102 | 삼성 | <NA> | 결제완료 | 20230803 | 20230810 | 4000 | 20230803 | 7:45 |
6103 | 삼성 | <NA> | 결제완료 | 20230822 | 20230829 | 4000 | 20230822 | 8:16 |
Most frequently occurring
카드 | 거래상태 | 징수결의일자 | 수입일자 | 총금액 | 거래일자 | 거래시간 | # duplicates | |
---|---|---|---|---|---|---|---|---|
4 | 국민 | 결제완료 | 20230703 | 20230710 | 2400 | 20230703 | 18:04 | 3 |
66 | 신한 | 결제완료 | 20230828 | 20230904 | 54000 | 20230828 | 6:01 | 3 |
87 | 현대 | 결제완료 | 20230828 | 20230904 | 54000 | 20230828 | 6:03 | 3 |
89 | 현대 | 결제완료 | 20230828 | 20230904 | 65000 | 20230828 | 6:03 | 3 |
0 | 구하나 | 결제완료 | 20230724 | 20230731 | 45000 | 20230724 | 15:21 | 2 |
1 | 구하나 | 결제완료 | 20230801 | 20230808 | 900 | 20230801 | 15:41 | 2 |
2 | 구하나 | 결제완료 | 20230906 | 20230913 | 900 | 20230906 | 13:36 | 2 |
3 | 국민 | 결제완료 | 20230701 | 20230707 | 4000 | 20230701 | 15:02 | 2 |
5 | 국민 | 결제완료 | 20230703 | 20230710 | 2400 | 20230703 | 18:05 | 2 |
6 | 국민 | 결제완료 | 20230703 | 20230710 | 4000 | 20230703 | 17:53 | 2 |