Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2513 (25.1%) zerosZeros

Reproduction

Analysis started2024-05-11 05:56:58.625332
Analysis finished2024-05-11 05:56:59.974691
Duration1.35 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2157
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:00.293252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.4893
Min length2

Characters and Unicode

Total characters74893
Distinct characters435
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique120 ?
Unique (%)1.2%

Sample

1st row신사현대2차
2nd row성수청구강변
3rd row역삼래미안
4th row용산파크자이
5th row마천우방
ValueCountFrequency (%)
아파트 181
 
1.7%
래미안 40
 
0.4%
e편한세상 24
 
0.2%
아이파크 22
 
0.2%
경남아너스빌 19
 
0.2%
고덕 16
 
0.1%
푸르지오 16
 
0.1%
힐스테이트 16
 
0.1%
이편한세상 16
 
0.1%
북한산 15
 
0.1%
Other values (2237) 10487
96.6%
2024-05-11T14:57:00.931467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2613
 
3.5%
2581
 
3.4%
2486
 
3.3%
1847
 
2.5%
1668
 
2.2%
1659
 
2.2%
1467
 
2.0%
1429
 
1.9%
1419
 
1.9%
1381
 
1.8%
Other values (425) 56343
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68634
91.6%
Decimal Number 3618
 
4.8%
Space Separator 938
 
1.3%
Uppercase Letter 823
 
1.1%
Lowercase Letter 354
 
0.5%
Open Punctuation 135
 
0.2%
Close Punctuation 135
 
0.2%
Dash Punctuation 131
 
0.2%
Other Punctuation 120
 
0.2%
Letter Number 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2613
 
3.8%
2581
 
3.8%
2486
 
3.6%
1847
 
2.7%
1668
 
2.4%
1659
 
2.4%
1467
 
2.1%
1429
 
2.1%
1419
 
2.1%
1381
 
2.0%
Other values (380) 50084
73.0%
Uppercase Letter
ValueCountFrequency (%)
S 144
17.5%
C 120
14.6%
K 102
12.4%
D 94
11.4%
M 94
11.4%
E 40
 
4.9%
H 37
 
4.5%
I 34
 
4.1%
L 33
 
4.0%
G 29
 
3.5%
Other values (7) 96
11.7%
Lowercase Letter
ValueCountFrequency (%)
e 201
56.8%
l 42
 
11.9%
i 34
 
9.6%
v 23
 
6.5%
s 12
 
3.4%
k 9
 
2.5%
w 8
 
2.3%
c 8
 
2.3%
h 7
 
2.0%
a 5
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 1092
30.2%
2 1075
29.7%
3 465
12.9%
4 292
 
8.1%
5 183
 
5.1%
6 149
 
4.1%
7 111
 
3.1%
8 90
 
2.5%
0 83
 
2.3%
9 78
 
2.2%
Other Punctuation
ValueCountFrequency (%)
, 101
84.2%
. 19
 
15.8%
Space Separator
ValueCountFrequency (%)
938
100.0%
Open Punctuation
ValueCountFrequency (%)
( 135
100.0%
Close Punctuation
ValueCountFrequency (%)
) 135
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 131
100.0%
Letter Number
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68634
91.6%
Common 5077
 
6.8%
Latin 1182
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2613
 
3.8%
2581
 
3.8%
2486
 
3.6%
1847
 
2.7%
1668
 
2.4%
1659
 
2.4%
1467
 
2.1%
1429
 
2.1%
1419
 
2.1%
1381
 
2.0%
Other values (380) 50084
73.0%
Latin
ValueCountFrequency (%)
e 201
17.0%
S 144
12.2%
C 120
10.2%
K 102
 
8.6%
D 94
 
8.0%
M 94
 
8.0%
l 42
 
3.6%
E 40
 
3.4%
H 37
 
3.1%
I 34
 
2.9%
Other values (19) 274
23.2%
Common
ValueCountFrequency (%)
1 1092
21.5%
2 1075
21.2%
938
18.5%
3 465
9.2%
4 292
 
5.8%
5 183
 
3.6%
6 149
 
2.9%
( 135
 
2.7%
) 135
 
2.7%
- 131
 
2.6%
Other values (6) 482
9.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68634
91.6%
ASCII 6254
 
8.4%
Number Forms 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2613
 
3.8%
2581
 
3.8%
2486
 
3.6%
1847
 
2.7%
1668
 
2.4%
1659
 
2.4%
1467
 
2.1%
1429
 
2.1%
1419
 
2.1%
1381
 
2.0%
Other values (380) 50084
73.0%
ASCII
ValueCountFrequency (%)
1 1092
17.5%
2 1075
17.2%
938
15.0%
3 465
 
7.4%
4 292
 
4.7%
e 201
 
3.2%
5 183
 
2.9%
6 149
 
2.4%
S 144
 
2.3%
( 135
 
2.2%
Other values (34) 1580
25.3%
Number Forms
ValueCountFrequency (%)
5
100.0%
Distinct2161
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:01.502804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique121 ?
Unique (%)1.2%

Sample

1st rowA12208105
2nd rowA13383003
3rd rowA13592706
4th rowA14075201
5th rowA13812004
ValueCountFrequency (%)
a13611007 13
 
0.1%
a15683402 13
 
0.1%
a41279918 13
 
0.1%
a15805115 12
 
0.1%
a12284701 12
 
0.1%
a13186801 12
 
0.1%
a13120001 12
 
0.1%
a15882104 11
 
0.1%
a13788208 11
 
0.1%
a13302001 11
 
0.1%
Other values (2151) 9880
98.8%
2024-05-11T14:57:02.382115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18619
20.7%
1 17558
19.5%
A 9986
11.1%
3 8707
9.7%
2 8581
9.5%
5 6073
 
6.7%
8 5470
 
6.1%
7 4619
 
5.1%
4 4033
 
4.5%
6 3301
 
3.7%
Other values (2) 3053
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18619
23.3%
1 17558
21.9%
3 8707
10.9%
2 8581
10.7%
5 6073
 
7.6%
8 5470
 
6.8%
7 4619
 
5.8%
4 4033
 
5.0%
6 3301
 
4.1%
9 3039
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9986
99.9%
B 14
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18619
23.3%
1 17558
21.9%
3 8707
10.9%
2 8581
10.7%
5 6073
 
7.6%
8 5470
 
6.8%
7 4619
 
5.8%
4 4033
 
5.0%
6 3301
 
4.1%
9 3039
 
3.8%
Latin
ValueCountFrequency (%)
A 9986
99.9%
B 14
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18619
20.7%
1 17558
19.5%
A 9986
11.1%
3 8707
9.7%
2 8581
9.5%
5 6073
 
6.7%
8 5470
 
6.1%
7 4619
 
5.1%
4 4033
 
4.5%
6 3301
 
3.7%
Other values (2) 3053
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:57:02.837763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length6.0122
Min length2

Characters and Unicode

Total characters60122
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row예수금
2nd row관리비미수금
3rd row선급금
4th row기타당좌자산
5th row수선유지비충당부채
ValueCountFrequency (%)
비품 330
 
3.3%
장기수선충당부채 316
 
3.2%
예수금 303
 
3.0%
예금 303
 
3.0%
관리비미수금 303
 
3.0%
선급비용 301
 
3.0%
연차수당충당부채 298
 
3.0%
공동주택적립금 297
 
3.0%
미지급금 296
 
3.0%
미처분이익잉여금 294
 
2.9%
Other values (67) 6959
69.6%
2024-05-11T14:57:03.434361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4563
 
7.6%
3811
 
6.3%
3168
 
5.3%
3110
 
5.2%
3041
 
5.1%
2918
 
4.9%
2620
 
4.4%
2505
 
4.2%
1935
 
3.2%
1770
 
2.9%
Other values (97) 30681
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60122
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4563
 
7.6%
3811
 
6.3%
3168
 
5.3%
3110
 
5.2%
3041
 
5.1%
2918
 
4.9%
2620
 
4.4%
2505
 
4.2%
1935
 
3.2%
1770
 
2.9%
Other values (97) 30681
51.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60122
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4563
 
7.6%
3811
 
6.3%
3168
 
5.3%
3110
 
5.2%
3041
 
5.1%
2918
 
4.9%
2620
 
4.4%
2505
 
4.2%
1935
 
3.2%
1770
 
2.9%
Other values (97) 30681
51.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60122
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4563
 
7.6%
3811
 
6.3%
3168
 
5.3%
3110
 
5.2%
3041
 
5.1%
2918
 
4.9%
2620
 
4.4%
2505
 
4.2%
1935
 
3.2%
1770
 
2.9%
Other values (97) 30681
51.0%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202312
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202312
2nd row202312
3rd row202312
4th row202312
5th row202312

Common Values

ValueCountFrequency (%)
202312 10000
100.0%

Length

2024-05-11T14:57:03.663306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:57:03.835326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202312 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7139
Distinct (%)71.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean86316338
Minimum-2.6482084 × 108
Maximum6.9310378 × 109
Zeros2513
Zeros (%)25.1%
Negative353
Negative (%)3.5%
Memory size166.0 KiB
2024-05-11T14:57:04.022440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-2.6482084 × 108
5-th percentile0
Q10
median3323710
Q344404518
95-th percentile4.3164009 × 108
Maximum6.9310378 × 109
Range7.1958587 × 109
Interquartile range (IQR)44404518

Descriptive statistics

Standard deviation3.087784 × 108
Coefficient of variation (CV)3.577288
Kurtosis104.22158
Mean86316338
Median Absolute Deviation (MAD)3323710
Skewness8.6204177
Sum8.6316338 × 1011
Variance9.5344102 × 1016
MonotonicityNot monotonic
2024-05-11T14:57:04.667195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2513
 
25.1%
500000 28
 
0.3%
300000 21
 
0.2%
250000 17
 
0.2%
242000 12
 
0.1%
200000 11
 
0.1%
484000 11
 
0.1%
400 11
 
0.1%
5000000 11
 
0.1%
3000000 10
 
0.1%
Other values (7129) 7355
73.6%
ValueCountFrequency (%)
-264820840 1
< 0.1%
-206158420 1
< 0.1%
-198071844 1
< 0.1%
-195908810 1
< 0.1%
-159515815 1
< 0.1%
-147774640 1
< 0.1%
-144624600 1
< 0.1%
-142530421 1
< 0.1%
-137854113 1
< 0.1%
-132142869 1
< 0.1%
ValueCountFrequency (%)
6931037845 1
< 0.1%
4980841618 1
< 0.1%
4909755056 1
< 0.1%
4888235156 1
< 0.1%
4853035871 1
< 0.1%
4777615630 1
< 0.1%
4699245158 1
< 0.1%
4663387098 1
< 0.1%
4271166575 1
< 0.1%
4132811129 1
< 0.1%

Interactions

2024-05-11T14:56:59.389990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:57:04.833853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.501
금액0.5011.000

Missing values

2024-05-11T14:56:59.674014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:56:59.874844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
15061신사현대2차A12208105예수금2023122312497
25766성수청구강변A13383003관리비미수금20231222937320
32044역삼래미안A13592706선급금2023123129000
48228용산파크자이A14075201기타당좌자산202312529000
38896마천우방A13812004수선유지비충당부채2023120
42761중계주공5단지A13922114선급비용20231240673170
62516상도동중앙하이츠빌아파트A15683402현금202312452410
41023거여현대2차A13881401청소비충당부채2023126044880
66727목동금호어울림A15805403선급금2023123438520
56731개봉삼환A15209205예수금2023122055662
아파트명아파트코드비용명년월일금액
65076강서센트레빌4차A15781201임차보증금202312400000
14749북한산현대홈타운A12204102선수수도료202312427820
42138상계극동늘푸른A13920106퇴직급여충당부채20231234408615
65510방화동 개화아파트A15785608당기순이익20231226187328
9672창신두산A11054101기타충당예금2023120
29982수서가람A13523003비품감가상각누계액202312-30846550
17196래미안크레시티A13071302비품20231280263380
2782DMC롯데캐슬더퍼스트A10024828복리후생비충당부채2023123663620
41020거여현대2차A13881401퇴직급여충당부채20231222882110
18229청량리홍릉동부A13086802가수금202312215536