Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2231 (22.3%) zerosZeros

Reproduction

Analysis started2024-05-11 06:00:12.604539
Analysis finished2024-05-11 06:00:13.622948
Duration1.02 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2183
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:13.908674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21
Mean length7.2304
Min length2

Characters and Unicode

Total characters72304
Distinct characters433
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)1.2%

Sample

1st row등촌태진아름
2nd row상계주공2단지
3rd row신내5단지대림두산
4th row우장산한화꿈에그린
5th row염창한마음삼성
ValueCountFrequency (%)
아파트 115
 
1.1%
래미안 36
 
0.3%
북한산 17
 
0.2%
래미안밤섬리베뉴 16
 
0.2%
e편한세상 15
 
0.1%
신반포 14
 
0.1%
힐스테이트 13
 
0.1%
2단지 12
 
0.1%
고덕 12
 
0.1%
염창 12
 
0.1%
Other values (2246) 10277
97.5%
2024-05-11T15:00:14.512193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2352
 
3.3%
2321
 
3.2%
2076
 
2.9%
1868
 
2.6%
1829
 
2.5%
1762
 
2.4%
1561
 
2.2%
1529
 
2.1%
1433
 
2.0%
1311
 
1.8%
Other values (423) 54262
75.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66402
91.8%
Decimal Number 3752
 
5.2%
Uppercase Letter 686
 
0.9%
Space Separator 607
 
0.8%
Lowercase Letter 341
 
0.5%
Dash Punctuation 134
 
0.2%
Close Punctuation 133
 
0.2%
Open Punctuation 133
 
0.2%
Other Punctuation 102
 
0.1%
Letter Number 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2352
 
3.5%
2321
 
3.5%
2076
 
3.1%
1868
 
2.8%
1829
 
2.8%
1762
 
2.7%
1561
 
2.4%
1529
 
2.3%
1433
 
2.2%
1311
 
2.0%
Other values (377) 48360
72.8%
Uppercase Letter
ValueCountFrequency (%)
S 128
18.7%
K 87
12.7%
C 75
10.9%
L 51
 
7.4%
D 49
 
7.1%
M 49
 
7.1%
H 42
 
6.1%
G 39
 
5.7%
I 37
 
5.4%
E 36
 
5.2%
Other values (7) 93
13.6%
Lowercase Letter
ValueCountFrequency (%)
e 202
59.2%
i 29
 
8.5%
l 28
 
8.2%
v 20
 
5.9%
k 15
 
4.4%
s 15
 
4.4%
w 10
 
2.9%
c 8
 
2.3%
g 5
 
1.5%
a 5
 
1.5%
Decimal Number
ValueCountFrequency (%)
1 1164
31.0%
2 1040
27.7%
3 513
13.7%
4 252
 
6.7%
5 216
 
5.8%
6 165
 
4.4%
7 125
 
3.3%
8 98
 
2.6%
9 91
 
2.4%
0 88
 
2.3%
Other Punctuation
ValueCountFrequency (%)
, 79
77.5%
. 23
 
22.5%
Space Separator
ValueCountFrequency (%)
607
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 134
100.0%
Close Punctuation
ValueCountFrequency (%)
) 133
100.0%
Open Punctuation
ValueCountFrequency (%)
( 133
100.0%
Letter Number
ValueCountFrequency (%)
9
100.0%
Math Symbol
ValueCountFrequency (%)
~ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 66402
91.8%
Common 4866
 
6.7%
Latin 1036
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2352
 
3.5%
2321
 
3.5%
2076
 
3.1%
1868
 
2.8%
1829
 
2.8%
1762
 
2.7%
1561
 
2.4%
1529
 
2.3%
1433
 
2.2%
1311
 
2.0%
Other values (377) 48360
72.8%
Latin
ValueCountFrequency (%)
e 202
19.5%
S 128
12.4%
K 87
 
8.4%
C 75
 
7.2%
L 51
 
4.9%
D 49
 
4.7%
M 49
 
4.7%
H 42
 
4.1%
G 39
 
3.8%
I 37
 
3.6%
Other values (19) 277
26.7%
Common
ValueCountFrequency (%)
1 1164
23.9%
2 1040
21.4%
607
12.5%
3 513
10.5%
4 252
 
5.2%
5 216
 
4.4%
6 165
 
3.4%
- 134
 
2.8%
) 133
 
2.7%
( 133
 
2.7%
Other values (7) 509
10.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 66402
91.8%
ASCII 5893
 
8.2%
Number Forms 9
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2352
 
3.5%
2321
 
3.5%
2076
 
3.1%
1868
 
2.8%
1829
 
2.8%
1762
 
2.7%
1561
 
2.4%
1529
 
2.3%
1433
 
2.2%
1311
 
2.0%
Other values (377) 48360
72.8%
ASCII
ValueCountFrequency (%)
1 1164
19.8%
2 1040
17.6%
607
10.3%
3 513
 
8.7%
4 252
 
4.3%
5 216
 
3.7%
e 202
 
3.4%
6 165
 
2.8%
- 134
 
2.3%
) 133
 
2.3%
Other values (35) 1467
24.9%
Number Forms
ValueCountFrequency (%)
9
100.0%
Distinct2190
Distinct (%)21.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:14.927462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)1.2%

Sample

1st rowA15784402
2nd rowA13983004
3rd rowA13184610
4th rowA15701004
5th rowA15786118
ValueCountFrequency (%)
a12170601 12
 
0.1%
a13204510 12
 
0.1%
a13684605 12
 
0.1%
a13486504 12
 
0.1%
a13010005 12
 
0.1%
a13611005 12
 
0.1%
a13880806 11
 
0.1%
a15080604 11
 
0.1%
a15807703 11
 
0.1%
a13290003 11
 
0.1%
Other values (2180) 9884
98.8%
2024-05-11T15:00:15.606657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18251
20.3%
1 17746
19.7%
A 9983
11.1%
3 8820
9.8%
2 8349
9.3%
5 6194
 
6.9%
8 5785
 
6.4%
7 4800
 
5.3%
4 3828
 
4.3%
6 3319
 
3.7%
Other values (2) 2925
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18251
22.8%
1 17746
22.2%
3 8820
11.0%
2 8349
10.4%
5 6194
 
7.7%
8 5785
 
7.2%
7 4800
 
6.0%
4 3828
 
4.8%
6 3319
 
4.1%
9 2908
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
A 9983
99.8%
B 17
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18251
22.8%
1 17746
22.2%
3 8820
11.0%
2 8349
10.4%
5 6194
 
7.7%
8 5785
 
7.2%
7 4800
 
6.0%
4 3828
 
4.8%
6 3319
 
4.1%
9 2908
 
3.6%
Latin
ValueCountFrequency (%)
A 9983
99.8%
B 17
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18251
20.3%
1 17746
19.7%
A 9983
11.1%
3 8820
9.8%
2 8349
9.3%
5 6194
 
6.9%
8 5785
 
6.4%
7 4800
 
5.3%
4 3828
 
4.3%
6 3319
 
3.7%
Other values (2) 2925
 
3.2%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:15.948411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length6.0154
Min length2

Characters and Unicode

Total characters60154
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수선유지비충당부채
2nd row미처분이익잉여금
3rd row관리비미수금
4th row미부과관리비
5th row예금
ValueCountFrequency (%)
장기수선충당예금 329
 
3.3%
당기순이익 319
 
3.2%
예금 317
 
3.2%
관리비미수금 313
 
3.1%
미처분이익잉여금 312
 
3.1%
공동주택적립금 310
 
3.1%
퇴직급여충당부채 304
 
3.0%
가수금 304
 
3.0%
예수금 300
 
3.0%
수선유지비충당부채 296
 
3.0%
Other values (67) 6896
69.0%
2024-05-11T15:00:16.498315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4700
 
7.8%
3857
 
6.4%
3234
 
5.4%
3160
 
5.3%
2998
 
5.0%
2997
 
5.0%
2695
 
4.5%
2425
 
4.0%
1919
 
3.2%
1820
 
3.0%
Other values (97) 30349
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60154
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4700
 
7.8%
3857
 
6.4%
3234
 
5.4%
3160
 
5.3%
2998
 
5.0%
2997
 
5.0%
2695
 
4.5%
2425
 
4.0%
1919
 
3.2%
1820
 
3.0%
Other values (97) 30349
50.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60154
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4700
 
7.8%
3857
 
6.4%
3234
 
5.4%
3160
 
5.3%
2998
 
5.0%
2997
 
5.0%
2695
 
4.5%
2425
 
4.0%
1919
 
3.2%
1820
 
3.0%
Other values (97) 30349
50.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60154
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4700
 
7.8%
3857
 
6.4%
3234
 
5.4%
3160
 
5.3%
2998
 
5.0%
2997
 
5.0%
2695
 
4.5%
2425
 
4.0%
1919
 
3.2%
1820
 
3.0%
Other values (97) 30349
50.5%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202006
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202006
2nd row202006
3rd row202006
4th row202006
5th row202006

Common Values

ValueCountFrequency (%)
202006 10000
100.0%

Length

2024-05-11T15:00:16.674198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:00:16.805552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202006 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7426
Distinct (%)74.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75773341
Minimum-5.5916345 × 108
Maximum6.7604684 × 109
Zeros2231
Zeros (%)22.3%
Negative314
Negative (%)3.1%
Memory size166.0 KiB
2024-05-11T15:00:16.978396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-5.5916345 × 108
5-th percentile0
Q10
median3233375
Q334059952
95-th percentile3.7427926 × 108
Maximum6.7604684 × 109
Range7.3196319 × 109
Interquartile range (IQR)34059952

Descriptive statistics

Standard deviation2.9133043 × 108
Coefficient of variation (CV)3.8447616
Kurtosis134.05564
Mean75773341
Median Absolute Deviation (MAD)3233375
Skewness9.6467215
Sum7.5773341 × 1011
Variance8.4873421 × 1016
MonotonicityNot monotonic
2024-05-11T15:00:17.227871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2231
 
22.3%
500000 28
 
0.3%
250000 27
 
0.3%
300000 18
 
0.2%
484000 14
 
0.1%
10000000 13
 
0.1%
2000000 13
 
0.1%
250400 12
 
0.1%
100000 10
 
0.1%
20000000 8
 
0.1%
Other values (7416) 7626
76.3%
ValueCountFrequency (%)
-559163452 1
< 0.1%
-349832798 1
< 0.1%
-273320722 1
< 0.1%
-246078175 1
< 0.1%
-237355800 1
< 0.1%
-152477584 1
< 0.1%
-144739812 1
< 0.1%
-134212500 1
< 0.1%
-118828690 1
< 0.1%
-113580500 1
< 0.1%
ValueCountFrequency (%)
6760468422 1
< 0.1%
6555730743 1
< 0.1%
5987020508 1
< 0.1%
4850215901 1
< 0.1%
4384188661 1
< 0.1%
4090240000 1
< 0.1%
3978461203 1
< 0.1%
3942659299 1
< 0.1%
3911517390 2
< 0.1%
3775379084 1
< 0.1%

Interactions

2024-05-11T15:00:13.206029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:00:17.479749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.597
금액0.5971.000

Missing values

2024-05-11T15:00:13.403556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:00:13.561031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
64248등촌태진아름A15784402수선유지비충당부채2020065271755
43120상계주공2단지A13983004미처분이익잉여금2020060
17240신내5단지대림두산A13184610관리비미수금20200615159980
61301우장산한화꿈에그린A15701004미부과관리비20200632546510
64729염창한마음삼성A15786118예금20200642297125
25866천호금호A13486102관리비미수금2020062146590
48762광장삼성1,2차A14381506기타유동부채2020063000000
6616명륜아남1차A11052201장기수선충당부채202006769209162
61592등촌동성A15703302기타의비유동부채2020060
25429명일동우성A13482505미지급금20200698450330
아파트명아파트코드비용명년월일금액
26765청담래미안로이뷰A13510009선급비용20200620860620
48746광장동금호베스트빌A14381504당기순이익20200623181185
51371당산현대5차A15080507관리비예치금202006152320000
38449송파꿈에그린아파트A13876114기타시설운영충당부채2020060
45719한강타운A14004001선수관리비20200634680000
11406성산2차현대A12187703가수금2020066891870
21467서울숲삼부아파트A13307101기타충당부채2020061853998
26542삼성서광A13509006전신전화가입권202006500000
66690목동12단지A15807706퇴직급여충당부채202006173366119
37233잠실한솔A13819001상여충당부채2020060