Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2093 (20.9%) zerosZeros

Reproduction

Analysis started2024-05-11 06:00:32.860594
Analysis finished2024-05-11 06:00:33.981728
Duration1.12 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2170
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:34.217973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21
Mean length7.226
Min length2

Characters and Unicode

Total characters72260
Distinct characters432
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)1.1%

Sample

1st row아크로리버뷰 신반포
2nd row치현마을동일스위트리버아파트
3rd row삼각산아이원임대
4th row방학우성2차
5th row현대성우
ValueCountFrequency (%)
아파트 115
 
1.1%
래미안 31
 
0.3%
힐스테이트 20
 
0.2%
입주자대표회의 15
 
0.1%
신동아파밀리에 14
 
0.1%
북한산 14
 
0.1%
고덕 13
 
0.1%
창동금용 13
 
0.1%
은평뉴타운상림마을6단지 13
 
0.1%
마곡수명산파크1단지 12
 
0.1%
Other values (2231) 10268
97.5%
2024-05-11T15:00:34.876148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2230
 
3.1%
2151
 
3.0%
1949
 
2.7%
1871
 
2.6%
1851
 
2.6%
1618
 
2.2%
1538
 
2.1%
1473
 
2.0%
1449
 
2.0%
1299
 
1.8%
Other values (422) 54831
75.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66151
91.5%
Decimal Number 3843
 
5.3%
Uppercase Letter 747
 
1.0%
Space Separator 580
 
0.8%
Lowercase Letter 355
 
0.5%
Dash Punctuation 154
 
0.2%
Close Punctuation 143
 
0.2%
Open Punctuation 143
 
0.2%
Other Punctuation 132
 
0.2%
Letter Number 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2230
 
3.4%
2151
 
3.3%
1949
 
2.9%
1871
 
2.8%
1851
 
2.8%
1618
 
2.4%
1538
 
2.3%
1473
 
2.2%
1449
 
2.2%
1299
 
2.0%
Other values (376) 48722
73.7%
Uppercase Letter
ValueCountFrequency (%)
S 110
14.7%
C 102
13.7%
K 98
13.1%
M 61
8.2%
D 61
8.2%
L 56
7.5%
I 42
 
5.6%
H 38
 
5.1%
E 32
 
4.3%
G 31
 
4.1%
Other values (7) 116
15.5%
Lowercase Letter
ValueCountFrequency (%)
e 211
59.4%
l 34
 
9.6%
i 29
 
8.2%
v 20
 
5.6%
s 15
 
4.2%
k 13
 
3.7%
c 10
 
2.8%
w 8
 
2.3%
h 7
 
2.0%
a 4
 
1.1%
Decimal Number
ValueCountFrequency (%)
1 1222
31.8%
2 1068
27.8%
3 515
13.4%
4 266
 
6.9%
5 218
 
5.7%
6 169
 
4.4%
7 117
 
3.0%
0 97
 
2.5%
9 91
 
2.4%
8 80
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 107
81.1%
. 25
 
18.9%
Space Separator
ValueCountFrequency (%)
580
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 154
100.0%
Close Punctuation
ValueCountFrequency (%)
) 143
100.0%
Open Punctuation
ValueCountFrequency (%)
( 143
100.0%
Letter Number
ValueCountFrequency (%)
8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 66151
91.5%
Common 4999
 
6.9%
Latin 1110
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2230
 
3.4%
2151
 
3.3%
1949
 
2.9%
1871
 
2.8%
1851
 
2.8%
1618
 
2.4%
1538
 
2.3%
1473
 
2.2%
1449
 
2.2%
1299
 
2.0%
Other values (376) 48722
73.7%
Latin
ValueCountFrequency (%)
e 211
19.0%
S 110
 
9.9%
C 102
 
9.2%
K 98
 
8.8%
M 61
 
5.5%
D 61
 
5.5%
L 56
 
5.0%
I 42
 
3.8%
H 38
 
3.4%
l 34
 
3.1%
Other values (19) 297
26.8%
Common
ValueCountFrequency (%)
1 1222
24.4%
2 1068
21.4%
580
11.6%
3 515
10.3%
4 266
 
5.3%
5 218
 
4.4%
6 169
 
3.4%
- 154
 
3.1%
) 143
 
2.9%
( 143
 
2.9%
Other values (7) 521
10.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 66151
91.5%
ASCII 6101
 
8.4%
Number Forms 8
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2230
 
3.4%
2151
 
3.3%
1949
 
2.9%
1871
 
2.8%
1851
 
2.8%
1618
 
2.4%
1538
 
2.3%
1473
 
2.2%
1449
 
2.2%
1299
 
2.0%
Other values (376) 48722
73.7%
ASCII
ValueCountFrequency (%)
1 1222
20.0%
2 1068
17.5%
580
 
9.5%
3 515
 
8.4%
4 266
 
4.4%
5 218
 
3.6%
e 211
 
3.5%
6 169
 
2.8%
- 154
 
2.5%
) 143
 
2.3%
Other values (35) 1555
25.5%
Number Forms
ValueCountFrequency (%)
8
100.0%
Distinct2176
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:35.353979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique111 ?
Unique (%)1.1%

Sample

1st rowA10026227
2nd rowA15722304
3rd rowA14210001
4th rowA13282510
5th rowA14281701
ValueCountFrequency (%)
a13204201 13
 
0.1%
a15728008 12
 
0.1%
a13611006 11
 
0.1%
a41279932 11
 
0.1%
a13078701 11
 
0.1%
a13302204 11
 
0.1%
a13528103 11
 
0.1%
a13309402 11
 
0.1%
a13812004 11
 
0.1%
a12220005 11
 
0.1%
Other values (2166) 9887
98.9%
2024-05-11T15:00:36.015094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18300
20.3%
1 17643
19.6%
A 9991
11.1%
3 9010
10.0%
2 8265
9.2%
5 6128
 
6.8%
8 5658
 
6.3%
7 4845
 
5.4%
4 3811
 
4.2%
6 3309
 
3.7%
Other values (2) 3040
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18300
22.9%
1 17643
22.1%
3 9010
11.3%
2 8265
10.3%
5 6128
 
7.7%
8 5658
 
7.1%
7 4845
 
6.1%
4 3811
 
4.8%
6 3309
 
4.1%
9 3031
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9991
99.9%
B 9
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18300
22.9%
1 17643
22.1%
3 9010
11.3%
2 8265
10.3%
5 6128
 
7.7%
8 5658
 
7.1%
7 4845
 
6.1%
4 3811
 
4.8%
6 3309
 
4.1%
9 3031
 
3.8%
Latin
ValueCountFrequency (%)
A 9991
99.9%
B 9
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18300
20.3%
1 17643
19.6%
A 9991
11.1%
3 9010
10.0%
2 8265
9.2%
5 6128
 
6.8%
8 5658
 
6.3%
7 4845
 
5.4%
4 3811
 
4.2%
6 3309
 
3.7%
Other values (2) 3040
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:00:36.316115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length6.0008
Min length2

Characters and Unicode

Total characters60008
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row미처분이익잉여금
2nd row상여충당부채
3rd row가수금
4th row비품
5th row기타충당부채
ValueCountFrequency (%)
관리비미수금 340
 
3.4%
미처분이익잉여금 334
 
3.3%
선급비용 332
 
3.3%
퇴직급여충당부채 326
 
3.3%
장기수선충당부채 308
 
3.1%
당기순이익 305
 
3.0%
예금 299
 
3.0%
미부과관리비 298
 
3.0%
연차수당충당부채 294
 
2.9%
예수금 293
 
2.9%
Other values (67) 6871
68.7%
2024-05-11T15:00:36.791922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4687
 
7.8%
3747
 
6.2%
3187
 
5.3%
3052
 
5.1%
3044
 
5.1%
2966
 
4.9%
2650
 
4.4%
2356
 
3.9%
1941
 
3.2%
1706
 
2.8%
Other values (97) 30672
51.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60008
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4687
 
7.8%
3747
 
6.2%
3187
 
5.3%
3052
 
5.1%
3044
 
5.1%
2966
 
4.9%
2650
 
4.4%
2356
 
3.9%
1941
 
3.2%
1706
 
2.8%
Other values (97) 30672
51.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60008
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4687
 
7.8%
3747
 
6.2%
3187
 
5.3%
3052
 
5.1%
3044
 
5.1%
2966
 
4.9%
2650
 
4.4%
2356
 
3.9%
1941
 
3.2%
1706
 
2.8%
Other values (97) 30672
51.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60008
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4687
 
7.8%
3747
 
6.2%
3187
 
5.3%
3052
 
5.1%
3044
 
5.1%
2966
 
4.9%
2650
 
4.4%
2356
 
3.9%
1941
 
3.2%
1706
 
2.8%
Other values (97) 30672
51.1%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202003
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202003
2nd row202003
3rd row202003
4th row202003
5th row202003

Common Values

ValueCountFrequency (%)
202003 10000
100.0%

Length

2024-05-11T15:00:36.977022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:00:37.165546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202003 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7576
Distinct (%)75.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68755997
Minimum-2.7179173 × 108
Maximum8.9543179 × 109
Zeros2093
Zeros (%)20.9%
Negative338
Negative (%)3.4%
Memory size166.0 KiB
2024-05-11T15:00:37.309435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-2.7179173 × 108
5-th percentile0
Q17020
median3253005
Q334539869
95-th percentile3.38867 × 108
Maximum8.9543179 × 109
Range9.2261096 × 109
Interquartile range (IQR)34532849

Descriptive statistics

Standard deviation2.7535831 × 108
Coefficient of variation (CV)4.0048624
Kurtosis361.92368
Mean68755997
Median Absolute Deviation (MAD)3253005
Skewness14.868295
Sum6.8755997 × 1011
Variance7.5822197 × 1016
MonotonicityNot monotonic
2024-05-11T15:00:37.513662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2093
 
20.9%
250000 25
 
0.2%
500000 22
 
0.2%
2000000 19
 
0.2%
242000 17
 
0.2%
3000000 14
 
0.1%
484000 13
 
0.1%
1000000 13
 
0.1%
300000 13
 
0.1%
30000000 8
 
0.1%
Other values (7566) 7763
77.6%
ValueCountFrequency (%)
-271791731 1
< 0.1%
-240552040 1
< 0.1%
-137360880 1
< 0.1%
-94069330 1
< 0.1%
-88304416 1
< 0.1%
-86645150 1
< 0.1%
-83223130 1
< 0.1%
-78679350 1
< 0.1%
-78673037 1
< 0.1%
-73095100 1
< 0.1%
ValueCountFrequency (%)
8954317866 1
< 0.1%
8880669168 1
< 0.1%
8003008457 1
< 0.1%
6503858315 1
< 0.1%
5907120472 1
< 0.1%
4789802542 1
< 0.1%
3757061363 1
< 0.1%
3467278462 1
< 0.1%
3000016676 1
< 0.1%
2849974319 1
< 0.1%

Interactions

2024-05-11T15:00:33.568203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:00:37.636435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.384
금액0.3841.000

Missing values

2024-05-11T15:00:33.757979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:00:33.913760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
2011아크로리버뷰 신반포A10026227미처분이익잉여금2020030
62635치현마을동일스위트리버아파트A15722304상여충당부채2020030
46711삼각산아이원임대A14210001가수금2020035419802
19684방학우성2차A13282510비품2020030
47291현대성우A14281701기타충당부채2020030
66940목동1단지A15875101비품20200355782045
63884등촌라인A15783806예수금2020031426219
30060돈암범양A13606102선수관리비20200379455000
19365방학동부센트레빌A13272102장기수선충당부채202003261435643
58493보라매코오롱하늘채A15602002경비비충당부채2020031439410
아파트명아파트코드비용명년월일금액
9671마포래미안푸르지오A12175203수선유지비충당부채2020030
7746연희성원A12071101비품감가상각누계액202003-14151650
31458동일하이빌뉴시티A13613011기타유형자산2020032561500
38293가락대림아파트A13880204당기순이익2020037108743
34625방배1차현대A13785203장기수선충당부채202003382472336
61337등촌8단지주공아파트A15703301미수수익2020032720
6072명륜아남1차A11052201미수금2020032966090
5023용두 롯데 캐슬리치 아파트A10028080장기수선충당부채202003160166324
27921래미안대치하이스턴A13528007미수금2020030
47443화양현대A14313001선수관리비20200343536000