Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2028 (20.3%) zerosZeros

Reproduction

Analysis started2024-05-11 06:01:59.266615
Analysis finished2024-05-11 06:02:00.851233
Duration1.58 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2103
Distinct (%)21.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:02:01.097105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length19
Mean length7.1329
Min length2

Characters and Unicode

Total characters71329
Distinct characters430
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80 ?
Unique (%)0.8%

Sample

1st row서초롯데캐슬프레지던트아파트
2nd row구로한일유엔아이
3rd row서울숲푸르지오
4th rowLG한강자이
5th row보라매파크빌
ValueCountFrequency (%)
아파트 108
 
1.0%
래미안 21
 
0.2%
신동아파밀리에 14
 
0.1%
래미안밤섬리베뉴 13
 
0.1%
브라운스톤 13
 
0.1%
흑석한강센트레빌2차 12
 
0.1%
하계청구 12
 
0.1%
등촌ipark 12
 
0.1%
성산월드타운대림 12
 
0.1%
은평뉴타운상림마을6단지 12
 
0.1%
Other values (2155) 10237
97.8%
2024-05-11T15:02:01.896200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2192
 
3.1%
2088
 
2.9%
1911
 
2.7%
1872
 
2.6%
1812
 
2.5%
1711
 
2.4%
1554
 
2.2%
1481
 
2.1%
1479
 
2.1%
1400
 
2.0%
Other values (420) 53829
75.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65291
91.5%
Decimal Number 3940
 
5.5%
Uppercase Letter 725
 
1.0%
Space Separator 512
 
0.7%
Lowercase Letter 314
 
0.4%
Other Punctuation 141
 
0.2%
Close Punctuation 135
 
0.2%
Open Punctuation 135
 
0.2%
Dash Punctuation 127
 
0.2%
Letter Number 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2192
 
3.4%
2088
 
3.2%
1911
 
2.9%
1872
 
2.9%
1812
 
2.8%
1711
 
2.6%
1554
 
2.4%
1481
 
2.3%
1479
 
2.3%
1400
 
2.1%
Other values (374) 47791
73.2%
Uppercase Letter
ValueCountFrequency (%)
S 119
16.4%
K 99
13.7%
C 81
11.2%
L 53
7.3%
M 51
7.0%
D 51
7.0%
H 44
 
6.1%
I 44
 
6.1%
E 37
 
5.1%
A 35
 
4.8%
Other values (7) 111
15.3%
Lowercase Letter
ValueCountFrequency (%)
e 180
57.3%
l 34
 
10.8%
i 30
 
9.6%
v 22
 
7.0%
w 10
 
3.2%
s 10
 
3.2%
k 9
 
2.9%
c 8
 
2.5%
h 5
 
1.6%
g 3
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 1209
30.7%
2 1180
29.9%
3 535
13.6%
4 256
 
6.5%
5 190
 
4.8%
6 161
 
4.1%
7 109
 
2.8%
9 106
 
2.7%
8 97
 
2.5%
0 97
 
2.5%
Other Punctuation
ValueCountFrequency (%)
, 111
78.7%
. 30
 
21.3%
Space Separator
ValueCountFrequency (%)
512
100.0%
Close Punctuation
ValueCountFrequency (%)
) 135
100.0%
Open Punctuation
ValueCountFrequency (%)
( 135
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 127
100.0%
Letter Number
ValueCountFrequency (%)
5
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65291
91.5%
Common 4994
 
7.0%
Latin 1044
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2192
 
3.4%
2088
 
3.2%
1911
 
2.9%
1872
 
2.9%
1812
 
2.8%
1711
 
2.6%
1554
 
2.4%
1481
 
2.3%
1479
 
2.3%
1400
 
2.1%
Other values (374) 47791
73.2%
Latin
ValueCountFrequency (%)
e 180
17.2%
S 119
11.4%
K 99
 
9.5%
C 81
 
7.8%
L 53
 
5.1%
M 51
 
4.9%
D 51
 
4.9%
H 44
 
4.2%
I 44
 
4.2%
E 37
 
3.5%
Other values (19) 285
27.3%
Common
ValueCountFrequency (%)
1 1209
24.2%
2 1180
23.6%
3 535
10.7%
512
10.3%
4 256
 
5.1%
5 190
 
3.8%
6 161
 
3.2%
) 135
 
2.7%
( 135
 
2.7%
- 127
 
2.5%
Other values (7) 554
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65291
91.5%
ASCII 6033
 
8.5%
Number Forms 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2192
 
3.4%
2088
 
3.2%
1911
 
2.9%
1872
 
2.9%
1812
 
2.8%
1711
 
2.6%
1554
 
2.4%
1481
 
2.3%
1479
 
2.3%
1400
 
2.1%
Other values (374) 47791
73.2%
ASCII
ValueCountFrequency (%)
1 1209
20.0%
2 1180
19.6%
3 535
 
8.9%
512
 
8.5%
4 256
 
4.2%
5 190
 
3.1%
e 180
 
3.0%
6 161
 
2.7%
) 135
 
2.2%
( 135
 
2.2%
Other values (35) 1540
25.5%
Number Forms
ValueCountFrequency (%)
5
100.0%
Distinct2109
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:02:02.510801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80 ?
Unique (%)0.8%

Sample

1st rowA10028146
2nd rowA15205104
3rd rowA13380803
4th rowA14003007
5th rowA15685503
ValueCountFrequency (%)
a12125202 12
 
0.1%
a15703204 12
 
0.1%
a13923103 12
 
0.1%
a12071002 12
 
0.1%
a15679109 12
 
0.1%
a41279912 11
 
0.1%
a14319005 11
 
0.1%
a15003801 11
 
0.1%
a12205003 11
 
0.1%
a41279924 11
 
0.1%
Other values (2099) 9885
98.9%
2024-05-11T15:02:03.382256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18309
20.3%
1 17581
19.5%
A 9995
11.1%
3 9064
10.1%
2 8076
9.0%
5 6200
 
6.9%
8 5711
 
6.3%
7 4877
 
5.4%
4 3789
 
4.2%
6 3366
 
3.7%
Other values (2) 3032
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18309
22.9%
1 17581
22.0%
3 9064
11.3%
2 8076
10.1%
5 6200
 
7.8%
8 5711
 
7.1%
7 4877
 
6.1%
4 3789
 
4.7%
6 3366
 
4.2%
9 3027
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9995
> 99.9%
B 5
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18309
22.9%
1 17581
22.0%
3 9064
11.3%
2 8076
10.1%
5 6200
 
7.8%
8 5711
 
7.1%
7 4877
 
6.1%
4 3789
 
4.7%
6 3366
 
4.2%
9 3027
 
3.8%
Latin
ValueCountFrequency (%)
A 9995
> 99.9%
B 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18309
20.3%
1 17581
19.5%
A 9995
11.1%
3 9064
10.1%
2 8076
9.0%
5 6200
 
6.9%
8 5711
 
6.3%
7 4877
 
5.4%
4 3789
 
4.2%
6 3366
 
3.7%
Other values (2) 3032
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:02:03.837410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9586
Min length2

Characters and Unicode

Total characters59586
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row승강기유지비충당부채
2nd row연차수당충당부채
3rd row미지급비용
4th row공동주택적립금예금
5th row미부과관리비
ValueCountFrequency (%)
예수금 345
 
3.5%
연차수당충당부채 324
 
3.2%
미처분이익잉여금 322
 
3.2%
관리비미수금 321
 
3.2%
퇴직급여충당부채 319
 
3.2%
당기순이익 318
 
3.2%
선급비용 317
 
3.2%
가수금 314
 
3.1%
예금 305
 
3.0%
장기수선충당예금 301
 
3.0%
Other values (67) 6814
68.1%
2024-05-11T15:02:04.550667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4759
 
8.0%
3787
 
6.4%
3294
 
5.5%
3078
 
5.2%
2948
 
4.9%
2928
 
4.9%
2630
 
4.4%
2304
 
3.9%
1880
 
3.2%
1783
 
3.0%
Other values (97) 30195
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59586
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4759
 
8.0%
3787
 
6.4%
3294
 
5.5%
3078
 
5.2%
2948
 
4.9%
2928
 
4.9%
2630
 
4.4%
2304
 
3.9%
1880
 
3.2%
1783
 
3.0%
Other values (97) 30195
50.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59586
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4759
 
8.0%
3787
 
6.4%
3294
 
5.5%
3078
 
5.2%
2948
 
4.9%
2928
 
4.9%
2630
 
4.4%
2304
 
3.9%
1880
 
3.2%
1783
 
3.0%
Other values (97) 30195
50.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59586
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4759
 
8.0%
3787
 
6.4%
3294
 
5.5%
3078
 
5.2%
2948
 
4.9%
2928
 
4.9%
2630
 
4.4%
2304
 
3.9%
1880
 
3.2%
1783
 
3.0%
Other values (97) 30195
50.7%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
201903
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201903
2nd row201903
3rd row201903
4th row201903
5th row201903

Common Values

ValueCountFrequency (%)
201903 10000
100.0%

Length

2024-05-11T15:02:04.807943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:02:04.974507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201903 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7634
Distinct (%)76.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73120152
Minimum-2.135686 × 108
Maximum9.5631702 × 109
Zeros2028
Zeros (%)20.3%
Negative308
Negative (%)3.1%
Memory size166.0 KiB
2024-05-11T15:02:05.154394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-2.135686 × 108
5-th percentile0
Q138492.75
median3574545
Q336112644
95-th percentile3.4573956 × 108
Maximum9.5631702 × 109
Range9.7767388 × 109
Interquartile range (IQR)36074152

Descriptive statistics

Standard deviation2.8323149 × 108
Coefficient of variation (CV)3.8735079
Kurtosis263.54194
Mean73120152
Median Absolute Deviation (MAD)3574545
Skewness12.545802
Sum7.3120152 × 1011
Variance8.0220078 × 1016
MonotonicityNot monotonic
2024-05-11T15:02:05.375130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2028
 
20.3%
500000 28
 
0.3%
250000 17
 
0.2%
300000 15
 
0.1%
100000 14
 
0.1%
1000000 13
 
0.1%
484000 12
 
0.1%
242000 12
 
0.1%
30000000 10
 
0.1%
5000000 8
 
0.1%
Other values (7624) 7843
78.4%
ValueCountFrequency (%)
-213568600 1
< 0.1%
-208368401 1
< 0.1%
-205308728 1
< 0.1%
-161481980 1
< 0.1%
-150682660 1
< 0.1%
-134130850 1
< 0.1%
-126365153 1
< 0.1%
-111088475 1
< 0.1%
-111011750 1
< 0.1%
-108093850 1
< 0.1%
ValueCountFrequency (%)
9563170161 1
< 0.1%
7129338980 1
< 0.1%
7037755693 1
< 0.1%
5562530172 1
< 0.1%
4855771708 1
< 0.1%
4434646994 1
< 0.1%
4309575926 1
< 0.1%
3963148756 1
< 0.1%
3928348880 1
< 0.1%
3828996559 1
< 0.1%

Interactions

2024-05-11T15:02:00.339604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:02:05.544179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.399
금액0.3991.000

Missing values

2024-05-11T15:02:00.553084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:02:00.748734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
3617서초롯데캐슬프레지던트아파트A10028146승강기유지비충당부채2019030
51816구로한일유엔아이A15205104연차수당충당부채2019035342640
20680서울숲푸르지오A13380803미지급비용20190363475198
43024LG한강자이A14003007공동주택적립금예금2019030
58433보라매파크빌A15685503미부과관리비20190388128530
27345수서동익A13588601비품감가상각누계액201903-36577648
5241홍제현대아파트A12009102선급비용20190311990290
22200명일삼익그린11차A13407201미처분이익잉여금2019035531850
46736자양7차현대홈타운A14388204퇴직급여충당예금20190360957862
24472청담삼성1차A13510001주차장충당부채20190321811176
아파트명아파트코드비용명년월일금액
64001신월대성유니드A15809403주차장충당부채2019030
62507목동현대아이파크A15805102관리비예치금20190325820000
7687상암월드컵파크2단지A12127004기타시설운영충당부채201903153284012
118응암역효성해링턴플레이스A10025659현금201903286800
48050문래두산위브A15009505연차수당충당부채2019034711920
57051사당휴먼시아A15609003공동체활성화단체지원적립금201903800000
32221서초네이처힐3단지A13778205수선유지비충당부채2019030
18207방학4단지신동아A13285507저장품201903-76000
37002가락삼익맨션A13885306미수금2019037888934
31942방배서리풀e편한세상A13771601미수금2019030