Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2347 (23.5%) zerosZeros

Reproduction

Analysis started2024-05-11 05:56:00.506221
Analysis finished2024-05-11 05:56:01.618575
Duration1.11 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2262
Distinct (%)22.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:01.885252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.4677
Min length2

Characters and Unicode

Total characters74677
Distinct characters433
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique131 ?
Unique (%)1.3%

Sample

1st row양재우성
2nd row고덕아남
3rd row천호삼성아파트
4th row현대멤피스아파트
5th row금호어울림1차
ValueCountFrequency (%)
아파트 156
 
1.4%
래미안 51
 
0.5%
e편한세상 25
 
0.2%
푸르지오 19
 
0.2%
sk뷰 19
 
0.2%
송파 16
 
0.1%
아이파크 16
 
0.1%
신반포 15
 
0.1%
해모로 15
 
0.1%
강남한신휴플러스 15
 
0.1%
Other values (2347) 10489
96.8%
2024-05-11T14:56:02.573693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2497
 
3.3%
2490
 
3.3%
2350
 
3.1%
1945
 
2.6%
1637
 
2.2%
1606
 
2.2%
1519
 
2.0%
1454
 
1.9%
1436
 
1.9%
1403
 
1.9%
Other values (423) 56340
75.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68178
91.3%
Decimal Number 3724
 
5.0%
Space Separator 918
 
1.2%
Uppercase Letter 903
 
1.2%
Lowercase Letter 364
 
0.5%
Open Punctuation 157
 
0.2%
Close Punctuation 157
 
0.2%
Dash Punctuation 146
 
0.2%
Other Punctuation 125
 
0.2%
Letter Number 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2497
 
3.7%
2490
 
3.7%
2350
 
3.4%
1945
 
2.9%
1637
 
2.4%
1606
 
2.4%
1519
 
2.2%
1454
 
2.1%
1436
 
2.1%
1403
 
2.1%
Other values (378) 49841
73.1%
Uppercase Letter
ValueCountFrequency (%)
S 162
17.9%
C 128
14.2%
K 123
13.6%
M 86
9.5%
D 86
9.5%
L 63
 
7.0%
H 46
 
5.1%
I 46
 
5.1%
E 41
 
4.5%
V 31
 
3.4%
Other values (7) 91
10.1%
Lowercase Letter
ValueCountFrequency (%)
e 196
53.8%
l 34
 
9.3%
i 34
 
9.3%
s 23
 
6.3%
v 19
 
5.2%
k 13
 
3.6%
h 13
 
3.6%
g 9
 
2.5%
a 9
 
2.5%
w 8
 
2.2%
Decimal Number
ValueCountFrequency (%)
1 1172
31.5%
2 1045
28.1%
3 534
14.3%
4 241
 
6.5%
5 196
 
5.3%
6 161
 
4.3%
8 107
 
2.9%
7 105
 
2.8%
9 95
 
2.6%
0 68
 
1.8%
Other Punctuation
ValueCountFrequency (%)
, 89
71.2%
. 36
28.8%
Space Separator
ValueCountFrequency (%)
918
100.0%
Open Punctuation
ValueCountFrequency (%)
( 157
100.0%
Close Punctuation
ValueCountFrequency (%)
) 157
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 146
100.0%
Letter Number
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68178
91.3%
Common 5227
 
7.0%
Latin 1272
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2497
 
3.7%
2490
 
3.7%
2350
 
3.4%
1945
 
2.9%
1637
 
2.4%
1606
 
2.4%
1519
 
2.2%
1454
 
2.1%
1436
 
2.1%
1403
 
2.1%
Other values (378) 49841
73.1%
Latin
ValueCountFrequency (%)
e 196
15.4%
S 162
12.7%
C 128
10.1%
K 123
9.7%
M 86
 
6.8%
D 86
 
6.8%
L 63
 
5.0%
H 46
 
3.6%
I 46
 
3.6%
E 41
 
3.2%
Other values (19) 295
23.2%
Common
ValueCountFrequency (%)
1 1172
22.4%
2 1045
20.0%
918
17.6%
3 534
10.2%
4 241
 
4.6%
5 196
 
3.7%
6 161
 
3.1%
( 157
 
3.0%
) 157
 
3.0%
- 146
 
2.8%
Other values (6) 500
9.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68178
91.3%
ASCII 6494
 
8.7%
Number Forms 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2497
 
3.7%
2490
 
3.7%
2350
 
3.4%
1945
 
2.9%
1637
 
2.4%
1606
 
2.4%
1519
 
2.2%
1454
 
2.1%
1436
 
2.1%
1403
 
2.1%
Other values (378) 49841
73.1%
ASCII
ValueCountFrequency (%)
1 1172
18.0%
2 1045
16.1%
918
14.1%
3 534
 
8.2%
4 241
 
3.7%
5 196
 
3.0%
e 196
 
3.0%
S 162
 
2.5%
6 161
 
2.5%
( 157
 
2.4%
Other values (34) 1712
26.4%
Number Forms
ValueCountFrequency (%)
5
100.0%
Distinct2266
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:03.166818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique132 ?
Unique (%)1.3%

Sample

1st rowA13789203
2nd rowA13480403
3rd rowA13402305
4th rowA13782902
5th rowA13812003
ValueCountFrequency (%)
a15003002 12
 
0.1%
a13987306 12
 
0.1%
a13590602 12
 
0.1%
a15601003 11
 
0.1%
a14272306 11
 
0.1%
a13671206 11
 
0.1%
a13006003 11
 
0.1%
a13613011 11
 
0.1%
a13822003 11
 
0.1%
a13887405 11
 
0.1%
Other values (2256) 9887
98.9%
2024-05-11T14:56:03.966918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18450
20.5%
1 17399
19.3%
A 9989
11.1%
3 9000
10.0%
2 8260
9.2%
5 6280
 
7.0%
8 5557
 
6.2%
7 4659
 
5.2%
4 3976
 
4.4%
6 3394
 
3.8%
Other values (2) 3036
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18450
23.1%
1 17399
21.7%
3 9000
11.2%
2 8260
10.3%
5 6280
 
7.8%
8 5557
 
6.9%
7 4659
 
5.8%
4 3976
 
5.0%
6 3394
 
4.2%
9 3025
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9989
99.9%
B 11
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18450
23.1%
1 17399
21.7%
3 9000
11.2%
2 8260
10.3%
5 6280
 
7.8%
8 5557
 
6.9%
7 4659
 
5.8%
4 3976
 
5.0%
6 3394
 
4.2%
9 3025
 
3.8%
Latin
ValueCountFrequency (%)
A 9989
99.9%
B 11
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18450
20.5%
1 17399
19.3%
A 9989
11.1%
3 9000
10.0%
2 8260
9.2%
5 6280
 
7.0%
8 5557
 
6.2%
7 4659
 
5.2%
4 3976
 
4.4%
6 3394
 
3.8%
Other values (2) 3036
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:56:04.328739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.9622
Min length2

Characters and Unicode

Total characters59622
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row연차수당충당부채
2nd row퇴직급여충당예금
3rd row장기수선충당예금
4th row연차수당충당부채
5th row수선유지비충당부채
ValueCountFrequency (%)
예금 332
 
3.3%
연차수당충당부채 327
 
3.3%
관리비미수금 317
 
3.2%
공동주택적립금 313
 
3.1%
장기수선충당예금 311
 
3.1%
퇴직급여충당부채 305
 
3.0%
선급비용 303
 
3.0%
장기수선충당부채 302
 
3.0%
미처분이익잉여금 300
 
3.0%
가수금 299
 
3.0%
Other values (67) 6891
68.9%
2024-05-11T14:56:04.906474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4655
 
7.8%
3919
 
6.6%
3207
 
5.4%
3092
 
5.2%
2981
 
5.0%
2913
 
4.9%
2646
 
4.4%
2454
 
4.1%
1936
 
3.2%
1758
 
2.9%
Other values (97) 30061
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59622
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4655
 
7.8%
3919
 
6.6%
3207
 
5.4%
3092
 
5.2%
2981
 
5.0%
2913
 
4.9%
2646
 
4.4%
2454
 
4.1%
1936
 
3.2%
1758
 
2.9%
Other values (97) 30061
50.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59622
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4655
 
7.8%
3919
 
6.6%
3207
 
5.4%
3092
 
5.2%
2981
 
5.0%
2913
 
4.9%
2646
 
4.4%
2454
 
4.1%
1936
 
3.2%
1758
 
2.9%
Other values (97) 30061
50.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59622
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4655
 
7.8%
3919
 
6.6%
3207
 
5.4%
3092
 
5.2%
2981
 
5.0%
2913
 
4.9%
2646
 
4.4%
2454
 
4.1%
1936
 
3.2%
1758
 
2.9%
Other values (97) 30061
50.4%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202306
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202306
2nd row202306
3rd row202306
4th row202306
5th row202306

Common Values

ValueCountFrequency (%)
202306 10000
100.0%

Length

2024-05-11T14:56:05.127639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:56:05.299971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202306 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7334
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean84359812
Minimum-7.3862196 × 108
Maximum1.6922472 × 1010
Zeros2347
Zeros (%)23.5%
Negative336
Negative (%)3.4%
Memory size166.0 KiB
2024-05-11T14:56:05.496525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-7.3862196 × 108
5-th percentile0
Q10
median3197775
Q337194302
95-th percentile3.9086267 × 108
Maximum1.6922472 × 1010
Range1.7661094 × 1010
Interquartile range (IQR)37194302

Descriptive statistics

Standard deviation3.680474 × 108
Coefficient of variation (CV)4.3628286
Kurtosis535.35607
Mean84359812
Median Absolute Deviation (MAD)3197775
Skewness16.864668
Sum8.4359812 × 1011
Variance1.3545889 × 1017
MonotonicityNot monotonic
2024-05-11T14:56:05.762563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2347
 
23.5%
500000 27
 
0.3%
250000 20
 
0.2%
300000 14
 
0.1%
242000 12
 
0.1%
484000 11
 
0.1%
100000 11
 
0.1%
250400 10
 
0.1%
600000 9
 
0.1%
20000000 9
 
0.1%
Other values (7324) 7530
75.3%
ValueCountFrequency (%)
-738621958 1
< 0.1%
-389001283 1
< 0.1%
-289749540 1
< 0.1%
-241325140 1
< 0.1%
-214954380 1
< 0.1%
-206615440 1
< 0.1%
-205993544 1
< 0.1%
-171420990 1
< 0.1%
-169446320 1
< 0.1%
-139539990 1
< 0.1%
ValueCountFrequency (%)
16922472486 1
< 0.1%
8528159731 1
< 0.1%
7136250926 1
< 0.1%
6855010093 1
< 0.1%
5694659305 1
< 0.1%
5453286124 1
< 0.1%
5120999288 1
< 0.1%
4728350166 1
< 0.1%
4688111055 1
< 0.1%
4537885252 2
< 0.1%

Interactions

2024-05-11T14:56:01.166358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:56:05.934850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.381
금액0.3811.000

Missing values

2024-05-11T14:56:01.375691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:56:01.549484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
39242양재우성A13789203연차수당충당부채20230614266440
28714고덕아남A13480403퇴직급여충당예금2023060
27388천호삼성아파트A13402305장기수선충당예금202306324906548
38400현대멤피스아파트A13782902연차수당충당부채2023064668732
40366금호어울림1차A13812003수선유지비충당부채2023063566620
69920목동대원칸타빌2,3단지A15805404현금20230617155
44140상계주공10단지A13920804예금202306369138111
13113삼성래미안공덕4차A12170601기타시설운영충당부채2023060
50399시티파크2단지A14088201장기수선충당부채2023061058992157
8613용마산하늘채아파트A10028033비품20230628396770
아파트명아파트코드비용명년월일금액
8054DMC파크뷰자이아파트A10027817선급금2023060
23620방학삼성래미안1단지A13285406미수금2023060
17002제기이수브라운스톤A13006003전신전화가입권2023060
52660광장현대3단지아파트A14381415공동체활성화단체지원적립금2023060
55064양평동보아파트A15010501관리비미수금2023061741840
36103삼선푸르지오아파트A13672101미부과관리비202306247251628
35247월곡3SH-villA13613003기타충당예금20230614373666
67038마곡서광A15722306선수금2023060
40670오금현대아파트A13813010기타충당부채20230632100000
23775쌍문성원A13286106저장품202306828000