Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2336 (23.4%) zerosZeros

Reproduction

Analysis started2024-05-11 05:55:38.503180
Analysis finished2024-05-11 05:55:39.441148
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2248
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:55:39.683878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.4022
Min length2

Characters and Unicode

Total characters74022
Distinct characters431
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique127 ?
Unique (%)1.3%

Sample

1st row롯데캐슬아이비
2nd row한남하이츠
3rd row면목늘푸른동아아파트
4th row서울숲2차푸르지오임대
5th row래미안트리베라1차
ValueCountFrequency (%)
아파트 177
 
1.6%
래미안 39
 
0.4%
e편한세상 24
 
0.2%
아이파크 23
 
0.2%
sk뷰 17
 
0.2%
해모로 16
 
0.1%
힐스테이트 16
 
0.1%
경남아너스빌 15
 
0.1%
길음뉴타운 14
 
0.1%
북한산 14
 
0.1%
Other values (2333) 10466
96.7%
2024-05-11T14:55:40.204449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2566
 
3.5%
2507
 
3.4%
2388
 
3.2%
1785
 
2.4%
1726
 
2.3%
1660
 
2.2%
1497
 
2.0%
1433
 
1.9%
1431
 
1.9%
1389
 
1.9%
Other values (421) 55640
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67772
91.6%
Decimal Number 3656
 
4.9%
Space Separator 903
 
1.2%
Uppercase Letter 839
 
1.1%
Lowercase Letter 282
 
0.4%
Open Punctuation 155
 
0.2%
Close Punctuation 155
 
0.2%
Dash Punctuation 130
 
0.2%
Other Punctuation 125
 
0.2%
Letter Number 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2566
 
3.8%
2507
 
3.7%
2388
 
3.5%
1785
 
2.6%
1726
 
2.5%
1660
 
2.4%
1497
 
2.2%
1433
 
2.1%
1431
 
2.1%
1389
 
2.0%
Other values (377) 49390
72.9%
Uppercase Letter
ValueCountFrequency (%)
S 145
17.3%
C 113
13.5%
K 104
12.4%
M 87
10.4%
D 87
10.4%
H 49
 
5.8%
L 48
 
5.7%
I 42
 
5.0%
E 41
 
4.9%
V 26
 
3.1%
Other values (7) 97
11.6%
Decimal Number
ValueCountFrequency (%)
1 1089
29.8%
2 1077
29.5%
3 470
12.9%
4 277
 
7.6%
5 229
 
6.3%
6 161
 
4.4%
7 104
 
2.8%
9 94
 
2.6%
8 83
 
2.3%
0 72
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
e 177
62.8%
i 19
 
6.7%
k 19
 
6.7%
l 16
 
5.7%
s 15
 
5.3%
v 13
 
4.6%
c 8
 
2.8%
w 7
 
2.5%
a 4
 
1.4%
g 4
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 98
78.4%
. 27
 
21.6%
Space Separator
ValueCountFrequency (%)
903
100.0%
Open Punctuation
ValueCountFrequency (%)
( 155
100.0%
Close Punctuation
ValueCountFrequency (%)
) 155
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 130
100.0%
Letter Number
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67772
91.6%
Common 5124
 
6.9%
Latin 1126
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2566
 
3.8%
2507
 
3.7%
2388
 
3.5%
1785
 
2.6%
1726
 
2.5%
1660
 
2.4%
1497
 
2.2%
1433
 
2.1%
1431
 
2.1%
1389
 
2.0%
Other values (377) 49390
72.9%
Latin
ValueCountFrequency (%)
e 177
15.7%
S 145
12.9%
C 113
10.0%
K 104
9.2%
M 87
 
7.7%
D 87
 
7.7%
H 49
 
4.4%
L 48
 
4.3%
I 42
 
3.7%
E 41
 
3.6%
Other values (18) 233
20.7%
Common
ValueCountFrequency (%)
1 1089
21.3%
2 1077
21.0%
903
17.6%
3 470
9.2%
4 277
 
5.4%
5 229
 
4.5%
6 161
 
3.1%
( 155
 
3.0%
) 155
 
3.0%
- 130
 
2.5%
Other values (6) 478
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67772
91.6%
ASCII 6245
 
8.4%
Number Forms 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2566
 
3.8%
2507
 
3.7%
2388
 
3.5%
1785
 
2.6%
1726
 
2.5%
1660
 
2.4%
1497
 
2.2%
1433
 
2.1%
1431
 
2.1%
1389
 
2.0%
Other values (377) 49390
72.9%
ASCII
ValueCountFrequency (%)
1 1089
17.4%
2 1077
17.2%
903
14.5%
3 470
 
7.5%
4 277
 
4.4%
5 229
 
3.7%
e 177
 
2.8%
6 161
 
2.6%
( 155
 
2.5%
) 155
 
2.5%
Other values (33) 1552
24.9%
Number Forms
ValueCountFrequency (%)
5
100.0%
Distinct2253
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:55:40.587075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique127 ?
Unique (%)1.3%

Sample

1st rowA15088915
2nd rowA13375901
3rd rowA13183504
4th rowA13378103
5th rowA14272309
ValueCountFrequency (%)
a12012203 12
 
0.1%
a13986306 11
 
0.1%
a15277302 11
 
0.1%
a13204510 11
 
0.1%
a15178201 11
 
0.1%
a15678103 11
 
0.1%
a14086001 11
 
0.1%
a13610003 11
 
0.1%
a13410006 11
 
0.1%
a15179701 10
 
0.1%
Other values (2243) 9890
98.9%
2024-05-11T14:55:41.119730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18439
20.5%
1 17415
19.4%
A 9983
11.1%
3 8768
9.7%
2 8404
9.3%
5 6196
 
6.9%
8 5549
 
6.2%
7 4707
 
5.2%
4 4084
 
4.5%
6 3427
 
3.8%
Other values (2) 3028
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18439
23.0%
1 17415
21.8%
3 8768
11.0%
2 8404
10.5%
5 6196
 
7.7%
8 5549
 
6.9%
7 4707
 
5.9%
4 4084
 
5.1%
6 3427
 
4.3%
9 3011
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 9983
99.8%
B 17
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18439
23.0%
1 17415
21.8%
3 8768
11.0%
2 8404
10.5%
5 6196
 
7.7%
8 5549
 
6.9%
7 4707
 
5.9%
4 4084
 
5.1%
6 3427
 
4.3%
9 3011
 
3.8%
Latin
ValueCountFrequency (%)
A 9983
99.8%
B 17
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18439
20.5%
1 17415
19.4%
A 9983
11.1%
3 8768
9.7%
2 8404
9.3%
5 6196
 
6.9%
8 5549
 
6.2%
7 4707
 
5.2%
4 4084
 
4.5%
6 3427
 
3.8%
Other values (2) 3028
 
3.4%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:55:41.431127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length5.9304
Min length2

Characters and Unicode

Total characters59304
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수선유지비충당부채
2nd row전신전화가입권
3rd row미지급금
4th row안전진단비충당부채
5th row장기수선충당부채
ValueCountFrequency (%)
비품 320
 
3.2%
장기수선충당예금 319
 
3.2%
퇴직급여충당부채 319
 
3.2%
미처분이익잉여금 315
 
3.1%
선급비용 314
 
3.1%
예금 313
 
3.1%
관리비미수금 306
 
3.1%
예수금 305
 
3.0%
연차수당충당부채 303
 
3.0%
당기순이익 301
 
3.0%
Other values (67) 6885
68.8%
2024-05-11T14:55:42.027347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4678
 
7.9%
3804
 
6.4%
3168
 
5.3%
3034
 
5.1%
3006
 
5.1%
2864
 
4.8%
2577
 
4.3%
2434
 
4.1%
1878
 
3.2%
1780
 
3.0%
Other values (97) 30081
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59304
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4678
 
7.9%
3804
 
6.4%
3168
 
5.3%
3034
 
5.1%
3006
 
5.1%
2864
 
4.8%
2577
 
4.3%
2434
 
4.1%
1878
 
3.2%
1780
 
3.0%
Other values (97) 30081
50.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59304
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4678
 
7.9%
3804
 
6.4%
3168
 
5.3%
3034
 
5.1%
3006
 
5.1%
2864
 
4.8%
2577
 
4.3%
2434
 
4.1%
1878
 
3.2%
1780
 
3.0%
Other values (97) 30081
50.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59304
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4678
 
7.9%
3804
 
6.4%
3168
 
5.3%
3034
 
5.1%
3006
 
5.1%
2864
 
4.8%
2577
 
4.3%
2434
 
4.1%
1878
 
3.2%
1780
 
3.0%
Other values (97) 30081
50.7%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202303
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202303
2nd row202303
3rd row202303
4th row202303
5th row202303

Common Values

ValueCountFrequency (%)
202303 10000
100.0%

Length

2024-05-11T14:55:42.189589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:55:42.317518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202303 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7342
Distinct (%)73.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78772942
Minimum-3.758759 × 108
Maximum1.6079052 × 1010
Zeros2336
Zeros (%)23.4%
Negative325
Negative (%)3.2%
Memory size166.0 KiB
2024-05-11T14:55:42.501129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3.758759 × 108
5-th percentile0
Q10
median3015638
Q335228410
95-th percentile3.8693145 × 108
Maximum1.6079052 × 1010
Range1.6454928 × 1010
Interquartile range (IQR)35228410

Descriptive statistics

Standard deviation3.3253583 × 108
Coefficient of variation (CV)4.2214474
Kurtosis624.94763
Mean78772942
Median Absolute Deviation (MAD)3015638
Skewness17.894021
Sum7.8772942 × 1011
Variance1.1058008 × 1017
MonotonicityNot monotonic
2024-05-11T14:55:42.768716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2336
 
23.4%
500000 27
 
0.3%
250000 19
 
0.2%
300000 18
 
0.2%
484000 15
 
0.1%
242000 14
 
0.1%
5000000 13
 
0.1%
1000000 13
 
0.1%
20000000 12
 
0.1%
200000 11
 
0.1%
Other values (7332) 7522
75.2%
ValueCountFrequency (%)
-375875896 1
< 0.1%
-322714222 1
< 0.1%
-200435790 1
< 0.1%
-188894300 1
< 0.1%
-147908370 1
< 0.1%
-146871400 1
< 0.1%
-146061190 1
< 0.1%
-140474971 1
< 0.1%
-138136348 1
< 0.1%
-135649720 1
< 0.1%
ValueCountFrequency (%)
16079052029 1
< 0.1%
7278051186 1
< 0.1%
6835355038 1
< 0.1%
5641618667 1
< 0.1%
5560181075 1
< 0.1%
5141654741 1
< 0.1%
5101420333 1
< 0.1%
4608025703 1
< 0.1%
4476478414 1
< 0.1%
3944348839 1
< 0.1%

Interactions

2024-05-11T14:55:39.033950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:55:42.939751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.277
금액0.2771.000

Missing values

2024-05-11T14:55:39.202242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:55:39.367091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
56521롯데캐슬아이비A15088915수선유지비충당부채20230332559055
25466한남하이츠A13375901전신전화가입권2023030
20110면목늘푸른동아아파트A13183504미지급금20230365720870
25977서울숲2차푸르지오임대A13378103안전진단비충당부채202303593820
51216래미안트리베라1차A14272309장기수선충당부채202303330348671
17016용두두산위브A13007001미지급금20230355257846
54526문래삼환A15009402비품20230311320880
3811래미안퍼스트하이A10025245수선유지비충당부채2023030
54211포레나 신길A15005501연차수당충당부채20230324640140
57313관악푸르지오제2단지A15105301장기수선충당예금2023030
아파트명아파트코드비용명년월일금액
62361남서울 무지개A15383905미지급금202303122156049
31377도곡삼성A13527004미수수익202303671530
60508신구로현대A15283902일반관리비충당부채2023030
10272독립문파크빌A12008001기타충당예금2023030
30469압구정신현대A13511004기타유형자산2023039360990
11856월드컵현대A12081602선수수도료2023034476850
12186삼성래미안공덕2차A12102008기타당좌자산2023030
8921롯데캐슬베네치아A10044002미지급금202303238355894
25128성수금호3차A13311101당기순이익20230318734525
29563역삼경남A13508002미수관리비예치금2023030