Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15820/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 2267 (22.7%) zerosZeros

Reproduction

Analysis started2024-05-11 05:59:25.500873
Analysis finished2024-05-11 05:59:26.336700
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2231
Distinct (%)22.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:59:26.576318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length19
Mean length7.2511
Min length2

Characters and Unicode

Total characters72511
Distinct characters436
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)1.1%

Sample

1st row구로다솜금호
2nd row신당약수하이츠
3rd row한강
4th row래미안도곡카운티
5th row보라매 sk뷰
ValueCountFrequency (%)
아파트 157
 
1.5%
아이파크 28
 
0.3%
래미안 24
 
0.2%
e편한세상 19
 
0.2%
고덕 15
 
0.1%
힐스테이트 15
 
0.1%
서울숲2차푸르지오임대 13
 
0.1%
경남아너스빌 13
 
0.1%
해모로 12
 
0.1%
강일리버파크6단지 12
 
0.1%
Other values (2299) 10329
97.1%
2024-05-11T14:59:27.072161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2464
 
3.4%
2462
 
3.4%
2219
 
3.1%
1864
 
2.6%
1838
 
2.5%
1664
 
2.3%
1509
 
2.1%
1480
 
2.0%
1473
 
2.0%
1306
 
1.8%
Other values (426) 54232
74.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66451
91.6%
Decimal Number 3725
 
5.1%
Space Separator 725
 
1.0%
Uppercase Letter 676
 
0.9%
Lowercase Letter 375
 
0.5%
Dash Punctuation 149
 
0.2%
Close Punctuation 146
 
0.2%
Open Punctuation 146
 
0.2%
Other Punctuation 114
 
0.2%
Letter Number 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2464
 
3.7%
2462
 
3.7%
2219
 
3.3%
1864
 
2.8%
1838
 
2.8%
1664
 
2.5%
1509
 
2.3%
1480
 
2.2%
1473
 
2.2%
1306
 
2.0%
Other values (381) 48172
72.5%
Uppercase Letter
ValueCountFrequency (%)
S 112
16.6%
C 91
13.5%
K 79
11.7%
D 66
9.8%
M 66
9.8%
L 47
7.0%
H 45
6.7%
I 29
 
4.3%
E 26
 
3.8%
G 24
 
3.6%
Other values (7) 91
13.5%
Lowercase Letter
ValueCountFrequency (%)
e 196
52.3%
l 48
 
12.8%
i 37
 
9.9%
v 29
 
7.7%
s 22
 
5.9%
k 15
 
4.0%
w 8
 
2.1%
h 8
 
2.1%
g 5
 
1.3%
a 5
 
1.3%
Decimal Number
ValueCountFrequency (%)
2 1124
30.2%
1 1071
28.8%
3 485
13.0%
4 273
 
7.3%
5 221
 
5.9%
6 152
 
4.1%
7 128
 
3.4%
8 104
 
2.8%
9 102
 
2.7%
0 65
 
1.7%
Other Punctuation
ValueCountFrequency (%)
, 92
80.7%
. 22
 
19.3%
Space Separator
ValueCountFrequency (%)
725
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 149
100.0%
Close Punctuation
ValueCountFrequency (%)
) 146
100.0%
Open Punctuation
ValueCountFrequency (%)
( 146
100.0%
Letter Number
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 66451
91.6%
Common 5005
 
6.9%
Latin 1055
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2464
 
3.7%
2462
 
3.7%
2219
 
3.3%
1864
 
2.8%
1838
 
2.8%
1664
 
2.5%
1509
 
2.3%
1480
 
2.2%
1473
 
2.2%
1306
 
2.0%
Other values (381) 48172
72.5%
Latin
ValueCountFrequency (%)
e 196
18.6%
S 112
10.6%
C 91
 
8.6%
K 79
 
7.5%
D 66
 
6.3%
M 66
 
6.3%
l 48
 
4.5%
L 47
 
4.5%
H 45
 
4.3%
i 37
 
3.5%
Other values (19) 268
25.4%
Common
ValueCountFrequency (%)
2 1124
22.5%
1 1071
21.4%
725
14.5%
3 485
9.7%
4 273
 
5.5%
5 221
 
4.4%
6 152
 
3.0%
- 149
 
3.0%
) 146
 
2.9%
( 146
 
2.9%
Other values (6) 513
10.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 66451
91.6%
ASCII 6056
 
8.4%
Number Forms 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2464
 
3.7%
2462
 
3.7%
2219
 
3.3%
1864
 
2.8%
1838
 
2.8%
1664
 
2.5%
1509
 
2.3%
1480
 
2.2%
1473
 
2.2%
1306
 
2.0%
Other values (381) 48172
72.5%
ASCII
ValueCountFrequency (%)
2 1124
18.6%
1 1071
17.7%
725
12.0%
3 485
 
8.0%
4 273
 
4.5%
5 221
 
3.6%
e 196
 
3.2%
6 152
 
2.5%
- 149
 
2.5%
) 146
 
2.4%
Other values (34) 1514
25.0%
Number Forms
ValueCountFrequency (%)
4
100.0%
Distinct2237
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:59:27.385069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109 ?
Unique (%)1.1%

Sample

1st rowA15283806
2nd rowA10045404
3rd rowA13790620
4th rowA13585404
5th rowA10025070
ValueCountFrequency (%)
a13993501 12
 
0.1%
a14272314 12
 
0.1%
a13410004 12
 
0.1%
a13307204 11
 
0.1%
a13987301 11
 
0.1%
a15785613 11
 
0.1%
a15210207 11
 
0.1%
a13922110 11
 
0.1%
a15805002 11
 
0.1%
a15080507 11
 
0.1%
Other values (2227) 9887
98.9%
2024-05-11T14:59:27.809450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18348
20.4%
1 17675
19.6%
A 9986
11.1%
3 8743
9.7%
2 8229
9.1%
5 6211
 
6.9%
8 5728
 
6.4%
7 4766
 
5.3%
4 3935
 
4.4%
6 3376
 
3.8%
Other values (2) 3003
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18348
22.9%
1 17675
22.1%
3 8743
10.9%
2 8229
10.3%
5 6211
 
7.8%
8 5728
 
7.2%
7 4766
 
6.0%
4 3935
 
4.9%
6 3376
 
4.2%
9 2989
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 9986
99.9%
B 14
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18348
22.9%
1 17675
22.1%
3 8743
10.9%
2 8229
10.3%
5 6211
 
7.8%
8 5728
 
7.2%
7 4766
 
6.0%
4 3935
 
4.9%
6 3376
 
4.2%
9 2989
 
3.7%
Latin
ValueCountFrequency (%)
A 9986
99.9%
B 14
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18348
20.4%
1 17675
19.6%
A 9986
11.1%
3 8743
9.7%
2 8229
9.1%
5 6211
 
6.9%
8 5728
 
6.4%
7 4766
 
5.3%
4 3935
 
4.4%
6 3376
 
3.8%
Other values (2) 3003
 
3.3%
Distinct77
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T14:59:28.029071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.9648
Min length2

Characters and Unicode

Total characters59648
Distinct characters107
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기타의비유동부채
2nd row선급금
3rd row당기순이익
4th row기타시설운영충당부채
5th row당기순이익
ValueCountFrequency (%)
연차수당충당부채 330
 
3.3%
예금 323
 
3.2%
당기순이익 318
 
3.2%
미처분이익잉여금 316
 
3.2%
관리비미수금 302
 
3.0%
예수금 300
 
3.0%
선급비용 300
 
3.0%
장기수선충당부채 299
 
3.0%
가수금 295
 
2.9%
장기수선충당예금 294
 
2.9%
Other values (67) 6923
69.2%
2024-05-11T14:59:28.434044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4719
 
7.9%
3813
 
6.4%
3210
 
5.4%
3074
 
5.2%
2980
 
5.0%
2953
 
5.0%
2657
 
4.5%
2413
 
4.0%
1916
 
3.2%
1784
 
3.0%
Other values (97) 30129
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59648
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4719
 
7.9%
3813
 
6.4%
3210
 
5.4%
3074
 
5.2%
2980
 
5.0%
2953
 
5.0%
2657
 
4.5%
2413
 
4.0%
1916
 
3.2%
1784
 
3.0%
Other values (97) 30129
50.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59648
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4719
 
7.9%
3813
 
6.4%
3210
 
5.4%
3074
 
5.2%
2980
 
5.0%
2953
 
5.0%
2657
 
4.5%
2413
 
4.0%
1916
 
3.2%
1784
 
3.0%
Other values (97) 30129
50.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59648
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4719
 
7.9%
3813
 
6.4%
3210
 
5.4%
3074
 
5.2%
2980
 
5.0%
2953
 
5.0%
2657
 
4.5%
2413
 
4.0%
1916
 
3.2%
1784
 
3.0%
Other values (97) 30129
50.5%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202102
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202102
2nd row202102
3rd row202102
4th row202102
5th row202102

Common Values

ValueCountFrequency (%)
202102 10000
100.0%

Length

2024-05-11T14:59:28.609308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:59:28.705157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202102 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7379
Distinct (%)73.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73533291
Minimum-4.09024 × 109
Maximum9.022582 × 109
Zeros2267
Zeros (%)22.7%
Negative343
Negative (%)3.4%
Memory size166.0 KiB
2024-05-11T14:59:28.802956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4.09024 × 109
5-th percentile0
Q10
median3084943
Q337729620
95-th percentile3.7505777 × 108
Maximum9.022582 × 109
Range1.3112822 × 1010
Interquartile range (IQR)37729620

Descriptive statistics

Standard deviation2.8217111 × 108
Coefficient of variation (CV)3.8373247
Kurtosis262.41963
Mean73533291
Median Absolute Deviation (MAD)3084943
Skewness11.882754
Sum7.3533291 × 1011
Variance7.9620536 × 1016
MonotonicityNot monotonic
2024-05-11T14:59:28.941832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2267
 
22.7%
250000 26
 
0.3%
500000 23
 
0.2%
200000 14
 
0.1%
484000 14
 
0.1%
1000000 11
 
0.1%
30000000 11
 
0.1%
55000 11
 
0.1%
2000000 10
 
0.1%
10000000 10
 
0.1%
Other values (7369) 7603
76.0%
ValueCountFrequency (%)
-4090240000 1
< 0.1%
-279460094 1
< 0.1%
-269782080 1
< 0.1%
-244922944 1
< 0.1%
-166143777 1
< 0.1%
-143400530 1
< 0.1%
-138548880 1
< 0.1%
-122517705 1
< 0.1%
-97530000 1
< 0.1%
-91859147 1
< 0.1%
ValueCountFrequency (%)
9022581992 1
< 0.1%
7573220043 1
< 0.1%
7495281423 1
< 0.1%
5624354910 1
< 0.1%
5528919317 1
< 0.1%
4139169162 1
< 0.1%
3791119443 1
< 0.1%
3751211249 1
< 0.1%
3416407080 1
< 0.1%
3176503778 1
< 0.1%

Interactions

2024-05-11T14:59:25.974873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:59:29.026746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.282
금액0.2821.000

Missing values

2024-05-11T14:59:26.135660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:59:26.263163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
58612구로다솜금호A15283806기타의비유동부채2021020
7008신당약수하이츠A10045404선급금2021023964050
37511한강A13790620당기순이익20210275235680
30794래미안도곡카운티A13585404기타시설운영충당부채2021024868442
1481보라매 sk뷰A10025070당기순이익20210225062463
10254토정한강삼성A12106001장기수선충당부채202102983111482
497종암sh빌아파트A10024603가수금2021021075528
65322마곡수명산파크2단지A15728004미처분이익잉여금20210226387799
36612방배임광1,2차A13785005선급금20210210400
47132대우월드마크용산A14001101미수관리비예치금2021020
아파트명아파트코드비용명년월일금액
47813후암미주A14019001전신전화가입권202102448000
1064대림 우성2차A10024829선수관리비20210224000000
55651남현동한일유앤아이A15108001선수전기료202102181420
52058신길건영A15005302미수금202102339680
28528강남신동아파밀리에1단지A13519001미수금2021020
11069월드컵아이파크1단지A12171101예수금2021022328540
13898DMC자이1단지A12275501공동주택적립금20210245670575
45672월계청백3단지A13985105기타의비유동부채2021020
49683화양현대A14313001장기수선충당예금202102351511198
26100강일리버파크6단지A13410004관리비미수금20210247325700