Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text3
Categorical1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15821/S/1/datasetView.do

Alerts

년월일 has constant value ""Constant
금액 has 1002 (10.0%) zerosZeros

Reproduction

Analysis started2024-05-11 06:52:05.746302
Analysis finished2024-05-11 06:52:07.888017
Duration2.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2178
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:52:08.167726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length7.3628
Min length2

Characters and Unicode

Total characters73628
Distinct characters425
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique128 ?
Unique (%)1.3%

Sample

1st row상계주공4단지
2nd row도봉숲 아뜨리움
3rd row보라매자이 더 포레스트
4th row보라매 sk뷰
5th row방화동성아파트
ValueCountFrequency (%)
아파트 224
 
2.0%
래미안 46
 
0.4%
e편한세상 32
 
0.3%
아이파크 27
 
0.2%
고덕 22
 
0.2%
sk뷰 17
 
0.2%
마포 17
 
0.2%
푸르지오 16
 
0.1%
팰리스 16
 
0.1%
경남아너스빌 15
 
0.1%
Other values (2257) 10527
96.1%
2024-05-11T06:52:09.202440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2777
 
3.8%
2669
 
3.6%
2476
 
3.4%
1715
 
2.3%
1675
 
2.3%
1617
 
2.2%
1524
 
2.1%
1456
 
2.0%
1402
 
1.9%
1270
 
1.7%
Other values (415) 55047
74.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67369
91.5%
Decimal Number 3393
 
4.6%
Space Separator 1033
 
1.4%
Uppercase Letter 877
 
1.2%
Lowercase Letter 370
 
0.5%
Open Punctuation 158
 
0.2%
Close Punctuation 158
 
0.2%
Dash Punctuation 156
 
0.2%
Other Punctuation 111
 
0.2%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2777
 
4.1%
2669
 
4.0%
2476
 
3.7%
1715
 
2.5%
1675
 
2.5%
1617
 
2.4%
1524
 
2.3%
1456
 
2.2%
1402
 
2.1%
1270
 
1.9%
Other values (370) 48788
72.4%
Uppercase Letter
ValueCountFrequency (%)
S 136
15.5%
C 122
13.9%
D 96
10.9%
M 96
10.9%
L 84
9.6%
K 83
9.5%
H 74
8.4%
I 41
 
4.7%
E 37
 
4.2%
V 27
 
3.1%
Other values (7) 81
9.2%
Lowercase Letter
ValueCountFrequency (%)
e 209
56.5%
l 44
 
11.9%
i 31
 
8.4%
v 21
 
5.7%
k 14
 
3.8%
c 12
 
3.2%
s 12
 
3.2%
h 10
 
2.7%
w 7
 
1.9%
a 5
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 1017
30.0%
2 1017
30.0%
3 457
13.5%
4 258
 
7.6%
5 171
 
5.0%
6 137
 
4.0%
8 97
 
2.9%
7 92
 
2.7%
9 89
 
2.6%
0 58
 
1.7%
Other Punctuation
ValueCountFrequency (%)
, 92
82.9%
. 19
 
17.1%
Space Separator
ValueCountFrequency (%)
1033
100.0%
Open Punctuation
ValueCountFrequency (%)
( 158
100.0%
Close Punctuation
ValueCountFrequency (%)
) 158
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 156
100.0%
Letter Number
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67369
91.5%
Common 5009
 
6.8%
Latin 1250
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2777
 
4.1%
2669
 
4.0%
2476
 
3.7%
1715
 
2.5%
1675
 
2.5%
1617
 
2.4%
1524
 
2.3%
1456
 
2.2%
1402
 
2.1%
1270
 
1.9%
Other values (370) 48788
72.4%
Latin
ValueCountFrequency (%)
e 209
16.7%
S 136
10.9%
C 122
9.8%
D 96
 
7.7%
M 96
 
7.7%
L 84
 
6.7%
K 83
 
6.6%
H 74
 
5.9%
l 44
 
3.5%
I 41
 
3.3%
Other values (19) 265
21.2%
Common
ValueCountFrequency (%)
1033
20.6%
1 1017
20.3%
2 1017
20.3%
3 457
9.1%
4 258
 
5.2%
5 171
 
3.4%
( 158
 
3.2%
) 158
 
3.2%
- 156
 
3.1%
6 137
 
2.7%
Other values (6) 447
8.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67369
91.5%
ASCII 6256
 
8.5%
Number Forms 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2777
 
4.1%
2669
 
4.0%
2476
 
3.7%
1715
 
2.5%
1675
 
2.5%
1617
 
2.4%
1524
 
2.3%
1456
 
2.2%
1402
 
2.1%
1270
 
1.9%
Other values (370) 48788
72.4%
ASCII
ValueCountFrequency (%)
1033
16.5%
1 1017
16.3%
2 1017
16.3%
3 457
 
7.3%
4 258
 
4.1%
e 209
 
3.3%
5 171
 
2.7%
( 158
 
2.5%
) 158
 
2.5%
- 156
 
2.5%
Other values (34) 1622
25.9%
Number Forms
ValueCountFrequency (%)
3
100.0%
Distinct2184
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:52:09.978823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters90000
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique129 ?
Unique (%)1.3%

Sample

1st rowA13920706
2nd rowA10027136
3rd rowA10024098
4th rowA10025070
5th rowA15778101
ValueCountFrequency (%)
a15083701 14
 
0.1%
a13613007 13
 
0.1%
a10024245 12
 
0.1%
a15086601 12
 
0.1%
a10025649 12
 
0.1%
a15805002 12
 
0.1%
a15284906 12
 
0.1%
a13922908 12
 
0.1%
a13384304 11
 
0.1%
a14388208 11
 
0.1%
Other values (2174) 9879
98.8%
2024-05-11T06:52:11.278090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 18972
21.1%
1 17435
19.4%
A 10000
11.1%
3 8828
9.8%
2 8311
9.2%
5 6120
 
6.8%
8 5482
 
6.1%
7 4479
 
5.0%
4 4022
 
4.5%
6 3487
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
88.9%
Uppercase Letter 10000
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 18972
23.7%
1 17435
21.8%
3 8828
11.0%
2 8311
10.4%
5 6120
 
7.6%
8 5482
 
6.9%
7 4479
 
5.6%
4 4022
 
5.0%
6 3487
 
4.4%
9 2864
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
A 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
88.9%
Latin 10000
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 18972
23.7%
1 17435
21.8%
3 8828
11.0%
2 8311
10.4%
5 6120
 
7.6%
8 5482
 
6.9%
7 4479
 
5.6%
4 4022
 
5.0%
6 3487
 
4.4%
9 2864
 
3.6%
Latin
ValueCountFrequency (%)
A 10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 18972
21.1%
1 17435
19.4%
A 10000
11.1%
3 8828
9.8%
2 8311
9.2%
5 6120
 
6.8%
8 5482
 
6.1%
7 4479
 
5.0%
4 4022
 
4.5%
6 3487
 
3.9%
Distinct87
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T06:52:12.164109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.8885
Min length2

Characters and Unicode

Total characters48885
Distinct characters120
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row소독비
2nd row교통비
3rd row제수당
4th row경비비
5th row세대급탕비
ValueCountFrequency (%)
청소비 249
 
2.5%
퇴직급여 230
 
2.3%
승강기유지비 226
 
2.3%
연체료수익 223
 
2.2%
소독비 215
 
2.1%
경비비 212
 
2.1%
세대전기료 212
 
2.1%
통신비 208
 
2.1%
도서인쇄비 207
 
2.1%
급여 206
 
2.1%
Other values (77) 7812
78.1%
2024-05-11T06:52:13.655490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5448
 
11.1%
3541
 
7.2%
2115
 
4.3%
1945
 
4.0%
1720
 
3.5%
1329
 
2.7%
1056
 
2.2%
897
 
1.8%
797
 
1.6%
748
 
1.5%
Other values (110) 29289
59.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 48885
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5448
 
11.1%
3541
 
7.2%
2115
 
4.3%
1945
 
4.0%
1720
 
3.5%
1329
 
2.7%
1056
 
2.2%
897
 
1.8%
797
 
1.6%
748
 
1.5%
Other values (110) 29289
59.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 48885
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5448
 
11.1%
3541
 
7.2%
2115
 
4.3%
1945
 
4.0%
1720
 
3.5%
1329
 
2.7%
1056
 
2.2%
897
 
1.8%
797
 
1.6%
748
 
1.5%
Other values (110) 29289
59.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 48885
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5448
 
11.1%
3541
 
7.2%
2115
 
4.3%
1945
 
4.0%
1720
 
3.5%
1329
 
2.7%
1056
 
2.2%
897
 
1.8%
797
 
1.6%
748
 
1.5%
Other values (110) 29289
59.9%

년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202206
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202206
2nd row202206
3rd row202206
4th row202206
5th row202206

Common Values

ValueCountFrequency (%)
202206 10000
100.0%

Length

2024-05-11T06:52:14.276444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T06:52:14.665801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202206 10000
100.0%

금액
Real number (ℝ)

ZEROS 

Distinct7165
Distinct (%)71.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3118217.6
Minimum-1969220
Maximum3.9121871 × 108
Zeros1002
Zeros (%)10.0%
Negative12
Negative (%)0.1%
Memory size166.0 KiB
2024-05-11T06:52:15.019621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-1969220
5-th percentile0
Q175095
median310000
Q31395642.8
95-th percentile15468489
Maximum3.9121871 × 108
Range3.9318793 × 108
Interquartile range (IQR)1320547.8

Descriptive statistics

Standard deviation10553099
Coefficient of variation (CV)3.3843368
Kurtosis257.02113
Mean3118217.6
Median Absolute Deviation (MAD)303427.5
Skewness11.477595
Sum3.1182176 × 1010
Variance1.1136789 × 1014
MonotonicityNot monotonic
2024-05-11T06:52:16.049118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1002
 
10.0%
200000 98
 
1.0%
300000 61
 
0.6%
100000 57
 
0.6%
73000 43
 
0.4%
500000 40
 
0.4%
30000 38
 
0.4%
250000 34
 
0.3%
400000 33
 
0.3%
150000 33
 
0.3%
Other values (7155) 8561
85.6%
ValueCountFrequency (%)
-1969220 1
< 0.1%
-1469007 1
< 0.1%
-760910 1
< 0.1%
-251500 1
< 0.1%
-208940 1
< 0.1%
-154100 1
< 0.1%
-110910 1
< 0.1%
-55000 1
< 0.1%
-31380 1
< 0.1%
-2117 1
< 0.1%
ValueCountFrequency (%)
391218709 1
< 0.1%
218638274 1
< 0.1%
190370848 1
< 0.1%
176792500 1
< 0.1%
161666668 1
< 0.1%
161579590 1
< 0.1%
140470785 1
< 0.1%
138866171 1
< 0.1%
129929954 1
< 0.1%
129566030 1
< 0.1%

Interactions

2024-05-11T06:52:06.883977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T06:52:16.341279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비용명금액
비용명1.0000.318
금액0.3181.000

Missing values

2024-05-11T06:52:07.281006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T06:52:07.740813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아파트명아파트코드비용명년월일금액
62318상계주공4단지A13920706소독비2022061309000
9992도봉숲 아뜨리움A10027136교통비2022069500
990보라매자이 더 포레스트A10024098제수당2022064917700
4523보라매 sk뷰A10025070경비비20220644720476
95369방화동성아파트A15778101세대급탕비2022064977000
52014정릉중앙하이츠빌1단지A13685104경비비20220611115000
4634월계센트럴아이파크아파트A10025097통신비202206315470
8056아크로리버뷰 신반포A10026227도서인쇄비202206163350
69929이촌강촌아파트A14003106세대수도료20220620170610
30491도봉삼환A13201207수선유지비20220610656220
아파트명아파트코드비용명년월일금액
76980문래삼환A15009402충당부채전입이자비용20220611354
74043중곡SKA14322001고용보험료20220653610
99163신트리3단지A15807311사무용품비2022060
72004삼각산아이원임대A14210001세대수도료2022064295870
4009반포센트럴자이아파트A10024913잡수익202206690
97078염창무학A15786213위탁관리수수료20220679570
23554역촌센트레빌A12289501고용안정사업비용202206208000
61102풍납신성노바빌A13887301고용보험료202206108380
16476연희성원A12071101감가상각비202206220660
80210여의도시범아파트A15089421세금과공과2022060