Overview

Dataset statistics

Number of variables7
Number of observations6394
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory362.3 KiB
Average record size in memory58.0 B

Variable types

Categorical4
Text1
Numeric2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-22226/F/1/datasetView.do

Alerts

시도 has constant value ""Constant
가구수 is highly overall correlated with 수급권자수High correlation
수급권자수 is highly overall correlated with 가구수High correlation
가구수 has 1843 (28.8%) zerosZeros

Reproduction

Analysis started2024-03-23 03:33:32.087787
Analysis finished2024-03-23 03:33:34.106814
Duration2.02 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size50.1 KiB
서울특별시
6394 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 6394
100.0%

Length

2024-03-23T12:33:34.204211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T12:33:34.350350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 6394
100.0%

시군구
Categorical

Distinct25
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size50.1 KiB
송파구
 
357
강서구
 
341
관악구
 
327
노원구
 
318
성북구
 
296
Other values (20)
4755 

Length

Max length4
Median length3
Mean length3.0807007
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row종로구
3rd row종로구
4th row종로구
5th row종로구

Common Values

ValueCountFrequency (%)
송파구 357
 
5.6%
강서구 341
 
5.3%
관악구 327
 
5.1%
노원구 318
 
5.0%
성북구 296
 
4.6%
양천구 276
 
4.3%
강남구 271
 
4.2%
중랑구 271
 
4.2%
강동구 268
 
4.2%
은평구 268
 
4.2%
Other values (15) 3401
53.2%

Length

2024-03-23T12:33:34.629711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
송파구 357
 
5.6%
강서구 341
 
5.3%
관악구 327
 
5.1%
노원구 318
 
5.0%
성북구 296
 
4.6%
양천구 276
 
4.3%
강남구 271
 
4.2%
중랑구 271
 
4.2%
강동구 268
 
4.2%
은평구 268
 
4.2%
Other values (15) 3401
53.2%
Distinct451
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size50.1 KiB
2024-03-23T12:33:35.163376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length4
Mean length3.8104473
Min length2

Characters and Unicode

Total characters24364
Distinct characters190
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)0.3%

Sample

1st row청운효자동
2nd row청운효자동
3rd row청운효자동
4th row청운효자동
5th row청운효자동
ValueCountFrequency (%)
신사동 30
 
0.5%
등촌3동 22
 
0.3%
화곡본동 22
 
0.3%
하계1동 21
 
0.3%
방화3동 21
 
0.3%
방화1동 21
 
0.3%
홍은2동 20
 
0.3%
오류1동 20
 
0.3%
상계1동 20
 
0.3%
중화1동 20
 
0.3%
Other values (441) 6177
96.6%
2024-03-23T12:33:35.839819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6415
26.3%
2 1497
 
6.1%
1 1492
 
6.1%
3 669
 
2.7%
596
 
2.4%
4 394
 
1.6%
332
 
1.4%
286
 
1.2%
285
 
1.2%
277
 
1.1%
Other values (180) 12121
49.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19755
81.1%
Decimal Number 4470
 
18.3%
Other Punctuation 120
 
0.5%
Dash Punctuation 19
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6415
32.5%
596
 
3.0%
332
 
1.7%
286
 
1.4%
285
 
1.4%
277
 
1.4%
271
 
1.4%
266
 
1.3%
256
 
1.3%
234
 
1.2%
Other values (167) 10537
53.3%
Decimal Number
ValueCountFrequency (%)
2 1497
33.5%
1 1492
33.4%
3 669
15.0%
4 394
 
8.8%
5 148
 
3.3%
6 98
 
2.2%
7 87
 
1.9%
8 51
 
1.1%
9 18
 
0.4%
0 16
 
0.4%
Other Punctuation
ValueCountFrequency (%)
. 101
84.2%
, 19
 
15.8%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19755
81.1%
Common 4609
 
18.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6415
32.5%
596
 
3.0%
332
 
1.7%
286
 
1.4%
285
 
1.4%
277
 
1.4%
271
 
1.4%
266
 
1.3%
256
 
1.3%
234
 
1.2%
Other values (167) 10537
53.3%
Common
ValueCountFrequency (%)
2 1497
32.5%
1 1492
32.4%
3 669
14.5%
4 394
 
8.5%
5 148
 
3.2%
. 101
 
2.2%
6 98
 
2.1%
7 87
 
1.9%
8 51
 
1.1%
, 19
 
0.4%
Other values (3) 53
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19755
81.1%
ASCII 4609
 
18.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6415
32.5%
596
 
3.0%
332
 
1.7%
286
 
1.4%
285
 
1.4%
277
 
1.4%
271
 
1.4%
266
 
1.3%
256
 
1.3%
234
 
1.2%
Other values (167) 10537
53.3%
ASCII
ValueCountFrequency (%)
2 1497
32.5%
1 1492
32.4%
3 669
14.5%
4 394
 
8.5%
5 148
 
3.2%
. 101
 
2.2%
6 98
 
2.1%
7 87
 
1.9%
8 51
 
1.1%
, 19
 
0.4%
Other values (3) 53
 
1.1%

자격
Categorical

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size50.1 KiB
차상위본인부담경감대상자
1261 
차상위계층 확인
1182 
차상위장애인
1117 
부자가족
885 
모자가족
879 
Other values (4)
1070 

Length

Max length12
Median length10
Mean length7.1219894
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row모자가족
2nd row모자가족
3rd row부자가족
4th row부자가족
5th row차상위장애인

Common Values

ValueCountFrequency (%)
차상위본인부담경감대상자 1261
19.7%
차상위계층 확인 1182
18.5%
차상위장애인 1117
17.5%
부자가족 885
13.8%
모자가족 879
13.7%
차상위자활 416
 
6.5%
청소년한부모모자가족 358
 
5.6%
조손가족 238
 
3.7%
청소년한부모부자가족 58
 
0.9%

Length

2024-03-23T12:33:36.135558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T12:33:36.394976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
차상위본인부담경감대상자 1261
16.6%
차상위계층 1182
15.6%
확인 1182
15.6%
차상위장애인 1117
14.7%
부자가족 885
11.7%
모자가족 879
11.6%
차상위자활 416
 
5.5%
청소년한부모모자가족 358
 
4.7%
조손가족 238
 
3.1%
청소년한부모부자가족 58
 
0.8%

연령구간
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size50.1 KiB
18~64세
2689 
18세미만
2140 
65세이상
1565 

Length

Max length6
Median length5
Mean length5.4205505
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row18세미만
2nd row18~64세
3rd row18세미만
4th row18~64세
5th row18~64세

Common Values

ValueCountFrequency (%)
18~64세 2689
42.1%
18세미만 2140
33.5%
65세이상 1565
24.5%

Length

2024-03-23T12:33:36.633727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T12:33:36.815652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
18~64세 2689
42.1%
18세미만 2140
33.5%
65세이상 1565
24.5%

가구수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct158
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.220519
Minimum0
Maximum212
Zeros1843
Zeros (%)28.8%
Negative0
Negative (%)0.0%
Memory size56.3 KiB
2024-03-23T12:33:37.106586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q323
95-th percentile70
Maximum212
Range212
Interquartile range (IQR)23

Descriptive statistics

Standard deviation25.409935
Coefficient of variation (CV)1.5665303
Kurtosis8.6507955
Mean16.220519
Median Absolute Deviation (MAD)4
Skewness2.5800654
Sum103714
Variance645.6648
MonotonicityNot monotonic
2024-03-23T12:33:37.397041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1843
28.8%
1 826
 
12.9%
2 280
 
4.4%
3 159
 
2.5%
4 125
 
2.0%
7 118
 
1.8%
10 115
 
1.8%
5 109
 
1.7%
9 95
 
1.5%
14 95
 
1.5%
Other values (148) 2629
41.1%
ValueCountFrequency (%)
0 1843
28.8%
1 826
12.9%
2 280
 
4.4%
3 159
 
2.5%
4 125
 
2.0%
5 109
 
1.7%
6 81
 
1.3%
7 118
 
1.8%
8 89
 
1.4%
9 95
 
1.5%
ValueCountFrequency (%)
212 1
< 0.1%
208 1
< 0.1%
204 1
< 0.1%
197 2
< 0.1%
188 1
< 0.1%
187 1
< 0.1%
184 1
< 0.1%
175 1
< 0.1%
169 1
< 0.1%
168 1
< 0.1%

수급권자수
Real number (ℝ)

HIGH CORRELATION 

Distinct200
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.799656
Minimum0
Maximum307
Zeros55
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size56.3 KiB
2024-03-23T12:33:37.615828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median14
Q334
95-th percentile95.35
Maximum307
Range307
Interquartile range (IQR)31

Descriptive statistics

Standard deviation33.732258
Coefficient of variation (CV)1.3074693
Kurtosis8.2653045
Mean25.799656
Median Absolute Deviation (MAD)12
Skewness2.5031387
Sum164963
Variance1137.8652
MonotonicityNot monotonic
2024-03-23T12:33:37.868587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 909
 
14.2%
2 469
 
7.3%
3 270
 
4.2%
5 202
 
3.2%
4 201
 
3.1%
10 153
 
2.4%
6 152
 
2.4%
9 151
 
2.4%
8 139
 
2.2%
7 134
 
2.1%
Other values (190) 3614
56.5%
ValueCountFrequency (%)
0 55
 
0.9%
1 909
14.2%
2 469
7.3%
3 270
 
4.2%
4 201
 
3.1%
5 202
 
3.2%
6 152
 
2.4%
7 134
 
2.1%
8 139
 
2.2%
9 151
 
2.4%
ValueCountFrequency (%)
307 1
< 0.1%
288 1
< 0.1%
262 1
< 0.1%
244 1
< 0.1%
237 1
< 0.1%
234 1
< 0.1%
230 1
< 0.1%
228 1
< 0.1%
224 1
< 0.1%
219 1
< 0.1%

Interactions

2024-03-23T12:33:33.435034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T12:33:32.980494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T12:33:33.598702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T12:33:33.222971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T12:33:38.054584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구자격연령구간가구수수급권자수
시군구1.0000.0970.0000.2390.267
자격0.0971.0000.5370.3060.362
연령구간0.0000.5371.0000.4520.188
가구수0.2390.3060.4521.0000.872
수급권자수0.2670.3620.1880.8721.000
2024-03-23T12:33:38.636483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령구간자격시군구
연령구간1.0000.2800.000
자격0.2801.0000.037
시군구0.0000.0371.000
2024-03-23T12:33:38.797347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가구수수급권자수시군구자격연령구간
가구수1.0000.6340.0860.1440.305
수급권자수0.6341.0000.0970.1740.114
시군구0.0860.0971.0000.0370.000
자격0.1440.1740.0371.0000.280
연령구간0.3050.1140.0000.2801.000

Missing values

2024-03-23T12:33:33.809918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T12:33:34.018143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구읍면동자격연령구간가구수수급권자수
0서울특별시종로구청운효자동모자가족18세미만023
1서울특별시종로구청운효자동모자가족18~64세2130
2서울특별시종로구청운효자동부자가족18세미만03
3서울특별시종로구청운효자동부자가족18~64세49
4서울특별시종로구청운효자동차상위장애인18~64세910
5서울특별시종로구청운효자동차상위장애인65세이상99
6서울특별시종로구청운효자동차상위자활18~64세11
7서울특별시종로구청운효자동차상위본인부담경감대상자18세미만09
8서울특별시종로구청운효자동차상위본인부담경감대상자18~64세139
9서울특별시종로구청운효자동차상위본인부담경감대상자65세이상2226
시도시군구읍면동자격연령구간가구수수급권자수
6384서울특별시강동구둔촌2동차상위장애인18세미만02
6385서울특별시강동구둔촌2동차상위장애인18~64세1816
6386서울특별시강동구둔촌2동차상위장애인65세이상2324
6387서울특별시강동구둔촌2동차상위자활18~64세11
6388서울특별시강동구둔촌2동차상위본인부담경감대상자18세미만121
6389서울특별시강동구둔촌2동차상위본인부담경감대상자18~64세2919
6390서울특별시강동구둔촌2동차상위본인부담경감대상자65세이상2831
6391서울특별시강동구둔촌2동차상위계층 확인18세미만03
6392서울특별시강동구둔촌2동차상위계층 확인18~64세2549
6393서울특별시강동구둔촌2동차상위계층 확인65세이상3134