Overview

Dataset statistics

Number of variables6
Number of observations624
Missing cells104
Missing cells (%)2.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory31.8 KiB
Average record size in memory52.2 B

Variable types

Numeric3
Categorical3

Dataset

Description연도,지역코드,지역이름,분류코드,분류 명,대상자 수
Author서울시정신건강복지센터
URLhttps://data.seoul.go.kr/dataList/OA-20331/S/1/datasetView.do

Alerts

분류 명 is highly overall correlated with 분류코드High correlation
분류코드 is highly overall correlated with 분류 명High correlation
지역코드 is highly overall correlated with 지역이름High correlation
지역이름 is highly overall correlated with 지역코드High correlation
대상자 수 has 104 (16.7%) missing valuesMissing

Reproduction

Analysis started2024-05-04 02:53:17.056540
Analysis finished2024-05-04 02:53:20.728155
Duration3.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

Distinct6
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.5
Minimum2017
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 KiB
2024-05-04T02:53:20.951255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2017
5-th percentile2017
Q12018
median2019.5
Q32021
95-th percentile2022
Maximum2022
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.7091952
Coefficient of variation (CV)0.00084634574
Kurtosis-1.2691179
Mean2019.5
Median Absolute Deviation (MAD)1.5
Skewness0
Sum1260168
Variance2.9213483
MonotonicityDecreasing
2024-05-04T02:53:21.472651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2022 104
16.7%
2021 104
16.7%
2020 104
16.7%
2019 104
16.7%
2018 104
16.7%
2017 104
16.7%
ValueCountFrequency (%)
2017 104
16.7%
2018 104
16.7%
2019 104
16.7%
2020 104
16.7%
2021 104
16.7%
2022 104
16.7%
ValueCountFrequency (%)
2022 104
16.7%
2021 104
16.7%
2020 104
16.7%
2019 104
16.7%
2018 104
16.7%
2017 104
16.7%

지역코드
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.5
Minimum100
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 KiB
2024-05-04T02:53:22.024790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile101
Q1106
median112.5
Q3119
95-th percentile124
Maximum125
Range25
Interquartile range (IQR)13

Descriptive statistics

Standard deviation7.5060168
Coefficient of variation (CV)0.06672015
Kurtosis-1.203578
Mean112.5
Median Absolute Deviation (MAD)6.5
Skewness0
Sum70200
Variance56.340289
MonotonicityNot monotonic
2024-05-04T02:53:22.542692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
100 24
 
3.8%
114 24
 
3.8%
125 24
 
3.8%
124 24
 
3.8%
123 24
 
3.8%
122 24
 
3.8%
121 24
 
3.8%
120 24
 
3.8%
119 24
 
3.8%
118 24
 
3.8%
Other values (16) 384
61.5%
ValueCountFrequency (%)
100 24
3.8%
101 24
3.8%
102 24
3.8%
103 24
3.8%
104 24
3.8%
105 24
3.8%
106 24
3.8%
107 24
3.8%
108 24
3.8%
109 24
3.8%
ValueCountFrequency (%)
125 24
3.8%
124 24
3.8%
123 24
3.8%
122 24
3.8%
121 24
3.8%
120 24
3.8%
119 24
3.8%
118 24
3.8%
117 24
3.8%
116 24
3.8%

지역이름
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
서울시
 
24
종로구
 
24
중구
 
24
용산구
 
24
성동구
 
24
Other values (21)
504 

Length

Max length11
Median length10
Mean length9.8076923
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울시
2nd row서울시
3rd row서울시
4th row서울시
5th row 종로구

Common Values

ValueCountFrequency (%)
서울시 24
 
3.8%
종로구 24
 
3.8%
중구 24
 
3.8%
용산구 24
 
3.8%
성동구 24
 
3.8%
광진구 24
 
3.8%
동대문구 24
 
3.8%
중랑구 24
 
3.8%
성북구 24
 
3.8%
강북구 24
 
3.8%
Other values (16) 384
61.5%

Length

2024-05-04T02:53:23.118680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울시 24
 
3.8%
종로구 24
 
3.8%
송파구 24
 
3.8%
강남구 24
 
3.8%
서초구 24
 
3.8%
관악구 24
 
3.8%
동작구 24
 
3.8%
영등포구 24
 
3.8%
금천구 24
 
3.8%
구로구 24
 
3.8%
Other values (16) 384
61.5%

분류코드
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
301001
156 
301002
156 
301003
156 
301004
156 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row301001
2nd row301002
3rd row301003
4th row301004
5th row301001

Common Values

ValueCountFrequency (%)
301001 156
25.0%
301002 156
25.0%
301003 156
25.0%
301004 156
25.0%

Length

2024-05-04T02:53:23.593153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T02:53:23.924213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
301001 156
25.0%
301002 156
25.0%
301003 156
25.0%
301004 156
25.0%

분류 명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
주민등록인구
156 
추계중증정신질환자 수
156 
등록 정신장애인 수
156 
추계중증정신질환자 대비 정신장애인등록률(%)
156 

Length

Max length24
Median length10.5
Mean length12.75
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주민등록인구
2nd row추계중증정신질환자 수
3rd row등록 정신장애인 수
4th row추계중증정신질환자 대비 정신장애인등록률(%)
5th row주민등록인구

Common Values

ValueCountFrequency (%)
주민등록인구 156
25.0%
추계중증정신질환자 수 156
25.0%
등록 정신장애인 수 156
25.0%
추계중증정신질환자 대비 정신장애인등록률(%) 156
25.0%

Length

2024-05-04T02:53:24.371730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T02:53:24.887498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
추계중증정신질환자 312
22.2%
312
22.2%
주민등록인구 156
11.1%
등록 156
11.1%
정신장애인 156
11.1%
대비 156
11.1%
정신장애인등록률 156
11.1%

대상자 수
Real number (ℝ)

MISSING 

Distinct443
Distinct (%)85.2%
Missing104
Missing (%)16.7%
Infinite0
Infinite (%)0.0%
Mean188835.43
Minimum7.6
Maximum9857426
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 KiB
2024-05-04T02:53:25.342037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7.6
5-th percentile13.295
Q1162.475
median1568.5
Q3104555.25
95-th percentile502024
Maximum9857426
Range9857418.4
Interquartile range (IQR)104392.77

Descriptive statistics

Standard deviation955105.24
Coefficient of variation (CV)5.057871
Kurtosis93.154616
Mean188835.43
Median Absolute Deviation (MAD)1555.25
Skewness9.5720374
Sum98194423
Variance9.1222602 × 1011
MonotonicityNot monotonic
2024-05-04T02:53:25.806861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.0 6
 
1.0%
14.8 5
 
0.8%
18.8 4
 
0.6%
17.4 4
 
0.6%
13.0 4
 
0.6%
16.9 4
 
0.6%
12.7 3
 
0.5%
18.6 3
 
0.5%
13.4 3
 
0.5%
530.0 3
 
0.5%
Other values (433) 481
77.1%
(Missing) 104
 
16.7%
ValueCountFrequency (%)
7.6 1
0.2%
8.1 1
0.2%
8.4 2
0.3%
8.5 1
0.2%
9.5 1
0.2%
9.6 1
0.2%
9.7 2
0.3%
10.0 1
0.2%
12.0 2
0.3%
12.1 1
0.2%
ValueCountFrequency (%)
9857426.0 1
0.2%
9765623.0 1
0.2%
9729107.0 1
0.2%
9668465.0 1
0.2%
9509458.0 1
0.2%
675961.0 1
0.2%
667960.0 1
0.2%
666635.0 1
0.2%
664496.0 1
0.2%
658338.0 1
0.2%

Interactions

2024-05-04T02:53:18.954053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:53:17.508485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:53:18.257013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:53:19.266164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:53:17.778716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:53:18.475568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:53:19.580737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:53:18.001858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T02:53:18.694400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T02:53:26.271731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도지역코드지역이름분류코드분류 명대상자 수
연도1.0000.0000.0000.0000.0000.000
지역코드0.0001.0001.0000.0000.0000.314
지역이름0.0001.0001.0000.0000.0000.565
분류코드0.0000.0000.0001.0001.0000.231
분류 명0.0000.0000.0001.0001.0000.231
대상자 수0.0000.3140.5650.2310.2311.000
2024-05-04T02:53:26.546505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분류 명분류코드지역이름
분류 명1.0001.0000.000
분류코드1.0001.0000.000
지역이름0.0000.0001.000
2024-05-04T02:53:26.825545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도지역코드대상자 수지역이름분류코드분류 명
연도1.0000.0000.0010.0000.0000.000
지역코드0.0001.0000.0330.9870.0000.000
대상자 수0.0010.0331.0000.4420.1530.153
지역이름0.0000.9870.4421.0000.0000.000
분류코드0.0000.0000.1530.0001.0001.000
분류 명0.0000.0000.1530.0001.0001.000

Missing values

2024-05-04T02:53:20.039926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T02:53:20.545566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도지역코드지역이름분류코드분류 명대상자 수
02022100서울시301001주민등록인구<NA>
12022100서울시301002추계중증정신질환자 수<NA>
22022100서울시301003등록 정신장애인 수<NA>
32022100서울시301004추계중증정신질환자 대비 정신장애인등록률(%)<NA>
42022101종로구301001주민등록인구<NA>
52022101종로구301002추계중증정신질환자 수<NA>
62022101종로구301003등록 정신장애인 수<NA>
72022101종로구301004추계중증정신질환자 대비 정신장애인등록률(%)<NA>
82022102중구301001주민등록인구<NA>
92022102중구301002추계중증정신질환자 수<NA>
연도지역코드지역이름분류코드분류 명대상자 수
6142017123강남구301003등록 정신장애인 수877.0
6152017123강남구301004추계중증정신질환자 대비 정신장애인등록률(%)15.8
6162017124송파구301001주민등록인구664496.0
6172017124송파구301002추계중증정신질환자 수6645.0
6182017124송파구301003등록 정신장애인 수635.0
6192017124송파구301004추계중증정신질환자 대비 정신장애인등록률(%)9.6
6202017125강동구301001주민등록인구436223.0
6212017125강동구301002추계중증정신질환자 수4362.0
6222017125강동구301003등록 정신장애인 수568.0
6232017125강동구301004추계중증정신질환자 대비 정신장애인등록률(%)13.0