Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells2960
Missing cells (%)5.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory478.5 KiB
Average record size in memory49.0 B

Variable types

Text1
Categorical3
Numeric1

Dataset

Description6세이상 교육정도별 인구(초등학교, 중학교, 고등학교, 대학교(2,3년제), 대학교(4년제 이상), 대학원(석박사 과정), 받지 않았음(미취학 포함))에 대한 정보입니다. * 인구주택 총조사 자료(5년주기 생성)
Author인천광역시
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15055008&srcSe=7661IVAWM27C61E190

Alerts

2020 년 has 2960 (29.6%) missing valuesMissing

Reproduction

Analysis started2024-03-18 04:19:12.869580
Analysis finished2024-03-18 04:19:13.825576
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct169
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-18T13:19:14.019175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length4
Mean length3.7671
Min length2

Characters and Unicode

Total characters37671
Distinct characters117
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자월면
2nd row삼산2동
3rd row만수4동
4th row도화2·3동
5th row교동면
ValueCountFrequency (%)
부평6동 76
 
0.8%
불은면 71
 
0.7%
연수2동 71
 
0.7%
가정1동 70
 
0.7%
효성2동 67
 
0.7%
작전1동 67
 
0.7%
자월면 66
 
0.7%
마전동 66
 
0.7%
영종동 66
 
0.7%
용유동 66
 
0.7%
Other values (159) 9314
93.1%
2024-03-18T13:19:14.381736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8436
22.4%
2 1896
 
5.0%
1 1889
 
5.0%
3 1211
 
3.2%
1188
 
3.2%
834
 
2.2%
798
 
2.1%
792
 
2.1%
774
 
2.1%
762
 
2.0%
Other values (107) 19091
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30910
82.1%
Decimal Number 6424
 
17.1%
Other Punctuation 337
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8436
27.3%
1188
 
3.8%
834
 
2.7%
798
 
2.6%
792
 
2.6%
774
 
2.5%
762
 
2.5%
716
 
2.3%
565
 
1.8%
540
 
1.7%
Other values (98) 15505
50.2%
Decimal Number
ValueCountFrequency (%)
2 1896
29.5%
1 1889
29.4%
3 1211
18.9%
4 707
 
11.0%
5 352
 
5.5%
6 256
 
4.0%
8 60
 
0.9%
7 53
 
0.8%
Other Punctuation
ValueCountFrequency (%)
· 337
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30910
82.1%
Common 6761
 
17.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8436
27.3%
1188
 
3.8%
834
 
2.7%
798
 
2.6%
792
 
2.6%
774
 
2.5%
762
 
2.5%
716
 
2.3%
565
 
1.8%
540
 
1.7%
Other values (98) 15505
50.2%
Common
ValueCountFrequency (%)
2 1896
28.0%
1 1889
27.9%
3 1211
17.9%
4 707
 
10.5%
5 352
 
5.2%
· 337
 
5.0%
6 256
 
3.8%
8 60
 
0.9%
7 53
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30910
82.1%
ASCII 6424
 
17.1%
None 337
 
0.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8436
27.3%
1188
 
3.8%
834
 
2.7%
798
 
2.6%
792
 
2.6%
774
 
2.5%
762
 
2.5%
716
 
2.3%
565
 
1.8%
540
 
1.7%
Other values (98) 15505
50.2%
ASCII
ValueCountFrequency (%)
2 1896
29.5%
1 1889
29.4%
3 1211
18.9%
4 707
 
11.0%
5 352
 
5.5%
6 256
 
4.0%
8 60
 
0.9%
7 53
 
0.8%
None
ValueCountFrequency (%)
· 337
100.0%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
여자
5003 
남자
4997 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남자
2nd row남자
3rd row남자
4th row남자
5th row남자

Common Values

ValueCountFrequency (%)
여자 5003
50.0%
남자 4997
50.0%

Length

2024-03-18T13:19:14.489307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T13:19:14.576336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
여자 5003
50.0%
남자 4997
50.0%

연령별
Categorical

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
50-59세
1295 
20-29세
1274 
10-19세
1263 
70세 이상
1253 
40-49세
1251 
Other values (3)
3664 

Length

Max length6
Median length6
Mean length5.7586
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row70세 이상
2nd row50-59세
3rd row20-29세
4th row50-59세
5th row20-29세

Common Values

ValueCountFrequency (%)
50-59세 1295
13.0%
20-29세 1274
12.7%
10-19세 1263
12.6%
70세 이상 1253
12.5%
40-49세 1251
12.5%
60-69세 1230
12.3%
30-39세 1227
12.3%
6-9세 1207
12.1%

Length

2024-03-18T13:19:14.666445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T13:19:14.761917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
50-59세 1295
11.5%
20-29세 1274
11.3%
10-19세 1263
11.2%
70세 1253
11.1%
이상 1253
11.1%
40-49세 1251
11.1%
60-69세 1230
10.9%
30-39세 1227
10.9%
6-9세 1207
10.7%

교육정도별
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
고등학교
1467 
대학교(4년제 이상)
1441 
대학원(석박사 과정)
1431 
대학교(2,3년제)
1430 
받지 않았음(미취학 포함)
1423 
Other values (2)
2808 

Length

Max length14
Median length11
Mean length8.1491
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row받지 않았음(미취학 포함)
2nd row받지 않았음(미취학 포함)
3rd row대학원(석박사 과정)
4th row초등학교
5th row초등학교

Common Values

ValueCountFrequency (%)
고등학교 1467
14.7%
대학교(4년제 이상) 1441
14.4%
대학원(석박사 과정) 1431
14.3%
대학교(2,3년제) 1430
14.3%
받지 않았음(미취학 포함) 1423
14.2%
중학교 1423
14.2%
초등학교 1385
13.9%

Length

2024-03-18T13:19:14.865692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T13:19:14.956036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고등학교 1467
9.3%
대학교(4년제 1441
9.2%
이상 1441
9.2%
대학원(석박사 1431
9.1%
과정 1431
9.1%
대학교(2,3년제 1430
9.1%
받지 1423
9.1%
않았음(미취학 1423
9.1%
포함 1423
9.1%
중학교 1423
9.1%

2020 년
Real number (ℝ)

MISSING 

Distinct1355
Distinct (%)19.2%
Missing2960
Missing (%)29.6%
Infinite0
Infinite (%)0.0%
Mean870.92699
Minimum1
Maximum159239
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-18T13:19:15.064829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q131
median124
Q3375
95-th percentile1498.05
Maximum159239
Range159238
Interquartile range (IQR)344

Descriptive statistics

Standard deviation5739.4169
Coefficient of variation (CV)6.5900092
Kurtosis287.87933
Mean870.92699
Median Absolute Deviation (MAD)110
Skewness15.214195
Sum6131326
Variance32940906
MonotonicityNot monotonic
2024-03-18T13:19:15.183841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8 103
 
1.0%
7 97
 
1.0%
5 78
 
0.8%
12 78
 
0.8%
4 76
 
0.8%
13 73
 
0.7%
2 69
 
0.7%
3 69
 
0.7%
9 66
 
0.7%
14 62
 
0.6%
Other values (1345) 6269
62.7%
(Missing) 2960
29.6%
ValueCountFrequency (%)
1 49
0.5%
2 69
0.7%
3 69
0.7%
4 76
0.8%
5 78
0.8%
6 60
0.6%
7 97
1.0%
8 103
1.0%
9 66
0.7%
10 61
0.6%
ValueCountFrequency (%)
159239 1
< 0.1%
154893 1
< 0.1%
109967 1
< 0.1%
107602 1
< 0.1%
94379 1
< 0.1%
93220 1
< 0.1%
91070 1
< 0.1%
90986 1
< 0.1%
90342 1
< 0.1%
86622 1
< 0.1%

Interactions

2024-03-18T13:19:13.508403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-18T13:19:15.257605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령별교육정도별2020 년
성별1.0000.0000.0000.021
연령별0.0001.0000.0000.078
교육정도별0.0000.0001.0000.059
2020 년0.0210.0780.0591.000
2024-03-18T13:19:15.327301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령별성별교육정도별
연령별1.0000.0000.000
성별0.0001.0000.000
교육정도별0.0000.0001.000
2024-03-18T13:19:15.407823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2020 년성별연령별교육정도별
2020 년1.0000.0160.0260.032
성별0.0161.0000.0000.000
연령별0.0260.0001.0000.000
교육정도별0.0320.0000.0001.000

Missing values

2024-03-18T13:19:13.695913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-18T13:19:13.788420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

행정구역별(동읍면)성별연령별교육정도별2020 년
18752자월면남자70세 이상받지 않았음(미취학 포함)12
9897삼산2동남자50-59세받지 않았음(미취학 포함)<NA>
6403만수4동남자20-29세대학원(석박사 과정)19
16387도화2·3동남자50-59세초등학교22
17710교동면남자20-29세초등학교<NA>
15663주안6동여자50-59세대학교(4년제 이상)173
9632십정1동남자6-9세초등학교198
2121만석동여자70세 이상초등학교274
6501만수5동남자6-9세대학원(석박사 과정)<NA>
9588일신동여자10-19세대학원(석박사 과정)<NA>
행정구역별(동읍면)성별연령별교육정도별2020 년
15810주안8동남자20-29세대학교(4년제 이상)612
6195만수2동남자50-59세초등학교8
7538부평1동남자40-49세받지 않았음(미취학 포함)<NA>
4747송도2동남자60-69세중학교19
2522송현3동여자6-9세고등학교<NA>
14058미추홀구여자6-9세고등학교<NA>
5184구월1동남자40-49세대학교(4년제 이상)1097
839신흥동남자70세 이상받지 않았음(미취학 포함)27
15114주안1동여자70세 이상중학교140
1500용유동남자60-69세고등학교273