Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells2970
Missing cells (%)5.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory478.5 KiB
Average record size in memory49.0 B

Variable types

Text1
Categorical3
Numeric1

Dataset

Description6세이상 교육정도별 인구(초등학교, 중학교, 고등학교, 대학교(2,3년제), 대학교(4년제 이상), 대학원(석박사 과정), 받지 않았음(미취학 포함))에 대한 정보입니다.* 인구주택 총조사 자료(5년주기 생성)
Author인천광역시
URLhttps://www.data.go.kr/data/15055008/fileData.do

Alerts

2020 년 has 2970 (29.7%) missing valuesMissing

Reproduction

Analysis started2023-12-23 08:02:48.841437
Analysis finished2023-12-23 08:02:51.202188
Duration2.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct169
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-23T08:02:52.280684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length4
Mean length3.768
Min length2

Characters and Unicode

Total characters37680
Distinct characters117
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row백령면
2nd row영종동
3rd row중구
4th row중구
5th row송림6동
ValueCountFrequency (%)
가좌1동 73
 
0.7%
주안6동 73
 
0.7%
구월2동 72
 
0.7%
교동면 71
 
0.7%
남촌도림동 69
 
0.7%
가좌3동 69
 
0.7%
계산2동 69
 
0.7%
삼산2동 68
 
0.7%
중구 68
 
0.7%
송월동 68
 
0.7%
Other values (159) 9300
93.0%
2023-12-23T08:02:54.073620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8454
22.4%
2 1869
 
5.0%
1 1843
 
4.9%
3 1228
 
3.3%
1209
 
3.2%
813
 
2.2%
760
 
2.0%
756
 
2.0%
754
 
2.0%
745
 
2.0%
Other values (107) 19249
51.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30966
82.2%
Decimal Number 6364
 
16.9%
Other Punctuation 350
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8454
27.3%
1209
 
3.9%
813
 
2.6%
760
 
2.5%
756
 
2.4%
754
 
2.4%
745
 
2.4%
709
 
2.3%
584
 
1.9%
545
 
1.8%
Other values (98) 15637
50.5%
Decimal Number
ValueCountFrequency (%)
2 1869
29.4%
1 1843
29.0%
3 1228
19.3%
4 701
 
11.0%
5 364
 
5.7%
6 245
 
3.8%
7 60
 
0.9%
8 54
 
0.8%
Other Punctuation
ValueCountFrequency (%)
· 350
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30966
82.2%
Common 6714
 
17.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8454
27.3%
1209
 
3.9%
813
 
2.6%
760
 
2.5%
756
 
2.4%
754
 
2.4%
745
 
2.4%
709
 
2.3%
584
 
1.9%
545
 
1.8%
Other values (98) 15637
50.5%
Common
ValueCountFrequency (%)
2 1869
27.8%
1 1843
27.5%
3 1228
18.3%
4 701
 
10.4%
5 364
 
5.4%
· 350
 
5.2%
6 245
 
3.6%
7 60
 
0.9%
8 54
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30966
82.2%
ASCII 6364
 
16.9%
None 350
 
0.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8454
27.3%
1209
 
3.9%
813
 
2.6%
760
 
2.5%
756
 
2.4%
754
 
2.4%
745
 
2.4%
709
 
2.3%
584
 
1.9%
545
 
1.8%
Other values (98) 15637
50.5%
ASCII
ValueCountFrequency (%)
2 1869
29.4%
1 1843
29.0%
3 1228
19.3%
4 701
 
11.0%
5 364
 
5.7%
6 245
 
3.8%
7 60
 
0.9%
8 54
 
0.8%
None
ValueCountFrequency (%)
· 350
100.0%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
여자
5021 
남자
4979 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남자
2nd row여자
3rd row남자
4th row여자
5th row여자

Common Values

ValueCountFrequency (%)
여자 5021
50.2%
남자 4979
49.8%

Length

2023-12-23T08:02:54.826682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T08:02:55.257598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
여자 5021
50.2%
남자 4979
49.8%

연령별
Categorical

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
60-69세
1306 
30-39세
1262 
70세 이상
1256 
40-49세
1246 
50-59세
1244 
Other values (3)
3686 

Length

Max length6
Median length6
Mean length5.752
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row6-9세
2nd row20-29세
3rd row50-59세
4th row50-59세
5th row60-69세

Common Values

ValueCountFrequency (%)
60-69세 1306
13.1%
30-39세 1262
12.6%
70세 이상 1256
12.6%
40-49세 1246
12.5%
50-59세 1244
12.4%
6-9세 1240
12.4%
10-19세 1228
12.3%
20-29세 1218
12.2%

Length

2023-12-23T08:02:56.000368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T08:02:56.539495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60-69세 1306
11.6%
30-39세 1262
11.2%
70세 1256
11.2%
이상 1256
11.2%
40-49세 1246
11.1%
50-59세 1244
11.1%
6-9세 1240
11.0%
10-19세 1228
10.9%
20-29세 1218
10.8%

교육정도별
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
대학교(4년제 이상)
1463 
대학교(2,3년제)
1445 
초등학교
1430 
고등학교
1429 
중학교
1426 
Other values (2)
2807 

Length

Max length14
Median length11
Mean length8.1286
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대학교(4년제 이상)
2nd row고등학교
3rd row받지 않았음(미취학 포함)
4th row대학교(2,3년제)
5th row대학원(석박사 과정)

Common Values

ValueCountFrequency (%)
대학교(4년제 이상) 1463
14.6%
대학교(2,3년제) 1445
14.4%
초등학교 1430
14.3%
고등학교 1429
14.3%
중학교 1426
14.3%
대학원(석박사 과정) 1423
14.2%
받지 않았음(미취학 포함) 1384
13.8%

Length

2023-12-23T08:02:57.055504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T08:02:57.520379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대학교(4년제 1463
9.3%
이상 1463
9.3%
대학교(2,3년제 1445
9.2%
초등학교 1430
9.1%
고등학교 1429
9.1%
중학교 1426
9.1%
대학원(석박사 1423
9.1%
과정 1423
9.1%
받지 1384
8.8%
않았음(미취학 1384
8.8%

2020 년
Real number (ℝ)

MISSING 

Distinct1372
Distinct (%)19.5%
Missing2970
Missing (%)29.7%
Infinite0
Infinite (%)0.0%
Mean831.72589
Minimum1
Maximum159239
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-23T08:02:58.227893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q131
median121
Q3365
95-th percentile1494.65
Maximum159239
Range159238
Interquartile range (IQR)334

Descriptive statistics

Standard deviation5568.4627
Coefficient of variation (CV)6.6950696
Kurtosis344.44581
Mean831.72589
Median Absolute Deviation (MAD)107
Skewness16.531688
Sum5847033
Variance31007777
MonotonicityNot monotonic
2023-12-23T08:02:58.914738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7 102
 
1.0%
8 100
 
1.0%
4 80
 
0.8%
6 77
 
0.8%
13 72
 
0.7%
2 69
 
0.7%
10 68
 
0.7%
9 66
 
0.7%
12 66
 
0.7%
5 66
 
0.7%
Other values (1362) 6264
62.6%
(Missing) 2970
29.7%
ValueCountFrequency (%)
1 51
0.5%
2 69
0.7%
3 62
0.6%
4 80
0.8%
5 66
0.7%
6 77
0.8%
7 102
1.0%
8 100
1.0%
9 66
0.7%
10 68
0.7%
ValueCountFrequency (%)
159239 1
< 0.1%
154893 1
< 0.1%
132978 1
< 0.1%
128364 1
< 0.1%
107602 1
< 0.1%
90342 1
< 0.1%
86622 1
< 0.1%
85060 1
< 0.1%
84395 1
< 0.1%
73890 1
< 0.1%

Interactions

2023-12-23T08:02:49.930288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-23T08:02:59.781119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령별교육정도별2020 년
성별1.0000.0000.0000.015
연령별0.0001.0000.0000.050
교육정도별0.0000.0001.0000.048
2020 년0.0150.0500.0481.000
2023-12-23T08:03:00.069124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
교육정도별성별연령별
교육정도별1.0000.0000.000
성별0.0001.0000.000
연령별0.0000.0001.000
2023-12-23T08:03:00.368199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2020 년성별연령별교육정도별
2020 년1.0000.0150.0240.025
성별0.0151.0000.0000.000
연령별0.0240.0001.0000.000
교육정도별0.0250.0000.0001.000

Missing values

2023-12-23T08:02:50.568959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-23T08:02:51.012875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

행정구역별(동읍면)성별연령별교육정도별2020 년
18365백령면남자6-9세대학교(4년제 이상)<NA>
1752영종동여자20-29세고등학교186
489중구남자50-59세받지 않았음(미취학 포함)<NA>
542중구여자50-59세대학교(2,3년제)893
3127송림6동여자60-69세대학원(석박사 과정)4
789신흥동남자6-9세대학원(석박사 과정)<NA>
4009청학동여자40-49세대학원(석박사 과정)38
7832부평3동여자60-69세받지 않았음(미취학 포함)<NA>
16229숭의1·3동여자60-69세대학교(2,3년제)51
16619강화읍남자60-69세중학교358
행정구역별(동읍면)성별연령별교육정도별2020 년
12457가좌1동남자30-39세대학교(4년제 이상)160
7332서창2동남자70세 이상대학교(2,3년제)53
8401산곡3동남자6-9세중학교<NA>
3357연수구여자70세 이상대학교(4년제 이상)467
9537일신동남자20-29세대학교(2,3년제)387
11644검암경서동여자70세 이상대학교(2,3년제)1
16597강화읍남자30-39세초등학교<NA>
11775가정1동남자20-29세중학교12
3781연수2동여자40-49세중학교18
9733십정1동여자60-69세대학교(2,3년제)29