Overview

Dataset statistics

Number of variables18
Number of observations36
Missing cells142
Missing cells (%)21.9%
Duplicate rows7
Duplicate rows (%)19.4%
Total size in memory5.2 KiB
Average record size in memory147.7 B

Variable types

Text4
Unsupported14

Dataset

Description2022-02-23
Author주민등록인구통계
URLhttps://bigdata.gwangju.go.kr/usr/dataSet/getDataDetailView.rd?dataSetUncd=DS000201926

Alerts

Dataset has 7 (19.4%) duplicate rowsDuplicates
인구이동보고서(1호) has 20 (55.6%) missing valuesMissing
Unnamed: 1 has 25 (69.4%) missing valuesMissing
Unnamed: 2 has 24 (66.7%) missing valuesMissing
Unnamed: 3 has 32 (88.9%) missing valuesMissing
Unnamed: 4 has 3 (8.3%) missing valuesMissing
Unnamed: 5 has 3 (8.3%) missing valuesMissing
Unnamed: 6 has 3 (8.3%) missing valuesMissing
Unnamed: 7 has 3 (8.3%) missing valuesMissing
Unnamed: 8 has 3 (8.3%) missing valuesMissing
Unnamed: 9 has 2 (5.6%) missing valuesMissing
Unnamed: 10 has 3 (8.3%) missing valuesMissing
Unnamed: 11 has 3 (8.3%) missing valuesMissing
Unnamed: 12 has 3 (8.3%) missing valuesMissing
Unnamed: 13 has 3 (8.3%) missing valuesMissing
Unnamed: 14 has 3 (8.3%) missing valuesMissing
Unnamed: 15 has 3 (8.3%) missing valuesMissing
Unnamed: 16 has 3 (8.3%) missing valuesMissing
Unnamed: 17 has 3 (8.3%) missing valuesMissing
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 15 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 16 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 17 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-02-10 10:01:53.168403
Analysis finished2024-02-10 10:01:56.067887
Duration2.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct16
Distinct (%)100.0%
Missing20
Missing (%)55.6%
Memory size420.0 B
2024-02-10T10:01:56.351293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length7.75
Min length5

Characters and Unicode

Total characters124
Distinct characters42
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)100.0%

Sample

1st row행정기관 :
2nd row작성기준 :
3rd row시 군 구(읍면동)
4th row전월말세대수
5th row전월말인구수
ValueCountFrequency (%)
2
 
7.7%
2
 
7.7%
2
 
7.7%
행정기관 1
 
3.8%
금월말거주불명자수 1
 
3.8%
금월말인구수 1
 
3.8%
금월말세대수 1
 
3.8%
거주불명자수증감 1
 
3.8%
인구수증감 1
 
3.8%
세대수증감 1
 
3.8%
Other values (13) 13
50.0%
2024-02-10T10:01:57.490205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
9.7%
11
 
8.9%
8
 
6.5%
8
 
6.5%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
Other values (32) 59
47.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 104
83.9%
Control 12
 
9.7%
Space Separator 4
 
3.2%
Other Punctuation 2
 
1.6%
Close Punctuation 1
 
0.8%
Open Punctuation 1
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11
 
10.6%
8
 
7.7%
8
 
7.7%
5
 
4.8%
5
 
4.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
Other values (27) 47
45.2%
Control
ValueCountFrequency (%)
12
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
: 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 104
83.9%
Common 20
 
16.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11
 
10.6%
8
 
7.7%
8
 
7.7%
5
 
4.8%
5
 
4.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
Other values (27) 47
45.2%
Common
ValueCountFrequency (%)
12
60.0%
4
 
20.0%
: 2
 
10.0%
) 1
 
5.0%
( 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 104
83.9%
ASCII 20
 
16.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12
60.0%
4
 
20.0%
: 2
 
10.0%
) 1
 
5.0%
( 1
 
5.0%
Hangul
ValueCountFrequency (%)
11
 
10.6%
8
 
7.7%
8
 
7.7%
5
 
4.8%
5
 
4.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
Other values (27) 47
45.2%

Unnamed: 1
Text

MISSING 

Distinct9
Distinct (%)81.8%
Missing25
Missing (%)69.4%
Memory size420.0 B
2024-02-10T10:01:57.854061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.3636364
Min length2

Characters and Unicode

Total characters26
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)63.6%

Sample

1st row전 입
2nd row복귀
3rd row출생
4th row등록
5th row국외
ValueCountFrequency (%)
국외 2
15.4%
기타 2
15.4%
2
15.4%
1
7.7%
복귀 1
7.7%
출생 1
7.7%
등록 1
7.7%
1
7.7%
사망 1
7.7%
말소 1
7.7%
2024-02-10T10:01:58.757541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
15.4%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (7) 7
26.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22
84.6%
Control 4
 
15.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (6) 6
27.3%
Control
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22
84.6%
Common 4
 
15.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (6) 6
27.3%
Common
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22
84.6%
ASCII 4
 
15.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4
100.0%
Hangul
ValueCountFrequency (%)
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (6) 6
27.3%

Unnamed: 2
Text

MISSING 

Distinct7
Distinct (%)58.3%
Missing24
Missing (%)66.7%
Memory size420.0 B
2024-02-10T10:01:59.318905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length3.4166667
Min length1

Characters and Unicode

Total characters41
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)16.7%

Sample

1st row광주광역시 동구
2nd row2022.01 현재
3rd row
4th row남자
5th row여자
ValueCountFrequency (%)
2
14.3%
남자 2
14.3%
여자 2
14.3%
시도내 2
14.3%
시도간 2
14.3%
광주광역시 1
7.1%
동구 1
7.1%
2022.01 1
7.1%
현재 1
7.1%
2024-02-10T10:02:00.060435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
12.2%
4
 
9.8%
4
 
9.8%
3
 
7.3%
2 3
 
7.3%
2
 
4.9%
0 2
 
4.9%
2
 
4.9%
2
 
4.9%
2
 
4.9%
Other values (10) 12
29.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 31
75.6%
Decimal Number 6
 
14.6%
Space Separator 3
 
7.3%
Other Punctuation 1
 
2.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
16.1%
4
12.9%
4
12.9%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
Other values (5) 5
16.1%
Decimal Number
ValueCountFrequency (%)
2 3
50.0%
0 2
33.3%
1 1
 
16.7%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 31
75.6%
Common 10
 
24.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
16.1%
4
12.9%
4
12.9%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
Other values (5) 5
16.1%
Common
ValueCountFrequency (%)
3
30.0%
2 3
30.0%
0 2
20.0%
1 1
 
10.0%
. 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 31
75.6%
ASCII 10
 
24.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
16.1%
4
12.9%
4
12.9%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
Other values (5) 5
16.1%
ASCII
ValueCountFrequency (%)
3
30.0%
2 3
30.0%
0 2
20.0%
1 1
 
10.0%
. 1
 
10.0%

Unnamed: 3
Text

MISSING 

Distinct2
Distinct (%)50.0%
Missing32
Missing (%)88.9%
Memory size420.0 B
2024-02-10T10:02:00.431107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters16
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row시군구내
2nd row시군구간
3rd row시군구내
4th row시군구간
ValueCountFrequency (%)
시군구내 2
50.0%
시군구간 2
50.0%
2024-02-10T10:02:01.388621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
25.0%
4
25.0%
4
25.0%
2
12.5%
2
12.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
25.0%
4
25.0%
4
25.0%
2
12.5%
2
12.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
25.0%
4
25.0%
4
25.0%
2
12.5%
2
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
25.0%
4
25.0%
4
25.0%
2
12.5%
2
12.5%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)5.6%
Memory size420.0 B

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 13
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 14
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 15
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 16
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Unnamed: 17
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)8.3%
Memory size420.0 B

Correlations

2024-02-10T10:02:01.685203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인구이동보고서(1호)Unnamed: 1Unnamed: 2Unnamed: 3
인구이동보고서(1호)1.0000.0001.000NaN
Unnamed: 10.0001.000NaNNaN
Unnamed: 21.000NaN1.000NaN
Unnamed: 3NaNNaNNaN1.000

Missing values

2024-02-10T10:01:53.792345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-02-10T10:01:54.621112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-02-10T10:01:55.492483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

인구이동보고서(1호)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16Unnamed: 17
0<NA><NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1행정기관 :<NA>광주광역시 동구<NA>NaNNaNNaNNaNNaN출력일자 : 2022.02.08NaNNaNNaNNaNNaNNaNNaNNaN
2작성기준 :<NA>2022.01 현재<NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3시 군 구(읍면동)<NA><NA><NA>합 계충장동동명동계림1동계림2동산수1동산수2동지산1동지산2동서남동학동학운동지원1동지원2동
4전월말세대수<NA><NA><NA>523623522249760164151450348652429243822273625523436517204
5전월말인구수<NA><NA><NA>10347046423890110099818857910501424945673056772411445783316157
6전월말거주불명자수<NA><NA><NA>8828311493456837414264816224128
7전월말재외국민등록자수<NA><NA><NA>968612126427381369
8증 가 요 인전 입<NA>1386191811199713012067585310410681179
9<NA><NA>남자<NA>68294435444645836342654563980
인구이동보고서(1호)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16Unnamed: 17
26<NA>말소<NA><NA>20000000000101
27<NA>국외<NA><NA>00000000000000
28<NA>기타<NA><NA>10000000001000
29세대수증감<NA><NA><NA>741007-1341423-21-4841471
30인구수증감<NA><NA><NA>-4895-11-184312-16-52-70-4912-3
31거주불명자수증감<NA><NA><NA>-10-1-2-10-1-1000-210-3
32금월말세대수<NA><NA><NA>524363622250460034155451748672432241721793629524836587205
33금월말인구수<NA><NA><NA>10346647313895109989800862210513423345152986772011454784516154
34금월말거주불명자수<NA><NA><NA>8728211292456736414264796324125
35금월말재외국민등록자수<NA><NA><NA>958612126426381369

Duplicate rows

Most frequently occurring

인구이동보고서(1호)Unnamed: 1Unnamed: 2Unnamed: 3# duplicates
0<NA>국외<NA><NA>2
1<NA>기타<NA><NA>2
2<NA><NA>남자<NA>2
3<NA><NA>시도간<NA>2
4<NA><NA>시도내시군구내2
5<NA><NA>여자<NA>2
6<NA><NA><NA>시군구간2