Overview

Dataset statistics

Number of variables16
Number of observations10000
Missing cells18145
Missing cells (%)11.3%
Duplicate rows80
Duplicate rows (%)0.8%
Total size in memory1.3 MiB
Average record size in memory137.0 B

Variable types

Unsupported12
Categorical1
Text3

Dataset

Description대통령 선거의 개표 결과 정보로 전국 시도, 구시군, 읍면동, 투표구별 대통령 선거의 개표 결과 데이터를 조회하실 수 있습니다.
URLhttps://www.data.go.kr/data/15025528/fileData.do

Alerts

Dataset has 80 (0.8%) duplicate rowsDuplicates
Unnamed: 0 has 10000 (100.0%) missing valuesMissing
Unnamed: 3 has 7492 (74.9%) missing valuesMissing
Unnamed: 4 has 630 (6.3%) missing valuesMissing
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 15 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 04:30:58.339502
Analysis finished2023-12-12 04:31:00.059165
Duration1.72 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB
Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기도
1936 
서울특별시
1547 
경상북도
755 
경상남도
729 
전라남도
692 
Other values (15)
4341 

Length

Max length7
Median length5
Mean length4.1902
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row경기도
2nd row전라남도
3rd row서울특별시
4th row경기도
5th row경상북도

Common Values

ValueCountFrequency (%)
경기도 1936
19.4%
서울특별시 1547
15.5%
경상북도 755
 
7.5%
경상남도 729
 
7.3%
전라남도 692
 
6.9%
부산광역시 631
 
6.3%
충청남도 542
 
5.4%
전라북도 507
 
5.1%
강원도 493
 
4.9%
인천광역시 487
 
4.9%
Other values (10) 1681
16.8%

Length

2023-12-12T13:31:00.162959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 1936
19.4%
서울특별시 1547
15.5%
경상북도 755
 
7.5%
경상남도 729
 
7.3%
전라남도 692
 
6.9%
부산광역시 631
 
6.3%
충청남도 542
 
5.4%
전라북도 507
 
5.1%
강원도 493
 
4.9%
인천광역시 487
 
4.9%
Other values (10) 1681
16.8%
Distinct230
Distinct (%)2.3%
Missing2
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-12T13:31:00.575409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.4116823
Min length2

Characters and Unicode

Total characters34110
Distinct characters145
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row고양시일산서구
2nd row진도군
3rd row강동구
4th row양평군
5th row칠곡군
ValueCountFrequency (%)
서구 279
 
2.8%
남구 239
 
2.4%
동구 225
 
2.3%
북구 220
 
2.2%
중구 184
 
1.8%
강서구 111
 
1.1%
송파구 98
 
1.0%
제주시 95
 
1.0%
달서구 89
 
0.9%
파주시 83
 
0.8%
Other values (220) 8375
83.8%
2023-12-12T13:31:01.222267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5326
 
15.6%
4497
 
13.2%
2023
 
5.9%
1042
 
3.1%
1014
 
3.0%
919
 
2.7%
886
 
2.6%
843
 
2.5%
822
 
2.4%
806
 
2.4%
Other values (135) 15932
46.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 34110
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5326
 
15.6%
4497
 
13.2%
2023
 
5.9%
1042
 
3.1%
1014
 
3.0%
919
 
2.7%
886
 
2.6%
843
 
2.5%
822
 
2.4%
806
 
2.4%
Other values (135) 15932
46.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 34110
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5326
 
15.6%
4497
 
13.2%
2023
 
5.9%
1042
 
3.1%
1014
 
3.0%
919
 
2.7%
886
 
2.6%
843
 
2.5%
822
 
2.4%
806
 
2.4%
Other values (135) 15932
46.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 34110
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5326
 
15.6%
4497
 
13.2%
2023
 
5.9%
1042
 
3.1%
1014
 
3.0%
919
 
2.7%
886
 
2.6%
843
 
2.5%
822
 
2.4%
806
 
2.4%
Other values (135) 15932
46.7%

Unnamed: 3
Text

MISSING 

Distinct1760
Distinct (%)70.2%
Missing7492
Missing (%)74.9%
Memory size156.2 KiB
2023-12-12T13:31:01.745672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length3.6953748
Min length2

Characters and Unicode

Total characters9268
Distinct characters303
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1669 ?
Unique (%)66.5%

Sample

1st row상동면
2nd row잘못 투입·구분된 투표지
3rd row효자2동
4th row산격1동
5th row신선동
ValueCountFrequency (%)
소계 285
 
10.9%
국내부재자투표 143
 
5.5%
재외투표 143
 
5.5%
잘못 48
 
1.8%
투입·구분된 48
 
1.8%
투표지 48
 
1.8%
중앙동 20
 
0.8%
남면 9
 
0.3%
북면 6
 
0.2%
동면 5
 
0.2%
Other values (1752) 1849
71.0%
2023-12-12T13:31:02.414527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1203
 
13.0%
650
 
7.0%
382
 
4.1%
335
 
3.6%
318
 
3.4%
302
 
3.3%
287
 
3.1%
2 216
 
2.3%
209
 
2.3%
1 206
 
2.2%
Other values (293) 5160
55.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8504
91.8%
Decimal Number 610
 
6.6%
Space Separator 96
 
1.0%
Other Punctuation 58
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1203
 
14.1%
650
 
7.6%
382
 
4.5%
335
 
3.9%
318
 
3.7%
302
 
3.6%
287
 
3.4%
209
 
2.5%
200
 
2.4%
177
 
2.1%
Other values (281) 4441
52.2%
Decimal Number
ValueCountFrequency (%)
2 216
35.4%
1 206
33.8%
3 94
15.4%
4 47
 
7.7%
5 24
 
3.9%
6 10
 
1.6%
7 5
 
0.8%
8 4
 
0.7%
9 3
 
0.5%
0 1
 
0.2%
Space Separator
ValueCountFrequency (%)
96
100.0%
Other Punctuation
ValueCountFrequency (%)
· 58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8504
91.8%
Common 764
 
8.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1203
 
14.1%
650
 
7.6%
382
 
4.5%
335
 
3.9%
318
 
3.7%
302
 
3.6%
287
 
3.4%
209
 
2.5%
200
 
2.4%
177
 
2.1%
Other values (281) 4441
52.2%
Common
ValueCountFrequency (%)
2 216
28.3%
1 206
27.0%
96
12.6%
3 94
12.3%
· 58
 
7.6%
4 47
 
6.2%
5 24
 
3.1%
6 10
 
1.3%
7 5
 
0.7%
8 4
 
0.5%
Other values (2) 4
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8504
91.8%
ASCII 706
 
7.6%
None 58
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1203
 
14.1%
650
 
7.6%
382
 
4.5%
335
 
3.9%
318
 
3.7%
302
 
3.6%
287
 
3.4%
209
 
2.5%
200
 
2.4%
177
 
2.1%
Other values (281) 4441
52.2%
ASCII
ValueCountFrequency (%)
2 216
30.6%
1 206
29.2%
96
13.6%
3 94
13.3%
4 47
 
6.7%
5 24
 
3.4%
6 10
 
1.4%
7 5
 
0.7%
8 4
 
0.6%
9 3
 
0.4%
None
ValueCountFrequency (%)
· 58
100.0%

Unnamed: 4
Text

MISSING 

Distinct7211
Distinct (%)77.0%
Missing630
Missing (%)6.3%
Memory size156.2 KiB
2023-12-12T13:31:02.739974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length5.6518677
Min length2

Characters and Unicode

Total characters52958
Distinct characters323
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7015 ?
Unique (%)74.9%

Sample

1st row주엽1동제1투
2nd row고군면제1투
3rd row암사제3동제2투
4th row서종면제4투
5th row지천면제1투
ValueCountFrequency (%)
소계 1888
 
20.1%
중앙동제2투 13
 
0.1%
남면제1투 8
 
0.1%
중앙동제3투 8
 
0.1%
중앙동제1투 7
 
0.1%
남면제2투 6
 
0.1%
북면제2투 6
 
0.1%
서면제2투 6
 
0.1%
중앙동제4투 5
 
0.1%
동천동제3투 4
 
< 0.1%
Other values (7201) 7419
79.2%
2023-12-12T13:31:03.210607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8212
15.5%
7482
14.1%
5443
 
10.3%
1 2896
 
5.5%
2 2727
 
5.1%
2196
 
4.1%
2014
 
3.8%
3 1736
 
3.3%
1497
 
2.8%
4 1153
 
2.2%
Other values (313) 17602
33.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42526
80.3%
Decimal Number 10336
 
19.5%
Other Punctuation 96
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8212
19.3%
7482
17.6%
5443
 
12.8%
2196
 
5.2%
2014
 
4.7%
1497
 
3.5%
793
 
1.9%
574
 
1.3%
363
 
0.9%
331
 
0.8%
Other values (302) 13621
32.0%
Decimal Number
ValueCountFrequency (%)
1 2896
28.0%
2 2727
26.4%
3 1736
16.8%
4 1153
 
11.2%
5 710
 
6.9%
6 481
 
4.7%
7 295
 
2.9%
8 174
 
1.7%
9 106
 
1.0%
0 58
 
0.6%
Other Punctuation
ValueCountFrequency (%)
· 96
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42526
80.3%
Common 10432
 
19.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8212
19.3%
7482
17.6%
5443
 
12.8%
2196
 
5.2%
2014
 
4.7%
1497
 
3.5%
793
 
1.9%
574
 
1.3%
363
 
0.9%
331
 
0.8%
Other values (302) 13621
32.0%
Common
ValueCountFrequency (%)
1 2896
27.8%
2 2727
26.1%
3 1736
16.6%
4 1153
 
11.1%
5 710
 
6.8%
6 481
 
4.6%
7 295
 
2.8%
8 174
 
1.7%
9 106
 
1.0%
· 96
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42526
80.3%
ASCII 10336
 
19.5%
None 96
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8212
19.3%
7482
17.6%
5443
 
12.8%
2196
 
5.2%
2014
 
4.7%
1497
 
3.5%
793
 
1.9%
574
 
1.3%
363
 
0.9%
331
 
0.8%
Other values (302) 13621
32.0%
ASCII
ValueCountFrequency (%)
1 2896
28.0%
2 2727
26.4%
3 1736
16.8%
4 1153
 
11.2%
5 710
 
6.9%
6 481
 
4.7%
7 295
 
2.9%
8 174
 
1.7%
9 106
 
1.0%
0 58
 
0.6%
None
ValueCountFrequency (%)
· 96
100.0%

Unnamed: 5
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 6
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 7
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 8
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 9
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 10
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 11
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 12
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 13
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 14
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Unnamed: 15
Unsupported

REJECTED  UNSUPPORTED 

Missing2
Missing (%)< 0.1%
Memory size156.2 KiB

Missing values

2023-12-12T13:30:59.249710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:30:59.498974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T13:30:59.838444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0개표진행상황(투표구별)Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
8671<NA>경기도고양시일산서구<NA>주엽1동제1투3795288713051565124128789908
14865<NA>전라남도진도군<NA>고군면제1투124990869817007489711341
2714<NA>서울특별시강동구<NA>암사제3동제2투3739313418091303036431259605
10323<NA>경기도양평군<NA>서종면제4투52935324810200013512176
16287<NA>경상북도칠곡군<NA>지천면제1투30662359205527210317234811707
13192<NA>전라북도군산시<NA>옥도면제4투250159141440000158191
17208<NA>경상남도김해시상동면소계31012269140080956412223633832
1201<NA>서울특별시은평구잘못 투입·구분된 투표지<NA>0191270000190-19
10473<NA>강원도춘천시효자2동소계10529753045112968131477504262999
4266<NA>대구광역시북구산격1동소계1057981596932117225988128312420
Unnamed: 0개표진행상황(투표구별)Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
7483<NA>경기도의정부시<NA>신곡1동제8투44753489148819810240347514986
5374<NA>인천광역시서구<NA>청라2동제3투4232331715081797013033098915
17497<NA>경상남도창녕군<NA>창녕읍제1투2740208314975610023206320657
6034<NA>대전광역시동구<NA>효동제1투3474266715001153123326625807
10614<NA>강원도원주시<NA>무실동제5투47073498171817641091349351209
1478<NA>서울특별시양천구<NA>신월5동제3투416130051332163922732985201156
17301<NA>경상남도밀양시<NA>산외면제2투112085960324220128509261
1575<NA>서울특별시강서구화곡본동소계29005203828349118961211291220309738623
3004<NA>부산광역시부산진구<NA>양정제2동제2투3517271516591045012227096802
17843<NA>경상남도합천군<NA>대병면투표소19451454110327841532142331491

Duplicate rows

Most frequently occurring

개표진행상황(투표구별)Unnamed: 2Unnamed: 3Unnamed: 4# duplicates
0강원도동해시소계<NA>2
1강원도삼척시소계<NA>2
2강원도영월군소계<NA>2
3강원도정선군소계<NA>2
4강원도횡성군소계<NA>2
5경기도고양시덕양구소계<NA>2
6경기도과천시소계<NA>2
7경기도동두천시소계<NA>2
8경기도수원시팔달구소계<NA>2
9경기도안산시단원구소계<NA>2