Overview

Dataset statistics

Number of variables7
Number of observations23
Missing cells44
Missing cells (%)27.3%
Duplicate rows1
Duplicate rows (%)4.3%
Total size in memory1.4 KiB
Average record size in memory62.7 B

Variable types

Numeric1
Text3
Categorical3

Dataset

Description충청남도 보령시 의회 의원 현황으로 의원의 성명, 정당, 선거구, 직위, 전화번호 항목으로 구성되어 있습니다. 의원에 대한 자세한 정보를 원하시는 경우 보령시의회 누리집(http://www.brcouncil.go.kr/kr/member/name.do)을 참고해주시기 바랍니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=41&beforeMenuCd=DOM_000000201001001000&publicdatapk=15114843

Alerts

Dataset has 1 (4.3%) duplicate rowsDuplicates
정당 is highly overall correlated with 데이터기준일자High correlation
데이터기준일자 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
선거구 is highly overall correlated with 데이터기준일자High correlation
연번 is highly overall correlated with 데이터기준일자High correlation
연번 has 11 (47.8%) missing valuesMissing
성명 has 11 (47.8%) missing valuesMissing
직위 has 11 (47.8%) missing valuesMissing
전화번호 has 11 (47.8%) missing valuesMissing

Reproduction

Analysis started2024-01-09 20:03:41.091163
Analysis finished2024-01-09 20:03:41.959268
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct12
Distinct (%)100.0%
Missing11
Missing (%)47.8%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2024-01-10T05:03:42.041835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.55
Q13.75
median6.5
Q39.25
95-th percentile11.45
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.6055513
Coefficient of variation (CV)0.5547002
Kurtosis-1.2
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum78
Variance13
MonotonicityStrictly increasing
2024-01-10T05:03:42.191883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1 1
 
4.3%
2 1
 
4.3%
3 1
 
4.3%
4 1
 
4.3%
5 1
 
4.3%
6 1
 
4.3%
7 1
 
4.3%
8 1
 
4.3%
9 1
 
4.3%
10 1
 
4.3%
Other values (2) 2
 
8.7%
(Missing) 11
47.8%
ValueCountFrequency (%)
1 1
4.3%
2 1
4.3%
3 1
4.3%
4 1
4.3%
5 1
4.3%
6 1
4.3%
7 1
4.3%
8 1
4.3%
9 1
4.3%
10 1
4.3%
ValueCountFrequency (%)
12 1
4.3%
11 1
4.3%
10 1
4.3%
9 1
4.3%
8 1
4.3%
7 1
4.3%
6 1
4.3%
5 1
4.3%
4 1
4.3%
3 1
4.3%

성명
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing11
Missing (%)47.8%
Memory size316.0 B
2024-01-10T05:03:42.410873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters36
Distinct characters30
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st row박상모
2nd row김충호
3rd row백성현
4th row이정근
5th row백영창
ValueCountFrequency (%)
박상모 1
8.3%
김충호 1
8.3%
백성현 1
8.3%
이정근 1
8.3%
백영창 1
8.3%
성태용 1
8.3%
최은순 1
8.3%
김정훈 1
8.3%
조장현 1
8.3%
김재관 1
8.3%
Other values (2) 2
16.7%
2024-01-10T05:03:42.755313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
 
8.3%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
1
 
2.8%
1
 
2.8%
1
 
2.8%
Other values (20) 20
55.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 36
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3
 
8.3%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
1
 
2.8%
1
 
2.8%
1
 
2.8%
Other values (20) 20
55.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 36
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3
 
8.3%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
1
 
2.8%
1
 
2.8%
1
 
2.8%
Other values (20) 20
55.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 36
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3
 
8.3%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
1
 
2.8%
1
 
2.8%
1
 
2.8%
Other values (20) 20
55.6%

정당
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
11 
국민의힘
더불어민주당

Length

Max length6
Median length4
Mean length4.3478261
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국민의힘
2nd row국민의힘
3rd row국민의힘
4th row더불어민주당
5th row국민의힘

Common Values

ValueCountFrequency (%)
<NA> 11
47.8%
국민의힘 8
34.8%
더불어민주당 4
 
17.4%

Length

2024-01-10T05:03:42.938693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T05:03:43.075858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 11
47.8%
국민의힘 8
34.8%
더불어민주당 4
 
17.4%

선거구
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)26.1%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
11 
다선거구
나선거구
가선거구
라선거구

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row다선거구
2nd row가선거구
3rd row다선거구
4th row라선거구
5th row가선거구

Common Values

ValueCountFrequency (%)
<NA> 11
47.8%
다선거구 3
 
13.0%
나선거구 3
 
13.0%
가선거구 2
 
8.7%
라선거구 2
 
8.7%
비례대표 2
 
8.7%

Length

2024-01-10T05:03:43.219361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T05:03:43.359120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 11
47.8%
다선거구 3
 
13.0%
나선거구 3
 
13.0%
가선거구 2
 
8.7%
라선거구 2
 
8.7%
비례대표 2
 
8.7%

직위
Text

MISSING 

Distinct8
Distinct (%)66.7%
Missing11
Missing (%)47.8%
Memory size316.0 B
2024-01-10T05:03:43.544256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length4.25
Min length2

Characters and Unicode

Total characters51
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)58.3%

Sample

1st row의장
2nd row부의장
3rd row의회운영위원장
4th row자치행정위원장
5th row경제개발위원장
ValueCountFrequency (%)
의원 5
41.7%
의장 1
 
8.3%
부의장 1
 
8.3%
의회운영위원장 1
 
8.3%
자치행정위원장 1
 
8.3%
경제개발위원장 1
 
8.3%
더불어민주당 1
 
8.3%
예산결산특별위원장 1
 
8.3%
2024-01-10T05:03:44.349201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9
17.6%
8
15.7%
6
 
11.8%
4
 
7.8%
2
 
3.9%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
Other values (17) 17
33.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 51
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
17.6%
8
15.7%
6
 
11.8%
4
 
7.8%
2
 
3.9%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
Other values (17) 17
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 51
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
17.6%
8
15.7%
6
 
11.8%
4
 
7.8%
2
 
3.9%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
Other values (17) 17
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 51
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9
17.6%
8
15.7%
6
 
11.8%
4
 
7.8%
2
 
3.9%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
Other values (17) 17
33.3%

전화번호
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing11
Missing (%)47.8%
Memory size316.0 B
2024-01-10T05:03:44.571775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters144
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st row041-930-4000
2nd row041-930-4001
3rd row041-930-4007
4th row041-930-4008
5th row041-930-4006
ValueCountFrequency (%)
041-930-4000 1
8.3%
041-930-4001 1
8.3%
041-930-4007 1
8.3%
041-930-4008 1
8.3%
041-930-4006 1
8.3%
041-930-4009 1
8.3%
041-930-4002 1
8.3%
041-930-4005 1
8.3%
041-930-4011 1
8.3%
041-930-4004 1
8.3%
Other values (2) 2
16.7%
2024-01-10T05:03:44.942723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 48
33.3%
4 25
17.4%
- 24
16.7%
1 16
 
11.1%
9 13
 
9.0%
3 13
 
9.0%
7 1
 
0.7%
8 1
 
0.7%
6 1
 
0.7%
2 1
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 120
83.3%
Dash Punctuation 24
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 48
40.0%
4 25
20.8%
1 16
 
13.3%
9 13
 
10.8%
3 13
 
10.8%
7 1
 
0.8%
8 1
 
0.8%
6 1
 
0.8%
2 1
 
0.8%
5 1
 
0.8%
Dash Punctuation
ValueCountFrequency (%)
- 24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 144
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 48
33.3%
4 25
17.4%
- 24
16.7%
1 16
 
11.1%
9 13
 
9.0%
3 13
 
9.0%
7 1
 
0.7%
8 1
 
0.7%
6 1
 
0.7%
2 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 144
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 48
33.3%
4 25
17.4%
- 24
16.7%
1 16
 
11.1%
9 13
 
9.0%
3 13
 
9.0%
7 1
 
0.7%
8 1
 
0.7%
6 1
 
0.7%
2 1
 
0.7%

데이터기준일자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
2023-06-20
12 
<NA>
11 

Length

Max length10
Median length10
Mean length7.1304348
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-06-20
2nd row2023-06-20
3rd row2023-06-20
4th row2023-06-20
5th row2023-06-20

Common Values

ValueCountFrequency (%)
2023-06-20 12
52.2%
<NA> 11
47.8%

Length

2024-01-10T05:03:45.126375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T05:03:45.248348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-06-20 12
52.2%
na 11
47.8%

Interactions

2024-01-10T05:03:41.416346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T05:03:45.329312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번성명정당선거구직위전화번호
연번1.0001.0000.6440.9720.8651.000
성명1.0001.0001.0001.0001.0001.000
정당0.6441.0001.0000.0000.0001.000
선거구0.9721.0000.0001.0000.6661.000
직위0.8651.0000.0000.6661.0001.000
전화번호1.0001.0001.0001.0001.0001.000
2024-01-10T05:03:45.452597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정당데이터기준일자선거구
정당1.0001.0000.000
데이터기준일자1.0001.0001.000
선거구0.0001.0001.000
2024-01-10T05:03:45.567931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번정당선거구데이터기준일자
연번1.0000.0000.4361.000
정당0.0001.0000.0001.000
선거구0.4360.0001.0001.000
데이터기준일자1.0001.0001.0001.000

Missing values

2024-01-10T05:03:41.556956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T05:03:41.694310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-10T05:03:41.847543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번성명정당선거구직위전화번호데이터기준일자
01박상모국민의힘다선거구의장041-930-40002023-06-20
12김충호국민의힘가선거구부의장041-930-40012023-06-20
23백성현국민의힘다선거구의회운영위원장041-930-40072023-06-20
34이정근더불어민주당라선거구자치행정위원장041-930-40082023-06-20
45백영창국민의힘가선거구경제개발위원장041-930-40062023-06-20
56성태용더불어민주당나선거구더불어민주당041-930-40092023-06-20
67최은순국민의힘나선거구의원041-930-40022023-06-20
78김정훈국민의힘라선거구예산결산특별위원장041-930-40052023-06-20
89조장현더불어민주당다선거구의원041-930-40112023-06-20
910김재관국민의힘나선거구의원041-930-40042023-06-20
연번성명정당선거구직위전화번호데이터기준일자
13<NA><NA><NA><NA><NA><NA><NA>
14<NA><NA><NA><NA><NA><NA><NA>
15<NA><NA><NA><NA><NA><NA><NA>
16<NA><NA><NA><NA><NA><NA><NA>
17<NA><NA><NA><NA><NA><NA><NA>
18<NA><NA><NA><NA><NA><NA><NA>
19<NA><NA><NA><NA><NA><NA><NA>
20<NA><NA><NA><NA><NA><NA><NA>
21<NA><NA><NA><NA><NA><NA><NA>
22<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번성명정당선거구직위전화번호데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA><NA>11