Overview

Dataset statistics

Number of variables7
Number of observations95
Missing cells10
Missing cells (%)1.5%
Duplicate rows1
Duplicate rows (%)1.1%
Total size in memory5.3 KiB
Average record size in memory57.4 B

Variable types

Unsupported2
Categorical4
Text1

Alerts

Dataset has 1 (1.1%) duplicate rowsDuplicates
Unnamed: 2 is highly overall correlated with Unnamed: 3 and 2 other fieldsHigh correlation
Unnamed: 3 is highly overall correlated with Unnamed: 2 and 1 other fieldsHigh correlation
Unnamed: 5 is highly overall correlated with Unnamed: 2 and 1 other fieldsHigh correlation
Unnamed: 6 is highly overall correlated with Unnamed: 2 and 2 other fieldsHigh correlation
Unnamed: 2 is highly imbalanced (64.0%)Imbalance
Unnamed: 0 has 4 (4.2%) missing valuesMissing
◇ 도청사 내 C C T V 설치 현황 ◇ has 2 (2.1%) missing valuesMissing
Unnamed: 4 has 4 (4.2%) missing valuesMissing
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported
◇ 도청사 내 C C T V 설치 현황 ◇ is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 01:15:24.389476
Analysis finished2024-03-14 01:15:24.943884
Duration0.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4
Missing (%)4.2%
Memory size892.0 B

◇ 도청사 내 C C T V 설치 현황 ◇
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)2.1%
Memory size892.0 B

Unnamed: 2
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size892.0 B
78 
"
 
6
<NA>
 
4
주차장
 
2
 
1
Other values (4)
 
4

Length

Max length4
Median length1
Mean length1.2421053
Min length1

Unique

Unique5 ?
Unique (%)5.3%

Sample

1st row<NA>
2nd row
3rd row청사동
4th row
5th row

Common Values

ValueCountFrequency (%)
78
82.1%
" 6
 
6.3%
<NA> 4
 
4.2%
주차장 2
 
2.1%
1
 
1.1%
청사동 1
 
1.1%
공연장 1
 
1.1%
외곽 1
 
1.1%
승강기 1
 
1.1%

Length

2024-03-14T10:15:24.995182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:15:25.096899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
78
82.1%
6
 
6.3%
na 4
 
4.2%
주차장 2
 
2.1%
1
 
1.1%
청사동 1
 
1.1%
공연장 1
 
1.1%
외곽 1
 
1.1%
승강기 1
 
1.1%

Unnamed: 3
Categorical

HIGH CORRELATION 

Distinct27
Distinct (%)28.4%
Missing0
Missing (%)0.0%
Memory size892.0 B
45 
"
1F
B1F
<NA>
 
4
Other values (22)
29 

Length

Max length4
Median length1
Mean length1.6947368
Min length1

Unique

Unique16 ?
Unique (%)16.8%

Sample

1st row<NA>
2nd row
3rd row옥상
4th row18F
5th row17F

Common Values

ValueCountFrequency (%)
45
47.4%
" 6
 
6.3%
1F 6
 
6.3%
B1F 5
 
5.3%
<NA> 4
 
4.2%
2F 3
 
3.2%
B2F 2
 
2.1%
4F 2
 
2.1%
3F 2
 
2.1%
야외 2
 
2.1%
Other values (17) 18
 
18.9%

Length

2024-03-14T10:15:25.213449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
45
47.4%
1f 6
 
6.3%
6
 
6.3%
b1f 5
 
5.3%
na 4
 
4.2%
2f 3
 
3.2%
3f 2
 
2.1%
옥상 2
 
2.1%
야외 2
 
2.1%
4f 2
 
2.1%
Other values (17) 18
 
18.9%

Unnamed: 4
Text

MISSING 

Distinct89
Distinct (%)97.8%
Missing4
Missing (%)4.2%
Memory size892.0 B
2024-03-14T10:15:25.442365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length12
Mean length10.879121
Min length2

Characters and Unicode

Total characters990
Distinct characters130
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)95.6%

Sample

1st row위치
2nd row옥상 헬기 착륙장
3rd row승강장 홀 입구(18층)
4th row 〃 (17층)
5th row 〃 (16층)
ValueCountFrequency (%)
23
 
10.9%
23
 
10.9%
승강장 9
 
4.3%
9
 
4.3%
주차장 7
 
3.3%
북2문 3
 
1.4%
복도 3
 
1.4%
중앙 3
 
1.4%
민원실 3
 
1.4%
정문 3
 
1.4%
Other values (99) 125
59.2%
2024-03-14T10:15:26.078455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
430
43.4%
29
 
2.9%
( 27
 
2.7%
) 27
 
2.7%
26
 
2.6%
23
 
2.3%
23
 
2.3%
1 18
 
1.8%
17
 
1.7%
16
 
1.6%
Other values (120) 354
35.8%

Most occurring categories

ValueCountFrequency (%)
Space Separator 430
43.4%
Other Letter 424
42.8%
Decimal Number 53
 
5.4%
Open Punctuation 27
 
2.7%
Close Punctuation 27
 
2.7%
Other Punctuation 23
 
2.3%
Lowercase Letter 3
 
0.3%
Uppercase Letter 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
 
6.8%
26
 
6.1%
23
 
5.4%
17
 
4.0%
16
 
3.8%
16
 
3.8%
14
 
3.3%
14
 
3.3%
12
 
2.8%
11
 
2.6%
Other values (100) 246
58.0%
Decimal Number
ValueCountFrequency (%)
1 18
34.0%
2 9
17.0%
3 5
 
9.4%
4 5
 
9.4%
7 4
 
7.5%
5 3
 
5.7%
8 3
 
5.7%
6 3
 
5.7%
9 2
 
3.8%
0 1
 
1.9%
Lowercase Letter
ValueCountFrequency (%)
n 1
33.3%
u 1
33.3%
t 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
O 1
33.3%
B 1
33.3%
I 1
33.3%
Space Separator
ValueCountFrequency (%)
430
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27
100.0%
Close Punctuation
ValueCountFrequency (%)
) 27
100.0%
Other Punctuation
ValueCountFrequency (%)
23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 560
56.6%
Hangul 421
42.5%
Latin 6
 
0.6%
Han 3
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
 
6.9%
26
 
6.2%
23
 
5.5%
17
 
4.0%
16
 
3.8%
16
 
3.8%
14
 
3.3%
14
 
3.3%
12
 
2.9%
11
 
2.6%
Other values (99) 243
57.7%
Common
ValueCountFrequency (%)
430
76.8%
( 27
 
4.8%
) 27
 
4.8%
23
 
4.1%
1 18
 
3.2%
2 9
 
1.6%
3 5
 
0.9%
4 5
 
0.9%
7 4
 
0.7%
5 3
 
0.5%
Other values (4) 9
 
1.6%
Latin
ValueCountFrequency (%)
n 1
16.7%
O 1
16.7%
u 1
16.7%
t 1
16.7%
B 1
16.7%
I 1
16.7%
Han
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 543
54.8%
Hangul 421
42.5%
None 23
 
2.3%
CJK 3
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
430
79.2%
( 27
 
5.0%
) 27
 
5.0%
1 18
 
3.3%
2 9
 
1.7%
3 5
 
0.9%
4 5
 
0.9%
7 4
 
0.7%
5 3
 
0.6%
8 3
 
0.6%
Other values (9) 12
 
2.2%
Hangul
ValueCountFrequency (%)
29
 
6.9%
26
 
6.2%
23
 
5.5%
17
 
4.0%
16
 
3.8%
16
 
3.8%
14
 
3.3%
14
 
3.3%
12
 
2.9%
11
 
2.6%
Other values (99) 243
57.7%
None
ValueCountFrequency (%)
23
100.0%
CJK
ValueCountFrequency (%)
3
100.0%

Unnamed: 5
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size892.0 B
53 
52만
13 
41만
10 
"
45만
 
5
Other values (4)

Length

Max length4
Median length1
Mean length1.7789474
Min length1

Unique

Unique3 ?
Unique (%)3.2%

Sample

1st row<NA>
2nd row화소
3rd row45만
4th row
5th row

Common Values

ValueCountFrequency (%)
53
55.8%
52만 13
 
13.7%
41만 10
 
10.5%
" 7
 
7.4%
45만 5
 
5.3%
<NA> 4
 
4.2%
화소 1
 
1.1%
적외선 1
 
1.1%
200만 1
 
1.1%

Length

2024-03-14T10:15:26.193165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:15:26.286277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
53
55.8%
52만 13
 
13.7%
41만 10
 
10.5%
7
 
7.4%
45만 5
 
5.3%
na 4
 
4.2%
화소 1
 
1.1%
적외선 1
 
1.1%
200만 1
 
1.1%

Unnamed: 6
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size892.0 B
62 
360˚ 회전형
11 
고정형
"
 
6
<NA>
 
4
Other values (2)
 
3

Length

Max length8
Median length1
Mean length2.2947368
Min length1

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row<NA>
2nd row비 고
3rd row360˚ 회전형
4th row고정형
5th row

Common Values

ValueCountFrequency (%)
62
65.3%
360˚ 회전형 11
 
11.6%
고정형 9
 
9.5%
" 6
 
6.3%
<NA> 4
 
4.2%
180˚ 회전형 2
 
2.1%
비 고 1
 
1.1%

Length

2024-03-14T10:15:26.393303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:15:26.505149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
62
56.9%
회전형 13
 
11.9%
360˚ 11
 
10.1%
고정형 9
 
8.3%
6
 
5.5%
na 4
 
3.7%
180˚ 2
 
1.8%
1
 
0.9%
1
 
0.9%

Correlations

2024-03-14T10:15:26.612549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
Unnamed: 21.0000.9431.0000.9200.857
Unnamed: 30.9431.0001.0000.8010.880
Unnamed: 41.0001.0001.0000.9511.000
Unnamed: 50.9200.8010.9511.0000.783
Unnamed: 60.8570.8801.0000.7831.000
2024-03-14T10:15:26.702887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 5Unnamed: 6Unnamed: 2Unnamed: 3
Unnamed: 51.0000.5790.5670.421
Unnamed: 60.5791.0000.6900.561
Unnamed: 20.5670.6901.0000.668
Unnamed: 30.4210.5610.6681.000
2024-03-14T10:15:26.788279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3Unnamed: 5Unnamed: 6
Unnamed: 21.0000.6680.5670.690
Unnamed: 30.6681.0000.4210.561
Unnamed: 50.5670.4211.0000.579
Unnamed: 60.6900.5610.5791.000

Missing values

2024-03-14T10:15:24.700666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T10:15:24.788171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T10:15:24.878663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0◇ 도청사 내 C C T V 설치 현황 ◇Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
0NaNNaN<NA><NA><NA><NA><NA>
1순번NO위치화소비 고
211청사동옥상옥상 헬기 착륙장45만360˚ 회전형
32318F승강장 홀 입구(18층)고정형
43417F〃 (17층)
54516F〃 (16층)
65615F〃 (15층)
76714F〃 (14층)
87813F〃 (13층)
98912F〃 (12층)
Unnamed: 0◇ 도청사 내 C C T V 설치 현황 ◇Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
858496〃 9호기
868599주차장야외남무 입구200만고정형
8786100""남문 출구""
8887101""북1문 입구""
8988102""북1문 출구""
9089103""북2문 입구""
9190104""북2문 출구""
92NaN* 의회동 카메라 총 14대 의회 총무담당관실로 이관 조치(2012년 9월 27일) - 현황제외<NA><NA><NA><NA><NA>
93NaNNaN<NA><NA><NA><NA><NA>
94NaN총:90대-고정형*33대,승강기용*9대,360˚회전형*33대,180˚회전형*7대,외곽*8대(360도)<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6# duplicates
0<NA><NA><NA><NA><NA>4