Overview

Dataset statistics

Number of variables4
Number of observations101
Missing cells14
Missing cells (%)3.5%
Duplicate rows4
Duplicate rows (%)4.0%
Total size in memory3.3 KiB
Average record size in memory33.3 B

Variable types

Unsupported1
Categorical2
Text1

Alerts

Dataset has 4 (4.0%) duplicate rowsDuplicates
Unnamed: 1 is highly overall correlated with Unnamed: 2High correlation
Unnamed: 2 is highly overall correlated with Unnamed: 1High correlation
Unnamed: 1 is highly imbalanced (59.2%)Imbalance
Unnamed: 3 has 14 (13.9%) missing valuesMissing
C C T V 설치 현황 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 02:47:26.240650
Analysis finished2024-03-14 02:47:26.603769
Duration0.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

C C T V 설치 현황
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size940.0 B

Unnamed: 1
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)9.9%
Missing0
Missing (%)0.0%
Memory size940.0 B
<NA>
80 
청사동
 
5
외곽
 
3
승강기
 
3
공연장
 
3
Other values (5)
 
7

Length

Max length4
Median length4
Mean length3.7227723
Min length1

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row<NA>
2nd row
3rd row청사동
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 80
79.2%
청사동 5
 
5.0%
외곽 3
 
3.0%
승강기 3
 
3.0%
공연장 3
 
3.0%
주차장 2
 
2.0%
별관 2
 
2.0%
1
 
1.0%
대강당 1
 
1.0%
의회동 1
 
1.0%

Length

2024-03-14T11:47:26.658056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T11:47:26.746188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 80
79.2%
청사동 5
 
5.0%
외곽 3
 
3.0%
승강기 3
 
3.0%
공연장 3
 
3.0%
주차장 2
 
2.0%
별관 2
 
2.0%
1
 
1.0%
대강당 1
 
1.0%
의회동 1
 
1.0%

Unnamed: 2
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)24.8%
Missing0
Missing (%)0.0%
Memory size940.0 B
<NA>
47 
1F
B1F
4F
2F
Other values (20)
28 

Length

Max length4
Median length3
Mean length3.0693069
Min length1

Unique

Unique15 ?
Unique (%)14.9%

Sample

1st row<NA>
2nd row
3rd row옥상
4th row18F
5th row17F

Common Values

ValueCountFrequency (%)
<NA> 47
46.5%
1F 9
 
8.9%
B1F 7
 
6.9%
4F 5
 
5.0%
2F 5
 
5.0%
야외 3
 
3.0%
3F 3
 
3.0%
3
 
3.0%
B2F 2
 
2.0%
5F 2
 
2.0%
Other values (15) 15
 
14.9%

Length

2024-03-14T11:47:26.863138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 47
46.5%
1f 9
 
8.9%
b1f 7
 
6.9%
4f 5
 
5.0%
2f 5
 
5.0%
야외 3
 
3.0%
3f 3
 
3.0%
3
 
3.0%
5f 2
 
2.0%
b2f 2
 
2.0%
Other values (15) 15
 
14.9%

Unnamed: 3
Text

MISSING 

Distinct76
Distinct (%)87.4%
Missing14
Missing (%)13.9%
Memory size940.0 B
2024-03-14T11:47:27.089374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length6.4827586
Min length3

Characters and Unicode

Total characters564
Distinct characters136
Distinct categories7 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)85.1%

Sample

1st row설치장소(위치)
2nd row옥상 헬기 착륙장
3rd row승강장 홀
4th row대회의실 앞
5th row승강장 홀
ValueCountFrequency (%)
27
 
15.1%
승강장 12
 
6.7%
12
 
6.7%
주차장 7
 
3.9%
복도 4
 
2.2%
청사동 3
 
1.7%
민원실 3
 
1.7%
3
 
1.7%
정문 3
 
1.7%
중앙 3
 
1.7%
Other values (86) 102
57.0%
2024-03-14T11:47:27.408323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
92
 
16.3%
37
 
6.6%
28
 
5.0%
18
 
3.2%
17
 
3.0%
14
 
2.5%
13
 
2.3%
13
 
2.3%
12
 
2.1%
12
 
2.1%
Other values (126) 308
54.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 439
77.8%
Space Separator 92
 
16.3%
Decimal Number 24
 
4.3%
Lowercase Letter 3
 
0.5%
Close Punctuation 2
 
0.4%
Open Punctuation 2
 
0.4%
Uppercase Letter 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
37
 
8.4%
28
 
6.4%
18
 
4.1%
17
 
3.9%
14
 
3.2%
13
 
3.0%
13
 
3.0%
12
 
2.7%
12
 
2.7%
10
 
2.3%
Other values (108) 265
60.4%
Decimal Number
ValueCountFrequency (%)
1 9
37.5%
2 6
25.0%
7 2
 
8.3%
4 1
 
4.2%
3 1
 
4.2%
6 1
 
4.2%
0 1
 
4.2%
5 1
 
4.2%
8 1
 
4.2%
9 1
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
u 1
33.3%
t 1
33.3%
n 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
O 1
50.0%
Space Separator
ValueCountFrequency (%)
92
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 436
77.3%
Common 120
 
21.3%
Latin 5
 
0.9%
Han 3
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
37
 
8.5%
28
 
6.4%
18
 
4.1%
17
 
3.9%
14
 
3.2%
13
 
3.0%
13
 
3.0%
12
 
2.8%
12
 
2.8%
10
 
2.3%
Other values (107) 262
60.1%
Common
ValueCountFrequency (%)
92
76.7%
1 9
 
7.5%
2 6
 
5.0%
) 2
 
1.7%
( 2
 
1.7%
7 2
 
1.7%
4 1
 
0.8%
3 1
 
0.8%
6 1
 
0.8%
0 1
 
0.8%
Other values (3) 3
 
2.5%
Latin
ValueCountFrequency (%)
u 1
20.0%
t 1
20.0%
I 1
20.0%
n 1
20.0%
O 1
20.0%
Han
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 436
77.3%
ASCII 125
 
22.2%
CJK 3
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
92
73.6%
1 9
 
7.2%
2 6
 
4.8%
) 2
 
1.6%
( 2
 
1.6%
7 2
 
1.6%
4 1
 
0.8%
3 1
 
0.8%
6 1
 
0.8%
0 1
 
0.8%
Other values (8) 8
 
6.4%
Hangul
ValueCountFrequency (%)
37
 
8.5%
28
 
6.4%
18
 
4.1%
17
 
3.9%
14
 
3.2%
13
 
3.0%
13
 
3.0%
12
 
2.8%
12
 
2.8%
10
 
2.3%
Other values (107) 262
60.1%
CJK
ValueCountFrequency (%)
3
100.0%

Correlations

2024-03-14T11:47:27.498102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3
Unnamed: 11.0000.8750.655
Unnamed: 20.8751.0000.775
Unnamed: 30.6550.7751.000
2024-03-14T11:47:27.586278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 1
Unnamed: 21.0000.605
Unnamed: 10.6051.000
2024-03-14T11:47:27.670683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2
Unnamed: 11.0000.605
Unnamed: 20.6051.000

Missing values

2024-03-14T11:47:26.432125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T11:47:26.567524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

C C T V 설치 현황Unnamed: 1Unnamed: 2Unnamed: 3
0(2014년 7월 30일)<NA><NA><NA>
1연번설치장소(위치)
21청사동옥상옥상 헬기 착륙장
32<NA>18F승강장 홀
43<NA>17F<NA>
54<NA>16F<NA>
65<NA>15F<NA>
76<NA>14F<NA>
87<NA>13F<NA>
98<NA>12F<NA>
C C T V 설치 현황Unnamed: 1Unnamed: 2Unnamed: 3
9190<NA><NA>의원총회의실 출입문2
9291<NA>1F로비중앙 계단
9392<NA><NA>기자회견장 앞
9493<NA><NA>농협 앞 복도
9594<NA>B1F승강장 홀
9695<NA><NA>자료열람실 앞
9796<NA><NA>문서고 앞
9897<NA><NA>의회 지하주차장
9998승강기10호기
10099<NA><NA>11호기

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3# duplicates
0<NA>1F승강장 홀2
1<NA>2F승강장 홀2
2<NA>B1F승강장 홀2
3<NA><NA>공연장 내부2