Overview

Dataset statistics

Number of variables16
Number of observations10000
Missing cells49992
Missing cells (%)31.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.3 MiB
Average record size in memory136.0 B

Variable types

Text7
Categorical9

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21706/F/1/datasetView.do

Alerts

Unnamed: 9 is highly overall correlated with Unnamed: 4 and 4 other fieldsHigh correlation
Unnamed: 14 is highly overall correlated with Unnamed: 4 and 5 other fieldsHigh correlation
Unnamed: 7 is highly overall correlated with Unnamed: 4 and 5 other fieldsHigh correlation
Unnamed: 4 is highly overall correlated with Unnamed: 6 and 7 other fieldsHigh correlation
Unnamed: 10 is highly overall correlated with Unnamed: 4 and 7 other fieldsHigh correlation
Unnamed: 6 is highly overall correlated with Unnamed: 4 and 7 other fieldsHigh correlation
Unnamed: 15 is highly overall correlated with Unnamed: 4 and 7 other fieldsHigh correlation
Unnamed: 8 is highly overall correlated with Unnamed: 4 and 4 other fieldsHigh correlation
Unnamed: 13 is highly overall correlated with Unnamed: 4 and 7 other fieldsHigh correlation
Unnamed: 4 is highly imbalanced (56.9%)Imbalance
Unnamed: 15 is highly imbalanced (99.8%)Imbalance
Unnamed: 2 has 9998 (> 99.9%) missing valuesMissing
Unnamed: 3 has 9998 (> 99.9%) missing valuesMissing
Unnamed: 5 has 9998 (> 99.9%) missing valuesMissing
Unnamed: 11 has 9998 (> 99.9%) missing valuesMissing
Unnamed: 12 has 9998 (> 99.9%) missing valuesMissing

Reproduction

Analysis started2023-12-11 03:59:47.392943
Analysis finished2023-12-11 03:59:49.911112
Duration2.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9999
Distinct (%)100.0%
Missing1
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-11T12:59:50.378860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length6
Mean length4.4459446
Min length1

Characters and Unicode

Total characters44455
Distinct characters25
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9999 ?
Unique (%)100.0%

Sample

1st row19997
2nd row7111
3rd row17682
4th row19104
5th row1714
ValueCountFrequency (%)
19997 1
 
< 0.1%
15917 1
 
< 0.1%
10410 1
 
< 0.1%
12614 1
 
< 0.1%
13516 1
 
< 0.1%
10111 1
 
< 0.1%
6513 1
 
< 0.1%
1430 1
 
< 0.1%
1967 1
 
< 0.1%
1666 1
 
< 0.1%
Other values (9989) 9989
99.9%
2023-12-11T12:59:51.120206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 8993
20.2%
7 4068
9.2%
8 4065
9.1%
2 4036
9.1%
4 4001
9.0%
5 3982
9.0%
3 3961
8.9%
6 3956
8.9%
9 3898
8.8%
0 3475
 
7.8%
Other values (15) 20
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 44435
> 99.9%
Uppercase Letter 14
 
< 0.1%
Other Letter 4
 
< 0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8993
20.2%
7 4068
9.2%
8 4065
9.1%
2 4036
9.1%
4 4001
9.0%
5 3982
9.0%
3 3961
8.9%
6 3956
8.9%
9 3898
8.8%
0 3475
 
7.8%
Uppercase Letter
ValueCountFrequency (%)
I 3
21.4%
D 2
14.3%
A 2
14.3%
R 1
 
7.1%
M 1
 
7.1%
X 1
 
7.1%
E 1
 
7.1%
B 1
 
7.1%
T 1
 
7.1%
N 1
 
7.1%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 44437
> 99.9%
Latin 14
 
< 0.1%
Hangul 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 8993
20.2%
7 4068
9.2%
8 4065
9.1%
2 4036
9.1%
4 4001
9.0%
5 3982
9.0%
3 3961
8.9%
6 3956
8.9%
9 3898
8.8%
0 3475
 
7.8%
Latin
ValueCountFrequency (%)
I 3
21.4%
D 2
14.3%
A 2
14.3%
R 1
 
7.1%
M 1
 
7.1%
X 1
 
7.1%
E 1
 
7.1%
B 1
 
7.1%
T 1
 
7.1%
N 1
 
7.1%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44451
> 99.9%
Hangul 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8993
20.2%
7 4068
9.2%
8 4065
9.1%
2 4036
9.1%
4 4001
9.0%
5 3982
9.0%
3 3961
8.9%
6 3956
8.9%
9 3898
8.8%
0 3475
 
7.8%
Other values (11) 16
 
< 0.1%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct1002
Distinct (%)10.0%
Missing1
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-11T12:59:51.520199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length6
Mean length6.1842184
Min length6

Characters and Unicode

Total characters61836
Distinct characters30
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row25-816
2nd row09-011
3rd row23-066
4th row24-2040
5th row02-051
ValueCountFrequency (%)
15-3013 17
 
0.2%
08-014 17
 
0.2%
17-026 16
 
0.2%
05-070 16
 
0.2%
22-1905 16
 
0.2%
03-025 16
 
0.2%
23-017 16
 
0.2%
20-040 16
 
0.2%
23-2121 15
 
0.2%
24-061 15
 
0.2%
Other values (992) 9839
98.4%
2023-12-11T12:59:52.161321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 14232
23.0%
- 9997
16.2%
1 9519
15.4%
2 8767
14.2%
3 4138
 
6.7%
4 3535
 
5.7%
5 3111
 
5.0%
6 2374
 
3.8%
7 2232
 
3.6%
9 2034
 
3.3%
Other values (20) 1897
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 51819
83.8%
Dash Punctuation 9997
 
16.2%
Uppercase Letter 12
 
< 0.1%
Other Letter 6
 
< 0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1
8.3%
C 1
8.3%
T 1
8.3%
O 1
8.3%
P 1
8.3%
S 1
8.3%
N 1
8.3%
I 1
8.3%
A 1
8.3%
X 1
8.3%
Other values (2) 2
16.7%
Decimal Number
ValueCountFrequency (%)
0 14232
27.5%
1 9519
18.4%
2 8767
16.9%
3 4138
 
8.0%
4 3535
 
6.8%
5 3111
 
6.0%
6 2374
 
4.6%
7 2232
 
4.3%
9 2034
 
3.9%
8 1877
 
3.6%
Other Letter
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 9997
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 61818
> 99.9%
Latin 12
 
< 0.1%
Hangul 6
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 14232
23.0%
- 9997
16.2%
1 9519
15.4%
2 8767
14.2%
3 4138
 
6.7%
4 3535
 
5.7%
5 3111
 
5.0%
6 2374
 
3.8%
7 2232
 
3.6%
9 2034
 
3.3%
Other values (2) 1879
 
3.0%
Latin
ValueCountFrequency (%)
M 1
8.3%
C 1
8.3%
T 1
8.3%
O 1
8.3%
P 1
8.3%
S 1
8.3%
N 1
8.3%
I 1
8.3%
A 1
8.3%
X 1
8.3%
Other values (2) 2
16.7%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 61830
> 99.9%
Hangul 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 14232
23.0%
- 9997
16.2%
1 9519
15.4%
2 8767
14.2%
3 4138
 
6.7%
4 3535
 
5.7%
5 3111
 
5.0%
6 2374
 
3.8%
7 2232
 
3.6%
9 2034
 
3.3%
Other values (14) 1891
 
3.1%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Unnamed: 2
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing9998
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T12:59:52.364233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length7
Mean length7
Min length4

Characters and Unicode

Total characters14
Distinct characters13
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row조사일자
2nd rowEXAMIN_DAY
ValueCountFrequency (%)
조사일자 1
50.0%
examin_day 1
50.0%
2023-12-11T12:59:52.699576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 2
14.3%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
E 1
 
7.1%
X 1
 
7.1%
M 1
 
7.1%
I 1
 
7.1%
N 1
 
7.1%
Other values (3) 3
21.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 9
64.3%
Other Letter 4
28.6%
Connector Punctuation 1
 
7.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2
22.2%
E 1
11.1%
X 1
11.1%
M 1
11.1%
I 1
11.1%
N 1
11.1%
D 1
11.1%
Y 1
11.1%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
64.3%
Hangul 4
28.6%
Common 1
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2
22.2%
E 1
11.1%
X 1
11.1%
M 1
11.1%
I 1
11.1%
N 1
11.1%
D 1
11.1%
Y 1
11.1%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
71.4%
Hangul 4
 
28.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2
20.0%
E 1
10.0%
X 1
10.0%
M 1
10.0%
I 1
10.0%
N 1
10.0%
_ 1
10.0%
D 1
10.0%
Y 1
10.0%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 3
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing9998
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T12:59:52.898854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length7.5
Mean length7.5
Min length4

Characters and Unicode

Total characters15
Distinct characters13
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row조사요일
2nd rowEXAMIN_DATE
ValueCountFrequency (%)
조사요일 1
50.0%
examin_date 1
50.0%
2023-12-11T12:59:53.277432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 2
13.3%
A 2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
X 1
 
6.7%
M 1
 
6.7%
I 1
 
6.7%
N 1
 
6.7%
Other values (3) 3
20.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10
66.7%
Other Letter 4
 
26.7%
Connector Punctuation 1
 
6.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 2
20.0%
A 2
20.0%
X 1
10.0%
M 1
10.0%
I 1
10.0%
N 1
10.0%
D 1
10.0%
T 1
10.0%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
66.7%
Hangul 4
 
26.7%
Common 1
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 2
20.0%
A 2
20.0%
X 1
10.0%
M 1
10.0%
I 1
10.0%
N 1
10.0%
D 1
10.0%
T 1
10.0%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
73.3%
Hangul 4
 
26.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 2
18.2%
A 2
18.2%
X 1
9.1%
M 1
9.1%
I 1
9.1%
N 1
9.1%
_ 1
9.1%
D 1
9.1%
T 1
9.1%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 4
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
여자
5378 
남자
4619 
<NA>
 
1
남여구분
 
1
MW_MN_SE
 
1

Length

Max length8
Median length2
Mean length2.001
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row남자
2nd row여자
3rd row여자
4th row여자
5th row여자

Common Values

ValueCountFrequency (%)
여자 5378
53.8%
남자 4619
46.2%
<NA> 1
 
< 0.1%
남여구분 1
 
< 0.1%
MW_MN_SE 1
 
< 0.1%

Length

2023-12-11T12:59:53.454375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:53.615498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
여자 5378
53.8%
남자 4619
46.2%
na 1
 
< 0.1%
남여구분 1
 
< 0.1%
mw_mn_se 1
 
< 0.1%

Unnamed: 5
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing9998
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T12:59:53.803163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length8.5
Mean length8.5
Min length5

Characters and Unicode

Total characters17
Distinct characters15
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row조사시간대
2nd rowEXAMIN_TMZON
ValueCountFrequency (%)
조사시간대 1
50.0%
examin_tmzon 1
50.0%
2023-12-11T12:59:54.202251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 2
 
11.8%
N 2
 
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
E 1
 
5.9%
X 1
 
5.9%
A 1
 
5.9%
Other values (5) 5
29.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11
64.7%
Other Letter 5
29.4%
Connector Punctuation 1
 
5.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 2
18.2%
N 2
18.2%
E 1
9.1%
X 1
9.1%
A 1
9.1%
I 1
9.1%
T 1
9.1%
Z 1
9.1%
O 1
9.1%
Other Letter
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
64.7%
Hangul 5
29.4%
Common 1
 
5.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 2
18.2%
N 2
18.2%
E 1
9.1%
X 1
9.1%
A 1
9.1%
I 1
9.1%
T 1
9.1%
Z 1
9.1%
O 1
9.1%
Hangul
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
70.6%
Hangul 5
29.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 2
16.7%
N 2
16.7%
E 1
8.3%
X 1
8.3%
A 1
8.3%
I 1
8.3%
_ 1
8.3%
T 1
8.3%
Z 1
8.3%
O 1
8.3%
Hangul
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Unnamed: 6
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
오전7시30분~11시
2703 
오후2시~5시
2587 
오후5시~8시
2394 
오전11시~오후2시
2313 
<NA>
 
1
Other values (2)
 
2

Length

Max length17
Median length11
Mean length8.776
Min length4

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row오후5시~8시
2nd row오전11시~오후2시
3rd row오전11시~오후2시
4th row오후2시~5시
5th row오전11시~오후2시

Common Values

ValueCountFrequency (%)
오전7시30분~11시 2703
27.0%
오후2시~5시 2587
25.9%
오후5시~8시 2394
23.9%
오전11시~오후2시 2313
23.1%
<NA> 1
 
< 0.1%
조사시간대_텍스트 1
 
< 0.1%
EXAMIN_TMZON_TEXT 1
 
< 0.1%

Length

2023-12-11T12:59:54.415561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:54.559854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
오전7시30분~11시 2703
27.0%
오후2시~5시 2587
25.9%
오후5시~8시 2394
23.9%
오전11시~오후2시 2313
23.1%
na 1
 
< 0.1%
조사시간대_텍스트 1
 
< 0.1%
examin_tmzon_text 1
 
< 0.1%

Unnamed: 7
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
50-54세
1191 
55-59세
1115 
25-29세
1049 
30-34세
1016 
60-64세
994 
Other values (9)
4635 

Length

Max length8
Median length6
Mean length5.9997
Min length3

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row40-44세
2nd row45-49세
3rd row50-54세
4th row50-54세
5th row25-29세

Common Values

ValueCountFrequency (%)
50-54세 1191
11.9%
55-59세 1115
11.2%
25-29세 1049
10.5%
30-34세 1016
10.2%
60-64세 994
9.9%
45-49세 919
9.2%
20-24세 868
8.7%
35-39세 826
8.3%
65세 이상 785
7.8%
40-44세 749
7.5%
Other values (4) 488
4.9%

Length

2023-12-11T12:59:54.746444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
50-54세 1191
11.0%
55-59세 1115
10.3%
25-29세 1049
9.7%
30-34세 1016
9.4%
60-64세 994
9.2%
45-49세 919
8.5%
20-24세 868
8.0%
35-39세 826
7.7%
65세 785
7.3%
이상 785
7.3%
Other values (5) 1237
11.5%

Unnamed: 8
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
노원구
 
586
성북구
 
516
강남구
 
497
양천구
 
474
구로구
 
469
Other values (38)
7458 

Length

Max length9
Median length3
Mean length3.0512
Min length2

Unique

Unique7 ?
Unique (%)0.1%

Sample

1st row구로구
2nd row강북구
3rd row강동구
4th row송파구
5th row노원구

Common Values

ValueCountFrequency (%)
노원구 586
 
5.9%
성북구 516
 
5.2%
강남구 497
 
5.0%
양천구 474
 
4.7%
구로구 469
 
4.7%
동대문구 461
 
4.6%
송파구 448
 
4.5%
서초구 440
 
4.4%
영등포구 424
 
4.2%
강서구 417
 
4.2%
Other values (33) 5268
52.7%

Length

2023-12-11T12:59:54.924797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
노원구 586
 
5.9%
성북구 516
 
5.2%
강남구 497
 
5.0%
양천구 474
 
4.7%
구로구 469
 
4.7%
동대문구 461
 
4.6%
송파구 448
 
4.5%
서초구 440
 
4.4%
영등포구 424
 
4.2%
강서구 417
 
4.2%
Other values (33) 5268
52.7%

Unnamed: 9
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)
1523 
여가/오락/친교/모임/식사(회식,음료 포함)
1376 
출근
1297 
업무관련
1277 
물건을사려고
957 
Other values (11)
3570 

Length

Max length30
Median length22
Mean length14.1748
Min length2

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row업무관련
2nd row여가/오락/친교/모임/식사(회식,음료 포함)
3rd row여가/오락/친교/모임/식사(회식,음료 포함)
4th row여가/오락/친교/모임/식사(회식,음료 포함)
5th row업무관련

Common Values

ValueCountFrequency (%)
개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등) 1523
15.2%
여가/오락/친교/모임/식사(회식,음료 포함) 1376
13.8%
출근 1297
13.0%
업무관련 1277
12.8%
물건을사려고 957
9.6%
그냥걸으려고(운동, 산책, 기분전환 등) 936
9.4%
역/정류장/교통수단이용 737
7.4%
귀가/퇴근/하교 731
7.3%
학업관련(학원/도서관) 413
 
4.1%
누군가를데리러(또는 데려다 주러) 283
 
2.8%
Other values (6) 470
 
4.7%

Length

2023-12-11T12:59:55.132560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등 1523
9.9%
여가/오락/친교/모임/식사(회식,음료 1376
 
9.0%
포함 1376
 
9.0%
출근 1297
 
8.5%
업무관련 1277
 
8.3%
물건을사려고 957
 
6.2%
그냥걸으려고(운동 936
 
6.1%
산책 936
 
6.1%
기분전환 936
 
6.1%
936
 
6.1%
Other values (15) 3779
24.7%

Unnamed: 10
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
주 3~5회
3225 
매일
3077 
주 1~2회
1821 
월1~2회
1036 
오늘 처음
433 
Other values (4)
408 

Length

Max length8
Median length6
Mean length4.7031
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row월1~2회
2nd row매일
3rd row6개월 1~3회
4th row주 3~5회
5th row주 3~5회

Common Values

ValueCountFrequency (%)
주 3~5회 3225
32.2%
매일 3077
30.8%
주 1~2회 1821
18.2%
월1~2회 1036
 
10.4%
오늘 처음 433
 
4.3%
6개월 1~3회 405
 
4.0%
<NA> 1
 
< 0.1%
방문횟수 1
 
< 0.1%
VISIT_FQ 1
 
< 0.1%

Length

2023-12-11T12:59:55.304656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:55.456044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5046
31.8%
3~5회 3225
20.3%
매일 3077
19.4%
1~2회 1821
 
11.5%
월1~2회 1036
 
6.5%
오늘 433
 
2.7%
처음 433
 
2.7%
6개월 405
 
2.5%
1~3회 405
 
2.5%
na 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Unnamed: 11
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing9998
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T12:59:55.625207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length8.5
Mean length8.5
Min length4

Characters and Unicode

Total characters17
Distinct characters11
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row동행자명
2nd rowCNCDNT_MAN_NM
ValueCountFrequency (%)
동행자명 1
50.0%
cncdnt_man_nm 1
50.0%
2023-12-11T12:59:55.894432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 4
23.5%
C 2
11.8%
_ 2
11.8%
M 2
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
D 1
 
5.9%
T 1
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11
64.7%
Other Letter 4
 
23.5%
Connector Punctuation 2
 
11.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 4
36.4%
C 2
18.2%
M 2
18.2%
D 1
 
9.1%
T 1
 
9.1%
A 1
 
9.1%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
64.7%
Hangul 4
 
23.5%
Common 2
 
11.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 4
36.4%
C 2
18.2%
M 2
18.2%
D 1
 
9.1%
T 1
 
9.1%
A 1
 
9.1%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Common
ValueCountFrequency (%)
_ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
76.5%
Hangul 4
 
23.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 4
30.8%
C 2
15.4%
_ 2
15.4%
M 2
15.4%
D 1
 
7.7%
T 1
 
7.7%
A 1
 
7.7%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 12
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing9998
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T12:59:56.030318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length7.5
Mean length7.5
Min length4

Characters and Unicode

Total characters15
Distinct characters12
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row교통수단
2nd rowTRNSPORT_MN
ValueCountFrequency (%)
교통수단 1
50.0%
trnsport_mn 1
50.0%
2023-12-11T12:59:56.334190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 2
13.3%
R 2
13.3%
N 2
13.3%
1
6.7%
1
6.7%
1
6.7%
1
6.7%
S 1
6.7%
P 1
6.7%
O 1
6.7%
Other values (2) 2
13.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10
66.7%
Other Letter 4
 
26.7%
Connector Punctuation 1
 
6.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 2
20.0%
R 2
20.0%
N 2
20.0%
S 1
10.0%
P 1
10.0%
O 1
10.0%
M 1
10.0%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
66.7%
Hangul 4
 
26.7%
Common 1
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 2
20.0%
R 2
20.0%
N 2
20.0%
S 1
10.0%
P 1
10.0%
O 1
10.0%
M 1
10.0%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
73.3%
Hangul 4
 
26.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 2
18.2%
R 2
18.2%
N 2
18.2%
S 1
9.1%
P 1
9.1%
O 1
9.1%
_ 1
9.1%
M 1
9.1%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 13
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보통
4706 
약간만족
2356 
약간불만족
1350 
매우만족
1152 
매우불만족
 
433
Other values (3)
 
3

Length

Max length10
Median length5
Mean length3.2377
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row매우불만족
2nd row약간불만족
3rd row매우만족
4th row매우만족
5th row약간불만족

Common Values

ValueCountFrequency (%)
보통 4706
47.1%
약간만족 2356
23.6%
약간불만족 1350
 
13.5%
매우만족 1152
 
11.5%
매우불만족 433
 
4.3%
<NA> 1
 
< 0.1%
보행환경 1
 
< 0.1%
WALK_ENVRN 1
 
< 0.1%

Length

2023-12-11T12:59:56.477739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:56.605304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보통 4706
47.1%
약간만족 2356
23.6%
약간불만족 1350
 
13.5%
매우만족 1152
 
11.5%
매우불만족 433
 
4.3%
na 1
 
< 0.1%
보행환경 1
 
< 0.1%
walk_envrn 1
 
< 0.1%

Unnamed: 14
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
(전업) 주부
2177 
사무/기술직
1535 
(대) 학생
1330 
판매/서비스직
1137 
자영업
1066 
Other values (9)
2755 

Length

Max length7
Median length6
Mean length5.9387
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row사무/기술직
2nd row(전업) 주부
3rd row(전업) 주부
4th row(전업) 주부
5th row사무/기술직

Common Values

ValueCountFrequency (%)
(전업) 주부 2177
21.8%
사무/기술직 1535
15.3%
(대) 학생 1330
13.3%
판매/서비스직 1137
11.4%
자영업 1066
10.7%
전문/자유직 839
 
8.4%
무직/기타 713
 
7.1%
일용/작업직 511
 
5.1%
경영/관리직 427
 
4.3%
생산/운수직 259
 
2.6%
Other values (4) 6
 
0.1%

Length

2023-12-11T12:59:56.745446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전업 2177
16.1%
주부 2177
16.1%
사무/기술직 1535
11.4%
1330
9.8%
학생 1330
9.8%
판매/서비스직 1137
8.4%
자영업 1066
7.9%
전문/자유직 839
 
6.2%
무직/기타 713
 
5.3%
일용/작업직 511
 
3.8%
Other values (6) 692
 
5.1%

Unnamed: 15
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2014
9997 
<NA>
 
1
년도
 
1
YEAR
 
1

Length

Max length4
Median length4
Mean length3.9998
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row2014
2nd row2014
3rd row2014
4th row2014
5th row2014

Common Values

ValueCountFrequency (%)
2014 9997
> 99.9%
<NA> 1
 
< 0.1%
년도 1
 
< 0.1%
YEAR 1
 
< 0.1%

Length

2023-12-11T12:59:56.864203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:57.235616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2014 9997
> 99.9%
na 1
 
< 0.1%
년도 1
 
< 0.1%
year 1
 
< 0.1%

Correlations

2023-12-11T12:59:57.310134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
Unnamed: 21.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.000
Unnamed: 30.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.000
Unnamed: 40.0000.0001.0000.0000.9250.9190.9550.9320.9810.0000.0000.8760.9431.000
Unnamed: 50.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.000
Unnamed: 60.0000.0000.9250.0001.0000.8490.9090.8940.8190.0000.0000.7980.8501.000
Unnamed: 70.0000.0000.9190.0000.8491.0000.8200.8060.8060.0000.0000.8280.9211.000
Unnamed: 80.0000.0000.9550.0000.9090.8201.0000.8160.8590.0000.0000.8910.8341.000
Unnamed: 90.0000.0000.9320.0000.8940.8060.8161.0000.8400.0000.0000.8360.8291.000
Unnamed: 100.0000.0000.9810.0000.8190.8060.8590.8401.0000.0000.0000.7900.8061.000
Unnamed: 110.0000.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
Unnamed: 120.0000.0000.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.000
Unnamed: 130.0000.0000.8760.0000.7980.8280.8910.8360.7900.0000.0001.0000.8281.000
Unnamed: 140.0000.0000.9430.0000.8500.9210.8340.8290.8060.0000.0000.8281.0001.000
Unnamed: 150.0000.0001.0000.0001.0001.0001.0001.0001.0000.0000.0001.0001.0001.000
2023-12-11T12:59:57.491911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 9Unnamed: 14Unnamed: 7Unnamed: 4Unnamed: 10Unnamed: 6Unnamed: 15Unnamed: 8Unnamed: 13
Unnamed: 91.0000.4840.4520.8290.5640.6820.9990.3810.578
Unnamed: 140.4841.0000.5070.8700.5380.6350.9990.4280.578
Unnamed: 70.4520.5071.0000.8160.5360.6340.9990.4090.577
Unnamed: 40.8290.8700.8161.0000.8160.8171.0000.8160.816
Unnamed: 100.5640.5380.5360.8161.0000.6381.0000.5450.579
Unnamed: 60.6820.6350.6340.8170.6381.0001.0000.6330.633
Unnamed: 150.9990.9990.9991.0001.0001.0001.0000.9981.000
Unnamed: 80.3810.4280.4090.8160.5450.6330.9981.0000.582
Unnamed: 130.5780.5780.5770.8160.5790.6331.0000.5821.000
2023-12-11T12:59:57.666867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 4Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 13Unnamed: 14Unnamed: 15
Unnamed: 41.0000.8170.8160.8160.8290.8160.8160.8701.000
Unnamed: 60.8171.0000.6340.6330.6820.6380.6330.6351.000
Unnamed: 70.8160.6341.0000.4090.4520.5360.5770.5070.999
Unnamed: 80.8160.6330.4091.0000.3810.5450.5820.4280.998
Unnamed: 90.8290.6820.4520.3811.0000.5640.5780.4840.999
Unnamed: 100.8160.6380.5360.5450.5641.0000.5790.5381.000
Unnamed: 130.8160.6330.5770.5820.5780.5791.0000.5781.000
Unnamed: 140.8700.6350.5070.4280.4840.5380.5781.0000.999
Unnamed: 151.0001.0000.9990.9980.9991.0001.0000.9991.000

Missing values

2023-12-11T12:59:49.172593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:59:49.417306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:59:49.674300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

유동인구_속성조사_2014Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
199991999725-816<NA><NA>남자<NA>오후5시~8시40-44세구로구업무관련월1~2회<NA><NA>매우불만족사무/기술직2014
7113711109-011<NA><NA>여자<NA>오전11시~오후2시45-49세강북구여가/오락/친교/모임/식사(회식,음료 포함)매일<NA><NA>약간불만족(전업) 주부2014
176841768223-066<NA><NA>여자<NA>오전11시~오후2시50-54세강동구여가/오락/친교/모임/식사(회식,음료 포함)6개월 1~3회<NA><NA>매우만족(전업) 주부2014
191061910424-2040<NA><NA>여자<NA>오후2시~5시50-54세송파구여가/오락/친교/모임/식사(회식,음료 포함)주 3~5회<NA><NA>매우만족(전업) 주부2014
1716171402-051<NA><NA>여자<NA>오전11시~오후2시25-29세노원구업무관련주 3~5회<NA><NA>약간불만족사무/기술직2014
194901948825-014<NA><NA>남자<NA>오전7시30분~11시30-34세강동구출근매일<NA><NA>보통사무/기술직2014
8486848411-053<NA><NA>여자<NA>오후2시~5시40-44세노원구누군가를데리러(또는 데려다 주러)주 3~5회<NA><NA>약간만족(전업) 주부2014
159951599322-036<NA><NA>여자<NA>오후2시~5시50-54세서초구물건을사려고매일<NA><NA>매우만족(전업) 주부2014
5523552107-022<NA><NA>여자<NA>오전11시~오후2시55-59세중랑구업무관련주 3~5회<NA><NA>약간만족생산/운수직2014
4049404705-008<NA><NA>여자<NA>오전7시30분~11시55-59세광진구업무관련오늘 처음<NA><NA>약간만족사무/기술직2014
유동인구_속성조사_2014Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
4082408005-011<NA><NA>여자<NA>오전11시~오후2시25-29세용산구업무관련주 1~2회<NA><NA>보통판매/서비스직2014
9174917212-002<NA><NA>여자<NA>오전7시30분~11시20-24세은평구학업관련(학원/도서관)주 3~5회<NA><NA>약간만족(대) 학생2014
4840483806-021<NA><NA>여자<NA>오후5시~8시20-24세동대문구여가/오락/친교/모임/식사(회식,음료 포함)주 1~2회<NA><NA>보통(대) 학생2014
87587301-2029<NA><NA>남자<NA>오전7시30분~11시45-49세강원여가/오락/친교/모임/식사(회식,음료 포함)오늘 처음<NA><NA>보통사무/기술직2014
7506750409-3124<NA><NA>남자<NA>오후2시~5시45-49세강북구개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)6개월 1~3회<NA><NA>매우만족자영업2014
9657965512-106<NA><NA>여자<NA>오후5시~8시35-39세은평구개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)주 1~2회<NA><NA>보통(전업) 주부2014
3137313503-1066<NA><NA>남자<NA>오후2시~5시45-49세성동구역/정류장/교통수단이용월1~2회<NA><NA>약간불만족생산/운수직2014
9219921712-006<NA><NA>여자<NA>오전7시30분~11시15-19세은평구학업관련(학원/도서관)매일<NA><NA>매우불만족(대) 학생2014
5312531006-265<NA><NA>남자<NA>오후5시~8시25-29세동대문구여가/오락/친교/모임/식사(회식,음료 포함)주 3~5회<NA><NA>매우불만족사무/기술직2014
2967296503-067<NA><NA>남자<NA>오전11시~오후2시40-44세양천구역/정류장/교통수단이용월1~2회<NA><NA>약간불만족사무/기술직2014