Overview

Dataset statistics

Number of variables16
Number of observations10000
Missing cells29999
Missing cells (%)18.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.3 MiB
Average record size in memory136.0 B

Variable types

Text5
Categorical11

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21706/F/1/datasetView.do

Alerts

Unnamed: 5 has constant value ""Constant
Unnamed: 11 has constant value ""Constant
Unnamed: 12 has constant value ""Constant
Unnamed: 9 is highly overall correlated with Unnamed: 3 and 3 other fieldsHigh correlation
Unnamed: 14 is highly overall correlated with Unnamed: 3 and 3 other fieldsHigh correlation
Unnamed: 7 is highly overall correlated with Unnamed: 3 and 3 other fieldsHigh correlation
Unnamed: 4 is highly overall correlated with Unnamed: 2 and 9 other fieldsHigh correlation
Unnamed: 2 is highly overall correlated with Unnamed: 3 and 3 other fieldsHigh correlation
Unnamed: 10 is highly overall correlated with Unnamed: 3 and 3 other fieldsHigh correlation
Unnamed: 6 is highly overall correlated with Unnamed: 2 and 9 other fieldsHigh correlation
Unnamed: 15 is highly overall correlated with Unnamed: 2 and 9 other fieldsHigh correlation
Unnamed: 8 is highly overall correlated with Unnamed: 3 and 3 other fieldsHigh correlation
Unnamed: 3 is highly overall correlated with Unnamed: 2 and 9 other fieldsHigh correlation
Unnamed: 13 is highly overall correlated with Unnamed: 3 and 3 other fieldsHigh correlation
Unnamed: 4 is highly imbalanced (50.1%)Imbalance
Unnamed: 15 is highly imbalanced (99.8%)Imbalance
Unnamed: 5 has 9999 (> 99.9%) missing valuesMissing
Unnamed: 11 has 9999 (> 99.9%) missing valuesMissing
Unnamed: 12 has 9999 (> 99.9%) missing valuesMissing

Reproduction

Analysis started2023-12-11 03:59:33.304442
Analysis finished2023-12-11 03:59:35.832201
Duration2.53 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9999
Distinct (%)100.0%
Missing1
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-11T12:59:36.168758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length5
Mean length4.4482448
Min length1

Characters and Unicode

Total characters44478
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9999 ?
Unique (%)100.0%

Sample

1st row18472
2nd row8387
3rd row2889
4th row8673
5th row4092
ValueCountFrequency (%)
18472 1
 
< 0.1%
6339 1
 
< 0.1%
16396 1
 
< 0.1%
15738 1
 
< 0.1%
17665 1
 
< 0.1%
3086 1
 
< 0.1%
19037 1
 
< 0.1%
7805 1
 
< 0.1%
5853 1
 
< 0.1%
3812 1
 
< 0.1%
Other values (9989) 9989
99.9%
2023-12-11T12:59:36.709766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 9038
20.3%
7 4073
9.2%
3 4050
9.1%
8 4034
9.1%
6 4029
9.1%
2 3995
9.0%
9 3955
8.9%
5 3948
8.9%
4 3938
8.9%
0 3404
 
7.7%
Other values (11) 14
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 44464
> 99.9%
Uppercase Letter 12
 
< 0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 9038
20.3%
7 4073
9.2%
3 4050
9.1%
8 4034
9.1%
6 4029
9.1%
2 3995
9.0%
9 3955
8.9%
5 3948
8.9%
4 3938
8.9%
0 3404
 
7.7%
Uppercase Letter
ValueCountFrequency (%)
A 2
16.7%
I 2
16.7%
D 1
8.3%
T 1
8.3%
R 1
8.3%
B 1
8.3%
E 1
8.3%
X 1
8.3%
M 1
8.3%
N 1
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 44466
> 99.9%
Latin 12
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 9038
20.3%
7 4073
9.2%
3 4050
9.1%
8 4034
9.1%
6 4029
9.1%
2 3995
9.0%
9 3955
8.9%
5 3948
8.9%
4 3938
8.9%
0 3404
 
7.7%
Latin
ValueCountFrequency (%)
A 2
16.7%
I 2
16.7%
D 1
8.3%
T 1
8.3%
R 1
8.3%
B 1
8.3%
E 1
8.3%
X 1
8.3%
M 1
8.3%
N 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44478
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 9038
20.3%
7 4073
9.2%
3 4050
9.1%
8 4034
9.1%
6 4029
9.1%
2 3995
9.0%
9 3955
8.9%
5 3948
8.9%
4 3938
8.9%
0 3404
 
7.7%
Other values (11) 14
 
< 0.1%
Distinct1001
Distinct (%)10.0%
Missing1
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-11T12:59:37.081568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length6
Mean length6.2613261
Min length6

Characters and Unicode

Total characters62607
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row24-032
2nd row11-148
3rd row03-1285
4th row11-3002
5th row05-228
ValueCountFrequency (%)
23-087 17
 
0.2%
03-054 17
 
0.2%
06-030 16
 
0.2%
12-208 15
 
0.2%
25-029 15
 
0.2%
06-208 15
 
0.2%
22-3009 15
 
0.2%
24-1905 15
 
0.2%
15-067 15
 
0.2%
15-120 15
 
0.2%
Other values (991) 9844
98.4%
2023-12-11T12:59:37.604486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 12837
20.5%
- 9998
16.0%
1 9877
15.8%
2 9185
14.7%
3 4391
 
7.0%
4 3664
 
5.9%
5 3148
 
5.0%
6 2500
 
4.0%
8 2433
 
3.9%
7 2373
 
3.8%
Other values (14) 2201
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52595
84.0%
Dash Punctuation 9998
 
16.0%
Uppercase Letter 12
 
< 0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 1
8.3%
T 1
8.3%
O 1
8.3%
P 1
8.3%
S 1
8.3%
X 1
8.3%
N 1
8.3%
I 1
8.3%
M 1
8.3%
A 1
8.3%
Other values (2) 2
16.7%
Decimal Number
ValueCountFrequency (%)
0 12837
24.4%
1 9877
18.8%
2 9185
17.5%
3 4391
 
8.3%
4 3664
 
7.0%
5 3148
 
6.0%
6 2500
 
4.8%
8 2433
 
4.6%
7 2373
 
4.5%
9 2187
 
4.2%
Dash Punctuation
ValueCountFrequency (%)
- 9998
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 62595
> 99.9%
Latin 12
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 12837
20.5%
- 9998
16.0%
1 9877
15.8%
2 9185
14.7%
3 4391
 
7.0%
4 3664
 
5.9%
5 3148
 
5.0%
6 2500
 
4.0%
8 2433
 
3.9%
7 2373
 
3.8%
Other values (2) 2189
 
3.5%
Latin
ValueCountFrequency (%)
C 1
8.3%
T 1
8.3%
O 1
8.3%
P 1
8.3%
S 1
8.3%
X 1
8.3%
N 1
8.3%
I 1
8.3%
M 1
8.3%
A 1
8.3%
Other values (2) 2
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62607
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 12837
20.5%
- 9998
16.0%
1 9877
15.8%
2 9185
14.7%
3 4391
 
7.0%
4 3664
 
5.9%
5 3148
 
5.0%
6 2500
 
4.0%
8 2433
 
3.9%
7 2373
 
3.8%
Other values (14) 2201
 
3.5%

Unnamed: 2
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1016
1344 
1017
1328 
1023
1294 
1030
1288 
1024
1274 
Other values (5)
3472 

Length

Max length10
Median length4
Mean length4.0006
Min length4

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row1017
2nd row1016
3rd row1023
4th row1017
5th row1002

Common Values

ValueCountFrequency (%)
1016 1344
13.4%
1017 1328
13.3%
1023 1294
12.9%
1030 1288
12.9%
1024 1274
12.7%
1031 1263
12.6%
1010 1117
11.2%
1002 1090
10.9%
EXAMIN_DAY 1
 
< 0.1%
<NA> 1
 
< 0.1%

Length

2023-12-11T12:59:37.773893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:37.903906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1016 1344
13.4%
1017 1328
13.3%
1023 1294
12.9%
1030 1288
12.9%
1024 1274
12.7%
1031 1263
12.6%
1010 1117
11.2%
1002 1090
10.9%
examin_day 1
 
< 0.1%
na 1
 
< 0.1%

Unnamed: 3
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
5016 
4982 
EXAMIN_DATE
 
1
<NA>
 
1

Length

Max length11
Median length1
Mean length1.0013
Min length1

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
5016
50.2%
4982
49.8%
EXAMIN_DATE 1
 
< 0.1%
<NA> 1
 
< 0.1%

Length

2023-12-11T12:59:38.071504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:38.201742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5016
50.2%
4982
49.8%
examin_date 1
 
< 0.1%
na 1
 
< 0.1%

Unnamed: 4
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
여자
5377 
남자
4621 
MW_MN_SE
 
1
<NA>
 
1

Length

Max length8
Median length2
Mean length2.0008
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row여자
2nd row여자
3rd row여자
4th row여자
5th row여자

Common Values

ValueCountFrequency (%)
여자 5377
53.8%
남자 4621
46.2%
MW_MN_SE 1
 
< 0.1%
<NA> 1
 
< 0.1%

Length

2023-12-11T12:59:38.326146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:38.450862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
여자 5377
53.8%
남자 4621
46.2%
mw_mn_se 1
 
< 0.1%
na 1
 
< 0.1%

Unnamed: 5
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing9999
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T12:59:38.590369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowEXAMIN_TMZON
ValueCountFrequency (%)
examin_tmzon 1
100.0%
2023-12-11T12:59:38.854255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 2
16.7%
N 2
16.7%
E 1
8.3%
X 1
8.3%
A 1
8.3%
I 1
8.3%
_ 1
8.3%
T 1
8.3%
Z 1
8.3%
O 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11
91.7%
Connector Punctuation 1
 
8.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 2
18.2%
N 2
18.2%
E 1
9.1%
X 1
9.1%
A 1
9.1%
I 1
9.1%
T 1
9.1%
Z 1
9.1%
O 1
9.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
91.7%
Common 1
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 2
18.2%
N 2
18.2%
E 1
9.1%
X 1
9.1%
A 1
9.1%
I 1
9.1%
T 1
9.1%
Z 1
9.1%
O 1
9.1%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 2
16.7%
N 2
16.7%
E 1
8.3%
X 1
8.3%
A 1
8.3%
I 1
8.3%
_ 1
8.3%
T 1
8.3%
Z 1
8.3%
O 1
8.3%

Unnamed: 6
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
오전7시30분~11시
2918 
오후2시~5시
2728 
오전11시~오후2시
2184 
오후5시~8시
2168 
EXAMIN_TMZON_TEXT
 
1

Length

Max length17
Median length11
Mean length8.8231
Min length4

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row오후2시~5시
2nd row오후2시~5시
3rd row오후5시~8시
4th row오전7시30분~11시
5th row오후5시~8시

Common Values

ValueCountFrequency (%)
오전7시30분~11시 2918
29.2%
오후2시~5시 2728
27.3%
오전11시~오후2시 2184
21.8%
오후5시~8시 2168
21.7%
EXAMIN_TMZON_TEXT 1
 
< 0.1%
<NA> 1
 
< 0.1%

Length

2023-12-11T12:59:38.990149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:39.140309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
오전7시30분~11시 2918
29.2%
오후2시~5시 2728
27.3%
오전11시~오후2시 2184
21.8%
오후5시~8시 2168
21.7%
examin_tmzon_text 1
 
< 0.1%
na 1
 
< 0.1%

Unnamed: 7
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
50-54세
1170 
55-59세
1133 
25-29세
1039 
30-34세
979 
60-64세
969 
Other values (8)
4710 

Length

Max length8
Median length6
Mean length5.9163
Min length4

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row40-44세
2nd row45-49세
3rd row35-39세
4th row35-39세
5th row65세이상

Common Values

ValueCountFrequency (%)
50-54세 1170
11.7%
55-59세 1133
11.3%
25-29세 1039
10.4%
30-34세 979
9.8%
60-64세 969
9.7%
45-49세 905
9.0%
20-24세 901
9.0%
65세이상 837
8.4%
35-39세 779
7.8%
40-44세 745
7.4%
Other values (3) 543
5.4%

Length

2023-12-11T12:59:39.293297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
50-54세 1170
11.7%
55-59세 1133
11.3%
25-29세 1039
10.4%
30-34세 979
9.8%
60-64세 969
9.7%
45-49세 905
9.0%
20-24세 901
9.0%
65세이상 837
8.4%
35-39세 779
7.8%
40-44세 745
7.4%
Other values (3) 543
5.4%

Unnamed: 8
Categorical

HIGH CORRELATION 

Distinct27
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
노원구
 
481
송파구
 
479
서초구
 
479
강남구
 
467
양천구
 
461
Other values (22)
7633 

Length

Max length9
Median length3
Mean length3.1333
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row송파구
2nd row노원구
3rd row용산구
4th row노원구
5th row광진구

Common Values

ValueCountFrequency (%)
노원구 481
 
4.8%
송파구 479
 
4.8%
서초구 479
 
4.8%
강남구 467
 
4.7%
양천구 461
 
4.6%
동대문구 458
 
4.6%
구로구 434
 
4.3%
중랑구 432
 
4.3%
강북구 432
 
4.3%
성북구 424
 
4.2%
Other values (17) 5453
54.5%

Length

2023-12-11T12:59:39.793235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
노원구 481
 
4.8%
송파구 479
 
4.8%
서초구 479
 
4.8%
강남구 467
 
4.7%
양천구 461
 
4.6%
동대문구 458
 
4.6%
구로구 434
 
4.3%
중랑구 432
 
4.3%
강북구 432
 
4.3%
성북구 424
 
4.2%
Other values (17) 5453
54.5%

Unnamed: 9
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)
1462 
여가/오락/친교/모임/식사(회식,음료포함)
1451 
출근
1309 
업무관련
1296 
그냥 걸으려고(운동,산책,기분전환등)
963 
Other values (9)
3519 

Length

Max length30
Median length16
Mean length13.8466
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)
2nd row그냥 걸으려고(운동,산책,기분전환등)
3rd row물건을사려고
4th row누군가를데리러(또는데려다주러)
5th row물건을사려고

Common Values

ValueCountFrequency (%)
개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등) 1462
14.6%
여가/오락/친교/모임/식사(회식,음료포함) 1451
14.5%
출근 1309
13.1%
업무관련 1296
13.0%
그냥 걸으려고(운동,산책,기분전환등) 963
9.6%
물건을사려고 913
9.1%
역/정류장/교통수단 이용 714
7.1%
귀가/퇴근/하교 669
6.7%
학업관련(학원/도서관) 477
 
4.8%
누군가를데리러(또는데려다주러) 312
 
3.1%
Other values (4) 434
 
4.3%

Length

2023-12-11T12:59:39.967128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등 1462
12.5%
여가/오락/친교/모임/식사(회식,음료포함 1451
12.4%
출근 1309
11.2%
업무관련 1296
11.1%
그냥 963
8.2%
걸으려고(운동,산책,기분전환등 963
8.2%
물건을사려고 913
7.8%
역/정류장/교통수단 714
6.1%
이용 714
6.1%
귀가/퇴근/하교 669
5.7%
Other values (6) 1223
10.5%

Unnamed: 10
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
주3~5회
3196 
매일
3035 
주1~2회
1849 
월1~2회
1041 
오늘처음
464 
Other values (3)
415 

Length

Max length8
Median length5
Mean length4.1259
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row주3~5회
2nd row주3~5회
3rd row주3~5회
4th row주3~5회
5th row월1~2회

Common Values

ValueCountFrequency (%)
주3~5회 3196
32.0%
매일 3035
30.3%
주1~2회 1849
18.5%
월1~2회 1041
 
10.4%
오늘처음 464
 
4.6%
6개월1~3회 413
 
4.1%
VISIT_FQ 1
 
< 0.1%
<NA> 1
 
< 0.1%

Length

2023-12-11T12:59:40.143317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:40.315224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주3~5회 3196
32.0%
매일 3035
30.3%
주1~2회 1849
18.5%
월1~2회 1041
 
10.4%
오늘처음 464
 
4.6%
6개월1~3회 413
 
4.1%
visit_fq 1
 
< 0.1%
na 1
 
< 0.1%

Unnamed: 11
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing9999
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T12:59:40.526669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters7
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowCNCDNT_MAN_NM
ValueCountFrequency (%)
cncdnt_man_nm 1
100.0%
2023-12-11T12:59:40.896266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 4
30.8%
C 2
15.4%
_ 2
15.4%
M 2
15.4%
D 1
 
7.7%
T 1
 
7.7%
A 1
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11
84.6%
Connector Punctuation 2
 
15.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 4
36.4%
C 2
18.2%
M 2
18.2%
D 1
 
9.1%
T 1
 
9.1%
A 1
 
9.1%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
84.6%
Common 2
 
15.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 4
36.4%
C 2
18.2%
M 2
18.2%
D 1
 
9.1%
T 1
 
9.1%
A 1
 
9.1%
Common
ValueCountFrequency (%)
_ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 4
30.8%
C 2
15.4%
_ 2
15.4%
M 2
15.4%
D 1
 
7.7%
T 1
 
7.7%
A 1
 
7.7%

Unnamed: 12
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing9999
Missing (%)> 99.9%
Memory size156.2 KiB
2023-12-11T12:59:41.065315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowTRNSPORT_MN
ValueCountFrequency (%)
trnsport_mn 1
100.0%
2023-12-11T12:59:41.444136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 2
18.2%
R 2
18.2%
N 2
18.2%
S 1
9.1%
P 1
9.1%
O 1
9.1%
_ 1
9.1%
M 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10
90.9%
Connector Punctuation 1
 
9.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 2
20.0%
R 2
20.0%
N 2
20.0%
S 1
10.0%
P 1
10.0%
O 1
10.0%
M 1
10.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
90.9%
Common 1
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 2
20.0%
R 2
20.0%
N 2
20.0%
S 1
10.0%
P 1
10.0%
O 1
10.0%
M 1
10.0%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 2
18.2%
R 2
18.2%
N 2
18.2%
S 1
9.1%
P 1
9.1%
O 1
9.1%
_ 1
9.1%
M 1
9.1%

Unnamed: 13
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보통
4751 
약간불만족
2380 
약간만족
1270 
매우불만족
1198 
매우만족
 
399
Other values (2)
 
2

Length

Max length10
Median length5
Mean length3.4082
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row약간만족
2nd row매우만족
3rd row매우불만족
4th row약간불만족
5th row보통

Common Values

ValueCountFrequency (%)
보통 4751
47.5%
약간불만족 2380
23.8%
약간만족 1270
 
12.7%
매우불만족 1198
 
12.0%
매우만족 399
 
4.0%
WALK_ENVRN 1
 
< 0.1%
<NA> 1
 
< 0.1%

Length

2023-12-11T12:59:41.672796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:41.850765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보통 4751
47.5%
약간불만족 2380
23.8%
약간만족 1270
 
12.7%
매우불만족 1198
 
12.0%
매우만족 399
 
4.0%
walk_envrn 1
 
< 0.1%
na 1
 
< 0.1%

Unnamed: 14
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
(전업)주부
2167 
사무/기술직
1487 
(대)학생
1453 
판매/서비스직
1142 
자영업
1068 
Other values (7)
2683 

Length

Max length7
Median length6
Mean length5.5014
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row판매/서비스직
2nd row(전업)주부
3rd row자영업
4th row(전업)주부
5th row(전업)주부

Common Values

ValueCountFrequency (%)
(전업)주부 2167
21.7%
사무/기술직 1487
14.9%
(대)학생 1453
14.5%
판매/서비스직 1142
11.4%
자영업 1068
10.7%
전문/자유직 862
 
8.6%
<NA> 728
 
7.3%
일용/작업직 453
 
4.5%
경영/관리직 379
 
3.8%
생산/운수직 256
 
2.6%
Other values (2) 5
 
0.1%

Length

2023-12-11T12:59:42.020990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전업)주부 2167
21.7%
사무/기술직 1487
14.9%
대)학생 1453
14.5%
판매/서비스직 1142
11.4%
자영업 1068
10.7%
전문/자유직 862
 
8.6%
na 728
 
7.3%
일용/작업직 453
 
4.5%
경영/관리직 379
 
3.8%
생산/운수직 256
 
2.6%
Other values (2) 5
 
< 0.1%

Unnamed: 15
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2015
9998 
YEAR
 
1
<NA>
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row2015
2nd row2015
3rd row2015
4th row2015
5th row2015

Common Values

ValueCountFrequency (%)
2015 9998
> 99.9%
YEAR 1
 
< 0.1%
<NA> 1
 
< 0.1%

Length

2023-12-11T12:59:42.180765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:59:42.303408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015 9998
> 99.9%
year 1
 
< 0.1%
na 1
 
< 0.1%

Correlations

2023-12-11T12:59:42.403716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 13Unnamed: 14Unnamed: 15
Unnamed: 21.0001.0000.9380.7000.6610.7100.6580.6410.7100.6371.000
Unnamed: 31.0001.0000.9430.7180.9320.8720.8360.7690.9410.8211.000
Unnamed: 40.9380.9431.0000.7180.9320.8720.8480.7690.9420.8781.000
Unnamed: 60.7000.7180.7181.0000.7170.7720.7810.6660.6400.7231.000
Unnamed: 70.6610.9320.9320.7171.0000.6850.7140.6760.8040.7661.000
Unnamed: 80.7100.8720.8720.7720.6851.0000.7380.7250.7480.6901.000
Unnamed: 90.6580.8360.8480.7810.7140.7381.0000.7330.7080.7491.000
Unnamed: 100.6410.7690.7690.6660.6760.7250.7331.0000.6400.6721.000
Unnamed: 130.7100.9410.9420.6400.8040.7480.7080.6401.0000.6951.000
Unnamed: 140.6370.8210.8780.7230.7660.6900.7490.6720.6951.0001.000
Unnamed: 151.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
2023-12-11T12:59:42.619741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 9Unnamed: 14Unnamed: 7Unnamed: 4Unnamed: 2Unnamed: 10Unnamed: 6Unnamed: 15Unnamed: 8Unnamed: 3Unnamed: 13
Unnamed: 91.0000.4200.3680.7280.3550.4540.5720.9990.2940.7100.448
Unnamed: 140.4201.0000.4400.7990.3530.4140.5051.0000.3190.7070.449
Unnamed: 70.3680.4401.0000.7080.3540.4130.5020.9990.3040.7070.448
Unnamed: 40.7280.7990.7081.0000.7070.7070.7071.0000.7070.7070.707
Unnamed: 20.3550.3530.3540.7071.0000.4080.5001.0000.3581.0000.448
Unnamed: 100.4540.4140.4130.7070.4081.0000.5101.0000.4120.7070.449
Unnamed: 60.5720.5050.5020.7070.5000.5101.0001.0000.5020.7070.500
Unnamed: 150.9991.0000.9991.0001.0001.0001.0001.0000.9991.0001.000
Unnamed: 80.2940.3190.3040.7070.3580.4120.5020.9991.0000.7060.454
Unnamed: 30.7100.7070.7070.7071.0000.7070.7071.0000.7061.0000.707
Unnamed: 130.4480.4490.4480.7070.4480.4490.5001.0000.4540.7071.000
2023-12-11T12:59:42.834678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 13Unnamed: 14Unnamed: 15
Unnamed: 21.0001.0000.7070.5000.3540.3580.3550.4080.4480.3531.000
Unnamed: 31.0001.0000.7070.7070.7070.7060.7100.7070.7070.7071.000
Unnamed: 40.7070.7071.0000.7070.7080.7070.7280.7070.7070.7991.000
Unnamed: 60.5000.7070.7071.0000.5020.5020.5720.5100.5000.5051.000
Unnamed: 70.3540.7070.7080.5021.0000.3040.3680.4130.4480.4400.999
Unnamed: 80.3580.7060.7070.5020.3041.0000.2940.4120.4540.3190.999
Unnamed: 90.3550.7100.7280.5720.3680.2941.0000.4540.4480.4200.999
Unnamed: 100.4080.7070.7070.5100.4130.4120.4541.0000.4490.4141.000
Unnamed: 130.4480.7070.7070.5000.4480.4540.4480.4491.0000.4491.000
Unnamed: 140.3530.7070.7990.5050.4400.3190.4200.4140.4491.0001.000
Unnamed: 151.0001.0001.0001.0000.9990.9990.9991.0001.0001.0001.000

Missing values

2023-12-11T12:59:35.131608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:59:35.352409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:59:35.605981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

유동인구_속성조사_2015Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
184741847224-0321017여자<NA>오후2시~5시40-44세송파구개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)주3~5회<NA><NA>약간만족판매/서비스직2015
8389838711-1481016여자<NA>오후2시~5시45-49세노원구그냥 걸으려고(운동,산책,기분전환등)주3~5회<NA><NA>매우만족(전업)주부2015
2891288903-12851023여자<NA>오후5시~8시35-39세용산구물건을사려고주3~5회<NA><NA>매우불만족자영업2015
8675867311-30021017여자<NA>오전7시30분~11시35-39세노원구누군가를데리러(또는데려다주러)주3~5회<NA><NA>약간불만족(전업)주부2015
4094409205-2281002여자<NA>오후5시~8시65세이상광진구물건을사려고월1~2회<NA><NA>보통(전업)주부2015
125841258217-0951024여자<NA>오전7시30분~11시30-34세구로구개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)매일<NA><NA>약간불만족(전업)주부2015
107541075214-4511031남자<NA>오전7시30분~11시55-59세마포구출근주3~5회<NA><NA>보통생산/운수직2015
5056505406-2521031여자<NA>오전11시~오후2시60-64세동대문구귀가/퇴근/하교매일<NA><NA>보통<NA>2015
4197419505-2461017남자<NA>오후5시~8시50-54세광진구그냥 걸으려고(운동,산책,기분전환등)매일<NA><NA>보통사무/기술직2015
122861228417-0161002여자<NA>오후2시~5시45-49세구로구물건을사려고매일<NA><NA>보통(전업)주부2015
유동인구_속성조사_2015Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
166051660322-30091023남자<NA>오전11시~오후2시15-19세서초구개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)주3~5회<NA><NA>보통(대)학생2015
160551605322-0941031남자<NA>오전7시30분~11시40-44세관악구업무관련주3~5회<NA><NA>약간불만족사무/기술직2015
128191281717-2541010여자<NA>오후2시~5시30-34세구로구개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)매일<NA><NA>보통(전업)주부2015
164791647722-2021002남자<NA>오후5시~8시55-59세서초구학업관련(학원/도서관)주1~2회<NA><NA>매우불만족경영/관리직2015
163301632822-1681030남자<NA>오전7시30분~11시40-44세서초구출근매일<NA><NA>약간불만족전문/자유직2015
7132713009-31251017남자<NA>오후5시~8시60-64세강북구개인용무/집안일(병원,은행,관공서,종교활동,봉사활동등)매일<NA><NA>보통<NA>2015
9804980213-2181024여자<NA>오전7시30분~11시25-29세서대문구출근주3~5회<NA><NA>보통판매/서비스직2015
137741377219-0681016여자<NA>오전7시30분~11시30-34세강서구출근매일<NA><NA>약간만족경영/관리직2015
181801817823-6621031여자<NA>오후5시~8시50-54세송파구귀가/퇴근/하교주3~5회<NA><NA>약간불만족경영/관리직2015
176791767723-12581002여자<NA>오전7시30분~11시50-54세서초구귀가/퇴근/하교매일<NA><NA>약간만족생산/운수직2015