Overview

Dataset statistics

Number of variables6
Number of observations1211
Missing cells4
Missing cells (%)0.1%
Duplicate rows3
Duplicate rows (%)0.2%
Total size in memory56.9 KiB
Average record size in memory48.1 B

Variable types

Unsupported3
Categorical2
Text1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15462/S/1/datasetView.do

Alerts

Dataset has 3 (0.2%) duplicate rowsDuplicates
Unnamed: 1 is highly overall correlated with Unnamed: 4High correlation
Unnamed: 4 is highly overall correlated with Unnamed: 1High correlation
서울특별시 간행물 판매 정보(20.04.) is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-05-04 06:23:19.998662
Analysis finished2024-05-04 06:23:22.221721
Duration2.22 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

서울특별시 간행물 판매 정보(20.04.)
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.1%
Memory size9.6 KiB

Unnamed: 1
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
역사/사료
477 
연구/논문
283 
일반행정
223 
문화/관광
200 
통계
 
26
Other values (2)
 
2

Length

Max length5
Median length5
Mean length4.748142
Min length2

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st row<NA>
2nd row분류
3rd row일반행정
4th row역사/사료
5th row일반행정

Common Values

ValueCountFrequency (%)
역사/사료 477
39.4%
연구/논문 283
23.4%
일반행정 223
18.4%
문화/관광 200
16.5%
통계 26
 
2.1%
<NA> 1
 
0.1%
분류 1
 
0.1%

Length

2024-05-04T06:23:22.433735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T06:23:22.798436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
역사/사료 477
39.4%
연구/논문 283
23.4%
일반행정 223
18.4%
문화/관광 200
16.5%
통계 26
 
2.1%
na 1
 
0.1%
분류 1
 
0.1%
Distinct1204
Distinct (%)99.5%
Missing1
Missing (%)0.1%
Memory size9.6 KiB
2024-05-04T06:23:23.681799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length76
Median length41
Mean length17.828099
Min length3

Characters and Unicode

Total characters21572
Distinct characters702
Distinct categories16 ?
Distinct scripts4 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1198 ?
Unique (%)99.0%

Sample

1st row상품명
2nd row2020 서울특별시 도시계획위원회 매뉴얼
3rd row식민도시 경성, 차별에서 파괴까지(서울역사강좌09)
4th row2020 알기쉬운 지방세
5th row함께 읽는 도시재생(8권세트)
ValueCountFrequency (%)
서울시 112
 
2.8%
연구 86
 
2.2%
서울의 80
 
2.0%
서울 64
 
1.6%
한성부자료집(漢城府資料集 46
 
1.2%
42
 
1.1%
향토서울 36
 
0.9%
위한 31
 
0.8%
서울학연구 30
 
0.8%
29
 
0.7%
Other values (2463) 3400
85.9%
2024-05-04T06:23:25.114475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2746
 
12.7%
995
 
4.6%
795
 
3.7%
( 392
 
1.8%
) 392
 
1.8%
389
 
1.8%
360
 
1.7%
2 357
 
1.7%
0 332
 
1.5%
1 323
 
1.5%
Other values (692) 14491
67.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15085
69.9%
Space Separator 2746
 
12.7%
Decimal Number 1720
 
8.0%
Lowercase Letter 529
 
2.5%
Open Punctuation 393
 
1.8%
Close Punctuation 393
 
1.8%
Uppercase Letter 320
 
1.5%
Other Punctuation 239
 
1.1%
Dash Punctuation 101
 
0.5%
Letter Number 18
 
0.1%
Other values (6) 28
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
995
 
6.6%
795
 
5.3%
389
 
2.6%
360
 
2.4%
297
 
2.0%
297
 
2.0%
271
 
1.8%
264
 
1.8%
221
 
1.5%
216
 
1.4%
Other values (598) 10980
72.8%
Lowercase Letter
ValueCountFrequency (%)
e 78
14.7%
o 63
11.9%
l 50
9.5%
i 37
 
7.0%
n 36
 
6.8%
u 34
 
6.4%
a 33
 
6.2%
t 31
 
5.9%
s 28
 
5.3%
r 27
 
5.1%
Other values (13) 112
21.2%
Uppercase Letter
ValueCountFrequency (%)
S 61
19.1%
G 27
 
8.4%
E 21
 
6.6%
D 20
 
6.2%
U 19
 
5.9%
P 18
 
5.6%
R 17
 
5.3%
N 16
 
5.0%
O 16
 
5.0%
I 13
 
4.1%
Other values (13) 92
28.7%
Other Punctuation
ValueCountFrequency (%)
: 88
36.8%
, 74
31.0%
. 28
 
11.7%
/ 17
 
7.1%
! 10
 
4.2%
· 8
 
3.3%
' 5
 
2.1%
? 3
 
1.3%
& 3
 
1.3%
; 1
 
0.4%
Other values (2) 2
 
0.8%
Decimal Number
ValueCountFrequency (%)
2 357
20.8%
0 332
19.3%
1 323
18.8%
3 117
 
6.8%
5 110
 
6.4%
4 108
 
6.3%
6 107
 
6.2%
9 102
 
5.9%
7 91
 
5.3%
8 73
 
4.2%
Other Number
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Letter Number
ValueCountFrequency (%)
10
55.6%
5
27.8%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Open Punctuation
ValueCountFrequency (%)
( 392
99.7%
[ 1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 392
99.7%
] 1
 
0.3%
Math Symbol
ValueCountFrequency (%)
~ 13
92.9%
+ 1
 
7.1%
Modifier Symbol
ValueCountFrequency (%)
˙ 2
66.7%
` 1
33.3%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Final Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
2746
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 101
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14747
68.4%
Common 5620
 
26.1%
Latin 867
 
4.0%
Han 338
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
995
 
6.7%
795
 
5.4%
389
 
2.6%
360
 
2.4%
297
 
2.0%
297
 
2.0%
271
 
1.8%
264
 
1.8%
221
 
1.5%
216
 
1.5%
Other values (556) 10642
72.2%
Latin
ValueCountFrequency (%)
e 78
 
9.0%
o 63
 
7.3%
S 61
 
7.0%
l 50
 
5.8%
i 37
 
4.3%
n 36
 
4.2%
u 34
 
3.9%
a 33
 
3.8%
t 31
 
3.6%
s 28
 
3.2%
Other values (41) 416
48.0%
Common
ValueCountFrequency (%)
2746
48.9%
( 392
 
7.0%
) 392
 
7.0%
2 357
 
6.4%
0 332
 
5.9%
1 323
 
5.7%
3 117
 
2.1%
5 110
 
2.0%
4 108
 
1.9%
6 107
 
1.9%
Other values (33) 636
 
11.3%
Han
ValueCountFrequency (%)
50
14.8%
50
14.8%
49
14.5%
48
14.2%
48
14.2%
48
14.2%
4
 
1.2%
3
 
0.9%
2
 
0.6%
2
 
0.6%
Other values (32) 34
10.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14745
68.4%
ASCII 6448
29.9%
CJK 337
 
1.6%
Number Forms 18
 
0.1%
None 8
 
< 0.1%
Enclosed Alphanum 6
 
< 0.1%
Punctuation 5
 
< 0.1%
Modifier Letters 2
 
< 0.1%
Compat Jamo 2
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2746
42.6%
( 392
 
6.1%
) 392
 
6.1%
2 357
 
5.5%
0 332
 
5.1%
1 323
 
5.0%
3 117
 
1.8%
5 110
 
1.7%
4 108
 
1.7%
6 107
 
1.7%
Other values (66) 1464
22.7%
Hangul
ValueCountFrequency (%)
995
 
6.7%
795
 
5.4%
389
 
2.6%
360
 
2.4%
297
 
2.0%
297
 
2.0%
271
 
1.8%
264
 
1.8%
221
 
1.5%
216
 
1.5%
Other values (555) 10640
72.2%
CJK
ValueCountFrequency (%)
50
14.8%
50
14.8%
49
14.5%
48
14.2%
48
14.2%
48
14.2%
4
 
1.2%
3
 
0.9%
2
 
0.6%
2
 
0.6%
Other values (31) 33
9.8%
Number Forms
ValueCountFrequency (%)
10
55.6%
5
27.8%
1
 
5.6%
1
 
5.6%
1
 
5.6%
None
ValueCountFrequency (%)
· 8
100.0%
Modifier Letters
ValueCountFrequency (%)
˙ 2
100.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Punctuation
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Unnamed: 3
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.1%
Memory size9.6 KiB

Unnamed: 4
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
<NA>
668 
품절
436 
임시품절
 
54
절판
 
52
품절여부
 
1

Length

Max length4
Median length4
Mean length3.1940545
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row<NA>
2nd row품절여부
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 668
55.2%
품절 436
36.0%
임시품절 54
 
4.5%
절판 52
 
4.3%
품절여부 1
 
0.1%

Length

2024-05-04T06:23:25.768384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T06:23:26.138494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 668
55.2%
품절 436
36.0%
임시품절 54
 
4.5%
절판 52
 
4.3%
품절여부 1
 
0.1%

Unnamed: 5
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.1%
Memory size9.6 KiB

Correlations

2024-05-04T06:23:26.370148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 4
Unnamed: 11.0000.790
Unnamed: 40.7901.000
2024-05-04T06:23:26.712178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 4
Unnamed: 11.0000.634
Unnamed: 40.6341.000
2024-05-04T06:23:27.250636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 4
Unnamed: 11.0000.634
Unnamed: 40.6341.000

Missing values

2024-05-04T06:23:21.067437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T06:23:21.458811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-04T06:23:21.906763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

서울특별시 간행물 판매 정보(20.04.)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5
0NaN<NA><NA>NaN<NA>NaN
1상품번호분류상품명판매가격품절여부등록일자
211496일반행정2020 서울특별시 도시계획위원회 매뉴얼6000<NA>2020-03-11 00:00:00
311457역사/사료식민도시 경성, 차별에서 파괴까지(서울역사강좌09)10000<NA>2020-03-04 00:00:00
411376일반행정2020 알기쉬운 지방세2000<NA>2020-02-24 00:00:00
511356일반행정함께 읽는 도시재생(8권세트)40000<NA>2020-02-19 00:00:00
611344역사/사료백제학 연구총서 쟁점백제사 15 무령왕릉 다시보기10000<NA>2020-02-14 00:00:00
711342일반행정뚜벅뚜벅 찾동씨4000<NA>2020-02-14 00:00:00
811336역사/사료서울기획연구6 한양의 삼군영15000<NA>2020-02-14 00:00:00
911316문화/관광미술관에 놓인 배움의 식탁 : 예술가의 런치박스 레시피20000<NA>2020-01-31 00:00:00
서울특별시 간행물 판매 정보(20.04.)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5
1201148역사/사료서울육백년사 515000품절2004-01-20 00:00:00
1202147연구/논문서울학 연구서설8000품절2004-01-20 00:00:00
1203146역사/사료서울육백년사 415000품절2004-01-20 00:00:00
1204145역사/사료서울근현대사기행8000품절2004-01-20 00:00:00
1205143문화/관광서울의 음식문화8000<NA>2004-01-20 00:00:00
1206142일반행정2002 FIFA 월드컵 한국/일본 서울특별시 리포트 1509일의 대장정16000품절2004-01-20 00:00:00
1207140문화/관광서울의 경과곡6000임시품절2004-01-20 00:00:00
1208133문화/관광한강홍보엽서500절판2004-01-20 00:00:00
1209130역사/사료사진으로보는서울2 - 일제 침략 아래서의 서울20000<NA>2004-01-20 00:00:00
1210129역사/사료사진으로보는서울1 - 개항 이후 서울의 근대화와 그 시련20000<NA>2004-01-20 00:00:00

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 4# duplicates
0역사/사료사진으로보는서울1 - 개항 이후 서울의 근대화와 그 시련<NA>2
1연구/논문도시설계 개론품절2
2일반행정청소년 노동권리 수첩<NA>2