Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells3
Missing cells (%)< 0.1%
Duplicate rows692
Duplicate rows (%)6.9%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

Numeric1
Text1
Categorical4
DateTime1

Dataset

Description강남구 스마트도서관 DB(구입년도, 서명, 도서타입, 이용제한구분, 관리구분, 선반위치부호의 단행서지색인정보를 제공합니다)
Author서울특별시 강남구
URLhttps://www.data.go.kr/data/15071665/fileData.do

Alerts

도서타입 has constant value ""Constant
데이터기준일 has constant value ""Constant
Dataset has 692 (6.9%) duplicate rowsDuplicates
선반위치부호 is highly overall correlated with 구입년도 and 1 other fieldsHigh correlation
관리구분 is highly overall correlated with 구입년도 and 1 other fieldsHigh correlation
구입년도 is highly overall correlated with 관리구분 and 1 other fieldsHigh correlation
이용제한구분 is highly imbalanced (91.8%)Imbalance
관리구분 is highly imbalanced (73.6%)Imbalance
선반위치부호 is highly imbalanced (54.7%)Imbalance

Reproduction

Analysis started2023-12-12 06:37:49.876106
Analysis finished2023-12-12 06:37:50.766466
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구입년도
Real number (ℝ)

HIGH CORRELATION 

Distinct13
Distinct (%)0.1%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean2016.181
Minimum2006
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:37:50.840098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2006
5-th percentile2011
Q12014
median2016
Q32018
95-th percentile2020
Maximum2020
Range14
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.7127678
Coefficient of variation (CV)0.0013454982
Kurtosis-0.5635247
Mean2016.181
Median Absolute Deviation (MAD)2
Skewness-0.42757587
Sum20155761
Variance7.359109
MonotonicityNot monotonic
2023-12-12T15:37:50.954445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
2018 1241
12.4%
2019 1237
12.4%
2017 1214
12.1%
2020 1198
12.0%
2016 1155
11.6%
2015 1151
11.5%
2014 1119
11.2%
2013 753
7.5%
2011 298
 
3.0%
2012 295
 
2.9%
Other values (3) 336
 
3.4%
ValueCountFrequency (%)
2006 1
 
< 0.1%
2009 52
 
0.5%
2010 283
 
2.8%
2011 298
 
3.0%
2012 295
 
2.9%
2013 753
7.5%
2014 1119
11.2%
2015 1151
11.5%
2016 1155
11.6%
2017 1214
12.1%
ValueCountFrequency (%)
2020 1198
12.0%
2019 1237
12.4%
2018 1241
12.4%
2017 1214
12.1%
2016 1155
11.6%
2015 1151
11.5%
2014 1119
11.2%
2013 753
7.5%
2012 295
 
2.9%
2011 298
 
3.0%

서명
Text

Distinct146
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T15:37:51.234674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length207
Median length60
Mean length11.8863
Min length2

Characters and Unicode

Total characters118863
Distinct characters313
Distinct categories10 ?
Distinct scripts5 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row고래가 그랬어
2nd row(월간)TEPS
3rd row위즈키즈
4th rowNewton = 뉴턴
5th row씨네21
ValueCountFrequency (%)
4307
 
16.9%
이코노미스트 964
 
3.8%
marie 621
 
2.4%
maison 621
 
2.4%
claire 621
 
2.4%
씨네21 510
 
2.0%
매경이코노미 373
 
1.5%
시사in 371
 
1.5%
21 364
 
1.4%
한겨레 364
 
1.4%
Other values (279) 16388
64.3%
2023-12-12T15:37:51.685160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16731
 
14.1%
= 4495
 
3.8%
e 4382
 
3.7%
a 3176
 
2.7%
i 3015
 
2.5%
2368
 
2.0%
r 2282
 
1.9%
E 2039
 
1.7%
1923
 
1.6%
o 1906
 
1.6%
Other values (303) 76546
64.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 47857
40.3%
Lowercase Letter 25695
21.6%
Uppercase Letter 19984
16.8%
Space Separator 16731
 
14.1%
Math Symbol 4511
 
3.8%
Decimal Number 1964
 
1.7%
Close Punctuation 650
 
0.5%
Open Punctuation 650
 
0.5%
Other Punctuation 448
 
0.4%
Dash Punctuation 373
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2368
 
4.9%
1923
 
4.0%
1578
 
3.3%
1453
 
3.0%
1339
 
2.8%
1234
 
2.6%
1134
 
2.4%
988
 
2.1%
906
 
1.9%
820
 
1.7%
Other values (233) 34114
71.3%
Lowercase Letter
ValueCountFrequency (%)
e 4382
17.1%
a 3176
12.4%
i 3015
11.7%
r 2282
8.9%
o 1906
 
7.4%
s 1509
 
5.9%
n 1500
 
5.8%
l 1083
 
4.2%
c 976
 
3.8%
w 802
 
3.1%
Other values (15) 5064
19.7%
Uppercase Letter
ValueCountFrequency (%)
E 2039
 
10.2%
N 1706
 
8.5%
I 1700
 
8.5%
M 1612
 
8.1%
T 1440
 
7.2%
R 1366
 
6.8%
O 1291
 
6.5%
L 1111
 
5.6%
A 1027
 
5.1%
H 1009
 
5.0%
Other values (14) 5683
28.4%
Decimal Number
ValueCountFrequency (%)
1 896
45.6%
2 878
44.7%
3 91
 
4.6%
0 49
 
2.5%
4 21
 
1.1%
5 18
 
0.9%
6 11
 
0.6%
Other Punctuation
ValueCountFrequency (%)
& 152
33.9%
: 137
30.6%
, 120
26.8%
. 26
 
5.8%
% 13
 
2.9%
Math Symbol
ValueCountFrequency (%)
= 4495
99.6%
+ 16
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 590
90.8%
] 60
 
9.2%
Open Punctuation
ValueCountFrequency (%)
( 590
90.8%
[ 60
 
9.2%
Dash Punctuation
ValueCountFrequency (%)
360
96.5%
- 13
 
3.5%
Space Separator
ValueCountFrequency (%)
16731
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 45679
38.4%
Hangul 45073
37.9%
Common 25327
21.3%
Katakana 1440
 
1.2%
Han 1344
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2368
 
5.3%
1923
 
4.3%
1578
 
3.5%
1453
 
3.2%
1339
 
3.0%
1234
 
2.7%
1134
 
2.5%
988
 
2.2%
906
 
2.0%
820
 
1.8%
Other values (223) 31330
69.5%
Latin
ValueCountFrequency (%)
e 4382
 
9.6%
a 3176
 
7.0%
i 3015
 
6.6%
r 2282
 
5.0%
E 2039
 
4.5%
o 1906
 
4.2%
N 1706
 
3.7%
I 1700
 
3.7%
M 1612
 
3.5%
s 1509
 
3.3%
Other values (39) 22352
48.9%
Common
ValueCountFrequency (%)
16731
66.1%
= 4495
 
17.7%
1 896
 
3.5%
2 878
 
3.5%
) 590
 
2.3%
( 590
 
2.3%
360
 
1.4%
& 152
 
0.6%
: 137
 
0.5%
, 120
 
0.5%
Other values (11) 378
 
1.5%
Han
ValueCountFrequency (%)
360
26.8%
360
26.8%
360
26.8%
88
 
6.5%
88
 
6.5%
88
 
6.5%
Katakana
ValueCountFrequency (%)
360
25.0%
360
25.0%
360
25.0%
360
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70646
59.4%
Hangul 45073
37.9%
Katakana 1440
 
1.2%
CJK 1344
 
1.1%
Punctuation 360
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16731
23.7%
= 4495
 
6.4%
e 4382
 
6.2%
a 3176
 
4.5%
i 3015
 
4.3%
r 2282
 
3.2%
E 2039
 
2.9%
o 1906
 
2.7%
N 1706
 
2.4%
I 1700
 
2.4%
Other values (59) 29214
41.4%
Hangul
ValueCountFrequency (%)
2368
 
5.3%
1923
 
4.3%
1578
 
3.5%
1453
 
3.2%
1339
 
3.0%
1234
 
2.7%
1134
 
2.5%
988
 
2.2%
906
 
2.0%
820
 
1.8%
Other values (223) 31330
69.5%
Katakana
ValueCountFrequency (%)
360
25.0%
360
25.0%
360
25.0%
360
25.0%
CJK
ValueCountFrequency (%)
360
26.8%
360
26.8%
360
26.8%
88
 
6.5%
88
 
6.5%
88
 
6.5%
Punctuation
ValueCountFrequency (%)
360
100.0%

도서타입
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
도서
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도서
2nd row도서
3rd row도서
4th row도서
5th row도서

Common Values

ValueCountFrequency (%)
도서 10000
100.0%

Length

2023-12-12T15:37:51.834482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:37:51.934886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도서 10000
100.0%

이용제한구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
적용안함
9898 
일반
 
102

Length

Max length4
Median length4
Mean length3.9796
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적용안함
2nd row적용안함
3rd row적용안함
4th row적용안함
5th row적용안함

Common Values

ValueCountFrequency (%)
적용안함 9898
99.0%
일반 102
 
1.0%

Length

2023-12-12T15:37:52.058497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:37:52.175709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적용안함 9898
99.0%
일반 102
 
1.0%

관리구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
도곡정보문화도서관
8959 
삼성도서관
951 
세곡마루도서관
 
88
정다운도서관
 
2

Length

Max length9
Median length9
Mean length8.6014
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도곡정보문화도서관
2nd row도곡정보문화도서관
3rd row도곡정보문화도서관
4th row도곡정보문화도서관
5th row도곡정보문화도서관

Common Values

ValueCountFrequency (%)
도곡정보문화도서관 8959
89.6%
삼성도서관 951
 
9.5%
세곡마루도서관 88
 
0.9%
정다운도서관 2
 
< 0.1%

Length

2023-12-12T15:37:52.300617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:37:52.458070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도곡정보문화도서관 8959
89.6%
삼성도서관 951
 
9.5%
세곡마루도서관 88
 
0.9%
정다운도서관 2
 
< 0.1%

선반위치부호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
[도곡정보] 종합자료실
7646 
[도곡정보] 어린이자료실
1208 
[삼성] 어린이서가
 
650
적용안함
 
406
[세곡마루]자료실
 
88

Length

Max length13
Median length12
Mean length11.6392
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[도곡정보] 종합자료실
2nd row[도곡정보] 종합자료실
3rd row[도곡정보] 어린이자료실
4th row[도곡정보] 종합자료실
5th row[도곡정보] 종합자료실

Common Values

ValueCountFrequency (%)
[도곡정보] 종합자료실 7646
76.5%
[도곡정보] 어린이자료실 1208
 
12.1%
[삼성] 어린이서가 650
 
6.5%
적용안함 406
 
4.1%
[세곡마루]자료실 88
 
0.9%
[정다운] 어린이실 2
 
< 0.1%

Length

2023-12-12T15:37:52.604657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:37:52.762870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도곡정보 8854
45.4%
종합자료실 7646
39.2%
어린이자료실 1208
 
6.2%
삼성 650
 
3.3%
어린이서가 650
 
3.3%
적용안함 406
 
2.1%
세곡마루]자료실 88
 
0.5%
정다운 2
 
< 0.1%
어린이실 2
 
< 0.1%

데이터기준일
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-03-25 00:00:00
Maximum2022-03-25 00:00:00
2023-12-12T15:37:52.912425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:37:53.055378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T15:37:50.375296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:37:53.160723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구입년도이용제한구분관리구분선반위치부호
구입년도1.0000.2350.7260.597
이용제한구분0.2351.0000.0450.115
관리구분0.7260.0451.0000.999
선반위치부호0.5970.1150.9991.000
2023-12-12T15:37:53.288560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선반위치부호관리구분이용제한구분
선반위치부호1.0000.9850.083
관리구분0.9851.0000.030
이용제한구분0.0830.0301.000
2023-12-12T15:37:53.396939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구입년도이용제한구분관리구분선반위치부호
구입년도1.0000.1760.7670.585
이용제한구분0.1761.0000.0300.083
관리구분0.7670.0301.0000.985
선반위치부호0.5850.0830.9851.000

Missing values

2023-12-12T15:37:50.531204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:37:50.686007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구입년도서명도서타입이용제한구분관리구분선반위치부호데이터기준일
42382013고래가 그랬어도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
46472015(월간)TEPS도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
45152019위즈키즈도서적용안함도곡정보문화도서관[도곡정보] 어린이자료실2022-03-25
31472020Newton = 뉴턴도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
30652014씨네21도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
104382020월간중앙도서적용안함세곡마루도서관[세곡마루]자료실2022-03-25
13132016(월간)헬스조선도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
99382017기획회의도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
83992016책 :책과 문화, 예술을 담은 잡지 = Chaeg = Chaeg = Chaeg = Chaeg도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
103782020이코노미스트도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
구입년도서명도서타입이용제한구분관리구분선반위치부호데이터기준일
25662020사람과 산도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
52562014이코노미스트도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
3332014어린이과학동아도서적용안함도곡정보문화도서관[도곡정보] 어린이자료실2022-03-25
32952015씨네21도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
80502015(중학)독서평설 = 독서도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
47922016전원주택라이프도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
100562020Around = 어라운드도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
12482016TIME = 타임도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
26782020월간미술도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25
52452014이코노미스트도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-25

Duplicate rows

Most frequently occurring

구입년도서명도서타입이용제한구분관리구분선반위치부호데이터기준일# duplicates
282012이코노미스트도서적용안함삼성도서관적용안함2022-03-25100
202011이코노미스트도서적용안함삼성도서관적용안함2022-03-2596
192011이코노미스트도서적용안함삼성도서관[삼성] 어린이서가2022-03-2595
272012이코노미스트도서적용안함삼성도서관[삼성] 어린이서가2022-03-2595
112010이코노미스트도서적용안함삼성도서관[삼성] 어린이서가2022-03-2590
122010이코노미스트도서적용안함삼성도서관적용안함2022-03-2583
2772016매경이코노미도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-2551
2842016시사IN도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-2551
5372019시사IN도서적용안함도곡정보문화도서관[도곡정보] 종합자료실2022-03-2551
1122014THE JUNIOR HERALD도서적용안함도곡정보문화도서관[도곡정보] 어린이자료실2022-03-2550