Overview

Dataset statistics

Number of variables4
Number of observations117
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)1.7%
Total size in memory4.0 KiB
Average record size in memory35.1 B

Variable types

Numeric1
Text1
Categorical2

Dataset

Description대전광역시 여성가족원의 장난감 도서관 현황에 관한 데이타로 장난감의 제품명과 제조사, 수량 등의 항목을 상세 제공합니다.
URLhttps://www.data.go.kr/data/15077706/fileData.do

Alerts

Dataset has 2 (1.7%) duplicate rowsDuplicates
연번 is highly overall correlated with 제조사명High correlation
제조사명 is highly overall correlated with 연번High correlation
개수 is highly imbalanced (84.3%)Imbalance

Reproduction

Analysis started2023-12-12 06:04:43.682117
Analysis finished2023-12-12 06:04:44.192872
Duration0.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION 

Distinct114
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58.606838
Minimum1
Maximum114
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-12T15:04:44.285291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.8
Q130
median59
Q388
95-th percentile108.2
Maximum114
Range113
Interquartile range (IQR)58

Descriptive statistics

Standard deviation33.334759
Coefficient of variation (CV)0.56878618
Kurtosis-1.2343733
Mean58.606838
Median Absolute Deviation (MAD)29
Skewness-0.04422532
Sum6857
Variance1111.2062
MonotonicityNot monotonic
2023-12-12T15:04:44.430917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
101 3
 
2.6%
100 2
 
1.7%
1 1
 
0.9%
84 1
 
0.9%
83 1
 
0.9%
82 1
 
0.9%
81 1
 
0.9%
80 1
 
0.9%
79 1
 
0.9%
78 1
 
0.9%
Other values (104) 104
88.9%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
10 1
0.9%
ValueCountFrequency (%)
114 1
0.9%
113 1
0.9%
112 1
0.9%
111 1
0.9%
110 1
0.9%
109 1
0.9%
108 1
0.9%
107 1
0.9%
106 1
0.9%
105 1
0.9%
Distinct108
Distinct (%)92.3%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T15:04:44.699300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length10
Mean length7.4529915
Min length3

Characters and Unicode

Total characters872
Distinct characters243
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique106 ?
Unique (%)90.6%

Sample

1st row동물의 왕국-공룡
2nd row동물의 왕국-동물
3rd row그린토이즈 덤프트럭
4th row그린토이즈 소방차
5th row그린토이즈 스쿨버스
ValueCountFrequency (%)
코코몽정비자동차 6
 
4.2%
에듀플러스 5
 
3.5%
고고다이노 4
 
2.8%
그린토이즈 4
 
2.8%
터치앤고레이서 3
 
2.1%
러기드머신 3
 
2.1%
동물의 2
 
1.4%
덤프트럭 2
 
1.4%
자석미로찾기 1
 
0.7%
뽀로로냠냠러닝테이블 1
 
0.7%
Other values (112) 112
78.3%
2023-12-12T15:04:45.129303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49
 
5.6%
28
 
3.2%
27
 
3.1%
24
 
2.8%
21
 
2.4%
18
 
2.1%
16
 
1.8%
15
 
1.7%
15
 
1.7%
14
 
1.6%
Other values (233) 645
74.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 823
94.4%
Space Separator 28
 
3.2%
Decimal Number 11
 
1.3%
Uppercase Letter 8
 
0.9%
Dash Punctuation 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
6.0%
27
 
3.3%
24
 
2.9%
21
 
2.6%
18
 
2.2%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
13
 
1.6%
Other values (219) 611
74.2%
Uppercase Letter
ValueCountFrequency (%)
Q 2
25.0%
D 1
12.5%
C 1
12.5%
R 1
12.5%
I 1
12.5%
L 1
12.5%
E 1
12.5%
Decimal Number
ValueCountFrequency (%)
3 4
36.4%
1 3
27.3%
2 2
18.2%
9 1
 
9.1%
0 1
 
9.1%
Space Separator
ValueCountFrequency (%)
28
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 823
94.4%
Common 41
 
4.7%
Latin 8
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
6.0%
27
 
3.3%
24
 
2.9%
21
 
2.6%
18
 
2.2%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
13
 
1.6%
Other values (219) 611
74.2%
Common
ValueCountFrequency (%)
28
68.3%
3 4
 
9.8%
1 3
 
7.3%
- 2
 
4.9%
2 2
 
4.9%
9 1
 
2.4%
0 1
 
2.4%
Latin
ValueCountFrequency (%)
Q 2
25.0%
D 1
12.5%
C 1
12.5%
R 1
12.5%
I 1
12.5%
L 1
12.5%
E 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 823
94.4%
ASCII 49
 
5.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
49
 
6.0%
27
 
3.3%
24
 
2.9%
21
 
2.6%
18
 
2.2%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
13
 
1.6%
Other values (219) 611
74.2%
ASCII
ValueCountFrequency (%)
28
57.1%
3 4
 
8.2%
1 3
 
6.1%
- 2
 
4.1%
Q 2
 
4.1%
2 2
 
4.1%
D 1
 
2.0%
9 1
 
2.0%
C 1
 
2.0%
R 1
 
2.0%
Other values (4) 4
 
8.2%

제조사명
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)29.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
코니스
브이텍
브라이트스타트
우드피아
 
7
아이코닉스
 
7
Other values (29)
79 

Length

Max length9
Median length7
Mean length3.9059829
Min length2

Unique

Unique10 ?
Unique (%)8.5%

Sample

1st rowSH라이온
2nd rowSH라이온
3rd row그린토이즈
4th row그린토이즈
5th row그린토이즈

Common Values

ValueCountFrequency (%)
코니스 8
 
6.8%
브이텍 8
 
6.8%
브라이트스타트 8
 
6.8%
우드피아 7
 
6.0%
아이코닉스 7
 
6.0%
한립 7
 
6.0%
대성토이즈 6
 
5.1%
아이존 6
 
5.1%
원펀 5
 
4.3%
리틀타익스 5
 
4.3%
Other values (24) 50
42.7%

Length

2023-12-12T15:04:45.324717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
코니스 8
 
6.8%
브라이트스타트 8
 
6.8%
브이텍 8
 
6.8%
우드피아 7
 
6.0%
아이코닉스 7
 
6.0%
한립 7
 
6.0%
대성토이즈 6
 
5.1%
아이존 6
 
5.1%
원펀 5
 
4.3%
리틀타익스 5
 
4.3%
Other values (24) 50
42.7%

개수
Categorical

IMBALANCE 

Distinct3
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2
113 
3
 
2
1
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 113
96.6%
3 2
 
1.7%
1 2
 
1.7%

Length

2023-12-12T15:04:45.468267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:04:45.589660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 113
96.6%
3 2
 
1.7%
1 2
 
1.7%

Interactions

2023-12-12T15:04:43.910431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:04:45.673756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번제조사명개수
연번1.0000.9730.468
제조사명0.9731.0000.733
개수0.4680.7331.000
2023-12-12T15:04:45.792997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
개수제조사명
개수1.0000.431
제조사명0.4311.000
2023-12-12T15:04:45.907243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번제조사명개수
연번1.0000.7290.308
제조사명0.7291.0000.431
개수0.3080.4311.000

Missing values

2023-12-12T15:04:44.047393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:04:44.150786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번장난감명제조사명개수
01동물의 왕국-공룡SH라이온2
12동물의 왕국-동물SH라이온2
23그린토이즈 덤프트럭그린토이즈2
34그린토이즈 소방차그린토이즈2
45그린토이즈 스쿨버스그린토이즈2
56그린토이즈 청소차그린토이즈2
67수개념별컵쌓기그린토이즈2
78헬로키티왕바쾨걸음마금보2
89높낮이농구대꼬마토이즈2
910뉴말하는119구급대대성토이즈2
연번장난감명제조사명개수
107105에듀플러스코니스2
108106코코몽정비자동차한립2
109107에듀플러스코니스2
110108코코몽정비자동차한립2
111109퍼즐하우스한립2
112110러기드머신 덤프트럭헬로토이2
113111러기드머신 포크레인헬로토이2
114112러기드머신 휠로더헬로토이2
115113바이크레미콘와우2
116114행복한멍멍이미니토2

Duplicate rows

Most frequently occurring

연번장난감명제조사명개수# duplicates
0100에듀플러스코니스22
1101코코몽정비자동차한립22