Overview

Dataset statistics

Number of variables4
Number of observations4941
Missing cells5
Missing cells (%)< 0.1%
Duplicate rows717
Duplicate rows (%)14.5%
Total size in memory159.4 KiB
Average record size in memory33.0 B

Variable types

Categorical2
Numeric1
DateTime1

Dataset

Description한국세라믹기술원 세라믹소재정보은행의 통계 정보입니다.
Author한국세라믹기술원
URLhttps://www.data.go.kr/data/15072087/fileData.do

Alerts

Dataset has 717 (14.5%) duplicate rowsDuplicates
사용자구분 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
메뉴명 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 메뉴명 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 09:14:02.294741
Analysis finished2023-12-12 09:14:02.798043
Duration0.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

메뉴명
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size38.7 KiB
일반검색
1093 
유전체/전압체
1035 
그래프분석
687 
응용검색
535 
전도체/반도체
452 
Other values (9)
1139 

Length

Max length7
Median length5
Mean length5.034406
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반검색
2nd row형광체
3rd row자동차부품
4th row구조소재
5th row전극재료

Common Values

ValueCountFrequency (%)
일반검색 1093
22.1%
유전체/전압체 1035
20.9%
그래프분석 687
13.9%
응용검색 535
10.8%
전도체/반도체 452
9.1%
다차원분석 256
 
5.2%
구조소재 237
 
4.8%
유리 179
 
3.6%
자동차부품 178
 
3.6%
내열소재 136
 
2.8%
Other values (4) 153
 
3.1%

Length

2023-12-12T18:14:02.870154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반검색 1093
22.1%
유전체/전압체 1035
20.9%
그래프분석 687
13.9%
응용검색 535
10.8%
전도체/반도체 452
9.1%
다차원분석 256
 
5.2%
구조소재 237
 
4.8%
유리 179
 
3.6%
자동차부품 178
 
3.6%
내열소재 136
 
2.8%
Other values (4) 153
 
3.1%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct13
Distinct (%)0.3%
Missing5
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean5.98906
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.6 KiB
2023-12-12T18:14:02.996538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q311
95-th percentile13
Maximum13
Range12
Interquartile range (IQR)9

Descriptive statistics

Standard deviation4.6080281
Coefficient of variation (CV)0.76940757
Kurtosis-1.6193927
Mean5.98906
Median Absolute Deviation (MAD)3
Skewness0.27816373
Sum29562
Variance21.233923
MonotonicityNot monotonic
2023-12-12T18:14:03.115969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
1 1093
22.1%
2 1035
20.9%
11 687
13.9%
13 535
10.8%
9 452
9.1%
12 256
 
5.2%
3 237
 
4.8%
8 179
 
3.6%
6 178
 
3.6%
4 136
 
2.8%
Other values (3) 148
 
3.0%
ValueCountFrequency (%)
1 1093
22.1%
2 1035
20.9%
3 237
 
4.8%
4 136
 
2.8%
5 91
 
1.8%
6 178
 
3.6%
7 11
 
0.2%
8 179
 
3.6%
9 452
9.1%
10 46
 
0.9%
ValueCountFrequency (%)
13 535
10.8%
12 256
 
5.2%
11 687
13.9%
10 46
 
0.9%
9 452
9.1%
8 179
 
3.6%
7 11
 
0.2%
6 178
 
3.6%
5 91
 
1.8%
4 136
 
2.8%

사용자구분
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size38.7 KiB
STND_SEARCH
1093 
FIELD_SEARCH01
1035 
GRAPH_ANALYSIS
687 
APPLIED_SEARCH
535 
FIELD_SEARCH08
452 
Other values (11)
1139 

Length

Max length14
Median length14
Mean length13.333536
Min length10

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowSTND_SEARCH
2nd rowFIELD_SEARCH04
3rd rowFIELD_SEARCH05
4th rowFIELD_SEARCH02
5th rowFIELD_SEARCH09

Common Values

ValueCountFrequency (%)
STND_SEARCH 1093
22.1%
FIELD_SEARCH01 1035
20.9%
GRAPH_ANALYSIS 687
13.9%
APPLIED_SEARCH 535
10.8%
FIELD_SEARCH08 452
9.1%
MULTI_ANALYSIS 256
 
5.2%
FIELD_SEARCH02 237
 
4.8%
FIELD_SEARCH07 179
 
3.6%
FIELD_SEARCH05 178
 
3.6%
FIELD_SEARCH03 136
 
2.8%
Other values (6) 153
 
3.1%

Length

2023-12-12T18:14:03.282877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
stnd_search 1093
22.1%
field_search01 1035
20.9%
graph_analysis 687
13.9%
applied_search 535
10.8%
field_search08 452
9.1%
multi_analysis 256
 
5.2%
field_search02 237
 
4.8%
field_search07 179
 
3.6%
field_search05 178
 
3.6%
field_search03 136
 
2.8%
Other values (6) 153
 
3.1%
Distinct270
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size38.7 KiB
Minimum2010-08-24 00:00:00
Maximum2011-10-24 00:00:00
2023-12-12T18:14:03.438251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:03.613982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T18:14:02.507588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:14:03.717148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
메뉴명일련번호사용자구분
메뉴명1.0001.0001.000
일련번호1.0001.0001.000
사용자구분1.0001.0001.000
2023-12-12T18:14:03.807187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용자구분메뉴명
사용자구분1.0001.000
메뉴명1.0001.000
2023-12-12T18:14:03.909426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호메뉴명사용자구분
일련번호1.0001.0001.000
메뉴명1.0001.0001.000
사용자구분1.0001.0001.000

Missing values

2023-12-12T18:14:02.647431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:14:02.756234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

메뉴명일련번호사용자구분사용일
0일반검색1STND_SEARCH2010-08-30
1형광체5FIELD_SEARCH042010-08-30
2자동차부품6FIELD_SEARCH052010-08-30
3구조소재3FIELD_SEARCH022010-08-30
4전극재료10FIELD_SEARCH092010-08-30
5일반검색1STND_SEARCH2010-08-30
6구조소재3FIELD_SEARCH022010-08-30
7유전체/전압체2FIELD_SEARCH012010-08-31
8유전체/전압체2FIELD_SEARCH012010-08-31
9유전체/전압체2FIELD_SEARCH012010-08-31
메뉴명일련번호사용자구분사용일
4931일반검색1STND_SEARCH2011-07-26
4932그래프분석11GRAPH_ANALYSIS2011-07-26
4933다차원분석12MULTI_ANALYSIS2011-07-26
4934그래프분석11GRAPH_ANALYSIS2011-07-26
4935구조소재3FIELD_SEARCH022011-07-29
4936일반검색1STND_SEARCH2011-08-01
4937응용검색13APPLIED_SEARCH2011-08-04
4938응용검색13APPLIED_SEARCH2011-08-05
4939그래프분석11GRAPH_ANALYSIS2011-08-13
4940다차원분석12MULTI_ANALYSIS2011-08-19

Duplicate rows

Most frequently occurring

메뉴명일련번호사용자구분사용일# duplicates
239유전체/전압체2FIELD_SEARCH012010-09-08104
217유리8FIELD_SEARCH072010-09-0868
673전도체/반도체9FIELD_SEARCH082010-12-2054
238유전체/전압체2FIELD_SEARCH012010-09-0752
236유전체/전압체2FIELD_SEARCH012010-09-0351
457일반검색1STND_SEARCH2010-09-0149
45그래프분석11GRAPH_ANALYSIS2010-09-0348
52그래프분석11GRAPH_ANALYSIS2010-09-2846
240유전체/전압체2FIELD_SEARCH012010-09-1543
244유전체/전압체2FIELD_SEARCH012010-09-2840