Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 25 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 2.4 KiB |
Average record size in memory | 97.3 B |
Variable types
Categorical | 9 |
---|---|
Numeric | 1 |
Text | 1 |
Dataset
Description | 특허청_KIPRISPlus_시소러스입니다. 발음, 의미가 유사한 단어들을 모은 단어 사전인 시소러스 정보를 제공합니다. (KIPRISPlus 서비스) |
---|---|
Author | 특허청 |
URL | https://www.data.go.kr/data/15044347/fileData.do |
시소러스IPC섹션구분코드 has constant value "" | Constant |
시소러스기준단어명 has constant value "" | Constant |
시소러스기준단어언어코드 has constant value "" | Constant |
부서코드 has constant value "" | Constant |
시소러스단어일련번호 is highly overall correlated with 시소러스단어분류코드 | High correlation |
시소러스단어분류코드 is highly overall correlated with 시소러스단어일련번호 and 1 other fields | High correlation |
시소러스관련단어가중치 is highly overall correlated with 시소러스단어분류코드 | High correlation |
시소러스단어입력일자 is highly overall correlated with 시소러스단어변경일자 | High correlation |
시소러스단어변경일자 is highly overall correlated with 시소러스단어입력일자 | High correlation |
시소러스단어입력일자 is highly imbalanced (59.8%) | Imbalance |
시소러스단어변경일자 is highly imbalanced (59.8%) | Imbalance |
시소러스관련단어중요도구분 is highly imbalanced (75.8%) | Imbalance |
시소러스단어일련번호 has unique values | Unique |
Reproduction
Analysis started | 2023-12-13 00:44:25.587466 |
---|---|
Analysis finished | 2023-12-13 00:44:26.153689 |
Duration | 0.57 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
시소러스IPC섹션구분코드
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 4.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
G |
---|
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | G |
---|---|
2nd row | G |
3rd row | G |
4th row | G |
5th row | G |
Common Values
Value | Count | Frequency (%) |
G | 25 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
g | 25 |
시소러스기준단어명
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 4.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
유리제품 |
---|
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 유리제품 |
---|---|
2nd row | 유리제품 |
3rd row | 유리제품 |
4th row | 유리제품 |
5th row | 유리제품 |
Common Values
Value | Count | Frequency (%) |
유리제품 | 25 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
유리제품 | 25 |
시소러스단어일련번호
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 25 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 17.8 |
Minimum | 1 |
---|---|
Maximum | 34 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 357.0 B |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 3.2 |
Q1 | 9 |
median | 18 |
Q3 | 25 |
95-th percentile | 32.8 |
Maximum | 34 |
Range | 33 |
Interquartile range (IQR) | 16 |
Descriptive statistics
Standard deviation | 10.099505 |
---|---|
Coefficient of variation (CV) | 0.56738792 |
Kurtosis | -1.1880774 |
Mean | 17.8 |
Median Absolute Deviation (MAD) | 9 |
Skewness | 0.00026378611 |
Sum | 445 |
Variance | 102 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 4.0% |
3 | 1 | 4.0% |
34 | 1 | 4.0% |
33 | 1 | 4.0% |
32 | 1 | 4.0% |
30 | 1 | 4.0% |
29 | 1 | 4.0% |
28 | 1 | 4.0% |
25 | 1 | 4.0% |
24 | 1 | 4.0% |
Other values (15) | 15 |
Value | Count | Frequency (%) |
1 | 1 | |
3 | 1 | |
4 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
11 | 1 | |
12 | 1 | |
13 | 1 |
Value | Count | Frequency (%) |
34 | 1 | |
33 | 1 | |
32 | 1 | |
30 | 1 | |
29 | 1 | |
28 | 1 | |
25 | 1 | |
24 | 1 | |
23 | 1 | |
22 | 1 |
시소러스기준단어언어코드
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 4.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
S10801 |
---|
Length
Max length | 6 |
---|---|
Median length | 6 |
Mean length | 6 |
Min length | 6 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | S10801 |
---|---|
2nd row | S10801 |
3rd row | S10801 |
4th row | S10801 |
5th row | S10801 |
Common Values
Value | Count | Frequency (%) |
S10801 | 25 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
s10801 | 25 |
시소러스단어분류코드
Categorical
HIGH CORRELATION
 
Distinct | 3 |
---|---|
Distinct (%) | 12.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
S10702 | |
---|---|
S10704 | |
S10701 |
Length
Max length | 6 |
---|---|
Median length | 6 |
Mean length | 6 |
Min length | 6 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | S10704 |
---|---|
2nd row | S10704 |
3rd row | S10704 |
4th row | S10704 |
5th row | S10704 |
Common Values
Value | Count | Frequency (%) |
S10702 | 15 | |
S10704 | 7 | |
S10701 | 3 | 12.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
s10702 | 15 | |
s10704 | 7 | |
s10701 | 3 | 12.0% |
시소러스관련단어명
Text
Distinct | 22 |
---|---|
Distinct (%) | 88.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
Length
Max length | 16 |
---|---|
Median length | 13 |
Mean length | 5.44 |
Min length | 2 |
Characters and Unicode
Total characters | 136 |
---|---|
Distinct characters | 57 |
Distinct categories | 4 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 19 ? |
---|---|
Unique (%) | 76.0% |
Sample
1st row | CERAMIC |
---|---|
2nd row | GLASS |
3rd row | Glass article |
4th row | article of glass |
5th row | glass product |
Value | Count | Frequency (%) |
glass | 5 | 15.6% |
세라믹 | 2 | 6.2% |
글래스 | 2 | 6.2% |
프로덕트 | 2 | 6.2% |
article | 2 | 6.2% |
product | 2 | 6.2% |
시레믹 | 1 | 3.1% |
ceramic | 1 | 3.1% |
새라믹 | 1 | 3.1% |
질그릇 | 1 | 3.1% |
Other values (13) | 13 |
Most occurring characters
Value | Count | Frequency (%) |
s | 10 | 7.4% |
a | 8 | 5.9% |
7 | 5.1% | |
l | 7 | 5.1% |
스 | 5 | 3.7% |
r | 5 | 3.7% |
g | 4 | 2.9% |
라 | 4 | 2.9% |
믹 | 4 | 2.9% |
c | 4 | 2.9% |
Other values (47) | 78 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 58 | |
Other Letter | 58 | |
Uppercase Letter | 13 | 9.6% |
Space Separator | 7 | 5.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
스 | 5 | 8.6% |
라 | 4 | 6.9% |
믹 | 4 | 6.9% |
래 | 3 | 5.2% |
글 | 3 | 5.2% |
유 | 3 | 5.2% |
리 | 3 | 5.2% |
그 | 3 | 5.2% |
레 | 2 | 3.4% |
트 | 2 | 3.4% |
Other values (22) | 26 |
Lowercase Letter
Value | Count | Frequency (%) |
s | 10 | |
a | 8 | |
l | 7 | |
r | 5 | |
g | 4 | 6.9% |
c | 4 | 6.9% |
t | 4 | 6.9% |
e | 3 | 5.2% |
o | 3 | 5.2% |
i | 2 | 3.4% |
Other values (5) | 8 |
Uppercase Letter
Value | Count | Frequency (%) |
A | 2 | |
C | 2 | |
G | 2 | |
S | 2 | |
I | 1 | |
E | 1 | |
R | 1 | |
M | 1 | |
L | 1 |
Space Separator
Value | Count | Frequency (%) |
7 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 71 | |
Hangul | 58 | |
Common | 7 | 5.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
스 | 5 | 8.6% |
라 | 4 | 6.9% |
믹 | 4 | 6.9% |
래 | 3 | 5.2% |
글 | 3 | 5.2% |
유 | 3 | 5.2% |
리 | 3 | 5.2% |
그 | 3 | 5.2% |
레 | 2 | 3.4% |
트 | 2 | 3.4% |
Other values (22) | 26 |
Latin
Value | Count | Frequency (%) |
s | 10 | |
a | 8 | 11.3% |
l | 7 | 9.9% |
r | 5 | 7.0% |
g | 4 | 5.6% |
c | 4 | 5.6% |
t | 4 | 5.6% |
e | 3 | 4.2% |
o | 3 | 4.2% |
A | 2 | 2.8% |
Other values (14) | 21 |
Common
Value | Count | Frequency (%) |
7 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 78 | |
Hangul | 58 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
s | 10 | |
a | 8 | 10.3% |
7 | 9.0% | |
l | 7 | 9.0% |
r | 5 | 6.4% |
g | 4 | 5.1% |
c | 4 | 5.1% |
t | 4 | 5.1% |
e | 3 | 3.8% |
o | 3 | 3.8% |
Other values (15) | 23 |
Hangul
Value | Count | Frequency (%) |
스 | 5 | 8.6% |
라 | 4 | 6.9% |
믹 | 4 | 6.9% |
래 | 3 | 5.2% |
글 | 3 | 5.2% |
유 | 3 | 5.2% |
리 | 3 | 5.2% |
그 | 3 | 5.2% |
레 | 2 | 3.4% |
트 | 2 | 3.4% |
Other values (22) | 26 |
시소러스관련단어가중치
Categorical
HIGH CORRELATION
 
Distinct | 3 |
---|---|
Distinct (%) | 12.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
7 | |
---|---|
4 | |
14 | 1 |
Length
Max length | 2 |
---|---|
Median length | 1 |
Mean length | 1.04 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 4.0% |
Sample
1st row | 7 |
---|---|
2nd row | 7 |
3rd row | 4 |
4th row | 4 |
5th row | 4 |
Common Values
Value | Count | Frequency (%) |
7 | 16 | |
4 | 8 | |
14 | 1 | 4.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
7 | 16 | |
4 | 8 | |
14 | 1 | 4.0% |
부서코드
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 4.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | |
---|---|
2nd row | |
3rd row | |
4th row | |
5th row |
Common Values
Value | Count | Frequency (%) |
25 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
No values found. |
시소러스단어입력일자
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 8.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
20090701 | |
---|---|
20090602 | 2 |
Length
Max length | 8 |
---|---|
Median length | 8 |
Mean length | 8 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20090701 |
---|---|
2nd row | 20090701 |
3rd row | 20090701 |
4th row | 20090701 |
5th row | 20090701 |
Common Values
Value | Count | Frequency (%) |
20090701 | 23 | |
20090602 | 2 | 8.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20090701 | 23 | |
20090602 | 2 | 8.0% |
시소러스단어변경일자
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 8.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
20090701 | |
---|---|
20090707 | 2 |
Length
Max length | 8 |
---|---|
Median length | 8 |
Mean length | 8 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20090701 |
---|---|
2nd row | 20090701 |
3rd row | 20090701 |
4th row | 20090701 |
5th row | 20090701 |
Common Values
Value | Count | Frequency (%) |
20090701 | 23 | |
20090707 | 2 | 8.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20090701 | 23 | |
20090707 | 2 | 8.0% |
시소러스관련단어중요도구분
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 8.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 332.0 B |
낮음 | |
---|---|
높음 | 1 |
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 4.0% |
Sample
1st row | 낮음 |
---|---|
2nd row | 낮음 |
3rd row | 낮음 |
4th row | 낮음 |
5th row | 낮음 |
Common Values
Value | Count | Frequency (%) |
낮음 | 24 | |
높음 | 1 | 4.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
낮음 | 24 | |
높음 | 1 | 4.0% |
시소러스단어일련번호 | 시소러스단어분류코드 | 시소러스관련단어명 | 시소러스관련단어가중치 | 시소러스단어입력일자 | 시소러스단어변경일자 | 시소러스관련단어중요도구분 | |
---|---|---|---|---|---|---|---|
시소러스단어일련번호 | 1.000 | 0.801 | 0.866 | 0.474 | 0.504 | 0.504 | 0.559 |
시소러스단어분류코드 | 0.801 | 1.000 | 1.000 | 0.869 | 0.192 | 0.192 | 0.104 |
시소러스관련단어명 | 0.866 | 1.000 | 1.000 | 1.000 | 0.000 | 0.000 | 0.000 |
시소러스관련단어가중치 | 0.474 | 0.869 | 1.000 | 1.000 | 0.206 | 0.206 | 0.058 |
시소러스단어입력일자 | 0.504 | 0.192 | 0.000 | 0.206 | 1.000 | 0.901 | 0.382 |
시소러스단어변경일자 | 0.504 | 0.192 | 0.000 | 0.206 | 0.901 | 1.000 | 0.382 |
시소러스관련단어중요도구분 | 0.559 | 0.104 | 0.000 | 0.058 | 0.382 | 0.382 | 1.000 |
시소러스단어변경일자 | 시소러스관련단어중요도구분 | 시소러스단어분류코드 | 시소러스관련단어가중치 | 시소러스단어입력일자 | |
---|---|---|---|---|---|
시소러스단어변경일자 | 1.000 | 0.246 | 0.304 | 0.325 | 0.714 |
시소러스관련단어중요도구분 | 0.246 | 1.000 | 0.158 | 0.074 | 0.246 |
시소러스단어분류코드 | 0.304 | 0.158 | 1.000 | 0.559 | 0.304 |
시소러스관련단어가중치 | 0.325 | 0.074 | 0.559 | 1.000 | 0.325 |
시소러스단어입력일자 | 0.714 | 0.246 | 0.304 | 0.325 | 1.000 |
시소러스단어일련번호 | 시소러스단어분류코드 | 시소러스관련단어가중치 | 시소러스단어입력일자 | 시소러스단어변경일자 | 시소러스관련단어중요도구분 | |
---|---|---|---|---|---|---|
시소러스단어일련번호 | 1.000 | 0.564 | 0.238 | 0.292 | 0.292 | 0.330 |
시소러스단어분류코드 | 0.564 | 1.000 | 0.559 | 0.304 | 0.304 | 0.158 |
시소러스관련단어가중치 | 0.238 | 0.559 | 1.000 | 0.325 | 0.325 | 0.074 |
시소러스단어입력일자 | 0.292 | 0.304 | 0.325 | 1.000 | 0.714 | 0.246 |
시소러스단어변경일자 | 0.292 | 0.304 | 0.325 | 0.714 | 1.000 | 0.246 |
시소러스관련단어중요도구분 | 0.330 | 0.158 | 0.074 | 0.246 | 0.246 | 1.000 |
시소러스IPC섹션구분코드 | 시소러스기준단어명 | 시소러스단어일련번호 | 시소러스기준단어언어코드 | 시소러스단어분류코드 | 시소러스관련단어명 | 시소러스관련단어가중치 | 부서코드 | 시소러스단어입력일자 | 시소러스단어변경일자 | 시소러스관련단어중요도구분 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | G | 유리제품 | 1 | S10801 | S10704 | CERAMIC | 7 | 20090701 | 20090701 | 낮음 | |
1 | G | 유리제품 | 3 | S10801 | S10704 | GLASS | 7 | 20090701 | 20090701 | 낮음 | |
2 | G | 유리제품 | 4 | S10801 | S10704 | Glass article | 4 | 20090701 | 20090701 | 낮음 | |
3 | G | 유리제품 | 6 | S10801 | S10704 | article of glass | 4 | 20090701 | 20090701 | 낮음 | |
4 | G | 유리제품 | 7 | S10801 | S10704 | glass product | 4 | 20090701 | 20090701 | 낮음 | |
5 | G | 유리제품 | 8 | S10801 | S10704 | glassware | 4 | 20090701 | 20090701 | 낮음 | |
6 | G | 유리제품 | 9 | S10801 | S10704 | glass product | 4 | 20090602 | 20090707 | 높음 | |
7 | G | 유리제품 | 11 | S10801 | S10702 | 광물 | 7 | 20090701 | 20090701 | 낮음 | |
8 | G | 유리제품 | 12 | S10801 | S10702 | 그라스 | 7 | 20090701 | 20090701 | 낮음 | |
9 | G | 유리제품 | 13 | S10801 | S10702 | 그래스 | 7 | 20090701 | 20090701 | 낮음 |
시소러스IPC섹션구분코드 | 시소러스기준단어명 | 시소러스단어일련번호 | 시소러스기준단어언어코드 | 시소러스단어분류코드 | 시소러스관련단어명 | 시소러스관련단어가중치 | 부서코드 | 시소러스단어입력일자 | 시소러스단어변경일자 | 시소러스관련단어중요도구분 | |
---|---|---|---|---|---|---|---|---|---|---|---|
15 | G | 유리제품 | 22 | S10801 | S10702 | 세라믹 | 7 | 20090701 | 20090701 | 낮음 | |
16 | G | 유리제품 | 23 | S10801 | S10702 | 소다유리 | 7 | 20090701 | 20090701 | 낮음 | |
17 | G | 유리제품 | 24 | S10801 | S10702 | 세라믹 | 7 | 20090701 | 20090701 | 낮음 | |
18 | G | 유리제품 | 25 | S10801 | S10702 | 시레믹 | 7 | 20090701 | 20090701 | 낮음 | |
19 | G | 유리제품 | 28 | S10801 | S10702 | 요업 | 7 | 20090701 | 20090701 | 낮음 | |
20 | G | 유리제품 | 29 | S10801 | S10702 | 유리 | 7 | 20090701 | 20090701 | 낮음 | |
21 | G | 유리제품 | 30 | S10801 | S10701 | 유리재품 | 4 | 20090701 | 20090701 | 낮음 | |
22 | G | 유리제품 | 32 | S10801 | S10702 | 점토 | 7 | 20090701 | 20090701 | 낮음 | |
23 | G | 유리제품 | 33 | S10801 | S10702 | 질그릇 | 7 | 20090701 | 20090701 | 낮음 | |
24 | G | 유리제품 | 34 | S10801 | S10702 | 초자 | 14 | 20090701 | 20090701 | 낮음 |