Overview

Dataset statistics

Number of variables5
Number of observations224
Missing cells4
Missing cells (%)0.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.5 KiB
Average record size in memory43.6 B

Variable types

Text2
Numeric3

Dataset

Description한국표준산업분류(KSIC)에 의하여 세라믹 분류 및 광물의 종류, 매출 및 매출증감율 등 기초데이터 자료를 제공합니다.
Author한국세라믹기술원
URLhttps://www.data.go.kr/data/3041386/fileData.do

Alerts

매출액(2012 / 억원) is highly overall correlated with 매출액(2013 / 억원)High correlation
매출액(2013 / 억원) is highly overall correlated with 매출액(2012 / 억원)High correlation
매출액(2012 / 억원) has 61 (27.2%) zerosZeros
매출액(2013 / 억원) has 51 (22.8%) zerosZeros
증감율(%) has 89 (39.7%) zerosZeros

Reproduction

Analysis started2023-12-12 06:45:54.423380
Analysis finished2023-12-12 06:45:55.717206
Duration1.29 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct223
Distinct (%)100.0%
Missing1
Missing (%)0.4%
Memory size1.9 KiB
2023-12-12T15:45:56.039217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.5067265
Min length1

Characters and Unicode

Total characters1005
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique223 ?
Unique (%)100.0%

Sample

1st rowA
2nd rowA01
3rd rowA0101
4th rowA0102
5th rowA02
ValueCountFrequency (%)
a 1
 
0.4%
d04 1
 
0.4%
d0401 1
 
0.4%
d0201 1
 
0.4%
d0202 1
 
0.4%
d0203 1
 
0.4%
d0204 1
 
0.4%
d03 1
 
0.4%
d0301 1
 
0.4%
d0302 1
 
0.4%
Other values (213) 213
95.5%
2023-12-12T15:45:56.643921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 376
37.4%
1 90
 
9.0%
2 76
 
7.6%
3 61
 
6.1%
D 53
 
5.3%
4 52
 
5.2%
B 52
 
5.2%
A 50
 
5.0%
5 47
 
4.7%
E 35
 
3.5%
Other values (5) 113
 
11.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 782
77.8%
Uppercase Letter 223
 
22.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 376
48.1%
1 90
 
11.5%
2 76
 
9.7%
3 61
 
7.8%
4 52
 
6.6%
5 47
 
6.0%
6 30
 
3.8%
7 23
 
2.9%
8 17
 
2.2%
9 10
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
D 53
23.8%
B 52
23.3%
A 50
22.4%
E 35
15.7%
C 33
14.8%

Most occurring scripts

ValueCountFrequency (%)
Common 782
77.8%
Latin 223
 
22.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 376
48.1%
1 90
 
11.5%
2 76
 
9.7%
3 61
 
7.8%
4 52
 
6.6%
5 47
 
6.0%
6 30
 
3.8%
7 23
 
2.9%
8 17
 
2.2%
9 10
 
1.3%
Latin
ValueCountFrequency (%)
D 53
23.8%
B 52
23.3%
A 50
22.4%
E 35
15.7%
C 33
14.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1005
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 376
37.4%
1 90
 
9.0%
2 76
 
7.6%
3 61
 
6.1%
D 53
 
5.3%
4 52
 
5.2%
B 52
 
5.2%
A 50
 
5.0%
5 47
 
4.7%
E 35
 
3.5%
Other values (5) 113
 
11.2%

광물
Text

Distinct213
Distinct (%)95.1%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-12-12T15:45:56.952769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length17
Mean length6.3035714
Min length2

Characters and Unicode

Total characters1412
Distinct characters221
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique207 ?
Unique (%)92.4%

Sample

1st row광물
2nd row규산질 원료
3rd row규사
4th row규조토
5th row규산알루미늄 원료
ValueCountFrequency (%)
기타 26
 
7.2%
부품 21
 
5.8%
원료 15
 
4.2%
12
 
3.3%
세라믹 12
 
3.3%
제품 6
 
1.7%
4
 
1.1%
복합산화물 4
 
1.1%
도자기 4
 
1.1%
비금속광물 4
 
1.1%
Other values (226) 253
70.1%
2023-12-12T15:45:57.364521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
137
 
9.7%
50
 
3.5%
46
 
3.3%
45
 
3.2%
43
 
3.0%
35
 
2.5%
33
 
2.3%
32
 
2.3%
31
 
2.2%
28
 
2.0%
Other values (211) 932
66.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1237
87.6%
Space Separator 137
 
9.7%
Other Punctuation 10
 
0.7%
Uppercase Letter 10
 
0.7%
Open Punctuation 6
 
0.4%
Close Punctuation 6
 
0.4%
Lowercase Letter 5
 
0.4%
Decimal Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
4.0%
46
 
3.7%
45
 
3.6%
43
 
3.5%
35
 
2.8%
33
 
2.7%
32
 
2.6%
31
 
2.5%
28
 
2.3%
24
 
1.9%
Other values (196) 870
70.3%
Lowercase Letter
ValueCountFrequency (%)
l 1
20.0%
e 1
20.0%
u 1
20.0%
d 1
20.0%
o 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
E 3
30.0%
D 3
30.0%
L 3
30.0%
M 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
/ 9
90.0%
· 1
 
10.0%
Space Separator
ValueCountFrequency (%)
137
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1237
87.6%
Common 160
 
11.3%
Latin 15
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
4.0%
46
 
3.7%
45
 
3.6%
43
 
3.5%
35
 
2.8%
33
 
2.7%
32
 
2.6%
31
 
2.5%
28
 
2.3%
24
 
1.9%
Other values (196) 870
70.3%
Latin
ValueCountFrequency (%)
E 3
20.0%
D 3
20.0%
L 3
20.0%
l 1
 
6.7%
e 1
 
6.7%
u 1
 
6.7%
d 1
 
6.7%
o 1
 
6.7%
M 1
 
6.7%
Common
ValueCountFrequency (%)
137
85.6%
/ 9
 
5.6%
( 6
 
3.8%
) 6
 
3.8%
· 1
 
0.6%
1 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1237
87.6%
ASCII 174
 
12.3%
None 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
137
78.7%
/ 9
 
5.2%
( 6
 
3.4%
) 6
 
3.4%
E 3
 
1.7%
D 3
 
1.7%
L 3
 
1.7%
l 1
 
0.6%
1 1
 
0.6%
e 1
 
0.6%
Other values (4) 4
 
2.3%
Hangul
ValueCountFrequency (%)
50
 
4.0%
46
 
3.7%
45
 
3.6%
43
 
3.5%
35
 
2.8%
33
 
2.7%
32
 
2.6%
31
 
2.5%
28
 
2.3%
24
 
1.9%
Other values (196) 870
70.3%
None
ValueCountFrequency (%)
· 1
100.0%

매출액(2012 / 억원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct140
Distinct (%)62.8%
Missing1
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean6514.2646
Minimum0
Maximum229052
Zeros61
Zeros (%)27.2%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-12T15:45:57.509682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median186
Q32626.5
95-th percentile31622.6
Maximum229052
Range229052
Interquartile range (IQR)2626.5

Descriptive statistics

Standard deviation23966.865
Coefficient of variation (CV)3.6791359
Kurtosis46.752387
Mean6514.2646
Median Absolute Deviation (MAD)186
Skewness6.3778602
Sum1452681
Variance5.7441062 × 108
MonotonicityNot monotonic
2023-12-12T15:45:57.633433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 61
27.2%
9 6
 
2.7%
4 4
 
1.8%
26 4
 
1.8%
4769 2
 
0.9%
2803 2
 
0.9%
20 2
 
0.9%
8 2
 
0.9%
40 2
 
0.9%
144 2
 
0.9%
Other values (130) 136
60.7%
ValueCountFrequency (%)
0 61
27.2%
4 4
 
1.8%
8 2
 
0.9%
9 6
 
2.7%
10 1
 
0.4%
12 1
 
0.4%
14 1
 
0.4%
16 1
 
0.4%
17 1
 
0.4%
19 1
 
0.4%
ValueCountFrequency (%)
229052 1
0.4%
155436 1
0.4%
152431 1
0.4%
99084 1
0.4%
93473 1
0.4%
60422 1
0.4%
59038 1
0.4%
44365 1
0.4%
38448 1
0.4%
38120 1
0.4%

매출액(2013 / 억원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct141
Distinct (%)63.2%
Missing1
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean7349.6054
Minimum0
Maximum270200
Zeros51
Zeros (%)22.8%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-12T15:45:57.753541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q18
median210
Q32630.5
95-th percentile35928.4
Maximum270200
Range270200
Interquartile range (IQR)2622.5

Descriptive statistics

Standard deviation27365.119
Coefficient of variation (CV)3.7233453
Kurtosis50.712004
Mean7349.6054
Median Absolute Deviation (MAD)210
Skewness6.6048075
Sum1638962
Variance7.4884973 × 108
MonotonicityNot monotonic
2023-12-12T15:45:57.882040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 51
 
22.8%
12 9
 
4.0%
10 6
 
2.7%
4 4
 
1.8%
32 4
 
1.8%
4987 2
 
0.9%
306 2
 
0.9%
5124 2
 
0.9%
16 2
 
0.9%
8 2
 
0.9%
Other values (131) 139
62.1%
ValueCountFrequency (%)
0 51
22.8%
4 4
 
1.8%
8 2
 
0.9%
9 1
 
0.4%
10 6
 
2.7%
12 9
 
4.0%
14 1
 
0.4%
16 2
 
0.9%
17 1
 
0.4%
19 1
 
0.4%
ValueCountFrequency (%)
270200 1
0.4%
174291 1
0.4%
172414 1
0.4%
113051 1
0.4%
98507 1
0.4%
60680 1
0.4%
58477 1
0.4%
57093 1
0.4%
43181 1
0.4%
42705 1
0.4%

증감율(%)
Real number (ℝ)

ZEROS 

Distinct106
Distinct (%)47.5%
Missing1
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean12.991928
Minimum-53.5
Maximum823.6
Zeros89
Zeros (%)39.7%
Negative32
Negative (%)14.3%
Memory size2.1 KiB
2023-12-12T15:45:58.035911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-53.5
5-th percentile-9.28
Q10
median0
Q311.1
95-th percentile59.32
Maximum823.6
Range877.1
Interquartile range (IQR)11.1

Descriptive statistics

Standard deviation60.718508
Coefficient of variation (CV)4.6735563
Kurtosis144.86365
Mean12.991928
Median Absolute Deviation (MAD)2.1
Skewness11.118644
Sum2897.2
Variance3686.7372
MonotonicityNot monotonic
2023-12-12T15:45:58.193227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 89
39.7%
11.1 6
 
2.7%
68.4 4
 
1.8%
23.1 4
 
1.8%
4.6 3
 
1.3%
16.3 3
 
1.3%
0.5 3
 
1.3%
1.4 2
 
0.9%
7.3 2
 
0.9%
2.6 2
 
0.9%
Other values (96) 105
46.9%
ValueCountFrequency (%)
-53.5 1
0.4%
-37.7 1
0.4%
-32.8 1
0.4%
-15.6 1
0.4%
-14.3 2
0.9%
-14.0 1
0.4%
-13.7 1
0.4%
-13.4 1
0.4%
-9.8 1
0.4%
-9.3 2
0.9%
ValueCountFrequency (%)
823.6 1
 
0.4%
214.9 1
 
0.4%
162.8 1
 
0.4%
145.2 1
 
0.4%
95.7 1
 
0.4%
68.4 4
1.8%
67.3 1
 
0.4%
66.2 1
 
0.4%
60.1 1
 
0.4%
52.3 1
 
0.4%

Interactions

2023-12-12T15:45:55.176605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:45:54.616813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:45:54.876910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:45:55.268044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:45:54.690454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:45:54.994683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:45:55.351777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:45:54.772401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:45:55.087605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:45:58.274773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
매출액(2012 / 억원)매출액(2013 / 억원)증감율(%)
매출액(2012 / 억원)1.0000.9720.000
매출액(2013 / 억원)0.9721.0000.000
증감율(%)0.0000.0001.000
2023-12-12T15:45:58.372839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
매출액(2012 / 억원)매출액(2013 / 억원)증감율(%)
매출액(2012 / 억원)1.0000.9870.324
매출액(2013 / 억원)0.9871.0000.356
증감율(%)0.3240.3561.000

Missing values

2023-12-12T15:45:55.459017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:45:55.556818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T15:45:55.654994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

분 류광물매출액(2012 / 억원)매출액(2013 / 억원)증감율(%)
0A광물773080664.3
1A01규산질 원료1177136916.3
2A0101규사921111821.4
3A0102규조토256251-2.0
4A02규산알루미늄 원료38847121.4
5A0201실리마나이트족 광물386468.4
6A0202카올린족 광물7612868.4
7A0203엽납석237215-9.3
8A0204점토386468.4
9A03알루미나 원료516323.5
분 류광물매출액(2012 / 억원)매출액(2013 / 억원)증감율(%)
214E0601필터5255280.6
215E0602촉매담체11458115811.1
216E0603기타000.0
217E07열적 세라믹 부품301931965.9
218E0701내열세라믹 부품274229206.5
219E0702발열용 부품2372370.0
220E0703금속제조용 부품40400.0
221E08방탄 세라믹 부품7060-14.3
222E0801방탄용 부품7060-14.3
223E0802기타000.0