Overview

Dataset statistics

Number of variables8
Number of observations1887
Missing cells81
Missing cells (%)0.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory125.4 KiB
Average record size in memory68.1 B

Variable types

Numeric3
Text3
Boolean1
Categorical1

Dataset

Description유해화학물질 취급시설 검사 관리 시스템 기타 시스템관리 코드의 산업분류 테이블 (업종분류아이디, 분류코드, 업종분류명, 업종분류영문명, 부모_아이디, 순서, 사용여부, 레벨)
Author한국환경공단
URLhttps://www.data.go.kr/data/15093717/fileData.do

Alerts

사용여부 has constant value ""Constant
산업분류아이디 is highly overall correlated with 부모아이디High correlation
부모아이디 is highly overall correlated with 산업분류아이디High correlation
순서 is highly overall correlated with 레벨High correlation
레벨 is highly overall correlated with 순서High correlation
부모아이디 has 76 (4.0%) missing valuesMissing
산업분류아이디 has unique valuesUnique

Reproduction

Analysis started2023-12-12 08:14:25.804470
Analysis finished2023-12-12 08:14:27.823352
Duration2.02 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

산업분류아이디
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1887
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean968.42077
Minimum1
Maximum1999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.7 KiB
2023-12-12T17:14:27.907051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile95.3
Q1474.5
median964
Q31457.5
95-th percentile1845.7
Maximum1999
Range1998
Interquartile range (IQR)983

Descriptive statistics

Standard deviation562.46811
Coefficient of variation (CV)0.58080963
Kurtosis-1.2131563
Mean968.42077
Median Absolute Deviation (MAD)492
Skewness-0.0001215177
Sum1827410
Variance316370.38
MonotonicityNot monotonic
2023-12-12T17:14:28.144271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
575 1
 
0.1%
1919 1
 
0.1%
1931 1
 
0.1%
1930 1
 
0.1%
1929 1
 
0.1%
1928 1
 
0.1%
1927 1
 
0.1%
1926 1
 
0.1%
1925 1
 
0.1%
1924 1
 
0.1%
Other values (1877) 1877
99.5%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1999 1
0.1%
1979 1
0.1%
1959 1
0.1%
1949 1
0.1%
1936 1
0.1%
1935 1
0.1%
1934 1
0.1%
1933 1
0.1%
1932 1
0.1%
1931 1
0.1%
Distinct1848
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size14.9 KiB
2023-12-12T17:14:28.609530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.3290938
Min length1

Characters and Unicode

Total characters8169
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1811 ?
Unique (%)96.0%

Sample

1st row2612
2nd row27329
3rd row274
4th row2740
5th row27401
ValueCountFrequency (%)
111 3
 
0.2%
11 3
 
0.2%
201 2
 
0.1%
20 2
 
0.1%
3120 2
 
0.1%
141 2
 
0.1%
203 2
 
0.1%
2020 2
 
0.1%
2012 2
 
0.1%
2011 2
 
0.1%
Other values (1838) 1865
98.8%
2023-12-12T17:14:29.318290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1778
21.8%
2 1642
20.1%
9 884
10.8%
3 782
9.6%
4 732
9.0%
0 711
 
8.7%
6 504
 
6.2%
5 464
 
5.7%
7 384
 
4.7%
8 284
 
3.5%
Other values (3) 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8165
> 99.9%
Lowercase Letter 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1778
21.8%
2 1642
20.1%
9 884
10.8%
3 782
9.6%
4 732
9.0%
0 711
 
8.7%
6 504
 
6.2%
5 464
 
5.7%
7 384
 
4.7%
8 284
 
3.5%
Lowercase Letter
ValueCountFrequency (%)
t 2
50.0%
e 1
25.0%
s 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8165
> 99.9%
Latin 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1778
21.8%
2 1642
20.1%
9 884
10.8%
3 782
9.6%
4 732
9.0%
0 711
 
8.7%
6 504
 
6.2%
5 464
 
5.7%
7 384
 
4.7%
8 284
 
3.5%
Latin
ValueCountFrequency (%)
t 2
50.0%
e 1
25.0%
s 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8169
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1778
21.8%
2 1642
20.1%
9 884
10.8%
3 782
9.6%
4 732
9.0%
0 711
 
8.7%
6 504
 
6.2%
5 464
 
5.7%
7 384
 
4.7%
8 284
 
3.5%
Other values (3) 4
 
< 0.1%
Distinct1618
Distinct (%)85.7%
Missing0
Missing (%)0.0%
Memory size14.9 KiB
2023-12-12T17:14:29.707840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length27
Mean length11.926338
Min length2

Characters and Unicode

Total characters22505
Distinct characters446
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1385 ?
Unique (%)73.4%

Sample

1st row다이오드, 트랜지스터 및 유사 반도체소자 제조업
2nd row기타 광학기기 제조업
3rd row시계 및 시계부품 제조업
4th row시계 및 시계부품 제조업
5th row시계제조업
ValueCountFrequency (%)
788
 
12.2%
제조업 637
 
9.8%
기타 336
 
5.2%
서비스업 138
 
2.1%
소매업 98
 
1.5%
도매업 92
 
1.4%
그외 75
 
1.2%
운영업 65
 
1.0%
운송업 45
 
0.7%
기계 34
 
0.5%
Other values (1590) 4173
64.4%
2023-12-12T17:14:30.235033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4597
20.4%
1803
 
8.0%
927
 
4.1%
791
 
3.5%
788
 
3.5%
773
 
3.4%
352
 
1.6%
351
 
1.6%
324
 
1.4%
320
 
1.4%
Other values (436) 11479
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 17628
78.3%
Space Separator 4597
 
20.4%
Other Punctuation 255
 
1.1%
Decimal Number 13
 
0.1%
Lowercase Letter 12
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1803
 
10.2%
927
 
5.3%
791
 
4.5%
788
 
4.5%
773
 
4.4%
352
 
2.0%
351
 
2.0%
324
 
1.8%
320
 
1.8%
257
 
1.5%
Other values (424) 10942
62.1%
Lowercase Letter
ValueCountFrequency (%)
s 4
33.3%
a 3
25.0%
d 2
16.7%
t 2
16.7%
e 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
, 222
87.1%
· 23
 
9.0%
; 10
 
3.9%
Decimal Number
ValueCountFrequency (%)
1 10
76.9%
2 2
 
15.4%
3 1
 
7.7%
Space Separator
ValueCountFrequency (%)
4597
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 17628
78.3%
Common 4865
 
21.6%
Latin 12
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1803
 
10.2%
927
 
5.3%
791
 
4.5%
788
 
4.5%
773
 
4.4%
352
 
2.0%
351
 
2.0%
324
 
1.8%
320
 
1.8%
257
 
1.5%
Other values (424) 10942
62.1%
Common
ValueCountFrequency (%)
4597
94.5%
, 222
 
4.6%
· 23
 
0.5%
1 10
 
0.2%
; 10
 
0.2%
2 2
 
< 0.1%
3 1
 
< 0.1%
Latin
ValueCountFrequency (%)
s 4
33.3%
a 3
25.0%
d 2
16.7%
t 2
16.7%
e 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 17628
78.3%
ASCII 4854
 
21.6%
None 23
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4597
94.7%
, 222
 
4.6%
1 10
 
0.2%
; 10
 
0.2%
s 4
 
0.1%
a 3
 
0.1%
d 2
 
< 0.1%
2 2
 
< 0.1%
t 2
 
< 0.1%
3 1
 
< 0.1%
Hangul
ValueCountFrequency (%)
1803
 
10.2%
927
 
5.3%
791
 
4.5%
788
 
4.5%
773
 
4.4%
352
 
2.0%
351
 
2.0%
324
 
1.8%
320
 
1.8%
257
 
1.5%
Other values (424) 10942
62.1%
None
ValueCountFrequency (%)
· 23
100.0%
Distinct1605
Distinct (%)85.1%
Missing2
Missing (%)0.1%
Memory size14.9 KiB
2023-12-12T17:14:30.566490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length146
Median length86
Mean length40.901326
Min length1

Characters and Unicode

Total characters77099
Distinct characters58
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1366 ?
Unique (%)72.5%

Sample

1st rowManufacture of Diodes, Transistors and Similar Semi-conductor Devices
2nd rowManufacture of Other Optical Instruments and Photographic Equipment
3rd rowManufacture of Watches, Clocks and its Parts
4th rowManufacture of Watches, Clocks and its Parts
5th rowManufacture of Watches and Clocks
ValueCountFrequency (%)
of 1195
 
11.6%
and 948
 
9.2%
manufacture 634
 
6.1%
other 345
 
3.3%
services 165
 
1.6%
products 148
 
1.4%
activities 147
 
1.4%
sale 116
 
1.1%
retail 98
 
0.9%
equipment 98
 
0.9%
Other values (1440) 6431
62.3%
2023-12-12T17:14:31.177637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8542
 
11.1%
e 6736
 
8.7%
a 6020
 
7.8%
n 5211
 
6.8%
i 4993
 
6.5%
r 4845
 
6.3%
t 4838
 
6.3%
o 4618
 
6.0%
s 3499
 
4.5%
c 3081
 
4.0%
Other values (48) 24716
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 60109
78.0%
Space Separator 8542
 
11.1%
Uppercase Letter 7730
 
10.0%
Other Punctuation 579
 
0.8%
Dash Punctuation 139
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 6736
11.2%
a 6020
10.0%
n 5211
 
8.7%
i 4993
 
8.3%
r 4845
 
8.1%
t 4838
 
8.0%
o 4618
 
7.7%
s 3499
 
5.8%
c 3081
 
5.1%
u 2798
 
4.7%
Other values (16) 13470
22.4%
Uppercase Letter
ValueCountFrequency (%)
M 1124
14.5%
S 855
11.1%
P 753
 
9.7%
C 559
 
7.2%
A 535
 
6.9%
O 533
 
6.9%
R 435
 
5.6%
F 387
 
5.0%
E 340
 
4.4%
W 322
 
4.2%
Other values (15) 1887
24.4%
Other Punctuation
ValueCountFrequency (%)
, 331
57.2%
. 225
38.9%
; 11
 
1.9%
' 9
 
1.6%
: 3
 
0.5%
Space Separator
ValueCountFrequency (%)
8542
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 139
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 67839
88.0%
Common 9260
 
12.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 6736
 
9.9%
a 6020
 
8.9%
n 5211
 
7.7%
i 4993
 
7.4%
r 4845
 
7.1%
t 4838
 
7.1%
o 4618
 
6.8%
s 3499
 
5.2%
c 3081
 
4.5%
u 2798
 
4.1%
Other values (41) 21200
31.3%
Common
ValueCountFrequency (%)
8542
92.2%
, 331
 
3.6%
. 225
 
2.4%
- 139
 
1.5%
; 11
 
0.1%
' 9
 
0.1%
: 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 77099
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8542
 
11.1%
e 6736
 
8.7%
a 6020
 
7.8%
n 5211
 
6.8%
i 4993
 
6.5%
r 4845
 
6.3%
t 4838
 
6.3%
o 4618
 
6.0%
s 3499
 
4.5%
c 3081
 
4.0%
Other values (48) 24716
32.1%

부모아이디
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct776
Distinct (%)42.8%
Missing76
Missing (%)4.0%
Infinite0
Infinite (%)0.0%
Mean963.56875
Minimum1
Maximum1999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.7 KiB
2023-12-12T17:14:31.351943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile96
Q1474
median962
Q31451
95-th percentile1841
Maximum1999
Range1998
Interquartile range (IQR)977

Descriptive statistics

Standard deviation561.07272
Coefficient of variation (CV)0.58228613
Kurtosis-1.2079243
Mean963.56875
Median Absolute Deviation (MAD)488
Skewness0.0061835998
Sum1745023
Variance314802.6
MonotonicityNot monotonic
2023-12-12T17:14:31.957037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
157 9
 
0.5%
1112 9
 
0.5%
724 9
 
0.5%
688 9
 
0.5%
107 8
 
0.4%
939 8
 
0.4%
1025 8
 
0.4%
922 7
 
0.4%
1142 7
 
0.4%
474 7
 
0.4%
Other values (766) 1730
91.7%
(Missing) 76
 
4.0%
ValueCountFrequency (%)
1 5
0.3%
2 5
0.3%
3 1
 
0.1%
5 3
0.2%
9 2
 
0.1%
12 1
 
0.1%
14 3
0.2%
18 4
0.2%
19 2
 
0.1%
22 1
 
0.1%
ValueCountFrequency (%)
1999 1
0.1%
1979 1
0.1%
1959 1
0.1%
1949 1
0.1%
1934 2
0.1%
1933 1
0.1%
1932 1
0.1%
1930 1
0.1%
1929 1
0.1%
1927 1
0.1%

순서
Real number (ℝ)

HIGH CORRELATION 

Distinct78
Distinct (%)4.1%
Missing3
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean3.9156051
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.7 KiB
2023-12-12T17:14:32.131861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile7
Maximum99
Range98
Interquartile range (IQR)2

Descriptive statistics

Standard deviation10.447949
Coefficient of variation (CV)2.6682845
Kurtosis44.40928
Mean3.9156051
Median Absolute Deviation (MAD)1
Skewness6.4572751
Sum7377
Variance109.15963
MonotonicityNot monotonic
2023-12-12T17:14:32.283174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 774
41.0%
2 500
26.5%
3 267
 
14.1%
4 142
 
7.5%
5 71
 
3.8%
6 31
 
1.6%
7 18
 
1.0%
8 8
 
0.4%
9 4
 
0.2%
26 1
 
0.1%
Other values (68) 68
 
3.6%
(Missing) 3
 
0.2%
ValueCountFrequency (%)
1 774
41.0%
2 500
26.5%
3 267
 
14.1%
4 142
 
7.5%
5 71
 
3.8%
6 31
 
1.6%
7 18
 
1.0%
8 8
 
0.4%
9 4
 
0.2%
10 1
 
0.1%
ValueCountFrequency (%)
99 1
0.1%
98 1
0.1%
97 1
0.1%
96 1
0.1%
95 1
0.1%
94 1
0.1%
91 1
0.1%
90 1
0.1%
87 1
0.1%
86 1
0.1%

사용여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
True
1887 
ValueCountFrequency (%)
True 1887
100.0%
2023-12-12T17:14:32.403921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

레벨
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size14.9 KiB
4
1107 
3
473 
2
227 
1
 
77
<NA>
 
3

Length

Max length4
Median length1
Mean length1.0047695
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row4
3rd row2
4th row3
5th row4

Common Values

ValueCountFrequency (%)
4 1107
58.7%
3 473
25.1%
2 227
 
12.0%
1 77
 
4.1%
<NA> 3
 
0.2%

Length

2023-12-12T17:14:32.514935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:14:32.640237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 1107
58.7%
3 473
25.1%
2 227
 
12.0%
1 77
 
4.1%
na 3
 
0.2%

Interactions

2023-12-12T17:14:26.961474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:14:26.279925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:14:26.617132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:14:27.111572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:14:26.398980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:14:26.726614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:14:27.263634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:14:26.511690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:14:26.838955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:14:32.729677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산업분류아이디부모아이디순서레벨
산업분류아이디1.0001.0000.4180.000
부모아이디1.0001.000NaN0.063
순서0.418NaN1.0000.732
레벨0.0000.0630.7321.000
2023-12-12T17:14:32.822848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산업분류아이디부모아이디순서레벨
산업분류아이디1.0001.000-0.0050.000
부모아이디1.0001.000-0.0210.038
순서-0.005-0.0211.0000.537
레벨0.0000.0380.5371.000

Missing values

2023-12-12T17:14:27.422926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:14:27.607091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T17:14:27.756082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

산업분류아이디분류코드업종분류명업종분류영문명부모아이디순서사용여부레벨
05752612다이오드, 트랜지스터 및 유사 반도체소자 제조업Manufacture of Diodes, Transistors and Similar Semi-conductor Devices5722Y3
164227329기타 광학기기 제조업Manufacture of Other Optical Instruments and Photographic Equipment6393Y4
2643274시계 및 시계부품 제조업Manufacture of Watches, Clocks and its Parts6174Y2
36442740시계 및 시계부품 제조업Manufacture of Watches, Clocks and its Parts6431Y3
464527401시계제조업Manufacture of Watches and Clocks6441Y4
564627402시계부품 제조업Manufacture of parts of Watches and Clocks6442Y4
664728전기장비 제조업Manufacture of electrical equipment<NA>28Y1
7648281전동기, 발전기 및 전기 변환 · 공급 · 제어 장치 제조업Manufacture of Electric Motors, Generators and Transforming, Distributing and Controlling Apparatus of Electricity6471Y2
86492811전동기, 발전기 및 전기변환장치 제조업Manufacture of Electric Motors, Generators and Transformers6481Y3
965028111전동기 및 발전기 제조업Manufacture of Electric Motors and Generators6491Y4
산업분류아이디분류코드업종분류명업종분류영문명부모아이디순서사용여부레벨
187799746104기계장비 중개업Brokerage of Machinery Equipment9934Y4
187899846105상품종합 중개업Brokerage of a Variety of Goods9935Y4
187999946109기타 상품 중개업Brokerage of Other Commodity9936Y4
18801000462산업용 농축산물 및 산동물 도매업Wholesale of Agricultural Raw Materials and Live Animals9912Y2
188110014620산업용 농축산물 및 산동물 도매업Wholesale of Agricultural Raw Materials and Live Animals10001Y3
1882100246201곡물 도매업Wholesale of Grains10011Y4
1883100346202종자 및 묘목 도매업Wholesale of Seeds and seedling10012Y4
1884100446203사료 도매업Wholesale of Animal Feeds10013Y4
1885100546204화초 및 산식물 도매업Wholesale of Flowers and Plants10014Y4
1886100646205산동물 도매업Wholesale of Live Animals10015Y4