Overview

Dataset statistics

Number of variables4
Number of observations68
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory36.9 B

Variable types

Text1
Numeric3

Dataset

Description울산광역시 소기업, 중기업, 대기업을 산업분류(농업, 임업, 어업, 금속, 비금속, 섬유, 의류, 석탄 등)로 나타낸 현황 자료를 제공하고 있습니다.
Author울산광역시
URLhttps://www.data.go.kr/data/3076119/fileData.do

Alerts

소기업 is highly overall correlated with 중기업High correlation
중기업 is highly overall correlated with 소기업 and 1 other fieldsHigh correlation
대기업 is highly overall correlated with 중기업High correlation
산업분류 has unique valuesUnique
중기업 has 12 (17.6%) zerosZeros
대기업 has 42 (61.8%) zerosZeros

Reproduction

Analysis started2024-03-14 14:39:56.918944
Analysis finished2024-03-14 14:39:59.573852
Duration2.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

산업분류
Text

UNIQUE 

Distinct68
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size672.0 B
2024-03-14T23:40:00.652378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length17.5
Mean length12.029412
Min length2

Characters and Unicode

Total characters818
Distinct characters160
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)100.0%

Sample

1st row농업
2nd row임업
3rd row어업
4th row비금속광물 광업 ; 연료용 제외
5th row광업 지원 서비스업
ValueCountFrequency (%)
35
 
15.0%
제조업 22
 
9.4%
서비스업 12
 
5.2%
기타 6
 
2.6%
제외 6
 
2.6%
자동차 3
 
1.3%
운송업 3
 
1.3%
보험 2
 
0.9%
기계 2
 
0.9%
광업 2
 
0.9%
Other values (135) 140
60.1%
2024-03-14T23:40:02.412945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
165
20.2%
72
 
8.8%
42
 
5.1%
35
 
4.3%
24
 
2.9%
22
 
2.7%
21
 
2.6%
19
 
2.3%
18
 
2.2%
, 18
 
2.2%
Other values (150) 382
46.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 626
76.5%
Space Separator 165
 
20.2%
Other Punctuation 26
 
3.2%
Decimal Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
72
 
11.5%
42
 
6.7%
35
 
5.6%
24
 
3.8%
22
 
3.5%
21
 
3.4%
19
 
3.0%
18
 
2.9%
15
 
2.4%
8
 
1.3%
Other values (145) 350
55.9%
Other Punctuation
ValueCountFrequency (%)
, 18
69.2%
; 7
 
26.9%
· 1
 
3.8%
Space Separator
ValueCountFrequency (%)
165
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 626
76.5%
Common 192
 
23.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
72
 
11.5%
42
 
6.7%
35
 
5.6%
24
 
3.8%
22
 
3.5%
21
 
3.4%
19
 
3.0%
18
 
2.9%
15
 
2.4%
8
 
1.3%
Other values (145) 350
55.9%
Common
ValueCountFrequency (%)
165
85.9%
, 18
 
9.4%
; 7
 
3.6%
· 1
 
0.5%
1 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 626
76.5%
ASCII 191
 
23.3%
None 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
165
86.4%
, 18
 
9.4%
; 7
 
3.7%
1 1
 
0.5%
Hangul
ValueCountFrequency (%)
72
 
11.5%
42
 
6.7%
35
 
5.6%
24
 
3.8%
22
 
3.5%
21
 
3.4%
19
 
3.0%
18
 
2.9%
15
 
2.4%
8
 
1.3%
Other values (145) 350
55.9%
None
ValueCountFrequency (%)
· 1
100.0%

소기업
Real number (ℝ)

HIGH CORRELATION 

Distinct63
Distinct (%)92.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1165.8676
Minimum1
Maximum18082
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size740.0 B
2024-03-14T23:40:02.656172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q130.5
median214.5
Q3667
95-th percentile5252.85
Maximum18082
Range18081
Interquartile range (IQR)636.5

Descriptive statistics

Standard deviation2979.5328
Coefficient of variation (CV)2.5556355
Kurtosis20.637705
Mean1165.8676
Median Absolute Deviation (MAD)202
Skewness4.3509584
Sum79279
Variance8877615.6
MonotonicityNot monotonic
2024-03-14T23:40:02.996710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 3
 
4.4%
11 2
 
2.9%
1 2
 
2.9%
23 2
 
2.9%
149 1
 
1.5%
51 1
 
1.5%
84 1
 
1.5%
3 1
 
1.5%
26 1
 
1.5%
948 1
 
1.5%
Other values (53) 53
77.9%
ValueCountFrequency (%)
1 2
2.9%
2 3
4.4%
3 1
 
1.5%
11 2
2.9%
12 1
 
1.5%
13 1
 
1.5%
17 1
 
1.5%
18 1
 
1.5%
19 1
 
1.5%
23 2
2.9%
ValueCountFrequency (%)
18082 1
1.5%
14451 1
1.5%
6871 1
1.5%
5690 1
1.5%
4441 1
1.5%
4292 1
1.5%
2628 1
1.5%
2569 1
1.5%
2547 1
1.5%
1928 1
1.5%

중기업
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct36
Distinct (%)52.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.058824
Minimum0
Maximum285
Zeros12
Zeros (%)17.6%
Negative0
Negative (%)0.0%
Memory size740.0 B
2024-03-14T23:40:03.249039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.75
median8
Q339
95-th percentile166.5
Maximum285
Range285
Interquartile range (IQR)37.25

Descriptive statistics

Standard deviation59.965135
Coefficient of variation (CV)1.8704721
Kurtosis9.8367691
Mean32.058824
Median Absolute Deviation (MAD)8
Skewness3.1227565
Sum2180
Variance3595.8174
MonotonicityNot monotonic
2024-03-14T23:40:03.468036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0 12
17.6%
1 5
 
7.4%
2 4
 
5.9%
4 4
 
5.9%
8 4
 
5.9%
39 3
 
4.4%
6 3
 
4.4%
10 2
 
2.9%
3 2
 
2.9%
41 2
 
2.9%
Other values (26) 27
39.7%
ValueCountFrequency (%)
0 12
17.6%
1 5
7.4%
2 4
 
5.9%
3 2
 
2.9%
4 4
 
5.9%
6 3
 
4.4%
7 1
 
1.5%
8 4
 
5.9%
9 1
 
1.5%
10 2
 
2.9%
ValueCountFrequency (%)
285 1
1.5%
278 1
1.5%
231 1
1.5%
205 1
1.5%
95 1
1.5%
93 1
1.5%
76 1
1.5%
70 1
1.5%
59 1
1.5%
52 1
1.5%

대기업
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3088235
Minimum0
Maximum65
Zeros42
Zeros (%)61.8%
Negative0
Negative (%)0.0%
Memory size740.0 B
2024-03-14T23:40:03.662728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile5
Maximum65
Range65
Interquartile range (IQR)2

Descriptive statistics

Standard deviation8.2665616
Coefficient of variation (CV)3.5804216
Kurtosis51.289188
Mean2.3088235
Median Absolute Deviation (MAD)0
Skewness6.8762358
Sum157
Variance68.33604
MonotonicityNot monotonic
2024-03-14T23:40:03.847870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 42
61.8%
2 7
 
10.3%
5 5
 
7.4%
1 5
 
7.4%
3 4
 
5.9%
4 2
 
2.9%
7 1
 
1.5%
65 1
 
1.5%
21 1
 
1.5%
ValueCountFrequency (%)
0 42
61.8%
1 5
 
7.4%
2 7
 
10.3%
3 4
 
5.9%
4 2
 
2.9%
5 5
 
7.4%
7 1
 
1.5%
21 1
 
1.5%
65 1
 
1.5%
ValueCountFrequency (%)
65 1
 
1.5%
21 1
 
1.5%
7 1
 
1.5%
5 5
 
7.4%
4 2
 
2.9%
3 4
 
5.9%
2 7
 
10.3%
1 5
 
7.4%
0 42
61.8%

Interactions

2024-03-14T23:39:58.779164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:39:57.129265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:39:57.921751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:39:58.933970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:39:57.412467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:39:58.171579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:39:59.072829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:39:57.658950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:39:58.622434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T23:40:03.992168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산업분류소기업중기업대기업
산업분류1.0001.0001.0001.000
소기업1.0001.0000.8260.000
중기업1.0000.8261.0000.464
대기업1.0000.0000.4641.000
2024-03-14T23:40:04.221843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소기업중기업대기업
소기업1.0000.7880.418
중기업0.7881.0000.571
대기업0.4180.5711.000

Missing values

2024-03-14T23:39:59.265873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T23:39:59.464650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

산업분류소기업중기업대기업
0농업1200
1임업1100
2어업200
3비금속광물 광업 ; 연료용 제외1340
4광업 지원 서비스업100
5식료품 제조업99862
6음료 제조업1910
7섬유제품 제조업; 의복 제외214172
8의복, 의복액세서리 및 모피제품 제조업13900
9가죽, 가방 및 신발 제조업2910
산업분류소기업중기업대기업
58사업시설 관리 및 조경 서비스업293210
59사업지원 서비스업751701
60임대업;부동산 제외321200
61교육 서비스업4292760
62보건업11292780
63사회복지 서비스업66480
64창작, 예술 및 여가관련 서비스업27710
65스포츠 및 오락관련 서비스업256970
66개인 및 소비용품 수리업1928950
67기타 개인 서비스업5690440