Overview

Dataset statistics

Number of variables5
Number of observations30
Missing cells9
Missing cells (%)6.0%
Duplicate rows1
Duplicate rows (%)3.3%
Total size in memory1.3 KiB
Average record size in memory45.4 B

Variable types

Numeric1
Text2
Categorical2

Dataset

Description울산광역시 동구 관내에 소재를 둔 대기오염물질 배출업소 현황(업체명, 소재지, 업종, 종별)을 공개하는 데이터입니다.
URLhttps://www.data.go.kr/data/15119404/fileData.do

Alerts

Dataset has 1 (3.3%) duplicate rowsDuplicates
연번 has 3 (10.0%) missing valuesMissing
업소명 has 3 (10.0%) missing valuesMissing
소재지 has 3 (10.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 23:32:37.918010
Analysis finished2023-12-12 23:32:38.553155
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

MISSING 

Distinct27
Distinct (%)100.0%
Missing3
Missing (%)10.0%
Infinite0
Infinite (%)0.0%
Mean14
Minimum1
Maximum27
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-13T08:32:38.625761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.3
Q17.5
median14
Q320.5
95-th percentile25.7
Maximum27
Range26
Interquartile range (IQR)13

Descriptive statistics

Standard deviation7.9372539
Coefficient of variation (CV)0.56694671
Kurtosis-1.2
Mean14
Median Absolute Deviation (MAD)7
Skewness0
Sum378
Variance63
MonotonicityStrictly increasing
2023-12-13T08:32:38.765279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
2 1
 
3.3%
27 1
 
3.3%
26 1
 
3.3%
25 1
 
3.3%
24 1
 
3.3%
23 1
 
3.3%
22 1
 
3.3%
21 1
 
3.3%
20 1
 
3.3%
19 1
 
3.3%
Other values (17) 17
56.7%
(Missing) 3
 
10.0%
ValueCountFrequency (%)
1 1
3.3%
2 1
3.3%
3 1
3.3%
4 1
3.3%
5 1
3.3%
6 1
3.3%
7 1
3.3%
8 1
3.3%
9 1
3.3%
10 1
3.3%
ValueCountFrequency (%)
27 1
3.3%
26 1
3.3%
25 1
3.3%
24 1
3.3%
23 1
3.3%
22 1
3.3%
21 1
3.3%
20 1
3.3%
19 1
3.3%
18 1
3.3%

업소명
Text

MISSING 

Distinct27
Distinct (%)100.0%
Missing3
Missing (%)10.0%
Memory size372.0 B
2023-12-13T08:32:38.961415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length13
Mean length10.444444
Min length5

Characters and Unicode

Total characters282
Distinct characters86
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)100.0%

Sample

1st row㈜동일쇼파
2nd row㈜한국정비
3rd row㈜현대백화점 동구점
4th row㈜현대백화점 서부2점
5th row그린정비공업사
ValueCountFrequency (%)
현대중공업㈜ 6
 
14.6%
울산대학교병원 2
 
4.9%
㈜현대미포조선 2
 
4.9%
㈜현대백화점 2
 
4.9%
전하재기숙사 1
 
2.4%
율전재 1
 
2.4%
현대스포츠 1
 
2.4%
한국조선해양㈜ 1
 
2.4%
한마음회관 1
 
2.4%
삼전재 1
 
2.4%
Other values (23) 23
56.1%
2023-12-13T08:32:39.308251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18
 
6.4%
18
 
6.4%
16
 
5.7%
15
 
5.3%
14
 
5.0%
10
 
3.5%
9
 
3.2%
8
 
2.8%
8
 
2.8%
8
 
2.8%
Other values (76) 158
56.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 239
84.8%
Other Symbol 18
 
6.4%
Space Separator 18
 
6.4%
Decimal Number 7
 
2.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
16
 
6.7%
15
 
6.3%
14
 
5.9%
10
 
4.2%
9
 
3.8%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.9%
6
 
2.5%
Other values (72) 138
57.7%
Decimal Number
ValueCountFrequency (%)
1 4
57.1%
2 3
42.9%
Other Symbol
ValueCountFrequency (%)
18
100.0%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 257
91.1%
Common 25
 
8.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
 
7.0%
16
 
6.2%
15
 
5.8%
14
 
5.4%
10
 
3.9%
9
 
3.5%
8
 
3.1%
8
 
3.1%
8
 
3.1%
7
 
2.7%
Other values (73) 144
56.0%
Common
ValueCountFrequency (%)
18
72.0%
1 4
 
16.0%
2 3
 
12.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 239
84.8%
ASCII 25
 
8.9%
None 18
 
6.4%

Most frequent character per block

None
ValueCountFrequency (%)
18
100.0%
ASCII
ValueCountFrequency (%)
18
72.0%
1 4
 
16.0%
2 3
 
12.0%
Hangul
ValueCountFrequency (%)
16
 
6.7%
15
 
6.3%
14
 
5.9%
10
 
4.2%
9
 
3.8%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.9%
6
 
2.5%
Other values (72) 138
57.7%

소재지
Text

MISSING 

Distinct26
Distinct (%)96.3%
Missing3
Missing (%)10.0%
Memory size372.0 B
2023-12-13T08:32:39.536534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length18
Mean length15.296296
Min length12

Characters and Unicode

Total characters413
Distinct characters46
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)92.6%

Sample

1st row방어진순환도로 693 (일산동)
2nd row방어진순환도로 560 (화정동)
3rd row방어진순환도로 899 (서부동)
4th row방어진순환도로 995-2 (서부동)
5th row방어진순환도로 494 (방어동)
ValueCountFrequency (%)
방어진순환도로 15
18.5%
방어동 9
 
11.1%
전하동 6
 
7.4%
서부동 6
 
7.4%
화정동 3
 
3.7%
30 3
 
3.7%
녹수2길 2
 
2.5%
일산동 2
 
2.5%
봉수로 1
 
1.2%
16 1
 
1.2%
Other values (33) 33
40.7%
2023-12-13T08:32:39.886302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
13.3%
28
 
6.8%
( 27
 
6.5%
) 27
 
6.5%
24
 
5.8%
24
 
5.8%
19
 
4.6%
15
 
3.6%
15
 
3.6%
15
 
3.6%
Other values (36) 164
39.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 223
54.0%
Decimal Number 80
 
19.4%
Space Separator 55
 
13.3%
Open Punctuation 27
 
6.5%
Close Punctuation 27
 
6.5%
Dash Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
12.6%
24
10.8%
24
10.8%
19
 
8.5%
15
 
6.7%
15
 
6.7%
15
 
6.7%
15
 
6.7%
8
 
3.6%
7
 
3.1%
Other values (22) 53
23.8%
Decimal Number
ValueCountFrequency (%)
1 12
15.0%
0 10
12.5%
3 10
12.5%
5 9
11.2%
4 9
11.2%
6 7
8.8%
9 7
8.8%
2 6
7.5%
7 5
6.2%
8 5
6.2%
Space Separator
ValueCountFrequency (%)
55
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27
100.0%
Close Punctuation
ValueCountFrequency (%)
) 27
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 223
54.0%
Common 190
46.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
12.6%
24
10.8%
24
10.8%
19
 
8.5%
15
 
6.7%
15
 
6.7%
15
 
6.7%
15
 
6.7%
8
 
3.6%
7
 
3.1%
Other values (22) 53
23.8%
Common
ValueCountFrequency (%)
55
28.9%
( 27
14.2%
) 27
14.2%
1 12
 
6.3%
0 10
 
5.3%
3 10
 
5.3%
5 9
 
4.7%
4 9
 
4.7%
6 7
 
3.7%
9 7
 
3.7%
Other values (4) 17
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 223
54.0%
ASCII 190
46.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
55
28.9%
( 27
14.2%
) 27
14.2%
1 12
 
6.3%
0 10
 
5.3%
3 10
 
5.3%
5 9
 
4.7%
4 9
 
4.7%
6 7
 
3.7%
9 7
 
3.7%
Other values (4) 17
 
8.9%
Hangul
ValueCountFrequency (%)
28
12.6%
24
10.8%
24
10.8%
19
 
8.5%
15
 
6.7%
15
 
6.7%
15
 
6.7%
15
 
6.7%
8
 
3.6%
7
 
3.1%
Other values (22) 53
23.8%

업종
Categorical

Distinct12
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
자동차수리업
부동산임대업
<NA>
백화점
가구 및 기타제품 제조업
Other values (7)

Length

Max length13
Median length6
Mean length5.8333333
Min length3

Unique

Unique8 ?
Unique (%)26.7%

Sample

1st row가구 및 기타제품 제조업
2nd row자동차수리업
3rd row백화점
4th row백화점
5th row자동차수리업

Common Values

ValueCountFrequency (%)
자동차수리업 9
30.0%
부동산임대업 8
26.7%
<NA> 3
 
10.0%
백화점 2
 
6.7%
가구 및 기타제품 제조업 1
 
3.3%
종합병원 1
 
3.3%
호텔업 1
 
3.3%
조립금속제품제조업 1
 
3.3%
영화관운영업 1
 
3.3%
수영장운영업 1
 
3.3%
Other values (2) 2
 
6.7%

Length

2023-12-13T08:32:40.022022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
자동차수리업 9
27.3%
부동산임대업 8
24.2%
na 3
 
9.1%
백화점 2
 
6.1%
가구 1
 
3.0%
1
 
3.0%
기타제품 1
 
3.0%
제조업 1
 
3.0%
종합병원 1
 
3.0%
호텔업 1
 
3.0%
Other values (5) 5
15.2%

종별
Categorical

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
5종
14 
4종
13 
<NA>

Length

Max length4
Median length2
Mean length2.2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5종
2nd row5종
3rd row5종
4th row5종
5th row5종

Common Values

ValueCountFrequency (%)
5종 14
46.7%
4종 13
43.3%
<NA> 3
 
10.0%

Length

2023-12-13T08:32:40.167430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:32:40.295515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5종 14
46.7%
4종 13
43.3%
na 3
 
10.0%

Interactions

2023-12-13T08:32:38.174222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:32:40.385431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업소명소재지업종종별
연번1.0001.0000.9370.4940.498
업소명1.0001.0001.0001.0001.000
소재지0.9371.0001.0001.0001.000
업종0.4941.0001.0001.0000.194
종별0.4981.0001.0000.1941.000
2023-12-13T08:32:40.494200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종별업종
종별1.0000.089
업종0.0891.000
2023-12-13T08:32:40.577426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종종별
연번1.0000.1780.298
업종0.1781.0000.089
종별0.2980.0891.000

Missing values

2023-12-13T08:32:38.277837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:32:38.369054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T08:32:38.479187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업소명소재지업종종별
01㈜동일쇼파방어진순환도로 693 (일산동)가구 및 기타제품 제조업5종
12㈜한국정비방어진순환도로 560 (화정동)자동차수리업5종
23㈜현대백화점 동구점방어진순환도로 899 (서부동)백화점5종
34㈜현대백화점 서부2점방어진순환도로 995-2 (서부동)백화점5종
45그린정비공업사방어진순환도로 494 (방어동)자동차수리업5종
56동울산정비방어진순환도로 448 (방어동)자동차수리업4종
67동울산정비2공장방어진순환도로 446 (방어동)자동차수리업5종
78문현1급종합정비방어진순환도로 445 (방어동)자동차수리업5종
89베어정비공업사방어진순환도로 831 (전하동)자동차수리업4종
910울산광역시방어진공영차고지문현로 120 (방어동)자동차수리업4종
연번업소명소재지업종종별
2021현대중공업㈜ 삼전재방어진순환도로 1035 (서부동)부동산임대업4종
2122현대중공업㈜ 화암재1동화잠로 53 (방어동)부동산임대업5종
2223한국조선해양㈜ 현대스포츠봉수로 507 (서부동)부동산임대업4종
2324현대중공업㈜ 율전재방어진순환도로 955 (서부동)부동산임대업4종
2425울산대학교병원 전하재기숙사녹수2길 30 (전하동)부동산임대업5종
2526㈜현대미포조선 미래재기숙사문현6길 41 (방어동)주거용건물임대업4종
2627㈜대한제21호위탁관리부동산투자회사방어진순환도로 637 (일산동)부동산임대업4종
27<NA><NA><NA><NA><NA>
28<NA><NA><NA><NA><NA>
29<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번업소명소재지업종종별# duplicates
0<NA><NA><NA><NA><NA>3