Overview

Dataset statistics

Number of variables3
Number of observations37
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 KiB
Average record size in memory28.6 B

Variable types

Text2
Categorical1

Dataset

Description19.1.1~21.12.31 기간에 산재발생 보고의무 2회 이상 위반한 사업장을 공표 -사업장명, 사업장 소재지, 위반횟수
URLhttps://www.data.go.kr/data/15090007/fileData.do

Alerts

산재미보고 적발건수 is highly imbalanced (57.8%)Imbalance
사업장명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 21:10:06.751317
Analysis finished2023-12-12 21:10:07.098330
Duration0.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업장명
Text

UNIQUE 

Distinct37
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-13T06:10:07.287559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length19
Mean length11.027027
Min length4

Characters and Unicode

Total characters408
Distinct characters165
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)100.0%

Sample

1st row롯데네슬레코리아주식회사
2nd row(주)코멧
3rd row대창경금속(주)
4th row도레이첨단소재(주)3공장
5th row두산에너빌리티(주)
ValueCountFrequency (%)
롯데네슬레코리아주식회사 1
 
1.9%
개발사업 1
 
1.9%
네오트랜스주식회사(용인지점 1
 
1.9%
대신정기화물 1
 
1.9%
동화금속 1
 
1.9%
르노코리아자동차김해정비사업소주식회사 1
 
1.9%
부산신항다목적터미널(주 1
 
1.9%
사상구청(고용산재대표관리번호 1
 
1.9%
서주제과(주 1
 
1.9%
서한산업개발주식회사 1
 
1.9%
Other values (43) 43
81.1%
2023-12-13T06:10:07.698782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31
 
7.6%
) 25
 
6.1%
( 25
 
6.1%
17
 
4.2%
12
 
2.9%
8
 
2.0%
7
 
1.7%
7
 
1.7%
7
 
1.7%
6
 
1.5%
Other values (155) 263
64.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 329
80.6%
Close Punctuation 25
 
6.1%
Open Punctuation 25
 
6.1%
Space Separator 17
 
4.2%
Lowercase Letter 6
 
1.5%
Other Symbol 4
 
1.0%
Decimal Number 2
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
9.4%
12
 
3.6%
8
 
2.4%
7
 
2.1%
7
 
2.1%
7
 
2.1%
6
 
1.8%
6
 
1.8%
6
 
1.8%
6
 
1.8%
Other values (144) 233
70.8%
Lowercase Letter
ValueCountFrequency (%)
f 2
33.3%
e 1
16.7%
c 1
16.7%
i 1
16.7%
o 1
16.7%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
3 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25
100.0%
Space Separator
ValueCountFrequency (%)
17
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 333
81.6%
Common 69
 
16.9%
Latin 6
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
9.3%
12
 
3.6%
8
 
2.4%
7
 
2.1%
7
 
2.1%
7
 
2.1%
6
 
1.8%
6
 
1.8%
6
 
1.8%
6
 
1.8%
Other values (145) 237
71.2%
Common
ValueCountFrequency (%)
) 25
36.2%
( 25
36.2%
17
24.6%
2 1
 
1.4%
3 1
 
1.4%
Latin
ValueCountFrequency (%)
f 2
33.3%
e 1
16.7%
c 1
16.7%
i 1
16.7%
o 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 329
80.6%
ASCII 75
 
18.4%
None 4
 
1.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
31
 
9.4%
12
 
3.6%
8
 
2.4%
7
 
2.1%
7
 
2.1%
7
 
2.1%
6
 
1.8%
6
 
1.8%
6
 
1.8%
6
 
1.8%
Other values (144) 233
70.8%
ASCII
ValueCountFrequency (%)
) 25
33.3%
( 25
33.3%
17
22.7%
f 2
 
2.7%
2 1
 
1.3%
e 1
 
1.3%
c 1
 
1.3%
i 1
 
1.3%
o 1
 
1.3%
3 1
 
1.3%
None
ValueCountFrequency (%)
4
100.0%
Distinct36
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-13T06:10:08.004023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length28
Mean length23.378378
Min length16

Characters and Unicode

Total characters865
Distinct characters151
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)94.6%

Sample

1st row충북 청주시 흥덕구 백봉로72번길 21(송정동)
2nd row경기 파주시 문산읍 돈유로 98
3rd row경기 김포시 통진읍 애기봉로 650
4th row경북 구미시 3공단2로 300(임수동)
5th row 경남 창원시 성산구 두산볼보로 22 (귀곡동)
ValueCountFrequency (%)
경남 7
 
3.8%
경기 5
 
2.7%
부산 4
 
2.2%
전남 4
 
2.2%
김해시 3
 
1.6%
충북 3
 
1.6%
강원 3
 
1.6%
영암군 2
 
1.1%
창원시 2
 
1.1%
충주시 2
 
1.1%
Other values (139) 149
81.0%
2023-12-13T06:10:08.473592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
150
 
17.3%
2 33
 
3.8%
28
 
3.2%
26
 
3.0%
1 26
 
3.0%
26
 
3.0%
( 24
 
2.8%
) 24
 
2.8%
22
 
2.5%
3 18
 
2.1%
Other values (141) 488
56.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 506
58.5%
Space Separator 150
 
17.3%
Decimal Number 148
 
17.1%
Open Punctuation 24
 
2.8%
Close Punctuation 24
 
2.8%
Dash Punctuation 7
 
0.8%
Uppercase Letter 5
 
0.6%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
 
5.5%
26
 
5.1%
26
 
5.1%
22
 
4.3%
18
 
3.6%
17
 
3.4%
15
 
3.0%
14
 
2.8%
12
 
2.4%
10
 
2.0%
Other values (121) 318
62.8%
Decimal Number
ValueCountFrequency (%)
2 33
22.3%
1 26
17.6%
3 18
12.2%
5 16
10.8%
6 11
 
7.4%
0 10
 
6.8%
9 10
 
6.8%
4 9
 
6.1%
7 8
 
5.4%
8 7
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
G 1
20.0%
S 1
20.0%
A 1
20.0%
B 1
20.0%
L 1
20.0%
Space Separator
ValueCountFrequency (%)
150
100.0%
Open Punctuation
ValueCountFrequency (%)
( 24
100.0%
Close Punctuation
ValueCountFrequency (%)
) 24
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 506
58.5%
Common 354
40.9%
Latin 5
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
 
5.5%
26
 
5.1%
26
 
5.1%
22
 
4.3%
18
 
3.6%
17
 
3.4%
15
 
3.0%
14
 
2.8%
12
 
2.4%
10
 
2.0%
Other values (121) 318
62.8%
Common
ValueCountFrequency (%)
150
42.4%
2 33
 
9.3%
1 26
 
7.3%
( 24
 
6.8%
) 24
 
6.8%
3 18
 
5.1%
5 16
 
4.5%
6 11
 
3.1%
0 10
 
2.8%
9 10
 
2.8%
Other values (5) 32
 
9.0%
Latin
ValueCountFrequency (%)
G 1
20.0%
S 1
20.0%
A 1
20.0%
B 1
20.0%
L 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 506
58.5%
ASCII 359
41.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
150
41.8%
2 33
 
9.2%
1 26
 
7.2%
( 24
 
6.7%
) 24
 
6.7%
3 18
 
5.0%
5 16
 
4.5%
6 11
 
3.1%
0 10
 
2.8%
9 10
 
2.8%
Other values (10) 37
 
10.3%
Hangul
ValueCountFrequency (%)
28
 
5.5%
26
 
5.1%
26
 
5.1%
22
 
4.3%
18
 
3.6%
17
 
3.4%
15
 
3.0%
14
 
2.8%
12
 
2.4%
10
 
2.0%
Other values (121) 318
62.8%

산재미보고 적발건수
Categorical

IMBALANCE 

Distinct3
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Memory size428.0 B
2
32 
3
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)2.7%

Sample

1st row4
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
2 32
86.5%
3 4
 
10.8%
4 1
 
2.7%

Length

2023-12-13T06:10:08.605853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:10:08.703701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 32
86.5%
3 4
 
10.8%
4 1
 
2.7%

Correlations

2023-12-13T06:10:08.773211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업장명사업장 소재지산재미보고 적발건수
사업장명1.0001.0001.000
사업장 소재지1.0001.0001.000
산재미보고 적발건수1.0001.0001.000

Missing values

2023-12-13T06:10:06.988104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:10:07.067717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업장명사업장 소재지산재미보고 적발건수
0롯데네슬레코리아주식회사충북 청주시 흥덕구 백봉로72번길 21(송정동)4
1(주)코멧경기 파주시 문산읍 돈유로 983
2대창경금속(주)경기 김포시 통진읍 애기봉로 6503
3도레이첨단소재(주)3공장경북 구미시 3공단2로 300(임수동)3
4두산에너빌리티(주)경남 창원시 성산구 두산볼보로 22 (귀곡동)3
5(주)거산기계경남 김해시 진영읍 본산로 277 (거산기계(주))2
6(주)경동강원 삼척시 도계읍 도상로 5122
7(주)경동월드와이드경남 양산시 산막공단북4길 39(산막동)2
8㈜다원디자인 에스에스지닷컴 office 이전에 따른 인테리어공사서울 종로구 우정국로 26 (공평동)2
9(주)몽돌구미지점경북 구미시 수출대로 225(공단동 188-7)2
사업장명사업장 소재지산재미보고 적발건수
27여흥건설㈜ 의정부 중앙생활권2구역 주택재개발 정비사업경기 의정부시 경의로132번길 252
28주식회사미래환경경남 창원시 의창구 명서로89번길 22-41(명서동)2
29지에스칼텍스(주)전남 여수시 월내동 GS칼텍스1056번지2
30진산선무(주)울산 남구 장생포고래로 305(매암동)2
31창녕군청경남 창녕군 창녕읍 군청길1, A동 창녕군청 행정과2
32충주치매전문요양원충북 충주시 만리산15길 8 (교현동) 3층2
33케이피건설㈜ 용현자이크레스트(인천) 건축골조인천 남동구 인주대로623번길 322
34탄용환경개발(주)충북 충주시 대소원면 창현로 911(탄용환경)2
35태삼건설주식회사충남 보령시 오천면 오천중앙로 2662
36태현개발주식회사전북 부안군 봉덕리 157-2 본사 자재창고2