"������������������������������������_20210831.xlsx"의 파일명이 "특수건강진단신청처리내역_20210831.xlsx"으로 변경 됨.

Overview

Dataset statistics

Number of variables35
Number of observations10000
Missing cells204040
Missing cells (%)58.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 MiB
Average record size in memory300.0 B

Variable types

Numeric5
Categorical16
Boolean1
Unsupported13

Dataset

Description특수건강검진지원신청처리내역에 대한 자료로 검진기관명, 관할지사, 근로자수, 유해물질 등의 데이터를 활용할 수 있습니다.
Author한국산업안전보건공단
URLhttps://www.data.go.kr/data/15065992/fileData.do

Alerts

사업년도 has constant value "2013" Constant
번호 has constant value "0" Constant
개인정보 활용( Y,N ) has constant value "True" Constant
사업장_주생산품 has a high cardinality: 687 distinct values High cardinality
유해물질코드 has a high cardinality: 3722 distinct values High cardinality
인자명 has a high cardinality: 1683 distinct values High cardinality
검진기관명 has a high cardinality: 743 distinct values High cardinality
SAM파일 has a high cardinality: 1678 distinct values High cardinality
발주처 has a high cardinality: 570 distinct values High cardinality
공사금액 has a high cardinality: 888 distinct values High cardinality
관할 노동관서 has 397 (4.0%) missing values Missing
사업장_업종코드 has 9893 (98.9%) missing values Missing
사업장_주생산품 has 7940 (79.4%) missing values Missing
검진신청근로자수 has 3950 (39.5%) missing values Missing
실근로자수 has 9607 (96.1%) missing values Missing
유해물질코드 has 1670 (16.7%) missing values Missing
인자명 has 1674 (16.7%) missing values Missing
검진기관명 has 3965 (39.6%) missing values Missing
SAM파일 has 8322 (83.2%) missing values Missing
비고 has 10000 (100.0%) missing values Missing
발주처 has 8322 (83.2%) missing values Missing
공사금액 has 8322 (83.2%) missing values Missing
공사시작일자 has 10000 (100.0%) missing values Missing
공사종료일자 has 10000 (100.0%) missing values Missing
비대상선정사유 has 9973 (99.7%) missing values Missing
등록일자 has 10000 (100.0%) missing values Missing
수정일자 has 10000 (100.0%) missing values Missing
기업정보이용동의(동의:Y,비동의:N) has 10000 (100.0%) missing values Missing
공단에서 검진기관 지정에 동의(동의:Y,비동의:N) has 10000 (100.0%) missing values Missing
측정예정일 has 10000 (100.0%) missing values Missing
수시(임시)접수여부 has 10000 (100.0%) missing values Missing
검진측정기관코드 has 10000 (100.0%) missing values Missing
시도명 has 10000 (100.0%) missing values Missing
시군구명 has 10000 (100.0%) missing values Missing
지원일자 has 10000 (100.0%) missing values Missing
검진신청근로자수 is highly skewed (γ1 = 77.78174593) Skewed
df_index has unique values Unique
비고 is an unsupported type, check if it needs cleaning or further analysis Unsupported
공사시작일자 is an unsupported type, check if it needs cleaning or further analysis Unsupported
공사종료일자 is an unsupported type, check if it needs cleaning or further analysis Unsupported
등록일자 is an unsupported type, check if it needs cleaning or further analysis Unsupported
수정일자 is an unsupported type, check if it needs cleaning or further analysis Unsupported
기업정보이용동의(동의:Y,비동의:N) is an unsupported type, check if it needs cleaning or further analysis Unsupported
공단에서 검진기관 지정에 동의(동의:Y,비동의:N) is an unsupported type, check if it needs cleaning or further analysis Unsupported
측정예정일 is an unsupported type, check if it needs cleaning or further analysis Unsupported
수시(임시)접수여부 is an unsupported type, check if it needs cleaning or further analysis Unsupported
검진측정기관코드 is an unsupported type, check if it needs cleaning or further analysis Unsupported
시도명 is an unsupported type, check if it needs cleaning or further analysis Unsupported
시군구명 is an unsupported type, check if it needs cleaning or further analysis Unsupported
지원일자 is an unsupported type, check if it needs cleaning or further analysis Unsupported
사업장_근로자수 has 172 (1.7%) zeros Zeros

Reproduction

Analysis started2022-08-13 09:09:02.415889
Analysis finished2022-08-13 09:09:03.335620
Duration0.92 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12941.7994
Minimum10
Maximum25902
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-08-13T18:09:03.408366image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile1247.9
Q16469.75
median12910.5
Q319382.25
95-th percentile24649.2
Maximum25902
Range25892
Interquartile range (IQR)12912.5

Descriptive statistics

Standard deviation7479.641927
Coefficient of variation (CV)0.5779445111
Kurtosis-1.194084626
Mean12941.7994
Median Absolute Deviation (MAD)6450
Skewness-0.001940253291
Sum129417994
Variance55945043.35
MonotonicityNot monotonic
2022-08-13T18:09:03.604891image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
177271
 
< 0.1%
1691
 
< 0.1%
48461
 
< 0.1%
69091
 
< 0.1%
151961
 
< 0.1%
147141
 
< 0.1%
154381
 
< 0.1%
90601
 
< 0.1%
243251
 
< 0.1%
20131
 
< 0.1%
Other values (9990)9990
99.9%
ValueCountFrequency (%)
101
< 0.1%
121
< 0.1%
131
< 0.1%
141
< 0.1%
151
< 0.1%
161
< 0.1%
231
< 0.1%
251
< 0.1%
261
< 0.1%
331
< 0.1%
ValueCountFrequency (%)
259021
< 0.1%
259011
< 0.1%
259001
< 0.1%
258961
< 0.1%
258951
< 0.1%
258941
< 0.1%
258921
< 0.1%
258911
< 0.1%
258901
< 0.1%
258891
< 0.1%

사업년도
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2013
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2013
2nd row2013
3rd row2013
4th row2013
5th row2013

Common Values

ValueCountFrequency (%)
201310000
100.0%

Length

2022-08-13T18:09:03.739653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-13T18:09:03.822999image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
201310000
100.0%

번호
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
010000
100.0%

Length

2022-08-13T18:09:03.891115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-13T18:09:03.976776image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
010000
100.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
1
8322 
0
1678 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
18322
83.2%
01678
 
16.8%

Length

2022-08-13T18:09:04.048027image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-13T18:09:04.134513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
18322
83.2%
01678
 
16.8%

접수일
Real number (ℝ≥0)

Distinct273
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20130477.47
Minimum20130121
Maximum20131126
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2022-08-13T18:09:04.238991image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum20130121
5-th percentile20130123
Q120130218
median20130411
Q320130710
95-th percentile20131028
Maximum20131126
Range1005
Interquartile range (IQR)492

Descriptive statistics

Standard deviation297.9922406
Coefficient of variation (CV)1.480303888 × 10-5
Kurtosis-0.8373279829
Mean20130477.47
Median Absolute Deviation (MAD)207
Skewness0.5883777731
Sum2.013047747 × 1011
Variance88799.37547
MonotonicityNot monotonic
2022-08-13T18:09:04.477472image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20130122263
 
2.6%
20130204194
 
1.9%
20130222186
 
1.9%
20130121180
 
1.8%
20130225180
 
1.8%
20130123171
 
1.7%
20130130159
 
1.6%
20130213158
 
1.6%
20130129157
 
1.6%
20130226146
 
1.5%
Other values (263)8206
82.1%
ValueCountFrequency (%)
20130121180
1.8%
20130122263
2.6%
20130123171
1.7%
2013012479
 
0.8%
20130125125
1.2%
201301266
 
0.1%
20130128134
1.3%
20130129157
1.6%
20130130159
1.6%
2013013178
 
0.8%
ValueCountFrequency (%)
201311261
 
< 0.1%
2013112521
0.2%
201311232
 
< 0.1%
2013112215
0.1%
2013112120
0.2%
2013112030
0.3%
2013111925
0.2%
2013111818
0.2%
201311171
 
< 0.1%
2013111522
0.2%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
대상
9192 
비대상
 
596
미확정
 
212

Length

Max length3
Median length2
Mean length2.0808
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대상
2nd row대상
3rd row대상
4th row대상
5th row대상

Common Values

ValueCountFrequency (%)
대상9192
91.9%
비대상596
 
6.0%
미확정212
 
2.1%

Length

2022-08-13T18:09:04.589894image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-13T18:09:04.681474image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
대상9192
91.9%
비대상596
 
6.0%
미확정212
 
2.1%

관할 노동관서
Categorical

MISSING

Distinct47
Distinct (%)0.5%
Missing397
Missing (%)4.0%
Memory size78.2 KiB
부천
 
624
여수
 
580
창원
 
546
안산
 
522
양산
 
496
Other values (42)
6835 

Length

Max length4
Median length2
Mean length2.524419452
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울동부
2nd row창원
3rd row부천
4th row구미
5th row전주

Common Values

ValueCountFrequency (%)
부천624
 
6.2%
여수580
 
5.8%
창원546
 
5.5%
안산522
 
5.2%
양산496
 
5.0%
경기461
 
4.6%
중부청434
 
4.3%
부산북부423
 
4.2%
인천북부421
 
4.2%
울산421
 
4.2%
Other values (37)4675
46.8%
(Missing)397
 
4.0%

Length

2022-08-13T18:09:04.805768image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
부천624
 
6.5%
여수580
 
6.0%
창원546
 
5.7%
안산522
 
5.4%
양산496
 
5.2%
경기461
 
4.8%
중부청434
 
4.5%
부산북부423
 
4.4%
인천북부421
 
4.4%
울산421
 
4.4%
Other values (37)4675
48.7%

관할 지사
Categorical

Distinct26
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
인천지역본부
855 
부산지역본부
767 
경남지역본부
765 
경기서부지사
701 
경기중부지사
 
623
Other values (21)
6289 

Length

Max length8
Median length6
Mean length6.059
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울지역본부
2nd row경남지역본부
3rd row경기중부지사
4th row경북지역본부
5th row전북지역본부

Common Values

ValueCountFrequency (%)
인천지역본부855
 
8.6%
부산지역본부767
 
7.7%
경남지역본부765
 
7.6%
경기서부지사701
 
7.0%
경기중부지사623
 
6.2%
경기지역본부600
 
6.0%
전남동부지사580
 
5.8%
경기북부지사499
 
5.0%
경남동부지사497
 
5.0%
울산지역본부421
 
4.2%
Other values (16)3692
36.9%

Length

2022-08-13T18:09:04.940640image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
인천지역본부855
 
8.6%
부산지역본부767
 
7.7%
경남지역본부765
 
7.6%
경기서부지사701
 
7.0%
경기중부지사623
 
6.2%
경기지역본부600
 
6.0%
전남동부지사580
 
5.8%
경기북부지사499
 
5.0%
경남동부지사497
 
5.0%
울산지역본부421
 
4.2%
Other values (16)3692
36.9%

사업장_업종코드
Categorical

MISSING

Distinct5
Distinct (%)4.7%
Missing9893
Missing (%)98.9%
Memory size78.2 KiB
합성수지 및 기타 플라스틱 물질 제조업
99 
재생섬유 제조업
 
4
합성섬유 제조업
 
2
김치류 제조업
 
1
의약용 화합물 및 항생물질 제조업
 
1

Length

Max length21
Median length21
Mean length20.11214953
Min length7

Unique

Unique2 ?
Unique (%)1.9%

Sample

1st row합성수지 및 기타 플라스틱 물질 제조업
2nd row합성수지 및 기타 플라스틱 물질 제조업
3rd row합성수지 및 기타 플라스틱 물질 제조업
4th row합성수지 및 기타 플라스틱 물질 제조업
5th row합성수지 및 기타 플라스틱 물질 제조업

Common Values

ValueCountFrequency (%)
합성수지 및 기타 플라스틱 물질 제조업99
 
1.0%
재생섬유 제조업4
 
< 0.1%
합성섬유 제조업2
 
< 0.1%
김치류 제조업1
 
< 0.1%
의약용 화합물 및 항생물질 제조업1
 
< 0.1%
(Missing)9893
98.9%

Length

2022-08-13T18:09:05.043374image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-13T18:09:05.141196image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
제조업107
17.5%
100
16.3%
합성수지99
16.2%
기타99
16.2%
플라스틱99
16.2%
물질99
16.2%
재생섬유4
 
0.7%
합성섬유2
 
0.3%
김치류1
 
0.2%
의약용1
 
0.2%
Other values (2)2
 
0.3%

사업장_주생산품
Categorical

HIGH CARDINALITY
MISSING

Distinct687
Distinct (%)33.3%
Missing7940
Missing (%)79.4%
Memory size78.2 KiB
기타가공품
230 
차량수리및도장업
 
149
기타
 
69
코팅및도금용역
 
66
기타인쇄물
 
25
Other values (682)
1521 

Length

Max length19
Median length14
Mean length6.010194175
Min length1

Unique

Unique443 ?
Unique (%)21.5%

Sample

1st row코팅및도금용역
2nd row목공용띠톱날
3rd row침대
4th row사출성형기
5th row자동차내장재

Common Values

ValueCountFrequency (%)
기타가공품230
 
2.3%
차량수리및도장업149
 
1.5%
기타69
 
0.7%
코팅및도금용역66
 
0.7%
기타인쇄물25
 
0.2%
자동차내장재25
 
0.2%
인쇄회로기판25
 
0.2%
석면20
 
0.2%
금형스프링18
 
0.2%
기타미분류가구17
 
0.2%
Other values (677)1416
 
14.2%
(Missing)7940
79.4%

Length

2022-08-13T18:09:05.241580image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타가공품230
 
11.0%
차량수리및도장업149
 
7.1%
기타69
 
3.3%
코팅및도금용역66
 
3.2%
기타인쇄물25
 
1.2%
자동차내장재25
 
1.2%
인쇄회로기판25
 
1.2%
석면20
 
1.0%
금형스프링18
 
0.9%
기타미분류가구17
 
0.8%
Other values (688)1451
69.3%

사업장_근로자수
Real number (ℝ≥0)

ZEROS

Distinct228
Distinct (%)2.3%
Missing5
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean108.0527264
Minimum0
Maximum30729
Zeros172
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2022-08-13T18:09:05.351947image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median5
Q38
95-th percentile39
Maximum30729
Range30729
Interquartile range (IQR)5

Descriptive statistics

Standard deviation1418.419728
Coefficient of variation (CV)13.12710725
Kurtosis363.0511815
Mean108.0527264
Median Absolute Deviation (MAD)3
Skewness18.40157567
Sum1079987
Variance2011914.524
MonotonicityNot monotonic
2022-08-13T18:09:05.472280image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31132
11.3%
11064
10.6%
4988
9.9%
2975
9.8%
5949
9.5%
6865
8.6%
7736
7.4%
8685
6.9%
9600
 
6.0%
10291
 
2.9%
Other values (218)1710
17.1%
ValueCountFrequency (%)
0172
 
1.7%
11064
10.6%
2975
9.8%
31132
11.3%
4988
9.9%
5949
9.5%
6865
8.6%
7736
7.4%
8685
6.9%
9600
6.0%
ValueCountFrequency (%)
307296
0.1%
305336
0.1%
289832
 
< 0.1%
286292
 
< 0.1%
246411
 
< 0.1%
238111
 
< 0.1%
221861
 
< 0.1%
210661
 
< 0.1%
186732
 
< 0.1%
172181
 
< 0.1%

검진신청근로자수
Real number (ℝ≥0)

MISSING
SKEWED

Distinct127
Distinct (%)2.1%
Missing3950
Missing (%)39.5%
Infinite0
Infinite (%)0.0%
Mean174691.857
Minimum0
Maximum1056821268
Zeros8
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2022-08-13T18:09:05.607223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q37
95-th percentile22.55
Maximum1056821268
Range1056821268
Interquartile range (IQR)5

Descriptive statistics

Standard deviation13587008.68
Coefficient of variation (CV)77.77700068
Kurtosis6050
Mean174691.857
Median Absolute Deviation (MAD)2
Skewness77.78174593
Sum1056885735
Variance1.84606805 × 1014
MonotonicityNot monotonic
2022-08-13T18:09:05.728103image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21085
 
10.8%
1827
 
8.3%
3729
 
7.3%
4694
 
6.9%
9577
 
5.8%
5493
 
4.9%
6407
 
4.1%
7301
 
3.0%
8250
 
2.5%
10108
 
1.1%
Other values (117)579
 
5.8%
(Missing)3950
39.5%
ValueCountFrequency (%)
08
 
0.1%
1827
8.3%
21085
10.8%
3729
7.3%
4694
6.9%
5493
4.9%
6407
 
4.1%
7301
 
3.0%
8250
 
2.5%
9577
5.8%
ValueCountFrequency (%)
10568212681
 
< 0.1%
10005
0.1%
8714
< 0.1%
6009
0.1%
5007
0.1%
4501
 
< 0.1%
4152
 
< 0.1%
4051
 
< 0.1%
4001
 
< 0.1%
3912
 
< 0.1%

실근로자수
Real number (ℝ≥0)

MISSING

Distinct27
Distinct (%)6.9%
Missing9607
Missing (%)96.1%
Infinite0
Infinite (%)0.0%
Mean11.02290076
Minimum0
Maximum999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2022-08-13T18:09:05.965009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15
median7
Q39
95-th percentile12
Maximum999
Range999
Interquartile range (IQR)4

Descriptive statistics

Standard deviation53.77335264
Coefficient of variation (CV)4.878330468
Kurtosis297.6175051
Mean11.02290076
Median Absolute Deviation (MAD)2
Skewness16.71423123
Sum4332
Variance2891.573454
MonotonicityNot monotonic
2022-08-13T18:09:06.100203image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
977
 
0.8%
863
 
0.6%
760
 
0.6%
642
 
0.4%
530
 
0.3%
124
 
0.2%
221
 
0.2%
417
 
0.2%
314
 
0.1%
1012
 
0.1%
Other values (17)33
 
0.3%
(Missing)9607
96.1%
ValueCountFrequency (%)
01
 
< 0.1%
124
 
0.2%
221
 
0.2%
314
 
0.1%
417
 
0.2%
530
 
0.3%
642
0.4%
760
0.6%
863
0.6%
977
0.8%
ValueCountFrequency (%)
9991
< 0.1%
3751
< 0.1%
1051
< 0.1%
631
< 0.1%
521
< 0.1%
351
< 0.1%
321
< 0.1%
261
< 0.1%
221
< 0.1%
202
< 0.1%

실시여부
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
미실시
5587 
실시
4413 

Length

Max length3
Median length3
Mean length2.5587
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미실시
2nd row실시
3rd row실시
4th row미실시
5th row미실시

Common Values

ValueCountFrequency (%)
미실시5587
55.9%
실시4413
44.1%

Length

2022-08-13T18:09:06.231306image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-13T18:09:06.323730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
미실시5587
55.9%
실시4413
44.1%

유해물질코드
Categorical

HIGH CARDINALITY
MISSING

Distinct3722
Distinct (%)44.7%
Missing1670
Missing (%)16.7%
Memory size78.2 KiB
없음
936 
금속가공유
 
520
.
 
289
-
 
258
해당없음
 
182
Other values (3717)
6145 

Length

Max length100
Median length96
Mean length11.17358944
Min length1

Unique

Unique3247 ?
Unique (%)39.0%

Sample

1st row6가크롬,시안화나트륨, 염화수소
2nd row산화철, 크롬, 니켈
3rd row메틸알코올,스틸렌,유리선유분진
4th row용접흄
5th row산화철분진

Common Values

ValueCountFrequency (%)
없음936
 
9.4%
금속가공유520
 
5.2%
.289
 
2.9%
-258
 
2.6%
해당없음182
 
1.8%
석면147
 
1.5%
89
 
0.9%
소음70
 
0.7%
톨루엔58
 
0.6%
유기화합물55
 
0.5%
Other values (3712)5726
57.3%
(Missing)1670
 
16.7%

Length

2022-08-13T18:09:06.424632image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
없음971
 
7.2%
금속가공유629
 
4.6%
600
 
4.4%
톨루엔576
 
4.2%
크실렌387
 
2.9%
340
 
2.5%
망간326
 
2.4%
크롬262
 
1.9%
산화철197
 
1.5%
니켈189
 
1.4%
Other values (3162)9089
67.0%

인자명
Categorical

HIGH CARDINALITY
MISSING

Distinct1683
Distinct (%)20.2%
Missing1674
Missing (%)16.7%
Memory size78.2 KiB
소음
1788 
없음
850 
-
 
259
.
 
216
분진
 
211
Other values (1678)
5002 

Length

Max length197
Median length115
Mean length5.677155897
Min length1

Unique

Unique1322 ?
Unique (%)15.9%

Sample

1st row해당없음
2nd row소음, 석영
3rd row소음
4th row용접
5th row소음

Common Values

ValueCountFrequency (%)
소음1788
17.9%
없음850
 
8.5%
-259
 
2.6%
.216
 
2.2%
분진211
 
2.1%
석면193
 
1.9%
용접흄183
 
1.8%
해당없음176
 
1.8%
소음,분진169
 
1.7%
광물성분진146
 
1.5%
Other values (1673)4135
41.3%
(Missing)1674
16.7%

Length

2022-08-13T18:09:06.593140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
소음2901
26.2%
없음887
 
8.0%
분진557
 
5.0%
용접흄532
 
4.8%
516
 
4.7%
광물성분진271
 
2.4%
석면200
 
1.8%
금속가공유184
 
1.7%
해당없음178
 
1.6%
소음,분진172
 
1.6%
Other values (1421)4670
42.2%

개인정보 활용( Y,N )
Boolean

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
True
10000 
ValueCountFrequency (%)
True10000
100.0%
2022-08-13T18:09:06.735929image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

검진기관명
Categorical

HIGH CARDINALITY
MISSING

Distinct743
Distinct (%)12.3%
Missing3965
Missing (%)39.6%
Memory size78.2 KiB
대한산업보건협회
803 
여수성심병원
 
231
인천산재병원
 
167
진주고려병원
 
140
굿모닝병원
 
133
Other values (738)
4561 

Length

Max length21
Median length19
Mean length7.49991715
Min length1

Unique

Unique393 ?
Unique (%)6.5%

Sample

1st row파티마병원
2nd row제이에스병원
3rd row구미강동병원
4th row대한산업보건협회
5th row대한산업보건협회

Common Values

ValueCountFrequency (%)
대한산업보건협회803
 
8.0%
여수성심병원231
 
2.3%
인천산재병원167
 
1.7%
진주고려병원140
 
1.4%
굿모닝병원133
 
1.3%
경희산업의학센터120
 
1.2%
대한산업보건협회남부산센터118
 
1.2%
보건협회 부산103
 
1.0%
성심병원100
 
1.0%
김병원99
 
1.0%
Other values (733)4021
40.2%
(Missing)3965
39.6%

Length

2022-08-13T18:09:06.844440image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대한산업보건협회1085
 
15.9%
여수성심병원232
 
3.4%
보건협회205
 
3.0%
인천산재병원170
 
2.5%
성심병원152
 
2.2%
진주고려병원140
 
2.1%
굿모닝병원136
 
2.0%
경희산업의학센터120
 
1.8%
대한산업보건협회남부산센터118
 
1.7%
부산107
 
1.6%
Other values (713)4354
63.9%

SAM파일
Categorical

HIGH CARDINALITY
MISSING

Distinct1678
Distinct (%)100.0%
Missing8322
Missing (%)83.2%
Memory size78.2 KiB
FILE_000000000010163
 
1
FILE_000000000018445
 
1
FILE_000000000006554
 
1
FILE_000000000009674
 
1
FILE_000000000012713
 
1
Other values (1673)
1673 

Length

Max length20
Median length20
Mean length20
Min length20

Unique

Unique1678 ?
Unique (%)100.0%

Sample

1st rowFILE_000000000012337
2nd rowFILE_000000000003766
3rd rowFILE_000000000018445
4th rowFILE_000000000006554
5th rowFILE_000000000009674

Common Values

ValueCountFrequency (%)
FILE_0000000000101631
 
< 0.1%
FILE_0000000000184451
 
< 0.1%
FILE_0000000000065541
 
< 0.1%
FILE_0000000000096741
 
< 0.1%
FILE_0000000000127131
 
< 0.1%
FILE_0000000000113721
 
< 0.1%
FILE_0000000000192491
 
< 0.1%
FILE_0000000000088811
 
< 0.1%
FILE_0000000000059941
 
< 0.1%
FILE_0000000000148811
 
< 0.1%
Other values (1668)1668
 
16.7%
(Missing)8322
83.2%

Length

2022-08-13T18:09:06.966444image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
file_0000000000101631
 
0.1%
file_0000000000145751
 
0.1%
file_0000000000101721
 
0.1%
file_0000000000203321
 
0.1%
file_0000000000159211
 
0.1%
file_0000000000203011
 
0.1%
file_0000000000067721
 
0.1%
file_0000000000092161
 
0.1%
file_0000000000108441
 
0.1%
file_0000000000180311
 
0.1%
Other values (1668)1668
99.4%

비고
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

발주처
Categorical

HIGH CARDINALITY
MISSING

Distinct570
Distinct (%)34.0%
Missing8322
Missing (%)83.2%
Memory size78.2 KiB
삼성물산
 
84
현대건설
 
49
GS칼텍스
 
41
대림산업(주)
 
39
동원개발
 
26
Other values (565)
1439 

Length

Max length23
Median length19
Mean length6.307508939
Min length2

Unique

Unique312 ?
Unique (%)18.6%

Sample

1st row삼성물산(주)
2nd row금호석유화학(주)
3rd row삼성에버랜드
4th rowsk이노베이션
5th rowSK에너지

Common Values

ValueCountFrequency (%)
삼성물산84
 
0.8%
현대건설49
 
0.5%
GS칼텍스41
 
0.4%
대림산업(주)39
 
0.4%
동원개발26
 
0.3%
대림산업19
 
0.2%
SK에너지17
 
0.2%
여천NCC(주)16
 
0.2%
한국도로공사16
 
0.2%
한국토지주택공사16
 
0.2%
Other values (560)1355
 
13.6%
(Missing)8322
83.2%

Length

2022-08-13T18:09:07.067992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
삼성물산84
 
4.7%
현대건설49
 
2.7%
gs칼텍스43
 
2.4%
대림산업(주39
 
2.2%
동원개발26
 
1.5%
sk24
 
1.3%
sk에너지21
 
1.2%
sk이노베이션20
 
1.1%
대림산업19
 
1.1%
한국환경공단19
 
1.1%
Other values (562)1441
80.7%

공사금액
Categorical

HIGH CARDINALITY
MISSING

Distinct888
Distinct (%)52.9%
Missing8322
Missing (%)83.2%
Memory size78.2 KiB
100000
 
40
1560000000
 
25
10000
 
21
500000
 
21
2000000
 
19
Other values (883)
1552 

Length

Max length10
Median length9
Mean length6.601907032
Min length1

Unique

Unique607 ?
Unique (%)36.2%

Sample

1st row731500
2nd row100000
3rd row1604800
4th row156000000
5th row2000000

Common Values

ValueCountFrequency (%)
10000040
 
0.4%
156000000025
 
0.2%
1000021
 
0.2%
50000021
 
0.2%
200000019
 
0.2%
500000016
 
0.2%
20000016
 
0.2%
100000014
 
0.1%
300000013
 
0.1%
3000012
 
0.1%
Other values (878)1481
 
14.8%
(Missing)8322
83.2%

Length

2022-08-13T18:09:07.181663image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10000040
 
2.4%
156000000025
 
1.5%
1000021
 
1.3%
50000021
 
1.3%
200000019
 
1.1%
500000016
 
1.0%
20000016
 
1.0%
100000014
 
0.8%
300000013
 
0.8%
3000012
 
0.7%
Other values (878)1481
88.3%

공사시작일자
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

공사종료일자
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

비대상선정사유
Categorical

MISSING

Distinct21
Distinct (%)77.8%
Missing9973
Missing (%)99.7%
Memory size78.2 KiB
실근로자수 유선확인 완료
하반기2차 작측대상 선정시 고용센터 근로자수 확인
처리완료
특수검진 예산소진으로 마감됨
진주사업장 근로자가 특검대상으로 비대상
 
1
Other values (16)
16 

Length

Max length87
Median length22
Mean length19
Min length4

Unique

Unique17 ?
Unique (%)63.0%

Sample

1st row사업장관리번호 다시입력하여 신청예정
2nd row근로자수가 10인 미만 대상이라서 근로자 수 확인하려고 여러번 전화해도 351-4386은 없는 전화번호이고 10인 확인불가하여 현전산망 10인라 비대상처리함
3rd row근로자수 10인 이상
4th row규정상 10인 미만아라서 고용보험인원 19명으로 초과
5th row실근로자수 유선확인 완료

Common Values

ValueCountFrequency (%)
실근로자수 유선확인 완료4
 
< 0.1%
하반기2차 작측대상 선정시 고용센터 근로자수 확인2
 
< 0.1%
처리완료2
 
< 0.1%
특수검진 예산소진으로 마감됨2
 
< 0.1%
진주사업장 근로자가 특검대상으로 비대상1
 
< 0.1%
근로자수가 10인 미만 대상이라서 근로자 수 확인하려고 여러번 전화해도 351-4386은 없는 전화번호이고 10인 확인불가하여 현전산망 10인라 비대상처리함1
 
< 0.1%
근로자수 10인 이상1
 
< 0.1%
규정상 10인 미만아라서 고용보험인원 19명으로 초과1
 
< 0.1%
특검대상물질아님1
 
< 0.1%
근로자수 0명으로 비대상처리1
 
< 0.1%
Other values (11)11
 
0.1%
(Missing)9973
99.7%

Length

2022-08-13T18:09:07.288547image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10인6
 
5.6%
실근로자수4
 
3.7%
완료4
 
3.7%
근로자수4
 
3.7%
특수검진4
 
3.7%
예산소진으로4
 
3.7%
유선확인4
 
3.7%
대상임2
 
1.9%
실근로자수가2
 
1.9%
참고2
 
1.9%
Other values (59)72
66.7%

등록일자
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

수정일자
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

기업정보이용동의(동의:Y,비동의:N)
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB
Missing10000
Missing (%)100.0%
Memory size88.0 KiB

측정예정일
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

수시(임시)접수여부
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

검진측정기관코드
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

시도명
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

시군구명
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

지원일자
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing10000
Missing (%)100.0%
Memory size88.0 KiB

Sample

First rows

df_index사업년도번호건설일용직지원접수일지원확인 상태관할 노동관서관할 지사사업장_업종코드사업장_주생산품사업장_근로자수검진신청근로자수실근로자수실시여부유해물질코드인자명개인정보 활용( Y,N )검진기관명SAM파일비고발주처공사금액공사시작일자공사종료일자비대상선정사유등록일자수정일자기업정보이용동의(동의:Y,비동의:N)공단에서 검진기관 지정에 동의(동의:Y,비동의:N)측정예정일수시(임시)접수여부검진측정기관코드시도명시군구명지원일자
01772720130020130624대상서울동부서울지역본부<NA><NA>19<NA>미실시<NA><NA>Y<NA>FILE_000000000012337<NA>삼성물산(주)731500<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
132420130120130121대상창원경남지역본부<NA>코팅및도금용역3<NA><NA>실시6가크롬,시안화나트륨, 염화수소해당없음Y파티마병원<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
2485320130120130207대상부천경기중부지사<NA><NA>2<NA><NA>실시산화철, 크롬, 니켈소음, 석영Y제이에스병원<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
3197620130120130125대상구미경북지역본부<NA><NA>17<NA>미실시메틸알코올,스틸렌,유리선유분진소음Y구미강동병원<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
4377120130120130204대상전주전북지역본부<NA><NA>45<NA>미실시용접흄용접Y대한산업보건협회<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
5594120130120130214대상대전청대전세종지역본부<NA>목공용띠톱날2<NA><NA>실시산화철분진소음Y대한산업보건협회<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
61464520130120130507대상양산경남동부지사<NA><NA>2<NA><NA>미실시없슴없슴Y<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
7512220130120130208대상포항경북동부지사<NA><NA>7<NA><NA>실시메탄올소음,분진Y<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
8429420130020130205대상여수전남동부지사<NA><NA>9<NA><NA>실시<NA><NA>Y<NA>FILE_000000000003766<NA>금호석유화학(주)100000<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
92273320130120130911대상전주전북지역본부<NA><NA>75<NA>미실시페인트,신너분진Y대한산업보건협회<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Last rows

df_index사업년도번호건설일용직지원접수일지원확인 상태관할 노동관서관할 지사사업장_업종코드사업장_주생산품사업장_근로자수검진신청근로자수실근로자수실시여부유해물질코드인자명개인정보 활용( Y,N )검진기관명SAM파일비고발주처공사금액공사시작일자공사종료일자비대상선정사유등록일자수정일자기업정보이용동의(동의:Y,비동의:N)공단에서 검진기관 지정에 동의(동의:Y,비동의:N)측정예정일수시(임시)접수여부검진측정기관코드시도명시군구명지원일자
9990163320130120130124대상여수전남동부지사<NA><NA>33<NA>미실시벤젠, 톨루엔, 크실렌, 헥산, 헵탄, 1.3 BD, DMF ,IPA ,에틸렌 글리콜,THF소음,분진Y성심병원<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
99911785920130120130625대상광주청광주지역본부<NA><NA>32<NA>미실시용접흄소음Y(사)대한산업보건협회광주산업보건센타<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
99921484420130020130509대상포항경북동부지사<NA><NA>4037<NA>미실시<NA><NA>Y포항성모병원FILE_000000000010294<NA>예수성심시녀회46000000<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
99931623520130120130604대상포항경북동부지사<NA><NA>8<NA><NA>미실시없음분진Y<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
9994914220130120130228비대상여수전남동부지사<NA><NA>137100<NA>실시메틸알코올, 벤젠등분진, 소음Y<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
99951886020130120130701대상서울동부서울지역본부<NA><NA>45<NA>미실시용접용접Y<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
99961662120130120130610대상부천경기중부지사<NA><NA>5<NA><NA>미실시.광물성분진Y<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
99971410720130120130426대상여수전남동부지사<NA><NA>7<NA><NA>실시벤젠,톨루엔소음,광물성분진Y<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
99981957520130120130713대상진주경남지역본부<NA><NA>6<NA><NA>미실시톨루엔,크실렌,에틸벤젠,메틸이소부틸케톤등활석Y진주고려병원<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
99992360220130120131007대상포항경북동부지사<NA>페인트용희석제및세척제54<NA>미실시혼합유기화합물(폐페인트,폐유기화합물)소음Y경주동대병원<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>