Overview

Dataset statistics

Number of variables4
Number of observations1196
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory39.8 KiB
Average record size in memory34.1 B

Variable types

Numeric2
Text2

Dataset

Description근로복지공단에서 제공하는 고용보험 가입사업장에 적용 되는 고용보험 업종코드, 업종명 입니다.* 고용보험 사업종류는 사업의 종류 구분을 한국표준산업분류표상 산업의 세세분류를 원칙으로 하며, 하나의 사업주가 둘 이상의 사업을 경영 하는 경우에는 첫째 상시근로자수, 둘째 임금총액, 셋째 매출액 순으로 사업의 종류를 결정합니다.※ 데이터 출처: 통계청 표준산업분류코드 10차
Author근로복지공단
URLhttps://www.data.go.kr/data/15122658/fileData.do

Alerts

연번 is highly overall correlated with 고용보험 업종코드High correlation
고용보험 업종코드 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
고용보험 업종명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 11:58:46.864225
Analysis finished2023-12-12 11:58:48.055493
Duration1.19 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1196
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean598.5
Minimum1
Maximum1196
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 KiB
2023-12-12T20:58:48.148026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile60.75
Q1299.75
median598.5
Q3897.25
95-th percentile1136.25
Maximum1196
Range1195
Interquartile range (IQR)597.5

Descriptive statistics

Standard deviation345.39977
Coefficient of variation (CV)0.57710905
Kurtosis-1.2
Mean598.5
Median Absolute Deviation (MAD)299
Skewness0
Sum715806
Variance119301
MonotonicityStrictly increasing
2023-12-12T20:58:48.351585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
805 1
 
0.1%
803 1
 
0.1%
802 1
 
0.1%
801 1
 
0.1%
800 1
 
0.1%
799 1
 
0.1%
798 1
 
0.1%
797 1
 
0.1%
796 1
 
0.1%
Other values (1186) 1186
99.2%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1196 1
0.1%
1195 1
0.1%
1194 1
0.1%
1193 1
0.1%
1192 1
0.1%
1191 1
0.1%
1190 1
0.1%
1189 1
0.1%
1188 1
0.1%
1187 1
0.1%
Distinct1150
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
2023-12-12T20:58:48.590477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length34
Mean length18.575251
Min length11

Characters and Unicode

Total characters22216
Distinct characters128
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1146 ?
Unique (%)95.8%

Sample

1st rowA. 농업, 임업 및 어업(01~03)
2nd rowA. 농업, 임업 및 어업(01~03)
3rd rowA. 농업, 임업 및 어업(01~03)
4th rowA. 농업, 임업 및 어업(01~03)
5th rowA. 농업, 임업 및 어업(01~03)
ValueCountFrequency (%)
618
 
14.2%
c 477
 
10.9%
도매 184
 
4.2%
g 184
 
4.2%
m 51
 
1.2%
전문 51
 
1.2%
과학 51
 
1.2%
기술 51
 
1.2%
운수 48
 
1.1%
h 48
 
1.1%
Other values (1210) 2604
59.6%
2023-12-12T20:58:49.138740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3171
 
14.3%
1323
 
6.0%
) 1196
 
5.4%
( 1196
 
5.4%
. 1196
 
5.4%
~ 1116
 
5.0%
1 1116
 
5.0%
0 916
 
4.1%
4 681
 
3.1%
618
 
2.8%
Other values (118) 9687
43.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7550
34.0%
Decimal Number 5313
23.9%
Space Separator 3171
14.3%
Other Punctuation 1478
 
6.7%
Close Punctuation 1196
 
5.4%
Open Punctuation 1196
 
5.4%
Uppercase Letter 1196
 
5.4%
Math Symbol 1116
 
5.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1323
17.5%
618
 
8.2%
486
 
6.4%
479
 
6.3%
368
 
4.9%
277
 
3.7%
228
 
3.0%
225
 
3.0%
203
 
2.7%
187
 
2.5%
Other values (81) 3156
41.8%
Uppercase Letter
ValueCountFrequency (%)
C 477
39.9%
G 184
 
15.4%
M 51
 
4.3%
H 48
 
4.0%
F 45
 
3.8%
R 43
 
3.6%
J 42
 
3.5%
S 41
 
3.4%
A 34
 
2.8%
P 33
 
2.8%
Other values (11) 198
16.6%
Decimal Number
ValueCountFrequency (%)
1 1116
21.0%
0 916
17.2%
4 681
12.8%
5 520
9.8%
9 417
 
7.8%
7 344
 
6.5%
3 342
 
6.4%
8 342
 
6.4%
2 328
 
6.2%
6 307
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 1196
80.9%
, 282
 
19.1%
Space Separator
ValueCountFrequency (%)
3171
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1196
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1196
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1116
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13470
60.6%
Hangul 7550
34.0%
Latin 1196
 
5.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1323
17.5%
618
 
8.2%
486
 
6.4%
479
 
6.3%
368
 
4.9%
277
 
3.7%
228
 
3.0%
225
 
3.0%
203
 
2.7%
187
 
2.5%
Other values (81) 3156
41.8%
Latin
ValueCountFrequency (%)
C 477
39.9%
G 184
 
15.4%
M 51
 
4.3%
H 48
 
4.0%
F 45
 
3.8%
R 43
 
3.6%
J 42
 
3.5%
S 41
 
3.4%
A 34
 
2.8%
P 33
 
2.8%
Other values (11) 198
16.6%
Common
ValueCountFrequency (%)
3171
23.5%
) 1196
 
8.9%
( 1196
 
8.9%
. 1196
 
8.9%
~ 1116
 
8.3%
1 1116
 
8.3%
0 916
 
6.8%
4 681
 
5.1%
5 520
 
3.9%
9 417
 
3.1%
Other values (6) 1945
14.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14666
66.0%
Hangul 7550
34.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3171
21.6%
) 1196
 
8.2%
( 1196
 
8.2%
. 1196
 
8.2%
~ 1116
 
7.6%
1 1116
 
7.6%
0 916
 
6.2%
4 681
 
4.6%
5 520
 
3.5%
C 477
 
3.3%
Other values (27) 3081
21.0%
Hangul
ValueCountFrequency (%)
1323
17.5%
618
 
8.2%
486
 
6.4%
479
 
6.3%
368
 
4.9%
277
 
3.7%
228
 
3.0%
225
 
3.0%
203
 
2.7%
187
 
2.5%
Other values (81) 3156
41.8%

고용보험 업종코드
Real number (ℝ)

HIGH CORRELATION 

Distinct1195
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44641.28
Minimum1110
Maximum99009
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.6 KiB
2023-12-12T20:58:49.342197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1110
5-th percentile10402.75
Q124217.5
median45211.5
Q363993
95-th percentile91193
Maximum99009
Range97899
Interquartile range (IQR)39775.5

Descriptive statistics

Standard deviation25743.154
Coefficient of variation (CV)0.57666702
Kurtosis-0.81965279
Mean44641.28
Median Absolute Deviation (MAD)19958
Skewness0.43364258
Sum53390971
Variance6.6270998 × 108
MonotonicityIncreasing
2023-12-12T20:58:49.508719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1212 2
 
0.2%
1110 1
 
0.1%
50111 1
 
0.1%
50202 1
 
0.1%
50201 1
 
0.1%
50130 1
 
0.1%
50122 1
 
0.1%
50121 1
 
0.1%
50112 1
 
0.1%
49500 1
 
0.1%
Other values (1185) 1185
99.1%
ValueCountFrequency (%)
1110 1
0.1%
1121 1
0.1%
1122 1
0.1%
1123 1
0.1%
1131 1
0.1%
1132 1
0.1%
1140 1
0.1%
1151 1
0.1%
1152 1
0.1%
1159 1
0.1%
ValueCountFrequency (%)
99009 1
0.1%
99001 1
0.1%
98200 1
0.1%
98100 1
0.1%
97000 1
0.1%
96999 1
0.1%
96995 1
0.1%
96994 1
0.1%
96993 1
0.1%
96992 1
0.1%
Distinct1196
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size9.5 KiB
2023-12-12T20:58:49.867558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length24
Mean length12.699833
Min length2

Characters and Unicode

Total characters15189
Distinct characters451
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1196 ?
Unique (%)100.0%

Sample

1st row곡물 및 기타 식량작물 재배업
2nd row채소작물 재배업
3rd row화훼작물 재배업
4th row종자 및 묘목 생산업
5th row과실작물 재배업
ValueCountFrequency (%)
484
 
10.5%
제조업 418
 
9.1%
기타 241
 
5.2%
도매업 88
 
1.9%
소매업 68
 
1.5%
64
 
1.4%
서비스업 63
 
1.4%
62
 
1.3%
운영업 56
 
1.2%
유사 28
 
0.6%
Other values (1500) 3043
65.9%
2023-12-12T20:58:50.393667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3419
22.5%
1148
 
7.6%
598
 
3.9%
504
 
3.3%
492
 
3.2%
484
 
3.2%
250
 
1.6%
240
 
1.6%
223
 
1.5%
180
 
1.2%
Other values (441) 7651
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11630
76.6%
Space Separator 3419
 
22.5%
Other Punctuation 129
 
0.8%
Decimal Number 5
 
< 0.1%
Close Punctuation 3
 
< 0.1%
Open Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1148
 
9.9%
598
 
5.1%
504
 
4.3%
492
 
4.2%
484
 
4.2%
250
 
2.1%
240
 
2.1%
223
 
1.9%
180
 
1.5%
176
 
1.5%
Other values (436) 7335
63.1%
Space Separator
ValueCountFrequency (%)
3419
100.0%
Other Punctuation
ValueCountFrequency (%)
, 129
100.0%
Decimal Number
ValueCountFrequency (%)
1 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11630
76.6%
Common 3559
 
23.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1148
 
9.9%
598
 
5.1%
504
 
4.3%
492
 
4.2%
484
 
4.2%
250
 
2.1%
240
 
2.1%
223
 
1.9%
180
 
1.5%
176
 
1.5%
Other values (436) 7335
63.1%
Common
ValueCountFrequency (%)
3419
96.1%
, 129
 
3.6%
1 5
 
0.1%
) 3
 
0.1%
( 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11600
76.4%
ASCII 3559
 
23.4%
Compat Jamo 30
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3419
96.1%
, 129
 
3.6%
1 5
 
0.1%
) 3
 
0.1%
( 3
 
0.1%
Hangul
ValueCountFrequency (%)
1148
 
9.9%
598
 
5.2%
504
 
4.3%
492
 
4.2%
484
 
4.2%
250
 
2.2%
240
 
2.1%
223
 
1.9%
180
 
1.6%
176
 
1.5%
Other values (435) 7305
63.0%
Compat Jamo
ValueCountFrequency (%)
30
100.0%

Interactions

2023-12-12T20:58:47.634398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:47.358175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:47.776617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:58:47.494311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:58:50.874514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번고용보험 업종코드
연번1.0000.981
고용보험 업종코드0.9811.000
2023-12-12T20:58:50.992472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번고용보험 업종코드
연번1.0001.000
고용보험 업종코드1.0001.000

Missing values

2023-12-12T20:58:47.911946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:58:48.013112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번고용보험 업종(대분류)고용보험 업종코드고용보험 업종명
01A. 농업, 임업 및 어업(01~03)1110곡물 및 기타 식량작물 재배업
12A. 농업, 임업 및 어업(01~03)1121채소작물 재배업
23A. 농업, 임업 및 어업(01~03)1122화훼작물 재배업
34A. 농업, 임업 및 어업(01~03)1123종자 및 묘목 생산업
45A. 농업, 임업 및 어업(01~03)1131과실작물 재배업
56A. 농업, 임업 및 어업(01~03)1132음료용 및 향신용 작물 재배업
67A. 농업, 임업 및 어업(01~03)1140기타 작물 재배업
78A. 농업, 임업 및 어업(01~03)1151콩나물 재배업
89A. 농업, 임업 및 어업(01~03)1152채소, 화훼 및 과실작물 시설 재배업
910A. 농업, 임업 및 어업(01~03)1159기타 시설작물 재배업
연번고용보험 업종(대분류)고용보험 업종코드고용보험 업종명
11861187S. 협회 및 단체, 수리 및 기타 개인 서비스업(94~132)96992점술 및 유사 서비스업
11871188S. 협회 및 단체, 수리 및 기타 개인 서비스업(94~133)96993개인 간병 및 유사 서비스업
11881189S. 협회 및 단체, 수리 및 기타 개인 서비스업(94~134)96994결혼 상담 및 준비 서비스업
11891190S. 협회 및 단체, 수리 및 기타 개인 서비스업(94~135)96995애완 동물 장묘 및 보호 서비스업
11901191S. 협회 및 단체, 수리 및 기타 개인 서비스업(94~136)96999그 외 기타 달리 분류되지 않은 개인 서비스업
11911192T. 가구 내 고용활동 및 달리 분류되지 않은 자가 소비 생산활동(97~98)97000가구 내 고용활동
11921193T. 가구 내 고용활동 및 달리 분류되지 않은 자가 소비 생산활동(97~98)98100자가 소비를 위한 가사 생산 활동
11931194T. 가구 내 고용활동 및 달리 분류되지 않은 자가 소비 생산활동(97~98)98200자가 소비를 위한 가사 서비스 활동
11941195U. 국제 및 외국기관(99)99001주한 외국 공관
11951196U. 국제 및 외국기관(99)99009기타 국제 및 외국기관