Overview

Dataset statistics

Number of variables5
Number of observations8322
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory341.5 KiB
Average record size in memory42.0 B

Variable types

Text2
Numeric2
DateTime1

Dataset

Description장애인을 고용하고 있는 사업체에서 장애인의 원할한 회사생활을 위하여 임명하는 직업생활상담원의 정보(상담원 교육기수, 자격취득연도, 자격번호, 수료일자, 소속기관 등)
URLhttps://www.data.go.kr/data/15014778/fileData.do

Alerts

자격년도 is highly overall correlated with 자격번호High correlation
자격번호 is highly overall correlated with 자격년도High correlation
자격번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 03:31:17.311884
Analysis finished2023-12-12 03:31:18.814132
Duration1.5 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기수
Text

Distinct142
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size65.1 KiB
2023-12-12T12:31:19.126646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.3202355
Min length2

Characters and Unicode

Total characters27631
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1기
2nd row1기
3rd row1기
4th row1기
5th row1기
ValueCountFrequency (%)
62기 1462
 
17.6%
맞춤-2017 150
 
1.8%
101기 148
 
1.8%
102기 132
 
1.6%
129기 121
 
1.5%
131기 117
 
1.4%
1기 112
 
1.3%
99기 102
 
1.2%
64기 100
 
1.2%
2기 95
 
1.1%
Other values (132) 5783
69.5%
2023-12-12T12:31:19.720834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8096
29.3%
1 4041
14.6%
2 3411
12.3%
6 2486
 
9.0%
3 1429
 
5.2%
9 1355
 
4.9%
0 1284
 
4.6%
8 1206
 
4.4%
5 1186
 
4.3%
4 1133
 
4.1%
Other values (10) 2004
 
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18631
67.4%
Other Letter 8772
31.7%
Dash Punctuation 228
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4041
21.7%
2 3411
18.3%
6 2486
13.3%
3 1429
 
7.7%
9 1355
 
7.3%
0 1284
 
6.9%
8 1206
 
6.5%
5 1186
 
6.4%
4 1133
 
6.1%
7 1100
 
5.9%
Other Letter
ValueCountFrequency (%)
8096
92.3%
245
 
2.8%
245
 
2.8%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18859
68.3%
Hangul 8772
31.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4041
21.4%
2 3411
18.1%
6 2486
13.2%
3 1429
 
7.6%
9 1355
 
7.2%
0 1284
 
6.8%
8 1206
 
6.4%
5 1186
 
6.3%
4 1133
 
6.0%
7 1100
 
5.8%
Hangul
ValueCountFrequency (%)
8096
92.3%
245
 
2.8%
245
 
2.8%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18859
68.3%
Hangul 8772
31.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8096
92.3%
245
 
2.8%
245
 
2.8%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%
31
 
0.4%
ASCII
ValueCountFrequency (%)
1 4041
21.4%
2 3411
18.1%
6 2486
13.2%
3 1429
 
7.6%
9 1355
 
7.2%
0 1284
 
6.8%
8 1206
 
6.4%
5 1186
 
6.3%
4 1133
 
6.0%
7 1100
 
5.8%

자격년도
Real number (ℝ)

HIGH CORRELATION 

Distinct33
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.8266
Minimum1991
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size73.3 KiB
2023-12-12T12:31:19.939633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1991
5-th percentile1994
Q12010
median2012
Q32020
95-th percentile2022
Maximum2023
Range32
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.8576507
Coefficient of variation (CV)0.0039037892
Kurtosis0.53284256
Mean2012.8266
Median Absolute Deviation (MAD)6
Skewness-0.95153375
Sum16750743
Variance61.742674
MonotonicityIncreasing
2023-12-12T12:31:20.115494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
2012 1691
20.3%
2020 694
 
8.3%
2022 653
 
7.8%
2021 618
 
7.4%
2011 450
 
5.4%
2019 418
 
5.0%
2018 318
 
3.8%
2017 297
 
3.6%
2023 257
 
3.1%
2006 251
 
3.0%
Other values (23) 2675
32.1%
ValueCountFrequency (%)
1991 112
1.3%
1992 181
2.2%
1993 81
1.0%
1994 44
 
0.5%
1995 40
 
0.5%
1996 28
 
0.3%
1997 43
 
0.5%
1998 48
 
0.6%
1999 47
 
0.6%
2000 103
1.2%
ValueCountFrequency (%)
2023 257
 
3.1%
2022 653
7.8%
2021 618
7.4%
2020 694
8.3%
2019 418
5.0%
2018 318
3.8%
2017 297
3.6%
2016 158
 
1.9%
2015 136
 
1.6%
2014 189
 
2.3%

자격번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct8322
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.012831 × 109
Minimum1.991 × 109
Maximum2.0230003 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size73.3 KiB
2023-12-12T12:31:20.304317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.991 × 109
5-th percentile1.994 × 109
Q12.01 × 109
median2.0120015 × 109
Q32.0200001 × 109
95-th percentile2.0220005 × 109
Maximum2.0230003 × 109
Range32000256
Interquartile range (IQR)10000110

Descriptive statistics

Standard deviation7860964
Coefficient of variation (CV)0.0039054266
Kurtosis0.53008412
Mean2.012831 × 109
Median Absolute Deviation (MAD)5998778
Skewness-0.95098469
Sum1.675078 × 1013
Variance6.1794754 × 1013
MonotonicityStrictly increasing
2023-12-12T12:31:20.490489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1991000001 1
 
< 0.1%
2018000196 1
 
< 0.1%
2018000194 1
 
< 0.1%
2018000193 1
 
< 0.1%
2018000192 1
 
< 0.1%
2018000191 1
 
< 0.1%
2018000190 1
 
< 0.1%
2018000189 1
 
< 0.1%
2018000188 1
 
< 0.1%
2018000187 1
 
< 0.1%
Other values (8312) 8312
99.9%
ValueCountFrequency (%)
1991000001 1
< 0.1%
1991000002 1
< 0.1%
1991000003 1
< 0.1%
1991000004 1
< 0.1%
1991000005 1
< 0.1%
1991000006 1
< 0.1%
1991000007 1
< 0.1%
1991000008 1
< 0.1%
1991000009 1
< 0.1%
1991000010 1
< 0.1%
ValueCountFrequency (%)
2023000257 1
< 0.1%
2023000256 1
< 0.1%
2023000255 1
< 0.1%
2023000254 1
< 0.1%
2023000253 1
< 0.1%
2023000252 1
< 0.1%
2023000251 1
< 0.1%
2023000250 1
< 0.1%
2023000249 1
< 0.1%
2023000248 1
< 0.1%
Distinct90
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size65.1 KiB
Minimum1991-12-31 00:00:00
Maximum2023-06-30 00:00:00
2023-12-12T12:31:20.678244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:31:20.867362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct6114
Distinct (%)73.5%
Missing0
Missing (%)0.0%
Memory size65.1 KiB
2023-12-12T12:31:21.238618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length25
Mean length7.7658015
Min length1

Characters and Unicode

Total characters64627
Distinct characters740
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5108 ?
Unique (%)61.4%

Sample

1st row대양고무(주)
2nd row삼화콘덴서공업(주)
3rd row교보생명
4th row화승인더스트리㈜
5th row(주)한진중공업 인사과장
ValueCountFrequency (%)
없음 238
 
2.4%
주식회사 204
 
2.0%
이마트 154
 
1.5%
사회복지법인 69
 
0.7%
총무부 55
 
0.6%
학교법인 37
 
0.4%
36
 
0.4%
35
 
0.4%
대리 34
 
0.3%
삼성전자 33
 
0.3%
Other values (6436) 9092
91.0%
2023-12-12T12:31:21.927559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2637
 
4.1%
) 2328
 
3.6%
( 2262
 
3.5%
1665
 
2.6%
1484
 
2.3%
1466
 
2.3%
1359
 
2.1%
1188
 
1.8%
1123
 
1.7%
1114
 
1.7%
Other values (730) 48001
74.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 56405
87.3%
Close Punctuation 2329
 
3.6%
Open Punctuation 2263
 
3.5%
Space Separator 1665
 
2.6%
Other Symbol 844
 
1.3%
Uppercase Letter 805
 
1.2%
Lowercase Letter 185
 
0.3%
Other Punctuation 69
 
0.1%
Decimal Number 47
 
0.1%
Dash Punctuation 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2637
 
4.7%
1484
 
2.6%
1466
 
2.6%
1359
 
2.4%
1188
 
2.1%
1123
 
2.0%
1114
 
2.0%
1099
 
1.9%
971
 
1.7%
946
 
1.7%
Other values (659) 43018
76.3%
Uppercase Letter
ValueCountFrequency (%)
S 132
16.4%
C 77
 
9.6%
K 70
 
8.7%
G 55
 
6.8%
L 45
 
5.6%
R 44
 
5.5%
I 41
 
5.1%
D 40
 
5.0%
T 40
 
5.0%
M 40
 
5.0%
Other values (15) 221
27.5%
Lowercase Letter
ValueCountFrequency (%)
s 28
15.1%
k 23
12.4%
c 20
10.8%
t 17
9.2%
i 15
8.1%
o 12
 
6.5%
e 11
 
5.9%
a 10
 
5.4%
r 7
 
3.8%
p 7
 
3.8%
Other values (11) 35
18.9%
Other Punctuation
ValueCountFrequency (%)
. 32
46.4%
& 17
24.6%
* 6
 
8.7%
, 4
 
5.8%
/ 4
 
5.8%
" 2
 
2.9%
' 2
 
2.9%
; 1
 
1.4%
1
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 15
31.9%
2 11
23.4%
9 4
 
8.5%
3 4
 
8.5%
6 4
 
8.5%
4 3
 
6.4%
5 3
 
6.4%
8 2
 
4.3%
7 1
 
2.1%
Close Punctuation
ValueCountFrequency (%)
) 2328
> 99.9%
] 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 2262
> 99.9%
[ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1665
100.0%
Other Symbol
ValueCountFrequency (%)
844
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 57249
88.6%
Common 6388
 
9.9%
Latin 990
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2637
 
4.6%
1484
 
2.6%
1466
 
2.6%
1359
 
2.4%
1188
 
2.1%
1123
 
2.0%
1114
 
1.9%
1099
 
1.9%
971
 
1.7%
946
 
1.7%
Other values (660) 43862
76.6%
Latin
ValueCountFrequency (%)
S 132
 
13.3%
C 77
 
7.8%
K 70
 
7.1%
G 55
 
5.6%
L 45
 
4.5%
R 44
 
4.4%
I 41
 
4.1%
D 40
 
4.0%
T 40
 
4.0%
M 40
 
4.0%
Other values (36) 406
41.0%
Common
ValueCountFrequency (%)
) 2328
36.4%
( 2262
35.4%
1665
26.1%
. 32
 
0.5%
& 17
 
0.3%
- 15
 
0.2%
1 15
 
0.2%
2 11
 
0.2%
* 6
 
0.1%
9 4
 
0.1%
Other values (14) 33
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 56404
87.3%
ASCII 7377
 
11.4%
None 845
 
1.3%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2637
 
4.7%
1484
 
2.6%
1466
 
2.6%
1359
 
2.4%
1188
 
2.1%
1123
 
2.0%
1114
 
2.0%
1099
 
1.9%
971
 
1.7%
946
 
1.7%
Other values (658) 43017
76.3%
ASCII
ValueCountFrequency (%)
) 2328
31.6%
( 2262
30.7%
1665
22.6%
S 132
 
1.8%
C 77
 
1.0%
K 70
 
0.9%
G 55
 
0.7%
L 45
 
0.6%
R 44
 
0.6%
I 41
 
0.6%
Other values (59) 658
 
8.9%
None
ValueCountFrequency (%)
844
99.9%
1
 
0.1%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-12T12:31:18.313204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:31:18.002608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:31:18.450817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:31:18.148506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:31:22.071932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자격년도자격번호수료일
자격년도1.0001.0001.000
자격번호1.0001.0001.000
수료일1.0001.0001.000
2023-12-12T12:31:22.175279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자격년도자격번호
자격년도1.0000.995
자격번호0.9951.000

Missing values

2023-12-12T12:31:18.649810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:31:18.762374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기수자격년도자격번호수료일소속기관명
01기199119910000011991-12-31대양고무(주)
11기199119910000021991-12-31삼화콘덴서공업(주)
21기199119910000031991-12-31교보생명
31기199119910000041991-12-31화승인더스트리㈜
41기199119910000051991-12-31(주)한진중공업 인사과장
51기199119910000061991-12-31(주)태평양 영업기획팀장
61기199119910000071991-12-31충남방적(주) 천안공장
71기199119910000081991-12-31근로복지공단 경인지역본부 복지부 차장
81기199119910000091991-12-31없음
91기199119910000101991-12-31(주)삼호실업
기수자격년도자격번호수료일소속기관명
8312137기202320230002482023-06-28한국생명공학연구원
8313137기202320230002492023-06-28부천세종병원
8314137기202320230002502023-06-28소화천사의집
8315137기202320230002512023-06-28(사)서울특별시시각장애인연합회
8316137기202320230002522023-06-30포천나눔의집
8317137기202320230002532023-06-28㈜브라보비버대구
8318137기202320230002542023-06-29충주시니어클럽
8319137기202320230002552023-06-28사회복지법인양혜원
8320137기202320230002562023-06-30남악오룡발달장애인주간활동방과후서비스센터
8321137기202320230002572023-06-28에스제이씨코리아 주식회사