Overview

Dataset statistics

Number of variables4
Number of observations94
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.2 KiB
Average record size in memory35.4 B

Variable types

Numeric2
Text1
Categorical1

Dataset

Description중소벤처기업 재직 근로자의 장기재직과 자산형성 지원을 위하여 중소벤처기업진흥공단에서 관리하는 내일채움공제 중견기업 지역별 가입현황
URLhttps://www.data.go.kr/data/15102333/fileData.do

Alerts

사업자번호 is highly overall correlated with 기업 소재지역High correlation
기업 소재지역 is highly overall correlated with 사업자번호High correlation
순번 has unique valuesUnique
업체명 has unique valuesUnique

Reproduction

Analysis started2023-12-11 23:50:15.684784
Analysis finished2023-12-11 23:50:16.305791
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct94
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.5
Minimum1
Maximum94
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size978.0 B
2023-12-12T08:50:16.384127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.65
Q124.25
median47.5
Q370.75
95-th percentile89.35
Maximum94
Range93
Interquartile range (IQR)46.5

Descriptive statistics

Standard deviation27.279418
Coefficient of variation (CV)0.57430354
Kurtosis-1.2
Mean47.5
Median Absolute Deviation (MAD)23.5
Skewness0
Sum4465
Variance744.16667
MonotonicityStrictly increasing
2023-12-12T08:50:16.796320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.1%
61 1
 
1.1%
70 1
 
1.1%
69 1
 
1.1%
68 1
 
1.1%
67 1
 
1.1%
66 1
 
1.1%
65 1
 
1.1%
64 1
 
1.1%
63 1
 
1.1%
Other values (84) 84
89.4%
ValueCountFrequency (%)
1 1
1.1%
2 1
1.1%
3 1
1.1%
4 1
1.1%
5 1
1.1%
6 1
1.1%
7 1
1.1%
8 1
1.1%
9 1
1.1%
10 1
1.1%
ValueCountFrequency (%)
94 1
1.1%
93 1
1.1%
92 1
1.1%
91 1
1.1%
90 1
1.1%
89 1
1.1%
88 1
1.1%
87 1
1.1%
86 1
1.1%
85 1
1.1%

사업자번호
Real number (ℝ)

HIGH CORRELATION 

Distinct17
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6300043 × 109
Minimum1.0281196 × 109
Maximum7.468601 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size978.0 B
2023-12-12T08:50:16.929245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0281196 × 109
5-th percentile1.1986847 × 109
Q13.4088016 × 109
median5.0391719 × 109
Q35.1281058 × 109
95-th percentile7.468601 × 109
Maximum7.468601 × 109
Range6.4404814 × 109
Interquartile range (IQR)1.7193042 × 109

Descriptive statistics

Standard deviation1.597471 × 109
Coefficient of variation (CV)0.34502582
Kurtosis0.055914914
Mean4.6300043 × 109
Median Absolute Deviation (MAD)1.0624608 × 109
Skewness-0.075267101
Sum4.352204 × 1011
Variance2.5519136 × 1018
MonotonicityNot monotonic
2023-12-12T08:50:17.053938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
5048155608 17
18.1%
3408801583 15
16.0%
7468601021 11
11.7%
5128105808 9
9.6%
4018117232 8
8.5%
5029182694 6
 
6.4%
1198684671 5
 
5.3%
3866000359 5
 
5.3%
6101632770 4
 
4.3%
5039171943 4
 
4.3%
Other values (7) 10
10.6%
ValueCountFrequency (%)
1028119575 1
 
1.1%
1198684671 5
 
5.3%
2209600816 1
 
1.1%
2618123607 3
 
3.2%
3010846905 1
 
1.1%
3408801583 15
16.0%
3866000359 5
 
5.3%
4018117232 8
8.5%
4028129274 1
 
1.1%
5029182694 6
 
6.4%
ValueCountFrequency (%)
7468601021 11
11.7%
7408100766 1
 
1.1%
6108160482 2
 
2.1%
6101632770 4
 
4.3%
5128105808 9
9.6%
5048155608 17
18.1%
5039171943 4
 
4.3%
5029182694 6
 
6.4%
4028129274 1
 
1.1%
4018117232 8
8.5%

업체명
Text

UNIQUE 

Distinct94
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size884.0 B
2023-12-12T08:50:17.302817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length14
Mean length8.3404255
Min length3

Characters and Unicode

Total characters784
Distinct characters188
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)100.0%

Sample

1st row대한폴리켐 주식회사
2nd row(주)혜성프로비젼
3rd row버커트코리아
4th row(주)훼밀리팜
5th row한국요꼬가와일렉트로닉스매뉴팩처링(주)
ValueCountFrequency (%)
주식회사 17
 
15.2%
대한폴리켐 1
 
0.9%
일진건설산업(주 1
 
0.9%
주)정도 1
 
0.9%
에스피씨삼립 1
 
0.9%
에스피지 1
 
0.9%
금문철강 1
 
0.9%
주)에스티씨 1
 
0.9%
대림통상(주 1
 
0.9%
크레텍책임(주 1
 
0.9%
Other values (86) 86
76.8%
2023-12-12T08:50:17.658581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
80
 
10.2%
( 55
 
7.0%
) 55
 
7.0%
32
 
4.1%
30
 
3.8%
26
 
3.3%
26
 
3.3%
25
 
3.2%
22
 
2.8%
19
 
2.4%
Other values (178) 414
52.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 649
82.8%
Open Punctuation 55
 
7.0%
Close Punctuation 55
 
7.0%
Space Separator 19
 
2.4%
Other Punctuation 3
 
0.4%
Uppercase Letter 2
 
0.3%
Decimal Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
80
 
12.3%
32
 
4.9%
30
 
4.6%
26
 
4.0%
26
 
4.0%
25
 
3.9%
22
 
3.4%
10
 
1.5%
10
 
1.5%
10
 
1.5%
Other values (170) 378
58.2%
Other Punctuation
ValueCountFrequency (%)
. 2
66.7%
& 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
C 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 55
100.0%
Close Punctuation
ValueCountFrequency (%)
) 55
100.0%
Space Separator
ValueCountFrequency (%)
19
100.0%
Decimal Number
ValueCountFrequency (%)
5 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 649
82.8%
Common 133
 
17.0%
Latin 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
80
 
12.3%
32
 
4.9%
30
 
4.6%
26
 
4.0%
26
 
4.0%
25
 
3.9%
22
 
3.4%
10
 
1.5%
10
 
1.5%
10
 
1.5%
Other values (170) 378
58.2%
Common
ValueCountFrequency (%)
( 55
41.4%
) 55
41.4%
19
 
14.3%
. 2
 
1.5%
5 1
 
0.8%
& 1
 
0.8%
Latin
ValueCountFrequency (%)
S 1
50.0%
C 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 649
82.8%
ASCII 135
 
17.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
80
 
12.3%
32
 
4.9%
30
 
4.6%
26
 
4.0%
26
 
4.0%
25
 
3.9%
22
 
3.4%
10
 
1.5%
10
 
1.5%
10
 
1.5%
Other values (170) 378
58.2%
ASCII
ValueCountFrequency (%)
( 55
40.7%
) 55
40.7%
19
 
14.1%
. 2
 
1.5%
5 1
 
0.7%
S 1
 
0.7%
& 1
 
0.7%
C 1
 
0.7%

기업 소재지역
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)9.6%
Missing0
Missing (%)0.0%
Memory size884.0 B
경기
32 
대구
27 
경북
전남
서울
Other values (4)
11 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique2 ?
Unique (%)2.1%

Sample

1st row경기
2nd row경북
3rd row경기
4th row경북
5th row대구

Common Values

ValueCountFrequency (%)
경기 32
34.0%
대구 27
28.7%
경북 9
 
9.6%
전남 8
 
8.5%
서울 7
 
7.4%
강원 5
 
5.3%
울산 4
 
4.3%
충북 1
 
1.1%
전북 1
 
1.1%

Length

2023-12-12T08:50:17.804866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:50:17.897281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기 32
34.0%
대구 27
28.7%
경북 9
 
9.6%
전남 8
 
8.5%
서울 7
 
7.4%
강원 5
 
5.3%
울산 4
 
4.3%
충북 1
 
1.1%
전북 1
 
1.1%

Interactions

2023-12-12T08:50:16.013828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:15.864957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:16.085803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:15.932174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:50:17.977814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번사업자번호업체명기업 소재지역
순번1.0000.0001.0000.000
사업자번호0.0001.0001.0000.871
업체명1.0001.0001.0001.000
기업 소재지역0.0000.8711.0001.000
2023-12-12T08:50:18.070065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번사업자번호기업 소재지역
순번1.0000.0530.000
사업자번호0.0531.0000.653
기업 소재지역0.0000.6531.000

Missing values

2023-12-12T08:50:16.189082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:50:16.270424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번사업자번호업체명기업 소재지역
017468601021대한폴리켐 주식회사경기
125128105808(주)혜성프로비젼경북
231198684671버커트코리아경기
345128105808(주)훼밀리팜경북
455039171943한국요꼬가와일렉트로닉스매뉴팩처링(주)대구
563866000359이지스 주식회사강원
675029182694킨코스코리아대구
783408801583주식회사 씨에스에코경기
895048155608(주)에이스침대대구
9103408801583주식회사 에스에이엘경기
순번사업자번호업체명기업 소재지역
84853408801583주식회사 아이티엠반도체경기
85865128105808(주)한국티디비신용정보경북
86876101632770엠케이전자(주)울산
87883866000359자화전자(주)강원
88895048155608(주)풍림푸드대구
89905039171943필에너지대구
90915048155608(주)우진아이엔에스대구
91925048155608(주)에넥스대구
92935029182694케이지이티에스(주)대구
93944028129274(주)보성메탈전북