Overview

Dataset statistics

Number of variables7
Number of observations32
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.0 KiB
Average record size in memory65.1 B

Variable types

Categorical1
Text1
Numeric5

Dataset

Description과기정통부, 정보통신산업진흥원, 벤처기업협회에서 2021 ICT 중소기업 실태조사보고서중 ICT 중소기업내 종사자 학력별 인력비율
URLhttps://www.data.go.kr/data/15105584/fileData.do

Alerts

고졸이하 is highly overall correlated with 전문대졸 and 3 other fieldsHigh correlation
전문대졸 is highly overall correlated with 고졸이하 and 3 other fieldsHigh correlation
대졸 is highly overall correlated with 고졸이하 and 3 other fieldsHigh correlation
대학원졸(석사) is highly overall correlated with 고졸이하 and 3 other fieldsHigh correlation
대학원졸(박사) is highly overall correlated with 고졸이하 and 3 other fieldsHigh correlation
상세구분 has unique valuesUnique
고졸이하 has unique valuesUnique
전문대졸 has unique valuesUnique
대졸 has unique valuesUnique
대학원졸(석사) has unique valuesUnique
대학원졸(박사) has unique valuesUnique

Reproduction

Analysis started2023-12-12 22:44:44.100011
Analysis finished2023-12-12 22:44:46.696837
Duration2.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct5
Distinct (%)15.6%
Missing0
Missing (%)0.0%
Memory size388.0 B
세부업종
11 
매출액규모
종사자규모
권역별
업종

Length

Max length5
Median length4
Mean length4
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row업종
2nd row업종
3rd row업종
4th row세부업종
5th row세부업종

Common Values

ValueCountFrequency (%)
세부업종 11
34.4%
매출액규모 6
18.8%
종사자규모 6
18.8%
권역별 6
18.8%
업종 3
 
9.4%

Length

2023-12-13T07:44:46.756515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:44:46.855141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
세부업종 11
34.4%
매출액규모 6
18.8%
종사자규모 6
18.8%
권역별 6
18.8%
업종 3
 
9.4%

상세구분
Text

UNIQUE 

Distinct32
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size388.0 B
2023-12-13T07:44:47.021917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length7.03125
Min length2

Characters and Unicode

Total characters225
Distinct characters72
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)100.0%

Sample

1st row정보통신방송서비스
2nd row정보통신방송기기
3rd row소프트웨어
4th row통신서비스
5th row방송서비스
ValueCountFrequency (%)
미만 5
 
11.1%
3
 
6.7%
소프트웨어 2
 
4.4%
50~99명 1
 
2.2%
10억~50억 1
 
2.2%
50억~100억 1
 
2.2%
100억 1
 
2.2%
이상 1
 
2.2%
1~9명 1
 
2.2%
10~19명 1
 
2.2%
Other values (28) 28
62.2%
2023-12-13T07:44:47.307019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 15
 
6.7%
13
 
5.8%
12
 
5.3%
1 10
 
4.4%
10
 
4.4%
/ 9
 
4.0%
~ 9
 
4.0%
9 7
 
3.1%
6
 
2.7%
6
 
2.7%
Other values (62) 128
56.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 151
67.1%
Decimal Number 41
 
18.2%
Space Separator 13
 
5.8%
Other Punctuation 9
 
4.0%
Math Symbol 9
 
4.0%
Uppercase Letter 2
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
7.9%
10
 
6.6%
6
 
4.0%
6
 
4.0%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
Other values (50) 87
57.6%
Decimal Number
ValueCountFrequency (%)
0 15
36.6%
1 10
24.4%
9 7
17.1%
5 5
 
12.2%
2 2
 
4.9%
3 1
 
2.4%
4 1
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
I 1
50.0%
Space Separator
ValueCountFrequency (%)
13
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 9
100.0%
Math Symbol
ValueCountFrequency (%)
~ 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 151
67.1%
Common 72
32.0%
Latin 2
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
7.9%
10
 
6.6%
6
 
4.0%
6
 
4.0%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
Other values (50) 87
57.6%
Common
ValueCountFrequency (%)
0 15
20.8%
13
18.1%
1 10
13.9%
/ 9
12.5%
~ 9
12.5%
9 7
9.7%
5 5
 
6.9%
2 2
 
2.8%
3 1
 
1.4%
4 1
 
1.4%
Latin
ValueCountFrequency (%)
T 1
50.0%
I 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 151
67.1%
ASCII 74
32.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 15
20.3%
13
17.6%
1 10
13.5%
/ 9
12.2%
~ 9
12.2%
9 7
9.5%
5 5
 
6.8%
2 2
 
2.7%
3 1
 
1.4%
4 1
 
1.4%
Other values (2) 2
 
2.7%
Hangul
ValueCountFrequency (%)
12
 
7.9%
10
 
6.6%
6
 
4.0%
6
 
4.0%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
5
 
3.3%
Other values (50) 87
57.6%

고졸이하
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct32
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24068.312
Minimum715
Maximum123133
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size420.0 B
2023-12-13T07:44:47.434245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum715
5-th percentile1680.95
Q15162.75
median18217
Q331422.25
95-th percentile69202.7
Maximum123133
Range122418
Interquartile range (IQR)26259.5

Descriptive statistics

Standard deviation26314.021
Coefficient of variation (CV)1.0933056
Kurtosis5.5304055
Mean24068.312
Median Absolute Deviation (MAD)13202
Skewness2.0755066
Sum770186
Variance6.9242772 × 108
MonotonicityNot monotonic
2023-12-13T07:44:47.567261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
4736 1
 
3.1%
52274 1
 
3.1%
2369 1
 
3.1%
6865 1
 
3.1%
29296 1
 
3.1%
18700 1
 
3.1%
72506 1
 
3.1%
24300 1
 
3.1%
5305 1
 
3.1%
22875 1
 
3.1%
Other values (22) 22
68.8%
ValueCountFrequency (%)
715 1
3.1%
840 1
3.1%
2369 1
3.1%
2543 1
3.1%
2589 1
3.1%
3181 1
3.1%
3428 1
3.1%
4736 1
3.1%
5305 1
3.1%
6865 1
3.1%
ValueCountFrequency (%)
123133 1
3.1%
72506 1
3.1%
66500 1
3.1%
52274 1
3.1%
44918 1
3.1%
42248 1
3.1%
42209 1
3.1%
32269 1
3.1%
31140 1
3.1%
29296 1
3.1%

전문대졸
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct32
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23285.125
Minimum1192
Maximum86482
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size420.0 B
2023-12-13T07:44:47.693066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1192
5-th percentile2323.65
Q16702.25
median18637
Q331673.25
95-th percentile53169.2
Maximum86482
Range85290
Interquartile range (IQR)24971

Descriptive statistics

Standard deviation19849.313
Coefficient of variation (CV)0.85244606
Kurtosis1.7322389
Mean23285.125
Median Absolute Deviation (MAD)12140.5
Skewness1.214418
Sum745124
Variance3.9399523 × 108
MonotonicityNot monotonic
2023-12-13T07:44:47.807865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
9974 1
 
3.1%
42591 1
 
3.1%
3578 1
 
3.1%
7051 1
 
3.1%
18803 1
 
3.1%
21941 1
 
3.1%
53904 1
 
3.1%
43748 1
 
3.1%
6125 1
 
3.1%
25210 1
 
3.1%
Other values (22) 22
68.8%
ValueCountFrequency (%)
1192 1
3.1%
2256 1
3.1%
2379 1
3.1%
2706 1
3.1%
3578 1
3.1%
5589 1
3.1%
6125 1
3.1%
6526 1
3.1%
6761 1
3.1%
7051 1
3.1%
ValueCountFrequency (%)
86482 1
3.1%
53904 1
3.1%
52568 1
3.1%
47464 1
3.1%
47126 1
3.1%
43748 1
3.1%
42591 1
3.1%
34272 1
3.1%
30807 1
3.1%
30733 1
3.1%

대졸
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct32
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60836.875
Minimum2168
Maximum195063
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size420.0 B
2023-12-13T07:44:47.935569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2168
5-th percentile5295.25
Q119411.5
median48297.5
Q384391.75
95-th percentile162595.75
Maximum195063
Range192895
Interquartile range (IQR)64980.25

Descriptive statistics

Standard deviation50504.877
Coefficient of variation (CV)0.83016883
Kurtosis0.55943632
Mean60836.875
Median Absolute Deviation (MAD)30745.5
Skewness1.0585458
Sum1946780
Variance2.5507426 × 109
MonotonicityNot monotonic
2023-12-13T07:44:48.055914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
37509 1
 
3.1%
109526 1
 
3.1%
10963 1
 
3.1%
17952 1
 
3.1%
34397 1
 
3.1%
40073 1
 
3.1%
116272 1
 
3.1%
169699 1
 
3.1%
13786 1
 
3.1%
66016 1
 
3.1%
Other values (22) 22
68.8%
ValueCountFrequency (%)
2168 1
3.1%
5100 1
3.1%
5455 1
3.1%
10963 1
3.1%
11459 1
3.1%
13786 1
3.1%
16382 1
3.1%
17952 1
3.1%
19898 1
3.1%
23882 1
3.1%
ValueCountFrequency (%)
195063 1
3.1%
169699 1
3.1%
156784 1
3.1%
116272 1
3.1%
114269 1
3.1%
110826 1
3.1%
109526 1
3.1%
99238 1
3.1%
79443 1
3.1%
77539 1
3.1%

대학원졸(석사)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct32
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4311.8438
Minimum130
Maximum13600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size420.0 B
2023-12-13T07:44:48.193205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum130
5-th percentile315.9
Q11273.5
median3599
Q36533.75
95-th percentile10741.9
Maximum13600
Range13470
Interquartile range (IQR)5260.25

Descriptive statistics

Standard deviation3499.0286
Coefficient of variation (CV)0.81149243
Kurtosis0.33493486
Mean4311.8438
Median Absolute Deviation (MAD)2365
Skewness0.93697597
Sum137979
Variance12243201
MonotonicityNot monotonic
2023-12-13T07:44:48.335508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
2351 1
 
3.1%
7352 1
 
3.1%
750 1
 
3.1%
1285 1
 
3.1%
3815 1
 
3.1%
6456 1
 
3.1%
5287 1
 
3.1%
10003 1
 
3.1%
1239 1
 
3.1%
5839 1
 
3.1%
Other values (22) 22
68.8%
ValueCountFrequency (%)
130 1
3.1%
295 1
3.1%
333 1
3.1%
450 1
3.1%
750 1
3.1%
1108 1
3.1%
1229 1
3.1%
1239 1
3.1%
1285 1
3.1%
1926 1
3.1%
ValueCountFrequency (%)
13600 1
3.1%
11645 1
3.1%
10003 1
3.1%
8706 1
3.1%
8504 1
3.1%
7514 1
3.1%
7352 1
3.1%
6767 1
3.1%
6456 1
3.1%
5839 1
3.1%

대학원졸(박사)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct32
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean782.25
Minimum13
Maximum2973
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size420.0 B
2023-12-13T07:44:48.479355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile28.3
Q1220.25
median649
Q3993.25
95-th percentile1961.55
Maximum2973
Range2960
Interquartile range (IQR)773

Descriptive statistics

Standard deviation710.24103
Coefficient of variation (CV)0.90794635
Kurtosis1.5836556
Mean782.25
Median Absolute Deviation (MAD)416.5
Skewness1.2893809
Sum25032
Variance504442.32
MonotonicityNot monotonic
2023-12-13T07:44:48.637612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
161 1
 
3.1%
1054 1
 
3.1%
197 1
 
3.1%
86 1
 
3.1%
579 1
 
3.1%
1438 1
 
3.1%
763 1
 
3.1%
1944 1
 
3.1%
228 1
 
3.1%
841 1
 
3.1%
Other values (22) 22
68.8%
ValueCountFrequency (%)
13 1
3.1%
25 1
3.1%
31 1
3.1%
70 1
3.1%
86 1
3.1%
117 1
3.1%
161 1
3.1%
197 1
3.1%
228 1
3.1%
237 1
3.1%
ValueCountFrequency (%)
2973 1
3.1%
1983 1
3.1%
1944 1
3.1%
1872 1
3.1%
1777 1
3.1%
1438 1
3.1%
1395 1
3.1%
1054 1
3.1%
973 1
3.1%
841 1
3.1%

Interactions

2023-12-13T07:44:46.167288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:44.387600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:44.894330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.371411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.780032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:46.246538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:44.501696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:44.976587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.449573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.856753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:46.312268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:44.591116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.069382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.530858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.924247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:46.391472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:44.694981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.180438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.627946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:46.010052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:46.466674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:44.805656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.280185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:45.706884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:44:46.091328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:44:48.729261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분상세구분고졸이하전문대졸대졸대학원졸(석사)대학원졸(박사)
구분1.0001.0000.2860.5930.5840.6630.263
상세구분1.0001.0001.0001.0001.0001.0001.000
고졸이하0.2861.0001.0000.8400.6050.8080.655
전문대졸0.5931.0000.8401.0000.8680.8680.860
대졸0.5841.0000.6050.8681.0000.8960.896
대학원졸(석사)0.6631.0000.8080.8680.8961.0000.833
대학원졸(박사)0.2631.0000.6550.8600.8960.8331.000
2023-12-13T07:44:48.833094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고졸이하전문대졸대졸대학원졸(석사)대학원졸(박사)구분
고졸이하1.0000.9430.8400.8340.7600.195
전문대졸0.9431.0000.9590.9430.8700.402
대졸0.8400.9591.0000.9410.8890.395
대학원졸(석사)0.8340.9430.9411.0000.9380.314
대학원졸(박사)0.7600.8700.8890.9381.0000.127
구분0.1950.4020.3950.3140.1271.000

Missing values

2023-12-13T07:44:46.561574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:44:46.659477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분상세구분고졸이하전문대졸대졸대학원졸(석사)대학원졸(박사)
0업종정보통신방송서비스47369974375092351161
1업종정보통신방송기기12313386482156784116451872
2업종소프트웨어2616952568195063136002973
3세부업종통신서비스7151192216813013
4세부업종방송서비스84022561145929531
5세부업종정보서비스31816526238821926117
6세부업종전자부품4220927511523082987567
7세부업종컴퓨터 및 주변기기25432379545545070
8세부업종통신 및 방송기기84536761163821108237
9세부업종영상 및 음향기기34282706510033325
구분상세구분고졸이하전문대졸대졸대학원졸(석사)대학원졸(박사)
22종사자규모20~49명42248342729923875141983
23종사자규모50~99명3226929397704494857497
24종사자규모100~299명2287525210660165839841
25종사자규모300명이상53056125137861239228
26권역별서울2430043748169699100031944
27권역별인천/경기72506539041162725287763
28권역별대전/세종/충청/강원18700219414007364561438
29권역별부산/울산/경남2929618803343973815579
30권역별대구/경북6865705117952128586
31권역별광주/전라/제주2369357810963750197