Overview

Dataset statistics

Number of variables4
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 KiB
Average record size in memory38.4 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description2020년~2023년 여성이 일하기 좋은 기업문화 확산을 위해 인천형 여성친화기업 선정하고 선정 기업 현황을 제공합니다.
URLhttps://www.data.go.kr/data/15104137/fileData.do

Alerts

연번 is highly overall correlated with 연도High correlation
연도 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
상호 has unique valuesUnique
사업자번호 has unique valuesUnique

Reproduction

Analysis started2023-12-11 23:06:18.382641
Analysis finished2023-12-11 23:06:18.834471
Duration0.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.5
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-12T08:06:18.913450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.45
Q18.25
median15.5
Q322.75
95-th percentile28.55
Maximum30
Range29
Interquartile range (IQR)14.5

Descriptive statistics

Standard deviation8.8034084
Coefficient of variation (CV)0.56796183
Kurtosis-1.2
Mean15.5
Median Absolute Deviation (MAD)7.5
Skewness0
Sum465
Variance77.5
MonotonicityStrictly increasing
2023-12-12T08:06:19.066843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 1
 
3.3%
17 1
 
3.3%
30 1
 
3.3%
29 1
 
3.3%
28 1
 
3.3%
27 1
 
3.3%
26 1
 
3.3%
25 1
 
3.3%
24 1
 
3.3%
23 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
1 1
3.3%
2 1
3.3%
3 1
3.3%
4 1
3.3%
5 1
3.3%
6 1
3.3%
7 1
3.3%
8 1
3.3%
9 1
3.3%
10 1
3.3%
ValueCountFrequency (%)
30 1
3.3%
29 1
3.3%
28 1
3.3%
27 1
3.3%
26 1
3.3%
25 1
3.3%
24 1
3.3%
23 1
3.3%
22 1
3.3%
21 1
3.3%

연도
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2020
10 
2022
10 
2023
10 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 10
33.3%
2022 10
33.3%
2023 10
33.3%

Length

2023-12-12T08:06:19.240236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:06:19.350508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 10
33.3%
2022 10
33.3%
2023 10
33.3%

상호
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-12T08:06:19.549577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length11
Mean length6.8
Min length3

Characters and Unicode

Total characters204
Distinct characters99
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row㈜세문스크린
2nd row㈜하이베로
3rd row㈜원웨드
4th row우먼산업
5th row㈜토마스
ValueCountFrequency (%)
주식회사 7
 
17.5%
㈜세문스크린 1
 
2.5%
태양이엔에스㈜ 1
 
2.5%
협동조합 1
 
2.5%
꿈꾸는 1
 
2.5%
문화놀이터 1
 
2.5%
1
 
2.5%
㈜미가에스티 1
 
2.5%
굿모닝바이오 1
 
2.5%
엑스파워네트웍스 1
 
2.5%
Other values (24) 24
60.0%
2023-12-12T08:06:19.939808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17
 
8.3%
10
 
4.9%
9
 
4.4%
8
 
3.9%
8
 
3.9%
7
 
3.4%
7
 
3.4%
7
 
3.4%
5
 
2.5%
4
 
2.0%
Other values (89) 122
59.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 177
86.8%
Other Symbol 17
 
8.3%
Space Separator 10
 
4.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
5.1%
8
 
4.5%
8
 
4.5%
7
 
4.0%
7
 
4.0%
7
 
4.0%
5
 
2.8%
4
 
2.3%
3
 
1.7%
3
 
1.7%
Other values (87) 116
65.5%
Other Symbol
ValueCountFrequency (%)
17
100.0%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 194
95.1%
Common 10
 
4.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
8.8%
9
 
4.6%
8
 
4.1%
8
 
4.1%
7
 
3.6%
7
 
3.6%
7
 
3.6%
5
 
2.6%
4
 
2.1%
3
 
1.5%
Other values (88) 119
61.3%
Common
ValueCountFrequency (%)
10
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 177
86.8%
None 17
 
8.3%
ASCII 10
 
4.9%

Most frequent character per block

None
ValueCountFrequency (%)
17
100.0%
ASCII
ValueCountFrequency (%)
10
100.0%
Hangul
ValueCountFrequency (%)
9
 
5.1%
8
 
4.5%
8
 
4.5%
7
 
4.0%
7
 
4.0%
7
 
4.0%
5
 
2.8%
4
 
2.3%
3
 
1.7%
3
 
1.7%
Other values (87) 116
65.5%

사업자번호
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-12T08:06:20.164659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters360
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row131-86-41240
2nd row122-51-95112
3rd row707-87-00868
4th row131-16-76784
5th row131-86-45421
ValueCountFrequency (%)
131-86-41240 1
 
3.3%
122-51-95112 1
 
3.3%
122-86-23971 1
 
3.3%
137-06-79932 1
 
3.3%
827-88-01703 1
 
3.3%
112-29-75654 1
 
3.3%
122-81-89930 1
 
3.3%
344-86-01432 1
 
3.3%
121-86-17326 1
 
3.3%
137-86-07935 1
 
3.3%
Other values (20) 20
66.7%
2023-12-12T08:06:20.574546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 60
16.7%
- 60
16.7%
8 42
11.7%
6 34
9.4%
2 34
9.4%
0 33
9.2%
7 29
8.1%
3 27
7.5%
4 15
 
4.2%
5 13
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 300
83.3%
Dash Punctuation 60
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 60
20.0%
8 42
14.0%
6 34
11.3%
2 34
11.3%
0 33
11.0%
7 29
9.7%
3 27
9.0%
4 15
 
5.0%
5 13
 
4.3%
9 13
 
4.3%
Dash Punctuation
ValueCountFrequency (%)
- 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 360
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 60
16.7%
- 60
16.7%
8 42
11.7%
6 34
9.4%
2 34
9.4%
0 33
9.2%
7 29
8.1%
3 27
7.5%
4 15
 
4.2%
5 13
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 360
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 60
16.7%
- 60
16.7%
8 42
11.7%
6 34
9.4%
2 34
9.4%
0 33
9.2%
7 29
8.1%
3 27
7.5%
4 15
 
4.2%
5 13
 
3.6%

Interactions

2023-12-12T08:06:18.567668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:06:20.674586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번연도상호사업자번호
연번1.0000.9261.0001.000
연도0.9261.0001.0001.000
상호1.0001.0001.0001.000
사업자번호1.0001.0001.0001.000
2023-12-12T08:06:20.776523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번연도
연번1.0000.773
연도0.7731.000

Missing values

2023-12-12T08:06:18.692697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:06:18.789117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번연도상호사업자번호
012020㈜세문스크린131-86-41240
122020㈜하이베로122-51-95112
232020㈜원웨드707-87-00868
342020우먼산업131-16-76784
452020㈜토마스131-86-45421
562020현대산업130-17-72180
672020㈜지오테크놀로지131-86-36602
782020㈜중원인더스트리122-86-11422
892020㈜에이치비107-81-68367
9102020㈜명품크리너스758-88-00167
연번연도상호사업자번호
20212023㈜미가에스티121-86-39806
21222023태양이엔에스㈜137-86-07935
22232023주식회사 굿모닝바이오121-86-17326
23242023㈜비바344-86-01432
24252023㈜대명아이넥스122-81-89930
25262023대한계전112-29-75654
26272023주식회사 빌리고827-88-01703
27282023강화섬김치137-06-79932
28292023㈜우진피앤티122-86-23971
29302023㈜경우정밀131-86-36223