Overview

Dataset statistics

Number of variables5
Number of observations2926
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory123.0 KiB
Average record size in memory43.0 B

Variable types

Numeric2
Categorical3

Dataset

Description창업지원을 받은 기업의 지원받은 당해연도 매출액 발생 여부 관련 자료입니다.각 업체별 설립일은 모두 상이하므로, 업력에 따른 매출 발생 정도가 업체별로 다를 수 있습니다.
Author중소벤처기업진흥공단
URLhttps://www.data.go.kr/data/15018331/fileData.do

Alerts

비고 has constant value ""Constant
연번 is highly overall correlated with 지원년도High correlation
지원년도 is highly overall correlated with 연번High correlation
매출발생여부 is highly imbalanced (54.8%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 23:37:13.385149
Analysis finished2023-12-12 23:37:14.270567
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2926
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1463.5
Minimum1
Maximum2926
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.8 KiB
2023-12-13T08:37:14.370069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile147.25
Q1732.25
median1463.5
Q32194.75
95-th percentile2779.75
Maximum2926
Range2925
Interquartile range (IQR)1462.5

Descriptive statistics

Standard deviation844.80777
Coefficient of variation (CV)0.57725164
Kurtosis-1.2
Mean1463.5
Median Absolute Deviation (MAD)731.5
Skewness0
Sum4282201
Variance713700.17
MonotonicityStrictly increasing
2023-12-13T08:37:14.555522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1945 1
 
< 0.1%
1947 1
 
< 0.1%
1948 1
 
< 0.1%
1949 1
 
< 0.1%
1950 1
 
< 0.1%
1951 1
 
< 0.1%
1952 1
 
< 0.1%
1953 1
 
< 0.1%
1954 1
 
< 0.1%
Other values (2916) 2916
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2926 1
< 0.1%
2925 1
< 0.1%
2924 1
< 0.1%
2923 1
< 0.1%
2922 1
< 0.1%
2921 1
< 0.1%
2920 1
< 0.1%
2919 1
< 0.1%
2918 1
< 0.1%
2917 1
< 0.1%

지원년도
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
2021
1044 
2020
983 
2022
899 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2021 1044
35.7%
2020 983
33.6%
2022 899
30.7%

Length

2023-12-13T08:37:14.703287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:37:14.833454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 1044
35.7%
2020 983
33.6%
2022 899
30.7%

일련번호
Real number (ℝ)

Distinct1044
Distinct (%)35.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1489.9781
Minimum1001
Maximum2044
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.8 KiB
2023-12-13T08:37:14.971454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile1049.25
Q11244.25
median1488
Q31732
95-th percentile1940.75
Maximum2044
Range1043
Interquartile range (IQR)487.75

Descriptive statistics

Standard deviation284.69335
Coefficient of variation (CV)0.19107217
Kurtosis-1.1515515
Mean1489.9781
Median Absolute Deviation (MAD)244
Skewness0.036302026
Sum4359676
Variance81050.305
MonotonicityNot monotonic
2023-12-13T08:37:15.159095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1001 3
 
0.1%
1564 3
 
0.1%
1594 3
 
0.1%
1595 3
 
0.1%
1596 3
 
0.1%
1597 3
 
0.1%
1598 3
 
0.1%
1599 3
 
0.1%
1600 3
 
0.1%
1601 3
 
0.1%
Other values (1034) 2896
99.0%
ValueCountFrequency (%)
1001 3
0.1%
1002 3
0.1%
1003 3
0.1%
1004 3
0.1%
1005 3
0.1%
1006 3
0.1%
1007 3
0.1%
1008 3
0.1%
1009 3
0.1%
1010 3
0.1%
ValueCountFrequency (%)
2044 1
< 0.1%
2043 1
< 0.1%
2042 1
< 0.1%
2041 1
< 0.1%
2040 1
< 0.1%
2039 1
< 0.1%
2038 1
< 0.1%
2037 1
< 0.1%
2036 1
< 0.1%
2035 1
< 0.1%

매출발생여부
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
매출발생
2649 
매출미발생
277 

Length

Max length5
Median length4
Mean length4.0946685
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row매출발생
2nd row매출발생
3rd row매출발생
4th row매출발생
5th row매출발생

Common Values

ValueCountFrequency (%)
매출발생 2649
90.5%
매출미발생 277
 
9.5%

Length

2023-12-13T08:37:15.316196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:37:15.444097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
매출발생 2649
90.5%
매출미발생 277
 
9.5%

비고
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
당해년도매출발생여부
2926 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row당해년도매출발생여부
2nd row당해년도매출발생여부
3rd row당해년도매출발생여부
4th row당해년도매출발생여부
5th row당해년도매출발생여부

Common Values

ValueCountFrequency (%)
당해년도매출발생여부 2926
100.0%

Length

2023-12-13T08:37:15.545786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:37:15.642314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
당해년도매출발생여부 2926
100.0%

Interactions

2023-12-13T08:37:13.897468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:37:13.622623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:37:14.012470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:37:13.743079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:37:15.717880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지원년도일련번호매출발생여부
연번1.0000.9570.8890.200
지원년도0.9571.0000.2100.085
일련번호0.8890.2101.0000.068
매출발생여부0.2000.0850.0681.000
2023-12-13T08:37:15.826804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지원년도매출발생여부
지원년도1.0000.141
매출발생여부0.1411.000
2023-12-13T08:37:15.935092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번일련번호지원년도매출발생여부
연번1.0000.2820.9540.153
일련번호0.2821.0000.1280.053
지원년도0.9540.1281.0000.141
매출발생여부0.1530.0530.1411.000

Missing values

2023-12-13T08:37:14.125051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:37:14.225856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번지원년도일련번호매출발생여부비고
0120201001매출발생당해년도매출발생여부
1220201002매출발생당해년도매출발생여부
2320201003매출발생당해년도매출발생여부
3420201004매출발생당해년도매출발생여부
4520201005매출발생당해년도매출발생여부
5620201006매출발생당해년도매출발생여부
6720201007매출발생당해년도매출발생여부
7820201008매출발생당해년도매출발생여부
8920201009매출발생당해년도매출발생여부
91020201010매출발생당해년도매출발생여부
연번지원년도일련번호매출발생여부비고
2916291720221890매출발생당해년도매출발생여부
2917291820221891매출발생당해년도매출발생여부
2918291920221892매출발생당해년도매출발생여부
2919292020221893매출발생당해년도매출발생여부
2920292120221894매출발생당해년도매출발생여부
2921292220221895매출발생당해년도매출발생여부
2922292320221896매출발생당해년도매출발생여부
2923292420221897매출발생당해년도매출발생여부
2924292520221898매출발생당해년도매출발생여부
2925292620221899매출발생당해년도매출발생여부