Overview

Dataset statistics

Number of variables5
Number of observations25
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory48.1 B

Variable types

Numeric3
Text2

Dataset

Description전북특별자치도 창업기업 현황 데이터입니다. 업종코드(C10, C11 등), 업종(식료품, 음료 등), 업체수, 종사자수 등의 데이터를 제공합니다.
Author전북특별자치도
URLhttps://www.data.go.kr/data/15089303/fileData.do

Alerts

업체수 is highly overall correlated with 종사자수High correlation
종사자수 is highly overall correlated with 업체수High correlation
연번 has unique valuesUnique
업종코드 has unique valuesUnique
업종 has unique valuesUnique

Reproduction

Analysis started2024-03-14 17:29:55.797461
Analysis finished2024-03-14 17:29:57.978472
Duration2.18 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size353.0 B
2024-03-15T02:29:58.088030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.2
Q17
median13
Q319
95-th percentile23.8
Maximum25
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.3598007
Coefficient of variation (CV)0.56613852
Kurtosis-1.2
Mean13
Median Absolute Deviation (MAD)6
Skewness0
Sum325
Variance54.166667
MonotonicityStrictly increasing
2024-03-15T02:29:58.306172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
1 1
 
4.0%
2 1
 
4.0%
25 1
 
4.0%
24 1
 
4.0%
23 1
 
4.0%
22 1
 
4.0%
21 1
 
4.0%
20 1
 
4.0%
19 1
 
4.0%
18 1
 
4.0%
Other values (15) 15
60.0%
ValueCountFrequency (%)
1 1
4.0%
2 1
4.0%
3 1
4.0%
4 1
4.0%
5 1
4.0%
6 1
4.0%
7 1
4.0%
8 1
4.0%
9 1
4.0%
10 1
4.0%
ValueCountFrequency (%)
25 1
4.0%
24 1
4.0%
23 1
4.0%
22 1
4.0%
21 1
4.0%
20 1
4.0%
19 1
4.0%
18 1
4.0%
17 1
4.0%
16 1
4.0%

업종코드
Text

UNIQUE 

Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size328.0 B
2024-03-15T02:29:59.054108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters75
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st rowC10
2nd rowC11
3rd rowC12
4th rowC13
5th rowC14
ValueCountFrequency (%)
c10 1
 
4.0%
c23 1
 
4.0%
c33 1
 
4.0%
c32 1
 
4.0%
c31 1
 
4.0%
c30 1
 
4.0%
c29 1
 
4.0%
c28 1
 
4.0%
c27 1
 
4.0%
c26 1
 
4.0%
Other values (15) 15
60.0%
2024-03-15T02:30:00.280892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 25
33.3%
1 13
17.3%
2 13
17.3%
3 8
 
10.7%
0 3
 
4.0%
4 3
 
4.0%
5 2
 
2.7%
6 2
 
2.7%
7 2
 
2.7%
8 2
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50
66.7%
Uppercase Letter 25
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 13
26.0%
2 13
26.0%
3 8
16.0%
0 3
 
6.0%
4 3
 
6.0%
5 2
 
4.0%
6 2
 
4.0%
7 2
 
4.0%
8 2
 
4.0%
9 2
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
C 25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50
66.7%
Latin 25
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 13
26.0%
2 13
26.0%
3 8
16.0%
0 3
 
6.0%
4 3
 
6.0%
5 2
 
4.0%
6 2
 
4.0%
7 2
 
4.0%
8 2
 
4.0%
9 2
 
4.0%
Latin
ValueCountFrequency (%)
C 25
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 75
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 25
33.3%
1 13
17.3%
2 13
17.3%
3 8
 
10.7%
0 3
 
4.0%
4 3
 
4.0%
5 2
 
2.7%
6 2
 
2.7%
7 2
 
2.7%
8 2
 
2.7%

업종
Text

UNIQUE 

Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size328.0 B
2024-03-15T02:30:00.991877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length15
Mean length10.96
Min length2

Characters and Unicode

Total characters274
Distinct characters93
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row식료품
2nd row음료
3rd row담배 제조업
4th row섬유제품(의복 제외)
5th row의복, 의복액세서리 및 모피제품
ValueCountFrequency (%)
15
 
20.0%
제외 3
 
4.0%
가구 2
 
2.7%
기계 2
 
2.7%
기타 2
 
2.7%
식료품 1
 
1.3%
통신장비 1
 
1.3%
비금속 1
 
1.3%
광물제품 1
 
1.3%
1차 1
 
1.3%
Other values (46) 46
61.3%
2024-03-15T02:30:02.224093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
50
 
18.2%
17
 
6.2%
15
 
5.5%
15
 
5.5%
10
 
3.6%
, 9
 
3.3%
7
 
2.6%
6
 
2.2%
6
 
2.2%
5
 
1.8%
Other values (83) 134
48.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 206
75.2%
Space Separator 50
 
18.2%
Other Punctuation 9
 
3.3%
Open Punctuation 4
 
1.5%
Close Punctuation 4
 
1.5%
Decimal Number 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17
 
8.3%
15
 
7.3%
15
 
7.3%
10
 
4.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
5
 
2.4%
4
 
1.9%
4
 
1.9%
Other values (78) 117
56.8%
Space Separator
ValueCountFrequency (%)
50
100.0%
Other Punctuation
ValueCountFrequency (%)
, 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 206
75.2%
Common 68
 
24.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
8.3%
15
 
7.3%
15
 
7.3%
10
 
4.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
5
 
2.4%
4
 
1.9%
4
 
1.9%
Other values (78) 117
56.8%
Common
ValueCountFrequency (%)
50
73.5%
, 9
 
13.2%
( 4
 
5.9%
) 4
 
5.9%
1 1
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 206
75.2%
ASCII 68
 
24.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
50
73.5%
, 9
 
13.2%
( 4
 
5.9%
) 4
 
5.9%
1 1
 
1.5%
Hangul
ValueCountFrequency (%)
17
 
8.3%
15
 
7.3%
15
 
7.3%
10
 
4.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
5
 
2.4%
4
 
1.9%
4
 
1.9%
Other values (78) 117
56.8%

업체수
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)88.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.44
Minimum1
Maximum233
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size353.0 B
2024-03-15T02:30:02.489321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.2
Q115
median29
Q369
95-th percentile158.2
Maximum233
Range232
Interquartile range (IQR)54

Descriptive statistics

Standard deviation55.667974
Coefficient of variation (CV)1.149215
Kurtosis4.5160325
Mean48.44
Median Absolute Deviation (MAD)20
Skewness2.033846
Sum1211
Variance3098.9233
MonotonicityNot monotonic
2024-03-15T02:30:02.798028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
1 2
 
8.0%
18 2
 
8.0%
15 2
 
8.0%
233 1
 
4.0%
169 1
 
4.0%
4 1
 
4.0%
87 1
 
4.0%
17 1
 
4.0%
69 1
 
4.0%
115 1
 
4.0%
Other values (12) 12
48.0%
ValueCountFrequency (%)
1 2
8.0%
2 1
4.0%
4 1
4.0%
9 1
4.0%
13 1
4.0%
15 2
8.0%
17 1
4.0%
18 2
8.0%
27 1
4.0%
29 1
4.0%
ValueCountFrequency (%)
233 1
4.0%
169 1
4.0%
115 1
4.0%
87 1
4.0%
77 1
4.0%
74 1
4.0%
69 1
4.0%
60 1
4.0%
46 1
4.0%
44 1
4.0%

종사자수
Real number (ℝ)

HIGH CORRELATION 

Distinct24
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean362.12
Minimum5
Maximum1441
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size353.0 B
2024-03-15T02:30:03.223376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile5.4
Q175
median177
Q3542
95-th percentile1072.4
Maximum1441
Range1436
Interquartile range (IQR)467

Descriptive statistics

Standard deviation389.70142
Coefficient of variation (CV)1.0761665
Kurtosis1.0842721
Mean362.12
Median Absolute Deviation (MAD)158
Skewness1.2766686
Sum9053
Variance151867.19
MonotonicityNot monotonic
2024-03-15T02:30:03.512767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
5 2
 
8.0%
1441 1
 
4.0%
217 1
 
4.0%
32 1
 
4.0%
275 1
 
4.0%
49 1
 
4.0%
75 1
 
4.0%
792 1
 
4.0%
540 1
 
4.0%
724 1
 
4.0%
Other values (14) 14
56.0%
ValueCountFrequency (%)
5 2
8.0%
7 1
4.0%
32 1
4.0%
40 1
4.0%
49 1
4.0%
75 1
4.0%
81 1
4.0%
112 1
4.0%
121 1
4.0%
136 1
4.0%
ValueCountFrequency (%)
1441 1
4.0%
1136 1
4.0%
818 1
4.0%
792 1
4.0%
744 1
4.0%
724 1
4.0%
542 1
4.0%
540 1
4.0%
493 1
4.0%
335 1
4.0%

Interactions

2024-03-15T02:29:57.135847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:29:56.017379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:29:56.710402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:29:57.280883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:29:56.257924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:29:56.843458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:29:57.425805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:29:56.488007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:29:56.968309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T02:30:03.724987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종코드업종업체수종사자수
연번1.0001.0001.0000.2470.521
업종코드1.0001.0001.0001.0001.000
업종1.0001.0001.0001.0001.000
업체수0.2471.0001.0001.0000.853
종사자수0.5211.0001.0000.8531.000
2024-03-15T02:30:03.940582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업체수종사자수
연번1.0000.2190.050
업체수0.2191.0000.867
종사자수0.0500.8671.000

Missing values

2024-03-15T02:29:57.677646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T02:29:57.913863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종코드업종업체수종사자수
01C10식료품2331441
12C11음료15156
23C12담배 제조업15
34C13섬유제품(의복 제외)38335
45C14의복, 의복액세서리 및 모피제품30136
56C15가죽, 가방 및 신발15
67C16목재 및 나무제품(가구제외)29121
78C17펄프, 종이 및 종이제품18542
89C18인쇄 및 기록매체 복제업1381
910C19코크스, 연탄 및 석유정유제품27
연번업종코드업종업체수종사자수
1516C25금속가공제품(기계 및 가구 제외)169744
1617C26전자부품, 컴퓨터, 영상, 음향 및 통신장비46818
1718C27의료, 정밀, 광학기기 및 시계27112
1819C28전기장비77724
1920C29기타 기계 및 장비115540
2021C30자동차 및 트레일러69792
2122C31기타 운송장비1775
2223C32가구1849
2324C33기타제품87275
2425C34산업용 기계 및 장비수리업432