Overview

Dataset statistics

Number of variables7
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 KiB
Average record size in memory62.4 B

Variable types

DateTime1
Numeric2
Text3
Categorical1

Dataset

Description샘플 데이터
Author경기신용보증재단
URLhttps://bigdata-region.kr/#/dataset/f57576ce-d9cc-475e-8715-e7c3e09c0059

Alerts

기준년월 has constant value ""Constant
관리번호 has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:15:00.633578
Analysis finished2023-12-10 14:15:02.503621
Duration1.87 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Date

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2021-10-01 00:00:00
Maximum2021-10-01 00:00:00
2023-12-10T23:15:02.601544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:15:02.824976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

관리번호
Real number (ℝ)

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0967418 × 108
Minimum1.0006314 × 108
Maximum1.1001507 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:15:03.160641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0006314 × 108
5-th percentile1.1000035 × 108
Q11.1000248 × 108
median1.1000496 × 108
Q31.1000838 × 108
95-th percentile1.1001292 × 108
Maximum1.1001507 × 108
Range9951935
Interquartile range (IQR)5902.75

Descriptive statistics

Standard deviation1815239.5
Coefficient of variation (CV)0.016551202
Kurtosis29.999716
Mean1.0967418 × 108
Median Absolute Deviation (MAD)3267.5
Skewness-5.4771879
Sum3.2902254 × 109
Variance3.2950943 × 1012
MonotonicityNot monotonic
2023-12-10T23:15:03.405146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
100063139 1
 
3.3%
110004795 1
 
3.3%
110015074 1
 
3.3%
110005365 1
 
3.3%
110008422 1
 
3.3%
110008421 1
 
3.3%
110008417 1
 
3.3%
110008410 1
 
3.3%
110013090 1
 
3.3%
110008398 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
100063139 1
3.3%
110000288 1
3.3%
110000436 1
3.3%
110000754 1
3.3%
110000933 1
3.3%
110001019 1
3.3%
110001788 1
3.3%
110002445 1
3.3%
110002571 1
3.3%
110003848 1
3.3%
ValueCountFrequency (%)
110015074 1
3.3%
110013090 1
3.3%
110012712 1
3.3%
110008422 1
3.3%
110008421 1
3.3%
110008417 1
3.3%
110008410 1
3.3%
110008398 1
3.3%
110008323 1
3.3%
110006958 1
3.3%
Distinct15
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:15:03.719706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7.5
Mean length4.7666667
Min length3

Characters and Unicode

Total characters143
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)13.3%

Sample

1st row포천시(의정부)
2nd row평택시
3rd row의정부
4th row수원시
5th row성남시
ValueCountFrequency (%)
포천시(의정부 3
10.0%
수원시 3
10.0%
용인시 3
10.0%
남양주시 3
10.0%
평택시 2
 
6.7%
의정부 2
 
6.7%
성남시 2
 
6.7%
양주시(의정부 2
 
6.7%
화성시(화성 2
 
6.7%
파주시(고양 2
 
6.7%
Other values (5) 6
20.0%
2023-12-10T23:15:04.280383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29
20.3%
( 11
 
7.7%
) 11
 
7.7%
8
 
5.6%
7
 
4.9%
7
 
4.9%
7
 
4.9%
7
 
4.9%
7
 
4.9%
5
 
3.5%
Other values (17) 44
30.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 121
84.6%
Open Punctuation 11
 
7.7%
Close Punctuation 11
 
7.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
24.0%
8
 
6.6%
7
 
5.8%
7
 
5.8%
7
 
5.8%
7
 
5.8%
7
 
5.8%
5
 
4.1%
5
 
4.1%
4
 
3.3%
Other values (15) 35
28.9%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 121
84.6%
Common 22
 
15.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
24.0%
8
 
6.6%
7
 
5.8%
7
 
5.8%
7
 
5.8%
7
 
5.8%
7
 
5.8%
5
 
4.1%
5
 
4.1%
4
 
3.3%
Other values (15) 35
28.9%
Common
ValueCountFrequency (%)
( 11
50.0%
) 11
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 121
84.6%
ASCII 22
 
15.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
29
24.0%
8
 
6.6%
7
 
5.8%
7
 
5.8%
7
 
5.8%
7
 
5.8%
7
 
5.8%
5
 
4.1%
5
 
4.1%
4
 
3.3%
Other values (15) 35
28.9%
ASCII
ValueCountFrequency (%)
( 11
50.0%
) 11
50.0%
Distinct7
Distinct (%)23.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
C 제조업(10~34)
10 
C 제조업 (10 ~ 33)
G 도매 및 소매업 (45~47)
G 도매 및 소매업(45~47)
F 건설업(41~42)
Other values (2)

Length

Max length25
Median length18
Mean length14.866667
Min length12

Unique

Unique2 ?
Unique (%)6.7%

Sample

1st rowC 제조업(10~34)
2nd rowC 제조업(10~34)
3rd rowG 도매 및 소매업(45~47)
4th rowJ출판;영상;방송통신및정보서비스업(58~63)
5th rowP 교육 서비스업(85)

Common Values

ValueCountFrequency (%)
C 제조업(10~34) 10
33.3%
C 제조업 (10 ~ 33) 7
23.3%
G 도매 및 소매업 (45~47) 6
20.0%
G 도매 및 소매업(45~47) 3
 
10.0%
F 건설업(41~42) 2
 
6.7%
J출판;영상;방송통신및정보서비스업(58~63) 1
 
3.3%
P 교육 서비스업(85) 1
 
3.3%

Length

2023-12-10T23:15:04.540714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:15:05.083053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 17
16.2%
제조업(10~34 10
9.5%
g 9
8.6%
도매 9
8.6%
9
8.6%
제조업 7
6.7%
10 7
6.7%
7
6.7%
33 7
6.7%
45~47 6
 
5.7%
Other values (8) 17
16.2%
Distinct26
Distinct (%)86.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:15:05.445188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length17
Mean length12.533333
Min length3

Characters and Unicode

Total characters376
Distinct characters80
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)76.7%

Sample

1st row전기공급 및 전기제어 장치 제조업
2nd row기타 전자부품 제조업
3rd row기타 기계 및 장비 도매업
4th row기타 정보 서비스업
5th row일반 교습 학원
ValueCountFrequency (%)
제조업 15
 
13.8%
기타 12
 
11.0%
10
 
9.2%
도매업 6
 
5.5%
음·식료품 4
 
3.7%
담배 4
 
3.7%
기계 3
 
2.8%
식품 3
 
2.8%
소매업 2
 
1.8%
공사업 2
 
1.8%
Other values (44) 48
44.0%
2023-12-10T23:15:06.032969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
79
21.0%
30
 
8.0%
21
 
5.6%
20
 
5.3%
17
 
4.5%
16
 
4.3%
12
 
3.2%
10
 
2.7%
8
 
2.1%
8
 
2.1%
Other values (70) 155
41.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 290
77.1%
Space Separator 79
 
21.0%
Other Punctuation 6
 
1.6%
Decimal Number 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
10.3%
21
 
7.2%
20
 
6.9%
17
 
5.9%
16
 
5.5%
12
 
4.1%
10
 
3.4%
8
 
2.8%
8
 
2.8%
7
 
2.4%
Other values (66) 141
48.6%
Other Punctuation
ValueCountFrequency (%)
· 4
66.7%
; 2
33.3%
Space Separator
ValueCountFrequency (%)
79
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 290
77.1%
Common 86
 
22.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
10.3%
21
 
7.2%
20
 
6.9%
17
 
5.9%
16
 
5.5%
12
 
4.1%
10
 
3.4%
8
 
2.8%
8
 
2.8%
7
 
2.4%
Other values (66) 141
48.6%
Common
ValueCountFrequency (%)
79
91.9%
· 4
 
4.7%
; 2
 
2.3%
1 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 290
77.1%
ASCII 82
 
21.8%
None 4
 
1.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
79
96.3%
; 2
 
2.4%
1 1
 
1.2%
Hangul
ValueCountFrequency (%)
30
 
10.3%
21
 
7.2%
20
 
6.9%
17
 
5.9%
16
 
5.5%
12
 
4.1%
10
 
3.4%
8
 
2.8%
8
 
2.8%
7
 
2.4%
Other values (66) 141
48.6%
None
ValueCountFrequency (%)
· 4
100.0%
Distinct29
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:15:06.497274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length11
Mean length7.3
Min length2

Characters and Unicode

Total characters219
Distinct characters122
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)93.3%

Sample

1st row수배전반
2nd rowBLU; IP CCTV
3rd row의료기기
4th row시스템 소프트웨어 자문;개발
5th row보습학원
ValueCountFrequency (%)
육가공제품 2
 
4.8%
자동차부품 2
 
4.8%
반도체 2
 
4.8%
시스템 2
 
4.8%
초콜릿;설탕류;감미료또는과자류도매업 1
 
2.4%
화장지;위생용품 1
 
2.4%
수배전반 1
 
2.4%
금속탱크 1
 
2.4%
소가죽;피혁 1
 
2.4%
목재제재 1
 
2.4%
Other values (28) 28
66.7%
2023-12-10T23:15:07.110659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
5.5%
; 11
 
5.0%
7
 
3.2%
6
 
2.7%
6
 
2.7%
5
 
2.3%
4
 
1.8%
4
 
1.8%
4
 
1.8%
4
 
1.8%
Other values (112) 156
71.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 183
83.6%
Space Separator 12
 
5.5%
Other Punctuation 11
 
5.0%
Uppercase Letter 11
 
5.0%
Lowercase Letter 2
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
3.8%
6
 
3.3%
6
 
3.3%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (99) 135
73.8%
Uppercase Letter
ValueCountFrequency (%)
C 3
27.3%
N 1
 
9.1%
V 1
 
9.1%
T 1
 
9.1%
P 1
 
9.1%
I 1
 
9.1%
U 1
 
9.1%
L 1
 
9.1%
B 1
 
9.1%
Lowercase Letter
ValueCountFrequency (%)
p 1
50.0%
d 1
50.0%
Space Separator
ValueCountFrequency (%)
12
100.0%
Other Punctuation
ValueCountFrequency (%)
; 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 183
83.6%
Common 23
 
10.5%
Latin 13
 
5.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
3.8%
6
 
3.3%
6
 
3.3%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (99) 135
73.8%
Latin
ValueCountFrequency (%)
C 3
23.1%
N 1
 
7.7%
V 1
 
7.7%
T 1
 
7.7%
p 1
 
7.7%
P 1
 
7.7%
I 1
 
7.7%
U 1
 
7.7%
L 1
 
7.7%
B 1
 
7.7%
Common
ValueCountFrequency (%)
12
52.2%
; 11
47.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 183
83.6%
ASCII 36
 
16.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12
33.3%
; 11
30.6%
C 3
 
8.3%
N 1
 
2.8%
V 1
 
2.8%
T 1
 
2.8%
p 1
 
2.8%
P 1
 
2.8%
I 1
 
2.8%
U 1
 
2.8%
Other values (3) 3
 
8.3%
Hangul
ValueCountFrequency (%)
7
 
3.8%
6
 
3.3%
6
 
3.3%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (99) 135
73.8%

종업원수
Real number (ℝ)

Distinct17
Distinct (%)56.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.266667
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:15:07.378190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median6.5
Q313
95-th percentile25.1
Maximum50
Range49
Interquartile range (IQR)9

Descriptive statistics

Standard deviation10.013554
Coefficient of variation (CV)0.97534617
Kurtosis7.7797404
Mean10.266667
Median Absolute Deviation (MAD)4
Skewness2.4226898
Sum308
Variance100.27126
MonotonicityNot monotonic
2023-12-10T23:15:07.584616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
6 4
13.3%
17 3
10.0%
2 3
10.0%
3 3
10.0%
4 2
 
6.7%
13 2
 
6.7%
8 2
 
6.7%
5 2
 
6.7%
11 1
 
3.3%
1 1
 
3.3%
Other values (7) 7
23.3%
ValueCountFrequency (%)
1 1
 
3.3%
2 3
10.0%
3 3
10.0%
4 2
6.7%
5 2
6.7%
6 4
13.3%
7 1
 
3.3%
8 2
6.7%
9 1
 
3.3%
11 1
 
3.3%
ValueCountFrequency (%)
50 1
 
3.3%
26 1
 
3.3%
24 1
 
3.3%
18 1
 
3.3%
17 3
10.0%
13 2
6.7%
12 1
 
3.3%
11 1
 
3.3%
9 1
 
3.3%
8 2
6.7%

Interactions

2023-12-10T23:15:01.659574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:15:01.145659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:15:01.835111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:15:01.432857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:15:07.736114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호시군명업종대분류명업종중분류명주요제품명종업원수
관리번호1.000NaNNaNNaNNaNNaN
시군명NaN1.0000.5490.8401.0000.708
업종대분류명NaN0.5491.0001.0001.0000.000
업종중분류명NaN0.8401.0001.0001.0000.889
주요제품명NaN1.0001.0001.0001.0001.000
종업원수NaN0.7080.0000.8891.0001.000
2023-12-10T23:15:07.909691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호종업원수업종대분류명
관리번호1.000-0.2550.000
종업원수-0.2551.0000.000
업종대분류명0.0000.0001.000

Missing values

2023-12-10T23:15:02.067612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:15:02.333229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년월관리번호시군명업종대분류명업종중분류명주요제품명종업원수
02021-10100063139포천시(의정부)C 제조업(10~34)전기공급 및 전기제어 장치 제조업수배전반11
12021-10110000288평택시C 제조업(10~34)기타 전자부품 제조업BLU; IP CCTV50
22021-10110000754의정부G 도매 및 소매업(45~47)기타 기계 및 장비 도매업의료기기7
32021-10110000933수원시J출판;영상;방송통신및정보서비스업(58~63)기타 정보 서비스업시스템 소프트웨어 자문;개발17
42021-10110001019성남시P 교육 서비스업(85)일반 교습 학원보습학원2
52021-10110008323양주시(의정부)C 제조업 (10 ~ 33)기타 식품 제조업양갱13
62021-10110002445양주시(의정부)C 제조업(10~34)합성고무 및 플라스틱 물질 제조업합성수지필름6
72021-10110005366포천시(의정부)G 도매 및 소매업 (45~47)음·식료품 및 담배 도매업육가공제품17
82021-10110002571김포시(부천)C 제조업 (10 ~ 33)기타 금속가공제품 제조업도장업13
92021-10110005714시흥시(안산)C 제조업 (10 ~ 33)기타 금속가공제품 제조업반도체; 자동차부품3
기준년월관리번호시군명업종대분류명업종중분류명주요제품명종업원수
202021-10110005147남양주시C 제조업 (10 ~ 33)제재 및 목재 가공업목재제재8
212021-10110012712파주시(고양)C 제조업(10~34)기타 특수목적용 기계 제조업고압세척기;NC테이블26
222021-10110008398파주시(고양)G 도매 및 소매업 (45~47)음·식료품 및 담배 도매업초콜릿;설탕류;감미료또는과자류도매업2
232021-10110013090수원시G 도매 및 소매업(45~47)상품 중개업반도체장비및 부품5
242021-10110008410안산시C 제조업 (10 ~ 33)가구 제조업옥외용벤치6
252021-10110008417안산시F 건설업(41~42)통신 공사업보안용카메라4
262021-10110008421용인시G 도매 및 소매업 (45~47)기타 가정용품 소매업보일러3
272021-10110008422용인시G 도매 및 소매업 (45~47)음·식료품 및 담배 소매업유제품1
282021-10110005365포천시(의정부)G 도매 및 소매업 (45~47)음·식료품 및 담배 도매업육가공제품17
292021-10110015074의정부G 도매 및 소매업(45~47)신선 식품 및 단순 가공 식품 도매업육류3