Overview

Dataset statistics

Number of variables3
Number of observations6026
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory147.2 KiB
Average record size in memory25.0 B

Variable types

Numeric1
Text2

Dataset

Description한전KDN 전자입찰시스템에 등록되어 있는 업체정보입니다. 업체명과 대표자, 사업자번호에 대한 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15064074/fileData.do

Alerts

사업자번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:38:02.916839
Analysis finished2023-12-12 01:38:03.831731
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업자번호
Real number (ℝ)

UNIQUE 

Distinct6026
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.3929105 × 109
Minimum1.0181193 × 109
Maximum8.9988011 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size53.1 KiB
2023-12-12T10:38:03.956808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0181193 × 109
5-th percentile1.1081692 × 109
Q11.4081827 × 109
median3.0186037 × 109
Q35.04218 × 109
95-th percentile7.0610505 × 109
Maximum8.9988011 × 109
Range7.9806818 × 109
Interquartile range (IQR)3.6339973 × 109

Descriptive statistics

Standard deviation1.996494 × 109
Coefficient of variation (CV)0.58843108
Kurtosis-0.46527871
Mean3.3929105 × 109
Median Absolute Deviation (MAD)1.6904118 × 109
Skewness0.67914058
Sum2.0445679 × 1013
Variance3.9859882 × 1018
MonotonicityNot monotonic
2023-12-12T10:38:04.138541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1148706420 1
 
< 0.1%
4110229026 1
 
< 0.1%
6138176606 1
 
< 0.1%
1338135263 1
 
< 0.1%
3030895251 1
 
< 0.1%
4230302385 1
 
< 0.1%
4128110054 1
 
< 0.1%
1048610901 1
 
< 0.1%
1298605673 1
 
< 0.1%
6131467390 1
 
< 0.1%
Other values (6016) 6016
99.8%
ValueCountFrequency (%)
1018119252 1
< 0.1%
1018139321 1
< 0.1%
1018145385 1
< 0.1%
1018147345 1
< 0.1%
1018151776 1
< 0.1%
1018154491 1
< 0.1%
1018154598 1
< 0.1%
1018192253 1
< 0.1%
1018197466 1
< 0.1%
1018607542 1
< 0.1%
ValueCountFrequency (%)
8998801071 1
< 0.1%
8998702113 1
< 0.1%
8988801501 1
< 0.1%
8988800180 1
< 0.1%
8988100463 1
< 0.1%
8978600655 1
< 0.1%
8968701022 1
< 0.1%
8968700571 1
< 0.1%
8968101020 1
< 0.1%
8958701636 1
< 0.1%
Distinct5955
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size47.2 KiB
2023-12-12T10:38:04.573230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length21
Mean length8.3153004
Min length2

Characters and Unicode

Total characters50108
Distinct characters638
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5888 ?
Unique (%)97.7%

Sample

1st row (유)에이씨이엠에이코리아
2nd row 서경씨엔티
3rd row 주식회사 아이티네트웍스
4th row 주식회사 지미션
5th row 찬빈정보통신
ValueCountFrequency (%)
주식회사 1320
 
17.5%
51
 
0.7%
유한회사 35
 
0.5%
합자회사 16
 
0.2%
건축사사무소 9
 
0.1%
사단법인 5
 
0.1%
산학협력단 4
 
0.1%
가람 3
 
< 0.1%
하나정보통신 3
 
< 0.1%
주)동광전기 3
 
< 0.1%
Other values (5994) 6094
80.8%
2023-12-12T10:38:05.195336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5231
 
10.4%
) 3465
 
6.9%
( 3449
 
6.9%
2222
 
4.4%
1940
 
3.9%
1834
 
3.7%
1551
 
3.1%
1495
 
3.0%
998
 
2.0%
979
 
2.0%
Other values (628) 26944
53.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41416
82.7%
Close Punctuation 3465
 
6.9%
Open Punctuation 3449
 
6.9%
Space Separator 1551
 
3.1%
Uppercase Letter 137
 
0.3%
Lowercase Letter 56
 
0.1%
Other Punctuation 22
 
< 0.1%
Decimal Number 10
 
< 0.1%
Other Symbol 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5231
 
12.6%
2222
 
5.4%
1940
 
4.7%
1834
 
4.4%
1495
 
3.6%
998
 
2.4%
979
 
2.4%
952
 
2.3%
879
 
2.1%
861
 
2.1%
Other values (571) 24025
58.0%
Uppercase Letter
ValueCountFrequency (%)
E 15
10.9%
T 14
 
10.2%
C 11
 
8.0%
G 10
 
7.3%
N 10
 
7.3%
B 9
 
6.6%
A 9
 
6.6%
I 8
 
5.8%
S 8
 
5.8%
O 7
 
5.1%
Other values (15) 36
26.3%
Lowercase Letter
ValueCountFrequency (%)
o 10
17.9%
n 6
10.7%
d 6
10.7%
e 5
8.9%
a 4
 
7.1%
y 3
 
5.4%
i 3
 
5.4%
s 3
 
5.4%
c 3
 
5.4%
p 2
 
3.6%
Other values (8) 11
19.6%
Decimal Number
ValueCountFrequency (%)
1 4
40.0%
3 2
20.0%
2 2
20.0%
4 1
 
10.0%
5 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 15
68.2%
& 5
 
22.7%
/ 1
 
4.5%
, 1
 
4.5%
Close Punctuation
ValueCountFrequency (%)
) 3465
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3449
100.0%
Space Separator
ValueCountFrequency (%)
1551
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41417
82.7%
Common 8498
 
17.0%
Latin 193
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5231
 
12.6%
2222
 
5.4%
1940
 
4.7%
1834
 
4.4%
1495
 
3.6%
998
 
2.4%
979
 
2.4%
952
 
2.3%
879
 
2.1%
861
 
2.1%
Other values (572) 24026
58.0%
Latin
ValueCountFrequency (%)
E 15
 
7.8%
T 14
 
7.3%
C 11
 
5.7%
G 10
 
5.2%
N 10
 
5.2%
o 10
 
5.2%
B 9
 
4.7%
A 9
 
4.7%
I 8
 
4.1%
S 8
 
4.1%
Other values (33) 89
46.1%
Common
ValueCountFrequency (%)
) 3465
40.8%
( 3449
40.6%
1551
18.3%
. 15
 
0.2%
& 5
 
0.1%
1 4
 
< 0.1%
3 2
 
< 0.1%
2 2
 
< 0.1%
/ 1
 
< 0.1%
, 1
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41416
82.7%
ASCII 8691
 
17.3%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5231
 
12.6%
2222
 
5.4%
1940
 
4.7%
1834
 
4.4%
1495
 
3.6%
998
 
2.4%
979
 
2.4%
952
 
2.3%
879
 
2.1%
861
 
2.1%
Other values (571) 24025
58.0%
ASCII
ValueCountFrequency (%)
) 3465
39.9%
( 3449
39.7%
1551
17.8%
E 15
 
0.2%
. 15
 
0.2%
T 14
 
0.2%
C 11
 
0.1%
G 10
 
0.1%
N 10
 
0.1%
o 10
 
0.1%
Other values (46) 141
 
1.6%
None
ValueCountFrequency (%)
1
100.0%
Distinct5383
Distinct (%)89.3%
Missing0
Missing (%)0.0%
Memory size47.2 KiB
2023-12-12T10:38:05.673099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length3
Mean length3.05692
Min length1

Characters and Unicode

Total characters18421
Distinct characters302
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4919 ?
Unique (%)81.6%

Sample

1st row진태호
2nd row박미화
3rd row이택진
4th row한준섭
5th row류선희
ValueCountFrequency (%)
김광수 9
 
0.1%
이상훈 9
 
0.1%
김영호 6
 
0.1%
김영수 6
 
0.1%
김영숙 6
 
0.1%
김기영 5
 
0.1%
김종현 5
 
0.1%
최은정 5
 
0.1%
김정희 5
 
0.1%
김영미 5
 
0.1%
Other values (5395) 6001
99.0%
2023-12-12T10:38:06.383685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1299
 
7.1%
919
 
5.0%
626
 
3.4%
564
 
3.1%
490
 
2.7%
344
 
1.9%
344
 
1.9%
335
 
1.8%
319
 
1.7%
315
 
1.7%
Other values (292) 12866
69.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18285
99.3%
Other Punctuation 71
 
0.4%
Space Separator 44
 
0.2%
Uppercase Letter 12
 
0.1%
Decimal Number 8
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1299
 
7.1%
919
 
5.0%
626
 
3.4%
564
 
3.1%
490
 
2.7%
344
 
1.9%
344
 
1.9%
335
 
1.8%
319
 
1.7%
315
 
1.7%
Other values (277) 12730
69.6%
Uppercase Letter
ValueCountFrequency (%)
C 2
16.7%
N 2
16.7%
H 2
16.7%
G 1
8.3%
E 1
8.3%
F 1
8.3%
A 1
8.3%
U 1
8.3%
I 1
8.3%
Other Punctuation
ValueCountFrequency (%)
, 67
94.4%
. 3
 
4.2%
/ 1
 
1.4%
Space Separator
ValueCountFrequency (%)
44
100.0%
Decimal Number
ValueCountFrequency (%)
1 8
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18285
99.3%
Common 124
 
0.7%
Latin 12
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1299
 
7.1%
919
 
5.0%
626
 
3.4%
564
 
3.1%
490
 
2.7%
344
 
1.9%
344
 
1.9%
335
 
1.8%
319
 
1.7%
315
 
1.7%
Other values (277) 12730
69.6%
Latin
ValueCountFrequency (%)
C 2
16.7%
N 2
16.7%
H 2
16.7%
G 1
8.3%
E 1
8.3%
F 1
8.3%
A 1
8.3%
U 1
8.3%
I 1
8.3%
Common
ValueCountFrequency (%)
, 67
54.0%
44
35.5%
1 8
 
6.5%
. 3
 
2.4%
/ 1
 
0.8%
` 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18285
99.3%
ASCII 136
 
0.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1299
 
7.1%
919
 
5.0%
626
 
3.4%
564
 
3.1%
490
 
2.7%
344
 
1.9%
344
 
1.9%
335
 
1.8%
319
 
1.7%
315
 
1.7%
Other values (277) 12730
69.6%
ASCII
ValueCountFrequency (%)
, 67
49.3%
44
32.4%
1 8
 
5.9%
. 3
 
2.2%
C 2
 
1.5%
N 2
 
1.5%
H 2
 
1.5%
G 1
 
0.7%
E 1
 
0.7%
F 1
 
0.7%
Other values (5) 5
 
3.7%

Interactions

2023-12-12T10:38:03.483334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T10:38:03.664183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:38:03.773612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업자번호업체명대표자
01148706420(유)에이씨이엠에이코리아진태호
15048130039서경씨엔티박미화
24018153387주식회사 아이티네트웍스이택진
32388801373주식회사 지미션한준섭
41480701727찬빈정보통신류선희
51212271726토탈이앤텍장현호
62258119067((주)미승전력소방강정민
75108208067(사)국민장애인복지협의회온누리조혜규
81168205951(사)대한경제연구원선종배
92098212394(사)대한문화체육교육협회 장애인자립지원단김상배
사업자번호업체명대표자
60161298657225휴텍주식회사김동엽
60171028107837흥국화재해상보험(주)권중원
60182258112496흥리종합건설주식회사정교진
60195010362494흥양전업공사우용정
60204128120120흥양종합건설(주)강명철
60212278109890흥일건설(주)한옥분
60222268128569흥해종합건설(주)김선교
60231048613043흥화텔레콤(주)김희정 강철수
60244668601955희명이엔씨위정희
60251238187798희상건설(주)이경범