Overview

Dataset statistics

Number of variables7
Number of observations42
Missing cells53
Missing cells (%)18.0%
Duplicate rows1
Duplicate rows (%)2.4%
Total size in memory2.4 KiB
Average record size in memory59.1 B

Variable types

Unsupported2
Categorical2
Text3

Alerts

Unnamed: 6 has constant value ""Constant
Dataset has 1 (2.4%) duplicate rowsDuplicates
Unnamed: 2 is highly overall correlated with Unnamed: 3High correlation
Unnamed: 3 is highly overall correlated with Unnamed: 2High correlation
□ 태양광기업 현황(시공기업 포함) has 3 (7.1%) missing valuesMissing
Unnamed: 1 has 1 (2.4%) missing valuesMissing
Unnamed: 4 has 4 (9.5%) missing valuesMissing
Unnamed: 5 has 4 (9.5%) missing valuesMissing
Unnamed: 6 has 41 (97.6%) missing valuesMissing
□ 태양광기업 현황(시공기업 포함) is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 1 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 02:35:04.010608
Analysis finished2024-03-14 02:35:04.444339
Duration0.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

□ 태양광기업 현황(시공기업 포함)
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3
Missing (%)7.1%
Memory size468.0 B

Unnamed: 1
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1
Missing (%)2.4%
Memory size468.0 B

Unnamed: 2
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)23.8%
Missing0
Missing (%)0.0%
Memory size468.0 B
30 
<NA>
대표생산품
 
1
폴리실리콘
 
1
잉곳 도가니
 
1
Other values (5)

Length

Max length16
Median length1
Mean length2.1904762
Min length1

Unique

Unique8 ?
Unique (%)19.0%

Sample

1st row<NA>
2nd row대표생산품
3rd row<NA>
4th row폴리실리콘
5th row잉곳 도가니

Common Values

ValueCountFrequency (%)
30
71.4%
<NA> 4
 
9.5%
대표생산품 1
 
2.4%
폴리실리콘 1
 
2.4%
잉곳 도가니 1
 
2.4%
모듈 1
 
2.4%
모듈 및 시공 1
 
2.4%
인버터 1
 
2.4%
Flexible CIGS 모듈 1
 
2.4%
시공 1
 
2.4%

Length

2024-03-14T11:35:04.506419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T11:35:04.649811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30
63.8%
na 4
 
8.5%
모듈 3
 
6.4%
시공 2
 
4.3%
대표생산품 1
 
2.1%
폴리실리콘 1
 
2.1%
잉곳 1
 
2.1%
도가니 1
 
2.1%
1
 
2.1%
인버터 1
 
2.1%
Other values (2) 2
 
4.3%

Unnamed: 3
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)28.6%
Missing0
Missing (%)0.0%
Memory size468.0 B
전주
21 
<NA>
정읍
고창
익산시
 
2
Other values (7)

Length

Max length4
Median length2
Mean length2.3571429
Min length2

Unique

Unique5 ?
Unique (%)11.9%

Sample

1st row<NA>
2nd row소재지
3rd row<NA>
4th row군산시
5th row익산시

Common Values

ValueCountFrequency (%)
전주 21
50.0%
<NA> 4
 
9.5%
정읍 3
 
7.1%
고창 3
 
7.1%
익산시 2
 
4.8%
완주군 2
 
4.8%
군산 2
 
4.8%
소재지 1
 
2.4%
군산시 1
 
2.4%
전주시 1
 
2.4%
Other values (2) 2
 
4.8%

Length

2024-03-14T11:35:04.781301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전주 21
50.0%
na 4
 
9.5%
정읍 3
 
7.1%
고창 3
 
7.1%
익산시 2
 
4.8%
완주군 2
 
4.8%
군산 2
 
4.8%
소재지 1
 
2.4%
군산시 1
 
2.4%
전주시 1
 
2.4%
Other values (2) 2
 
4.8%

Unnamed: 4
Text

MISSING 

Distinct29
Distinct (%)76.3%
Missing4
Missing (%)9.5%
Memory size468.0 B
2024-03-14T11:35:04.923700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.4473684
Min length1

Characters and Unicode

Total characters93
Distinct characters53
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)73.7%

Sample

1st row대표
2nd row이우현
3rd row강남훈
4th row박현우
5th row허병호
ValueCountFrequency (%)
10
26.3%
장희근 1
 
2.6%
대표 1
 
2.6%
김현식 1
 
2.6%
최은미 1
 
2.6%
김영우 1
 
2.6%
조창국 1
 
2.6%
박익수 1
 
2.6%
윤창복 1
 
2.6%
김용국 1
 
2.6%
Other values (19) 19
50.0%
2024-03-14T11:35:05.151554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 10
 
10.8%
7
 
7.5%
4
 
4.3%
4
 
4.3%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
2
 
2.2%
Other values (43) 51
54.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 82
88.2%
Dash Punctuation 10
 
10.8%
Space Separator 1
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
8.5%
4
 
4.9%
4
 
4.9%
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
2
 
2.4%
2
 
2.4%
Other values (41) 48
58.5%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 82
88.2%
Common 11
 
11.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
8.5%
4
 
4.9%
4
 
4.9%
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
2
 
2.4%
2
 
2.4%
Other values (41) 48
58.5%
Common
ValueCountFrequency (%)
- 10
90.9%
1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 82
88.2%
ASCII 11
 
11.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 10
90.9%
1
 
9.1%
Hangul
ValueCountFrequency (%)
7
 
8.5%
4
 
4.9%
4
 
4.9%
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
2
 
2.4%
2
 
2.4%
Other values (41) 48
58.5%

Unnamed: 5
Text

MISSING 

Distinct38
Distinct (%)100.0%
Missing4
Missing (%)9.5%
Memory size468.0 B
2024-03-14T11:35:05.342877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length8
Mean length8.1052632
Min length1

Characters and Unicode

Total characters308
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)100.0%

Sample

1st row연락처
2nd row460-6000
3rd row720-1200
4th row710-3000
5th row714-3111
ValueCountFrequency (%)
214-9911 1
 
2.6%
563-6101 1
 
2.6%
211-8007 1
 
2.6%
905-6260 1
 
2.6%
277-1623 1
 
2.6%
010-8368-7744 1
 
2.6%
237-1380 1
 
2.6%
1588-6509 1
 
2.6%
1
 
2.6%
561-0505 1
 
2.6%
Other values (28) 28
73.7%
2024-03-14T11:35:05.702127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 39
12.7%
1 34
11.0%
0 33
10.7%
2 31
10.1%
3 31
10.1%
5 31
10.1%
6 30
9.7%
7 27
8.8%
4 20
6.5%
8 17
5.5%
Other values (4) 15
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 266
86.4%
Dash Punctuation 39
 
12.7%
Other Letter 3
 
1.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 34
12.8%
0 33
12.4%
2 31
11.7%
3 31
11.7%
5 31
11.7%
6 30
11.3%
7 27
10.2%
4 20
7.5%
8 17
6.4%
9 12
 
4.5%
Other Letter
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 39
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 305
99.0%
Hangul 3
 
1.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 39
12.8%
1 34
11.1%
0 33
10.8%
2 31
10.2%
3 31
10.2%
5 31
10.2%
6 30
9.8%
7 27
8.9%
4 20
6.6%
8 17
5.6%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 305
99.0%
Hangul 3
 
1.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 39
12.8%
1 34
11.1%
0 33
10.8%
2 31
10.2%
3 31
10.2%
5 31
10.2%
6 30
9.8%
7 27
8.9%
4 20
6.6%
8 17
5.6%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Unnamed: 6
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing41
Missing (%)97.6%
Memory size468.0 B
2024-03-14T11:35:05.819345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row비고
ValueCountFrequency (%)
비고 1
100.0%
2024-03-14T11:35:05.986009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Correlations

2024-03-14T11:35:06.060985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5
Unnamed: 21.0000.8981.0001.000
Unnamed: 30.8981.0000.7021.000
Unnamed: 41.0000.7021.0001.000
Unnamed: 51.0001.0001.0001.000
2024-03-14T11:35:06.152005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3
Unnamed: 21.0000.675
Unnamed: 30.6751.000
2024-03-14T11:35:06.226121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 3
Unnamed: 21.0000.675
Unnamed: 30.6751.000

Missing values

2024-03-14T11:35:04.183717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T11:35:04.280127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T11:35:04.375519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

□ 태양광기업 현황(시공기업 포함)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
0NaNNaN<NA><NA><NA><NA><NA>
1연번기 업 명대표생산품소재지대표연락처비고
2총계38<NA><NA><NA><NA><NA>
31OCI_군산공장폴리실리콘군산시이우현460-6000<NA>
42쿼츠테크잉곳 도가니익산시강남훈720-1200<NA>
53솔라파크코리아모듈완주군박현우710-3000<NA>
64이엠테크모듈 및 시공완주군허병호714-3111<NA>
75세스인버터전주시김차현070-4322-0430<NA>
86일진머티리얼즈Flexible CIGS 모듈익산시허재명835-3616<NA>
98㈜신후건설시공전주이창용277-5678<NA>
□ 태양광기업 현황(시공기업 포함)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
3230우아이앤지(주)전주김영우1544-7386<NA>
3331(유)S&E전주최은미1566-5139<NA>
3432강남에너지산업전주김현식1566-2593<NA>
3533㈜탑에너지농장전주정상수1666-9772<NA>
3634다우태양광임실-644-5555<NA>
3735선제1태양광발전소군산-453-5753<NA>
3836케이씨태양광군산-461-1232<NA>
3937승한태양광발전소정읍-532-1271<NA>
4038신태인-한양정읍-571-2539<NA>
41NaN태양광발전소<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6# duplicates
0<NA><NA><NA><NA><NA>4