Overview

Dataset statistics

Number of variables8
Number of observations122
Missing cells69
Missing cells (%)7.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.0 KiB
Average record size in memory67.0 B

Variable types

Text4
Numeric1
Categorical2
DateTime1

Dataset

Description대구광역시_부동산 중개업_20161016
Author대구광역시
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15069124&dataSetDetailId=150691241f0738c580d02&provdMethod=FILE

Alerts

전문인력인원수 is highly overall correlated with 상태High correlation
상태 is highly overall correlated with 전문인력인원수High correlation
영업소 소재지 has 68 (55.7%) missing valuesMissing
등록번호 has unique valuesUnique

Reproduction

Analysis started2024-04-20 16:40:22.273264
Analysis finished2024-04-20 16:40:24.043966
Duration1.77 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

등록번호
Text

UNIQUE 

Distinct122
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-04-21T01:40:24.937662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.9836066
Min length6

Characters and Unicode

Total characters974
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique122 ?
Unique (%)100.0%

Sample

1st row160005
2nd row대구070001
3rd row대구070002
4th row대구080001
5th row대구080002
ValueCountFrequency (%)
160005 1
 
0.8%
대구140004 1
 
0.8%
대구140002 1
 
0.8%
대구140001 1
 
0.8%
대구130016 1
 
0.8%
대구130015 1
 
0.8%
대구130014 1
 
0.8%
대구130013 1
 
0.8%
대구130012 1
 
0.8%
대구130011 1
 
0.8%
Other values (112) 112
91.8%
2024-04-21T01:40:26.146044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 377
38.7%
1 138
 
14.2%
121
 
12.4%
121
 
12.4%
8 43
 
4.4%
2 36
 
3.7%
3 35
 
3.6%
5 29
 
3.0%
4 26
 
2.7%
6 21
 
2.2%
Other values (2) 27
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 732
75.2%
Other Letter 242
 
24.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 377
51.5%
1 138
 
18.9%
8 43
 
5.9%
2 36
 
4.9%
3 35
 
4.8%
5 29
 
4.0%
4 26
 
3.6%
6 21
 
2.9%
9 15
 
2.0%
7 12
 
1.6%
Other Letter
ValueCountFrequency (%)
121
50.0%
121
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 732
75.2%
Hangul 242
 
24.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 377
51.5%
1 138
 
18.9%
8 43
 
5.9%
2 36
 
4.9%
3 35
 
4.8%
5 29
 
4.0%
4 26
 
3.6%
6 21
 
2.9%
9 15
 
2.0%
7 12
 
1.6%
Hangul
ValueCountFrequency (%)
121
50.0%
121
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 732
75.2%
Hangul 242
 
24.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 377
51.5%
1 138
 
18.9%
8 43
 
5.9%
2 36
 
4.9%
3 35
 
4.8%
5 29
 
4.0%
4 26
 
3.6%
6 21
 
2.9%
9 15
 
2.0%
7 12
 
1.6%
Hangul
ValueCountFrequency (%)
121
50.0%
121
50.0%

상호
Text

Distinct121
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-04-21T01:40:26.757445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length13
Mean length8.5245902
Min length3

Characters and Unicode

Total characters1040
Distinct characters164
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique120 ?
Unique (%)98.4%

Sample

1st row(주)오션글로벌
2nd row아세아건설㈜
3rd row㈜빅스타건설
4th row㈜대덕디엔씨
5th row㈜동서개발
ValueCountFrequency (%)
주식회사 33
 
21.3%
주)미래엔텍 2
 
1.3%
주)서림종합건설 1
 
0.6%
주)영선건설 1
 
0.6%
주)두강건설 1
 
0.6%
주)한라개발 1
 
0.6%
주)엠제이건설 1
 
0.6%
주)한돌 1
 
0.6%
주)덕송종합건설 1
 
0.6%
주)상수건설 1
 
0.6%
Other values (112) 112
72.3%
2024-04-21T01:40:27.602005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
114
 
11.0%
( 74
 
7.1%
) 74
 
7.1%
55
 
5.3%
52
 
5.0%
37
 
3.6%
36
 
3.5%
35
 
3.4%
33
 
3.2%
24
 
2.3%
Other values (154) 506
48.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 847
81.4%
Open Punctuation 74
 
7.1%
Close Punctuation 74
 
7.1%
Space Separator 33
 
3.2%
Other Symbol 12
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
114
 
13.5%
55
 
6.5%
52
 
6.1%
37
 
4.4%
36
 
4.3%
35
 
4.1%
24
 
2.8%
24
 
2.8%
18
 
2.1%
18
 
2.1%
Other values (150) 434
51.2%
Open Punctuation
ValueCountFrequency (%)
( 74
100.0%
Close Punctuation
ValueCountFrequency (%)
) 74
100.0%
Space Separator
ValueCountFrequency (%)
33
100.0%
Other Symbol
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 859
82.6%
Common 181
 
17.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
114
 
13.3%
55
 
6.4%
52
 
6.1%
37
 
4.3%
36
 
4.2%
35
 
4.1%
24
 
2.8%
24
 
2.8%
18
 
2.1%
18
 
2.1%
Other values (151) 446
51.9%
Common
ValueCountFrequency (%)
( 74
40.9%
) 74
40.9%
33
18.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 847
81.4%
ASCII 181
 
17.4%
None 12
 
1.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
114
 
13.5%
55
 
6.5%
52
 
6.1%
37
 
4.4%
36
 
4.3%
35
 
4.1%
24
 
2.8%
24
 
2.8%
18
 
2.1%
18
 
2.1%
Other values (150) 434
51.2%
ASCII
ValueCountFrequency (%)
( 74
40.9%
) 74
40.9%
33
18.2%
None
ValueCountFrequency (%)
12
100.0%
Distinct113
Distinct (%)93.4%
Missing1
Missing (%)0.8%
Memory size1.1 KiB
2024-04-21T01:40:28.651990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length3
Mean length3.5785124
Min length2

Characters and Unicode

Total characters433
Distinct characters110
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique105 ?
Unique (%)86.8%

Sample

1st row최인규
2nd row박규남
3rd row신태노
4th row신동하
5th row이승현
ValueCountFrequency (%)
문영호 2
 
1.7%
조만현 2
 
1.7%
우대현 2
 
1.7%
윤창진 2
 
1.7%
정선자 2
 
1.7%
이석우 2
 
1.7%
박승화 2
 
1.7%
류일옥 2
 
1.7%
전재준 1
 
0.8%
서재정 1
 
0.8%
Other values (103) 103
85.1%
2024-04-21T01:40:29.907897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24
 
5.5%
18
 
4.2%
, 18
 
4.2%
15
 
3.5%
15
 
3.5%
14
 
3.2%
12
 
2.8%
10
 
2.3%
9
 
2.1%
9
 
2.1%
Other values (100) 289
66.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 415
95.8%
Other Punctuation 18
 
4.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
5.8%
18
 
4.3%
15
 
3.6%
15
 
3.6%
14
 
3.4%
12
 
2.9%
10
 
2.4%
9
 
2.2%
9
 
2.2%
7
 
1.7%
Other values (99) 282
68.0%
Other Punctuation
ValueCountFrequency (%)
, 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 415
95.8%
Common 18
 
4.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
5.8%
18
 
4.3%
15
 
3.6%
15
 
3.6%
14
 
3.4%
12
 
2.9%
10
 
2.4%
9
 
2.2%
9
 
2.2%
7
 
1.7%
Other values (99) 282
68.0%
Common
ValueCountFrequency (%)
, 18
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 415
95.8%
ASCII 18
 
4.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
24
 
5.8%
18
 
4.3%
15
 
3.6%
15
 
3.6%
14
 
3.4%
12
 
2.9%
10
 
2.4%
9
 
2.2%
9
 
2.2%
7
 
1.7%
Other values (99) 282
68.0%
ASCII
ValueCountFrequency (%)
, 18
100.0%

자본금(천원)
Real number (ℝ)

Distinct49
Distinct (%)40.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7230010.1
Minimum300000
Maximum6.80625 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-04-21T01:40:30.146314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum300000
5-th percentile300000
Q1500000
median510000
Q31210000
95-th percentile5119370.5
Maximum6.80625 × 108
Range6.80325 × 108
Interquartile range (IQR)710000

Descriptive statistics

Standard deviation61735328
Coefficient of variation (CV)8.538761
Kurtosis119.8705
Mean7230010.1
Median Absolute Deviation (MAD)210000
Skewness10.909259
Sum8.8206123 × 108
Variance3.8112508 × 1015
MonotonicityNot monotonic
2024-04-21T01:40:30.402772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
500000 32
26.2%
300000 18
14.8%
550000 7
 
5.7%
510000 6
 
4.9%
1200000 3
 
2.5%
600000 3
 
2.5%
1210000 3
 
2.5%
505000 3
 
2.5%
1500000 2
 
1.6%
1800000 2
 
1.6%
Other values (39) 43
35.2%
ValueCountFrequency (%)
300000 18
14.8%
310000 1
 
0.8%
350000 2
 
1.6%
500000 32
26.2%
503000 1
 
0.8%
505000 3
 
2.5%
510000 6
 
4.9%
550000 7
 
5.7%
560000 1
 
0.8%
600000 3
 
2.5%
ValueCountFrequency (%)
680625000 1
0.8%
62254000 1
0.8%
8200000 1
0.8%
6816160 1
0.8%
6769410 1
0.8%
5705000 1
0.8%
5120390 1
0.8%
5100000 1
0.8%
5000000 1
0.8%
4511000 1
0.8%

전문인력인원수
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
0
69 
2
49 
3
 
2
1
 
1
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)1.6%

Sample

1st row2
2nd row0
3rd row2
4th row0
5th row2

Common Values

ValueCountFrequency (%)
0 69
56.6%
2 49
40.2%
3 2
 
1.6%
1 1
 
0.8%
5 1
 
0.8%

Length

2024-04-21T01:40:30.645562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T01:40:30.981681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 69
56.6%
2 49
40.2%
3 2
 
1.6%
1 1
 
0.8%
5 1
 
0.8%

영업소 소재지
Text

MISSING 

Distinct54
Distinct (%)100.0%
Missing68
Missing (%)55.7%
Memory size1.1 KiB
2024-04-21T01:40:32.150226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length33
Mean length24.87037
Min length16

Characters and Unicode

Total characters1343
Distinct characters124
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)100.0%

Sample

1st row대구광역시 달성군 유가면 비슬로64길 56-3
2nd row대구광역시 중구 경상감영길 227
3rd row대구광역시 중구 달구벌대로 1929
4th row대구광역시 수성구 동대구로 111 (황금동)
5th row대구광역시 북구 원대로 130 , 6층
ValueCountFrequency (%)
대구광역시 54
 
18.9%
수성구 17
 
6.0%
동구 11
 
3.9%
달서구 9
 
3.2%
6
 
2.1%
동대구로 6
 
2.1%
북구 5
 
1.8%
2층 5
 
1.8%
범어동 5
 
1.8%
중구 5
 
1.8%
Other values (139) 162
56.8%
2024-04-21T01:40:33.770566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
231
17.2%
122
 
9.1%
73
 
5.4%
56
 
4.2%
54
 
4.0%
54
 
4.0%
54
 
4.0%
53
 
3.9%
1 41
 
3.1%
2 36
 
2.7%
Other values (114) 569
42.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 800
59.6%
Space Separator 231
 
17.2%
Decimal Number 222
 
16.5%
Close Punctuation 30
 
2.2%
Open Punctuation 30
 
2.2%
Other Punctuation 22
 
1.6%
Dash Punctuation 8
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
122
15.2%
73
 
9.1%
56
 
7.0%
54
 
6.8%
54
 
6.8%
54
 
6.8%
53
 
6.6%
24
 
3.0%
19
 
2.4%
19
 
2.4%
Other values (99) 272
34.0%
Decimal Number
ValueCountFrequency (%)
1 41
18.5%
2 36
16.2%
4 27
12.2%
3 25
11.3%
5 24
10.8%
6 17
7.7%
0 17
7.7%
8 13
 
5.9%
7 11
 
5.0%
9 11
 
5.0%
Space Separator
ValueCountFrequency (%)
231
100.0%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Other Punctuation
ValueCountFrequency (%)
, 22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 800
59.6%
Common 543
40.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
122
15.2%
73
 
9.1%
56
 
7.0%
54
 
6.8%
54
 
6.8%
54
 
6.8%
53
 
6.6%
24
 
3.0%
19
 
2.4%
19
 
2.4%
Other values (99) 272
34.0%
Common
ValueCountFrequency (%)
231
42.5%
1 41
 
7.6%
2 36
 
6.6%
) 30
 
5.5%
( 30
 
5.5%
4 27
 
5.0%
3 25
 
4.6%
5 24
 
4.4%
, 22
 
4.1%
6 17
 
3.1%
Other values (5) 60
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 800
59.6%
ASCII 543
40.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
231
42.5%
1 41
 
7.6%
2 36
 
6.6%
) 30
 
5.5%
( 30
 
5.5%
4 27
 
5.0%
3 25
 
4.6%
5 24
 
4.4%
, 22
 
4.1%
6 17
 
3.1%
Other values (5) 60
 
11.0%
Hangul
ValueCountFrequency (%)
122
15.2%
73
 
9.1%
56
 
7.0%
54
 
6.8%
54
 
6.8%
54
 
6.8%
53
 
6.6%
24
 
3.0%
19
 
2.4%
19
 
2.4%
Other values (99) 272
34.0%
Distinct103
Distinct (%)84.4%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
Minimum2007-12-20 00:00:00
Maximum2016-10-13 00:00:00
2024-04-21T01:40:34.160854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T01:40:34.585064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

상태
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
등록말소(폐업)
55 
등록완료
49 
등록말소(전출)
등록취소
등록말소(양도)
 
3

Length

Max length8
Median length8
Mean length6.1967213
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row등록완료
2nd row등록말소(폐업)
3rd row등록완료
4th row등록말소(폐업)
5th row등록완료

Common Values

ValueCountFrequency (%)
등록말소(폐업) 55
45.1%
등록완료 49
40.2%
등록말소(전출) 7
 
5.7%
등록취소 6
 
4.9%
등록말소(양도) 3
 
2.5%
등록완료(전입) 2
 
1.6%

Length

2024-04-21T01:40:35.047957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T01:40:35.427449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
등록말소(폐업 55
45.1%
등록완료 49
40.2%
등록말소(전출 7
 
5.7%
등록취소 6
 
4.9%
등록말소(양도 3
 
2.5%
등록완료(전입 2
 
1.6%

Interactions

2024-04-21T01:40:22.841828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T01:40:35.667239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자본금(천원)전문인력인원수영업소 소재지상태
자본금(천원)1.0000.0001.0000.000
전문인력인원수0.0001.0001.0000.685
영업소 소재지1.0001.0001.0001.000
상태0.0000.6851.0001.000
2024-04-21T01:40:35.925572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전문인력인원수상태
전문인력인원수1.0000.545
상태0.5451.000
2024-04-21T01:40:36.171496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자본금(천원)전문인력인원수상태
자본금(천원)1.0000.0000.000
전문인력인원수0.0001.0000.545
상태0.0000.5451.000

Missing values

2024-04-21T01:40:23.199654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T01:40:23.611614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-21T01:40:23.910074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

등록번호상호대표자자본금(천원)전문인력인원수영업소 소재지등록일상태
0160005(주)오션글로벌최인규3000002대구광역시 달성군 유가면 비슬로64길 56-32016-08-05등록완료
1대구070001아세아건설㈜박규남12200000<NA>2007-12-20등록말소(폐업)
2대구070002㈜빅스타건설신태노20000002대구광역시 중구 경상감영길 2272007-12-21등록완료
3대구080001㈜대덕디엔씨신동하5000000<NA>2008-02-14등록말소(폐업)
4대구080002㈜동서개발이승현50000002대구광역시 중구 달구벌대로 19292008-02-14등록완료
5대구080003㈜일신이엔지하영규5000000<NA>2008-02-20등록말소(폐업)
6대구080004㈜동흥개발배준환5000000<NA>2008-03-12등록말소(폐업)
7대구080005㈜현창건설박승화12100000<NA>2008-03-19등록말소(폐업)
8대구080006코오롱씨앤씨주식회사조현철5000000<NA>2008-04-10등록말소(폐업)
9대구080007㈜케이홍인종합건설김수일5050000<NA>2008-04-17등록말소(폐업)
등록번호상호대표자자본금(천원)전문인력인원수영업소 소재지등록일상태
112대구150014(주)에스에이치건설이재철3000000<NA>2015-11-13등록말소(폐업)
113대구150015(주)세진티앤디박재임3500002대구광역시 남구 용두2길 29 (봉덕동)2015-11-16등록완료
114대구150016지성건설(주)이형선5000002대구광역시 남구 명덕로 322 , 302호(이천동)2015-11-24등록완료
115대구160001신성토건(주)심지영27000002대구광역시 수성구 만촌로 164 ,3층2016-01-19등록완료
116대구160002주식회사 한집노양래3100002대구광역시 동구 검사동 957-44 1층2016-04-15등록완료
117대구160003(주)석영건축사사무소조영우3000002대구광역시 달서구 장기로 228 1310호2016-06-15등록완료
118대구160004(주)브리티시건설류일옥5000000<NA>2016-08-03등록말소(폐업)
119대구160006신서드림프로젝트금융투자주식회사정현기,박성준,정현기51000005대구광역시 동구 이노밸리로 305 1501호, 서원킹스밀오피스텔2016-08-22등록완료
120대구160007주식회사 썬라이즈이상철9000002대구광역시 중구 달구벌대로447길 34-392016-08-26등록완료
121대구160008주식회사 에이치엔에스허상수3000002대구광역시 동구 동부로26길 702016-10-13등록완료