Overview

Dataset statistics

Number of variables7
Number of observations5397
Missing cells10086
Missing cells (%)26.7%
Duplicate rows18
Duplicate rows (%)0.3%
Total size in memory295.3 KiB
Average record size in memory56.0 B

Variable types

Text5
Categorical1
DateTime1

Dataset

Description경상남도 남해군 건설현장 시공정보 현황입니다. 항목으로는 시공대지 위치, 주용도, 착공일, 준공일, 시공업체명, 감리사무소, 설계사무소명과 전화번호 등을 포함한 정보입니다.
Author경상남도 남해군
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15035684

Alerts

Dataset has 18 (0.3%) duplicate rowsDuplicates
주용도 is highly imbalanced (55.9%)Imbalance
착공일 has 307 (5.7%) missing valuesMissing
준공일 has 303 (5.6%) missing valuesMissing
시공업체명(전화번화) has 4934 (91.4%) missing valuesMissing
감리사무소명(전화번호) has 4516 (83.7%) missing valuesMissing

Reproduction

Analysis started2023-12-11 00:57:52.081624
Analysis finished2023-12-11 00:57:53.158003
Duration1.08 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct4942
Distinct (%)91.6%
Missing0
Missing (%)0.0%
Memory size42.3 KiB
2023-12-11T09:57:53.335828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length28
Mean length22.830276
Min length17

Characters and Unicode

Total characters123215
Distinct characters108
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4543 ?
Unique (%)84.2%

Sample

1st row경상남도 남해군 남면 상가리 144
2nd row경상남도 남해군 남면 평산리 1570-2
3rd row경상남도 남해군 창선면 동대리 산 157-23
4th row경상남도 남해군 남해읍 평리 980-2
5th row경상남도 남해군 상주면 양아리 1853-10
ValueCountFrequency (%)
경상남도 5397
18.8%
남해군 5397
18.8%
외1필지 1042
 
3.6%
창선면 907
 
3.2%
남해읍 894
 
3.1%
남면 673
 
2.3%
삼동면 638
 
2.2%
이동면 517
 
1.8%
설천면 438
 
1.5%
고현면 431
 
1.5%
Other values (3770) 12443
43.2%
2023-12-11T09:57:53.680039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23380
19.0%
12581
 
10.2%
6291
 
5.1%
5992
 
4.9%
1 5638
 
4.6%
5546
 
4.5%
5397
 
4.4%
5397
 
4.4%
5396
 
4.4%
4523
 
3.7%
Other values (98) 43074
35.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 74151
60.2%
Space Separator 23380
 
19.0%
Decimal Number 22484
 
18.2%
Dash Punctuation 3200
 
2.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12581
17.0%
6291
 
8.5%
5992
 
8.1%
5546
 
7.5%
5397
 
7.3%
5397
 
7.3%
5396
 
7.3%
4523
 
6.1%
1925
 
2.6%
1685
 
2.3%
Other values (86) 19418
26.2%
Decimal Number
ValueCountFrequency (%)
1 5638
25.1%
2 2961
13.2%
3 2200
 
9.8%
4 2136
 
9.5%
5 1823
 
8.1%
6 1765
 
7.9%
8 1545
 
6.9%
7 1539
 
6.8%
0 1522
 
6.8%
9 1355
 
6.0%
Space Separator
ValueCountFrequency (%)
23380
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 74151
60.2%
Common 49064
39.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12581
17.0%
6291
 
8.5%
5992
 
8.1%
5546
 
7.5%
5397
 
7.3%
5397
 
7.3%
5396
 
7.3%
4523
 
6.1%
1925
 
2.6%
1685
 
2.3%
Other values (86) 19418
26.2%
Common
ValueCountFrequency (%)
23380
47.7%
1 5638
 
11.5%
- 3200
 
6.5%
2 2961
 
6.0%
3 2200
 
4.5%
4 2136
 
4.4%
5 1823
 
3.7%
6 1765
 
3.6%
8 1545
 
3.1%
7 1539
 
3.1%
Other values (2) 2877
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 74151
60.2%
ASCII 49064
39.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23380
47.7%
1 5638
 
11.5%
- 3200
 
6.5%
2 2961
 
6.0%
3 2200
 
4.5%
4 2136
 
4.4%
5 1823
 
3.7%
6 1765
 
3.6%
8 1545
 
3.1%
7 1539
 
3.1%
Other values (2) 2877
 
5.9%
Hangul
ValueCountFrequency (%)
12581
17.0%
6291
 
8.5%
5992
 
8.1%
5546
 
7.5%
5397
 
7.3%
5397
 
7.3%
5396
 
7.3%
4523
 
6.1%
1925
 
2.6%
1685
 
2.3%
Other values (86) 19418
26.2%

주용도
Categorical

IMBALANCE 

Distinct27
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size42.3 KiB
단독주택
3250 
창고시설
599 
제1종근린생활시설
474 
제2종근린생활시설
396 
동.식물관련시설
376 
Other values (22)
 
302

Length

Max length10
Median length4
Mean length5.1410043
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row창고시설
2nd row단독주택
3rd row단독주택
4th row노유자시설
5th row제1종근린생활시설

Common Values

ValueCountFrequency (%)
단독주택 3250
60.2%
창고시설 599
 
11.1%
제1종근린생활시설 474
 
8.8%
제2종근린생활시설 396
 
7.3%
동.식물관련시설 376
 
7.0%
숙박시설 46
 
0.9%
공장 36
 
0.7%
공동주택 34
 
0.6%
노유자시설 22
 
0.4%
업무시설 19
 
0.4%
Other values (17) 145
 
2.7%

Length

2023-12-11T09:57:53.796676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
단독주택 3250
60.2%
창고시설 599
 
11.1%
제1종근린생활시설 474
 
8.8%
제2종근린생활시설 396
 
7.3%
동.식물관련시설 376
 
7.0%
숙박시설 46
 
0.9%
공장 36
 
0.7%
공동주택 34
 
0.6%
노유자시설 22
 
0.4%
업무시설 19
 
0.4%
Other values (17) 145
 
2.7%

착공일
Text

MISSING 

Distinct2624
Distinct (%)51.6%
Missing307
Missing (%)5.7%
Memory size42.3 KiB
2023-12-11T09:57:54.049254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters50900
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1299 ?
Unique (%)25.5%

Sample

1st row2020-07-28
2nd row2020-07-22
3rd row2020-07-22
4th row2020-07-10
5th row2020-07-08
ValueCountFrequency (%)
2010-03-20 10
 
0.2%
2010-11-05 9
 
0.2%
2019-04-05 9
 
0.2%
2017-05-10 8
 
0.2%
2010-02-18 8
 
0.2%
2016-06-15 7
 
0.1%
2016-11-28 7
 
0.1%
2017-04-26 7
 
0.1%
2013-04-10 7
 
0.1%
2012-08-21 7
 
0.1%
Other values (2614) 5011
98.4%
2023-12-11T09:57:54.411686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 12599
24.8%
- 10180
20.0%
1 9210
18.1%
2 8477
16.7%
5 1702
 
3.3%
3 1604
 
3.2%
9 1587
 
3.1%
4 1400
 
2.8%
8 1391
 
2.7%
7 1383
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40720
80.0%
Dash Punctuation 10180
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 12599
30.9%
1 9210
22.6%
2 8477
20.8%
5 1702
 
4.2%
3 1604
 
3.9%
9 1587
 
3.9%
4 1400
 
3.4%
8 1391
 
3.4%
7 1383
 
3.4%
6 1367
 
3.4%
Dash Punctuation
ValueCountFrequency (%)
- 10180
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50900
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 12599
24.8%
- 10180
20.0%
1 9210
18.1%
2 8477
16.7%
5 1702
 
3.3%
3 1604
 
3.2%
9 1587
 
3.1%
4 1400
 
2.8%
8 1391
 
2.7%
7 1383
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 12599
24.8%
- 10180
20.0%
1 9210
18.1%
2 8477
16.7%
5 1702
 
3.3%
3 1604
 
3.2%
9 1587
 
3.1%
4 1400
 
2.8%
8 1391
 
2.7%
7 1383
 
2.7%

준공일
Date

MISSING 

Distinct2274
Distinct (%)44.6%
Missing303
Missing (%)5.6%
Memory size42.3 KiB
Minimum2009-01-02 00:00:00
Maximum2020-08-11 00:00:00
2023-12-11T09:57:54.538151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:57:54.673613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct348
Distinct (%)75.2%
Missing4934
Missing (%)91.4%
Memory size42.3 KiB
2023-12-11T09:57:54.892103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length30
Mean length20.546436
Min length5

Characters and Unicode

Total characters9513
Distinct characters206
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique287 ?
Unique (%)62.0%

Sample

1st row(주)덕산종합건설(054-534-0301)
2nd row(주)태흥종합건설(055-741-5678)
3rd row주식회사 화성종합건설(055-251-7555)
4th row동휘종합건설㈜(055-863-4561)
5th row(주)민수종합건설(055-867-8438)
ValueCountFrequency (%)
주식회사 13
 
2.7%
정남종합건설(주)(055-863-2523 12
 
2.4%
주)와이비(055-862-0488 10
 
2.0%
주)금강종합건설(055-864-0599 10
 
2.0%
한라종합건설(주)(055-253-5823 8
 
1.6%
수광종합건설(주)(055-758-9000 6
 
1.2%
주)민수종합건설(055-867-8438 4
 
0.8%
용봉종합건설(주)(053-633-5795 4
 
0.8%
한남종합건설주식회사(055-863-2523 4
 
0.8%
거성종합건설(주)(055-864-8043 4
 
0.8%
Other values (351) 415
84.7%
2023-12-11T09:57:55.422079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 1001
 
10.5%
- 800
 
8.4%
0 704
 
7.4%
) 691
 
7.3%
( 689
 
7.2%
398
 
4.2%
389
 
4.1%
3 374
 
3.9%
374
 
3.9%
8 359
 
3.8%
Other values (196) 3734
39.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4065
42.7%
Other Letter 3182
33.4%
Dash Punctuation 800
 
8.4%
Close Punctuation 691
 
7.3%
Open Punctuation 689
 
7.2%
Other Symbol 57
 
0.6%
Space Separator 27
 
0.3%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
398
 
12.5%
389
 
12.2%
374
 
11.8%
267
 
8.4%
264
 
8.3%
106
 
3.3%
106
 
3.3%
100
 
3.1%
54
 
1.7%
48
 
1.5%
Other values (179) 1076
33.8%
Decimal Number
ValueCountFrequency (%)
5 1001
24.6%
0 704
17.3%
3 374
 
9.2%
8 359
 
8.8%
2 341
 
8.4%
6 306
 
7.5%
4 290
 
7.1%
7 270
 
6.6%
1 253
 
6.2%
9 167
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
N 1
50.0%
H 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 800
100.0%
Close Punctuation
ValueCountFrequency (%)
) 691
100.0%
Open Punctuation
ValueCountFrequency (%)
( 689
100.0%
Other Symbol
ValueCountFrequency (%)
57
100.0%
Space Separator
ValueCountFrequency (%)
27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6272
65.9%
Hangul 3239
34.0%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
398
 
12.3%
389
 
12.0%
374
 
11.5%
267
 
8.2%
264
 
8.2%
106
 
3.3%
106
 
3.3%
100
 
3.1%
57
 
1.8%
54
 
1.7%
Other values (180) 1124
34.7%
Common
ValueCountFrequency (%)
5 1001
16.0%
- 800
12.8%
0 704
11.2%
) 691
11.0%
( 689
11.0%
3 374
 
6.0%
8 359
 
5.7%
2 341
 
5.4%
6 306
 
4.9%
4 290
 
4.6%
Other values (4) 717
11.4%
Latin
ValueCountFrequency (%)
N 1
50.0%
H 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6274
66.0%
Hangul 3182
33.4%
None 57
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 1001
16.0%
- 800
12.8%
0 704
11.2%
) 691
11.0%
( 689
11.0%
3 374
 
6.0%
8 359
 
5.7%
2 341
 
5.4%
6 306
 
4.9%
4 290
 
4.6%
Other values (6) 719
11.5%
Hangul
ValueCountFrequency (%)
398
 
12.5%
389
 
12.2%
374
 
11.8%
267
 
8.4%
264
 
8.3%
106
 
3.3%
106
 
3.3%
100
 
3.1%
54
 
1.7%
48
 
1.5%
Other values (179) 1076
33.8%
None
ValueCountFrequency (%)
57
100.0%
Distinct216
Distinct (%)24.5%
Missing4516
Missing (%)83.7%
Memory size42.3 KiB
2023-12-11T09:57:55.722341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length31
Mean length22.438138
Min length2

Characters and Unicode

Total characters19768
Distinct characters194
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique151 ?
Unique (%)17.1%

Sample

1st row김윤섭건축사(사)(055-864-3315)
2nd row예터건축사사무소(070-7504-0153)
3rd row장문호건축사(사)(055-864-7400)
4th row김윤섭건축사(사)(055-864-3315)
5th row예터건축사사무소(070-7504-0153)
ValueCountFrequency (%)
김윤섭건축사(사)(055-864-3315 104
 
10.8%
건축사사무소동성(055-862-5900 91
 
9.5%
남해건축사(사)(055-863-4441 85
 
8.9%
장문호건축사(사)(055-864-7400 63
 
6.6%
장문호건축사사무소(055-864-7400 58
 
6.0%
김윤섭건축사사무소(055-864-3315 55
 
5.7%
고원건축사사무소(055-863-4300 46
 
4.8%
건축사사무소 28
 
2.9%
도하건축사사무소(055-863-4182 24
 
2.5%
예터건축사사무소(070-7504-0153 24
 
2.5%
Other values (234) 381
39.7%
2023-12-11T09:57:56.146356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 2034
 
10.3%
1745
 
8.8%
- 1678
 
8.5%
0 1674
 
8.5%
) 1182
 
6.0%
( 1180
 
6.0%
4 956
 
4.8%
881
 
4.5%
876
 
4.4%
8 784
 
4.0%
Other values (184) 6778
34.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8501
43.0%
Other Letter 7123
36.0%
Dash Punctuation 1678
 
8.5%
Close Punctuation 1182
 
6.0%
Open Punctuation 1180
 
6.0%
Space Separator 78
 
0.4%
Uppercase Letter 23
 
0.1%
Other Punctuation 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1745
24.5%
881
12.4%
876
12.3%
580
 
8.1%
579
 
8.1%
169
 
2.4%
168
 
2.4%
166
 
2.3%
129
 
1.8%
127
 
1.8%
Other values (161) 1703
23.9%
Decimal Number
ValueCountFrequency (%)
5 2034
23.9%
0 1674
19.7%
4 956
11.2%
8 784
 
9.2%
3 772
 
9.1%
6 768
 
9.0%
1 542
 
6.4%
7 418
 
4.9%
2 370
 
4.4%
9 183
 
2.2%
Uppercase Letter
ValueCountFrequency (%)
A 5
21.7%
C 5
21.7%
E 3
13.0%
T 3
13.0%
S 3
13.0%
M 3
13.0%
L 1
 
4.3%
Dash Punctuation
ValueCountFrequency (%)
- 1678
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1182
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1180
100.0%
Space Separator
ValueCountFrequency (%)
78
100.0%
Other Punctuation
ValueCountFrequency (%)
& 2
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12622
63.9%
Hangul 7120
36.0%
Latin 23
 
0.1%
Han 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1745
24.5%
881
12.4%
876
12.3%
580
 
8.1%
579
 
8.1%
169
 
2.4%
168
 
2.4%
166
 
2.3%
129
 
1.8%
127
 
1.8%
Other values (158) 1700
23.9%
Common
ValueCountFrequency (%)
5 2034
16.1%
- 1678
13.3%
0 1674
13.3%
) 1182
9.4%
( 1180
9.3%
4 956
7.6%
8 784
 
6.2%
3 772
 
6.1%
6 768
 
6.1%
1 542
 
4.3%
Other values (6) 1052
8.3%
Latin
ValueCountFrequency (%)
A 5
21.7%
C 5
21.7%
E 3
13.0%
T 3
13.0%
S 3
13.0%
M 3
13.0%
L 1
 
4.3%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12645
64.0%
Hangul 7120
36.0%
CJK 2
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 2034
16.1%
- 1678
13.3%
0 1674
13.2%
) 1182
9.3%
( 1180
9.3%
4 956
7.6%
8 784
 
6.2%
3 772
 
6.1%
6 768
 
6.1%
1 542
 
4.3%
Other values (13) 1075
8.5%
Hangul
ValueCountFrequency (%)
1745
24.5%
881
12.4%
876
12.3%
580
 
8.1%
579
 
8.1%
169
 
2.4%
168
 
2.4%
166
 
2.3%
129
 
1.8%
127
 
1.8%
Other values (158) 1700
23.9%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct501
Distinct (%)9.3%
Missing26
Missing (%)0.5%
Memory size42.3 KiB
2023-12-11T09:57:56.506724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length31
Mean length22.481475
Min length2

Characters and Unicode

Total characters120748
Distinct characters266
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique327 ?
Unique (%)6.1%

Sample

1st row건축사사무소가인(055-833-3724)
2nd row김윤섭건축사(사)(055-864-3315)
3rd row남해건축사(사)(055-863-4441)
4th row남해건축사(사)(055-863-4441)
5th row남해건축사(사)(055-863-4441)
ValueCountFrequency (%)
김윤섭건축사(사)(055-864-3315 864
15.5%
남해건축사(사)(055-863-4441 810
14.5%
건축사사무소동성(055-862-5900 644
11.5%
고원건축사사무소(055-863-4300 584
10.5%
장문호건축사(사)(055-864-7400 460
 
8.2%
장문호건축사사무소(055-864-7400 438
 
7.8%
김윤섭건축사사무소(055-864-3315 184
 
3.3%
부산종합건축사(사)(055-863-1156 128
 
2.3%
예터건축사사무소(070-7504-0153 111
 
2.0%
부산종합건축사사무소(055-863-1156 103
 
1.8%
Other values (545) 1258
22.5%
2023-12-11T09:57:56.952395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 12797
 
10.6%
10762
 
8.9%
0 10579
 
8.8%
- 10484
 
8.7%
) 7749
 
6.4%
( 7740
 
6.4%
4 6625
 
5.5%
5422
 
4.5%
5413
 
4.5%
3 5258
 
4.4%
Other values (256) 37919
31.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52707
43.7%
Other Letter 41794
34.6%
Dash Punctuation 10484
 
8.7%
Close Punctuation 7749
 
6.4%
Open Punctuation 7740
 
6.4%
Space Separator 214
 
0.2%
Uppercase Letter 49
 
< 0.1%
Other Punctuation 10
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10762
25.8%
5422
13.0%
5413
13.0%
3029
 
7.2%
3027
 
7.2%
1084
 
2.6%
1077
 
2.6%
1073
 
2.6%
920
 
2.2%
912
 
2.2%
Other values (228) 9075
21.7%
Uppercase Letter
ValueCountFrequency (%)
M 9
18.4%
A 8
16.3%
E 7
14.3%
S 6
12.2%
C 6
12.2%
L 4
8.2%
T 4
8.2%
D 2
 
4.1%
U 1
 
2.0%
G 1
 
2.0%
Decimal Number
ValueCountFrequency (%)
5 12797
24.3%
0 10579
20.1%
4 6625
12.6%
3 5258
10.0%
6 4998
 
9.5%
8 4965
 
9.4%
1 3120
 
5.9%
7 1947
 
3.7%
2 1479
 
2.8%
9 939
 
1.8%
Other Punctuation
ValueCountFrequency (%)
. 5
50.0%
& 5
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 10484
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7749
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7740
100.0%
Space Separator
ValueCountFrequency (%)
214
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 78905
65.3%
Hangul 41791
34.6%
Latin 49
 
< 0.1%
Han 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10762
25.8%
5422
13.0%
5413
13.0%
3029
 
7.2%
3027
 
7.2%
1084
 
2.6%
1077
 
2.6%
1073
 
2.6%
920
 
2.2%
912
 
2.2%
Other values (225) 9072
21.7%
Common
ValueCountFrequency (%)
5 12797
16.2%
0 10579
13.4%
- 10484
13.3%
) 7749
9.8%
( 7740
9.8%
4 6625
8.4%
3 5258
6.7%
6 4998
 
6.3%
8 4965
 
6.3%
1 3120
 
4.0%
Other values (7) 4590
 
5.8%
Latin
ValueCountFrequency (%)
M 9
18.4%
A 8
16.3%
E 7
14.3%
S 6
12.2%
C 6
12.2%
L 4
8.2%
T 4
8.2%
D 2
 
4.1%
U 1
 
2.0%
G 1
 
2.0%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78954
65.4%
Hangul 41791
34.6%
CJK 2
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 12797
16.2%
0 10579
13.4%
- 10484
13.3%
) 7749
9.8%
( 7740
9.8%
4 6625
8.4%
3 5258
6.7%
6 4998
 
6.3%
8 4965
 
6.3%
1 3120
 
4.0%
Other values (18) 4639
 
5.9%
Hangul
ValueCountFrequency (%)
10762
25.8%
5422
13.0%
5413
13.0%
3029
 
7.2%
3027
 
7.2%
1084
 
2.6%
1077
 
2.6%
1073
 
2.6%
920
 
2.2%
912
 
2.2%
Other values (225) 9072
21.7%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

Missing values

2023-12-11T09:57:52.874827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:57:52.994686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T09:57:53.095219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시공 대지위치주용도착공일준공일시공업체명(전화번화)감리사무소명(전화번호)설계사무소명(전화번호)
0경상남도 남해군 남면 상가리 144창고시설<NA><NA><NA><NA><NA>
1경상남도 남해군 남면 평산리 1570-2단독주택<NA><NA><NA><NA>건축사사무소가인(055-833-3724)
2경상남도 남해군 창선면 동대리 산 157-23단독주택<NA><NA><NA><NA>김윤섭건축사(사)(055-864-3315)
3경상남도 남해군 남해읍 평리 980-2노유자시설<NA><NA><NA>김윤섭건축사(사)(055-864-3315)남해건축사(사)(055-863-4441)
4경상남도 남해군 상주면 양아리 1853-10제1종근린생활시설2020-07-282020-08-11<NA><NA>남해건축사(사)(055-863-4441)
5경상남도 남해군 삼동면 봉화리 237단독주택<NA><NA><NA><NA>남해건축사(사)(055-863-4441)
6경상남도 남해군 서면 정포리 816 외2필지단독주택<NA><NA><NA><NA>장문호건축사사무소(055-864-7400)
7경상남도 남해군 남해읍 차산리 238-3창고시설<NA><NA><NA><NA>장문호건축사사무소(055-864-7400)
8경상남도 남해군 창선면 당저리 227 외1필지숙박시설<NA><NA><NA><NA>진원건축사사무소(055-852-0071)
9경상남도 남해군 창선면 동대리 산 32제2종근린생활시설<NA><NA><NA><NA>건축사사무소이현(055-743-0017)
시공 대지위치주용도착공일준공일시공업체명(전화번화)감리사무소명(전화번호)설계사무소명(전화번호)
5387경상남도 남해군 이동면 무림리 1049창고시설2006-05-082010-03-22<NA><NA>부산종합건축사사무소(055-863-1156)
5388경상남도 남해군 삼동면 물건리 1133단독주택2006-02-222010-08-19<NA><NA>김윤섭건축사사무소(055-864-3315)
5389경상남도 남해군 창선면 당저리 375-2제1종근린생활시설2005-10-072013-12-19<NA><NA>장문호건축사(사)(055-864-7400)
5390경상남도 남해군 미조면 송정리 797숙박시설2006-03-022010-06-25<NA><NA>김윤섭건축사(사)(055-864-3315)
5391경상남도 남해군 설천면 남양리 601-4동.식물관련시설2005-10-052010-07-16<NA><NA>김윤섭건축사사무소(0545-864-3315)
5392경상남도 남해군 설천면 남양리 601-9동.식물관련시설2005-10-052010-07-16<NA><NA>김윤섭건축사사무소(0545-864-3315)
5393경상남도 남해군 설천면 남양리 601-10동.식물관련시설2005-10-052010-07-16<NA><NA>김윤섭건축사사무소(0545-864-3315)
5394경상남도 남해군 서면 서상리 622-1단독주택2004-10-152013-05-09<NA><NA>남해건축사(사)(055-863-4441)
5395경상남도 남해군 서면 서상리 619-3 외1필지창고시설2004-04-032010-05-13<NA><NA>김윤섭건축사(사)(055-864-3315)
5396경상남도 남해군 남해읍 북변리 250-3제2종근린생활시설2002-11-012013-07-05<NA><NA>남해건축사(사)(055-863-4441)

Duplicate rows

Most frequently occurring

시공 대지위치주용도착공일준공일시공업체명(전화번화)감리사무소명(전화번호)설계사무소명(전화번호)# duplicates
7경상남도 남해군 상주면 상주리 1056-5 외1필지단독주택2012-02-232012-07-31<NA>건축사사무소바로(055-748-5771)건축사사무소바로(055-748-5771)3
13경상남도 남해군 창선면 부윤리 157단독주택<NA><NA><NA><NA>남해건축사(사)(055-863-4441)3
0경상남도 남해군 고현면 대사리 산 80-7단독주택<NA><NA><NA><NA>건축사사무소이현(055-743-0017)2
1경상남도 남해군 고현면 도마리 1094-1단독주택2019-01-142019-02-28<NA><NA>고원건축사사무소(055-863-4300)2
2경상남도 남해군 남면 당항리 31-1단독주택2017-06-072018-08-10<NA><NA>건축사사무소동성(055-862-5900)2
3경상남도 남해군 남면 홍현리 산 8 외1필지단독주택<NA><NA><NA><NA>남해건축사(사)(055-863-4441)2
4경상남도 남해군 남해읍 선소리 67-6단독주택2019-01-112019-01-23<NA><NA>건축사사무소동성(055-862-5900)2
5경상남도 남해군 남해읍 선소리 67-7단독주택2019-01-102019-01-23<NA><NA>고원건축사사무소(055-863-4300)2
6경상남도 남해군 남해읍 평리 1680창고시설2019-02-182019-02-25<NA><NA>남해건축사(사)(055-863-4441)2
8경상남도 남해군 설천면 문항리 6 외1필지단독주택2009-10-252010-06-15<NA><NA>남해건축사(사)(055-863-4441)2