Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows14
Duplicate rows (%)0.1%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Categorical5
Text1
Numeric2

Dataset

Description2017~2021년도 충청남도 보령시 일반건축물에 대한 지방세 부과기준인 시가표준액 항목을 제공합니다. *물건별 재산가액 비교에 참조
URLhttps://www.data.go.kr/data/15079936/fileData.do

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
Dataset has 14 (0.1%) duplicate rowsDuplicates
과세년도 is highly overall correlated with 기준일자High correlation
기준일자 is highly overall correlated with 과세년도High correlation
시가표준액 is highly overall correlated with 연면적High correlation
연면적 is highly overall correlated with 시가표준액High correlation

Reproduction

Analysis started2023-12-12 22:58:55.271357
Analysis finished2023-12-12 22:58:56.485766
Duration1.21 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
충청남도
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청남도
2nd row충청남도
3rd row충청남도
4th row충청남도
5th row충청남도

Common Values

ValueCountFrequency (%)
충청남도 10000
100.0%

Length

2023-12-13T07:58:56.547910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:58:56.638446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
충청남도 10000
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보령시
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보령시
2nd row보령시
3rd row보령시
4th row보령시
5th row보령시

Common Values

ValueCountFrequency (%)
보령시 10000
100.0%

Length

2023-12-13T07:58:56.757027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:58:56.864046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보령시 10000
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
44180
10000 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44180
2nd row44180
3rd row44180
4th row44180
5th row44180

Common Values

ValueCountFrequency (%)
44180 10000
100.0%

Length

2023-12-13T07:58:57.002125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:58:57.113983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
44180 10000
100.0%

과세년도
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2021
2914 
2018
2534 
2019
2454 
2017
2098 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2021
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2021 2914
29.1%
2018 2534
25.3%
2019 2454
24.5%
2017 2098
21.0%

Length

2023-12-13T07:58:57.216187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:58:57.320329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 2914
29.1%
2018 2534
25.3%
2019 2454
24.5%
2017 2098
21.0%
Distinct8340
Distinct (%)83.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:58:57.663151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length31
Mean length25.3353
Min length19

Characters and Unicode

Total characters253353
Distinct characters275
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7000 ?
Unique (%)70.0%

Sample

1st row충청남도 보령시 청라면 황룡리 957 101호
2nd row충청남도 보령시 요암동 371-8 1동 102호
3rd row[ 구장터로 6 ] 0000동 8101호
4th row충청남도 보령시 죽정동 562 1동 8101호
5th row충청남도 보령시 주산면 증산리 421 101호
ValueCountFrequency (%)
8012
 
13.4%
충청남도 5994
 
10.0%
보령시 5994
 
10.0%
0000동 3708
 
6.2%
101호 2502
 
4.2%
0101호 1543
 
2.6%
102호 981
 
1.6%
1동 970
 
1.6%
천북면 834
 
1.4%
대천동 599
 
1.0%
Other values (4450) 28677
47.9%
2023-12-13T07:58:58.135906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49814
19.7%
0 31415
 
12.4%
1 22205
 
8.8%
10088
 
4.0%
2 8275
 
3.3%
8090
 
3.2%
6795
 
2.7%
6687
 
2.6%
6333
 
2.5%
6310
 
2.5%
Other values (265) 97341
38.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 103374
40.8%
Decimal Number 86993
34.3%
Space Separator 49814
19.7%
Dash Punctuation 5160
 
2.0%
Open Punctuation 4006
 
1.6%
Close Punctuation 4006
 
1.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10088
 
9.8%
8090
 
7.8%
6795
 
6.6%
6687
 
6.5%
6333
 
6.1%
6310
 
6.1%
6202
 
6.0%
6163
 
6.0%
6081
 
5.9%
4199
 
4.1%
Other values (251) 36426
35.2%
Decimal Number
ValueCountFrequency (%)
0 31415
36.1%
1 22205
25.5%
2 8275
 
9.5%
3 5153
 
5.9%
4 4373
 
5.0%
6 3382
 
3.9%
7 3296
 
3.8%
5 3262
 
3.7%
8 3148
 
3.6%
9 2484
 
2.9%
Space Separator
ValueCountFrequency (%)
49814
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5160
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 4006
100.0%
Close Punctuation
ValueCountFrequency (%)
] 4006
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 149979
59.2%
Hangul 103374
40.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10088
 
9.8%
8090
 
7.8%
6795
 
6.6%
6687
 
6.5%
6333
 
6.1%
6310
 
6.1%
6202
 
6.0%
6163
 
6.0%
6081
 
5.9%
4199
 
4.1%
Other values (251) 36426
35.2%
Common
ValueCountFrequency (%)
49814
33.2%
0 31415
20.9%
1 22205
14.8%
2 8275
 
5.5%
- 5160
 
3.4%
3 5153
 
3.4%
4 4373
 
2.9%
[ 4006
 
2.7%
] 4006
 
2.7%
6 3382
 
2.3%
Other values (4) 12190
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149979
59.2%
Hangul 103374
40.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
49814
33.2%
0 31415
20.9%
1 22205
14.8%
2 8275
 
5.5%
- 5160
 
3.4%
3 5153
 
3.4%
4 4373
 
2.9%
[ 4006
 
2.7%
] 4006
 
2.7%
6 3382
 
2.3%
Other values (4) 12190
 
8.1%
Hangul
ValueCountFrequency (%)
10088
 
9.8%
8090
 
7.8%
6795
 
6.6%
6687
 
6.5%
6333
 
6.1%
6310
 
6.1%
6202
 
6.0%
6163
 
6.0%
6081
 
5.9%
4199
 
4.1%
Other values (251) 36426
35.2%

시가표준액
Real number (ℝ)

HIGH CORRELATION 

Distinct8995
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66169334
Minimum29260
Maximum7.9289672 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:58:58.306581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum29260
5-th percentile519931.5
Q13570480
median18658560
Q363466500
95-th percentile2.3618863 × 108
Maximum7.9289672 × 109
Range7.9289379 × 109
Interquartile range (IQR)59896020

Descriptive statistics

Standard deviation2.324486 × 108
Coefficient of variation (CV)3.5129355
Kurtosis433.12963
Mean66169334
Median Absolute Deviation (MAD)17359920
Skewness17.712559
Sum6.6169334 × 1011
Variance5.4032352 × 1016
MonotonicityNot monotonic
2023-12-13T07:58:58.454267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
49695360 21
 
0.2%
96946670 14
 
0.1%
22106700 9
 
0.1%
1782000 8
 
0.1%
480000 8
 
0.1%
288000 8
 
0.1%
21903420 8
 
0.1%
450000 7
 
0.1%
90000 7
 
0.1%
55852500 7
 
0.1%
Other values (8985) 9903
99.0%
ValueCountFrequency (%)
29260 1
 
< 0.1%
36000 1
 
< 0.1%
43200 1
 
< 0.1%
45000 3
< 0.1%
47520 1
 
< 0.1%
50000 1
 
< 0.1%
51750 1
 
< 0.1%
54000 1
 
< 0.1%
55440 1
 
< 0.1%
60000 2
< 0.1%
ValueCountFrequency (%)
7928967200 1
< 0.1%
7268134450 1
< 0.1%
5930761030 1
< 0.1%
5788629500 1
< 0.1%
5747120320 1
< 0.1%
4816076850 1
< 0.1%
4708265100 1
< 0.1%
4653233430 1
< 0.1%
4191515100 1
< 0.1%
4188195130 1
< 0.1%

연면적
Real number (ℝ)

HIGH CORRELATION 

Distinct5842
Distinct (%)58.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean222.30309
Minimum0.72
Maximum19459.53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:58:58.606970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.72
5-th percentile14.5195
Q149.5
median106.375
Q3203.7325
95-th percentile756.678
Maximum19459.53
Range19458.81
Interquartile range (IQR)154.2325

Descriptive statistics

Standard deviation571.66337
Coefficient of variation (CV)2.5715493
Kurtosis370.62155
Mean222.30309
Median Absolute Deviation (MAD)69.415
Skewness15.840855
Sum2223030.9
Variance326799.01
MonotonicityNot monotonic
2023-12-13T07:58:58.753042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18.0 141
 
1.4%
27.0 38
 
0.4%
198.0 31
 
0.3%
60.0 31
 
0.3%
66.0 29
 
0.3%
50.0 28
 
0.3%
72.0 26
 
0.3%
9.0 25
 
0.2%
36.0 25
 
0.2%
162.0 24
 
0.2%
Other values (5832) 9602
96.0%
ValueCountFrequency (%)
0.72 1
 
< 0.1%
1.0 3
< 0.1%
1.44 1
 
< 0.1%
1.64 1
 
< 0.1%
1.76 1
 
< 0.1%
1.8 4
< 0.1%
1.95 1
 
< 0.1%
1.98 1
 
< 0.1%
2.0 5
0.1%
2.25 1
 
< 0.1%
ValueCountFrequency (%)
19459.53 1
< 0.1%
16465.22 1
< 0.1%
14356.72 2
< 0.1%
12885.92 1
< 0.1%
12871.7 1
< 0.1%
10349.42 2
< 0.1%
9293.41 1
< 0.1%
8991.76 1
< 0.1%
8599.99 1
< 0.1%
7964.19 1
< 0.1%

기준일자
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2021-12-31
2914 
2018-12-31
2534 
2019-12-31
2454 
2017-12-31
2098 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019-12-31
2nd row2019-12-31
3rd row2021-12-31
4th row2017-12-31
5th row2017-12-31

Common Values

ValueCountFrequency (%)
2021-12-31 2914
29.1%
2018-12-31 2534
25.3%
2019-12-31 2454
24.5%
2017-12-31 2098
21.0%

Length

2023-12-13T07:58:58.884548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:58:58.980604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-12-31 2914
29.1%
2018-12-31 2534
25.3%
2019-12-31 2454
24.5%
2017-12-31 2098
21.0%

Interactions

2023-12-13T07:58:56.038327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:58:55.823474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:58:56.147466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:58:55.929171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:58:59.061134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도시가표준액연면적기준일자
과세년도1.0000.0000.0001.000
시가표준액0.0001.0000.8980.000
연면적0.0000.8981.0000.000
기준일자1.0000.0000.0001.000
2023-12-13T07:58:59.173861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도기준일자
과세년도1.0001.000
기준일자1.0001.000
2023-12-13T07:58:59.264284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시가표준액연면적과세년도기준일자
시가표준액1.0000.5940.0000.000
연면적0.5941.0000.0000.000
과세년도0.0000.0001.0001.000
기준일자0.0000.0001.0001.000

Missing values

2023-12-13T07:58:56.287787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:58:56.422552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군구명자치단체코드과세년도물건지시가표준액연면적기준일자
50896충청남도보령시441802019충청남도 보령시 청라면 황룡리 957 101호452400113.12019-12-31
53819충청남도보령시441802019충청남도 보령시 요암동 371-8 1동 102호2244367066.1472019-12-31
13797충청남도보령시441802021[ 구장터로 6 ] 0000동 8101호32495000129.982021-12-31
80382충청남도보령시441802017충청남도 보령시 죽정동 562 1동 8101호3218748092.922017-12-31
90068충청남도보령시441802017충청남도 보령시 주산면 증산리 421 101호1792980199.222017-12-31
73340충청남도보령시441802018[ 구상가길 26 ] 0000동 0301호582309073.712018-12-31
91101충청남도보령시441802017충청남도 보령시 천북면 장은리 90-3 103호7931000113.32017-12-31
51194충청남도보령시441802019충청남도 보령시 청소면 재정리 149-1 103호783320195.832019-12-31
76994충청남도보령시441802018[ 사호장은로 335 ] 0000동 0101호7143500109.92018-12-31
62059충청남도보령시441802018충청남도 보령시 주교면 은포리 292-3 1동 102호32044920110.122018-12-31
시도명시군구명자치단체코드과세년도물건지시가표준액연면적기준일자
47288충청남도보령시441802019충청남도 보령시 대천동 333-2 1동 101호10390870131.532019-12-31
65634충청남도보령시441802018충청남도 보령시 청소면 죽림리 582-2 107호121800021.02018-12-31
12917충청남도보령시441802021[ 중앙로 47 ] 0000동 0101호91353750154.852021-12-31
49125충청남도보령시441802019[ 한내로터리길 78-30 ] 0000동 0101호115409950149.342019-12-31
80770충청남도보령시441802017충청남도 보령시 대천동 618-441 328호2463108049.462017-12-31
89544충청남도보령시441802017충청남도 보령시 성주면 성주리 238-2 102호1545750067.52017-12-31
42433충청남도보령시441802019충청남도 보령시 웅천읍 대창리 685-1 101호30814960317.682019-12-31
23486충청남도보령시441802021[ 중앙로 226 ] 0000동 0201호5504398084.2942021-12-31
80899충청남도보령시441802017충청남도 보령시 대천동 618-227 8101호2217752039.12017-12-31
29378충청남도보령시441802019충청남도 보령시 천북면 궁포리 107-2 104호7049000704.92019-12-31

Duplicate rows

Most frequently occurring

시도명시군구명자치단체코드과세년도물건지시가표준액연면적기준일자# duplicates
5충청남도보령시441802018충청남도 보령시 오천면 교성리 1326-20 101호44741900389.062018-12-313
0충청남도보령시441802017충청남도 보령시 미산면 봉성리 462-1 101호10022400172.82017-12-312
1충청남도보령시441802017충청남도 보령시 오천면 오포리 773 117호268560000720.02017-12-312
2충청남도보령시441802017충청남도 보령시 천북면 궁포리 381-2 1동 101호136080001512.02017-12-312
3충청남도보령시441802017충청남도 보령시 천북면 장은리 905-16 1동 101호47244001574.82017-12-312
4충청남도보령시441802017충청남도 보령시 청라면 장산리 806 101호1188000396.02017-12-312
6충청남도보령시441802018충청남도 보령시 오천면 오포리 773 122호1846608049.642018-12-312
7충청남도보령시441802018충청남도 보령시 천북면 장은리 703 1동 101호460800115.22018-12-312
8충청남도보령시441802019충청남도 보령시 웅천읍 노천리 563 101호62741250371.252019-12-312
9충청남도보령시441802019충청남도 보령시 주교면 관창리 405-1 101호178200066.02019-12-312