Overview

Dataset statistics

Number of variables6
Number of observations3994
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory202.9 KiB
Average record size in memory52.0 B

Variable types

Categorical2
Text1
Numeric3

Alerts

지번수(개) is highly overall correlated with 소유자수(명) and 1 other fieldsHigh correlation
소유자수(명) is highly overall correlated with 지번수(개)High correlation
토지면적(㎡) is highly overall correlated with 지번수(개)High correlation

Reproduction

Analysis started2023-12-16 06:03:47.620589
Analysis finished2023-12-16 06:03:55.591899
Duration7.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
201712
1998 
201212
1996 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201712
2nd row201712
3rd row201712
4th row201712
5th row201712

Common Values

ValueCountFrequency (%)
201712 1998
50.0%
201212 1996
50.0%

Length

2023-12-16T06:03:55.911671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T06:03:56.512165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201712 1998
50.0%
201212 1996
50.0%

시군명
Categorical

Distinct31
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
화성시
386 
안성시
382 
여주시
291 
평택시
272 
이천시
264 
Other values (26)
2399 

Length

Max length4
Median length3
Mean length3.0425638
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row가평군
5th row가평군

Common Values

ValueCountFrequency (%)
화성시 386
 
9.7%
안성시 382
 
9.6%
여주시 291
 
7.3%
평택시 272
 
6.8%
이천시 264
 
6.6%
파주시 252
 
6.3%
양평군 220
 
5.5%
용인시 214
 
5.4%
포천시 176
 
4.4%
연천군 172
 
4.3%
Other values (21) 1365
34.2%

Length

2023-12-16T06:03:57.075918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
화성시 386
 
9.7%
안성시 382
 
9.6%
여주시 291
 
7.3%
평택시 272
 
6.8%
이천시 264
 
6.6%
파주시 252
 
6.3%
양평군 220
 
5.5%
용인시 214
 
5.4%
포천시 176
 
4.4%
연천군 172
 
4.3%
Other values (21) 1365
34.2%
Distinct2234
Distinct (%)55.9%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
2023-12-16T06:03:58.940072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length14.502253
Min length10

Characters and Unicode

Total characters57922
Distinct characters316
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique474 ?
Unique (%)11.9%

Sample

1st row경기도 가평군 설악면 엄소리
2nd row경기도 가평군 설악면 창의리
3rd row경기도 가평군 설악면 묵안리
4th row경기도 가평군 설악면 가일리
5th row경기도 가평군 설악면 방일리
ValueCountFrequency (%)
경기도 3994
25.9%
화성시 386
 
2.5%
안성시 382
 
2.5%
평택시 272
 
1.8%
이천시 264
 
1.7%
파주시 252
 
1.6%
양평군 220
 
1.4%
용인시 214
 
1.4%
포천시 176
 
1.1%
연천군 172
 
1.1%
Other values (1981) 9097
59.0%
2023-12-16T06:04:01.247204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11435
19.7%
4162
 
7.2%
4072
 
7.0%
4004
 
6.9%
3410
 
5.9%
2871
 
5.0%
2099
 
3.6%
1468
 
2.5%
1127
 
1.9%
1009
 
1.7%
Other values (306) 22265
38.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46475
80.2%
Space Separator 11435
 
19.7%
Decimal Number 12
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4162
 
9.0%
4072
 
8.8%
4004
 
8.6%
3410
 
7.3%
2871
 
6.2%
2099
 
4.5%
1468
 
3.2%
1127
 
2.4%
1009
 
2.2%
935
 
2.0%
Other values (302) 21318
45.9%
Decimal Number
ValueCountFrequency (%)
2 4
33.3%
3 4
33.3%
1 4
33.3%
Space Separator
ValueCountFrequency (%)
11435
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 46475
80.2%
Common 11447
 
19.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4162
 
9.0%
4072
 
8.8%
4004
 
8.6%
3410
 
7.3%
2871
 
6.2%
2099
 
4.5%
1468
 
3.2%
1127
 
2.4%
1009
 
2.2%
935
 
2.0%
Other values (302) 21318
45.9%
Common
ValueCountFrequency (%)
11435
99.9%
2 4
 
< 0.1%
3 4
 
< 0.1%
1 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 46475
80.2%
ASCII 11447
 
19.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11435
99.9%
2 4
 
< 0.1%
3 4
 
< 0.1%
1 4
 
< 0.1%
Hangul
ValueCountFrequency (%)
4162
 
9.0%
4072
 
8.8%
4004
 
8.6%
3410
 
7.3%
2871
 
6.2%
2099
 
4.5%
1468
 
3.2%
1127
 
2.4%
1009
 
2.2%
935
 
2.0%
Other values (302) 21318
45.9%

지번수(개)
Real number (ℝ)

HIGH CORRELATION 

Distinct2113
Distinct (%)52.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1427.2308
Minimum0
Maximum11676
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size35.2 KiB
2023-12-16T06:04:02.822768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile249.65
Q1838
median1277.5
Q31817.75
95-th percentile2984.4
Maximum11676
Range11676
Interquartile range (IQR)979.75

Descriptive statistics

Standard deviation949.66895
Coefficient of variation (CV)0.66539267
Kurtosis14.301709
Mean1427.2308
Median Absolute Deviation (MAD)481.5
Skewness2.5275595
Sum5700360
Variance901871.11
MonotonicityNot monotonic
2023-12-16T06:04:03.479852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
843 8
 
0.2%
1425 8
 
0.2%
1434 7
 
0.2%
1222 7
 
0.2%
1117 7
 
0.2%
930 7
 
0.2%
1118 7
 
0.2%
1256 7
 
0.2%
1637 7
 
0.2%
1315 6
 
0.2%
Other values (2103) 3923
98.2%
ValueCountFrequency (%)
0 1
 
< 0.1%
1 2
0.1%
2 2
0.1%
3 3
0.1%
4 2
0.1%
5 4
0.1%
6 3
0.1%
7 2
0.1%
8 2
0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
11676 1
< 0.1%
10411 1
< 0.1%
9548 1
< 0.1%
8935 1
< 0.1%
8899 1
< 0.1%
7828 1
< 0.1%
7726 1
< 0.1%
7660 1
< 0.1%
7395 1
< 0.1%
7309 1
< 0.1%

소유자수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct2181
Distinct (%)54.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2354.5819
Minimum1
Maximum50390
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size35.2 KiB
2023-12-16T06:04:04.223429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile218.65
Q1516
median835
Q31677
95-th percentile10789.25
Maximum50390
Range50389
Interquartile range (IQR)1161

Descriptive statistics

Standard deviation4694.5751
Coefficient of variation (CV)1.9938042
Kurtosis26.723409
Mean2354.5819
Median Absolute Deviation (MAD)412
Skewness4.595715
Sum9404200
Variance22039036
MonotonicityNot monotonic
2023-12-16T06:04:05.049909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
689 10
 
0.3%
690 9
 
0.2%
534 8
 
0.2%
422 8
 
0.2%
671 8
 
0.2%
960 8
 
0.2%
424 8
 
0.2%
787 8
 
0.2%
649 7
 
0.2%
858 7
 
0.2%
Other values (2171) 3913
98.0%
ValueCountFrequency (%)
1 2
0.1%
2 1
 
< 0.1%
3 3
0.1%
5 2
0.1%
7 2
0.1%
8 1
 
< 0.1%
9 4
0.1%
10 2
0.1%
11 3
0.1%
17 2
0.1%
ValueCountFrequency (%)
50390 1
< 0.1%
48141 1
< 0.1%
46924 1
< 0.1%
44060 1
< 0.1%
42469 1
< 0.1%
41275 1
< 0.1%
40511 1
< 0.1%
40125 1
< 0.1%
39167 1
< 0.1%
38174 1
< 0.1%

토지면적(㎡)
Real number (ℝ)

HIGH CORRELATION 

Distinct3993
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2380412.4
Minimum277.55
Maximum24931714
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size35.2 KiB
2023-12-16T06:04:05.654041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum277.55
5-th percentile317978.6
Q11258444.1
median2036219.2
Q32996528.7
95-th percentile5370642.1
Maximum24931714
Range24931437
Interquartile range (IQR)1738084.6

Descriptive statistics

Standard deviation1799582.3
Coefficient of variation (CV)0.75599603
Kurtosis18.694502
Mean2380412.4
Median Absolute Deviation (MAD)862192.38
Skewness2.864614
Sum9.5073671 × 109
Variance3.2384965 × 1012
MonotonicityNot monotonic
2023-12-16T06:04:06.289284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
132930.0 2
 
0.1%
479488.85 1
 
< 0.1%
2089669.83 1
 
< 0.1%
2485969.71 1
 
< 0.1%
1506928.52 1
 
< 0.1%
2120007.41 1
 
< 0.1%
1789958.69 1
 
< 0.1%
2370085.71 1
 
< 0.1%
955090.92 1
 
< 0.1%
1124103.93 1
 
< 0.1%
Other values (3983) 3983
99.7%
ValueCountFrequency (%)
277.55 1
< 0.1%
365.25 1
< 0.1%
557.27 1
< 0.1%
3106.33 1
< 0.1%
3649.03 1
< 0.1%
4045.34 1
< 0.1%
6203.29 1
< 0.1%
7305.0 1
< 0.1%
7567.0 1
< 0.1%
8992.0 1
< 0.1%
ValueCountFrequency (%)
24931714.49 1
< 0.1%
24112594.18 1
< 0.1%
18040213.07 1
< 0.1%
18010685.77 1
< 0.1%
13539731.47 1
< 0.1%
13455762.56 1
< 0.1%
12181726.83 1
< 0.1%
12027745.83 1
< 0.1%
11988582.15 1
< 0.1%
11938676.8 1
< 0.1%

Interactions

2023-12-16T06:03:53.071640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T06:03:49.873640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T06:03:51.542386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T06:03:53.625127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T06:03:50.540944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T06:03:52.151912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T06:03:54.025068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T06:03:51.017336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T06:03:52.562638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-16T06:04:06.750509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준년월시군명지번수(개)소유자수(명)토지면적(㎡)
기준년월1.0000.0000.0000.0000.000
시군명0.0001.0000.5630.6190.465
지번수(개)0.0000.5631.0000.7300.484
소유자수(명)0.0000.6190.7301.0000.043
토지면적(㎡)0.0000.4650.4840.0431.000
2023-12-16T06:04:07.275926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준년월시군명
기준년월1.0000.000
시군명0.0001.000
2023-12-16T06:04:07.684555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지번수(개)소유자수(명)토지면적(㎡)기준년월시군명
지번수(개)1.0000.7090.6370.0000.234
소유자수(명)0.7091.0000.2750.0000.269
토지면적(㎡)0.6370.2751.0000.0000.200
기준년월0.0000.0000.0001.0000.000
시군명0.2340.2690.2000.0001.000

Missing values

2023-12-16T06:03:54.692633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-16T06:03:55.372980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년월시군명법정동명지번수(개)소유자수(명)토지면적(㎡)
0201712가평군경기도 가평군 설악면 엄소리15867406766695.7
1201712가평군경기도 가평군 설악면 창의리11585422188643.09
2201712가평군경기도 가평군 설악면 묵안리17118204656227.31
3201712가평군경기도 가평군 설악면 가일리13576042204678.35
4201712가평군경기도 가평군 설악면 방일리171814674373828.12
5201712가평군경기도 가평군 설악면 천안리14786172731466.11
6201712가평군경기도 가평군 설악면 이천리11785452187854.43
7201712가평군경기도 가평군 청평면 청평리419060194394336.95
8201712가평군경기도 가평군 청평면 상천리6347541118010685.77
9201712가평군경기도 가평군 청평면 하천리250629306689814.92
기준년월시군명법정동명지번수(개)소유자수(명)토지면적(㎡)
3984201212화성시경기도 화성시 송산면 지화리17388073451201.72
3985201212화성시경기도 화성시 송산면 중송리15137202307992.37
3986201212화성시경기도 화성시 송산면 육일리15599311833131.18
3987201212화성시경기도 화성시 송산면 칠곡리11165222054995.6
3988201212화성시경기도 화성시 서신면 전곡리196310943379908.38
3989201212화성시경기도 화성시 남양동431546914892897.47
3990201212화성시경기도 화성시 신남동302319734261458.91
3991201212화성시경기도 화성시 장덕동23149442915176.18
3992201212화성시경기도 화성시 안석동15536872080930.23
3993201212화성시경기도 화성시 활초동14437262227072.44