Overview

Dataset statistics

Number of variables6
Number of observations5556
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory276.8 KiB
Average record size in memory51.0 B

Variable types

Numeric3
Text1
Categorical2

Dataset

Description산림자원통합관리시스템 활용한 임산물 시가조사 산림자원통합관리시스템 : 조림, 숲가꾸기, 입목 매각 등 산림자원 정보화 시스템
Author산림청
URLhttps://www.data.go.kr/data/15093799/fileData.do

Alerts

시가번호 is highly overall correlated with 임산물시가 and 1 other fieldsHigh correlation
임산물시가 is highly overall correlated with 시가번호 and 1 other fieldsHigh correlation
시가수종명 is highly overall correlated with 시가번호High correlation
품등 is highly overall correlated with 임산물시가High correlation

Reproduction

Analysis started2023-12-12 12:53:29.622500
Analysis finished2023-12-12 12:53:31.294823
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시가조사년월
Real number (ℝ)

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean201906.5
Minimum201901
Maximum201912
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size49.0 KiB
2023-12-12T21:53:31.346681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201901
5-th percentile201901
Q1201903.75
median201906.5
Q3201909.25
95-th percentile201912
Maximum201912
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.4523632
Coefficient of variation (CV)1.7098822 × 10-5
Kurtosis-1.2167983
Mean201906.5
Median Absolute Deviation (MAD)3
Skewness0
Sum1.1217925 × 109
Variance11.918812
MonotonicityIncreasing
2023-12-12T21:53:31.465372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
201901 463
8.3%
201902 463
8.3%
201903 463
8.3%
201904 463
8.3%
201905 463
8.3%
201906 463
8.3%
201907 463
8.3%
201908 463
8.3%
201909 463
8.3%
201910 463
8.3%
Other values (2) 926
16.7%
ValueCountFrequency (%)
201901 463
8.3%
201902 463
8.3%
201903 463
8.3%
201904 463
8.3%
201905 463
8.3%
201906 463
8.3%
201907 463
8.3%
201908 463
8.3%
201909 463
8.3%
201910 463
8.3%
ValueCountFrequency (%)
201912 463
8.3%
201911 463
8.3%
201910 463
8.3%
201909 463
8.3%
201908 463
8.3%
201907 463
8.3%
201906 463
8.3%
201905 463
8.3%
201904 463
8.3%
201903 463
8.3%

시가번호
Real number (ℝ)

HIGH CORRELATION 

Distinct500
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean233.57163
Minimum1
Maximum500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size49.0 KiB
2023-12-12T21:53:31.594361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile24
Q1117
median233
Q3350
95-th percentile443
Maximum500
Range499
Interquartile range (IQR)233

Descriptive statistics

Standard deviation134.65706
Coefficient of variation (CV)0.57651291
Kurtosis-1.1880432
Mean233.57163
Median Absolute Deviation (MAD)116
Skewness0.0067796112
Sum1297724
Variance18132.524
MonotonicityNot monotonic
2023-12-12T21:53:31.766648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 12
 
0.2%
366 12
 
0.2%
377 12
 
0.2%
376 12
 
0.2%
375 12
 
0.2%
374 12
 
0.2%
373 12
 
0.2%
371 12
 
0.2%
370 12
 
0.2%
37 12
 
0.2%
Other values (490) 5436
97.8%
ValueCountFrequency (%)
1 12
0.2%
2 12
0.2%
3 12
0.2%
4 12
0.2%
5 12
0.2%
6 12
0.2%
7 11
0.2%
8 12
0.2%
9 12
0.2%
10 12
0.2%
ValueCountFrequency (%)
500 1
< 0.1%
499 1
< 0.1%
498 1
< 0.1%
497 1
< 0.1%
496 1
< 0.1%
495 1
< 0.1%
494 1
< 0.1%
493 1
< 0.1%
492 1
< 0.1%
491 1
< 0.1%
Distinct53
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size43.5 KiB
2023-12-12T21:53:32.017468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.3304536
Min length5

Characters and Unicode

Total characters40728
Distinct characters64
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row인천광역시
2nd row광주광역시
3rd row경상북도 봉화군
4th row경상북도 봉화군
5th row경상북도 봉화군
ValueCountFrequency (%)
강원도 1416
 
13.3%
경기도 1128
 
10.6%
충청북도 528
 
4.9%
전라남도 516
 
4.8%
전라북도 468
 
4.4%
경상남도 420
 
3.9%
경상북도 324
 
3.0%
충청남도 324
 
3.0%
인천광역시 252
 
2.4%
파주시 252
 
2.4%
Other values (51) 5052
47.3%
2023-12-12T21:53:32.380916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5124
 
12.6%
5124
 
12.6%
3228
 
7.9%
2388
 
5.9%
1872
 
4.6%
1788
 
4.4%
1620
 
4.0%
1548
 
3.8%
1320
 
3.2%
1320
 
3.2%
Other values (54) 15396
37.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 35604
87.4%
Space Separator 5124
 
12.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5124
 
14.4%
3228
 
9.1%
2388
 
6.7%
1872
 
5.3%
1788
 
5.0%
1620
 
4.6%
1548
 
4.3%
1320
 
3.7%
1320
 
3.7%
1296
 
3.6%
Other values (53) 14100
39.6%
Space Separator
ValueCountFrequency (%)
5124
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 35604
87.4%
Common 5124
 
12.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5124
 
14.4%
3228
 
9.1%
2388
 
6.7%
1872
 
5.3%
1788
 
5.0%
1620
 
4.6%
1548
 
4.3%
1320
 
3.7%
1320
 
3.7%
1296
 
3.6%
Other values (53) 14100
39.6%
Common
ValueCountFrequency (%)
5124
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 35604
87.4%
ASCII 5124
 
12.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5124
 
14.4%
3228
 
9.1%
2388
 
6.7%
1872
 
5.3%
1788
 
5.0%
1620
 
4.6%
1548
 
4.3%
1320
 
3.7%
1320
 
3.7%
1296
 
3.6%
Other values (53) 14100
39.6%
ASCII
ValueCountFrequency (%)
5124
100.0%

시가수종명
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size43.5 KiB
낙엽송
2292 
소나무
1368 
잣나무
720 
참나무
456 
리기다소나무
420 
Other values (2)
300 

Length

Max length6
Median length3
Mean length3.1857451
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row소나무
2nd row소나무
3rd row소나무
4th row소나무
5th row소나무

Common Values

ValueCountFrequency (%)
낙엽송 2292
41.3%
소나무 1368
24.6%
잣나무 720
 
13.0%
참나무 456
 
8.2%
리기다소나무 420
 
7.6%
편백 228
 
4.1%
삼나무 72
 
1.3%

Length

2023-12-12T21:53:32.518113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:53:32.652158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
낙엽송 2292
41.3%
소나무 1368
24.6%
잣나무 720
 
13.0%
참나무 456
 
8.2%
리기다소나무 420
 
7.6%
편백 228
 
4.1%
삼나무 72
 
1.3%

품등
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size43.5 KiB
3등
1224 
2등
1164 
1등
1056 
원주재
828 
특용재
684 
Other values (2)
600 

Length

Max length3
Median length2
Mean length2.3736501
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row특용재
2nd row3등
3rd row1등
4th row2등
5th row3등

Common Values

ValueCountFrequency (%)
3등 1224
22.0%
2등 1164
21.0%
1등 1056
19.0%
원주재 828
14.9%
특용재 684
12.3%
원료재 564
10.2%
기타 36
 
0.6%

Length

2023-12-12T21:53:32.786787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:53:32.899236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3등 1224
22.0%
2등 1164
21.0%
1등 1056
19.0%
원주재 828
14.9%
특용재 684
12.3%
원료재 564
10.2%
기타 36
 
0.6%

임산물시가
Real number (ℝ)

HIGH CORRELATION 

Distinct245
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean152419.37
Minimum3900
Maximum460000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size49.0 KiB
2023-12-12T21:53:33.040269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3900
5-th percentile63200
Q1134000
median144000
Q3158000
95-th percentile261000
Maximum460000
Range456100
Interquartile range (IQR)24000

Descriptive statistics

Standard deviation67454.976
Coefficient of variation (CV)0.44256172
Kurtosis7.0534178
Mean152419.37
Median Absolute Deviation (MAD)12000
Skewness2.1149958
Sum8.46842 × 108
Variance4.5501738 × 109
MonotonicityNot monotonic
2023-12-12T21:53:33.176665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
144000 287
 
5.2%
142000 223
 
4.0%
138000 211
 
3.8%
146000 182
 
3.3%
150000 175
 
3.1%
156000 170
 
3.1%
143000 169
 
3.0%
140000 139
 
2.5%
152000 123
 
2.2%
147000 117
 
2.1%
Other values (235) 3760
67.7%
ValueCountFrequency (%)
3900 10
 
0.2%
4000 10
 
0.2%
4500 10
 
0.2%
59200 18
0.3%
59500 6
 
0.1%
59800 3
 
0.1%
59900 6
 
0.1%
60000 15
0.3%
60100 15
0.3%
60800 30
0.5%
ValueCountFrequency (%)
460000 12
 
0.2%
450000 12
 
0.2%
440000 53
1.0%
435000 12
 
0.2%
430000 31
0.6%
420000 12
 
0.2%
395000 12
 
0.2%
390000 12
 
0.2%
385000 12
 
0.2%
375000 12
 
0.2%

Interactions

2023-12-12T21:53:30.778611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:30.087825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:30.451096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:30.882706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:30.211651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:30.590331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:30.998006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:30.327681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:30.693175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:53:33.269049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시가조사년월시가번호주민법정동명시가수종명품등임산물시가
시가조사년월1.0000.0000.0000.0000.0000.000
시가번호0.0001.0000.9130.9090.3350.740
주민법정동명0.0000.9131.0000.8550.6740.742
시가수종명0.0000.9090.8551.0000.5500.667
품등0.0000.3350.6740.5501.0000.764
임산물시가0.0000.7400.7420.6670.7641.000
2023-12-12T21:53:33.363944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품등시가수종명
품등1.0000.219
시가수종명0.2191.000
2023-12-12T21:53:33.457579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시가조사년월시가번호임산물시가시가수종명품등
시가조사년월1.0000.008-0.0200.0000.000
시가번호0.0081.000-0.5530.7710.176
임산물시가-0.020-0.5531.0000.4210.536
시가수종명0.0000.7710.4211.0000.219
품등0.0000.1760.5360.2191.000

Missing values

2023-12-12T21:53:31.126283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:53:31.249656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시가조사년월시가번호주민법정동명시가수종명품등임산물시가
02019011인천광역시소나무특용재395000
120190110광주광역시소나무3등213000
2201901100경상북도 봉화군소나무1등235000
3201901101경상북도 봉화군소나무2등210000
4201901102경상북도 봉화군소나무3등177000
5201901103경상북도 봉화군소나무원주재148000
6201901104경상남도 함양군소나무특용재450000
7201901105경상남도 함양군소나무1등215000
8201901106경상남도 함양군소나무2등211000
9201901107경상남도 함양군소나무3등195000
시가조사년월시가번호주민법정동명시가수종명품등임산물시가
554620191290경상북도 안동시소나무1등233000
554720191291경상북도 안동시소나무2등217000
554820191292경상북도 안동시소나무3등188000
554920191293경상북도 안동시소나무원주재184800
555020191294경상북도 영주시소나무특용재460000
555120191295경상북도 영주시소나무1등271000
555220191296경상북도 영주시소나무2등201400
555320191297경상북도 영주시소나무3등185500
555420191298경상북도 영주시소나무원주재187500
555520191299경상북도 봉화군소나무특용재420000