Preparing model inputs
Miquel De Cáceres
2024-11-05
Source:vignettes/intro/PreparingInputs.Rmd
PreparingInputs.Rmd
About this article
A companion article Understanding model inputs explained the vegetation, soil and weather structures needed to run the simulation models included in medfate. Preparing inputs for simulations with medfate is not straightforward, because it requires obtaining and reshaping data for vegetation, soil and weather. Therefore, this article illustrates some common issues that arise in the process of preparing inputs, so that the user is aware of them when processing his/her own data.
We begin by loading packages medfate and meteoland:
Building/manipulating forest objects
In this section we show how to build and manipulate objects of class
forest
, for their use in package medfate, starting from a
table containing forest inventory data.
Poblet tree data set
Package medfate includes a small dataset of tree data, corresponding to a dense holm oak forest in Poblet (Catalonia, Spain). As a result of the abandonment of previous coppicing exploitation, there is a high density of stems per individual.
We begin by loading the tree data from Poblet:
data("poblet_trees")
and we inspect its content, for example using:
summary(poblet_trees)
#> Plot.Code Indv.Ref Species Diameter.cm
#> Length:717 Min. : 1.0 Length:717 Min. : 7.50
#> Class :character 1st Qu.: 45.0 Class :character 1st Qu.: 9.10
#> Mode :character Median : 97.0 Mode :character Median :11.10
#> Mean :103.4 Mean :11.62
#> 3rd Qu.:156.0 3rd Qu.:13.40
#> Max. :261.0 Max. :26.00
The data frame includes tree data corresponding to three forest inventories:
table(poblet_trees$Plot.Code)
#>
#> POBL_CTL POBL_THI_AFT POBL_THI_BEF
#> 267 189 261
POBL_CTL
corresponds to an oak forest where no treatment
was done (control), whereas POBL_THI_BEF
and
POBL_THI_AFT
are two forest inventories conducted on the
same forest plot, before and after a thinning intervention to reduce the
number of stems.
Mapping trees from the control forest
We initialize an empty forest object using function
emptyforest()
from package medfate:
pobl_ctl <- emptyforest()
pobl_ctl
#> $treeData
#> [1] Species DBH Height N Z50 Z95
#> <0 rows> (or 0-length row.names)
#>
#> $shrubData
#> [1] Species Height Cover Z50 Z95
#> <0 rows> (or 0-length row.names)
#>
#> $herbCover
#> [1] NA
#>
#> $herbHeight
#> [1] NA
#>
#> $seedBank
#> [1] Species Percent
#> <0 rows> (or 0-length row.names)
#>
#> attr(,"class")
#> [1] "forest" "list"
Now we will fill in data for element treeData
in the
forest
object. For that, we need to define a mapping from
column names in poblet_trees
to variables in
treeData
. The mapping can be defined using a named
string vector, i.e. a vector where element names are variable
names in treeData
and vector elements are strings of the
variable names in poblet_trees
:
mapping <- c("Species.name" = "Species", "DBH" = "Diameter.cm")
We can now replace the empty treeData
in
pobl_ctl
using functions subset()
and
forest_mapTreeTable()
:
pobl_ctl$treeData <- forest_mapTreeTable(subset(poblet_trees, Plot.Code=="POBL_CTL"),
mapping_x = mapping, SpParams = SpParamsMED)
We can inspect the result using:
summary(pobl_ctl$treeData)
#> Species N Height DBH Z50
#> Length:267 Min. :1 Mode:logical Min. : 7.50 Mode:logical
#> Class :character 1st Qu.:1 NA's:267 1st Qu.: 9.00 NA's:267
#> Mode :character Median :1 Median :10.70
#> Mean :1 Mean :11.53
#> 3rd Qu.:1 3rd Qu.:13.30
#> Max. :1 Max. :26.00
#> Z95
#> Mode:logical
#> NA's:267
#>
#>
#>
#>
Some data are missing, but we will not worry about it now. One way to
evaluate if the tree data is correctly specified is to display a summary
of the forest
object using the summary
function defined in medfate for this object class:
summary(pobl_ctl, SpParamsMED)
#> Tree BA (m2/ha): 3.0179815 adult trees: 3.0179815 saplings: 0
#> Density (ind/ha) adult trees: 267 saplings: 0 shrubs (estimated): 0
#> Cover (%) adult trees: 42.1205627 saplings: 0 shrubs: 0 herbs: 0
#> LAI (m2/m2) total: 0.544959 adult trees: 0.544959 saplings: 0 shrubs: 0 herbs: 0
#> Fuel loading (kg/m2) total: 0.1421746 adult trees: 0.1421746 saplings: 0 shrubs: 0 herbs: 0
#> PAR ground (%): NA SWR ground (%): NA
The values of stand density and stand basal area are too low for such
a dense forest, which indicates that something needs to be corrected. At
this point, it is important to remember that forest
objects
need the density of trees to be specified as stems per hectare.
We conducted our tree data mapping without indicating the area of the
sampled plot. We are told that forest stand sampling was done using a
circular plot whose radius was 15 m. We can calculate the sampled area
using:
sampled_area <- pi*15^2
and use this information to map the tree data again, where we specify
parameter plot_size_x
:
pobl_ctl$treeData <- forest_mapTreeTable(subset(poblet_trees, Plot.Code=="POBL_CTL"),
mapping_x = mapping, SpParams = SpParamsMED,
plot_size_x = sampled_area)
We run again the summary:
summary(pobl_ctl, SpParamsMED)
#> Tree BA (m2/ha): 42.6957047 adult trees: 42.6957047 saplings: 0
#> Density (ind/ha) adult trees: 3777.277316 saplings: 0 shrubs (estimated): 0
#> Cover (%) adult trees: 100 saplings: 0 shrubs: 0 herbs: 0
#> LAI (m2/m2) total: 5.6770407 adult trees: 5.6770407 saplings: 0 shrubs: 0 herbs: 0
#> Fuel loading (kg/m2) total: 1.493419 adult trees: 1.493419 saplings: 0 shrubs: 0 herbs: 0
#> PAR ground (%): NA SWR ground (%): NA
which results in a much higher basal area and density, as should be expected for a dense oak forest resulting from an abandoned old coppice.
Another issue that we see is the percentage of PAR and SWR that reaches the ground, which have missing values. This indicates that medfate cannot calculate the light extinction profile, in our case because tree heights are missing. Thus, we should somehow estimate tree heights, for example using an allometric relationship:
poblet_trees$Height.cm <- 100 * 1.806*poblet_trees$Diameter.cm^0.518
summary(poblet_trees$Height.cm)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 512.9 566.9 628.3 638.0 692.7 976.5
So trees are between 5 and 10 m height. Once tree heights are defined, we can include them in our mapping:
mapping = c("Species.name" = "Species", "DBH" = "Diameter.cm", "Height" = "Height.cm")
and rerun the tree data mapping:
pobl_ctl$treeData <- forest_mapTreeTable(subset(poblet_trees, Plot.Code=="POBL_CTL"),
mapping_x = mapping, SpParams = SpParamsMED,
plot_size_x = sampled_area)
Now the summary of the control forest stand looks like:
summary(pobl_ctl, SpParamsMED)
#> Tree BA (m2/ha): 42.6957047 adult trees: 42.6957047 saplings: 0
#> Density (ind/ha) adult trees: 3777.277316 saplings: 0 shrubs (estimated): 0
#> Cover (%) adult trees: 100 saplings: 0 shrubs: 0 herbs: 0
#> LAI (m2/m2) total: 5.6770407 adult trees: 5.6770407 saplings: 0 shrubs: 0 herbs: 0
#> Fuel loading (kg/m2) total: 1.493419 adult trees: 1.493419 saplings: 0 shrubs: 0 herbs: 0
#> PAR ground (%): NA SWR ground (%): NA
The fraction of PAR/SWR reaching the ground is low, as would be expected for a dense forest.
Mapping trees from the managed forest
Here we can repeat our mapping for the managed forest plot, which has two codes corresponding to before and after the thinning intervention. Let us first address the pre-thinning state:
pobl_thi_bef <- emptyforest()
pobl_thi_bef$treeData <- forest_mapTreeTable(subset(poblet_trees, Plot.Code=="POBL_THI_BEF"),
mapping_x = mapping, SpParams = SpParamsMED,
plot_size_x = sampled_area)
#> Warning in forest_mapTreeTable(subset(poblet_trees, Plot.Code ==
#> "POBL_THI_BEF"), : Taxon names that were not matched: Quercus humilis.
A warning is raised that not all species names could be parsed. In this case, the reason is that the name used for the downy oak (Quercus humilis) is a synonym and needs to be replaced by its accepted name (Quercus pubescens), which we can do:
poblet_trees$Species[poblet_trees$Species=="Quercus humilis"] <- "Quercus pubescens"
Now we repeat our mapping:
pobl_thi_bef$treeData <- forest_mapTreeTable(subset(poblet_trees, Plot.Code=="POBL_THI_BEF"),
mapping_x = mapping, SpParams = SpParamsMED,
plot_size_x = sampled_area)
summary(pobl_thi_bef, SpParamsMED)
#> Tree BA (m2/ha): 40.9224267 adult trees: 40.9224267 saplings: 0
#> Density (ind/ha) adult trees: 3692.3946797 saplings: 0 shrubs (estimated): 0
#> Cover (%) adult trees: 100 saplings: 0 shrubs: 0 herbs: 0
#> LAI (m2/m2) total: 5.5833511 adult trees: 5.5833511 saplings: 0 shrubs: 0 herbs: 0
#> Fuel loading (kg/m2) total: 1.4629714 adult trees: 1.4629714 saplings: 0 shrubs: 0 herbs: 0
#> PAR ground (%): NA SWR ground (%): NA
Like the control plot, these statistics indicate a dense oak forest. We can repeat the same operations with the forest plot after the thinning intervention:
pobl_thi_aft = emptyforest()
pobl_thi_aft$treeData <- forest_mapTreeTable(subset(poblet_trees, Plot.Code=="POBL_THI_AFT"),
mapping_x = mapping, SpParams = SpParamsMED,
plot_size_x = sampled_area)
summary(pobl_thi_aft, SpParamsMED)
#> Tree BA (m2/ha): 31.6162035 adult trees: 31.6162035 saplings: 0
#> Density (ind/ha) adult trees: 2673.8030439 saplings: 0 shrubs (estimated): 0
#> Cover (%) adult trees: 100 saplings: 0 shrubs: 0 herbs: 0
#> LAI (m2/m2) total: 4.5328748 adult trees: 4.5328748 saplings: 0 shrubs: 0 herbs: 0
#> Fuel loading (kg/m2) total: 1.1915321 adult trees: 1.1915321 saplings: 0 shrubs: 0 herbs: 0
#> PAR ground (%): NA SWR ground (%): NA
Note the decrease in tree density and basal area, and the increase in light reaching the ground, despite the estimated leaf area index is still high.
Reducing the number of woody cohorts
So far we have considered that each tree record should correspond to
a woody cohort. We can check the number of tree cohorts in each
forest
structure using:
nrow(pobl_ctl$treeData)
#> [1] 267
nrow(pobl_thi_bef$treeData)
#> [1] 261
nrow(pobl_thi_aft$treeData)
#> [1] 189
This large amount of cohorts can slow done simulations considerably.
Hence, it is advisable to lump them into coarser woody cohorts. One way
of doing this is via function forest_mergeTrees()
from
package medfate:
pobl_ctl <- forest_mergeTrees(pobl_ctl)
pobl_thi_bef <- forest_mergeTrees(pobl_thi_bef)
pobl_thi_aft <- forest_mergeTrees(pobl_thi_aft)
By default, the function will pool tree cohorts of the same species and diameter class (defined every 5 cm). We can check the new number of tree cohorts using again:
nrow(pobl_ctl$treeData)
#> [1] 9
nrow(pobl_thi_bef$treeData)
#> [1] 11
nrow(pobl_thi_aft$treeData)
#> [1] 8
We can check whether stand properties were altered using the
summary()
function:
summary(pobl_thi_aft, SpParamsMED)
#> Tree BA (m2/ha): 31.6162035 adult trees: 31.6162035 saplings: 0
#> Density (ind/ha) adult trees: 2673.8030439 saplings: 0 shrubs (estimated): 0
#> Cover (%) adult trees: 100 saplings: 0 shrubs: 0 herbs: 0
#> LAI (m2/m2) total: 4.0969956 adult trees: 4.0969956 saplings: 0 shrubs: 0 herbs: 0
#> Fuel loading (kg/m2) total: 1.0724731 adult trees: 1.0724731 saplings: 0 shrubs: 0 herbs: 0
#> PAR ground (%): NA SWR ground (%): NA
Function forest_mergeTrees()
will preserve the stand
density and basal area that the stand description had before merging
cohorts. Other properties like leaf area index may be slightly
modified.
In general, it is advisable to reduce the number of woody cohorts before running simulation models in medfate.
Retrieving SoilGrids data
Because soil properties vary strongly at fine spatial scales, ideally soil physical attributes should be measured on samples taken at the forest stand to be simulated. For those users lacking such data, soil properties modelled at larger scales are available via SoilGrids.org.
Retrieval of soil properties from SoilGrids can be done using
function add_soilgrids()
from package
medfateland. Assuming we know the plot coordinates, we
first create an object sf
(see package
sf):
With this function we will obtain, for each location, a data frame of soil properties:
pobl_soil_props
#> widths clay sand om bd rfc
#> 1 300 26.43333 31.06667 4.133333 1.166667 18.0
#> 2 700 30.40000 29.75000 0.900000 1.440000 19.2
#> 3 1000 31.60000 29.60000 0.610000 1.500000 20.9
This data frame is a physical description of the soil. Initialization
of additional parameters and state variables is done using function
soil()
:
pobl_soil <- soil(pobl_soil_props)
We can inspect the soil definition using:
summary(pobl_soil)
#> Soil depth (mm): 2000
#>
#> Layer 1 [ 0 to 300 mm ]
#> clay (%): 26 silt (%): 38 sand (%): 31 organic matter (%): 4 [ Loam ]
#> Rock fragment content (%): 18 Macroporosity (%): 22
#> Theta WP (%): 18 Theta FC (%): 34 Theta SAT (%): 51 Theta current (%) 34
#> Vol. WP (mm): 44 Vol. FC (mm): 83 Vol. SAT (mm): 125 Vol. current (mm): 83
#> Temperature (Celsius): NA
#>
#> Layer 2 [ 300 to 1000 mm ]
#> clay (%): 30 silt (%): 39 sand (%): 30 organic matter (%): 1 [ Clay loam ]
#> Rock fragment content (%): 19 Macroporosity (%): 9
#> Theta WP (%): 19 Theta FC (%): 33 Theta SAT (%): 44 Theta current (%) 33
#> Vol. WP (mm): 106 Vol. FC (mm): 186 Vol. SAT (mm): 250 Vol. current (mm): 186
#> Temperature (Celsius): NA
#>
#> Layer 3 [ 1000 to 2000 mm ]
#> clay (%): 32 silt (%): 38 sand (%): 30 organic matter (%): 1 [ Clay loam ]
#> Rock fragment content (%): 21 Macroporosity (%): 6
#> Theta WP (%): 19 Theta FC (%): 33 Theta SAT (%): 44 Theta current (%) 33
#> Vol. WP (mm): 152 Vol. FC (mm): 263 Vol. SAT (mm): 347 Vol. current (mm): 263
#> Temperature (Celsius): NA
#>
#> Total soil saturated capacity (mm): 723
#> Total soil water holding capacity (mm): 532
#> Total soil extractable water (mm): 280
#> Total soil current Volume (mm): 532
#> Saturated water depth (mm): NA
It is important to remember that SoilGrids may underestimate the
amount of rocks in the soil. This is because soil samples (which were
used to generate the global database) do not normally contain large
stones or blocks. Hence, realistic simulations should reduce the soil
water holding capacity by increasing the column rfc
. For
example, here we will assume that the third layer contains 80% of
rocks:
pobl_soil_props$rfc[3] <- 80
If we rebuild the soil object and inspect its properties we will see the effect on the soil water holding capacity and soil extractable water:
pobl_soil <- soil(pobl_soil_props)
summary(pobl_soil)
#> Soil depth (mm): 2000
#>
#> Layer 1 [ 0 to 300 mm ]
#> clay (%): 26 silt (%): 38 sand (%): 31 organic matter (%): 4 [ Loam ]
#> Rock fragment content (%): 18 Macroporosity (%): 22
#> Theta WP (%): 18 Theta FC (%): 34 Theta SAT (%): 51 Theta current (%) 34
#> Vol. WP (mm): 44 Vol. FC (mm): 83 Vol. SAT (mm): 125 Vol. current (mm): 83
#> Temperature (Celsius): NA
#>
#> Layer 2 [ 300 to 1000 mm ]
#> clay (%): 30 silt (%): 39 sand (%): 30 organic matter (%): 1 [ Clay loam ]
#> Rock fragment content (%): 19 Macroporosity (%): 9
#> Theta WP (%): 19 Theta FC (%): 33 Theta SAT (%): 44 Theta current (%) 33
#> Vol. WP (mm): 106 Vol. FC (mm): 186 Vol. SAT (mm): 250 Vol. current (mm): 186
#> Temperature (Celsius): NA
#>
#> Layer 3 [ 1000 to 2000 mm ]
#> clay (%): 32 silt (%): 38 sand (%): 30 organic matter (%): 1 [ Clay loam ]
#> Rock fragment content (%): 80 Macroporosity (%): 6
#> Theta WP (%): 19 Theta FC (%): 33 Theta SAT (%): 44 Theta current (%) 33
#> Vol. WP (mm): 38 Vol. FC (mm): 67 Vol. SAT (mm): 88 Vol. current (mm): 67
#> Temperature (Celsius): NA
#>
#> Total soil saturated capacity (mm): 463
#> Total soil water holding capacity (mm): 336
#> Total soil extractable water (mm): 179
#> Total soil current Volume (mm): 336
#> Saturated water depth (mm): NA
Interpolating weather
While soil information is often scarce and uncertain, obtaining daily
weather data suitable for simulations is not straightforward either.
Here we illustrate one way of obtaining such data using package
meteoland. We begin by adding topographic variables
into a sf
object:
pobl_spt <- sf::st_sf(sf_pt) |>
dplyr::mutate(elevation = 850,
slope = 15.1,
aspect = 15)
pobl_spt
#> Simple feature collection with 1 feature and 3 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 1.0219 ymin: 41.3443 xmax: 1.0219 ymax: 41.3443
#> Geodetic CRS: WGS 84
#> sf_pt elevation slope aspect
#> 1 POINT (1.0219 41.3443) 850 15.1 15
The more difficult part of using package meteoland
is to assemble reference weather data from surface weather stations into
the socalled interpolator object (of class
stars
). Please see the meteoland
package documentation to learn how to create interpolator objects.
Here we will assume that such an object is already available, by using
the example object provided in the meteoland
package.
data("meteoland_interpolator_example")
Once we have this interpolator, obtaining interpolated weather for a
set of target points is rather straightforward using function
interpolate_data()
from meteoland:
meteo <- interpolate_data(pobl_spt, meteoland_interpolator_example)
#> ℹ Starting interpolation...
#> ℹ Temperature interpolation is needed also...
#> • Interpolating temperature...
#> ℹ Precipitation interpolation is needed also...
#> • Interpolating precipitation...
#> ℹ Relative humidity interpolation is needed also...
#> • Interpolating relative humidity...
#> ℹ Radiation calculation is needed also...
#> • Calculating radiation...
#> ℹ Wind interpolation is needed also...
#> • Interpolating wind...
#> • Calculating PET...
#> ✔ Interpolation done...
The output of function interpolate_data()
is an object
of class sf
:
meteo
#> Simple feature collection with 1 feature and 4 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 1.0219 ymin: 41.3443 xmax: 1.0219 ymax: 41.3443
#> Geodetic CRS: WGS 84
#> # A tibble: 1 × 5
#> sf_pt elevation slope aspect interpolated_data
#> <POINT [°]> <dbl> <dbl> <dbl> <list>
#> 1 (1.0219 41.3443) 850 15.1 15 <tibble [30 × 13]>
We can access the weather data frame by subsetting the appropriate
element of interpolated_data
:
pobl_weather <- meteo$interpolated_data[[1]]
head(pobl_weather)
#> # A tibble: 6 × 13
#> dates DOY MeanTemperature MinTemperature MaxTemperature
#> <dttm> <dbl> <dbl> <dbl> <dbl>
#> 1 2022-04-01 00:00:00 91 3.37 -2.21 6.99
#> 2 2022-04-02 00:00:00 92 3.60 -4.01 8.54
#> 3 2022-04-03 00:00:00 93 2.33 -7.67 8.83
#> 4 2022-04-04 00:00:00 94 4.16 -4.46 9.76
#> 5 2022-04-05 00:00:00 95 5.86 -5.21 13.1
#> 6 2022-04-06 00:00:00 96 9.16 0.472 14.8
#> # ℹ 8 more variables: Precipitation <dbl>, MeanRelativeHumidity <dbl>,
#> # MinRelativeHumidity <dbl>, MaxRelativeHumidity <dbl>, Radiation <dbl>,
#> # WindSpeed <dbl>, WindDirection <dbl>, PET <dbl>