Mesoamerican tree richness across the Americas: conditional inference trees for biodiversity patterns
Source:vignettes/articles/trees.Rmd
trees.RmdTree species richness in the Americas is not evenly distributed — Mesoamerican forests are global hotspots of tree diversity, yet that richness declines sharply toward drier or colder regions. Understanding what drives those gradients calls for a dataset with both broad geographic coverage and diverse climate predictors.
Overview
The trees dataset focuses on the distribution of the
tree species present in Mesoamerica across North and South America. The
presence of Mesoamerican trees was obtained from the Tree
Biodiversity Network (BIOTREE-NET) dataset (project now
defunct). The distribution of these species across the Americas was
derived from GBIF. The data represents richness of
Mesoamerican trees in the Americas for 3,373 hexagonal grid cells
across the Americas, combined with 50 environmental predictor variables
from different sources. The dataset was originally compiled for Benito
et al. (2013).
Its main features are:
- Americas scope: Hexagonal cells covering longitudes -125.3° to -34.3° and latitudes -34.4° to 49.9°.
-
Single response:
trees, an integer count of tree species richness per hexagonal cell. - Rich environmental predictors: 50 predictors spanning 10 categories (climate, soil, vegetation, geography, and more).
We designed it to support regression modelling, multicollinearity filtering, and spatial analysis of tree diversity patterns.
Description
The dataset is an sf data frame with 3373 rows and 53
columns, and 989 cells with NA in the trees
response variable (cells where environmental data is valid but no tree
species were found in the relevant databases). The first 10 records and
all columns but geometry are shown below.
trees |>
head(n = 10) |>
dplyr::glimpse()
#> Rows: 10
#> Columns: 53
#> $ cellid <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
#> $ trees <int> 1, NA, 1, 5, 1, NA, NA, NA, NA, NA
#> $ air_humidity_max <dbl> 66.72578, 68.97865, 65.92500, 63.75984, …
#> $ air_humidity <dbl> 64.19672, 65.28275, 62.97762, 60.19885, …
#> $ air_humidity_min <dbl> 62.62891, 62.51946, 59.47713, 55.22597, …
#> $ air_humidity_range <dbl> 3.675112, 5.961699, 5.949539, 8.043704, …
#> $ aridity <dbl> 3.034283, 7.000583, 2.228907, 3.352518, …
#> $ cloud_cover_max <dbl> 51.80775, 59.59544, 49.02710, 53.67749, …
#> $ cloud_cover <dbl> 39.38748, 49.43344, 36.08522, 37.64137, …
#> $ cloud_cover_min <dbl> 23.980626, 37.773336, 18.689770, 14.2178…
#> $ cloud_cover_range <dbl> 27.21311, 21.36061, 29.83153, 38.95784, …
#> $ evapotranspiration_max <dbl> 145.8778, 121.9521, 162.0084, 165.8928, …
#> $ evapotranspiration <dbl> 81.39344, 62.25680, 86.53583, 83.38519, …
#> $ evapotranspiration_min <dbl> 31.34277, 18.14818, 26.22737, 24.54848, …
#> $ evapotranspiration_range <dbl> 114.0328, 103.3089, 135.2890, 140.8501, …
#> $ rainfall_seasonality <dbl> 69.05514, 58.10799, 83.43844, 73.72836, …
#> $ rainfall <dbl> 1984.803, 3788.573, 1797.700, 2674.796, …
#> $ rainfall_coldest_quarter <dbl> 899.4158, 1485.1863, 935.7548, 1274.9368…
#> $ rainfall_driest_month <dbl> 13.207154, 72.568857, 4.360102, 12.90247…
#> $ rainfall_driest_quarter <dbl> 74.86140, 287.73483, 34.07716, 87.48721,…
#> $ rainfall_warmest_quarter <dbl> 74.86140, 314.23001, 34.07716, 87.48721,…
#> $ rainfall_wettest_month <dbl> 354.6468, 601.3340, 376.6572, 511.7660, …
#> $ rainfall_wettest_quarter <dbl> 952.3741, 1699.2585, 954.7104, 1328.0056…
#> $ temperature_isothermality <dbl> 42.31744, 31.10151, 46.19262, 40.70953, …
#> $ temperature_mean_daily_range <dbl> 4.970194, 4.531185, 7.362851, 7.845559, …
#> $ temperature <dbl> 10.965723, 8.226245, 10.837620, 10.12197…
#> $ temperature_range <dbl> 12.12370, 15.21725, 16.33477, 19.58319, …
#> $ temperature_seasonality <dbl> 255.8554, 356.1905, 308.2309, 410.3759, …
#> $ temperature_coldest_month_min <dbl> 5.6348733, 1.8606111, 4.1480463, 2.53274…
#> $ temperature_coldest_quarter <dbl> 8.016393, 4.015278, 7.217357, 5.473721, …
#> $ temperature_driest_quarter <dbl> 14.54247, 12.75220, 15.08227, 15.84711, …
#> $ temperature_warmest_month_max <dbl> 18.26975, 17.30557, 20.92225, 22.42327, …
#> $ temperature_warmest_quarter <dbl> 14.54247, 13.03118, 15.08227, 15.84711, …
#> $ temperature_wettest_quarter <dbl> 8.542474, 4.589159, 7.622423, 5.954469, …
#> $ distance_to_ocean <dbl> 47.70194, 95.32629, 160.80169, 236.81731…
#> $ elevation <dbl> 82.64232, 279.17706, 327.39466, 516.4347…
#> $ latitude <dbl> 43.07518, 48.57816, 40.58208, 42.43964, …
#> $ longitude <dbl> -124.3935, -124.6602, -124.0775, -124.14…
#> $ soil_clay <dbl> 18.97297, 19.41023, 26.68421, 28.00797, …
#> $ soil_nitrogen <dbl> 4.081081, 5.990067, 3.774358, 4.297677, …
#> $ soil_organic_carbon <dbl> 69.37015, 86.66117, 72.22415, 75.67207, …
#> $ soil_ph <dbl> 5.242456, 4.839451, 5.537970, 5.388273, …
#> $ soil_sand <dbl> 49.89072, 32.86094, 28.39756, 31.97907, …
#> $ soil_silt <dbl> 29.78496, 46.36619, 43.56146, 38.65843, …
#> $ soil_temperature_max <dbl> 20.68563, 19.32484, 23.68939, 24.77120, …
#> $ soil_temperature <dbl> 10.056886, 7.044211, 11.533294, 10.98734…
#> $ soil_temperature_min <dbl> 1.9985030, -0.7671579, 2.9644339, 1.4102…
#> $ soil_temperature_range <dbl> 18.21407, 20.42632, 20.22940, 22.93742, …
#> $ ndvi_max <dbl> 0.7350104, 0.8117852, 0.7702507, 0.77888…
#> $ ndvi <dbl> 0.6665069, 0.7123055, 0.7063288, 0.71021…
#> $ ndvi_min <dbl> 0.5839046, 0.6050872, 0.6405807, 0.62898…
#> $ ndvi_range <dbl> 0.1511058, 0.2066980, 0.1296699, 0.14990…
#> $ geometry <POLYGON [°]> POLYGON ((-124.8289 42.6689..., POLYGON …The hexagonal grid cells cover the Americas between approximately 50°N and 35°S. The map below shows the grid coloured by tree richness.