Final Report to the Environmental Protection Agency
Report Date: September 30, 1998
Executive Summary
Table of Contents
The EPA's Scientific Advisory Board, in a report, Reducing Risk: Priorities and Strategies for Environmental Protection, identified biological depletion and habitat modification among the four highest priority ecological risks in the United States. In recognition that the loss of biological diversity can only be effectively addressed through cooperation of vested interests, EPA formed the Biodiversity Research Consortium (BRC) to develop the technical information and databases needed to assess and manage risks to biodiversity. The BRC proposed a national assessment of comparative risks to biodiversity. The objective of a national assessment would be to identify those areas having species assemblages that contribute the greatest genetic diversity to the biota of their biogeographic regions and high-risk areas requiring management intervention to sustain biodiversity.
Before authorizing a national assessment, the Scientific Advisory Board urged the BRC to demonstrate that reliable data and methodology exist. To answer both the ecological and technical (data and analysis) components of these questions, the BRC sponsored and coordinated pilot studies on five aspects of a potential national assessment:
In 1994, the BRC entered a cooperative agreement with the University of California, Santa Barbara (UCSB), to continue the investigation by developing and analyzing a GIS database of stressors and biophysical factors. This report presents the results of the UCSB research related to aspect #4 above.
A stressor can be defined as a factor operating at the organism, population, community, or ecosystem level of organization that causes or may cause a change in habitat, or a change in exposure to adverse physical, chemical, or biological conditions. We can distinguish between stressor as a disturbance and stress as the response to disturbance. This definition implies that the effects of stressors should be measured in relation to reference conditions. Indicators of stress can also be divided into stressors that directly affect biodiversity at one or more of the ecological levels mentioned above and conditions that characterize the effects of stress.
EPA specified that the study area be the three West Coast states of the United States: Washington, Oregon, and California for which the requisite data on species distributions by hexagon were being compiled by The Nature Conservancy. This region, referred to in this report as the West Coast Transect or WCT, spans 16 degrees of latitude and 10 degrees of longitude. The WCT environments range from cool, moist temperate rainforest in the Pacific Northwest to hot, arid deserts in southeastern California near the border with Mexico. The region provides a good sampling of environments and habitats to examine patterns of richness and relationships with biophysical variables. This region has also experienced a wide range in the magnitude and types of environmental stress. Including three states in the study area, containing a combination of public and private lands, required that data sets evaluated in this pilot study be comprehensive across states and ownership, and therefore would be reasonable prototypes of a national database. Sampling units for the BRC national assessment were defined as the 640 km² sampling hexagons designed for EPA’s Environmental Monitoring and Assessment Program.
The UCSB research attempted to answer several questions regarding the level of risk to biodiversity. Which existing data sets relate to stressors and biophysical factors that affect biodiversity? How should they be manipulated to create a consistent database (i.e., structured by EMAP hexagons) that best represents these factors with respect to species richness and stress? In addition, the BRC program needed answers to two specific questions in addition to identifying a candidate set of stressors (and biophysical factors):
As a pilot study, the primary criteria for evaluating project success will be whether the two BRC questions about consistency between data sets and their relationship to biodiversity were answered. Stressors and biophysical factors should be selected for analysis based on two criteria:
The next three sections are organized around the major research themes of the study: patterns of species richness and relationships with biophysical factors, the potential for remotely sensed data to estimate environmental stress, and an assessment of risk to regional biodiversity. The subsequent section describes the application of stressor data in several related research projects. The report concludes with our answers to the two research questions above and with our recommendations for future research.
Comparisons across Taxa of Biophysical Predictors of Species Richness
Recent improvements in the quality and resolution of spatial data on species distributions and related biophysical factors prompted a study to re-examine the relationship between them. Our research objectives in this study were to 1) identify common biophysical factors that predict richness among taxa, 2) evaluate the potential contribution of remotely sensed data as an integrator of biophysical factors for predicting richness, and 3) identify natural stressors from among the biophysical variables that could be used in Chapter 4 for analysis with anthropogenic stressors.
Data on species distributions had been compiled by EMAP hexagon by The Nature Conservancy for the three western states. Improved biophysical data include new interpolations of climate variables that account for topographic effects, recently completed soils maps, and remotely sensed data products known to be related to primary production. Species richness was compiled for birds, mammals, amphibians, reptiles, rare vertebrates, and trees at the resolution of EMAP hexagons. Biophysical factors were summarized as means and standard deviations of the pixel values within hexagons (see Mean annual precipitation as an example). The relationships were examined by regression tree analysis to identify the most useful predictors of richness and compare these with other findings from the scientific literature (see the regression tree predicting tree species richness as an example).
The most frequently used biophysical variables in the regression tree models for the different taxonomic groups represented broad regional patterns of climatic conditions, i.e., mean values in hexagons. Variables representing habitat heterogeneity within EMAP hexagons, i.e., standard deviations, were very seldom selected for the models, and then only for lower level splits in the regression tree. The one exception was for mammals where the standard deviation of mean annual precipitation was the most important variable in predicting richness. Even the habitat heterogeneity factor derived from the number of ecosystems in a hexagon, as mapped by Hargrove and Luxmoore (1998), was never selected in any richness model. Soil factors were not generally useful in this study area in predicting richness patterns. The climate factors adequately captured the regional patterns of richness so that elevation and topographic diversity or relief were not needed, although they were frequently cited in past literature. Perhaps interpolation of precipitation and temperature over digital elevation models to high resolution grids was a substantial improvement over the cruder estimates from meteorological stations used in the past. Solar irradiance was a factor in every model except for rare species. Rarely was the same factor selected for the primary split in a regression tree for more than one taxon.
The satellite-based remote sensing data played a small role in predicting richness. The set of models with NDVI factors caused a small improvement in the variation explained over those without them in several groups. They were not used at all in models for rare vertebrates or trees. The variation explained for reptiles was actually reduced slightly when using NDVI factors but this may be a reflection more on the subjectivity in pruning regression trees than on the usefulness of the data. Potential weaknesses in using NDVI composite data from USGS in biodiversity studies have been discussed elsewhere ( Stoms et al. 1997). In addition, NDVI reflects current land use and disturbance, whereas the TNC species data tends to portray long-term patterns of species distributions. At the scale of EMAP hexagons, actual productivity may be less useful than potential primary production.
Land use information may be better applied in understanding potential stresses on biodiversity. Human land use can divert the natural inputs of precipitation and energy away from native species and ecosystems. This change would be reflected in imagery obtained from space. Land use also disrupts normal ecosystem flows and processes by reducing and fragmenting habitats, increasing mortality, introducing superior competitors or predators, and exposing species to harmful toxins. While these stressors do not necessarily modify the basic inputs found in this chapter to be associated with patterns of species richness, they lower the potential of native species to use them fully. Species are also exposed to natural stressors, such as extremes of heat and cold between seasons of the year.
One of the most dramatic impacts of human activities on ecosystem functioning is the appropriation of net primary production (NPP) for human uses at the expense of much of the rest of the biota. Assessing the amount of this appropriation and monitoring its expansion is difficult at the resolution needed for regional scale studies. Data are often too coarse (e.g., county, state, or national level statistics), are only collected in some areas (e.g., National Resource Inventory on non-federal lands), or are collected infrequently (e.g., the decadal census).
Periodic satellite remote sensing offers one tool for monitoring ecosystem functioning. Therefore a method that tracks changes in NPP would estimate levels of environmental stress that may cause decline or extinction of local species populations. In this study, rather than using AVHRR imagery to detect changes between two dates, we developed a simple predictive model of potential greenness in the absence of human land use effects. Sample pixels were selected from nature reserves as mapped for the Gap Analysis Program where land uses generally maintain natural ecological processes. A regression tree model for the West Coast Transect study area (California, Oregon, and Washington) used mean annual precipitation, mean January and July temperatures, and available soil water capacity, and it captured most of the variation in NDVI in undisturbed natural areas. Actual greenness deviated as expected from potential greenness in response to human land use effects such as urbanization and agriculture.
Patterns of NDVI: a) actual and b) predicted (or potential) time integrated NDVI, c) positive deviations between predicted and actual time integrated NDVI (greenness was predicted higher than actual) and d) negative deviations (greenness less than predicted).
Regression trees are a useful exploratory data analysis technique for predictive modeling as was done for NDVI here. The method typically captures broad regional patterns at the upper levels of the hierarchical divisions, and then incorporates local processes or effects at lower levels. In fact, unexpected relationships may emerge that contradict the results of simple correlation analysis. For instance, mean July temperature was negatively correlated with time integrated NDVI because the hottest locations in the deserts have very sparse vegetation cover. In the branch of the model with high precipitation and cold January temperatures, however, warmer July temperatures predicted higher NDVI in the interior mountains. This important local relationship would probably have been overlooked in multiple regression analysis because of the conflicting trends in subsets of the data. The drawback of regression tree analysis is that pruning trees is still a case-specific process. Analysts could produce different trees from the same data set based on how they choose to prune the initial tree that fits all observations. Defining the study region to be modeled could also influence the final shape of the tree. Some local patterns may have been better predicted if the West Coast Transect had been partitioned into smaller subregions first. In this case, when a categorical variable for ecoregions was tested in the regression tree analysis, it did not enter the final model. The risks of doing so are that artificially abrupt changes will be created in the predicted variable at the boundaries of subregions and that training samples will be inadequate to represent the variation in each subregion.
While this approach appears promising for monitoring environmental stress, it is far from operational. Additional research is needed to corroborate these preliminary findings and to make the methodology reliable. Limitations of the present study can be summarized as those of AVHRR data, sampling of training sites, and validation. For all these reasons, we present the regression tree model developed in this report only as a promising initial step rather than as a finished work.
There are several limitations to the AVHRR data set used in this study. The model was developed using only 1990 NDVI composites from USGS. We know that NDVI metrics can vary significantly between years in response to interannual differences in weather, local disturbance, and sensor factors (e.g., calibration, drift in time of acquisition, and replacement of satellites). Using a single date to develop the predictive model has two limitations. First, the variables that enter the model, and therefore potential greenness, are likely to be sensitive to the choice of year(s) used to construct it. Similarly, the deviations calculated in comparison with a given year’s NDVI data might be sensitive to the data used in model building. Time integrated NDVI is probably less sensitive than other metrics such as the date of first green-up. It would still be prudent to test the sensitivity of the model and deviations to the choice of year. At the very least, sensitivity analysis would identify the effects of year-to-year variation in actual NDVI and deviations from baseline and therefore the level of deviation that would robustly detect actual change. Perhaps a model could be developed using NDVI averaged over many years to remove the influence of weather variability, or alternatively to use standardized principal components of multi-year data.
Modeling was limited to a study area on the west coast of the United States. Comparisons with other regions and for the entire nation will be needed to evaluate how general our results are. The compositing strategy used by USGS is known to produce NDVI images that tend to be biased towards off-nadir views, particularly in the western U. S. This satellite zenith angle bias blurs the effective resolution of the NDVI images and tends to inflate the apparent value of NDVI from a combination of atmospheric and bi-directional reflection effects. Ideally, alternative compositing algorithms should be tested in comparison with the maximum value compositing algorithm, curently in use by USGS, for the robustness of the model and its accuracy in detecting real change in greenness. Other vegetation indices should also be examined, since some of these are less sensitive to atmospheric effects and soil background color. Others have had success combining land-surface temperature with NDVI to detect changes in vegetation.
The second issue relates to the choice of training data used to develop the regression tree model. We selected pixels randomly from managed areas that, in general, are not urbanized, cultivated, or intensively managed for resource extraction. The training data were extracted from areas being managed primarily for biodiversity objectives as mapped by the Gap Analysis Program. This does not mean that all pixels in the training set have been unaffected by human influences such as livestock grazing, mining, dams, or invasion of exotic species. Wildfires are routinely suppressed in the majority of these areas with varying effects on the plant canopy. Selecting training pixels known to be as similar as possible to presettlement conditions would be essential to develop the best predictive model. This will require a substantial effort to identify enough such sites to provide an adequate sample of all environments in the study area. Finding adequate sites will be problematic in many parts of the world where much of the landscape has already been altered.
This pilot study only evaluated deviations between potential and actual NDVI against mapped information on land use and land management. Further work is needed to interpret the results with higher resolution map or field information to determine if apparent deviations truly reflect environmental stress and not model errors.
The launch of the Moderate Resolution Imaging Spectrometer (MODIS) is currently scheduled in late 1998. MODIS will provide global image coverage of better quality than its AVHRR predecessor. Radiometric calibration, atmospheric corrections using specially chosen water absorption bands, and other enhancements will greatly improve the quality of daily images for monitoring environmental stress. To make our predictive modeling technique useful in the MODIS era will require careful calibration of the two data sets. The new MODIS data stream can then be appended to the historical time series compiled of AVHRR imagery. MODIS will give us both a more useful tool for monitoring changes in primary production as well as extending our historical record of change.
In this analysis, we set out to answer three questions. The first question dealt with the pattern of species richness for rare and vulnerable terrestrial vertebrates. By extracting the species from the TNC database that were ranked as G1-G3 and tallying the number of species in each hexagon, we observed a distinct geographic pattern in richness. The hexagons with the greatest number of species (9-14) occurred along the California coastal zone, with a narrow band in the north and a wider band in the south. Richness of rare species generally declined with distance from the coastline. Desert regions in California generally had 1-3 rare species, while the interior hexagons of Oregon and Washington frequently had none.
Regression tree modeling was applied to answer the second question about the relationship of this pattern of richness to biophysical and anthropogenic stressors. Data were compiled for 13 stressors. Two data sets represented natural stressors, 8 were anthropogenic stressors, and 3 were derived from satellite data and represented a combination of both types of stressors. Some of the data sets correspond to actual stressors (e.g., roads), while others are surrogate measures of environmental conditions (e.g., an index of habitat condition or percentage of area protected). For simplicity, we refer to all factors as "stressors" throughout the text. By far the most important predictors of rare species richness were two natural stressors, seasonal temperature difference and degree-day cool sum. These two variables represent the extremes of hot and cold to which rare species must adapt and to the severity of the winter in which body temperature must be preserved and food must be available. Rare species richness was highest in hexagons with the lowest values of these stressors, that is, where the climate is relatively mild year-round such as with a marine influence. The only anthropogenic stressors selected in the regression tree model were the number of exotic species (both total and terrestrial vertebrates alone). Rare species and exotic species tended to have similar distribution patterns. It is unclear from our analysis whether exotic species have caused more vertebrates to become rare and vulnerable or simply that, in the West Coast Transect study area, both are influenced by the same, undetermined ecological processes. The more direct measures of stress such as population density, roadedness, or habitat loss were not used by the regression tree model.
Our third question, about the value of satellite data in estimating environmental stress and the number of vulnerable species, produced a negative result. None of the three measures of environmental stress from NDVI were selected by the regression tree. The most significant differences between potential and actual NDVI were in urban and agricultural areas, which were not generally associated with large numbers of rare species. It may be that the vulnerable species have already be extirpated from these hexagons and were thus not in TNC’s database.
Our study was hindered by a lack of stressor data at the required resolution. A great deal of data, however, exist at the county scale or similar geographic units. It may still be possible to use these data to estimate stressors are the finer, hexagon scale through development of smart interpolation methods. For instance, grazing density could be inferred based on a model that uses commonly available spatial data such as topography. This kind of GIS model can explicitly limit predicted land uses to appropriate environmental settings and land stewards while disaggregating county level statistics. Other land uses such as logging might be modeled in a similar manner, as might the agricultural census data on chemical applications.
We mentioned at the beginning of this chapter how human activity has appropriated a large proportion of NPP and energy. Estimates of this monumental alteration of ecosystem function has only been estimated at national or global levels. We were disappointed in our attempts to apply this approach at the hexagon level. Data inputs were often too coarse, as discussed above. It may yet be possible to implement this approach, but it will require greater use of the coarse-scale data and smart interpolation techniques. One intriguing possibility of relating energy usage to biodiversity loss is to estimate energy usage from the nighttime lights data from DMSP satellite data.
There is an alternative approach that could be taken in using the stressor data sets. Rather than using them to predict richness of vulnerable species in a modeling context, they could be used to identify potential "train wrecks" in hexagons where both stresses and biodiversity are high. This would require the development either of thresholds for levels of stressors that correspond to threat to biodiversity or of a new index that synthesizes the effects of stresses into an overall metric of threat.
During the project period, we also participated in several related studies. These gave us the opportunity to develop methods for compiling and processing stressor data in smaller portions of the WCT study area and to apply these data in planning exercises. Two studies used similar measures of stress and management issues to guide selection of potential new biodiversity management areas. In both studies, stressors were used to rate the suitability of sites for biodiversity management as a complement to biological information. The first of these studies was part of the Sierra Nevada Ecosystem Project (SNEP) funded by the U. S. Forest Service and directed by Congress. The reserve selection modeling in this case was exploratory rather than for decision making. The second project, however, assisted TNC in planning their conservation portfolio for the Columbia Plateau ecoregion. Stressor data were valuable in the identification of an initial portfolio, which TNC staff then revised with detailed local knowledge of site conditions. The California Gap Analysis Project used stressor measures of roadedness, projected population growth, and level of protection to predict vulnerability of native plant communities.
Recommendations
We recommend three areas for further research.