Our Mission:
To Understand and Predict Ecological Systems
There is considerable scientific and societal interest in better understanding the terrestrial carbon cycle: how carbon dioxide is taken out of the air by plants, moves through ecosystems (i.e., fluxes), is stored in different plant and soil pools, and is released back to the atmosphere. In particular we need to better understand the variability and predictability of the terrestrial carbon cycle over the short-term (for carbon inventory monitoring, reporting, and verification), medium-term (for developing natural climate solutions), and long-term (to understand climate stabilizing feedbacks), and across spatial scales from individual sites to continents. Here we aim to better understand carbon variability and predictability through the generation and analysis of a new North American terrestrial carbon cycle data product that harmonizes information coming from an unprecedented volume and variety of on-the-ground measurements, data from satellites, and mathematical models of how the carbon cycle works. These analyses will provide new insights into long-standing questions such as: (1) How do different carbon pools and fluxes vary across space, time, and in response to environmental variables like temperature, precipitation, land use, and topography? (2) Under what conditions are our mathematical models most/least reliable? (3) Where are the gaps in our existing data-collection networks? (4) How far into the future are different carbon pools/fluxes predictable and which sources of uncertainty most limit predictability? To facilitate uptake we will work with the US Forest Service (USFS) to incorporate these data products into federal carbon accounting efforts and seek certification of these open-source technologies for use in the voluntary carbon markets. Finally, this project will contribute to the training of two graduate students and four undergraduates, with the latter recruited through an environmental data science program focused on increasing American Indian/Alaskan Native involvement in STEM.
This project will produce a carbon cycle “reanalysis” product based on the iterative model–data assimilation approaches commonly employed in numerical weather forecasting to harmonize process-based mathematical models with new observations. Specifically, we will expand the PEcAn terrestrial carbon data assimilation and forecasting system to integrate twelve new bottom-up field data constraints from the National Ecological Observatory Network (NEON), five data constraints from the USFS Forest Inventory, and Ameriflux eddy-covariance tower and ancillary data. These bottom-up constraints will “anchor” PEcAn’s existing assimilation, which is based on optical, lidar, and microwave remote sensing. To support this, we will refine existing data assimilation approaches to not only jointly estimate pools and fluxes, but to also capture spatiotemporal variability in model parameters and employ hybrid machine learning approaches to understand the variability in model residual error. Finally, we will extensively analyze the variability in our continental-scale reanalysis product against a range of explanatory variables and across multiple time scales, spatial scales, and prediction lead times.
NASA has devoted considerable resources to developing remote sensing data products aimed at quantifying and understanding the terrestrial carbon (C) cycle. Similar efforts have been taken throughout the research community, generating bottom-up estimates based on inventory data, eddy covariance, process-based models, etc. While these efforts collectively span a wide range of observations (optical, lidar, radar, field-measurements) and response variables (cover, pools, fluxes, disturbances), each data product typically only leverages one or two data sources. However, what is fundamentally needed to improve monitoring, reporting and verification (MRV) isn’t numerous alternative C estimates but a synthetic view of the whole. Furthermore, any approach to synthesis needs to be flexible and extensible, so that it can deal with different data sources with different spatial and temporal resolutions, extents, and uncertainties, as well as new sensors and products as they are brought online. Finally, it needs to inform top-down atmospheric inversions, which currently cannot ingest these bottom-up C estimates an a constraint.
In our first NASA CMS project we developed a prototype synthesis, focused initially on the continental US (CONUS), by employing a formal Bayesian model-data assimilation between process-based ecosystem models and multiple data sources to estimate key C pools and fluxes. Models are at the center of our novel system, but rather than providing a prognostic forward-simulation they serve as a scaffold in a fundamentally data-driven process by allowing different data sources to be merged together. Essentially, while data on different scales and processes are difficult to merge directly, all of these data can be used to inform the state variables (i.e. pools not parameters) in the models. In addition to a ‘best estimate’ of the terrestrial C cycle, a key outcome of such a synthesis would be a robust and transparent accounting of uncertainties. This approach is also exceedingly extensible to new data products, or to changes in the availability of data in space and time, as assimilation only requires the construction of simple data models (e.g. Likelihoods) that link model states to observations. Our bottom-up model-data assimilation also provides informative prior means and uncertainties for the CarbonTracker-Lagrange (CT-L) inverse modeling framework. This assimilation of a robust, data-driven bottom-up prior will provide, for the first time, a formal synthesis between top-down and bottom-up C estimates.
In our second NASA CMS project we are extending our initial prototype in a number of exciting new directions. First, we will extend this system to all of North America, with a focus on 2015-present, a period that saw the addition of many new remote sensing platforms that are important for carbon monitoring. Specifically, we will extend our assimilaiton to NASA's SMAP (Soil Moisture Active Passive) microwave mission, GEDI lidar mission, OCO-2 and -3 solar induced fluorescence, and EcoSTRESS thermal imagery. We also aim to merge our CMS system with the NEFI site-scale land C forecast system to produce a continental-scale C budget with a seamless transition from reanalysis (best harmonized estimate of past carbon pools and fluxes) to nowcast (current state) to forecasts on the weather (35-day) and subseasonal-to-seasonal (9mo) timescales. We will also integrate new algoriths our team has developed to assimilate discrete data on land use, land use change, forestry (LULUCF) and natural disturbance. By coupling disturbance assimilation with near-term forecasting we will be able to make recovery forecasts that could help guide management and restoration efforts. More broadly, this system aims to directly meet the needs of land managers in a number of sectors, who desire information on daily to seasonal scales and make decisions based on what they expect to happen in the future.
We will develop a remotely-sensed data-constrained biogeochemical (BGC) model for landscape scale (statewide) annual carbon and greenhouse gas (GHG) analysis, and future projection modeling. Further, the model and data associated with this project will be open source to maximize transparency, innovation, and improve governance and future contracting options for the State of California. This monitoring and modeling framework will estimate carbon stocks and GHG fluxes on every acre of annual and perennial croplands in California, whether or not they are enrolled in any healthy soils or otherwise climate smart agricultural practices. This approach allows California to monitor and project into the future the benefits of climate action, and the consequences of inaction.
Because of the slow pace of terrestrial ecosystem processes, including the slow generation time, growth rate, and decomposition rate of trees, the impact of changing climate and disturbance on forests plays out over hundreds of years. For this reason, terrestrial ecosystem models are used to anticipate the centennial scale projections of forest response to environmental change. Current terrestrial ecosystem model predictions vary widely and results have large statistical uncertainties. Furthermore, testing and calibration of these models relies on short term (sub-daily to decadal) data that fail to capture longer term trends and infrequent extreme events. The capacity of ecosystem models for scientific inference and long-term prediction would be greatly improved if uncertainties can be reduced through rigorous testing against observational data. PalEON is an interdisciplinary team of paleoecologists, statisticians, and modelers that have partnered to rigorously synthesize longer term paleoecological data and incorporate into ecosystem models to provide a deeper understanding of past dynamics and to use this knowledge to improve long-term forecasting capabilities.
PalEON addresses four objectives and associated research questions: 1) Validation: How well do ecosystem models simulate decadal-to-centennial dynamics when confronted with past climate change, and what limits model accuracy? 2) Initialization: How sensitive are ecosystem models to initialization state and equilibrium assumptions? Do data-constrained simulations of centennial-scale dynamics improve 20th century simulations? 3) Inference: Was the terrestrial biosphere a carbon sink or source during the Little Ice Age and Medieval Climate Anomaly? and 4) Improvement: How can parameters and processes responsible for data-model divergences be improved? The data synthesis will include wide range of ecosystems, encompasses past climate variations that were large enough to affect tree growth rates, disturbance regimes, and forest demography, and leverages available paleodata. The synthesis will include 1) fossil pollen and Public Land Survey data to reconstruct forest composition, 2) sedimentary charcoal, stand-age and firescar indicators of past disturbance regimes, 3) tree-ring records of tree growth rates, and 4) multiple paleoclimatic proxies and paleoclimatic simulations. Bayesian hierarchical statistical models will be used to reconstruct key ecological variables and their associated uncertainty estimates. A standardized model intercomparison involving 13 ecosystem modeling groups will be used to evaluate the robustness of the modeling approach.
Three areas will be emphasized for PalEON's broader impacts. Community Building: The PalEON research community has doubled over the past 10 months, with more than 60 participants now. It is anticipated to nearly another doubling over the next five years, and the funds will allow the ongoing community-building via annual large meetings and task-oriented workshops. Interdisciplinary Training and Mentoring: A new generation of researchers will be trained to naturally conceptualize large spatial and temporal scales and to approach ecological forecasting as an integrative activity spanning data collection to model prediction. Additionally, the PalEON Summer Short Course provides an intensive cross-training experience for young scientists in all areas encompassed by PalEON. The 2012 course will be followed by courses in 2014 and 2016. Building Scientific Infrastructure: All PalEON datasets will be made publicly available upon publication, as will our new data-assimilation methods and model intercomparison protocols. Tools will be developed for optimal site selection (given the goal of reducing the integrated prediction uncertainty about past vegetation and climate over space and time) and will distribute a publicly available webtool version that will be linked directly to the Neotoma Paleoecology Database.
Mankind is dependent upon the health of the natural world for its survival. However, in the face of climate change and other environmental challenges, society can no longer rely solely on past experience to understand and manage the world around us. This project asks the question, "What would it take to forecast ecological processes the same way we forecast the weather?" Central to this project is the development of an iterative cycle between making forecasts, performing analyses, and updating predictions in light of new evidence. This iterative process of gaining feedback, building experience, and correcting models and methods is critical for building a forecast capacity, and also a crucial part of any decision making under high uncertainty.
In addition to making ecology more relevant to management, near-term forecasts routinely compare specific, quantitative predictions to new data, which is one of the strongest tests of any scientific theory. This project will generate near-term forecasts that leverage ecological data collected by the National Ecological Observatory Network and spanning a wide range of themes: leaf phenology, land carbon and energy fluxes, tick-borne disease incidence, small-mammal populations, aquatic productivity, and soil microbial diversity and function. This broad, comparative approach will be used to address cross-cutting hypotheses about the nature of predictability in ecology and develop an overarching body of forecasting theory and methods.
The Near-term Ecological Forecasting Initiative (NEFI) will advance ecological knowledge at three levels: (1) overarching across-theme hypotheses about the predictability of ecological systems; (2) pressing within-theme questions about what drives process and predictability; and (3) advancing the tools and techniques that will enable an iterative approach to quantitative hypothesis testing. The overarching hypotheses of this project are that: (1) ecological predictability is more driven by processes error than initial condition error; (2) there are consistent patterns in the sources of uncertainty across themes; (3) across themes, spatial and temporal autocorrelation are positively correlated; and (4) spatial and temporal autocorrelation are positively correlated with limits of predictability. Overall, the answers to these questions address to what extent there are general patterns to ecological predictability, which would advance both our basic understanding of ecological processes and constrain the practical problem of making forecasts.
The expected outcomes of NEFI are to: (1) Disseminate data products and predictions that benefit society; (2) Develop new tools and cyberinfrastructure that enhances research and education; and (3) to promote teaching, training, and learning. Specific NEFI forecasts, such as tick-borne disease risk, aquatic blooms, carbon sequestration, and leaf phenology, are of direct relevance to society. Forecasts will be made available via open cyberinfrastructure that disseminates forecasts to the public and allows other ecologists to contribute new forecasts. To produce these forecasts, NEFI will develop an open-source statistical package, ecoforecastR, which will advance the tools and techniques beyond what is currently used by the community. Finally, in addition to the graduate students directly mentored through the project, NEFI will run an annual summer course on ecological forecasting that will train the next generation of ecologists.
Computer simulations play an essential role in ecological research, the management of national forests and other public and private land resources, and projections of climate change impacts on ecosystem services at the local, state, national, and international level. However, at the moment, there are a number of barriers slowing the pace of model improvement and reducing their wider use. First, the software for using each model is unique and does not communicate well with other models. Second, because each model is unique, the tools to manage data going into models, analyze models, and visualize results are not shared. In this project PEcAn (Predictive Ecosystem Analyzer) is being developed to provide a common set of software tools for researchers and land managers to effectively work with multiple ecosystem models and data. Web technologies will be used to allow distant modeling teams to share information, work together, and better use public and private cloud and supercomputing resources. Other tools will be developed to identify model errors and combine new and existing applications into workflows to make ecological research more efficient, better forecast ecosystem services, and support evidence-based decision making. The PEcAn team will also develop training tools for new users and work with the scientific community to add more models to PEcAn. PEcAn will make ecological research more transparent, repeatable, and accountable.
PEcAn is an open-source ecoinformatics system designed for ecologists with a range of modeling backgrounds to be able to better and more easily parameterize, run, analyze, and assimilate data into ecosystem models at local and regional scales. This project will expand the PEcAn user community, incorporate more models, and develop tools that are more intuitive and accessible. Further, the project intends to transform tools for managing the flows of information into and out of ecosystem models into a resilient, scalable, and distributed peer-to-peer network for managing the flow of this information among modeling teams and with the broader community. To support a larger number of models, data processing workflows will be improved and tools will be developed for multi-model visualization and benchmarking. Applications that distribute analyses across the PEcAn network, cloud, and high-performance computing environments will be used to better understand model structural error using data mining approaches. Models will benchmarked over a range of environmental conditions, allowing model improvement to be tracked and users to select the best models for different applications in an informed manner. Finally, PEcAn tools will be combined into customizable workflows for real-time synthesis, forecasting, and decision support. By allowing modelers to focus on science rather than informatics, and allowing ecologists to easily compare their data to models, PEcAn will greatly accelerate the pace of model improvement and hypothesis testing. These activities are essential for improving ecosystem models and reducing uncertainty of the impacts of climate change on ecosystems and carbon cycle-climate feedbacks. Project information and results are available at http://pecanproject.org while project computer code is available at https://github.com/pecanproject.