EFI-CI Briefing Paper

Author

Cameron Thompson, Jacob Zwart, Jake Kritzer, Chris Brown, Jessica Burnett, Mike Dietze, Jody Peters, Hassan Moustahfid

Cyberinfrastructure to Enable Ecological Forecasting for Decision Making

Society increasingly faces environmental challenges that require people to make decisions based on what they think the future state of nature will be. To that end, ecological forecasting attempts to make predictions that inform decision-makers on the likely future state of the environment, and numerous predictive models have been developed. These models represent environmental processes and use real world data to make predictions, the results of which help us understand the natural world and give context to changes we observe. However, they are often run manually or using boutique and independently-developed workflows that are difficult to iteratively learn from and hinder our collective ability to efficiently generate forecasts. Meeting the challenge of making informed decisions in a rapidly changing world requires frequent, near-term, and reproducible ecological forecasts which are supported by robust cyberinfrastructure.

A Workshop for Cyberinfrastructure Design Principles

Cyberinfrastructure can facilitate our ability to create, improve, and learn from forecasts, but seemingly there is a lack of widely accepted cyberinfrastructure design principles which is a major obstacle for researchers and organizations seeking to engage in ecological forecasting. In an effort to address these obstacles, a workshop was convened by EFI and the Northeastern Regional Association for Coastal Ocean Observing Systems (NERACOOS) to: 1. Collate common cyberinfrastructure practices currently used 2. Identify cyberinfrastructure needs and gaps 3. Propose cyberinfrastructure designs for various forecasting problems

The workshop participants included a diverse mix of participants from various sectors including NGOs, private industry, academia, and federal agencies.

Through presentations, panels, and breakout sessions, participants discussed workflows, best practices, and the implementation of cyberinfrastructure across agencies. Although there were technical presentations, the makeup of participants influenced the scope of the topics covered and contributed to the majority of discussions focusing on higher level concepts and general design rather than the technical aspects of cyberinfrastructure. Often the participants’ recommendations reflected those that are in the literature, such as the FAIR principles which stipulate that data should be findable, accessible, interoperable, and reusable. Otherwise, their varied experiences and perspectives highlighted the many challenges which hinder adoption of best practices and advancement of ecological forecasting. Here we report on some of the major themes from the workshop and identify areas for advancing the implementation of cyberinfrastructure to support ecological forecasting.

Cyberinfrastructure and Cloud-based Architecture

Cyberinfrastructure refers to the integrated suite of hardware, software, data, and human resources designed to support data analysis and information processing. The design of cyberinfrastructure for ecological forecasting can take many different forms and the sequence of processes within that infrastructure constitutes the workflow. Cyberinfrastructure best practices include concepts like data standards and modularity which enable the workflow to be automated, flexible, and scalable. These workflow best practices are also aspects of cloud-based architecture which involves designing systems specifically to leverage cloud services and capabilities. However, cyberinfrastructure can adopt cloud-based architecture principles without using vendor cloud services. Doing so takes advantage of the flexible, scalable, efficient design while enabling local control and the use of cost-effective infrastructure like on-premises high-performance-computing systems.

The rapid and inevitable shift to cloud computing is upending traditional IT architectures, procurement models, and budget structures. Organizations risk cost inflation and loss of leverage to commercial providers when locking into cloud vendor services. Utilizing cloud-based architecture while minimizing reliance on services is a solution to some of the challenges and enables adoption of cyberinfrastructure best practices. Another strategy is to coordinate across agencies and other organizations to reach collective or at least transparent agreements on service requirements and costs.

Community Standards, Best Practices, and Flexibility

Establishing community-wide conventions for data standards and workflow best practices is crucial for developing robust cyberinfrastructure. However, a balance must be struck between standardization and flexibility to accommodate the diverse needs of ecological forecasting. Integrating and harmonizing heterogeneous data from diverse sources is a significant barrier, particularly for biological and ecological data which are difficult to standardize. Therefore, the focus should be on common concepts and prioritizing standard outputs rather than inputs, which allows for direct comparison of outcomes.

The development of standards should be fostered through communities of practice since grassroots involvement will lead to greater adoption and widespread acceptance. Where available, existing conventions should be borrowed, and if no standards exist, then a project is encouraged to create them. To incentivize the establishment and adoption of data standards and best practices, they can be required by funding agencies and recognized by communities as with forecasting challenges. Further support should be provided by collating catalogs of existing models, data, and tools, and by investing in open-source technologies.

Human Capital, Collaboration, and Communication

Cyberinfrastructure for ecological forecasting requires people with diverse skills and expertise to ensure effective development, implementation, and maintenance of the system. At the research stage, ecological forecasters are often domain scientists and data scientists who often struggle to operationalize a model into a workflow because of the steep learning curve to the skills and expertise needed. In addition to scientific knowledge, ecological forecasting requires technical skills in software development, continuous deployment, cloud architecture, data management, visualization, and project management. Collaboration and effective communication is needed among people with these various backgrounds in order to advance ecological forecasting. However, there is a lack of trained individuals accessible to the ecological forecasting community who can fill those roles.

Multiple strategies should be employed to build the human infrastructure needed for ecological forecasting. These include training and workforce development for new professionals to fill various roles as well as skill-building for inexperienced workers. This will enable them to work towards operationalizing models and more effectively communicate and collaborate with other technical professionals.

Design Justice Principles and Stakeholder Engagement

When designing an ecological forecasting model and supporting cyberinfrastructure, it is essential to meaningfully engage with stakeholders early on and continuously throughout the research and operationalization process. Ecological forecasting models should be designed to inform specific decisions to ensure their relevance and usefulness. This design includes their visualization to ensure that the information being transmitted is intuitive and accessible, providing stakeholders with clear and actionable insights.

However, effective stakeholder engagement takes time to build relationships and trust, and high-quality engagement is particularly challenging when working with marginalized and under-resourced communities. Embedding design justice principles into the development process can center the voices and needs of these communities during the creation of ecological forecasting tools. This approach advocates for equitable participation and the inclusion of diverse perspectives which helps distribute the benefits of ecological forecasting equitably and addresses the unique challenges faced by marginalized groups. For ecological forecasting cyberinfrastructure, specific concerns include the relevance and accessibility of products and the use of data derived from marginalized communities. In those latter cases, following CARE (Collective Benefit, Authority to Control, Responsibility, and Ethics) principles can further help avoid potential conflicts and ensure ethical practices.

Culture, Institutions, and Community

Science culture and institutions incentivize novelty and innovation, while the operationalization of ecological forecasts with cyberinfrastructure requires stability, reliability, and continuity. Furthermore, there are tensions between what is considered research quality and production quality which hinders the translation of models into operation-ready workflows. These barriers are exacerbated by limited scope and duration of grant funding which is typically inadequate for establishing and maintaining cyberinfrastructure.

Organizational inertia is another obstacle to ecological forecasting implementation since workers are habituated to rely on particular funding sources and use familiar technologies and workflows which are not well-suited for sustained cyberinfrastructure. These ruts may also make workers resistant to using established conventions and best practices in favor of their own or those used by their organization. Meanwhile, different agencies and organizations have their own missions and security requirements which may make steps towards interoperability challenging.

Cross-agency collaboration and implementation of sustained cyberinfrastructure is a sure path towards the advancement of ecological forecasting, but it will require overcoming cultural and institutional barriers. Therefore, it is imperative for grassroots communities in conjunction with agency champions to organize and advocate for necessary changes.

Conclusion and Recommendations

The workshop was conceived on the premise that ecological forecasting cyberinfrastructure lacks widely accepted design principles. However, participants largely agreed with the best practices presented, which are consistent with the existing literature. Despite this consensus, barriers to adoption persist, primarily due to human factors, organizational culture, and limited institutional resources. To overcome these obstacles, we recommend the following:

  • Develop compelling communication materials in coordination with scientific societies including:
    • White papers that explain the need for ecological forecasting and cyberinfrastructure
    • Case studies that showcase the success of ecological forecasting
    • Flagship projects that could have substantial impact and visibility
  • Find and empower champions at multiple levels within agencies and equip them with communication materials
  • Build communities of practice:
    • Participate in their workshops and conferences to promote cyberinfrastructure best practices
    • Provide them with tools, training, and platforms to foster collaborations
  • Expand the workforce and bridge expertise by collaborating with academic partners to modernize curricula, develop core competencies, and create professional development opportunities
  • Align with agency missions and priorities to gain traction in a resource-constrained environment
    • Reframe ecological forecasting in terms of agency missions and societal priorities such as economic development, public health, food and water security, hazard resilience, and environmental justice.
    • Capitalize on strategic opportunities such as high-profile initiatives, administration transitions, and budget cycles to promote ecological forecasting
  • Advocate for ecological forecasting cyberinfrastructure in coordination with scientific societies, stakeholder groups, and boundary organizations to effectively translate needs, priorities, and opportunities.