1 Introduction
Coastal communities have a strong demographic and economic preponderance worldwide. In the United States (U.S.) alone, they consist of 123 million people (40% of the total population;
NOAA, 2014a), generate 45% of the Gross Domestic Product and support over 51 million jobs (
NOAA, 2014b). These communities rely heavily on a diverse range of services provided by coastal margins, which constitute a resource-rich but ecologically sensitive interface between land and ocean.
Essential sub-systems of coastal margins are estuaries, complex environments where transitions from fresh to marine waters often occur across steep spatial and temporal gradients. Estuaries provide economic resilience to coastal communities and deliver important ecosystem services (
Barbier et al., 2011). Many estuaries are essential waterways, which are critical to local and global commerce. They are also essential migration routes or nurseries for birds, fish and shellfish, while buffering the coastal ocean from increased nutrient loads and other terrestrial contaminants. Collectively, they form a global resource of increasingly recognized significance for society (
MEA, 2005; OGE, 2014).
The susceptibility of estuaries to climate change and economic development is a major ecological and socioeconomic concern (
Lotze et al., 2006). Sea level rise alone poses a significant threat to the world’s estuaries (
Statham, 2012;
Robinson et al., 2013;
IPCC, 2014), via potential increases in flooding, salinity intrusion and disruption of aquatic ecosystems. Also, anthropogenic practices have imposed major stresses on estuaries in recent decades, such as increased nutrient loads leading to hypoxic dead zones (
Diaz and Rosenberg, 2008;
Howarth et al., 2011;
Statham, 2012). In some estuaries, the effects of these stresses have been mitigated and even reversed through management approaches that employ best practices informed by scientific understanding; for instance, nutrient loads and dead zones have been successfully reversed in some estuaries (
Diaz and Rosenberg, 2008).
However, for many estuaries, potentially drastic reductions in ecological resilience lie ahead, as expressions of climate change and local anthropogenic stresses increase. As a striking example, the 10 most populated water basins (4 in China, 3 in India, 2 in Africa, and 1 in Europe) already account for 10% of the world’s Gross Domestic Product, but in 40 years this figure is projected to increase to 25% (more than Japan, Germany, and the U.S. combined;
HSBC, 2012). Without effective management, this massive development poses an unprecedented threat to estuaries, with major potential for disruption of global biogeochemical cycles, of local and global ecosystem services and ultimately of human well-being.
Developing or improving science-based anticipatory capabilities that allow society to best manage estuaries is critical to a sustainable Earth, but is also challenging because estuaries are complex and highly varying ecosystems. To meet these challenges for a specific estuary, and provide a model for others, we created a distinctive scientific infrastructure that we term a collaboratory. The SATURN (Science And Technology University Research Network) collaboratory is designed to facilitate generation and open flow of information, towards enhanced scientific understanding, prediction, operation, and sustainability of the Columbia River estuary in the U.S. (Fig. 1). Here, we document the operational infrastructure of SATURN and provide brief insights into lessons learned.
2 The study area
2.1 The broader context of the Columbia River basin
The Columbia River is the largest of the North American rivers flowing into the Pacific Ocean, with daily average discharges (at Bonneville Dam, Fig. 1) ranging from ~2,000 m3·s−1 in the early fall to ~15,000 m3·s−1 during spring freshets (Fig. 2, top panel). The river basin crosses the U.S.-Canada border, and includes seven U.S. states: Idaho, Oregon, Washington, Montana, Nevada, Wyoming, and Utah. As a major economic resource for the region, the Columbia River is managed for multiple purposes: power generation, flood risk mitigation, agricultural water supply, fisheries and navigation. Essential to these activities is the operation of major navigation and hydropower systems.
The Columbia-Snake River System is a vital waterway that supports the economies of multiple U.S. states, including Oregon and Washington (
Simmons and Casavant, 2010). With deep-draft and inland components, this waterway is a major gateway for wheat and barley exports (#1 in the U.S.) and for mineral bulk exports, wood exports and auto imports (#1 on the U.S. West Coast). Recently deepened to 13.1 m, the deep-draft channel extends for 166.6 km along the estuary and supports more than 40 million tons of cargo per year, valued at $17 billion U.S. dollars. The inland river system, mostly upstream of the estuary, supports 10 million tons of cargo per year, valued at $3 billion U.S. dollars annually.
The Federal Columbia River Power System is a network of 31 federally owned U.S. dams managed for hydropower production and flood protection. At ~22,500 MW of maximum generating capacity (~over a third of the total U.S. hydroelectric capacity), these dams are the foundation of the Pacific Northwest power supply and also provide power to other U.S. states and Canada. In spite of the large number of U.S. dams, the storage capacity of the river is largely in Canada. The 1964 U.S.-Canada Columbia River Treaty, which regulates the access to that storage capacity by the U.S., is up for review in the 2014−2024 timeframe. While in its original form it focused only on power generation and flood protection, a revised Treaty will likely also address ecosystem function (U.S.
Entity, 2013).
Ecosystem function is already a major consideration in the operation of both the Columbia-Snake River System and the Federal Columbia River Power System. Of particular significance, four Pacific salmon stocks are listed as endangered and thirteen as threatened under the Endangered Species Act (ESA). ESA-mandated Biological Opinions set specific guidance for salmon protection and recovery that affects all development activities in the Columbia River basin. Also of direct relevance to salmon protection and recovery is the 1855 Columbia Basin Treaties between the U.S. and the Umatilla, Nez Perce, Yakama and Warm Springs which granted those Native American tribes permanent fishing rights in their reservations and accustomed lands; the latter collectively represents approximately a quarter of the basin’s area.
The Columbia River estuary
The Columbia River estuary serves as a biogeochemical buffer between river and ocean and offers important transitional habitat for fish and birds. Seasonal upwelling (Fig. 2, second panel) from the California Current influences estuarine conditions through tidal exchange, and results in the transport of deep-ocean waters to the estuary during the summer months. These waters are relatively rich in nutrients (
Roegner et al., 2011a), low in oxygen (
Roegner et al., 2011b; also, bottom panel of Fig. 2) and pH, and high in carbon dioxide. The estuary is limited upstream by the Bonneville Dam in the Columbia River (~160 km upstream of the mouth) and by the Willamette Falls in the Willamette River, a major tributary. Freshwater loads of sediments and nutrients are largest from late autumn through early spring (Fig. 2, third and fourth panels).
The influence of the estuary extends along the continental shelf through a plume that reaches north to British Columbia or south to California, depending on the prevailing winds (
Barnes et al., 1972;
Liu et al., 2009;
Burla et al., 2010a;
Hickey et al., 2010). By contrast, the
chemical estuary, defined by the presence of salinity upstream of the mouth, is limited to ~45 km (
Chawla et al., 2008), with actual extent dependent primarily on river flow and tidal conditions.
It is common to refer to the Columbia River estuary as a river-dominated mesotidal estuary. More specifically, the chemical estuary operates across four physical estuarine regimes: salt wedge, time-dependent salt-wedge, highly stratified and partially mixed. Each regime corresponds to a specific pairing of river flows and tide ranges, and has distinctive stratification and mixing characteristics.
Generally shallow, the chemical estuary is cut deeply by two major channels (Fig. 1): the South Channel, which is maintained for deep-draft navigation and the North Channel, which is unmanaged. Both channels are ecologically important, for instance as conduits of ocean influences into the estuary (
Roegner et al., 2011a,
b), as migration corridors for juvenile Pacific salmon (
Bottom et al., 2005) and as loci for seasonal
Mesodinium spp. blooms (
Herfort et al., 2011a). However, each channel is dynamically distinct: the South Channel has stronger river outflow than the North Channel, less tidal transport and a less well-developed salt wedge (
Chawla et al., 2008).
Also of ecological significance are four lateral bays, two of which are brackish (Baker Bay and Youngs Bay) and the other two (Cathlamet Bay and Grays Bay) primarily freshwater. Ecosystem services provided by these bays are diverse, ranging from transitional habitat for out-migrant juvenile salmon (e.g., Cathlamet Bay;
Bottom et al., 2005) to seeding seasonal
Mesodinium spp. blooms (e.g., Baker Bay;
Herfort et al., 2011a).
As estuaries do, the Columbia River estuary provides an ecosystem service by reducing fluxes of natural and anthropogenic materials to the coastal ocean, thus functioning as a natural bioreactor. We use the term “bioreactor” (rather than the more commonly used terms “filter” or “buffer”) to evoke the combination of active microbial, biogeochemical and ecological processes that result in the biogeochemical transformation of organic and inorganic materials. The term bioreactor is also a useful construct to encapsulate material and energy flows through the ecosystem, ultimately leading to food web structure and defining water quality and ecosystem health.
Because of the estuary’s short residence times, many of the biogeochemical transformations in the bioreactor are hypothesized to occur in association with biological hotspots that extend water or particle residence time. These hotspots specifically include estuarine turbidity maxima, seasonal Mesodinium spp. blooms and intertidal zones in lateral bays. The dynamics and function of the bioreactor and its hotspots are the focus of the research of the Center for Coastal Margin Observation & Prediction (CMOP), a multi-institutional Science and Technology Center funded by the U.S. National Science Foundation.
3 Columbia River collaboratory
Concept
Many issues of societal importance, from fisheries to shipping and power generation, require the study of estuaries. The knowledge base needed to address these issues is enabled or enhanced by high-resolution, long-term time series of observations, especially when coupled with computational models that extend the range of observations. The term estuarine collaboratory captures the open and collaborative infrastructure that allows the knowledge base for an estuary to grow from study to study and application to application.
We view an estuarine collaboratory as a networked integration of sensors and platforms, models, and information flows, designed to enable diverse communities of practice to interact productively with reduced geographic, disciplinary or institutional barriers. The term
communities of practice refers to groups of people who share a concern or interest (such as the environmentally sound operation of the Federal Columbia River Power System), and who deepen their knowledge and expertise (e.g., ability to reach consensus on controversial issues at the power-system–salmon-conservation interface) by interacting and sharing information on a sustained basis (
Lave and Wenger, 1991;
Wenger, 1998).
SATURN is an implementation of the collaboratory concept (
Baptista et al., 2008;
Baptista, 2015). It is designed to support interdisciplinary research in the Columbia River estuary, as well as to support regional decisions on the appropriate balance for the use of the estuary and river for hydropower generation, flood protection, navigation, and nursery and migration habitat for salmon. Of particular interest is the characterization of physical and biogeochemical processes that underlay the function of the estuary, and that illuminate the estuary’s variability and susceptibility to changes in global climate and in operation of the Federal Columbia River Power System and the Columbia-Snake River System.
The successful implementation of SATURN is predicated on the open flow of information, which requires a supporting cyber-infrastructure. It is also predicated on the availability and integration of observations and simulations, which requires an observation network and a modeling system. These three operational components of SATURN are the focus of this paper.
3.1 SATURN observation network
Starting in 1996, the development of the observation network was directed initially by the interest in calibrating and validating a modeling system for estuarine circulation (CORIE;
Baptista et al., 1999,
2008;
Baptista, 2006). The network consisted of several real-time stations measuring physical variables: salinity, temperature, water levels and, at a few stations, velocity profiles. Driven by the CMOP focus on the Columbia River estuary as a river-dominated estuarine bioreactor, the SATURN network underwent fundamental changes after 2008 (Fig. 3). It now encompasses both endurance stations and a pioneer array (both defined and described below), and has an interdisciplinary focus with a spectrum of physical and biogeochemical sensors deployed.
3.1.1 Real-time endurance stations
Endurance stations provide long-term, real-time, high-resolution time series at fixed locations. Most stations are concentrated in the chemical estuary, but some are in the tidal freshwater, the near-field plume or the continental shelf (Fig. 1). In the estuary, stations are located primarily along the main channels, to characterize large-scale patterns of estuarine flow and transport. These locations were chosen based on an empirical understanding of estuarine dynamics, and re-affirmed through a formal network optimization study (
Frolov et al., 2008). Selected estuarine stations are also located in ecologically important lateral bays: Cathlamet Bay, Youngs Bay, and Baker Bay, each exposed to distinct ocean influences.
Some of the SATURN stations (represented by dark cyan triangles in Fig.1) remain strictly physical in terms of their sensor suites, typically with single-depth measurements of salinity and temperature. They constitute a useful reference for calibration and validation of circulation models (e.g.,
Baptista et al., 2005). However, nine stations now also have interdisciplinary sensor packages, and form the core of the network. These stations are named SATURN-
nn, with
nn assigned sequentially in order of the date of initiation of their interdisciplinary data streams. Some SATURN-
nn stations (e.g., SATURN-01, -03 and-04) have replaced collocated (or nearly collocated) physical stations, in which case relevant legacy physical data is available prior to inception of the interdisciplinary station.
Most interdisciplinary stations are distinctive in design (Fig. 4) and purpose. From the perspective of design, three stations are particularly noteworthy: SATURN-01 is a vertical profiler and SATURN-03 and-04 are field laboratories with a multi-depth in-water pumping system. SATURN-01 (unique in its ability to capture the vertical structure, Fig. 5) and SATURN-03 characterize the transport and mixing in, respectively, the North and South channels of the chemical estuary. SATURN-05, -6, and-08 characterize river inputs (e.g., Fig. 2, third and fourth panels). SATURN-04, -09, and-07 characterize exchanges between the estuary and progressively more ocean-influenced brackish lateral bays (Cathlamet Bay, Youngs Bay, and Baker Bay, respectively). SATURN-02 helps characterize estuary-shelf exchanges (e.g., Fig. 2, bottom panel).
Sensor composition varies per station, as shown in Fig. 6. However, all SATURN-
nn stations measure the same set of baseline variables: temperature, salinity (derived from conductivity), dissolved oxygen, turbidity or backscatter, chlorophyll
a fluoresence, colored dissolved organic matter (CDOM) and—at some sites—the fluorescence of the algal pigment phycoerythrin. Other parameters measured with regularity at some of these stations include velocity profiles (at SATURN-01, -02, -03, and-04), nitrate (at SATURN-01, -02, -03, -04, -05, and-08), pH and pCO
2 (at SATURN-03 and-04), atmospheric variables (air temperature, solar radiation and wind speed and direction at SATURN-02 and a physical station, Desdemona Sands) and fluorescence quantum yield (at SATURN-01 and-03). Episodically, specialty instrumentation is also deployed at the SATURN-03 and-04 “field laboratories,” and benefits from physical and biogeochemical contextualization by the endurance sensors available at those stations. Examples of such specialty instrumentations are the Environmental Sample Processor (ESP;
Scholin, 2013), a technology developed at the Monterrey Bay Aquarium Research Institute that we use for adaptive, autonomous sampling of microbial communities (Herfort et al., in press); and the SeaFlow (
Swalwell et al., 2011), a flow cytometer developed at the University of Washington that allows continuous real-time observations of small phytoplankton populations.
The geographical extent of the river-to-ocean system leads to significant logistic challenges in the implementation of the observatory. The endurance stations in the chemical estuary and continental shelf (SATURN-01, -02, -03, -04, -07, and-09) are maintained by an operational field team, from a field station in Astoria, Oregon, near the mouth of the estuary. A Portland-based research team maintains SATURN-05 and-08. Since inception, SATURN-06 was developed in collaboration with the United States Geological Survey (USGS) and responsibility for that station is now fully assumed by that federal agency.
The number of physical stations has been reduced since 2008 (Fig. 3), in part because some were transformed into interdisciplinary stations, but also because some (light cyan triangles in Fig. 1) were discontinued. The design of the network calls for Acoustic Doppler Profilers (ADPs) to support physical oceanography and circulation modeling studies, and to help characterize material fluxes. ADPs were first introduced in the network in late 1998, and have been progressively restored after a 2008‒2011 hiatus (Fig. 3). By post-2011 design, ADP locations help characterize fluxes at the North (SATURN-01) and South (SATURN-03) channels, at the estuary-plume interface (SATURN-02) and between the main estuary and a freshwater lateral bay (SATURN-04).
Pioneer array
Pioneer array is a term borrowed from the Ocean Observing Initiative (
NSF, 2005) and is used here to refer to assets that can be deployed on-demand to add spatial scope or spatial resolution to the endurance network, for limited time periods. The SATURN pioneer array consists of manned and unmanned mobile platforms (some of which have real-time telemetry) and re-deployable bottom nodes. Assets can be deployed in isolation, or in coordination within a scientifically targeted field campaign. Pioneer array assets are typically deployed with the benefit of information from endurance stations and operational models.
Unmanned mobile platforms include Slocum gliders and Remus-100 autonomous underwater vehicles (AUVs). Through the seasonal deployment of gliders (typically May–September), broad sweeps of the Washington continental shelf are conducted to characterize waters that (
Hickey et al., 2010;
Roegner et al., 2011a,
b) serve as ocean sources for the Columbia River estuary during upwelling season. Gliders are equipped with temperature, salinity/conductivity, dissolved oxygen, optical backscatter, chlorophyll
a and CDOM sensors. Figure 7 illustrates a typical deployment. The coverage area is north of the Columbia River, roughly from Grays Harbor to Quinault, and the deployment is typically in a radiator pattern or variant thereof. The assessment of hypoxic conditions in the shelf, illustrated in the figure, is an example of the larger-scale information that is needed to interpret estuarine data such as the oxygen saturation shown in the bottom panel of Fig. 2 (for SATURN-03) and in the top—middle panel of Fig. 5 (for SATURN-01). Glider data is also useful in support of modeling efforts, for instance by helping define ocean boundary conditions to drive estuarine biogeochemical simulations.
Two AUVs, often deployed in tandem, are used for process studies—typically in the North Channel of the estuary and occasionally across the mouth and in the near-plume. On-board sensors measure temperature, salinity/conductivity, depth, dissolved oxygen, optical backscatter at two wavelengths (700 nm and 880 nm), fluorescence of pigments (chlorophyll
a, phycoerythrin, and phycocyanin), CDOM, currents and bathymetry. AUV data have been instrumental in providing insight into biological hotspots such as the estuarine turbidity maxima and the
Mesodinium spp. blooms. AUV data have also contributed to a stringent benchmark for the CMOP modeling system, which has resulted in novel assessments (e.g., Fig. 8) and substantial improvements of modeling skill (
Kärnä et al., 2015). In turn, AUV missions are planned and interpreted with the benefit of the CMOP numerical models, by exploring
in silico (not shown) alternative paths of the AUVs through forecasted fields of water velocity, density, and turbidity.
Multiple manned platforms have been integral to the SATURN pioneer array:
• The R/V
CORIE, a 20 ft (6 m) rigid-hull inflatable boat that is the workhorse for the maintenance of the in-estuary endurance stations, has been used for specialized data collection (
Herfort et al., 2011b,
2012) and for AUV deployments.
• The 50 ft (15 m) training vessel M/V
Forerunner, which is owned and operated by the Clatsop Community College (CCC) for mariner training programs, has been used for both serendipity data collection and targeted campaigns. CMOP installed a flow-through system (with salinity, temperature, chlorophyll
a, and turbidity sensors) and a downward looking ADP, which perform automated data collection during CCC classes, station deployments, and other periods of serendipity vessel operation. In addition, the M/V
Forerunner is used by CMOP and other scientists for short-term field campaigns (
Roegner et al., 2011a,
b;
Bräuer et al., 2011,
2013;
Peterson et al., 2013;
Kahn et al., 2014) and calibration sampling near endurance stations.
• Multiple UNOLS research vessels—65 ft (20 m) R/V
Clifford Barnes, 135 ft (41 m) R/V
Point Sur, 143 ft (44 m) R/V
Horizon, and 177 ft (53 m) R/V
Wecoma and R/V
Oceanus—have been used in CMOP campaigns across the river-to-shelf continuum (
Smith et al., 2010,
2012,
2013;
Anderson et al., 2011;
Bräuer et al., 2011,
2013;
Fortunato and Crump, 2011;
Herfort et al., 2011b,
2012;
DeLorenzo et al., 2012;
Fortunato et al., 2012,
2013;
Maier et al., 2012;
Durkin et al., 2013;
Evans et al., 2013;
Gilbert et al., 2013;
Peterson et al., 2013;
Kahn et al., 2014). For campaign details, see stccmop.org/research/cruise.
• A kayak, instrumented with the same baseline sensors as the SATURN-
nn endurance stations (
Rathmell et al., 2013), was deployed multiple times in Baker Bay in 2012 and 2013, to capture spatial variability in that shallow lateral bay. For data, see stccmop.org/datamart/observation_network/kiviuq.
Bottom nodes, instrumented with at least an upward looking ADP and a conductivity-temperature-depth (CTD) sensor, have been deployed for weeks to months, when temporal detail is temporarily needed at locations not occupied by endurance stations. Typically, bottom node deployments have been coordinated with AUV deployments or with broader field campaigns, especially in the North Channel and in Cathlamet Bay.
Vessel-based field campaigns have been conducted both in exploratory mode (e.g., recurrent baseline sampling to explore microbial population dynamics (
Smith et al., 2010;
Fortunato and Crump, 2011;
Fortunato et al., 2012,
2013)) and targeted at understanding specific processes or balances, such as in the August 2007 Barnes cruise focused on estuarine turbidity maxima (
Bräuer et al., 2011,
2013;
Herfort et al., 2011c). Vessel operations have been coordinated, as appropriate and feasible, with other assets of the observation network (e.g., Fig. 9). Importantly, the design, implementation and interpretation of field campaigns has benefited from the spatial and temporal context provided by the endurance stations and by the products of the modeling system (e.g., short-term forecasts, simulation climatologies, or targeted simulation hindcasts). The importance of this context cannot be overstated: climatological information from simulation databases has transformed design and post-campaign interpretation, and real-time and predictive in-ship information has transformed implementation. For instance, CMOP scientists addressing hypotheses related to dynamic features such as estuarine turbidity maxima or salinity intrusion are able to remove (within model and observational uncertainty) much of the guesswork involved in platform placement for data collection.
We have also conducted land-based campaigns in which SATURN-03 or SATURN-04 function as field laboratories. In these campaigns, water samples are collected or specialty instruments are deployed inside the stations, for short periods of time; data acquired through the samples and specialty instruments automatically benefit from contextualization by synoptic high-resolution time series of baseline variables, which are routinely generated by the station sensors. Of particular note are adaptive sampling strategies targeting transient estuarine events for microbial RNA and DNA analysis, triggered by turbidity or other variables measured at SATURN-03 (Herfort et al., in press). Land-based and vessel campaigns have been cross-coordinated when appropriate. As an example, the contribution of Cathlamet Bay to estuarine nutrient balances was quantified through a combination (Fig. 9, bottom panel) of vessel transects, bottom nodes and SATURN-04 operated as a field laboratory. Specifically, a high-resolution Fast Methane Analyzer (from Los Gatos Research) was operated from the station (Fig. 10) and water samples for biogeochemical and microbiological analysis were extracted, all contextualized by endurance sensors.
SATURN modeling system
The SATURN modeling system (henceforth Virtual Columbia River, Fig. 11) is designed to create a progressively more comprehensive and skilled multi-scale description of the estuary and associated tidal freshwater and continental shelf plume. The Virtual Columbia River has helped integrate and expand understanding of how the contemporary chemical estuary functions as a dynamic ecosystem. Furthermore, it has contributed a historical perspective on past evolution of the estuary and provided assessments of future conditions under alternative scenarios of change in global climate and in regional management. Besides being a research tool, the Virtual Columbia River is also an important science-translation tool, having been used over the years to support multiple management decisions by diverse stakeholders (U.S.
Entity, 2013;
Seaton et al., 2014).
The Virtual Columbia River is anchored on high-resolution circulation simulations (
Baptista et al., 2005;
Burla et al., 2010a;
Kärnä et al., 2015), upon which operational products are created and complementary (sediment dynamics and biogeochemical) models are built. The infrastructure for the circulation simulations integrates models, bathymetry, grids, forcing, and skill assessment strategies. Circulation simulations are conducted with 3D baroclinic circulation codes, currently SELFE (
Zhang and Baptista, 2008) and in the past ELCIRC (
Zhang et al., 2004) and others. SELFE solves the 3D baroclinic shallow-water equations, typically with the hydrostatic approximation (a non-hydrostatic option is also available). The primary variables are free-surface elevation, velocities, salinity and temperature. Triangle-based unstructured grids are used in the horizontal direction and hybrid vertical coordinates (a combination of terrain-following
S coordinates and
Z coordinates) are used in the vertical direction.
Multiple modeling domains have been used. Typical computational grids for circulation simulations now extend from the Bonneville Dam and Willamette Falls through a lengthy tidal freshwater and a compressed estuary into the continental shelves of Oregon and Washington (Fig. 12). The grids have higher resolution in the estuary than in the continental shelf, and more recent generations of grids tend to have overall higher resolution than older generations. Higher resolution is possible because advances in code efficiency and parallelization of SELFE (
Lopez and Brown, 2014) and of access to supercomputer centers.
Key operational products are daily forecasts (e.g., Fig. 13) and multi-year simulation databases of circulation (
Baptista et al., 2005;
Burla et al., 2010a;
Kärnä et al., 2015), the latter forming the basis for a Climatological Atlas for the estuary and plume (Fig. 14). Multiple generations of forecasts and databases are numbered sequentially, and preceded by the distinguishing letters F (for forecasts) and DB (for databases). Thus in Figs. 12‒14, DB33 is a simulation database of a more recent generation than DB22, and F33 is a forecast of the same generation of simulation database DB33. The skill of all forecasts and simulation databases is routinely assessed against observations (
Baptista et al., 2005;
Burla et al., 2010a;
Kärnä et al., 2015). Observations of salinity, temperature, and velocity come primarily from the SATURN observation network and CMOP campaigns, with tidal observations coming primarily from the National Oceanic and Atmospheric Administration (NOAA).
Of relevance to regional applications, a “
salmon filter” can be applied to circulation results to create physical metrics (ranging from salinity intrusion length and plume volume to physical habitat opportunity) identified by fisheries researchers as useful to characterize contemporary variability (
Bottom et al., 2005;
Burla et al., 2010b;
Miller et al., 2013,
2014;
Burke et al., 2014) and predict future changes in the role of the estuary, plume and shelf on the salmon lifecycle.
Critically important for CMOP research activities are emerging modeling capabilities for sediment dynamics (
Lopez et al., 2012) and biogeochemistry (
Spitz, 2011), all of which rely on SELFE for circulation and scalar transport. Model domains are typically reduced (estuary-centric) versions of the river-to-shelf domains used for circulation simulations. The sediment model, an adaptation of a previous implementation (
Pinto et al., 2012) through enhanced computational efficiency, solves for the transport of suspended sediment and bedload, and tracks morphological changes due to erosion and deposition. The model was tested against laboratory benchmarks and is being applied to study the Columbia River estuarine turbidity maxima. Calibration and validation for the field application has relied on high-resolution turbidity data from the SATURN endurance stations and CMOP AUVs, complemented by turbidity and sediment concentration data from CMOP field campaigns.
The biogeochemical models are designed to characterize nutrient cycles and to provide insights into the estuary as a river-dominated bioreactor. At the core of the models is a flexible formulation of nutrient, bacteria, phytoplankton, zooplankton, dissolved organic matter and detritus that permits the calculation of dissolved oxygen and will ultimately address carbonate chemistry parameters. We started from an existing open-ocean conceptual model (
Spitz et al., 2001), to which we introduced several adaptations customized to the nutrient cycles in the estuary, and designed to account for the influences of strong estuarine salinity gradients on marine and freshwater phytoplankton populations. The nutrient pool includes nitrogen and carbon. The water column and the exchanges at the water-air and water-benthos interfaces are included. Light attenuation is accounted for via pre-defined water types. Blooms of
Mesodinium spp. are accounted for in a special module.
After an extended period of development and assessment against interdisciplinary data from the SATURN endurance stations and CMOP cruises, the biogeochemical models are being applied to characterize Net Ecosystem Metabolism (production minus respiration) in the Columbia River. The objective is to elucidate biogeochemical processes, transformations and fluxes occurring in and across the biological hotspots targeted by CMOP science: Mesodinium spp. blooms, estuarine turbidity maxima, and lateral bays.
Cyber-infrastructure
The primary role of the SATURN cyber-infrastructure is to enable the free and timely flow of information among diverse producers and consumers, while also adding value to the information when appropriate (
Baptista et al., 2008). There are three major information producers in the SATURN collaboratory, all of which are also (to varying degrees) information consumers: (a) sensors and platforms of the endurance stations and pioneer array; (b) model simulations; and (c) communities of practice ranging from scientists to educators, industry, emergency responders, regional managers and decision makers.
Examples of products and tools that add value to information include the Climatological Atlas, the Data Explorer and the Data Near Here (see also stccmop.org/datamart/data_tools). The Climatological Atlas, based primarily on the Virtual Columbia River, offers insights into multiple scales of variability of the contemporary system, via statistics of various estuarine metrics and river and ocean forcing (e.g., Fig. 14). Data Near Here (
Megler and Maier, 2013; also, Fig. 15) is a ranked-search engine designed to locate relevant CMOP datasets based on position, depth, time, variables and variable values. Increasingly sophisticated versions of this tool have been tested and deployed. The Data Explorer is a web-based tool for access, exploratory analysis and contextualization of SATURN observations (e.g., Fig. 16), which gives users the ability to annotate data and share analyses.
Underlying all products and tools are managed information flows (Fig. 17). Once the data is standardized and accessible, we apply several levels of quality assurance, creating multiple versions of the data. All quality levels are stored and made available. This approach allows researchers to see the effects of quality assurance procedures and apply alternative filters if desired; also, data can be reprocessed if better quality protocols are developed. The approach is implemented through a combination of file transfer and archiving, storage of data, creation of metadata and quality assurance metadata in a relational database, and generation of multiple iterations of a NetCDF archive.
As an example, consider the real-time flow of data from a sensor at SATURN-03. A computer at the station allows data to be received from the instrumentation in real time, by a serial port (SP) reader. The serial port reader writes each raw data line from the sensor to file with a preceding time stamp line. The data from each data line is initially processed (a) to assign a depth, instrument ID and time information; (b) for some instruments, to convert engineering units into scientific units and (c) to write the data to a file. The raw (RV0) and initially processed (RV1) files are transferred from the station computer to a central CMOP computer network in Portland, using the rsync utility (with the first link being over a 802.11 wireless network, using the UNOLS Shipboard Wireless Access Protocol—SWAP, see siomail.ucsd.edu/mailman/listinfo/swap).
As the RV0 file is updated in Portland, a second processing program reads the new records and regenerates the RV1 data lines, which are then entered into a PostgreSQL database in a station-specific table. A third processing program reads new lines from the database and further processes them to store in instrument-specific tables in the database. For some data sets, where data from multiple instruments is required to generate scientifically meaningful results (e.g., dissolved oxygen, where salinity and temperature from a conductivity and temperature sensor are required to convert dissolved oxygen voltage into concentration), other programs are run to pull data from the instrument-specific tables and load the results into additional instrument-specific tables.
A series of metadata tables describe which instruments are collecting data at SATURN-03 at any given time, and the metadata are used by an additional script to generate NetCDF files containing all received data that were successfully parsed, which are publicly served via THREDDS (labeled raw data or PD0). Real-time quality control is used for some variables to generate quality flags, stored in the database, and NetCDF files are generated containing data from which bad or suspect data are excluded. These files are served via THREDDS, labeled preliminary data or PD1.
The PD0 and near real-time PD1 data are ultimately further quality controlled. The quality-control process includes a visual inspection of the PD0 or PD1 data and a review of the data within a historical context as well as in relation to data from other SATURN stations. Results from pre- and post-deployment checks and sensor-specific quality assurance protocols are used to evaluate sensor calibration stability and drift. At some stations, sensor performance and stability are monitored using on-station weekly measurements of aerated DI (deionized water) and near-station CTD casts made with the M/V Forerunner. Additional quality control processes may include corrections for sensor artifacts, identification of periods of fouling and corrections for sensor drift. A final quality level (on a scale of 1 [excellent] to 5 [bad]) is assigned to the data, and metadata are generated to detail the data quality determination and any corrections applied to the data.
After full quality control of the data is completed, a third set of NetCDF files is generated and served via THREDDS labeled as verified data or PD2 (e.g., Fig. 18). The PD2 data files contain the raw data values, final data values that may have been adjusted during quality control, and data quality flags. For the THREDDS service, SATURN-03 data are divided up into monthly files.
Once sensor data from SATURN-03 are in the database and NetCDF files, they are viewable on the station page for SATURN-03, findable in Data Near Here, searched and manipulated in Data Explorer, or downloaded in several different formats.
3.1.2 Cyber-infrastructure
The primary role of the SATURN cyber-infrastructure is to enable the free and timely flow of information among diverse producers and consumers, while also adding value to the information when appropriate (
Baptista et al., 2008). There are three major information producers in the SATURN collaboratory, all of which are also (to varying degrees) information consumers: (a) sensors and platforms of the endurance stations and pioneer array; (b) model simulations; and (c) communities of practice ranging from scientists to educators, industry, emergency responders, regional managers and decision makers.
Examples of products and tools that add value to information include the Climatological Atlas, the Data Explorer and the Data Near Here (see also stccmop.org/datamart/data_tools). The Climatological Atlas, based primarily on the Virtual Columbia River, offers insights into multiple scales of variability of the contemporary system, via statistics of various estuarine metrics and river and ocean forcing (e.g., Fig. 14). Data Near Here (
Megler and Maier, 2013; also, Fig. 15) is a ranked-search engine designed to locate relevant CMOP datasets based on position, depth, time, variables and variable values. Increasingly sophisticated versions of this tool have been tested and deployed. The Data Explorer is a web-based tool for access, exploratory analysis and contextualization of SATURN observations (e.g., Fig. 16), which gives users the ability to annotate data and share analyses.
Underlying all products and tools are managed information flows (Fig. 17). Once the data is standardized and accessible, we apply several levels of quality assurance, creating multiple versions of the data. All quality levels are stored and made available. This approach allows researchers to see the effects of quality assurance procedures and apply alternative filters if desired; also, data can be reprocessed if better quality protocols are developed. The approach is implemented through a combination of file transfer and archiving, storage of data, creation of metadata and quality assurance metadata in a relational database, and generation of multiple iterations of a NetCDF archive.
As an example, consider the real-time flow of data from a sensor at SATURN-03. A computer at the station allows data to be received from the instrumentation in real time, by a serial-port (SP) reader. The serial port reader writes each raw data line from the sensor to file with a preceding time stamp line. The data from each data line is initially processed (a) to assign a depth, instrument ID and time information; (b) for some instruments, to convert engineering units into scientific units and (c) to write the data to a file. The raw (RV0) and initially processed (RV1) files are transferred from the station computer to a central CMOP computer network in Portland, using the rsync utility (with the first link being over a 802.11 wireless network, using the UNOLS Shipboard Wireless Access Protocol—SWAP, see siomail.ucsd.edu/mailman/listinfo/swap).
As the RV0 file is updated in Portland, a second processing program reads the new records and regenerates the RV1 data lines, which are then entered into a PostgreSQL database in a station-specific table. A third processing program reads new lines from the database and further processes them to store in instrument-specific tables in the database. For some data sets, where data from multiple instruments is required to generate scientifically meaningful results (e.g., dissolved oxygen, where salinity and temperature from a conductivity and temperature sensor are required to convert dissolved oxygen voltage into concentration), other programs are run to pull data from the instrument-specific tables and load the results into additional instrument-specific tables.
A series of metadata tables describe which instruments are collecting data at SATURN-03 at any given time, and the metadata are used by an additional script to generate NetCDF files containing all received data that were successfully parsed, which are publicly served via THREDDS (labeled raw data or PD0). Real-time quality control is used for some variables to generate quality flags, stored in the database, and NetCDF files are generated containing data from which bad or suspect data are excluded. These files are served via THREDDS, labeled preliminary data or PD1.
The PD0 and near real-time PD1 data are ultimately further quality controlled. The quality-control process includes a visual inspection of the PD0 or PD1 data and a review of the data within a historical context as well as in relation to data from other SATURN stations. Results from pre- and post- deployment checks and sensor-specific quality assurance protocols are used to evaluate sensor calibration stability and drift. At some stations, sensor performance and stability are monitored using on-station weekly measurements of aerated DI (deionized water) and near-station CTD casts made with the M/V Forerunner. Additional quality control processes may include corrections for sensor artifacts, identification of periods of fouling and corrections for sensor drift. A final quality level (on a scale of 1 [excellent] to 5 [bad]) is assigned to the data, and metadata are generated to detail the data quality determination and any corrections applied to the data.
After full quality control of the data is completed, a third set of NetCDF files is generated and served via THREDDS labeled as verified data or PD2 (e.g., Fig. 18). The PD2 data files contain the raw data values, final data values that may have been adjusted during quality control, and data quality flags. For the THREDDS service, SATURN-03 data are divided up into monthly files.
Once sensor data from SATURN-03 are in the database and NetCDF files, they are viewable on the station page for SATURN-03, findable in Data Near Here, searched and manipulated in Data Explorer, or downloaded in several different formats.
4 Lessons learned
It is challenging to implement and maintain an infrastructure such as SATURN. While a comprehensive discussion is beyond the scope of this paper, Figure 3 illustrates some basic challenges for the observation network. First, we note that there are extensive data gaps in the endurance stations and that, post-2008, some the original physical stations were discontinued. The gaps reflect in part the often-harsh environmental conditions in the Columbia River coastal margin. But data gaps and discontinuations also reflect the precarious balance among available resources, long-term goals and short-term priorities.
Figure 3 also shows that the inter-annual variability of the ocean (illustrated by El Niño-Southern Oscillation) and river forcing (illustrated by discharge at Bonneville Dam) occur at scales that require longer time series than currently available, both for physical and especially for biogeochemical variables. For instance, no large El Niño has yet been observed with the interdisciplinary endurance stations. The time series are not yet long enough to capture climate scales of importance for estuaries, because endurance observation networks are a relatively recent concept for these ecosystems. Moreover, funding mechanisms to maintain endurance time series on a permanent basis are insufficiently established, even if the U.S. Integrated Ocean Observing System represents a step in the right direction.
In spite of these and other challenges, collaboratories such as SATURN can transform the ability to conduct hypothesis-driven science in estuaries, because the testing of hypotheses can be conducted faster, at a reduced cost and more thoroughly by leveraging the (sensed and modeled) data-rich environments that these infrastructures create. The power of collaboratories is exemplified by modern field campaigns. For instance, it is now possible with the benefit of skill-assessed modeling products to plan campaigns in silico prior to the deployment of any mobile platforms (e.g., vessels or AUVs) in the field. This capability increases the likelihood of being able to place sensors where and when needed to capture features as complex and transient as plume fronts and estuarine turbidity maxima. This capability also improves the safety of high-risk deployments, such as AUV missions in the highly energetic North Channel or across the mouth of the Columbia River. In another example, it is also possible to conduct land-based campaigns where specialized adaptive sampling (e.g., ESP sampling for RNA and DNA analyses) is triggered by features (e.g., estuarine turbidity maxima) or thresholds (e.g., oxygen-saturation levels) detected by co-located sensors or predicted by coordinated models.
The extensive data sets produced by collaboratories also open the doors for
exploratory or
discovery science (Hey et al., 2009) as a legitimate and effective manner to advance understanding of estuaries. For instance, multi-year simulations of circulation are fertile grounds for exploring seasonal and inter-annual patterns and trends of change, not only for circulation (e.g., characterization of estuarine regimes), but also for circulation-dependent ecosystem features, such as estuarine habitat and ocean-entry conditions of juvenile salmon (
Burla, 2009;
Burla et al., 2010b;
Miller et al., 2013,
2014). It is also possible to conduct low-cost remote research or education. As an example, it is possible for someone in, say, China, to conduct exploratory research on the Columbia River estuary—or to introduce students to estuaries using the Columbia River as reference (
Green et al., 2013)—by assessing and contextualizing SATURN sensor data via tools such as the Data Explorer (e.g., Fig. 16).
Regardless of whether hypothesis-testing or exploratory analyses are preferred (or combined), the open information flows of collaboratories are particularly effective in supporting interdisciplinary team science—as illustrated by the ~150 multi-author peer-reviewed articles published since CMOP’s inception in July 2006 through March 2015, with our major scientific synthesis still ahead. See stccmop.org/publications for a list of CMOP articles.
Collaboratories such as SATURN also offer exciting opportunities for technology assessment and development. For instance, rigorous modeling benchmarks based on SATURN observations (
Kärnä et al., 2015) have not only improved the skill of current circulation simulation databases, but have also offered guidance on priority requirements for next-generation estuarine models. In another example, the development of sophisticated estuarine biogeochemical models for the Columbia River estuary has been transformed by the availability of time series from interdisciplinary sensor packages. In addition, the SATURN infrastructure has supported the development or testing of multiple sensor technologies under challenging environmental conditions (high turbidity, fluctuating salinity, etc.).
Collaboratories are also extremely effective in helping translate science into regional decision-making and management because scientists can respond quickly to requests for applied studies, and also because regional stakeholders have knowledge and even “ownership” of the development, priorities and maintenance of collaboratories. Examples of science translation to stakeholders include the CMOP contributions to the Columbia River Channel Improvement Project (
Seaton et al., 2014) and to the Columbia River Treaty Review (U.S.
Entity, 2013). In both cases, circulation models, skill-assessed against sensor data and linked to fisheries and other ecosystem metrics, helped build regional consensus on specific issues. In the case of the Columbia River Channel Improvement Project, the relevant issue was whether to continue post-construction field observations of the potential impact of a recent deepening of the channel. In the case of the Columbia River Treaty Review, the relevant issue was to what extent the estuary was influenced by a set of scenarios of change in hydropower operations.
Higher Education Press and Springer-Verlag Berlin Heidelberg