Introduction
Snowmelt derived from mountain headwaters provides water resources to over half of the world’s population (
Debarbieux and Price, 2012). Accordingly, the ability to measure or estimate the snowpack in mountain landscapes is critical to water resource management (
Bales et al., 2006). While the importance of information on mountain snowpack is clear, obtaining these measures for estimates remains challenging. This is in part because snow depth and snow water equivalent (SWE) are both spatially and temporally variable, and mountain regions are generally difficult to access. As a step toward addressing this challenge we evaluated methodologies to increase the efficiency of snow surveys and to enhance remotely derived estimates.
The correlation between snow depth and terrain/canopy variables has been well established (
Meiman, 1968;
Elder et al., 1991;
Balk and Elder, 2000;
Erxleben et al., 2002;
Erickson et al., 2005;
Molotch and Bales, 2005;
Hultstrand et al., 2006;
López Moreno and Nogués Bravo, 2006;
López-Moreno et al., 2011;
Sexstone and Fassnacht, 2014). Existing methods to estimate snow depths integrate digital elevation models (DEMs) and land cover (LC) data within a geographic information system (GIS) analysis framework. With these approaches, only a few snow depth measurements, often from one (
Elder et al., 2009) to three (
Molotch and Bales, 2005) points, are used to represent the area of one DEM or LC pixel (e.g., 30 m).
Clearly, this raises the question: how much variability in snow depth exists within a pixel (
López-Moreno et al., 2011)? Various assumptions (
Sturm and Wagner, 2010) and analyses (
Erickson et al., 2005;
Deems et al., 2008;
López-Moreno et al., 2015) have demonstrated that snow accumulation and distribution patterns are consistent over time for mean snow depth within a given pixel. Thus, a second question follows: is the variability of snow depth per pixel also consistent over time? To address these two questions, we pose the following more specific questions for our research objectives:
1) How different are the estimates of snow depth per pixel using one or three points compared to using more measurements per pixel? Specifically, do more measurements per pixel provide more information?
2) How many measurements are needed per pixel to estimate the mean value of a pixel within an acceptable limit?
3) How do the pixel mean and pixel standard deviations vary inter-annually for the same location at the same time of the year?
4) How do meteorological driving forces, represented by terrain and canopy variables, change in relative effect on snow depth distribution between sites and inter-annually at the same site?
Background on snow depth sampling
From a water resource management perspective, information on water storage or SWE is preferred to estimates of snow depth (
Bales et al., 2006). However, snow depth (
ds) information is the easiest snowpack property to measure. SWE is the product of d
s and the depth averaged snowpack density, and density has been shown to be less spatially variable than SWE or depth (
Logan, 1973;
Fassnacht et al., 2010;
López-Moreno et al., 2013). Operationally, the Natural Resource Conservation Service (NRCS) has measured snowpack properties (
ds and SWE) for more than 80 years with the snow course sites, and over the past 40 years has used data from the automated snowpack telemetry (SNOTEL) network to forecast spring and summer runoff volumes (
U.S. Department of Agriculture, 2011, 2016) across the Western United States where snowmelt contributes water supply to more than 60 million people. SNOTEL data have a daily temporal resolution that represents the dynamic evolution of the snowpack, but only represent a small area (~10 m
2). Thus, individual measurements or stations tend not to be representative of the area(s) surrounding them (
Molotch and Bales, 2005;
Kashipazha, 2012;
Meromy et al., 2013), and such a single fixed point measurement is not a statistically useful tool to represent the average snow depth of an area, even with a relatively uniform snow pack (
Neumann et al., 2006). This issue arises for manual field surveys used to evaluate the distribution of snow (e.g.,
Meiman, 1968;
Mould, 1985;
Elder et al., 1991;
Balk and Elder, 2000;
Erxleben et al., 2002;
Erickson et al., 2005;
Molotch and Bales, 2005;
Hultstrand et al., 2006;
López-Moreno and Nogués Bravo, 2006;
Sexstone and Fassnacht, 2014). Most of these surveys use one (
Elder et al., 2009) or three (
Elder et al., 1991) snow depth measurements to represent one pixel (e.g., 30 m) of a DEM.
Rice and Bales (2010) suggested that a network of sensors could be used to provide a better estimate of snow depth, especially for areas that cannot be readily or easily surveyed manually. On average, 5 sensors out of a network up to 44 yielded a snow depth estimate within 25% of the basin mean, depending on the snow variability (
Rice and Bales, 2010). More sensors are likely needed to achieve this level of accuracy in terrain with strong topographic variability (
Neumann et al., 2006).
López-Moreno et al. (2011) sampled 121 points at 15 relatively homogenous 100 m
2 plots in the Spanish Pyrenees and found that five to seven points produced an average snow depth within 5% of the mean value based on the 121 points. In this paper we used 11, 17, or 21 snow depth measurements per location or pixel size of 900 m
2 for over 150 pixels to evaluate snow depth variability within a given location and to determine the optimal number of points.
Data and methods
Study sites and sampling
Snow depth was measured manually about two SNOTEL stations using evolving sampling strategies. The first survey was about the Togwotee Pass SNOTEL station in Northwestern Wyoming (Fig. 1(a)) on 17 March 2009 (TP09), while the second and third surveys were about the Joe Wright SNOTEL station near Cameron Pass in Northern Colorado (Fig. 1(a)) on 1 May 2009 (JW09) and 2 May 2010 (JW10). The Togwotee Pass site has a mixed canopy density throughout (Fig. 1(b)-i), while the Joe Wright site is moderately to densely forested, except along the road through the middle of the study area (Fig. 1(b)-ii). Both sites are located in the Spruce-Fir forest composed mainly of Picea engelmanii and Abies lasiocarpa trees.
Snow depth measurements were taken to cover a 1 km×1 km area about the SNOTEL station in 9 or 10 transects each separated by 100 m (
Meromy et al., 2013; Fig. 1(b)). A 1 cm diameter aluminum probe (extendable in 1 m lengths) was used to measure snow depth to the nearest one centimeter. Measurements were taken at 50 m intervals along each transect. During snow sampling, a hand-held Garmin or Trimble global positioning system (GPS) unit was used to navigate to each sampling location. Each person was instructed to navigate to within 10 m for each coordinate, and record the GPS coordinates at the center point to the nearest one-meter.
The simplest and most basic sampling design was three snow depth measurements per pixel taken 5 m apart, denoted –5, 0 and+5 m (
Meromy et al., 2013; Fig. 1(c)). At the first site (TP09), two sets of four points were added between each of the basic three measurements (–5, 0 and+5 m) plus an additional five points to the left and right of center point yielding 21 points per pixel (Fig. 1(c)-i). The 10 additional points to the left and right added a substantial amount of sampling time, and were only used for TP09. Eleven points were used per pixel for the Joe Wright 2009 survey (Fig. 1(c)-ii). To increase the number of points per pixel, while only slightly increasing the sampling time, one point was added each to the left and right of the start (–5), center (0) and end (+5) resulting in 17 points per pixel at Joe Wright in 2010 (Fig. 1(c)-iii). These surveys produced 157, 203, and 206 pixels for TP09, JW09, and JW10, respectively. All measurements that were by the road or in the anthropogenically produced snow bank beside the road were noted and not used in the analysis.
Analysis and statistics
We examined the impact that the number of sampling points per pixel had on computed mean and standard deviations (Tables 1 and 2; Fig. 2). Firstly, we compared the center snow depth measurement to the depths collected along the 11 points in a row (Fig. 1(c)-ii) for all three datasets, and then all points for TP09 (21 points, Fig. 1(c)-i) and JW10 (17 points, Fig. 1(c)-iii). Secondly, we used the basic three measurements in a row (–5, 0, +5 m;
Meromy et al., 2013), and then the five measurements in a plus from TP09 (Fig. 1(c)-i). For TP09 and JW10 we also compared the 11 in a row versus all measurements. Lastly, we computed the difference from the pixel mean snow depth for all samples per pixel as a function of the number of points per pixel (Fig. 3). These comparisons were evaluated using the Nash Sutcliffe coefficient of efficiency (NSCE) (
Nash and Sutcliffe, 1970), the coefficient of determination (
R2), and the slope of the best-fit regression.
To evaluate repeatability of sampling (
Erickson et al., 2005;
Sturm and Wagner, 2010), comparisons of the two years of snow survey data for the Joe Wright site were conducted by overlapping both datasets to ensure that each corresponding data point from both years was within the same 30-m pixel (Fig. 4). Any 2009 measurement location that did not have measurements in 2010 but was within one pixel of a 2010 measurement was also compared; this increased the number of comparable points between the two survey years (Fig. 4).
Various statistical techniques (e.g.,
Erxleben et al., 2002;
López-Moreno and Nogués Bravo, 2006) have been used to determine which terrain and canopy variables explain the spatial distribution of snow. A comparison of methods for the same extent used here (~ 1 km
2 shown in Fig. 1(b)) was applied across several Rocky Mountain sites in Colorado for the NASA Cold Land Process Experiment, and binary regression trees were determined to be the most successful at explaining the variability of snow depth (
Erxleben et al., 2002). Accordingly, we created binary regression trees (Fig. 5) following
Elder et al. (1991) using the Matlab data processing language. The trees were developed to identify the sequence of terrain and canopy variables that best distributed mean and standard deviations of snow depths (Table 3), as well as to determine the number of measurements needed to produce a value within 5% of the pixel mean (
López-Moreno et al., 2011) (hereinafter called measurements to 5%, Table 3). The binary regression trees were built with a series of branching decisions ending in a group of measurements or “leaf”. A minimum leaf size equal to 5% of each site’s measurements was used so that each terminal node contained a reasonable number of measurements. Each binary regression tree was analyzed across all possible mergers of decision points and leaves to find the tree with the lowest standard error. Standard error here is a function of the relative standard deviation within each leaf. A tree with one leaf for each measurement will have an error of zero, and error generally increases as number of leaves decrease. There are exceptions where reducing leaves and decision points produces slightly more homogenous terminal nodes (
Meromy et al., 2013).
Geospatial data
Manual snow depth sampling was carried out at a resolution of 50 m to 100 m while a 30-m DEM was used for topographic data. The statistical analyses identified the terrain and canopy variables that describe the distribution and variability of snow depth about the two SNOTEL stations. These terrain variables included elevation (
Dingman et al., 1988;
Fassnacht et al., 2003), slope as the sine of slope (
Sexstone and Fassnacht, 2014), northness (
Fassnacht et al., 2012), eastness (
Wallace and Gass, 2008), cumulative monthly clear sky solar radiation (
Dozier and Frew, 1990), and maximum upwind slope (
Winstral et al., 2002). Northness is the product of sine of slope and cosine of aspect to represent an integration of wind and sun influences with a steep north facing slope approaching a value of 1 and a steep south facing slope approaching a value of –1. Eastness is the product of sine of slope and sine of aspect to represent wind processes. Maximum upwind slope describes the topographic shelter or exposure relative to a specific wind direction (
Winstral and Marks, 2002), and can explain the variability in snow deposition due to wind transport and redistribution (
Winstral et al., 2002).
Elevation data were obtained from the
U.S. Geological Survey (2018) 30-m DEM, and used to derive the terrain variables. Canopy cover offers snow protection from the wind, and provides shade on sunny aspects, and was thus also used. Canopy cover data were also obtained from the
U.S. Geological Survey (2018). Data points were overlain onto the DEM and the other variables in the ArcMap Geographical Information System (GIS) software. The value for the variables at each data point was then extracted.
Results
When center points were compared to overall pixel means, only Joe Wright 2010 was not considered good or very good (Fig. 2(a)), as per the classification of
Moriasi et al. (2007). The mean of the three main measurement points (–5, center, +5; Fig. 1(c)) is better than the center point measurement alone for representing the overall pixel mean, as shown by the improved NSCE and R
2 values (Tables 1 and 2, Fig. 2). For Togwotee Pass 2009 and Joe Wright 2010, measuring 11 points was nearly the same as measuring all 21 (TP09) or 17 (JW10) points, in terms of estimating the pixel mean (Table 1; Fig. 2(a)). The five extreme points arranged in a plus at Togwotee Pass did produce a more accurate mean than three points in a row (Fig. 1(c)-i versus Fig. 1(c)-ii).
The standard deviation computed from just three measurements is not adequate to represent the variation across a pixel from all measured points (Table 2; Fig. 2(b)). The five extreme points at Togwotee Pass (Fig. 1(c)-i) is better than the 11 points in a row at representing the variability within each pixel (Table 2). At Joe Wright 2010, the standard deviation from 11 points in a row is similar to the pixel standard deviation computed with the extra left and right points at –5, 0 and+5, i.e., 17 points in total (Fig. 1(c)-iii).
The average number of points per pixel required to represent the mean of the pixel depends on the acceptable error (Fig. 3), and we used the 5% threshold suggested by
López-Moreno et al. (2011). Only 20 (TP) to 30% (JW) of the actual points measured, i.e., 3 to 5 points, are required to obtain the mean within 5% (Fig. 3(a)). Since there are more measurements per pixel at TP than JW (Fig. 1(c)), the deviation from the mean is smaller even with a lower proportion of the total points (Fig. 3). When using only the 11 points in a row configuration (Fig. 1(c)-ii), the relation between difference from mean and the number of points are almost identical (Fig. 3(b)). However, there is considerable variability among individual pixels (Tables 1 and 2; Fig. 2).
There is inter-annual consistency in the average number of points necessary to estimate the pixel mean (Fig. 4(a)), but much less consistency among the mean of individual pixels within the same year (Fig. 4(a)). This decreases further from the fit of R2 = 0.26 when nearby measurements (within one pixel) are added (R2 = 0.11). The snow depth was almost the same for the early May sampling at Joe Wright; the measurement means were 156 and 151 cm, and the SNOTEL measured depths were 177 and 180 cm for 2009 and 2010, respectively. At the SNOTEL station, melt started after the sampling (6 May 2009 and 16 May 2010). There is almost no correlation of pixel standard deviation between years (Fig. 3(b)).
The relative influence of terrain and canopy variables was somewhat more consistent. Eastness appears as an explanatory variable in all nine binary regression trees (BRTs), with canopy density and elevation appearing in eight of the nine BRTs (Table 3). Canopy density appears early in the binary regression trees for mean snow depth and standard deviation (Table 3(b) and 3(c)). Elevation appears at the top of the binary regression trees for measurements to 5% at all three sites (Table 3(a)), and in the second (JW09 and JW10) or third (TP09) branch for mean, but later (or not at all JW09) for standard deviation. Sine of slope appears in the first or second branch and then again in the third branch for the standard deviation. There is more consistency between the order of appearance of terrain and canopy variables for the two years at JW than for the same year at TP and JW (Table 3).
Discussion
Previous surveys have used only three measurement points in the sub alpine (e.g.,
Molotch and Bales, 2005;
Meromy et al., 2013 around SNOTEL stations) or up to five measurement points in the alpine (e.g.,
Elder et al. (1991) at Emerald Lake, Sierra Nevada CA or
Hultstrand et al. (2006) at GLEES, Snowy Range WY) to evaluate snowpack depth and variability per sample location. The NASA Cold Land Process (CLPX) used only one snow depth measurement, but with a random stratified sampling design which yielded 550 snow depth measurements over a 1 km
2 area (
Erxleben et al., 2002). In a study in the Pyrenees, 5 to 6 samples were adequate to represent one location at a 5% threshold (
López-Moreno et al., 2011). However, the sites used by
López-Moreno et al. (2011) were chosen as they appeared to have homogenous snow distribution on the surface, and only 15 plots of 121 points each were surveyed twice. A reliable sampling strategy is five measurement points taken at 5-m intervals (Fig. 1(c)-i), but this is more difficult in forested areas and complex terrain.
The 2009 and 2010 Joe Wright SNOTEL snow surveys used a sampling strategy of 11 and 17 points, respectively. In these two surveys only 99 sampling locations were within the same pixel and 70 were in an adjacent pixel (Fig. 4) due to the complexity of the terrain, the dense canopy, and safety/logistics of sampling. It should be noted that in this dense forest, the GPS accuracy was occasionally reported to be as poor as 10 m, implying that a specific pixel measurement may actually be within a different pixel. The 10-m transect (Fig. 1(c)) is not necessarily all within one 30-m pixel, and could even be sampled across three pixels (Figure 1ci). Snow surveyors were told to target each sampling pixel within 10 m of a specified center point, which could lead to up to maximum potential error of 40 m for points between the 2009 and 2010 survey. Furthermore, since snow depth, terrain, and canopy all vary at scales finer than the 30-m resolution data used here (
Fassnacht and Deems, 2006), measuring more than three points per pixel (e.g., 11 to 21; Fig. 1(c)) could reduce sampling error.
Much of the snow hydrology literature suggests that there is a temporal consistency in the spatial patterns of the distribution of snow (e.g.,
Erickson et al., 2005;
Sturm and Wagner, 2010;
Webb, 2017). In open areas or areas with low density canopy, the snow distribution should be consistent through each snow year, as the areas are affected by the same physical features each season, while in more closed canopy areas, the snow distribution may vary through seasons due to changes in vegetative structure and density, especially over time (
Rice and Bales, 2010). The patterns can be less consistent intra-annually than inter-annually over small domains (
Kashipazha, 2012), than this justifies further investigations into the patterns at the seasonal scale (
Webb, 2017). For the Joe Wright data, the difference between 2009 to 2010 (Figs. 4(a) and 4(b)) could not be explained by land cover type, as 93% of the pixels were classified as evergreen, nor by canopy density, as most of the pixels had a dense canopy (91% had a canopy density greater than 50%, 70% of the pixels has a canopy density greater than 75%).
Binary regression trees (e.g., Fig. 5) illustrate the systematic relation of snow properties per pixel to terrain and canopy variables (Table 3;
Elder et al., 1991;
Erxleben et al., 2002;
López-Moreno and Nogués Bravo, 2006). Elevation influences the distribution of snow (
Fassnacht et al., 2003), but more so at Joe Wright (Table 3) as the elevation range was 50% greater (3055 to 3295 m at Joe Wright versus 2840 to 3000 m at Togwotee Pass). Slope also affected the distribution of snow, appearing often in the mean regression tree at Joe Wright. Sine of slope and maximum upwind slope were only related when the local slope faces into the direction of the wind from the west at TP and south at JW (
Meromy et al., 2013).
Canopy density was an important factor due to its effects on wind and solar radiation (
Winstral et al., 2002;
Veatch et al., 2009), and it appeared in the binary regression trees for all snow surveys for both mean snow depth (Table 3(b);
Fassnacht et al., 2017) and standard deviation (Table 3(c)) per pixel, but fewer for measurements to 5% (Table 3(a)). Both study sites are forested with an average of 63% and 74% at Togwotee Pass and Joe Wright, respectively (Fig. 1(b)). Similarly, modeled clear sky cumulative solar radiation (plus eastness and northness) appeared in the regression trees and had some correlation to standard deviation at Joe Wright. Since all the measurements were taken prior to the start of melt, canopy would further be relevant for the distribution of snow during ablation (
Molotch et al., 2009).
How much snow varies within a pixel (Table 3(c); Fig. 2(b)) is not directly a function of the same terrain/canopy variables that drive the distribution of snow across the landscape (Table 3(b)), nor how many measurements are required to reach 5% (Table 3(a); Fig. 5). This difference is related to in part to scale (
Blöschl, 1999); snow depth and vegetation are correlated at shorter scales (
Deems et al., 2006) when trees are present (
Trujillo et al., 2007). At longer distances, snow depth is more related to elevation (
Fassnacht et al., 2003;
Deems et al., 2006;
Fassnacht et al., 2009).
Since each of the Joe Wright surveys composes the same area and uses 99 of the same sample points, it is expected that the key variables influencing the average snow depths for each survey to be the same or similar. Variables such as canopy cover, slope, and aspect are expected to be constant between years, and other factors, such as the driving local meteorology (given by solar radiation and maximum upwind slope), are consistent from year to year (
Meromy et al., 2013). Here, we showed limited inter-annual consistency in mean snow depth (Fig. 4(a)). There is no previous research examining the distribution of variability; no consistency was found in the variability of snow depth for a specific pixel between years (Fig. 5(b)). However, the correlation to the terrain and canopy variables is consistent (Table 3). It is recommended that more data should be collected to draw more solid conclusions between variability in snow depth at the same location over different years (e.g.,
Erickson et al., 2005).
With 11 to 21 samples per pixel over more than 150 pixels (Fig. 1(b)), our measurements have higher accuracy than the single SNOTEL measurement (
Kashipazha, 2012;
Meromy et al., 2013). The 11 measurement points in a row (Joe Wright 2009;
Fassnacht et al., 2017) was the easiest to perform in the field, while 21 measurement points in a plus (Togwotee Pass) was the most difficult, especially in dense canopy and steep areas. The 17 measurement point per pixel survey from Joe Wright in 2010 seemed to be the most reasonable compromise to ensure representative samples (Kashipazha, 2012), and efficiency in time and energy spent to sample. On average, sampling three of 11 points in a row yields a mean snow depth within 5% (
López-Moreno et al., 2011) of the pixel mean (Fig. 3(b)). The additional measurements for TP09 (21 points) and JW10 (17 points) improve estimation of snow depth in a pixel because these additional 6 to 10 points are off the row of 11 samples. However, for TP09, the 5 meters measurements taken off the main sampling line were difficult due to the dense canopy (Fig. 1(b)) and slope of the terrain at some of the pixels. The 17 point sampling strategy (Fig. 1(c)-iii) was more efficient. To determine the variability across a pixel using the standard deviation, the five extreme points in the 21 point sampling strategy (Fig. 1(c)-i) were best strategy (Tables 1 and 2). The different needs depend on the final purpose of the research and must be considered when designing a sampling strategy; sampling to estimate the mean would be used for an overall estimation of the snow resource while sampling to estimate the standard deviation would be used to assess sub-grid variability.
Conclusions
Snow depth sampling strategies, the distribution of snow, and its sub-pixel variability were investigated at two sites. Using one snow depth measurement is not adequate to estimate snow depth within a sampling pixel, when compared to 11, 17, or 21 measurements. While increasing the sampling to three points is an improvement when compared to using 11 measurement, but more measurements provide a better estimate of the mean snow depth across a sampling pixel. Using an acceptable error of sampling within 5% of the overall mean, measuring between 3 and 6 points was optimal. The results were similar for each snow survey. Adding sampling off of the main transect improved the estimation of pixel mean snow depth. However, sampling five meters to the left and right of the main transect increased the sampling time, so sampling one meter on each side at three locations in each sampling transect was a good compromise between sampling time and quantity of data collected.
On a pixel by pixel basis, the mean snow depth was not consistent for two similar sampling years; there was no correlation between years for the standard deviation of snow depth within the same pixels. The same key variables appeared more often in the binary regression trees across years and sites, but the hierarchy was inconsistent. The most relevant variables were elevation, canopy density, and eastness (the product of sine of slope and sine of aspect).
Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature