Introduction
The increasing global demand for energy and growing environmental concerns over the dependence on fossils lead scholars to explore for alternative solutions (
Pardalos et al., 2013). Wind is among the most important renewable sources of energy with minimal environmental impact and is abundant in nature. Wind farms do not use fuel and do not emit air pollution, and the energy consumed to manufacture a wind power plant is recovered within a few months only. In addition, the microclimate generated by wind turbines has advantages to corn and soybean crops, such as preventing the late spring and early autumn frosts and reducing pathogenic fungi (
Takle and Lundquist, 2010). The disadvantages of harvesting wind energy are minimal compared with traditional sources that involve sound and visual pollution, bird and bat deaths, and interference with radio reception and ground radar systems used for military, weather, and air traffic control.
Globally available wind power over land in locations worldwide with mean wind speeds exceeding 6.9 m/s at 80 m height is approximately 72 TW (630–700 PWh/yr) according to several studies (
Jacobson, 2009). As of 2005, this resource was five times the world’s total power production and 20 times the world’s electric power production (
Archer and Jacobson, 2005). Hence, recent years have been marked by a rapid development in the wind energy sector, especially in China where wind power has become a key economic growth component. In 2016, China added 19.3 GW of wind power generation capacity, accounting the total capacity of up to 149 GW; the country generated 241 TWh of electricity from the wind, which constituted 4% of the country’s total consumption for the year (
Global Times, 2017). Currently, China has the largest capacity of installed wind farms and is ranked second worldwide in the production of wind energy, with the US being first due to their higher capacity factor (
Vaughan and Kelley, 2016). The remarkable recent progress in developing wind energy capabilities may be the beginning of a new era because the potential for continuing growth is tremendous.
McElroy et al. (2009) asserted that China could meet all of its electricity demands through wind power by 2030. However, for these projections to materialize, the focus of governmental and private entities involved in the wind energy sector should shift from the domain of landscape assessment and environmental impact to the economic viability and efficiency.
One of the main challenges that should be overcome to maximize fully the wind for energy generation purposes is the high variability of wind speeds (
Soder, 2004). Particularly, the distribution of wind speed over time resembles the Weibull function (
Piacquadio and De la Barra, 2014). A diverse set of locations for wind farms could be selected to ensure a steady supply of wind energy, such that at least some of the selected locations have sufficiently high wind speed at any time. Specifically, the following question needs to be addressed:
Given the wind speed data collected from the potential locations of wind farms, which locations should be selected to maximize the wind power production while balancing the wind speed variability?Various aspects of wind energy related to this question have been considered in previous studies based on different perspectives (
Kahn, 1979;
Milligan and Artig, 1999;
Milligan, 2000;
Archer and Jacobson, 2007;
Drake and Hubacek, 2007;
Roques et al., 2010;
Degeilh and Singh, 2011;
Grothe and Schnieders, 2011;
Novoa and Jin, 2011;
Liu et al., 2013). Different from the existing works on the topic, the present work approaches the issue by using a network representation of the wind energy system. In this study, the nodes correspond to potential sites for wind farm location and the edges connect the pairs of sites characterized by negative correlations of wind speed fluctuations. Hence, the problem of selecting appropriate locations can be reduced to finding tightly knit clusters of nodes corresponding to windy locations. We illustrate the proposed methodology by using a real-life wind speed data set collected from more than 200 locations in Bolivia over a 10-year period.
The network-based data mining approach explored in this work is based on observation that big data arising in various complex systems can be conveniently modeled by using networks or graphs. The components of the complex system are represented as vertices (nodes), and the pairwise interactions among different components are described by edges (arcs) that link pairs of vertices. This simple and intuitive modeling technique can maximize the rich and powerful network analysis tools to reveal several global structural properties of the underlying system and predict the overall trends in its dynamics. This approach has been successfully used to analyze several other complex systems involved in various applications (
Boginski et al., 2003), including phone call records in telecommunications (
Abello et al., 1999), social networks (
Pattillo et al., 2012), and stock market data (
Boginski et al., 2006).
The remaining sections of this paper are organized as follows. Section 2 provides some preliminaries from graph theory and network analysis. Section 3 describes the process used to construct a wind speed graph by using the wind speed time series. Section 4 presents the results obtained by applying the proposed approach to the available data set. Finally, Section 5 elaborates the conclusion of the study.
Preliminaries
Before describing the process of constructing the wind speed graph, we introduce the necessary definitions, notations, and other preliminaries from the graph theory and network analysis.
A simple undirected graph (network) is provided as G=(V,E), where V is the set of n vertices (nodes), and E is the set of m edges (arcs) that connect the pairs of vertices. The edge between vertices is denoted by . If the edge (i,j) exists in the graph, then the two vertices, i and j, defining the edge are the endpoints of this edge, also said to be neighbors and adjacent to each other. The number of neighbors of a vertex is called the degree of that vertex. A graph is regarded as complete if it contains all possible n(n−1)/2 edges. For a subset , the subgraph induced by S, denoted by G(S)=(S,E(S)), uses S as its set of vertices and E(S) as its set of edges, where E(S) consists of all edges in E that have both endpoints in S. A subset C of vertices is called a clique if the induced subgraph G(C) is a complete graph, that is, all vertices in C are pairwise adjacent.
Cliques can be considered an idealistic model of a tightly knit cluster of nodes in a network, where all possible pairwise connections are present. On the one hand, the “perfect” nature of a clique makes it an attractive cluster model, on the other hand it is the reason cliques are considered overly restrictive in several applications, especially when the studied network model is only an approximation of the actual complex system under investigation. As a result, many alternative cluster models have been introduced in the literature, aiming to overcome this drawback by allowing missing edges while preserving certain cohesiveness characteristics of cliques. Such models, referred to as clique relaxations (
Pattillo et al., 2013), often provide a better description of clusters than the cliques. In this study, we will use a clique relaxation called
s-plex to model clusters in wind speed graphs. Specifically, a subset of vertices
C is called an
s-plex if each vertex from
C is a neighbor of all but at most
s vertices from
C , where
s is a given positive integer constant. Note that for
s =1,
s-plex is equivalent to a clique (i.e., the only non-neighbor of each vertex is the vertex itself). For low
s values (2,3),
s -plexes provide extensively cohesive clusters that retain the properties of a clique, such as high connectivity and low diameter, in a slightly relaxed form (
Balasundaram et al., 2011;
Verma et al., 2015).
Large tightly knit clusters in a network constitute several most important building blocks that fully influence the network’s overall structure. Particularly, the size of the largest clique and
s-plex can be used to characterize the global cohesiveness properties of the network. The problems of finding the largest clique and
s-plex in a graph are referred to as the maximum clique and maximum
s-plex problems, respectively. Both problems are NP-hard (
Garey and Johnson, 1979;
Balasundaram et al., 2011). Nevertheless, effective methods that can successfully address practical occurrences of these problems are available (
Trukhanov et al., 2013;
Verma et al., 2015).
Construction of the wind speed graph
In constructing the wind speed graph, we consider a number of geographically diverse locations to be the potential places for harvesting wind energy and represent each location as a node in the graph. In the case study presented in the subsequent section, we use a data set based on a 10-year study in Bolivia (
GeoBolivia, 2008). The data set contains the wind speed data at a height of 10 m for 201 potential wind mill locations recorded every 10 min. The information collected is represented in monthly charts, correcting the variation of the data in air density, height, and temperature depending on the location, due to the altitude in comparison with the sea level. Figure 1 shows a map of the 201 locations with their corresponding average wind speed. Different ranges of velocities are denoted by various radii of blue disks in the figure.
Although the measurements are obtained at the height of 10 m, a typical wind turbine is operated at a height of 80 m or 100 m, where the wind speeds are typically higher. In accordance with the wind profile power law relationship (
Touma, 1977), we have
where
u is the wind speed (m/s) at height
z (m), and
ur is the known wind speed at a reference height
zr. The friction coefficient
depends on the type of the location and ranges between 0.1 for lakes, ocean, and smooth hard ground and 0.4 for city areas with high-rise buildings (
Bañuelos-Ruedas et al., 2011).
Figure 2 shows the average wind speed for each of the investigated sites over a considered period at a height of 10 m. Here, the X- and Y-axes correspond to the order index of the locations and the average wind speed, respectively. The overall average speed for all the locations is 2.70 m/s.
Figure 3 shows the cumulative average wind speeds for all the investigated sites. From the figure, the X-axis is the set of locations sorted in decreasing order of the average wind speeds, and the Y-axis corresponds to the average (left) and cumulative average wind speeds (right, indicated by the orange line). The variance of the average wind speeds is 2.2867, and the standard deviation is 1.5122.
The edges in the wind speed graph are constructed on the basis of the pairwise correlations of wind speed fluctuations of the considered sites. Let
Si(
t) denote the wind speed in location
i at time
t. Moreover, the fluctuation of the wind speed for the
ith site at time
t is expressed as
, whereas the correlation coefficient between sites
i and
j can be calculated as follows (
Mantegna and Stanley, 2000):
where V denotes the set of the considered sites, which is also the set of vertices of the constructed wind speed graph. Given that for our working data set, we have 20100 pairs of correlation coefficients. Figure 4 shows the distribution of the computed correlation coefficient values. Here, the X-axis corresponds to the correlation value and the Y-axis is the frequency of pairs of sites with the correlation coefficient value in the corresponding interval. The most frequently occurring correlation coefficient values are around 0.33, the average correlation coefficient value is approximately 0.27, and the median is around 0.3.
Two versions of the wind speed graph can be considered on the basis of the correlation coefficients, that is, the complete edge-weighted version (where all possible edges are present with the weights given by the corresponding correlation coefficients) and a threshold-based version constructed as follows. The correlation threshold is selected and the edge in the wind speed graph is included if and only if . In this work, we focus on the latter version. Selecting an appropriate correlation threshold is an important consideration in constructing a threshold-based wind speed graph. The wind farms must be located in places that have negative pairwise correlations of wind speed fluctuations to minimize the effects of wind speed variability. Hence, is used as the threshold correlation value in this work. The number of edges in the corresponding wind speed graph is 4811 out of possible 20100 edges, corresponding to the edge density of 0.239353. The edge density is expressed as 2m/(n(n−1)), where m is the number of edges and n is the number of vertices.
Figures 5 and 6 show the wind speed graph and its degree distribution, respectively. The average degree of a vertex is around 47.87.
Case study
In this section, we present the network analysis results obtained from the wind speed graph constructed in the previous section based on the data set of the 10-year study in Bolivia (
GeoBolivia, 2008). Then, we solve the maximum clique and the maximum
s-plex problems for
s=2,3, as well as their vertex-weighted versions. Given the nonnegative weight
associated with each vertex
, the maximum weight clique (
s-plex) is utilized to find a clique (
s-plex) C that maximizes
. The weight of a vertex in the wind speed graph is defined as the average wind speed in the corresponding location. To address the investigated optimization problems, we use the combinatorial branch-and-bound approach called the Russian Doll Search (
Trukhanov et al., 2013).
In the first set of experiments, we consider all the 201 locations to be the potential wind farm sites, independent on the attractiveness of their wind speed profile. Table 1 summarizes the results of these experiments. The maximum clique in this case consists of six vertices that correspond to the locations as follows (with their average speeds in parentheses): Mendoza (3.8 m/s), Santa Ana (3.7 m/s), Colquechaca (2.8 m/s), Taquiri (1.9 m/s), Tariquia (1.3 m/s), and El Puente (0.4 m/s). The maximum weight clique has five vertices, namely, Huacullani (6.9 m/s), Redencion (4.9 m/s), Robore (4.3 m/s), Santa Ana (3.7 m/s), and Chilcara (2.7 m/s). Evidently, some of the locations suggested by these solutions (especially in the unweighted case) are unacceptable due to their rather low average wind speed. The maximum weights of the 2-plex and 3-plex models yield slightly highly attractive solutions. Particularly, the six sites selected by the maximum weight 2-plex solution have the average speeds of 2.8, 3.8, 4.5, 6.0, 6.4, and 8.1 m/s, respectively, with an overall average of 5.25 m/s. Figure 7 illustrates the monthly average wind speeds for these locations in comparison with the locations included in the optimal solutions of the other considered problems (i.e., weighted and unweighted), as well as all the 201 locations (denoted by “Bolivia” in the plot). Noticeably, the maximum weight 2-plex solution is relatively attractive with regard to the overall wind speed average and the balanced wind speeds over time.
Although the results of the investigated case are rather encouraging, we can further improve them by disregarding the locations that are clearly unattractive due to their low average wind speeds. By performing such a feasibility study, we only have 39 locations to consider for placing the wind power plants with the average wind speeds of more than 4 m/s. (The complete list of locations, their geographical coordinates, and average wind speeds are presented in Table A1 of Appendix) Subsequently, we proceed to solving the considered weighted optimization problems of this reduced wind speed graph (hereinafter referred to as the feasible wind speed graph). Solving the maximum weight clique problem in this graph yields an optimal solution consisting of only two sites, that is, Villa Puni (5.98 m/s) and Calamarca (8.12 m/s), with an overall average wind speed of 7.05 m/s. The maximum weight 2-plex solution adds two locations, that is, Chorocona (6.19 m/s) and Tacagua (8.12 m/s), to the maximum weight clique solution. Similarly, the maximum weight 3-plex solution adds two sites, namely, Huaraco (5.66 m/s) and Tinquipaya (4.92 m/s). Table A1 summarizes the corresponding results. Figure 8 shows the maximum weight 3-plex solution on a geographical map, and Fig. 9 presents the plot of the average monthly wind speeds for the optimal solutions obtained for the weighted versions of all the considered optimization problems of the reduced wind speed graph. From the figure, all the weighted solutions considerably outperform the overall average. In addition, although the maximum weight clique solution seems to be superior with regard to the average monthly wind speeds, the solution only consists of two locations and may not be robust in the long run. Meanwhile, the maximum weight 3-plex solution includes a further geographically diverse set of six locations.
Conclusions
This study presents a network-based approach for determining the optimal locations for wind farms among the given sites on the basis of the wind speed information available for each site. We conclude that the network models provide a convenient, intuitive, visual description of wind power systems. On the basis of the results, the weighted-plex approach seems to be particularly useful for balancing the energy output and wind speed variability.
One of the possible future research directions to explore is to find other clique relaxations, such as
s-defective clique and
-quasiclique (
Pattillo et al., 2013), as alternative models for selecting useful wind farm locations. Considering the fractional objective function, which would maximize the energy output per dollar invested, could also be interesting. Given a simple undirected graph
G=(
E,
V), where each vertex
is assigned two nonnegative rational weights,
ai and
bi, the maximum ratio clique problem (MRCP) is to find an inclusion-wise maximal clique
C in
G that maximizes the quantity
(
Sethuraman and Butenko, 2015). Moreover, the proposed approach can be extended to other types of renewable energy sources, such as solar. Generally, the renewable sources must be optimally integrated into a broad power grid. Hence, a combination of all available types of energy sources rather than an independent one must be considered (
Jaramillo et al., 2004). However, the corresponding optimization problems in this case become increasingly complex; therefore, effective heuristic algorithms may be necessary (e.g.,
Pei et al. (2017) and
Yang et al. (2018)).
The Author(s) 2018. Published by Higher Education Press. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0)