Data-driven distribution network topology identification considering correlated generation power of distributed energy resource

Jialiang CHEN , Xiaoyuan XU , Zheng YAN , Han WANG

Front. Energy ›› 2022, Vol. 16 ›› Issue (1) : 121 -129.

PDF (1024KB)
Front. Energy ›› 2022, Vol. 16 ›› Issue (1) : 121 -129. DOI: 10.1007/s11708-021-0780-x
RESEARCH ARTICLE
RESEARCH ARTICLE

Data-driven distribution network topology identification considering correlated generation power of distributed energy resource

Author information +
History +
PDF (1024KB)

Abstract

This paper proposes a data-driven topology identification method for distribution systems with distributed energy resources (DERs). First, a neural network is trained to depict the relationship between nodal power injections and voltage magnitude measurements, and then it is used to generate synthetic measurements under independent nodal power injections, thus eliminating the influence of correlated nodal power injections on topology identification. Second, a maximal information coefficient-based maximum spanning tree algorithm is developed to obtain the network topology by evaluating the dependence among the synthetic measurements. The proposed method is tested on different distribution networks and the simulation results are compared with those of other methods to validate the effectiveness of the proposed method.

Graphical abstract

Keywords

power distribution network / data-driven / topology identification / distributed energy resource / maximal information coefficient

Cite this article

Download citation ▾
Jialiang CHEN, Xiaoyuan XU, Zheng YAN, Han WANG. Data-driven distribution network topology identification considering correlated generation power of distributed energy resource. Front. Energy, 2022, 16(1): 121-129 DOI:10.1007/s11708-021-0780-x

登录浏览全文

4963

注册一个新账户 忘记密码

1 Introduction

Power network topology identification plays a vital role in power system operations since it is the foundation of fault location and network reconfiguration [13]. Topology identification was first investigated in Ref. [4], and since then various methods have been proposed to solve transmission network topology identification problems in the framework of state estimation [5,6]. Recently, the proliferation of distributed energy resources (DERs) proposes high requirements on power distribution system operation [7,8]. Various DER control methods have been proposed to enhance the reliability and economy of distribution system operation [9,10]. Moreover, the distribution network topology identification, which has a remarkable impact on distribution system operation, has attracted additional attention nowadays [11].

Broadly speaking, distribution network topology identification methods are categorized into model-driven and data-driven ones. The interpretable model-driven methods are developed from state estimation methods. Reference [12] investigated the spatial and temporal correlations of distribution network measurements and designed a state estimation-based topology identification algorithm using compressed measurements. Reference [13] formulated distribution network topology identification as a maximum likelihood problem based on the covariance matrix of voltage data and obtained the numerical bounds of optimal results of the relaxed likelihood function. Reference [14] identified distribution network topology with the weighted least square residuals of limited measurements. Despite the high interpretability of state estimation-based methods, model-driven methods might be trapped in situations where models are inaccurate.

Different from model-driven methods, data-driven methods solve topology identification problems by mining information from measurements, instead of establishing specific distribution network models. With the fast development of smart meters and distribution-level PMUs (phasor measurement units), it has become possible to perform topology identification via exploiting the measurements gathered by smart sensors [15,16]. Reference [17] proposed a data-driven method for distribution network topology identification based on the assumption that nodal power injections are independent and nodal voltage magnitudes follow Gaussian distributions. Reference [18] indicated that the parent nodes of distribution networks contained the principal components of their children nodes, thus designing a topology identification algorithm with the principal component analysis. Reference [19] proposed a topology identification method using independent nodal power injections and obtained accurate results under incomplete measurements. Since data-driven methods do not rely on mathematical models and capture the relationship between measurements and network topologies, they eliminate the errors caused by inaccurate models and adapt to the flexible integration of DERs. The data-driven methods have shown superior performances in solving distribution network topology identification problems. However, in most of the researches, the nodal power injections of distribution networks are assumed to be independent and nodal voltage magnitudes are assumed to follow Gaussian distributions, which limit the applications of data-driven methods to distribution networks with DERs. In fact, the output power of adjacent DERs are strongly correlated due to similar weather conditions, which affects the joint probability distribution of nodal voltage magnitudes. Neglecting the correlations will lead to inaccurate topology identification results. To overcome the deficiency of existing methods, this paper proposes a data-driven distribution network topology identification method considering the correlations between nodal power injections. Only nodal voltage magnitude measurements are used to depict the connection between nodes and the detailed distribution system models are not established. In particular, the neural network representing the distribution system power flow is established to generate synthetic voltage magnitude measurements under independent nodal power injections. The maximal information coefficient (MIC) is adopted to evaluate the dependence between synthetic measurements, which reflects the connection between nodes. A MIC-based maximum spanning tree algorithm is developed to estimate the network topology. The main contributions of the paper are summarized as follow:

First, a neural network is established to represent the distribution system power flow. Specifically, the input and output of the neural network are designed as nodal power injections and voltage magnitudes, respectively. The neural network is trained with original measurements and is then used to generate synthetic voltage magnitude measurements with independent nodal power injections. Therefore, the influences of correlations between nodal power injections on voltage magnitudes are eliminated and only the network topology affects the dependence between different nodal voltage magnitudes.

In addition, considering the fact that interdependent nodal voltage magnitudes are related to the connection between nodes, MIC is utilized to evaluate the dependence between voltage magnitudes of different nodes. Different from other dependence measurements, MIC does not rely on the types of probability distributions, and it captures the relationship between random variables based on the largest mutual information of a certain grid resolution.

Moreover, the MIC-based maximum spanning tree algorithm is developed to identify the topology of radial distribution networks. The algorithm constructs a radial network by integrating distribution lines with large MIC gradually. The proposed topology identification method is tested on different distribution networks, and simulation results are compared with those obtained by other methods to validate the effectiveness of the proposed method.

2 Topology identification problem

2.1 Introduction to topology identification

The topology identification problem is stated as follows. For distribution networks with nodal voltage magnitude measurements and unknown connections between nodes, the objective of topology identification is to determine the connections between nodes. As shown in Fig. 1, the dotted lines in the left figure show possible connections between nodes and the right figure shows the actual connections.

To describe the network topology, the distribution network is modeled as a graphical model, G = (V, E), where V is the set of nodes, representing the nodes of the distribution network, and E is the set of undirected edges, representing the lines of the distribution network. Each node is associated with a random variable ui, representing the voltage magnitude of node i. For an N-node distribution network, the joint probability distribution of voltage magnitudes of all nodes except that of the root node is stated as

p(u)=p(u2, u3,...,uN)=p(u2)p (u 3|u2)p (uN|u2,u3, ...,uN1) ,

where p(ui|u2, u3, …, ui-1) is the conditional probability distribution of ui given measurements u2, u3, …, ui-1. Note that the voltage magnitude of the root node is constant, thus, u1 is not given in Eq. (1).

Reference [17] points out that the voltage magnitudes of non-descendant nodes are conditionally independent given the voltage magnitudes of their parent node when nodal power injections are independent. Hence, the probability distribution Eq. (1) is represented by N‒1 product terms, which denote the connection between node i and its parent node, as stated in Eq. (2)

p po(u)= i= 2Np(ui|upa( i)) ,

where the subscript pa(i) indicates the parent node of node i.

Since the probability distribution Eq. (2) depicts the connection between nodes, the network topology identification is performed by constructing p*(u) in the form of Eq. (2), such that p*(u) is close to p(u).

2.2 Mutual information-based topology identification

In this section, the Kullback-Leiber (KL) distance and mutual information (MI) are introduced to lay the foundation of finding p*(u).

The KL distance measures the difference between two probability distributions:

D(p||q) =Eplog p(x )q (x),

where p(x) and q(x) are probability distributions and Ep is the expectation of the function. The KL distance is nonnegative and is equal to zero if and only if p(x) = q(x).

Then, for random variables X and Y, MI is defined as the KL distance between their joint distribution p(x, y) and the product of marginal distributions p(x) and p(y), as stated in Eq. (4) [20].

I( X,Y)=D(p(x,y)| |p( x)p (y))= Ep(x,y )log p (x, y) p(x)p (y).

If X and Y are independent, p(x, y) is equal to p(x)p(y); thus, I(X, Y) = 0. If X and Y are correlated, p(x, y) is different from p(x)p(y), making I(X, Y)≥0. Moreover, a larger difference between p(x, y) and p(x)p(y) indicates a stronger correlation between X and Y.

For distribution network topology identification, let X, Y, and Z be the voltage magnitudes of nodes nx, ny, and nz, respectively. The MI between p(x, (y, z)) and the product of p(x) and p(y, z) is expanded as

I( X;( Y,Z))=I(X;Z)+ I(X;Y|Z) =I(X;Y )+I (X;Z|Y).

Suppose that node ny is the parent node of non-descendent nodes nx and nz, random variables X and Z are conditionally independent given Y, indicating that I(X; Z|Y) = 0. Besides, I(X; Y|Z)≥0 since MI is nonnegative. Therefore, by comparing the items in Eq. (5), Eq. (6) is obtained.

I(X;Y) I(X;Z).

Equation (6) indicates that the MI of voltage magnitudes of connected nodes is larger than that of unconnected ones. Based on the above consideration, the MI-based maximum spanning tree algorithm was proposed in Ref. [17] to obtain the distribution network topology based on the following lemma:

Lemma 1: p*(u) is the closest to p(u) if and only if its corresponding distribution network satisfies the following MI-based maximum spanning tree [21]:

i =2NI(ui,upa*( ni)) i=2NI(ui,u pa(ni ) ),

where pa*(ni) is the parent node of node ni of the distribution network based on the maximum spanning tree.

Based on lemma 1, the distribution network topology is identified by establishing a maximum spanning tree, where the weights of edges are the MI of nodal voltage magnitudes. However, there are two issues for the MI-based distribution network topology identification method. First, it requires independent nodal power injections, which are not satisfied in the actual distribution networks. Second, there is no closed-form formula to calculate the MI of non-normally distributed random variables. An alternative is using the calculation formula for normally distributed random variables or estimating the MI via numerical integration, which leads to inaccurate results for non-normally distributed random variables.

In this paper, neural networks and MIC are utilized to tackle the challenge of distribution network topology identification considering the integration of DERs. First, the neural network is established to convert nodal voltage magnitudes under correlated nodal power injections into those under independent nodal power injections, which enables the proposed method to solve the topology identification problem with correlated nodal power injections. Second, MIC is adopted to evaluate the dependence between non-normally distributed nodal voltage magnitudes, thus providing information on the connection between nodes.

3 Methodology

3.1 Neural network

In this section, a neural network is designed to generate voltage magnitude measurements under independent nodal power injections. The input and output of the neural network are the nodal power injections and voltage magnitudes, respectively. The training samples are the original measurements, and the testing samples are synthetic measurements under independent nodal power injections. The training and testing samples are generated based on the principles that training and testing samples follow the same marginal probability distributions, and the sample space of training samples covers that of testing samples. Let M and N be the sample size and number of nodes. The procedure to obtain training and testing samples is given in Fig. 2 and the steps are given as follows:

(1) Obtain the nodal power injection measurement vector pi of node i (i = 1, 2, …, N).

(2) Transform pi into the samples pn,i of the standard normal distribution via the cumulative distribution function transformation.

(3) Calculate the covariance matrix ∑ of Pn, where Pn = {pn,1, pn,2, …, pn,N}

(4) Perform the Cholesky decomposition of ∑ to obtain the lower triangle matrix L.

(5) Obtain the training samples Pt=L–1PnT, where each column of Pt represents a training sample.

(6) Calculate the standard variance vector σ

σi={ 1/ l ˜11,i=1, 1 k =1i1( l˜ ik σk)2,2iN.

(7) Generate samples pc,i of the independent Normal distribution, of which the mean is zero and the standard variance is σi.

(8) Obtain the testing samples Ps=L–1PcT, where Pc = {pc,1, pc,2, …, pc,N} and each column of Ps represents a testing sample.

3.2 Maximal information coefficient

MIC is a robust measurement to achieve good equitability in a wide range of relationship types [22,23]. Taking two random variables as an example, the procedure to calculate their MIC is given as follows:

First, a certain grid resolution is selected based on the scatterplot of random variables X and Y. The resolution indicates the number of rows and columns of the grid.

Second, different grids with k rows and l columns are drawn and MI is calculated for each grid. Then, the largest MI of the grid is given as

I*( (X,Y) ,k,l)=maxI((X,Y)|G),

where G is a grid with k rows and l columns.

Third, the normalized largest MI of the grid with k rows and l columns is obtained as

M (X,Y)k,l= I*(( X,Y),k,l)logmin{k,l}.

Finally, let M(X, Y)k,l be the element of matrix M(X, Y). MIC is defined as the largest element of M(X, Y), stated as

MIC(X ,Y)=maxM(X,Y)
.

The MIC-based maximum spanning tree algorithm is stated in Fig. 3. The objective of the algorithm is to build a tree network whose probability distribution is the closest to that of the actual distribution network.

3.3 Procedure of topology identification

Based on the neural network and the MIC-based maximum spanning tree algorithm, the procedure of the data-driven distribution network topology identification is presented in Fig. 4.

The F1 score is used in this paper to evaluate the accuracy of topology identification results, expressed as

F1= 2PR( P+R),

where P is the precision rate, which is the ratio of the edges in the inferred topology existing in the actual one; and R is the recall rate, which is the ratio of the edges in the actual topology existing in the inferred one.

4 Case studies

The topology identification method is tested on five IEEE distribution networks, including the IEEE 8-node, 33-node, 69-node, 123-node, and 141-node distribution networks, to validate its effectiveness. The program is developed using Matlab and PyCharm on a Server with Xeon(R) E5-2650 v4 CPU and 64 GB RAM.

4.1 Test systems

For each distribution network, the voltage measurement samples are simulated based on the actual load data using MATPOWER [24,25]. Ten scenarios are generated for each network and the results obtained by the proposed method are compared with those of two popular methods, denoted as A and B in the rest of the paper. Both methods A and B use the MI-based maximum spanning tree algorithm to perform distribution network topology identification [17]. The difference between methods A and B is that A treats voltage magnitude measurements as Gaussian random variables and uses the MI calculation formula of Gaussian distributions while B estimates MI via numerical integration. The proposed method, denoted as method C, estimates MIC using minepy [26]. The neural network of method C is a fully connected network with 2 hidden layers, each of which consists of 1000 neurons. The neuron numbers in the input and output layers are the node number of the distribution network. The neural network is established based on the mean squared error loss function and is trained using the Adam optimization algorithm. The initial learning rate is set as 5 × 10−4 and is halved at the 100th, 200th, and 400th epochs. The parameters of neural networks are tuned by the grid searching method based on the F1 scores of topology identification results.

4.2 Results of MIC

Figure 5 gives the MI and MIC indices of nodal voltage magnitudes when nodal power injections follow Gaussian distributions. The results of node 21 in the 33-node network, node 47 in the 69-node network, and node 8 in the 141-node network are analyzed. The connections between nodes are marked in blue in the first row. The MI or MIC indices are presented in the next three rows, where darker bars indicate stronger dependences between voltage magnitude measurements.

As shown in Fig. 5, the MI indices obtained by method A reflect the connection between nodes. This is because the probability distribution of voltage magnitudes is close to the Gaussian distribution; thus, the Gaussian distribution-based method obtains accurate MI indices. On the other hand, the MI indices of method B do not reflect the true connection between nodes, indicating that numerical integration does not obtain accurate MI indices. Moreover, the MIC indices obtained by method C also accurately reflect the connection between nodes despite the numerical integration used in calculating MIC. Therefore, MIC is a robust and effective index to evaluate the connection between nodes.

4.3 Results of topology identification

Figure 6 demonstrates the topology identification results of method C. The upper figures show the inferred topologies and each point indicates the connection between nodes. The lower figures show the MIC indices and darker cells indicate larger MIC between nodal voltage magnitude measurements. It is found that node-pairs with larger MIC are more likely to be connected in the inferred topology. Some nodes with large MIC are not connected because their connection will lead to meshed networks while the distribution network is radial.

Figure 7 manifests the topology identification accuracy indices of the three methods when nodal power injections are correlated. Method C obtains more accurate results than methods A and B do in all the test systems. The reason for this is that, besides the connection of nodes, the correlations between power injections also have significant impacts on voltage magnitudes. Therefore, determining the connected nodes using original voltage magnitude measurements leads to inaccurate topology identification results. In method C, the nodal voltage magnitudes used for network topology identification are generated with independent nodal power injections, which diminishes the influence of correlations between nodal power injections. Therefore, method C obtains the most accurate results.

4.4 Influence of measurement noise

This section investigates the influence of measurement noises on topology identification. Specifically, the measurement noises are modeled by Uniform distributions with zero means and standard variances ranging from 1% to 20% of the average voltage magnitudes. Besides, the standard variance of nodal voltage magnitudes is also variable. Figure 8 shows the accurate indices of topology identification for different measurement variabilities and noises. For the same standard variance of voltage measurements, the accuracy of topology identification decreases with the increase of noises. Besides, larger standard variances of voltage magnitudes make topology identification less sensitive to measurement noises.

4.5 Influence of DER

This section analyzes the influence of DER integration on topology identification. The percentage of nodes connected with PV power units is increased from 10% to 100%, and the accuracy indices of topology identification obtained by method C are exhibited in Fig. 9. When a node is connected with PV units, a negative load is added to nodal power injections to represent the output power of PV units. The data of PV power is found in Ref. [27].

As displayed in Fig. 9, method C obtains accurate results for high-level PV power penetration, which indicates that the proposed method applies to distribution networks with PV integration. On the other hand, the nodal power injections are correlated since the output power of PV units are correlated due to similar weather conditions. Figure 10 presents the probability distributions of nodal voltage magnitudes with and without DERs. The dotted black curves are Gaussian distributions. As is observed in Fig. 10, when DERs are not integrated, the probability distributions of nodal voltage magnitudes are close to Gaussian distributions. The probability distributions of nodal voltage magnitudes are quite different from the Gaussian distribution when the DERs are integrated. Both methods A and B fail to obtain accurate MI indices. In this case, the proposed method C accurately evaluates the dependence between nodal voltage magnitudes; thus obtaining satisfactory results in topology identification.

In summary, the proposed method obtains accurate topology identification results because the neural network depicts the relationship between nodal power injections and voltage magnitudes and tackles the problem caused by the correlated output power of DERs. In addition, the neural network is specifically designed to obtain accurate synthetic nodal voltage magnitude measurements by using the original measurements as training samples. Moreover, MIC accurately evaluates the dependence between non-normally-distributed nodal voltage magnitudes.

5 Conclusions

This paper proposed a data-driven method to identify the topology of power distribution networks with DER penetration. The distribution network was modeled as a probabilistic graphical model. The voltage magnitudes under independent nodal power injections were generated using a designed neural network. The MIC-based maximum spanning tree algorithm was developed to infer the connection between nodes based on the dependence between voltage magnitudes. The proposed method was tested on five distribution networks, and the following conclusions are drawn:

Compared with MI, MIC is more robust to evaluate dependence between non-normally distributed random variables. Removing the correlation between nodal power injections improves the topology identification accuracy. Compared with commonly used data-driven methods, the proposed method obtains more accurate topology identification results for distribution networks with DERs. In the future, the distribution network topology identification problem with limited synchronous phasor measurements will be investigated.

References

[1]

Jiang H, Zhang J J, Gao W, Fault detection, identification, and location in smart grid based on data-driven computational methods. IEEE Transactions on Smart Grid, 2014, 5(6): 2947–2956

[2]

Pignati M, Zanni L, Romano P, Fault detection and faulted line identification in active distribution networks using synchrophasors-based real-time state estimation. IEEE Transactions on Power Delivery, 2017, 32(1): 381–392

[3]

Erdiwansyah M, Mahidin, Husin H, A critical review of the integration of renewable energy sources with various technologies. Protection and Control of Modern Power Systems, 2021, 6(1): 3

[4]

Irving M R, Sterling M J H. Substation data validation. IEE Proceedings C (Generation, Transmission and Distribution), 1982, 129(3): 119–122

[5]

Zhu H, Giannakis G B. Sparse overcomplete representations for efficient identification of power line outages. IEEE Transactions on Power Systems, 2012, 27(4): 2215–2224

[6]

He M, Zhang J. A dependency graph approach for fault detection and localization towards secure smart grid. IEEE Transactions on Smart Grid, 2011, 2(2): 342–351

[7]

Zhou D, Ma S, Huang D, An operating state estimation model for integrated energy systems based on distributed solution. Frontiers in Energy, 2020, 14(4): 801–816

[8]

Sharma R, Suhag S. Feedback linearization based control for weak grid connected PV system under normal and abnormal conditions. Frontiers in Energy, 2020, 14(2): 400–409

[9]

Zhang N, Sun Q, Wang J, Distributed adaptive dual control via consensus algorithm in the energy Internet. IEEE Transactions on Industrial Informatics, 2021, 17(7): 4848–4860

[10]

Zhang N, Sun Q, Yang L, Event-triggered distributed hybrid control scheme for the integrated energy system. IEEE Transactions on Industrial Informatics, 2021, online,

[11]

Li R, Wong P, Wang K, Power quality enhancement and engineering application with high permeability distributed photovoltaic access to low-voltage distribution networks in Australia. Protection and Control of Modern Power Systems, 2020, 5(1): 18

[12]

Alam S M S, Natarajan B, Pahwa A. Distribution grid state estimation from compressed measurements. IEEE Transactions on Smart Grid, 2014, 5(4): 1631–1642

[13]

Cavraro G, Kekatos V, Veeramachaneni S. Voltage analytics for power distribution network topology verification. IEEE Transactions on Smart Grid, 2019, 10(1): 1058–1067

[14]

Tian Z, Wu W, Zhang B. A mixed integer quadratic programming model for topology identification in distribution network. IEEE Transactions on Power Systems, 2016, 31(1): 823–824

[15]

Cunha V C, Freitas W, Trindade F C L, Automated determination of topology and line parameters in low voltage systems using smart meters measurements. IEEE Transactions on Smart Grid, 2020, 11(6): 5028–5038

[16]

Jiang W, Chen J, Tang H, A physical probabilistic network model for distribution network topology recognition using smart meter data. IEEE Transactions on Smart Grid, 2019, 10(6): 6965–6973

[17]

Weng Y, Liao Y, Rajagopal R. Distributed energy resources topology identification via graphical modeling. IEEE Transactions on Power Systems, 2017, 32(4): 2682–2694

[18]

Pappu S J, Bhatt N, Pasumarthy R, Identifying topology of low voltage distribution networks based on smart meter data. IEEE Transactions on Smart Grid, 2018, 9(5): 5113–5122

[19]

Deka D, Backhaus S, Chertkov M. Structure learning in power distribution networks. IEEE Transactions on Control of Network Systems, 2018, 5(3): 1061–1074

[20]

Cover T M, Thomas J A. Elements of Information Theory. New York: John Wiley & Sons, Inc., 1991

[21]

Chow C, Liu C. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 1968, 14(3): 462–467

[22]

Reshef D N, Reshef Y A, Finucane H K, Detecting novel associations in large data sets. Science, 2011, 334(6062): 1518–1524

[23]

Reshef Y A, Reshef D N, Finucane H K, Measuring dependence powerfully and equitably. Journal of Machine Learning Research, 2016, 17(1): 7406–7468

[24]

Roberts M B, Haghdadi N, Bruce A, Clustered residential electricity load profiles from smart grid smart city dataset. 2019, available at the website of narcis.nl

[25]

Zimmerman R D, Murillo-Sanchez C E, Thomas R J. MATPOWER: steady-state operations, planning, and analysis tools for power systems research and education. IEEE Transactions on Power Systems, 2011, 26(1): 12–19

[26]

Albanese D, Filosi M, Visintainer R, Minerva and minepy: a C engine for the MINE suite and its R, Python and MATLAB wrappers. Bioinformatics (Oxford, England), 2013, 29(3): 407–408

[27]

UK Power Networks. Photovoltaic (PV) solar panel energy generation data. London Datastore, 2020–08–18, available at the website of

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (1024KB)

2948

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/