This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation. The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game. Then, the Nash equilibrium can be achieved by solving the coupled Hamilton–Jacobi–Bellman (HJB) equations with nonquadratic input energy terms. A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution without the system models, and the critic neural networks (NNs) and actor NNs are introduced to implement the presented method. Theoretical analysis is provided, which shows that the iterative control laws converge to the Nash equilibrium. Simulation results show the good performance of the presented method.
We investigate a distributed game strategy for unmanned aerial vehicle (UAV) formations with external disturbances and obstacles. The strategy is based on a distributed model predictive control (MPC) framework and Levy flight based pigeon inspired optimization (LFPIO). First, we propose a non-singular fast terminal sliding mode observer (NFTSMO) to estimate the influence of a disturbance, and prove that the observer converges in fixed time using a Lyapunov function. Second, we design an obstacle avoidance strategy based on topology reconstruction, by which the UAV can save energy and safely pass obstacles. Third, we establish a distributed MPC framework where each UAV exchanges messages only with its neighbors. Further, the cost function of each UAV is designed, by which the UAV formation problem is transformed into a game problem. Finally, we develop LFPIO and use it to solve the Nash equilibrium. Numerical simulations are conducted, and the efficiency of LFPIO based distributed MPC is verified through comparative simulations.
Multi-agent reinforcement learning is difficult to apply in practice, partially because of the gap between simulated and real-world scenarios. One reason for the gap is that simulated systems always assume that agents can work normally all the time, while in practice, one or more agents may unexpectedly “crash” during the coordination process due to inevitable hardware or software failures. Such crashes destroy the cooperation among agents and lead to performance degradation. In this work, we present a formal conceptualization of a cooperative multi-agent reinforcement learning system with unexpected crashes. To enhance the robustness of the system to crashes, we propose a coach-assisted multi-agent reinforcement learning framework that introduces a virtual coach agent to adjust the crash rate during training. We have designed three coaching strategies (fixed crash rate, curriculum learning, and adaptive crash rate) and a re-sampling strategy for our coach agent. To our knowledge, this work is the first to study unexpected crashes in a multi-agent system. Extensive experiments on grid-world and StarCraft II micromanagement tasks demonstrate the efficacy of the adaptive strategy compared with the fixed crash rate strategy and curriculum learning strategy. The ablation study further illustrates the effectiveness of our re-sampling strategy.
This paper studies the multi-agent differential game based problem and its application to cooperative synchronization control. A systematized formulation and analysis method for the multi-agent differential game is proposed and a data-driven methodology based on the reinforcement learning (RL) technique is given. First, it is pointed out that typical distributed controllers may not necessarily lead to global Nash equilibrium of the differential game in general cases because of the coupling of networked interactions. Second, to this end, an alternative local Nash solution is derived by defining the best response concept, while the problem is decomposed into local differential games. An off-policy RL algorithm using neighboring interactive data is constructed to update the controller without requiring a system model, while the stability and robustness properties are proved. Third, to further tackle the dilemma, another differential game configuration is investigated based on modified coupling index functions. The distributed solution can achieve global Nash equilibrium in contrast to the previous case while guaranteeing the stability. An equivalent parallel RL method is constructed corresponding to this Nash solution. Finally, the effectiveness of the learning process and the stability of synchronization control are illustrated in simulation results.
In this study, we solve the finite-time leader-follower consensus problem of discrete-time second-order multi-agent systems (MASs) under the constraints of external disturbances. First, a novel consensus scheme is designed using a novel adaptive sliding mode control theory. Our adaptive controller is designed using the traditional sliding mode reaching law, and its advantages are chatter reduction and invariance to disturbances. In addition, the finite-time stability is demonstrated by presenting a discrete Lyapunov function. Finally, simulation results are presented to prove the validity of our theoretical results.
Cooperative planning is one of the critical problems in the field of multi-agent system gaming. This work focuses on cooperative planning when each agent has only a local observation range and local communication. We propose a novel cooperative planning architecture that combines a graph neural network with a task-oriented knowledge fusion sampling method. Two main contributions of this paper are based on the comparisons with previous work: (1) we realize feasible and dynamic adjacent information fusion using GraphSAGE (i.e., Graph SAmple and aggreGatE), which is the first time this method has been used to deal with the cooperative planning problem, and (2) a task-oriented sampling method is proposed to aggregate the available knowledge from a particular orientation, to obtain an effective and stable training process in our model. Experimental results demonstrate the good performance of our proposed method.
Light field (LF) imaging has attracted attention because of its ability to solve computer vision problems. In this paper we briefly review the research progress in computer vision in recent years. For most factors that affect computer vision development, the richness and accuracy of visual information acquisition are decisive. LF imaging technology has made great contributions to computer vision because it uses cameras or microlens arrays to record the position and direction information of light rays, acquiring complete three-dimensional (3D) scene information. LF imaging technology improves the accuracy of depth estimation, image segmentation, blending, fusion, and 3D reconstruction. LF has also been innovatively applied to iris and face recognition, identification of materials and fake pedestrians, acquisition of epipolar plane images, shape recovery, and LF microscopy. Here, we further summarize the existing problems and the development trends of LF imaging in computer vision, including the establishment and evaluation of the LF dataset, applications under high dynamic range (HDR) conditions, LF image enhancement, virtual reality, 3D display, and 3D movies, military optical camouflage technology, image recognition at micro-scale, image processing method based on HDR, and the optimal relationship between spatial resolution and four-dimensional (4D) LF information acquisition. LF imaging has achieved great success in various studies. Over the past 25 years, more than 180 publications have reported the capability of LF imaging in solving computer vision problems. We summarize these reports to make it easier for researchers to search the detailed methods for specific solutions.
The surface–volume–surface electric field integral equation (SVS-EFIE) can lead to complex equations, laborious implementation, and unacceptable computational complexity in the method of moments (MoM). Therefore, a general matrix equation (GME) is proposed for electromagnetic scattering from arbitrary metal–dielectric composite objects, and its enhanced solution is presented in this paper. In previous works, MoM solution formulation of SVS-EFIE considering only three-region metal–dielectric composite scatters was presented, and the two-stage process resulted in two integral operators in SVS-EFIE, which is arduous to implement and is incapable of reducing computational complexity. To address these difficulties, GME, which is versatile for homogeneous objects and composite objects consisting of more than three sub-regions, is proposed for the first time. Accelerated solving policies are proposed for GME based on coupling degree concerning the spacing between sub-regions, and the coupling degree standard can be adaptively set to balance the accuracy and efficiency. In this paper, the reformed addition theorem is applied for the strong coupling case, and the iterative method is presented for the weak coupling case. Parallelism can be easily applied in the enhanced solution. Numerical results demonstrate that the proposed method requires only 11.6% memory and 11.8% CPU time on average compared to the previous direct solution.
This paper presents a group-based dynamic stuck-at fault diagnosis scheme intended for resistive random-access memory (ReRAM). Traditional static random-access memory, dynamic random-access memory, NAND, and NOR flash memory are limited by their scalability, power, package density, and so forth. Next-generation memory types like ReRAMs are considered to have various advantages such as high package density, non-volatility, scalability, and low power consumption, but cell reliability has been a problem. Unreliable memory operation is caused by permanent stuck-at faults due to extensive use of write- or memory-intensive workloads. An increased number of stuck-at faults also prematurely limit chip lifetime. Therefore, a cellular automaton (CA) based dynamic stuck-at fault-tolerant design is proposed here to combat unreliable cell functioning and variable cell lifetime issues. A scalable, block-level fault diagnosis and recovery scheme is introduced to ensure readable data despite multi-bit stuck-at faults. The scheme is a novel approach because its goal is to remove all the restrictions on the number and nature of stuck-at faults in general fault conditions. The proposed scheme is based on Wolfram’s null boundary and periodic boundary CA theory. Various special classes of CAs are introduced for 100% fault tolerance: single-length-cycle single-attractor cellular automata (SACAs), single-length-cycle two-attractor cellular automata (TACAs), and single-length-cycle multiple-attractor cellular automata (MACAs). The target micro-architectural unit is designed with optimal space overhead.
Self-attention has been innovatively applied to text-to-speech (TTS) because of its parallel structure and superior strength in modeling sequential data. However, when used in end-to-end speech synthesis with an autoregressive decoding scheme, its inference speed becomes relatively low due to the quadratic complexity in sequence length. This problem becomes particularly severe on devices without graphics processing units (GPUs). To alleviate the dilemma, we propose an efficient decoding self-attention (EDSA) module as an alternative. Combined with a dynamic programming decoding procedure, TTS model inference can be effectively accelerated to have a linear computation complexity. We conduct studies on Mandarin and English datasets and find that our proposed model with EDSA can achieve 720% and 50% higher inference speed on the central processing unit (CPU) and GPU respectively, with almost the same performance. Thus, this method may make the deployment of such models easier when there are limited GPU resources. In addition, our model may perform better than the baseline Transformer TTS on out-of-domain utterances.