Multiagent reinforcement learning (MARL) has become a dazzling new star in the field of reinforcement learning in recent years, demonstrating its immense potential across many application scenarios. The reward function directs agents to explore their environments and make optimal decisions within them by establishing evaluation criteria and feedback mechanisms. Concurrently, cooperative objectives at the macro level provide a trajectory for agents’ learning, ensuring alignment between individual behavioral strategies and the overarching system goals. The interplay between reward structures and cooperative objectives not only bolsters the effectiveness of individual agents but also fosters interagent collaboration, offering both momentum and direction for the development of swarm intelligence and the harmonious operation of multiagent systems. This review delves deeply into the methods for designing reward structures and optimizing cooperative objectives in MARL, along with the most recent scientific advancements in this field. The article meticulously reviews the application of simulation environments in cooperative scenarios and discusses future trends and potential research directions in the field, providing a forward-looking perspective and inspiration for subsequent research efforts.
Autonomous driving systems (ADSs) have attracted wide attention in the machine learning communities. With the help of deep neural networks (DNNs), ADSs have shown both satisfactory performance under significant uncertainties in the environment and the ability to compensate for system failures without external intervention. However, the vulnerability of ADSs has raised concerns since DNNs have been proven vulnerable to adversarial attacks. In this paper, we present a comprehensive survey of current physical adversarial vulnerabilities in ADSs. We first divide the physical adversarial attack methods and defense methods by their restrictions of deployment into three scenarios: the real-world, simulator-based, and digital-world scenarios. Then, we consider the adversarial vulnerabilities that focus on various sensors in ADSs and separate them as camera-based, light detection and ranging (LiDAR) based, and multifusion-based attacks. Subsequently, we divide the attack tasks by traffic elements. For the physical defenses, we establish the taxonomy with reference to input image preprocessing, adversarial example detection, and model enhancement for the DNN models to achieve full coverage of the adversarial defenses. Based on the above survey, we finally discuss the challenges in this research field and provide further outlook on future directions.
Event extraction (EE) is a complex natural language processing (NLP) task that aims at identifying and classifying triggers and arguments in raw text. The polysemy of triggers and arguments stands out as one of the key challenges affecting the precise extraction of events. Existing approaches commonly consider the semantic distribution of triggers and arguments to be balanced. However, the sample quantities of different semantics in the same trigger or argument vary in real-world scenarios, leading to a biased semantic distribution. The bias introduces two challenges: (1) low-frequency semantics is difficult to identify; (2) high-frequency semantics is often mistakenly identified. To tackle these challenges, we propose an adaptive learning method with the reward–penalty mechanism for balancing the semantic distribution in polysemous triggers and arguments. The reward-penalty mechanism balances the semantic distribution by enlarging the gap between the target and nontarget semantics by rewarding correct classifications and penalizing incorrect classifications. Additionally, we propose a sentence-level event situation awareness (SA) mechanism to guide the encoder to accurately learn the knowledge of events mentioned in the sentence, thereby enhancing target event semantics in the distribution of polysemous triggers and arguments. Finally, for various semantics in different tasks, we propose task-specific semantic decoders to precisely identify the boundaries of the predicted triggers and arguments for the semantics. Our experimental results on ACE2005 and its variants, along with the rich Entities, Relations, and Events (ERE), demonstrate the superiority of our approach over single-task and multi-task EE baselines.
Semi-supervised sound event detection (SSED) tasks typically leverage a large amount of unlabeled and synthetic data to facilitate model generalization during training, reducing overfitting on a limited set of labeled data. However, the generalization training process often encounters challenges from noisy interference introduced by pseudo-labels or domain knowledge gaps. To alleviate noisy interference in class distribution learning, we propose an efficient semi-supervised class distribution learning method through dynamic prompt tuning, named prompting class distribution optimization (PADO). Specifically, when modeling real labeled data, PADO dynamically incorporates independent learnable prompt tokens to explore prior knowledge about the true distribution. Then, the prior knowledge serves as prompt information, dynamically interacting with the posterior noisy-class distribution information. In this case, PADO achieves class distribution optimization while maintaining model generalization, leading to a significant improvement in the efficiency of class distribution learning. Compared with state-of-the-art methods on the SSED datasets from DCASE 2019, 2020, and 2021 challenges, PADO achieves significant performance improvements. Furthermore, it is readily extendable to other benchmark models.
In the fiercely competitive landscape of product-oriented operating systems, including the Internet of Things (IoT), efficiently managing a substantial stream of real-time tasks coexisting with resource-intensive user applications embedded in constrained hardware presents a significant challenge. Bridging the gap between embedded and general-purpose operating systems, we introduce XIRAC, an optimized operating system shaped by information-theory principles. XIRAC leverages Shannon’s information theory to regulate processor workloads, minimize context switches, and preempt processes by maximizing system entropy tolerance. Unlike prior approaches that apply information theory to task priority alignment, the proposed method integrates maximum entropy into the core of the real-time operating system (RTOS) and scheduling algorithms. Subsequently, we optimize numerous system parameters by shifting and integrating commonly used unlimited tasks from the application layer to the kernel. We describe the advantages of this architectural shift, including improved system performance, scalability, and adaptability. A new application-programming paradigm, termed “object-emulated programming,” has emerged from this integration. Practical implementations of XIRAC in diverse products have revealed additional benefits, including reduced learning curves, elimination of library functions and threading dependencies, optimized chip capabilities, and increased competitiveness in product development. We provide a comprehensive explanation of these benefits and explore their impact through real-world use cases and practical applications.
Sparsity-based joint active user detection and channel estimation (JADCE) algorithms are crucial in grant-free massive machine-type communication (mMTC) systems. The conventional compressed sensing algorithms are tailored for noncoherent communication systems, where the correlation between any two measurements is as minimal as possible. However, existing sparsity-based JADCE approaches may not achieve optimal performance in strongly coherent systems, especially with a small number of pilot subcarriers. To tackle this challenge, we formulate JADCE as a joint sparse signal recovery problem, leveraging the block-type row-sparse structure of millimeter-wave (mmWave) channels in massive multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) systems. Then, we propose an efficient difference-of-convex function algorithm (DCA) based JADCE algorithm with multiple measurement vector (MMV) frameworks, promoting the row-sparsity of the channel matrix. To mitigate the computational complexity further, we introduce a fast DCA-based JADCE algorithm via a proximal operator, which allows a low-complexity alternating direction multiplier method (ADMM) to resolve the optimization problem directly. Finally, simulation results demonstrate that the two proposed difference-of-convex (DC) algorithms achieve effective active user detection and accurate channel estimation compared with state-of-the-art compressed sensing based JADCE techniques.
Transformer models have become a cornerstone of various natural language processing (NLP) tasks. However, the substantial computational overhead during the inference remains a significant challenge, limiting their deployment in practical applications. In this study, we address this challenge by minimizing the inference overhead in transformer models using the controlling element on artificial intelligence (AI) accelerators. Our work is anchored by four key contributions. First, we conduct a comprehensive analysis of the overhead composition within the transformer inference process, identifying the primary bottlenecks. Second, we leverage the management processing element (MPE) of the Shenwei AI (SWAI) accelerator, implementing a three-tier scheduling framework that significantly reduces the number of host-device launches to approximately 1/10 000 of the original PyTorch-GPU setup. Third, we introduce a zero-copy memory management technique using segment-page fusion, which significantly reduces memory access latency and improves overall inference efficiency. Finally, we develop a fast model loading method that eliminates redundant computations during model verification and initialization, reducing the total loading time for large models from 22 128.31 ms to 1041.72 ms. Our contributions significantly enhance the optimization of transformer models, enabling more efficient and expedited inference processes on AI accelerators.
With the evolution of 5th generation (5G) and 6th generation (6G) wireless communication technologies, various Internet of Things (IoT) devices and artificial intelligence applications are proliferating, putting enormous pressure on existing computing power networks. Unmanned aerial vehicle (UAV)-enabled mobile edge computing (U-MEC) shows potential to alleviate this pressure and has been recognized as a new paradigm for responding to data explosion. Nevertheless, the conflict between computing demands and resource-constrained UAVs poses a great challenge. Recently, researchers have proposed resource management solutions in U-MEC for computing tasks with dependency. However, the repeatability among the tasks was ignored. In this paper, considering repeatability and dependency, we propose a U-MEC paradigm based on a computing power pool for processing computationally intensive tasks, in which UAVs can share information and computing resources. To ensure the effectiveness of computing power pool construction, the problem of balancing the energy consumption of UAVs is formulated through joint optimization of an offloading strategy, task scheduling, and resource allocation. To address this NP-hard problem, we adopt a two-stage alternate optimization algorithm based on successive convex approximation (SCA) and an improved genetic algorithm (GA). The simulation results show that the proposed scheme reduces time consumption by 18.41% and energy consumption by 21.68% on average, which can improve the working efficiency of UAVs.
For near-field multiuser communications based on hybrid beamforming (HBF) architectures, high-quality effective channel estimation is required to obtain the channel state information (CSI) for the design of the digital beamformer. To simplify the system reconfiguration and eliminate the pilot overhead required by effective channel estimation, we consider an analog-only beamforming (AoBF) architecture in this study. AoBF is designed to maximize the sum rate, it is transformed into a problem maximizing the power transmitted to the target user equipment (UE) and meanwhile minimizing the power leaked to the other UEs. To solve this problem, we use beam focusing and beam nulling and propose two AoBF schemes based on the majorization–minimization algorithm. First, the AoBF scheme based on perfect CSI is proposed, with the focus on beamforming performance and regardless of CSI acquisition. Then, the AoBF scheme based on imperfect CSI is proposed, where low-dimensional imperfect CSI is obtained by beam sweeping based on a near-field codebook. Simulation results demonstrate that the two AoBF schemes can approach HBF schemes in terms of the sum rate and outperform HBF schemes in terms of energy efficiency.
A novel approach to widening the active reflection coefficient (ARC) bandwidth of an antenna array, employing a parasitic coupling network (PCN), is investigated in this article. Different from traditional tightly coupled arrays adopting space structures for enhancing the coupling in balanced-excitation antennas, a PCN derived from rigorous formulas is employed in the feeding lines of unbalanced-excitation ones. Based on network analysis, the mutual coupling utilization condition for an (M×N)-element antenna array is initially deduced, and the PCN is implemented. Then, the PCNs are realized by introducing a parasitic element and a coupling network between the two-element H-plane and E-plane dual-layer coupled microstrip antenna arrays, resulting in 10.9% and 30.8% bandwidth enhancements compared with the original arrays, respectively. Moreover, the PCNs are further expanded to multielement antenna arrays, including three- and five-element one-dimensional and 8×2 two-dimensional arrays, exhibiting approximately 40% overlapped ARC bandwidths with normal radiation patterns, steady gains, and applicable scanning characteristics. The results indicate its potential application in large-scale wideband arrays.