Noncooperative computer systems and network confrontation present a core challenge in cyberspace security. Traditional cybersecurity technologies predominantly rely on passive response mechanisms, which exhibit significant limitations when addressing real-world complex and unknown threats. This paper introduces the concept of “active cybersecurity,” aiming to enhance network security not only through technical measures but also by leveraging strategy-level defenses. The core assumption of this concept is that attackers and defenders, in the context of network confrontations, act as rational decision-makers seeking to maximize their respective objectives. Building on this observation, this paper integrates game theory to analyze the interdependent relationships between attackers and defenders, thereby optimizing their strategies. Guided by this foundational idea, we propose an active cybersecurity model involving intelligent threat sensing, in-depth behavior analysis, comprehensive path profiling, and dynamic countermeasures, termed SAPC, designed to foster an integrated defense capability encompassing threat perception, analysis, tracing, and response. At its core, SAPC incorporates theoretical analyses of adversarial behavior and the optimization of corresponding strategies informed by game theory. By profiling adversaries and modeling confrontation as a “game,” the model establishes a comprehensive framework that provides both theoretical insights into and practical guidance for cybersecurity. The proposed active cybersecurity model marks a transformative shift from passive defense to proactive perception and confrontation. It facilitates the evolution of cybersecurity technologies toward a new paradigm characterized by active prediction, prevention, and strategic guidance.
Mimic active defense technology effectively disrupts attack routes and reduces the probability of successful attacks by using a dynamic heterogeneous redundancy (DHR) architecture. However, current approaches often overlook the adaptability of the adjudication mechanism in complex and variable network environments, focusing primarily on system security while neglecting performance considerations. To address these limitations, we propose an output difference feedback and system benefit control based DHR architecture. This architecture introduces an adjudication mechanism based on output difference feedback, which enhances adaptability by considering the impact of each executor's output deviation on the global decision. Additionally, the architecture incorporates a scheduling strategy based on system benefit, which models the quality of service and switching overhead as a bi-objective optimization problem, balancing security with reduced computational costs and system overhead. Simulation results demonstrate that our architecture improves adaptability towards different network environments and effectively reduces both the attack success rate and average failure rate.
Temporal attention mechanisms are essential for video action recognition, enabling models to focus on semantically informative moments. However, these models frequently exhibit temporal infidelity—misaligned attention weights caused by limited training diversity and the absence of fine-grained temporal supervision. While video-level labels provide coarse-grained action guidance, the lack of detailed constraints allows attention noise to persist, especially in complex scenarios with distracting spatial elements. To address this issue, we propose temporal fidelity enhancement (TFE), a competitive learning paradigm based on the disentangled information bottleneck (DisenIB) theory. TFE mitigates temporal infidelity by decoupling action-relevant semantics from spurious correlations through adversarial feature disentanglement. Using pre-trained representations for initialization, TFE establishes an adversarial process in which segments with elevated temporal attention compete against contexts with diminished action relevance. This mechanism ensures temporal consistency and enhances the fidelity of attention patterns without requiring explicit fine-grained supervision. Extensive studies on UCF101, HMDB-51, and Charades benchmarks validate the effectiveness of our method, with significant improvements in action recognition accuracy.
Diffusion tensor imaging (DTI) is a widely used imaging technique for mapping living human brain tissue's microstructure and structural connectivity. Recently, deep learning methods have been proposed to rapidly estimate diffusion tensors (DTs) using only a small quantity of diffusion-weighted (DW) images. However, these methods typically use the DW images obtained with fixed q-space sampling schemes as the training data, limiting the application scenarios of such methods. To address this issue, we develop a new deep neural network called q-space-coordinate-guided diffusion tensor imaging (QCG-DTI), which can efficiently and correctly estimate DTs under flexible q-space sampling schemes. First, we propose a q-space-coordinate-embedded feature consistency strategy to ensure the correspondence between q-space-coordinates and their respective DW images. Second, a q-space-coordinate fusion (QCF) module is introduced which efficiently embeds q-space-coordinates into multiscale features of the corresponding DW images by linearly adjusting the feature maps along the channel dimension, thus eliminating the dependence on fixed diffusion sampling schemes. Finally, a multiscale feature residual dense (MRD) module is proposed which enhances the network's feature extraction and image reconstruction capabilities by using dual-branch convolutions with different kernel sizes to extract features at different scales. Compared to state-of-the-art methods that rely on a fixed sampling scheme, the proposed network can obtain high-quality diffusion tensors and derived parameters even using DW images acquired with flexible q-space sampling schemes. Compared to state-of-the-art deep learning methods, QCG-DTI reduces the mean absolute error by approximately 15% on fractional anisotropy and around 25% on mean diffusivity.
End-to-end object detection methods have attracted extensive interest recently since they alleviate the need for complicated human-designed components and simplify the detection pipeline. However, these methods suffer from slower training convergence and inferior detection performance compared to conventional detectors, as their feature fusion and selection processes are constrained by insufficient positive supervision. To address this issue, we introduce a novel query-selection encoder (QSE) designed for end-to-end object detectors to improve the training convergence speed and detection accuracy. QSE is composed of multiple encoder layers stacked on top of the backbone. A lightweight head network is added after each encoder layer to continuously optimize features in a cascading manner, providing more positive supervision for efficient training. Additionally, a hierarchical feature-aware attention (HFA) mechanism is incorporated in each encoder layer, including in- and cross-level feature attention, to enhance the interaction between features from different levels. HFA can effectively suppress similar feature representations and highlight discriminative ones, thereby accelerating the feature selection process. Our method is highly versatile in accommodating both CNN- and Transformer-based detectors. Extensive experiments were conducted on the popular benchmark datasets MS COCO, CrowdHuman, and PASCAL VOC to demonstrate the effectiveness of our method. The results showed that CNN- and Transformer-based detectors using QSE can achieve better end-to-end performance within fewer training epochs.
Large language models (LLMs) exhibit remarkable capabilities in various natural language processing tasks, such as machine translation. However, the large number of LLM parameters incurs significant costs during inference. Previous studies have attempted to train translation-tailored LLMs with moderately sized models by fine-tuning them on the translation data. Nevertheless, when performing translations in zero-shot directions that are absent from the fine-tuning data, the problem of ignoring instructions and thus producing translations in the wrong language (i.e., the off-target translation issue) remains unresolved. In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability of translation-tailored LLMs, particularly for maintaining accurate translation directions. We first fine-tune LLMs on the translation data to elicit basic translation capabilities. At the second stage, we construct instruction-conflicting samples by randomly replacing the instructions with the incorrect ones. Then, we introduce an extra unlikelihood loss to reduce the probability assigned to those samples. Experiments on two benchmarks using the LLaMA 2 and LLaMA 3 models, spanning 16 zero-shot directions, demonstrate that, compared to the competitive baseline—translation-finetuned LLaMA, our method could effectively reduce the off-target translation ratio (up to −62.4 percentage points), thus improving translation quality (up to +9.7 bilingual evaluation understudy). Analysis shows that our method can preserve the model's performance on other tasks, such as supervised translation and general tasks. Code is released at https://github.com/alphadl/LanguageAware_Tuning.
Effectively tuning the parameters of proportional-integral-derivative (PID) controllers has persistently posed a challenge in control engineering. This study proposes enhanced hippopotamus optimization (EHO) to address this challenge. Latin hypercube sampling and adaptive lens reverse learning are used to initialize the population to improve population diversity and enhance global search. Additionally, an adaptive perturbation mechanism is introduced into the position update in the exploration phase. To validate the performance of EHO, it is benchmarked against hippopotamus optimization and four classical or state-of-the-art intelligent algorithms using the CEC2022 test suite. The effectiveness of EHO is further evaluated by applying it in tuning PID controllers for different types of systems. The performance of EHO is compared with five other algorithms and the classical Ziegler-Nichols method. Analysis of convergence curves, step responses, box plots, and radar charts indicates that EHO outperforms the compared methods in accuracy, convergence speed, and stability. Finally, EHO is used to tune the cascade PID controller for trajectory tracking in a quadrotor unmanned aerial vehicle to assess its applicability. The simulation results indicate that the integrals of the time absolute error for the position channels (x, y, z), when the system is optimized using EHO over an 80 s runtime, are 59.979, 22.162, and 0.017, respectively. These values are notably lower than those obtained by the original hippopotamus optimization and manual parameter adjustment.
Federated learning (FL) emerged as a novel machine learning setting that enables collaboratively training deep models on decentralized clients with privacy constraints. In the vanilla federated averaging algorithm (FedAvg), the global model is generated by the weighted linear combination of local models, and the weights are proportional to the local data sizes. This methodology, however, encounters challenges when facing heterogeneous and unknown client data distributions, often leading to discrepancies from the intended global objective. The linear combination-based aggregation often fails to address the varied dynamics presented by diverse scenarios, settings, and data distributions inherent in FL, resulting in hindered convergence and compromised generalization. In this paper, we present a new aggregation method, FedMcon, within a framework of meta-learning for FL. We introduce a learnable controller trained on a small proxy dataset and served as an aggregator to learn how to adaptively aggregate heterogeneous local models into a better global model toward the desired objective. The experimental results indicate that the proposed method is effective on extremely non-independent and identically distributed data and it can simultaneously reach 19 times communication speedup in a single FL setting.
Digital simulation of the full operation of a remotely operated vehicle (ROV) is an economically feasible way for algorithm pretesting and operator training prior to the actual underwater tasks, due to the huge difficulties encountered during the underwater test, high equipment cost, and the time-consuming nature of the process. In this paper, a human-interactive digital simulation platform is established for the navigation, motion, and teleoperated manipulation of work-class ROVs, and provides the human operator with the visualized full operation process. Specially, two mechanisms are presented in this platform: one provides the virtual simulation platform for operator training; the other provides real-time visual and force feedback when implementing the actual tasks. Moreover, an open data interface is designed for researchers for pretesting various algorithms before implementing the actual underwater tasks. Additionally, typical underwater scenarios of the ROV, including underwater sediment sampling and pipeline docking tasks, are selected as the case studies for hydrodynamics-based simulation. Human operator can operate the manipulator installed on the ROV via the master manipulator with the visual and force feedback after the ROV is navigated to the desired position. During the full operation, the dynamic windows approach (DWA)-based local navigation algorithm, sliding mode control (SMC) controller, and the teleoperation control framework are implemented to show the effectiveness of the designed platform. Finally, a user study on the ROV operation mode is carried out, and several metrics are designed to evaluate the superiority and accuracy of the digital simulation platform for immersive underwater teleoperation.
It is challenging for underwater vehicle-manipulator systems (UVMSs) to operate autonomously in unstructured underwater environments. Relying solely on teleoperation for both underwater vehicle (UV) and underwater manipulator (UM) imposes a considerable cognitive and physical load on the operator. In this paper, we propose a unified shared control (USC) architecture for the UVMS, integrating divisible shared control (DSC) and interactive shared control (ISC) to alleviate the operator's workload. By applying task priority based on DSC, we divide the whole-body task into constraints, operation, and posture optimization subtasks. The robot autonomously avoids self-collisions and adjusts its posture according to the user's visual preferences. ISC incorporates haptic feedback to enhance human-robot collaboration, seamlessly integrating it into the operation task via a whole-body controller for the UVMS. Simulations and pool experiments are conducted to verify the feasibility of the method. Compared to manual control (MC), the proposed method reduces completion time by 17.50%, operator input length by 25.00%, and cognitive load by 35.53% in the simulations, with corresponding reductions of 22.73%, 40.00%, and 29.91% in the pool experiments. Subjective measurements demonstrate the reduction in operator workload with the proposed method.
Text-to-image diffusion models have demonstrated impressive capabilities in image generation and have been effectively applied to image inpainting. While text prompt provides an intuitive guidance for conditional inpainting, users often seek the ability to inpaint a specific object with customized appearance by providing an exemplar image. Unfortunately, existing methods struggle to achieve high fidelity in exemplar-driven inpainting. To address this, we use a plug-and-play low-rank adaptation (LoRA) module based on a pretrained text-driven inpainting model. The LoRA module is dedicated to learn the exemplar-specific concepts through few-shot fine-tuning, bringing improved fitting capability to customized exemplar images, without intensive training on large-scale datasets. Additionally, we introduce GPT-4V prompting and prior noise initialization techniques to further facilitate the fidelity in inpainting results. In brief, the denoising diffusion process first starts with the noise derived from a composite exemplar-background image, and is subsequently guided by an expressive prompt generated from the exemplar using the GPT-4V model. Extensive experiments demonstrate that our method achieves state-of-the-art performance, qualitatively and quantitatively, offering users an exemplar-driven inpainting tool with enhanced customization capability.
Continual learning (CL) has emerged as a crucial paradigm for learning from sequential data while retaining previous knowledge. Continual graph learning (CGL), characterized by dynamically evolving graphs from streaming data, presents distinct challenges that demand efficient algorithms to prevent catastrophic forgetting. The first challenge stems from the interdependencies between different graph data, in which previous graphs influence new data distributions. The second challenge is handling large graphs in an efficient manner. To address these challenges, we propose an efficient continual graph learner (E-CGL) in this paper. We address the interdependence issue by demonstrating the effectiveness of replay strategies and introducing a combined sampling approach that considers both node importance and diversity. To improve efficiency, E-CGL leverages a simple yet effective multi-layer perceptron (MLP) model that shares weights with a graph neural network (GNN) during training, thereby accelerating computation by circumventing the expensive message-passing process. Our method achieves state-of-the-art results on four CGL datasets under two settings, while significantly lowering the catastrophic forgetting value to an average of −1.1%. Additionally, E-CGL achieves the training and inference speedup by an average of 15.83× and 4.89×, respectively, across four datasets. These results indicate that E-CGL not only effectively manages correlations between different graph data during continual training but also enhances efficiency in large-scale CGL.
Circular polarizers based on the metasurface suffer from a trade-off between the structural complexity and the polarization extinction ratio (ER). Herein, we present a single-layer chiral metasurface with strong circular dichroism. The structure turns a circularly polarized incident beam into a linearly polarized beam, achieving a high circular polarization ER. The operating wavelength of the proposed metasurface is tunable by changing the geometric parameters. The metasurface's localized surface plasmon resonances between structures ensure strong chiral optical effects. We further experimentally demonstrate the circular dichroism of the fabricated metasurface.
The integration of electric field enhancement structures (EFESs) with Rydberg atomic sensors (RASs) has garnered considerable interest due to their potential to enhance detection sensitivity in quantum measurement systems. Despite this, there is a dearth of research on the directional response of EFES, and the analysis of the three-dimensional (3D) patterns of RAS remains a formidable challenge. RASs are employed in non-destructive measurement techniques, and are responsive to electric fields, primarily serving as reception devices. However, analyzing their reception patterns is a complex task that requires a sophisticated approach. To address this, we adopt characteristic mode (CM) analysis to illustrate the omnidirectional performance of RAS. According to the CM theory, the reception pattern can be calculated by a series of modal currents and their corresponding coefficients. The analytical representation of these coefficients negates the need for time-consuming full-wave (FW) numerical simulations, which are typically required to generate EFES patterns due to the necessity of scanning numerous angle parameters. This approach significantly reduces the complexity of solving EFES patterns, and provides insightful guidance for the design process. To validate the efficacy of our proposed method, we construct three prototypes. The results indicate that the final model resonates at 1.96 GHz, achieving an electric field gain of 25 dB and an out-of-roundness of 2.4 dB. These findings underscore the effectiveness of our method in analyzing EFES patterns, highlighting its potential for future applications in the field.
Because underwater sensor networks (USNs) have limited energy resources due to environmental constraints, it is essential to improve energy utilization. For this purpose, an autonomous underwater vehicle (AUV) with greater onboard computation power is used to process measurement data, and the mobility of the AUV is leveraged to optimize the USN topology, enhancing tracking accuracy. First, to address the transmission delay of underwater acoustic signals, a time-delay compensated centralized extended Kalman filter (TD-CEKF) algorithm is proposed. Next, the mathematical relationship between AUV position and USN topology is established, based on which the optimization target is constructed. Subsequently, a penalty function is introduced to remove the constraints from the objective function, and the optimal AUV position is searched using the gradient descent method to optimize the USN topology. The simulation results demonstrate that the proposed algorithm can effectively overcome the influence of transmission delay on target tracking and achieve improved tracking performance.
In quasi-static wireless channel scenarios, the generation of physical layer keys faces the challenge of invariant spatial and temporal channel characteristics, resulting in a high key disagreement rate (KDR) and low key generation rate (KGR). To address these issues, we propose a novel reconfigurable intelligent surface (RIS)-aided secret key generation approach using an autoencoder and K-means quantization algorithm. The proposed method uses channel state information (CSI) for channel estimation and dynamically adjusts the reflection coefficients of the RIS to create a rapidly fluctuating channel. This strategy enables the extraction of dynamic channel parameters, thereby enhancing channel randomness. Additionally, by integrating the autoencoder with the K-means clustering quantization algorithm, the method efficiently extracts random bits from complex, ambiguous, and high-dimensional channel parameters, significantly reducing KDR. Simulations demonstrate that, under various signal-to-noise ratios (SNRs), the proposed method performs excellently in terms of KGR and KDR. Furthermore, the randomness of the generated keys is validated through the National Institute of Standards and Technology (NIST) test suite.