With the rapid development of logistics and manufacturing industries, traditional handling robots can no longer meet practical needs. In response to this, for the rapid handling of diversified products, research first combines deep learning technology to improve the Double Actors Regularized Critics (DARC) algorithm and design a robot path planning method; Then, a Reachability Analysis-based Time Optimal Trajectory Planning (RA-TOP) algorithm is designed to generate the time optimal trajectory from the interpolated robot path, thereby efficiently achieving the task of rapid handling of diversified products by robots. The findings demonstrate that the enhanced DARC algorithm offers notable benefits in terms of path planning, resulting in shorter paths, reduced curvature, enhanced smoothness, a minimum path length of less than 20 meters, and fewer convergence times, surpassing the performance of alternative algorithms. The time trajectory generation algorithm has a shorter motion time, taking about 1.75 seconds under the same displacement, which is better than the comparison algorithm and can effectively avoid robot motion shaking. Compared with the comparative method, the obstacle avoidance trajectory of the research method is closer to the expected value, with an average deviation of about 0.5 m from the expected trajectory. The application results of the example show that under the research method, the success rate of the handling robot task is 94% or above. The above results indicate that robots can stably and dynamically avoid obstacles, generate optimal trajectories, meet the real-time path planning and efficient handling needs of enterprises, and improve production efficiency under the research method.
The aircraft final assembly is a complex system, encompassing various aspects and multidimensional production factors. These numerous factors are interconnected, significantly impacting the efficiency of the final assembly process. To investigate the interrelationships among various production factors, this study introduces a specialized fine-tuning large language model for aircraft final assembly, termed Aircraft Final Assembly ChatGLM (AFA-ChatGLM). This model is designed to automatically extract essential information regarding key production factors from process documentation. Furthermore, the FP-Growth algorithm is employed to uncover association rules between these production factors and the various stages of the final assembly. Experimental results indicate that our method demonstrates outstanding performance in the aircraft final assembly domain. Specifically, for the assembly process documents of the C919 large passenger aircraft, our proposed model achieved a Precision of 82.7%, Recall of 89.1%, and F1 score of 85.4%, representing a substantial improvement over traditional word segmentation methods. leveraging the superior performance of the model, we utilized association rule mining techniques to construct 44,851 high-confidence association rules for the final assembly line of the C919, laying a foundation for subsequent optimization of the production line.
This paper presents a geometric solution framework for a target defense problem, formulated as a variant of the classical Game of Two Cars. The setting considers a Dubins defender that is faster and more maneuverable and aims to intercept a Dubins attacker attempting to reach a convex target set. To address the computational complexity of solving the associated Hamilton-Jacobi-Isaacs (HJI) equations, a geometric approach based on the concept of the Attacker Dominance Region (ADR) is developed. The ADR is constructed piecewise from the boundaries of the players’ reachable sets. The complete solution consists of two components: a Game of Kind, which determines the outcome based on the spatial relationship between the ADR and the target set, and a Game of Degree, which derives optimal strategies that achieve equilibrium. Simulation results demonstrate the effectiveness of the proposed method under realistic motion constraints and indicate its potential applicability to practical target defense scenarios.
Blockchain has achieved widespread application in various fields due to its decentralized nature, data immutability, and transparency. Particularly, its integration with satellite networks provides a more secure and efficient solution for cross-regional, high-speed transmission, and reliable communication. However, challenges such as network fluctuations, performance bottlenecks, and leader election issues arise in this context, primarily due to the uneven computational power distribution of heterogeneous devices in satellite networks, as well as bandwidth limitations, signal delays, and instability. To address these challenges, this paper proposes a Dynamic Scoped Hierarchical Raft algorithm based on the network performance and computational power differences of nodes. The algorithm establishes consensus groups and restricts the pool of eligible leader candidates, thereby enhancing the adaptability of blockchain in satellite networks. Furthermore, by introducing different consensus subgroups, the scalability of the blockchain system is improved. Experimental results show that, compared to the traditional Raft algorithm, the proposed algorithm achieves a 65% increase in average throughput, a 12% reduction in latency, and a 71% reduction in leader election time, with a significantly lower chance of leader node failure when nodes drop out due to network instability.
High-precision 3D perception in autonomous driving remains constrained by dependence on expensive LiDAR sensors and computationally intensive models. These prohibitive requirements effectively limit resource-constrained platforms from accessing advanced perception capabilities, hindering the widespread adoption of autonomous technology. We present T-3MS Fusion, a transformer-based middle-fusion framework that achieves state-of-the-art 3D object detection performance using only a Velodyne VLP-32C LiDAR and a consumer-grade 360° camera, eliminating the need for test-time augmentation while maintaining computational efficiency. In contrast to early-fusion strategies that weaken spatial fidelity and late-fusion methods that lose geometric consistency, T-3MS employs a transformer-based middle-deep fusion architecture. This design leverages hierarchical gated residual transformers and adaptive cross-modal reactivation to preserve LiDAR geometry and camera semantics while enabling effective multi-scale feature integration. Sparse bird’s-eye-view processing and quantization-aware training enable real-time inference on embedded platforms. Validation on the nuScenes benchmark confirms strong performance with 74.9% NDS and 72.8% mAP, while evaluation on a self-collected semi-urban dataset acquired with low-cost and accessible hardware demonstrates resilience under occlusion, adverse illumination, and sparse point-cloud conditions, where even high-resolution LiDAR systems often experience significant performance degradation. These results establish T-3MS Fusion as an effective approach for jointly advancing accuracy, efficiency, and affordability in next-generation autonomous driving perception systems.
We present a modular, production-ready approach that integrates compact Neural Network (NN) into a Kalman-filter-based Multi-Object Tracking (MOT) pipeline. We design three tiny task-specific networks to retain modularity, interpretability and real-time suitability for embedded Automotive Driver Assistance Systems:
| 1. | SPENT (Single-Prediction Network) — predicts per-track states and replaces heuristic motion models used by the Kalman Filter (KF). |
| 2. | SANT (Single-Association Network) — assigns a single incoming sensor object to existing tracks, without relying on heuristic distance and association metrics. |
| 3. | MANTa (Multi-Association Network) — jointly associates multiple sensor objects to multiple tracks in a single step. |
Multi-object tracking (MOT) has long been a challenging task in computer vision, particularly in complex scenes with intricate motion patterns and frequent occlusions. Existing approaches often face significant hurdles in maintaining consistent and accurate trajectories under such demanding conditions. The integration of motion and appearance cues has proven beneficial, yet most methods rely on static fusion strategies that fail to adapt to dynamic scene variations. In this paper, we propose Adaptive Hybrid Association Tracking (AHAT), a novel framework designed to address the limitations of traditional MOT methods. AHAT employs a two-stage dynamic feature selection mechanism. The first stage combines motion and appearance features to achieve high-precision matching for high-scoring detection boxes, while the second stage utilizes a dynamic threshold for simple matching against low-scoring detection boxes. This approach effectively reduces trajectory fragmentation and ID switches, improving tracking robustness in crowded and dynamic environments. Notably, AHAT achieves a 5% improvement in HOTA in scenarios with low detection confidence or high motion complexity and reduces identity switches by over 10%. These results highlight AHAT’s effectiveness in practical applications, especially in video surveillance and robotics where high accuracy and real-time performance are crucial. The modular design of AHAT allows for seamless integration into existing tracking frameworks, offering a simple yet effective solution.
The proliferation of Large Language Models (LLMs) is often constrained by their significant computational and memory requirements, limiting their deployment to large data centers. Small Language Models (SLMs) offer a feasible solution for on-device applications; yet their efficiency requires optimization to operate well on resource-constrained hardware. This study looks at ways to make SLMs more efficient at using computers. The effects of two primary methods were compared: post-training optimization and architectural innovation through quantitative and observational study. Using a standardized suite of benchmarks measuring accuracy, reasoning, and inference performance, a baseline is established with state-of-the-art SLMs like Phi-3 and Llama 3. Post-training techniques were evaluated, including 4-bit quantization (GPTQ) and knowledge distillation from a superior teacher model. Finally, these optimized Transformers were compared against a custom-trained, non-Transformer model based on the Mamba (State-Space Model) architecture. Results show that 4-bit quantization is the most effective compression strategy among those evaluated. It reduces peak inference memory footprint by 71%, increases throughput by 83%, and does so with minimal accuracy degradation. Within the controlled experimental space evaluated in this study, the 4-bit quantized Phi-3-mini model occupies a Pareto-optimal position in memory-normalized accuracy. Focused skill growth works best with knowledge distillation. However, new designs like Mamba offer a different trade-off by being the best at streaming jobs’ raw output. It was found that improving current Transformer-based SLMs through quantization is the best way to use them for general purposes. However, customized designs and distillation work better for specific, high-performance uses. This research offers a definitive framework and pragmatic recommendations for advancing the next generation of robust, efficient, and accessible language models.
Unmanned aerial vehicle swarms serving ground high-demand communication users in dynamic environments must simultaneously optimize three-dimensional trajectories, communication network topology, and routing strategies while considering limited energy, link quality fluctuations, and collision avoidance constraints. This problem faces three core challenges: routing decisions under dynamic topology require real-time adaptation to vehicle position changes and channel variations; end-to-end delay and throughput optimization in multi-hop communication demands coordinated forwarding strategies across all vehicles; high-dimensional continuous action spaces and partial observability make traditional optimization methods difficult to solve. This paper models the problem as a multi-agent partially observable Markov decision process and proposes a graph attention-based multi-agent deep deterministic policy gradient algorithm to jointly optimize velocity vectors, communication power, and routing decisions for each vehicle. The reward function comprehensively considers user quality of service, system throughput, end-to-end delay, and energy consumption while ensuring safety distance and energy margins through constraint penalties. Simulation results demonstrate that compared to single-agent deep deterministic policy gradient and independent Q-learning baseline methods, the proposed method achieves approximately 50% improvement in convergence speed, 12% to 18% increase in user service satisfaction, 25% to 40% improvement in system throughput, 30% to 45% reduction in end-to-end delay, and 39% to 102% improvement in energy efficiency. The framework dynamically adjusts network topology and routing strategies according to user demands, providing a deployable solution for large-scale vehicle swarm communication networks.