In multivariate time series forecasting, most existing Transformer models follow a fixed modeling paradigm; they either focus on capturing temporal patterns within each variable or on learning the interactions between variables. However, this single approach often fails to adapt to real-world time series, which exhibit complex and diverse characteristics. To this end, we propose Adformer, an adaptive and unified forecasting framework. The framework innovatively integrates a hybrid architecture capable of capturing both intra-variable and inter-variable dependencies. However, we recognize that this hybrid design faces a key challenge: when inter-variable correlations in the data are weak, forcing the model to learn these inter-variable interactions may introduce statistical noise and thus degrade forecasting performance. To enable the model to intelligently circumvent this issue, our hybrid architecture is dynamically guided by a data-driven strategy selection module. This module analyzes the input’s intrinsic correlation structure using unsupervised clustering, based on this analysis, automatically selects the optimal modeling path for the architecture—whether to focus on intra-variable patterns, inter-variable interactions, or a hybrid of both. Additionally, we introduce a frequency-aware loss function, which helps the model focus on meaningful low-frequency components and improves robustness under noisy conditions.Extensive experiments on several public benchmarks demonstrate that our adaptive framework consistently outperforms state-of-the-art methods across various forecasting tasks, showing strong generalization and robustness, and highlighting its potential as a foundation for future time series models.
Domain Adaptation (DA) aims to transfer knowledge from a labeled source domain to an unlabeled or sparsely labeled target domain under domain shifts. Most prior works focus on capturing the inter-domain transferability but largely overlook rich intra-domain structures, which empirically results in even worse discriminability. To tackle this tradeoff, we propose a generalized graph SPectral Alignment framework, SPA++. Its core is briefly condensed as follows: (1) by casting the DA problem to graph primitives, it composes a coarse graph alignment mechanism with a novel spectral regularizer toward aligning the domain graphs in eigenspaces; (2) we further develop a fine-grained neighbor-aware propagation mechanism for enhanced discriminability in the target domain; (3) by incorporating data augmentation and consistency regularization, SPA++ can adapt to complex scenarios including most DA settings and even challenging distribution scenarios. Furthermore, we also provide theoretical analysis to support our method, including the generalization bound of graph-based DA and the role of spectral alignment and smoothing consistency. Extensive experiments on benchmark datasets demonstrate that SPA++ consistently outperforms existing cutting-edge methods, achieving superior robustness and adaptability across various challenging adaptation scenarios.
Applying artificial intelligence to Traditional Chinese Medicine (TCM) treatment has enabled the online intelligent diagnosis of TCM. However, TCM faces two critical challenges in AI-driven prescription systems. The first is limited generalizability. Existing methods merely retrieve similar historical prescriptions, failing to generate novel prescriptions for rare symptoms or patients with unique constitutions. The second is the accuracy degradation. This limitation primarily stems from three critical factors including experiential bias in practitioner-dependent decision patterns, neglect of historical patient context critical for personalization, and geometric distortion induced by Euclidean embeddings of scale-free herb interaction networks. To address these issues, we propose HyperRxGen, a historical-contextualized hyperbolic framework for herb prescription generation. The HyperRxGen paradigm is architected with two core components: a Hyperbolic Multi-Graph Neural Network (HMGNN) and an HMGNN-based Prescription Generator (HM-PG). The HMGNN leverages hyperbolic geometry model to encode TCM knowledge graphs, achieving lower distortion than Euclidean GNNs. The HM-PG injects patients’ historical records into prescription generation process, enhancing personalized treatment consistency through adaptive history weighting. Extensive experiments on real-world datasets demonstrate the superior effectiveness and efficiency of HyperRxGen over various baselines. This work bridges hyperbolic deep learning with clinical decision support, offering a potential paradigm shift for personalized healthcare.
Continual learning (CL) involves acquiring and accumulating knowledge from evolving tasks while alleviating catastrophic forgetting. Recently, leveraging contrastive loss to construct more transferable and less forgetful representations has been a promising direction in CL. Despite advancements, their performance is still limited due to the confusion arising from both inter-task and intra-task features. To address the problem, we propose a simple yet effective contrastive strategy named Global Pre-fixing, Local Adjusting for Supervised Contrastive learning (GPLASC). Specifically, to avoid task-level confusion, we divide the entire unit hypersphere of representations into non-overlapping regions, with the centers of the regions forming an inter-task pre-fixed Equiangular Tight Frame (ETF). Meanwhile, for individual tasks, our method helps regulate the feature structure and form intra-task adjustable ETFs within their respective allocated regions. As a result, our method simultaneously ensures discriminative feature structures both between and within tasks and can be seamlessly integrated into any existing contrastive continual learning framework. Extensive experiments validate its effectiveness.
Though vertical federated learning (VFL) is generally considered to be privacy-preserving, recent studies have shown that VFL system is vulnerable to label inference attacks originating from various attack surfaces. Among these attacks, the model completion (MC) attack is currently the most powerful one. Existing defense methods against it either sacrifice model accuracy or incur impractical computational overhead. In this paper, we propose VMask, a novel label privacy protection framework designed to defend against MC attack from the perspective of layer masking. Our key insight is to disrupt the strong correlation between input data and intermediate outputs by applying the secret sharing (SS) technique to mask layer parameters in the attacker’s model. We devise a strategy for selecting critical layers to mask, reducing the overhead that would arise from naively applying SS to the entire model. Moreover, VMask is the first framework to offer a tunable privacy budget to defenders, allowing for flexible control over the levels of label privacy according to actual requirements. We built a VFL system, implemented VMask on it, and extensively evaluated it using five model architectures and 13 datasets with different modalities, comparing it to 12 other defense methods. The results demonstrate that VMask achieves the best privacy-utility trade-off, successfully thwarting the MC attack (reducing the label inference accuracy to a random guessing level) while preserving model performance (e.g., in Transformer-based model, the averaged drop of VFL model accuracy is only 0.09%). VMask’s runtime is up to 60,846 times faster than cryptography-based methods, and it only marginally exceeds that of standard VFL by 1.8 times in a large Transformer-based model, which is generally acceptable.
Retrieval-Augmented Generation (RAG) has proven its effectiveness in enhancing the generation capabilities of large language models (LLMs) for various natural language processing tasks. However, its ability in low-resource machine translation drops sharply due to the noise interference caused by the semantic mismatch between retrieved content and translation requirements. To alleviate this drawback, we propose a novel hierarchical dynamic retrieval and matching approach for Southeast Asian low-resource machine translation. First, we construct a hierarchical index structure that utilizes high-frequency word statistics as key indices based on an existing parallel corpus, associating bilingual short and long sentence pairs. Second, we dynamically match words between the source sentence and the hierarchical index structure to retrieve all associated short and long bilingual sentence pairs. Meanwhile, we rerank the candidate samples by computing cross-lingual semantic similarity between the source sentence and the retrieved pairs. Finally, the sample with the highest semantic similarity is integrated into the prompt to guide LLMs in generating more accurate translations. Experimental results show that our approach outperforms mainstream machine translation systems without fine-tuning LLM parameters. Detailed analysis indicates that our method precisely matches fine-grained semantic information, thus reducing noise interference and improving low-resource translation performance.
Semi-supervised multi-label learning (SSMLL) trains models efficiently by leveraging a small amount of labeled data along with a large set of unlabeled data. In SSMLL, given that each instance can be associated with multiple labels, a key problem of pseudo-labeling is how to transfer the soft predicted probabilities to hard positive/negative labels for unlabeled data. The recent work addresses this problem by developing a class-wise thresholding method but neglects the fact that different instances contain different contextual information, causing the model to make biased predictions for the same class. This phenomenon further leads to biased pseudo-labels, which in turn degrade the model’s performance. To solve this problem, we propose an instance-adaptive thresholding method for SSMLL, which aims to avoid introducing contextual bias into pseudo-labeling. The core idea is to introduce a learnable thresholding function that adaptively generates instance-wise thresholds to separate the positive and negative labels for each unlabeled instance. The thresholding function can be easily learned with an improved pairwise ranking loss on labeled data. Specifically, this strategy can be implemented as a plug-in solution for other SSMLL methods to generate hard pseudo-labels. Experimental results demonstrate that our thresholding strategy consistently improves existing SSMLL methods and achieves state-of-the-art performance when integrated into strong architectures.