2025-03-10 2025, Volume 1 Issue 1

  • Select all
  • research-article
    Kuang Du, Jing Du, Zhi Wei

    Drug-Drug Interactions (DDIs) can occur when diseases are treated with combinations of drugs, leading to changes in the pharmacological activity of these drugs. Predicting DDIs has become a crucial task in medical health. Recently, hierarchical graph representation learning methods have attracted significant interest and have proven effective for this task. However, collecting drug interaction data through biological experiments in wet laboratories is resource- and time-intensive. Given the limited amount of available drug interaction data, the performance of existing hierarchical graph methods has encountered a bottleneck. Current approaches are supervised learning methods, which train graph neural networks on specific datasets and can cause overfitting problems. Additionally, supervised learning models cannot leverage information from massive amounts of unlabeled public molecular datasets, such as ZINC15. To overcome this limitation, we propose a novel method for multi-view graph representation learning, namely, Self-Supervised Multi-View Graph Representation Learning for Drug-Drug Interaction Prediction (SMG-DDI). SMG-DDI leverages a pre-trained Graph Convolutional Network to generate inter-view molecule graph representations, incorporating atoms as nodes and chemical bonds as edges. Subsequently, SMG-DDI captures intra-view interactions between molecules. The final drug-drug interactions will be based on the drug embeddings from intra-view analyses. Our experiments conducted on various real datasets demonstrate that molecular structure information can aid in predicting potential drug-drug interactions, and our proposed approach outperforms state-of-the-art DDI prediction methods. The accuracies are 0.83, 0.79, and 0.73 on small, medium, and large scale test datasets, respectively.

  • research-article
    Gaurav Bagwe, Lan Zhang, Linke Guo, Miao Pan, Xiaolong Ma, Xiaoyong Yuan

    Embedding-as-a-Service (EaaS) has emerged as a popular paradigm for empowering users with limited resources to leverage large language models (LLMs). Through an API, EaaS providers grant access to their large language embedding models (LLEMs), enabling users with domain expertise to construct the domain-specific layers locally. However, the close interaction between EaaS providers and users raises new concerns: Is EaaS safe for users? Although recent research has highlighted the vulnerability of LLMs to backdoor attacks, especially task-agnostic backdoor attacks, existing attacks cannot be effectively executed in EaaS due to challenges in terms of attack efficacy, attack stealthiness, and user-side knowledge limitations. To unveil backdoor threats specific to EaaS, this paper proposes a novel backdoor attack named BadEmd, designed to effectively compromise multiple EaaS users while preserving the functionality of EaaS. BadEmd comprises two key modules: meta-prompt-based attack buildup creates backdoor attack surfaces in EaaS while seamlessly integrating with prior task-agnostic attacks to ensure attack stealthiness; user-specific trigger migration enforces attack efficacy despite limited user-side knowledge. Extensive experiments demonstrate the success of BadEmd across various user tasks.

  • research-article
    Yihe Zhou, Tao Ni, Wei-Bin Lee, Qingchuan Zhao

    Large Language Models (LLMs) have achieved significantly advanced capabilities in understanding and generating human language text, which have gained increasing popularity over recent years. Apart from their state-of-the-art natural language processing (NLP) performance, considering their widespread usage in many industries, including medicine, finance, education, etc., security concerns over their usage grow simultaneously. In recent years, the evolution of backdoor attacks has progressed with the advancement of defense mechanisms against them and more well-developed features in the LLMs. In this paper, we adapt the general taxonomy for classifying machine learning attacks on one of the subdivisions - training-time white-box backdoor attacks. Besides systematically classifying attack methods, we also consider the corresponding defense methods against backdoor attacks. By providing an extensive summary of existing works, we hope this survey can serve as a guideline for inspiring future research that further extends the attack scenarios and creates a stronger defense against them for more robust LLMs.

  • research-article
    Yutong Zhao, Jianye Pang, Xinjie Zhu, Wenhua Shao

    Traditional automated machine learning (AutoML) often faces limitations in manual effort, complexity management, and subjective design choices. This paper introduces a novel LLM-driven AutoML framework centered on the innovation of decomposed prompting. We hypothesize that by strategically breaking down complex AutoML tasks into sequential, guided sub-prompts, Large Language Models (LLMs) operating within a code sandbox on standard PCs can autonomously design, implement, evaluate, and select high-performing machine learning models. To validate this, we primarily applied our decomposed prompting approach to sleep disorder classification (illustrating potential benefits in healthcare). To assess the generalizability and robustness of our method across different data types, we subsequently evaluated it on the established 20 Newsgroups text classification benchmark. We rigorously compared decomposed prompting against zero-shot and few-shot prompting strategies, as well as a manually engineered baseline. Our results demonstrate that decomposed prompting significantly outperforms these alternatives. Our results demonstrate that decomposed prompting significantly outperforms alternatives, enabling the LLM to autonomously achieve superior classifier design and performance, particularly showing strong results in the primary sleep disorder domain and demonstrating robustness in the benchmark task. These findings underscore the transformative potential of decomposed prompting as a key technique for advancing LLM-driven AutoML across diverse application areas beyond the specific examples explored here, paving the way for more automated and accessible problem-solving in scientific and engineering disciplines.

  • research-article
    Fei Deng, Catherine H. Feng, Nan Gao, Lanjing Zhang

    Normalization is a critical step in quantitative analyses of biological processes. Recent works show that cross-platform integration and normalization enable machine learning (ML) training on RNA microarray and RNA-seq data, but no independent datasets were used in their studies. Therefore, it is unclear how to improve ML modelling performance on independent RNA array and RNA-seq based datasets. Inspired by the house-keeping genes that are commonly used in experimental biology, this study tests the hypothesis that non-differentially expressed genes (NDEG) may improve normalization of transcriptomic data and subsequently cross-platform modelling performance of ML models. Microarray and RNA-seq datasets of the TCGA breast cancer were used as independent training and test datasets, respectively, to classify the molecular subtypes of breast cancer. NDEG (p > 0.85) and differentially expressed genes (DEG) (p < 0.05) were selected based on the p values of ANOVA analysis and used for subsequent data normalization and classification, respectively. Models trained based on data from one platform were used for testing on the other platform. Our data show that NDEG and DEG gene selection could effectively improve the model classification performance. Normalization methods based on parametric statistical analysis were inferior to those based on nonparametric statistics. In this study, the LOG_QN and LOG_QNZ normalization methods combined with the neural network classification model seem to achieve better performance. Therefore, NDEG-based normalization appears useful for cross-platform testing on completely independent datasets. However, more studies are required to examine whether NDEG-based normalization can improve ML classification performance in other datasets and other omic data types.

  • research-article
    Jiayimei Wang, Tao Ni, Wei-Bin Lee, Qingchuan Zhao

    The increasing complexity of software systems has driven significant advancements in program analysis, as traditional methods are unable to meet the demands of modern software development. To address these limitations, deep learning techniques, particularly Large Language Models (LLMs), have gained attention due to their context-aware capabilities in code comprehension. Recognizing the potential of LLMs, researchers have extensively explored their application in program analysis since their introduction. Despite existing surveys on LLM applications in cybersecurity, comprehensive reviews specifically addressing their role in program analysis remain scarce. This survey reviews the application of LLMs in program analysis, categorizing existing work into static, dynamic, and hybrid approaches. We also identify current research hotspots, such as LLM integration in automated vulnerability detection and code analysis, common challenges like model interpretability and training data limitations, and future directions, including using LLMs to convert dynamic analysis tasks into static ones. This survey aims to demonstrate the potential of LLMs in advancing program analysis practices and offer actionable insights for security researchers seeking to enhance detection frameworks or develop domain-specific models.

  • research-article
    Iris Z. Shen, Lanjing Zhang

    Individual investors often trail institutional ones in investor return. Artificial intelligence (AI) has been increasingly used in investing and finance sectors. However, its impact on the disparity in investor returns is unclear. We therefore discuss how and to what extent the application of AI tools exacerbates return disparities between individual and institutional investors. Literature search and review were conducted. Hypothetical drawdowns during 2020 market crisis were simulated and reported. Our data and review of literature show that AI may worsen these disparities through additional technological and psychological edges gained by institutional (versus individual) investors and large (versus smaller) institutions. To address this concern, we propose several approaches to mitigate the increasing disparities in investor return, including increasing awareness of the risks of AI-driven tools, playing defensively in the market, actions by the law makers and law enforcement agencies and fiduciary requirement of financial advisors and brokers. However, there are several exceptions to the increasing disparities that may help individual investors and those in small institutions. In summary, AI tools will likely increase the disparity in the investor return between individual and institutional investors and that between large and smaller institutions. Yet we believe that these disparities can be prevented or mitigated through collaborative efforts of the investors, public, academics, and government officials.

  • research-article
    Dapeng Oliver Wu
  • research-article
    Renwei Yang, Yun Wang, Yongcan Luo, Zhengjie Yang, Zhimin Zong, Dapeng Oliver Wu

    In this survey, we examine contemporary advancements in Artificial Intelligence (AI) applications for Financial Technology (FinTech), with a specific focus on three rapidly evolving domains: recommendation systems, risk analysis, and AI-generated commercial content (AIGC). For recommendation systems, self-supervised learning and graph neural network methodologies facilitate real-time, hyper-personalized financial product suggestions, optimizing the balance between conversion efficacy and regulatory adherence. For risk analysis, large language models, including GPT-4 and Llama 3, enhanced through sophisticated prompt engineering techniques, have significantly transformed credit assessment and stress testing processes for small and medium-sized enterprises, reducing analytical cycles from weeks to minutes. Concurrently, multimodal generative models, such as DALL-E 3, are revolutionizing advertising through the automated generation of compliant and engaging content across textual, visual, and video formats, markedly compressing production timelines. The survey further critically addresses persistent challenges, encompassing data privacy, algorithmic transparency, and cultural bias within AIGC, while delineating future research trajectories for developing trustworthy and scalable AI solutions in FinTech.

  • research-article
    Juntao Hu, Zhengjie Yang, Peng Wang, Guanyi Zhao, Hong Huang, Zhimin Zong, Dapeng Oliver Wu

    Federated Learning (FL) has emerged as a transformative paradigm in medical image analysis, addressing the critical challenges of data scarcity and patient privacy. By enabling collaborative model training across decentralized datasets without requiring data sharing, FL aligns with stringent privacy regulations like HIPAA and GDPR. However, existing surveys on FL for medical image analysis often focus narrowly on aspects like privacy and security or fail to categorize methods within a clear taxonomy. Our survey bridges these gaps by systematically organizing FL methodologies for medical image analysis around three core pillars: training, architecture, and unlearning. We emphasize the unique demands of the medical domain, such as handling heterogeneous imaging modalities and annotations. Unlike prior works, our survey strikes a balance between technical rigor and clinical practicality, covering approaches not only for privacy and security but also for accuracy and efficiency. By synthesizing insights from various studies, we provide a comprehensive roadmap to guide researchers and practitioners in leveraging FL’s potential to advance AI-driven healthcare.

  • research-article
    Siyuan Guo, Dapeng Oliver Wu

    Current clinical decision-making relies heavily on doctors’ experience, often leading to generic treatments that overlook individual patient factors. A lack of clinical expertise, especially in resource-limited regions, hinders optimal decisions and contributes to higher patient mortality. To address this, traditional AI systems have modeled clinical decision-making as a predator-prey game. However, such approaches fail to recognize that disease agents, such as cancer cells, can exhibit adaptive, human- like intelligence. Immunological studies reveal that malignant tumors evade the immune system through camouflage, coercion, and cytoprotection. To counter these adaptive strategies, game-theoretic approaches are essential. In this paper, we present Game Theoretical AI (GTAI)—a novel approach that formalizes and automates strategic reasoning to enhance clinical decision-making against complex diseases. Inspired by Sun Tzu’s The Art of War and the Thirty-Six Stratagems, GTAI mimics expert clinical reasoning through four stages: (1) observation and diagnosis, (2) treatment planning, (3) execution, and (4) outcome evaluation. Within this framework, GTAI can dynamically select and carry out high-level tactics analogous to humans’ stratagems at each decision stage. This unified approach yields six major discoveries that bridge theory and practice. Collectively, these advances demonstrate the power of integrating strategic intelligence with computational models, opening new avenues for the application of AI in precision medicine and adaptive clinical practice.

  • research-article
    Cong Wang, Wei Bao
  • research-article
    Monan Zhou, Xiaobing Li, Feng Yu, Wei Li

    The EMelodyGen system mainly focuses on melody generation in ABC notation controlled by emotional conditions. To overcome the scarcity of emotional labeled sheet music, we utilize statistical correlations derived from small-scale symbolic music datasets with emotion labels and music psychology conclusions to guide subsequent feature extraction, emotional control and automatic annotation. We then automatically annotate a large, well-structured sheet music collection with rough emotional labels, convert the annotated dataset into ABC notation format, and apply data augmentation to address label imbalance, resulting in the creation of a dataset named Rough4Q. We demonstrate that our system backbone pre-trained on Rough4Q can achieve up to 99% music21 parsing rate. Our emotional control parameters, categorized into directly modifiable, embedding, dual-stage, and guidance features, can be selected and assembled to design customized emotional control templates that can lead to a 91% alignment in emotional expression in blind listening tests. Ablation studies further validate the impact of these control conditions on emotional accuracy.

  • research-article
    Haofeng Wang, Yilin Guo, Tiange Zhang, Zehao Li, Tong Yue, Yizong Wang, Rongqun Lin, Feng Gao, Shiqi Wang, Siwei Ma

    The Yellow River culture is a cornerstone of Chinese civilization, em- bodying rich historical, social, and ecological significance. To conserve and promote this invaluable cultural heritage, we propose RiverEcho-2.0, a real-time interactive digital system designed to facilitate user engagement with Yellow River culture. As the foundation of our system, we curated and digitized a comprehensive col- lection of books and documents related to Yellow River heritage, constructing a dedicated multimodal corpus. To effectively leverage this corpus, we introduce a novel multi-modal Document Retrieval-Augmented Generation (RAG) framework that enhances document retrieval through context-aware image-text alignment and joint embedding. Experimental results demonstrate that our method achieves a large improvement over existing state-of-the-art multi-modal RAG baselines, leading to significant gains in downstream tasks.

  • research-article
    Yash Gondkar, Chengjie Zheng, Yumeng Yang, Shiqian Shen, Wei Ding, Ping Chen

    Deep learning models built upon Transformer architectures have led to substantial advancements in sequential data analysis. Nevertheless, their direct application to video-based tasks, such as Group Activity Recognition (GAR), remains constrained by the quadratic computational complexity and excessive memory requirements of global self-attention, especially when handling long video sequences. To overcome these limitations, we propose SUGAR: A Sequence Unfolding Based Transformer Model for Group Activity Recognition. Our approach introduces a novel sequence unfolding and folding mechanism that partitions long video sequences into overlapping local windows, enabling the model to concentrate attention within compact temporal regions. This local attention design dramatically reduces computational cost and memory footprint while maintaining high recognition accuracy. Within the Bi-Causal framework, SUGAR replaces conventional Transformer blocks, and experimental results on the Volleyball dataset demonstrate that our model achieves state-of-the-art performance, consistently exceeding 93% accuracy, with significantly improved efficiency. In addition, we investigate Lightning Attention 2 as an alternative linear-complexity attention module, identifying practical challenges such as increased memory usage and unstable convergence. To ensure robustness and training stability, we incorporate a dedicated safety mechanism that mitigates these issues. In summary, SUGAR offers a scalable, resource-efficient solution for group activity analysis in videos and exhibits strong potential for broader applications involving lengthy sequential data in computer vision and bioinformatics.

  • research-article
    Lanjing Zhang, Yanjun Li
  • research-article
    Ronghua Cai, James She

    This paper presents Verse-in-Wine , a generative framework that integrates Chinese classical poetry, traditional wine culture, and calligraphy painting through large language models (LLMs) and visual generation. Given user-selected intention keywords from culturally grounded categories, the system recommends poetic lines, maps them to symbolic wines and historical calligraphy styles, and synthesizes visually coherent outputs. A fully functional prototype was developed and evaluated through both automated and user studies. LLM-based evaluation across 300 samples achieved an overall score of 0.9165, while a user study with 100 samples yielded a comparable human rating of 0.8900, confirming both the system’s cultural fidelity and usability. The framework demonstrates how generative AI can meaningfully engage with heritage aesthetics, linking related cultures for artistic expression.

  • research-article
    Yefei Huang, Wei Zhong, Shuzhan Hu, Fei Hu, Long Ye, Qin Zhang

    Numerous studies have demonstrated that gender-specific emotional pat- terns are prevalent and can be reflected in electroencephalography (EEG) signals. However, most existing EEG-based emotion recognition models fail to fully account for these gender differences, leading to limited generalization performance. To ad- dress this problem, this paper proposes a regionally progressive graph convolutional network with gender-sensitive domain adaptation (RPGCN-GDA). Grounded in prior information of gender differences, the proposed model is expected to flexibly capture gender-specific connectivity patterns across functional brain regions using a progres- sive graph structure. By fully fusing hierarchical emotional features and adaptively adjusting distributional differences between genders, our model performs remarkable generalization capabilities in both cross-subject and cross-gender emotion recognition tasks. The experiment results on public datasets demonstrate that the model not only excels in subject-dependent and subject-independent tasks but also shows significant advantages in handling gender-specific emotional responses, offering a promising new direction for developing higher gender-sensitive emotion recognition systems.

  • research-article
    Yuehan Lee, Yi Qin

    This paper presents MIRTracks, a large-scale dataset containing 240 h of royalty-free multi-track audio, aiming to address the limitations of traditional music source separation datasets, including single-dimensional annotation and semantic information gaps. By integrating multi-dimensional musical information annotation with a semi-automated annotation pipeline, MIRTracks achieves high- quality semantic annotation across rock, electronic, and pop music genres. Experiments demonstrate that fine-tuning a small-scale model on this dataset significantly improves beat detection accuracy from 66.2% to 80.1%, reaching 91.0% of the performance of large-scale models

  • research-article
    Qinyan Li, James She

    Curatorial texts are essential interpretive tools in art exhibitions, bridging the communication between artworks, curators, and visitors. While advancements in AI, particularly large language models (LLMs), have opened new possibilities for automating and assisting in the creation of curatorial texts, current AI models often suffer from inaccuracies and limited interpretive depth. This paper proposes “CurateXelerator”, a collaborative Human-AI framework that integrates a structured input strategy into the curatorial workflow to address these challenges. Comprehensive evaluations demonstrate that proposed method generates texts significantly superior to baseline AI prompts in narrative coherence, rhetorical style, and adherence to constraints. Crucially, in a human paired-samples study, texts by CurateXelerator earned an average quality rating of 3.17/5, achieving statistical parity with texts from human writers, which averaged 2.89/5, while significantly outperforming them in technical simplicity and logical structure. This research contributes a validated collaborative Human-AI framework that enhances curatorial efficiency and quality, a reusable dataset and benchmark for future cutorial practice, and critical insights for integrating AI into Human-centered creative practices.

  • research-article
    Yue Gao

    Effective reinforcement learning (RL) for sepsis treatment depends on learning stable, clinically meaningful state representations from irregular ICU time series. While previous works have explored representation learning for this task, the critical challenge of training instability in sequential representations and its detrimental impact on policy performance has been overlooked. This work demonstrates that Controlled Differential Equations (CDE) state representation can achieve strong RL policies when two key factors are met: (1) ensuring training stability through early stopping or stabilization methods, and (2) enforcing acuity-aware representations by correlation regularization with clinical scores (SOFA, SAPS-II, OASIS). Experiments on the MIMIC-III sepsis cohort reveal that stable CDE autoencoder produces represen- tations strongly correlated with acuity scores and enables RL policies with superior performance (WIS return > 0.9). In contrast, unstable CDE representation leads to de- graded representations and policy failure (WIS return ∼ 0). Visualizations of the latent space show that stable CDEs not only separate survivor and non-survivor trajectories but also reveal clear acuity score gradients, whereas unstable training fails to capture either pattern. These findings highlight practical guidelines for using CDEs to encode irregular medical time series in clinical RL, emphasizing the need for training stability in sequential representation learning.

  • research-article
    Tiange Zhou, Marco Bidin

    The systematic comparison of pre-modern literary traditions demands a methodology capable of balancing historical specificity with analytical rigor. This study constructs an integrative framework that bridges deep hermeneutic traditions and computational literary analysis. Focusing on the peony in Tang-dynasty China (618-907 CE) and the rose in Renaissance England (ca. 1500-1660 CE), we situate these emblematic flowers within Janet Abu-Lughod’s paradigm of the pre-modern Afro- Eurasian “world system”, establishing their comparability as distinct yet interconnected cultural formations. To move beyond impressionistic analogy, we introduce a compu- tational transposition of the logic of cultural dimensions research—the translation of abstract values into measurable axes of variation—from sociological surveys to poetic language. Treating the peony and the rose as Ernst Cassirer’s symbolic forms, we em- ploy a multi-method computational triangulation—multilingual BERT embeddings for semantic resonance, LDA topic modeling for thematic structure, and transformer-based emotion analysis for affective tonality—to model their symbolic architectures across a bilingual corpus of 49 Tang peony poems and 45 Renaissance rose sonnets after quality control and alignment constraints. Results reveal a shared preoccupation with beauty and transience as a universal axis of meaning. Yet a significant negative correlation (r = −0.128, p < 0.05) between semantic and thematic alignment suggests a pattern of structural complementarity rather than direct similarity: Tang and Renaissance poetics converge in emotional and aesthetic concerns while diverging in symbolic framing. The study thus proposes both a substantive insight into cross-civilizational poetics and a transferable methodological paradigm—one that complements close reading with empirical scalability and structural clarity in the comparative study of the world literature.

  • research-article
    Luntian Mou