Dec 2023, Volume 17 Issue 6
    

  • Select all
    Perspective
  • PERSPECTIVE
    Hongda QI, Changjun JIANG
  • Excellent Young Computer Scientists Forum
  • RESEARCH ARTICLE
    Haoyu MA, Ningning LU, Junjun MEI, Tao GUAN, Yu ZHANG, Xin GENG

    Recently, segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved texts. However, existing methods usually need complex post-processing stages to process ambiguous labels, i.e., the labels of the pixels near the text boundary, which may belong to the text or background. In this paper, we present a framework for segmentation-based scene text detection by learning from ambiguous labels. We use the label distribution learning method to process the label ambiguity of text annotation, which achieves a good performance without using additional post-processing stage. Experiments on benchmark datasets demonstrate that our method produces better results than state-of-the-art methods for segmentation-based scene text detection.

  • RESEARCH ARTICLE
    Xumeng WANG, Ziliang WU, Wenqi HUANG, Yating WEI, Zhaosong HUANG, Mingliang XU, Wei CHEN

    Visualization and artificial intelligence (AI) are well-applied approaches to data analysis. On one hand, visualization can facilitate humans in data understanding through intuitive visual representation and interactive exploration. On the other hand, AI is able to learn from data and implement bulky tasks for humans. In complex data analysis scenarios, like epidemic traceability and city planning, humans need to understand large-scale data and make decisions, which requires complementing the strengths of both visualization and AI. Existing studies have introduced AI-assisted visualization as AI4VIS and visualization-assisted AI as VIS4AI. However, how can AI and visualization complement each other and be integrated into data analysis processes are still missing. In this paper, we define three integration levels of visualization and AI. The highest integration level is described as the framework of VIS+AI, which allows AI to learn human intelligence from interactions and communicate with humans through visual interfaces. We also summarize future directions of VIS+AI to inspire related studies.

  • REVIEW ARTICLE
    Muning WEN, Runji LIN, Hanjing WANG, Yaodong YANG, Ying WEN, Luo MAI, Jun WANG, Haifeng ZHANG, Weinan ZHANG

    Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e.g., GPT-3 and Swin Transformer. Although originally designed for prediction problems, it is natural to inquire about their suitability for sequential decision-making and reinforcement learning problems, which are typically beset by long-standing issues involving sample efficiency, credit assignment, and partial observability. In recent years, sequence models, especially the Transformer, have attracted increasing interest in the RL communities, spawning numerous approaches with notable effectiveness and generalizability. This survey presents a comprehensive overview of recent works aimed at solving sequential decision-making tasks with sequence models such as the Transformer, by discussing the connection between sequential decision-making and sequence modeling, and categorizing them based on the way they utilize the Transformer. Moreover, this paper puts forth various potential avenues for future research intending to improve the effectiveness of large sequence models for sequential decision-making, encompassing theoretical foundations, network architectures, algorithms, and efficient training systems.

  • Software
  • RESEARCH ARTICLE
    Bo YANG, Xiuyin MA, Chunhui WANG, Haoran GUO, Huai LIU, Zhi JIN

    Agile development aims at rapidly developing software while embracing the continuous evolution of user requirements along the whole development process. User stories are the primary means of requirements collection and elicitation in the agile development. A project can involve a large amount of user stories, which should be clustered into different groups based on their functionality’s similarity for systematic requirements analysis, effective mapping to developed features, and efficient maintenance. Nevertheless, the current user story clustering is mainly conducted in a manual manner, which is time-consuming and subjective to human bias. In this paper, we propose a novel approach for clustering the user stories automatically on the basis of natural language processing. Specifically, the sentence patterns of each component in a user story are first analysed and determined such that the critical structure in the representative tasks can be automatically extracted based on the user story meta-model. The similarity of user stories is calculated, which can be used to generate the connected graph as the basis of automatic user story clustering. We evaluate the approach based on thirteen datasets, compared against ten baseline techniques. Experimental results show that our clustering approach has higher accuracy, recall rate and F1-score than these baselines. It is demonstrated that the proposed approach can significantly improve the efficacy of user story clustering and thus enhance the overall performance of agile development. The study also highlights promising research directions for more accurate requirements elicitation.

  • RESEARCH ARTICLE
    Yamin HU, Hao JIANG, Zongyao HU

    The maintainability of source code is a key quality characteristic for software quality. Many approaches have been proposed to quantitatively measure code maintainability. Such approaches rely heavily on code metrics, e.g., the number of Lines of Code and McCabe’s Cyclomatic Complexity. The employed code metrics are essentially statistics regarding code elements, e.g., the numbers of tokens, lines, references, and branch statements. However, natural language in source code, especially identifiers, is rarely exploited by such approaches. As a result, replacing meaningful identifiers with nonsense tokens would not significantly influence their outputs, although the replacement should have significantly reduced code maintainability. To this end, in this paper, we propose a novel approach (called DeepM) to measure code maintainability by exploiting the lexical semantics of text in source code. DeepM leverages deep learning techniques (e.g., LSTM and attention mechanism) to exploit these lexical semantics in measuring code maintainability. Another key rationale of DeepM is that measuring code maintainability is complex and often far beyond the capabilities of statistics or simple heuristics. Consequently, DeepM leverages deep learning techniques to automatically select useful features from complex and lengthy inputs and to construct a complex mapping (rather than simple heuristics) from the input to the output (code maintainability index). DeepM is evaluated on a manually-assessed dataset. The evaluation results suggest that DeepM is accurate, and it generates the same rankings of code maintainability as those of experienced programmers on 87.5% of manually ranked pairs of Java classes.

  • LETTER
    Ding BAO, Wei REN, Yuexin XIANG, Weimao LIU, Tianqing ZHU, Yi REN, Kim-Kwang Raymond CHOO
  • Artificial Intelligence
  • RESEARCH ARTICLE
    Shuo TAN, Lei ZHANG, Xin SHU, Zizhou WANG

    Attention mechanism has become a widely researched method to improve the performance of convolutional neural networks (CNNs). Most of the researches focus on designing channel-wise and spatial-wise attention modules but neglect the importance of unique information on each feature, which is critical for deciding both “what” and “where” to focus. In this paper, a feature-wise attention module is proposed, which can give each feature of the input feature map an attention weight. Specifically, the module is based on the well-known surround suppression in the discipline of neuroscience, and it consists of two sub-modules, Minus-Square-Add (MSA) operation and a group of learnable non-linear mapping functions. The MSA imitates the surround suppression and defines an energy function which can be applied to each feature to measure its importance. The group of non-linear functions refines the energy calculated by the MSA to more reasonable values. By these two sub-modules, feature-wise attention can be well captured. Meanwhile, due to the simple structure and few parameters of the two sub-modules, the proposed module can easily be almost integrated into any CNN. To verify the performance and effectiveness of the proposed module, several experiments were conducted on the Cifar10, Cifar100, Cinic10, and Tiny-ImageNet datasets, respectively. The experimental results demonstrate that the proposed module is flexible and effective for CNNs to improve their performance.

  • RESEARCH ARTICLE
    Yufei ZENG, Zhixin LI, Zhenbin CHEN, Huifang MA

    The deep learning methods based on syntactic dependency tree have achieved great success on Aspect-based Sentiment Analysis (ABSA). However, the accuracy of the dependency parser cannot be determined, which may keep aspect words away from its related opinion words in a dependency tree. Moreover, few models incorporate external affective knowledge for ABSA. Based on this, we propose a novel architecture to tackle the above two limitations, while fills up the gap in applying heterogeneous graphs convolution network to ABSA. Specially, we employ affective knowledge as an sentiment node to augment the representation of words. Then, linking sentiment node which have different attributes with word node through a specific edge to form a heterogeneous graph based on dependency tree. Finally, we design a multi-level semantic heterogeneous graph convolution network (Semantic-HGCN) to encode the heterogeneous graph for sentiment prediction. Extensive experiments are conducted on the datasets SemEval 2014 Task 4, SemEval 2015 task 12, SemEval 2016 task 5 and ACL 14 Twitter. The experimental results show that our method achieves the state-of-the-art performance.

  • RESEARCH ARTICLE
    Bin-Bin JIA, Jun-Ying LIU, Jun-Yi HANG, Min-Ling ZHANG

    Multi-class classification can be solved by decomposing it into a set of binary classification problems according to some encoding rules, e.g., one-vs-one, one-vs-rest, error-correcting output codes. Existing works solve these binary classification problems in the original feature space, while it might be suboptimal as different binary classification problems correspond to different positive and negative examples. In this paper, we propose to learn label-specific features for each decomposed binary classification problem to consider the specific characteristics containing in its positive and negative examples. Specifically, to generate the label-specific features, clustering analysis is respectively conducted on the positive and negative examples in each decomposed binary data set to discover their inherent information and then label-specific features for one example are obtained by measuring the similarity between it and all cluster centers. Experiments clearly validate the effectiveness of learning label-specific features for decomposition-based multi-class classification.

  • RESEARCH ARTICLE
    Ting WU, Hong QIAN, Ziqi LIU, Jun ZHOU, Aimin ZHOU

    Bayesian network is a popular approach to uncertainty knowledge representation and reasoning. Structure learning is the first step to learn a Bayesian network. Score-based methods are one of the most popular ways of learning the structure. In most cases, the score of Bayesian network is defined as adding the log-likelihood score and complexity score by using the penalty function. If the penalty function is set unreasonably, it may hurt the performance of structure search. Thus, Bayesian network structure learning is essentially a bi-objective optimization problem. However, the existing bi-objective structure learning algorithms can only be applied to small-scale networks. To this end, this paper proposes a bi-objective evolutionary Bayesian network structure learning algorithm via skeleton constraint (BBS) for the medium-scale networks. To boost the performance of searching, BBS introduces the random order prior (ROP) initial operator. ROP generates a skeleton to constrain the searching space, which is the key to expanding the scale of structure learning problems. Then, the acyclic structures are guaranteed by adding the orders of variables in the initial skeleton. After that, BBS designs the Pareto rank based crossover and skeleton guided mutation operators. The operators operate on the skeleton obtained in ROP to make the search more targeted. Finally, BBS provides a strategy to choose the final solution. The experimental results show that BBS can always find the structure which is closer to the ground truth compared with the single-objective structure learning methods. Furthermore, compared with the existing bi-objective structure learning methods, BBS is scalable and can be applied to medium-scale Bayesian network datasets. On the educational problem of discovering the influencing factors of students’ academic performance, BBS provides higher quality solutions and is featured with the flexibility of solution selection compared with the widely-used Bayesian network structure learning methods.

  • LETTER
    Xiaohu LUO, Zili ZHANG
  • LETTER
    Junxiao XUE, Shibo HUANG, Huawei SONG, Lei SHI
  • LETTER
    Ruisong LI, Hongjiu LIU, Yanrong HU
  • LETTER
    Shangwei WU, Yingtong XIONG, Chuliang WENG
  • LETTER
    Jipeng QIANG, Yang LI, Yun LI, Yunhao YUAN, Yi ZHU
  • LETTER
    Shiyu ZHU, Yun LI, Xiaoye OUYANG, Xiaocheng HU, Jipeng QIANG
  • Theoretical Computer Science
  • RESEARCH ARTICLE
    Huisi ZHOU, Dantong OUYANG, Xinliang TIAN, Liming ZHANG

    Model-based diagnosis (MBD) with multiple observations shows its significance in identifying fault location. The existing approaches for MBD with multiple observations use observations which is inconsistent with the prediction of the system. In this paper, we proposed a novel diagnosis approach, namely, the Diagnosis with Different Observations (DiagDO), to exploit the diagnosis when given a set of pseudo normal observations and a set of abnormal observations. Three ideas are proposed in this paper. First, for each pseudo normal observation, we propagate the value of system inputs and gain fanin-free edges to shrink the size of possible faulty components. Second, for each abnormal observation, we utilize filtered nodes to seek surely normal components. Finally, we encode all the surely normal components and parts of dominated components into hard clauses and compute diagnosis using the MaxSAT solver and MCS algorithm. Extensive tests on the ISCAS'85 and ITC'99 benchmarks show that our approach performs better than the state-of-the-art algorithms.

  • LETTER
    Xiangyan KONG, Zhen ZHANG
  • Networks and Communication
  • LETTER
    Zihang SONG, Han ZHANG, Sean FULLER, Andrew LAMBERT, Zhinong YING, Petri MAHONEN, Yonina ELDAR, Shuguang CUI, Mark D. PLUMBLEY, Clive PARINI, Arumugam NALLANATHAN, Yue GAO
  • Information Systems
  • RESEARCH ARTICLE
    Yongquan LIANG, Qiuyu SONG, Zhongying ZHAO, Hui ZHOU, Maoguo GONG

    Session-based recommendation is a popular research topic that aims to predict users’ next possible interactive item by exploiting anonymous sessions. The existing studies mainly focus on making predictions by considering users’ single interactive behavior. Some recent efforts have been made to exploit multiple interactive behaviors, but they generally ignore the influences of different interactive behaviors and the noise in interactive sequences. To address these problems, we propose a behavior-aware graph neural network for session-based recommendation. First, different interactive sequences are modeled as directed graphs. Thus, the item representations are learned via graph neural networks. Then, a sparse self-attention module is designed to remove the noise in behavior sequences. Finally, the representations of different behavior sequences are aggregated with the gating mechanism to obtain the session representations. Experimental results on two public datasets show that our proposed method outperforms all competitive baselines. The source code is available at the website of GitHub.

  • LETTER
    Fangshu CHEN, Yufei ZHANG, Lu CHEN, Xiankai MENG, Yanqiang QI, Jiahui WANG
  • LETTER
    Fan LI, Tiancheng ZHANG, Shengjia CUI, Hengyu LIU, Zhibin REN, Donglin DI, Xiao WANG, Po ZHANG, Ge YU
  • Image and Graphics
  • LETTER
    Shaolei LIU, Xiaoyuan LUO, Kexue FU, Manning WANG, Zhijian SONG
  • LETTER
    Mingqiang GUO, Hongting SHENG, Zhizheng ZHANG, Ying HUANG, Xueye CHEN, Cunjin WANG, Jiaming ZHANG
  • Information Security
  • LETTER
    Xiaofan LIU, Wei REN, Kim-Kwang Raymond CHOO
  • LETTER
    Liu ZHANG, Jinyu LU, Zilong WANG, Chao LI
  • ANNOUNCEMENT