Sep 2017, Volume 11 Issue 5
    

  • Select all
  • PERSPECTIVE
    Kun ZHOU
  • REVIEW ARTICLE
    Zhen LI, Yuqing WANG, Tian ZHI, Tianshi CHEN

    Machine-learning techniques have recently been proved to be successful in various domains, especially in emerging commercial applications. As a set of machinelearning techniques, artificial neural networks (ANNs), requiring considerable amount of computation and memory, are one of the most popular algorithms and have been applied in a broad range of applications such as speech recognition, face identification, natural language processing, ect. Conventionally, as a straightforward way, conventional CPUs and GPUs are energy-inefficient due to their excessive effort for flexibility. According to the aforementioned situation, in recent years, many researchers have proposed a number of neural network accelerators to achieve high performance and low power consumption. Thus, the main purpose of this literature is to briefly review recent related works, as well as the DianNao-family accelerators. In summary, this review can serve as a reference for hardware researchers in the area of neural networks.

  • RESEARCH ARTICLE
    Xiaobing WANG, Cong TIAN, Zhenhua DUAN, Liang ZHAO

    The development of types is an important but challenging issue in temporal logic programming. In this paper, we investigate how to formalize and implement types in the temporal logic programming language MSVL, which is an executable subset of projection temporal logic (PTL). Specifically, we extendMSVL with a few groups of types including basic data types, pointer types and struct types. On each type, we specify the domain of values and define some standard operations in terms of logic functions and predicates. Then, it is feasible to formalize statements of type declaration of program variables and statements of struct definitions as logic formulas. As the implementation of the theory, we extend the MSV toolkit with the support of modeling, simulation and verification of typedMSVL programs. Applications to the construction of AVL tree and ordered list show the practicality of the language.

  • REVIEW ARTICLE
    Houkui ZHOU, Huimin YU, Roland HU

    Accurately representing the quantity and characteristics of users’ interest in certain topics is an important problem facing topic evolution researchers, particularly as it applies to modern online environments. Search engines can provide information retrieval for a specified topic from archived data, but fail to reflect changes in interest toward the topic over time in a structured way. This paper reviews notable research on topic evolution based on the probabilistic topic model from multiple aspects over the past decade. First, we introduce notations, terminology, and the basic topic model explored in the survey, then we summarize three categories of topic evolution based on the probabilistic topic model: the discrete time topic evolution model, the continuous time topic evolutionmodel, and the online topic evolution model. Next, we describe applications of the topic evolution model and attempt to summarize model generalization performance evaluation and topic evolution evaluation methods, as well as providing comparative experimental results for different models. To conclude the review, we pose some open questions and discuss possible future research directions.

  • RESEARCH ARTICLE
    Donggang CAO, Lianghuan KANG, Hanglong ZHAN, Hong MEI

    In current cluster computing, several distributed frameworks are designed to support elasticity for business services adapting to environment fluctuation. However, most existing works support elasticity mainly at the resource level, leaving application level elasticity support problem to domain-specific frameworks and applications. This paper proposes an actor-based general approach to support application-level elasticity for multiple cluster computing frameworks. The actor model offers scalability and decouples language-level concurrency from the runtime environment. By extending actors, a new middle layer called Unisupervisor is designed to “sit” between the resource management layer and application framework layer. Actors in Unisupervisor can automatically distribute and execute tasks over clusters and dynamically scale in/out. Based on Unisupervisor, high-level profiles (MasterSlave, MapReduce, Streaming, Graph, and Pipeline) for diverse cluster computing requirements can be supported. The entire approach is implemented in a prototype system called UniAS. In the evaluation, both benchmarks and real applications are tested and analyzed in a small scale cluster. Results show that UniAS is expressive and efficiently elastic.

  • RESEARCH ARTICLE
    Tao WU, Qiusong YANG, Yeping HE

    Two key issues exist during virtual machine (VM) migration in cloud computing. One is when to start migration, and the other is how to determine a reliable target, both of which totally depend on whether the source hypervisor is trusted or not in previous studies. However, once the source hypervisor is not trusted any more, migration will be facing unprecedented challenges. To address the problems, we propose a secure architecture SMIG (secure migration), which defines a new concept of Region Critical TCB and leverages an innovative adjacent integrity measurement (AIM) mechanism. AIM dynamically monitors the integrity of its adjacent hypervisor, and passes the results to the Region Critical TCB, which then determines whether to start migration and where to migrate according to a table named integrity validation table. We have implemented a prototype of SMIG based on the Xen hypervisor. Experimental evaluation result shows that SMIG could detect amalicious hypervisor and start migration to a trusted one rapidly, only incurring a moderate overhead for computing intensive and I/O intensive tasks, and small for others.

  • RESEARCH ARTICLE
    Qian LI, Gang LI, Wenjia NIU, Yanan CAO, Liang CHANG, Jianlong TAN, Li GUO

    Learning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or synthesizing new samples. In general, synthesization outperforms replication by supplying additional information on the minority class. However, the additional information needs to follow the same normal distribution of the training set, which further constrains the new samples within the predefined range of training set. In this paper, we present the Wiener process oversampling (WPO) technique that brings the physics phenomena into sample synthesization. WPO constructs a robust decision region by expanding the attribute ranges in training set while keeping the same normal distribution. The satisfactory performance of WPO can be achieved with much lower computing complexity. In addition, by integrating WPO with ensemble learning, the WPOBoost algorithm outperformsmany prevalent imbalance learning solutions.

  • RESEARCH ARTICLE
    Wayne Xin ZHAO, Chen LIU, Ji-Rong WEN, Xiaoming LI

    Detecting and using bursty patterns to analyze text streams has been one of the fundamental approaches in many temporal text mining applications. So far, most existing studies have focused on developing methods to detect bursty features based purely on term frequency changes. Few have taken the semantic contexts of bursty features into consideration, and as a result the detected bursty features may not always be interesting and can be hard to interpret. In this article, we propose to model the contexts of bursty features using a language modeling approach. We propose two methods to estimate the context language models based on sentence-level context and document-level context.We then propose a novel topic diversity-based metric using the context models to find newsworthy bursty features. We also propose to use the context models to automatically assign meaningful tags to bursty features. Using a large corpus of news articles, we quantitatively show that the proposed context language models for bursty features can effectively help rank bursty features based on their newsworthiness and to assign meaningful tags to annotate bursty features. We also use two example text mining applications to qualitatively demonstrate the usefulness of bursty feature ranking and tagging.

  • RESEARCH ARTICLE
    Yansheng DU, Zhihua CHEN, Changqing ZHANG, Xiaochun CAO

    Design of rectangular concrete-filled steel tubular (CFT) columns has been a big concern owing to their complex constraint mechanism. Generally, most existing methods are based on simplified mechanical model with limited experimental data, which is not reliable under many conditions, e.g., columns using high strength materials. Artificial neural network (ANN) models have shown the effectiveness to solve complex problems in many areas of civil engineering in recent years. In this paper, ANN models were employed to predict the axial bearing capacity of rectangular CFT columns based on the experimental data. 305 experimental data from articles were collected, and 275 experimental samples were chosen to train the ANN models while 30 experimental samples were used for testing. Based on the comparison among different models, artificial neural network model1 (ANN1) and artificial neural network model2 (ANN2) with a 20- neuron hidden layer were chosen as the fit prediction models. ANN1 has five inputs: the length (D) and width (B) of cross section, the thickness of steel (t), the yield strength of steel (fy), the cylinder strength of concrete ( f'c ). ANN2 has ten inputs: D, B, t, fy, f'c, the length to width ratio (D/B), the length to thickness ratio (D/t), the width to thickness ratio (B/t), restraint coefficient (ξ), the steel ratio (α). The axial bearing capacity is the output data for both models.The outputs from ANN1 and ANN2 were verified and compared with those from EC4, ACI, GJB4142 and AISC360-10. The results show that the implemented models have good prediction and generalization capacity. Parametric study was conducted using ANN1 and ANN2 which indicates that effect law of basic parameters of columns on the axial bearing capacity of rectangular CFT columns differs from design codes.The results also provide convincing design reference to rectangular CFT columns.

  • RESEARCH ARTICLE
    Yongyi YAN, Zengqiang CHEN, Jumei YUE

    A new modeling tool, algebraic state space approach to logical dynamic systems, which is developed recently based on the theory of semi-tensor product of matrices (STP), is applied to the automata field. Using the STP, this paper investigates the modeling and controlling problems of combined automata constructed in the ways of parallel, serial and feedback. By representing the states, input and output symbols in vector forms, the transition and output functions are expressed as algebraic equations of the states and inputs. Based on such algebraic descriptions, the control problems of combined automata, including output control and state control, are considered, and two necessary and sufficient conditions are presented for the controllability, by which two algorithms are established to find out all the control strings that make a combined automaton go to a target state or produce a desired output. The results are quite different from existing methods and provide a new angle and means to understand and analyze the dynamics of combined automata.

  • RESEARCH ARTICLE
    Wei LI, Yuefei SUI

    A B4-valued propositional logic will be proposed in this paper which there are three unary logical connectives ∼1, ∼2, ¬ and two binary logical connectives ∧, ∨, and a Gentzen-typed deduction system will be given so that the system is sound and complete with B4-valued semantics, where B4 is a Boolean algebra.

  • RESEARCH ARTICLE
    Cheqing JIN, Jie CHEN, Huiping LIU

    Entity matching that aims at finding some records belonging to the same real-world objects has been studied for decades. In order to avoid verifying every pair of records in a massive data set, a common method, known as the blockingbased method, tends to select a small proportion of record pairs for verification with a far lower cost thanO(n2), where n is the size of the data set. Furthermore, executing multiple blocking functions independently is critical since much more matching records can be found in this way, so that the quality of the query result can be improved significantly.

    It is popular to use the MapReduce (MR) framework to improve the performance and the scalability of some complicated queries by running a lot of map (/reduce) tasks in parallel. However, entity matching upon the MapReduce framework is non-trivial due to two inevitable challenges: load balancing and pair deduplication. In this paper, we propose a novel solution, called MrEm, to handle these challenges with the support of multiple blocking functions. Although the existing work can deal with load balancing and pair deduplication respectively, it still cannot deal with both challenges at the same time. Theoretical analysis and experimental results upon real and synthetic data sets illustrate the high effectiveness and efficiency of our proposed solutions.

  • RESEARCH ARTICLE
    Rui LIU, Wenge RONG, Yuanxin OUYANG, Zhang XIONG

    When people want to move to a new job, it is often difficult since there is too much job information available. To select an appropriate job and then submit a resume is tedious. It is particularly difficult for university students since they normally do not have any work experience and also are unfamiliar with the job market. To deal with the information overload for students during their transition into work, a job recommendation system can be very valuable. In this research, after fully investigating the pros and cons of current job recommendation systems for university students, we propose a student profiling based re-ranking framework. In this system, the students are recommended a list of potential jobs based on those who have graduated and obtained job offers over the past few years. Furthermore, recommended employers are also used as input for job recommendation result re-ranking. Our experimental study on real recruitment data over the past four years has shown this method’s potential.