Despite significant successes achieved in knowledge discovery, traditional machine learning methods may fail to obtain satisfactory performances when dealing with complex data, such as imbalanced, high-dimensional, noisy data, etc. The reason behind is that it is difficult for these methods to capture multiple characteristics and underlying structure of data. In this context, it becomes an important topic in the data mining field that how to effectively construct an efficient knowledge discovery and mining model. Ensemble learning, as one research hot spot, aims to integrate data fusion, data modeling, and data mining into a unified framework. Specifically, ensemble learning firstly extracts a set of features with a variety of transformations. Based on these learned features, multiple learning algorithms are utilized to produce weak predictive results. Finally, ensemble learning fuses the informative knowledge from the above results obtained to achieve knowledge discovery and better predictive performance via voting schemes in an adaptive way. In this paper, we review the research progress of the mainstream approaches of ensemble learning and classify them based on different characteristics. In addition, we present challenges and possible research directions for each mainstream approach of ensemble learning, and we also give an extra introduction for the combination of ensemble learning with other machine learning hot spots such as deep learning, reinforcement learning, etc.
Compilers are widely-used infrastructures in accelerating the software development, and expected to be trustworthy. In the literature, various testing technologies have been proposed to guarantee the quality of compilers. However, there remains an obstacle to comprehensively characterize and understand compiler testing. To overcome this obstacle, we propose a literature analysis framework to gain insights into the compiler testing area. First, we perform an extensive search to construct a dataset related to compiler testing papers. Then, we conduct a bibliometric analysis to analyze the productive authors, the influential papers, and the frequently tested compilers based on our dataset. Finally, we utilize association rules and collaboration networks to mine the authorships and the communities of interests among researchers and keywords. Some valuable results are reported. We find that the USA is the leading country that contains the most influential researchers and institutions. The most active keyword is “random testing”. We also find that most researchers have broad interests within small-scale collaborators in the compiler testing area.
Blockchain(BC), as an emerging distributed database technology with advanced security and reliability, has attracted much attention from experts who devoted to e-finance, intellectual property protection, the internet of things (IoT) and so forth. However, the inefficient transaction processing speed, which hinders the BC’s widespread, has not been well tackled yet. In this paper, we propose a novel architecture, called Dual-Channel Parallel Broadcast model (DCPB), which could address such a problem to a greater extent by using three methods which are dual communication channels, parallel pipeline processing and block broadcast strategy. In the dual-channel model, one channel processes transactions, and the other engages in the execution of BFT. The parallel pipeline processing allows the system to operate asynchronously. The block generation strategy improves the efficiency and speed of processing. Extensive experiments have been applied to BeihangChain, a simplified prototype for BC system, illustrates that its transaction processing speed could be improved to 16K transaction per second which could well supportmany real-world scenarios such as BC-based energy trading system andMicro-film copyright trading system in CCTV.
Semi-supervised learning constructs the predictive model by learning from a few labeled training examples and a large pool of unlabeled ones. It has a wide range of application scenarios and has attracted much attention in the past decades. However, it is noteworthy that although the learning performance is expected to be improved by exploiting unlabeled data, some empirical studies show that there are situations where the use of unlabeled data may degenerate the performance. Thus, it is advisable to be able to exploit unlabeled data safely. This article reviews some research progress of safe semi-supervised learning, focusing on three types of safeness issue: data quality, where the training data is risky or of low-quality;model uncertainty, where the learning algorithm fails to handle the uncertainty during training; measure diversity, where the safe performance could be adapted to diverse measures.
Elastic simulation plays an important role in computer graphics and has been widely applied to film and game industries. It also has a tight relationship to virtual reality and computational fabrication applications. The balance between accuracy and performance are the most important challenge in the design of an elastic simulation algorithm. This survey will begin with the basic knowledge of elastic simulation, and then investigate two major acceleration techniques for it. From the viewpoint of deformation energy, we introduce typical linearization and reduction ideas for accelerating. We also introduce some recent progress in projective and position-based dynamics, which mainly rely on special numerical methods. Besides, optimal control for elastic objects and typical collision resolving techniques are discussed. Finally, we discuss several possible future works on integrating elastic simulation into virtual reality and 3D printing applications.
In recent times, mobile Internet has witnessed the explosive growth of video applications, embracing user-generated content, Internet Protocol television (IPTV), live streaming, video-on-demand, video conferencing, and FaceTime-like video communications. The exponential rise of video traffic and dynamic user behaviors have proved to be a major challenge to video resource sharing and delivery in the mobile environment. In this article, we present a survey of state-of-the-art video distribution solutions over the Internet. We first discuss the challenges of mobile peer-to-peer (MP2P)-based solutions and categorize them into two groups. We discuss the design idea, characteristics, and drawbacks of solutions in each group.We also give a reviewfor solutions of video transmission in wireless heterogeneous networks. Furthermore, we summarize the information-centric networking (ICN)-based video solutions in terms of in-network caching and name-based routing. Finally, we outline the open issues for mobile video systems that require further studies.
The evolution of social network and multimedia technologies encourage more and more people to generate and upload visual information, which leads to the generation of large-scale video data. Therefore, preeminent compression technologies are highly desired to facilitate the storage and transmission of these tremendous video data for a wide variety of applications. In this paper, a systematic review of the recent advances for large-scale video compression (LSVC) is presented. Specifically, fast video coding algorithms and effective models to improve video compression efficiency are introduced in detail, since coding complexity and compression efficiency are two important factors to evaluate video coding approaches. Finally, the challenges and future research trends for LSVC are discussed.