Dec 2018, Volume 12 Issue 6
    

  • Select all
  • REVIEW ARTICLE
    Mu WANG, Changqiao XU, Shijie JIA, Gabriel-Miro MUNTEAN

    In recent times, mobile Internet has witnessed the explosive growth of video applications, embracing user-generated content, Internet Protocol television (IPTV), live streaming, video-on-demand, video conferencing, and FaceTime-like video communications. The exponential rise of video traffic and dynamic user behaviors have proved to be a major challenge to video resource sharing and delivery in the mobile environment. In this article, we present a survey of state-of-the-art video distribution solutions over the Internet. We first discuss the challenges of mobile peer-to-peer (MP2P)-based solutions and categorize them into two groups. We discuss the design idea, characteristics, and drawbacks of solutions in each group.We also give a reviewfor solutions of video transmission in wireless heterogeneous networks. Furthermore, we summarize the information-centric networking (ICN)-based video solutions in terms of in-network caching and name-based routing. Finally, we outline the open issues for mobile video systems that require further studies.

  • REVIEW ARTICLE
    Ruochen HUANG, Xin WEI, Liang ZHOU, Chaoping LV, Hao MENG, Jiefeng JIN

    With the development of mobile communication technology and the growth of mobile device, the requirements for user quality of experience (QoE) become higher and higher. Network operators and content providers are interested in QoE evaluation for improving users’ QoE. However, multimedia QoE evaluation faces severe challenges due to the subjective properties of the QoE. In this paper, we provide a survey of the state of the art about applying data-driven approach on QoE evaluation. Firstly, we describe the way to choose factors influencing QoE. Then we investigate and discuss the strengths and shortcomings of existing machine learning algorithms for modeling and predicting users’ QoE. Finally, we describe our research work on how to evaluate QoE in imbalanced dataset.

  • RESEARCH ARTICLE
    Qiang LIU, Xiaoshe DONG, Heng CHEN, Yinfeng WANG

    Large-scale graph computation is often required in a variety of emerging applications such as social network computation and Web services. Such graphs are typically large and frequently updated with minor changes. However, re-computing an entire graphwhen a fewvertices or edges are updated is often prohibitively expensive. To reduce the cost of such updates, this study proposes an incremental graph computation model called IncPregel, which leverages the nonafter- effect property of the first-order Markov chain and provides incremental programming abstractions to avoid redundant computation and message communication. This is accomplished by employing an efficient and fine-grained reuse mechanism. We implemented this model on Hama, a popular open source framework based on Pregel, to construct an incremental graph processing system called IncHama. IncHama automatically detects changes in input in order to recognize “changed vertices” and to exchange reusable data by means of shuffling. The evaluation results on large-scale graphs show that, compared with Hama, IncHama is 1.1–2.7 times faster and can reduce communication messages by more than 50% when the incremental edges increase in number from 0.1 to 100k.

  • RESEARCH ARTICLE
    Jingyu ZHANG, Chentao WU, Dingyu YANG, Yuanyi CHEN, Xiaodong MENG, Liting XU, Minyi GUO

    The traditional dynamic random-access memory (DRAM) storage medium can be integrated on chips via modern emerging 3D-stacking technology to architect a DRAM shared cache in multicore systems. Compared with static random-access memory (SRAM), DRAM is larger but slower. In the existing research, a lot of work has been devoted to improving the workload performance using SRAM and stacked DRAM together in shared cache systems, ranging from SRAM structure improvement to optimizing cache tags and data access. However, little attention has been paid to designing a shared cache scheduling scheme for multiprogrammed workloads with different memory footprints in multicore systems. Motivated by this, we propose a hybrid shared cache scheduling scheme that allows a multicore system to utilize SRAM and 3D-stacked DRAM efficiently, thus achieving better workload performance. This scheduling scheme employs (1) a cache monitor, which is used to collect cache statistics; (2) a cache evaluator, which is used to evaluate the cache information during the process of programs being executed; and (3) a cache switcher, which is used to self-adaptively choose SRAM or DRAM shared cache modules. A cache data migration policy is naturally developed to guarantee that the scheduling scheme works correctly. Extensive experiments are conducted to evaluate the workload performance of our proposed scheme. The experimental results showed that our method can improve the multiprogrammed workload performance by up to 25% compared with state-of-the-art methods (including conventional and DRAM cache systems).

  • RESEARCH ARTICLE
    Munish SAINI, Kuljit Kaur CHAHAL

    Source code management systems (such as git) record changes to code repositories of Open-Source Software (OSS) projects. The metadata about a change includes a change message to record the intention of the change. Classification of changes,based on change messages, into different change types has been explored in the past to understand the evolution of software systems from the perspective of change size and change density only. However, software evolution analysis based on change classification with a focus on change evolution patterns is still an open research problem. This study examines change messages of 106 OSS projects, as recorded in the git repository, to explore their evolutionary patterns with respect to the types of changes performed over time. An automated keyword-based classifier technique is applied to the change messages to categorize the changes into various types (corrective, adaptive, perfective, preventive, and enhancement). Cluster analysis helps to uncover distinct change patterns that each change type follows. We identify three categories of 106 projects for each change type: high activity, moderate activity, and low activity. Evolutionary behavior is different for projects of different categories. The projects with high and moderate activity receive maximum changes during 76–81 months of the project lifetime. The project attributes such as the number of committers, number of files changed, and total number of commits seem to contribute the most to the change activity of the projects. The statistical findings show that the change activity of a project is related to the number of contributors, amount of work done, and total commits of the projects irrespective of the change type. Further, we explored languages and domains of projects to correlate change types with domains and languages of the projects. The statistical analysis indicates that there is no significant and strong relation of change types with domains and languages of the 106 projects.

  • RESEARCH ARTICLE
    Hailong LIU, Zhanhuai LI, Qun CHEN, Zhaoqiang CHEN

    Data incompleteness is one of the most important data quality problems in enterprise information systems. Most existing data imputing techniques just deduce approximate values for the incomplete attributes by means of some specific data quality rules or some mathematical methods. Unfortunately, approximationmay be far away from the truth. Furthermore, when observed data is inadequate, they will not work well. The World Wide Web (WWW) has become the most important and the most widely used information source. Several current works have proven that using Web data can augment the quality of databases. In this paper, we propose a Web-based relational data imputing framework, which tries to automatically retrieve real values from the WWW for the incomplete attributes. In the paper, we try to take full advantage of relations among different kinds of objects based on the idea that the same kind of things must have the same kind of relations with their relatives in a specific world. Our proposed techniques consist of two automatic query formulation algorithms and one graph-based candidates extraction model. Several evaluations are proposed on two high-quality real datasets and one poor-quality real dataset to prove the effectiveness of our approaches.

  • RESEARCH ARTICLE
    Qianjun ZHANG, Lei ZHANG

    Convolutional neural networks (CNNs) are typical structures for deep learning and are widely used in image recognition and classification. However, the random initialization strategy tends to become stuck at local plateaus or even diverge, which results in rather unstable and ineffective solutions in real applications. To address this limitation, we propose a hybrid deep learning CNN-AdapDAE model, which applies the features learned by the AdapDAE algorithm to initialize CNN filters and then train the improved CNN for classification tasks. In this model, AdapDAE is proposed as a CNN pre-training procedure, which adaptively obtains the noise level based on the principle of annealing, by starting with a high level of noise and lowering it as the training progresses. Thus, the features learned by AdapDAE include a combination of features at different levels of granularity. Extensive experimental results on STL-10, CIFAR-10, andMNIST datasets demonstrate that the proposed algorithm performs favorably compared to CNN (random filters), CNNAE (pre-training filters by autoencoder), and a few other unsupervised feature learning methods.

  • RESEARCH ARTICLE
    Jun XIAO, Sidong LIU, Liang HU, Ying WANG

    Filtering is an essential step in the process of obtaining rock data. To the best of our knowledge, there are no special algorithms for use in the point clouds of rock masses. Existing filtering methods remove noisy points by fitting the surface of the ground and deleting the points above the surface around a range of values. This type of methods has certain limitations in rock engineering owing the uniqueness of the particular rockmass being studied. In this paper, a method for filtering the rock points is proposed based on a backpropagation (BP) neural network and principal component analysis (PCA). In the proposed method, a PCA is applied for feature extraction, and for obtaining the dimensional information, which can be used to effectively distinguish the rock and other points at different scales. A BP neural network, which has a strong nonlinear processing capability, is then used to obtain the exact points of rock with the above characteristics. In the present paper, the efficiency of the proposed technique is illustrated by classifying steep rocky slopes as rock and vegetation. A comparison with existing methods indicates the superiority of the proposed method in terms of the point cloud filtering of rock masses.

  • RESEARCH ARTICLE
    Jun ZHANG, Bineng ZHONG, Pengfei WANG, Cheng WANG, Jixiang DU

    Owing to the inherent lack of training data in visual tracking, recent work in deep learning-based trackers has focused on learning a generic representation offline from large-scale training data and transferring the pre-trained feature representation to a tracking task. Offline pre-training is time-consuming, and the learned generic representation may be either less discriminative for tracking specific objects or overfitted to typical tracking datasets. In this paper, we propose an online discriminative tracking method based on robust feature learning without large-scale pre-training. Specifically, we first design a PCA filter bank-based convolutional neural network (CNN) architecture to learn robust features online with a few positive and negative samples in the high-dimensional feature space. Then, we use a simple softthresholding method to produce sparse features that are more robust to target appearance variations.Moreover, we increase the reliability of our tracker using edge information generated from edge box proposals during the process of visual tracking. Finally, effective visual tracking results are achieved by systematically combining the tracking information and edge box-based scores in a particle filtering framework. Extensive results on the widely used online tracking benchmark (OTB- 50) with 50 videos validate the robustness and effectiveness of the proposed tracker without large-scale pre-training.

  • RESEARCH ARTICLE
    Yan LI, Shiguang SHAN, Ruiping WANG, Zhen CUI, Xilin CHEN

    High accuracy face recognition is of great importance for a wide variety of real-world applications. Although significant progress has been made in the last decades, fully automatic face recognition systems have not yet approached the goal of surpassing the human vision system, even in controlled conditions. In this paper, we propose an approach for robust face recognition by fusing two complementary features: one is Gabor magnitude of multiple scales and orientations and the other is Fourier phase encoded by spatial pyramid based local phase quantization (SPLPQ). To reduce the high dimensionality of both features, block-wise fisher discriminant analysis (BFDA) is applied and further combined by score-level fusion. Moreover, inspired by the biological cognitive mechanism, multiple face models are exploited to further boost the robustness of the proposed approach. We evaluate the proposed approach on three challenging databases, i.e., FRGC ver2.0, LFW, and CFW-p, that address two face classification scenarios, i.e., verification and identification. Experimental results consistently exhibit the complementarity of the two features and the performance boost gained by the multiple face models. The proposed approach achieved approximately 96% verification rate when FAR was 0.1% on FRGC ver2.0 Exp.4, impressively surpassing all the best known results.

  • RESEARCH ARTICLE
    Chen LUO, Fei HE

    Differential privacy enables sensitive data to be analyzed in a privacy-preserving manner. In this paper, we focus on the online setting where each analyst is assigned a privacy budget and queries the data interactively. However, existing differentially private data analytics systems such as PINQ process each query independently, which may cause an unnecessary waste of the privacy budget. Motivated by this, we present a satisfiability modulo theories (SMT)-based query tracking approach to reduce the privacy budget usage. In brief, our approach automatically locates past queries that access disjoint parts of the dataset with respect to the current query to save the privacy cost using the SMT solving techniques. To improve efficiency, we further propose an optimization based on explicitly specified column ranges to facilitate the search process. We have implemented a prototype of our approach with Z3, and conducted several sets of experiments. The results show our approach can save a considerable amount of the privacy budget and each query can be tracked efficiently within milliseconds.

  • RESEARCH ARTICLE
    Hongguo YANG, Derong SHEN, Yue KOU, Tiezhen NIE, Ge YU

    In this paper, an efficient page rank (PR) exact algorithm is proposed, which can improve the computation efficiency without sacrificing results accuracy. The existing exact algorithms are generally based on the original power method (PM). In order to reduce the number of I/Os required to improve efficiency, they partition the big graph into multiple smaller ones that can be totally fitted in memory. The algorithmproposed in this paper can further reduce the required number of I/Os. Instead of partitioning the graph into the general subgraphs, our algorithm partitions graph into a special kind of subgraphs: SCCs (strongly connected components), the nodes in which are reachable to each other. By exploiting the property of SCC, some theories are proposed, based on which the computation iterations can be constrained on these SCC subgraphs. Our algorithm can reduce lots of I/Os and save a large amount of computations, as well as keeping the results accuracy. In a word, our algorithm is more efficient among the existing exact algorithms. The experiments demonstrate that the algorithms proposed in this paper can make an obvious efficiency improvement and can attain high accurate results.

  • RESEARCH ARTICLE
    Xianxian LI, Peipei SUI, Yan BAI, Li-E WANG

    Transactional data collection and sharing currently face the challenge of how to prevent information leakage and protect data from privacy breaches while maintaining high-quality data utilities. Data anonymization methods such as perturbation, generalization, and suppression have been proposed for privacy protection. However, many of these methods incur excessive information loss and cannot satisfy multipurpose utility requirements. In this paper, we propose a multidimensional generalization method to provide multipurpose optimization when anonymizing transactional data in order to offer better data utility for different applications. Our methodology uses bipartite graphs with generalizing attribute, grouping item and perturbing outlier. Experiments on real-life datasets are performed and show that our solution considerably improves data utility compared to existing algorithms.

  • LETTER
    Thanh-Hieu BUI, Seong-Bae PARK
  • LETTER
    Muhammad Aminur RAHAMAN, Mahmood JASIM, Md.Haider ALI, Tao ZHANG, Md. HASANUZZAMAN
  • LETTER
    Muzhou HOU, Yunlei YANG, Taohua LIU, Wenping PENG
  • LETTER
    Xiaochun WANG, Yidong LI