Frontiers of Computer Science

30 Most Downloaded Articles
Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

Please wait a minute...
For Selected: View Abstracts Toggle Thumbnails
Mining the interests of Chinese microbloggers via keyword extraction
Zhiyuan LIU, Xinxiong CHEN, Maosong SUN
Front Comput Sci    2012, 6 (1): 76-87.
Abstract   HTML   PDF (547KB)

Microblogging provides a new platform for communicating and sharing information amongWeb users. Users can express opinions and record daily life using microblogs. Microblogs that are posted by users indicate their interests to some extent. We aim to mine user interests via keyword extraction from microblogs. Traditional keyword extraction methods are usually designed for formal documents such as news articles or scientific papers. Messages posted by microblogging users, however, are usually noisy and full of new words, which is a challenge for keyword extraction. In this paper, we combine a translation-based method with a frequency-based method for keyword extraction. In our experiments, we extract keywords for microblog users from the largest microblogging website in China, Sina Weibo. The results show that our method can identify users’ interests accurately and efficiently.

Reference | Related Articles | Metrics
Cited: WebOfScience(35)
User behavior modeling for better Web search ranking
Yiqun LIU, Chao WANG, Min ZHANG, Shaoping MA
Front. Comput. Sci.
Abstract   PDF (518KB)

Modern search engines record user interactions and use them to improve search quality. In particular, user click-through has been successfully used to improve clickthrough rate (CTR), Web search ranking, and query recommendations and suggestions. Although click-through logs can provide implicit feedback of users’ click preferences, deriving accurate absolute relevance judgments is difficult because of the existence of click noises and behavior biases. Previous studies showed that user clicking behaviors are biased toward many aspects such as “position” (user’s attention decreases from top to bottom) and “trust” (Web site reputations will affect user’s judgment). To address these problems, researchers have proposed several behavior models (usually referred to as click models) to describe users? practical browsing behaviors and to obtain an unbiased estimation of result relevance. In this study, we review recent efforts to construct click models for better search ranking and propose a novel convolutional neural network architecture for building click models. Compared to traditional click models, our model not only considers user behavior assumptions as input signals but also uses the content and context information of search engine result pages. In addition, our model uses parameters from traditional click models to restrict the meaning of some outputs in our model’s hidden layer. Experimental results show that the proposed model can achieve considerable improvement over state-of-the-art click models based on the evaluation metric of click perplexity.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(1) WebOfScience(1)
Big graph search: challenges and techniques
Shuai MA,Jia LI,Chunming HU,Xuelian LIN,Jinpeng HUAI
Front. Comput. Sci.    2016, 10 (3): 387-398.
Abstract   PDF (484KB)

On one hand, compared with traditional relational and XML models, graphs have more expressive power and are widely used today. On the other hand, various applications of social computing trigger the pressing need of a new search paradigm. In this article, we argue that big graph search is the one filling this gap. We first introduce the application of graph search in various scenarios. We then formalize the graph search problem, and give an analysis of graph search from an evolutionary point of view, followed by the evidences from both the industry and academia. After that, we analyze the difficulties and challenges of big graph search. Finally, we present three classes of techniques towards big graph search: query techniques, data techniques and distributed computing techniques.

Reference | Related Articles | Metrics
Cited: Crossref(16) WebOfScience(13)
The use of mathematics in software quality assurance
David Lorge PARNAS
Front Comput Sci    2012, 6 (1): 3-16.
Abstract   HTML   PDF (425KB)

The use of mathematics for documenting, inspecting, and testing software is explained and illustrated. Three measures of software quality are described and discussed. Then three distinct complementary approaches to software quality assurance are presented. A case study, the testing and inspection of a safety-critical system, is discussed in detail.

Reference | Related Articles | Metrics
Cited: WebOfScience(1)
A survey on software defined networking and its applications
Yili GONG,Wei HUANG,Wenjie WANG,Yingchun LEI
Front. Comput. Sci.    2015, 9 (6): 827-845.
Abstract   PDF (1083KB)

Software defined networking (SDN) achieves network routing management with logically centralized control software that decouples the network data plane from the control plane. This new design paradigm greatly emancipates network innovation. This paper introduces the background of SDN technology with its design principles, explains its differentiation, and summarizes the research efforts on SDN network architecture, components and applications. Based on the observation of current SDN development, this paper analyzes the potential driving forces of SDN deployment and its future trend.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(12) WebOfScience(10)
Formal engineering methods for software quality assurance
Shaoying LIU
Front Comput Sci    2012, 6 (1): 1-2.
Abstract   HTML   PDF (184KB)
Related Articles | Metrics
Cloud service selection using cloud service brokers: approaches and challenges
Front. Comput. Sci.    2019, 13 (3): 599-617.
Abstract   PDF (879KB)

Cloud computing users are faced with a wide variety of services to choose from. Consequently, a number of cloud service brokers (CSBs) have emerged to help users in their service selection process. This paper reviews the recent approaches that have been introduced and used for cloud service brokerage and discusses their challenges accordingly. We propose a set of attributes for a CSB to be considered effective. Different CSBs’ approaches are classified as either single service or multiple service models. The CSBs are then assessed, analyzed, and compared with respect to the proposed set of attributes. Based on our studies, CSBs with multiple service models that support more of the proposed effective CSB attributes have wider application in cloud computing environments.

Reference | Supplementary Material | Related Articles | Metrics
Top-k probabilistic prevalent co-location mining in spatially uncertain data sets
Lizhen WANG,Jun HAN,Hongmei CHEN,Junli LU
Front. Comput. Sci.    2016, 10 (3): 488-503.
Abstract   PDF (774KB)

A co-location pattern is a set of spatial features whose instances frequently appear in a spatial neighborhood. This paper efficiently mines the top-k probabilistic prevalent co-locations over spatially uncertain data sets and makes the following contributions: 1) the concept of the top-k probabilistic prevalent co-locations based on a possible world model is defined; 2) a framework for discovering the top-k probabilistic prevalent co-locations is set up; 3) a matrix method is proposed to improve the computation of the prevalence probability of a top-k candidate, and two pruning rules of the matrix block are given to accelerate the search for exact solutions; 4) a polynomial matrix is developed to further speed up the top-k candidate refinement process; 5) an approximate algorithm with compensation factor is introduced so that relatively large quantity of data can be processed quickly. The efficiency of our proposed algorithms as well as the accuracy of the approximation algorithms is evaluated with an extensive set of experiments using both synthetic and real uncertain data sets.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(10) WebOfScience(10)
Saliency-based framework for facial expression recognition
Rizwan Ahmed KHAN, Alexandre MEYER, Hubert KONIK, Saida BOUAKAZ
Front. Comput. Sci.    2019, 13 (1): 183-198.
Abstract   PDF (912KB)

This article proposes a novel framework for the recognition of six universal facial expressions. The framework is based on three sets of features extracted from a face image: entropy, brightness, and local binary pattern. First, saliency maps are obtained using the state-of-the-art saliency detection algorithm “frequency-tuned salient region detection”. The idea is to use saliency maps to determine appropriate weights or values for the extracted features (i.e., brightness and entropy).We have performed a visual experiment to validate the performance of the saliency detection algorithm against the human visual system. Eye movements of 15 subjects were recorded using an eye-tracker in free-viewing conditions while they watched a collection of 54 videos selected from the Cohn-Kanade facial expression database. The results of the visual experiment demonstrated that the obtained saliency maps are consistent with the data on human fixations. Finally, the performance of the proposed framework is demonstrated via satisfactory classification results achieved with the Cohn-Kanade database, FG-NET FEED database, and Dartmouth database of children’s faces.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(4) WebOfScience(1)
System architecture for high-performance permissioned blockchains
Libo FENG, Hui ZHANG, Wei-Tek TSAI, Simeng SUN
Front. Comput. Sci.    2019, 13 (6): 1151-1165.
Abstract   PDF (695KB)

Blockchain(BC), as an emerging distributed database technology with advanced security and reliability, has attracted much attention from experts who devoted to e-finance, intellectual property protection, the internet of things (IoT) and so forth. However, the inefficient transaction processing speed, which hinders the BC’s widespread, has not been well tackled yet. In this paper, we propose a novel architecture, called Dual-Channel Parallel Broadcast model (DCPB), which could address such a problem to a greater extent by using three methods which are dual communication channels, parallel pipeline processing and block broadcast strategy. In the dual-channel model, one channel processes transactions, and the other engages in the execution of BFT. The parallel pipeline processing allows the system to operate asynchronously. The block generation strategy improves the efficiency and speed of processing. Extensive experiments have been applied to BeihangChain, a simplified prototype for BC system, illustrates that its transaction processing speed could be improved to 16K transaction per second which could well supportmany real-world scenarios such as BC-based energy trading system andMicro-film copyright trading system in CCTV.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(2) WebOfScience(1)
String similarity search and join: a survey
Minghe YU,Guoliang LI,Dong DENG,Jianhua FENG
Front. Comput. Sci.    2016, 10 (3): 399-417.
Abstract   PDF (720KB)

String similarity search and join are two important operations in data cleaning and integration, which extend traditional exact search and exact join operations in databases by tolerating the errors and inconsistencies in the data. They have many real-world applications, such as spell checking, duplicate detection, entity resolution, and webpage clustering. Although these two problems have been extensively studied in the recent decade, there is no thorough survey. In this paper, we present a comprehensive survey on string similarity search and join. We first give the problem definitions and introduce widely-used similarity functions to quantify the similarity. We then present an extensive set of algorithms for string similarity search and join. We also discuss their variants, including approximate entity extraction, type-ahead search, and approximate substring matching. Finally, we provide some open datasets and summarize some research challenges and open problems.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(37) WebOfScience(30)
Robust artificial intelligence and robust human organizations
Front. Comput. Sci.    2019, 13 (1): 1-3.
Abstract   PDF (153KB)
Reference | Related Articles | Metrics
Coordinating workload balancing and power switching in renewable energy powered data center
Xian LI,Rui WANG,Zhongzhi LUAN,Yi LIU,Depei QIAN
Front. Comput. Sci.    2016, 10 (3): 574-587.
Abstract   PDF (882KB)

There has been growing concern about energy consumption and environmental impact of datacenters. Some pioneers begin to power datacenters with renewable energy to offset carbon footprint. However, it is challenging to integrate intermittent renewable energy into datacenter power system. Grid-tied system is widely deployed in renewable energy powered datacenters. But the drawbacks (e.g. Harmonic disturbance and costliness) of grid tie inverter harass this design. Besides, the mixture of green load and brown load makes power management heavily depend on software measurement and monitoring, which often suffers inaccuracy. We propose DualPower, a novel power provisioning architecture that enables green datacenters to integrate renewable power supply without grid tie inverters. To optimize DualPower operation, we propose a specially designed power management framework to coordinate workload balancing with power supply switching. We evaluate three optimization schemes (LM, PS and JO) under different datacenter operation scenarios on our trace-driven simulation platform. The experimental results show that DualPower can be as efficient as grid-tied system and has good scalability. In contrast to previous works, DualPower integrates renewable power at lower cost and maintains full availability of datacenter servers.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(3) WebOfScience(2)
Learning multiple metrics for ranking
Xiubo GENG, Xue-Qi CHENG
Front Comput Sci Chin    2011, 5 (3): 259-267.
Abstract   HTML   PDF (174KB)

Directly optimizing an information retrieval (IR) metric has become a hot topic in the field of learning to rank. Conventional wisdom believes that it is better to train for the loss function on which will be used for evaluation. But we often observe different results in reality. For example, directly optimizing averaged precision achieves higher performance than directly optimizing precision@3 when the ranking results are evaluated in terms of precision@3. This motivates us to combine multiple metrics in the process of optimizing IR metrics. For simplicity we study learning with two metrics. Since we usually conduct the learning process in a restricted hypothesis space, e.g., linear hypothesis space, it is usually difficult to maximize both metrics at the same time. To tackle this problem, we propose a relaxed approach in this paper. Specifically, we incorporate one metric within the constraint while maximizing the other one. By restricting the feasible hypothesis space, we can get a more robust ranking model. Empirical results on the benchmark data set LETOR show that the relaxed approach is superior to the direct linear combination approach, and also outperforms other baselines.

Table and Figures | Reference | Related Articles | Metrics
Cited: Crossref(1) WebOfScience(1)
Adding regular expressions to graph reachability and pattern queries
Wenfei FAN, Jianzhong LI, Shuai MA, Nan TANG, Yinghui WU
Front Comput Sci    2012, 6 (3): 313-338.
Abstract   HTML   PDF (1236KB)

It is increasingly common to find graphs in which edges are of different types, indicating a variety of relationships. For such graphs we propose a class of reachability queries and a class of graph patterns, in which an edge is specified with a regular expression of a certain form, expressing the connectivity of a data graph via edges of various types. In addition, we define graph pattern matching based on a revised notion of graph simulation. On graphs in emerging applications such as social networks, we show that these queries are capable of finding more sensible information than their traditional counterparts. Better still, their increased expressive power does not come with extra complexity. Indeed, (1) we investigate their containment and minimization problems, and show that these fundamental problems are in quadratic time for reachability queries and are in cubic time for pattern queries. (2) We develop an algorithm for answering reachability queries, in quadratic time as for their traditional counterpart. (3) We provide two cubic-time algorithms for evaluating graph pattern queries, as opposed to the NP-completeness of graph pattern matching via subgraph isomorphism. (4) The effectiveness and efficiency of these algorithms are experimentally verified using real-life data and synthetic data.

Reference | Related Articles | Metrics
Cited: Crossref(1) WebOfScience(9)
Consolidated cluster systems for data centers in the cloud age: a survey and analysis
Jian LIN, Li ZHA, Zhiwei XU
Front. Comput. Sci.    2013, 7 (1): 1-19.
Abstract   HTML   PDF (718KB)

In the cloud age, heterogeneous application modes on large-scale infrastructures bring about the challenges on resource utilization and manageability to data centers. Many resource and runtime management systems are developed or evolved to address these challenges and relevant problems from different perspectives. This paper tries to identify the main motivations, key concerns, common features, and representative solutions of such systems through a survey and analysis. A typical kind of these systems is generalized as the consolidated cluster system, whose design goal is identified as reducing the overall costs under the quality of service premise. A survey on this kind of systems is given, and the critical issues concerned by such systems are summarized as resource consolidation and runtime coordination. These two issues are analyzed and classified according to the design styles and external characteristics abstracted from the surveyed work. Five representative consolidated cluster systems from both academia and industry are illustrated and compared in detail based on the analysis and classifications. We hope this survey and analysis to be conducive to both design implementation and technology selection of this kind of systems, in response to the constantly emerging challenges on infrastructure and application management in data centers.

Reference | Related Articles | Metrics
Cited: Crossref(11) WebOfScience(8)
Prediction of urban human mobility using large-scale taxi traces and its applications
Xiaolong LI, Gang PAN, Zhaohui WU, Guande QI, Shijian LI, Daqing ZHANG, Wangsheng ZHANG, Zonghui WANG
Front Comput Sci    2012, 6 (1): 111-121.
Abstract   HTML   PDF (665KB)

This paper investigates human mobility patterns in an urban taxi transportation system. This work focuses on predicting humanmobility fromdiscovering patterns of in the number of passenger pick-ups quantity (PUQ) from urban hotspots. This paper proposes an improved ARIMA based prediction method to forecast the spatial-temporal variation of passengers in a hotspot. Evaluation with a large-scale realworld data set of 4 000 taxis’ GPS traces over one year shows a prediction error of only 5.8%. We also explore the application of the prediction approach to help drivers find their next passengers. The simulation results using historical real-world data demonstrate that, with our guidance, drivers can reduce the time taken and distance travelled, to find their next passenger, by 37.1% and 6.4%, respectively.

Reference | Related Articles | Metrics
Cited: WebOfScience(104)
Understanding information interactions in diffusion: an evolutionary game-theoretic perspective
Yuan SU,Xi ZHANG,Lixin LIU,Shouyou SONG,Binxing FANG
Front. Comput. Sci.    2016, 10 (3): 518-531.
Abstract   PDF (542KB)

Social networks are fundamental mediums for diffusion of information and contagions appear at some node of the network and get propagated over the edges. Prior researches mainly focus on each contagion spreading independently, regardless of multiple contagions’ interactions as they propagate at the same time. In the real world, simultaneous news and events usually have to compete for user’s attention to get propagated. In some other cases, they can cooperate with each other and achieve more influences.

In this paper, an evolutionary game theoretic framework is proposed to model the interactions among multiple contagions. The basic idea is that different contagions in social networks are similar to the multiple organisms in a population, and the diffusion process is as organisms interact and then evolve from one state to another. This framework statistically learns the payoffs as contagions interacting with each other and builds the payoff matrix. Since learning payoffs for all pairs of contagions IS almost impossible (quadratic in the number of contagions), a contagion clustering method is proposed in order to decrease the number of parameters to fit, which makes our approach efficient and scalable. To verify the proposed framework, we conduct experiments by using real-world information spreading dataset of Digg. Experimental results show that the proposed game theoretic framework helps to comprehend the information diffusion process better and can predict users’ forwarding behaviors with more accuracy than the previous studies. The analyses of evolution dynamics of contagions and evolutionarily stable strategy reveal whether a contagion can be promoted or suppressed by others in the diffusion process.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(9) WebOfScience(10)
Efficient reinforcement learning in continuous state and action spaces with Dyna and policy approximation
Shan ZHONG, Quan LIU, Zongzhang ZHANG, Qiming FU
Front. Comput. Sci.    2019, 13 (1): 106-126.
Abstract   PDF (1290KB)

Dyna is an effective reinforcement learning (RL) approach that combines value function evaluation with model learning. However, existing works on Dyna mostly discuss only its efficiency in RL problems with discrete action spaces. This paper proposes a novel Dyna variant, called Dyna-LSTD-PA, aiming to handle problems with continuous action spaces. Dyna-LSTD-PA stands for Dyna based on least-squares temporal difference (LSTD) and policy approximation. Dyna-LSTD-PA consists of two simultaneous, interacting processes. The learning process determines the probability distribution over action spaces using the Gaussian distribution; estimates the underlying value function, policy, and model by linear representation; and updates their parameter vectors online by LSTD(λ). The planning process updates the parameter vector of the value function again by using offline LSTD(λ). Dyna-LSTD-PA also uses the Sherman–Morrison formula to improve the efficiency of LSTD(λ), and weights the parameter vector of the value function to bring the two processes together. Theoretically, the global error bound is derived by considering approximation, estimation, and model errors. Experimentally, Dyna-LSTD-PA outperforms two representative methods in terms of convergence rate, success rate, and stability performance on four benchmark RL problems.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(4) WebOfScience(1)
EnAli: entity alignment across multiple heterogeneous data sources
Chao KONG, Ming GAO, Chen XU, Yunbin FU, Weining QIAN, Aoying ZHOU
Front. Comput. Sci.    2019, 13 (1): 157-169.
Abstract   PDF (543KB)

Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the others. Identifying entities across heterogeneous data sources is paramount to many research fields, such as data cleaning, data integration, information retrieval and machine learning. The aligning process is not only overwhelmingly expensive for large data sources since it involves all tuples from two or more data sources, but also need to handle heterogeneous entity attributes. In this paper, we propose an unsupervised approach, called EnAli, to match entities across two or more heterogeneous data sources. EnAli employs a generative probabilistic model to incorporate the heterogeneous entity attributes via employing exponential family, handle missing values, and also utilize the locality sensitive hashing schema to reduce the candidate tuples and speed up the aligning process. EnAli is highly accurate and efficient even without any ground-truth tuples. We illustrate the performance of EnAli on re-identifying entities from the same data source, as well as aligning entities across three real data sources. Our experimental results manifest that our proposed approach outperforms the comparable baseline.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(3) WebOfScience(2)
rCOS: a formal model-driven engineering method for component-based software
Wei KE, Xiaoshan LI, Zhiming LIU, Volker STOLZ
Front Comput Sci    2012, 6 (1): 17-39.
Abstract   HTML   PDF (772KB)

Model-driven architecture (MDA) has become a main stream technology for software-intensive system design. The main engineering principle behind it is that the inherent complexity of software development can only be mastered by building, analyzing and manipulating system models. MDA also deals with system complexity by providing component-based design techniques, allowing independent component design, implementation and deployment, and then system integration and reconfiguration based on component interfaces. The model of a system in any stage is an integration of models of different viewpoints. Therefore, for a model-driven method to be applied effectively, it must provide a body of techniques and an integrated suite of tools for model construction, validation, and transformation. This requires a number of modeling notations for the specification of different concerns and viewpoints of the system. These notations should have formally defined syntaxes and a unified theory of semantics. The underlying theory of the method is needed to underpin the development of tools and correct use of tools in software development, as well as to formally verify and reason about properties of systems in mission-critical applications. The modeling notations, techniques, and tools must be designed so that they can be used seamlessly in supporting development activities and documentation of artifacts in software design processes. This article presents such a method, called the rCOS, focusing on the models of a system at different stages in a software development process, their semantic integration, and how they are constructed, analyzed, transformed, validated, and verified.

Reference | Related Articles | Metrics
Cited: WebOfScience(8)
Recommend trustworthy services using interval numbers of four parameters via cloud model for potential users
Hua MA,Zhigang HU
Front. Comput. Sci.    2015, 9 (6): 887-903.
Abstract   PDF (646KB)

How to discover the trustworthy services is a challenge for potential users because of the deficiency of usage experiences and the information overload of QoE (quality of experience) evaluations from consumers. Aiming to the limitations of traditional interval numbers in measuring the trustworthiness of service, this paper proposed a novel service recommendation approach using the interval numbers of four parameters (INF) for potential users. In this approach, a trustworthiness cloud model was established to identify the eigenvalue of INF via backward cloud generator, and a new formula of INF possibility degree based on geometrical analysis was presented to ensure the high calculation precision. In order to select the highly valuable QoE evaluations, the similarity of client-side feature between potential user and consumers was calculated, and the multi-attributes trustworthiness values were aggregated into INF by the fuzzy analytic hierarchy process method. On the basis of ranking INF, the sort values of trustworthiness of candidate services were obtained, and the trustworthy services were chosen to recommend to potential user. The experiments based on a realworld dataset showed that it can improve the recommendation accuracy of trustworthy services compared to other approaches, which contributes to solving cold start and information overload problem in service recommendation.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(6) WebOfScience(6)
Compound graph based hybrid data center topologies
Lailong LUO,Deke GUO,Wenxin LI,Tian ZHANG,Junjie XIE,Xiaolei ZHOU
Front. Comput. Sci.    2015, 9 (6): 860-874.
Abstract   PDF (658KB)

In large-scale data centers, many servers are interconnected via a dedicated networking structure, so as to satisfy specific design goals, such as the low equipment cost, the high network capacity, and the incremental expansion. The topological properties of a networking structure are critical factors that dominate the performance of the entire data center. The existing networking structures are either fully random or completely structured. Although such networking structures exhibit advantages on given aspects, they suffer obvious shortcomings in other essential fields. In this paper, we aim to design a hybrid topology, called R3, which is the compound graph of structured and random topology. It employs random regular graph as a unit cluster and connects many such clusters by means of a structured topology, i.e., the generalized hypercube. Consequently, the hybrid topology combines the advantages of structured as well as random topologies seamlessly. Meanwhile, a coloring-based algorithm is proposed for R3 to enable fast and accurate routing. R3 possesses many attractive characteristics, such as the modularity and expansibility at the cost of only increasing the degree of any node by one. Comprehensive evaluation results show that our hybrid topology possesses excellent topology properties and network performance.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(6) WebOfScience(6)
Optimization methods for regularization-based ill-posed problems: a survey and a multi-objective framework
Maoguo GONG, Xiangming JIANG, Hao LI
Front. Comput. Sci.    2017, 11 (3): 362-391.
Abstract   PDF (722KB)

Ill-posed problems are widely existed in signal processing. In this paper, we review popular regularization models such as truncated singular value decomposition regularization, iterative regularization, variational regularization. Meanwhile, we also retrospect popular optimization approaches and regularization parameter choice methods. In fact, the regularization problem is inherently a multiobjective problem. The traditional methods usually combine the fidelity term and the regularization term into a singleobjective with regularization parameters, which are difficult to tune. Therefore, we propose a multi-objective framework for ill-posed problems, which can handle complex features of problem such as non-convexity, discontinuity. In this framework, the fidelity term and regularization term are optimized simultaneously to gain more insights into the ill-posed problems. A case study on signal recovery shows the effectiveness of the multi-objective framework for ill-posed problems.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(5) WebOfScience(7)
iGraph: an incremental data processing system for dynamic graph
Wuyang JU,Jianxin LI,Weiren YU,Richong ZHANG
Front. Comput. Sci.    2016, 10 (3): 462-476.
Abstract   PDF (625KB)

With the popularity of social network, the demand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintaining and processing of dynamic graph is significantly high. In this paper, we design iGraph, an incremental graph processing system for dynamic graph with its continuous updates. The contributions of iGraph include: 1) a hash-based graph partition strategy to enable fine-grained graph updates; 2) a vertexbased graph computing model to support incremental data processing; 3) detection and rebalance methods of hotspot to address the workload imbalance problem during incremental processing. Through the general-purpose API, iGraph can be used to implement various graph processing algorithms such as PageRank. We have implemented iGraph on Apache Spark, and experimental results show that for real life datasets, iGraph outperforms the original GraphX in respect of graph update and graph computation.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(5) WebOfScience(4)
Design and verification of a lightweight reliable virtual machine monitor for a many-core architecture
Yuehua DAI, Yi SHI, Yong QI, Jianbao REN, Peijian WANG
Front Comput Sci    2013, 7 (1): 34-43.
Abstract   HTML   PDF (398KB)

Virtual machine monitors (VMMs) play a central role in cloud computing. Their reliability and availability are critical for cloud computing. Virtualization and device emulation make the VMM code base large and the interface between OS and VMM complex. This results in a code base that is very hard to verify the security of the VMM. For example, a misuse of a VMM hyper-call by a malicious guest OS can corrupt the whole VMM. The complexity of the VMM also makes it hard to formally verify the correctness of the system’s behavior. In this paper a new VMM, operating system virtualization (OSV), is proposed. The multiprocessor boot interface and memory configuration interface are virtualized in OSV at boot time in the Linux kernel. After booting, only inter-processor interrupt operations are intercepted by OSV, which makes the interface between OSV and OS simple. The interface is verified using formal model checking, which ensures a malicious OS cannot attack OSV through the interface. Currently, OSV is implemented based on the AMD Opteron multi-core server architecture. Evaluation results show that Linux running on OSV has a similar performance to native Linux. OSV has a performance improvement of 4%-13% over Xen.

Reference | Related Articles | Metrics
Cited: Crossref(8) WebOfScience(4)
A survey of RDF data management systems
M. Tamer ÖZSU
Front. Comput. Sci.    2016, 10 (3): 418-432.
Abstract   PDF (850KB)

RDF is increasingly being used to encode data for the semantic web and data exchange. There have been a large number of works that address RDF data management following different approaches. In this paper we provide an overview of these works. This review considers centralized solutions (what are referred to as warehousing approaches), distributed solutions, and the techniques that have been developed for querying linked data. In each category, further classifications are provided that would assist readers in understanding the identifying characteristics of different approaches.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(29) WebOfScience(20)
VIPLFaceNet: an open source deep face recognition SDK
Xin LIU,Meina KAN,Wanglong WU,Shiguang SHAN,Xilin CHEN
Front. Comput. Sci.    2017, 11 (2): 208-218.
Abstract   PDF (461KB)

Robust face representation is imperative to highly accurate face recognition. In this work, we propose an open source face recognition method with deep representation named as VIPLFaceNet, which is a 10-layer deep convolutional neural network with seven convolutional layers and three fully-connected layers. Compared with the well-known AlexNet, our VIPLFaceNet takes only 20% training time and 60% testing time, but achieves 40% drop in error rate on the real-world face recognition benchmark LFW. Our VIPLFaceNet achieves 98.60% mean accuracy on LFW using one single network. An open-source C++ SDK based on VIPLFaceNet is released under BSD license. The SDK takes about 150ms to process one face image in a single thread on an i7 desktop CPU. VIPLFaceNet provides a state-of-the-art start point for both academic and industrial face recognition applications.

Reference | Related Articles | Metrics
Cited: Crossref(39) WebOfScience(26)