Frontiers of Computer Science

30 Most Downloaded Articles
Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

Please wait a minute...
For Selected: View Abstracts Toggle Thumbnails
Mining the interests of Chinese microbloggers via keyword extraction
Zhiyuan LIU, Xinxiong CHEN, Maosong SUN
Front Comput Sci    2012, 6 (1): 76-87.   DOI: 10.1007/s11704-011-1174-8
Abstract   HTML   PDF (547KB)

Microblogging provides a new platform for communicating and sharing information amongWeb users. Users can express opinions and record daily life using microblogs. Microblogs that are posted by users indicate their interests to some extent. We aim to mine user interests via keyword extraction from microblogs. Traditional keyword extraction methods are usually designed for formal documents such as news articles or scientific papers. Messages posted by microblogging users, however, are usually noisy and full of new words, which is a challenge for keyword extraction. In this paper, we combine a translation-based method with a frequency-based method for keyword extraction. In our experiments, we extract keywords for microblog users from the largest microblogging website in China, Sina Weibo. The results show that our method can identify users’ interests accurately and efficiently.

Reference | Related Articles | Metrics
The use of mathematics in software quality assurance
David Lorge PARNAS
Front Comput Sci    2012, 6 (1): 3-16.   DOI: 10.1007/s11704-012-2904-2
Abstract   HTML   PDF (425KB)

The use of mathematics for documenting, inspecting, and testing software is explained and illustrated. Three measures of software quality are described and discussed. Then three distinct complementary approaches to software quality assurance are presented. A case study, the testing and inspection of a safety-critical system, is discussed in detail.

Reference | Related Articles | Metrics
Big graph search: challenges and techniques
Shuai MA,Jia LI,Chunming HU,Xuelian LIN,Jinpeng HUAI
Front. Comput. Sci.    2016, 10 (3): 387-398.   DOI: 10.1007/s11704-015-4515-1
Abstract   PDF (484KB)

On one hand, compared with traditional relational and XML models, graphs have more expressive power and are widely used today. On the other hand, various applications of social computing trigger the pressing need of a new search paradigm. In this article, we argue that big graph search is the one filling this gap. We first introduce the application of graph search in various scenarios. We then formalize the graph search problem, and give an analysis of graph search from an evolutionary point of view, followed by the evidences from both the industry and academia. After that, we analyze the difficulties and challenges of big graph search. Finally, we present three classes of techniques towards big graph search: query techniques, data techniques and distributed computing techniques.

Reference | Related Articles | Metrics
Cited: Crossref(4)
Formal engineering methods for software quality assurance
Shaoying LIU
Front Comput Sci    2012, 6 (1): 1-2.   DOI: 10.1007/s11704-012-2900-6
Abstract   HTML   PDF (184KB)
Related Articles | Metrics
Learning multiple metrics for ranking
Xiubo GENG, Xue-Qi CHENG
Front Comput Sci Chin    2011, 5 (3): 259-267.   DOI: 10.1007/s11704-011-0152-5
Abstract   HTML   PDF (174KB)

Directly optimizing an information retrieval (IR) metric has become a hot topic in the field of learning to rank. Conventional wisdom believes that it is better to train for the loss function on which will be used for evaluation. But we often observe different results in reality. For example, directly optimizing averaged precision achieves higher performance than directly optimizing precision@3 when the ranking results are evaluated in terms of precision@3. This motivates us to combine multiple metrics in the process of optimizing IR metrics. For simplicity we study learning with two metrics. Since we usually conduct the learning process in a restricted hypothesis space, e.g., linear hypothesis space, it is usually difficult to maximize both metrics at the same time. To tackle this problem, we propose a relaxed approach in this paper. Specifically, we incorporate one metric within the constraint while maximizing the other one. By restricting the feasible hypothesis space, we can get a more robust ranking model. Empirical results on the benchmark data set LETOR show that the relaxed approach is superior to the direct linear combination approach, and also outperforms other baselines.

Table and Figures | Reference | Related Articles | Metrics
Cited: Crossref(1)
Adding regular expressions to graph reachability and pattern queries
Wenfei FAN, Jianzhong LI, Shuai MA, Nan TANG, Yinghui WU
Front Comput Sci    2012, 6 (3): 313-338.   DOI: 10.1007/s11704-012-1312-y
Abstract   HTML   PDF (1236KB)

It is increasingly common to find graphs in which edges are of different types, indicating a variety of relationships. For such graphs we propose a class of reachability queries and a class of graph patterns, in which an edge is specified with a regular expression of a certain form, expressing the connectivity of a data graph via edges of various types. In addition, we define graph pattern matching based on a revised notion of graph simulation. On graphs in emerging applications such as social networks, we show that these queries are capable of finding more sensible information than their traditional counterparts. Better still, their increased expressive power does not come with extra complexity. Indeed, (1) we investigate their containment and minimization problems, and show that these fundamental problems are in quadratic time for reachability queries and are in cubic time for pattern queries. (2) We develop an algorithm for answering reachability queries, in quadratic time as for their traditional counterpart. (3) We provide two cubic-time algorithms for evaluating graph pattern queries, as opposed to the NP-completeness of graph pattern matching via subgraph isomorphism. (4) The effectiveness and efficiency of these algorithms are experimentally verified using real-life data and synthetic data.

Reference | Related Articles | Metrics
Consolidated cluster systems for data centers in the cloud age: a survey and analysis
Jian LIN, Li ZHA, Zhiwei XU
Front. Comput. Sci.    2013, 7 (1): 1-19.   DOI: 10.1007/s11704-012-2086-y
Abstract   HTML   PDF (718KB)

In the cloud age, heterogeneous application modes on large-scale infrastructures bring about the challenges on resource utilization and manageability to data centers. Many resource and runtime management systems are developed or evolved to address these challenges and relevant problems from different perspectives. This paper tries to identify the main motivations, key concerns, common features, and representative solutions of such systems through a survey and analysis. A typical kind of these systems is generalized as the consolidated cluster system, whose design goal is identified as reducing the overall costs under the quality of service premise. A survey on this kind of systems is given, and the critical issues concerned by such systems are summarized as resource consolidation and runtime coordination. These two issues are analyzed and classified according to the design styles and external characteristics abstracted from the surveyed work. Five representative consolidated cluster systems from both academia and industry are illustrated and compared in detail based on the analysis and classifications. We hope this survey and analysis to be conducive to both design implementation and technology selection of this kind of systems, in response to the constantly emerging challenges on infrastructure and application management in data centers.

Reference | Related Articles | Metrics
Cited: Crossref(7)
A survey on software defined networking and its applications
Yili GONG,Wei HUANG,Wenjie WANG,Yingchun LEI
Front. Comput. Sci.    2015, 9 (6): 827-845.   DOI: 10.1007/s11704-015-3448-z
Abstract   PDF (1083KB)

Software defined networking (SDN) achieves network routing management with logically centralized control software that decouples the network data plane from the control plane. This new design paradigm greatly emancipates network innovation. This paper introduces the background of SDN technology with its design principles, explains its differentiation, and summarizes the research efforts on SDN network architecture, components and applications. Based on the observation of current SDN development, this paper analyzes the potential driving forces of SDN deployment and its future trend.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(3)
Prediction of urban human mobility using large-scale taxi traces and its applications
Xiaolong LI, Gang PAN, Zhaohui WU, Guande QI, Shijian LI, Daqing ZHANG, Wangsheng ZHANG, Zonghui WANG
Front Comput Sci    2012, 6 (1): 111-121.   DOI: 10.1007/s11704-011-1192-6
Abstract   HTML   PDF (665KB)

This paper investigates human mobility patterns in an urban taxi transportation system. This work focuses on predicting humanmobility fromdiscovering patterns of in the number of passenger pick-ups quantity (PUQ) from urban hotspots. This paper proposes an improved ARIMA based prediction method to forecast the spatial-temporal variation of passengers in a hotspot. Evaluation with a large-scale realworld data set of 4 000 taxis’ GPS traces over one year shows a prediction error of only 5.8%. We also explore the application of the prediction approach to help drivers find their next passengers. The simulation results using historical real-world data demonstrate that, with our guidance, drivers can reduce the time taken and distance travelled, to find their next passenger, by 37.1% and 6.4%, respectively.

Reference | Related Articles | Metrics
rCOS: a formal model-driven engineering method for component-based software
Wei KE, Xiaoshan LI, Zhiming LIU, Volker STOLZ
Front Comput Sci    2012, 6 (1): 17-39.   DOI: 10.1007/s11704-012-2901-5
Abstract   HTML   PDF (772KB)

Model-driven architecture (MDA) has become a main stream technology for software-intensive system design. The main engineering principle behind it is that the inherent complexity of software development can only be mastered by building, analyzing and manipulating system models. MDA also deals with system complexity by providing component-based design techniques, allowing independent component design, implementation and deployment, and then system integration and reconfiguration based on component interfaces. The model of a system in any stage is an integration of models of different viewpoints. Therefore, for a model-driven method to be applied effectively, it must provide a body of techniques and an integrated suite of tools for model construction, validation, and transformation. This requires a number of modeling notations for the specification of different concerns and viewpoints of the system. These notations should have formally defined syntaxes and a unified theory of semantics. The underlying theory of the method is needed to underpin the development of tools and correct use of tools in software development, as well as to formally verify and reason about properties of systems in mission-critical applications. The modeling notations, techniques, and tools must be designed so that they can be used seamlessly in supporting development activities and documentation of artifacts in software design processes. This article presents such a method, called the rCOS, focusing on the models of a system at different stages in a software development process, their semantic integration, and how they are constructed, analyzed, transformed, validated, and verified.

Reference | Related Articles | Metrics
Coordinating workload balancing and power switching in renewable energy powered data center
Xian LI,Rui WANG,Zhongzhi LUAN,Yi LIU,Depei QIAN
Front. Comput. Sci.    2016, 10 (3): 574-587.   DOI: 10.1007/s11704-015-5018-9
Abstract   PDF (882KB)

There has been growing concern about energy consumption and environmental impact of datacenters. Some pioneers begin to power datacenters with renewable energy to offset carbon footprint. However, it is challenging to integrate intermittent renewable energy into datacenter power system. Grid-tied system is widely deployed in renewable energy powered datacenters. But the drawbacks (e.g. Harmonic disturbance and costliness) of grid tie inverter harass this design. Besides, the mixture of green load and brown load makes power management heavily depend on software measurement and monitoring, which often suffers inaccuracy. We propose DualPower, a novel power provisioning architecture that enables green datacenters to integrate renewable power supply without grid tie inverters. To optimize DualPower operation, we propose a specially designed power management framework to coordinate workload balancing with power supply switching. We evaluate three optimization schemes (LM, PS and JO) under different datacenter operation scenarios on our trace-driven simulation platform. The experimental results show that DualPower can be as efficient as grid-tied system and has good scalability. In contrast to previous works, DualPower integrates renewable power at lower cost and maintains full availability of datacenter servers.

Reference | Supplementary Material | Related Articles | Metrics
Design and verification of a lightweight reliable virtual machine monitor for a many-core architecture
Yuehua DAI, Yi SHI, Yong QI, Jianbao REN, Peijian WANG
Front Comput Sci    2013, 7 (1): 34-43.   DOI: 10.1007/s11704-012-2084-0
Abstract   HTML   PDF (398KB)

Virtual machine monitors (VMMs) play a central role in cloud computing. Their reliability and availability are critical for cloud computing. Virtualization and device emulation make the VMM code base large and the interface between OS and VMM complex. This results in a code base that is very hard to verify the security of the VMM. For example, a misuse of a VMM hyper-call by a malicious guest OS can corrupt the whole VMM. The complexity of the VMM also makes it hard to formally verify the correctness of the system’s behavior. In this paper a new VMM, operating system virtualization (OSV), is proposed. The multiprocessor boot interface and memory configuration interface are virtualized in OSV at boot time in the Linux kernel. After booting, only inter-processor interrupt operations are intercepted by OSV, which makes the interface between OSV and OS simple. The interface is verified using formal model checking, which ensures a malicious OS cannot attack OSV through the interface. Currently, OSV is implemented based on the AMD Opteron multi-core server architecture. Evaluation results show that Linux running on OSV has a similar performance to native Linux. OSV has a performance improvement of 4%-13% over Xen.

Reference | Related Articles | Metrics
Cited: Crossref(3)
Top-k probabilistic prevalent co-location mining in spatially uncertain data sets
Lizhen WANG,Jun HAN,Hongmei CHEN,Junli LU
Front. Comput. Sci.    2016, 10 (3): 488-503.   DOI: 10.1007/s11704-015-4196-9
Abstract   PDF (774KB)

A co-location pattern is a set of spatial features whose instances frequently appear in a spatial neighborhood. This paper efficiently mines the top-k probabilistic prevalent co-locations over spatially uncertain data sets and makes the following contributions: 1) the concept of the top-k probabilistic prevalent co-locations based on a possible world model is defined; 2) a framework for discovering the top-k probabilistic prevalent co-locations is set up; 3) a matrix method is proposed to improve the computation of the prevalence probability of a top-k candidate, and two pruning rules of the matrix block are given to accelerate the search for exact solutions; 4) a polynomial matrix is developed to further speed up the top-k candidate refinement process; 5) an approximate algorithm with compensation factor is introduced so that relatively large quantity of data can be processed quickly. The efficiency of our proposed algorithms as well as the accuracy of the approximation algorithms is evaluated with an extensive set of experiments using both synthetic and real uncertain data sets.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(4)
On social computing research collaboration patterns: a social network perspective
Tao WANG, Qingpeng ZHANG, Zhong LIU, Wenli LIU, Ding WEN
Front Comput Sci    2012, 6 (1): 122-130.   DOI: 10.1007/s11704-011-1173-9
Abstract   HTML   PDF (724KB)

The field of social computing emerged more than ten years ago. During the last decade, researchers from a variety of disciplines have been closely collaborating to boost the growth of social computing research. This paper aims at identifying key researchers and institutions, and examining the collaboration patterns in the field. We employ co-authorship network analysis at different levels to study the bibliographic information of 6 543 publications in social computing from 1998 to 2011. This paper gives a snapshot of the current research in social computing and can provide an initial guidance to new researchers in social computing.

Reference | Related Articles | Metrics
Introduction to the special section on peer-to-peer computing and web data management
ZHOU Aoying
Front. Comput. Sci.    2008, 2 (3): 209-210.   DOI: 10.1007/s11704-008-0031-x
Abstract   HTML   PDF (41KB)
Related Articles | Metrics
Hybrid MARTE statecharts
Jing LIU, Ziwei LIU, Jifeng HE, Frédéric MALLET, Zuohua DING
Front Comput Sci    2013, 7 (1): 95-108.   DOI: 10.1007/s11704-012-1301-1
Abstract   HTML   PDF (672KB)

The specification of modeling and analysis of real-time and embedded systems (MARTE) is an extension of the unified modeling language (UML) in the domain of real-time and embedded systems. Even though MARTE time model offers a support to describe both discrete and dense clocks, the biggest effort has been put so far on the specification and analysis of discrete MARTE models. To address hybrid real-time and embedded systems, we propose to extend statecharts using both MARTE and the theory of hybrid automata. We call this extension hybrid MARTE statecharts. It provides an improvement over the hybrid automata in that: the logical time variables and the chronometric time variables are unified. The formal syntax and semantics of hybrid MARTE statecharts are given based on labeled transition systems and live transition systems. As a case study, we model the behavior of a train control system with hybrid MARTE statecharts to demonstrate the benefit.

Reference | Related Articles | Metrics
Cited: Crossref(11)
Enriching short text representation in microblog for clustering
Jiliang TANG, Xufei WANG, Huiji GAO, Xia HU, Huan LIU
Front Comput Sci    2012, 6 (1): 88-101.   DOI: 10.1007/s11704-011-1167-7
Abstract   HTML   PDF (660KB)

Social media websites allow users to exchange short texts such as tweets via microblogs and user status in friendship networks. Their limited length, pervasive abbreviations, and coined acronyms and words exacerbate the problems of synonymy and polysemy, and bring about new challenges to data mining applications such as text clustering and classification. To address these issues, we dissect some potential causes and devise an efficient approach that enriches data representation by employing machine translation to increase the number of features from different languages. Then we propose a novel framework which performs multi-language knowledge integration and feature reduction simultaneously through matrix factorization techniques. The proposed approach is evaluated extensively in terms of effectiveness on two social media datasets from Facebook and Twitter. With its significant performance improvement, we further investigate potential factors that contribute to the improved performance.

Reference | Related Articles | Metrics
Semantic theories of programs with nested interrupts
Yanhong HUANG,Jifeng HE,Huibiao ZHU,Yongxin ZHAO,Jianqi SHI,Shengchao QIN
Front. Comput. Sci.    2015, 9 (3): 331-345.   DOI: 10.1007/s11704-015-3251-x
Abstract   PDF (424KB)

In the design of dependable software for embedded and real-time operating systems, time analysis is a crucial but extremely difficult issue, the challenge of which is exacerbated due to the randomness and nondeterminism of interrupt handling behaviors. Thus research into a theory that integrates interrupt behaviors and time analysis seems to be important and challenging. In this paper, we present a programming language to describe programs with interrupts that is comprised of two essential parts: main program and interrupt handling programs. We also explore a timed operational semantics and a denotational semantics to specify the meanings of our language. Furthermore, a strategy of deriving denotational semantics from the timed operational semantics is provided to demonstrate the soundness of our operational semantics by showing the consistency between the derived denotational semantics and the original denotational semantics.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(1)
String similarity search and join: a survey
Minghe YU,Guoliang LI,Dong DENG,Jianhua FENG
Front. Comput. Sci.    2016, 10 (3): 399-417.   DOI: 10.1007/s11704-015-5900-5
Abstract   PDF (720KB)

String similarity search and join are two important operations in data cleaning and integration, which extend traditional exact search and exact join operations in databases by tolerating the errors and inconsistencies in the data. They have many real-world applications, such as spell checking, duplicate detection, entity resolution, and webpage clustering. Although these two problems have been extensively studied in the recent decade, there is no thorough survey. In this paper, we present a comprehensive survey on string similarity search and join. We first give the problem definitions and introduce widely-used similarity functions to quantify the similarity. We then present an extensive set of algorithms for string similarity search and join. We also discuss their variants, including approximate entity extraction, type-ahead search, and approximate substring matching. Finally, we provide some open datasets and summarize some research challenges and open problems.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(9)
Large-scale virtual machines provisioning in clouds:challenges and approaches
Zhaoning ZHANG,Dongsheng LI,Kui WU
Front. Comput. Sci.    2016, 10 (1): 2-18.   DOI: 10.1007/s11704-015-4420-7
Abstract   PDF (748KB)

The scale of global data center market has been explosive in recent years. As the market grows, the demand for fast provisioning of the virtual resources to support elastic, manageable, and economical computing over the cloud becomes high. Fast provisioning of large-scale virtual machines (VMs), in particular, is critical to guarantee quality of service (QoS). In this paper, we systematically review the existing VM provisioning schemes and classify them in three main categories. We discuss the features and research status of each category, and introduce two recent solutions, VMThunder and VMThunder+, both of which can provision hundreds of VMs in seconds.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(9)
VGQ-Vor: extending virtual grid quadtree with Voronoi diagram for mobile k nearest neighbor queries over mobile objects
Botao WANG, Jingwei QU, Xiaosong WANG, Guoren WANG, Masaru KITSUREGAWA
Front Comput Sci    2013, 7 (1): 44-54.   DOI: 10.1007/s11704-012-2069-z
Abstract   HTML   PDF (723KB)

Performing mobile k nearest neighbor (MkNN) queries whilst also being mobile is a challenging problem. All the mobile objects issuing queries and/or being queried aremobile. The performance of this kind of query relies heavily on the maintenance of the current locations of the objects. The index used for mobile objects must support efficient update operations and efficient query handling. This study aims to improve the performance of the MkNN queries while reducing update costs. Our approach is based on an observation that the frequency of one region changing between being occupied or not by mobile objects is much lower than the frequency of the position changes reported by the mobile objects. We first propose an virtual grid quadtree with Voronoi diagram(VGQ-Vor), which is a two-layer index structure that indexes regions occupied by mobile objects in a quadtree and builds a Voronoi diagram of the regions. Then we propose a moving k nearest neighbor (kNN) query algorithm on the VGQ-Vor and prove the correctness of the algorithm. The experimental results show that the VGQ-Vor outperforms the existing techniques (Bx-tree, Bdual-tree) by one to three orders of magnitude in most cases.

Reference | Related Articles | Metrics
Cited: Crossref(2)
Compound graph based hybrid data center topologies
Lailong LUO,Deke GUO,Wenxin LI,Tian ZHANG,Junjie XIE,Xiaolei ZHOU
Front. Comput. Sci.    2015, 9 (6): 860-874.   DOI: 10.1007/s11704-015-4483-5
Abstract   PDF (658KB)

In large-scale data centers, many servers are interconnected via a dedicated networking structure, so as to satisfy specific design goals, such as the low equipment cost, the high network capacity, and the incremental expansion. The topological properties of a networking structure are critical factors that dominate the performance of the entire data center. The existing networking structures are either fully random or completely structured. Although such networking structures exhibit advantages on given aspects, they suffer obvious shortcomings in other essential fields. In this paper, we aim to design a hybrid topology, called R3, which is the compound graph of structured and random topology. It employs random regular graph as a unit cluster and connects many such clusters by means of a structured topology, i.e., the generalized hypercube. Consequently, the hybrid topology combines the advantages of structured as well as random topologies seamlessly. Meanwhile, a coloring-based algorithm is proposed for R3 to enable fast and accurate routing. R3 possesses many attractive characteristics, such as the modularity and expansibility at the cost of only increasing the degree of any node by one. Comprehensive evaluation results show that our hybrid topology possesses excellent topology properties and network performance.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(4)
Recent advances in program verification through computer algebra
Lu YANG, Chaochen ZHOU, Naijun ZHAN, Bican XIA,
Front. Comput. Sci.    2010, 4 (1): 1-16.   DOI: 10.1007/s11704-009-0074-7
Abstract   PDF (255KB)
In this paper, we summarize the results on program verification through semi-algebraic systemsSASs solving that we have obtained, including automatic discovery of invariants and ranking functions, symbolic decision procedure for the termination of a class of linear loops, termination analysis of nonlinear systems, and so on.
Reference | Related Articles | Metrics
Cited: Crossref(18)
An institution theory of formal meta-modelling in graphically extended BNF
Hong ZHU
Front Comput Sci    2012, 6 (1): 40-56.   DOI: 10.1007/s11704-012-2902-4
Abstract   HTML   PDF (403KB)

Meta-modelling plays an important role in model driven software development. In this paper, a graphic extension of BNF (GEBNF) is proposed to define the abstract syntax of graphic modelling languages. From a GEBNF syntax definition, a formal predicate logic language can be induced so that meta-modelling can be performed formally by specifying a predicate on the domain of syntactically valid models. In this paper, we investigate the theoretical foundation of this meta-modelling approach. We formally define the semantics of GEBNF and its induced predicate logic languages, then apply Goguen and Burstall’s institution theory to prove that they form a sound and valid formal specification language for meta-modelling.

Reference | Related Articles | Metrics
A graph-based generic type system for object-oriented programs
Wei KE, Zhiming LIU, Shuling WANG, Liang ZHAO
Front Comput Sci    2013, 7 (1): 109-134.   DOI: 10.1007/s11704-012-1307-8
Abstract   HTML   PDF (754KB)

We present a graph-basedmodel of a generic type system for an OO language. The type system supports the features of recursive types, generics and interfaces, which are commonly found in modern OO languages such as Java. In the classical graph theory, we define type graphs, instantiation graphs and conjunction graphs that naturally illustrate the relations among types, generics and interfaces within complex OO programs. The model employs a combination of nominal and anonymous nodes to represent respectively types that are identified by names and structures, and defines graph-based relations and operations on types including equivalence, subtyping, conjunction and instantiation. Algorithms based on the graph structures are designed for the implementation of the type system. We believe that this type system is important for the development of a graph-based logical foundation of a formal method for verification of and reasoning about OO programs.

Reference | Related Articles | Metrics
Cited: Crossref(3)
Basic research in computer science and software engineering at SKLCS
ZHANG Jian, ZHANG Wenhui, ZHAN Naijun, SHEN Yidong, CHEN Haiming, ZHANG Yunquan, WANG Yongji, WU Enhua, WANG Hongan, ZHU Xueyang
Front. Comput. Sci.    2008, 2 (1): 1-11.   DOI: 10.1007/s11704-008-0001-3
Abstract   HTML   PDF (167KB)
The State Key Laboratory of Computer Science (SKLCS) is committed to basic research in computer science and software engineering. The research topics of the laboratory include: concurrency theory, theory and algorithms for real-time systems, formal specifications based on context-free grammars, semantics of programming languages, model checking, automated reasoning, logic programming, software testing, software process improvement, middleware technology, parallel algorithms and parallel software, computer graphics and human-computer interaction. This paper describes these topics in some detail and summarizes some results obtained in recent years.
Reference | Related Articles | Metrics
Cited: Crossref(1)
Hierarchical caches in content-centric networks: modeling and analysis
Zixiao JIA,Jiwei HUANG,Chuang LIN
Front. Comput. Sci.    2015, 9 (6): 846-859.   DOI: 10.1007/s11704-015-4064-7
Abstract   PDF (862KB)

Content-centric network (CCN) is a new Internet architecture in which content is treated as the primitive of communication. In CCN, routers are equipped with content stores at the content level, which act as caches for frequently requested content. Based on this design, the Internet is available to provide content distribution services without any application-layer support.

In addition, as caches are integrated into routers, the overall performance of CCN will be deeply affected by the caching efficiency. In this paper, our aim is to gain some insights on how caches should be designed to maintain a high performance in a cost-efficient way. We try to model the two-layer cache hierarchy composed of CCN routers using a two-dimensional discrete-time Markov chain, and develop an efficient algorithm to calculate the hit ratios of these caches. Simulations validate the accuracy of our modeling method, and convey some meaningful information which can help us better understand the caching mechanism of CCN.

Reference | Supplementary Material | Related Articles | Metrics
RDF partitioning for scalable SPARQL query processing
Xiaoyan WANG,Tao YANG,Jinchuan CHEN,Long HE,Xiaoyong DU
Front. Comput. Sci.    2015, 9 (6): 919-933.   DOI: 10.1007/s11704-015-4104-3
Abstract   PDF (1014KB)

The volume of RDF data increases dramatically within recent years, while cloud computing platforms like Hadoop are supposed to be a good choice for processing queries over huge data sets for their wonderful scalability. Previous work on evaluating SPARQL queries with Hadoop mainly focus on reducing the number of joins through careful split of HDFS files and algorithms for generating Map/Reduce jobs. However, the way of partitioning RDF data could also affect system performance. Specifically, a good partitioning solution would greatly reduce or even totally avoid cross-node joins, and significantly cut down the cost in query evaluation. Based on HadoopDB, this work processes SPARQL queries in a hybrid architecture, where Map/Reduce takes charge of the computing tasks, and RDF query engines like RDF-3X store the data and execute join operations. According to the analysis of query workloads, this work proposes a novel algorithm for automatically partitioning RDF data and an approximate solution to physically place the partitions in order to reduce data redundancy. It also discusses how to make a good trade-off between query evaluation efficiency and data redundancy. All of these proposed approaches have been evaluated by extensive experiments over large RDF data sets.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(3)
State of the art and prospects of structured sensing matrices in compressed sensing
Kezhi LI,Shuang CONG
Front. Comput. Sci.    2015, 9 (5): 665-677.   DOI: 10.1007/s11704-015-3326-8
Abstract   PDF (513KB)

Compressed sensing (CS) enables people to acquire the compressed measurements directly and recover sparse or compressible signals faithfully even when the sampling rate is much lower than the Nyquist rate. However, the pure random sensing matrices usually require huge memory for storage and high computational cost for signal reconstruction. Many structured sensing matrices have been proposed recently to simplify the sensing scheme and the hardware implementation in practice. Based on the restricted isometry property and coherence, couples of existing structured sensing matrices are reviewed in this paper, which have special structures, high recovery performance, and many advantages such as the simple construction, fast calculation and easy hardware implementation. The number of measurements and the universality of different structure matrices are compared.

Reference | Supplementary Material | Related Articles | Metrics
Cited: Crossref(5)
VIPLFaceNet: an open source deep face recognition SDK
Xin LIU,Meina KAN,Wanglong WU,Shiguang SHAN,Xilin CHEN
Front. Comput. Sci.    2017, 11 (2): 208-218.   DOI: 10.1007/s11704-016-6076-3
Abstract   PDF (461KB)

Robust face representation is imperative to highly accurate face recognition. In this work, we propose an open source face recognition method with deep representation named as VIPLFaceNet, which is a 10-layer deep convolutional neural network with seven convolutional layers and three fully-connected layers. Compared with the well-known AlexNet, our VIPLFaceNet takes only 20% training time and 60% testing time, but achieves 40% drop in error rate on the real-world face recognition benchmark LFW. Our VIPLFaceNet achieves 98.60% mean accuracy on LFW using one single network. An open-source C++ SDK based on VIPLFaceNet is released under BSD license. The SDK takes about 150ms to process one face image in a single thread on an i7 desktop CPU. VIPLFaceNet provides a state-of-the-art start point for both academic and industrial face recognition applications.

Reference | Related Articles | Metrics