Since service level agreement (SLA) is essentially used to maintain reliable quality of service between cloud providers and clients in cloud environment, there has been a growing effort in reducing power consumption while complying with the SLA by maximizing physical machine (PM)-level utilization and load balancing techniques in infrastructure as a service. However, with the recent introduction of container as a service by cloud providers, containers are increasingly popular and will become the major deployment model in the cloud environment and specifically in platform as a service. Therefore, reducing power consumption while complying with the SLA at virtual machine (VM)-level becomes essential. In this context, we exploit a container consolidation scheme with usage prediction to achieve the above objectives. To obtain a reliable characterization of overutilized and underutilized PMs, our scheme jointly exploits the current and predicted CPU utilization based on local history of the considered PMs in the process of the container consolidation. We demonstrate our solution through simulations on real workloads. The experimental results show that the container consolidation scheme with usage prediction reduces the power consumption, number of container migrations, and average number of active VMs while complying with the SLA.
When evaluating the performance of distributed software-defined network (SDN) controller architecture in data center networks, the required number of controllers for a given network topology and their location are major issues of interest. To address these issues, this study proposes the adaptively adjusting and mapping controllers (AAMcon) to design a stateful data plane. We use the complex network community theory to select a key switch to place the controller which is closer to switches it controls in a subnet. A physically distributed but logically centralized controller pool is built based on the network function virtualization (NFV). And then we propose a fast start/overload avoid algorithm to adaptively adjust the number of controllers according to the demand. We performed an analysis for AAMcon to find the optimal distance between the switch and controller. Finally, experiments show the following results. (1) For the number of controllers, AAMcon can greatly follow the demand; for the placement location of controller, controller can respond to the request of switch with the least distance to minimize the delay between the switch and it. (2) For failure tolerance, AAMcon shows good robustness. (3) AAMcon requires less delay to the network with more significant community structure. In fact, there is an inverse relationship between the community modularity and average distance between the switch and controller, i.e., the average delay decreases when the community modularity increases.(4) AAMcon can achieve the load balance between the controllers. (5) Compared to DCP-GK and k-critical, AAMcon shows good performance
Ordinal regression (OR) or classification is a machine learning paradigm for ordinal labels. To date, there have been a variety of methods proposed including kernel based and neural network based methods with significant performance. However, existing OR methods rarely consider latent structures of given data, particularly the interaction among covariates, thus losing interpretability to some extent. To compensate this, in this paper, we present a new OR method: ordinal factorization machine with hierarchical sparsity (OFMHS), which combines factorization machine and hierarchical sparsity together to explore the hierarchical structure behind the input variables. For the sake of optimization, we formulate OFMHS as a convex optimization problem and solve it by adopting the efficient alternating directions method of multipliers (ADMM) algorithm. Experimental results on synthetic and real datasets demonstrate the superiority of our method in both performance and significant variable selection.
Compilers are widely-used infrastructures in accelerating the software development, and expected to be trustworthy. In the literature, various testing technologies have been proposed to guarantee the quality of compilers. However, there remains an obstacle to comprehensively characterize and understand compiler testing. To overcome this obstacle, we propose a literature analysis framework to gain insights into the compiler testing area. First, we perform an extensive search to construct a dataset related to compiler testing papers. Then, we conduct a bibliometric analysis to analyze the productive authors, the influential papers, and the frequently tested compilers based on our dataset. Finally, we utilize association rules and collaboration networks to mine the authorships and the communities of interests among researchers and keywords. Some valuable results are reported. We find that the USA is the leading country that contains the most influential researchers and institutions. The most active keyword is “random testing”. We also find that most researchers have broad interests within small-scale collaborators in the compiler testing area.
It is often the case that in the development of a system-on-a-chip (SoC) design, a family of SystemC transaction level models (TLM) is created. TLMs in the same family often share common functionalities but differ in their timing, implementation, configuration and performance in various SoC developing phases. In most cases, all the TLMs in a family must be verified for the follow-up design activities. In our previous work, we proposed to call such family TLM product line (TPL), and proposed feature-oriented (FO) design methodology for efficient TPL development. However, developers can only verify TLM in a family one by one, which causes large portion of duplicated verification overhead. Therefore, in our proposed methodology, functional verification of TPL has become a bottleneck. In this paper, we proposed a novel TPL verification method for FO designs. In our method, for the given property, we can exponentially reduce the number of TLMs to be verified by identifying mutefeature-modules (MFM), which will avoid duplicated veri-fication. The proposed method is presented in informal and formal way, and the correctness of it is proved. The theoretical analysis and experimental results on a real design show the correctness and efficiency of the proposed method.
When users store data in big data platforms, the integrity of outsourced data is a major concern for data owners due to the lack of direct control over the data. However, the existing remote data auditing schemes for big data platforms are only applicable to static data. In order to verify the integrity of dynamic data in a Hadoop big data platform, we presents a dynamic auditing scheme meeting the special requirement of Hadoop. Concretely, a new data structure, namely Data Block Index Table, is designed to support dynamic data operations on HDFS (Hadoop distributed file system), including appending, inserting, deleting, and modifying. Then combined with the MapReduce framework, a dynamic auditing algorithm is designed to audit the data on HDFS concurrently. Analysis shows that the proposed scheme is secure enough to resist forge attack, replace attack and replay attack on big data platform. It is also efficient in both computation and communication.
Performance variability, stemming from nondeterministic hardware and software behaviors or deterministic behaviors such as measurement bias, is a well-known phenomenon of computer systems which increases the difficulty of comparing computer performance metrics and is slated to become even more of a concern as interest in Big Data analytic increases. Conventional methods use various measures (such as geometric mean) to quantify the performance of different benchmarks to compare computers without considering this variability which may lead to wrong conclusions. In this paper, we propose three resampling methods for performance evaluation and comparison: a randomization test for a general performance comparison between two computers, bootstrapping confidence estimation, and an empirical distribution and five-number-summary for performance evaluation. The results show that for both PARSEC and highvariance BigDataBench benchmarks 1) the randomization test substantially improves our chance to identify the difference between performance comparisons when the difference is not large; 2) bootstrapping confidence estimation provides an accurate confidence interval for the performance comparison measure (e.g., ratio of geometric means); and 3) when the difference is very small, a single test is often not enough to reveal the nature of the computer performance due to the variability of computer systems.We further propose using empirical distribution to evaluate computer performance and a five-number-summary to summarize computer performance. We use published SPEC 2006 results to investigate the sources of performance variation by predicting performance and relative variation for 8,236 machines. We achieve a correlation of predicted performances of 0.992 and a correlation of predicted and measured relative variation of 0.5. Finally, we propose the utilization of a novel biplotting technique to visualize the effectiveness of benchmarks and cluster machines by behavior. We illustrate the results and conclusion through detailed Monte Carlo simulation studies and real examples.
With the increasing number of GPS-equipped vehicles, more and more trajectories are generated continuously, based on which some urban applications become feasible, such as route planning. In general, popular route that has been travelled frequently is a good choice, especially for people who are not familiar with the road networks. Moreover, accurate estimation of the travel cost (such as travel time, travel fee and fuel consumption) will benefit a wellscheduled trip plan. In this paper, we address this issue by finding the popular route with travel cost estimation. To this end, we design a system consists of three main components. First, we propose a novel structure, called popular traverse graph where each node is a popular location and each edge is a popular route between locations, to summarize historical trajectories without road network information. Second, we propose a self-adaptive method to model the travel cost on each popular route at different time interval, so that each time interval has a stable travel cost. Finally, based on the graph, given a query consists of source, destination and leaving time, we devise an efficient route planning algorithmwhich considers optimal route concatenation to search the popular route from source to destination at the leaving time with accurate travel cost estimation. Moreover, we conduct comprehensive experiments and implement our system by a mobile App, the results show that our method is both effective and efficient.