Frontiers of Computer Science

Jun 2020, Volume 14 Issue 3

Select all

RESEARCH ARTICLE

A provable-secure and practical two-party distributed signing protocol for SM2 signature algorithm

Yudi ZHANG, Debiao HE, Mingwu ZHANG, Kim-Kwang Raymond CHOO

2020, 14(3): 143803. https://doi.org/10.1007/s11704-018-8106-9

Download PDF

Mobile devices are widely used for data access, communications and storage. However, storing a private key for signature and other cryptographic usage on a single mobile device can be challenging, due to its computational limitations. Thus, a number of (t, n) threshold secret sharing schemes designed to minimize private key from leakage have been proposed in the literature. However, existing schemes generally suffer from key reconstruction attack. In this paper, we propose an efficient and secure two-party distributed signing protocol for the SM2 signature algorithm. The latter has been mandated by the Chinese government for all electronic commerce applications. The proposed protocol separates the private key to storage on two devices and can generate a valid signature without the need to reconstruct the entire private key. We prove that our protocol is secure under nonstandard assumption. Then, we implement our protocol using MIRACL Cryptographic SDK to demonstrate that the protocol can be deployed in practice to prevent key disclosure.
RESEARCH ARTICLE

Hybritus: a password strength checker by ensemble learning from the query feedbacks of websites

Yongzhong HE, Endalew Elsabeth ALEM, Wei WANG

2020, 14(3): 143802. https://doi.org/10.1007/s11704-019-7342-y

Download PDF

Password authentication is vulnerable to dictionary attacks. Password strength measurement helps users to choose hard-to-guess passwords and enhance the security of systems based on password authentication. Although there are many password strength metrics and tools, none of them produces an objective measurement with inconsistent policies and different dictionaries. In this work, we analyzed the password policies and checkers of top 100 popular websites that are selected from Alexa rankings. The checkers are inconsistent and thus they may label the same password as different strength labels, because each checker is sensitive to its configuration, e.g., the algorithm used and the training data. Attackers are empowered to exploit the above vulnerabilities to crack the protected systems more easily. As such, single metrics or local training data are not enough to build a robust and secure password checker. Based on these observations, we proposed Hybritus that integrates different websites’ strategies and views into a global and robust model of the attackers with multiple layer perceptron (MLP) neural networks. Our data set is comprised of more than 3.3 million passwords taken from the leaked, transformed and randomly generated dictionaries. The data set were sent to 10 website checkers to get the feedbacks on the strength of passwords labeled as strong, medium and weak. Then we used the features of passwords generated by term frequency–inverse document frequency to train and test Hybritus. The experimental results show that the accuracy of passwords strength checking can be as high as 97.7% and over 94% even if it was trained with only ten thousand passwords. User study shows that Hybritus is usable as well as secure.
RESEARCH ARTICLE

TPII: tracking personally identifiable information via user behaviors in HTTP traffic

Yi LIU, Tian SONG, Lejian LIAO

2020, 14(3): 143801. https://doi.org/10.1007/s11704-018-7451-z

Download PDF

It is widely common that mobile applications collect non-critical personally identifiable information (PII) from users’ devices to the cloud by application service providers (ASPs) in a positive manner to provide precise and recommending services. Meanwhile, Internet service providers (ISPs) or local network providers also have strong requirements to collect PIIs for finer-grained traffic control and security services. However, it is a challenge to locate PIIs accurately in the massive data of network traffic just like looking a needle in a haystack. In this paper, we address this challenge by presenting an efficient and light-weight approach, namely TPII, which can locate and track PIIs from the HTTP layer rebuilt from raw network traffics. This approach only collects three features from HTTP fields as users’ behaviors and then establishes a tree-based decision model to dig PIIs efficiently and accurately.Without any priori knowledge, TPII can identify any types of PIIs from any mobile applications, which has a broad vision of applications. We evaluate the proposed approach on a real dataset collected from a campus network with more than 13k users. The experimental results show that the precision and recall of TPII are 91.72% and 94.51% respectively and a parallel implementation of TPII can achieve 213 million records digging and labelling within one hour, reaching near to support 1Gbps wirespeed inspection in practice. Our approach provides network service providers a practical way to collect PIIs for better services.
LETTER

A privacy-enhancing scheme against contextual knowledge-based attacks in location-based services

Jiaxun HUA, Yu LIU, Yibin SHEN, Xiuxia TIAN, Yifeng LUO, Cheqing JIN

2020, 14(3): 143605. https://doi.org/10.1007/s11704-019-8300-4

Download PDF
RESEARCH ARTICLE

Evaluating and improving the interpretability of item embeddings using item-tag relevance information

Tao LIAN, Lin DU, Mingfu ZHAO, Chaoran CUI, Zhumin CHEN, Jun MA

2020, 14(3): 143603. https://doi.org/10.1007/s11704-019-7427-7

Download PDF

Matrix factorization (MF) methods have superior recommendation performance and are flexible to incorporate other side information, but it is hard for humans to interpret the derived latent factors. Recently, the item-item cooccurrence information is exploited to learn item embeddings and enhance the recommendation performance. However, the item-item co-occurrence information, constructed from the sparse and long-tail distributed user-item interaction matrix, is over-estimated for rare items, which could lead to bias in learned item embeddings. In this paper, we seek to evaluate and improve the interpretability of item embeddings by leveraging a dense item-tag relevance matrix. Specifically, we design two metrics to quantitatively evaluate the interpretability of item embeddings from different viewpoints: interpretability of individual dimensions of item embeddings and semantic coherence of local neighborhoods in the latent space.We also propose a tag-informed item embedding (TIE) model that jointly factorizes the user-item interaction matrix, the item-item co-occurrence matrix and the item-tag relevance matrix with shared item embeddings so that different forms of information can co-operate with each other to learn better item embeddings. Experiments on the MovieLens20M dataset demonstrate that compared with other state-of-the-art MF methods, TIE achieves better top-N recommendations, and the relative improvement is larger when the user-item interaction matrix becomes sparser. By leveraging the itemtag relevance information, individual dimensions of item embeddings are more interpretable and local neighborhoods in the latent space are more semantically coherent; the bias in learned item embeddings are also mitigated to some extent.
RESEARCH ARTICLE

Personalized query suggestion diversification in information retrieval

Wanyu CHEN, Fei CAI, Honghui CHEN, Maarten DE RIJKE

2020, 14(3): 143602. https://doi.org/10.1007/s11704-018-7283-x

Download PDF

Query suggestions help users refine their queries after they input an initial query. Previous work on query suggestion has mainly concentrated on approaches that are similarity-based or context-based, developing models that either focus on adapting to a specific user (personalization) or on diversifying query aspects in order to maximize the probability of the user being satisfied (diversification). We consider the task of generating query suggestions that are both personalized and diversified. We propose a personalized query suggestion diversification (PQSD) model, where a user’s long-term search behavior is injected into a basic greedy query suggestion diversification model that considers a user’s search context in their current session. Query aspects are identified through clicked documents based on the open directory project (ODP) with a latent dirichlet allocation (LDA) topic model. We quantify the improvement of our proposed PQSD model against a state-of-the-art baseline using the public america online (AOL) query log and show that it beats the baseline in terms of metrics used in query suggestion ranking and diversification. The experimental results show that PQSD achieves its best performance when only queries with clicked documents are taken as search context rather than all queries, especially when more query suggestions are returned in the list.
RESEARCH ARTICLE

Most similar maximal clique query on large graphs

Yun PENG, Yitong XU, Huawei ZHAO, Zhizheng ZHOU, Huimin HAN

2020, 14(3): 143601. https://doi.org/10.1007/s11704-019-7235-0

Download PDF

This paper studies the most similar maximal clique query (MSMCQ). Given a graphG and a set of nodes Q,MSMCQ is to find the maximal clique of G having the largest similarity with Q. MSMCQ has many real applications including advertising industry, public security, task crowdsourcing and social network, etc. MSMCQ can be studied as a special case of the general set similarity query (SSQ). However, the MCs of G has several specialties from the general sets. Based on the specialties of MCs, we propose a novel index, namely MCIndex. MCIndex outperforms the state-of-the-art SSQ method significantly in terms of the number of candidates and the query time. Specifically, we first construct an inverted index I for all the MCs of G. Since the MCs in a posting list often have a lot of overlaps, MCIndex selects some pivots to cluster the MCs with a small radius. Given a query Q, we compute the distance from the pivots to Q. The clusters of the pivots assured not answer can be pruned by our distance based pruning rule. Since it is NP-hard to construct a minimum MCIndex, we propose to construct a minimal MCIndex on I(v) with an approximation ratio 1+ ln |I(v)|. Since the MCs have properties that are inherent of graph structure, we further propose a SIndex within each cluster of a MCIndex and a structure based pruning rule. SIndex can significantly reduce the number of candidates. Since the sizes of intersections between Q and many MCs need to be computed during the query evaluation, we also propose a binary representation of MCs to improve the efficiency of the intersection size computation. Our extensive experiments confirm the effectiveness and efficiency of our proposed techniques on several real-world datasets.
RESEARCH ARTICLE

Varna-based optimization: a novel method for capacitated controller placement problem in SDN

Ashutosh Kumar SINGH, Saurabh MAURYA, Shashank SRIVASTAVA

2020, 14(3): 143402. https://doi.org/10.1007/s11704-018-7277-8

Download PDF

Recently, software defined networking (SDN) is a promising paradigm shift that decouples the control plane from the data plane. It can centrally monitor and control the network through softwarization, i.e., controller. Multiple controllers are a necessity of current SDN basedWAN. Placing multiple controllers in an optimum way is known as controller placement problem (CPP). Earlier, solutions of CPP only concentrated on propagation latency but overlooked the capacity of controllers and the dynamic load on switches, which is a significant factor in real networks. In this paper, we develop a novel optimization algorithm named varna-based optimization (VBO) and use it to solve CPP. To the best of our knowledge, this is the first attempt to minimize the total average latency of SDN along with the implementation of TLBO and Jaya algorithms to solve CPP for all twelve possible scenarios. Our experimental results show that TLBO outperforms PSO, and VBO outperforms TLBO and Jaya algorithms in all scenarios for all topologies.
RESEARCH ARTICLE

Connection models for the Internet-of-Things

Kangli HE, Holger HERMANNS, Hengyang WU, Yixiang CHEN

2020, 14(3): 143401. https://doi.org/10.1007/s11704-018-7395-3

Download PDF

The Internet-of-Things (IoT) is expected to swamp the world. In order to study and understand the emergent behaviour of connected things, effective support for their modelling is needed. At the heart of IoT are flexible and adaptive connection patterns between things, which can naturally be modelled by channel-based coordination primitives, and characteristics of connection failure probabilities, execution and waiting times, as well as resource consumption. The latter is especially important in light of severely limited power and computation budgets inside the things. In this paper, we tackle the IoT modelling challenge, based on a conservative extension of channel-based Reo circuits. We introduce a model called priced probabilistic timed constraint automaton, which combines models of probabilistic and timed aspects, and integrates pricing information. An expressive logic called priced probabilistic timed scheduled data stream logic is presented, so as to enable the specification and verification of properties, which characterize data-flow streams and prices. A small but illustrative IoT case demonstrates the principal benefits of the proposed approach.
RESEARCH ARTICLE

Bangla language modeling algorithm for automatic recognition of hand-sign-spelled Bangla sign language

Muhammad Aminur RAHAMAN, Mahmood JASIM, Md. Haider ALI, Md. HASANUZZAMAN

2020, 14(3): 143302. https://doi.org/10.1007/s11704-018-7253-3

Download PDF

Because of using traditional hand-sign segmentation and classification algorithm, many diversities of Bangla language including joint-letters, dependent vowels etc. and representing 51 Bangla written characters by using only 36 hand-signs, continuous hand-sign-spelled Bangla sign language (BdSL) recognition is challenging. This paper presents a Bangla language modeling algorithm for automatic recognition of hand-sign-spelled Bangla sign language which consists of two phases. First phase is designed for hand-sign classification and the second phase is designed for Bangla language modeling algorithm (BLMA) for automatic recognition of hand-sign-spelledBangla sign language. In first phase, we have proposed two step classifiers for hand-sign classification using normalized outer boundary vector (NOBV) and window-grid vector (WGV) by calculating maximum inter correlation coefficient (ICC) between test feature vector and pre-trained feature vectors. At first, the system classifies hand-signs using NOBV. If classification score does not satisfy specific threshold then another classifier based on WGV is used. The system is trained using 5,200 images and tested using another (5, 200 × 6) images of 52 hand-signs from 10 signers in 6 different challenging environments achieving mean accuracy of 95.83% for classification with the computational cost of 39.972 milliseconds per frame. In the Second Phase, we have proposed Bangla language modeling algorithm (BLMA) which discovers all “hidden characters” based on “recognized characters” from 52 hand-signs of BdSL to make any Bangla words, composite numerals and sentences in BdSL with no training, only based on the result of first phase. To the best of our knowledge, the proposed system is the first system in BdSL designed on automatic recognition of hand-sign-spelled BdSL for large lexicon. The system is tested for BLMA using hand-sign-spelled 500 words, 100 composite numerals and 80 sentences in BdSL achieving mean accuracy of 93.50%, 95.50% and 90.50% respectively.
RESEARCH ARTICLE

A correlative denoising autoencoder to model social influence for top-N recommender system

Yiteng PAN, Fazhi HE, Haiping YU

2020, 14(3): 143301. https://doi.org/10.1007/s11704-019-8123-3

Download PDF

In recent years, there are numerous works been proposed to leverage the techniques of deep learning to improve social-aware recommendation performance. In most cases, it requires a larger number of data to train a robust deep learning model, which contains a lot of parameters to fit training data. However, both data of user ratings and social networks are facing critical sparse problem, which makes it not easy to train a robust deep neural networkmodel. Towards this problem, we propose a novel correlative denoising autoencoder (CoDAE) method by taking correlations between users with multiple roles into account to learn robust representations from sparse inputs of ratings and social networks for recommendation. We develop the CoDAE model by utilizing three separated autoencoders to learn user featureswith roles of rater, truster and trustee, respectively. Especially, on account of that each input unit of user vectors with roles of truster and trustee is corresponding to a particular user, we propose to utilize shared parameters to learn common information of the units that corresponding to same users. Moreover, we propose a related regularization term to learn correlations between user features that learnt by the three subnetworks of CoDAE model. We further conduct a series of experiments to evaluate the proposed method on two public datasets for Top-N recommendation task. The experimental results demonstrate that the proposed model outperforms state-of-the-art algorithms on rank-sensitive metrics of MAP and NDCG.
LETTER

Software testing without the oracle correctness assumption

Tun LI, Wanwei LIU, Xinrui GUO, Ji WANG

2020, 14(3): 143203. https://doi.org/10.1007/s11704-019-8434-4

Download PDF
LETTER

Perspectives on search strategies in automated test input generation

Yang CAO, Yanyan JIANG, Chang XU, Jun MA, Xiaoxing MA

2020, 14(3): 143202. https://doi.org/10.1007/s11704-019-8281-3

Download PDF
RESEARCH ARTICLE

TRAP: trace runtime analysis of properties

Daian YUE, Vania JOLOBOFF, Frédéric MALLET

2020, 14(3): 143201. https://doi.org/10.1007/s11704-018-7217-7

Download PDF

We present a method and a tool for the verification of causal and temporal properties for embedded systems. We analyze trace streams resulting from the execution of virtual prototypes that combine simulated hardware and embedded software. The main originality lies in the use of logical clocks to abstract away irrelevant information from the trace. We propose a model-based approach that relies on domain specific languages (DSL). A first DSL, called TISL (trace item specification language), captures the relevant data structures. A second DSL, called STML (simulation trace mapping language), abstracts the simulation raw data into logical clocks, abstracting simulation data into relevant observation probes and thus reducing the trace streams size. The third DSL, called TPSL, defines a set of behavioral patterns that include widely used temporal properties. This is meant for users who are not familiar with temporal logics. Each pattern is transformed into an automata. All the automata are executed concurrently and each one raises an error if and when the related TPSL property is violated. The contribution is the integration of this pattern-based property specification language into the SimSoC virtual prototyping framework without requiring to recompile all the simulation models when the properties evolve. We illustrate our approach with experiments that show the possibility to use multi-core platforms to parallelize the simulation and verification processes, thus reducing the verification time.
RESEARCH ARTICLE

Transparent partial page migration between CPU and GPU

Shiqing ZHANG, Zheng QIN, Yaohua YANG, Li SHEN, Zhiying WANG

2020, 14(3): 143101. https://doi.org/10.1007/s11704-018-7386-4

Download PDF

Despite the increasing investment in integrated GPU and next-generation interconnect research, discrete GPU connected by PCIe still account for the dominant position of the market, the management of data communication between CPU and GPU continues to evolve. Initially, the programmer explicitly controls the data transfer between CPU and GPU. To simplify programming and enable systemwide atomic memory operations, GPU vendors have developed a programming model that provides a single, virtual address space for accessing all CPU and GPU memories in the system. The page migration engine in this model automatically migrates pages between CPU and GPU on demand. To meet the needs of high-performanceworkloads, the page size tends to be larger. Limited by low bandwidth and high latency interconnects compared to GDDR, larger page migration has longer delay, which may reduce the overlap of computation and transmission, waste time to migrate unrequested data, block subsequent requests, and cause serious performance decline. In this paper, we propose partial page migration that only migrates the requested part of a page to reduce the migration unit, shorten the migration latency, and avoid the performance degradation of the full page migration when the page becomes larger. We show that partial page migration is possible to largely hide the performance overheads of full page migration. Compared with programmer controlled data transmission, when the page size is 2MB and the PCIe bandwidth is 16GB/sec, full page migration is 72.72× slower, while our partial page migration achieves 1.29× speedup. When the PCIe bandwidth is changed to 96GB/sec, full page migration is 18.85× slower, while our partial page migration provides 1.37× speedup. Additionally, we examine the performance impact that PCIe bandwidth and migration unit size have on execution time, enabling designers to make informed decisions.

Please choose a citation manager

About the journal

Aims & scope

Description

Editorial board

Abstracting / Indexing

Contact us

Browse

Just accepted

Online first

Latest issue

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates

Guidelines for reviewers

Jun 2020, Volume 14 Issue 3