Apr 2024, Volume 18 Issue 2
    

  • Select all
    Architecture
  • RESEARCH ARTICLE
    Ye CHI, Jianhui YUE, Xiaofei LIAO, Haikun LIU, Hai JIN

    Hybrid memory systems composed of dynamic random access memory (DRAM) and Non-volatile memory (NVM) often exploit page migration technologies to fully take the advantages of different memory media. Most previous proposals usually migrate data at a granularity of 4 KB pages, and thus waste memory bandwidth and DRAM resource. In this paper, we propose Mocha, a non-hierarchical architecture that organizes DRAM and NVM in a flat address space physically, but manages them in a cache/memory hierarchy. Since the commercial NVM device–Intel Optane DC Persistent Memory Modules (DCPMM) actually access the physical media at a granularity of 256 bytes (an Optane block), we manage the DRAM cache at the 256-byte size to adapt to this feature of Optane. This design not only enables fine-grained data migration and management for the DRAM cache, but also avoids write amplification for Intel Optane DCPMM. We also create an Indirect Address Cache (IAC) in Hybrid Memory Controller (HMC) and propose a reverse address mapping table in the DRAM to speed up address translation and cache replacement. Moreover, we exploit a utility-based caching mechanism to filter cold blocks in the NVM, and further improve the efficiency of the DRAM cache. We implement Mocha in an architectural simulator. Experimental results show that Mocha can improve application performance by 8.2% on average (up to 24.6%), reduce 6.9% energy consumption and 25.9% data migration traffic on average, compared with a typical hybrid memory architecture–HSCC.

  • RESEARCH ARTICLE
    Kun WANG, Song WU, Shengbang LI, Zhuo HUANG, Hao FAN, Chen YU, Hai JIN

    Container-based virtualization is becoming increasingly popular in cloud computing due to its efficiency and flexibility. Resource isolation is a fundamental property of containers. Existing works have indicated weak resource isolation could cause significant performance degradation for containerized applications and enhanced resource isolation. However, current studies have almost not discussed the isolation problems of page cache which is a key resource for containers. Containers leverage memory cgroup to control page cache usage. Unfortunately, existing policy introduces two major problems in a container-based environment. First, containers can utilize more memory than limited by their cgroup, effectively breaking memory isolation. Second, the OS kernel has to evict page cache to make space for newly-arrived memory requests, slowing down containerized applications. This paper performs an empirical study of these problems and demonstrates the performance impacts on containerized applications. Then we propose pCache (precise control of page cache) to address the problems by dividing page cache into private and shared and controlling both kinds of page cache separately and precisely. To do so, pCache leverages two new technologies: fair account (f-account) and evict on demand (EoD). F-account splits the shared page cache charging based on per-container share to prevent containers from using memory for free, enhancing memory isolation. And EoD reduces unnecessary page cache evictions to avoid the performance impacts. The evaluation results demonstrate that our system can effectively enhance memory isolation for containers and achieve substantial performance improvement over the original page cache management policy.

  • RESEARCH ARTICLE
    Mingzhen LI, Changxi LIU, Jianjin LIAO, Xuegui ZHENG, Hailong YANG, Rujun SUN, Jun XU, Lin GAN, Guangwen YANG, Zhongzhi LUAN, Depei QIAN

    The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application portability. Among the existing deep learning compilers, TVM is well known for its efficiency in code generation and optimization across diverse hardware devices. In the meanwhile, the Sunway many-core processor renders itself as a competitive candidate for its attractive computational power in both scientific computing and deep learning workloads. This paper combines the trends in these two directions. Specifically, we propose swTVM that extends the original TVM to support ahead-of-time compilation for architecture requiring cross-compilation such as Sunway. In addition, we leverage the architecture features during the compilation such as core group for massive parallelism, DMA for high bandwidth memory transfer and local device memory for data locality, in order to generate efficient codes for deep learning workloads on Sunway. The experiment results show that the codes generated by swTVM achieve 1.79× improvement of inference latency on average compared to the state-of-the-art deep learning framework on Sunway, across eight representative benchmarks. This work is the first attempt from the compiler perspective to bridge the gap of deep learning and Sunway processor particularly with productivity and efficiency in mind. We believe this work will encourage more people to embrace the power of deep learning and Sunway many-core processor.

  • Software
  • REVIEW ARTICLE
    Pak-Lok POON, Man Fai LAU, Yuen Tak YU, Sau-Fun TANG

    Spreadsheets are very common for information processing to support decision making by both professional developers and non-technical end users. Moreover, business intelligence and artificial intelligence are increasingly popular in the industry nowadays, where spreadsheets have been used as, or integrated into, intelligent or expert systems in various application domains. However, it has been repeatedly reported that faults often exist in operational spreadsheets, which could severely compromise the quality of conclusions and decisions based on the spreadsheets. With a view to systematically examining this problem via survey of existing work, we have conducted a comprehensive literature review on the quality issues and related techniques of spreadsheets over a 35.5-year period (from January 1987 to June 2022) for target journals and a 10.5-year period (from January 2012 to June 2022) for target conferences. Among other findings, two major ones are: (a) Spreadsheet quality is best addressed throughout the whole spreadsheet life cycle, rather than just focusing on a few specific stages of the life cycle. (b) Relatively more studies focus on spreadsheet testing and debugging (related to fault detection and removal) when compared with spreadsheet specification, modeling, and design (related to development). As prevention is better than cure, more research should be performed on the early stages of the spreadsheet life cycle. Enlightened by our comprehensive review, we have identified the major research gaps as well as highlighted key research directions for future work in the area.

  • RESEARCH ARTICLE
    Zhuo ZHANG, Jianxin XUE, Deheng YANG, Xiaoguang MAO

    In the process of software development, the ability to localize faults is crucial for improving the efficiency of debugging. Generally speaking, detecting and repairing errant behavior at an early stage of the development cycle considerably reduces costs and development time. Researchers have tried to utilize various methods to locate the faulty codes. However, failing test cases usually account for a small portion of the test suite, which inevitably leads to the class-imbalance phenomenon and hampers the effectiveness of fault localization.

    Accordingly, in this work, we propose a new fault localization approach named ContextAug. After obtaining dynamic execution through test cases, ContextAug traces these executions to build an information model; subsequently, it constructs a failure context with propagation dependencies to intersect with new model-domain failing test samples synthesized by the minimum variability of the minority feature space. In contrast to traditional test generation directly from the input domain, ContextAug seeks a new perspective to synthesize failing test samples from the model domain, which is much easier to augment test suites. Through conducting empirical research on real large-sized programs with 13 state-of-the-art fault localization approaches, ContextAug could significantly improve fault localization effectiveness with up to 54.53%. Thus, ContextAug is verified as able to improve fault localization effectiveness.

  • Artificial Intelligence
  • LETTER
    Jing LI, Donghong HAN, Zhishuai GUO, Baiyou QIAO, Gang WU
  • LETTER
    Luyu JIANG, Dantong OUYANG, Qi ZHANG, Liming ZHANG
  • RESEARCH ARTICLE
    Shiwei LU, Ruihu LI, Wenbin LIU

    Federated learning (FL) has emerged to break data-silo and protect clients’ privacy in the field of artificial intelligence. However, deep leakage from gradient (DLG) attack can fully reconstruct clients’ data from the submitted gradient, which threatens the fundamental privacy of FL. Although cryptology and differential privacy prevent privacy leakage from gradient, they bring negative effect on communication overhead or model performance. Moreover, the original distribution of local gradient has been changed in these schemes, which makes it difficult to defend against adversarial attack. In this paper, we propose a novel federated learning framework with model decomposition, aggregation and assembling (FedDAA), along with a training algorithm, to train federated model, where local gradient is decomposed into multiple blocks and sent to different proxy servers to complete aggregation. To bring better privacy protection performance to FedDAA, an indicator is designed based on image structural similarity to measure privacy leakage under DLG attack and an optimization method is given to protect privacy with the least proxy servers. In addition, we give defense schemes against adversarial attack in FedDAA and design an algorithm to verify the correctness of aggregated results. Experimental results demonstrate that FedDAA can reduce the structural similarity between the reconstructed image and the original image to 0.014 and remain model convergence accuracy as 0.952, thus having the best privacy protection performance and model training effect. More importantly, defense schemes against adversarial attack are compatible with privacy protection in FedDAA and the defense effects are not weaker than those in the traditional FL. Moreover, verification algorithm of aggregation results brings about negligible overhead to FedDAA.

  • RESEARCH ARTICLE
    Dawei ZHANG, Peng WANG, Yongfeng DONG, Linhao LI, Xin LI

    Moving target detection is one of the most basic tasks in computer vision. In conventional wisdom, the problem is solved by iterative optimization under either Matrix Decomposition (MD) or Matrix Factorization (MF) framework. MD utilizes foreground information to facilitate background recovery. MF uses noise-based weights to fine-tune the background. So both noise and foreground information contribute to the recovery of the background. To jointly exploit their advantages, inspired by two framework complementary characteristics, we propose to simultaneously exploit the advantages of these two optimizing approaches in a unified framework called Joint Matrix Decomposition and Factorization (JMDF). To improve background extraction, a fuzzy factorization is designed. The fuzzy membership of the background/foreground association is calculated during the factorization process to distinguish their contributions of both to background estimation. To describe the spatio-temporal continuity of foreground more accurately, we propose to incorporate the first order temporal difference into the group sparsity constraint adaptively. The temporal constraint is adjusted adaptively. Both foreground and the background are jointly estimated through an effective alternate optimization process, and the noise can be modeled with the specific probability distribution. The experimental results of vast real videos illustrate the effectiveness of our method. Compared with the current state-of-the-art technology, our method can usually form the clearer background and extract the more accurate foreground. Anti-noise experiments show the noise robustness of our method.

  • RESEARCH ARTICLE
    Wuxiu QUAN, Yu HU, Tingting DAN, Junyu LI, Yue ZHANG, Hongmin CAI

    Instance co-segmentation aims to segment the co-occurrent instances among two images. This task heavily relies on instance-related cues provided by co-peaks, which are generally estimated by exhaustively exploiting all paired candidates in point-to-point patterns. However, such patterns could yield a high number of false-positive co-peaks, resulting in over-segmentation whenever there are mutual occlusions. To tackle with this issue, this paper proposes an instance co-segmentation method via tensor-based salient co-peak search (TSCPS-ICS). The proposed method explores high-order correlations via triple-to-triple matching among feature maps to find reliable co-peaks with the help of co-saliency detection. The proposed method is shown to capture more accurate intra-peaks and inter-peaks among feature maps, reducing the false-positive rate of co-peak search. Upon having accurate co-peaks, one can efficiently infer responses of the targeted instance. Experiments on four benchmark datasets validate the superior performance of the proposed method.

  • Theoretical Computer Science
  • RESEARCH ARTICLE
    Ningning CHEN, Huibiao ZHU

    The Internet of Things (IoT) can realize the interconnection of people, machines, and things anytime, anywhere. Most of the existing research mainly focuses on the practical applications of IoT, and there is a lack of research on modeling and reasoning about IoT systems from the perspective of formal methods. Thus, the Calculus of the Internet of Things (CaIT) has been proposed to specify and analyze IoT systems before the actual implementation, which can effectively improve development efficiency, and enhance system quality and reliability. To verify the correctness of IoT systems described by CaIT, this paper presents a proof system for CaIT, in which specifications and verifications are based on the extended Hoare Logic with time. Furthermore, we explore the cooperation between isolated proofs to validate the postconditions of the communication actions occurring in these proofs, with a particular focus on broadcast communication. We also demonstrate the soundness of our proof system. A simple “smart home” is given to illustrate the availability of our proof system.

  • Networks and Communication
  • RESEARCH ARTICLE
    Jilin WANG, Yinsen HUANG, Bin WANG

    In this paper, we introduce a sub-Nyquist sampling-based receiver architecture and method for wideband spectrum sensing. Instead of recovering the original wideband analog signal, the proposed method aims to directly reconstruct the power spectrum of the wideband analog signal from sub-Nyquist samples. Note that power spectrum alone is sufficient for wideband spectrum sensing. Since only the covariance matrix of the wideband signal is needed, the proposed method, unlike compressed sensing-based methods, does not need to impose any sparsity requirement on the frequency domain. The proposed method is based on a multi-coset sampling architecture. By exploiting the inherent sampling structure, a fast compressed power spectrum estimation method whose primary computational task consists of fast Fourier transform (FFT) is proposed. Simulation results are presented to show the effectiveness of the proposed method.

  • Information Systems
  • LETTER
    Pengyi ZHANG, Yuchen YUAN, Jie SONG, Yu GU, Qiang QU, Yongjie BAI
  • Image and Graphics
  • LETTER
    Yang WU, Gang DONG, Lingyan LIANG, Yaqian ZHAO, Kaihua ZHANG
  • LETTER
    Erkang CHEN, Sixiang CHEN, Tian YE, Yun LIU
  • RESEARCH ARTICLE
    Mingzhi YUAN, Kexue FU, Zhihao LI, Manning WANG

    Estimating rigid transformation using noisy correspondences is critical to feature-based point cloud registration. Recently, a series of studies have attempted to combine traditional robust model fitting with deep learning. Among them, DHVR proposed a hough voting-based method, achieving new state-of-the-art performance. However, we find voting on rotation and translation simultaneously hinders achieving better performance. Therefore, we proposed a new hough voting-based method, which decouples rotation and translation space. Specifically, we first utilize hough voting and a neural network to estimate rotation. Then based on good initialization on rotation, we can easily obtain accurate rigid transformation. Extensive experiments on 3DMatch and 3DLoMatch datasets show that our method achieves comparable performances over the state-of-the-art methods. We further demonstrate the generalization of our method by experimenting on KITTI dataset.

  • Information Security
  • LETTER
    Yongxin ZHANG, Hong LEI, Bin WANG, Qinghao WANG, Ning LU, Wenbo SHI, Bangdao CHEN, Qiuling YUE
  • RESEARCH ARTICLE
    Yuanjing HAO, Long LI, Liang CHANG, Tianlong GU

    With the emergence of network-centric data, social network graph publishing is conducive to data analysts to mine the value of social networks, analyze the social behavior of individuals or groups, implement personalized recommendations, and so on. However, published social network graphs are often subject to re-identification attacks from adversaries, which results in the leakage of users’ privacy. The k-anonymity technology is widely used in the field of graph publishing, which is quite effective to resist re-identification attacks. However, the current researches still exist some issues to be solved: the protection of directed graphs is less concerned than that of undirected graphs; the protection of graph structure is often ignored while achieving the protection of nodes’ identities; the same protection is performed for different users, which doesn’t meet the different privacy requirements of users. Therefore, to address the above issues, a multi-level k-degree anonymity (MLDA) scheme on directed social network graphs is proposed in this paper. First, node sets with different importance are divided by the firefly algorithm and constrained connectedness upper approximation, and they are performed different k-degree anonymity protection to meet the different privacy requirements of users. Second, a new graph anonymity method is proposed, which achieves the addition and removal of edges with the help of fake nodes. In addition, to improve the utility of the anonymized graph, a new edge cost criterion is proposed, which is used to select the most appropriate edge to be removed. Third, to protect the community structure of the original graph as much as possible, fake nodes contained in a same community are merged prior to fake nodes contained in different communities. Experimental results on real datasets show that the newly proposed MLDA scheme is effective to balance the privacy and utility of the anonymized graph.

  • REVIEW ARTICLE
    Guocheng ZHU, Debiao HE, Haoyang AN, Min LUO, Cong PENG

    After the Ethereum DAO attack in 2016, which resulted in significant economic losses, blockchain governance has become a prominent research area. However, there is a lack of comprehensive and systematic literature review on blockchain governance. To deeply understand the process of blockchain governance and provide guidance for the future design of the blockchain governance model, we provide an in-depth review of blockchain governance. In this paper, first we introduce the consensus algorithms currently used in blockchain and relate them to governance theory. Second, we present the main content of off-chain governance and investigate two well-known off-chain governance projects. Third, we investigate four common on-chain governance voting techniques, then summarize the seven attributes that the on-chain governance voting process should meet, and finally analyze four well-known on-chain governance blockchain projects based on the previous research. We hope this survey will provide an in-depth insight into the potential development direction of blockchain governance and device future research agenda.

  • RESEARCH ARTICLE
    B Swaroopa REDDY, T Uday Kiran REDDY

    In this work, we propose a stateless blockchain called CompactChain, which compacts the entire state of the UTXO (Unspent Transaction Output) based blockchain systems into two RSA accumulators. The first accumulator is called Transaction Output (TXO) commitment which represents the TXO set. The second one is called Spent Transaction Output (STXO) commitment which represents the STXO set. In this work, we discuss three algorithms: (i) To update the TXO and STXO commitments by the miner. The miner also provides the proofs for the correctness of the updated commitments; (ii) To prove the transaction’s validity by providing a membership witness in TXO commitment and non-membership witness against STXO commitment for a coin being spent by a user; (iii) To update the witness for the coin that is not yet spent; The experimental results evaluate the performance of the CompactChain in terms of time taken by a miner to update the commitments and time taken by a validator to verify the commitments and validate the transactions. We compare the performance of CompactChain with the existing state-of-the-art works on stateless blockchains. CompactChain shows a reduction in commitments update complexity and transaction witness size which inturn reduces the mempool size and propagation latency without compromising the system throughput (Transactions per second (TPS)).

  • RESEARCH ARTICLE
    Min HAO, Beihai TAN, Siming WANG, Rong YU, Ryan Wen LIU, Lisu YU

    The sixth-generation (6G) wireless communication system is envisioned be cable of providing highly dependable services by integrating with native reliable and trustworthy functionalities. Zero-trust vehicular networks is one of the typical scenarios for 6G dependable services. Under the technical framework of vehicle-and-roadside collaboration, more and more on-board devices and roadside infrastructures will communicate for information exchange. The reliability and security of the vehicle-and-roadside collaboration will directly affect the transportation safety. Considering a zero-trust vehicular environment, to prevent malicious vehicles from uploading false or invalid information, we propose a malicious vehicle identity disclosure approach based on the Shamir secret sharing scheme. Meanwhile, a two-layer consortium blockchain architecture and smart contracts are designed to protect the identity and privacy of benign vehicles as well as the security of their private data. After that, in order to improve the efficiency of vehicle identity disclosure, we present an inspection policy based on zero-sum game theory and a roadside unit incentive mechanism jointly using contract theory and subjective logic model. We verify the performance of the entire zero-trust solution through extensive simulation experiments. On the premise of protecting the vehicle privacy, our solution is demonstrated to significantly improve the reliability and security of 6G vehicular networks.

  • Interdisciplinary
  • LETTER
    Zishan XU, Linlin SONG, Shichao LIU, Wen ZHANG
  • LETTER
    Qiang ZHANG, Juan LIU, Wen ZHANG, Feng YANG, Zhihui YANG, Xiaolei ZHANG
  • RESEARCH ARTICLE
    Yizheng WANG, Xin ZHANG, Ying JU, Qing LIU, Quan ZOU, Yazhou ZHANG, Yijie DING, Ying ZHANG

    Numerous studies have demonstrated that human microRNAs (miRNAs) and diseases are associated and studies on the microRNA-disease association (MDA) have been conducted. We developed a model using a low-rank approximation-based link propagation algorithm with Hilbert–Schmidt independence criterion-based multiple kernel learning (HSIC-MKL) to solve the problem of the large time commitment and cost of traditional biological experiments involving miRNAs and diseases, and improve the model effect. We constructed three kernels in miRNA and disease space and conducted kernel fusion using HSIC-MKL. Link propagation uses matrix factorization and matrix approximation to effectively reduce computation and time costs. The results of the experiment show that the approach we proposed has a good effect, and, in some respects, exceeds what existing models can do.