Hadoop-based secure storage solution for big data in cloud computing environment

Shaopeng Guan , Conghui Zhang , Yilin Wang , Wenqing Liu

›› 2024, Vol. 10 ›› Issue (1) : 227 -236.

PDF
›› 2024, Vol. 10 ›› Issue (1) :227 -236. DOI: 10.1016/j.dcan.2023.01.014
Regular Papers
research-article

Hadoop-based secure storage solution for big data in cloud computing environment

Author information +
History +
PDF

Abstract

In order to address the problems of the single encryption algorithm, such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment, we propose a Hadoop based big data secure storage scheme. Firstly, in order to disperse the NameNode service from a single server to multiple servers, we combine HDFS federation and HDFS high-availability mechanisms, and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage. Then, we improve the ECC encryption algorithm for the encryption of ordinary data, and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated. To accelerate the encryption, we adopt the dual-thread encryption mode. Finally, the HDFS control module is designed to combine the encryption algorithm with the storage model. Experimental results show that the proposed solution solves the problem of a single point of failure of metadata, performs well in terms of metadata reliability, and can realize the fault tolerance of the server. The improved encryption algorithm integrates the dual-channel storage mode, and the encryption storage efficiency improves by 27.6% on average.

Keywords

Big data security / Data encryption / Hadoop / Parallel encrypted storage / Zookeeper

Cite this article

Download citation ▾
Shaopeng Guan, Conghui Zhang, Yilin Wang, Wenqing Liu. Hadoop-based secure storage solution for big data in cloud computing environment. , 2024, 10(1): 227-236 DOI:10.1016/j.dcan.2023.01.014

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

G. Li, J. Tan, S.S. Chaudhry,Industry 4.0 and big data innovations, Enterprise Inf. Syst. 13 (2) (2019) 145-147.

[2]

C. Wen, J. Yang, L. Gan, Y. Pan, Big data driven internet of things for credit evaluation and early warning in finance, Future Generat. Comput. Syst. 124 (6)(2021) 295-307.

[3]

X. Zhang, Y. Wang, Research on intelligent medical big data system based on hadoop and blockchain, EURASIP J. Wirel. Commun. Netw. (1) (2021) 1-21, 2021.

[4]

S. Bhavsar, P. Shah, T. Trambadiya, A survey on e-commerce log analysis using hadoop, Int. J. Comput. Eng. 7 (3) (2019) 486-489.

[5]

X. Huang, W. Yi, J. Wang, Z. Xu, Hadoop-based medical image storage and access method for examination series, Math. Probl Eng. (3) (2021) 1-10, 2021.

[6]

S. Mazumdar, D. Seybold, K. Kritikos, Y. Verginadis, A survey on data storage and placement methodologies for cloud-big data ecosystem, J. Big Data 6 (1) (2019) 1-37.

[7]

W. Rajeh, Hadoop distributed file system security challenges and examination of unauthorized access issue, J. Inf. Secur. 13 (2) (2022) 23-42.

[8]

G.S. Bhathal, A. Singh, Big data: hadoop framework vulnerabilities, security issues and attacks, Array 1-2 (4) (2019) 1-8.

[9]

G. Kapil, A. Agrawal, R.A. Khan, Big data security challenges: hadoop perspective, Int. J. Pure Appl. Math. 120 (6) (2020) 11767-11784.

[10]

B.H. Husain, S.R. Zeebaree, et al., Improvised distributions framework of hadoop: a review, Int. J. Sci. Bus. 5 (2) (2021) 31-41.

[11]

M. Naisuty, A.N. Hidayanto, N.C. Harahap, A. Rosyiq, G.M.S. Hartono, Data protection on hadoop distributed file system by using encryption algorithms: a systematic literature review, J. Phys. Conf. 1444 (4) (2020) 1-8.

[12]

A. Oriani, I.C. Garcia, From backup to hot standby: high availability for hdfs, in: 2012 IEEE 31st Symposium on Reliable Distributed Systems, IEEE, 2012, pp. 131-140.

[13]

I. Hababeh, A. Gharaibeh, S. Nofal, I. Khalil, An integrated methodology for big data classification and security for improving cloud systems data mobility, IEEE Access 7 (12) (2019) 9153-9163.

[14]

Y. Kim, J. Son, R.M. Parizi, G. Srivastava, H. Oh, 3-multi ranked encryption with enhanced security in cloud computing, Digit. Commun. Netw. 3 (7) (2022) 1-18.

[15]

F. Jiang, Z. Pan, Q. Li, L. Huang, D. Zhang, Research on the application of transparent encryption in distributed file system hdfs, in: 2020 19th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), IEEE, 2020, pp. 1-4.

[16]

A.J. Ouda, A.N. Yousif, A.S. Hasan, H.M. Ibrahim, M.A. Shyaa, The impact of cloud computing on network security and the risk for organization behaviors, Webology 19 (1) (2022) 195-206.

[17]

A. Razaque, M.B.H. Frej, B. Alotaibi, M. Alotaibi, Privacy preservation models for third-party auditor over cloud computing: a survey, Electronics 10 (21) (2021) 1-22.

[18]

B. Jena, M.K. Gourisaria, S.S. Rautaray, M. Pandey, Name Node Performance Enlarging by Aggregator Based Hadoop Framework, vol. 6, 2017, pp. 112-116, 2.

[19]

T.R. Shaha, M.N. Akhtar, F.T. Johora, M.Z. Hossain, M. Rahman, R.B. Ahmad, A noble approach to develop dynamically scalable namenode in hadoop distributed file system using secondary storage, Indones. J. Electr. Eng. Comput. Sci. 13 (2)(2019) 729-736.

[20]

W.G. Choi, S. Park, A write-friendly approach to manage namespace of hadoop distributed file system by utilizing nonvolatile memory, J. Supercomput. 75 (10)(2019) 6632-6662.

[21]

H. Won, M.C. Nguyen, M.-S. Gil, Y.-S. Moon, K.-Y. Whang, Moving metadata from ad hoc files to database tables for robust, highly available, and scalable hdfs, J. Supercomput. 73 (6) (2017) 2657-2681.

[22]

D.F. Solissa, M. Abdurohman, Hadoop high availability with linux ha, in: 2018 6th International Conference on Information and Communication Technology (ICoICT), IEEE, 2018, pp. 66-69.

[23]

T. Moses,A proposed rack-aware model for high-availability of hadoop distributed file system (hdfs) architecture, Univ. Pitesti Sci. Bull. Series: Electr. Comput. Sci. 20 (1) (2020) 25-34.

[24]

F.M. Awaysheh, M. Alazab, M. Gupta, T.F. Pena, J.C. Cabaleiro, Next-generation big data federation access control: a reference model, Future Generat. Comput. Syst. 108 (7) (2020) 726-741.

[25]

Z.-q. Wu, J. Wei, F. Zhang, W. Guo, G.-w. Xie, Mdlb: a metadata dynamic load balancing mechanism based on reinforcement learning, Front. Inform. Technol. Electr. Eng. 21 (7) (2020) 1034-1046.

[26]

S. Aljawarneh, M.B. Yassein, W.A. Talafha, A multithreaded programming approach for multimedia big data: encryption system, Multimed. Tool. Appl. 77 (6) (2017) 1-20.

[27]

K. Gai, M. Qiu, H. Zhao, Privacy-preserving data encryption strategy for big data in mobile cloud computing, IEEE Transac. Big Data 7 (4) (2017) 1-12.

[28]

L. Guo, H. Xie, Y. Li, Data encryption based blockchain and privacy preserving mechanisms towards big data, J. Vis. Commun. Image Represent. 70 (7) (2019) 1-11.

[29]

K. Wang, J. Yu, X. Liu, S. Guo, A pre-authentication approach to proxy re-encryption in big data context, IEEE Transac. Big Data 7 (4) (2017) 657-667.

[30]

Y. Mo, A data security storage method for iot under hadoop cloud computing platform, Int. J. Wireless Inf. Network 26 (3) (2019) 152-157.

[31]

S. Rallapalli, R. Gondkar, U.P.K. Ketavarapu, Impact of processing and analyzing healthcare big data on cloud computing environment by implementing hadoop cluster, Procedia Comput. Sci. 85 (8) (2016) 16-22.

[32]

W. Kareem, R.Z. Yousif, S.M.J. Abdalwahid, An approach for enhancing data confidentiality in hadoop, Indones. J. Electr. Eng. Comput. Sci. 20 (3) (2020) 1547-1555.

[33]

G. Kapil, A. Agrawal, A. Attaallah, A. Algarni, R. Kumar, R.A. Khan, Attribute based honey encryption algorithm for securing big data: hadoop distributed file system perspective, PeerJ Comput. Sci. 6 (99) (2020) 1-11.

[34]

P. Jain, M. Gyanchandani, N. Khare, Enhanced secured map reduce layer for big data privacy and security, J. Big Data 6 (1) (2019) 1-17.

[35]

Y. Song, Y.-S. Shin, M. Jang, J.-W. Chang, Design and implementation of hdfs data encryption scheme using aria algorithm on hadoop, in: 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), IEEE, 2017, pp. 84-90.

[36]

W. Heping, Research on hyperchaotic encryption algorithm based on mapreduce, in: 2017 International Conference on Computer Systems, Electronics and Control (ICCSEC), IEEE, 2017, pp. 1358-1361.

[37]

M. Alloghani, M.M. Alani, D. Al-Jumeily, T. Baker, A.J. Aljaaf, A systematic review on the status and progress of homomorphic encryption technologies, J. Inf. Secur. Appl. 48 (6) (2019) 1-10.

[38]

M.M. Potey, C.A. Dhote, D.H. Sharma, Homomorphic encryption for security of cloud data, Procedia Comput. Sci. 79 (4) (2016) 175-181.

[39]

M. Farooqui, H. Gull, M. Ilyas, S.Z. Iqbal, M.A.A. Khan, G. Krishna, M.S. Ahmed, Improving mental healthcare using a human centered internet of things model and embedding homomorphic encryption scheme for cloud security, J. Comput. Theor. Nanosci. 16 (5-6) (2019) 1806-1812.

[40]

A. Alabdulatif, I. Khalil, X. Yi, Towards secure big data analytic for cloud-enabled applications with fully homomorphic encryption, J. Parallel Distr. Comput. 137 (3)(2019) 192-204.

[41]

T. Soo, A. Samsudin, Securing big data processing with homomorphic encryption, Syst. Eng. 82 (10) (2020) 11980-11991.

[42]

A.R. Omondi, Elliptic-curve cryptosystems, in: Cryptography Arithmetic, Springer, 2020, pp. 243-252.

[43]

D. Mahto, D.K. Yadav, Rsa and ecc: a comparative analysis, Int. J. Appl. Eng. Res. 12 (19) (2017) 9053-9061.

[44]

L.B. Goel, R. Majumdar, Handling mutual exclusion in a distributed application through zookeeper, in: Computer Engineering & Applications, IEEE, 2015, pp. 457-460.

[45]

D. He, H. Wang, M.K. Khan, L. Wang, Lightweight anonymous key distribution scheme for smart grid using elliptic curve cryptography, IET Commun. 10 (14) (2016) 1795-1802.

[46]

M.I. Falcao, F. Miranda, R. Severino, M.J. Soares, Weierstrass method for quaternionic polynomial root-finding, Math. Methods Appl. Sci. 41 (1) (2018) 423-437.

AI Summary AI Mindmap
PDF

72

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/