Development of robust machine learning models to estimate hydrochar higher heating value and yield based upon biomass proximate analysis

Guoliang Hou , Ahmad Alkhayyat , Ahmad Almalkawi , Anupam Yadav , H. S. Shreenidhi , Vishnu Saini , Shirin Shomurotova , Devendra Singh , Vatsal Jain , Aseel Smerat , Ahmad Khalid

Bioresources and Bioprocessing ›› 2025, Vol. 12 ›› Issue (1) : 138

PDF
Bioresources and Bioprocessing ›› 2025, Vol. 12 ›› Issue (1) :138 DOI: 10.1186/s40643-025-00979-1
Research
research-article

Development of robust machine learning models to estimate hydrochar higher heating value and yield based upon biomass proximate analysis

Author information +
History +
PDF

Abstract

This study introduces a robust machine learning framework for predicting hydrochar yield and higher heating value (HHV) using biomass proximate analysis. A curated dataset of 481 samples was assembled, featuring input variables such as fixed carbon, volatile matter, ash content, reaction time, temperature, and water content. Hydrochar yield and HHV served as the target outputs. To enhance data quality, Monte Carlo Outlier Detection (MCOD) was employed to eliminate anomalous entries. Thirteen machine learning algorithms, including convolutional neural networks (CNN), linear regression, decision trees, and advanced ensemble methods (CatBoost, LightGBM, XGBoost) were systematically compared. CatBoost demonstrated superior performance, achieving an R2 of 0.98 and mean squared error (MSE) of 0.05 for HHV prediction, and an R2 of 0.94 with MSE of 0.03 for yield estimation. SHAP analysis identified ash content as the most influential feature for HHV prediction, while temperature, water content, and fixed carbon were key drivers of yield. These results validate the effectiveness of gradient boosting models, particularly CatBoost, in accurately modeling hydrothermal carbonization outcomes and supporting data-driven biomass valorization strategies.

Graphical abstract

Keywords

Biomass proximate analysis / Hydrochar yield prediction / Machine learning / Higher heating value (HHV) / CatBoost algorithm

Cite this article

Download citation ▾
Guoliang Hou, Ahmad Alkhayyat, Ahmad Almalkawi, Anupam Yadav, H. S. Shreenidhi, Vishnu Saini, Shirin Shomurotova, Devendra Singh, Vatsal Jain, Aseel Smerat, Ahmad Khalid. Development of robust machine learning models to estimate hydrochar higher heating value and yield based upon biomass proximate analysis. Bioresources and Bioprocessing, 2025, 12(1): 138 DOI:10.1186/s40643-025-00979-1

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Abbasi P, Aghdam SK-y, Madani M. Modeling subcritical multi-phase flow through surface chokes with new production parameters. Flow Meas Instrum, 2023, 89: 102293.

[2]

Aghdam SK-y, et al. . Thermodynamic modeling of saponin adsorption behavior on sandstone rocks: an experimental study. Arab J Sci Eng, 2022

[3]

Aghdam SK-y, et al. . Thermodynamic modeling of saponin adsorption behavior on sandstone rocks: an experimental study. Arab J Sci Eng, 2023, 48(7): 9461-9476.

[4]

Ahmadi MA, et al. . Evolving artificial neural network and imperialist competitive algorithm for prediction oil flow rate of the reservoir. Appl Soft Comput, 2013, 13(2): 1085-1098.

[5]

Ajin RS, Segoni S, Fanti R. Optimization of SVR and CatBoost models using metaheuristic algorithms to assess landslide susceptibility. Sci Rep, 2024, 14(1): 24851

[6]

Aloysius N, Geetha M (2017) A review on deep convolutional neural networks. In: 2017 International Conference on Communication and Signal Processing (ICCSP).

[7]

Ao Y, et al. . The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J Pet Sci Eng, 2019, 174: 776-789.

[8]

Bansal M, Goyal A, Choudhary A. A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and long short term memory algorithms in machine learning. Decis Anal J, 2022, 3: 100071.

[9]

Bassir SM, Madani M. Predicting asphaltene precipitation during titration of diluted crude oil with paraffin using artificial neural network (ANN). Pet Sci Technol, 2019, 37(24): 2397-2403.

[10]

Bemani A, Madani M, Kazemi A. Machine learning-based estimation of nano-lubricants viscosity in different operating conditions. Fuel, 2023, 352: 129102

[11]

Bhutto AW, et al. . Promoting sustainability of use of biomass as energy resource: Pakistan’s perspective. Environ Sci Pollut Res Int, 2019, 26: 29606-29619.

[12]

Cha G-W, Moon H-J, Kim Y-C. Comparison of random forest and gradient boosting machine models for predicting demolition waste based on small datasets and categorical variables. Int J Environ Res Public Health, 2021

[13]

Chagas P et al. (2018) Evaluation of convolutional neural network architectures for chart image classification. In: 2018 International Joint Conference on Neural Networks (IJCNN).

[14]

Chen J, et al. . A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide. Environ Int, 2019, 130: 104934

[15]

Chen W-H, et al. . A comparative analysis of biomass torrefaction severity index prediction from machine learning. Appl Energy, 2022, 324: 119689.

[16]

Chen W-H, et al. . Forecast of glucose production from biomass wet torrefaction using statistical approach along with multivariate adaptive regression splines, neural network and decision tree. Appl Energy, 2022, 324: 119775

[17]

Dou Z, et al. . Development and process simulation of a biomass driven SOFC-based electricity and ammonia production plant using green hydrogen; AI-based machine learning-assisted tri-objective optimization. Int J Hydrogen Energy, 2025, 133: 440-457.

[18]

Emmert-Streib F, Dehmer M. High-dimensional LASSO-based computational regression models: regularization, shrinkage, and selection. Machine Learn Knowl Extraction, 2019, 1(1): 359-383.

[19]

Fan J, et al. . Light gradient boosting machine: an efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data. Agric Water Manage, 2019, 225: 105758.

[20]

Feller N, et al. . MRD parameters using immunophenotypic detection methods are highly reliable in predicting survival in acute myeloid leukaemia. Leukemia, 2004, 18(8): 1380-1390.

[21]

Feng W et al (2020) FSRF: An improved random forest for classification. In: 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications( AEECA).

[22]

Gardner MW, Dorling SR. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ, 1998, 32(14): 2627-2636.

[23]

Ghiasi MM, Zendehboudi S, Mohsenipour AA. Decision tree-based diagnosis of coronary artery disease: CART model. Comput Methods Programs Biomed, 2020, 192: 105400

[24]

Guo X, Wang J. Comparison of linearization methods for modeling the Langmuir adsorption isotherm. J Mol Liq, 2019, 296: 111850

[25]

Guo J, et al. . Prediction of heating and cooling loads based on light gradient boosting machine algorithms. Build Environ, 2023, 236: 110252.

[26]

Hai A, et al. . Machine learning models for the prediction of total yield and specific surface area of biochar derived from agricultural biomass by pyrolysis. Environ Technol Innov, 2023, 30: 103071

[27]

Hautaniemi S, et al. . Modeling of signal–response cascades using decision tree analysis. Bioinformatics, 2005, 21(9): 2027-2035.

[28]

Heidari E, Sobati MA, Movahedirad S. Accurate prediction of nanofluid viscosity using a multilayer perceptron artificial neural network (MLP-ANN). Chemometr Intell Lab Syst, 2016, 155: 73-85.

[29]

Hoerl RW. Ridge regression: a historical context. Technometrics, 2020, 62(4): 420-425.

[30]

Hoerl AE, Kennard RW. Ridge regression: applications to nonorthogonal problems. Technometrics, 1970, 12(1): 69-82.

[31]

Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 1970, 12(1): 55-67.

[32]

Holt CA, et al. . Use of potassium/lime drilling-fluid system in Navarin Basin drilling. SPE Drill Eng, 1987, 2(04): 323-330.

[33]

Hope TMH (2020) Chapter 4 - Linear regression. In: Machine Learning, Mechelli A, Vieira S, (Eds). Academic Press. p 67–81.

[34]

Jia Y, Yu S, Ma J. Intelligent interpolation by Monte Carlo machine learning. Geophysics, 2018, 83(2): V83-V97.

[35]

Kang J et al LASSO-Based machine learning algorithm for prediction of lymph node metastasis in T1 colorectal cancer. crt, 2020. 53(3): p. 773–783.

[36]

Kavitha S, Varuna S, Ramya R (2016) A comparative analysis on linear regression and support vector regression. In: 2016 Online International Conference on Green Engineering and Technologies (IC-GET)

[37]

Khan MR et al. Machine learning application for oil rate prediction in artificial gas lift wells. In: SPE Middle East Oil and Gas Show and Conference. 2019.

[38]

Khezerlooe-ye Aghdam S, et al. . Mechanistic assessment of Seidlitzia rosmarinus-derived surfactant for restraining shale hydration: a comprehensive experimental investigation. Chem Eng Res des, 2019, 147: 570-578.

[39]

Kim P. Convolutional Neural Network. MATLAB deep learning: with machine learning, neural networks and artificial intelligence, 2017, Berkeley, CA. Apress121-147.

[40]

Kramer O. K-nearest neighbors, in dimensionality reduction with unsupervised nearest neighbors, 2013, Berlin, Heidelberg. Springer13-23.

[41]

Li Z, et al. . A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst, 2022, 33(12): 6999-7019.

[42]

Li X, et al. . Catalytic cracking of biomass tar for hydrogen-rich gas production: parameter optimization using response surface methodology combined with deterministic finite automaton. Renew Energy, 2025, 241: 122368

[43]

Liu F, et al. . Organics composition and microbial analysis reveal the different roles of biochar and hydrochar in affecting methane oxidation from paddy soil. Sci Total Environ, 2022, 843: 157036

[44]

Lopez Pinaya WH et al (2020) Chapter 10 - Convolutional neural networks. In: Machine learning Mechelli A, Vieira S (Eds). Academic Press. p 173–191.

[45]

Madani M, Alipour M. Gas-oil gravity drainage mechanism in fractured oil reservoirs: surrogate model development and sensitivity analysis. Comput Geosci, 2022, 26(5): 1323-1343.

[46]

Madani M, Moraveji MK, Sharifi M. Modeling apparent viscosity of waxy crude oils doped with polymeric wax inhibitors. J Pet Sci Eng, 2021, 196: 108076

[47]

Mokhtari S, Navidi W, Mooney M. White-box regression (elastic net) modeling of earth pressure balance shield machine advance rate. Autom Constr, 2020, 115: 103208.

[48]

Niu X, et al. . Thermodynamic analysis of supercritical Brayton cycles using CO2-based binary mixtures for solar power tower system application. Energy, 2022, 254: 124286

[49]

Paula AJ, et al. . Machine learning and natural language processing enable a data-oriented experimental design approach for producing biochar and hydrochar from biomass. Chem Mater, 2022, 34(3): 979-990.

[50]

Qi K, Yang H. Elastic net nonparallel hyperplane support vector machine and its geometrical rationality. IEEE Trans Neural Netw Learn Syst, 2022, 33(12): 7199-7209.

[51]

Rigatti SJ. Random forest. J Insur Med, 2017, 47(1): 31-39.

[52]

Rocco CM, Moreno JA. Fast Monte Carlo reliability evaluation using support vector machine. Reliab Eng Syst Saf, 2002, 76(3): 237-243.

[53]

Roth V. The generalized LASSO. IEEE Trans Neural Netw, 2004, 15(1): 16-28.

[54]

Sado S, et al. . Current state of application of machine learning for investigation of MgO-C refractories: a review. Materials, 2023, 16(23): 7396

[55]

Sarica A, Cerasa A, Quattrone A. Random forest algorithm for the classification of neuroimaging data in Alzheimer’s disease: a systematic review. Front Aging Neurosci, 2017

[56]

Shafizadeh A, et al. . Machine learning predicts and optimizes hydrothermal liquefaction of biomass. Chem Eng J (Lausanne), 2022, 445: 136579

[57]

Shafizadeh A, et al. . Machine learning-based characterization of hydrochar from biomass: implications for sustainable energy and material production. Fuel, 2023, 347: 128467

[58]

Shi Z, et al. . Combined microbial transcript and metabolic analysis reveals the different roles of hydrochar and biochar in promoting anaerobic digestion of waste activated sludge. Water Res, 2021, 205: 117679

[59]

Shoushtari AB, Asadolahpour SR, Madani M. Thermodynamic investigation of asphaltene precipitation and deposition profile in wellbore: a case study. J Mol Liq, 2020, 320: 114468

[60]

Smith G, Campbell F. A critique of some ridge regression methods. J Am Stat Assoc, 1980, 75(369): 74-81.

[61]

Smola AJ, Schölkopf B. A tutorial on support vector regression. Stat Comput, 2004, 14(3): 199-222.

[62]

Taha AA, Malebary SJ. An intelligent approach to credit card fraud detection using an optimized light gradient boosting machine. IEEE Access, 2020, 8: 25579-25587.

[63]

Tauro R, et al. . An integrated user-friendly web-based spatial platform for bioenergy planning. Biomass Bioenergy, 2021, 145: 105939.

[64]

Touzani S, Granderson J, Fernandes S. Gradient boosting machine for modeling the energy consumption of commercial buildings. Energy Build, 2018, 158: 1533-1543.

[65]

Tsarpali M, Kuhn JN, Philippidis GP. Hydrothermal carbonization of residual algal biomass for production of hydrochar as a biobased metal adsorbent. Sustainability, 2022, 14(1): 455

[66]

Tyralis H, Papacharalampous G. Boosting algorithms in energy research: a systematic review. Neural Comput Appl, 2021, 33(21): 14101-14117.

[67]

Vishwakarma DK, et al. . Evaluation of CatBoost method for predicting weekly pan evaporation in subtropical and sub-humid regions. Pure Appl Geophys, 2024, 181(2): 719-747.

[68]

Xiong J-b, et al. . Study on the hydrothermal carbonization of swine manure: the effect of process parameters on the yield/properties of hydrochar and process water. J Anal Appl Pyrolysis, 2019, 144: 104692

[69]

Xu Z-X, et al. . Benign-by-design N-doped carbonaceous materials obtained from the hydrothermal carbonization of sewage sludge for supercapacitor applications. Green Chem, 2020, 22(12): 3885-3895.

[70]

Xu S, et al. . Effect of biomass type and pyrolysis temperature on nitrogen in biochar, and the comparison with hydrochar. Fuel, 2021, 291: 120128

[71]

Yang Y, et al. . Biomass microwave pyrolysis characterization by machine learning for sustainable rural biorefineries. Renew Energy, 2022, 201: 70-86.

[72]

Zhang Y, et al. . Effects of temperature, time and acidity of hydrothermal carbonization on the hydrochar properties and nitrogen recovery from corn stover. Biomass Bioenergy, 2019, 122: 175-182.

[73]

Zhang F, O’Donnell LJ (2020) Chapter 7 - Support vector regression. In: Machine learning, Mechelli A, Vieira S (Eds.). Academic Press. p 123–140.

[74]

Zhu Q (2022) Treatment and prevention of stuck pipe based on artificial neural networks analysis. In: Offshore Technology Conference Asia

[75]

Zulfiqar H, et al. . Identification of cyclin protein using gradient boost decision tree algorithm. Comput Struct Biotechnol J, 2021, 19: 4123-4131.

RIGHTS & PERMISSIONS

The Author(s)

AI Summary AI Mindmap
PDF

5

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/