Performance Evaluation of Machine Learning Algorithms for Predicting Organic Photovoltaic Efficiency

Mirza Sanita Haque , Simon Y. Foo

Clean Energy Sustain. ›› 2025, Vol. 3 ›› Issue (4) : 10016

PDF (585KB)
Clean Energy Sustain. ›› 2025, Vol. 3 ›› Issue (4) :10016 DOI: 10.70322/ces.2025.10016
Article
research-article
Performance Evaluation of Machine Learning Algorithms for Predicting Organic Photovoltaic Efficiency
Author information +
History +
PDF (585KB)

Abstract

This study forecasts the power conversion efficiency (PCE) of organic solar cells using data from experiments with donors and non-fullerene acceptor materials. We built a dataset that includes both numerical and categorical features by using standard scaling and one-hot encoding. We developed and compared several machine learning (ML) models, including multilayer perceptron, random forest, XGBoost, multiple linear regression, and partial least squares. The modified XGBoost model performed best, achieving a root mean squared error (RMSE) of 0.564, a mean absolute error (MAE) of 0.446, and a coefficient of determination (R2) of 0.980 on the test set. We also assessed the model’s ability to generalize and its reliability by examining learning curve trends, calibration curve analysis, and residual distribution. Plots of feature correlation and permutation importance showed that ionization potential and electron affinity were key predictors. The results demonstrate that with proper tuning, gradient boosting methods can provide highly accurate and easy-to-understand predictions of organic solar cell efficiency. This work establishes a repeatable machine learning process to quickly screen and thoughtfully design high-efficiency photovoltaic materials.

Keywords

Organic solar cell / Power conversion efficiency / Machine learning / XGBoost / Multilayer perceptron / Feature importance / Photovoltaic material / Data modeling

Cite this article

Download citation ▾
Mirza Sanita Haque, Simon Y. Foo. Performance Evaluation of Machine Learning Algorithms for Predicting Organic Photovoltaic Efficiency. Clean Energy Sustain., 2025, 3(4): 10016 DOI:10.70322/ces.2025.10016

登录浏览全文

4963

注册一个新账户 忘记密码

Author Contributions

Conceptualization, M.S.H. and S.Y.F.; Methodology, M.S.H.; Software, M.S.H.; Validation, M.S.H. and S.Y.F.; Formal Analysis, M.S.H.; Investigation, M.S.H.; Resources, M.S.H.; Data Curation, M.S.H.; Writing—Original Draft Preparation, M.S.H.; Writing—Review & Editing, M.S.H. and S.Y.F.; Visualization, M.S.H. and S.Y.F.; Supervision, S.Y.F.

Ethics Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset is currently in progress. We may be able to make it available after publication.

Funding

This research received no external funding.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

[1]

Lopez SA, Sanchez-Lengeling B, de Goes Soares J, Aspuru-Guzik A. Design principles and top non-fullerene acceptor candidates for organic photovoltaics. Adv. Energy Mater. 2017, 1, 857-870.

[2]

Sun W, Zheng Y, Yang K, Zhang Q, Shah AA, Wu Z, et al. Machine learning-assisted molecular design and efficiency prediction for high-performance organic photovoltaic materials. Sci. Adv. 2019, 5, eaay4275.

[3]

Jiang Y, Yao C, Yang Y, Wang J. Machine learning approaches for predicting power conversion efficiency in organic solar cells: A comprehensive review. Solar RRL 2024, 8, 2400567.

[4]

Zhao W, Zhang M, Yuan J, Li Y. Machine learning for organic solar cells. Adv. Mater. 2023, 35, 2300259.

[5]

Mahmood A, Irfan A, Wang JL. Machine Learning for Organic Photovoltaic Polymers: A Minireview. Chin. J. Polym. Sci. 2022, 40, 870-876. doi:10.1007/s10118-022-2782-5.

[6]

Siddiqui H, Usmani T. Interpretable AI and Machine Learning Classification for Identifying High-Efficiency Donor-Acceptor Pairs in Organic Solar Cells. ACS Omega 2024, 9, 34445-34455. doi:10.1021/acsomega.4c02157.

[7]

Neves BJ, Braga RC, Melo-Filho CC, Moreira-Filho JT, Muratov EN, Andrade CH. QSAR-Based Virtual Screening: Advances and Applications in Drug Discovery. Front Pharmacol. 2018, 9, 1275. doi:10.3389/fphar.2018.01275.

[8]

Geladi P, Kowalski BR. Partial least-squares regression: A tutorial. Anal. Chim. Acta 1986, 185, 1-17.

[9]

Butler KT, Davies DW, Cartwright H, Isayev O, Walsh A. Machine learning for molecular and materials science. Nature 2018, 559, 547-555.

[10]

Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13-17 August 2016.

[11]

Ahasan MR, Haque MS, Alam MGR. Supervised Learning Based Mobile Network Anomaly Detection from Key Performance Indicator (KPI) Data. In Proceedings of the 2022 IEEE Region 10 Symposium (TENSYMP), Mumbai, India, 1-3 July 2022; pp. 1-6.

[12]

Yi C, Wu Y, Gao Y, Du Q. Tandem solar cells efficiency prediction and optimization via deep learning. Phys. Chem. Chem. Phys. 2021, 23, 2991-2998. doi:10.1039/D0CP05882C.

[13]

Ward L, Agrawal A, Choudhary A, Wolverton C. A general-purpose machine learning framework for predicting properties of inorganic materials. NPJ Comput. Mater. 2016, 2, 1-7.

[14]

Hyndman RJ, Koehler AB. Another look at measures of forecast accuracy. Int. J. Forecast. 2006, 22, 679-688.

[15]

Willmott CJ, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 2005, 30, 79-82.

[16]

Chai T, Draxler RR. Root mean square error (RMSE) or mean absolute error (MAE)? Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 2014, 7, 1247-1250.

[17]

Draper NR, Smith H. Applied Regression Analysis, 3rd ed.; Wiley-Interscience: Hoboken, NJ, USA, 1998.

PDF (585KB)

0

Accesses

0

Citation

Detail

Sections
Recommended

/