An overview of crude oil price forecasting based on big data technology

Ting YAO , Pan-Feng ZHANG , Yue-Jun ZHANG

Front. Eng ›› 2025, Vol. 12 ›› Issue (4) : 938 -951.

PDF (962KB)
Front. Eng ›› 2025, Vol. 12 ›› Issue (4) : 938 -951. DOI: 10.1007/s42524-025-5042-x
Energy and Environmental Systems
REVIEW ARTICLE

An overview of crude oil price forecasting based on big data technology

Author information +
History +
PDF (962KB)

Abstract

Accurate crude oil price forecasting is critical in energy economics and energy engineering, as it informs economic policy-making and investment decisions. The emergence of big data brings both new opportunities and challenges for crude oil price forecasting. This paper systematically reviews recent advances in crude oil price forecasting in the context of big data, with a focus on the evolution of data types, predictors, and modeling techniques. In particular, it analyzes key forecasting approaches, including conventional and data-driven forecasting models, while emphasizing the growing role of emerging data sources. Promising directions for future research include the integration of multi-source data, the reconstruction of high-frequency supply and demand indicators, the development of hybrid modeling approaches, the enhancement of model interpretability, and the evaluation of the economic value of forecasting outcomes.

Graphical abstract

Keywords

crude oil price forecasting / big data technology / machine learning

Cite this article

Download citation ▾
Ting YAO, Pan-Feng ZHANG, Yue-Jun ZHANG. An overview of crude oil price forecasting based on big data technology. Front. Eng, 2025, 12(4): 938-951 DOI:10.1007/s42524-025-5042-x

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Abdollahi H, Ebrahimi S B, (2020). A new hybrid model for forecasting Brent crude oil price. Energy, 200: 117520

[2]

Alquist R, Gervais O, (2013). The role of financial speculation in driving the price of crude oil. Energy Journal, 34( 3): 35–54

[3]

Bai Y, Li X, Yu H, Jia S, (2022). Crude oil price forecasting incorporating news text. International Journal of Forecasting, 38( 1): 367–383

[4]

Barber B M, Odean T, (2008). All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors. Review of Financial Studies, 21( 2): 785–818

[5]

Baumeister C, Guérin P, Kilian L, (2015). Do high-frequency financial data help forecast oil prices? The MIDAS touch at work. International Journal of Forecasting, 31( 2): 238–252

[6]

Beyer Díaz S, Coussement K, De Caigny A, Pérez L F, Creemers S, (2024). Do the US president’s tweets better predict oil prices? An empirical examination using long short-term memory networks. International Journal of Production Research, 62( 6): 2158–2175

[7]

Busari G A, Lim D H, (2021). Crude oil price prediction: A comparison between AdaBoost-LSTM and AdaBoost-GRU for improving forecasting performance. Computers & Chemical Engineering, 155: 107513

[8]

Cen Z, Wang J, (2019). Crude oil price prediction model with long short term memory deep learning based on prior knowledge data transfer. Energy, 169: 160–171

[9]

Chen W, Lai K K, Cai Y, (2021). Exploring public mood toward commodity markets: a comparative study of user behavior on Sina Weibo and Twitter. Internet Research, 31( 3): 1102–1119

[10]

Chen X, Zhang W, Xu X, Cao W, (2022). A public and large-scale expert information fusion method and its application: Mining public opinion via sentiment analysis and measuring public dynamic reliability. Information Fusion, 78: 71–85

[11]

Chiroma H, Abdulkareem S, Herawan T, (2015). Evolutionary neural network model for West Texas Intermediate crude oil price prediction. Applied Energy, 142: 266–273

[12]

Deeney P, Cummins M, Dowling M, Bermingham A, (2015). Sentiment in oil markets. International Review of Financial Analysis, 39: 179–185

[13]

Degiannakis S, Filis G, (2018). Forecasting oil prices: High-frequency financial data are indeed useful. Energy Economics, 76: 388–402

[14]

Elshendy M, Colladon A F, Battistoni E, Gloor P A, (2018). Using four different online media sources to forecast the crude oil price. Journal of Information Science, 44( 3): 408–421

[15]

Fan L, Pan S, Li Z, Li H, (2016). An ICA-based support vector regression scheme for forecasting crude oil prices. Technological Forecasting and Social Change, 112: 245–253

[16]

Fang T, Zheng C, Wang D, (2023a). Forecasting the crude oil prices with an EMD-ISBM-FNN model. Energy, 263: 125407

[17]

Fang Y, Wang W, Wu P, Zhao Y, (2023b). A sentiment-enhanced hybrid model for crude oil price forecasting. Expert Systems with Applications, 215: 119329

[18]

Fu T, Huang D, Feng L, Tang X, (2024). More is better? The impact of predictor choice on the INE oil futures volatility forecasting. Energy Economics, 134: 107540

[19]

Ghaffari A, Zare S, (2009). A novel algorithm for prediction of crude oil price variation based on soft computing. Energy Economics, 31( 4): 531–536

[20]

Godarzi A A, Amiri R M, Talaei A, Jamasb T, (2014). Predicting oil price movements: A dynamic Artificial Neural Network approach. Energy Policy, 68: 371–382

[21]

Gong X, Guan K, Chen Q, (2022). The role of textual analysis in oil futures price forecasting based on machine learning approach. Journal of Futures Markets, 42( 10): 1987–2017

[22]

Guo J, Long S, Luo W, (2022a). Nonlinear effects of climate policy uncertainty and financial speculation on the global prices of oil and gas. International Review of Financial Analysis, 83: 102286

[23]

Guo J F, Ji Q, (2013). How does market concern derived from the Internet affect oil prices. Applied Energy, 112: 1536–1543

[24]

Guo Y, Ma F, Li H, Lai X, (2022b). Oil price volatility predictability based on global economic conditions. International Review of Financial Analysis, 82: 102195

[25]

Han L, Lv Q, Yin L, (2017). Can investor attention predict oil prices. Energy Economics, 66: 547–558

[26]

Hao X, Wang Y, (2023). Cloud cover and expected oil returns. Humanities & Social Sciences Communications, 10( 1): 605

[27]

Hao X, Zhao Y, Wang Y, (2020). Forecasting the real prices of crude oil using robust regression models with regularization constraints. Energy Economics, 86: 104683

[28]

He L T, Casey K M, (2015). Forecasting ability of the investor sentiment endurance index: The case of oil service stock returns and crude oil prices. Energy Economics, 47: 121–128

[29]

Henry E, (2008). Are investors influenced by how earnings press releases are written. The Journal of Business Communication (1973), 45( 4): 363–407

[30]

Hou A, Suardi S, (2012). A nonparametric GARCH model of crude oil price return volatility. Energy Economics, 34( 2): 618–626

[31]

HouKPengLXiongW (2009). A tale of two anomalies: The implications of investor attention for price and earnings momentum. Available at SSRN 976394

[32]

Huang D, Yu B, Fabozzi F J, Fukushima M, (2009). CAViaR-based forecast for oil price risk. Energy Economics, 31( 4): 511–518

[33]

Huang J, Ding Q, Zhang H, Guo Y, Suleman M T, (2021). Nonlinear dynamic correlation between geopolitical risk and oil prices: A study based on high-frequency data. Research in International Business and Finance, 56: 101370

[34]

Huang Y, Deng Y, (2021). A new crude oil price forecasting model based on variational mode decomposition. Knowledge-Based Systems, 213: 106669

[35]

Jammazi R, Aloui C, (2012). Crude oil price forecasting: Experimental evidence from wavelet decomposition and neural network modeling. Energy Economics, 34( 3): 828–841

[36]

Jiang F, Wang K, Dong L, Pan C, Xu W, Yang K, (2021). AI driven heterogeneous MEC system with UAV assistance for dynamic environment: Challenges and solutions. IEEE Network, 35( 1): 400–408

[37]

Jiang H, Hu W, Xiao L, Dong Y, (2022a). A decomposition ensemble based deep learning approach for crude oil price forecasting. Resources Policy, 78: 102855

[38]

Jiang Z, Zhang L, Zhang L, Wen B, (2022b). Investor sentiment and machine learning: Predicting the price of China’s crude oil futures market. Energy, 247: 123471

[39]

Karasu S, Altan A, Bekiros S, Ahmad W, (2020). A new forecasting model with wrapper-based feature selection approach using multi-objective optimization technique for chaotic crude oil time series. Energy, 212: 118750

[40]

Li J, Qian S, Li L, Guo Y, Wu J, Tang L, (2024). A novel secondary decomposition method for forecasting crude oil price with twitter sentiment. Energy, 290: 129954

[41]

Li J, Tang L, Wang S, (2020). Forecasting crude oil price with multilingual search engine data. Physica A, 551: 124178

[42]

Li J, Xu Z, Yu L, Tang L, (2016). Forecasting oil price trends with sentiment of online news articles. Procedia Computer Science, 91: 1081–1087

[43]

Li J, Zhu S, Wu Q, (2019a). Monthly crude oil spot price forecasting using variational mode decomposition. Energy Economics, 83: 240–253

[44]

Li T, Qian Z, Deng W, Zhang D, Lu H, Wang S, (2021). Forecasting crude oil prices based on variational mode decomposition and random sparse Bayesian learning. Applied Soft Computing, 113: 108032

[45]

Li X, Ma J, Wang S, Zhang X, (2015). How does Google search affect trader positions and crude oil prices. Economic Modelling, 49: 162–171

[46]

Li X, Shang W, Wang S, (2019b). Text-based crude oil price forecasting: A deep learning approach. International Journal of Forecasting, 35( 4): 1548–1560

[47]

Li Z, Huang Z, Failler P, (2022). Dynamic correlation between crude oil price and investor sentiment in China: Heterogeneous and asymmetric effect. Energies, 15( 3): 687

[48]

Liang Q, Lin Q, Guo M, Lu Q, Zhang D, (2025). Forecasting crude oil prices: A Gated Recurrent Unit-based nonlinear Granger Causality model. International Review of Financial Analysis, 102: 104124

[49]

Liang W, Chen X, Huang S, Xiong G, Yan K, Zhou X, (2023). Federal learning edge network based sentiment analysis combating global COVID-19. Computer Communications, 204: 33–42

[50]

Lin Y, Chen K, Zhang X, Tan B, Lu Q, (2022). Forecasting crude oil futures prices using BiLSTM-Attention-CNN model with Wavelet transform. Applied Soft Computing, 130: 109723

[51]

Loughran T, McDonald B, Pragidis I, (2019). Assimilation of oil news into prices. International Review of Financial Analysis, 63: 105–118

[52]

Lucey B, Ren B, (2021). Does news tone help forecast oil. Economic Modelling, 104: 105635

[53]

Ma F, Liu J, Wahab M I M, Zhang Y, (2018). Forecasting the aggregate oil price volatility in a data-rich environment. Economic Modelling, 72: 320–332

[54]

Manera M, Nicolini M, Vignati I, (2016). Modelling futures price volatility in energy markets: Is there a role for financial speculation. Energy Economics, 53: 220–229

[55]

Mohammadi H, Su L, (2010). International evidence on crude oil price dynamics: Applications of ARIMA-GARCH models. Energy Economics, 32( 5): 1001–1008

[56]

Mohsin M, Jamaani F, (2023). A novel deep-learning technique for forecasting oil price volatility using historical prices of five precious metals in context of green financing–A comparison of deep learning, machine learning, and statistical models. Resources Policy, 86: 104216

[57]

Morana C, (2013). Oil price dynamics, macro-finance interactions and the role of financial speculation. Journal of Banking & Finance, 37( 1): 206–226

[58]

Mukherjee A, Panayotov G, Shon J, (2021). Eye in the sky: Private satellites and government macro data. Journal of Financial Economics, 141( 1): 234–254

[59]

Narayan P K, Narayan S, (2007). Modelling oil price volatility. Energy Policy, 35( 12): 6549–6553

[60]

Narayan P K, Ranjeeni K, Bannigidadmath D, (2017). New evidence of psychological barrier from the oil market. Journal of Behavioral Finance, 18( 4): 457–469

[61]

Naser H, (2016). Estimating and forecasting the real prices of crude oil: A data rich model using a dynamic model averaging (DMA) approach. Energy Economics, 56: 75–87

[62]

Pan Z, Wang Y, Wu C, Yin L, (2017). Oil price volatility and macroeconomic fundamentals: A regime switching GARCH-MIDAS model. Journal of Empirical Finance, 43: 130–142

[63]

Qadan M, Nama H, (2018). Investor sentiment and the price of oil. Energy Economics, 69: 42–58

[64]

Qin Q, Huang Z, Zhou Z, Chen C, Liu R, (2023). Crude oil price forecasting with machine learning and Google search data: An accuracy comparison of single-model versus multiple-model. Engineering Applications of Artificial Intelligence, 123: 106266

[65]

Qu H, Li G, (2023). Multi-perspective investor attention and oil futures volatility forecasting. Energy Economics, 119: 106531

[66]

Ren X, Jiang W, Ji Q, Zhai P, (2024). Seeing is believing: Forecasting crude oil price trend from the perspective of images. Journal of Forecasting, 43( 7): 2809–2821

[67]

Safari A, Davallou M, (2018). Oil price forecasting using a hybrid model. Energy, 148: 49–58

[68]

Salisu A A, Gupta R, Demirer R, (2022). Global financial cycle and the predictability of oil market volatility: Evidence from a GARCH-MIDAS model. Energy Economics, 108: 105934

[69]

Seasholes M S, Wu G, (2007). Predictable behavior, profits, and attention. Journal of Empirical Finance, 14( 5): 590–610

[70]

Sun S, Sun Y, Wang S, Wei Y, (2018). Interval decomposition ensemble approach for crude oil price forecasting. Energy Economics, 76: 274–287

[71]

Tang L, Dai W, Yu L, Wang S, (2015). A novel CEEMD-based EELM ensemble learning paradigm for crude oil price forecasting. International Journal of Information Technology & Decision Making, 14( 1): 141–169

[72]

Tang L, Wu Y, Yu L, (2018). A non-iterative decomposition-ensemble learning paradigm using RVFL network for crude oil price forecasting. Applied Soft Computing, 70: 1097–1108

[73]

Tang L, Zhang C, Li L, Wang S, (2020). A multi-scale method for forecasting oil price with multi-factor search engine data. Applied Energy, 257: 114033

[74]

Urolagin S, Sharma N, Datta T K, (2021). A combined architecture of multivariate LSTM with Mahalanobis and Z-Score transformations for oil price forecasting. Energy, 231: 120963

[75]

Van Eyden R, Difeto M, Gupta R, Wohar M E, (2019). Oil price volatility and economic growth: Evidence from advanced economies using more than a century’s data. Applied Energy, 233–234: 612–621

[76]

Wang B, Wang J, (2020). Energy futures and spots prices forecasting by hybrid SW-GRU with EMD and error evaluation. Energy Economics, 90: 104827

[77]

Wang J, Athanasopoulos G, Hyndman R J, Wang S, (2018). Crude oil price forecasting based on internet concern using an extreme learning machine. International Journal of Forecasting, 34( 4): 665–677

[78]

Wang J, Li X, (2018). A combined neural network model for commodity price forecasting with SSA. Soft Computing, 22( 16): 5323–5333

[79]

Wang Y, Liu L, Wu C, (2020). Forecasting commodity prices out-of-sample: Can technical indicators help. International Journal of Forecasting, 36( 2): 666–683

[80]

Wen F, Gong X, Cai S, (2016). Forecasting the volatility of crude oil futures using HAR-type models with structural breaks. Energy Economics, 59: 400–413

[81]

Wong T C, Chan H K, Lacka E, (2017). An ANN-based approach of interpreting user-generated comments from social media. Applied Soft Computing, 52: 1169–1180

[82]

Working H, (1960). Speculation on hedging markets. Food Research Institute Studies, 1( 2): 185–220

[83]

Wu B, Wang L, Lv S X, Zeng Y R, (2021b). Effective crude oil price forecasting using new text-based and big-data-driven model. Measurement, 168: 108468

[84]

Wu B, Wang L, Wang S, Zeng Y R, (2021a). Forecasting the US oil markets based on social media information during the COVID-19 pandemic. Energy, 226: 120403

[85]

Wu J, Zhao R, Sun J, Zhou X, (2023). Impact of geopolitical risks on oil price fluctuations: Based on GARCH-MIDAS model. Resources Policy, 85: 103982

[86]

Wu W, Xu M, Su R, Ullah K, (2024). Modeling crude oil volatility using economic sentiment analysis and opinion mining of investors via deep learning and machine learning models. Energy, 289: 130017

[87]

Xing L M, Zhang Y J, (2022). Forecasting crude oil prices with shrinkage methods: Can nonconvex penalty and Huber loss help. Energy Economics, 110: 106014

[88]

Xu Z, Mohsin M, Ullah K, Ma X, (2023). Using econometric and machine learning models to forecast crude oil prices: Insights from economic history. Resources Policy, 83: 103614

[89]

Yao T, Zhang Y J, (2024). The impact of air pollution on crude oil futures market. Journal of Futures Markets, 44( 6): 1055–1068

[90]

Yao T, Zhang Y J, Ma C Q, (2017). How does investor attention affect international crude oil prices. Applied Energy, 205: 336–344

[91]

Yin L, Yang Q, (2016). Predicting the oil prices: Do technical indicators help. Energy Economics, 56: 338–350

[92]

Yu L, Dai W, Tang L, (2016). A novel decomposition ensemble model with extended extreme learning machine for crude oil price forecasting. Engineering Applications of Artificial Intelligence, 47: 110–121

[93]

Yu L, Zhao Y, Tang L, (2017). Ensemble forecasting for complex time series using sparse representation and neural networks. Journal of Forecasting, 36( 2): 122–138

[94]

Yuan Y, (2015). Market-wide attention, trading, and stock returns. Journal of Financial Economics, 116( 3): 548–564

[95]

Zhang S, Luo J, Wang S, Liu F, (2023). Oil price forecasting: A hybrid GRU neural network based on decomposition–reconstruction methods. Expert Systems with Applications, 218: 119617

[96]

Zhang T, Tang Z, Wu J, Du X, Chen K, (2021). Multi-step-ahead crude oil price forecasting based on two-layer decomposition technique and extreme learning machine optimized by the particle swarm optimization algorithm. Energy, 229: 120797

[97]

Zhang Y, Ma F, Wang Y, (2019a). Forecasting crude oil prices with a large set of predictors: Can LASSO select powerful predictors. Journal of Empirical Finance, 54: 97–117

[98]

Zhang Y J, Wang J L, (2019). Do high-frequency stock market data help forecast crude oil prices? Evidence from the MIDAS models. Energy Economics, 78: 192–201

[99]

Zhang Y J, Yao T, He L Y, Ripple R, (2019b). Volatility forecasting of crude oil market: Can the regime switching GARCH model beat the single-regime GARCH models. International Review of Economics & Finance, 59: 302–317

[100]

Zhang Y J, Zhang H, (2023a). Volatility forecasting of crude oil futures market: Which structural change-based HAR models have better performance. International Review of Financial Analysis, 85: 102454

[101]

Zhang Y J, Zhang H, (2023b). Volatility forecasting of crude oil market: which structural change based GARCH models have better performance. Energy Journal, 44( 1): 175–194

[102]

Zhao J, (2022). Exploring the influence of the main factors on the crude oil price volatility: An analysis based on GARCH-MIDAS model with Lasso approach. Resources Policy, 79: 103031

[103]

Zhao J, Hosseini S, Chen Q, Jahed Armaghani D, (2023a). Super learner ensemble model: A novel approach for predicting monthly copper price in future. Resources Policy, 85: 103903

[104]

Zhao L T, Xing Y Y, Zhao Q R, Chen X H, (2023b). Dynamic impacts of online investor sentiment on international crude oil prices. Resources Policy, 82: 103506

[105]

Zhao Y, Li J, Yu L, (2017). A deep learning ensemble approach for crude oil price forecasting. Energy Economics, 66: 9–16

[106]

Zhou X, Zheng X, Cui X, Shi J, Liang W, Yan Z, Yang L T, Shimizu S, Wang K I, (2023). Digital twin enhanced federated reinforcement learning with lightweight knowledge distillation in mobile networks. IEEE Journal on Selected Areas in Communications, 41( 10): 3191–3211

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (962KB)

126

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/