This paper reviews the utilization of Big Data analytics, as an emerging trend, in the upstream and downstream oil and gas industry. Big Data or Big Data analytics refers to a new technology which can be employed to handle large datasets which include six main characteristics of volume, variety, velocity, veracity, value, and complexity. With the recent advent of data recording sensors in exploration, drilling, and production operations, oil and gas industry has become a massive data intensive industry. Analyzing seismic and micro-seismic data, improving reservoir characterization and simulation, reducing drilling time and increasing drilling safety, optimization of the performance of production pumps, improved petrochemical asset management, improved shipping and transportation, and improved occupational safety are among some of the applications of Big Data in oil and gas industry. Although the oil and gas industry has become more interested in utilizing Big Data analytics recently, but, there are still challenges mainly due to lack of business support and awareness about the Big Data within the industry. Furthermore, quality of the data and understanding the complexity of the problem are also among the challenging parameters facing the application of Big Data.
In the quest for interpretable models, two versions of a neural network rule extraction algorithm were proposed and compared. The two algorithms are called the Piece-Wise Linear Artificial Neural Network (PWL-ANN) and enhanced Piece-Wise Linear Artificial Neural Network (enhanced PWL-ANN) algorithms. The PWL-ANN algorithm is a decomposition artificial neural network (ANN) rule extraction algorithm, and the enhanced PWL-ANN algorithm improves upon the PWL-ANN algorithm and extracts multiple linear regression equations from a trained ANN model by approximating the hidden sigmoid activation functions using N-piece linear equations. In doing so, the algorithm provides interpretable models from the originally trained opaque ANN models. A detailed application case study illustrates how the generated enhanced-PWL-ANN models can provide understandable IF-THEN rules about a problem domain. Comparison of the results generated by the two versions of the PWL-ANN algorithm showed that in comparison to the PWL-ANN models, the enhanced-PWL-ANN models support improved fidelities to the originally trained ANN models. The results also showed that more concise rule sets could be generated using the enhanced-PWL-ANN algorithm. If a more simplified set of rules is desired, the enhanced-PWL-ANN algorithm can be combined with the decision tree approach. Potential application of the algorithms to domains related to petroleum engineering can help enhance understanding of the problems.
The traditional method of using the coefficient of drag -Reynolds number relationship to predict cuttings settling velocity involves an implicit procedure that requires repeated, time-consuming and tedious iterations using Newtonian or mostly non-Newtonian correlations. Usually, these correlations are limited to certain fluid flow regimes. Besides, most of the explicit and direct cuttings settling velocity models that exist are based on the assumption that the cuttings are spherical particles. However, in the field, the cuttings are a mixture of various shapes and are hardly spherical, hence these models when applied to field conditions come off with huge errors. The objective of this work was to use a nature-inspired algorithm (artificial neural network -ANN) to develop a model for estimating cuttings settling velocity that would be robust and useful in the field that would take into account the shape of the cuttings. The data used for this work was obtained from research experiments in the literature. The model was then evaluated using four performance metrics namely: mean squared error (MSE), root mean square error (RMSE), sum of squares error (SSE) and goodness of fit (R2). It was found that the model's predictions obtained in this work agreed with experimental evidence. Furthermore, the developed model possesses the capacity to generalize across new input datasets and can be applied to particles of any shape, hence, defining the novelty of this research and bridging the gap between theory and practice. When compared with state-of-the-art models, the developed models show a high degree of robustness, as the ANN model performed reasonably well with an MSE of 7.5 × 10−4, an R2 of 0.978, RMSE of 0.0274 and SSE of 0.25. To generalize the results across new input datasets, the developed model was cross-validated with new data that was not part of the training process. It was found that the ANN model had an MSE value 0.00807, RMSE of 0.0898, MAE of 0.065, SSE of 2.74 and MAPE of 0.675%. To ensure the replicability of the ANN model, the weights and biases for the inputs, hidden and output layers are presented in this work unlike other artificial intelligence-based models in the literature. The range of application for the developed ANN model is 0.0001 < Particle Reynolds's number <100 and 0.471 < cuttings sphericity <1. With the model developed in this work, the cuttings settling velocity can be predicted with minimal errors in a quick, less cumbersome, non-iterative manner and is not limited by cuttings shapes' factor and fluid flow regimes.
Crude oil price prediction is a challenging task in oil producing countries. Its price is among the most complex and tough to model because fluctuations of price of crude oil are highly irregular, nonlinear and varies dynamically with high uncertainty. This paper proposed a hybrid model for crude oil price prediction that uses the complex network analysis and long short-term memory (LSTM) of the deep learning algorithms. The complex network analysis tool called the visibility graph is used to map the dataset on a network and K-core centrality was employed to extract the non-linearity features of crude oil and reconstruct the dataset. The complex network analysis is carried out in order to preprocess the original data to extract the non-linearity features and to reconstruct the data. Thereafter, LSTM was employed to model the reconstructed data. To verify the result, we compared the empirical results with other research in the literature. The experiments show that the proposed model has higher accuracy, and is more robust and reliable.
Acidizing treatment is considered as a significant process in the oil well stimulations to form wormholes in carbonate formation in order to enhance the reservoir fluid production. Obtaining the number of pore volumes to breakthrough is an important objective in matrix acidizing, for it contributes to determining the wormhole characteristics such as type, shape, and size. Finding this number in experimental works requires a considerable amount of time, energy and cost. Therefore, this study aimed to establish an analytical method in which a reasonable result is achieved for the number of pore volumes to breakthrough. This purpose is accomplished by solely implementing acid and formation properties without performing any experimental works. The process of wormhole creation is done through developing a numerical model by utilizing the conservation of mass law method in which the carbonate core is considered as a closed system and the overall mass in the system as constant during the acid injection process. Furthermore, a constant number is added to the mathematical part of the model in order to eliminate the dimensionless Damköhler number which is supposed to be calculated experimentally. The results of the numerical procedure of the model are further compared to four other experimental works, which led to calculating the average accuracy of this model that is shown to be 95.98%. This study puts forward a comprehensive numerical model to estimate the number of pore volumes to breakthrough with an acceptable accuracy rate merely through implementing known acid and core properties.
As the price of oil decreases, it is becoming increasingly important for oil companies to operate in the most cost-effective manner. This problem is especially apparent in Western Canada, where most oil production is dependent on costly enhanced oil recovery (EOR) techniques such as steam-assisted gravity drainage (SAGD). Therefore, the goal of this study is to create an artificial neural network (ANN) that is capable of accurately predicting the ultimate recovery factor of oil reservoirs by steam-assisted gravity drainage (SAGD). The developed ANN model featured over 250 unique entries for oil viscosity, steam injection rate, horizontal permeability, permeability ratio, porosity, reservoir thickness, and steam injection pressure collected from literature. The collected data set was entered through a feed-forward back-propagation neural network to train, validate, and test the model to predict the recovery factor of SAGD method as accurate as possible. Results from this study revealed that the neural network was able to accurately predict recovery factors of selected projects with less than 10% error. When the neural network was exposed to a new simulation data set of 64 points, the predictions were found to have an accuracy of 82% as measured by linear regression. Finally, the feasibility of ANN to predict the recovery performance of one of the most complicated enhanced heavy oil recovery techniques with reasonable accuracy was confirmed.
The transparent open box (TOB) learning network algorithm offers an alternative approach to the lack of transparency provided by most machine-learning algorithms. It provides the exact calculations and relationships among the underlying input variables of the datasets to which it is applied. It also has the capability to achieve credible and auditable levels of prediction accuracy to complex, non-linear datasets, typical of those encountered in the oil and gas sector, highlighting the potential for underfitting and overfitting. The algorithm is applied here to predict bubble-point pressure from a published PVT dataset of 166 data records involving four easy-to-measure variables (reservoir temperature, gas-oil ratio, oil gravity, gas density relative to air) with uneven, and in parts, sparse data coverage. The TOB network demonstrates high-prediction accuracy for this complex system, although it predictions applied to the full dataset are outperformed by an artificial neural network (ANN). However, the performance of the TOB algorithm reveals the risk of overfitting in the sparse areas of the dataset and achieves a prediction performance that matches the ANN algorithm where the underlying data population is adequate. The high levels of transparency and its inhibitions to overfitting enable the TOB learning network to provide complementary information about the underlying dataset to that provided by traditional machine learning algorithms. This makes them suitable for application in parallel with neural-network algorithms, to overcome their black-box tendencies, and for benchmarking the prediction performance of other machine learning algorithms.
In the present work, artificial neuron network (ANN) based models for predicting equilibrium solubility and mass transfer coefficient of CO2 absorption into aqueous solutions of high performance alternative 4-diethylamino-2-butanol (DEAB) solvent were successfully developed. The ANN models show an outstanding predictive performance over the predictive correlations proposed in the literature. In order to predict the equilibrium solubility, the ANN model were developed based on three input parameters of operating temperature, concentration of DEAB and partial pressure of CO2. An outstanding prediction performance of 2.4% average absolute deviation (AAD) can be obtained (comparing with 7.1-8.3% AAD from the literature). Additionally, a significant improvement on predicting mass transfer coefficient can also be achieved through the developed ANN model with 3.1% AAD (comparing with 14.5% AAD from the existing semi-empirical model). The mass transfer coefficient is considered to be a function of liquid flow rate, liquid inlet temperature, concentration of DEAB, inlet CO2 loading, outlet CO2 loading, concentration of CO2 along the height of the column.
This paper proposes an integrated method of analytical calculation, artificial intelligence, and probabilistic analysis to cost-effectively determine geomechanical properties and in-situ stresses from borehole deformation via caliper logs. It's also demonstrated in this paper that the actual borehole size can not be simply taken as the bit size by default, and adjusted borehole size has to be used to find the reasonable borehole deformation. In the proposed method, an artificial neural network (ANN) is applied to map the relationship among in-situ stress, adjusted borehole size, geomechanical properties, and borehole displacements. The genetic algorithm (GA) searches for the set of unknown stresses and geomechanical properties that match the objective borehole deformation function. Probabilistic analysis is conducted after ANN-GA modeling to estimate the most possible ranges of the parameters. The hybrid method has been demonstrated by a field case study to estimate the adjusted borehole size, Young's modulus, and the two horizontal in-situ stresses using borehole deformation information reported from four-arm caliper logs of a vertical borehole in Liard Basin in Canada.
Fluid-flow measurements of petroleum can be performed using a variety of equipment such as orifice meters and wellhead chokes. It is useful to understand the relationship between flow rate through orifice meters (Qv) and the five fluid-flow influencing input variables: pressure (P), temperature (T), viscosity (μ), square root of differential pressure (ΔP^0.5), and oil specific gravity (SG). Here we evaluate these relationships using a range of machine-learning algorithms applied to orifice meter data from a pipeline flowing from the Cheshmeh Khosh Iranian oil field. Correlation coefficients indicate that (Qv) has weak to moderate positive correlations with T, P, and μ, a strong positive correlation with the ΔP^0.5, and a weak negative correlation with oil specific gravity. In order to predict the flow rate with reliable accuracy, five machine-learning algorithms are applied to a dataset of 1037 data records (830 used for algorithm training; 207 used for testing) with the full input variable values for the data set provided. The algorithms evaluated are: Adaptive Neuro Fuzzy Inference System (ANFIS), Least Squares Support Vector Machine (LSSVM), Radial Basis Function (RBF), Multilayer Perceptron (MLP), and Gene expression programming (GEP). The prediction performance analysis reveals that all of the applied methods provide predictions at acceptable levels of accuracy. The MLP algorithm achieves the most accurate predictions of orifice meter flow rates for the dataset studied. GEP and RBF also achieve high levels of accuracy. ANFIS and LSSVM perform less well, particularly in the lower flow rate range (i.e., <40,000 stb/day). Some machine learning algorithms have the potential to overcome the limitations of idealized streamline analysis applying the Bernoulli equation when predicting flow rate across an orifice meter, particularly at low flow rates and in turbulent flow conditions. Further studies on additional datasets are required to confirm this.
Minimum miscibility pressure (MMP) is a key parameter in the successful design of miscible gases injection such as CO2 flooding for enhanced oil recovery process (EOR). MMP is generally determined through experimental tests such as slim tube and rising bubble apparatus (RBA). As these tests are time-consuming and their cost is very expensive, several correlations have been developed. However, and although the simplicity of these correlations, they suffer from inaccuracies and bad generalization due to the limitation of their ranges of application. This paper aims to establish a global model to predict MMP in both pure and impure CO2-crude oil in EOR process by combining support vector regression (SVR) with artificial bee colony (ABC). ABC is used to find best SVR hyper-parameters. 201 data collected from authenticated published literature and covering a wide range of variables are considered to develop SVR-ABC pure/impure CO2-crude oil MMP model with following inputs: reservoir temperature (TR), critical temperature of the injection gas (Tc), molecular weight of pentane plus fraction of crude oil (MWC5+) and the ratio of volatile components to intermediate components in crude oil (xvol/xint). Statistical indicators and graphical error analyses show that SVR-ABC MMP model yields excellent results with a low mean absolute percentage error (3.24%) and root mean square error (0.79) and a high coefficient of determination (0.9868). Furthermore, the results reveal that SVR-ABC outperforms either ordinary SVR with trial and error approach or all existing methods considered in this work in the prediction of pure and impure CO2-crude oil MMP. Finally, the Leverage approach (Williams plot) is done to investigate the realm of prediction capability of the new model and to detect any probable erroneous data points.
Lost circulation is the most common problem encountered while drilling oil wells. Occurrence of such a problem can cause a lot of time and cost wastes. In order to drill oil wells, a fast and profitable way is necessary to predict and solve lost circulation problem. Expert system is a method used lately for problems that deal with uncertainty. In this paper, three approaches are carried out for prediction of lost circulation problem. These approaches include design of experiments (DOE), data mining, and adaptive neuro-fuzzy inference system (ANFIS). Data of 61 wells of Maroon oilfield are selected and sorted as the feed of the systems. Seventeen variables are used as inputs of the approaches and one variable is used as the output. First, DOE is conducted to observe the effects of variables. Plackett-Burman method is used to determine the effects of variables on lost circulation. After that, data mining is conducted to predict the amount of lost circulation. The class of regression is used to determine a function to model the data and the error of the model. Then, ANFIS is applied to predict the amount of lost circulation. The chosen data are used in order to train, test, and control the ANFIS. Furthermore, subtractive clustering is used to train the fuzzy inference system (FIS) of the model. The performance of the ANFIS model is assessed through the root mean squared error (RMSE). The results suggest that ANFIS method can be successfully applied to establish lost circulation prediction model. In addition, results of ANFIS and data mining are investigated through their prediction performances. The comparison of both methods reveals that ANFIS error is much lower than data mining.
As oil and gas extraction activities move into deeper rock formations, many experimental studies and field investigations indicate rock exhibits a plastic behavior rather than a pure linear elastic behavior, so poroelastoplasticity must be taken into account in the reservoir simulation. Because reservoir rock is a porous material consisting of a compressible solid matrix and number of compressible fluids occupying the pore space, fully coupled modeling is required for reservoir simulation considering solid-fluid interaction, complex stress conditions and nonlinear behaviors. But the computational process could be cumbersome when constant tangent stiffness method is used to address the poroelastoplastic behavior. In this paper, a fully coupled poroelastoplasticity reservoir model based on Drucker-Prager yield criterion is implemented the tangent stiffness method, and the computational efficiency is compared with the constant stiffness method. The accuracy of these two methods is demonstrated in one-dimensional consolidation. In a case study, these two methods are used to analyze the stresses and pore pressure of a reservoir and computing results and running efficiency are compared. Also, the linear elastic and nonlinear solutions are compared in one-dimensional consolidation and reservoir modeling. It shows that the difference between results by constant stiffness method and tangent stiffness method is very small, while the tangent stiffness method shows significantly fewer iteration numbers and shorter running time than the constant stiffness method.