Project objective The objective of this project is to build an artificial

Project objective
The objective of this project is to build an artificial intelligent model (smart proxy) that can replicate the functions of CMG model, in short words to predict pressure distribution and CO2 plume at any time step through the process of injection and post injection. Why ? doing that using CMG model is taking a lot of time (in some cases it takes around a week). Doing the same run using artificial intelligence model takes seconds or minutes while maintaining the same accuracy.
CMG models
100 CMG realizations received from NETL created from a saline co2 sequestration model.
The realizations generated based on change of porosity and permeability into 7 average levels (5,10,25,50,75,90,95)
Later they noticed that levels 5 and 10 are unrealistically low while levels 90 and 95 are unrealistically high, therefore they decided to go forward using only levels 25,50,75.
Levels 25,50,75 contains 22,23,16 realizations respectively, total number of realizations of 61
The only changing properties on those 61 realizations are permeability and porosity (that’s making the change on the outputs pressure and Co2 saturation.
The main purpose of the model is to predict the pressure distribution changes and CO2 plume on the saline aquafer after injecting the CO2 for 10 years.
There is also a period called Post injection where it’s used to monitor what happens to the co2 and pressure after stopping the injection and that is for 170 years.
There are four injectors in the field where the target is to inject around 2 metric tons of CO2 yearly for 10 years.
There are two constraints on the injection process, first one is the bottom hole pressure and the second one is the injection rate volume that it should not exceed.
The model contains 30 layers (211 x 211 cells each layer), the first 2 layers are sealed, and the injection wells are not completed on them as they are on layers 3-30.
There are 3 reck types in the model (rock type 1, 2, 3)
Rock type 1 is only used in layers 1,2 (where they have very low relative permeability to serve the design purpose and make those 2 layers act like a seal)
Layers 3-30 are mixture of rock type 2,3 where 2 has the same relative permeability as rock type 1 (sealed) each model contains 1,335,630 cells (211cells x 211cells for each of the 30 layers)
Select number of realizations and extract the data then generate extra features to enhance the prediction results. after creating the dataset, it has been noticed that some features needs engineering process to overcome the skewed distribution such as permeability.
The dataset then distributed into 3 parts, first is training dataset where its used to train the smart proxy model, the second part is the calibration dataset where its used as a watchdog (after every complete cycle of introducing the training dataset to the smart proxy it will be tested on the calibration dataset and check the prediction result, the weights of the model neurons will modified according to the result of this test). Finally, the third part of the data is called Blind validation where this data totally separated from training and calibration dataset before the training process starts and its used to perform a final test on the model after it achieves the desired accuracy on the training process.
This process of training, calibration and blind validation is repeated for very model in the smart proxy, in this project there are 2 models (pressure and Co2 saturation)
Similar projects dissertations:
Features generation
After studying and understanding the data structure, new futures were generated to enhance the capability of models learning to achieve precise predictions. Since the data Is structured in cell-based fashion, each row in the dataset will represent a focal cell in the model, the dimensions of the model are 211 cells in “I” axis, 211 cells in “J” axis and 30 layers in “K” axis. i,j,k are x,y,z respectively. Total number of cells in each realization is 211x211x30 = 1,335,630 The features will be generated as columns meaning each new future will be a column having a value for each focal cell.
Distance from the focal cell to the (closest, 2nd closest, 3rd closest, 4th closest) injectors (4 new attributes)
Focal cell
Closest injector
2nd Closest injector
3rd Closest injector
4th Closest injector
Focal cell
Closest injector
2nd Closest injector
3rd Closest injector
4th Closest injector
After generating the distance to each injector, the best way to represent injection rate , cumulative injection and bhp is to arrange them in the same fashion (injection rate of the closest injector, cumulative injection of closest injector, bhp of the closest injector… etc) (12 new attributes)
We also apply the same method (injection rate of the closest injector, bhp of the closest injector… etc) for layers above and below the layer of focal cell, that means if the focal cell in layer 4, layers 3 and 5 values will be included also as new attributes to account for any chance of communication between layers.
Rock type was extracted from the numerical simulator as one attribute has (1,2 and 3), we converted that attribute to 3 new attributes, each column will represent one rock type as binary (either 1 or 0)
Tier cells are cells surrounding the focal cell, there are three types of tier cells (face , line, point).
Face tier cells have face contact with the focal cell and for each focal cell there are 6 face tier cells
Line tier cells have line contact with the focal cell and for each focal cell there are 12 line tier cells
Point tier cells have point contact with the focal cell and for each focal cell there are 8 point tier cells
From tier cells, new attributes were generated. Permeability, porosity, initial pressure, rock type (3 attributes) so we end up adding 6×26 = 156 new attributes
While running the numerical reservoir simulator, there were some instances that the cell value will be very low value or empty due to difficulty of converging a very small permeability. Therefore, 4 new attributes generated to show the number of inactive face tier cells, inactive line tier cells, inactive point tier cells and total number of inactive tier cells for each focal cell.
Since there are only 2 relative permeability curves for the 3 rock types, we can consider rock type 1 & 2 as sealed (very low permeability) and rock type three as conductive, therefore we generated a new attribute will show a binary value for each cell, if the rock type in the focal cell is 1 or 2 the value of the new attribute will be 0 and if it is rock type 3 , the value will be 1
There are 2 uncompleted or not perforated layers (layer 1 and 2), and layers 3-30 are completed. So a new attribute is generated with binary value to show if the focal cell is in a completed layer or sealed (1 or 0) respectively.
To strengthen the understanding of different rock types in the model, effective relative permeability was introduced to communicate the relation of relative permeability curves with each rock type. 4 relative permeability values were taken on each curve at similar interval resulting in 8 values with 4 krg values and 4 krw values.
That was applied for focal and face tier cells (8 attributes for focal cell and 8 for each face tier cells 8×6) 56 new attributes
Attribute engineering
Most of permeability values are laying below 100 md while some values go up to 12000, this makes normalization process less efficient. Therefore, to overcome this issue we implemented a method called “power transformation”.
Power transforms are a family of parametric, monotonic transformations that are applied to make data more Gaussian-like. This is useful for modeling issues related to heteroscedasticity (non-constant variance), or other situations where normality is desired.
When permeability was normalized, we noticed skewness issue
After applying power transformation to permeability and normalization
This technique was implemented to effective permeability attributes as well since thy are derived from permeability.
Results of model predictions will be provided later