Using an MPC to control a system: Applying the controller to the system
Last time we saw the cost function. It is a quadratic function and this allows us to look for a minimum in which the values obtained for the control input vector will direct the system towards the correct way. Here’s the cost function:
Beware, this story will be dense in mathematical derivations. So, buckle up and prepare to digest lots of letters and very few numbers.
Something that should be mentioned is that the cost function contains the error from time sample K to K+N and contains the control input from time sample K to K+N-1. When we defined the equation that calculates the prediction of the states, we saw that we use the current states (at time sample K) and N control inputs to calculate five new predicted states. This is the equation:
So, we’ll have N+1 samples of states (the current and N more) and N samples of control inputs. Bellow is a schematics of how the weight matrices are spread across the time samples in the cost function.
We also discussed the effects of the different parameters of the function and concluded that there is one parameter b that is irrelevant because it doesn’t change the position of the minimum nor the behaviour of the function. Before we start changing the cost function, we need to take into account the following algebraic rules:
This tells us how the transposition outside parenthesis affects summation, differentiation and multiplication inside parenthesis. Let’s take a look at the term of the error that deals with the time sample K+N and perform the operations we have just seen:
Let’s expand and get rid of the parenthesis.
Now, bring your attention to the highlighted terms in blue. They are both very similar and one could think that by transposing one of both terms, they could be even more similar inversing the matrices and vectors’ position inside the term. But there are two problems. The weight matrix S would be transposed in one of the terms and not in the other. Also, we can’t just transpose the whole term and expect it to have the same shape and remain the same term. Or can we? Let’s deconstruct the second of the highlighted terms and look at its dimensionality.
We can see that it is nothing but a scalar and transposed scalars are themselves:
Note that because our system is simple and we only control one variable, it doesn’t mean that this is not applicable to more complex systems. What if we had two variables that we wanted to control?
We still get a scalar. And now it is visible that the the weight matrix S is diagonal and the transpose of a diagonal matrix is also itself.
Going back to the highlighted blue terms and applying these changes we just talked about we get:
All this we talked about could be done to the terms that involve the weight matrix Q for all time samples as well. Let’s change the cost function according to all the above considerations.
As we can see, inside the sum operator, the same line of thought was made for the first term. Some new terms have been highlighted. In those in red, the vector r represents the reference values that are constant for every time sample and so are the weight matrices S and Q.
The terms in blue are constant only for the time sample when i = 0 (or the current time sample K) because the tilde X vector contains the current states, that already exist and can’t be changed to minimise the cost function.
These terms that are constant can be discarded to obtain the following equation:
And the sumation parts in this new cost function J’ can be expanded and converted into a matrix form that is much more intuitive by joining the reference, tilde X and Δθ vectors into their global vectors.
That can be simplified with the following notation:
Where the double bar letters Q, S and R represent the matrices in the same order. The global vectors are the following:
Remember the equation that enabled us to calculate predictions? We are going to substitute it into the cost function. But before let’s change its notation from this:
And by substituting it into the cost function we get:
Now, let’s expand again and see if there are any more terms that we can set to zero.
The terms highlighted in red are constants. The tilde X vector at the Kth iteration contains the states at the current iteration and not predictions as discussed before. So, again, they are already a real value and not a prediction that can be changed and optimized. The double bar matrices A and Q are just constant parameters and weights. The r vector contains only the set point values, they are also constant. We’ll set both terms to zero.
The terms highlighted in blue are both very similar terms. Doing the same analysis as before, we can understand that both terms are just a scalar and double bar Q is a diagonal matrix, thus its transpose is the same.
Adding both terms together and setting the constants to zero we get:
Which can be simplified to:
We finally have deduced the simplest form of the cost function. We still need to find the minimum value of this equation. From calculus we know that minimum values become zero values in the derivative counterpart of a given function. But how can we obtain such derivative?
Derivating the cost function
We can see that the only variable here is the global control input vector. We may see it clearly by using the simpler notation.
Where x represents the variable and thus, the global control input vector. We can see that the cost’s gradient w.r.t. x is:
∇ is the gradient operator. In this case it represents d/dx. In case you were wondering why the cost function had a 1/2 multiplied to its terms, here is the reason. It simplifies the process of derivating by cancelling the multiplication by the value of the quadratic power.
I won’t make it this publication’s objective to show how to deduce the derivation of the cost function. Instead I am going to show the resulting equation and leave it for the next publication to explain this part. I decide to take this approach because this series of publications are already too long and I wanted to show some control results with my code. Also, this publication is already very heavy on mathematics.
Can you see some resemblance to the simpler version I showed above? If we compare to the cost function, we can see that the constant b got transposed and its constituents swapped. We can finally calculate the best values of control input by setting the cost derivative to zero.
And there we have it. We calculate the global control input vector and with it we can calculate predictions with the equation we defined before that used the value of states in the current iteration.
Then, we use the first value of the global control input vector to actually change the system. If the model is accurate enough, the states are going to change to values very similar to those of the (K+1)th prediction.
Simulating the real world system
In a previous publication I said that in the next iteration, the states would be the same as the predicted values. Even though that made sense while explaining how predictions worked, it is not what’s actually going to happen.
To simulate the evolution of the real world system, we are going to calculate the derivatives of the states w.r.t. time using the undiscretized forms of the system equations we defined before. With those derivatives we’ll integrate discretely over subsets of the time sample to increase accuracy.
I’ll explain using the equation containing the derivative of temperature w.r.t. time. Let’s recall the equation:
At a certain instant in time we use the values of the states we have for the current iteration and calculate T dot. Then we can integrate that T dot and add to the current value of T. If we divide the time sample into subsamples, we can make more accurate simulations because T dot is continuous and constantly changing regardless of the time steps we decide to use. This way we can use updated T dot values more often and obtain more precise values of T.
In this image we have a state S that changes in time. The full blue line is its continuous profile if this was an actual real system. We understand that the slope at each instant is its derivative w.r.t. time and is ever changing. In green, we have a dashed line containing five different slopes. The points represent a subsample of time and in those moments, the derivative of S w.r.t. time was updated and that new value was used to obtain a better simulation than in the red situation where S dot was only updated twice.
Above is the calculation of the integral and its discretization by time sample t_s. If we divide t_s into five subsamples we get:
We start with T_0, calculate T_0.2 up until T_1.0 and only use this final value as our real system simulation. Each T dot is an updated value. If our model is accurate enough, this T_1.0 should be very similar to the first predicted value. This process is repeated to all the states of our system.
The MPC loop
Our MPC loop is finally completed. To put it simple, we apply the states at the current iteration K into the derivative of the cost function, adjusting the weights to adapt the behaviour of control, to calculate a new global control input vector as long as our prediction horizon.
We apply both the states and control input vector in the equation that enables us to make state predictions.
Finally we use the first value of control input from the global vector in the simulation of the OLS for the next iteration. Then is just repeat the loop. Note that in Simulating the real world system section, we only saw the T dot equation, but all system equations need to go through that step and using the V dot dot equation we would have seen that the first control input value is used to update V dot dot values.
Bellow is a scheme of how our MPC loop looks like externally and internally.
We created an LTI model that could represent our system, discretized it and with the error of our controlling defined a quadratic error function that would calculate a set of control input values. We expanded our system equation form the SSES to include as many predictions as we wanted and with this global control input vector canculated those predictions. Then we used the first value to calculate the new states through subsets of the time sample to increase accuracy. Hopefully the first predicted value is very similar to the states values of the new iteration.
Simulating the MPC
Let me show you some simulations. These will take into consideration different weights so that it is easier to understand how they actually influence the control of the system.
In the following gifs, you’ll see a series of moving graphs. In the first graph, the blue line is the reference temperature we want. Instead of a constant temperature, we are using a sinusoidal reference to see how the MPC copes with the varying temperature requirement. The red line is the actual temperature and the purple line is the horizon of predictions.
The second graph shows the temperature difference between the reference and the actual temperature. At the bottom, the left graph is the pressure acting on the steam. The right one shows the volumetric flow of the steam.
In this first example we set Q and S to 10000 and R to 100. It is easy to understand that these set of parameters aren’t good because the system takes time to stabilize. The greater Q and S values make it so that the controller wants to approach the reference as quick as possible making it less flexible, which can see in the horizon line as it tries to meet the blue line quicker. The smaller R values make it so that the pressure varies much more than in the next example.
In the second example we set Q and S to 100 and R to 10000. As discussed in the previous publication, the smaller weights give the controller more freedom to adjust the predictions. Here, the stiffness of the control input is greater and by consequence we can see that the pressure changes at a slower pace compared to the first example. Also, since the temperature weights Q and S are small, temperature is not so constricted to the values of the blue line in the horizon prediction.
Today we went through a mathematics intensive explanation of how to simplify the cost function by identifying and removing terms that didn’t change the behaviour of control.
We also saw the differentiated counterpart of the cost function, but since the story was so long already and with so much mathematics I‘ll leave it for another time to see how to deduce it.
We learned how to simulate the real world system and make it more accurate with subdivisions of the time sample.
We saw the MPC loop in a scheme and discussed the key steps of an iteration.
Finally we ran two simulations of the MPC loop with different weights to understand how they influence control.
The knowledge needed to create this project was a mix between what I learned in my MEng in Chemical Engineering and the course Applied Control Systems with Python in Udemy.com.
Disclaimer: I have nothing to do with it and I gain nothing if you visit or purchase it, but if you are interested in learning more about the MPC there you have the source of what I learnt.
As always, thanks for reading!